llvm-mirror/test at 32c30ddee7ead45445b4da83fbe8b8e7664216b4 - llvm-mirror - Git.je (Gitea)

RPCS3/llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

History

Simon Pilgrim 32c30ddee7 [X86] Implement smarter instruction lowering for FP_TO_UINT from f32/f64 to i32/i64 and vXf32/vXf64 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction.

We know that "CVTTPS2SI" returns 0x80000000 for out of range inputs (and for FP_TO_UINT, negative float values are undefined). We can use this to make unsigned conversions from vXf32 to vXi32 more efficient, particularly on targets without blend using the following logic:

small := CVTTPS2SI(x);
fp_to_ui(x) := small | (CVTTPS2SI(x - 2^31) & ARITHMETIC_RIGHT_SHIFT(small, 31))

Even on targets where "PBLENDVPS"/"PBLENDVB" exists, it is often a latency 2, low throughput instruction so this logic is applied there too (in particular for AVX2 also). It furthermore gets rid of one high latency floating point comparison in the previous lowering.

@TomHender checked the correctness of this for all possible floats between -1 and 2^32 (both ends excluded).

Original Patch by @TomHender (Tom Hender)

Differential Revision: https://reviews.llvm.org/D89697

2021-07-14 12:03:49 +01:00

..

[X86] Implement smarter instruction lowering for FP_TO_UINT from f32/f64 to i32/i64 and vXf32/vXf64 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction.

2021-07-14 12:03:49 +01:00

[remangleIntrinsicFunction] Detect and resolve name clash

2021-07-13 11:21:12 +02:00

…

Revert "[DebugInfo] Enforce implicit constraints on distinct MDNodes"

2021-07-02 15:57:07 -07:00

…

[X86] Implement smarter instruction lowering for FP_TO_UINT from f32/f64 to i32/i64 and vXf32/vXf64 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction.

2021-07-14 12:03:49 +01:00

[DebugInfo] Correctly update dbg.values with duplicated location ops

2021-07-14 11:17:24 +01:00

[Clang] Introduce Swift async calling convention.

2021-07-09 11:50:10 -07:00

[Orc] At CBindings for LazyRexports

2021-07-01 21:52:05 +02:00

ExecutionEngine

[JITLink][ELF] Move ELF section and symbol parsing into ELFLinkGraphBuilder.

2021-06-29 09:59:49 +10:00

…

…

Instrumentation

[DebugInfo] Correctly update dbg.values with duplicated location ops

2021-07-14 11:17:24 +01:00

…

…

…

…

MachineVerifier

CodeGen: Print/parse LLTs in MachineMemOperands

2021-06-30 16:54:13 -04:00

[AArch64][SME] Add matrix register definitions and parsing support

2021-07-14 08:25:49 +00:00

[llvm-readobj] Switch command line parsing from llvm::cl to OptTable

2021-07-12 10:14:42 -07:00

…

[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch

2021-07-13 16:09:42 -07:00

SafepointIRVerifier

…

[llvm-readobj] Switch command line parsing from llvm::cl to OptTable

2021-07-12 10:14:42 -07:00

…

[TableGen] Allow identical MnemonicAliases with no predicate

2021-06-30 10:53:39 +01:00

[CSSPGO] Do not import pseudo probe desc in thinLTO

2021-07-13 18:26:36 -07:00

[CSSPGO][llvm-profgen] Allow multiple executable load segments.

2021-07-13 18:22:24 -07:00

[X86] Implement smarter instruction lowering for FP_TO_UINT from f32/f64 to i32/i64 and vXf32/vXf64 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction.

2021-07-14 12:03:49 +01:00

…

[OpaquePtr] Support VecOfAnyPtrsToElt intrinsics

2021-07-01 20:35:33 +02:00

…

.clang-format

…

CMakeLists.txt

[Orc] At CBindings for LazyRexports

2021-07-01 21:52:05 +02:00

lit.cfg.py

[Orc] At CBindings for LazyRexports

2021-07-01 21:52:05 +02:00

lit.site.cfg.py.in

…

TestRunner.sh

…