mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-25 12:12:47 +01:00
faad462651
This adds new node types for each intrinsic. For instance, for addv, we have AArch64ISD::UADDV, such that: (v4i32 (uaddv ...)) is the same as (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...)))) that is, (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)), (i32 (int_aarch64_neon_uaddv ...)), ssub) In a combine, we transform all such across-vector-lanes intrinsics to: (i32 (extract_vector_elt (uaddv ...), 0)) This has one big advantage: by making the extract_element explicit, we enable the existing patterns for lane-aware instructions to fire. This lets us avoid needlessly going through the GPRs. Consider: uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) { return vmulq_n_u32(a, vaddvq_u32(b)); } We now generate: addv.4s s1, v1 mul.4s v0, v0, v1[0] instead of the previous: addv.4s s1, v1 fmov w8, s1 dup.4s v1, w8 mul.4s v0, v1, v0 rdar://20044838 llvm-svn: 231840 |
||
---|---|---|
.. | ||
Analysis | ||
Assembler | ||
Bindings | ||
Bitcode | ||
BugPoint | ||
CodeGen | ||
DebugInfo | ||
ExecutionEngine | ||
Feature | ||
FileCheck | ||
Instrumentation | ||
Integer | ||
JitListener | ||
Linker | ||
LTO | ||
MC | ||
Object | ||
Other | ||
SymbolRewriter | ||
TableGen | ||
tools | ||
Transforms | ||
Unit | ||
Verifier | ||
YAMLParser | ||
.clang-format | ||
CMakeLists.txt | ||
lit.cfg | ||
lit.site.cfg.in | ||
Makefile | ||
Makefile.tests | ||
TestRunner.sh |