If we were to have an operation with an s16 def that needs to be
executed in a waterfall loop, not having s16 legal would place an
avoidable burden on RegBankSelect to widen it.
This was trying to constrain a physical register. By the verifier's
understanding, it's impossible to have a 1-bit copy to vcc/vcc_lo so
don't try to handle physregs.
This reverts commit 87c5437afd273e909e0fed3389de7531d5452ea5.
The commit includes several headers in the middle of a function, which
breaks pretty much everything.
This patch makes the 'Values' field optional. This is useful when we
handcraft the terminating entry of DIEs.
```
debug_info:
- Version: 4
...
Entries:
- AbbrCode: 1
Values:
- Value: 0x1234
- AbbrCode: 0 ## Termination
```
Reviewed By: jhenderson, grimar
Differential Revision: https://reviews.llvm.org/D85397
This patch adds one test case that tests dumping an empty .debug_aranges
section.
Reviewed By: jhenderson, grimar
Differential Revision: https://reviews.llvm.org/D85405
Add given input and mark it as tied.
Doesn't create additional copy compared to
matching input constraint to virtual register.
Differential Revision: https://reviews.llvm.org/D85122
As noted on PR46885, the number of mask elements should always be a power of 2, so to fix the static analyzer warning we are better off replacing the condition to <= 4, and I've added a pow2 assertion as well.
We already need to include raw_ostream.h, also add missing StringRef.h and cstdint implicit dependencies.
Remove unnecessary includes from PDBExtras.cpp
Since we're messing with individual element loads we need to expose this to show whats going on.
Part of the work to fix the masked_expandload.ll regressions in D66004
This allows us to remove extra patterns from AArch64SVEInstrInfo.td
because we can reuse those required for fixed length vectors.
Differential Revision: https://reviews.llvm.org/D85328
NOTE: Also uses SVE code generation for NEON size vectors, instead
of expanding i64 based vector multiplications.
Differential Revision: https://reviews.llvm.org/D85327
The dsymutil/X86/reproducer.test test could create paths
longer than MAX_PATH:
C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win
\build\test\tools\dsymutil\X86\Output\reproducer.test.tmp.repro
\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\build
\test\tools\dsymutil\X86\Output\reproducer.test.tmp\Inputs\
basic1.macho.x86_64.o
Disable this test on windows.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D85294
This is a simple patch that folds freeze(undef) into a proper constant after inspecting its uses.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84948
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.
In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).
Differential Revision: https://reviews.llvm.org/D75069
Unit.Format, Unit.Version and Unit.AddrSize are replaced with
dwarf::FormParams in D84496 to get rid of unnecessary functions
getOffsetSize() and getRefSize(). However, that change makes it
difficult to make AddrSize optional (Optional<uint8_t>). This change
pulls out dwarf::FormParams from DWARFYAML::Unit and use it as a helper
struct in DWARFYAML::emitDebugInfo().
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D85296
Failing test output sometimes contains control characters like \x1b (e.g.
if there was some -fcolor-diagnostics output) which are not allowed inside
XML files. This causes problems with CI systems: for example, the Jenkins
JUnit XML will throw an exception when ecountering those characters and
similar problems also occur with GitLab CI.
Reviewed By: yln, jdenny
Differential Revision: https://reviews.llvm.org/D84233
These might occur in seemingly generic assembly. Previously when
targeting COFF, they were silently ignored, which certainly won't
give the right result. Instead clearly error out, to make it clear
that the assembly needs to be adjusted for this target.
Also change a preexisting report_fatal_error into a proper error
message, pointing out the offending source instruction. This isn't
strictly an internal error, as it can be triggered by user input.
Differential Revision: https://reviews.llvm.org/D85242
We need to have special handling of i128 div/rem on Windows due
to a weird calling convention needed for the libcall. There was
also some code that made it look like we do the same for sdivrem/udiv,
but the code didn't account for multiple return values of those
functions so couldn't possibly work. I think this code never
triggers because we don't have libcall names defined for those
functions by default so DAGCombine never creates DIVREM nodes.
Correctly sign extend the addend, and fix implicit shift operand decoding
(it incorrectly returned 0 for some cases), and check that the initial
encoded immediate is 0.
The functionality is used when calling imageAtomicExhange() on float
type imageBuffer in Graphics shaders.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D85187
This is the last JumpThreading patch for getting the performance numbers shown at
https://reviews.llvm.org/D84940#2184653 .
This patch makes ProcessBlock call ProcessBranchOnPHI when the branch condition
is freeze(phi) as well (originally it calls the function when the condition is
phi only).
Since what ProcessBranchOnPHI does is to duplicate the basic block into
predecessors if profitable, it is still valid when the condition is freeze(phi)
too.
```
p = phi [a, pred1] [b, pred2]
p.fr = freeze p
br p.fr, ...
=>
pred1:
p.fr = freeze a
br p.fr, ...
pred2:
p.fr2 = freeze b
br p.fr2, ...
```
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85029
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.
This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.
Differential Revision: https://reviews.llvm.org/D79219
-print-memoryssa in legacy PM is print<memoryssa> in NPM.
Pin tests with -print-memoryssa to legacy PM.
Add corresponding tests for NPM where missing.
This fixes "unknown pass name 'print-memoryssa'".
Some tests still fail in Analysis/MemorySSA due to other passes that
haven't been ported.
pr43427.ll and pr43438.ll required adding -aa-pipeline=basic-aa,
-loop-simplify (since it doesn't run on legacy PM by default), and
decrementing some of the MemoryPhi numbers.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D85333
For example a v4f16 argument is scalarized to 4 i32 values. So
the values are spread out instead of being packed tightly like
in the original vector.
Fixes PR47000.