dumpDebugStrings() and dumpDebugAbbrev() are no longer used in
macho2yaml. This patch helps remove them.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D85496
As mentioned on D85463, we should be using SimplifyMultipleUseDemandedBits (which is the default fallback).
The minor regression in illegal-bitfield-loadstore.ll will be addressed properly by D77804.
Change to expand MULHU/MULHS/UMUL_LOHI/SMUL_LOHI for i32 and i64 since
those instructions are not available on Aurora SX VE. Some of them
are used in expansion of i128 multiply, so need to modify them to
support i128. Then, update basic arithmetic regression tests of
i128 and signed/unsigned i32 typed integer values.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85490
This removes members of the DIEUnit class which were used only in unit
tests. Note also that child classes shadowed some of these methods,
namely, getDwarfVersion() was overridden in DwartfUnit and getLength()
was overridden in DwarfCompileUnit.
Differential Revision: https://reviews.llvm.org/D85436
This is a split patch of D80991.
This patch introduces AAPotentialValues and its interface only.
For more detail of AAPotentialValues abstract attribute, see the original patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D83283
Fixed an incorrect pattern in lib/Target/AArch64/AArch64SVEInstrInfo.td
for storing out <vscale x 2 x f32> unpacked scalable vectors. Added
a couple of tests to
test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll
Differential Revision: https://reviews.llvm.org/D85441
This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10.
Differential Revision: https://reviews.llvm.org/D84622
If it is load cluster, we don't need to create the dependency edges(SUb->reg) from SUb to SUa
as they both depend on the base register "reg"
+-------+
+----> reg |
| +---+---+
| ^
| |
| |
| |
| +---+---+
| | SUa | Load 0(reg)
| +---+---+
| ^
| |
| |
| +---+---+
+----+ SUb | Load 4(reg)
+-------+
But if it is store cluster, we need to create it as follow shows to avoid the instruction store
depend on scheduled in-between SUb and SUa.
+-------+
+----> reg |
| +---+---+
| ^
| | Missing +-------+
| | +-------------------->+ y |
| | | +---+---+
| +---+-+-+ ^
| | SUa | Store x 0(reg) |
| +---+---+ |
| ^ |
| | +------------------------+
| | |
| +---+--++
+----+ SUb | Store y 4(reg)
+-------+
Reviewed By: evandro, arsenm, rampitec, foad, fhahn
Differential Revision: https://reviews.llvm.org/D72031
This patch is a follow up of D84733.
If a function has noundef attribute in returned position, instructions that return undef or poison value cause UB.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85178
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D84630
addGlobalValueSummary can check newly added FunctionSummary
and set HasParamAccess to mark that generateParamAccessSummary
is needed.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85182
Change to not generate truncate instructions if all use of a truncate
operation don't care about higher bits. For example, an i32 add
instruction doesn't care about higher 32 bits in 64 bit registers.
Updates regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85418
In GlobalISel, if you have a load into a small type with a range, you'll hit
an assert if you try to compute known bits on it starting at a larger type.
e.g.
```
%x:_(s8) = G_LOAD %whatever(p0) :: (load 1 ... !range !n)
...
%y:_(s32) = G_SOMETHING %x
```
When we walk through G_SOMETHING and hit the load, the width of our known bits
is 32. However, the width of the range is going to be 8. This will cause us
to hit an assert.
To fix this, make computeKnownBitsFromRangeMetadata zero extend or truncate
the range type to match the bitwidth of the known bits we're calculating.
Add a testcase in CodeGen/GlobalISel/KnownBitsTest.cpp to reflect that this
works now.
https://reviews.llvm.org/D85375
As of GN 3028c6a426a4, the hack that transformed "libs" ending in
".framework" from -l arguments to -framework arguments has been removed.
Instead, "frameworks" must be used, and the toolchain must provide
support.
Differential Revision: https://reviews.llvm.org/D84219
Buildbot reported a build failure when building shared
library libLLVMBPFCodeGen.so with unknown reference to
"createCFGSimplificationPass".
Commit 87cba434027b ("BPF: add a SimplifyCFG IR pass during
generic Scalar/IPO optimization") added an IR pass SimplifyCFG
by BPF target. The commit called function
createCFGSimplificationPass() defined in "Scalar" library.
Add this library in Target/BPF/LLVMBuild.txt so
shared library build can succeed.
Negator knows how to do this, but the one-use reasoning is getting
a bit muddy here, we don't really want to increase instruction count,
so we need to both lie that "IsNegation" and have an one-use check
on the outermost LHS value.
Multiplication is commutative, and either of operands can be negative,
so if the RHS is a negated power-of-two, we should try to make it
true power-of-two (which will allow us to turn it into a left-shift),
by trying to sink the negation down into LHS op.
But, we shouldn't re-invent the logic for sinking negation,
let's just use Negator for that.
Tests and original patch by: Simon Pilgrim @RKSimon!
Differential Revision: https://reviews.llvm.org/D85446
One of the callers only wants the condition, but the vselect can
be simplified by getNode making it hard or impossible to retrieve
the condition.
Instead, return the condition and make the other 2 callers
responsible for creating the vselect node using the condition.
Rename the function to WidenVSELECTMask accordingly.
Differential Revision: https://reviews.llvm.org/D85468
The following bpf linux kernel selftest failed with latest
llvm:
$ ./test_progs -n 7/10
...
The sequence of 8193 jumps is too complex.
verification time 126272 usec
stack depth 320
processed 114799 insns (limit 1000000)
...
libbpf: failed to load object 'pyperf600_nounroll.o'
test_bpf_verif_scale:FAIL:110
#7/10 pyperf600_nounroll.o:FAIL
#7 bpf_verif_scale:FAIL
After some investigation, I found the following llvm patch
https://reviews.llvm.org/D84108
is responsible. The patch disabled hoisting common instructions
in SimplifyCFG by default. Later on, the code changes and a
SimplifyCFG phase with hoisting on cannot do the work any more.
A test is provided to demonstrate the problem.
The IR before simplifyCFG looks like:
for.cond:
%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
%cmp = icmp ult i32 %i.0, 6
br i1 %cmp, label %for.body, label %for.cond.cleanup
for.cond.cleanup:
%2 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
%cmp2 = icmp eq i8* %2, null
%conv = zext i1 %cmp2 to i32
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3
ret i32 %conv
for.body:
%3 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
%tobool.not = icmp eq i8* %3, null
br i1 %tobool.not, label %for.inc, label %land.lhs.true
The first two insns of `for.cond.cleanup` and `for.body`, load and
icmp, can be hoisted to `for.cond` block. With Patch D84108, the
optimization is delayed. But unfortunately, later on loop rotation
added addition phi nodes to `for.body` and hoisting cannot
be done any more.
Note such a hoisting is beneficial to bpf programs as
bpf verifier does path sensitive analysis and verification.
The hoisting preverts reloading from stack which will assume
conservative value and increase exploited insns. In this case,
it caused verifier failure.
To fix this problem, I added an IR pass from bpf target
to performance additional simplifycfg with hoisting common inst
enabled.
Differential Revision: https://reviews.llvm.org/D85434