This includes integer arithmetic of various kinds (add/sub/multiply,
saturating and not), and the immediate forms of VMOV and VMVN that
load an immediate into all lanes of a vector.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62674
llvm-svn: 363936
AMDGPU target needs to override getRegClass() used during
instruction selection. We now may have either 32 or 64 bit
conditional registers used in the same instructions. For
that purpose special SReg_1 register class is created which
is dynamically resolved to either SReg_64 or SGPR_32 depending
on the subtarget attributes.
Differential Revision: https://reviews.llvm.org/D63205
llvm-svn: 363931
ELFState<ELFT>::addSymbols method looks a bit strange.
User code have to create the destination symbols vector outside,
add a null symbol and then pass it to addSymbols when it seems
the more natural logic is to isolate all work with symbols inside some
function, build the list right there and return it.
Differential revision: https://reviews.llvm.org/D63493
llvm-svn: 363930
We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc.
llvm-svn: 363924
Nothing of these tests made much sense. Loops were iterating too much, and I
also don't think it was actually testing anything. I think we simply want to
check that AEK_SOME_EXT returns "+some_ext".
I've given the AArch64 tests the same treatment as they very similarly didn't
made any sense either.
This fixes PR42316.
Differential Revision: https://reviews.llvm.org/D63569
llvm-svn: 363913
The caller of this is looking for comparisons of the input
to these instructions with 0. But the memory instructions
input is an addess not a value input in a register.
llvm-svn: 363907
Introducing VCC defs during SIFixSGPRCopies is generally
problematic. Avoid it by starting with the VOP3 form with the general
condition register. This is the easiest to fix instance, but doesn't
solve any specific problems I'm looking at.
llvm-svn: 363904
The ARMDisassembler changes allow changing between ARM and Thumb mode
based on the MCSubtargetInfo, rather than the Target, which simplifies
the other changes a bit.
I'm not really happy with adding more target-specific logic to
tools/llvm-objdump/, but there isn't any easy way around it: the logic
in question specifically applies to disassembling an object file, and
that code simply isn't located in lib/Target, at least at the moment.
Differential Revision: https://reviews.llvm.org/D60927
llvm-svn: 363903
This is incomplete, and ideally these would all be removed, but it's
better to localize them to the subtarget first with comments about
what they're for.
llvm-svn: 363902
Summary:
Stop referring to "numeric expression", using simply the term
"expression" instead. Likewise for numeric operation since operations
are only used in numeric expressions.
Reviewers: jhenderson, jdenny, probinson, arichardson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63500
llvm-svn: 363901
Summary:
Make use of Error and Expected to bubble up diagnostics and force
checking of errors in the callers.
Reviewers: jhenderson, jdenny, probinson, arichardson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63125
llvm-svn: 363900
Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something.....
llvm-svn: 363887
Simple little utility which takes a opt logfile generated with "opt -print-before-all -print-module-scope -o /dev/null <args> 2&>1", and splits into a series of individual "chunk-X.ll" files. The intended purpose is to help automate one step in failure reduction.
The imagined workflow is:
New crasher bug reported against clang or other frontend
Frontend run with -emit-llvm equivalent and manually confirmed that opt -O2 <emit.ll> crashes
Run this splitter script
Manually map pass name to invocation command (next on the to automate list)
Run bugpoint on last chunk file + manual command
I chose to dump every chunk rather than only the last since miscompile debugging frequently requires either manual step by step reduction, or cross feeding IR into different compiler versions. Not an immediate target, but there may be applications.
Differential Revision: https://reviews.llvm.org/D63461
llvm-svn: 363884
Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.
This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)
I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.
I do want to note that I added one bit after the review. When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one. This was theoretically possible with a single exit, but the zero case covered it in practice. With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached. Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done. The solution seemed obvious, so I didn't bother with another round of review.
Differential Revision: https://reviews.llvm.org/D62625
llvm-svn: 363883
Summary:
This is unfortunately needed for correctness, if we are to extend the tolerance of the update API to the way simple loop unswitch is doing cloning.
In simple loop unswitch (as opposed to loop unswitch), not all blocks are cloned. This can create unreachable cloned blocks (no predecessor), which are later cleaned up.
In MemorySSA, the APIs for supporting these kind of updates (clone + update exit blocks), make certain assumption on the integrity of the CFG. When cloning, if something was not cloned, it's values in MemorySSA default to LiveOnEntry. When updating exit blocks, it is safe to assume that we can first insert phis in the blocks merging two clones, then add additional phis in the IDF of the blocks that received phis. This no longer holds true if one of the clones being merged comes from an unreachable block. We'd conservatively need to add all phis before filling in their incoming definitions. In practice this restriction can be relaxed if we clean up trivial phis after the first round of insertion.
Reviewers: george.burgess.iv
Subscribers: jlebar, Prazek, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63354
llvm-svn: 363880
Summary:
When computing IDF for insert updates, ensure we use the snapshot CFG offered by GraphDiff.
Caught by D63389.
Reviewers: kuhar, george.burgess.iv
Subscribers: jlebar, Prazek, llvm-commits, Szelethus
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63443
llvm-svn: 363879
The def instruction for the vreg may not match, because it may be
folding through a reg_sequence. The assert was overly conservative and
not necessary. It's not actually important if DefMI really defined the
register, because the fold that will be done cares about the def of
the value that will be folded.
For some reason copies aren't making it through the reg_sequence,
although they should.
llvm-svn: 363876
(Recommit of r363293 which was reverted when a dependent patch was.)
As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here.
llvm-svn: 363875
Turns out that we can save an instruction by folding the right shift into
the compare.
Differential Revision: https://reviews.llvm.org/D63568
llvm-svn: 363874
This reapplies r363678, using the correct chain for the CopyToReg for
v0. glueCopyToM0 counterintuitively changes the operands of the
original node.
llvm-svn: 363870
Reword the ScalarEvolution::getExitCount comment in the same terminology as used by getBackedgeTakenCount since they're equivelent for single exit loops. Also, strengthen the comment to indicate exiting on the exact iteration specified is guaranteed. Several transforms implicitly rely on this; and the actual implementation checks for it (via dominating latch checks). So, spell out the guarantee in the comment.
llvm-svn: 363867