To unblock the builders this disables a test for which the CHECK lines
need to be updated. The patch causing the failure was not reverted
because it is needed for a different problem we are investigating. Here
we just need to update the CHECK lines which will happen in the
meantime.
This patch adds a cache that is valid only for the duration of a call
to getKnownBits. With such short lived cache we avoid all the problems
of cache invalidation while still getting the benefits of reusing
the information we already computed.
This cache is useful whenever an instruction occurs more than once
in a chain of computation.
E.g.,
v0 = G_ADD v1, v2
v3 = G_ADD v0, v1
Previously we would compute the known bits for:
v1, v2, v0, then v1 again and finally v3.
With the patch, now we won't have to recompute v1 again.
NFC
Summary:
Blocks in a loop can be in any order as long as the loop header is the
first block in Blocks.
With some order of Blocks, cloneLoopWithPreheader would trigger the
assertion in addBasicBlockToLoop.
Example:
define void @test(i64 %N) {
preheader.i:
br label %header.i
header.i:
%i = phi i64 [ 0, %preheader.i ], [ %inc.i, %latch.i ]
br label %header.j
header.j:
%j = phi i64 [ 0, %header.i ], [ %inc.j, %latch.j ]
br label %header.k
header.k:
%k = phi i64 [ 0, %header.j ], [ %inc.k, %latch.k ]
call void @baz(i64 %i, i64 %j, i64 %k)
br label %latch.k
latch.k:
%inc.k = add nsw i64 %k, 1
%cmp.k = icmp slt i64 %inc.k, %N
br i1 %cmp.k, label %header.k, label %latch.j
latch.j:
%inc.j = add nsw i64 %j, 1
%cmp.j = icmp slt i64 %inc.j, %N
br i1 %cmp.j, label %header.j, label %latch.i
latch.i:
%inc.i = add nsw i64 %i, 1
%cmp.i = icmp slt i64 %inc.i, %N
br i1 %cmp.i, label %header.i, label %exit.i
exit.i:
ret void
}
declare void @baz(i64, i64, i64)
If the blocks of loop-i is in the order: header.i, latch.k, header.k,
header.j, latch.j, latch.i,
then cloneLoopWithPreheader would trigger the assertion in
addBasicBlockToLoop
assert(contains(SameHeader) && getHeader() == SameHeader->getHeader() &&
"Incorrect LI specified for this loop!");
As latch.k is in both loop-j and loop-k, it would be set as the header
of both loops after adding latch.k.
If we update loop headers during cloning blocks, then after adding
header.k,
the header of loop-k would be updated with header.k,
while the header of loop-j stays as latch.k.
When adding header.j, SameHeader is loop-k, SameHeader->getHeader() is
header.k, but getHeader() is latch.k, which trigger the assertion.
Reviewer: jdoerfert, Meinersbur, fhahn, kbarton, hfinkel, bmahjour,
etiotto
Reviewed By: Meinersbur
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D74382
We want flag users to check individual fast-math flags,
not that all of them are set. This was also probably
not working as intended because NoFPExcept isn't always
set on non-strict nodes.
In this diff we change the storage of a section to unique_ptr.
This refactoring was factored out from D71647.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D74946
Add support for DestructiveBinaryComm DestructiveInstType, as well as the lowering code to expand the new Pseudos into the final movprfx+instruction pairs.
Differential Revision: https://reviews.llvm.org/D73711
This should be the last step in the current cleanup.
Follow-ups should resolve the TODO about cost calc
and enable the more general case where we extract
different elements.
This moves all the logic of converting LLVM Triples to
MachO::CPU_(SUB_)TYPE from the specific target (Target)AsmBackend to
more convenient functions in lib/BinaryFormat.
This also gets rid of the separate two X86AsmBackend classes.
The previous attempt was to add it to libObject, but that adds an
unnecessary dependency to libObject from all the targets.
Differential Revision: https://reviews.llvm.org/D74808
isPrefix was added to support the patches to align branches.
it relies on a switch over instruction names.
This moves those opcodes to a new format so the information is
tablegen and we can just check for a specific value in some bits
in TSFlags instead.
I've left the other function in place for now so that the
existing patches in phabricator will still work. I'll work with
the owner to get them migrated.
Summary:
Added register + immediate and register + register addressing modes for the following intrinsics:
1. Masked load and stores:
* Sign and zero extended load and truncated stores.
* No extension or truncation.
2. Masked non-temporal load and store.
Reviewers: andwar, efriedma
Subscribers: cameron.mcinally, sdesmalen, tschuett, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74254
The legalizer helper functions are unusably awkward to perform the 3-5
part legalization. This needs to be widened, scalarized, lowered, and
we should avoid creating vector extends and truncates. Manually do all
of this and expand.
I tried to use some of the new tablegen features to avoid creating
different operand list permutations, but I still don't see a way to
programmatically build a source pattern dag.
Also add GlobalISel tests, which now all import successfully.
Some of the fneg fold tests are incorrect, which need to be fixed in a
future commit
G_SHUFFLE_VECTOR is legal since it theoretically may help match op_sel
for VOP3P instructions. Expand it in some other way in case it doesn't
fold into the use instructions.
Summary:
When we have a long name for the undefined symbol, we would hit this assertion:
Assertion failed: I != StringIndexMap.end() && "String is not in table!"
This patch addresses that.
Reviewed by: DiggerLin, daltenty
Differential Revision: https://reviews.llvm.org/D74924
This fixes a small mistake from D72944: The worklist add should
happen before assigning the new operand, not after.
In case an actual replacement happens, the old operand needs to
be added for DCE. If no actual replacement happens, then old/new
are the same, so it doesn't matter.
This drops one iteration from the annotated test case.
Summary:
This prevents BFI queries on new blocks (from
MachineSinking::GetAllSortedSuccessors) and fixes a bunch of assert failures
under -check-bfi-unknown-block-queries=true.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74511
The Blocks runtime provide a header named Block.h.
It is generally preferable to avoid name collision with system headers
(reducing reliance on -isystem order, more friendly when navigating files in
an editor, etc).
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D74934
Followup to D73919 with another batch of replacements of
setOperand() -> replaceOperand(), to make sure the old
operand gets DCEd right away.
Differential Revision: https://reviews.llvm.org/D74932
ToVectorTy is defined and used in multiple places. Hoist it to
VectorUtils.h to avoid duplication and improve re-usability.
Reviewers: rengolin, hsaito, Ayal, gilr, fpetrogalli
Reviewed By: fpetrogalli
Differential Revision: https://reviews.llvm.org/D74959
This changes the SimplifyLibCalls utility to accept an IRBuilderBase,
which allows us to pass through the IRBuilder used by InstCombine.
This will ensure that new instructions get added to the worklist.
The annotated test-case drops from 4 to 2 InstCombine iterations thanks
to this.
To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard,
which is basically the same as the existing InsertPointGuard and
FastMathFlagsGuard, but for operand bundles. Also add a
setDefaultOperandBundles() method so these can be set outside the
constructor.
Differential Revision: https://reviews.llvm.org/D74792
A cost query for a vector instruction should return a cost even without
target vector support, and not trigger an assert.
VectorCombine does this with an input containing source code vectors.
Review: Ulrich Weigand
We don't use this, and matching from the def doesn't make much sense.
There are multiple tablegen bugs with default operand
handling. undef_tied_input should work to handle the vdst_in
correctly, but this breaks the operand register class constraint which
it should be able to infer.
Can be used like
-debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20
Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D72147
Add +fullfp16 to sve-vector-splat.ll so we can test folding of immediates into moves.
This attribute can go away later when SVE has a full set of fp16 patterns in place.
Differential Revision: https://reviews.llvm.org/D74965
We should try the generated matchers before the manual selection. This
means the patterns are now handling the common cases, but the manual
selection code is not yet dead. It's still handling the non-s32/s64
cases (like v2s16 and v2s32). Currently tablegen doesn't have a nice
way to have a single pattern that covers multiple types.