Summary:
Currently the check is incorrect and the following invalid
instruction is accepted and incorrectly assembled:
vmov.i32 d2, #0x00a500a6
This patch fixes the issue.
Reviewers: olista01, rengolin
Reviewed By: rengolin
Subscribers: SjoerdMeijer, javed.absar, rogfer01, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44460
llvm-svn: 327704
This patch provides an implementation of getArithmeticReductionCost for
AArch64. We can specialize the cost of add reductions since they are computed
using the 'addv' instruction.
Differential Revision: https://reviews.llvm.org/D44490
llvm-svn: 327702
Summary:
This patch adds more checks to the .debug_names validator. Specifically,
they check for:
- buckets claiming to be non-empty but pointing to mismatched hashes
(most consumers would interpret this as an empty bucket, but it
questionable whether the generator meant that)
- hashes that are not reachable from any bucket
- names with incorrect hashes
Together, these checks ensure that any name in the index can be reached
through the hash table using the regular lookup algorithm. We also warn
if we encounter a name index without a hash table.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44433
llvm-svn: 327699
This patch considers the experimental vector reduce intrinsics in the default
implementation of getIntrinsicInstrCost. The cost of these intrinsics is
computed with getArithmeticReductionCost and getMinMaxReductionCost. This patch
also adds a test case for AArch64 that indicates the costs we currently compute
for vector reduce intrinsics. These costs are inaccurate and will be updated in
a follow-on patch.
Differential Revision: https://reviews.llvm.org/D44489
llvm-svn: 327698
This implements lowering of SELECT_CC for f16s, which enables
codegen of VSEL with f16 types.
Differential Revision: https://reviews.llvm.org/D44518
llvm-svn: 327695
Previously if getSetccResultType returned an illegal type we just fell back to using the default promoted type. This appears to have been to handle the case where for vectors getSetccResultType returns the input type, but the input type itself isn't legal and will need to be promoted. Without the legality check we would never reach a legal type.
But just picking the promoted type to be the setcc type can create strange setccs where the result type is 128 bits and the operand type is 256 bits. If for example the result type was promoted to v8i16 from v8i1, but the input type was promoted from v8i23 to v8i32. We currently handle this with custom lowering code in X86.
This legality check also caused us reject the getSetccResultType when the input type needed to be widened or split. Even though that result wouldn't have caused legalization to get stuck.
This patch tries to fix this by detecting the getSetccResultType needs to be promoted. If its input type also needs to be promoted we'll try a ask for a new setcc result type based on its eventual promoted value. Otherwise we fall back to default type to promote to.
For any other illegal values we might get back from the initial call to getSetccResultType we just keep and allow it to be re-legalized later via splitting or widening or scalarizing.
llvm-svn: 327683
YMM FDiv/FSqrt are dispatched on pipe JFPU1 but should be performed on the JFPM unit - that is where most of the cycles are spent.
This matches the pipes for WriteFSqrt/WriteFDiv definitions.
llvm-svn: 327682
This test was originally disabled because it was failing on a bot.
It turns out I had run dos2unix on the file, and that removed a
necessary byte from the file. I'm just recomitting the proper
file and updating the test to test a little bit more now.
llvm-svn: 327679
There was some code that tried to calculate the number of 4-byte
words required to hold N bits, but it was instead computing the
number of bytes required to hold N bits. This was leading to
extraneous data being output into the hash table, which would
cause certain operations in DIA (the Microsoft PDB reader) to
fail.
llvm-svn: 327675
If the loop body contains conditions of the form IndVar < #constant, we
can remove the checks by peeling off #constant iterations.
This improves codegen for PR34364.
Reviewers: mkuper, mkazantsev, efriedma
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D43876
llvm-svn: 327671
We were unnecessarily copying a bunch of these FunctionInfo objects
around when rehashing the DenseMap.
Furthermore, r327620 introduced pointers referring to objects owned by
FunctionInfo, and the default copy ctor did the wrong thing in this
case, leading to use-after-free when the DenseMap gets rehashed.
I will rebase r327620 on this next and recommit it.
llvm-svn: 327665
It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest *is* known to be taken on the first iteration.
This case comes up in two major ways:
* If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration.
* Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated.
The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate.
Differential Revision: https://reviews.llvm.org/D44287
llvm-svn: 327664
The FADD part of the addsub/subadd pattern can have its operands commuted, but when checking for fsubadd we were using the fadd as reference and commuting the fsub node.
llvm-svn: 327660
The code that creates fmsubadd from shuffle vector has some code to allow commuting the operands of the fadd node. This code was originally created when we only recognized fmaddsub. When fmsubadd support was added this code was not updated and is now commuting the fsub operands instead.
llvm-svn: 327659
If we've already established an invariant scope with an earlier generation, we don't want to hide it in the scoped hash table with one with a later generation. I noticed this when working on the invariant-load handling, but it also applies to the invariant.start case as well.
Without this change, my previous patch for invariant-load regresses some cases, so I'm pushing this without waiting for review. This is why you don't make last minute tweaks to patches to catch "obvious cases" after it's already been reviewed. Bad Philip!
llvm-svn: 327655
PR35402 triggered this case. It bswap and stores a 48bit value, current STBRX optimization transforms it into STBRX. Unfortunately 48bit is not a simple MVT, there is no PPC instruction to support it, and it can't be automatically expanded by llvm, so caused a crash.
This patch detects the non-simple MVT and returns early.
Differential Revision: https://reviews.llvm.org/D44500
llvm-svn: 327651
Rather than enumerating all specific types, for the DAG combine we can just use TLI::isTypeLegal and an SSE3 check. For the BUILD_VECTOR version we already know the type is legal so we just need to check SSE3.
llvm-svn: 327649
It previously only worked when the key and value types were
both 4 byte integers. We now have a use case for a non trivial
value type, so we need to extend it to support arbitrary value
types, which means templatizing it.
llvm-svn: 327647
This is a follow up to https://reviews.llvm.org/D43716 which rewrites the invariant load handling using the new infrastructure. It's slightly more powerful, but only in somewhat minor ways for the moment. It's not clear that DSE of stores to invariant locations is actually interesting since why would your IR have such a construct to start with?
Note: The submitted version is slightly different than the reviewed one. I realized the scope could start for an invariant load which was proven redundant and removed. Added a test case to illustrate that as well.
Differential Revision: https://reviews.llvm.org/D44497
llvm-svn: 327646
Now both method DispatchUnit::checkRAT() and DispatchUnit::canDispatch take as
input an Instruction refrence instead of an instruction descriptor.
This was requested by Simon in D44488 to simplify the diff.
llvm-svn: 327640
This patch adds new load/store instructions for integer scalar types
which can be used for X-Form when fed by add with an @tls relocation.
Differential Revision: https://reviews.llvm.org/D43315
llvm-svn: 327635
As discussed on D44428 and PR36726, this patch splits off WriteFMove/WriteVecMove, WriteFLoad/WriteVecLoad and WriteFStore/WriteVecStore scheduler classes to permit vectors to be handled separately from gpr/scalar types.
I've minimised the diff here by only moving various basic SSE/AVX vector instructions across - we can fix the rest when called for. This does fix the MOVDQA vs MOVAPS/MOVAPD discrepancies mentioned on D44428.
Differential Revision: https://reviews.llvm.org/D44471
llvm-svn: 327630