MachineIRBuilder had weird before/after and beginning/end flags for the insert
point. Unfortunately the non-default means that instructions will be inserted
in reverse order which is almost never what anyone wants.
Really, I think we just want (like IRBuilder has) the ability to insert at any
C++ iterator-style point (i.e. before any instruction or before MBB.end()). So
this fixes MIRBuilders to behave like IRBuilders in this respect.
llvm-svn: 288980
If we don't skip over DEBUG_VALUEs, we get differences between -g and non-g
code.
This fixes PR31242.
Differential Revision: https://reviews.llvm.org/D27485
llvm-svn: 288965
The second operand of an "ri" instruction may be an immediate, but it may
also be a globalvariable, so we should make any assumptions.
This fixes PR31271.
Differential Revision: https://reviews.llvm.org/D27481
llvm-svn: 288964
The tests that already work are folded in InstSimplify, so those
tests should be redundant and we can remove them if they don't
seem worthwhile for completeness.
llvm-svn: 288957
This is now performed more generally by the target shuffle combine code.
Already covered by tests that were originally added in D7666/rL229480 to support combineVectorZext (or VectorZextCombine as it was known then....).
Differential Revision: https://reviews.llvm.org/D27510
llvm-svn: 288918
This patch attempts to scalarize the operand expressions of predicated
instructions if they were conditionally executed in the original loop. After
scalarization, the expressions will be sunk inside the blocks created for the
predicated instructions. The transformation essentially performs
un-if-conversion on the operands.
The cost model has been updated to determine if scalarization is profitable. It
compares the cost of a vectorized instruction, assuming it will be
if-converted, to the cost of the scalarized instruction, assuming that the
instructions corresponding to each vector lane will be sunk inside a predicated
block, possibly avoiding execution. If it's more profitable to scalarize the
entire expression tree feeding the predicated instruction, the expression will
be scalarized; otherwise, it will be vectorized. We only consider the cost of
the entire expression to accurately estimate the cost of the required
insertelement and extractelement instructions.
Differential Revision: https://reviews.llvm.org/D26083
llvm-svn: 288909
In the case of a fully redundant load LI dominated by an equivalent load V, GVN
should always preserve the original debug location of V. Otherwise, we risk to
introduce an incorrect stepping.
If V has debug info, then clearly it should not be modified. If V has a null
debugloc, then it is still potentially incorrect to propagate LI's debugloc
because LI may not post-dominate V.
Differential Revision: https://reviews.llvm.org/D27468
llvm-svn: 288903
We are being inconsistent with these instructions (and all their variants.....) with a random mix of them using the default float domain.
Differential Revision: https://reviews.llvm.org/D27419
llvm-svn: 288902
The non-constant pool version of DecodeVPERMIL2PMask was not offsetting correctly for the second input. I've updated the code to match the implementation in the constant-pool version.
Annoyingly this bug was hidden for so long as it's tricky to combine to useful variable shuffle masks that don't become constant-pool entries.
llvm-svn: 288898
When a function F is inlined, InlineFunction extends the debug location of every
instruction inlined from F by adding an InlinedAt.
However, if an instruction has a 'null' debug location, InlineFunction would
propagate the callsite debug location to it. This behavior existed since
revision 210459.
Revision 210459 was originally committed specifically to workaround the lack of
debug information for instructions inlined from intrinsic functions (which are
usually declared with attributes `__always_inline__, __nodebug__`).
The problem with revision 210459 is that it doesn't make any sort of distinction
between instructions inlined from a 'nodebug' function and instructions which
are inlined from a function built with debug info. This issue may lead to
incorrect stepping in the debugger.
This patch works under the assumption that a nodebug function does not have a
DISubprogram. When a function F is inlined into another function G,
InlineFunction checks if F has debug info associated with it.
For nodebug functions, the InlineFunction logic is unchanged (i.e. it would
still propagate the callsite debugloc to the inlined instructions). Otherwise,
InlineFunction no longer propagates the callsite debug location.
Differential Revision: https://reviews.llvm.org/D27462
llvm-svn: 288895
I believe this is the cause of the failure, but have not been able to confirm. Note that this is a speculative fix; I'm still waiting for a full build to finish as I synced and ended up doing a clean build which takes 20+ minutes on my machine.
llvm-svn: 288886
Requesting metadata for a global is a relatively expensive operation as it
involves a map lookup, but it's one that we need to do relatively frequently in
this pass to collect the list of type metadata nodes associated with a global.
This change improves the performance of type metadata queries by prebuilding
data structures that keep the global together with its list of type metadata,
and changing the pass to use that data structure wherever we were previously
passing global references around.
This change also eliminates some O(N^2) behavior by collecting the list of
globals associated with each type identifier during the first pass over the
list of globals rather than visiting each global to compute that list every
time we add a new type identifier.
Reduces pass runtime on a module containing Chrome's vtables from over 60s
to 0.9s.
Differential Revision: https://reviews.llvm.org/D27484
llvm-svn: 288859