This is similar to what's done in computeKnownBits and computeSignBits. Don't do anything fancy just collect information valid for any element.
Differential Revision: https://reviews.llvm.org/D43789
llvm-svn: 326237
We were always setting the block alignment to 2 bytes in Thumb mode
and 4-bytes in ARM mode (r325754, and r325012), but this could cause
reducing the block alignment when it already had been aligned (e.g.
in Thumb mode when the block is a CPE that was already 4-byte aligned).
Patch by Momchil Velikov, I've only added a test.
Differential Revision: https://reviews.llvm.org/D43777
llvm-svn: 326232
Following DW_AT_sibling attributes completely defeats the pruning pass.
Although clang doesn't generate the DW_AT_sibling attribute we should
still handle it correctly.
Differential revision: https://reviews.llvm.org/D43439
llvm-svn: 326231
These tables add 3000 lines to X86InstrInfo.cpp. And if we ever manage to auto generate them they'll be a separate file anyway.
Differential Revision: https://reviews.llvm.org/D43806
llvm-svn: 326225
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.
llvm-svn: 326221
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.
llvm-svn: 326217
Currently when abort is enabled, we get a diagnostic saying "Fallback
path used .... " and the program terminates. To actually figure out what
the reason is, we need to run again with another verbose argument
"-pass-remarks-missed=gisel". Instead, when we are going to abort,
we might as well print expensive remarks.
https://reviews.llvm.org/D43796
llvm-svn: 326215
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
Summary:
Since r325479 the DataLayout includes a program address space. However, it
is not possible to use `call %foo` if foo is a `i8(...) addrspace(200)` and
the DataLayout specifies address space 200 as the address space for functions.
With this change the IR parser will still accept variables in the program
address space as well as address space 0 for call and invoke functions.
Reviewers: pcc, arsenm, bjope, dylanmckay, theraven
Reviewed By: dylanmckay
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D43645
llvm-svn: 326188
Add test that verifies that we don't follow DWARF values with a
reference form class, such as DW_AT_sibling.
Since clang doesn't generate the latter attribute, we added a PowerPC
test generated on an old PowerBook G4. (Thanks Adrian!)
llvm-svn: 326183
Until this patch, only `powerpc` and `ppc32` were recognized as valid
PowerPC 32-bit architectures in a target triple. This was incompatible
with the triple `ppc-apple-darwin` as returned for libObject. I found
out about this when working on a test case using a binary generated on
an old PowerBook G4.
We had the choice of either fix this in the Mach-O object parser or
in the Triple implementation. I chose the latter because it feels like
the most canonical place.
Differential revision: https://reviews.llvm.org/D43760
llvm-svn: 326182
In case we update a ValuePHI node created earlier, we could update it
based on a different OpPHI which could be in a different block.
We need to update the TempToBlock mapping reflecting the new block,
otherwise we would end up placing the new phi node in a wrong block.
This problem is exposed by the test case in
https://bugs.llvm.org/show_bug.cgi?id=36504.
This patch fixes a slightly simpler problem than in the bug report. In
the bug's re-producer, the additional problem is that we are re-using a
ValuePHI node with to few incoming values for the new OpPHI. If this
patch makes sense, I will follow it up with a patch that creates a new
PHI node if the existing PHI node has a different number of incoming
values.
Reviewers: davide, dberlin
Reviewed By: dberlin
Differential Revision: https://reviews.llvm.org/D43770
llvm-svn: 326181
Since getNode() might not always return the requsted opcode, for instance if
called with (ISD::AND, -1) arguments, there should be a check so that
SelectCode() is only called when appropriate.
Review: Ulrich Weigand
llvm-svn: 326178
The only cases I can come up with where this invalidation needs to
happen is when there's a deletion somewhere. If we find more creative
test-cases, we can probably go with another approach mentioned on
PR36529.
Fixes PR36529.
llvm-svn: 326177
It appears that there were many cases where we were directly (through
templates) calling the dtor of MemoryAccess, which is conceptually an
abstract class.
This hasn't been a problem, since the data members of all of the
subclasses of MemoryAccess have been POD. I'm planning on changing that.
:)
llvm-svn: 326175
Set default value for IgnoreOtherLoops of SCEVInitRewriter::rewrite to true
to be consistent with SCEVPostIncRewriter which does not have this parameter
but behaves as it would be true.
This is follow up for rL326067.
llvm-svn: 326174
AVX512 used to promote v32i1 to v32i8 during legalization when BWI was disabled. So this code was added to improve legalization of v32i1 concat_vectors of v16i1 by extending the v16i1 to v16i8 to avoid scalarization.
X86 has since switched to legalizing v32i1 by splitting to v16i1 instead. This has rendered this code unnecessary and its no longer exercised.
llvm-svn: 326153
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.
https://reviews.llvm.org/D43409
llvm-svn: 326142
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.
Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.
Differential Revision: https://reviews.llvm.org/D43733
llvm-svn: 326133
There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next.
llvm-svn: 326128
Summary:
We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math.
This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization.
Reviewers: spatel, RKSimon, zvi
Reviewed By: RKSimon
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D43593
llvm-svn: 326119
This change improves incremental rebuild performance on dual Xeon 8168
machines by 54%. This change also improves run time code gen by not
forcing the case values to be lvalues.
llvm-svn: 326109
This wires up -pass-remarks-hotness-threshold to LTO and ThinLTO.
Next is to change the clang driver to pass this
with -fdiagnostics-hotness-threshold.
Differential Revision: https://reviews.llvm.org/D41465
llvm-svn: 326107