Summary:
In PR29973 Sanjay Patel reported an assertion failure when a certain
loop was optimized, for a target without SSE2 support. It turned out
this was because of the AVG pattern detection introduced in rL253952.
Prevent the assertion failure by bailing out early in
`detectAVGPattern()`, if the target does not support SSE2.
Also add a minimized test case.
Reviewers: congh, eli.friedman, spatel
Subscribers: emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D20905
llvm-svn: 271548
Although this was intended to be NFC, the test case wiggle shows a change in
code scheduling/RA caused by a difference in the SDLoc() generation.
Depending on how you look at it, this is the (dis)advantage of exact checking
in regression tests.
llvm-svn: 271526
This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead.
Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower.
Differential Revision: http://reviews.llvm.org/D20860
llvm-svn: 271510
I'm not sure why this was missing for so long.
This also exposed that we were picking floating point 256-bit VMOVNTPS for some integer types in normal isel for AVX1 even though VMOVNTDQ is available. In practice it doesn't matter due to the execution dependency fix pass, but it required extra isel patterns. Fixing that in a follow up commit.
llvm-svn: 271481
When the index is known to be constant 0, insert directly into the the low half,
instead of spilling, performing the insert in-memory, and reloading.
Differential Revision: http://reviews.llvm.org/D20763
llvm-svn: 271428
Summary:
Re-enable lifetime-start-on-first-use for stack coloring,
but explicitly disable it for slots with more than one start
or end lifetime marker.
Bug: 27903
Reviewers: wmi, tejohnson, qcolombet, gbiv
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20739
llvm-svn: 271412
This adds support to the backed to actually support SjLj EH as an exception
model. This is *NOT* the default model, and requires explicitly opting into it
from the frontend. GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.
Addresses PR27749!
llvm-svn: 271244
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.
Differential Revision: http://reviews.llvm.org/D20686
llvm-svn: 271131
It would be better to check the valid/expected size of the immediate operand, but this is
generally better than what we print right now.
Differential Revision: http://reviews.llvm.org/D20385
llvm-svn: 271114
Summary:
Turn off lifetime-start-on-first-use enhancement for the moment
pending a fix for bug 27903.
Bug: 27903
Reviewers: tejohnson, wmi, qcolombet, gbiv
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20731
llvm-svn: 271003
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
A companion patch (D20684) removes/auto-upgrade the clang intrinsics.
Differential Revision: http://reviews.llvm.org/D20686
llvm-svn: 270973
CriticalAntiDepBreaker was not correctly tracking defs of the high X86 byte
registers, leading to incorrect use of a busy register to break an
antidependence.
Fixes pr27681, and its duplicates pr27580, pr27804.
Differential Revision: http://reviews.llvm.org/D20456
llvm-svn: 270935
Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads.
llvm-svn: 270858
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.
Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.
This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).
Fixes PR19797.
llvm-svn: 270720
As noted in the review, there are still problems, so this doesn't the bug completely.
Differential Revision: http://reviews.llvm.org/D20529
llvm-svn: 270718