1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

7555 Commits

Author SHA1 Message Date
Simon Pilgrim
29c4e8764f [X86][SSE] Added SSE41/AVX2 non-temporal tests
Useful for when we add MOVNTDQA support

llvm-svn: 271552
2016-06-02 18:01:21 +00:00
Dimitry Andric
9beb691de0 Only attempt to detect AVG if SSE2 is available
Summary:
In PR29973 Sanjay Patel reported an assertion failure when a certain
loop was optimized, for a target without SSE2 support.  It turned out
this was because of the AVG pattern detection introduced in rL253952.

Prevent the assertion failure by bailing out early in
`detectAVGPattern()`, if the target does not support SSE2.

Also add a minimized test case.

Reviewers: congh, eli.friedman, spatel

Subscribers: emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D20905

llvm-svn: 271548
2016-06-02 17:30:49 +00:00
Sanjay Patel
153b260453 [DAG] use getBitcast() to reduce code
Although this was intended to be NFC, the test case wiggle shows a change in
code scheduling/RA caused by a difference in the SDLoc() generation.

Depending on how you look at it, this is the (dis)advantage of exact checking
in regression tests.

llvm-svn: 271526
2016-06-02 16:01:15 +00:00
Simon Pilgrim
e49fac5565 [X86][SSE] Added non-temporal load tests for vector types
These currently lower to regular loads instead of MOVNTDQA

llvm-svn: 271516
2016-06-02 13:51:50 +00:00
Simon Pilgrim
2e72cbb66e [X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (llvm)
This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead.

Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower.

Differential Revision: http://reviews.llvm.org/D20860

llvm-svn: 271510
2016-06-02 10:55:21 +00:00
Craig Topper
f70345a66d [X86] Add AVX 256-bit load and stores to fast isel.
I'm not sure why this was missing for so long.

This also exposed that we were picking floating point 256-bit VMOVNTPS for some integer types in normal isel for AVX1 even though VMOVNTDQ is available. In practice it doesn't matter due to the execution dependency fix pass, but it required extra isel patterns. Fixing that in a follow up commit.

llvm-svn: 271481
2016-06-02 04:19:45 +00:00
Craig Topper
1887664778 [AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked loads.

llvm-svn: 271478
2016-06-02 04:19:36 +00:00
Sanjay Patel
287ac2dfec [x86, AVX2] regenerate checks
llvm-svn: 271434
2016-06-01 21:32:56 +00:00
Michael Kuperstein
1e7dd66dfc [DAG] Improve legalization of INSERT_SUBVECTOR
When the index is known to be constant 0, insert directly into the the low half,
instead of spilling, performing the insert in-memory, and reloading.

Differential Revision: http://reviews.llvm.org/D20763

llvm-svn: 271428
2016-06-01 20:49:35 +00:00
Than McIntosh
a03aeb4a97 Better fix for PR27903.
Summary:
Re-enable lifetime-start-on-first-use for stack coloring,
but explicitly disable it for slots with more than one start
or end lifetime marker.

Bug: 27903

Reviewers: wmi, tejohnson, qcolombet, gbiv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20739

llvm-svn: 271412
2016-06-01 17:55:10 +00:00
Simon Pilgrim
c1ef71aea6 [X86][SSE] Added non-temporal store tests for all 512-bit vector types
llvm-svn: 271393
2016-06-01 13:58:00 +00:00
Simon Pilgrim
8864f582d0 [X86][SSE] Added non-temporal store tests for all 256-bit vector types
Also added KNL AVX-512 checks

llvm-svn: 271391
2016-06-01 13:20:25 +00:00
Simon Pilgrim
577c6a3c29 [X86][SSE] Added non-temporal store tests for all 128-bit integer vector types
llvm-svn: 271389
2016-06-01 13:05:00 +00:00
Michael Zuckerman
e5673d8456 Adding back-end support to two bit scanning intrinsics
Adding LLVM back-end support to two intrinsics dealing with bit scan: _bit_scan_forward and _bit_scan_reverse.
Their functionality is as described in Intel intrinsics guide:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_forward&expand=371,370
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_reverse&expand=371,370

Commit on behalf of Omer Paparo Bivas


Differential Revision: http://reviews.llvm.org/D19915

llvm-svn: 271386
2016-06-01 12:02:37 +00:00
Craig Topper
bc9e4ba942 Revert r271362 "[AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead."
Looks like something isn't quite right still. Also forgot to move the test cases to an autoupgrade test.

llvm-svn: 271363
2016-06-01 05:57:55 +00:00
Craig Topper
734c8343a6 [AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked loads.

llvm-svn: 271362
2016-06-01 05:35:16 +00:00
Kevin B. Smith
21876b39ee [X86]: Add a pattern that uses GR16_ABCD rather than GR32_ABCD to avoid falsely marking whole 32 bit register as live.
Differential Revision: http://reviews.llvm.org/D20649

llvm-svn: 271341
2016-05-31 22:00:12 +00:00
Simon Pilgrim
29a89e51e6 [X86][SSE] Add load-folding patterns for (V)CVTDQ2PD (PR27291)
Added patterns for (V)CVTDQ2PD -> 2f64 loading from a 64-bit source.

llvm-svn: 271269
2016-05-31 12:04:35 +00:00
Igor Breger
dab6232733 [AVX512] Fix intrinsic vcvtps2ph lowering.
Differential Revision: http://reviews.llvm.org/D20788

llvm-svn: 271255
2016-05-31 08:04:21 +00:00
Igor Breger
cc3e7d2dd1 Fix intrinsic vbroadcast{i32|f32}x2 lowering.
Differential Revision: http://reviews.llvm.org/D20780

llvm-svn: 271254
2016-05-31 07:43:39 +00:00
Craig Topper
cb79936a4b [AVX512] Remove masked store intrinsics. Clang now emits generic masked store intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked stores.

llvm-svn: 271245
2016-05-31 01:50:02 +00:00
Saleem Abdulrasool
f85a029a33 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Craig Topper
4f195e8edb [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Craig Topper
9d8b6d7b31 [X86] Use update_llc_test_checks.py to re-generate a test in preparation for an upcoming commit. NFC
llvm-svn: 271234
2016-05-30 22:54:14 +00:00
Simon Pilgrim
9226b7680f [X86][XOP] Split off auto-upgraded xop intrinsics
llvm-svn: 271228
2016-05-30 19:50:56 +00:00
Simon Pilgrim
5e4ee0eb80 [X86][SSE] Renamed pmovxrm tests
These aren't intrinsics anymore - as discussed on D20686

llvm-svn: 271226
2016-05-30 19:14:37 +00:00
Simon Pilgrim
15ddf25bef [X86][AVX2] Regenerated AVX2 extension tests
llvm-svn: 271224
2016-05-30 18:49:57 +00:00
Simon Pilgrim
83a5e7cad9 [X86][SSE] Updated storeu fast-isel tests to match clang builtin tests
Since rL271214 the headers have no longer used the storeu intrinsic

llvm-svn: 271222
2016-05-30 18:42:51 +00:00
Simon Pilgrim
97b3734c4d [X86][SSE2] Updated _mm_store_pd1/_mm_store1_pd fast-isel tests to match D20617
llvm-svn: 271220
2016-05-30 18:18:44 +00:00
Simon Pilgrim
6ec0f7efbc [X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 271131
2016-05-28 18:03:41 +00:00
Sanjay Patel
fc1048d379 [x86] avoid printing unnecessary sign bits of hex immediates in asm comments (PR20347)
It would be better to check the valid/expected size of the immediate operand, but this is
generally better than what we print right now.

Differential Revision: http://reviews.llvm.org/D20385

llvm-svn: 271114
2016-05-28 14:58:37 +00:00
Ahmed Bougacha
396b16d3af [X86] Try to zero elts when lowering 256-bit shuffle with PSHUFB.
Otherwise we fallback to a blend of PSHUFBs later on.

Differential Revision: http://reviews.llvm.org/D19661

llvm-svn: 271113
2016-05-28 14:38:04 +00:00
Michael Kuperstein
ac24107aac [X86] Detect SAD patterns and emit psadbw instructions.
This recommits r267649 with a fix for PR27539.

Differential Revision: http://reviews.llvm.org/D20598

llvm-svn: 271033
2016-05-27 18:53:22 +00:00
Simon Pilgrim
67f884d956 [X86][AVX] Removed some remains of old (pre-regeneration) filechecks
llvm-svn: 271007
2016-05-27 15:56:19 +00:00
Than McIntosh
4e1e347d86 Disable lifetime-start-on-first-use analysis.
Summary:
Turn off lifetime-start-on-first-use enhancement for the moment
pending a fix for bug 27903.

Bug: 27903

Reviewers: tejohnson, wmi, qcolombet, gbiv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20731

llvm-svn: 271003
2016-05-27 15:27:51 +00:00
Simon Pilgrim
99e3cf65ff Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
llvm-svn: 270976
2016-05-27 09:02:25 +00:00
Simon Pilgrim
c8925e270b [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

A companion patch (D20684) removes/auto-upgrade the clang intrinsics.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 270973
2016-05-27 08:49:15 +00:00
Mitch Bodart
8ebfa4a476 [CodeGen] Fix problem with X86 byte registers in CriticalAntiDepBreaker
CriticalAntiDepBreaker was not correctly tracking defs of the high X86 byte
registers, leading to incorrect use of a busy register to break an
antidependence.

Fixes pr27681, and its duplicates pr27580, pr27804.

Differential Revision: http://reviews.llvm.org/D20456

llvm-svn: 270935
2016-05-26 23:08:52 +00:00
Simon Pilgrim
3ed5f417f8 [X86][SSE] When lowering a 256-bit shuffle as PMOVZX, reduce the input vector to the lower 128-bit subvector.
Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads.

llvm-svn: 270858
2016-05-26 15:40:36 +00:00
Simon Pilgrim
91342a593e [X86][SSE] Added load_zext_16i8_to_8i32 test
Odd issue with input vector not being folded into pmovzx on AVX2+ targets

llvm-svn: 270852
2016-05-26 14:45:30 +00:00
Igor Breger
d6da40dfa4 [AVX512] Fix intrinsic cmp{sd|ss} lowering.
Differential Revision: http://reviews.llvm.org/D20615

llvm-svn: 270843
2016-05-26 12:42:25 +00:00
Simon Pilgrim
bc104367db [X86][F16C] Added F16C fast-isel tests to match clang/test/CodeGen/f16c-builtins.c
llvm-svn: 270837
2016-05-26 10:26:56 +00:00
Simon Pilgrim
e303ca3030 [X86][AVX2] Added gather fast-isel tests to match clang/test/CodeGen/avx2-builtins.c
llvm-svn: 270835
2016-05-26 10:07:05 +00:00
Simon Pilgrim
e9a179168b [X86][SSE41] Removed pblendw intrinsics tests - they are auto-upgraded
Equivalent tests included in sse41-intrinsics-x86-upgrade.ll - the i8/i32 immediate diff doesn't matter anymore

llvm-svn: 270767
2016-05-25 21:27:58 +00:00
Simon Pilgrim
0da4a2c2e6 [X86][SSE41] Regenerated intrinsics tests
llvm-svn: 270764
2016-05-25 21:21:51 +00:00
Simon Pilgrim
5142359484 [X86][SSE41] Removed blendpd/blendps intrinsics tests - they are auto-upgraded
Equivalent tests included in sse41-intrinsics-x86-upgrade.ll

llvm-svn: 270761
2016-05-25 21:06:36 +00:00
Simon Pilgrim
0792d32b20 [X86][AVX2] Regenerate avx2 vector shift tests
llvm-svn: 270756
2016-05-25 21:00:40 +00:00
Rafael Espindola
2224908028 Fix shouldAssumeDSOLocal for private linkage.
llvm-svn: 270746
2016-05-25 19:55:16 +00:00
Hal Finkel
c1f6823ee0 [SDAG] Add a fallback multiplication expansion
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.

Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.

This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).

Fixes PR19797.

llvm-svn: 270720
2016-05-25 16:50:22 +00:00
Sanjay Patel
289425eb9f [x86, AVX] allow explicit calls to VZERO* to modify state in VZeroUpperInserter pass (PR27823)
As noted in the review, there are still problems, so this doesn't the bug completely.

Differential Revision: http://reviews.llvm.org/D20529

llvm-svn: 270718
2016-05-25 16:39:47 +00:00