1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00
Commit Graph

13489 Commits

Author SHA1 Message Date
Simon Pilgrim
22068445e0 [X86][AVX] Add target shuffle decode support for VBROADCAST
Currently we only decode broadcasts from a vector of the same size.

llvm-svn: 275823
2016-07-18 17:32:59 +00:00
Craig Topper
187b793c90 [AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables.
llvm-svn: 275775
2016-07-18 06:49:32 +00:00
Craig Topper
192c9a001b [AVX512] Add KADD/KAND/KOR/KXOR to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275771
2016-07-18 06:14:59 +00:00
Craig Topper
696842b228 [X86] Add VPMULLW/D/Q instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275770
2016-07-18 06:14:57 +00:00
Craig Topper
ee440de5cc [X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275769
2016-07-18 06:14:54 +00:00
Craig Topper
30e4fba167 [X86] Add floating point packed logical ops to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275768
2016-07-18 06:14:50 +00:00
Craig Topper
324b6001c8 [X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275767
2016-07-18 06:14:47 +00:00
Craig Topper
121c063d30 [X86] Add more AVX512 instructions to X86InstrInfo::isHighLatencyDef. Also add all packed fp division instructions.
llvm-svn: 275766
2016-07-18 06:14:45 +00:00
Craig Topper
af7075c21c [X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr.
llvm-svn: 275765
2016-07-18 06:14:43 +00:00
Craig Topper
0a7edae8f6 [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related.
llvm-svn: 275764
2016-07-18 06:14:39 +00:00
Craig Topper
937c51bc76 [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported.
Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step.

llvm-svn: 275763
2016-07-18 06:14:34 +00:00
Craig Topper
85b83b8d33 [X86] Fix 80-column violations. NFC
llvm-svn: 275762
2016-07-18 06:14:26 +00:00
Simon Pilgrim
e1f863012f Strip trailing whitespace
llvm-svn: 275726
2016-07-17 19:02:27 +00:00
Simon Pilgrim
116932e299 [X86][SSE] lowerVectorShuffleAsPermuteAndUnpack tidyup. NFCI.
Moved unpack type determination into TryUnpack lambda.

Added missing comment describing lowerVectorShuffleAsPermuteAndUnpack call.

llvm-svn: 275708
2016-07-17 15:48:25 +00:00
Guy Blank
9b5ca203ee test commit
llvm-svn: 275703
2016-07-17 12:10:35 +00:00
Craig Topper
2def7ae5d7 [AVX512] Remove CodeGenOnly VBROADCAST m_Int instructions. They can be implemented with patterns selecting existing instructions. NFC
llvm-svn: 275671
2016-07-16 03:42:59 +00:00
Nico Weber
37fadfd6e6 Teach fast isel about the win64 calling convention.
This mostly just works.

Vectorcall rets are still not supported.

The win64_eh test change is because fast isel doesn't use rsi for temporary
computations, so it doesn't need to be pushed. The test case I'm changing was
originally added to test pushes, but by now there are other test cases in that
file exercising that code path.

https://reviews.llvm.org/D22422

llvm-svn: 275607
2016-07-15 20:18:37 +00:00
Justin Lebar
4964e23787 [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Justin Lebar
700af803a3 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Jacques Pienaar
4ab4ea3179 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00
Simon Pilgrim
a3a6e24eb1 [X86][AVX2] Improve lowerShuffleAsRepeatedMaskAndLanePermute permutation of 64-bit sub-lanes
As discussed on PR28136, lowerShuffleAsRepeatedMaskAndLanePermute was attempting to match repeated masks at the 128-bit level and then permute the resultant lanes at the 128-bit (AVX1) or 64-bit (AVX2) sub-lane level.

This change allows us to create the repeated masks at the sub-lane level (and then concat them together to create a 128-bit repeated mask) and then select which sub-lane to permute. This has no effect on the AVX1 codegen.

Fixes PR28136.

llvm-svn: 275543
2016-07-15 09:49:12 +00:00
Simon Pilgrim
590f1569f5 [X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle (reapplied)
This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better.

This was incorrectly reverted in rL275421 during triage of PR28552.

llvm-svn: 275497
2016-07-14 23:05:09 +00:00
Nirav Dave
29b7bfa64f [X86][MC] Fix bracket expression parsing in intel-style assembly.
Only perform struct field check on Identifier tokens.

Fixes PR28547.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22361

llvm-svn: 275445
2016-07-14 17:37:05 +00:00
Nico Weber
c1714f06d7 Don't optimize movs to pushes in -O0 builds.
https://reviews.llvm.org/D22362

llvm-svn: 275431
2016-07-14 15:40:22 +00:00
Nico Weber
9a98abcd5d Delete some trailing whitespace.
llvm-svn: 275429
2016-07-14 15:07:44 +00:00
Ahmed Bougacha
9d56162be1 [X86] Decode MPX BND registers.
We were able to assemble, but not disassemble.

Note that fixupRMValue was truncating EA_REG_BND0-3 because we hit
the uint8_t max.  The control registers were already squarely above
it, but I don't think they ever go in .r/m, only in .reg.

I also did notice an extra REX.W in our encoding, but I think that's
fine.

llvm-svn: 275427
2016-07-14 14:53:21 +00:00
Ahmed Bougacha
2e4800fe66 [X86] Don't mark addressing mode operands as "outs". NFC-ish.
Nothing in-tree can tell the difference, but it's incorrect: the
addressing mode registers aren't what's defined.

llvm-svn: 275426
2016-07-14 14:53:17 +00:00
Nico Weber
ab8cedf91d Revert r275411, it cause PR28552.
llvm-svn: 275421
2016-07-14 14:49:35 +00:00
Nico Weber
2cf597abfa Teach fast isel calls and rets about stdcall.
stdcall is callee-pop like thiscall, so the thiscall changes already did most
of the work for this.  This change only opts stdcall in and adds tests.

llvm-svn: 275414
2016-07-14 13:54:26 +00:00
Simon Pilgrim
7cbec99936 Remove trailing whitespace.
llvm-svn: 275412
2016-07-14 13:29:23 +00:00
Simon Pilgrim
6806c88fdb [X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle
This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better.

llvm-svn: 275411
2016-07-14 13:28:43 +00:00
Simon Pilgrim
783ab10519 [X86][AVX] Add support for narrowing 128-bit+ shuffle mask elements to 64-bits to allow combining
Primarily this is to allow blend with zero instead of having to use vperm2f128, but we can use this in the future to deal with AVX512 cases where we need to keep the original element size to correctly fold masked operations.

llvm-svn: 275406
2016-07-14 12:58:04 +00:00
Simon Pilgrim
da1433bf2e [X86][AVX] Add VBROADCASTF128/VBROADCASTI128 shuffle comments support
llvm-svn: 275400
2016-07-14 12:07:43 +00:00
Simon Pilgrim
7a627e5944 [X86][AVX2] VBROADCASTSSrr/VBROADCASTSSYrr require AVX2 not AVX
llvm-svn: 275391
2016-07-14 10:37:14 +00:00
Craig Topper
d3e63ad239 [AVX512] Implement EXTLOAD lowering with patterns to select existing VPMOVZX instructions instead of creating CodeGenOnly instructions.
llvm-svn: 275378
2016-07-14 06:41:34 +00:00
Eli Friedman
f453c38f88 [X86] Fix stupid typo in isel lowering.
Apparently someone miscounted the number of zeros in the immediate.
Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 .

llvm-svn: 275376
2016-07-14 05:48:25 +00:00
Dean Michael Berris
b3cb9bd89d XRay: Add entry and exit sleds
Summary:
In this patch we implement the following parts of XRay:

- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.

There are some caveats here:

1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.

2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.

Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk

Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D19904

llvm-svn: 275367
2016-07-14 04:06:33 +00:00
Nico Weber
d4fcb8d00e Teach fast isel about thiscall (and callee-pop) calls.
http://reviews.llvm.org/D22315

llvm-svn: 275360
2016-07-14 01:52:51 +00:00
Nico Weber
7706680906 Fix a TODO in X86CallFrameOptimization to not rely on a codegen artifact.
This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well,
but it's not clear if the pass should run in that setup.

http://reviews.llvm.org/D22314

llvm-svn: 275320
2016-07-13 21:38:27 +00:00
Sanjay Patel
a2decca56b [x86][SSE/AVX] optimize pcmp results better (PR28484)
We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading.

One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the
same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if
it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the
test cases is filed as PR28505:
https://llvm.org/bugs/show_bug.cgi?id=28505

Differential Revision: http://reviews.llvm.org/D22225

llvm-svn: 275276
2016-07-13 16:04:07 +00:00
Simon Pilgrim
af898a0a8f [X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask comments
llvm-svn: 275272
2016-07-13 15:45:36 +00:00
Simon Pilgrim
4bfe4bab88 [X86][AVX] Add support for target shuffle combining to VPERMILPS variable shuffle mask
Added AVX512F VPERMILPS shuffle decoding support

llvm-svn: 275270
2016-07-13 15:10:43 +00:00
Simon Pilgrim
5643451e2d [X86][SSE] Check for lane crossing shuffles before trying to combine to PSHUFB
Removes a return-on-fail that was making it tricky to add other variable mask shuffles.

llvm-svn: 275262
2016-07-13 12:48:41 +00:00
Craig Topper
1f20f5e1b0 [X86] Remove some seemingly unnecessary patterns that supported vector zext/sext with 256-bit source types producing a 256-bit result.
These patterns just extracted the source down to 128-bits to use the instructions. AVX512 seems to have blindly copied them over for VLX, but did not create similar patterns for 512-bit sources. So I'm hoping the backend can't actually produce these cases.

llvm-svn: 275240
2016-07-13 02:21:25 +00:00
Simon Pilgrim
673a2f80ff [X86][AVX] Add support for target shuffle combining to VPERM2F128/VPERM2I128
llvm-svn: 275212
2016-07-12 20:27:32 +00:00
Matthias Braun
6ab8340235 X86FixupBWInsts: No need for forward liveness analysis.
With r274952 and r275201 in place there are no cases left where a
forward liveness analysis yields different results than a backward one.
So we can remove the forward stepping logic.

Differential Revision: http://reviews.llvm.org/D22083

llvm-svn: 275204
2016-07-12 19:04:30 +00:00
Craig Topper
d120449666 [AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR.
llvm-svn: 275155
2016-07-12 05:27:53 +00:00
Duncan P. N. Exon Smith
7f746b1ebb X86: Avoid implicit iterator conversions, NFC
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr*, mainly by preferring MachineInstr& over MachineInstr* and
using range-based for loops.

llvm-svn: 275149
2016-07-12 03:18:50 +00:00
Nico Weber
4da55bdb13 Teach FastISel about thiscall (and, hence, about callee-pop).
http://reviews.llvm.org/D22115

llvm-svn: 275135
2016-07-12 01:30:35 +00:00
Michael Kuperstein
7e6a08b33c [X86] Make some cast costs more precise
Make some AVX and AVX512 cast costs more precise.
Based on part of a patch by Elena Demikhovsky (D15604).

Differential Revision: http://reviews.llvm.org/D22064

llvm-svn: 275106
2016-07-11 21:39:44 +00:00