llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	22068445e0	[X86][AVX] Add target shuffle decode support for VBROADCAST Currently we only decode broadcasts from a vector of the same size. llvm-svn: 275823	2016-07-18 17:32:59 +00:00
Craig Topper	187b793c90	[AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables. llvm-svn: 275775	2016-07-18 06:49:32 +00:00
Craig Topper	192c9a001b	[AVX512] Add KADD/KAND/KOR/KXOR to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275771	2016-07-18 06:14:59 +00:00
Craig Topper	696842b228	[X86] Add VPMULLW/D/Q instructions to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275770	2016-07-18 06:14:57 +00:00
Craig Topper	ee440de5cc	[X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275769	2016-07-18 06:14:54 +00:00
Craig Topper	30e4fba167	[X86] Add floating point packed logical ops to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275768	2016-07-18 06:14:50 +00:00
Craig Topper	324b6001c8	[X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275767	2016-07-18 06:14:47 +00:00
Craig Topper	121c063d30	[X86] Add more AVX512 instructions to X86InstrInfo::isHighLatencyDef. Also add all packed fp division instructions. llvm-svn: 275766	2016-07-18 06:14:45 +00:00
Craig Topper	af7075c21c	[X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr. llvm-svn: 275765	2016-07-18 06:14:43 +00:00
Craig Topper	0a7edae8f6	[X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related. llvm-svn: 275764	2016-07-18 06:14:39 +00:00
Craig Topper	937c51bc76	[AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763	2016-07-18 06:14:34 +00:00
Craig Topper	85b83b8d33	[X86] Fix 80-column violations. NFC llvm-svn: 275762	2016-07-18 06:14:26 +00:00
Simon Pilgrim	e1f863012f	Strip trailing whitespace llvm-svn: 275726	2016-07-17 19:02:27 +00:00
Simon Pilgrim	116932e299	[X86][SSE] lowerVectorShuffleAsPermuteAndUnpack tidyup. NFCI. Moved unpack type determination into TryUnpack lambda. Added missing comment describing lowerVectorShuffleAsPermuteAndUnpack call. llvm-svn: 275708	2016-07-17 15:48:25 +00:00
Guy Blank	9b5ca203ee	test commit llvm-svn: 275703	2016-07-17 12:10:35 +00:00
Craig Topper	2def7ae5d7	[AVX512] Remove CodeGenOnly VBROADCAST m_Int instructions. They can be implemented with patterns selecting existing instructions. NFC llvm-svn: 275671	2016-07-16 03:42:59 +00:00
Nico Weber	37fadfd6e6	Teach fast isel about the win64 calling convention. This mostly just works. Vectorcall rets are still not supported. The win64_eh test change is because fast isel doesn't use rsi for temporary computations, so it doesn't need to be pushed. The test case I'm changing was originally added to test pushes, but by now there are other test cases in that file exercising that code path. https://reviews.llvm.org/D22422 llvm-svn: 275607	2016-07-15 20:18:37 +00:00
Justin Lebar	4964e23787	[SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592	2016-07-15 18:27:10 +00:00
Justin Lebar	700af803a3	[CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand. Summary: Previously we took an unsigned. Hooray for type-safety. Reviewers: chandlerc Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D22282 llvm-svn: 275591	2016-07-15 18:26:59 +00:00
Jacques Pienaar	4ab4ea3179	Rename AnalyzeBranch* to analyzeBranch*. Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564	2016-07-15 14:41:04 +00:00
Simon Pilgrim	a3a6e24eb1	[X86][AVX2] Improve lowerShuffleAsRepeatedMaskAndLanePermute permutation of 64-bit sub-lanes As discussed on PR28136, lowerShuffleAsRepeatedMaskAndLanePermute was attempting to match repeated masks at the 128-bit level and then permute the resultant lanes at the 128-bit (AVX1) or 64-bit (AVX2) sub-lane level. This change allows us to create the repeated masks at the sub-lane level (and then concat them together to create a 128-bit repeated mask) and then select which sub-lane to permute. This has no effect on the AVX1 codegen. Fixes PR28136. llvm-svn: 275543	2016-07-15 09:49:12 +00:00
Simon Pilgrim	590f1569f5	[X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle (reapplied) This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better. This was incorrectly reverted in rL275421 during triage of PR28552. llvm-svn: 275497	2016-07-14 23:05:09 +00:00
Nirav Dave	29b7bfa64f	[X86][MC] Fix bracket expression parsing in intel-style assembly. Only perform struct field check on Identifier tokens. Fixes PR28547. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22361 llvm-svn: 275445	2016-07-14 17:37:05 +00:00
Nico Weber	c1714f06d7	Don't optimize movs to pushes in -O0 builds. https://reviews.llvm.org/D22362 llvm-svn: 275431	2016-07-14 15:40:22 +00:00
Nico Weber	9a98abcd5d	Delete some trailing whitespace. llvm-svn: 275429	2016-07-14 15:07:44 +00:00
Ahmed Bougacha	9d56162be1	[X86] Decode MPX BND registers. We were able to assemble, but not disassemble. Note that fixupRMValue was truncating EA_REG_BND0-3 because we hit the uint8_t max. The control registers were already squarely above it, but I don't think they ever go in .r/m, only in .reg. I also did notice an extra REX.W in our encoding, but I think that's fine. llvm-svn: 275427	2016-07-14 14:53:21 +00:00
Ahmed Bougacha	2e4800fe66	[X86] Don't mark addressing mode operands as "outs". NFC-ish. Nothing in-tree can tell the difference, but it's incorrect: the addressing mode registers aren't what's defined. llvm-svn: 275426	2016-07-14 14:53:17 +00:00
Nico Weber	ab8cedf91d	Revert r275411, it cause PR28552. llvm-svn: 275421	2016-07-14 14:49:35 +00:00
Nico Weber	2cf597abfa	Teach fast isel calls and rets about stdcall. stdcall is callee-pop like thiscall, so the thiscall changes already did most of the work for this. This change only opts stdcall in and adds tests. llvm-svn: 275414	2016-07-14 13:54:26 +00:00
Simon Pilgrim	7cbec99936	Remove trailing whitespace. llvm-svn: 275412	2016-07-14 13:29:23 +00:00
Simon Pilgrim	6806c88fdb	[X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better. llvm-svn: 275411	2016-07-14 13:28:43 +00:00
Simon Pilgrim	783ab10519	[X86][AVX] Add support for narrowing 128-bit+ shuffle mask elements to 64-bits to allow combining Primarily this is to allow blend with zero instead of having to use vperm2f128, but we can use this in the future to deal with AVX512 cases where we need to keep the original element size to correctly fold masked operations. llvm-svn: 275406	2016-07-14 12:58:04 +00:00
Simon Pilgrim	da1433bf2e	[X86][AVX] Add VBROADCASTF128/VBROADCASTI128 shuffle comments support llvm-svn: 275400	2016-07-14 12:07:43 +00:00
Simon Pilgrim	7a627e5944	[X86][AVX2] VBROADCASTSSrr/VBROADCASTSSYrr require AVX2 not AVX llvm-svn: 275391	2016-07-14 10:37:14 +00:00
Craig Topper	d3e63ad239	[AVX512] Implement EXTLOAD lowering with patterns to select existing VPMOVZX instructions instead of creating CodeGenOnly instructions. llvm-svn: 275378	2016-07-14 06:41:34 +00:00
Eli Friedman	f453c38f88	[X86] Fix stupid typo in isel lowering. Apparently someone miscounted the number of zeros in the immediate. Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 . llvm-svn: 275376	2016-07-14 05:48:25 +00:00
Dean Michael Berris	b3cb9bd89d	XRay: Add entry and exit sleds Summary: In this patch we implement the following parts of XRay: - Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture. There are some caveats here: 1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet. 2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library. Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D19904 llvm-svn: 275367	2016-07-14 04:06:33 +00:00
Nico Weber	d4fcb8d00e	Teach fast isel about thiscall (and callee-pop) calls. http://reviews.llvm.org/D22315 llvm-svn: 275360	2016-07-14 01:52:51 +00:00
Nico Weber	7706680906	Fix a TODO in X86CallFrameOptimization to not rely on a codegen artifact. This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well, but it's not clear if the pass should run in that setup. http://reviews.llvm.org/D22314 llvm-svn: 275320	2016-07-13 21:38:27 +00:00
Sanjay Patel	a2decca56b	[x86][SSE/AVX] optimize pcmp results better (PR28484) We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading. One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the test cases is filed as PR28505: https://llvm.org/bugs/show_bug.cgi?id=28505 Differential Revision: http://reviews.llvm.org/D22225 llvm-svn: 275276	2016-07-13 16:04:07 +00:00
Simon Pilgrim	af898a0a8f	[X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask comments llvm-svn: 275272	2016-07-13 15:45:36 +00:00
Simon Pilgrim	4bfe4bab88	[X86][AVX] Add support for target shuffle combining to VPERMILPS variable shuffle mask Added AVX512F VPERMILPS shuffle decoding support llvm-svn: 275270	2016-07-13 15:10:43 +00:00
Simon Pilgrim	5643451e2d	[X86][SSE] Check for lane crossing shuffles before trying to combine to PSHUFB Removes a return-on-fail that was making it tricky to add other variable mask shuffles. llvm-svn: 275262	2016-07-13 12:48:41 +00:00
Craig Topper	1f20f5e1b0	[X86] Remove some seemingly unnecessary patterns that supported vector zext/sext with 256-bit source types producing a 256-bit result. These patterns just extracted the source down to 128-bits to use the instructions. AVX512 seems to have blindly copied them over for VLX, but did not create similar patterns for 512-bit sources. So I'm hoping the backend can't actually produce these cases. llvm-svn: 275240	2016-07-13 02:21:25 +00:00
Simon Pilgrim	673a2f80ff	[X86][AVX] Add support for target shuffle combining to VPERM2F128/VPERM2I128 llvm-svn: 275212	2016-07-12 20:27:32 +00:00
Matthias Braun	6ab8340235	X86FixupBWInsts: No need for forward liveness analysis. With r274952 and r275201 in place there are no cases left where a forward liveness analysis yields different results than a backward one. So we can remove the forward stepping logic. Differential Revision: http://reviews.llvm.org/D22083 llvm-svn: 275204	2016-07-12 19:04:30 +00:00
Craig Topper	d120449666	[AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR. llvm-svn: 275155	2016-07-12 05:27:53 +00:00
Duncan P. N. Exon Smith	7f746b1ebb	X86: Avoid implicit iterator conversions, NFC Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr, mainly by preferring MachineInstr& over MachineInstr and using range-based for loops. llvm-svn: 275149	2016-07-12 03:18:50 +00:00
Nico Weber	4da55bdb13	Teach FastISel about thiscall (and, hence, about callee-pop). http://reviews.llvm.org/D22115 llvm-svn: 275135	2016-07-12 01:30:35 +00:00
Michael Kuperstein	7e6a08b33c	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106	2016-07-11 21:39:44 +00:00

1 2 3 4 5 ...

13489 Commits