llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

Author	SHA1	Message	Date
Jakob Stoklund Olesen	a05b70241c	Don't check liveness of unallocatable registers. This includes registers like EFLAGS and ST0-ST7. We don't check for liveness issues in the verifier and scavenger because registers will never be allocated from these classes. While in SSA form, we do care about the liveness of unallocatable unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for MachineDCE and MachineSinking. llvm-svn: 136541	2011-07-29 23:36:21 +00:00
Eric Christopher	96b31d5681	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Bruno Cardoso Lopes	871df895f4	Fix two tests that I crashed in the previous commits. The mask elts on the second half must be reindexed. llvm-svn: 136454	2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes	2b3d85d81c	Match VPERMIL masks more strictly and update the target specific mask generation to always catch the weird cases. llvm-svn: 136453	2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes	473d982caf	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Jakob Stoklund Olesen	cc29034b4c	Transfer implicit operands in NEONMoveFixPass. Later passes /are/ using this information when running the register scavenger. This fixes the second problem in PR10520. llvm-svn: 136440	2011-07-29 00:27:35 +00:00
Jakob Stoklund Olesen	f97f492104	Add -verify-arm-pseudo-expand. This hidden llc option runs the machine code verifier after expanding ARM pseudo-instructions, but before if-conversion. The machine code verifier is much better at pointing out liveness errors that can trip up the register scavenger. llvm-svn: 136439	2011-07-29 00:27:32 +00:00
Jakob Stoklund Olesen	5f429460ba	Handle REG_SEQUENCE with implicitly defined operands. Code like that would only be produced by bugpoint, but we should still handle it correctly. When a register is defined by a REG_SEQUENCE of undefs, the register itself is undef. Previously, we would create a register with uses but no defs. Fixes part of PR10520. llvm-svn: 136401	2011-07-28 21:38:51 +00:00
Bruno Cardoso Lopes	e24a043703	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	1f63a37172	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	06d8be564f	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes	8830fde434	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Devang Patel	e85a416d4e	It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry. llvm-svn: 136196	2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen	3f729850d3	Eliminate copies of undefined values during coalescing. These copies would coalesce easily, but the resulting value would be defined by a deleted instruction. Now we also remove the undefined value number from the destination register. This fixes PR10503. llvm-svn: 136174	2011-07-26 23:00:24 +00:00
Benjamin Kramer	32a2ce8416	Update test. llvm-svn: 136170	2011-07-26 22:45:39 +00:00
Benjamin Kramer	bfc2dfe3f7	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	e53bb853ea	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	4e16c5341a	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Jim Grosbach	906ecb46ed	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Eli Friedman	8779017138	XFAIL this test while I investigate it; it's failing for an unexpected reason. llvm-svn: 136131	2011-07-26 20:41:03 +00:00
Eli Friedman	e52bee3cc9	Add obvious missing case to switch. PR10497. llvm-svn: 136130	2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes	ab40a57cce	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	c94d6a2d2c	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	9380919dc5	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Eli Friedman	dc213dadcc	Attempt to fix test failure reported on llvm-commits. llvm-svn: 135995	2011-07-25 22:28:51 +00:00
Eli Friedman	99fd6d41b5	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Eli Friedman	234bbb2b95	Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980	2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen	0e4f7f92a2	Correctly handle <undef> tied uses when rewriting after a split. This fixes PR10463. A two-address instruction with an <undef> use operand was incorrectly rewritten so the def and use no longer used the same register, violating the tie constraint. Fix this by always rewriting <undef> operands with the register a def operand would use. llvm-svn: 135885	2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes	7347599e42	Fix test check! llvm-svn: 135802	2011-07-22 20:55:28 +00:00
Bruno Cardoso Lopes	50a38b479a	Fix PR10422 by adding the necessary AVX UCOMISD memory versions to load folding logic llvm-svn: 135801	2011-07-22 20:53:20 +00:00
Rafael Espindola	0c8190c4a3	Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 too. Patch by Jeff Muizelaar. llvm-svn: 135789	2011-07-22 18:56:05 +00:00
Bruno Cardoso Lopes	b7b9688aa5	-Inspected a AVX code block added by someone in early Feb. This was never used and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729	2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes	85357a460f	Although we already support this, add testcases for consistency llvm-svn: 135728	2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes	1ee6122518	Add a DAGCombine for transforming 128->256 casts into a simple vxorps + vinsertf128 pair of instructions llvm-svn: 135727	2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes	3691063149	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	ba1a2a9135	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Devang Patel	9914fe1aca	While emitting constant value, look through derived type and use underlying basic type to determine size and signness of the constant value. llvm-svn: 135627	2011-07-20 21:57:04 +00:00
Eli Friedman	3af0eb7b5f	PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS. llvm-svn: 135595	2011-07-20 18:14:33 +00:00
Evan Cheng	380dc98371	Add MCObjectFileInfo and sink the MCSections initialization code from TargetLoweringObjectFileImpl down to MCObjectFileInfo. TargetAsmInfo is done to one last method. It's almost gone! llvm-svn: 135569	2011-07-20 05:58:47 +00:00
Eric Christopher	7510091996	New pointer rotate test. llvm-svn: 135562	2011-07-20 03:09:11 +00:00
Akira Hatanaka	a50bbdfe15	Lower memory barriers to sync instructions. llvm-svn: 135537	2011-07-19 23:30:50 +00:00
Evan Cheng	9a80b0a7e6	Fix an obvious typo that's preventing x86 (32-bit) from using .literal16. llvm-svn: 135535	2011-07-19 23:14:32 +00:00
Akira Hatanaka	14e517df43	Use the correct opcodes: SLLV/SRLV or AND must be used instead of SLL/SRL or ANDi, when the instruction does not have any immediate operands. llvm-svn: 135520	2011-07-19 20:34:00 +00:00
Akira Hatanaka	f59cbeec14	Remove redundant instructions. - In EmitAtomicBinaryPartword, mask incr in loopMBB only if atomic.swap is the instruction being expanded, instead of masking it in thisMBB. - Remove redundant Or in EmitAtomicCmpSwap. llvm-svn: 135495	2011-07-19 18:14:26 +00:00
Richard Osborne	b469141419	Add intrinsics for the zext / sext instructions. llvm-svn: 135476	2011-07-19 13:28:50 +00:00
Richard Osborne	50303e0d38	Add intrinsics for the testct, testwct instructions. llvm-svn: 135475	2011-07-19 13:00:40 +00:00
Richard Osborne	409c0d7768	Add intrinsics for the peek and endin instructions. llvm-svn: 135474	2011-07-19 12:50:25 +00:00
Evan Cheng	bfc0cac54d	Introduce MCCodeGenInfo, which keeps information that can affect codegen (including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468	2011-07-19 06:37:02 +00:00
Devang Patel	72886ba8d8	Revert r135423. llvm-svn: 135454	2011-07-19 00:28:24 +00:00
Eli Friedman	887bb0b25a	FileCheck-ize a couple tests. llvm-svn: 135427	2011-07-18 21:23:42 +00:00

... 3 4 5 6 7 ...

5015 Commits