llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Elena Demikhovsky	40f2c3077f	Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT \|\| (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. llvm-svn: 147445	2012-01-03 11:59:04 +00:00
Nadav Rotem	6929a8868b	Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit. llvm-svn: 147426	2012-01-02 08:05:46 +00:00
Craig Topper	f7c9bf17dd	Allow CRC32 instructions to be selected when AVX is enabled. llvm-svn: 147411	2012-01-01 19:51:58 +00:00
Craig Topper	d8ae2d9f27	Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers. llvm-svn: 147409	2012-01-01 19:40:22 +00:00
Rafael Espindola	1cb17796db	Revert 147399. It broke CodeGen/ARM/vext.ll. llvm-svn: 147400	2012-01-01 17:36:23 +00:00
Elena Demikhovsky	9b74049783	Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT \|\| (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. llvm-svn: 147399	2012-01-01 16:22:47 +00:00
Craig Topper	0311c45aed	Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load. llvm-svn: 147393	2011-12-31 23:24:49 +00:00
Craig Topper	c01ce759d7	Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected. llvm-svn: 147392	2011-12-31 23:15:11 +00:00
Craig Topper	33091db89a	Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms. llvm-svn: 147361	2011-12-30 02:18:36 +00:00
Craig Topper	e066262284	Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere. llvm-svn: 147360	2011-12-30 01:49:53 +00:00
Eli Friedman	db54b4b68f	Fix type-checking for load transformation which is not legal on floating-point types. PR11674. llvm-svn: 147323	2011-12-28 21:24:44 +00:00
Nadav Rotem	d8c4880903	PR11662. Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage. llvm-svn: 147309	2011-12-28 13:08:20 +00:00
Elena Demikhovsky	9b4613ff14	Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR. Matching MOVLP mask for AVX (265-bit vectors) was wrong. The failure was detected by conformance tests. llvm-svn: 147308	2011-12-28 08:14:01 +00:00
Eli Friedman	064187912e	Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2. llvm-svn: 147283	2011-12-26 22:49:32 +00:00
Chandler Carruth	7a5c52fadf	Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the LZCNT instructions are available. Force promotion to i32 to get a smaller encoding since the fix-ups necessary are just as complex for either promoted type We can't do standard promotion for CTLZ when lowering through BSR because it results in poor code surrounding the 'xor' at the end of this instruction. Essentially, if we promote the entire CTLZ node to i32, we end up doing the xor on a 32-bit CTLZ implementation, and then subtracting appropriately to get back to an i8 value. Instead, our custom logic just uses the knowledge of the incoming size to compute a perfect xor. I'd love to know of a way to fix this, but so far I'm drawing a blank. I suspect the legalizer could be more clever and/or it could collude with the DAG combiner, but how... ;] llvm-svn: 147251	2011-12-24 12:12:34 +00:00
Chandler Carruth	82b7a7478b	Add systematic testing for cttz as well, and fix the bug I spotted by inspection earlier. llvm-svn: 147250	2011-12-24 11:46:10 +00:00
Chandler Carruth	b52ba33d0a	Add i8 and i64 testing for ctlz on x86. Also simplify the i16 test. llvm-svn: 147249	2011-12-24 11:26:59 +00:00
Chandler Carruth	800a803717	Tidy up this rather crufty test. Put the declarations at the top to make my C-brain happy. Remove the unnecessary bits of pedantic IR fluff like nounwind. Remove stray uses comments. Name things semantically rather than tN so that adding a new test in the middle doesn't cause pain, and so that new tests can be grouped semantically. This exposes how little systematic testing is going on here. I noticed this by finding several bugs via inspection and wondering why this test wasn't catching any of them. =[ llvm-svn: 147248	2011-12-24 11:26:57 +00:00
Chandler Carruth	48f5be6ce0	Expand more when we have a nice 'tzcnt' instruction, to avoid generating 'bsf' instructions here. This one is actually debatable to my eyes. It's not clear that any chip implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding. Still, this restores the old behavior with 'tzcnt' enabled for now. llvm-svn: 147246	2011-12-24 11:11:38 +00:00
Chandler Carruth	1846086903	Tidy up some of these tests. llvm-svn: 147245	2011-12-24 11:11:36 +00:00
Chandler Carruth	9ef50ef1f7	Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the X86ISelLowering C++ code. Because this is lowered via an xor wrapped around a bsr, we want the dagcombine which runs after isel lowering to have a chance to clean things up. In particular, it is very common to see code which looks like: (sizeof(x)8 - 1) ^ __builtin_clz(x) Which is trying to compute the most significant bit of 'x'. That's actually the value computed directly by the 'bsr' instruction, but if we match it too late, we'll get completely redundant xor instructions. The more naive code for the above (subtracting rather than using an xor) still isn't handled correctly due to the dagcombine getting confused. Also, while here fix an issue spotted by inspection: we should have been expanding the zero-undef variants to the normal variants when there is an 'lzcnt' instruction. Do so, and test for this. We don't want to generate unnecessary 'bsr' instructions. These two changes fix some regressions in encoding and decoding benchmarks. However, there is still a lot* to be improve on in this type of code. llvm-svn: 147244	2011-12-24 10:55:54 +00:00
Chandler Carruth	514920d53b	Cleanup this test a bit, sorting things and grouping them more clearly. llvm-svn: 147243	2011-12-24 10:55:42 +00:00
Elena Demikhovsky	b37883fe87	This is the second fix related to VZEXT_MOVL node. The failure that I see in the current version is: LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14] 0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13] 0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12] 0x18b9870: v4i64 = undef [ID=4] 0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10] 0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9970: i32 = Constant<0> [ID=3] 0x18b9170: v2i64 = undef [ORD=1] [ID=1] 0x18b9570: i32 = Constant<2> [ID=5] llvm-svn: 146975	2011-12-20 13:34:28 +00:00
Chandler Carruth	7564e8371a	Begin teaching the X86 target how to efficiently codegen patterns that use the zero-undefined variants of CTTZ and CTLZ. These are just simple patterns for now, there is more to be done to make real world code using these constructs be optimized and codegen'ed properly on X86. The existing tests are spiffed up to check that we no longer generate unnecessary cmov instructions, and that we generate the very important 'xor' to transform bsr which counts the index of the most significant one bit to the number of leading (most significant) zero bits. Also they now check that when the variant with defined zero result is used, the cmov is still produced. llvm-svn: 146974	2011-12-20 11:19:37 +00:00
Lang Hames	e32ef23ba8	Make sure that the lower bits on the VSELECT condition are properly set. llvm-svn: 146800	2011-12-17 01:08:46 +00:00
Craig Topper	88e2bfef0a	Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes. llvm-svn: 146726	2011-12-16 08:06:31 +00:00
Chad Rosier	62ebee9859	Add missing zmovl AVX patterns which were causing crashes. Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>! llvm-svn: 146689	2011-12-15 22:11:31 +00:00
Chad Rosier	e74b3b1469	Fix assert in LowerBUILD_VECTOR for v16i16 type on AVX. Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>! llvm-svn: 146684	2011-12-15 21:34:44 +00:00
Lang Hames	d5cee672a7	Set specific target cpu for testcase. llvm-svn: 146678	2011-12-15 20:22:34 +00:00
Lang Hames	0e361e816d	Added test case for r146671. llvm-svn: 146675	2011-12-15 19:56:07 +00:00
Eli Friedman	71c0914b64	Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570. llvm-svn: 146630	2011-12-15 02:07:20 +00:00
Chad Rosier	b93733686c	Add support for lowering fneg when AVX is enabled. rdar://10566486 llvm-svn: 146625	2011-12-15 01:02:25 +00:00
Chandler Carruth	2bedf185c9	Manually upgrade the test suite to specify the flag to cttz and ctlz. I followed three heuristics for deciding whether to set 'true' or 'false': - Everything target independent got 'true' as that is the expected common output of the GCC builtins. - If the target arch only has one way of implementing this operation, set the flag in the way that exercises the most of codegen. For most architectures this is also the likely path from a GCC builtin, with 'true' being set. It will (eventually) require lowering away that difference, and then lowering to the architecture's operation. - Otherwise, set the flag differently dependending on which target operation should be tested. Let me know if anyone has any issue with this pattern or would like specific tests of another form. This should allow the x86 codegen to just iteratively improve as I teach the backend how to differentiate between the two forms, and everything else should remain exactly the same. llvm-svn: 146370	2011-12-12 11:59:10 +00:00
Evan Cheng	77f0fb0296	Update test to something more sensible. llvm-svn: 146282	2011-12-09 21:54:10 +00:00
Benjamin Kramer	06cd66b1d7	X86: Add patterns for the various rounding ops for SSE4.1 and AVX. llvm-svn: 146257	2011-12-09 15:44:03 +00:00
Evan Cheng	2c8bac6b4c	Forgot setting -march. llvm-svn: 146244	2011-12-09 06:15:00 +00:00
Evan Cheng	ad8debd736	Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417 llvm-svn: 146196	2011-12-08 22:30:45 +00:00
Evan Cheng	d8a73b8918	Add various missing AVX patterns which was causing crashes. Sadly, the generated code looks pretty bad compared to SSE. rdar://10538793 llvm-svn: 146191	2011-12-08 22:05:28 +00:00
Evan Cheng	0e0e920975	Add test for r146163. llvm-svn: 146167	2011-12-08 19:21:39 +00:00
NAKAMURA Takumi	671c1da473	test/CodeGen/X86/vec_compare-2.ll: Add explicit -mtriple=i686-linux. llvm-svn: 146152	2011-12-08 15:24:09 +00:00
Nadav Rotem	341b30a457	Fix a bug in the integer-promotion of bitcast operations on vector types. We must not issue a bitcast operation for integer-promotion of vector types, because the location of the values in the vector may be different. llvm-svn: 146150	2011-12-08 13:10:01 +00:00
Eli Friedman	9e8d557cd1	Support vector bitcasts in the AsmPrinter. PR11495. llvm-svn: 146001	2011-12-07 00:50:54 +00:00
Eli Friedman	5545db0906	Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494. llvm-svn: 145996	2011-12-07 00:11:56 +00:00
Craig Topper	8b05e7d035	Fix a bunch of SSE/AVX patterns to use v2i64/v4i64 loads since all other integer vector loads are promoted to those. llvm-svn: 145927	2011-12-06 09:04:59 +00:00
Craig Topper	72b41227d8	Merge isSHUFPMask and isCommutedSHUFPMask into single function that can do both. Do the same for the 256-bit version. Use loops to reduce size of isVSHUFPYMask. Fix test cases that were incorrectly passing due to isCommutedSHUFPMask not checking for the vector being 128-bit. This caused some 256-bit shuffles to be incorrectly commuted. llvm-svn: 145921	2011-12-06 04:59:07 +00:00
NAKAMURA Takumi	c6a187dfdd	test/CodeGen/X86/pointer-vector.ll: Add explicit -mtriple=i686-linux. llvm-svn: 145805	2011-12-05 07:54:57 +00:00
Nadav Rotem	1a91e4381d	Add support for vectors of pointers. llvm-svn: 145801	2011-12-05 06:29:09 +00:00
Sanjoy Das	fe35e107cd	Check for stack space more intelligently. libgcc sets the stack limit field in TCB to 256 bytes above the actual allocated stack limit. This means if the function's stack frame needs less than 256 bytes, we can just compare the stack pointer with the stack limit. This should result in lesser calls to __morestack. llvm-svn: 145766	2011-12-03 09:32:07 +00:00
Sanjoy Das	d1c3d82afe	Fix a bug in the x86-32 code generated for segmented stacks. Currently LLVM pads the call to __morestack with a add and sub of 8 bytes to esp. This isn't correct since __morestack expects the call to be followed directly by a ret. This commit also adjusts the relevant test-case. llvm-svn: 145765	2011-12-03 09:21:07 +00:00
Craig Topper	d381116357	Add instruction selection support for horizontal add/sub of 256-bit floating point vectors. Also add the test case for 256-bit integer vectors. llvm-svn: 145680	2011-12-02 07:16:01 +00:00

1 2 3 4 5 ...

3083 Commits