llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Eli Friedman	3db429c878	Convert a bunch more tests over to the new atomic instructions. llvm-svn: 140582	2011-09-26 23:15:09 +00:00
Eli Friedman	d01fc33809	Convert more tests to new atomic instructions. llvm-svn: 140567	2011-09-26 21:36:10 +00:00
Eli Friedman	6aaaadc188	Convert more tests over to the new atomic instructions. I did not convert Atomics-32.ll and Atomics-64.ll by hand; the diff is autoupgrade output. The wmb test is gone because there isn't any way to express wmb with the new atomic instructions; if someone really needs a non-asm way to write a wmb on Alpha, a platform-specific intrisic could be added. llvm-svn: 140566	2011-09-26 21:30:17 +00:00
Eli Friedman	56e68f7271	Convert more tests over to the new atomic instructions. llvm-svn: 140559	2011-09-26 20:27:49 +00:00
Justin Holewinski	52c50104d7	PTX: Fix detection of stack load/store vs. global load/store, as well as fix the printing of local offsets llvm-svn: 140547	2011-09-26 18:57:22 +00:00
Justin Holewinski	443a122ac3	PTX: Add .align tests to stack object test file llvm-svn: 140537	2011-09-26 16:20:38 +00:00
Justin Holewinski	859dd9fa59	PTX: Fix some lingering issues with stack allocation llvm-svn: 140535	2011-09-26 16:20:34 +00:00
Justin Holewinski	83ae9143fd	PTX: Unify handling of loads/stores llvm-svn: 140533	2011-09-26 16:20:28 +00:00
David Meyer	90ed5fdd4f	Only run tests in test/CodeGen/CBackend/X86 when both X86 and CBackend are supported llvm-svn: 140517	2011-09-26 06:44:27 +00:00
David Meyer	a6e588d80c	PR11004: Inline memcpy to avoid generating nested call sequence. Un-XFAIL 2011-06-09-TailCallByVal and 2010-11-04-BigByval llvm-svn: 140516	2011-09-26 06:13:20 +00:00
Jakob Stoklund Olesen	59b2982dcf	Only run MF.verify() with EXPENSIVE_CHECKS=1. llvm-svn: 140441	2011-09-24 01:11:19 +00:00
Jakob Stoklund Olesen	bc6ae70907	Verify that terminators follow non-terminators. This exposes a -segmented-stacks bug. llvm-svn: 140429	2011-09-23 22:45:39 +00:00
Eli Friedman	a66a438876	PR10998: It is not legal to sink an instruction past the terminator of a block; make sure we don't do that. llvm-svn: 140428	2011-09-23 22:41:57 +00:00
Jakob Stoklund Olesen	ca6877343b	Also match negative offsets for addrmode3 and addrmode5. Math is hard, and isScaledConstantInRange() always returned false for negative constants. It was doing unsigned division of negative numbers before casting back to signed. llvm-svn: 140425	2011-09-23 22:10:33 +00:00
Justin Holewinski	1c0e0dcfbe	PTX: Handle function call return values llvm-svn: 140386	2011-09-23 16:48:41 +00:00
Justin Holewinski	0231798704	PTX: Start fixing function calls llvm-svn: 140378	2011-09-23 14:31:12 +00:00
Eli Friedman	6f0131b3a7	PR10989: Don't print .hidden on Windows. llvm-svn: 140356	2011-09-23 00:13:02 +00:00
Eli Friedman	31c7bde95a	PR10991: make fast-isel correctly check whether accessing a global through an alias involves thread-local storage. (I'm not entirely sure how this is supposed to work, but this patch makes fast-isel consistent with the normal isel path.) llvm-svn: 140355	2011-09-22 23:41:28 +00:00
Dan Gohman	d63418e497	Fix SimplifySelectCC to add newly created nodes to the DAGCombiner worklist, as it may be possible to perform further optimization on them. llvm-svn: 140349	2011-09-22 23:01:29 +00:00
Duncan Sands	1da590b589	Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from floating point add/sub of appropriate shuffle vectors. Does not synthesize the 256 bit AVX versions because they work differently. llvm-svn: 140332	2011-09-22 20:15:48 +00:00
Justin Holewinski	9acce6aa64	PTX: fixup test cases for register changes llvm-svn: 140311	2011-09-22 16:45:51 +00:00
Devang Patel	5d43ab8434	Do not unnecessarily use AT_specification DIE because it does not add any value. Few weeks ago, llvm completely inverted the debug info graph. Earlier each debug info node used to keep track of its compile unit, now compile unit keeps track of important nodes. One impact of this change is that the global variable's do not have any context, which should be checked before deciding to use AT_specification DIE. llvm-svn: 140282	2011-09-21 23:41:11 +00:00
Akira Hatanaka	0c87291a10	Remove +. llvm-svn: 140266	2011-09-21 17:43:48 +00:00
Akira Hatanaka	d987b12b57	Re-enable some of the disabled tests. Use FileCheck instead of grep to check output. llvm-svn: 140263	2011-09-21 17:36:30 +00:00
Nadav Rotem	50430e8160	add another testcase for pr10902 llvm-svn: 140257	2011-09-21 17:13:40 +00:00
Nadav Rotem	af5643de3c	[VECTOR-SELECT] Address one of the bugs in pr10902. Vector SetCC result types need to be type-legalized. This code worked before because scalar result types are known to be legal. llvm-svn: 140249	2011-09-21 14:34:38 +00:00
Eric Christopher	9b721ff19e	Remove llvm-gcc and various compiler handling from llvm. It's not needed here anymore and has been migrated to the test-suite project. llvm-svn: 140216	2011-09-20 23:58:15 +00:00
Bill Wendling	67cf034fe3	This test is completely invalid with the modern EH model. Delete. llvm-svn: 140213	2011-09-20 23:52:09 +00:00
Bruno Cardoso Lopes	1ffbef8ad1	Add a DAGCombine for subvector extracts to remove useless chains of subvector inserts and extracts. Initial patch by Rackover, Zvi with some tweak done by me. llvm-svn: 140204	2011-09-20 23:19:33 +00:00
Bruno Cardoso Lopes	629e7c2410	Revert r140097, working on a better approach llvm-svn: 140203	2011-09-20 23:19:29 +00:00
Evan Cheng	ead45e2ba6	Fix a bug introduced during refactoring a couple of months ago. Cortex-M3 does not support Thumb2 dsp instructions. rdar://10152911. llvm-svn: 140181	2011-09-20 21:38:18 +00:00
NAKAMURA Takumi	595c0c8e15	test/CodeGen/X86/avx-minmax.ll: Unbreak Win32. On Windows x64, 128-bit arguments are not passed by reg but by indirect. eg. maxpd: vmovapd (%rcx), %xmm0 vmaxpd (%rdx), %xmm0, %xmm0 FIXME: I don't care YMM on x64 for now. llvm-svn: 140143	2011-09-20 14:11:35 +00:00
Craig Topper	df17f1cc99	Extend changes from r139986 to produce 256-bit AVX minps/minpd/maxps/maxpd. llvm-svn: 140140	2011-09-20 07:38:59 +00:00
Andrew Trick	53aeb9f663	ARM isel bug fix for adds/subs operands. Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the full gamut of CPSR defs/uses including instructins whose "optional" cc_out operand is not really optional. This allowed removal of the hasPostISelHook to simplify the .td files and make the implementation more robust. Fixes rdar://10137436: sqlite3 miscompile llvm-svn: 140134	2011-09-20 03:17:40 +00:00
Bruno Cardoso Lopes	bed7ef51b6	Attempt to fix -mtriple=i686-{cygwin\|mingw\|win32} regressions. Nakamura, if this doesn't work, please provide more details. llvm-svn: 140107	2011-09-20 00:08:12 +00:00
Bruno Cardoso Lopes	7cf7f02c3d	Based on the small opt Zvi's patch was trying to achieve, eliminate 128-bit undef subvector insertion into a 256-bit vector llvm-svn: 140097	2011-09-19 23:36:50 +00:00
Eli Friedman	b11676fb4b	Some additional tests for Thumb atomic load and store (which I somehow forgot to commit earlier). llvm-svn: 140074	2011-09-19 22:02:33 +00:00
Bruno Cardoso Lopes	9e5ef44daf	Match X86ISD::FSETCCsd and X86ISD::FSETCCss while in AVX mode. This fix PR10955 and PR10948. llvm-svn: 140069	2011-09-19 21:29:24 +00:00
Nadav Rotem	1cfdc59e94	setOperationAction should be done on the return value of the type, not the operands. llvm-svn: 140001	2011-09-18 14:57:03 +00:00
Nadav Rotem	cfc77bc719	When promoting integer vectors we often create ext-loads. This patch adds a dag-combine optimization to implement the ext-load efficiently (using shuffles). For example the type <4 x i8> is stored in memory as i32, but it needs to find its way into a <4 x i32> register. Previously we scalarized the memory access, now we use shuffles. llvm-svn: 139995	2011-09-18 10:39:32 +00:00
Benjamin Kramer	547157073b	Apply Duncan's test fix from r139986 to the avx version of that test too. llvm-svn: 139992	2011-09-18 00:41:38 +00:00
Duncan Sands	4149334f09	Synthesize x86 max/min instructions also for vectors (i.e. produce maxps and maxpd). This broke the sse41-blend.ll testcase by causing maxpd to be produced rather than a cmp+blend pair, which is the reason I tweaked it. Gives a small speedup on doduc with dragonegg when the GCC vectorizer is used. llvm-svn: 139986	2011-09-17 16:49:39 +00:00
Andrew Trick	10ea51b841	Test case trial and error. Not sure the proper way to check MBB names. llvm-svn: 139900	2011-09-16 03:57:19 +00:00
Andrew Trick	5be06c8057	Reduced a stronger test case for coalescer bug PR10920. llvm-svn: 139898	2011-09-16 03:46:49 +00:00
Eli Friedman	f7bb39b592	Some legalization fixes for atomic load and store. llvm-svn: 139851	2011-09-15 21:20:49 +00:00
Jakob Stoklund Olesen	b36a98d18f	VirtRegMap is counting spill slots, not register spills. Fix the stats counters to reflect that. llvm-svn: 139819	2011-09-15 18:31:13 +00:00
Bruno Cardoso Lopes	8e702bba63	Change all checks regarding the presence of any SSE level to always take into consideration the presence of AVX. This change, together with the SSEDomainFix enabled for AVX, makes AVX codegen to always (hopefully) emit the same code as SSE for 128-bit vector ops. I don't have a testcase for this, but AVX now beats SSE in performance for 128-bit ops in the majority of programas in the llvm testsuite llvm-svn: 139817	2011-09-15 18:27:36 +00:00
Andrew Trick	e5bb7267ff	[regcoalescing] bug fix for RegistersDefinedFromSameValue. An improper SlotIndex->VNInfo lookup was leading to unsafe copy removal. Fixes PR10920 401.bzip2 miscompile with no IV rewrite. llvm-svn: 139765	2011-09-15 01:09:33 +00:00
Nadav Rotem	8e3edccebe	Add integer promotion support for vselect llvm-svn: 139692	2011-09-14 14:42:15 +00:00
Bruno Cardoso Lopes	3e6b9661d1	Vector shuffle mask <i32 4, i32 5, i32 2, i32 3> should yield "movsd", not "movss". llvm-svn: 139686	2011-09-14 02:36:14 +00:00
Devang Patel	75c70b2315	Remove ancient debug info constructs from test cases, they are not relevant to test case's main objective. llvm-svn: 139675	2011-09-14 00:29:50 +00:00
Devang Patel	f9dcd6261d	Remove unnecessary old test. llvm-svn: 139674	2011-09-14 00:28:54 +00:00
Akira Hatanaka	3d26b79a9a	Delete test cases that generate code for allegrex/psp and cannot be repurposed. llvm-svn: 139652	2011-09-13 22:29:13 +00:00
Eli Friedman	f13b5ef0e1	Error out on CodeGen of unaligned load/store. Fix test so it isn't accidentally testing that case. llvm-svn: 139641	2011-09-13 20:50:54 +00:00
Akira Hatanaka	44c745931f	Add pattern used to match MipsLo, which is needed when the instruction selector tries to match a dead MipsLo node (explanation in the link below). http://article.gmane.org/gmane.comp.compilers.llvm.devel/42757/match=dagcombiner+dead llvm-svn: 139634	2011-09-13 20:13:58 +00:00
Akira Hatanaka	1376e1eabf	Disable tests which generate code for allegrex or psp. llvm-svn: 139632	2011-09-13 20:00:35 +00:00
Nadav Rotem	5ea703debf	update checked pattern llvm-svn: 139631	2011-09-13 19:59:18 +00:00
Nadav Rotem	60df99b809	Add vselect target support for targets that do not support blend but do support xor/and/or (For example SSE2). llvm-svn: 139623	2011-09-13 19:17:42 +00:00
Andrew Trick	a534a558a2	Generalize this test's CHECK statements to handle different indvars modes. llvm-svn: 139577	2011-09-13 02:46:27 +00:00
Bruno Cardoso Lopes	eb09ab7c3f	Change testcase commandline to be more strict and silence buildbots llvm-svn: 139554	2011-09-12 22:59:26 +00:00
Bruno Cardoso Lopes	a4d2bdfa40	Fix PR10845. SUBREG_TO_REG shouldn't be used when the input and destination types are equal! llvm-svn: 139553	2011-09-12 22:59:23 +00:00
Bruno Cardoso Lopes	64e2e852f9	Revert the wrong part of r139528, and fix testcases. llvm-svn: 139541	2011-09-12 21:24:07 +00:00
Bruno Cardoso Lopes	c67e996fc3	Not sure how CMPPS and CMPPD had already ever worked, I guess it didn't. However with this fix it does now. Basically the operand order for the x86 target specific node is not the same as the instruction, but since the intrinsic need that specific order at the instruction definition, just change the order during legalization. Also, there were some wrong invertions of condition codes, such as GE => LE, GT => LT, fix that too. Fix PR10907. llvm-svn: 139528	2011-09-12 19:30:40 +00:00
Eli Friedman	08926ecbfb	Fix mistake in test runline. llvm-svn: 139505	2011-09-12 17:32:58 +00:00
Richard Osborne	962b1ca071	Associate a MemOperand with LDWCP nodes introduced during ISel. This information is required if we want LDWCP to be hoisted out of loops. llvm-svn: 139495	2011-09-12 14:43:23 +00:00
Eli Friedman	2275f7612e	Really un-XFAIL the testcase, like I said I would in r139458. llvm-svn: 139459	2011-09-10 02:02:27 +00:00
Richard Trieu	0485e133f2	Fixed an assert from: assert("not implemented for target shuffle node"); to: assert(0 && "not implemented for target shuffle node"); This causes a test failure in CodeGen/X86/palignr.ll which has been marked as XFAIL for the time being. Test failure filed at PR10901. llvm-svn: 139454	2011-09-10 01:26:21 +00:00
Akira Hatanaka	a8f0f7babb	Fix test cases. Generate code for Mips32r1 unless a Mips32r2 feature is tested. llvm-svn: 139433	2011-09-09 23:14:58 +00:00
Eli Friedman	4bae1c4f70	Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897. llvm-svn: 139407	2011-09-09 21:04:06 +00:00
Akira Hatanaka	f65d050693	Drop support for Mips1 and Mips2. llvm-svn: 139405	2011-09-09 20:45:50 +00:00
Nadav Rotem	ccb46031e6	Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type llvm-svn: 139400	2011-09-09 20:29:17 +00:00
Akira Hatanaka	17df2dfe8c	Drop support for Allegrex. Allegrex implements a variant of Mips2. llvm-svn: 139383	2011-09-09 19:00:51 +00:00
Akira Hatanaka	e1eb015eb9	Change default target architecture from Mips1 to Mips32r1 in preparation for removing support for Mips1 and Mips2. This change and the ones that follow have been discussed with and approved by Bruno. llvm-svn: 139344	2011-09-09 01:13:27 +00:00
Devang Patel	ba2d56b1ef	Directly point debug info to the stack slot of the arugment, instead of trying to keep track of vreg in which it the arugment is copied. The LiveDebugVariable can keep track of variable's ranges. llvm-svn: 139330	2011-09-08 22:59:09 +00:00
Bruno Cardoso Lopes	54962ac233	Add a AVX version of a simple i64 -> f64 bitcast. This could be triggered using llc with -O0, which wouldn't let it be folded and expose the lack of this pattern. llvm-svn: 139320	2011-09-08 21:52:33 +00:00
Bruno Cardoso Lopes	50596b096c	Reapply testcase from r139309! llvm-svn: 139318	2011-09-08 21:05:43 +00:00
Bruno Cardoso Lopes	3ecc7a69fd	Remove this crashing test, until I figure out what's going wrong here llvm-svn: 139309	2011-09-08 18:32:36 +00:00
Bruno Cardoso Lopes	74a67e22b0	Add AVX versions of blend vector operations and fix some issues noticed in Nadav's r139285 and r139287 commits. 1) Rename vsel.ll to a more descriptive name 2) Change the order of BLEND operands to "Op1, Op2, Cond", this is necessary because PBLENDVB is already used in different places with this order, and it was being emitted in the wrong way for vselect 3) Add AVX patterns and tests for the same SSE41 instructions llvm-svn: 139305	2011-09-08 18:05:08 +00:00
Bruno Cardoso Lopes	84c53e3965	Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl. Triggered using llc -O0. Also fix some SET0PS patterns to their AVX forms and test it on the testcase. llvm-svn: 139304	2011-09-08 18:05:02 +00:00
Nadav Rotem	fd68584146	This test is already covered by llvm/trunk/test/CodeGen/X86/vsel.ll llvm-svn: 139288	2011-09-08 08:43:23 +00:00
Nadav Rotem	dbfa2c8810	add a testcase for the previous patch llvm-svn: 139287	2011-09-08 08:31:31 +00:00
Nadav Rotem	b461f2190e	Add X86-SSE4 codegen support for vector-select. llvm-svn: 139285	2011-09-08 08:11:19 +00:00
Eli Friedman	9ea5599729	Fix atomic load and store on x86 to pass -verify-machineinstrs (and possibly fix some subtle bugs involving passes which check mayStore()). This isn't exactly ideal, but it is good enough for the moment. llvm-svn: 139245	2011-09-07 18:48:32 +00:00
Duncan Sands	8df5170d0d	Another forgotten trampoline testcase. llvm-svn: 139230	2011-09-07 10:05:14 +00:00
Eli Friedman	6a45370c0f	Relax the MemOperands on atomics a bit. Fixes -verify-machineinstrs failures for atomic laod/store on ARM. (The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.) llvm-svn: 139221	2011-09-07 02:23:42 +00:00
Devang Patel	f4483238b6	While sinking machine instructions, sink matching DBG_VALUEs also otherwise live debug variable pass will drop DBG_VALUEs on the floor. llvm-svn: 139208	2011-09-07 00:07:58 +00:00
Nick Lewycky	4e3daabb26	Disable these tests harder. They're XFAIL'd, but that means they still run, and these tests all infinitely recurse, bringing my system down into swapping hell. llvm-svn: 139192	2011-09-06 22:08:18 +00:00
Evan Cheng	891e9696ea	Fix fall outs from my recent change on how carry bit is modeled during isel. Now the 'S' instructions, e.g. ADDS, treat S bit as optional operand as well. Also fix isel hook to correctly set the optional operand. rdar://10073745 llvm-svn: 139157	2011-09-06 18:52:20 +00:00
Jakob Stoklund Olesen	7994269719	Atomic pseudos don't use (as in read) CPSR. They clobber it. llvm-svn: 139148	2011-09-06 17:40:35 +00:00
Duncan Sands	6939ae53ac	Split the init.trampoline intrinsic, which currently combines GCC's init.trampoline and adjust.trampoline intrinsics, into two intrinsics like in GCC. While having one combined intrinsic is tempting, it is not natural because typically the trampoline initialization needs to be done in one function, and the result of adjust trampoline is needed in a different (nested) function. To get around this llvm-gcc hacks the nested function lowering code to insert an additional parent variable holding the adjust.trampoline result that can be accessed from the child function. Dragonegg doesn't have the luxury of tweaking GCC code, so it stored the result of adjust.trampoline in the memory GCC set aside for the trampoline itself (this is always available in the child function), and set up some new memory (using an alloca) to hold the trampoline. Unfortunately this breaks Go which allocates trampoline memory on the heap and wants to use it even after the parent has exited (!). Rather than doing even more hacks to get Go working, it seemed best to just use two intrinsics like in GCC. Patch mostly by Sanjoy Das. llvm-svn: 139140	2011-09-06 13:37:06 +00:00
Dan Gohman	cbadb0f92c	Revert r129875, XFAILing this test for arm, since the fix was reverted. llvm-svn: 139058	2011-09-03 00:14:24 +00:00
Jakob Stoklund Olesen	ef8527b836	Pseudo CMOV instructions don't clobber EFLAGS. The explanation about a 0 argument being materialized as xor is no longer valid. Rematerialization will check if EFLAGS is live before clobbering it. The code produced by X86TargetLowering::EmitLoweredSelect does not clobber EFLAGS. This causes one less testb instruction to be generated in the cmov.ll test case. llvm-svn: 139057	2011-09-02 23:52:55 +00:00
Bill Wendling	145872a92c	Try to eliminate the use of the 'unwind' instruction. llvm-svn: 139046	2011-09-02 22:41:11 +00:00
Eli Friedman	383a3c76b2	Don't fast-isel for atomic load/store; some cases require extra handling missing from fast-isel. llvm-svn: 139044	2011-09-02 22:33:24 +00:00
Bill Wendling	267bc5089a	Better fix for this testcase. Update it to the new EH scheme entirely. llvm-svn: 139039	2011-09-02 21:27:08 +00:00
Bill Wendling	a4f9142ad4	Update for new EH stuff. (I'm not sure if this is 100% correct.) llvm-svn: 139038	2011-09-02 21:24:17 +00:00
Duncan Sands	33f33411e8	Darwin wants ctors/dtors to be ordered the other way round to linux. llvm-svn: 139015	2011-09-02 18:07:19 +00:00
Kalle Raiskila	7c154fe467	Pass signed (not unsigned) 10 bit field to SPU 'ori' instruction. llvm-svn: 139004	2011-09-02 10:05:01 +00:00
Dan Gohman	6d0230847c	Revert r131152, r129796, r129761. This code is currently considered to be unreliable on platforms which require memcpy calls, and it is complicating broader legalize cleanups. It is hoped that these cleanups will make memcpy byval easier to implement in the future. llvm-svn: 138977	2011-09-01 23:07:08 +00:00
Benjamin Kramer	bd939ad83e	Don't drop alignment info on local common symbols. - On COFF the .lcomm directive has an alignment argument. - On ELF we fall back to .local + .comm Based on a patch by NAKAMURA Takumi. Fixes PR9337, PR9483 and PR10128. llvm-svn: 138976	2011-09-01 23:04:27 +00:00
Benjamin Kramer	4ed9810e77	XFAIL this test on arm until the backend is fixed. llvm-svn: 138955	2011-09-01 18:40:03 +00:00
Benjamin Kramer	ca7001eedb	This test depends on cmov being available. llvm-svn: 138954	2011-09-01 18:40:01 +00:00
Jakob Stoklund Olesen	c26e2e6221	Permit remat of partial register defs when it is safe. An instruction may define part of a register where the other bits are undefined. In that case, it is safe to rematerialize the instruction. For example: %vreg2:ssub_0<def> = VLDRS <cp#0>, 0, pred:14, pred:%noreg, %vreg2<imp-def> The extra <imp-def> operand indicates that the instruction does not read the other parts of the virtual register, so a remat is safe. This patch simply allows multiple def operands for the virtual register. It is MI->readsVirtualRegister() that determines if we depend on a previous value so remat is impossible. llvm-svn: 138953	2011-09-01 18:27:51 +00:00
Bruno Cardoso Lopes	10f234f1a7	Fix vbroadcast matching logic to early unmatch if the node doesn't have only one use. Fix PR10825. llvm-svn: 138951	2011-09-01 18:15:06 +00:00
Jakob Stoklund Olesen	bc000bf219	Prevent remat of partial register redefinitions. An instruction that redefines only part of a larger register can never be rematerialized since the virtual register value depends on the old value in other parts of the register. This was fixed for the inline spiller in r138794. This patch fixes the problem for all register allocators, and includes a small test case. <rdar://problem/10032939> llvm-svn: 138944	2011-09-01 17:18:50 +00:00
Andrew Trick	e5d7c0d111	PreRA scheduler should avoid cloning compares. Added canClobberReachingPhysRegUse() to handle a particular pattern in which a two-address instruction could be forced to interfere with EFLAGS, causing a compare to be unnecessarilly cloned. Fixes rdar://problem/5875261 llvm-svn: 138924	2011-09-01 00:54:31 +00:00
Bill Wendling	66b613c8b9	Remove old declare statements. llvm-svn: 138905	2011-08-31 21:41:20 +00:00
Bill Wendling	78fdf13f12	Update more tests to the new EH scheme. llvm-svn: 138904	2011-08-31 21:40:15 +00:00
Bill Wendling	aca68bbe94	Update more tests to the new EH scheme. llvm-svn: 138903	2011-08-31 21:39:05 +00:00
Bill Wendling	da12f7322a	Revert r138894. This was failing on cmake-clang-i686-msvc10. llvm-svn: 138900	2011-08-31 21:20:25 +00:00
Bill Wendling	722de8a9aa	Update more tests to the new EH scheme. llvm-svn: 138894	2011-08-31 21:04:11 +00:00
Eli Friedman	8ae6a88723	Generic expansion for atomic load/store into cmpxchg/atomicrmw xchg; implements 64-bit atomic load/store for ARM. llvm-svn: 138872	2011-08-31 18:26:09 +00:00
Eli Friedman	5d3814e0c4	64-bit atomic cmpxchg for ARM. llvm-svn: 138868	2011-08-31 17:52:22 +00:00
David Greene	a975ef3213	Compress Repeated Byte Output Emit a repeated sequence of bytes using .zero. This saves an enormous amount of asm file space for certain programs. llvm-svn: 138864	2011-08-31 17:30:56 +00:00
Benjamin Kramer	ce13423304	This test requires sse, otherwise x87 ops will block tailcall optimization llvm-svn: 138859	2011-08-31 16:49:05 +00:00
Bruno Cardoso Lopes	5bd6e92f99	- Move all MOVSS and MOVSD patterns close to their definitions - Duplicate some store patterns to their AVX forms! - Catched a bug while restricting the patterns subtarget, fix it and update a testcase to check it properly llvm-svn: 138851	2011-08-31 03:04:20 +00:00
Evan Cheng	bbabe9ff60	Fix (movhps load) lowering / pattern to match more cases. rdar://10050549 llvm-svn: 138848	2011-08-31 02:05:24 +00:00
Eli Friedman	928959bc52	Some minor cleanups for r138845. llvm-svn: 138846	2011-08-31 00:41:05 +00:00
Eli Friedman	d71c865ae0	Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next. llvm-svn: 138845	2011-08-31 00:31:29 +00:00
Benjamin Kramer	2da6e863e7	Fix test typo. llvm-svn: 138843	2011-08-31 00:02:59 +00:00
Rafael Espindola	17f15bc464	Add a triple. llvm-svn: 138831	2011-08-30 21:19:37 +00:00
Rafael Espindola	83257fe618	Some test code to check if correct code is being generated. Patch by Sanjoy Das. llvm-svn: 138820	2011-08-30 19:51:29 +00:00
Roman Divacky	7ac1bc57f7	Set CR1EQ only when lowering vararg floating arguments (not any vararg arguments as before), unset CR1EQ otherwise. llvm-svn: 138802	2011-08-30 17:04:16 +00:00
Evan Cheng	1eacb83316	Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical register dependency (rather than glue them together). This is general goodness as it gives scheduler more freedom. However it is motivated by a nasty bug in isel. When a i64 sub is expanded to subc + sube. libcall #1 \ \ subc \ / \ \ / \ \ / libcall #2 sube If the libcalls are not serialized (i.e. both have chains which are dag entry), legalizer can serialize them in arbitrary orders. If it's unlucky, it can force libcall #2 before libcall #1 in the above case. subc \| libcall #2 \| libcall #1 \| sube However since subc and sube are "glued" together, this ends up being a cycle when the scheduler combine subc and sube as a single scheduling unit. The right solution is to fix LegalizeType too chains the libcalls together. However, LegalizeType is not processing nodes in order so that's harder than it should be. For now, the move to physical register dependency will do. rdar://10019576 llvm-svn: 138791	2011-08-30 01:34:54 +00:00
Eli Friedman	4d90e53381	Explicitly zero out parts of a vector which are required to be zero by the algorithm in LowerUINT_TO_FP_i32. This only has a substantial effect on the generated code when the input is extracted from a vector register; other ways of loading an i32 do the appropriate zeroing implicitly. Fixes PR10802. llvm-svn: 138768	2011-08-29 21:15:46 +00:00
Owen Anderson	d2fc51c0e5	Add testcase for r138746. llvm-svn: 138747	2011-08-29 18:02:40 +00:00
Duncan Sands	1a69e0119a	Fix PR5329: pay attention to constructor/destructor priority when outputting them. With this, the entire LLVM testsuite passes when built with dragonegg. llvm-svn: 138724	2011-08-28 13:17:22 +00:00
Bill Wendling	aeeb59947e	Update to new EH scheme. llvm-svn: 138699	2011-08-27 04:53:41 +00:00
Bill Wendling	38b5c3a5bc	Cannot have an llvm.eh.exception call in a non-landing pad block. llvm-svn: 138698	2011-08-27 04:53:28 +00:00
Eli Friedman	9f95c7d381	Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction. llvm-svn: 138660	2011-08-26 21:21:21 +00:00
Bill Wendling	5b7cbeacad	Revert r138606 until LowerInvoke has been converted to the new EH scheme. llvm-svn: 138656	2011-08-26 21:11:23 +00:00
Eli Friedman	802dd20495	Atomic load/store on ARM/Thumb. I don't really like the patterns, but I'm having trouble coming up with a better way to handle them. I plan on making other targets use the same legalization ARM-without-memory-barriers is using... it's not especially efficient, but if anyone cares, it's not that hard to fix for a given target if there's some better lowering. llvm-svn: 138621	2011-08-26 02:59:24 +00:00
Bill Wendling	077e9ea84b	Update to the new EH scheme. llvm-svn: 138606	2011-08-25 23:48:37 +00:00
Bruno Cardoso Lopes	5b3d2c9e17	Add support for AVX 256-bit version of MOVDDUP! llvm-svn: 138588	2011-08-25 21:40:37 +00:00
Andrew Trick	0dd0ae11f8	ARM fix for missing implicit operands on ldmia_ret. rdar://10005094: miscompile of 176.gcc llvm-svn: 138568	2011-08-25 17:50:53 +00:00
Bill Wendling	1eec5affec	LSR wants to split the landing pad's critical edge. Let it do it, but use the proper function to do it. llvm-svn: 138550	2011-08-25 05:55:40 +00:00
Bruno Cardoso Lopes	5d34219953	Add support for 256-bit versions of VSHUFPD and VSHUFPS. llvm-svn: 138546	2011-08-25 02:58:26 +00:00
Eli Friedman	b6597a2e70	Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually. llvm-svn: 138505	2011-08-24 22:33:28 +00:00
Eli Friedman	688794e1d1	Basic tests for atomic load and store on x86. llvm-svn: 138486	2011-08-24 21:16:59 +00:00
Richard Osborne	6b6b0b535d	Add Uses=[SP] to call instructions. This fixes a miscompilation with a variable sized alloca. llvm-svn: 138433	2011-08-24 13:32:43 +00:00
Craig Topper	1da38a34a6	Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711. llvm-svn: 138427	2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes	8959b54713	Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit permutations. Also tidy up some patterns and make them close to their instruction definition! llvm-svn: 138392	2011-08-23 22:06:37 +00:00
Nick Lewycky	11874a4e0a	PerformSubCombine to work on integers larger than i128. Fixes a crasher. llvm-svn: 138354	2011-08-23 19:01:24 +00:00
Craig Topper	67b22aedb4	Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712. llvm-svn: 138321	2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes	8024703a16	Introduce a pass to insert vzeroupper instructions to avoid AVX to SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper" llc command line option. This is only the first step (very naive and conservative one) to sketch out the idea, but proper DFA is coming next to allow smarter decisions. Comments and ideas now and in further commits will be very appreciated. llvm-svn: 138317	2011-08-23 01:14:17 +00:00
Bruno Cardoso Lopes	8007165688	Add support for breaking 256-bit int VETCC into two 128-bit ones, avoding scalarization of the compare. Reduces code from 59 to 6 instructions. Fix PR10712. llvm-svn: 138271	2011-08-22 20:31:04 +00:00
Chad Rosier	0bfea70d09	With the fix in r138164: "Add <imp-def> operands to QQ and QQQQ stack loads." -verify-machineinstrs can be enabled for this test case. llvm-svn: 138171	2011-08-20 00:34:45 +00:00
Chad Rosier	55c57f07dd	VMOVQQQQs pseudo instructions are only created by ARMBaseInstrInfo::copyPhysReg. Therefore, rather then generate a pseudo instruction, which is later expanded, generate the necessary instructions in place. llvm-svn: 138163	2011-08-20 00:17:25 +00:00
Devang Patel	e4127d626e	Do not use named md nodes to track variables that are completely optimized. This does not scale while doing LTO with debug info. New approach is to include list of variables in the subprogram info directly. llvm-svn: 138145	2011-08-19 23:28:12 +00:00
Jim Grosbach	969c7a9037	Use regex to remove false dependencies on register allocation. llvm-svn: 138137	2011-08-19 23:10:31 +00:00
Jim Grosbach	5481e15390	Update tests. llvm-svn: 138116	2011-08-19 22:19:48 +00:00
Jakob Stoklund Olesen	f847cb77db	Add test case for r138018. llvm-svn: 138033	2011-08-19 04:30:24 +00:00
Akira Hatanaka	163382894e	Use subword loads instead of a 4-byte load when the size of a structure (or a piece of it) that is being passed by value is smaller than a word. llvm-svn: 138007	2011-08-18 23:39:37 +00:00
Ivan Krasin	338df71d60	FastISel: avoid function calls between the materialization of the constant and its use. llvm-svn: 137993	2011-08-18 22:06:10 +00:00
Jim Grosbach	7ecefeb594	Thumb assembly parsing and encoding for LDM instruction. Fix base register type and canonicallize to the "ldm" spelling rather than "ldmia." Add diagnostics for incorrect writeback token and out-of-range registers. llvm-svn: 137986	2011-08-18 21:50:53 +00:00
Richard Osborne	415c5ff412	Add intrinsics for SETEV, GETED, GETET. llvm-svn: 137938	2011-08-18 13:00:48 +00:00
Bruno Cardoso Lopes	c174d8ac48	Cleanup vector logical ops in AVX and add use int versions for simple v2i64 llvm-svn: 137919	2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes	82795e6b41	Fix PR10688. Add support for spliting 256-bit vector shifts when the shift amount is variable llvm-svn: 137885	2011-08-17 22:12:20 +00:00
Jim Grosbach	0115c6f75b	Thumb assembly parsing and encoding for ADR. llvm-svn: 137864	2011-08-17 20:37:40 +00:00
Bruno Cardoso Lopes	98531dfd08	Introduce matching patterns for vbroadcast AVX instruction. The idea is to match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810	2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes	0a3b3123fd	Update test to not use the scalar type to splat from a load llvm-svn: 137809	2011-08-17 02:29:15 +00:00
Bruno Cardoso Lopes	4ff4ed28af	Now that we have a canonical way to handle 256-bit splats: vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807	2011-08-17 02:29:10 +00:00
Akira Hatanaka	0179c7fa68	Add support for ext and ins. llvm-svn: 137804	2011-08-17 02:05:42 +00:00
Bruno Cardoso Lopes	d64294fb0a	Instead of always leaving the work to the generic legalizer when there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733	2011-08-16 18:21:54 +00:00
Akira Hatanaka	dcbf455b98	Add test case for r137711. llvm-svn: 137725	2011-08-16 17:32:01 +00:00
Akira Hatanaka	12df91513e	Fix handling of double precision loads and stores when Mips1 is targeted. Mips1 does not support double precision loads or stores, therefore two single precision loads or stores must be used in place of these instructions. This patch treats double precision loads and stores as if they are legal instructions until MCInstLowering, instead of generating the single precision instructions during instruction selection or Prolog/Epilog code insertion. Without the changes made in this patch, llc produces code that has the same problem described in r137484 or bails out when MipsInstrInfo::storeRegToStackSlot or loadRegFromStackSlot is called before register allocation. llvm-svn: 137711	2011-08-16 03:51:51 +00:00
Bruno Cardoso Lopes	b81c3ed76d	Fix PR10656. It's only profitable to use 128-bit inserts and extracts when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661	2011-08-15 21:45:54 +00:00
Eric Christopher	1bb5eaa978	Fix this test to avoid leaving a temporary file behind. llvm-svn: 137651	2011-08-15 20:55:03 +00:00
Bob Wilson	90799621b3	Expand VMOVQQQQ pseudo instructions. Apparently we never added code to expand these pseudo instructions, and in over a year, no one has noticed. Our register allocator must be awesome! llvm-svn: 137551	2011-08-13 05:14:55 +00:00
Bruno Cardoso Lopes	2d100ca13c	The VPERM2F128 is a AVX instruction which permutes between two 256-bit vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519	2011-08-12 21:48:26 +00:00
Akira Hatanaka	c9c0190cbe	Define unaligned load and store. llvm-svn: 137515	2011-08-12 21:30:06 +00:00
Akira Hatanaka	6caf61a6ac	Test case for 137484 llvm-svn: 137486	2011-08-12 18:12:06 +00:00
Akira Hatanaka	b787f8a8a5	Enclose directive .cprestore with .set macro and nomacro to silence assembler warning. llvm-svn: 137378	2011-08-11 22:42:31 +00:00
Bruno Cardoso Lopes	328a6a980b	Add a dag combine to xform 256-bit shuffles into simple vector inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362	2011-08-11 21:50:44 +00:00
Bruno Cardoso Lopes	884d8b9cb5	Fix the test added by Nadav in r137308. Make it more strict: 1) check for the "v" version of movaps 2) add a couple of CHECK-NOT to guarantee the behavior 3) move to a more appropriate test file llvm-svn: 137361	2011-08-11 21:50:35 +00:00
Bruno Cardoso Lopes	38d4afa02f	Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict. llvm-svn: 137324	2011-08-11 18:59:13 +00:00
Jim Grosbach	9717a9c0d3	ARM push of a single register encodes as pre-indexed STR. Per the ARM ARM, a 'push' of a single register encodes as an STR, not an STM. llvm-svn: 137318	2011-08-11 18:07:11 +00:00
Jim Grosbach	abaaf4513f	ARM pop of a single register encodes as post-indexed LDR. Per the ARM ARM, a 'pop' of a single register encodes as an LDR, not an LDM. llvm-svn: 137316	2011-08-11 17:35:48 +00:00
Nadav Rotem	de1b485f3f	[AVX] If the data which is going to be saved is already in two XMM registers (for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308	2011-08-11 16:41:21 +00:00
Chris Lattner	3ae8704c4f	add missing colon, thanks peter. llvm-svn: 137306	2011-08-11 16:15:10 +00:00
Chris Lattner	575057916a	fix PR10605 / rdar://9930964 by adding a pretty scary missed check. It's somewhat surprising anything works without this. Before we would compile the testcase into: test: # @test movl $4, 8(%rdi) movl 8(%rdi), %eax orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 now we produce: test: # @test movl 8(%rdi), %eax movl $4, 8(%rdi) orl %esi, %eax cmpl $32, %edx movl %eax, -4(%rsp) # 4-byte Spill je .LBB0_2 llvm-svn: 137303	2011-08-11 06:26:54 +00:00
Bruno Cardoso Lopes	8674ddf55a	Splats for v8i32/v8f32 can be handled by VPERMILPSY. This was causing infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296	2011-08-11 02:49:44 +00:00
Bruno Cardoso Lopes	954ac403c7	Use the splat index to generate the desired shuffle. Otherwise we could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295	2011-08-11 02:49:41 +00:00
Eli Friedman	17bd9e5d7c	Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292	2011-08-11 01:48:05 +00:00
NAKAMURA Takumi	5d316f7632	test/CodeGen/X86/opt-shuff-tstore.ll: Add explicit -mtriple=x86_64-linux. llvm-svn: 137262	2011-08-10 22:52:48 +00:00
Devang Patel	393d6e1fd0	While extending definition range of a debug variable, consult lexical scopes also. There is no point extending debug variable out side its lexical block. This provides 6x compile time speedup in some cases. llvm-svn: 137250	2011-08-10 21:25:34 +00:00
Nadav Rotem	1b3075c0ab	Fix the test. Add cpu target. llvm-svn: 137241	2011-08-10 19:49:19 +00:00
Nadav Rotem	4a8d78d24a	When performing a truncating store, it is sometimes possible to rearrange the data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238	2011-08-10 19:30:14 +00:00
Bruno Cardoso Lopes	565ab1542a	The following X86 pattern is incorrect: def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227	2011-08-10 17:45:17 +00:00
Rafael Espindola	45cd7316b5	Add support for the R and Q constraints. llvm-svn: 137217	2011-08-10 16:26:42 +00:00
Bruno Cardoso Lopes	4a435a361d	Fix a bug in vpermilps mask checking. Fix PR10560 llvm-svn: 137194	2011-08-10 01:54:17 +00:00
Bruno Cardoso Lopes	9a695724bd	Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 llvm-svn: 137179	2011-08-09 23:27:13 +00:00
Bruno Cardoso Lopes	7461b930f3	Add v16i16 and v32i8 store patterns llvm-svn: 137166	2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes	028c6aa951	Use fp unpack instructions to unpack int types. Until we have AVX2, this is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161	2011-08-09 22:18:37 +00:00
Eli Friedman	44fd5b2b59	Fix a couple ridiculous copy-paste errors. rdar://9914773 . llvm-svn: 137160	2011-08-09 22:17:39 +00:00
Bill Wendling	250ea7930e	Revert r137134. It breaks some code as Eli pointed out. llvm-svn: 137135	2011-08-09 18:56:35 +00:00
Bill Wendling	ca256c0d2d	Print out the variable declaration only if it is a declaration. Otherwise, a 'static' variable will be emitted twice. PR10081 llvm-svn: 137134	2011-08-09 18:31:50 +00:00
Jakob Stoklund Olesen	e43aca1c39	Inflate register classes after coalescing. Coalescing can remove copy-like instructions with sub-register operands that constrained the register class. Examples are: x86: GR32_ABCD:sub_8bit_hi -> GR32 arm: DPR_VFP2:ssub0 -> DPR Recompute the register class of any virtual registers that are used by less instructions after coalescing. This affects code generation for the Cortex-A8 where we use NEON instructions for f32 operations, c.f. fp_convert.ll: vadd.f32 d16, d1, d0 vcvt.s32.f32 d0, d16 The register allocator is now free to use d16 for the temporary, and that comes first in the allocation order because it doesn't interfere with any s-registers. llvm-svn: 137133	2011-08-09 18:19:41 +00:00
Bruno Cardoso Lopes	633400ee00	Reapply a more appropriate solution than in r137114. AVX supports v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128	2011-08-09 17:39:13 +00:00
Bruno Cardoso Lopes	1962a341d8	Revert r137114 llvm-svn: 137127	2011-08-09 17:39:01 +00:00
Justin Holewinski	021ab783b7	PTX: Add initial support for device function calls - Calls are supported on SM 2.0+ for function with no return values llvm-svn: 137125	2011-08-09 17:36:31 +00:00
Bruno Cardoso Lopes	5dac86dac6	Handle sitofp between v4f64 <- v4i32. Fix PR10559 llvm-svn: 137114	2011-08-09 05:48:01 +00:00
Bruno Cardoso Lopes	d521431558	Add support for avx vector fextend llvm-svn: 137105	2011-08-09 03:04:29 +00:00
Bruno Cardoso Lopes	81534df169	Rename and tidy up tests llvm-svn: 137103	2011-08-09 03:04:23 +00:00
Bruno Cardoso Lopes	1025d1eb3b	Add two patterns to match special vmovss and vmovsd cases. Also fix the patterns already there to be more strict regarding the predicate. This fixes PR10558 llvm-svn: 137100	2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes	d7eac41193	Make LowerVSETCC aware of AVX types and add patterns to match them. llvm-svn: 137090	2011-08-09 00:46:57 +00:00
Bruno Cardoso Lopes	d8534855ff	Add support for several vector shifts operations while in AVX mode. Fix PR10581 llvm-svn: 137067	2011-08-08 21:31:08 +00:00
Eli Friedman	7a34419c6f	Fix up the patterns for SXTB, SXTH, UXTB, and UXTH so that they are correctly active without HasT2ExtractPack. PR10611. llvm-svn: 137061	2011-08-08 19:49:37 +00:00
Jakob Stoklund Olesen	85931574b0	Don't clobber pending ST regs when FP regs are killed. X86FloatingPoint keeps track of pending ST registers for an upcoming inline asm instruction with fixed stack register constraints. It does this by remembering which FP register holds the value that should appear at a fixed stack position for the inline asm. When that FP register is killed before the inline asm, make sure to duplicate it to a scratch register, so the ST register still has a live FP reference. This could happen when the same FP register was copied to two ST registers, or when a spill instruction is inserted between the ST copy and the inline asm. This fixes PR10602. llvm-svn: 137050	2011-08-08 17:15:43 +00:00
Rafael Espindola	2da6e6a1d8	print st_shndx with the correct number of bits. llvm-svn: 136880	2011-08-04 15:50:13 +00:00
Rafael Espindola	c1a076eeb1	print st_other with the correct number of bits. llvm-svn: 136877	2011-08-04 15:38:19 +00:00
Rafael Espindola	368850841d	print st_type with the correct number of bits. llvm-svn: 136875	2011-08-04 15:24:00 +00:00
Rafael Espindola	e08bb3d50f	Print st_bind with the correct number of bits. llvm-svn: 136874	2011-08-04 15:10:35 +00:00
Rafael Espindola	865ab6cb05	Print r_sym with the correct number of bits. llvm-svn: 136873	2011-08-04 14:48:27 +00:00
Rafael Espindola	f65dd30907	Print r_type with the correct number of bits. llvm-svn: 136872	2011-08-04 14:39:30 +00:00
Rafael Espindola	edfafcbfb0	Change anther counter to decimal. llvm-svn: 136870	2011-08-04 14:01:03 +00:00
Rafael Espindola	3e8393e6f7	Don't print a counter in hex. llvm-svn: 136869	2011-08-04 13:39:15 +00:00
Bill Wendling	60e17f8212	Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR. Fixes PR10527. llvm-svn: 136853	2011-08-04 00:32:58 +00:00
Benjamin Kramer	d93ac7d0b6	Remove underscore that's breaking linux buildbots. llvm-svn: 136833	2011-08-03 23:13:01 +00:00
Jakub Staszak	9d083611d4	Use MachineBranchProbabilityInfo in If-Conversion instead of its own heuristics. llvm-svn: 136826	2011-08-03 22:34:43 +00:00
Jakob Stoklund Olesen	002075193b	Handle IMPLICIT_DEF instructions in X86FloatingPoint. This fixes PR10575. llvm-svn: 136787	2011-08-03 16:33:19 +00:00
Devang Patel	99a2f0d98c	Use byte offset, instead of element number, to access merged global. llvm-svn: 136759	2011-08-03 01:25:46 +00:00
Rafael Espindola	cefc38659a	Assume .cfi_startproc is the first thing in a function. If the function is externally visable, create a local symbol to use in the CFE. If not, use the function label itself. Fixes PR10420. llvm-svn: 136716	2011-08-02 20:24:22 +00:00
Bruno Cardoso Lopes	ac0984dc7e	Make this kind of lowering to be supported by 256-bit instructions: shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> To: shuffle (vload ptr)), undef, <1, 1, 1, 1> Fix PR10494 llvm-svn: 136691	2011-08-02 16:06:18 +00:00
Bruno Cardoso Lopes	771876cade	Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654	2011-08-01 21:54:09 +00:00
Bruno Cardoso Lopes	d3a5171087	Since vectors with all ones can't be created with a 256-bit instruction, avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642	2011-08-01 19:51:53 +00:00
Richard Osborne	2cd07cf351	Fix crash with varargs function with no named parameters. llvm-svn: 136623	2011-08-01 16:45:59 +00:00
Jakob Stoklund Olesen	0f099a3c58	Revert "Don't check liveness of unallocatable registers." The ARM target depends on CPSR liveness being tracked after register allocation. llvm-svn: 136548	2011-07-30 00:57:25 +00:00
Jakob Stoklund Olesen	a05b70241c	Don't check liveness of unallocatable registers. This includes registers like EFLAGS and ST0-ST7. We don't check for liveness issues in the verifier and scavenger because registers will never be allocated from these classes. While in SSA form, we do care about the liveness of unallocatable unreserved registers. Liveness of EFLAGS and ST0 neds to be correct for MachineDCE and MachineSinking. llvm-svn: 136541	2011-07-29 23:36:21 +00:00
Eric Christopher	96b31d5681	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Bruno Cardoso Lopes	871df895f4	Fix two tests that I crashed in the previous commits. The mask elts on the second half must be reindexed. llvm-svn: 136454	2011-07-29 02:05:28 +00:00
Bruno Cardoso Lopes	2b3d85d81c	Match VPERMIL masks more strictly and update the target specific mask generation to always catch the weird cases. llvm-svn: 136453	2011-07-29 01:31:15 +00:00
Bruno Cardoso Lopes	473d982caf	Add v8i32 and v4i64 vpermil patterns llvm-svn: 136451	2011-07-29 01:31:07 +00:00
Jakob Stoklund Olesen	cc29034b4c	Transfer implicit operands in NEONMoveFixPass. Later passes /are/ using this information when running the register scavenger. This fixes the second problem in PR10520. llvm-svn: 136440	2011-07-29 00:27:35 +00:00
Jakob Stoklund Olesen	f97f492104	Add -verify-arm-pseudo-expand. This hidden llc option runs the machine code verifier after expanding ARM pseudo-instructions, but before if-conversion. The machine code verifier is much better at pointing out liveness errors that can trip up the register scavenger. llvm-svn: 136439	2011-07-29 00:27:32 +00:00
Jakob Stoklund Olesen	5f429460ba	Handle REG_SEQUENCE with implicitly defined operands. Code like that would only be produced by bugpoint, but we should still handle it correctly. When a register is defined by a REG_SEQUENCE of undefs, the register itself is undef. Previously, we would create a register with uses but no defs. Fixes part of PR10520. llvm-svn: 136401	2011-07-28 21:38:51 +00:00
Bruno Cardoso Lopes	e24a043703	Add patterns to generate copies for extract_subvector instead of using vextractf128. This will reduce the number of issued instruction for several avx codes. llvm-svn: 136323	2011-07-28 01:26:50 +00:00
Bruno Cardoso Lopes	1f63a37172	Add a few patterns to match allzeros without having to use the fp unit. Take advantage that the 128-bit vpxor zeros the higher part and use it. This also fixes PR10491 llvm-svn: 136321	2011-07-28 01:26:43 +00:00
Bruno Cardoso Lopes	06d8be564f	Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move a convert pattern close to the instruction definition. llvm-svn: 136320	2011-07-28 01:26:39 +00:00
Bruno Cardoso Lopes	8830fde434	The vpermilps and vpermilpd have different behaviour regarding the usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200	2011-07-27 00:56:34 +00:00
Devang Patel	e85a416d4e	It is quiet possible that inlined function body is split into multiple chunks of consequtive instructions. But, there is not any way to describe this in .debug_inline accelerator table used by gdb. However, describe non contiguous ranges of inlined function body appropriately using AT_range of DW_TAG_inlined_subroutine debug info entry. llvm-svn: 136196	2011-07-27 00:34:13 +00:00
Jakob Stoklund Olesen	3f729850d3	Eliminate copies of undefined values during coalescing. These copies would coalesce easily, but the resulting value would be defined by a deleted instruction. Now we also remove the undefined value number from the destination register. This fixes PR10503. llvm-svn: 136174	2011-07-26 23:00:24 +00:00
Benjamin Kramer	32a2ce8416	Update test. llvm-svn: 136170	2011-07-26 22:45:39 +00:00
Benjamin Kramer	bfc2dfe3f7	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	e53bb853ea	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	4e16c5341a	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Jim Grosbach	906ecb46ed	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Eli Friedman	8779017138	XFAIL this test while I investigate it; it's failing for an unexpected reason. llvm-svn: 136131	2011-07-26 20:41:03 +00:00
Eli Friedman	e52bee3cc9	Add obvious missing case to switch. PR10497. llvm-svn: 136130	2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes	ab40a57cce	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00

... 3 4 5 6 7 ...

5243 Commits