llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 06:22:51 +01:00

Author	SHA1	Message	Date
Evan Cheng	1f4b84cad0	Fix test. llvm-svn: 55849	2008-09-05 20:04:37 +00:00
Evan Cheng	10a350fa89	If SSE2 is available, x86 should pass first 3 f32/f64 arguments in XMM registers for fastcc calls. llvm-svn: 55840	2008-09-05 17:24:07 +00:00
Evan Cheng	bd15e330d0	For whatever the reason, x86 CallingConv::Fast (i.e. fastcc) was not passing scalar arguments in registers. This patch defines a new fastcc CC which is slightly different from the FastCall CC. In addition to passing integer arguments in ECX and EDX, it also specify doubles are passed in 8-byte slots which are 8-byte aligned (instead of 4-byte aligned). This avoids a potential performance hazard where doubles span cacheline boundaries. llvm-svn: 55807	2008-09-04 22:59:58 +00:00
Owen Anderson	cd3ee9198d	Fix the ordering of operands to the store (inverted relative to LLVM IR), and fix the testcase. llvm-svn: 55777	2008-09-04 16:48:33 +00:00
Owen Anderson	35485dbae3	Add a first attempt at implementing stores for X86 fast isel using target hooks. Dan or Evan, please review. llvm-svn: 55764	2008-09-04 07:08:58 +00:00
Evan Cheng	9c728a557d	Load from GV stub should be locally CSE'd. llvm-svn: 55763	2008-09-04 06:18:33 +00:00
Evan Cheng	53ce5fa5ce	Remove code that pad number of bytes to pop for X86_FastCall CC. The code doesn't do the "aligning" for Cygwin, Mingw, and Windows. But aligning it on Darwin and Linux breaks gcc compatibility. That ruled out all the platforms we support! llvm-svn: 55756	2008-09-04 01:04:15 +00:00
Evan Cheng	942d55dd92	Add X86 target hook to implement load (even from GlobalAddress). llvm-svn: 55693	2008-09-03 06:44:39 +00:00
Evan Cheng	b40b710766	Re-apply 55467 with fix. If copy is being replaced by remat'ed def, transfer the implicit defs onto the remat'ed instruction. llvm-svn: 55564	2008-08-30 09:09:33 +00:00
Evan Cheng	4bc8c9652e	Transform (x << (y&31)) -> (x << y). This takes advantage of the fact x86 shift instructions 2nd operand (shift count) is limited to 0 to 31 (or 63 in the x86-64 case). llvm-svn: 55558	2008-08-30 02:03:58 +00:00
Evan Cheng	c1c53221c5	Swap fp comparison operands and change predicate to allow load folding (safely this time). llvm-svn: 55553	2008-08-29 23:22:12 +00:00
Evan Cheng	79d2a8f97d	xfail this. llvm-svn: 55550	2008-08-29 22:59:13 +00:00
Evan Cheng	cdd06ba3f4	Swap fp comparison operands and change predicate to allow load folding. llvm-svn: 55521	2008-08-28 23:48:31 +00:00
Dan Gohman	35a69c106a	Optimize DAGCombiner's worklist processing. Previously it started its work by putting all nodes in the worklist, requiring a big dynamic allocation. Now, DAGCombiner just iterates over the AllNodes list and maintains a worklist for nodes that are newly created or need to be revisited. This allows the worklist to stay small in most cases, so it can be a SmallVector. This has the side effect of making DAGCombine not miss a folding opportunity in alloca-align-rounding.ll. llvm-svn: 55498	2008-08-28 21:01:56 +00:00
Dan Gohman	8f4d612996	Revert r55467; it causes regressions in UnitTests/Vector/divides, Benchmarks/sim/sim, and others on x86-64. llvm-svn: 55475	2008-08-28 17:22:54 +00:00
Evan Cheng	28b0b18082	If a copy isn't coalesced, but its src is defined by trivial computation. Re-materialize the src to replace the copy. llvm-svn: 55467	2008-08-28 07:53:51 +00:00
Dale Johannesen	ae522b8463	This test crashes on non-x86 host; make SSE explicit. Feel free to fix a better way! llvm-svn: 55456	2008-08-28 01:51:09 +00:00
Dan Gohman	5e5f1c9e8f	Basic FastISel support for floating-point constants. llvm-svn: 55401	2008-08-27 01:09:54 +00:00
Chris Lattner	c5c00890e5	If an xmm register is referenced explicitly in an inline asm, make sure to assign it to a version of the xmm register with the regclass that matches its type. This fixes PR2715, a bug handling some crazy xpcom case in mozilla. llvm-svn: 55358	2008-08-26 06:19:02 +00:00
Evan Cheng	569b489cf5	Try approach to moving call address load inside of callseq_start. Now it's done during the preprocess of x86 isel. callseq_start's chain is changed to load's chain node; while load's chain is the last of callseq_start or the loads or copytoreg nodes inserted to move arguments to the right spot. llvm-svn: 55338	2008-08-25 21:27:18 +00:00
Owen Anderson	27491bbf2c	Add support for fast isel of (integer) immediate materialization pattens, and use them to support bitcast of constants in fast isel. llvm-svn: 55325	2008-08-25 20:20:32 +00:00
Evan Cheng	2b9f879a99	Fix asm printing of MOVSDto64mr and MOV64toSDrm. llvm-svn: 55300	2008-08-25 04:11:42 +00:00
Bill Wendling	05e1910595	Fix this test. Don't null out the file, just XFAIL it until patch can be fixed. llvm-svn: 55296	2008-08-24 21:48:46 +00:00
Bill Wendling	5728cf59fd	Temporarily reverting r55292. It's causing a bootstraping failure: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc ... src/libiberty/make-temp-file.c -o make-temp-file.o Assertion failed: (Node2Index[SU->NodeNum] > Node2Index[I->Dep->NodeNum] && "Wrong topological sorting"), function InitDAGTopologicalSorting, file /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp, line 508. ../../../../llvm-gcc.src/libiberty/hashtab.c:955: internal compiler error: Abort trap Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://developer.apple.com/bugreporter> for instructions. make[4]: * [hashtab.o] Error 1 make[4]: * Waiting for unfinished jobs.... make[3]: * [multi-do] Error 1 make[2]: * [all] Error 2 make[1]: * [all-target-libiberty] Error 2 make: * [all] Error 2 llvm-svn: 55295	2008-08-24 21:45:30 +00:00
Evan Cheng	a600778748	Move callseq_start above the call address load to allow load to be folded into the call node. llvm-svn: 55292	2008-08-24 19:19:55 +00:00
Anton Korobeynikov	496a2865db	Testcase for 64bit maskmovq llvm-svn: 55239	2008-08-23 15:53:47 +00:00
Dale Johannesen	a8dbf73ffd	Test all currently supported atomic builtins on x86-{32,64}. These just test that they go through the BE. llvm-svn: 55208	2008-08-22 22:39:21 +00:00
Dan Gohman	a398d11527	Factor out the predicate check code from DAGISelEmitter.cpp and use it in FastISelEmitter.cpp, and make FastISel subtarget aware. Among other things, this lets it work properly on x86 targets that don't have SSE, where it successfully selects x87 instructions. llvm-svn: 55156	2008-08-22 00:20:26 +00:00
Bill Wendling	8ff0d8f829	Testcase for PR2585. llvm-svn: 55151	2008-08-21 23:04:49 +00:00
Dan Gohman	4562b2bcfe	Add -mattr=sse2 so this test doesn't fail on non-x86 hosts. llvm-svn: 55145	2008-08-21 22:34:25 +00:00
Dale Johannesen	6fe9da3acc	Make x86 and sse2 explicit for non-x86 hosts. llvm-svn: 55141	2008-08-21 21:26:06 +00:00
Evan Cheng	ef2509b3ba	Fix a number of byval / memcpy / memset related codegen issues. 1. x86-64 byval alignment should be max of 8 and alignment of type. Previously the code was not doing what the commit message was saying. 2. Do not use byte repeat move and store operations. These are slow. llvm-svn: 55139	2008-08-21 21:00:15 +00:00
Dan Gohman	42fa2945d3	getelementptr doesn't work on x86-64 yet, because it has MOV64ri32 and no plain MOV64ri. llvm-svn: 55126	2008-08-21 17:28:42 +00:00
Dan Gohman	f4269f7bea	MVT::getMVT uses iPTR for pointer types, while we need the actual intptr_t type in this case. FastISel can now select simple getelementptr instructions. llvm-svn: 55125	2008-08-21 17:25:26 +00:00
Dan Gohman	a6e647dd7c	Basic fast-isel support for instructions with constant int operands. llvm-svn: 55099	2008-08-21 01:41:07 +00:00
Dan Gohman	bb28e0fc6d	Add a -march line for this test, and run it on x86-64 too for fun. llvm-svn: 55030	2008-08-20 00:56:07 +00:00
Dan Gohman	455abe7436	Add FastISel support for floating-point operations. llvm-svn: 55021	2008-08-20 00:23:20 +00:00
Dan Gohman	ce636764de	Add FastISel support for several more binary operators. llvm-svn: 55020	2008-08-20 00:11:48 +00:00
Bill Wendling	ab7c8c091e	Add support for the __sync_sub_and_fetch atomics and friends for X86. The code was already present, but not hooked up to anything. llvm-svn: 55018	2008-08-19 23:09:18 +00:00
Dan Gohman	d5c84e8061	Fast-isel is now minimally functional. Add a testcase to demonstrate the extent of its capabilities. Note that it only attempts to operate on one of the blocks in this testcase. llvm-svn: 55016	2008-08-19 22:37:59 +00:00
Dale Johannesen	15b76de064	Add support for 8 and 16 bit forms of __sync builtins on X86. Change "lock" instructions to be on a separate line. This is needed to work around a bug in the Darwin assembler. llvm-svn: 54999	2008-08-19 18:47:28 +00:00
Evan Cheng	6534c78383	Fix a (u)comiss intrinsic lowering bug. It was using anyext which can return junk in higher bits. Patch by Nate Begeman. llvm-svn: 54903	2008-08-17 19:22:34 +00:00
Dan Gohman	096cdc6059	Allow SelectionDAG to create EXTRACT_VECTOR_ELT nodes with non-constant indices. Only a few of the peephole checks require a constant index. llvm-svn: 54764	2008-08-13 21:51:37 +00:00
Dan Gohman	6789ef32d7	Improve the grep commands for this test to be tolerant of ABI differences, and to be more specific. llvm-svn: 54648	2008-08-11 20:10:41 +00:00
Dan Gohman	a27ed39f05	Take the FrameOffset into account when computing the alignment of stack objects. This fixes PR2656. llvm-svn: 54646	2008-08-11 18:27:03 +00:00
Dan Gohman	ac992cdc1c	Add an EXTRACTPSmr pattern to match the pattern that X86ISelLowering creates. llvm-svn: 54544	2008-08-08 18:30:21 +00:00
Dan Gohman	74fa421281	Re-enable elimination of unnecessary SUBREG_TO_REG instructions in LowerSubregs, and fix an x86-64 isel bug that this exposed. SUBREG_TO_REG for x86-64 implicit zero extension is only safe for isel to generate when the source is known to always have zeros in the high 32 bits. The EXTRACT_SUBREG instruction does not clear the high 32 bits. llvm-svn: 54444	2008-08-07 02:54:50 +00:00
Dan Gohman	1674a7c2f3	Add an extra example that shouldn't get an and instruction. llvm-svn: 54443	2008-08-07 02:23:06 +00:00
Dan Gohman	cc784f1662	Re-introduce the 8-bit subreg zext-inreg patterns for x86-32, this time using MOV32to32_ and MOV16to16_. Thanks to Evan for suggesting this. llvm-svn: 54418	2008-08-06 18:27:21 +00:00
Evan Cheng	f4d1119fbd	Fix PR2620: Fix X86cmppd selection code so it expects operands to be v2f64. llvm-svn: 54376	2008-08-05 22:19:15 +00:00
Evan Cheng	a07795a0c3	Fix PR2596: out of bound reference. llvm-svn: 54375	2008-08-05 21:51:46 +00:00
Owen Anderson	d1185e4da3	Update the remaining tests not to use -disable-correct-folding, and remove two that couldn't be updated. llvm-svn: 54359	2008-08-05 18:19:14 +00:00
Owen Anderson	117b0e405d	One more -disable-correct-folding case removed. llvm-svn: 54358	2008-08-05 18:08:56 +00:00
Owen Anderson	c5fd801d85	Remove another -disable-correct-folding use. llvm-svn: 54357	2008-08-05 18:05:58 +00:00
Owen Anderson	f845ea8d52	Eliminate another use of -disable-correct-folding. llvm-svn: 54356	2008-08-05 18:03:01 +00:00
Evan Cheng	754148a2ec	Fix PR2568: Fix bug that cause redudant kill marker after its live interval has been extended due to coalescing. llvm-svn: 54346	2008-08-05 07:10:38 +00:00
Owen Anderson	231111faf9	Update these tests to work by disabling the new correct CFG generation. This flag should ONLY be used to for tests like these. llvm-svn: 54334	2008-08-04 23:55:29 +00:00
Dan Gohman	60ea311ec8	Fix SDISel lowering of PHI nodes to use ComputeValueVTs. This allows it to work correctly on aggregate values. This fixes PR2623. llvm-svn: 54331	2008-08-04 23:42:46 +00:00
Dale Johannesen	c1ae4b8c08	Make sse2 explicit, for non-x86 hosts. llvm-svn: 54251	2008-07-31 20:16:33 +00:00
Dan Gohman	f691fc703d	Improve dagcombining for sext-loads and sext-in-reg nodes. llvm-svn: 54239	2008-07-31 00:50:31 +00:00
Dan Gohman	6f3fa16fd9	I missed this file in r54223. movzbl is now used instead of movzbw here. llvm-svn: 54224	2008-07-30 18:23:34 +00:00
Dan Gohman	efb5d2ce6e	Reapply r54147 with a constraint to only use the 8-bit subreg form on x86-64, to avoid the problem with x86-32 having GPRs that don't have 8-bit subregs. Also, change several 16-bit instructions to use equivalent 32-bit instructions. These have a smaller encoding and avoid partial-register updates. llvm-svn: 54223	2008-07-30 18:09:17 +00:00
Mon P Wang	fb483982f5	Added support for overloading intrinsics (atomics) based on pointers to different address spaces. This alters the naming scheme for those intrinsics, e.g., atomic.load.add.i32 => atomic.load.add.i32.p0i32 llvm-svn: 54195	2008-07-30 04:36:53 +00:00
Dan Gohman	ebe629a4b2	Revert 54147. llvm-svn: 54148	2008-07-29 01:02:18 +00:00
Dan Gohman	1816900fd1	Add x86 isel patterns to match what would be a ZERO_EXTEND_INREG operation, which is represented in codegen as an 'and' operation. This matches them with movz instructions, instead of leaving them to be matched by and instructions with an immediate field. llvm-svn: 54147	2008-07-28 22:18:25 +00:00
Dan Gohman	a5a50a8853	Fix embedded CRLF characters. llvm-svn: 54125	2008-07-27 18:37:58 +00:00
Nate Begeman	1396e3d206	Fix test RUN line llvm-svn: 54040	2008-07-25 19:08:59 +00:00
Nate Begeman	5523d40e4b	Disable mov{L, LP, HP, HLP, *DUP} shuffles for mmx mmx needs its own fancy shuffle logic based on unpack; for now we get correct but awful code. Also commit Mon Ping's VSETCC patch llvm-svn: 54039	2008-07-25 19:05:58 +00:00
Dan Gohman	6d394147f2	This test needs -aggressive-remat enabled. llvm-svn: 54015	2008-07-25 15:25:32 +00:00
Dan Gohman	680e1bd958	Enable rematerialization of constants using AliasAnalysis::pointsToConstantMemory, and knowledge of PseudoSourceValues. This unfortunately isn't sufficient to allow constants to be rematerialized in PIC mode -- the extra indirection is a complication. llvm-svn: 54000	2008-07-25 00:02:30 +00:00
Dan Gohman	da5c2b50b8	Add target triples so these tests behave as expected on non-darwin hosts. llvm-svn: 53991	2008-07-24 18:08:01 +00:00
Evan Cheng	055f5e6ed0	New test case. llvm-svn: 53971	2008-07-24 00:22:05 +00:00
Evan Cheng	20c9cdbe69	Fix PR2485: do all 4-element SSE shuffles in max. of 2 shuffle instructions. Based on patch by Nicolas Capens. llvm-svn: 53939	2008-07-23 00:22:17 +00:00
Duncan Sands	550e0de239	LegalizeTypes support for VSETCC. Fixes PR2575. llvm-svn: 53938	2008-07-22 23:54:03 +00:00
Evan Cheng	1aa928a8e6	Fix pr2566: incorrect assumption about bit_convert. It doesn't not have to output a vector value. Patch by Nicolas Capens! llvm-svn: 53932	2008-07-22 20:42:56 +00:00
Evan Cheng	901d469e05	Fix PR2574: implement v2f32 scalar_to_vector. llvm-svn: 53927	2008-07-22 18:39:19 +00:00
Bill Wendling	98b6e63176	Fix for first part of PR2562. Generate the "pinsrw" instruction for inserts into v4i16 vectors. llvm-svn: 53807	2008-07-20 02:32:23 +00:00
Anton Korobeynikov	6f354293fe	Testcase for PR2549 llvm-svn: 53785	2008-07-19 06:31:12 +00:00
Evan Cheng	d26080487b	Subreg live interval valno may not have a corresponding def machineinstr since it's less precise. llvm-svn: 53734	2008-07-17 19:48:53 +00:00
Evan Cheng	48b2f3dfe9	Add nounwind. llvm-svn: 53733	2008-07-17 19:48:04 +00:00
Evan Cheng	05e5317cab	Fix PR2536: a nasty spiller bug. If a two-address instruction uses a register but the use portion of its live range is not part of its liveinterval, it must be defined by an implicit_def. In that case, do not spill the use. e.g. 8 %reg1024<def> = IMPLICIT_DEF 12 %reg1024<def> = INSERT_SUBREG %reg1024<kill>, %reg1025, 2 The live range [12, 14) are not part of the r1024 live interval since it's defined by an implicit def. It will not conflicts with live interval of r1025. Now suppose both registers are spilled, you can easily see a situation where both registers are reloaded before the INSERT_SUBREG and both target registers that would overlap. llvm-svn: 53503	2008-07-12 01:56:02 +00:00
Duncan Sands	52f1dbf139	Port a shift-by-1 optimization from LegalizeDAG: it was presumably added after the rest of the code was copied to LegalizeTypes. llvm-svn: 53459	2008-07-11 16:54:57 +00:00
Bill Wendling	9f17caa9a9	The frame address on an x86-64 box needs to be offset by -8, not -4. llvm-svn: 53450	2008-07-11 07:18:52 +00:00
Evan Cheng	02a618dc56	Fix for PR2472. Use movss to set lower 32-bits of a zero XMM vector. llvm-svn: 53386	2008-07-10 01:08:23 +00:00
Anton Korobeynikov	f710ada483	Testcase for PR2024 llvm-svn: 53327	2008-07-09 14:09:41 +00:00
Evan Cheng	cf3a4ad46d	Fix two serious LSR bugs. 1. LSR runOnLoop is always returning false regardless if any transformation is made. 2. AddUsersIfInteresting can create new instructions that are added to DeadInsts. But there is a later early exit which prevents them from being freed. llvm-svn: 53193	2008-07-07 19:51:32 +00:00
Dale Johannesen	51edab312c	Considering predecessors of exit blocks gets us a little more tail merging. llvm-svn: 52986	2008-07-01 21:50:49 +00:00
Chris Lattner	153b6695b8	test doesn't need eh info llvm-svn: 52811	2008-06-27 03:14:20 +00:00
Dale Johannesen	76f5dc0cc4	Allow for rounding up of stack frame. llvm-svn: 52751	2008-06-26 01:55:32 +00:00
Chris Lattner	2b67ff8632	when we know the signbit of an input to uint_to_fp is zero, change it to sint_to_fp on targets where that is cheaper (and visaversa of course). This allows us to compile uint_to_fp to: _test: movl 4(%esp), %eax shrl $23, %eax cvtsi2ss %eax, %xmm0 movl 8(%esp), %eax movss %xmm0, (%eax) ret instead of: .align 3 LCPI1_0: ## double .long 0 ## double least significant word 4.5036e+15 .long 1127219200 ## double most significant word 4.5036e+15 .text .align 4,0x90 .globl _test _test: subl $12, %esp movl 16(%esp), %eax shrl $23, %eax movl %eax, (%esp) movl $1127219200, 4(%esp) movsd (%esp), %xmm0 subsd LCPI1_0, %xmm0 cvtsd2ss %xmm0, %xmm0 movl 20(%esp), %eax movss %xmm0, (%eax) addl $12, %esp ret llvm-svn: 52747	2008-06-26 00:16:49 +00:00
Evan Cheng	71fbfe73c1	- Fix a x86 vector isel bug: illegal transformation of a vector_shuffle into a shift. - Add a readme entry for a missing vector_shuffle optimization that results in awful codegen. llvm-svn: 52740	2008-06-25 20:52:59 +00:00
Mon P Wang	7d89d61387	Added MemOperands to Atomic operations since Atomics touches memory. Added abstract class MemSDNode for any Node that have an associated MemOperand Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and atomic.lss => atomic.load.sub llvm-svn: 52706	2008-06-25 08:15:39 +00:00
Evan Cheng	bab5925a0b	Enable two-address remat by default. llvm-svn: 52701	2008-06-25 01:16:38 +00:00
Dale Johannesen	244433ebb1	v2f32 is now a valid (MMX) type which breaks this test (doesn't work for any MMX vector types, it's not me). Rewritten to use v2i16 which is generic and going to stay that way; I think that preserves the point of the test. llvm-svn: 52692	2008-06-24 22:03:36 +00:00
Evan Cheng	a62f5f0f82	If it's determined safe, remat MOV32r0 (i.e. xor r, r) and others as it is instead of using the longer MOV32ri instruction. llvm-svn: 52670	2008-06-24 07:10:51 +00:00
Bill Wendling	d6b7d457cf	Make test work on non-x86 machines (like my G4 PPC). llvm-svn: 52619	2008-06-23 06:16:31 +00:00
Evan Cheng	1d07cd32c2	Undo spill weight tweak. Need to investigate the performance regressions. llvm-svn: 52572	2008-06-21 06:45:54 +00:00
Eli Friedman	570aa6f801	Fix a bug with <8 x i16> shuffle lowering on X86 where parts of the shuffle could be skipped. The check is invalid because the loop index i doesn't correspond to the element actually inserted. The correct check is already done a few lines earlier, for whether the element is already in the right spot, so this shouldn't have any effect on the codegen for code that was already correct. llvm-svn: 52486	2008-06-19 06:09:51 +00:00
Evan Cheng	919b735586	New test case. llvm-svn: 52483	2008-06-19 01:50:24 +00:00
Evan Cheng	ee801276b3	This also got better (55 - 51 instructions). But doing one more re-materialization. llvm-svn: 52482	2008-06-19 01:50:13 +00:00
Evan Cheng	56e17b525c	This got better. llvm-svn: 52481	2008-06-19 01:46:43 +00:00
Evan Cheng	8cfd1d39a1	Do not issue identity copies. llvm-svn: 52373	2008-06-16 22:52:53 +00:00
Evan Cheng	d27948e716	- Add "Commutative" property to intrinsics. This allows tblgen to generate the commuted variants for dagisel matching code. - Mark lots of X86 intrinsics as "Commutative" to allow load folding. llvm-svn: 52353	2008-06-16 20:29:38 +00:00
Evan Cheng	2e99c9cbf8	Teach the spiller to commute instructions in order to fold a reload. This hits 410 times on 444.namd and 122 times on 252.eon. llvm-svn: 52266	2008-06-13 23:58:02 +00:00
Duncan Sands	40c8db881a	Disable some DAG combiner optimizations that may be wrong for volatile loads and stores. In fact this is almost all of them! There are three types of problems: (1) it is wrong to change the width of a volatile memory access. These may be used to do memory mapped i/o, in which case a load can have an effect even if the result is not used. Consider loading an i32 but only using the lower 8 bits. It is wrong to change this into a load of an i8, because you are no longer tickling the other three bytes. It is also unwise to make a load/store wider. For example, changing an i16 load into an i32 load is wrong no matter how aligned things are, since the fact of loading an additional 2 bytes can have i/o side-effects. (2) it is wrong to change the number of volatile load/stores: they may be counted by the hardware. (3) it is wrong to change a volatile load/store that requires one memory access into one that requires several. For example on x86-32, you can store a double in one processor operation, but to store an i64 requires two (two i32 stores). In a multi-threaded program you may want to bitcast an i64 to a double and store as a double because that will occur atomically, and be indivisible to other threads. So it would be wrong to convert the store-of-double into a store of an i64, because this will become two i32 stores - no longer atomic. My policy here is to say that the number of processor operations for an illegal operation is undefined. So it is alright to change a store of an i64 (requires at least two stores; but could be validly lowered to memcpy for example) into a store of double (one processor op). In short, if the new store is legal and has the same size then I say that the transform is ok. It would also be possible to say that transforms are always ok if before they were illegal, whether after they are illegal or not, but that's more awkward to do and I doubt it buys us anything much. However this exposed an interesting thing - on x86-32 a store of i64 is considered legal! That is because operations are marked legal by default, regardless of whether the type is legal or not. In some ways this is clever: before type legalization this means that operations on illegal types are considered legal; after type legalization there are no illegal types so now operations are only legal if they really are. But I consider this to be too cunning for mere mortals. Better to do things explicitly by testing AfterLegalize. So I have changed things so that operations with illegal types are considered illegal - indeed they can never map to a machine operation. However this means that the DAG combiner is more conservative because before it was "accidentally" performing transforms where the type was illegal because the operation was nonetheless marked legal. So in a few such places I added a check on AfterLegalize, which I suppose was actually just forgotten before. This causes the DAG combiner to do slightly more than it used to, which resulted in the X86 backend blowing up because it got a slightly surprising node it wasn't expecting, so I tweaked it. llvm-svn: 52254	2008-06-13 19:07:40 +00:00
Evan Cheng	66ce588b87	Fix some tests. llvm-svn: 52245	2008-06-12 21:23:38 +00:00
Dale Johannesen	47cee90b57	Fix parameter spelling: sse not sse1 llvm-svn: 52185	2008-06-10 17:57:58 +00:00
Matthijs Kooijman	00a807266e	Fix some more quoting issues in RUN lines, this time regarding unintended variable expansions involving the $ character. This fixes 4 tests that were not running properly before. llvm-svn: 52183	2008-06-10 16:10:32 +00:00
Matthijs Kooijman	281711dc95	Remove double pipes in RUN commandlines. This fixes 5 testcases that were not being run properly before. llvm-svn: 52180	2008-06-10 15:11:36 +00:00
Dan Gohman	f5602924ae	Convert several tests to use temporary files instead of redundantly executing the test commands. llvm-svn: 52163	2008-06-10 00:36:41 +00:00
Rafael Espindola	feaadb1e05	add support for PIC on linux x86-64 llvm-svn: 52139	2008-06-09 09:52:31 +00:00
Evan Cheng	e77d6a1a2d	Fix a memcpy lowering bug. Even though the memcpy alignment is smaller than the desired alignment, the frame destination alignment may still be larger than the desired alignment. Don't change its alignment to something smaller. llvm-svn: 51970	2008-06-04 23:37:54 +00:00
Dan Gohman	385b7d76ed	Fix the position of MemOperands in nodes that use variadic_ops in DAGISelEmitter output. This bug was recently uncovered by the addition of patterns for CALL32m and CALL64m, which are nodes that now have both MemOperands and variadic_ops. This bug was especially visible with PIC in various configurations, because the new patterns are matching the indirect call code used in many PIC configurations. llvm-svn: 51877	2008-06-02 17:40:38 +00:00
Dan Gohman	aa8fcd5657	Add patterns for CALL32m and CALL64m. They aren't matched in most cases due to an isel deficiency already noted in lib/Target/X86/README.txt, but they can be matched in this fold-call.ll testcase, for example. This is interesting mainly because it exposes a tricky tblgen bug; tblgen was incorrectly computing the starting index for variable_ops in the case of a complex pattern. llvm-svn: 51706	2008-05-29 21:50:34 +00:00
Dan Gohman	e256337a1a	Expand small memmovs using inline code. Set the X86 threshold for expanding memmove to a more plausible value, now that it's actually being used. llvm-svn: 51696	2008-05-29 19:42:22 +00:00
Evan Cheng	04c0915a2f	Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. llvm-svn: 51667	2008-05-29 08:22:04 +00:00
Evan Cheng	f2e38956ff	Add nounwind. llvm-svn: 51665	2008-05-29 07:09:24 +00:00
Evan Cheng	cd45b11bc1	Fix PR2289: vr defined by multiple implicit_def as result of coalescing. llvm-svn: 51648	2008-05-28 17:40:10 +00:00
Evan Cheng	591b57edd6	Teach local register allocator to deal with landing pad MBB's. llvm-svn: 51647	2008-05-28 17:22:32 +00:00
Dan Gohman	568685ffa7	Specify a target so that this tests tests what it's intended to test. llvm-svn: 51600	2008-05-27 17:55:57 +00:00
Dan Gohman	3ba9d77adb	Make this test independent of the target-triple; the stack alignment is specifically what this test depends on. llvm-svn: 51599	2008-05-27 17:44:23 +00:00
Nick Lewycky	c096899392	The Linux ABI emits an extra "movl %esp, %ebp" in function prologue and sometimes a "mov %ebp, %esp" in the epilogue. Force these tests that rely on counting 'mov' to use i686-apple-darwin8.8.0 where they were written. llvm-svn: 51568	2008-05-26 20:18:56 +00:00
Evan Cheng	e9c1c96f7b	New loadl_pd and loadh_pd tests. llvm-svn: 51525	2008-05-24 00:10:02 +00:00
Evan Cheng	4f660778f0	Use movlps / movhps to modify low / high half of 16-byet memory location. llvm-svn: 51501	2008-05-23 21:23:16 +00:00
Dan Gohman	6cc0b4f262	Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add load-folding table entries for PMULDQ and PMULLD. llvm-svn: 51489	2008-05-23 17:49:40 +00:00
Evan Cheng	097e95b1f7	Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks. Also fixed some 80 col. violations. llvm-svn: 51462	2008-05-23 00:37:07 +00:00
Evan Cheng	dc3a3d3a2c	Add a couple of test cases. llvm-svn: 51441	2008-05-22 21:19:19 +00:00
Evan Cheng	d1373cd497	Add missing patterns. llvm-svn: 51435	2008-05-22 18:56:56 +00:00
Chris Lattner	477239c56d	testcase for PR2267 llvm-svn: 51408	2008-05-22 04:45:22 +00:00
Evan Cheng	8e02953de8	Fix PR2343. An interesting coalescer bug. BB1: vr1025 = copy vr1024 .. BB2: vr1024 = op = op vr1025 <loop eventually branch back to BB1> Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop. llvm-svn: 51394	2008-05-21 22:34:12 +00:00
Gabor Greif	807c2df887	sabre brings to my attention that the 'tr' suffix is also obsolete llvm-svn: 51349	2008-05-20 21:00:03 +00:00
Gabor Greif	d8a4dbb5da	Rename the last test with .llx extension to .ll, resolve duplicate test by renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too. llvm-svn: 51328	2008-05-20 19:52:04 +00:00
Dan Gohman	7681889f75	Run vortex-bug as x86-64, which is what the original bug was triggered on. llvm-svn: 51289	2008-05-20 00:54:39 +00:00
Dale Johannesen	4e46c5601d	Use common where we mean common, not weak. llvm-svn: 51173	2008-05-16 00:52:30 +00:00
Dan Gohman	2da4145cd8	Fix a bug in LoopStrengthReduce that caused it to emit IR with use-before-def. The problem comes up in code with multiple PHIs where one PHI is being rewritten in terms of the other, but the other needs to be casted first. LLVM rules requre the cast instruction to be inserted after any PHI instructions, but when instructions were inserted to replace the second PHI value with a function of the first, they were ended up going before the cast instruction. Avoid this problem by remembering the location of the cast instruction, when one is needed, and inserting the expansion of the new value after it. This fixes a bug that surfaced in 255.vortex on x86-64 when instcombine was removed from the middle of the loop optimization passes. llvm-svn: 51169	2008-05-15 23:26:57 +00:00
Dan Gohman	cd29e1fa60	When bit-twiddling CondCode values for integer comparisons produces SETOEQ, is it does with (SETEQ & SETULE), map it to SETEQ. llvm-svn: 51112	2008-05-14 18:17:09 +00:00
Evan Cheng	9e15622879	Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset. pshufd $1, (%rdi), %xmm0 movd %xmm0, %eax => movl 4(%rdi), %eax llvm-svn: 51026	2008-05-13 08:35:03 +00:00
Evan Cheng	e4ee4c2870	On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16. llvm-svn: 51019	2008-05-13 00:54:02 +00:00
Evan Cheng	fcbdc8bd6e	Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other. llvm-svn: 51008	2008-05-12 23:04:07 +00:00
Dale Johannesen	b54491d31a	New test for tail merging llvm-svn: 51007	2008-05-12 22:59:44 +00:00
Evan Cheng	c19c639ad7	When transforming a vector_shuffle to a load, the base address must not be an undef. llvm-svn: 50940	2008-05-10 06:46:49 +00:00
Evan Cheng	fc4b8e1d96	Add nounwind. llvm-svn: 50931	2008-05-10 02:22:25 +00:00
Evan Cheng	cf4d5567d5	If all sources of a PHI node are defined by an implicit_def, just emit an implicit_def instead of a copy. llvm-svn: 50927	2008-05-10 00:17:50 +00:00
Evan Cheng	2adea48f7e	Add a pattern to do move the low element of a v4f32 and zero extend the rest. llvm-svn: 50922	2008-05-09 23:37:55 +00:00
Evan Cheng	3493e43afd	Handle a few more cases of folding load i64 into xmm and zero top bits. Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch. llvm-svn: 50918	2008-05-09 21:53:03 +00:00
Evan Cheng	a0688bf1cb	Simplify test. llvm-svn: 50911	2008-05-09 19:56:32 +00:00
Evan Cheng	f824b47188	Use movq to move low half of XMM register and zero-extend the rest. llvm-svn: 50874	2008-05-08 22:35:02 +00:00
Evan Cheng	f97e716511	Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. llvm-svn: 50838	2008-05-08 00:57:18 +00:00
Evan Cheng	7ff000c175	Add nounwind. llvm-svn: 50837	2008-05-07 22:59:08 +00:00
Evan Cheng	c86c035346	Yet another nasty spiller bug. %ecx = op store %cl<kill>, (addr) (addr) = op %al It's not safe to unfold the last operand and eliminate store even though %cl is marked kill. It's a sub-register use which means one of its super-register(s) may be used below. llvm-svn: 50794	2008-05-07 00:49:28 +00:00

1 2 3 4 5 ...

750 Commits