llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Chris Lattner	42b8e431b6	move a target-specific test into its directory so it isn't run if you don't configure the ARM target in. llvm-svn: 68843	2009-04-10 23:58:38 +00:00
Chris Lattner	0577b8e2ef	fix two problems with machine sinking: 1. Sinking would crash when the first instruction of a block was sunk due to iterator problems. 2. Instructions could be sunk to their current block, causing an infinite loop. This fixes PR3968 llvm-svn: 68787	2009-04-10 16:38:36 +00:00
Rafael Espindola	88986ef511	Don't fold a load if the other operand is a TLS address. With this we generate movl %gs:0, %eax leal i@NTPOFF(%eax), %eax instead of movl $i@NTPOFF, %eax addl %gs:0, %eax llvm-svn: 68778	2009-04-10 10:09:34 +00:00
Bob Wilson	c53238dff1	Fix pr3954. The register scavenger asserts for inline assembly with register destinations that are tied to source operands. The TargetInstrDescr::findTiedToSrcOperand method silently fails for inline assembly. The existing MachineInstr::isRegReDefinedByTwoAddr was very close to doing what is needed, so this revision makes a few changes to that method and also renames it to isRegTiedToUseOperand (for consistency with the very similar isRegTiedToDefOperand and because it handles both two-address instructions and inline assembly with tied registers). llvm-svn: 68714	2009-04-09 17:16:43 +00:00
Chris Lattner	301c4f39a0	reg0 references are not real registers. This fixes a crash on the attached testcase. llvm-svn: 68712	2009-04-09 16:50:43 +00:00
Dan Gohman	68de98eef3	Generalize ExtendUsesToFormExtLoad to be usable for ANY_EXTEND, in addition to ZERO_EXTEND and SIGN_EXTEND. Fix a bug in the way it checked for live-out values, and simplify the way it find users by using SDNode::use_iterator's (relatively) new features. Also, make it slightly more permissive on targets with free truncates. In SelectionDAGBuild, avoid creating ANY_EXTEND nodes that are larger than necessary. If the target's SwitchAmountTy has enough bits, use it. This exposes the truncate to optimization early, enabling more optimizations. llvm-svn: 68670	2009-04-09 03:51:29 +00:00
Rafael Espindola	7eb72dc5f2	Re-apply 68552. Tested by bootstrapping llvm-gcc and using that to build llvm. llvm-svn: 68645	2009-04-08 21:14:34 +00:00
Bob Wilson	e0e4a070da	Add testcase for PR3795. llvm-svn: 68620	2009-04-08 18:00:55 +00:00
Duncan Sands	d0e186d90f	Soft float support for FREM. llvm-svn: 68614	2009-04-08 16:20:57 +00:00
Duncan Sands	ee34b0d05d	Soft float support for undef. Reported by Xerxes Rånby. llvm-svn: 68607	2009-04-08 13:33:37 +00:00
Dan Gohman	94fde57da3	Fully escape the grep string for this test. llvm-svn: 68580	2009-04-08 00:54:40 +00:00
Dan Gohman	b979f332fd	Update this test for recent codegen improvements. CodeGen is now using an lea in place of a mov and an add for this test. llvm-svn: 68579	2009-04-08 00:51:11 +00:00
Dan Gohman	c9ce27d6b7	Implement support for using modeling implicit-zero-extension on x86-64 with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG instructions), and teach the DAGCombiner to take advantage of this on targets which support it. This eliminates many redundant zero-extension operations on x86-64. This adds a new TargetLowering hook, isZExtFree. It's similar to isTruncateFree, except it only applies to actual definitions, and not no-op truncates which may not zero the high bits. Also, this adds a new optimization to SimplifyDemandedBits: transform operations like x+y into (zext (add (trunc x), (trunc y))) on targets where all the casts are no-ops. In contexts where the high part of the add is explicitly masked off, this allows the mask operation to be eliminated. Fix the DAGCombiner to avoid undoing these transformations to eliminate casts on targets where the casts are no-ops. Also, this adds a new two-address lowering heuristic. Since two-address lowering runs before coalescing, it helps to be able to look through copies when deciding whether commuting and/or three-address conversion are profitable. Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle the case that a clobber range extended both before and beyond an existing live range. In that case, multiple live ranges need to be added. This was exposed by the new subreg coalescing code. Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the spiller behavior it was looking for no longer occurrs with the new instruction selection. llvm-svn: 68576	2009-04-08 00:15:30 +00:00
Bill Wendling	6e702cf68c	Temporarily revert r68552. This was causing a failure in the self-hosting LLVM builds. --- Reverse-merging (from foreign repository) r68552 into '.': U test/CodeGen/X86/tls8.ll U test/CodeGen/X86/tls10.ll U test/CodeGen/X86/tls2.ll U test/CodeGen/X86/tls6.ll U lib/Target/X86/X86Instr64bit.td U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86RegisterInfo.cpp U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86CodeEmitter.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86InstrInfo.h U lib/Target/X86/X86ISelDAGToDAG.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86ISelLowering.h U lib/Target/X86/X86InstrInfo.cpp U lib/Target/X86/X86InstrBuilder.h U lib/Target/X86/X86RegisterInfo.td llvm-svn: 68560	2009-04-07 22:35:25 +00:00
Rafael Espindola	0324937229	Reduce code duplication on the TLS implementation. This introduces a small regression on the generated code quality in the case we are just computing addresses, not loading values. Will work on it and on X86-64 support. llvm-svn: 68552	2009-04-07 21:37:46 +00:00
Dan Gohman	e98c3b1ea1	Don't attempt to handle aggregate argument values in FastISel; let SelectionDAG do those. This fixes PR3955. llvm-svn: 68546	2009-04-07 20:40:11 +00:00
Bob Wilson	39c7bec188	Handle 'a' modifier in ARM inline assembly. Patch by Richard Pennington. llvm-svn: 68464	2009-04-06 21:46:51 +00:00
Nick Lewycky	811a7377b0	Try SSE2? llvm-svn: 68423	2009-04-04 10:24:24 +00:00
Nick Lewycky	49df985ad9	Fix test on non-x86 platforms. llvm-svn: 68419	2009-04-04 07:20:43 +00:00
Dan Gohman	ea48adc739	Fix a TargetLowering optimization so that it doesn't duplicate loads when an input node has multiple uses. llvm-svn: 68398	2009-04-03 20:11:30 +00:00
Mon P Wang	f829fb5cab	Added a x86 dag combine to increase the chances to use a movq for v2i64 on x86-32. llvm-svn: 68368	2009-04-03 02:43:30 +00:00
Bob Wilson	5b42ebe6a9	Fix PR3862: Recognize some ARM-specific constraints for immediates in inline assembly. llvm-svn: 68218	2009-04-01 17:58:54 +00:00
Evan Cheng	0674a512bf	Fully general expansion of integer shift of any size. llvm-svn: 68134	2009-03-31 19:39:24 +00:00
Dan Gohman	9f7d1f2bd4	Add an explicit -asm-verbose to these tests, to make it possible to run the tests with -asm-verbose defaulting to false. llvm-svn: 68124	2009-03-31 18:20:47 +00:00
Owen Anderson	59cff6919d	Remove the "fast" cases for spill and restore point determination, as these were subtlely wrong in obscure cases. Patch the testcase to account for this change. llvm-svn: 68093	2009-03-31 08:27:09 +00:00
Dan Gohman	cba99ee717	Fix live-out reg logic to not insert over-aggressive AssertZExt instructions. This fixes lua. llvm-svn: 68083	2009-03-31 01:38:29 +00:00
Evan Cheng	d7824e208a	Turn a 2-address instruction into a 3-address one when it's profitable even if the two-address operand is killed. e.g. %reg1024<def> = MOV r1 %reg1025<def> = ADD %reg1024, %reg1026 r0 = MOV %reg1025 If it's not possible / profitable to commute ADD, then turning ADD into a LEA saves a copy. llvm-svn: 68065	2009-03-30 21:34:07 +00:00
Anton Korobeynikov	497fd0e996	Tweak test for recent relro stuff llvm-svn: 68035	2009-03-30 15:28:40 +00:00
Evan Cheng	5c460dbc3d	Forgot this test. llvm-svn: 68025	2009-03-30 06:17:34 +00:00
Anton Korobeynikov	d24f576124	Testcase for recent ro/relocs stuff llvm-svn: 68008	2009-03-29 17:14:57 +00:00
Duncan Sands	602234cdf3	Fix PR3899: add support for extracting floats from vectors when using -soft-float. Based on a patch by Jakob Stoklund Olesen. llvm-svn: 67996	2009-03-29 13:51:06 +00:00
Arnold Schwaighofer	76188bc8a1	Make check in CheckTailCallReturnConstraints for ignorable instructions between a CALL and a RET node more generic. Add a test for tail calls with a void return. llvm-svn: 67943	2009-03-28 12:36:29 +00:00
Arnold Schwaighofer	636127325b	Enable tail call optimization for functions that return a struct (bug 3664) and for functions that return types that need extending (e.g i1). llvm-svn: 67934	2009-03-28 08:33:27 +00:00
Evan Cheng	a15fdaa292	Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g. x * 40 => shlq $3, %rdi leaq (%rdi,%rdi,4), %rax This has the added benefit of allowing more multiply to be folded into addressing mode. e.g. a * 24 + b => leaq (%rdi,%rdi,2), %rax leaq (%rsi,%rax,8), %rax llvm-svn: 67917	2009-03-28 05:57:29 +00:00
Dan Gohman	66fc2f0a88	Fix this test so that it doesn't spuriously fail due to some unrelated debugging output happening to contain the string "store". llvm-svn: 67849	2009-03-27 16:17:22 +00:00
Evan Cheng	86f6af35bf	Add -march=x86. llvm-svn: 67783	2009-03-26 23:03:32 +00:00
Bill Wendling	e59b8d1cad	Add -f to RUN line. llvm-svn: 67744	2009-03-26 06:17:54 +00:00
Chris Lattner	ad28e0de5f	no need for eh info llvm-svn: 67740	2009-03-26 05:51:18 +00:00
Bill Wendling	3b9278a1c5	Add testcase for r67728. llvm-svn: 67729	2009-03-26 01:52:47 +00:00
Evan Cheng	bddc7d1032	Add a test case for PR3779: when to promote the function return value. llvm-svn: 67702	2009-03-25 20:30:19 +00:00
Evan Cheng	7e4217176a	Revert 67132. This is breaking some objective-c apps. Also fixes SDISel so it does not force promote return value if the function is not marked signext / zeroext. llvm-svn: 67701	2009-03-25 20:20:11 +00:00
Evan Cheng	3a7489a4cc	CodeGen still defaults to non-verbose asm, but llc now overrides it and default to verbose. llvm-svn: 67668	2009-03-25 01:47:28 +00:00
Dan Gohman	edd5fa3721	Add a testcase for the scheduling heuristic introduced in r67586. llvm-svn: 67622	2009-03-24 16:38:27 +00:00
Evan Cheng	b3196f1298	Do not emit comments unless -asm-verbose. llvm-svn: 67580	2009-03-24 00:17:40 +00:00
Evan Cheng	702a8b4399	Fix a bug in spill weight computation. If the alias is a super-register, and the super-register is in the register class we are trying to allocate. Then add the weight to all sub-registers of the super-register even if they are not aliases. e.g. allocating for GR32, bh is not used, updating bl spill weight. bl should get the same spill weight otherwise it will be choosen as a spill candidate since spilling bh doesn't make ebx available. This fix PR2866. llvm-svn: 67574	2009-03-23 22:57:19 +00:00
Dale Johannesen	34123aba43	Fix internal representation of fp80 to be the same as a normal i80 {low64, high16} rather than its own {high64, low16}. A depressing number of places know about this; I think I got them all. Bitcode readers and writers convert back to the old form to avoid breaking compatibility. llvm-svn: 67562	2009-03-23 21:16:53 +00:00
Evan Cheng	e09988f66b	Update test for pr3864. llvm-svn: 67545	2009-03-23 18:27:36 +00:00
Evan Cheng	7e4a6972d6	Fix PR3391 and PR3864. Reg allocator infinite looping. llvm-svn: 67544	2009-03-23 18:24:37 +00:00
Evan Cheng	2ec94dd447	Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies. llvm-svn: 67512	2009-03-23 08:01:15 +00:00
Evan Cheng	4b11d96b62	Do not fold away subreg_to_reg if the source register has a sub-register index. That means the source register is taking a sub-register of a larger register. e.g. On x86 %RAX<def> = ... %RAX<def> = SUBREG_TO_REG 0, %EAX:3<kill>, 3 The first def is defining RAX, not EAX so the top bits were not zero-extended. llvm-svn: 67511	2009-03-23 07:19:58 +00:00
Rafael Espindola	4461c91700	Add -relocation-model=pic so that the test works both in Linux and Darwin. llvm-svn: 67191	2009-03-18 09:38:28 +00:00
Mon P Wang	3d7fb6738a	Added missing support for widening when splitting an unary op (PR3683) and expanding a bit convert (PR3711). In both cases, we extract the valid part of the widen vector and then do the conversion. llvm-svn: 67175	2009-03-18 06:24:04 +00:00
Evan Cheng	2a51157172	Add another test case for r64440. llvm-svn: 67156	2009-03-18 02:43:01 +00:00
Chris Lattner	205380a4e4	Disable the "call to immediate" optimization on x86-64. It is not safe in general because the immediate could be an arbitrary value that does not fit in a 32-bit pcrel displacement. Conservatively fall back to loading the value into a register and calling through it. We still do the optzn on X86-32. llvm-svn: 67142	2009-03-18 00:43:52 +00:00
Bill Wendling	28358589ec	A more proper -mtriple. llvm-svn: 67138	2009-03-18 00:19:44 +00:00
Bill Wendling	8ec55afd27	Temporary fix. I think Rafael wanted this to be Linux-only. llvm-svn: 67137	2009-03-18 00:16:36 +00:00
Chris Lattner	ee2d69fc7b	LSR shouldn't ever try to hack on integer IV's larger than 64-bits. Right now it is not APInt clean, but even when it is it needs to be evaluated carefully to determine whether it is actually profitable. This fixes a crash on PR3806 llvm-svn: 67134	2009-03-17 23:58:30 +00:00
Rafael Espindola	6a6d9e48dd	Don't force promotion of return arguments on the callee. Some architectures (like x86) don't require it. This fixes bug 3779. llvm-svn: 67132	2009-03-17 23:43:59 +00:00
Chris Lattner	970d2828f4	this is apparently passing now. Evan/Dan, please check to see if this is producing the expected code or not, I'm not sure what the test was intended to check. llvm-svn: 67099	2009-03-17 20:23:43 +00:00
Chris Lattner	e3c442050d	Fix codegen to compute the size of an allocation by multiplying the size by the array amount as an i32 value instead of promoting from i32 to i64 then doing the multiply. Not doing this broke wrap-around assumptions that the optimizers (validly) made. The ultimate real fix for this is to introduce i64 version of alloca and remove mallocinst. This fixes PR3829 llvm-svn: 67093	2009-03-17 19:36:00 +00:00
Evan Cheng	3e355f6c0d	Add newline at end of file. llvm-svn: 67085	2009-03-17 17:08:25 +00:00
Scott Michel	a023598ad3	CellSPU: Revert inadvertent mis-fix of fneg. llvm-svn: 67084	2009-03-17 16:45:16 +00:00
Duncan Sands	34e7f207ee	Reapply r67049, with the test adjusted for darwin (which produces "call L_f$stub" rather than "call f"). llvm-svn: 67079	2009-03-17 09:46:22 +00:00
Mon P Wang	7184e30bdb	Fix a problem with DAGCombine where we were building an illegal build vector shuffle mask. Forced the mask to be built using i32. Note: this will be irrelevant once vector_shuffle no longer takes a build vector for the shuffle mask. llvm-svn: 67076	2009-03-17 06:33:10 +00:00
Dan Gohman	f6c57d0fe7	Recognize bswapl as bswap too. llvm-svn: 67072	2009-03-17 02:45:40 +00:00
Dan Gohman	4efda2b52b	Recognize "bswapq" as an alternate spelling for the bswap instruction. llvm-svn: 67071	2009-03-17 02:17:27 +00:00
Evan Cheng	0dccb325d1	Spiller may unfold load / mod / store instructions as an optimization when the would be loaded value is available in a register. It needs to check if it's legal to clobber the register. Also, the register can contain values of multiple spill slots, make sure to check all instead of just the one being unfolded. llvm-svn: 67068	2009-03-17 01:23:09 +00:00
Scott Michel	2c4ac99ef8	CellSPU: - Fix fabs, fneg for f32 and f64. - Use BuildVectorSDNode.isConstantSplat, now that the functionality exists - Continue to improve i64 constant lowering. Lower certain special constants to the constant pool when they correspond to SPU's shufb instruction's special mask values. This avoids the overhead of performing a shuffle on a zero-filled vector just to get the special constant when the memory load suffices. llvm-svn: 67067	2009-03-17 01:15:45 +00:00
Bill Wendling	51bfef84e1	--- Reverse-merging (from foreign repository) r67049 into '.': U test/CodeGen/X86/2009-03-13-PHIElimBug.ll D test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll U lib/CodeGen/PHIElimination.cpp r67049 was causing this failure: Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/dg.exp ... FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/2009-03-13-PHIElimBug.ll for PR3784 Failed with exit(1) at line 1 while running: llvm-as < /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/2009-03-13-PHIElimBug.ll \| llc -march=x86 \| /usr/bin/grep -A 2 {call f} \| /usr/bin/grep movl child process exited abnormally llvm-svn: 67051	2009-03-16 20:27:20 +00:00
Duncan Sands	8572084279	Tweak the fix for PR3784: be less sensitive about just how invokes are set up. The fix could be disturbed by register copies coming after the EH_LABEL, and also didn't behave quite right when it was the invoke result that was used in a phi node. Also (see new testcase) fix another phi elimination bug while there: register copies in the landing pad need to come after the EH_LABEL, because that's where execution branches to when unwinding. If they come before the EH_LABEL then they will never be executed... Also tweak the original testcase so it doesn't use a no-longer existing counter. The accumulated phi elimination changes fix two of seven Ada testsuite failures that turned up after landing pad critical edge splitting was turned off. So there's probably more to come. llvm-svn: 67049	2009-03-16 19:58:38 +00:00
Scott Michel	2e2bccf754	CellSPU: Incorporate Tilmann's 128-bit operation patch. Evidently, it gets the llvm-gcc bootstrap a bit further along. llvm-svn: 67048	2009-03-16 18:47:25 +00:00
Dan Gohman	c938e0ab02	Add a testcase that covers a wide variety of ABI isel cases. llvm-svn: 67003	2009-03-14 02:35:10 +00:00
Dan Gohman	fd6debff99	Use %rip-relative addressing on x86-64 whenever practical, as it has a smaller encoding than absolute addressing. llvm-svn: 67002	2009-03-14 02:33:41 +00:00
Dan Gohman	5c4cd59ce4	Add a few more ptrtoint/inttoptr cast tests. llvm-svn: 66989	2009-03-13 23:54:51 +00:00
Dan Gohman	fa0a3504ba	Improve FastISel's handling of truncates to i1, and implement ptrtoint and inttoptr in X86FastISel. These casts aren't always handled in the generic FastISel code because X86 sometimes needs custom code to do truncation and zero-extension. llvm-svn: 66988	2009-03-13 23:53:06 +00:00
Evan Cheng	cda58e565f	Fix PR3784: If the source of a phi comes from a bb ended with an invoke, make sure the copy is inserted before the try range (unless it's used as an input to the invoke, then insert it after the last use), not at the end of the bb. Also re-apply r66140 which was disabled as a workaround. llvm-svn: 66976	2009-03-13 22:59:14 +00:00
Dan Gohman	790659c0d6	Fix FastISel's assumption that i1 values are always zero-extended by inserting explicit zero extensions where necessary. Included is a testcase where SelectionDAG produces a virtual register holding an i1 value which FastISel previously mistakenly assumed to be zero-extended. llvm-svn: 66941	2009-03-13 20:42:20 +00:00
Rafael Espindola	ff17d02271	Improve sext and zext of TLS variables. llvm-svn: 66922	2009-03-13 18:37:06 +00:00
Evan Cheng	f9951d1557	Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. llvm-svn: 66875	2009-03-13 07:51:59 +00:00
Chris Lattner	cbbdd230dd	generalize the previous code to use the full generality of LEA for i32/i64 expressions (we could also do i16 on cpus where i16 lea is fast, but I didn't add this). On the example, we now generate: _test: movl 4(%esp), %eax cmpl $42, (%eax) setl %al movzbl %al, %eax leal 4(%eax,%eax,8), %eax ret instead of: _test: movl 4(%esp), %eax cmpl $41, (%eax) movl $4, %ecx movl $13, %eax cmovg %ecx, %eax ret llvm-svn: 66869	2009-03-13 05:53:31 +00:00
Chris Lattner	878d951f8f	optimize the case of cond ? 42 : 41 and friends. This compiles the example to: _test: movl 4(%esp), %eax cmpl $41, (%eax) setg %al movzbl %al, %eax orl $4294967294, %eax ret instead of: movl 4(%esp), %eax cmpl $41, (%eax) movl $4294967294, %ecx movl $4294967295, %eax cmova %ecx, %eax ret which is smaller in code size and faster. rdar://6668608 llvm-svn: 66868	2009-03-13 05:22:11 +00:00
Dan Gohman	37d843c129	Enhance address-mode folding of ISD::ADD to handle cases where the operands can't both be fully folded at the same time. For example, in the included testcase, a global variable is being added with an add of two values. The global variable wants RIP-relative addressing, so it can't share the address with another base register, but it's still possible to fold the initial add. llvm-svn: 66865	2009-03-13 02:25:09 +00:00
Evan Cheng	27be5272b4	Add this test back. llvm-svn: 66838	2009-03-12 23:01:35 +00:00
Duncan Sands	968939fca2	Revert commit 66140 since it caused several failures in the Ada testcase. Reverting this only covers up the real problem, which is a nasty conceptual difficulty in the phi elimination pass: when eliminating phi nodes in landing pads, the register copies need to come before the invoke, not at the end of the basic block which is too late... See PR3784. llvm-svn: 66826	2009-03-12 21:13:42 +00:00
Evan Cheng	e04a84ed9f	Typo. llvm-svn: 66797	2009-03-12 17:07:39 +00:00
Evan Cheng	45aa89fa9a	Fix test after Chris' select changes. llvm-svn: 66795	2009-03-12 16:10:08 +00:00
Chris Lattner	26a971c4ec	Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" related transformations out of target-specific dag combine into the ARM backend. These were added by Evan in r37685 with no testcases and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll). Add some simple X86-specific (for now) DAG combines that turn things like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently with the recently added cp constant select optimization, but is a very general xform. For example, we now compile the second example in const-select.ll to: _test: movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 seta %al movzbl %al, %eax movl 4(%esp), %ecx movsbl (%ecx,%eax,4), %eax ret instead of: _test: movl 4(%esp), %eax leal 4(%eax), %ecx movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 cmovbe %eax, %ecx movsbl (%ecx), %eax ret This passes multisource and dejagnu. llvm-svn: 66779	2009-03-12 06:52:53 +00:00
Evan Cheng	46e903d2f6	On x86, if the only use of a i64 load is a i64 store, generate a pair of double load and store instead. llvm-svn: 66776	2009-03-12 05:59:15 +00:00
Chris Lattner	a5368fd283	add no-unwind, remove duplicate run line. llvm-svn: 66775	2009-03-12 05:56:37 +00:00
Chris Lattner	cb52d81eea	add nounwinds llvm-svn: 66773	2009-03-12 05:35:33 +00:00
Dan Gohman	d30e108f0e	Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and the assembly text output uses an indirect call ("call *") instead of a direct call. llvm-svn: 66735	2009-03-11 23:01:47 +00:00
Rafael Espindola	a8fe373200	optimize i8 and i16 tls values. llvm-svn: 66725	2009-03-11 22:40:04 +00:00
Evan Cheng	042e05cb31	My last coalescer fix introduced a subtler one. It's aborting a commuting optimization too late and left the live intervals to be out of sync with instructions. This fixes 8b10b. llvm-svn: 66715	2009-03-11 22:18:44 +00:00
Mon P Wang	287e422039	For yonah, fix a vector shuffle case for v16i8 where we didn't properly clear some bits. llvm-svn: 66684	2009-03-11 18:47:57 +00:00
Mon P Wang	2867737ad2	Fixed a v8i16 shuffle case that should generate a pshufb instead of a pshuflw/hw. llvm-svn: 66645	2009-03-11 06:35:11 +00:00
Chris Lattner	a0c4a99ec2	reapply my previous patch (r66358) with a tweak to set the alignment of the generated constant pool entry to the desired alignment of a type. If we don't do this, we end up trying to do movsd from 4-byte alignment memory. This fixes 450.soplex and 456.hmmer. llvm-svn: 66641	2009-03-11 05:08:08 +00:00
Evan Cheng	264173da40	Two coalescer fixes in one. 1. Use the same value# to represent unknown values being merged into sub-registers. 2. When coalescer commute an instruction and the destination is a physical register, update its sub-registers by merging in the extended ranges. llvm-svn: 66610	2009-03-11 00:03:21 +00:00
Bill Wendling	5af061a657	Readd test, but XFAIL it. llvm-svn: 66581	2009-03-10 21:31:00 +00:00
Evan Cheng	535dbf4ffd	Revert 66358 for now. It's breaking povray, 450.soplex, and 456.hmmer on x86 / Darwin. llvm-svn: 66574	2009-03-10 20:47:18 +00:00
Bill Wendling	4ddb112b2f	Add radar number. llvm-svn: 66534	2009-03-10 06:53:54 +00:00
Chris Lattner	03060f6d50	wire up support for emitting "special" values from inline asm format strings with the standard ${:foo} syntax. llvm-svn: 66527	2009-03-10 05:37:13 +00:00
Chris Lattner	93662edfb8	Fix PR3763 by using proper APInt methods instead of uint64_t's. llvm-svn: 66434	2009-03-09 20:22:18 +00:00
Evan Cheng	3c9a084a1b	ARM isLegalAddressImmediate should check if type is a simple type now that optimizer can create values of funky scalar types. llvm-svn: 66429	2009-03-09 19:15:00 +00:00
Evan Cheng	8b47b553f9	Yet another case where the spiller marked two uses of the same register on the same instruction as kill. This fixes PR3706. llvm-svn: 66428	2009-03-09 19:00:05 +00:00
Evan Cheng	67ccd79e39	Recognize triplets starting with armv5-, armv6- etc. And set the ARM arch version accordingly. llvm-svn: 66365	2009-03-08 04:02:49 +00:00
Evan Cheng	483ece4d2e	If a MI uses the same register more than once, only mark one of them as 'kill'. llvm-svn: 66363	2009-03-08 03:58:35 +00:00
Chris Lattner	578b634f56	implement an optimization to codegen c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4. For 2009-03-07-FPConstSelect.ll we now produce: _f: xorl %eax, %eax testl %edi, %edi movl $4, %ecx cmovne %rax, %rcx leaq LCPI1_0(%rip), %rax movss (%rcx,%rax), %xmm0 ret previously we produced: _f: subl $4, %esp cmpl $0, 8(%esp) movss LCPI1_0, %xmm0 je LBB1_2 ## entry LBB1_1: ## entry movss LCPI1_1, %xmm0 LBB1_2: ## entry movss %xmm0, (%esp) flds (%esp) addl $4, %esp ret on PPC the code also improves to: _f: cntlzw r2, r3 srwi r2, r2, 5 li r3, lo16(LCPI1_0) slwi r2, r2, 2 addis r3, r3, ha16(LCPI1_0) lfsx f1, r3, r2 blr from: _f: li r2, lo16(LCPI1_1) cmplwi cr0, r3, 0 addis r2, r2, ha16(LCPI1_1) beq cr0, LBB1_2 ; entry LBB1_1: ; entry li r2, lo16(LCPI1_0) addis r2, r2, ha16(LCPI1_0) LBB1_2: ; entry lfs f1, 0(r2) blr This also improves the existing pic-cpool case from: foo: subl $12, %esp call .Lllvm$1.$piclabel .Lllvm$1.$piclabel: popl %eax addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax cmpl $0, 16(%esp) movsd .LCPI1_0@GOTOFF(%eax), %xmm0 je .LBB1_2 # entry .LBB1_1: # entry movsd .LCPI1_1@GOTOFF(%eax), %xmm0 .LBB1_2: # entry movsd %xmm0, (%esp) fldl (%esp) addl $12, %esp ret to: foo: call .Lllvm$1.$piclabel .Lllvm$1.$piclabel: popl %eax addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax xorl %ecx, %ecx cmpl $0, 4(%esp) movl $8, %edx cmovne %ecx, %edx fldl .LCPI1_0@GOTOFF(%eax,%edx) ret This triggers a few dozen times in spec FP 2000. llvm-svn: 66358	2009-03-08 01:51:30 +00:00
Dan Gohman	b9c32f1aca	Arithmetic instructions don't set EFLAGS bits OF and CF bits the same say the "test" instruction does in overflow cases, so eliminating the test is only safe when those bits aren't needed, as is the case for COND_E and COND_NE, or if it can be proven that no overflow will occur. For now, just restrict the optimization to COND_E and COND_NE and don't do any overflow analysis. llvm-svn: 66318	2009-03-07 01:58:32 +00:00
Dan Gohman	42049dafd7	Fix ScheduleDAGRRList::CopyAndMoveSuccessors' handling of nodes with multiple chain operands. This can occur when the scheduler has added chain operands to a node that already has a chain operand, in order to handle physical register dependencies. This fixes an llvm-gcc bootstrap failure on x86-64 introduced in r66058. llvm-svn: 66240	2009-03-06 02:23:01 +00:00
Dan Gohman	f6f684b206	Fix the "test" optimization to recognize "dec" as an add of negative one, as subtracts of immediates are canonicalized to adds. llvm-svn: 66180	2009-03-05 19:32:48 +00:00
Dan Gohman	5e88bf4c00	Make this test more thorough. Not only should there be no %esi, there should be no spilling of anything. llvm-svn: 66179	2009-03-05 19:31:32 +00:00
Evan Cheng	24c138a1cd	Do not split edges to EH landing pads. It will cause code size explosion. llvm-svn: 66140	2009-03-05 06:31:26 +00:00
Dan Gohman	31fb085c2e	Re-apply 66008, now that the unfoldMemoryOperand bug is fixed. llvm-svn: 66058	2009-03-04 19:44:21 +00:00
Owen Anderson	fb7f64ea0d	Add a restore folder, which shaves a dozen or so machineinstrs off oggenc. Update a testcase to check this. llvm-svn: 66029	2009-03-04 08:52:31 +00:00
Evan Cheng	7d9019d0f3	Fix PR3666: isel calls to constant addresses. llvm-svn: 66024	2009-03-04 06:48:53 +00:00
Eli Friedman	1bed40b86a	PR3686: make the legalizer handle bitcast from i80 to x86 long double. llvm-svn: 66021	2009-03-04 06:23:34 +00:00
Dan Gohman	6831e2c2a6	Revert r66004 for now; it's causing a variety of test failures. llvm-svn: 66008	2009-03-04 03:54:19 +00:00
Evan Cheng	32eef2f73f	Rename test. llvm-svn: 66006	2009-03-04 02:47:25 +00:00
Dan Gohman	c6c669cc1e	Teach the x86 backend to eliminate "test" instructions by using the EFLAGS result from add, sub, inc, and dec instructions in simple cases. llvm-svn: 66004	2009-03-04 02:33:24 +00:00
Evan Cheng	db402a7a49	Fix PR3701. 1. X86 target renamed eflags register to flags. This matches what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs. llvm-svn: 65996	2009-03-04 01:41:49 +00:00
Bill Wendling	93eeea0493	The DAG combiner was performing a BT combine. The BT combine had a value of -1, so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it would go through the DAG combiner again. This time it had a value of 31, which was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong forever. Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded value is an XOR of all ones. llvm-svn: 65985	2009-03-04 00:18:06 +00:00
Nate Begeman	6b41d33726	Fix a problem with DAGCombine on 64b targets where folding extracts + build_vector into a shuffle would fail, because the type of the new build_vector would not be legal. Try harder to create a legal build_vector type. Note: this will be totally irrelevant once vector_shuffle no longer takes a build_vector for shuffle mask. New: _foo: xorps %xmm0, %xmm0 xorps %xmm1, %xmm1 subps %xmm1, %xmm1 mulps %xmm0, %xmm1 addps %xmm0, %xmm1 movaps %xmm1, 0 Old: _foo: xorps %xmm0, %xmm0 movss %xmm0, %xmm1 xorps %xmm2, %xmm2 unpcklps %xmm1, %xmm2 pshufd $80, %xmm1, %xmm1 unpcklps %xmm1, %xmm2 pslldq $16, %xmm2 pshufd $57, %xmm2, %xmm1 subps %xmm0, %xmm1 mulps %xmm0, %xmm1 addps %xmm0, %xmm1 movaps %xmm1, 0 llvm-svn: 65791	2009-03-01 23:44:07 +00:00
Evan Cheng	276f9b02c5	Minor optimization: Look for situations like this: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 Commute the ADD to hopefully eliminate an otherwise unavoidable copy. llvm-svn: 65752	2009-03-01 02:03:43 +00:00
Evan Cheng	9c3ce7905e	Last commit accidentially deleted this code. llvm-svn: 65679	2009-02-28 06:02:14 +00:00
Rafael Espindola	880e63bf01	Refactor TLS code and add some tests. The tests and expected results are: pic \| declaration \| linkage \| visibility \| !pic \| declaration \| external \| default \| tls1.ll tls2.ll \| local exec pic \| declaration \| external \| default \| tls1-pic.ll tls2-pic.ll \| general dynamic !pic \| !declaration \| external \| default \| tls3.ll tls4.ll \| initial exec pic \| !declaration \| external \| default \| tls3-pic.ll tls4-pic.ll \| general dynamic !pic \| declaration \| external \| hidden \| tls7.ll tls8.ll \| local exec pic \| declaration \| external \| hidden \| X \| local dynamic !pic \| !declaration \| external \| hidden \| tls9.ll tls10.ll \| local exec pic \| !declaration \| external \| hidden \| X \| local dynamic !pic \| declaration \| internal \| default \| tls5.ll tls6.ll \| local exec pic \| declaration \| internal \| default \| X \| local dynamic The ones marked with an X have not been implemented since local dynamic is not implemented. llvm-svn: 65632	2009-02-27 13:37:18 +00:00
Evan Cheng	257e065df6	Make sure this test passes on linux-ppc. llvm-svn: 65600	2009-02-27 00:51:50 +00:00
Evan Cheng	0ca6e3dba5	MachineLICM CSE should match destination register classes; avoid hoisting implicit_def's. llvm-svn: 65592	2009-02-27 00:02:22 +00:00
Evan Cheng	4014a9a5b8	ADDS{D\|S}rr_Int and MULS{D\|S}rr_Int are not commutable. The users of these intrinsics expect the high bits will not be modified. llvm-svn: 65499	2009-02-26 03:12:02 +00:00
Evan Cheng	86fc9440db	The last commit was overly conservative. It's ok to reuse value that's already marked livein. llvm-svn: 65498	2009-02-26 03:02:21 +00:00
Evan Cheng	ec34226c2b	Revert BuildVectorSDNode related patches: 65426, 65427, and 65296. llvm-svn: 65482	2009-02-25 22:49:59 +00:00
Dan Gohman	3766eeea36	Fast-isel can't do TLS yet, so it should fall back to SDISel if it sees TLS addresses. llvm-svn: 65341	2009-02-23 22:03:08 +00:00
Dan Gohman	c1229a3a86	Use the -stack-alignment option instead of using a target triple for avoiding dynamic stack realignment. llvm-svn: 65319	2009-02-23 16:34:46 +00:00
Evan Cheng	dd139e795c	Only v1i16 (i.e. _m64) is returned via RAX / RDX. llvm-svn: 65313	2009-02-23 09:03:22 +00:00
Nate Begeman	7d9e99560f	Make this test use darwin targe triple, to avoid stack traffic on linux. llvm-svn: 65312	2009-02-23 09:03:06 +00:00
Nate Begeman	e0093d2501	Generate better code for v8i16 shuffles on SSE2 Generate better code for v16i8 shuffles on SSE2 (avoids stack) Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops. Document the shuffle matching logic and add some FIXMEs for later further cleanups. New tests that test the above. Examples: New: _shuf2: pextrw $7, %xmm0, %eax punpcklqdq %xmm1, %xmm0 pshuflw $128, %xmm0, %xmm0 pinsrw $2, %eax, %xmm0 Old: _shuf2: pextrw $2, %xmm0, %eax pextrw $7, %xmm0, %ecx pinsrw $2, %ecx, %xmm0 pinsrw $3, %eax, %xmm0 movd %xmm1, %eax pinsrw $4, %eax, %xmm0 ret ========= New: _shuf4: punpcklqdq %xmm1, %xmm0 pshufb LCPI1_0, %xmm0 Old: _shuf4: pextrw $3, %xmm0, %eax movsd %xmm1, %xmm0 pextrw $3, %xmm1, %ecx pinsrw $4, %ecx, %xmm0 pinsrw $5, %eax, %xmm0 ======== New: _shuf1: pushl %ebx pushl %edi pushl %esi pextrw $1, %xmm0, %eax rolw $8, %ax movd %xmm0, %ecx rolw $8, %cx pextrw $5, %xmm0, %edx pextrw $4, %xmm0, %esi pextrw $3, %xmm0, %edi pextrw $2, %xmm0, %ebx movaps %xmm0, %xmm1 pinsrw $0, %ecx, %xmm1 pinsrw $1, %eax, %xmm1 rolw $8, %bx pinsrw $2, %ebx, %xmm1 rolw $8, %di pinsrw $3, %edi, %xmm1 rolw $8, %si pinsrw $4, %esi, %xmm1 rolw $8, %dx pinsrw $5, %edx, %xmm1 pextrw $7, %xmm0, %eax rolw $8, %ax movaps %xmm1, %xmm0 pinsrw $7, %eax, %xmm0 popl %esi popl %edi popl %ebx ret Old: _shuf1: subl $252, %esp movaps %xmm0, (%esp) movaps %xmm0, 16(%esp) movaps %xmm0, 32(%esp) movaps %xmm0, 48(%esp) movaps %xmm0, 64(%esp) movaps %xmm0, 80(%esp) movaps %xmm0, 96(%esp) movaps %xmm0, 224(%esp) movaps %xmm0, 208(%esp) movaps %xmm0, 192(%esp) movaps %xmm0, 176(%esp) movaps %xmm0, 160(%esp) movaps %xmm0, 144(%esp) movaps %xmm0, 128(%esp) movaps %xmm0, 112(%esp) movzbl 14(%esp), %eax movd %eax, %xmm1 movzbl 22(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 42(%esp), %eax movd %eax, %xmm1 movzbl 50(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm1, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 77(%esp), %eax movd %eax, %xmm1 movzbl 84(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 104(%esp), %eax movd %eax, %xmm1 punpcklbw %xmm1, %xmm0 punpcklbw %xmm2, %xmm0 movaps %xmm0, %xmm1 punpcklbw %xmm3, %xmm1 movzbl 127(%esp), %eax movd %eax, %xmm0 movzbl 135(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 155(%esp), %eax movd %eax, %xmm0 movzbl 163(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm0, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 188(%esp), %eax movd %eax, %xmm0 movzbl 197(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 217(%esp), %eax movd %eax, %xmm4 movzbl 225(%esp), %eax movd %eax, %xmm0 punpcklbw %xmm4, %xmm0 punpcklbw %xmm2, %xmm0 punpcklbw %xmm3, %xmm0 punpcklbw %xmm1, %xmm0 addl $252, %esp ret llvm-svn: 65311	2009-02-23 08:49:38 +00:00
Scott Michel	3f8637305f	Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR instruction. The class also consolidates the code for detecting constant splats that's shared across PowerPC and the CellSPU backends (and might be useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for generating new BUILD_VECTOR nodes. llvm-svn: 65296	2009-02-22 23:36:09 +00:00
Richard Pennington	72dc7c621e	bug 3610: Test case. llvm-svn: 65287	2009-02-22 15:54:44 +00:00
Evan Cheng	41687ff389	If a use operand is marked isKill, don't forget to add kill to its live interval as well. llvm-svn: 65279	2009-02-22 08:35:56 +00:00
Evan Cheng	4385f393f7	Be bug compatible with gcc by returning MMX values in RAX. llvm-svn: 65274	2009-02-22 08:05:12 +00:00
Anton Korobeynikov	5df82e3e25	Drop bunch of half-working stuff in the ext_weak linkage support. Now we're using one gross, but quite robust hack :) (previous ones did not work, for example, when ext_weak symbol was used deep inside constant expression in the initializer). The proper fix of this problem will require some quite huge asmprinter changes and that's why was postponed. This fixes PR3629 by the way :) llvm-svn: 65230	2009-02-21 11:53:32 +00:00
Evan Cheng	d4dc06f55a	If two-address def is dead and the instruction does not define other registers, and it doesn't produce side effects, just delete the instruction. llvm-svn: 65218	2009-02-21 03:14:25 +00:00
Evan Cheng	56b43045f6	Teach LSR sink to sink the immediate portion of the common expression back into uses if they fit in address modes of all the uses. llvm-svn: 65215	2009-02-21 02:06:47 +00:00
Evan Cheng	c2541a4450	Fix strange logic in CollectIVUsers used to determine whether all uses are addresses, part 1. This fixes an obvious logic bug. Previously if the only in-loop use is a PHI, it would return AllUsesAreAddresses as true. llvm-svn: 65178	2009-02-20 22:16:49 +00:00
Evan Cheng	c40c3e28f7	Support return of MMX values in 64-bit mode. llvm-svn: 65152	2009-02-20 20:43:02 +00:00
Owen Anderson	8d12de1c25	Fix a crash in the pre-alloc splitter exposed by recent codegen changes. llvm-svn: 65121	2009-02-20 10:02:23 +00:00
Chris Lattner	c25ea86424	make these tests pass when run on a G5. llvm-svn: 65117	2009-02-20 07:10:11 +00:00
Dan Gohman	4e8fc41d48	Implement "superhero" strength reduction, or full strength reduction of address calculations down to basic pointer arithmetic. This is currently off by default, as it needs a few other features before it becomes generally useful. And even when enabled, full strength reduction is only performed when it doesn't increase register pressure, and when several other conditions are true. This also factors out a bunch of exisiting LSR code out of StrengthReduceStridedIVUsers into separate functions, and tidies up IV insertion. This actually decreases register pressure even in non-superhero mode. The change in iv-users-in-other-loops.ll is an example of this; there are two more adds because there are two fewer leas, and there is less spilling. llvm-svn: 65108	2009-02-20 04:17:46 +00:00
Evan Cheng	bd63a1f40d	GV with null value initializer shouldn't go to BSS if it's meant for a mergeable strings section. Currently it only checks for Darwin. Someone else please check if it should apply to other targets as well. llvm-svn: 64877	2009-02-18 02:19:52 +00:00
Evan Cheng	2e0cf05ad2	A couple of places where reused use operands should be marked kill. This is exposed by recent availability fallthrough changes. llvm-svn: 64745	2009-02-17 06:41:03 +00:00
Dan Gohman	3d93bc5654	Change these tests to use regular loads instead of llvm.x86.sse2.loadu.dq. Enhance instcombine to use the preferred field of GetOrEnforceKnownAlignment in more cases, so that regular IR operations are optimized in the same way that the intrinsics currently are. llvm-svn: 64623	2009-02-16 00:44:23 +00:00

1 2 3 4 5 ...

1735 Commits