llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Chris Lattner	99ca33d324	move .set generation out of DwarfPrinter into AsmPrinter and MCize it. llvm-svn: 98010	2010-03-08 23:58:37 +00:00
Chris Lattner	10d571f349	simplify EmitSectionOffset to always use .set if it is available, the only thing this affects is that we produce .set in one case we didn't before, which shouldn't harm anything. Make EmitSectionOffset call EmitDifference instead of duplicating it. llvm-svn: 98005	2010-03-08 23:23:25 +00:00
Bob Wilson	116599fe52	Fix a crash compiling 254.gap for Thumb2. The Thumb2 add/sub with 12-bit immediate instructions cannot set the condition codes, so they do not have the extra cc_out operand. We hit an assertion during tail duplication because the instruction being duplicated had more operands that expected. llvm-svn: 98001	2010-03-08 22:56:15 +00:00
Evan Cheng	cfe037000a	Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll. llvm-svn: 97980	2010-03-08 21:05:02 +00:00
Wesley Peck	bc2c6d7b1b	Re-committing the failed r97807 commit with changes to eliminate warnings. llvm-svn: 97891	2010-03-06 23:23:12 +00:00
Anton Korobeynikov	e0e616a74d	Initial bits of ARMv4-only support. Patch by John Tytgat! llvm-svn: 97886	2010-03-06 19:39:36 +00:00
Anton Korobeynikov	6c841e6a44	Do not use '&' prefix for globals when register base field is non-zero, otherwise msp430-as will silently miscompile the code (TI's assembler report an error though). This fixes PR6349 llvm-svn: 97877	2010-03-06 11:41:12 +00:00
Chris Lattner	b2ed5cb501	revert r97807, it introduced build warnings. llvm-svn: 97869	2010-03-06 04:32:46 +00:00
Charles Davis	faa2f44081	Don't emit global symbols into the (__TEXT,__ustring) section on Darwin. This is a workaround for <rdar://problem/7672401/> (which I filed). This let's us build Wine on Darwin, and it gets the Qt build there a little bit further (so Doug says). llvm-svn: 97845	2010-03-05 22:28:45 +00:00
Jakob Stoklund Olesen	4e033d2070	Better handling of dead super registers in LiveVariables. We used to do this: CALL ... %RAX<imp-def> ... [not using %RAX] %EAX = ..., %RAX<imp-use, kill> RET %EAX<imp-use,kill> Now we do this: CALL ... %RAX<imp-def, dead> ... [not using %RAX] %EAX = ... RET %EAX<imp-use,kill> By not artificially keeping %RAX alive, we lower register pressure a bit. The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously 55, anybody can see that. Sheesh. llvm-svn: 97838	2010-03-05 21:49:17 +00:00
Jakob Stoklund Olesen	1fce28720a	We don't really care about correct register liveness information after the post-ra scheduler has run. Disable the verifier checks that late in the game. llvm-svn: 97837	2010-03-05 21:49:13 +00:00
Jakob Stoklund Olesen	67476519d7	Avoid creating bad PHI instructions when BR is being const-folded. llvm-svn: 97836	2010-03-05 21:49:10 +00:00
Chris Lattner	e1452aba94	fix bss section printing for cell, patch by Kalle Raiskila! llvm-svn: 97814	2010-03-05 18:55:36 +00:00
Wesley Peck	79c2f7afcc	Reworking the stack layout that the MicroBlaze backend generates. The MicroBlaze backend was generating stack layouts that did not conform correctly to the ABI. This update generates stack layouts which are closer to what GCC does. Variable arguments support was added as well but the stack layout for varargs has not been finalized. llvm-svn: 97807	2010-03-05 15:26:02 +00:00
Evan Cheng	eb43cbfc75	Fix an oops in x86 sibcall optimization. If the ByVal callee argument is itself passed as a pointer, then it's obviously not safe to do a tail call. llvm-svn: 97797	2010-03-05 08:38:04 +00:00
Chris Lattner	80aaccb987	Fix PR6497, a bug where we'd fold a load into an addc node which has a flag. That flag in turn was used by an already-selected adde which turned into an ADC32ri8 which used a selected load which was chained to the load we folded. This flag use caused us to form a cycle. Fix this by not ignoring chains in IsLegalToFold even in cases where the isel thinks it can. llvm-svn: 97791	2010-03-05 06:19:13 +00:00
Chris Lattner	5804690a45	cleanup llvm-svn: 97790	2010-03-05 06:17:43 +00:00
Evan Cheng	04b9deff58	Rever 96389 and 96990. They are causing some miscompilation that I do not fully understand. llvm-svn: 97782	2010-03-05 03:08:23 +00:00
Bill Wendling	a92a5b8a6c	Revert r97766. It's deleting a tag. llvm-svn: 97768	2010-03-05 00:33:59 +00:00
Bill Wendling	55b4dfcd9d	Micro-optimization: This code: float floatingPointComparison(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } produces this: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jne 0x00000004 0000000000000016 jp 0x00000002 0000000000000018 jmp 0x00000008 000000000000001a addsd 0x00000006(%rip),%xmm0 0000000000000022 cvtsd2ss %xmm0,%xmm0 0000000000000026 ret The "jne/jp/jmp" sequence can be reduced to this instead: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jp 0x00000002 0000000000000016 je 0x00000008 0000000000000018 addsd 0x00000006(%rip),%xmm0 0000000000000020 cvtsd2ss %xmm0,%xmm0 0000000000000024 ret for a savings of 2 bytes. This xform can happen when we recognize that jne and jp jump to the same "true" MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch is the fall-through MBB. llvm-svn: 97766	2010-03-05 00:24:26 +00:00
Johnny Chen	b6d35fd803	Drop the ".w" qualifier for t2UXTB16* instructions as there is no 16-bit version of either sxtb16 or uxtb16, and the unified syntax does not specify ".w". llvm-svn: 97760	2010-03-04 22:24:41 +00:00
Bob Wilson	188e15d7a5	pr6478: The frame pointer spill frame index is only defined when there is a frame pointer. llvm-svn: 97755	2010-03-04 21:42:36 +00:00
Bob Wilson	73b96c00d2	pr6480: Don't try producing ld/st-multiple instructions when the address is an undef value. This is only going to come up for bugpoint-reduced tests -- correct programs will not access memory at undefined addresses -- so it's not worth the effort of doing anything more aggressive. llvm-svn: 97745	2010-03-04 21:04:38 +00:00
Jakob Stoklund Olesen	3408cd6de1	Fix the remaining MUL8 and DIV8 to define AX instead of AL,AH. These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG reads AX in order to avoid reading AH with a REX instruction. Fix PR6489. llvm-svn: 97742	2010-03-04 20:42:07 +00:00
Dan Gohman	265f85f6d8	Fix recognition of 16-bit bswap for C front-ends which emit the clobber registers in a different order. llvm-svn: 97741	2010-03-04 19:58:08 +00:00
Dan Gohman	da13ee1220	Revert r97580; that's not the right way to fix this. llvm-svn: 97639	2010-03-03 04:36:42 +00:00
Bill Wendling	d1f658563d	This test case: long test(long x) { return (x & 123124) \| 3; } Currently compiles to: _test: orl $3, %edi movq %rdi, %rax andq $123127, %rax ret This is because instruction and DAG combiners canonicalize (or (and x, C), D) -> (and (or, D), (C \| D)) However, this is only profitable if (C & D) != 0. It gets in the way of the 3-addressification because the input bits are known to be zero. llvm-svn: 97616	2010-03-03 00:35:56 +00:00
Chris Lattner	9c9c1158cb	Fix some issues in WalkChainUsers dealing with CopyToReg/CopyFromReg/INLINEASM. These are annoying because they have the same opcode before an after isel. Fix this by setting their NodeID to -1 to indicate that they are selected, just like what automatically happens when selecting things that end up being machine nodes. With that done, give IsLegalToFold a new flag that causes it to ignore chains. This lets the HandleMergeInputChains routine be the one place that validates chains after a match is successful, enabling the new hotness in chain processing. This smarter chain processing eliminates the need for "PreprocessRMW" in the X86 and MSP430 backends and enables MSP to start matching it's multiple mem operand instructions more aggressively. I currently #if out the dead code in the X86 backend and MSP backend, I'll remove it for real in a follow-on patch. The testcase changes are: test/CodeGen/X86/sse3.ll: we generate better code test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was miscompiling this before, we now generate correct code Convert it to filecheck while I'm at it. test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem folding to make anton happy. :) llvm-svn: 97596	2010-03-02 22:20:06 +00:00
Chris Lattner	d25f212f9f	this testcase is failing because pic16 doesn't define a reg/reg xor pattern. I have no plans to fix this XFAIL. llvm-svn: 97587	2010-03-02 20:48:24 +00:00
Chris Lattner	f84b94d738	xfail this for now. llvm-svn: 97584	2010-03-02 19:53:25 +00:00
Dan Gohman	f06941597a	When expanding an expression such as (A + B + C + D), sort the operands by loop depth and emit loop-invariant subexpressions outside of loops. This speeds up MultiSource/Applications/viterbi and others. llvm-svn: 97580	2010-03-02 19:32:21 +00:00
Chris Lattner	845db3b26d	clean up some testcases. llvm-svn: 97576	2010-03-02 18:56:03 +00:00
Chris Lattner	2019e2922f	Fix the xfail I added a couple of patches back. The issue was that we weren't properly handling the case when interior nodes of a matched pattern become dead after updating chain and flag uses. Now we handle this explicitly in UpdateChainsAndFlags. llvm-svn: 97561	2010-03-02 07:50:03 +00:00
Chris Lattner	0b41a42411	Rewrite chain handling validation and input TokenFactor handling stuff now that we don't care about emulating the old broken behavior of the old isel. This eliminates the 'CheckChainCompatible' check (along with IsChainCompatible) which did an incorrect and inefficient scan up the chain nodes which happened as the pattern was being formed and does the validation at the end in HandleMergeInputChains when it forms a structural pattern. This scans "down" the graph, which means that it is quickly bounded by nodes already selected. This also handles token factors that get "trapped" in the dag. Removing the CheckChainCompatible nodes also shrinks the generated tables by about 6K for X86 (down to 83K). There are two pieces remaining before I can nuke PreprocessRMW: 1. I xfailed a test because we're now producing worse code in a case that has nothing to do with the change: it turns out that our use of MorphNodeTo will leave dead nodes in the graph which (depending on how the graph is walked) end up causing bogus uses of chains and blocking matches. This is really bad for other reasons, so I'll fix this in a follow-up patch. 2. CheckFoldableChainNode needs to be improved to handle the TF. llvm-svn: 97539	2010-03-02 02:22:10 +00:00
Dan Gohman	56a20fc5eb	Fix several places to handle vector operands properly. Based on a patch by Micah Villmow for PR6438. llvm-svn: 97538	2010-03-02 02:14:38 +00:00
Chris Lattner	c0839055a9	Fix PR2590 by making PatternSortingPredicate actually be ordered correctly. Previously it would get in trouble when two patterns were too similar and give them nondet ordering. We force this by using the record ID order as a fallback. The testsuite diff is due to alpha patterns being ordered slightly differently, the change is a semantic noop afaict: < lda $0,-100($16) --- > subq $16,100,$0 llvm-svn: 97509	2010-03-01 22:09:11 +00:00
Chris Lattner	ac2f5c24a0	stop using anders-aa llvm-svn: 97491	2010-03-01 20:24:05 +00:00
Devang Patel	6853e2432e	Rewrite test to test VLA using new debug info encoding scheme. llvm-svn: 97465	2010-03-01 18:30:58 +00:00
Devang Patel	c56aee014c	Remove this generic debug info intrinsic test. LLVM does not use this llvm.dbg.stoppoint intrinsic anymore. There are tests to check new implementation, which attaches location information directly with an instruction using metadata. llvm-svn: 97464	2010-03-01 18:30:08 +00:00
Chris Lattner	74db1864da	add some random nounwinds. llvm-svn: 97411	2010-02-28 20:36:49 +00:00
Dan Gohman	0799a41c48	Don't try to replace physical registers when doing CSE. llvm-svn: 97360	2010-02-28 01:33:43 +00:00
Dan Gohman	3d9622fa97	Add nounwinds. llvm-svn: 97349	2010-02-27 23:53:53 +00:00
Evan Cheng	94051bc37e	Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap. llvm-svn: 97310	2010-02-27 07:36:59 +00:00
Jakob Stoklund Olesen	755ba2ee84	Use the right floating point load/store instructions in PPCInstrInfo::foldMemoryOperandImpl(). The PowerPC floating point registers can represent both f32 and f64 via the two register classes F4RC and F8RC. F8RC is considered a subclass of F4RC to allow cross-class coalescing. This coalescing only affects whether registers are spilled as f32 or f64. Spill slots must be accessed with load/store instructions corresponding to the class of the spilled register. PPCInstrInfo::foldMemoryOperandImpl was looking at the instruction opcode which is wrong. X86 has similar floating point register classes, but doesn't try to fold memory operands, so there is no problem there. llvm-svn: 97262	2010-02-26 21:09:24 +00:00
Sanjiv Gupta	8a971a1dc5	Reapply things reverted back in 97220, with the fixed test case. llvm-svn: 97228	2010-02-26 17:59:28 +00:00
Richard Osborne	fe30a8a2c1	Fix XCoreTargetLowering::isLegalAddressingMode() to handle VoidTy. Previously LoopStrengthReduce would sometimes be unable to find a legal formula, causing an assertion failure. llvm-svn: 97226	2010-02-26 16:44:51 +00:00
Chris Lattner	e4b5559cf8	change the scope node to include a list of children to be checked instead of to have a chained series of scope nodes. This makes the generated table smaller, improves the efficiency of the interpreter, and make the factoring optimization much more reasonable to implement. llvm-svn: 97160	2010-02-25 19:00:39 +00:00
Dan Gohman	084112437d	Revert r97064. Duncan pointed out that bitcasts are defined in terms of store and load, which means bitcasting between scalar integer and vector has endian-specific results, which undermines this whole approach. llvm-svn: 97137	2010-02-25 15:20:39 +00:00
Dan Gohman	52ed61204b	Make LoopSimplify change conditional branches in loop exiting blocks which branch on undef to branch on a boolean constant for the edge exiting the loop. This helps ScalarEvolution compute trip counts for loops. Teach ScalarEvolution to recognize single-value PHIs, when safe, and ForgetSymbolicName to forget such single-value PHI nodes as apprpriate in ForgetSymbolicName. llvm-svn: 97126	2010-02-25 06:57:05 +00:00
Jakob Stoklund Olesen	2b93d17560	Create a stack frame on ARM when - Function uses all scratch registers AND - Function does not use any callee saved registers AND - Stack size is too big to address with immediate offsets. In this case a register must be scavenged to calculate the address of a stack object, and the scavenger needs a spare register or emergency spill slot. llvm-svn: 97071	2010-02-24 22:43:17 +00:00
Bob Wilson	4ffb88d388	Check for comparisons of +/- zero when optimizing less-than-or-equal and greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions. This is only allowed when UnsafeFPMath is set or when at least one of the operands is known to be nonzero. llvm-svn: 97065	2010-02-24 22:15:53 +00:00
Dan Gohman	424e8f22d0	Make getTypeSizeInBits work correctly for array types; it should return the number of value bits, not the number of bits of allocation for in-memory storage. Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and vectors. Fix several places in CodeGen which compute offsets into in-memory vectors to use TargetData information. This fixes PR1784. llvm-svn: 97064	2010-02-24 22:05:23 +00:00
Daniel Dunbar	24c99e027e	Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in the hopes of fixing PPC bootstrap. llvm-svn: 97040	2010-02-24 17:05:47 +00:00
Dan Gohman	c0c6077fed	When forming SSE min and max nodes for UGE and ULE comparisons, it's necessary to swap the operands to handle NaN and negative zero properly. Also, reintroduce logic for checking for NaN conditions when forming SSE min and max instructions, fixed to take into consideration NaNs and negative zeros. This allows forming min and max instructions in more cases. llvm-svn: 97025	2010-02-24 06:52:40 +00:00
Chris Lattner	52a02205d8	Change the scheduler from adding nodes in allnodes order to adding them in a determinstic order (bottom up from the root) based on the structure of the graph itself. This updates tests for some random changes, interesting bits: CodeGen/Blackfin/promote-logic.ll no longer crashes. I have no idea why, but that's good right? CodeGen/X86/2009-07-16-LoadFoldingBug.ll also fails, but now compiles to have one fewer constant pool entry, making the expected load that was being folded disappear. Since it is an unreduced mass of gnast, I just removed it. This fixes PR6370 llvm-svn: 97023	2010-02-24 06:11:37 +00:00
Jim Grosbach	6f72657d6e	LowerCall() should always do getCopyFromReg() to reference the stack pointer. Machine instruction selection is much happier when operands are in virtual registers. llvm-svn: 97012	2010-02-24 01:43:03 +00:00
Evan Cheng	5787cd9349	Re-apply 96540 and 96556 with fixes. llvm-svn: 97011	2010-02-24 01:42:31 +00:00
Jakob Stoklund Olesen	a946f9eb7d	DIV8r must define %AX since X86DAGToDAGISel::Select() sometimes uses it instead of %AL/%AH. llvm-svn: 97006	2010-02-24 00:39:35 +00:00
Jakob Stoklund Olesen	3406ec2f57	Remember to handle sub-registers when moving imp-defs to a rematted instruction. llvm-svn: 96995	2010-02-23 22:44:02 +00:00
Jakob Stoklund Olesen	cf29251712	Keep track of phi join registers explicitly in LiveVariables. Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply defined registers. That doesn't work if the phi join is implicitly defined in all but one of the predecessors. llvm-svn: 96994	2010-02-23 22:43:58 +00:00
Wesley Peck	94cdac52e5	Adding the MicroBlaze backend. The MicroBlaze is a highly configurable 32-bit soft-microprocessor for use on Xilinx FPGAs. For more information see: http://www.xilinx.com/tools/microblaze.htm http://en.wikipedia.org/wiki/MicroBlaze The current LLVM MicroBlaze backend generates assembly which can be compiled using the an appropriate binutils assembler. llvm-svn: 96969	2010-02-23 19:15:24 +00:00
Richard Osborne	7387249531	Lower BR_JT on the XCore to a jump into a series of jump instructions. llvm-svn: 96942	2010-02-23 13:25:07 +00:00
Evan Cheng	8096221984	These should not have been committed. llvm-svn: 96827	2010-02-22 23:37:48 +00:00
Chris Lattner	c4fea4c8a1	no need to run llvm-as here. llvm-svn: 96826	2010-02-22 23:34:12 +00:00
Evan Cheng	d9816ef946	Instcombine constant folding can normalize gep with negative index to index with large offset. When instcombine objsize checking transformation sees these geps where the offset seemingly point out of bound, it should just return "i don't know" rather than asserting. llvm-svn: 96825	2010-02-22 23:34:00 +00:00
Dan Gohman	97481fb5e0	Canonicalize ConstantInts to the right operand of commutative operators. The test difference is just due to the multiplication operands being commuted (and thus requiring a more elaborate match). In optimized code, that expression would be folded. llvm-svn: 96816	2010-02-22 22:43:23 +00:00
Dan Gohman	8f672b95f2	Actually enable the -enable-unsafe-fp-math tests. llvm-svn: 96796	2010-02-22 18:53:26 +00:00
Arnold Schwaighofer	8427969f9c	Mark the return address stack slot as mutable when moving the return address during a tail call. A parameter might overwrite this stack slot during the tail call. The sequence during a tail call is: 1.) load return address to temp reg 2.) move parameters (might involve storing to return address stack slot) 3.) store return address to new location from temp reg If the stack location is marked immutable CodeGen can colocate load (1) with the store (3). This fixes bug 6225. llvm-svn: 96783	2010-02-22 16:18:09 +00:00
Dan Gohman	c281a5da15	Remove the logic for reasoning about NaNs from the code that forms SSE min and max instructions. The real thing this code needs to be concerned about is negative zero. Update the sse-minmax.ll test accordingly, and add tests for -enable-unsafe-fp-math mode as well. llvm-svn: 96775	2010-02-22 04:03:39 +00:00
Dan Gohman	c44dee5fbd	When emitting an instruction which depends on both a post-incremented induction variable value and a loop-variant value, don't force the insert position to be at the post-increment position, because it may not be dominated by the loop-variant value. This fixes a use-before-def problem noticed on PPC. llvm-svn: 96774	2010-02-22 03:59:54 +00:00
Chris Lattner	37f20c29c8	add some no-unwinds, other minor cleanups. llvm-svn: 96756	2010-02-21 20:33:20 +00:00
Chris Lattner	fa1fdcf146	add a triple so that this doesn't fail due to linux/ppc register printing syntax. llvm-svn: 96748	2010-02-21 19:27:38 +00:00
Chris Lattner	654f38165b	filecheckize and add nouwinds. llvm-svn: 96745	2010-02-21 18:53:28 +00:00
Anton Korobeynikov	fe0d6453ec	IT turns out that during jumpless setcc lowering eq and ne were swapped. This fixes PR6348 llvm-svn: 96734	2010-02-21 12:28:58 +00:00
Chris Lattner	b328c57aa4	fix and un-xfail X86/vec_ss_load_fold.ll llvm-svn: 96720	2010-02-21 04:53:34 +00:00
Chris Lattner	b28cf9f8d9	temporarily disable this. llvm-svn: 96717	2010-02-21 03:24:41 +00:00
Dan Gohman	9db0689627	Check for overflow when scaling up an add or an addrec for scaled reuse. llvm-svn: 96692	2010-02-19 19:32:49 +00:00
Charles Davis	a64fc8c41b	Add support for the 'alignstack' attribute to the x86 backend. Fixes PR5254. Also, FileCheck'ize a test. llvm-svn: 96686	2010-02-19 18:17:13 +00:00
Duncan Sands	5d5cce2e19	Revert commits 96556 and 96640, because commit 96556 breaks the dragonegg self-host build. I reverted 96640 in order to revert 96556 (96640 goes on top of 96556), but it also looks like with both of them applied the breakage happens even earlier. The symptom of the 96556 miscompile is the following crash: llvm[3]: Compiling AlphaISelLowering.cpp for Release build cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode, llvm::SDNode, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) \|\| From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed. Stack dump: 0. Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE' g++: Internal error: Aborted (program cc1plus) This occurs when building LLVM using LLVM built by LLVM (via dragonegg). Probably LLVM has miscompiled itself, though it may have miscompiled GCC and/or dragonegg itself: at this point of the self-host build, all of GCC, LLVM and dragonegg were built using LLVM. Unfortunately this kind of thing is extremely hard to debug, and while I did rummage around a bit I didn't find any smoking guns, aka obviously miscompiled code. Found by bisection. r96556 \| evancheng \| 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) \| 5 lines Some dag combiner goodness: Transform br (xor (x, y)) -> br (x != y) Transform br (xor (xor (x,y), 1)) -> br (x == y) Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm" r96640 \| evancheng \| 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) \| 16 lines Transform (xor (setcc), (setcc)) == / != 1 to (xor (setcc), (setcc)) != / == 1. e.g. On x86_64 %0 = icmp eq i32 %x, 0 %1 = icmp eq i32 %y, 0 %2 = xor i1 %1, %0 br i1 %2, label %bb, label %return => testl %edi, %edi sete %al testl %esi, %esi sete %cl cmpb %al, %cl je LBB1_2 llvm-svn: 96672	2010-02-19 11:30:41 +00:00
Evan Cheng	32031f7404	Transform (xor (setcc), (setcc)) == / != 1 to (xor (setcc), (setcc)) != / == 1. e.g. On x86_64 %0 = icmp eq i32 %x, 0 %1 = icmp eq i32 %y, 0 %2 = xor i1 %1, %0 br i1 %2, label %bb, label %return => testl %edi, %edi sete %al testl %esi, %esi sete %cl cmpb %al, %cl je LBB1_2 llvm-svn: 96640	2010-02-19 00:34:39 +00:00
Dan Gohman	58199e30dc	When determining the set of interesting reuse factors, consider strides in foreign loops. This helps locate reuse opportunities with existing induction variables in foreign loops and reduces the need for inserting new ones. This fixes rdar://7657764. llvm-svn: 96629	2010-02-19 00:05:23 +00:00
Mon P Wang	64cd1a8d7f	getSplatIndex assumes that the first element of the mask contains the splat index which is not always true if the mask contains undefs. Modified it to return the first non undef value. llvm-svn: 96621	2010-02-18 22:33:18 +00:00
Jakob Stoklund Olesen	5b9d14b55e	Always normalize spill weights, also for intervals created by spilling. Moderate the weight given to very small intervals. The spill weight given to new intervals created when spilling was not normalized in the same way as the original spill weights calculated by CalcSpillWeights. That meant that restored registers would tend to hang around because they had a much higher spill weight that unspilled registers. This improves the runtime of a few tests by up to 10%, and there are no significant regressions. llvm-svn: 96613	2010-02-18 21:33:05 +00:00
Dan Gohman	34b5cb7deb	Make CodePlacementOpt detect special EH control flow by checking whether AnalyzeBranch disagrees with the CFG directly, rather than looking for EH_LABEL instructions. EH_LABEL instructions aren't always at the end of the block, due to FP_REG_KILL and other things. This fixes an infinite loop compiling MultiSource/Benchmarks/Bullet. llvm-svn: 96611	2010-02-18 21:25:53 +00:00
Chris Lattner	a2e094064f	remove empty file llvm-svn: 96573	2010-02-18 06:29:06 +00:00
Bob Wilson	84fc0200bd	Use NEON vmin/vmax instructions for floating-point selects. Radar 7461718. llvm-svn: 96572	2010-02-18 06:05:53 +00:00
Evan Cheng	9af06dfc83	Some dag combiner goodness: Transform br (xor (x, y)) -> br (x != y) Transform br (xor (xor (x,y), 1)) -> br (x == y) Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm" llvm-svn: 96556	2010-02-18 02:13:50 +00:00
Dan Gohman	3cb7dc5912	Don't check for comments, which vary between subtargets. llvm-svn: 96434	2010-02-17 01:08:57 +00:00
Dan Gohman	493a1fcbe0	Don't attempt to divide INT_MIN by -1; consider such cases to have overflowed. llvm-svn: 96428	2010-02-17 00:41:53 +00:00
Chris Lattner	c87f9d6d1a	roundss is an sse 4 thing, fix the test on non-sse41 builders like llvm-gcc-x86_64-darwin10-selfhost llvm-svn: 96417	2010-02-17 00:29:06 +00:00
Dale Johannesen	d147b9a4d4	Make g5 target explicit; scheduling affects register choice. llvm-svn: 96413	2010-02-16 23:25:23 +00:00
Chris Lattner	0d35c68d5c	fix rdar://7653908, a crash on a case where we would fold a load into a roundss intrinsic, producing a cyclic dag. The root cause of this is badness handling ComplexPattern nodes in the old dagisel that I noticed through inspection. Eliminate a copy of the of the code that handled ComplexPatterns by making EmitChildMatchCode call into EmitMatchCode. llvm-svn: 96408	2010-02-16 22:35:06 +00:00
Dale Johannesen	60d48aef7b	Adjust register numbers in tests to compensate for the new lack of R2. llvm-svn: 96407	2010-02-16 22:31:31 +00:00
Chris Lattner	008f62bfa2	filecheckize llvm-svn: 96404	2010-02-16 22:13:43 +00:00
Evan Cheng	ee44d6a752	Look for SSE and instructions of this form: (and x, (build_vector c1,c2,c3,c4)). If there exists a use of a build_vector that's the bitwise complement of the mask, then transform the node to (and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)). Since this transformation is only useful when 1) the given build_vector will become a load from constpool, and 2) (and (xor x -1), y) matches to a single instruction, I decided this is appropriate as a x86 specific transformation. rdar://7323335 llvm-svn: 96389	2010-02-16 21:09:44 +00:00
David Greene	c10133139e	Add support for emitting non-temporal stores for DAGs marked non-temporal. Fix from r96241 for botched encoding of MOVNTDQ. Add documentation for !nontemporal metadata. Add a simpler movnt testcase. llvm-svn: 96386	2010-02-16 20:50:18 +00:00
Bob Wilson	94eef3fc13	Fix pr6111: Avoid using the LR register for the target address of an indirect branch in ARM v4 code, since it gets clobbered by the return address before it is used. Instead of adding a new register class containing all the GPRs except LR, just use the existing tGPR class. llvm-svn: 96360	2010-02-16 17:24:15 +00:00
Dan Gohman	d19ecedc40	Split the main for-each-use loop again, this time for GenerateTruncates, as it also peeks at which registers are being used by other uses. This makes LSR less sensitive to use-list order. llvm-svn: 96308	2010-02-16 01:42:53 +00:00
Anton Korobeynikov	dccd240998	Preliminary patch to improve dwarf EH generation - Hooks to return Personality / FDE / LSDA / TType encoding depending on target / options (e.g. code model / relocation model) - MCIzation of Dwarf EH printer to use encoding information - Stub generation for ELF target (needed for indirect references) - Some other small changes here and there llvm-svn: 96285	2010-02-15 22:35:59 +00:00
Jakob Stoklund Olesen	143339a43a	Fix PR6300. A virtual register can be used before it is defined in the same MBB if the MBB is part of a loop. Teach the implicit-def pass about this case. llvm-svn: 96279	2010-02-15 22:03:29 +00:00
Bob Wilson	01e8d35855	Last week we were generating code with duplicate induction variables in this test, but the problem seems to have gone away today. Add a check to make sure it doesn't come back. llvm-svn: 96277	2010-02-15 21:56:40 +00:00
Chris Lattner	2ce5f89c01	remove empty file. llvm-svn: 96271	2010-02-15 21:14:50 +00:00
Chris Lattner	d7470aa340	revert r96241. It breaks two regression tests, isn't documented, and the testcase needs improvement. llvm-svn: 96265	2010-02-15 20:53:01 +00:00
Chris Lattner	a8505609fe	fix PR6305 by handling BlockAddress in a helper function called by jump threading. llvm-svn: 96263	2010-02-15 20:47:49 +00:00
David Greene	ba8bac644b	Add support for emitting non-temporal stores for DAGs marked non-temporal. llvm-svn: 96241	2010-02-15 17:02:56 +00:00
Jakob Stoklund Olesen	0a65533a38	Fix PR6283. When coalescing with a physreg, remember to add imp-def and imp-kill when dealing with sub-registers. Also fix a related bug in VirtRegRewriter where substitutePhysReg may reallocate the operand list on an instruction and invalidate the reg_iterator. This can happen when a register is mentioned twice on the same instruction. llvm-svn: 96072	2010-02-13 02:06:10 +00:00
Bob Wilson	5d66f81412	Besides removing phi cycles that reduce to a single value, also remove dead phi cycles. Adjust a few tests to keep dead instructions from being optimized away. This (together with my previous change for phi cycles) fixes Apple radar 7627077. llvm-svn: 96057	2010-02-13 00:31:44 +00:00
Dale Johannesen	ea96b2974f	When save/restoring CR at prolog/epilog, in a large stack frame, the prolog/epilog code was using the same register for the copy of CR and the address of the save slot. Oops. This is fixed here for Darwin, sort of, by reserving R2 for this case. A better way would be to do the store before the decrement of SP, which is safe on Darwin due to the red zone. SVR4 probably has the same problem, but I don't know how to fix it; there is no red zone and R2 is already used for something else. I'm going to leave it to someone interested in that target. Better still would be to rewrite the CR-saving code completely; spilling each CR subregister individually is horrible code. llvm-svn: 96015	2010-02-12 21:35:34 +00:00
Anton Korobeynikov	c66de6687b	Testcases for recent stdcall / fastcall mangling improvements llvm-svn: 95982	2010-02-12 15:29:13 +00:00
Anton Korobeynikov	7073515c86	Cleanup stdcall / fastcall name mangling. This should fix alot of problems we saw so far, e.g. PRs 5851 & 2936 llvm-svn: 95980	2010-02-12 15:28:40 +00:00
Dan Gohman	c40eb525ad	Reapply the new LoopStrengthReduction code, with compile time and bug fixes, and with improved heuristics for analyzing foreign-loop addrecs. This change also flattens IVUsers, eliminating the stride-oriented groupings, which makes it easier to work with. llvm-svn: 95975	2010-02-12 10:34:29 +00:00
Bob Wilson	2fd80c3d94	Add a new pass on machine instructions to optimize away PHI cycles that reduce down to a single value. InstCombine already does this transformation but DAG legalization may introduce new opportunities. This has turned out to be important for ARM where 64-bit values are split up during type legalization: InstCombine is not able to remove the PHI cycles on the 64-bit values but the separate 32-bit values can be optimized. I measured the compile time impact of this (running llc on 176.gcc) and it was not significant. llvm-svn: 95951	2010-02-12 01:30:21 +00:00
Jakob Stoklund Olesen	b800ff8ca9	Reapply coalescer fix for better cross-class coalescing. This time with fixed test cases. llvm-svn: 95938	2010-02-11 23:55:29 +00:00
Mon P Wang	c17e781f35	The previous fix of widening divides that trap was too fragile as it depends on custom lowering and requires that certain types exist in ValueTypes.h. Modified widening to check if an op can trap and if so, the widening algorithm will apply only the op on the defined elements. It is safer to do this in widening because the optimizer can't guarantee removing unused ops in some cases. llvm-svn: 95823	2010-02-10 23:37:45 +00:00
Bob Wilson	82d5534acc	Delete dead PHI machine instructions. These can be created due to type legalization even when the IR-level optimizer has removed dead phis, such as when the high half of an i64 value is unused on a 32-bit target. I had to adjust a few test cases that had dead phis. This is a partial fix for Radar 7627077. llvm-svn: 95816	2010-02-10 22:58:57 +00:00
Evan Cheng	8bee7fb61d	Now that ShrinkDemandedOps() is separated out from DAG combine. It sometimes leave some obvious nops which dag combine used to clean up afterwards e.g. (trunk (ext n)) -> n. Look for them and squash them. llvm-svn: 95757	2010-02-10 02:17:34 +00:00
Chris Lattner	340fe1f187	move tests that depend on the x86 backend out of codegen/generic, and remove a few old and unreduced ones. Fixes PR5624. llvm-svn: 95656	2010-02-09 06:41:03 +00:00
Chris Lattner	fdb9fda4af	make target independent. llvm-svn: 95655	2010-02-09 06:36:30 +00:00
Chris Lattner	28b79c686d	merge a target-specific add test into x86 directory. llvm-svn: 95654	2010-02-09 06:35:50 +00:00
Chris Lattner	34c420d11b	merge another test in, drop the trivially constant folded cases. llvm-svn: 95653	2010-02-09 06:33:27 +00:00
Chris Lattner	ddb2e5a05c	consolidate and filecheckize two tests. llvm-svn: 95652	2010-02-09 06:24:00 +00:00
Chris Lattner	e669789912	merge two tests, make target independent. llvm-svn: 95651	2010-02-09 06:19:20 +00:00
Chris Lattner	20be5fb012	convert to filecheck. llvm-svn: 95608	2010-02-08 23:47:34 +00:00
Chris Lattner	6162c89bfe	add an x86 implementation of MCTargetExpr for representing @GOT and friends. Use it for personality references as a first use. llvm-svn: 95588	2010-02-08 22:09:08 +00:00
Dan Gohman	56b9ea088b	When CodeGen'ing unoptimized code, there may be unfolded constant expressions in global initializers. Instead of aborting, attempt to fold them on the spot. If folding succeeds, emit the folded expression instead. This fixes PR6255. llvm-svn: 95583	2010-02-08 22:02:38 +00:00
Dan Gohman	f113e5466c	In guaranteed tailcall mode, don't decline the tailcall optimization for blocks ending in "unreachable". llvm-svn: 95565	2010-02-08 20:34:14 +00:00
Evan Cheng	5541068ad3	Run codegen dce pass for all targets at all optimization levels. Previously it's only run for x86 with fastisel. I've found it being very effective in eliminating some obvious dead code as result of formal parameter lowering especially when tail call optimization eliminated the need for some of the loads from fixed frame objects. It also shrinks a number of the tests. A couple of tests no longer make sense and are now eliminated. llvm-svn: 95493	2010-02-06 09:07:11 +00:00
Evan Cheng	c3cfda4e7e	Remove a large test case that (soon will) no longer make sense. llvm-svn: 95492	2010-02-06 09:00:30 +00:00
Rafael Espindola	b0bb1ddfe3	Fix alignment on ppc linux. This fixes the build of crtend.o llvm-svn: 95477	2010-02-06 03:32:21 +00:00
Evan Cheng	de1a4726e6	Do not emit callseq instructions around sibcalls. This eliminated some unnecessary stack adjustments. llvm-svn: 95475	2010-02-06 03:28:46 +00:00
Bob Wilson	1a324958d6	Handle AddrMode6 (for NEON load/stores) in Thumb2's rewriteT2FrameIndex. Radar 7614112. llvm-svn: 95456	2010-02-06 00:24:38 +00:00
Jakob Stoklund Olesen	7b4c60adae	Don't unroll loops containing function calls. llvm-svn: 95454	2010-02-05 23:21:31 +00:00
Bill Wendling	c3f4101cc6	Make test more fucused eliminating extraneous bits. llvm-svn: 95384	2010-02-05 11:21:05 +00:00
Evan Cheng	4b03f55de1	Fix test. llvm-svn: 95373	2010-02-05 06:37:00 +00:00
Evan Cheng	81dde4c7f7	Handle tail call with byval arguments. llvm-svn: 95351	2010-02-05 02:21:12 +00:00
Evan Cheng	94fe5501b7	When the scheduler unfold a load folding instruction it move some of the predecessors to the unfolded load. It decides what gets moved to the load by checking whether the new load is using the predecessor as an operand. The check neglects the cases whether the predecessor is a flagged scheduling unit. rdar://7604000 llvm-svn: 95339	2010-02-05 01:27:11 +00:00
Bill Wendling	9761f067f8	An empty global constant (one of size 0) may have a section immediately following it. However, the EmitGlobalConstant method wasn't emitting a body for the constant. The assembler doesn't like that. Before, we were generating this: .zerofill __DATA, __common, __cmd, 1, 3 This fix puts us back to that semantic. llvm-svn: 95336	2010-02-05 00:17:02 +00:00
Jakob Stoklund Olesen	d72e82107d	Fix small bug in handling instructions with more than one implicitly defined operand. ProcessImplicitDefs would only mark one operand per instruction with <undef>. This fixed PR6086. llvm-svn: 95319	2010-02-04 18:46:28 +00:00
Evan Cheng	f5ee7fb571	Re-enable x86 tail call optimization. llvm-svn: 95295	2010-02-04 06:47:24 +00:00
Chris Lattner	e43007d443	add support for the sparcv9-- target triple to turn on 64-bit sparc codegen. Patch by Nathan Keynes! llvm-svn: 95293	2010-02-04 06:34:01 +00:00
Evan Cheng	5c8b1b9164	Speculatively disable x86 automatic tail call optimization while we track down a self-hosting issue. llvm-svn: 95259	2010-02-03 21:40:40 +00:00
Evan Cheng	ccbbdfa8c4	Make test less fragile llvm-svn: 95258	2010-02-03 21:39:04 +00:00
Evan Cheng	e273e42195	Revert 94937 and move the noreturn check to codegen. llvm-svn: 95198	2010-02-03 03:55:59 +00:00
Evan Cheng	d9cf09b0d6	Allow all types of callee's to be tail called. But avoid automatic tailcall if the callee is a result of bitcast to avoid losing necessary zext / sext etc. llvm-svn: 95195	2010-02-03 03:28:02 +00:00
Dale Johannesen	1e9d147461	Reapply 95050 with a tweak to check the register class. llvm-svn: 95183	2010-02-03 01:40:33 +00:00
Chris Lattner	2b798aafd0	make these less sensitive to asm verbose changes by disabling it for them. llvm-svn: 95175	2010-02-03 00:48:53 +00:00
Dale Johannesen	08ab638bdc	Test revert 95050; there's a good chance it's causing buildbot failure. llvm-svn: 95103	2010-02-02 18:52:56 +00:00
Evan Cheng	fac0fdc6a0	Perform sibcall in some cases when arguments are passes memory. Look for cases where callee's arguments are already in the caller's own caller's stack and they line up perfectly. e.g. extern int foo(int a, int b, int c); int bar(int a, int b, int c) { return foo(a, b, c); } llvm-svn: 95053	2010-02-02 02:22:50 +00:00
Dale Johannesen	a20fc3d1a9	Make local RA smarter about reusing input register of a copy as output. Needed for (functional) correctness in inline asm, and should be generally beneficial. 7361612. llvm-svn: 95050	2010-02-02 02:08:02 +00:00
Evan Cheng	efa391da81	Fix PR6196. GV callee may not be a function. llvm-svn: 95017	2010-02-01 22:40:09 +00:00

1 2 3 4 5 ...

3085 Commits