llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00

Author	SHA1	Message	Date
Jakob Stoklund Olesen	e27902ac68	Simplify the tracking of used physregs to a bulk bitor followed by a transitive closure after allocating all blocks. Add a few more test cases for -regalloc=fast. llvm-svn: 103500	2010-05-11 20:30:28 +00:00
Jakob Stoklund Olesen	442e38c4de	Mostly rewrite RegAllocFast. Sorry for the big change. The path leading up to this patch had some TableGen changes that I didn't want to commit before I knew they were useful. They weren't, and this version does not need them. The fast register allocator now does no liveness calculations. Instead it relies on kill flags provided by isel. (Currently those kill flags are also ignored due to isel bugs). The allocation algorithm is supposed to work with any subset of valid kill flags. More kill flags simply means fewer spills inserted. Registers are allocated from a working set that contains no aliases. That means most allocations can be done directly without expensive alias checks. When the working set runs out of registers we do the full alias check to find new free registers. llvm-svn: 103488	2010-05-11 18:54:45 +00:00
Kalle Raiskila	e302300b51	Make SPU backend not assert on jump tables. llvm-svn: 103466	2010-05-11 11:00:02 +00:00
Evan Cheng	11130a0a22	Select @llvm.trap to the special B with 1111 condition (i.e. trap) instruction. llvm-svn: 103459	2010-05-11 07:26:32 +00:00
Evan Cheng	df350445c6	Be careful with operand promotion. For a binary operation, the source operands may be the same. PR7018. rdar://7939869. llvm-svn: 103419	2010-05-10 19:03:57 +00:00
Kalle Raiskila	61289abcda	Fix encoding of 'sf' and 'sfh' instructions. llvm-svn: 103399	2010-05-10 08:13:49 +00:00
Bill Wendling	787de8fe38	Readd testcase. llvm-svn: 103335	2010-05-08 04:47:54 +00:00
Dan Gohman	4ca74b9c6c	When pruning candidate formulae out of an LSRUse, update the LSRUse's Regs set after all pruning is done, rather than trying to do it on the fly, which can produce an incomplete result. This fixes a case where heuristic pruning was stripping all formulae from a use, which led the solver to enter an infinite loop. Also, add a few asserts to diagnose this kind of situation. llvm-svn: 103328	2010-05-07 23:36:59 +00:00
Bill Wendling	4241941b52	Remove. Don't XFAIL. llvm-svn: 103321	2010-05-07 23:09:17 +00:00
Bill Wendling	3846ec7889	Temorarily revert r101984. llvm-svn: 103314	2010-05-07 22:45:36 +00:00
Dan Gohman	95040c18f4	SDDbgValues are apparently not being legalized. Fix a symptom of the problem, and not the real problem itself, by dropping debug info for i128 values. rdar://7958162. llvm-svn: 103310	2010-05-07 22:19:08 +00:00
Dale Johannesen	1ee37ac5d4	Fix PR 7087, and probably other things, by extending getConstantFP to accept the two supported long double target types. This was not the original intent, but there are other places that assume this works and it's easy enough to do. llvm-svn: 103299	2010-05-07 21:35:53 +00:00
Jim Grosbach	2db1618b44	Clean up the conditional for handling of sign_extend_inreg based on whether the extract instructions are available. rdar://7956878 llvm-svn: 103277	2010-05-07 18:34:55 +00:00
Duncan Sands	ed2ef5a987	Correct some bogus target triples. llvm-svn: 103265	2010-05-07 17:03:48 +00:00
Nick Lewycky	3e5720a898	Revert r103133 and add testcase from PR7066. llvm-svn: 103233	2010-05-07 01:45:38 +00:00
Dan Gohman	f863ad3dc5	Disable the new unknown-location code for now. It causes a major increase in the debug line info section, and it's causing regressions in a gdb testsuite. llvm-svn: 103226	2010-05-07 01:08:53 +00:00
Dan Gohman	497e752655	Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that it doesn't have to guess. llvm-svn: 103194	2010-05-06 20:33:48 +00:00
Dan Gohman	3190e5c292	Add a testcase for r103135, explicitly representing unknown locations in debug line info. llvm-svn: 103189	2010-05-06 17:49:17 +00:00
Chris Lattner	014a954e3d	Fix PR7054 - Assertion `Symbol->isUndefined() && "Cannot define a symbol twice!"' failed. Users can write broken code that emits the same label twice with asm renaming, detect this and emit a fatal backend error instead of aborting. llvm-svn: 103140	2010-05-06 00:05:37 +00:00
Jim Grosbach	e04cc6cb43	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack instructions to subtarget features and update tests to reflect. PR5717. llvm-svn: 103136	2010-05-05 23:44:43 +00:00
Jakob Stoklund Olesen	2e5d12acfa	Fix PR6520. An earlyclobber physreg must not be allocated to anything else. llvm-svn: 103133	2010-05-05 23:07:41 +00:00
Jim Grosbach	7eb0b4d646	fix copy/paste oops. llvm-svn: 103122	2010-05-05 21:07:46 +00:00
Jim Grosbach	25fc725b2a	Add tests for ARMV7M divide instruction use llvm-svn: 103120	2010-05-05 20:47:15 +00:00
Jim Grosbach	9b7ae2027f	remove unneeded underscores. llvm-svn: 103114	2010-05-05 19:55:58 +00:00
Jim Grosbach	7ea67d346f	Convert to filecheck llvm-svn: 103113	2010-05-05 19:41:11 +00:00
Chris Lattner	b4696853af	"on the rare occasion the SPU BE produces illegal assembly - it tries to emit an add instruction of the form 'a reg, reg, imm'." Patch by Kalle Raiskila! llvm-svn: 103021	2010-05-04 17:58:46 +00:00
Dale Johannesen	b10ca6bf4c	Implement builtin_return_address(x) and builtin_frame_address(x) on PPC for x!=0. 7624113. llvm-svn: 102972	2010-05-03 22:59:34 +00:00
Jakob Stoklund Olesen	51ab2653d5	Check that subregisters don't have independent values in RemoveCopyByCommutingDef(). This fixes PR6941. llvm-svn: 102970	2010-05-03 22:40:32 +00:00
Dan Gohman	c024fd43c9	Fix tests to use fadd, fsub, and fmul, instead of add, sub, and mul, when the type is floating-point. llvm-svn: 102969	2010-05-03 22:36:46 +00:00
Dan Gohman	15cb983f55	Fix a bug which prevented tail merging of return instructions in beneficial cases. See the changes in test/CodeGen/X86/tail-opts.ll and test/CodeGen/ARM/ifcvt2.ll for details. The fix is to change HashEndOfMBB to hash at most one instruction, instead of trying to apply heuristics about when it will be profitable to consider more than one instruction. The regular tail-merging heuristics are already prepared to handle the same cases, and they're more precise. Also, make test/CodeGen/ARM/ifcvt5.ll and test/CodeGen/Thumb2/thumb2-branch.ll slightly more complex so that they continue to test what they're intended to test. And, this eliminates the problem in test/CodeGen/Thumb2/2009-10-15-ITBlockBranch.ll, the testcase from PR5204. Update it accordingly. llvm-svn: 102907	2010-05-03 14:35:47 +00:00
Duncan Sands	153ad3b903	Remove the -enable-sjlj-eh option, which doesn't do anything. Remove the -enable-eh option which is only used by the JIT, and replace it with -jit-enable-eh. llvm-svn: 102865	2010-05-02 15:36:26 +00:00
Anton Korobeynikov	a3726088fa	Insert ANY_EXTEND node instead of invalid truncate during DAG Combining (X & 1), when needed. This fixes PR7001 llvm-svn: 102838	2010-05-01 12:52:34 +00:00
Anton Korobeynikov	f31181a0cc	Do folding for indirect branches, where possible llvm-svn: 102836	2010-05-01 12:28:21 +00:00
Anton Korobeynikov	9b724bd446	Implement indirect branches on MSP430 llvm-svn: 102835	2010-05-01 12:04:32 +00:00
Bill Wendling	7b6527b94b	Test failing too much on too many platforms. llvm-svn: 102812	2010-05-01 00:12:33 +00:00
Bill Wendling	a348ff69b8	Maybe it needs sse2? llvm-svn: 102802	2010-04-30 23:19:29 +00:00
Bill Wendling	d1c5f11f72	Force 64-bit. llvm-svn: 102800	2010-04-30 22:45:20 +00:00
Bill Wendling	95a4929ac7	EXTRACT_VECTOR_ELT of an INSERT_VECTOR_ELT may have the same index, but the indexes could be of a different value type. Or not even using the same SDNode for the constant (weird, I know). Compare the actual values instead of the pointers. llvm-svn: 102791	2010-04-30 22:19:17 +00:00
Jakob Stoklund Olesen	fac584ed9e	The local register allocator has to spill dirty callee saved registers before a call that might throw. The landing pad assumes that all registers are in stack slots. We used to spill those dirty CSRs after the call, and the stack slots would be wrong when arriving at the landing pad. llvm-svn: 102770	2010-04-30 21:19:29 +00:00
Evan Cheng	fc86d7fbdc	Fix test. llvm-svn: 102694	2010-04-30 06:00:56 +00:00
Evan Cheng	9303b11e47	Another sibcall bug. If caller and callee calling conventions differ, then it's only safe to do a tail call if the results are returned in the same way. llvm-svn: 102683	2010-04-30 01:12:32 +00:00
Jakob Stoklund Olesen	01254ea96d	Reject really weird coalescer case when trying to merge identical subregisters of different register classes. e.g. %reg1048:3<def> = EXTRACT_SUBREG %RAX<kill>, 3 Where %reg1048 is a GR32 register. This is not impossible to handle, but it is pretty hard and very rare. This should unbreak the dragonegg builder. llvm-svn: 102672	2010-04-29 23:47:46 +00:00
Evan Cheng	da89f22f3d	Load folding tail call should not use ebp / rbp after it's popped. PEI should use esp / rsp to reference frame instead. llvm-svn: 102596	2010-04-29 05:08:22 +00:00
Chris Lattner	9867c1a075	Rework global alignment computation again. Now we do round up alignment of globals to the preferred alignment, but only when there is no section specified on the global (by far the common case). llvm-svn: 102515	2010-04-28 19:58:07 +00:00
Evan Cheng	d4fe387eb8	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Evan Cheng	08e5f737d2	Update tests. llvm-svn: 102487	2010-04-28 01:53:13 +00:00
Devang Patel	570e9d53a7	Emit debug info for byval parameters. llvm-svn: 102486	2010-04-28 01:39:28 +00:00
Evan Cheng	2aaefc6167	Do not count kill, implicit_def instructions as printed instructions. llvm-svn: 102453	2010-04-27 19:38:45 +00:00
Chris Lattner	a9c1328501	round zero-byte .zerofill directives up to 1 byte. This should fix some "g++.dg-struct-layout-1" failures, rdar://7886017 llvm-svn: 102421	2010-04-27 07:41:44 +00:00
Chris Lattner	9292bad5f5	on darwin empty functions need to codegen into something of non-zero length, otherwise labels get incorrectly merged. We handled this by emitting a ".byte 0", but this isn't correct on thumb/arm targets where the text segment needs to be a multiple of 2/4 bytes. Handle this by emitting a noop. This is more gross than it should be because arm/ppc are not fully mc'ized yet. This fixes rdar://7908505 llvm-svn: 102400	2010-04-26 23:37:21 +00:00
Bob Wilson	ece63716aa	Handle register-to-register copies within the tGPR class. Radar 7896289 llvm-svn: 102396	2010-04-26 23:20:08 +00:00
Dan Gohman	40561dd0ba	When checking whether the special handling for an addrec increment which doesn't dominate the header is needed, don't check whether the increment expression has computable loop evolution. While the operands of an addrec are required to be loop-invariant, they're not required to dominate any part of the loop. This fixes PR6914. llvm-svn: 102389	2010-04-26 21:46:36 +00:00
Chris Lattner	4854eab087	fix PR6921 a different way. Intead of increasing the alignment of globals with a specified alignment, we fix common variables to obey their alignment. Add a comment explaining why this behavior is important. llvm-svn: 102365	2010-04-26 18:46:46 +00:00
Chris Lattner	a8cd2ac893	Revert r102300/102301, which serious broke objc apps. llvm-svn: 102359	2010-04-26 18:30:45 +00:00
Chris Lattner	e4a25eb35a	Fix PR6921: globals were not getting correctly rounded up to their preferred alignment unless they were common or some other special case. llvm-svn: 102300	2010-04-25 05:30:43 +00:00
Dan Gohman	42337e0ee9	Generalize LSR's OptimizeMax to handle the new kinds of max expressions that indvars may use, now that indvars is recognizing le and ge loops. llvm-svn: 102235	2010-04-24 03:13:44 +00:00
Stuart Hastings	85b5c330f2	Per Chris, fuse four trivial tests using grep (r102199) into one that uses FileCheck. llvm-svn: 102216	2010-04-23 22:12:57 +00:00
Dan Gohman	6a48222bd8	Change TargetData's algorithm for computing defualt vector type alignment to match what's used in clang and GCC for __alignof, rather than trying to guess what Legalize is going to be doing. llvm-svn: 102206	2010-04-23 19:41:15 +00:00
Stuart Hastings	ad81819149	Add some missing x86 patterns for movdq2q. Fixes two (LLVM-)GCC DejaGNU testcases. Radar 6881029. llvm-svn: 102199	2010-04-23 19:03:32 +00:00
Dan Gohman	38949c2f1f	Fix LSR to tolerate cases where ScalarEvolution initially misses an opportunity to fold add operands, but folds them after LSR has separated them out. This fixes rdar://7886751. llvm-svn: 102157	2010-04-23 01:55:05 +00:00
Jim Grosbach	b9dccb6103	Update ARM DAGtoDAG for matching UBFX instruction for unsigned bitfield extraction. This fixes PR5998. llvm-svn: 102144	2010-04-22 23:24:18 +00:00
Evan Cheng	a324da99ae	Do not try to optimize a copy that has already been marked for deletion. llvm-svn: 102027	2010-04-21 20:57:54 +00:00
Evan Cheng	dbfb7dc438	Implement -disable-non-leaf-fp-elim which disable frame pointer elimination optimization for non-leaf functions. This will be hooked up to gcc's -momit-leaf-frame-pointer option. rdar://7886181 llvm-svn: 101984	2010-04-21 03:18:23 +00:00
Evan Cheng	a0c4b2952f	- Clean up some crappy code which deals with coalescing of copies which look at extract_subreg / insert_subreg, etc. - Add support for more aggressive insert_subreg coalescing. llvm-svn: 101971	2010-04-21 00:44:22 +00:00
Dan Gohman	570b621976	Add another variant of this test which found a place where CodeGen's ComputeMaskedBits was being over-conservative when computing bits for an ADD. llvm-svn: 101963	2010-04-21 00:19:28 +00:00
Chris Lattner	6db0f451a7	teach the x86 address matching stuff to handle (shl (or x,c), 3) the same as (shl (add x, c), 3) when x doesn't have any bits from c set. This finishes off PR1135. Before we compiled the block to: to: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) leaq 2(%rdx), %r9 movl %esi, (%rdi,%r9,4) leaq 1(%rdx), %r9 movl %esi, (%rdi,%r9,4) addq $3, %rdx movl %esi, (%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 Now we produce: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) movl %esi, 8(%rdi,%rdx,4) movl %esi, 4(%rdi,%rdx,4) movl %esi, 12(%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 llvm-svn: 101958	2010-04-20 23:18:40 +00:00
Bill Wendling	a87efb5d0f	Move CodeGen/X86/2010-04-19-DAGCombineCrash.ll into CodeGen/X86/crash.ll. Also reduce. llvm-svn: 101925	2010-04-20 18:14:47 +00:00
Chris Lattner	b66b0c36cd	Bill's change in r95336 broke empty aggregates embedded in other types. fix this by only bumping zero-byte globals up to a single byte if the entire global is zero size, fixing PR6340. This also fixes empty arrays etc to be handled correctly, and only does this on subsection-via-symbols targets (aka darwin) which is the only place where this matters. llvm-svn: 101879	2010-04-20 06:20:21 +00:00
Chris Lattner	04fb51984f	teach cellspu how to return i8 and i16 from calls, patch by Kalle Raiskila! llvm-svn: 101875	2010-04-20 05:36:09 +00:00
Bill Wendling	887dac2aa6	The visitXOR method can return the same SDNode. If so, we don't want to delete it as it's not dead. llvm-svn: 101855	2010-04-20 01:25:01 +00:00
Bob Wilson	2e6cd50a50	Fix tests for Neon load/store intrinsics to match the i8* types expected by the intrinsics. The reason for those i8* types is that the intrinsics are overloaded on the vector type and we don't have a way to declare an intrinsic where one argument is an overloaded vector type and another argument is a pointer to the vector element type. The bitcasts added here will match what the frontend will typically generate when these intrinsics are used. llvm-svn: 101840	2010-04-20 00:17:16 +00:00
Nick Lewycky	c639c07492	Fix declarations in a few more tests. llvm-svn: 101676	2010-04-17 21:29:25 +00:00
Chris Lattner	99d17acb35	fix PR6332, allowing an index of zero into a zero sized array even if the element of the array has no size. llvm-svn: 101662	2010-04-17 19:02:33 +00:00
Dan Gohman	5736cd1e47	Start function numbering at 0. llvm-svn: 101638	2010-04-17 16:29:15 +00:00
Evan Cheng	d3d5e6793a	Add nounwind. llvm-svn: 101613	2010-04-17 03:43:36 +00:00
Jakob Stoklund Olesen	7e77f60652	Add test case for machine-sink on critical edges llvm-svn: 101416	2010-04-15 23:19:16 +00:00
Evan Cheng	c843326d60	Use default lowering of DYNAMIC_STACKALLOC. As far as I can tell, ARM isle is doing the right thing and codegen looks correct for both Thumb and Thumb2. llvm-svn: 101410	2010-04-15 22:20:34 +00:00
Jakob Stoklund Olesen	a40915cc26	Fix PR6847. RegScavenger should ignore DebugValues. llvm-svn: 101392	2010-04-15 20:28:39 +00:00
Evan Cheng	2f6d7ecd1b	ARM SelectDYN_ALLOC should emit a copy from SP rather than referencing SP directly. In cases where there are two dyn_alloc in the same BB it would have caused the old SP value to be reused and badness ensues. rdar://7493908 llvm is generating poor code for dynamic alloca, I'll fix that later. llvm-svn: 101383	2010-04-15 18:42:28 +00:00
Chris Lattner	1b7ecfdf60	enhance the load/store narrowing optimization to handle a tokenfactor in between the load/store. This allows us to optimize test7 into: _test7: ## @test7 ## BB#0: ## %entry movl (%rdx), %eax ## kill: SIL<def> ESI<kill> movb %sil, 5(%rdi) ret instead of: _test7: ## @test7 ## BB#0: ## %entry movl 4(%esp), %ecx movl $-65281, %eax ## imm = 0xFFFFFFFFFFFF00FF andl 4(%ecx), %eax movzbl 8(%esp), %edx shll $8, %edx addl %eax, %edx movl 12(%esp), %eax movl (%eax), %eax movl %edx, 4(%ecx) ret llvm-svn: 101355	2010-04-15 06:10:49 +00:00
Chris Lattner	8c5a5c9094	teach codegen to turn trunc(zextload) into load when possible. This doesn't occur much at all, it only seems to formed in the case when the trunc optimization kicks in due to phase ordering. In that case it is saves a few bytes on x86-32. llvm-svn: 101350	2010-04-15 05:40:59 +00:00
Chris Lattner	510d19e597	add a simple dag combine to replace trivial shl+lshr with and. This happens with the store->load narrowing stuff. llvm-svn: 101348	2010-04-15 05:28:43 +00:00
Chris Lattner	3282f3d34f	Implement rdar://7860110 (also in target/readme.txt) narrowing a load/or/and/store sequence into a narrower store when it is safe. Daniel tells me that clang will start producing this sort of thing with bitfields, and this does trigger a few dozen times on 176.gcc produced by llvm-gcc even now. This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll into: movl %eax, 36(%rdi) instead of: movl $4294967295, %eax ## imm = 0xFFFFFFFF andq 32(%rdi), %rax shlq $32, %rcx addq %rax, %rcx movq %rcx, 32(%rdi) and each of the testcases into a single store. Each of them used to compile into craziness like this: _test4: movl $65535, %eax ## imm = 0xFFFF andl (%rdi), %eax shll $16, %esi addl %eax, %esi movl %esi, (%rdi) ret llvm-svn: 101343	2010-04-15 04:48:01 +00:00
Chris Lattner	553267e9cc	further tweak this to do something useful. llvm-svn: 101341	2010-04-15 04:31:42 +00:00
Chris Lattner	a4b3756baf	remove undef control flow. llvm-svn: 101340	2010-04-15 04:30:19 +00:00
Jakob Stoklund Olesen	7343a90490	Remove unneeded types from test. llvm-svn: 101286	2010-04-14 20:56:09 +00:00
Bob Wilson	7b19d89e3a	Don't custom lower bit converts to ARM VMOVDRRD or VMOVDRR when the operand does not have a legal type. The legalizer does not know how to handle those nodes. Radar 7854640. llvm-svn: 101282	2010-04-14 20:45:23 +00:00
Evan Cheng	172f2f9e2d	Add test for post-ra machine licm. llvm-svn: 101182	2010-04-13 22:10:03 +00:00
Bob Wilson	526e615ff9	Handle a v2f64 formal parameter that is split between registers and memory such that the entire second half is in memory. Radar 7855014. llvm-svn: 101181	2010-04-13 22:03:22 +00:00
Evan Cheng	6ffb1ed4fb	Fix test on non-x86 hosts. llvm-svn: 101163	2010-04-13 18:54:04 +00:00
Evan Cheng	b8861dcb04	Re-apply 101075 and fix it properly. Just reuse the debug info of the branch instruction being optimized. There is no need to --I which can deref off start of the BB. llvm-svn: 101162	2010-04-13 18:50:27 +00:00
Eric Christopher	330ca0c937	Temporarily revert r101075, it's causing invalid iterator assertions in a nightly tester. llvm-svn: 101158	2010-04-13 18:37:58 +00:00
Chris Lattner	dabcd9738c	add llvm codegen support for -ffunction-sections and -fdata-sections, patch by Sylvere Teissier! llvm-svn: 101106	2010-04-13 00:36:43 +00:00
Evan Cheng	ec21a36774	Use .set expression for x86 pic jump table reference to reduce assembly relocation. rdar://7738756 llvm-svn: 101085	2010-04-12 23:07:17 +00:00
Bill Wendling	5a56f7fc20	Third time's a charm... llvm-svn: 101081	2010-04-12 22:43:21 +00:00
Bill Wendling	e1bf74de52	Genericize the label test. llvm-svn: 101079	2010-04-12 22:40:37 +00:00
Bill Wendling	fd58812c95	Correct test to test what I mean it to test. llvm-svn: 101077	2010-04-12 22:25:42 +00:00
Bill Wendling	1f2e71928c	Micro-optimization: If we have this situation: jCC L1 jmp L2 L1: ... L2: ... We can get a small performance boost by emitting this instead: jnCC L2 L1: ... L2: ... This testcase shows an example of this: float func(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } llvm-svn: 101075	2010-04-12 22:19:57 +00:00
Evan Cheng	90788354c9	Enable post regalloc machine licm by default. llvm-svn: 101023	2010-04-12 06:25:28 +00:00
Benjamin Kramer	f040734da3	Make sure this test tests something. llvm-svn: 100879	2010-04-09 19:03:31 +00:00
Bob Wilson	ee7665078a	Add a testcase for svn r100568. llvm-svn: 100876	2010-04-09 18:29:29 +00:00
Chris Lattner	5408e8a62b	"On SPU, variables in the .bss section that are allocated with the .lcomm directive are not aligned on 16 byte boundaries. This causes misaligned loads, as the generated assembly assumes this "default" alignment. this patch disables .lcomm in favour of '.local .comm' Patch by Kalle Raisklia! llvm-svn: 100875	2010-04-09 18:27:03 +00:00
Dan Gohman	a451b859f9	Merge a few fast-isel tests. llvm-svn: 100860	2010-04-09 15:03:55 +00:00
Evan Cheng	619f1b8a94	Coalescer should not delete copy instructions whose defs are partially dead. e.g. %RDI<def,dead> = MOV64rr %RAX<kill>, %EDI<imp-def> llvm-svn: 100804	2010-04-08 20:02:37 +00:00
Evan Cheng	3fa0b6fb03	Avoid using f64 to lower memcpy from constant string. It's cheaper to use i32 store of immediates. llvm-svn: 100751	2010-04-08 07:37:57 +00:00
Dan Gohman	0a77dd29de	When expanding expressions which are using post-inc mode for multiple loops, ensure that the expansion is dominated by the increments of those loops. llvm-svn: 100748	2010-04-08 05:57:57 +00:00
Chris Lattner	23334439e9	add newlines at the end of files. llvm-svn: 100705	2010-04-07 22:53:17 +00:00
Dan Gohman	b5210c934f	Generalize IVUsers to track arbitrary expressions rather than expressions explicitly split into stride-and-offset pairs. Also, add the ability to track multiple post-increment loops on the same expression. This refines the concept of "normalizing" SCEV expressions used for to post-increment uses, and introduces a dedicated utility routine for normalizing and denormalizing expressions. This fixes the expansion of expressions which are post-increment users of more than one loop at a time. More broadly, this takes LSR another step closer to being able to reason about more than one loop at a time. llvm-svn: 100699	2010-04-07 22:27:08 +00:00
Dale Johannesen	4cdb545401	Split big test into multiple directories to cater to those who don't build all targets. llvm-svn: 100688	2010-04-07 20:43:35 +00:00
Chris Lattner	367cbc160c	this has a pr! llvm-svn: 100637	2010-04-07 18:04:56 +00:00
Chris Lattner	60e5f55565	fix a latent bug my inline asm stuff exposed: MachineOperand::isIdenticalTo wasn't handling metadata operands. llvm-svn: 100636	2010-04-07 18:03:19 +00:00
Sanjiv Gupta	b3557c497a	Remove XFAIL for vg_leak as the leaks are fixed by 100601. llvm-svn: 100612	2010-04-07 07:06:48 +00:00
Jakob Stoklund Olesen	370a3e553f	Don't try to collapse DomainValues onto an incompatible SSE domain. This fixes the Bullet regression on i386/nocona. llvm-svn: 100553	2010-04-06 19:48:56 +00:00
Evan Cheng	7d177beb1d	Add nounwind. llvm-svn: 100482	2010-04-05 22:30:05 +00:00
Dan Gohman	2aff0055c1	Don't do code sinking on unreachable blocks. It's unprofitable and hazardous. llvm-svn: 100455	2010-04-05 19:17:22 +00:00
Chris Lattner	779e051357	resolve a fixme. llvm-svn: 100346	2010-04-04 19:28:59 +00:00
Evan Cheng	5d825988d0	Correctly lower memset / memcpy of undef. It should be a nop. PR6767. llvm-svn: 100208	2010-04-02 19:36:14 +00:00
Dan Gohman	7051600e3f	Revert the recent alignment changes. They're broken for -Os because, in particular, they end up aligning strings at 16-byte boundaries, and there's no way for GlobalOpt to check OptForSize. llvm-svn: 100172	2010-04-02 03:04:37 +00:00
Evan Cheng	921fc2c77b	After trivial coalescing, the MI being visited may have become a copy. Avoid adding it to CSE hash table since copies aren't being considered for CSE and they may be deleted. rdar://7819990 llvm-svn: 100170	2010-04-02 02:21:24 +00:00
Dan Gohman	ec731a99fa	Remove this initializer so that the optimizer doesn't convert unaligned loads into aligned loads. llvm-svn: 100166	2010-04-02 01:26:13 +00:00
Dan Gohman	d14fe2a5cd	Update this test for the new preferred alignment heuristics. llvm-svn: 100165	2010-04-02 01:24:08 +00:00
Evan Cheng	9508456bb6	In 64-bit mode, use i64 to lower memcpy / memset instead of f64. llvm-svn: 100137	2010-04-01 20:27:45 +00:00
Evan Cheng	8728924812	- Avoid using floating point stores to implement memset unless the value is zero. - Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type. llvm-svn: 100118	2010-04-01 18:19:11 +00:00
Evan Cheng	202b6327c4	Add -mcpu to memcpy / memset tests to ensure they behave the same on all hosts / targets. llvm-svn: 100101	2010-04-01 08:25:26 +00:00
Evan Cheng	562bb43207	Fix sdisel memcpy, memset, memmove lowering: 1. Makes it possible to lower with floating point loads and stores. 2. Avoid unaligned loads / stores unless it's fast. 3. Fix some memcpy lowering logic bug related to when to optimize a load from constant string into a constant. 4. Adjust x86 memcpy lowering threshold to make it more sane. 5. Fix x86 target hook so it uses vector and floating point memory ops more effectively. rdar://7774704 llvm-svn: 100090	2010-04-01 06:04:33 +00:00
Jakob Stoklund Olesen	58296f9543	Replace V_SET0 with variants for each SSE execution domain. llvm-svn: 99975	2010-03-31 00:40:13 +00:00
Jakob Stoklund Olesen	13a7a0adff	Fix typo. Thank you, valgrind. llvm-svn: 99974	2010-03-31 00:40:08 +00:00
Jakob Stoklund Olesen	c6213a8bc0	Not all platforms start symbols with _ llvm-svn: 99959	2010-03-30 23:12:48 +00:00
Jakob Stoklund Olesen	a027a6d4f2	Enable -sse-domain-fix by default. Now with tests! llvm-svn: 99954	2010-03-30 22:47:00 +00:00
Eric Christopher	4c3a3208e3	Remove the pmulld intrinsic and autoupdate it as a vector multiply. Rewrite the pmulld patterns, and make sure that they fold in loads of arguments into the instruction. llvm-svn: 99910	2010-03-30 18:49:01 +00:00
Benjamin Kramer	1be90cca94	XFAIL some PIC16 tests when running under valgrind-leaks. I don't expect these to be fixed any time soon. llvm-svn: 99888	2010-03-30 14:34:13 +00:00
Evan Cheng	b1ddb193c7	Fix PR4975. Avoid referencing empty vector. llvm-svn: 99840	2010-03-29 21:27:30 +00:00
Chris Lattner	c0d5bcc160	From Kalle Raiskila: "the bigstack patch for SPU, with testcase. It is essentially the patch committed as 97091, and reverted as 97099, but with the following additions: -in vararg handling, registers are marked to be live, to not confuse the register scavenger -function prologue and epilogue are not emitted, if the stack size is 16. 16 means it is empty - there is only the register scavenger emergency spill slot, which is not used as there is no stack." llvm-svn: 99819	2010-03-29 17:38:47 +00:00
Chris Lattner	e04fb0a1a7	teach tblgen to allow patterns like (add (i32 (bitconvert (i32 GPR))), 4), transforming it into (add (i32 GPR), 4). This allows us to write type generic multi patterns and have tblgen automatically drop the bitconvert in the case when the types align. This allows us to fold an extra load in the changed testcase. llvm-svn: 99756	2010-03-28 08:38:32 +00:00
Chris Lattner	1a838000ec	add some nounwinds llvm-svn: 99752	2010-03-28 07:58:37 +00:00
Chris Lattner	de8b42ce67	this takes an insane amount of time to run, disable it for now (PR6727) llvm-svn: 99751	2010-03-28 07:58:09 +00:00
Evan Cheng	d1ee7e0ba3	Do not sibcall if stack needs to be dynamically aligned. llvm-svn: 99620	2010-03-26 16:26:03 +00:00
Evan Cheng	377bb993d8	Allow trivial sibcall of vararg callee when no arguments are being passed. llvm-svn: 99598	2010-03-26 02:13:13 +00:00
Evan Cheng	0b7dd682dd	Try trivial remat before the coalescer gives up on a vr / physreg coalescing for fear of tying up a physical register. llvm-svn: 99575	2010-03-26 00:07:25 +00:00
Jim Grosbach	2a0b14a387	switch the flag for using NEON for SP floating point to a subtarget 'feature'. Re-commit. This time complete with testsuite updates. llvm-svn: 99570	2010-03-25 23:47:34 +00:00
Evan Cheng	eb5c7f65dc	Add nounwind. llvm-svn: 99546	2010-03-25 20:01:07 +00:00
Chris Lattner	391902aa30	Make the NDEBUG assertion stronger and more clear what is happening. Enhance scheduling to set the DEAD flag on implicit defs more aggressively. Before, we'd set an implicit def operand to dead if it were present in the SDNode corresponding to the machineinstr but had no use. Now we do it in this case AND if the implicit def does not exist in the SDNode at all. This exposes a couple of problems: one is the FIXME, which causes a live intervals crash on CodeGen/X86/sibcall.ll. The second is that it makes machinecse and licm more aggressive (which is a good thing) but also exposes a case where licm hoists a set0 and then it doesn't get resunk. Talking to codegen folks about both these issues, but I need this patch in in the meantime. llvm-svn: 99485	2010-03-25 05:40:48 +00:00
Nate Begeman	820b07ed10	BUILD_VECTOR was missing out on some prime opportunities to use SSE 4.1 inserts. llvm-svn: 99423	2010-03-24 20:49:50 +00:00
Bob Wilson	32d0ded51e	Revert Edwin's change that is breaking MultiSource/Applications/ClamAV/clamscan. --- Reverse-merging r99400 into '.': D test/CodeGen/Generic/2010-03-24-liveintervalleak.ll U lib/CodeGen/LiveIntervalAnalysis.cpp llvm-svn: 99419	2010-03-24 20:25:25 +00:00
Torok Edwin	c8751938dd	Fix memory leak in liveintervals: the destructor for VNInfos must be called, otherwise the SmallVector it contains doesn't free its memory. In most cases LiveIntervalAnalysis could get away by not calling the destructor, because VNInfos are bumpptr-allocated, and smallvectors usually don't grow. However when the SmallVector does grow it always leaks. This is the valgrind shown leak from the original testcase: ==8206== 18,304 bytes in 151 blocks are definitely lost in loss record 164 of 164 ==8206== at 0x4A079C7: operator new(unsigned long) (vg_replace_malloc.c:220) ==8206== by 0x4DB7A7E: llvm::SmallVectorBase::grow_pod(unsigned long, unsigned long) (in /home/edwin/clam/git/builds/defaul t/libclamav/.libs/libclamav.so.6.1.0) ==8206== by 0x4F90382: llvm::VNInfo::addKill(llvm::SlotIndex) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libcl amav.so.6.1.0) ==8206== by 0x5126B5C: llvm::LiveIntervals::handleVirtualRegisterDef(llvm::MachineBasicBlock, llvm::ilist_iterator<llvm::M achineInstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int, llvm::LiveInterval&) (in /home/edwin/clam/git/builds/defau lt/libclamav/.libs/libclamav.so.6.1.0) ==8206== by 0x512725E: llvm::LiveIntervals::handleRegisterDef(llvm::MachineBasicBlock, llvm::ilist_iterator<llvm::MachineI nstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav .so.6.1.0) ==8206== by 0x51278A8: llvm::LiveIntervals::computeIntervals() (in /home/edwin/clam/git/builds/default/libclamav/.libs/libc lamav.so.6.1.0) ==8206== by 0x5127CB4: llvm::LiveIntervals::runOnMachineFunction(llvm::MachineFunction&) (in /home/edwin/clam/git/builds/de fault/libclamav/.libs/libclamav.so.6.1.0) ==8206== by 0x4DAE935: llvm::FPPassManager::runOnFunction(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama v/.libs/libclamav.so.6.1.0) ==8206== by 0x4DAEB10: llvm::FunctionPassManagerImpl::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama v/.libs/libclamav.so.6.1.0) ==8206== by 0x4DAED3D: llvm::FunctionPassManager::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclamav/.l ibs/libclamav.so.6.1.0) ==8206== by 0x4D8BE8E: llvm::JIT::runJITOnFunctionUnlocked(llvm::Function, llvm::MutexGuard const&) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0) ==8206== by 0x4D8CA72: llvm::JIT::getPointerToFunction(llvm::Function) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0) llvm-svn: 99400	2010-03-24 13:50:36 +00:00
Chris Lattner	772ef50bf1	Fix PR6673: updating the callback should not clear the map. llvm-svn: 99227	2010-03-22 23:15:57 +00:00
Bob Wilson	e5614d6b2d	pr6652: Use LDM to restore PC to the return address on ARMv4. Patch by John Tytgat! llvm-svn: 99096	2010-03-20 22:20:40 +00:00
Evan Cheng	77f5ed5de5	Stupid svn. Add back to the lost sibcall tests. llvm-svn: 99033	2010-03-20 03:17:05 +00:00
Kevin Enderby	766909ae3b	Fixed the encoding problems of the crc32 instructions. All had the Operand size override prefix and only the r/m16 forms should have had that. Also for variant one, the AT&T syntax, added suffixes to all forms. Also added the missing 64-bit form for 'CRC32 r64, r/m8'. Plus added test cases for all forms and tweaked one test case to add the needed suffixes. llvm-svn: 98980	2010-03-19 20:04:42 +00:00
Mon P Wang	005a544af8	Fixed a widening bug where we were not using the correct size for the load llvm-svn: 98920	2010-03-19 01:19:52 +00:00
Evan Cheng	5ace6fffac	Turning off post-ra scheduling for x86. It isn't a consistent win. llvm-svn: 98810	2010-03-18 06:55:42 +00:00
Evan Cheng	41730fc7c8	X86 address mode matching code MatchAddressRecursively does some aggressive hack which require doing a RAUW. It may end up deleting some SDNode up stream. It should avoid referencing deleted nodes. llvm-svn: 98780	2010-03-17 23:58:35 +00:00
Johnny Chen	0212e0df47	Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm instructions to help disassembly. We also changed the output of the addressing modes to omit the '+' from the assembler syntax #+/-<imm> or +/-<Rm>. See, for example, A8.6.57/58/60. And modified test cases to not expect '+' in +reg or #+num. For example, ; CHECK: ldr.w r9, [r7, #28] llvm-svn: 98745	2010-03-17 17:52:21 +00:00
Evan Cheng	fb978ca2c5	Fix liveintervals handling of dbg_value instructions. llvm-svn: 98686	2010-03-16 21:51:27 +00:00
Dan Gohman	7ac4578c0e	Add an rdar number to this test. llvm-svn: 98654	2010-03-16 19:08:20 +00:00
Bob Wilson	34aca030c5	--- Reverse-merging r98637 into '.': U test/CodeGen/ARM/tls2.ll U test/CodeGen/ARM/arm-negative-stride.ll U test/CodeGen/ARM/2009-10-30.ll U test/CodeGen/ARM/globals.ll U test/CodeGen/ARM/str_pre-2.ll U test/CodeGen/ARM/ldrd.ll U test/CodeGen/ARM/2009-10-27-double-align.ll U test/CodeGen/Thumb2/thumb2-strb.ll U test/CodeGen/Thumb2/ldr-str-imm12.ll U test/CodeGen/Thumb2/thumb2-strh.ll U test/CodeGen/Thumb2/thumb2-ldr.ll U test/CodeGen/Thumb2/thumb2-str_pre.ll U test/CodeGen/Thumb2/thumb2-str.ll U test/CodeGen/Thumb2/thumb2-ldrh.ll U utils/TableGen/TableGen.cpp U utils/TableGen/DisassemblerEmitter.cpp D utils/TableGen/RISCDisassemblerEmitter.h D utils/TableGen/RISCDisassemblerEmitter.cpp U Makefile.rules U lib/Target/ARM/ARMInstrNEON.td U lib/Target/ARM/Makefile U lib/Target/ARM/AsmPrinter/ARMInstPrinter.cpp U lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp U lib/Target/ARM/AsmPrinter/ARMInstPrinter.h D lib/Target/ARM/Disassembler U lib/Target/ARM/ARMInstrFormats.td U lib/Target/ARM/ARMAddressingModes.h U lib/Target/ARM/Thumb2ITBlockPass.cpp llvm-svn: 98640	2010-03-16 16:59:47 +00:00
Johnny Chen	ff030064fb	Initial ARM/Thumb disassembler check-in. It consists of a tablgen backend (RISCDisassemblerEmitter) which emits the decoder functions for ARM and Thumb, and the disassembler core which invokes the decoder function and builds up the MCInst based on the decoded Opcode. Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm instructions to help disassembly. We also changed the output of the addressing modes to omit the '+' from the assembler syntax #+/-<imm> or +/-<Rm>. See, for example, A8.6.57/58/60. And modified test cases to not expect '+' in +reg or #+num. For example, ; CHECK: ldr.w r9, [r7, #28] llvm-svn: 98637	2010-03-16 16:36:54 +00:00
Bob Wilson	67c88e4977	Stop using the old pre-UAL syntax for LDM/STM instruction suffixes. This does not move entirely to UAL syntax, since the default "increment after" suffix is empty but we still use "IA" for that. llvm-svn: 98635	2010-03-16 16:19:07 +00:00
Bob Wilson	545aba3681	Add a testcase for the change in r98586. llvm-svn: 98610	2010-03-16 05:33:29 +00:00
Bill Wendling	8d2ee208ab	Forgot testcase for r98599. llvm-svn: 98602	2010-03-16 01:54:20 +00:00
Chris Lattner	adff4d133f	Fix the third (and last known) case of code update problems due to LLVM IR changes with addr label weirdness. In the testcase, we generate references to the two bb's when codegen'ing the first function: _test1: ## @test1 leaq Ltmp0(%rip), %rax .. leaq Ltmp1(%rip), %rax Then continue to codegen the second function where the blocks get merged. We're now smart enough to emit both labels, producing this code: _test_fun: ## @test_fun ## BB#0: ## %entry Ltmp1: ## Block address taken Ltmp0: ## BB#1: ## %ret movl $-1, %eax ret Rejoice. llvm-svn: 98595	2010-03-16 00:29:39 +00:00
Daniel Dunbar	241d3cb048	MC: Allow modifiers in MCSymbolRefExpr, and eliminate X86MCTargetExpr. - Although it would be nice to allow this decoupling, the assembler needs to be able to reason about MCSymbolRefExprs in too many places to make this viable. We can use a target specific encoding of the variant if this becomes an issue. - This patch also extends llvm-mc to support parsing of the modifiers, as opposed to lumping them in with the symbol. llvm-svn: 98592	2010-03-15 23:51:06 +00:00
Dan Gohman	db6002b964	Recognize code for doing vector gather/scatter index calculations with 32-bit indices. Instead of shuffling each element out of the index vector, when all indices are needed, just store the input vector to the stack and load the elements out. llvm-svn: 98588	2010-03-15 23:23:03 +00:00
Chris Lattner	45a0ae21b8	Implement support for the case when a reference to a addr-of-bb label is generated, but then the block is deleted. Since the value is undefined, we just emit the label right after the entry label of the function. It might matter that the label is in the same section as the function was afterall. llvm-svn: 98579	2010-03-15 20:39:00 +00:00
Chris Lattner	802ebf9561	Fix the case when a reference to an address taken BB is emitted in one function, then the BB is RAUW'd before the definition is emitted. There are still two cases not being handled, but this should improve us back to the situation before I touched anything. llvm-svn: 98566	2010-03-15 19:09:43 +00:00
Chris Lattner	8be174c089	filecheckize a test and mark these wiht a cpu so it passes on hosts without cmovs. llvm-svn: 98521	2010-03-14 22:31:16 +00:00
Duncan Sands	217cec1786	Turn calls to copysignl into an FCOPYSIGN node. Handle FCOPYSIGN nodes with ppc_f128 type by having the type legalizer turn these back into a call to copysignl. llvm-svn: 98514	2010-03-14 21:08:40 +00:00
Chris Lattner	70eca8f78e	fix ShrinkDemandedOps to not leave dead nodes around, fixing PR6607 llvm-svn: 98512	2010-03-14 19:46:02 +00:00
Chris Lattner	95e0a61b2d	don't have i386-specific tests in CodeGen/Generic, PR6601. llvm-svn: 98508	2010-03-14 18:51:18 +00:00
Chris Lattner	c50b8b27f5	fix PR6605, X86ISD::CMP always returns i32 (EFLAGS), not the operand type. llvm-svn: 98507	2010-03-14 18:44:35 +00:00
Anton Korobeynikov	9333c20518	Fix typo llvm-svn: 98506	2010-03-14 18:42:52 +00:00
Anton Korobeynikov	3997a71ef8	Feature test for half precision FP. llvm-svn: 98504	2010-03-14 18:42:43 +00:00
Chris Lattner	2bdb0765f8	fix AsmPrinter::GetBlockAddressSymbol to always return a unique label instead of trying to form one based on the BB name (which causes collisions if the name is empty). This fixes PR6608 llvm-svn: 98495	2010-03-14 17:53:23 +00:00
Chris Lattner	9331acc6d7	get MMI out of the label uniquing business, just go to MCContext to get unique assembler temporary labels. llvm-svn: 98489	2010-03-14 08:36:50 +00:00
Evan Cheng	7d8c39bb1c	Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions. llvm-svn: 98465	2010-03-14 03:48:46 +00:00
Chris Lattner	683801add5	simplify code to use OutContext.GetOrCreateTemporarySymbol with no arguments instead of having to come up with a unique name. This also makes the code less fragile. llvm-svn: 98364	2010-03-12 18:47:50 +00:00
Chris Lattner	80ab250a1c	fix PR6577, a bug in sdbuilder lowering select instructions whose true value was not Val#0. llvm-svn: 98336	2010-03-12 07:15:36 +00:00
Bill Wendling	368a68ac82	revert r98270. llvm-svn: 98281	2010-03-11 19:50:31 +00:00
Evan Cheng	20dbd70316	Bad bad bug. x86 force indirect tail call address into eax when it's meant to force it into a call preserved register instead. Change it to ecx for now. llvm-svn: 98270	2010-03-11 18:49:14 +00:00
Richard Osborne	0dcde97cbc	Add dag combine to simplify lmul(x, 0, a, b) llvm-svn: 98258	2010-03-11 16:26:35 +00:00
Evan Cheng	4ef6d8fa15	The check for coalescing a virtual register to a physical register, e.g. cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check for overlaps of vr's live interval with the super registers of the physical register (ECX in this case) and let JoinIntervals() handle checking the coalescing feasibility against the physical register (cl in this case). llvm-svn: 98251	2010-03-11 08:20:21 +00:00
Eric Christopher	bbdbe41a97	Have fast-isel understand llvm.objectsize. Update testcase for slightly different codegen. llvm-svn: 98244	2010-03-11 06:20:22 +00:00
Chris Lattner	d6d11e53ab	add support, testcases, and dox for the new GHC calling convention. Patch by David Terei! llvm-svn: 98212	2010-03-11 00:22:57 +00:00
Chris Lattner	7cd70b8066	fix PR6533 by updating the br(xor) code to remember the case when it looked past a trunc. llvm-svn: 98203	2010-03-10 23:46:44 +00:00
Richard Osborne	41c5f84f1d	Handle MVT::i64 type in DAG combine for ISD::ADD. Fold 64 bit expression add(add(mul(x,y),a),b) -> lmul(x,y,a,b) if all operands are zero extended. llvm-svn: 98168	2010-03-10 18:12:27 +00:00
Richard Osborne	d400202a43	Fold add(add(mul(x,y),a),b) -> lmul(x,y,a,b) if the intermediate results are unused elsewhere. llvm-svn: 98157	2010-03-10 16:19:31 +00:00
Richard Osborne	c19c2bd177	Prefer LMUL to MACCU as LMUL has no tied operands. llvm-svn: 98153	2010-03-10 13:27:10 +00:00
Richard Osborne	43210638f1	Custom lower (S\|U)MUL_LOHI -> MACC(S\|U) llvm-svn: 98152	2010-03-10 13:20:07 +00:00
Richard Osborne	c88a8e8d66	Lower add (mul a, b), c into MACCU / MACCS nodes which translate directly to the maccu / maccs instructions. We handle this in ExpandADDSUB since after type legalisation it is messy to recognise these operations. llvm-svn: 98150	2010-03-10 11:41:08 +00:00
Richard Osborne	6c55bfe516	Convert test to FileCheck. llvm-svn: 98148	2010-03-10 11:24:03 +00:00
Evan Cheng	1f7af27386	Fix typo. llvm-svn: 98142	2010-03-10 07:07:55 +00:00
Evan Cheng	96e1e20fd5	Unbreak test on Linux. llvm-svn: 98141	2010-03-10 07:07:45 +00:00
Evan Cheng	668ceddeec	Enable machine cse pass. llvm-svn: 98132	2010-03-10 03:07:41 +00:00
Dale Johannesen	4b8f3692f4	The address of an indirect call must be in R12 on Darwin. Make it so. (This patch is in LowerCall_Darwin, which seems to be used by SVR4 code as well; since that doesn't belong here, I haven't worried about this case.) llvm-svn: 98077	2010-03-09 20:15:42 +00:00
Richard Osborne	4077517135	In cases where the carry / borrow unused converted ladd / lsub to an add or a sub. llvm-svn: 98059	2010-03-09 16:34:25 +00:00
Richard Osborne	173efed224	Add DAG combine for ladd / lsub. llvm-svn: 98057	2010-03-09 16:07:47 +00:00
Chris Lattner	99ca33d324	move .set generation out of DwarfPrinter into AsmPrinter and MCize it. llvm-svn: 98010	2010-03-08 23:58:37 +00:00
Chris Lattner	10d571f349	simplify EmitSectionOffset to always use .set if it is available, the only thing this affects is that we produce .set in one case we didn't before, which shouldn't harm anything. Make EmitSectionOffset call EmitDifference instead of duplicating it. llvm-svn: 98005	2010-03-08 23:23:25 +00:00
Bob Wilson	116599fe52	Fix a crash compiling 254.gap for Thumb2. The Thumb2 add/sub with 12-bit immediate instructions cannot set the condition codes, so they do not have the extra cc_out operand. We hit an assertion during tail duplication because the instruction being duplicated had more operands that expected. llvm-svn: 98001	2010-03-08 22:56:15 +00:00
Evan Cheng	cfe037000a	Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll. llvm-svn: 97980	2010-03-08 21:05:02 +00:00
Wesley Peck	bc2c6d7b1b	Re-committing the failed r97807 commit with changes to eliminate warnings. llvm-svn: 97891	2010-03-06 23:23:12 +00:00
Anton Korobeynikov	e0e616a74d	Initial bits of ARMv4-only support. Patch by John Tytgat! llvm-svn: 97886	2010-03-06 19:39:36 +00:00
Anton Korobeynikov	6c841e6a44	Do not use '&' prefix for globals when register base field is non-zero, otherwise msp430-as will silently miscompile the code (TI's assembler report an error though). This fixes PR6349 llvm-svn: 97877	2010-03-06 11:41:12 +00:00
Chris Lattner	b2ed5cb501	revert r97807, it introduced build warnings. llvm-svn: 97869	2010-03-06 04:32:46 +00:00
Charles Davis	faa2f44081	Don't emit global symbols into the (__TEXT,__ustring) section on Darwin. This is a workaround for <rdar://problem/7672401/> (which I filed). This let's us build Wine on Darwin, and it gets the Qt build there a little bit further (so Doug says). llvm-svn: 97845	2010-03-05 22:28:45 +00:00
Jakob Stoklund Olesen	4e033d2070	Better handling of dead super registers in LiveVariables. We used to do this: CALL ... %RAX<imp-def> ... [not using %RAX] %EAX = ..., %RAX<imp-use, kill> RET %EAX<imp-use,kill> Now we do this: CALL ... %RAX<imp-def, dead> ... [not using %RAX] %EAX = ... RET %EAX<imp-use,kill> By not artificially keeping %RAX alive, we lower register pressure a bit. The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously 55, anybody can see that. Sheesh. llvm-svn: 97838	2010-03-05 21:49:17 +00:00
Jakob Stoklund Olesen	1fce28720a	We don't really care about correct register liveness information after the post-ra scheduler has run. Disable the verifier checks that late in the game. llvm-svn: 97837	2010-03-05 21:49:13 +00:00
Jakob Stoklund Olesen	67476519d7	Avoid creating bad PHI instructions when BR is being const-folded. llvm-svn: 97836	2010-03-05 21:49:10 +00:00
Chris Lattner	e1452aba94	fix bss section printing for cell, patch by Kalle Raiskila! llvm-svn: 97814	2010-03-05 18:55:36 +00:00
Wesley Peck	79c2f7afcc	Reworking the stack layout that the MicroBlaze backend generates. The MicroBlaze backend was generating stack layouts that did not conform correctly to the ABI. This update generates stack layouts which are closer to what GCC does. Variable arguments support was added as well but the stack layout for varargs has not been finalized. llvm-svn: 97807	2010-03-05 15:26:02 +00:00
Evan Cheng	eb43cbfc75	Fix an oops in x86 sibcall optimization. If the ByVal callee argument is itself passed as a pointer, then it's obviously not safe to do a tail call. llvm-svn: 97797	2010-03-05 08:38:04 +00:00
Chris Lattner	80aaccb987	Fix PR6497, a bug where we'd fold a load into an addc node which has a flag. That flag in turn was used by an already-selected adde which turned into an ADC32ri8 which used a selected load which was chained to the load we folded. This flag use caused us to form a cycle. Fix this by not ignoring chains in IsLegalToFold even in cases where the isel thinks it can. llvm-svn: 97791	2010-03-05 06:19:13 +00:00
Chris Lattner	5804690a45	cleanup llvm-svn: 97790	2010-03-05 06:17:43 +00:00
Evan Cheng	04b9deff58	Rever 96389 and 96990. They are causing some miscompilation that I do not fully understand. llvm-svn: 97782	2010-03-05 03:08:23 +00:00
Bill Wendling	a92a5b8a6c	Revert r97766. It's deleting a tag. llvm-svn: 97768	2010-03-05 00:33:59 +00:00
Bill Wendling	55b4dfcd9d	Micro-optimization: This code: float floatingPointComparison(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } produces this: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jne 0x00000004 0000000000000016 jp 0x00000002 0000000000000018 jmp 0x00000008 000000000000001a addsd 0x00000006(%rip),%xmm0 0000000000000022 cvtsd2ss %xmm0,%xmm0 0000000000000026 ret The "jne/jp/jmp" sequence can be reduced to this instead: _floatingPointComparison: 0000000000000000 cvtss2sd %xmm1,%xmm1 0000000000000004 cvtss2sd %xmm0,%xmm0 0000000000000008 mulsd %xmm1,%xmm0 000000000000000c pxor %xmm1,%xmm1 0000000000000010 ucomisd %xmm1,%xmm0 0000000000000014 jp 0x00000002 0000000000000016 je 0x00000008 0000000000000018 addsd 0x00000006(%rip),%xmm0 0000000000000020 cvtsd2ss %xmm0,%xmm0 0000000000000024 ret for a savings of 2 bytes. This xform can happen when we recognize that jne and jp jump to the same "true" MBB, the unconditional jump would jump to the "false" MBB, and the "true" branch is the fall-through MBB. llvm-svn: 97766	2010-03-05 00:24:26 +00:00
Johnny Chen	b6d35fd803	Drop the ".w" qualifier for t2UXTB16* instructions as there is no 16-bit version of either sxtb16 or uxtb16, and the unified syntax does not specify ".w". llvm-svn: 97760	2010-03-04 22:24:41 +00:00
Bob Wilson	188e15d7a5	pr6478: The frame pointer spill frame index is only defined when there is a frame pointer. llvm-svn: 97755	2010-03-04 21:42:36 +00:00
Bob Wilson	73b96c00d2	pr6480: Don't try producing ld/st-multiple instructions when the address is an undef value. This is only going to come up for bugpoint-reduced tests -- correct programs will not access memory at undefined addresses -- so it's not worth the effort of doing anything more aggressive. llvm-svn: 97745	2010-03-04 21:04:38 +00:00
Jakob Stoklund Olesen	3408cd6de1	Fix the remaining MUL8 and DIV8 to define AX instead of AL,AH. These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG reads AX in order to avoid reading AH with a REX instruction. Fix PR6489. llvm-svn: 97742	2010-03-04 20:42:07 +00:00
Dan Gohman	265f85f6d8	Fix recognition of 16-bit bswap for C front-ends which emit the clobber registers in a different order. llvm-svn: 97741	2010-03-04 19:58:08 +00:00
Dan Gohman	da13ee1220	Revert r97580; that's not the right way to fix this. llvm-svn: 97639	2010-03-03 04:36:42 +00:00
Bill Wendling	d1f658563d	This test case: long test(long x) { return (x & 123124) \| 3; } Currently compiles to: _test: orl $3, %edi movq %rdi, %rax andq $123127, %rax ret This is because instruction and DAG combiners canonicalize (or (and x, C), D) -> (and (or, D), (C \| D)) However, this is only profitable if (C & D) != 0. It gets in the way of the 3-addressification because the input bits are known to be zero. llvm-svn: 97616	2010-03-03 00:35:56 +00:00
Chris Lattner	9c9c1158cb	Fix some issues in WalkChainUsers dealing with CopyToReg/CopyFromReg/INLINEASM. These are annoying because they have the same opcode before an after isel. Fix this by setting their NodeID to -1 to indicate that they are selected, just like what automatically happens when selecting things that end up being machine nodes. With that done, give IsLegalToFold a new flag that causes it to ignore chains. This lets the HandleMergeInputChains routine be the one place that validates chains after a match is successful, enabling the new hotness in chain processing. This smarter chain processing eliminates the need for "PreprocessRMW" in the X86 and MSP430 backends and enables MSP to start matching it's multiple mem operand instructions more aggressively. I currently #if out the dead code in the X86 backend and MSP backend, I'll remove it for real in a follow-on patch. The testcase changes are: test/CodeGen/X86/sse3.ll: we generate better code test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was miscompiling this before, we now generate correct code Convert it to filecheck while I'm at it. test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem folding to make anton happy. :) llvm-svn: 97596	2010-03-02 22:20:06 +00:00
Chris Lattner	d25f212f9f	this testcase is failing because pic16 doesn't define a reg/reg xor pattern. I have no plans to fix this XFAIL. llvm-svn: 97587	2010-03-02 20:48:24 +00:00
Chris Lattner	f84b94d738	xfail this for now. llvm-svn: 97584	2010-03-02 19:53:25 +00:00
Dan Gohman	f06941597a	When expanding an expression such as (A + B + C + D), sort the operands by loop depth and emit loop-invariant subexpressions outside of loops. This speeds up MultiSource/Applications/viterbi and others. llvm-svn: 97580	2010-03-02 19:32:21 +00:00
Chris Lattner	845db3b26d	clean up some testcases. llvm-svn: 97576	2010-03-02 18:56:03 +00:00
Chris Lattner	2019e2922f	Fix the xfail I added a couple of patches back. The issue was that we weren't properly handling the case when interior nodes of a matched pattern become dead after updating chain and flag uses. Now we handle this explicitly in UpdateChainsAndFlags. llvm-svn: 97561	2010-03-02 07:50:03 +00:00
Chris Lattner	0b41a42411	Rewrite chain handling validation and input TokenFactor handling stuff now that we don't care about emulating the old broken behavior of the old isel. This eliminates the 'CheckChainCompatible' check (along with IsChainCompatible) which did an incorrect and inefficient scan up the chain nodes which happened as the pattern was being formed and does the validation at the end in HandleMergeInputChains when it forms a structural pattern. This scans "down" the graph, which means that it is quickly bounded by nodes already selected. This also handles token factors that get "trapped" in the dag. Removing the CheckChainCompatible nodes also shrinks the generated tables by about 6K for X86 (down to 83K). There are two pieces remaining before I can nuke PreprocessRMW: 1. I xfailed a test because we're now producing worse code in a case that has nothing to do with the change: it turns out that our use of MorphNodeTo will leave dead nodes in the graph which (depending on how the graph is walked) end up causing bogus uses of chains and blocking matches. This is really bad for other reasons, so I'll fix this in a follow-up patch. 2. CheckFoldableChainNode needs to be improved to handle the TF. llvm-svn: 97539	2010-03-02 02:22:10 +00:00
Dan Gohman	56a20fc5eb	Fix several places to handle vector operands properly. Based on a patch by Micah Villmow for PR6438. llvm-svn: 97538	2010-03-02 02:14:38 +00:00
Chris Lattner	c0839055a9	Fix PR2590 by making PatternSortingPredicate actually be ordered correctly. Previously it would get in trouble when two patterns were too similar and give them nondet ordering. We force this by using the record ID order as a fallback. The testsuite diff is due to alpha patterns being ordered slightly differently, the change is a semantic noop afaict: < lda $0,-100($16) --- > subq $16,100,$0 llvm-svn: 97509	2010-03-01 22:09:11 +00:00
Chris Lattner	ac2f5c24a0	stop using anders-aa llvm-svn: 97491	2010-03-01 20:24:05 +00:00
Devang Patel	6853e2432e	Rewrite test to test VLA using new debug info encoding scheme. llvm-svn: 97465	2010-03-01 18:30:58 +00:00
Devang Patel	c56aee014c	Remove this generic debug info intrinsic test. LLVM does not use this llvm.dbg.stoppoint intrinsic anymore. There are tests to check new implementation, which attaches location information directly with an instruction using metadata. llvm-svn: 97464	2010-03-01 18:30:08 +00:00
Chris Lattner	74db1864da	add some random nounwinds. llvm-svn: 97411	2010-02-28 20:36:49 +00:00
Dan Gohman	0799a41c48	Don't try to replace physical registers when doing CSE. llvm-svn: 97360	2010-02-28 01:33:43 +00:00
Dan Gohman	3d9622fa97	Add nounwinds. llvm-svn: 97349	2010-02-27 23:53:53 +00:00
Evan Cheng	94051bc37e	Re-apply 97040 with fix. This survives a ppc self-host llvm-gcc bootstrap. llvm-svn: 97310	2010-02-27 07:36:59 +00:00
Jakob Stoklund Olesen	755ba2ee84	Use the right floating point load/store instructions in PPCInstrInfo::foldMemoryOperandImpl(). The PowerPC floating point registers can represent both f32 and f64 via the two register classes F4RC and F8RC. F8RC is considered a subclass of F4RC to allow cross-class coalescing. This coalescing only affects whether registers are spilled as f32 or f64. Spill slots must be accessed with load/store instructions corresponding to the class of the spilled register. PPCInstrInfo::foldMemoryOperandImpl was looking at the instruction opcode which is wrong. X86 has similar floating point register classes, but doesn't try to fold memory operands, so there is no problem there. llvm-svn: 97262	2010-02-26 21:09:24 +00:00
Sanjiv Gupta	8a971a1dc5	Reapply things reverted back in 97220, with the fixed test case. llvm-svn: 97228	2010-02-26 17:59:28 +00:00
Richard Osborne	fe30a8a2c1	Fix XCoreTargetLowering::isLegalAddressingMode() to handle VoidTy. Previously LoopStrengthReduce would sometimes be unable to find a legal formula, causing an assertion failure. llvm-svn: 97226	2010-02-26 16:44:51 +00:00
Chris Lattner	e4b5559cf8	change the scope node to include a list of children to be checked instead of to have a chained series of scope nodes. This makes the generated table smaller, improves the efficiency of the interpreter, and make the factoring optimization much more reasonable to implement. llvm-svn: 97160	2010-02-25 19:00:39 +00:00
Dan Gohman	084112437d	Revert r97064. Duncan pointed out that bitcasts are defined in terms of store and load, which means bitcasting between scalar integer and vector has endian-specific results, which undermines this whole approach. llvm-svn: 97137	2010-02-25 15:20:39 +00:00
Dan Gohman	52ed61204b	Make LoopSimplify change conditional branches in loop exiting blocks which branch on undef to branch on a boolean constant for the edge exiting the loop. This helps ScalarEvolution compute trip counts for loops. Teach ScalarEvolution to recognize single-value PHIs, when safe, and ForgetSymbolicName to forget such single-value PHI nodes as apprpriate in ForgetSymbolicName. llvm-svn: 97126	2010-02-25 06:57:05 +00:00
Jakob Stoklund Olesen	2b93d17560	Create a stack frame on ARM when - Function uses all scratch registers AND - Function does not use any callee saved registers AND - Stack size is too big to address with immediate offsets. In this case a register must be scavenged to calculate the address of a stack object, and the scavenger needs a spare register or emergency spill slot. llvm-svn: 97071	2010-02-24 22:43:17 +00:00
Bob Wilson	4ffb88d388	Check for comparisons of +/- zero when optimizing less-than-or-equal and greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions. This is only allowed when UnsafeFPMath is set or when at least one of the operands is known to be nonzero. llvm-svn: 97065	2010-02-24 22:15:53 +00:00
Dan Gohman	424e8f22d0	Make getTypeSizeInBits work correctly for array types; it should return the number of value bits, not the number of bits of allocation for in-memory storage. Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and vectors. Fix several places in CodeGen which compute offsets into in-memory vectors to use TargetData information. This fixes PR1784. llvm-svn: 97064	2010-02-24 22:05:23 +00:00
Daniel Dunbar	24c99e027e	Speculatively revert r97011, "Re-apply 96540 and 96556 with fixes.", again in the hopes of fixing PPC bootstrap. llvm-svn: 97040	2010-02-24 17:05:47 +00:00
Dan Gohman	c0c6077fed	When forming SSE min and max nodes for UGE and ULE comparisons, it's necessary to swap the operands to handle NaN and negative zero properly. Also, reintroduce logic for checking for NaN conditions when forming SSE min and max instructions, fixed to take into consideration NaNs and negative zeros. This allows forming min and max instructions in more cases. llvm-svn: 97025	2010-02-24 06:52:40 +00:00

... 3 4 5 6 7 ...

3381 Commits