llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 11:42:57 +01:00

Author	SHA1	Message	Date
Rafael Espindola	c97d642bf7	Relax address updates in the eh_frame section. llvm-svn: 122591	2010-12-28 05:39:27 +00:00
Rafael Espindola	0552cb0638	Start adding basic support for emitting the call frame instructions. llvm-svn: 122590	2010-12-28 04:15:37 +00:00
Rafael Espindola	12c30aed07	Add support for .cfi_lsda. llvm-svn: 122584	2010-12-27 15:56:22 +00:00
Daniel Dunbar	2d0cf8e149	MC/Mach-O/Thumb: Select appropriate relocation types for Thumb. llvm-svn: 122583	2010-12-27 14:49:49 +00:00
Rafael Espindola	7f947794d7	Handle reloc_riprel_4byte_movq_load. Should make the bots happy. llvm-svn: 122579	2010-12-27 02:03:24 +00:00
Rafael Espindola	e7e67fce10	Add support for the same encodings of the personality function that gnu as supports. llvm-svn: 122577	2010-12-27 00:36:05 +00:00
Chris Lattner	d4daf9f002	implement enough of the memset inference algorithm to recognize and insert memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm \| opt -loop-idiom \| opt -O3 \| llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573	2010-12-26 23:42:51 +00:00
Chris Lattner	9007b56712	start using irbuilder to make mem intrinsics in a few passes. llvm-svn: 122572	2010-12-26 22:57:41 +00:00
Rafael Espindola	2ebe553431	Add support for @note. Patch by Jörg Sonnenberger. llvm-svn: 122568	2010-12-26 21:30:59 +00:00
Rafael Espindola	99f1527316	Add basic support for .cfi_personality. llvm-svn: 122566	2010-12-26 20:20:31 +00:00
Chris Lattner	2129ce0891	Generalize a previous change, fixing PR8855 - an valid large immediate rejected by the mc assembler. llvm-svn: 122557	2010-12-25 21:36:35 +00:00
Benjamin Kramer	49e40d4c4b	MemCpyOpt: Turn memcpys from a constant into a memset if possible. This allows us to compile "int cst[] = {-1, -1, -1};" into movl $-1, 16(%rsp) movq $-1, 8(%rsp) instead of movl _cst+8(%rip), %eax movl %eax, 16(%rsp) movq _cst(%rip), %rax movq %rax, 8(%rsp) llvm-svn: 122548	2010-12-24 21:17:12 +00:00
Daniel Dunbar	592854a10a	MC/Mach-O/ARM: Start handling some Thumb branches. llvm-svn: 122547	2010-12-24 16:41:46 +00:00
Kevin Enderby	ff7e68c5e7	In llvm-mc parse a Hash token as a full line comment. Allows handling of preprocessed .s files and matches darwin gas. rdar://8798690 Also fix a comment on the next line of AsmParser.cpp after this new code. llvm-svn: 122531	2010-12-24 00:12:02 +00:00
Owen Anderson	6afd90810e	When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero are not the low bits of x, but the bits that WILL be the low bits after the operation completes. llvm-svn: 122529	2010-12-23 23:56:24 +00:00
Bob Wilson	85dbc89f44	Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions. If the basic block containing the BCCi64 (or BCCZi64) instruction ends with an unconditional branch, that branch needs to be deleted before appending the expansion of the BCCi64 to the end of the block. llvm-svn: 122521	2010-12-23 22:45:49 +00:00
Torok Edwin	cfd30fdf42	XFAIL vg_leak the new test as the rest. llvm-svn: 122517	2010-12-23 21:22:09 +00:00
Torok Edwin	2acdd67db2	Fix OCaml bindings crash, PR8847. See http://caml.inria.fr/mantis/view.php?id=4166 If we call only external functions from a module, then its 'let _' bindings don't get executed, which means that the exceptions don't get registered for use in the C code. This in turn causes llvm_raise to call raise_with_arg() with a NULL pointer and cause a segmentation fault. The workaround is to declare all 'external' functions as 'val' in these .mli files. Also added a separate testcase (the testcase must call only external functions for the bug to occur). llvm-svn: 122497	2010-12-23 15:49:26 +00:00
Andrew Trick	cc701bcfdc	Fixes PR8823: add-with-overflow-128.ll In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472	2010-12-23 03:15:51 +00:00
Benjamin Kramer	49942a90b7	DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455	2010-12-22 23:17:45 +00:00
Benjamin Kramer	27d13684f5	InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1. llvm-svn: 122453	2010-12-22 23:12:15 +00:00
Benjamin Kramer	d8387aa9bd	X86: Lower a select directly to a setcc_carry if possible. int test(unsigned long a, unsigned long b) { return -(a < b); } compiles to _test: ## @test cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7] sbbl %eax, %eax ## encoding: [0x19,0xc0] ret ## encoding: [0xc3] instead of _test: ## @test xorl %ecx, %ecx ## encoding: [0x31,0xc9] cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7] movl $-1, %eax ## encoding: [0xb8,0xff,0xff,0xff,0xff] cmovael %ecx, %eax ## encoding: [0x0f,0x43,0xc1] ret ## encoding: [0xc3] llvm-svn: 122451	2010-12-22 23:09:28 +00:00
Daniel Dunbar	e6ec0e7149	MC/Mach-O/ARM: Don't try to use scattered relocs for BR24 fixups. llvm-svn: 122441	2010-12-22 21:26:43 +00:00
Rafael Espindola	5004de4d8b	Add reduced test from 8845. llvm-svn: 122438	2010-12-22 21:15:13 +00:00
Duncan Sands	68d969c2f5	When determining whether the new instruction was already present in the original instruction, half the cases were missed (making it not wrong but suboptimal). Also correct a typo (A <-> B) in the second chunk. llvm-svn: 122414	2010-12-22 17:15:25 +00:00
Duncan Sands	e1522867e6	Make this test not depend on how the variable is named. llvm-svn: 122413	2010-12-22 17:08:04 +00:00
Daniel Dunbar	cb8ac619a2	MC/Mach-O/ARM: We always use the SECTDIFF reloc type on ARM, which is esp. important given that the LOCAL_SECTDIFF enumeration got redefined. llvm-svn: 122412	2010-12-22 16:52:19 +00:00
Daniel Dunbar	e44a2c1166	MC/Mach-O/ARM: Add enough relocation logic to get BR24 relocations. llvm-svn: 122407	2010-12-22 16:19:24 +00:00
Rafael Espindola	7c995a90fc	Simplify the handling of .size expressions. llvm-svn: 122404	2010-12-22 16:03:00 +00:00
Duncan Sands	922251757b	Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C) if both A op B and A op C simplify. This fires fairly often but doesn't make that much difference. On gcc-as-one-file it removes two "and"s and turns one branch into a select. llvm-svn: 122399	2010-12-22 13:36:08 +00:00
Che-Liang Chiou	e73ad4387e	ptx: add ld instruction and test llvm-svn: 122398	2010-12-22 10:38:51 +00:00
Chris Lattner	04ef853e23	Fix a bug in ReduceLoadWidth that wasn't handling extending loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392	2010-12-22 08:02:57 +00:00
Owen Anderson	b4f1511864	Give GVN back the ability to perform simple conditional propagation on conditional branch values. I still think that LVI should be handling this, but that capability is some ways off in the future, and this matters for some significant benchmarks. llvm-svn: 122378	2010-12-21 23:54:34 +00:00
Dale Johannesen	e0fb87c3d7	Reapply 122353-122355 with fixes. 122354 was wrong; the shift type was needed one place, the shift count type another. The transform in 123555 had the same problem. llvm-svn: 122366	2010-12-21 21:55:50 +00:00
Benjamin Kramer	369872edfc	Add some x86 specific dagcombines for conditional increments. (add Y, (sete X, 0)) -> cmp X, 1; adc 0, Y (add Y, (setne X, 0)) -> cmp X, 1; sbb -1, Y (sub (sete X, 0), Y) -> cmp X, 1; sbb 0, Y (sub (setne X, 0), Y) -> cmp X, 1; adc -1, Y for unsigned foo(unsigned a, unsigned b) { if (a == 0) b++; return b; } we now get: foo: cmpl $1, %edi movl %esi, %eax adcl $0, %eax ret instead of: foo: testl %edi, %edi sete %al movzbl %al, %eax addl %esi, %eax ret llvm-svn: 122364	2010-12-21 21:41:44 +00:00
Dale Johannesen	972aba543a	Revert 122353-122355 for the moment, they broke stuff. llvm-svn: 122360	2010-12-21 21:22:27 +00:00
Dale Johannesen	39186cfb0b	Add a new transform to DAGCombiner. llvm-svn: 122355	2010-12-21 20:10:51 +00:00
Dale Johannesen	5f3e7b08f6	Get the type of a shift from the shift, not from its shift count operand. These should be the same but apparently are not always, and this is cleaner anyway. This improves the code in an existing test. llvm-svn: 122354	2010-12-21 20:06:19 +00:00
David Greene	33a91c0c9a	Revert 122341. It breaks some darwin tests. llvm-svn: 122346	2010-12-21 17:25:43 +00:00
David Greene	28140b5288	Fix PR 8199. This patch prepends the build tool dir to LLVM programs being tested. This ensures that we test the tools just built and not some random tools that might happen to be in the user's PATH. This makes LLVM testing much more stable and predictable. llvm-svn: 122341	2010-12-21 16:55:53 +00:00
Duncan Sands	658dd68e10	Add an additional InstructionSimplify factorization test. llvm-svn: 122333	2010-12-21 15:12:22 +00:00
Duncan Sands	b4497c7e0f	While I don't think any later transforms can fire, it seems cleaner to not assume this (for example in case more transforms get added below it). Suggested by Frits van Bommel. llvm-svn: 122332	2010-12-21 15:03:43 +00:00
Duncan Sands	3ceeaf218e	Fix typo in comment, spotted by Deewiant. llvm-svn: 122329	2010-12-21 13:39:20 +00:00
Duncan Sands	0bd25425b6	Teach InstructionSimplify about distributive laws. These transforms fire quite often, but don't make much difference in practice presumably because instcombine also knows them and more. llvm-svn: 122328	2010-12-21 13:32:22 +00:00
Duncan Sands	5880f299da	Add generic simplification of associative operations, generalizing a couple of existing transforms. This fires surprisingly often, for example when compiling gcc "(X+(-1))+1->X" fires quite a lot as well as various "and" simplifications (usually with a phi node operand). Most of the time this doesn't make a real difference since the same thing would have been done elsewhere anyway, eg: by instcombine, but there are a few places where this results in simplifications that we were not doing before. llvm-svn: 122326	2010-12-21 08:49:00 +00:00
Bob Wilson	01593c55a2	Add ARM-specific DAG combining to cast i64 vector element load/stores to f64. Type legalization splits up i64 values into pairs of i32 values, which leads to poor quality code when inserting or extracting i64 vector elements. If the vector element is loaded or stored, it can be treated as an f64 value and loaded or stored directly from a VPR register. Use the pre-legalization DAG combiner to cast those vector elements to f64 types so that the type legalizer won't mess them up. Radar 8755338. llvm-svn: 122319	2010-12-21 06:43:19 +00:00
Wesley Peck	e8ec7a4d1f	Teach the MBlaze disassembler to disassemble special purpose registers. llvm-svn: 122269	2010-12-20 21:18:04 +00:00
Roman Divacky	42b3eee794	Set the value of absolute symbols. llvm-svn: 122268	2010-12-20 21:14:39 +00:00
Roman Divacky	13b5260f62	Print all 64bits for st_value and st_size. Adjust tests accordingly. llvm-svn: 122263	2010-12-20 20:49:43 +00:00
Wesley Peck	af2890a051	Teach the MBlaze asm parser how to parse special purpose register names. llvm-svn: 122261	2010-12-20 20:43:24 +00:00
Dale Johannesen	036c3da142	Cosmetic changes. llvm-svn: 122259	2010-12-20 20:10:50 +00:00
Benjamin Kramer	bec7a6be15	Teach InstCombine to merge (icmp ult (X + CA), C1) \| (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1. InstCombine creates these so now we compile x == 23 \|\| x == 24 \|\| x == 25 to %x.off = add i32 %x, -23 %1 = icmp ult i32 %x.off, 3 instead of %x.off = add i32 %x, -23 %1 = icmp ult i32 %x.off, 2 %cmp3 = icmp eq i32 %x, 25 %ret2 = or i1 %1, %cmp3 llvm-svn: 122248	2010-12-20 16:18:51 +00:00
Duncan Sands	f72cfa961d	Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods (they had just been forgotten before). Adding Xor causes "main" in the existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified. llvm-svn: 122245	2010-12-20 14:47:04 +00:00
Chris Lattner	b27b5d0a3a	fix PR8807 by making transformConstExprCastCall aware of byval arguments. llvm-svn: 122238	2010-12-20 08:36:38 +00:00
Chris Lattner	ba962825a4	when eliding a byval copy due to inlining a readonly function, we have to make sure that the reused alloca has sufficient alignment. llvm-svn: 122236	2010-12-20 08:10:40 +00:00
Chris Lattner	c0a48df9f9	pull byval processing out to its own helper function. llvm-svn: 122235	2010-12-20 07:57:41 +00:00
Chris Lattner	029952c844	fix PR8769, a miscompilation by inliner when inlining a function with a byval argument. The generated alloca has to have at least the alignment of the byval, if not, the client may be making assumptions that the new alloca won't satisfy. llvm-svn: 122234	2010-12-20 07:45:28 +00:00
Chris Lattner	52149d6e21	merge two tests. llvm-svn: 122233	2010-12-20 07:39:57 +00:00
Chris Lattner	2fa128c4c5	filecheckize llvm-svn: 122232	2010-12-20 07:38:24 +00:00
Chris Lattner	4c3662e299	temporarily disable this: PR8823. llvm-svn: 122222	2010-12-20 02:11:23 +00:00
Chris Lattner	bee7320c3c	now that addc/adde are gone, "ADDC" in the X86 backend uses EFLAGS results, the same as setcc. Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS). This is a step towards finishing off PR5443. In the testcase in that bug we now get: movq %rdi, %rax addq %rsi, %rax sbbq %rcx, %rcx testb $1, %cl setne %dl ret instead of: movq %rdi, %rax addq %rsi, %rax movl $0, %ecx adcq $0, %rcx testq %rcx, %rcx setne %dl ret llvm-svn: 122219	2010-12-20 01:37:09 +00:00
Chris Lattner	2d4e17d195	We lower setb to sbb with the hope that the and will go away, when it doesn't, match it back to setb. On a 64-bit version of the testcase before we'd get: movq %rdi, %rax addq %rsi, %rax sbbb %dl, %dl andb $1, %dl ret now we get: movq %rdi, %rax addq %rsi, %rax setb %dl ret llvm-svn: 122217	2010-12-20 01:16:03 +00:00
Mon P Wang	236fa96503	Test case for r122215 when InstCombine optimizes memset llvm-svn: 122216	2010-12-20 01:06:23 +00:00
Mon P Wang	666259546c	Add comment for testcase for 122206 llvm-svn: 122210	2010-12-20 00:54:26 +00:00
Mon P Wang	d3adab7a64	Prevents PerformShuffleCombine from creating a node with an illegal type after legalize types has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type. llvm-svn: 122206	2010-12-19 23:55:53 +00:00
Chris Lattner	297259f6f1	improve the setcc -> setcc_carry optimization to happen more consistently by moving it out of lowering into dag combine. Add some missing patterns for matching away extended versions of setcc_c. llvm-svn: 122201	2010-12-19 22:08:31 +00:00
Chris Lattner	ad85635a93	now that generic vector types aren't selected onto MMX registers, these tests don't need -disable-mmx. llvm-svn: 122188	2010-12-19 20:12:58 +00:00
Chris Lattner	29475c23d0	X86 supports i8/i16 overflow ops (except i8 multiplies), we should generate them. Now we compile: define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp { entry: %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b) %cmp = extractvalue %0 %0, 1 br i1 %cmp, label %if.then, label %if.end into: _X: ## @X ## BB#0: ## %entry subl $12, %esp movb 16(%esp), %al addb 20(%esp), %al jo LBB0_2 Before we were generating: _X: ## @X ## BB#0: ## %entry pushl %ebp movl %esp, %ebp subl $8, %esp movb 12(%ebp), %al testb %al, %al setge %cl movb 8(%ebp), %dl testb %dl, %dl setge %ah cmpb %cl, %ah sete %cl addb %al, %dl testb %dl, %dl setge %al cmpb %al, %ah setne %al andb %cl, %al testb %al, %al jne LBB0_2 llvm-svn: 122186	2010-12-19 20:03:11 +00:00
Chris Lattner	2d6a408c4c	add a general coverage test for overflow intrinsics. llvm-svn: 122185	2010-12-19 20:01:13 +00:00
Chris Lattner	3bc741a0d2	recognize an unsigned add with overflow idiom into uadd. This resolves a README entry and technically resolves PR4916, but we still get poor code for the testcase in that PR because GVN isn't CSE'ing uadd with add, filed as PR8817. Previously we got: _test7: ## @test7 addq %rsi, %rdi cmpq %rdi, %rsi movl $42, %eax cmovaq %rsi, %rax ret Now we get: _test7: ## @test7 addq %rsi, %rdi movl $42, %eax cmovbq %rsi, %rax ret llvm-svn: 122182	2010-12-19 19:37:52 +00:00
Chris Lattner	faef9b6bfb	optimize uadd(x, cst) into a comparison when the normal result is dead. This is required for my next patch to not regress the testsuite. llvm-svn: 122181	2010-12-19 19:35:32 +00:00
Chris Lattner	d1f114d8f2	generalize the sadd creation code to not require that the sadd formed is half the size of the original type. We can now compile this into a sadd.i8: unsigned char X(char a, char b) { int res = a+b; if ((unsigned )(res+128) > 255U) abort(); return res; } llvm-svn: 122178	2010-12-19 18:35:09 +00:00
Chris Lattner	bb0d067691	fix another miscompile in the llvm.sadd formation logic: it wasn't checking to see if the high bits of the original add result were dead. Inserting a smaller add and zexting back to that size is not good enough. This is likely to be the fix for 8816. llvm-svn: 122177	2010-12-19 18:22:06 +00:00
Chris Lattner	c7876edb16	fix a bug (possibly 8816) in the sadd forming xform: it isn't profitable (or safe) to promote code when the add-with-constant has other uses. llvm-svn: 122175	2010-12-19 17:59:02 +00:00
Chris Lattner	bb93cd80d6	Enhance LICM to promote alias sets whose pointers themselves are stored, which doesn't affect the memory address being promoted. llvm-svn: 122172	2010-12-19 05:57:25 +00:00
Chris Lattner	71fcecf597	fix PR8602, a bug in an assertion: a volatile store of a pointer does not make the alias set for that pointer volatile, just stores to the pointer. llvm-svn: 122171	2010-12-19 05:51:54 +00:00
Chris Lattner	ac82ea26da	fix PR8642: if a critical edge has a PHI value that can trap, isel is required to split the edge. PHI values get evaluated on the edge, not in their predecessor block. llvm-svn: 122170	2010-12-19 04:58:57 +00:00
Chris Lattner	0965f3f76d	revert r122164, I'm going to go with a different approach. llvm-svn: 122168	2010-12-19 04:23:03 +00:00
Chris Lattner	14a3e26146	first step to fixing PR8642: don't fold away empty basic blocks which have trapping constant exprs in them due to PHI nodes. Eliminating them can cause the constant expr to be evalutated on new paths if the input edges are critical. llvm-svn: 122164	2010-12-19 03:02:34 +00:00
Chris Lattner	1cc35d2472	move this test into the ARM test so that it is only run when the arm backend is enabled. llvm-svn: 122163	2010-12-19 02:58:14 +00:00
Anton Korobeynikov	f49c9c02d6	Restore the behavior of frame lowering before my refactoring. It turns out that ppc backend has really weird interdependencies over different hooks and all stuff is fragile wrt small changes. This should fix PR8749 llvm-svn: 122155	2010-12-18 19:53:14 +00:00
Benjamin Kramer	84d3e6cfd0	Just rename the functions, relying on matching a instruction that has the same name as a symbol is way too fragile. llvm-svn: 122154	2010-12-18 14:23:57 +00:00
Benjamin Kramer	4d591385d1	Test more than just label names and make test work on non-x86 hosts. llvm-svn: 122153	2010-12-18 14:07:28 +00:00
Roman Divacky	ed5bb14415	Add support for lexing single quotes like 'c'. This fixed 8615. llvm-svn: 122150	2010-12-18 08:56:37 +00:00
Rafael Espindola	4c1b83e53f	Add a test that shows that we produce no fixups when computing the difference of two symbols in the same fragment. llvm-svn: 122145	2010-12-18 05:07:45 +00:00
Rafael Espindola	dff6c18a5e	Test for push being relaxed. llvm-svn: 122124	2010-12-18 01:16:59 +00:00
Bob Wilson	776d3f73eb	Fix result type of Neon floating-point comparisons against zero. The result vector elements are always integers. Radar 8782191. llvm-svn: 122112	2010-12-18 00:04:33 +00:00
Nate Begeman	063d88d6fb	Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms. llvm-svn: 122105	2010-12-17 23:12:19 +00:00
Bill Wendling	c16f9b1ccc	During local stack slot allocation, the materializeFrameBaseRegister function may be called. If the entry block is empty, the insertion point iterator will be the "end()" value. Calling ->getParent() on it (among others) causes problems. Modify materializeFrameBaseRegister to take the machine basic block and insert the frame base register at the beginning of that block. (It's very similar to what the code does all ready. The only difference is that it will always insert at the beginning of the entry block instead of after a previous materialization of the frame base register. I doubt that that matters here.) <rdar://problem/8782198> llvm-svn: 122104	2010-12-17 23:09:14 +00:00
Bob Wilson	c57c2d755b	Fix a DAGCombiner crash when folding binary vector operations with constant BUILD_VECTOR operands where the element type is not legal. I had previously changed this code to insert TRUNCATE operations, but that was just wrong. llvm-svn: 122102	2010-12-17 23:06:49 +00:00
Bob Wilson	347efe0b29	Combine several vector-related DAGCombiner tests. llvm-svn: 122101	2010-12-17 23:06:46 +00:00
Nate Begeman	ef5f3c0fa7	Add support for matching psign & plendvb to the x86 target Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098	2010-12-17 22:55:37 +00:00
Dale Johannesen	c2c6ebd82a	Add a transform to DAG Combiner. This improves the code for the case where 32-bit divide by constant is turned into 64-bit multiply by constant. 8771012. llvm-svn: 122090	2010-12-17 21:45:49 +00:00
Owen Anderson	6acf8c9125	Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself on the DragonEgg self-host bot. Unfortunately, the testcase is pretty messy and doesn't reduce well due to interactions with other parts of InstCombine. llvm-svn: 122072	2010-12-17 18:08:00 +00:00
Benjamin Kramer	39b30b18fa	SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build. llvm-svn: 122054	2010-12-17 10:48:14 +00:00
Kalle Raiskila	68f221707a	Don't feed 19 bit immediates to ILA. Patch (slightly modified) by Visa Putkinen. llvm-svn: 122052	2010-12-17 09:36:09 +00:00
Chris Lattner	e92f8121d4	improve switch formation to handle small range comparisons formed by comparisons. For example, this: void foo(unsigned x) { if (x == 0 \|\| x == 1 \|\| x == 3 \|\| x == 4 \|\| x == 6) bar(); } compiles into: _foo: ## @foo ## BB#0: ## %entry cmpl $6, %edi ja LBB0_2 ## BB#1: ## %entry movl %edi, %eax movl $91, %ecx btq %rax, %rcx jb LBB0_3 instead of: _foo: ## @foo ## BB#0: ## %entry cmpl $2, %edi jb LBB0_4 ## BB#1: ## %switch.early.test cmpl $6, %edi ja LBB0_3 ## BB#2: ## %switch.early.test movl %edi, %eax movl $88, %ecx btq %rax, %rcx jb LBB0_4 This catches a bunch of cases in GCC, which look like this: %804 = load i32* @which_alternative, align 4, !tbaa !0 %805 = icmp ult i32 %804, 2 %806 = icmp eq i32 %804, 3 %or.cond121 = or i1 %805, %806 %807 = icmp eq i32 %804, 4 %or.cond124 = or i1 %or.cond121, %807 br i1 %or.cond124, label %.thread, label %808 turning this into a range comparison. llvm-svn: 122045	2010-12-17 06:20:15 +00:00
Daniel Dunbar	1f9fd0b79b	MC/Expr: Implemnt more aggressive folding during symbol evaluation using IsSymbolRefDifferenceFullyResolved(). For example, we will now fold away something like: -- _a: ... L0: ... L1: ... .long (L1 - L0) / 2 -- llvm-svn: 122043	2010-12-17 05:50:33 +00:00
Bob Wilson	e06f6eabe7	Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation. Radar 8776599 llvm-svn: 122018	2010-12-17 01:21:12 +00:00
Dan Gohman	f8949c3d1a	Revert r64460. strtol and friends cannot be marked readonly, even with a null endptr argument, because they may write to errno. This fixes a seflhost miscompile observed on Linux targets when TBAA was enabled. llvm-svn: 122014	2010-12-17 01:09:43 +00:00
Rafael Espindola	bd13ceed72	"Fix" FDE alignment to match what gas does. llvm-svn: 122006	2010-12-17 00:28:02 +00:00
Rafael Espindola	3ee4530406	Make pushq produce signed relocations. llvm-svn: 122005	2010-12-16 22:50:01 +00:00
Duncan Sands	22de496ae3	Speculatively revert commit 121905 since it looks like it might have broken the dragonegg self-host buildbot. Original commit message: Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. llvm-svn: 121965	2010-12-16 09:40:54 +00:00
Jason W Kim	2f4a8d7553	1. ARM/MC/ELF: A few more ELF relocs for .o 2. Fixed EmitLocalCommonSymbol for ELF (Yes, they exist. :) Test added. llvm-svn: 121951	2010-12-16 03:12:17 +00:00
Dan Gohman	29a260015a	-enable-tbaa is on by default now. llvm-svn: 121945	2010-12-16 02:53:48 +00:00
Dan Gohman	e106936414	Make memcpyopt TBAA-aware. llvm-svn: 121944	2010-12-16 02:51:19 +00:00
Jason W Kim	529d762fff	Fix elf-dump --dump-section-data for .bss section llvm-svn: 121927	2010-12-16 00:15:10 +00:00
Dan Gohman	a2fd4f2e22	Preserve TBAA tags when doing load PRE. llvm-svn: 121921	2010-12-15 23:53:55 +00:00
Jim Grosbach	84c2b29b58	Thumb1 had two patterns for the same load-from-constant-pool instruction. Canonicalize on tLDRpci and remove tLDRcp. llvm-svn: 121920	2010-12-15 23:52:36 +00:00
Eric Christopher	339499f8f3	Don't handle -arm-long-calls in fast isel for now. llvm-svn: 121919	2010-12-15 23:47:29 +00:00
Owen Anderson	aefeb448a9	Add an InstCombine transform to recognize instances of manual overflow-safe addition (performing the addition in a wider type and explicitly checking for overflow), and fold them down to intrinsics. This currently only supports signed-addition, but could be generalized if someone works out the magic constant formulas for other operations. Fixes <rdar://problem/8558713>. llvm-svn: 121905	2010-12-15 22:32:38 +00:00
Evan Cheng	68e1ed8752	Teach machine cse to commute instructions. llvm-svn: 121903	2010-12-15 22:16:21 +00:00
Bob Wilson	438a9a1367	Add Neon VCVT instructions for f32 <-> f16 conversions. Clang is now providing intrinsics for these and so we need to support them in the backend. Radar 8068427. llvm-svn: 121902	2010-12-15 22:14:12 +00:00
Bob Wilson	1082705e72	Fix misspelled target triples in MC/ARM test commands. llvm-svn: 121901	2010-12-15 22:14:01 +00:00
Wesley Peck	2a376c535e	Lower the MBlaze target specific calling conventions for "interrupt_handler" and "save_volatiles" correctly. This completes the custom calling convention functionality changes for the MBlaze backend that were started in 121888. llvm-svn: 121891	2010-12-15 20:27:28 +00:00
Duncan Sands	2699fb1072	Move Sub simplifications and additional Add simplifications out of instcombine and into InstructionSimplify. llvm-svn: 121861	2010-12-15 14:07:39 +00:00
Frits van Bommel	83b7c3773f	Teach jump threading to "look through" a select when the branch direction of a terminator depends on it. When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately. llvm-svn: 121859	2010-12-15 09:51:20 +00:00
Rafael Espindola	94d026d157	Relax alignment fragments. With this we don't need the EffectiveSize field anymore. Without that field LayoutFragment only updates offsets and we don't need to invalidate the current fragment when it is relaxed (only the ones following it). This is also a very small improvement in the accuracy of the layout info as we now use the after relaxation size immediately. llvm-svn: 121857	2010-12-15 08:45:53 +00:00
Rafael Espindola	f55f520a7a	Patch by David Meyer to avoid a O(N^2) behaviour when relaxing fragments. Since we now don't update addresses so early, we might relax a bit more than we need to. This is simillar to the issue in PR8467. llvm-svn: 121856	2010-12-15 07:39:29 +00:00
Chris Lattner	81815cd4db	take care of some todos, transforming [us]mul_lohi into a wider mul if the wider mul is legal. llvm-svn: 121848	2010-12-15 06:04:19 +00:00
Chris Lattner	3bec2e7d0d	merge two tests llvm-svn: 121847	2010-12-15 05:58:59 +00:00
Kevin Enderby	6b3ae489f8	Add some more MC tests for ARM arithmetic instructions that update or don't update the condition codes. These come from my test generator and are just the ones that MC currently assembles correctly. llvm-svn: 121830	2010-12-15 01:24:36 +00:00
Owen Anderson	de42e1136e	Fix PR8790, another instance where unreachable code can cause instruction simplification to fail, this case involve a select that simplifies to itself. llvm-svn: 121817	2010-12-15 00:55:35 +00:00
Evan Cheng	7e96e67d98	Fix a minor bug in two-address pass. It was missing a commute opportunity. regB = move RCX regA = op regB, regC RAX = move regA where both regB and regC are killed. If regB is constrainted to non-compatible physical registers but regC is not constrainted at all, then it's better to commute the instruction. movl %edi, %eax shlq $32, %rcx leaq (%rcx,%rax), %rax => movl %edi, %eax shlq $32, %rcx orq %rcx, %rax rdar://8762995 llvm-svn: 121793	2010-12-14 21:34:53 +00:00
Daniel Dunbar	3f9b9dc852	MC/ARM: Fix-up fixup offset for fixup_arm_branch target specific fixup. llvm-svn: 121772	2010-12-14 17:37:16 +00:00
Chris Lattner	c1aaf52608	- Insert new instructions before DomBlock's terminator, which is simpler than finding a place to insert in BB. - Don't perform the 'if condition hoisting' xform on certain i1 PHIs, as it interferes with switch formation. This re-fixes "example 7", without breaking the world hopefully. llvm-svn: 121764	2010-12-14 08:46:09 +00:00
Chris Lattner	22d4dc5a4d	fix two significant issues with FoldTwoEntryPHINode: first, it can kick in on blocks whose conditions have been folded to a constant, even though one of the edges will be trivially folded. second, it doesn't clean up the "if diamond" that it just eliminated away. This is a problem because other simplifycfg xforms kick in depending on the order of block visitation, causing pointless work. llvm-svn: 121762	2010-12-14 08:01:53 +00:00
Chris Lattner	5d4aea9791	fix yet anohter broken line llvm-svn: 121750	2010-12-14 06:09:07 +00:00
Chris Lattner	093b5b256d	reapply my recent change that disables a piece of the switch formation work, but fixes 400.perlbmk. llvm-svn: 121749	2010-12-14 05:57:30 +00:00
Evan Cheng	6a2bed92f5	bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663 llvm-svn: 121746	2010-12-14 03:22:07 +00:00
Jason W Kim	0cf0f7a078	fix fixme case typo :-) llvm-svn: 121743	2010-12-14 01:42:38 +00:00
Owen Anderson	5536134dc4	Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state where I'm confident there were no crashes or miscompilations. XFAIL the test added since then for now. llvm-svn: 121733	2010-12-13 23:49:28 +00:00
Jason W Kim	b5cc5dad79	First cut of ARM/MC/ELF PIC relocations. Test has fixme, to move to .s -> .o test when AsmParser works better. llvm-svn: 121732	2010-12-13 23:16:07 +00:00
Bob Wilson	33e5e902b0	Remove the rest of the _sfp Neon instruction patterns. Use the same COPY_TO_REGCLASS approach as for the 2-register _sfp instructions. This change made a big difference in the code generated for the CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing a fine job, but some instructions that were previously moved outside the loop are not moved now. It's using fewer VFP registers now, which is generally a good thing, so I think the estimates for register pressure changed and that affected the LICM behavior. Since that isn't obviously wrong, I've just changed the test file. This completes the work for Radar 8711675. llvm-svn: 121730	2010-12-13 23:02:37 +00:00
Chris Lattner	dcba81d96f	temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk. llvm-svn: 121728	2010-12-13 23:02:19 +00:00
Dan Gohman	b187cce266	Reapply r121520, PartialAlias implementation for BasicAA, now that memdep is updated to handle it. llvm-svn: 121725	2010-12-13 22:50:24 +00:00
Benjamin Kramer	7f1cdac1e4	Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780. llvm-svn: 121705	2010-12-13 18:20:38 +00:00
Chris Lattner	d17cbf803b	rename test llvm-svn: 121697	2010-12-13 08:39:40 +00:00
Chris Lattner	14810c808b	Add a couple dag combines to transform mulhi/mullo into a wider multiply when the wider type is legal. This allows us to compile: define zeroext i16 @test1(i16 zeroext %x) nounwind { entry: %div = udiv i16 %x, 33 ret i16 %div } into: test1: # @test1 movzwl 4(%esp), %eax imull $63551, %eax, %eax # imm = 0xF83F shrl $21, %eax ret instead of: test1: # @test1 movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F mulw 4(%esp) andl $65504, %edx # imm = 0xFFE0 movl %edx, %eax shrl $5, %eax ret Implementing rdar://8760399 and example #4 from: http://blog.regehr.org/archives/320 We should implement the same thing for [su]mul_hilo, but I don't have immediate plans to do this. llvm-svn: 121696	2010-12-13 08:39:01 +00:00
Chris Lattner	0368bf7457	reinstate my patch: the miscompile was caused by an inverted branch in the 'and' case. llvm-svn: 121695	2010-12-13 08:12:19 +00:00
Chris Lattner	caad324345	Completely disable the optimization I added in r121680 until I can track down a miscompile. This should bring the buildbots back to life llvm-svn: 121693	2010-12-13 07:41:29 +00:00
Chris Lattner	5ce3e42d80	Make simplifycfg reprocess newly formed "br (cond1 \| cond2)" conditions when simplifying, allowing them to be eagerly turned into switches. This is the last step required to get "Example 7" from this blog post: http://blog.regehr.org/archives/320 On X86, we now generate this machine code, which (to my eye) seems better than the ICC generated code: _crud: ## @crud ## BB#0: ## %entry cmpb $33, %dil jb LBB0_4 ## BB#1: ## %switch.early.test addb $-34, %dil cmpb $58, %dil ja LBB0_3 ## BB#2: ## %switch.early.test movzbl %dil, %eax movabsq $288230376537592865, %rcx ## imm = 0x400000017001421 btq %rax, %rcx jb LBB0_4 LBB0_3: ## %lor.rhs xorl %eax, %eax ret LBB0_4: ## %lor.end movl $1, %eax ret llvm-svn: 121690	2010-12-13 07:00:06 +00:00
Chris Lattner	ea15ce73be	fix a bug in r121680 that upset the various buildbots. llvm-svn: 121687	2010-12-13 05:34:18 +00:00
Chris Lattner	c331eb8e1e	make these tests a bit less fragile llvm-svn: 121682	2010-12-13 05:10:30 +00:00
Chris Lattner	5cbbcc56ad	enhance the "change or icmp's into switch" xform to handle one value in an 'or sequence' that it doesn't understand. This allows us to optimize something insane like this: int crud (unsigned char c, unsigned x) { if(((((((((( (int) c <= 32 \|\| (int) c == 46) \|\| (int) c == 44) \|\| (int) c == 58) \|\| (int) c == 59) \|\| (int) c == 60) \|\| (int) c == 62) \|\| (int) c == 34) \|\| (int) c == 92) \|\| (int) c == 39) != 0) foo(); } into: define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone { entry: %cmp = icmp ult i8 %c, 33 br i1 %cmp, label %if.then, label %switch.early.test switch.early.test: ; preds = %entry switch i8 %c, label %if.end [ i8 39, label %if.then i8 44, label %if.then i8 58, label %if.then i8 59, label %if.then i8 60, label %if.then i8 62, label %if.then i8 46, label %if.then i8 92, label %if.then i8 34, label %if.then ] by pulling the < comparison out ahead of the newly formed switch. llvm-svn: 121680	2010-12-13 04:50:38 +00:00
Chris Lattner	e35f4b31f4	merge two tests llvm-svn: 121679	2010-12-13 04:45:56 +00:00
Chris Lattner	25b642edfd	Fix my previous patch to handle a degenerate case that the llvm-gcc bootstrap buildbot tripped over. llvm-svn: 121674	2010-12-13 03:43:57 +00:00
Chris Lattner	a21c02e807	fix a fairly serious oversight with switch formation from or'd conditions. Previously we'd compile something like this: int crud (unsigned char c) { return c == 62 \|\| c == 34 \|\| c == 92; } into: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end ] lor.rhs: ; preds = %entry %cmp8 = icmp eq i8 %c, 92 br label %lor.end lor.end: ; preds = %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which failed to merge the compare-with-92 into the switch. With this patch we simplify this all the way to: switch i8 %c, label %lor.rhs [ i8 62, label %lor.end i8 34, label %lor.end i8 92, label %lor.end ] lor.rhs: ; preds = %entry br label %lor.end lor.end: ; preds = %entry, %entry, %entry, %lor.rhs %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ] %lor.ext = zext i1 %0 to i32 ret i32 %lor.ext which is much better for codegen's switch lowering stuff. This kicks in 33 times on 176.gcc (for example) cutting 103 instructions off the generated code. llvm-svn: 121671	2010-12-13 03:18:54 +00:00
Bill Wendling	16e7d7bf2f	Add support for using the `!if' operator when initializing variables: class A<bit a, bits<3> x, bits<3> y> { bits<3> z; let z = !if(a, x, y); } The variable z will get the value of x when 'a' is 1 and 'y' when a is '0'. llvm-svn: 121666	2010-12-13 01:46:19 +00:00
Wesley Peck	f842b79b4b	Missed some ADDI <-> ADDIK conversions in 121649. llvm-svn: 121652	2010-12-12 22:53:14 +00:00
Benjamin Kramer	a638216447	Generalize the and-icmp-select instcombine further by allowing selects of the form (x & 2^n) ? 2^m+C : C we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the select to a shift and apply the offset afterwards. llvm-svn: 121609	2010-12-11 10:49:22 +00:00
Benjamin Kramer	5a1721f4ac	Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it to catch cases where n != m with a shift. llvm-svn: 121608	2010-12-11 09:42:59 +00:00
Evan Cheng	b6773d7e1f	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056 llvm-svn: 121606	2010-12-11 04:11:38 +00:00
Bob Wilson	d30768fe3e	Add float patterns for Neon vld1-lane/dup and vst1-lane operations. llvm-svn: 121583	2010-12-10 22:13:32 +00:00
Dan Gohman	18e2a55c07	Revert r121520, which may have introduced miscompilations. llvm-svn: 121573	2010-12-10 21:48:28 +00:00
Dan Gohman	d1bf1d8013	Implement PartialAlias checking in BasicAA. llvm-svn: 121520	2010-12-10 20:47:03 +00:00
Bob Wilson	5ff13f9d5c	Fix some invalid alignments for Neon vld-dup and vld/st-lane instructions. Alignments smaller than the total size of the memory being loaded or stored, unless the alignment is 8 bytes, are not allowed. Add tests for this, too. llvm-svn: 121506	2010-12-10 19:37:42 +00:00
NAKAMURA Takumi	e6b65793ea	macho-dump: Fix CMake build, following up to r121466. llvm-svn: 121476	2010-12-10 09:18:26 +00:00
Rafael Espindola	0e665e502d	Fixed version of 121434 with no new memory leaks. llvm-svn: 121471	2010-12-10 07:39:47 +00:00
Daniel Dunbar	599da0cadf	macho-dump: Switch to C++ macho-dump tool. llvm-svn: 121466	2010-12-10 06:19:45 +00:00
Rafael Espindola	011e168728	Revert my previous patch to make the valgrind bots happy. llvm-svn: 121461	2010-12-10 04:01:09 +00:00
NAKAMURA Takumi	5ad673e4d9	Add dependency to "make check". cmake/modules/AddLLVM.cmake: Add empty "phony" target in add_llvm_loadable_module() even if loadable module were not supported. llvm-svn: 121455	2010-12-10 02:15:36 +00:00
Nate Begeman	cb6d1c8193	Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable. llvm-svn: 121439	2010-12-10 00:26:57 +00:00
Rafael Espindola	03ad1e8f1f	Initial support for the cfi directives. This is just enough to get f: .cfi_startproc nop .cfi_endproc assembled (on ELF). llvm-svn: 121434	2010-12-09 23:48:29 +00:00
Kevin Enderby	55cb19813e	Add support for parsing ARM arithmetic instructions that update or don't update the condition codes. Where the ones that do have an 's' suffix and the ones that don't don't have the suffix. The trick is if MatchInstructionImpl() fails we try again after adding a CCOut operand with the correct value and removing the 's' if present. Four simple test cases added for now, lots more to come. llvm-svn: 121401	2010-12-09 19:19:43 +00:00
Jim Grosbach	8bc33cc6e5	ARM stm/ldm instructions require more than one register in the register list. Otherwise, a plain str/ldr should be used instead. Make sure we account for that in prologue/epilogue code generation. rdar://8745460 llvm-svn: 121391	2010-12-09 18:31:13 +00:00
Bruno Cardoso Lopes	93e5c2fb64	Add ROTR and ROTRV mips32 instructions. Patch by Akira Hatanaka llvm-svn: 121377	2010-12-09 17:32:30 +00:00
Chris Lattner	996691e79c	enhance memcpyopt to zap memcpy's that have the same src/dst. llvm-svn: 121362	2010-12-09 07:45:45 +00:00
Chris Lattner	4fef82afa0	fix PR8753, eliminating a case where we'd infinitely make a substitution because it doesn't actually change the IR. Patch by Jakub Staszak! llvm-svn: 121361	2010-12-09 07:39:50 +00:00
Eric Christopher	0e40452eb0	Rewrite the darwin tlv support to use a chain and return to copying the output to the correct register. Fixes a hidden problem uncovered by the last patch where we'd try to DAG combine our MVT::Other node oddly. llvm-svn: 121358	2010-12-09 06:25:53 +00:00
Dan Gohman	3d9fc7db03	Really check that the bits that will become zero are actually already zero before eliminating the operation that zeros them. This fixes rdar://8739316. llvm-svn: 121353	2010-12-09 02:52:17 +00:00
Eric Christopher	0100a8fda4	Remove extraneous copy from DAG conversion for darwin tls. This was popping up at O0 when it wasn't folded and the fast allocator would complain. llvm-svn: 121330	2010-12-09 00:27:58 +00:00
Kevin Enderby	988dab6b5c	Allow a slash, '/', as a prefix separator for X86. rdar://8741045 llvm-svn: 121320	2010-12-08 23:57:59 +00:00
Eric Christopher	d601b8288f	Move this test to tlv* to make it easier to notice versus linux tls support. llvm-svn: 121316	2010-12-08 23:33:23 +00:00
Jason W Kim	2e6e50c1b0	ARM/MC/ELF TPsoft is now a proper pseudo inst. Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM as well as ELF/OBJ (including fixup) Also added support for ELF::R_ARM_TLS_IE32 llvm-svn: 121312	2010-12-08 23:14:44 +00:00
Evan Cheng	3bd9b95b4d	Fix a bad prologue / epilogue codegen bug where the compiler would emit illegal vpush instructions to save / restore VFP / NEON registers like this: vpush {d8,d10,d11} vpop {d8,d10,d11} vpush and vpop do not allow gaps in the register list. rdar://8728956 llvm-svn: 121197	2010-12-07 23:08:38 +00:00
Bruno Cardoso Lopes	0e14644599	Match a pattern generated by a dag combiner opt where: (select (load (load tga0)) (load tga1)) => (load (select (load tga0) tga1)) Thanks to Akira for pointing that. llvm-svn: 121163	2010-12-07 19:00:20 +00:00
Rafael Espindola	866531d633	Fix absolute recording of differences of symbols in two sections. Reduced from ctor_dtor_count-2.cpp. llvm-svn: 121152	2010-12-07 17:12:32 +00:00
Rafael Espindola	da64b6aa50	Fix relocations with weak definitions. llvm-svn: 121114	2010-12-07 05:57:28 +00:00
NAKAMURA Takumi	9d1f909489	Revert test/Archive/check_binary_output.ll". It fails on a buildbot. llvm-svn: 121113	2010-12-07 05:57:02 +00:00
Chris Lattner	12c2c17ac7	reapply r121100 with a tweak to constant fold ConstExprs with TargetData (if available) as we go so that we get simple constantexprs not insane ones. This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp that the previous iteration of this patch had. llvm-svn: 121111	2010-12-07 04:33:29 +00:00
Rafael Espindola	9ede5ef045	Fix pcrel relocations that cross sections. llvm-svn: 121107	2010-12-07 03:50:14 +00:00
NAKAMURA Takumi	159722407b	test/Archive/check_binary_output.ll: Add a new test to check output of 'llvm-ar -p' is sane. Thanks to Danil Malyshev! llvm-svn: 121106	2010-12-07 03:35:20 +00:00
NAKAMURA Takumi	f260cf36f0	test/Other/close-stderr.ll: Require the feature 'shell'. It is not executable on Win32 but it is executable on MSYS-bash. llvm-svn: 121105	2010-12-07 02:43:58 +00:00
NAKAMURA Takumi	63bff1a5d3	test: Add the feature 'shell' on LLVM_ON_UNIX. llvm-svn: 121104	2010-12-07 02:43:51 +00:00
Eric Christopher	cab6997dc8	Temporarily revert r121100 as it's causing clang to fail CodeGenCXX/virtual-base-ctor.cpp. llvm-svn: 121102	2010-12-07 02:41:11 +00:00
Chris Lattner	5996a47663	fix PR8710 - teach global opt that some constantexprs are too complex to put in a global variable's initializer. llvm-svn: 121100	2010-12-07 01:59:32 +00:00
Michael J. Spencer	4c0dfbd472	Test: Fix Support.Path and _all_ of the unittest death tests. GetTempPath defaults to \Windows\. If I typed anything else it would just decline into cursing. llvm-svn: 121095	2010-12-07 01:23:49 +00:00
Rafael Espindola	c98cc0b286	Fix a crash reduced from gcc produced assembly. llvm-svn: 121085	2010-12-07 01:09:54 +00:00
Owen Anderson	81f8b084e6	Second attempt at converting Thumb2's LDRpci, including updating the gazillion places that need to know about it. llvm-svn: 121082	2010-12-07 00:45:21 +00:00
Frits van Bommel	1494a2f6fe	Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInts or BlockAddresss. llvm-svn: 121066	2010-12-06 23:36:56 +00:00
Devang Patel	6fe7fe8dd4	If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0. llvm-svn: 121059	2010-12-06 22:39:26 +00:00
Wesley Peck	996c76c27e	Fixed reversed operands for IDIV and CMP instructions in MBlaze backend. Use BRAD instead of BRD for indirect branches in MBlaze backend. patch contributed by Jack Whitham! llvm-svn: 121044	2010-12-06 22:06:49 +00:00
Wesley Peck	b168ddedaa	Fix a 16-bit immediate value detection bug in the MBlaze delay slot filler. Address more hazards in the MBlaze delay slot filler. patch contributed by Jack Whitham! llvm-svn: 121037	2010-12-06 21:11:01 +00:00
Rafael Espindola	3e954d16f4	Second try at making direct object emission produce the same results as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006	2010-12-06 17:27:56 +00:00
Rafael Espindola	4ec917db9b	Revert previous two patches while I try to find out how to make both linux and darwin assemblers happy :-( llvm-svn: 121004	2010-12-06 15:35:15 +00:00
Rafael Espindola	ad6219b193	Update test for the extra =. llvm-svn: 121001	2010-12-06 15:05:36 +00:00
Che-Liang Chiou	cd2878d421	ptx: add shift instructions llvm-svn: 120982	2010-12-06 04:00:03 +00:00
Rafael Espindola	f56c11276e	Don't use PadSectionToAlignment on windows. llvm-svn: 120978	2010-12-06 03:03:44 +00:00
Chris Lattner	db6c348f31	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Evan Cheng	fc78767730	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Frits van Bommel	e390b379ae	Fix PR 4170 by having ExtractValueInst::getIndexedType() reject out-of-bounds indexing. Also add asserts that the indices are valid in InsertValueInst::init(). ExtractValueInst already asserts when constructed with invalid indices. llvm-svn: 120956	2010-12-05 20:50:26 +00:00
Frits van Bommel	31cf7b99f9	Teach SimplifyCFG to turn (indirectbr (select cond, blockaddress(@fn, BlockA), blockaddress(@fn, BlockB))) into (br cond, BlockA, BlockB). llvm-svn: 120943	2010-12-05 18:29:03 +00:00
Chris Lattner	e30adfb732	Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936	2010-12-05 07:49:54 +00:00
Chris Lattner	76601e7a99	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Chris Lattner	9b4b9e751a	fix the rest of the linux miscompares :) llvm-svn: 120933	2010-12-05 02:08:07 +00:00
Chris Lattner	16bafb2414	generalize the previous check to handle -1 on either side of the select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] llvm-svn: 120932	2010-12-05 02:00:51 +00:00
Chris Lattner	e1c32a116b	relax this to handle linux defaulting to -static. llvm-svn: 120930	2010-12-05 01:31:13 +00:00
Chris Lattner	474ed0aa9b	Improve an integer select optimization in two ways: 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) \| y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) \| y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax llvm-svn: 120929	2010-12-05 01:23:24 +00:00
Chris Lattner	c383be807e	merge some tests into select.ll and make them more specific. llvm-svn: 120928	2010-12-05 01:13:58 +00:00
Chris Lattner	7e0a594633	rename test llvm-svn: 120927	2010-12-05 01:02:23 +00:00
Chris Lattner	1a760f6247	remove two tests that aren't really testing anything. llvm-svn: 120926	2010-12-05 01:02:13 +00:00
Benjamin Kramer	851691ddb2	Add patterns for the x86 popcnt instruction. - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917	2010-12-04 20:32:23 +00:00
Bob Wilson	20c65a9d33	The Thumb tADDrSPi instruction is not valid when the destination is SP. Check for that and try narrowing it to tADDspi instead. Radar 8724703. llvm-svn: 120892	2010-12-04 04:40:19 +00:00
Rafael Espindola	9215947c83	There are two reasons why we might want to use foo = a - b .long foo instead of just .long a - b First, on darwin9 64 bits the assembler produces the wrong result. Second, if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not consider a - b to be a constant but will if the dummy foo is created. Split how we handle these cases. The first one is something MC should take care of. The second one has to be handled by the caller. llvm-svn: 120889	2010-12-04 03:21:47 +00:00
Rafael Espindola	50b6170457	Next step: Only pad debug_line when the target is darwin. Add a FIXME to avoid doing that if the target is darwin10 or newer. This fixes ) Direct object emission was producing objects without the workaround on darwin9. ) Assembly printing was producing objects with the workaround on linux. llvm-svn: 120866	2010-12-04 00:31:13 +00:00
Jim Grosbach	8cef570ed9	Encode the 32-bit wide Thumb (and Thumb2) instructions with the high order halfword being emitted to the stream first. rdar://8728174 llvm-svn: 120848	2010-12-03 22:31:40 +00:00
Jim Grosbach	c69ad2176a	When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the 32-bit wide version by adding the .w suffix. llvm-svn: 120838	2010-12-03 20:33:01 +00:00
Devang Patel	dad3193123	Hide tests, that check .loc, .file in output assembly, from darwin9 buildbot. llvm-svn: 120750	2010-12-02 23:29:58 +00:00
Devang Patel	822facd787	Use set directive for StartMinusEndExpr. This is a fix for llvm-gcc-i386-darwin9 buildbot failure. llvm-svn: 120742	2010-12-02 21:32:30 +00:00
Stuart Hastings	55928bc576	Test case for r120740. Radar 8712503. llvm-svn: 120741	2010-12-02 21:25:55 +00:00
Duncan Sands	b38ddd79ed	Adjust this test for the fact that the stores are no longer being combined (which is being tracked as PR8699). llvm-svn: 120734	2010-12-02 20:56:51 +00:00
Jim Grosbach	5a94ed2d67	XFAIL for now. If someone with access to an ARM/Linux host wants to have a look that would be great. They're ARM JIT failures, so without that, it's tough. llvm-svn: 120731	2010-12-02 20:20:32 +00:00
Evan Cheng	402157b66e	Fix test. llvm-svn: 120730	2010-12-02 20:17:34 +00:00
Duncan Sands	84bd43844e	This test dates from the time when llvm-gcc had problems if two types were named the same, so it had to qualify type names according to the enclosing scope to ensure uniqueness. This is no longer needed for correctness (though it may be helpful when reading the IR), so this test has lost its importance. Zap it because dragonegg will never be able to produce the qualified type name since modern gcc zaps language specific info (such as whether a type is nested inside another - needed to get X::Y here) before dragonegg is reached. llvm-svn: 120721	2010-12-02 18:19:23 +00:00
NAKAMURA Takumi	f8cc69a131	test/Archive/extract.ll: Use cmp instead of diff. Thanks to Danil Malyshev! llvm-svn: 120698	2010-12-02 09:16:14 +00:00
Evan Cheng	4118b24aca	Fix and re-enable tail call optimization of expanded libcalls. llvm-svn: 120622	2010-12-01 22:59:46 +00:00
Rafael Espindola	16b64c646a	Rename temporary symbols if they conflict with artificial symbols created by the assembler. This was blocking parsing any large .s produced by clang for example. Fixes PR8596. llvm-svn: 120603	2010-12-01 20:46:11 +00:00
Owen Anderson	8802c68592	Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax. llvm-svn: 120589	2010-12-01 19:18:46 +00:00
Evan Cheng	84162760b7	Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. llvm-svn: 120549	2010-12-01 03:27:20 +00:00
Jason W Kim	4d960e071c	ARM/MC/ELF relocation "hello world" for movw/movt. Lifted adjustFixupValue() from Darwin for sharing w ELF. Test added TODO: refactor ELFObjectWriter::RecordRelocation more. Possibly share more code with Darwin? Lots more relocations... llvm-svn: 120534	2010-12-01 02:40:06 +00:00
Chris Lattner	c3112f1e94	fix a bozo bug I introduced in r119930, causing a miscompile of 20040709-1.c from the gcc testsuite. I was using the size of a pointer instead of the pointee. This fixes rdar://8713376 llvm-svn: 120519	2010-12-01 01:24:55 +00:00
NAKAMURA Takumi	ffb10289fb	test/Archive: FileCheck-ize, and remove *.toc. These may be CRLF-tolerant. llvm-svn: 120506	2010-12-01 00:09:25 +00:00
Evan Cheng	f7e586d749	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Chris Lattner	c888f3ec58	Enhance DSE to handle the variable index case in PR8657. llvm-svn: 120498	2010-11-30 23:43:23 +00:00
Chris Lattner	191aa08db1	remove fixme comment too. llvm-svn: 120493	2010-11-30 23:25:01 +00:00
Chris Lattner	eee2bb2ff0	check in all files. This is now handled by my previous DSE commit. llvm-svn: 120492	2010-11-30 23:23:59 +00:00
Chris Lattner	b9c5a6fa04	teach DSE to use GetPointerBaseWithConstantOffset to analyze may-aliasing stores that partially overlap with different base pointers. This implements PR6043 and the non-variable part of PR8657 llvm-svn: 120485	2010-11-30 23:05:20 +00:00
Chris Lattner	41b6b286a3	enhance isRemovable to refuse to delete volatile mem transfers now that DSE hacks on them. This fixes a regression I introduced, by generalizing DSE to hack on transfers. llvm-svn: 120445	2010-11-30 19:12:10 +00:00
Owen Anderson	e2a8781847	Add tests for more forms of Thumb2 loads and stores. llvm-svn: 120436	2010-11-30 18:15:21 +00:00
Che-Liang Chiou	f594fe5fc5	ptx: add command-line options for gpu target and ptx version llvm-svn: 120423	2010-11-30 10:14:14 +00:00
Eric Christopher	990bcd83b8	Not all platforms use _<func>. Duh. llvm-svn: 120418	2010-11-30 09:23:54 +00:00
Bill Wendling	ae920bcc50	Add parsing for the Thumb t_addrmode_s4 addressing mode. This can almost certainly be made more generic. But it does allow us to parse something like: ldr r3, [r2, r4] correctly in Thumb mode. llvm-svn: 120408	2010-11-30 07:44:32 +00:00
Chris Lattner	7d444d0682	Rewrite the main DSE loop to be written in terms of reasoning about pairs of AA::Location's instead of looking for MemDep's "Def" predicate. This is more powerful and general, handling memset/memcpy/store all uniformly, and implementing PR8701 and probably obsoleting parts of memcpyoptimizer. This also fixes an obscure bug with init.trampoline and i8 stores, but I'm not surprised it hasn't been hit yet. Enhancing init.trampoline to carry the size that it stores would allow DSE to be much more aggressive about optimizing them. llvm-svn: 120406	2010-11-30 07:23:21 +00:00
Eric Christopher	f27f0b5234	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. llvm-svn: 120404	2010-11-30 07:20:12 +00:00
Anders Carlsson	67e9e6234c	Add a puts optimization that converts puts() to putchar('\n'). llvm-svn: 120398	2010-11-30 06:19:18 +00:00
Anders Carlsson	2a46a03898	Fix a typo. llvm-svn: 120394	2010-11-30 06:03:55 +00:00
Anders Carlsson	a2ad88fb73	Rename this test to FPuts.ll since it actually tests fputs. llvm-svn: 120393	2010-11-30 05:59:26 +00:00
Chris Lattner	bea813875e	remove a use of llvm-dis llvm-svn: 120383	2010-11-30 02:04:15 +00:00
Chris Lattner	56b0cc6974	merge one more away llvm-svn: 120375	2010-11-30 01:06:43 +00:00

... 3 4 5 6 7 ...

12057 Commits