llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-31 16:02:52 +01:00

Author	SHA1	Message	Date
Chris Lattner	79947d56ea	filecheckize llvm-svn: 125710	2011-02-17 02:21:03 +00:00
Chris Lattner	828b97cdc2	fix PR9215, preventing -reassociate from clearing nsw/nuw when it swaps the LHS/RHS of a single binop. llvm-svn: 125700	2011-02-17 01:29:24 +00:00
Rafael Espindola	4ef1268b39	Gas is very inconsistent about when a relaxation/relocation is needed. Do the right thing and stop trying to copy it. Fixes PR8944. llvm-svn: 125648	2011-02-16 03:25:55 +00:00
Eric Christopher	8965ba39fb	The change for PR9190 wasn't quite right. We need to avoid making the transformation if we can't legally create a build vector of the correct type. Check that we can make the transformation first, and add a TODO to refactor this code with similar cases. Fixes: PR9223 and rdar://9000350 llvm-svn: 125631	2011-02-16 01:10:03 +00:00
Eric Christopher	2d3de0727f	Add testcase for PR9190. llvm-svn: 125630	2011-02-16 01:08:31 +00:00
Rafael Espindola	b59fdeb3de	Add support for pushsection and popsection. Patch by Joerg Sonnenberger. llvm-svn: 125629	2011-02-16 01:08:29 +00:00
Nick Lewycky	5c854580b2	Teach PatternMatch that splat vectors could be floating point as well as integer. Fixes PR9228! llvm-svn: 125613	2011-02-15 23:13:23 +00:00
Roman Divacky	3277e18a42	Add support for parsing [expr]. This is submitted by Joerg Sonnenberger and fixes his PR8685. llvm-svn: 125595	2011-02-15 20:43:39 +00:00
Devang Patel	d5b7c28519	Ignore DBG_VALUE machine instructions while constructing instruction ranges based on location info. Machine instruction range consisting of only DBG_VALUE MIs only contributes consecutive labels in assembly output, which is harmless, and empty scope entry in DebugInfo, which confuses debugger tools. llvm-svn: 125577	2011-02-15 17:56:09 +00:00
Nadav Rotem	5306a4ae96	Fix 9216 - Endless loop in InstCombine pass. The pattern "A&(A^B) -> A & ~B" recreated itself because ~B is actually a xor -1. llvm-svn: 125557	2011-02-15 07:13:48 +00:00
Devang Patel	4b1ea1ef94	Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop. llvm-svn: 125529	2011-02-14 23:03:23 +00:00
Rafael Espindola	3c32af5834	Switch llvm to using comdats. For now always use groups with a single section. llvm-svn: 125526	2011-02-14 22:23:49 +00:00
Bob Wilson	e53a0cef51	PR9139: Specify ARM/Darwin triple for vector-DAGCombine.ll test. The i64_buildvector test in this file relies on the alignment of i64 and f64 types being the same, which is true for Darwin but not AAPCS. llvm-svn: 125525	2011-02-14 22:12:50 +00:00
Bruno Cardoso Lopes	e65a98b127	Fix encoding and add parsing support for the arm/thumb CPS instruction: - Add custom operand matching for imod and iflags. - Rename SplitMnemonicAndCC to SplitMnemonic since it splits more than CC from mnemonic. - While adding ".w" as an operand, don't change "Head" to avoid passing the wrong mnemonic to ParseOperand. - Add asm parser tests. - Add disassembler tests just to make sure it can catch all cps versions. llvm-svn: 125489	2011-02-14 13:09:44 +00:00
Chris Lattner	68d9fae34e	fix PR9210 by implementing some type legalization logic for vector fp conversions. llvm-svn: 125482	2011-02-14 06:30:45 +00:00
Chris Lattner	bcf2d46d8a	Enhance ComputeMaskedBits to know that aligned frameindexes have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI\|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470	2011-02-13 22:25:43 +00:00
Duncan Sands	c5e791fdd9	Teach instsimplify that X+Y>=X+Z is the same as Y>=Z if neither side overflows, plus some variations of this. According to my auto-simplifier this occurs a lot but usually in combination with max/min idioms. Because max/min aren't handled yet this unfortunately doesn't have much effect in the testsuite. llvm-svn: 125462	2011-02-13 17:15:40 +00:00
Nadav Rotem	98c5af8517	Fix test llvm-svn: 125460	2011-02-13 16:13:16 +00:00
Nadav Rotem	3ce2bfbb55	Fix a regression from r125393; It caused a crash in MultiSource/Benchmarks/Bullet. Opt hit an assertion with "opt -std-compile-opts" because Constant::getAllOnesValue doesn't know how to handle floats. This patch added a test to reproduce the problem and a check that the destination vector is of integer type. Thank you Benjamin! llvm-svn: 125459	2011-02-13 15:45:34 +00:00
Chris Lattner	378faeecc8	when legalizing extremely wide shifts, make sure that the shift amounts are in a suitably wide type so that we don't generate out of range constant shift amounts. This fixes PR9028. llvm-svn: 125458	2011-02-13 09:10:56 +00:00
Chris Lattner	47e8019142	fix visitShift to properly zero extend the shift amount if the provided operand is narrower than the shift register. Doing an anyext provides undefined bits in the top part of the register. llvm-svn: 125457	2011-02-13 09:02:52 +00:00
Chris Lattner	4362c74d13	add PR# llvm-svn: 125455	2011-02-13 08:27:31 +00:00
Chris Lattner	72b78e11ba	implement instcombine folding for things like (x >> c) < 42. We were previously simplifying divisions, but not right shifts! llvm-svn: 125454	2011-02-13 08:07:21 +00:00
Chris Lattner	2596ac19b9	teach SCEV that the scale and addition of an inbounds gep don't NSW. This fixes a FIXME in scev-aa.ll (allowing a new no-alias result) and generally makes things more precise. llvm-svn: 125449	2011-02-13 03:14:49 +00:00
Reid Kleckner	0e68b2ed88	Add encodings and mnemonics for FXSAVE64 and FXRSTOR64. These are just FXSAVE and FXRSTOR with REX.W prefixes. These versions use 64-bit pointer values instead of 32-bit pointer values in the memory map they dump and restore. llvm-svn: 125446	2011-02-12 23:24:13 +00:00
Venkatraman Govindaraju	3cc16c2b89	Prevent IMPLICIT_DEF/KILL to become a delay filler instruction in SPARC backend. llvm-svn: 125444	2011-02-12 19:02:33 +00:00
Daniel Dunbar	74c3b94237	SimplifyLibCalls: Add missing legalize check on various printf to puts and putchar transforms, their return values are not compatible. llvm-svn: 125442	2011-02-12 18:19:57 +00:00
Daniel Dunbar	33aae18345	tests: FileCheckize llvm-svn: 125441	2011-02-12 18:19:53 +00:00
Nadav Rotem	367d73ccb6	A fix for 9165. The DAGCombiner created illegal BUILD_VECTOR operations. The patch added a check that either illegal operations are allowed or that the created operation is legal. llvm-svn: 125435	2011-02-12 14:40:33 +00:00
Benjamin Kramer	793cd269de	Also fold (A+B) == A -> B == 0 when the add is commuted. llvm-svn: 125411	2011-02-11 21:46:48 +00:00
Chris Lattner	fec8b6bd6d	Per discussion with Dan G, inbounds geps certainly can have unsigned overflow (e.g. "gep P, -1"), and while they can have signed wrap in theoretical situations, modelling an AddRec as not having signed wrap is going enough for any case we can think of today. In the future if this isn't enough, we can revisit this. Modeling them as having NUW isn't causing any known problems either FWIW. llvm-svn: 125410	2011-02-11 21:43:33 +00:00
Nate Begeman	0a8f9ff53b	Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types. This avoids moving each element to the integer register file and calling __divsi3 etc. on it. llvm-svn: 125402	2011-02-11 20:53:29 +00:00
Nadav Rotem	c2e51ee68a	Fix 9173. Add more folding patterns to constant expressions of vector selects and vector bitcasts. llvm-svn: 125393	2011-02-11 19:37:55 +00:00
Daniel Dunbar	5620b11961	Disable this test for now... llvm-svn: 125361	2011-02-11 02:59:08 +00:00
Evan Cheng	7cfe7b71e6	Fix buggy fcopysign lowering. This define float @foo(float %x, float %y) nounwind readnone { entry: %0 = tail call float @copysignf(float %x, float %y) nounwind readnone ret float %0 } Was compiled to: vmov s0, r1 bic r0, r0, #-2147483648 vmov s1, r0 vcmpe.f32 s0, #0 vmrs apsr_nzcv, fpscr it lt vneglt.f32 s1, s1 vmov r0, s1 bx lr This fails to copy the sign of -0.0f because it's lost during the float to int conversion. Also, it's sub-optimal when the inputs are in GPR registers. Now it uses integer and + or operations when it's profitable. And it's correct! lsrs r1, r1, #31 bfi r0, r1, #31, #1 bx lr rdar://8984306 llvm-svn: 125357	2011-02-11 02:28:55 +00:00
Cameron Zwarich	898b10d36b	Add a test for the LSR issue exposed by r125254. llvm-svn: 125325	2011-02-11 00:49:27 +00:00
Nick Lewycky	6380885ba1	Tolerate degenerate phi nodes that can occur in the middle of optimization passes. Fixes PR9112. Patch by Jakub Staszak! llvm-svn: 125319	2011-02-10 23:54:10 +00:00
Cameron Zwarich	9d8ab7d0f7	Rename 'loopsimplify' to 'loop-simplify'. llvm-svn: 125317	2011-02-10 23:38:10 +00:00
Bruno Cardoso Lopes	3792d82914	Add mips o32 tests again with the hope that the buildbot won't complaint again llvm-svn: 125316	2011-02-10 23:37:20 +00:00
Bruno Cardoso Lopes	c9968d544b	Remove the test to silence the buildbot, will check it in again with a proper fix soon llvm-svn: 125305	2011-02-10 20:10:17 +00:00
Bruno Cardoso Lopes	7cc40009a8	Fix a lot of o32 CC issues and add a bunch of tests. Patch by Akira Hatanaka with some small modifications by me. llvm-svn: 125292	2011-02-10 18:05:10 +00:00
Che-Liang Chiou	762ff2a943	ptx: add passing parameter to kernel functions llvm-svn: 125279	2011-02-10 12:01:24 +00:00
Chris Lattner	d2c1936c14	implement the first part of PR8882: when lowering an inbounds gep to explicit addressing, we know that none of the intermediate computation overflows. This could use review: it seems that the shifts certainly wouldn't overflow, but could the intermediate adds overflow if there is a negative index? Previously the testcase would instcombine to: define i1 @test(i64 %i) { %p1.idx.mask = and i64 %i, 4611686018427387903 %cmp = icmp eq i64 %p1.idx.mask, 1000 ret i1 %cmp } now we get: define i1 @test(i64 %i) { %cmp = icmp eq i64 %i, 1000 ret i1 %cmp } llvm-svn: 125271	2011-02-10 07:11:16 +00:00
Chris Lattner	6e84f48cd8	Enhance a bunch of transformations in instcombine to start generating exact/nsw/nuw shifts and have instcombine infer them when it can prove that the relevant properties are true for a given shift without them. Also, a variety of refactoring to use the new patternmatch logic thrown in for good luck. I believe that this takes care of a bunch of related code quality issues attached to PR8862. llvm-svn: 125267	2011-02-10 05:36:31 +00:00
Chris Lattner	72ac244f4e	Enhance the "compare with shift" and "compare with div" optimizations to be much more aggressive in the face of exact/nsw/nuw div and shifts. For example, these (which are the same except the first is 'exact' sdiv: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %A = sdiv exact i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } define i1 @sdiv_icmp4(i64 %X) nounwind { %A = sdiv i64 %X, -5 ; X/-5 == 0 --> x == 0 %B = icmp eq i64 %A, 0 ret i1 %B } compile down to: define i1 @sdiv_icmp4_exact(i64 %X) nounwind { %1 = icmp eq i64 %X, 0 ret i1 %1 } define i1 @sdiv_icmp4(i64 %X) nounwind { %X.off = add i64 %X, 4 %1 = icmp ult i64 %X.off, 9 ret i1 %1 } This happens when you do something like: (ptr1-ptr2) == 42 where the pointers are pointers to non-unit types. llvm-svn: 125266	2011-02-10 05:23:05 +00:00
Chris Lattner	0decae4bf7	more cleanups, notably bitcast isn't used for "signed to unsigned type conversions". :) llvm-svn: 125265	2011-02-10 05:17:27 +00:00
Evan Cheng	5a42a6a20f	After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508. llvm-svn: 125259	2011-02-10 02:20:55 +00:00
Jim Grosbach	48e8554772	Do AsmMatcher operand classification per-opcode. When matching operands for a candidate opcode match in the auto-generated AsmMatcher, check each operand against the expected operand match class. Previously, operands were classified independently of the opcode being handled, which led to difficulties when operand match classes were more complicated than simple subclass relationships. llvm-svn: 125245	2011-02-10 00:08:28 +00:00
Chris Lattner	02088f3ab8	Teach instsimplify some tricks about exact/nuw/nsw shifts. improve interfaces to instsimplify to take this info. llvm-svn: 125196	2011-02-09 17:15:04 +00:00
Chris Lattner	e29022d779	merge two tests. llvm-svn: 125195	2011-02-09 17:06:41 +00:00
Chris Lattner	c2750a7e3b	remove a small scattering of basically pointless tests. These are all covered by llvm-test, which is what they were reduced from back in 2003. llvm-svn: 125189	2011-02-09 16:41:31 +00:00
Chris Lattner	8abd838838	remove a broken test, this is matching nounwind on intrinsics, not the old unwind instruction llvm-svn: 125188	2011-02-09 16:40:56 +00:00
Richard Osborne	112cff2533	Add intrinsic for setc instruction on the XCore. llvm-svn: 125186	2011-02-09 13:22:12 +00:00
Nick Lewycky	b162446cda	When removing a function from the function set and adding it to deferred, we could end up removing a different function than we intended because it was functionally equivalent, then end up with a comparison of a function against itself in the next round of comparisons (the one in the function set and the one on the deferred list). To fix this, I introduce a choice in the form of comparison for ComparableFunctions, either normal or "pointer only" used to find exact Function*'s in lookups. Also add some debugging statements. llvm-svn: 125180	2011-02-09 06:32:02 +00:00
NAKAMURA Takumi	0c1790b8d2	test/lit.cfg: Seek sane tools(and bash) in directories and set to $PATH. LitConfig.getBashPath() will not seek in $PATH after LitConfig.getToolsPath() was executed. llvm-svn: 125176	2011-02-09 04:19:21 +00:00
NAKAMURA Takumi	81da9fa2a4	CMake: Add the new option LLVM_LIT_TOOLS_DIR. It can specify "Path to GnuWin32 tools". llvm-svn: 125173	2011-02-09 04:18:58 +00:00
Owen Anderson	899a6d74bf	Revert both r121082 (which broke a bunch of constant pool stuff) and r125074 (which worked around it). This should get us back to the old, correct behavior, though it will make the integrated assembler unhappy for the time being. llvm-svn: 125127	2011-02-08 22:39:40 +00:00
Benjamin Kramer	8ff71a1384	Support for .ifdef / .ifndef in the assembler parser. Patch by Joerg Sonnenberger. llvm-svn: 125120	2011-02-08 22:29:56 +00:00
Andrew Trick	4bf0d6a37a	PostRA antidependence breaker unit test for PR8986. llvm-svn: 125091	2011-02-08 17:42:05 +00:00
Andrew Trick	4c1ebd08ef	PostRA antidependence breaker unit test for rdar://8959122. llvm-svn: 125090	2011-02-08 17:41:12 +00:00
Benjamin Kramer	04249128ab	SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch. Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler. llvm-svn: 125056	2011-02-07 22:37:28 +00:00
Bruno Cardoso Lopes	0ce5b0f4a8	Add support for parsing dmb/dsb instructions llvm-svn: 125055	2011-02-07 22:09:15 +00:00
Evan Cheng	56b78e409e	Fix an obvious typo which caused an isel assertion. rdar://8964854. llvm-svn: 125023	2011-02-07 18:50:47 +00:00
Devang Patel	46db608b81	Reduce test case, smaller is better. llvm-svn: 125019	2011-02-07 18:24:18 +00:00
Bob Wilson	65f4a70b82	Add codegen support for using post-increment NEON load/store instructions. The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using post-increment versions, but all the rest of the NEON load/store instructions should be handled now. llvm-svn: 125014	2011-02-07 17:43:21 +00:00
Chris Lattner	2fd09e3397	implement .ll and .bc support for nsw/nuw on shl and exact on lshr/ashr. Factor some code better. llvm-svn: 125006	2011-02-07 16:40:21 +00:00
Jason W Kim	7342155b4c	Teach ARM/MC/ELF about gcc compatible reloc output to get past odd linkage failures with relocations. The code committed is a first cut at compatibility for emitted relocations in ELF .o. Why do this? because existing ARM tools like emitting relocs symbols as explicit relocations, not as section-offset relocs. Result is that with these changes, 1) relocs are now substantially identical what to gcc outputs. 2) larger apps (including many spec2k tests) compile, cross-link, and pass Added reminder fixme to tests for future conversion to .s form. llvm-svn: 124996	2011-02-07 01:11:15 +00:00
Jason W Kim	b0d4492aa1	Rework some .ARM.attribute work for improved gcc compatibility. Unified EmitTextAttribute for both Asm and Obj emission (.cpu only) Added necessary cortex-A8 related attrs for codegen compat tests. llvm-svn: 124995	2011-02-07 00:49:53 +00:00
Chris Lattner	1c1b342a62	teach instsimplify to transform (X / Y) * Y to X when the div is an exact udiv. llvm-svn: 124994	2011-02-06 22:05:31 +00:00
Chris Lattner	8d427ed03c	rename test. llvm-svn: 124993	2011-02-06 21:59:10 +00:00
Chris Lattner	7b6a968f5d	enhance vmcore to know that udiv's can be exact, and add a trivial instcombine xform to exercise this. Nothing forms exact udivs yet though. This is progress on PR8862 llvm-svn: 124992	2011-02-06 21:44:57 +00:00
Anders Carlsson	1eeebf1c22	When loading from a constant, fold inttoptr if the integer type and the resulting pointer type both have the same size. llvm-svn: 124987	2011-02-06 20:11:56 +00:00
NAKAMURA Takumi	07a84f5950	Target/X86: Tweak allocating shadow area (aka home) on Win64. It must be enough for caller to allocate one. llvm-svn: 124949	2011-02-05 15:11:32 +00:00
Bob Wilson	d46874eb5d	Move a test that ended up in the wrong place. llvm-svn: 124933	2011-02-05 04:15:50 +00:00
Devang Patel	930b4b16a1	Merge .debug_loc entries whenever possible to reduce debug_loc size. llvm-svn: 124904	2011-02-04 22:57:18 +00:00
Nick Lewycky	a4f2b5a934	Mark that the return is using EAX so that we don't use it for some other purpose. Fixes PR9080! llvm-svn: 124903	2011-02-04 22:44:08 +00:00
Jason W Kim	056e5aacb7	Teach ARM/MC/ELF about EF_ARM_EABI_VERSION. The magic number is set to 5 to match the current doc. Added FIXME reminder Make it really configurable later. llvm-svn: 124899	2011-02-04 21:41:11 +00:00
Jason W Kim	10c1a81736	Teach ARM/MC/ELF to handle R_ARM_JUMP24 relocation type for conditional jumps. (yes, this is different from R_ARM_CALL) - Adds a new method getARMBranchTargetOpValue() which handles the necessary distinction between the conditional and unconditional br/bl needed for ARM/ELF At least for ARM mode, the needed fixup for conditional versus unconditional br/bl is identical, but the ARM docs and existing ARM tools expect this reloc type... Added a few FIXME's for future naming fixups in ARMInstrInfo.td llvm-svn: 124895	2011-02-04 19:47:15 +00:00
Devang Patel	a586bb8ecd	DebugLoc associated with a machine instruction is used to emit location entries. DebugLoc associated with a DBG_VALUE is used to identify lexical scope of the variable. After register allocation, while inserting DBG_VALUE remember original debug location for the first instruction and reuse it, otherwise dwarf writer may be mislead in identifying the variable's scope. llvm-svn: 124845	2011-02-04 01:43:25 +00:00
Benjamin Kramer	75785ec972	SimplifyCFG: Also transform switches that represent a range comparison but are not sorted into sub+icmp. This transforms another 1000 switches in gcc.c. llvm-svn: 124826	2011-02-03 22:51:41 +00:00
Richard Osborne	5c655f451e	Add XCore intrinsics for resource instructions. llvm-svn: 124794	2011-02-03 13:14:25 +00:00
Duncan Sands	fc33df78c1	Improve threading of comparisons over select instructions (spotted by my auto-simplifier). This has a big impact on Ada code, but not much else. Unfortunately the impact is mostly negative! This is due to PR9004 (aka SCCP failing to resolve conditional branch conditions in the destination blocks of the branch), in which simple correlated expressions are not resolved but complicated ones are, so simplifying has a bad effect! llvm-svn: 124788	2011-02-03 09:37:39 +00:00
NAKAMURA Takumi	04bf5d54a6	test/Makefile: "check-all" should update tools/clang/test/Unit/lit.site.cfg, too. Follow up to clang r124777. llvm-svn: 124783	2011-02-03 07:36:02 +00:00
Rafael Espindola	b0a802c8bf	Add -march to fix the bots. llvm-svn: 124774	2011-02-03 04:21:01 +00:00
Rafael Espindola	5bfba89832	Fix PR9127 by reversing the operands even if they have more then one use. Reversing the operands allows us to fold, but doesn't force us to. Also, at this point the DAG is still being optimized, so the check for hasOneUse is not very precise. llvm-svn: 124773	2011-02-03 03:58:05 +00:00
Duncan Sands	7eecb72021	Reenable the transform "(X*Y)/Y->X" when the multiplication is known not to overflow (nsw flag), which was disabled because it breaks 254.gap. I have informed the GAP authors of the mistake in their code, and arranged for the testsuite to use -fwrapv when compiling this benchmark. llvm-svn: 124746	2011-02-02 20:52:00 +00:00
Benjamin Kramer	b739613711	SimplifyCFG: Turn switches into sub+icmp+branch if possible. This makes the job of the later optzn passes easier, allowing the vast amount of icmp transforms to chew on it. We transform 840 switches in gcc.c, leading to a 16k byte shrink of the resulting binary on i386-linux. The testcase from README.txt now compiles into decl %edi cmpl $3, %edi sbbl %eax, %eax andl $1, %eax ret llvm-svn: 124724	2011-02-02 15:56:22 +00:00
Richard Osborne	5ee859cb22	Add support for trampolines on the XCore. llvm-svn: 124722	2011-02-02 14:57:41 +00:00
Dan Gohman	4dc130ea78	Fix reassociate to clear optional flags, such as nsw. llvm-svn: 124712	2011-02-02 02:02:34 +00:00
Evan Cheng	c7ce7e2ac3	Given a pair of floating point load and store, if there are no other uses of the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 llvm-svn: 124708	2011-02-02 01:06:55 +00:00
Duncan Sands	06e82c76ee	Have m_One also match constant vectors for which every element is 1. llvm-svn: 124655	2011-02-01 08:39:12 +00:00
Rafael Espindola	e60f9519d8	Correctly merge available_externally and regular definitions when they have different visibilities. llvm-svn: 124650	2011-02-01 05:33:52 +00:00
Evan Cheng	dc27913f2d	Fix test for non-darwin targets. llvm-svn: 124640	2011-02-01 01:16:18 +00:00
Devang Patel	2b1331b4d1	Remove stale test that has never worked, afaik. llvm-svn: 124635	2011-02-01 00:47:16 +00:00
Devang Patel	97c467ee47	Keep track of incoming argument's location while emitting LiveIns. llvm-svn: 124611	2011-01-31 21:38:14 +00:00
Richard Osborne	11cdda2346	Fix bug where ReduceLoadWidth was creating illegal ZEXTLOAD instructions. llvm-svn: 124587	2011-01-31 17:41:44 +00:00
Anders Carlsson	f184e5de9a	Recognize and simplify (A+B) == A -> B == 0 A == (A+B) -> B == 0 llvm-svn: 124567	2011-01-30 22:01:13 +00:00
Duncan Sands	987c8bc759	Commit 124487 broke 254.gap. See if disabling the part that might be triggered by PR9088 fixes things. llvm-svn: 124561	2011-01-30 18:24:20 +00:00
Duncan Sands	ac01c21937	Transform (X/Y)*Y into X if the division is exact. Instcombine already knows how to do this and more, but would only do it if X/Y had only one use. Spotted as the most common missed simplification in SPEC by my auto-simplifier, now that it knows about nuw/nsw/exact flags. This removes a bunch of multiplications from 447.dealII and 483.xalancbmk. It also removes a lot from tramp3d-v4, which results in much more inlining. llvm-svn: 124560	2011-01-30 18:03:50 +00:00
Benjamin Kramer	6b3c3de09a	Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off. This happens all the time when a smul is promoted to a larger type. On x86-64 we now compile "int test(int x) { return x/10; }" into movslq %edi, %rax imulq $1717986919, %rax, %rax movq %rax, %rcx shrq $63, %rcx sarq $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax" addl %ecx, %eax This fires 96 times in gcc.c on x86-64. llvm-svn: 124559	2011-01-30 16:38:43 +00:00
Nick Lewycky	5259b6a6e2	Add the select optimization recently added to instcombine to constant folding. This is the one where one of the branches of the select is another select on the same condition. llvm-svn: 124547	2011-01-29 20:35:06 +00:00
Frits van Bommel	92dc04df67	Move InstCombine's knowledge of fdiv to SimplifyInstruction(). llvm-svn: 124534	2011-01-29 15:26:31 +00:00
Duncan Sands	0587f785bf	Fix typo: should have been testing that X was odd, not V. llvm-svn: 124533	2011-01-29 13:27:00 +00:00
Evan Cheng	20433f6339	Add a test for TCE return duplication. llvm-svn: 124527	2011-01-29 04:53:35 +00:00
Evan Cheng	4af5487b74	Re-apply r124518 with fix. Watch out for invalidated iterator. llvm-svn: 124526	2011-01-29 04:46:23 +00:00
Evan Cheng	1f943b9b13	Revert r124518. It broke Linux self-host. llvm-svn: 124522	2011-01-29 02:43:04 +00:00
Evan Cheng	a1e4cb5f09	Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand. llvm-svn: 124518	2011-01-29 01:29:26 +00:00
Bob Wilson	e7ac2389b2	PR9030: Fix disassembly of ARM "mov pc, lr" instruction. Patch by Jyun-Yan You. llvm-svn: 124492	2011-01-28 17:50:30 +00:00
Duncan Sands	1a18d8df96	My auto-simplifier noticed that ((X/Y)Y)/Y occurs several times in SPEC benchmarks, and that it can be simplified to X/Y. (In general you can only simplify (ZY)/Y to Z if the multiplication did not overflow; if Z has the form "X/Y" then this is the case). This patch implements that transform and moves some Div logic out of instcombine and into InstructionSimplify. Unfortunately instcombine gets in the way somewhat, since it likes to change (X/Y)Y into X-(X rem Y), so I had to teach instcombine about this too. Finally, thanks to the NSW/NUW flags, sometimes we know directly that "ZY" does not overflow, because the flag says so, so I added that logic too. This eliminates a bunch of divisions and subtractions in 447.dealII, and has good effects on some other benchmarks too. It seems to have quite an effect on tramp3d-v4 but it's hard to say if it's good or bad because inlining decisions changed, resulting in massive changes all over. llvm-svn: 124487	2011-01-28 16:51:11 +00:00
Roman Divacky	c6a20d1728	Add support for parsing .float llvm-svn: 124485	2011-01-28 14:20:32 +00:00
Evan Cheng	5b6c72e549	Revert r124462. There are a few big regressions that I need to fix first. llvm-svn: 124478	2011-01-28 07:12:38 +00:00
Nick Lewycky	9bbbb3e6f5	Clean up the tests a little, make sure we match an instruction in the right test. llvm-svn: 124473	2011-01-28 05:13:17 +00:00
Rafael Espindola	d93551f227	Add a triple. llvm-svn: 124471	2011-01-28 03:57:55 +00:00
Nick Lewycky	74dfcccec4	Fold select + select where both selects are on the same condition. llvm-svn: 124469	2011-01-28 03:28:10 +00:00
Rafael Espindola	9bc19ee478	Print the visibility of declarations. llvm-svn: 124468	2011-01-28 03:20:10 +00:00
Nico Weber	66fd0e8119	PR8951: Support for .equiv in integrated assembler, patch by Jörg Sonnenberger! llvm-svn: 124467	2011-01-28 03:04:41 +00:00
Evan Cheng	7031f450b3	- Stop simplifycfg from duplicating "ret" instructions into unconditional branches. PR8575, rdar://5134905, rdar://8911460. - Allow codegen tail duplication to dup small return blocks after register allocation is done. llvm-svn: 124462	2011-01-28 02:19:21 +00:00
Evan Cheng	2042be8132	Fix PLD encoding. llvm-svn: 124458	2011-01-27 23:48:34 +00:00
Roman Divacky	f817c82cf3	Add support for specifying register name in cfi-register/offset/def as well as register number. llvm-svn: 124379	2011-01-27 17:16:37 +00:00
Nick Lewycky	864a35740a	Fix surprising missed optimization in mergefunc where we forgot to consider that relationships like "i8* null" is equivalent to "i32* null". llvm-svn: 124368	2011-01-27 08:38:19 +00:00
Eric Christopher	3906b33289	Add a testcase for my last checkin. llvm-svn: 124358	2011-01-27 06:01:17 +00:00
Bruno Cardoso Lopes	228d126d6f	Add encoding testcases for ARM vcvtr variations llvm-svn: 124289	2011-01-26 13:53:38 +00:00
Bruno Cardoso Lopes	2d6bd03b18	fix the encoding and add testcases for ARM nop, yield, wfe and wfi instructions llvm-svn: 124288	2011-01-26 13:28:14 +00:00
Duncan Sands	e1912ca7e0	Fix PR9039, a use-after-free in reassociate. The issue was that the operand being factorized (and erased) could occur several times in Ops, resulting in freed memory being used when the next occurrence in Ops was analyzed. llvm-svn: 124287	2011-01-26 10:08:38 +00:00
NAKAMURA Takumi	8ace7260cc	Target/X86: Tweak win64's tailcall. llvm-svn: 124272	2011-01-26 02:04:09 +00:00
NAKAMURA Takumi	066378440a	Fix whitespace. llvm-svn: 124270	2011-01-26 02:03:37 +00:00
Rafael Espindola	29e8317caa	Move unnamed_addr after the function arguments on Sabre's request. llvm-svn: 124209	2011-01-25 19:09:56 +00:00
Devang Patel	fce915414e	Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic. llvm-svn: 124203	2011-01-25 18:09:58 +00:00
Duncan Sands	017a3d76f7	In which I discover that zero+zero is zero, d'oh! llvm-svn: 124188	2011-01-25 15:14:15 +00:00
Duncan Sands	76eef3df7e	Turn off this test - the corresponding instsimplify logic has been disabled. llvm-svn: 124185	2011-01-25 12:31:43 +00:00
Duncan Sands	92b081bd42	According to my auto-simplifier the most common missed simplifications in optimized code are: (non-negative number)+(power-of-two) != 0 -> true and (x \| 1) != 0 -> true Instcombine knows about the second one of course, but only does it if X\|1 has only one use. These fire thousands of times in the testsuite. llvm-svn: 124183	2011-01-25 09:38:29 +00:00
Nick Lewycky	b20b284b35	Teach mergefunc how to emit aliases safely again -- but keep it turned it off for now. It's controlled by the HasGlobalAliases variable which is not attached to any flag yet. llvm-svn: 124182	2011-01-25 08:56:50 +00:00
Evan Cheng	8e47a6e196	Don't merge restore with tail call instruction. llvm-svn: 124167	2011-01-25 01:28:33 +00:00
Devang Patel	431a9b9c2f	Speculatively revert r124138. llvm-svn: 124142	2011-01-24 20:04:37 +00:00
Rafael Espindola	f290466899	Jörg Sonnenberger noticed that we were missing this test. llvm-svn: 124139	2011-01-24 19:40:38 +00:00
Devang Patel	5ccc4e884c	Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic. llvm-svn: 124138	2011-01-24 19:24:37 +00:00
Duncan Sands	64d4cb968a	Testcase for dragonegg commit 124128. llvm-svn: 124129	2011-01-24 18:04:33 +00:00
Rafael Espindola	ead58a5259	Handle strings in section names the same way as gas: * If the name is a single string, we remove the quotes * If the name starts without a quote, we include any quotes in the name llvm-svn: 124127	2011-01-24 18:02:54 +00:00
Dan Gohman	6f83adb763	Add another rdar number. llvm-svn: 124125	2011-01-24 17:54:01 +00:00
Chris Lattner	9ba0a83f2b	fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey! llvm-svn: 124102	2011-01-24 03:42:46 +00:00
Chris Lattner	8f4ed7d057	merge all the "crash tests" into crash.ll llvm-svn: 124101	2011-01-24 03:37:34 +00:00
Chris Lattner	f33b65080c	fix PR9017, a bug where we'd assert when promoting in unreachable code. llvm-svn: 124100	2011-01-24 03:29:07 +00:00
Chris Lattner	077cdfcadb	fix PR9015, a crash linking recursive metadata. llvm-svn: 124099	2011-01-24 03:18:24 +00:00
Chris Lattner	3609e5afb4	enhance SRoA to promote allocas that are used by PHI nodes. This often occurs because instcombine sinks loads and inserts phis. This kicks in on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in spec2006 in some app that uses std::deque. This resolves the last of rdar://7339113. llvm-svn: 124090	2011-01-24 01:07:11 +00:00
Chris Lattner	50288f8e54	Enhance SRoA to promote allocas that are used by selects in some common cases. This triggers a surprising number of times in SPEC2K6 because min/max idioms end up doing this. For example, code from the STL ends up looking like this to SRoA: %202 = load i64* %__old_size, align 8, !tbaa !3 %203 = load i64* %__old_size, align 8, !tbaa !3 %204 = load i64* %__n, align 8, !tbaa !3 %205 = icmp ult i64 %203, %204 %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size %206 = load i64* %storemerge.i, align 8, !tbaa !3 We can now promote both the __n and the __old_size allocas. This addresses another chunk of rdar://7339113, poor codegen on stringswitch. llvm-svn: 124088	2011-01-23 22:04:55 +00:00
Nick Lewycky	13a2b8281f	Simplify some code with no functionality change. Make the test a lot more robust against smarter optimizations, using the power of FileCheck. llvm-svn: 124081	2011-01-23 20:06:05 +00:00
Rafael Espindola	547873da60	Add support for the --noexecstack option. llvm-svn: 124077	2011-01-23 17:55:27 +00:00
Rafael Espindola	14333be1af	Add support for lowercase variants. llvm-svn: 124071	2011-01-23 16:11:25 +00:00
Chris Lattner	2ef605b499	Enhance SRoA to be more aggressive about scalarization of aggregate allocas that have PHI or select uses of their element pointers. This can often happen when instcombine sinks two loads into a successor, inserting a phi or select. With this patch, we can scalarize the alloca, but the pinned elements are not yet promoted. This is still a win for large aggregates where only one element is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen on stringswitch). llvm-svn: 124070	2011-01-23 08:27:54 +00:00
Chris Lattner	ba66871643	remove an old hack that avoided creating MMX datatypes. The X86 backend has been fixed. llvm-svn: 124064	2011-01-23 06:40:33 +00:00
Nick Lewycky	2503c9f9c8	Use value ranges to fold ext(trunc) in SCEV when possible. llvm-svn: 124062	2011-01-23 06:20:19 +00:00
Rafael Espindola	aefd549139	Delay the creation of eh_frame so that the user can change the defaults. Add support for SHT_X86_64_UNWIND. llvm-svn: 124059	2011-01-23 05:43:40 +00:00
Venkatraman Govindaraju	66369057ae	Pass sret arguments through the stack instead of through registers in Sparc backend. It makes the code generated more compliant with the sparc32 ABI. llvm-svn: 124030	2011-01-22 13:05:16 +00:00
Venkatraman Govindaraju	3c37c9914f	Added ICC, FCC as uses of movcc instruction to generate correct code when -mattr=v9 is used. llvm-svn: 124027	2011-01-22 11:36:24 +00:00
Dan Gohman	a9724ada65	Actually check memcpy lengths, instead of just commenting about how they should be checked. llvm-svn: 123999	2011-01-21 22:07:57 +00:00
Venkatraman Govindaraju	c20beed917	Sparc backend: Rename FLUSH to FLUSHW. Output "ta 3" instead of a "flushw" instruction if v8 instruction set is used. llvm-svn: 123997	2011-01-21 22:00:00 +00:00
Owen Anderson	6e3425c7e0	Just because we have determined that an (fcmp \| fcmp) is true for A < B, A == B, and A > B, does not mean we can fold it to true. We still need to check for A ? B (A unordered B). llvm-svn: 123993	2011-01-21 19:39:42 +00:00
Evan Cheng	0dfe28a9b5	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Bruno Cardoso Lopes	2f96371a7a	Fix the encoding of QADD/SUB, QDADD/SUB. While qadd16, qadd8 use "rd, rn, rm", qadd and qdadd uses "rd, rm, rn", the same applies to the 'sub' variants. This is described in ARM manuals and matches the encoding used by the gnu assembler. llvm-svn: 123975	2011-01-21 14:07:40 +00:00
Venkatraman Govindaraju	6a083c355b	Implement support for byval arguments in Sparc backend. llvm-svn: 123974	2011-01-21 14:00:01 +00:00
Michael J. Spencer	08bf1dbce4	Revert "Object: Renable the tests now that none of the build bots complain about aliasing." This reverts commit 281f3901b7b0869929caf8946c1ad1228bc38922. llvm-svn: 123972	2011-01-21 06:27:04 +00:00
Andrew Trick	e0bccb5f87	Enable support for precise scheduling of the instruction selection DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971	2011-01-21 06:19:05 +00:00
Andrew Trick	7155e98904	Convert -enable-sched-cycles and -enable-sched-hazard to -disable flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969	2011-01-21 05:51:33 +00:00
Chris Lattner	f225708ef1	fix PR9013, an infinite loop in instcombine. llvm-svn: 123968	2011-01-21 05:29:50 +00:00
Michael J. Spencer	3089c44b26	Object: Renable the tests now that none of the build bots complain about aliasing. llvm-svn: 123964	2011-01-21 05:07:13 +00:00
Nick Lewycky	c4300debc2	Don't try to pull vector bitcasts that change the number of elements through a select. A vector select is pairwise on each element so we'd need a new condition with the right number of elements to select on. Fixes PR8994. llvm-svn: 123963	2011-01-21 02:30:43 +00:00
Nick Lewycky	50f86f3414	Add a constant folding of casts from zero to zero. Fixes PR9011! While here, I'd like to complain about how vector is not an aggregate type according to llvm::Type::isAggregateType(), but they're listed under aggregate types in the LangRef and zero vectors are stored as ConstantAggregateZero. llvm-svn: 123956	2011-01-21 01:12:09 +00:00
Evan Cheng	52fe62c996	Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative value, the "add pc" must be CSE'ed at the same time. We could follow the same approach as T2 by adding pseudo instructions that combine the ldr + "add pc". But the better approach is to use movw + movt (which I will enable soon), so I'll leave this as a TODO. llvm-svn: 123949	2011-01-20 23:55:07 +00:00
Tobias Grosser	ea8985cc25	Implement requiredTransitive The PassManager did not implement the transitivity of requiredTransitive. This was unnoticed since 2006. llvm-svn: 123942	2011-01-20 21:03:22 +00:00
Bruno Cardoso Lopes	dc3853d7b5	Add testcases for clz encoding llvm-svn: 123937	2011-01-20 19:27:16 +00:00
Bruno Cardoso Lopes	6aeb2e320f	Fix the encoding and parsing of clrex instruction llvm-svn: 123936	2011-01-20 19:18:32 +00:00
Bruno Cardoso Lopes	5f06c0aa3b	Add cdp/cdp2 instructions for thumb/thumb2 llvm-svn: 123929	2011-01-20 18:32:09 +00:00
Devang Patel	e3e6201a2f	Disable objdump-trivial-object.test. It is broken on powerpc-darwin9. llvm-svn: 123928	2011-01-20 18:08:44 +00:00
Bruno Cardoso Lopes	3584c02d83	- Use a more appropriate name for Owen's ARM Parser isMCR hack since the same operands can be present in cdp/cdp2 instructions. Also increase the hack with cdp/cdp2 instructions. - Fix the encoding of cdp/cdp2 instructions for ARM (no thumb and thumb2 yet) and add testcases for t hem. llvm-svn: 123927	2011-01-20 18:06:58 +00:00
Bruno Cardoso Lopes	75712e8a7a	Add mcr2 and mrc2 support to thumb2 targets llvm-svn: 123919	2011-01-20 16:58:48 +00:00
Bruno Cardoso Lopes	f377d1721e	Add mcr* and mr*c support to thumb targets llvm-svn: 123917	2011-01-20 16:35:57 +00:00
Michael J. Spencer	d74f931baa	Disable this test until I can figure out why it's broken. Not xfailed because it usese 100% CPU and times out, so it's annoying to run it. llvm-svn: 123915	2011-01-20 16:24:07 +00:00
Kalle Raiskila	070fb5e54d	Allow sign-extending of i8 and i16 to i128 on SPU. llvm-svn: 123912	2011-01-20 15:49:06 +00:00
Duncan Sands	1faa8712c9	At -O123 the early-cse pass is run before instcombine has run. According to my auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0. This patch adds this transform and some related logic to InstructionSimplify and removes some of the logic from instcombine (unfortunately not all because there are several situations in which instcombine can improve things by making new instructions, whereas instsimplify is not allowed to do this). At -O2 this often results in more than 15% more simplifications by early-cse, and results in hundreds of lines of bitcode being eliminated from the testsuite. I did see some small negative effects in the testsuite, for example a few additional instructions in three programs. One program, 483.xalancbmk, got an additional 35 instructions, which seems to be due to a function getting an additional instruction and then being inlined all over the place. llvm-svn: 123911	2011-01-20 13:21:55 +00:00
Eric Christopher	f7579ff174	Expand invalid return values for umulo and smulo. Handle these similarly to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. llvm-svn: 123908	2011-01-20 08:54:28 +00:00
Evan Cheng	5c5e42a878	Add test. llvm-svn: 123906	2011-01-20 08:38:21 +00:00
Evan Cheng	6dc21c7358	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Michael J. Spencer	c3f075e648	Object: Add some tests! llvm-svn: 123899	2011-01-20 06:39:15 +00:00
Venkatraman Govindaraju	5280b2876f	Sparc backend: Implements a delay slot filler that attempt to fill delay slots with useful instructions. llvm-svn: 123884	2011-01-20 05:08:26 +00:00
Eric Christopher	1b0e5debb4	If we can, lower the multiply part of a umulo/smulo call to a libcall with an invalid type then split the result and perform the overflow check normally. Fixes the 32-bit parts of rdar://8622122 and rdar://8774702. llvm-svn: 123864	2011-01-20 00:29:24 +00:00
Devang Patel	729c5e59af	Fix debug info for merged global. llvm-svn: 123862	2011-01-20 00:02:16 +00:00
Nick Lewycky	51c13384f5	Similarly, analyze truncate through multiply. llvm-svn: 123842	2011-01-19 18:56:00 +00:00
Nick Lewycky	9867e58096	Add a missed SCEV fold that is required to continue analyzing the IR produced by indvars through the scev expander. trunc(add x, y) --> add(trunc x, y). Currently SCEV largely folds the other way which is probably wrong, but preserved to minimize churn. Instcombine doesn't do this fold either, demonstrating a missed optz'n opportunity on code doing add+trunc+add. llvm-svn: 123838	2011-01-19 16:59:46 +00:00
Bruno Cardoso Lopes	0f7a30b1cb	Fix the encoding of mrrc and mcrr family of instructions. Also add testcases for mcr and mrc llvm-svn: 123837	2011-01-19 16:56:52 +00:00
Rafael Espindola	ce499efe1d	Add unnamed_addr when we can show that address of a global is not used. llvm-svn: 123834	2011-01-19 16:32:21 +00:00
Nick Lewycky	5a538b62ca	Add a missing SCEV simplification sext(zext x) --> zext x. llvm-svn: 123832	2011-01-19 15:56:12 +00:00
Owen Anderson	ed4acd59cb	When matching asm operands, always try to match the most restricted type first. Unfortunately, while this is the "right" thing to do, it breaks some ARM asm parsing tests because MemMode5 and ThumbMemModeReg are ambiguous. This is tricky to resolve since neither is a subset of the other. XFAIL the test for now. The old way was broken in other ways, just ways we didn't happen to be testing, and our ARM asm parsing is going to require significant revisiting at a later point anyways. llvm-svn: 123786	2011-01-18 23:01:21 +00:00
Bruno Cardoso Lopes	e0f8fee637	Create two new generic classes to represent the following VMRS/VMSR variations: vmrs reg, fpexc vmrs reg, fpsid vmsr fpexc, reg vmsr fpsid, reg llvm-svn: 123783	2011-01-18 21:58:20 +00:00
Bruno Cardoso Lopes	82c6fe3dfe	Fix MRS encoding for arm and thumb. llvm-svn: 123778	2011-01-18 21:31:35 +00:00
Bruno Cardoso Lopes	6e4c5af01e	Fix the encoding of t2ISB by using the right class and also parse it correctly llvm-svn: 123776	2011-01-18 21:17:09 +00:00
Dan Gohman	df668227fb	Teach BasicAA to return PartialAlias in cases where both pointers are pointing to the same object, one pointer is accessing the entire object, and the other is access has a non-zero size. This prevents TBAA from kicking in and saying NoAlias in such cases. llvm-svn: 123775	2011-01-18 21:16:06 +00:00
Bruno Cardoso Lopes	c1e21b06b9	Follow the current hack set and enable the correct parsing of bkpt while in thumb mode. llvm-svn: 123772	2011-01-18 20:55:11 +00:00
Chris Lattner	4832a9d32c	fix rdar://8878965, a regression I introduced with the recent llvm.objectsize changes. llvm-svn: 123771	2011-01-18 20:53:04 +00:00
Bruno Cardoso Lopes	94247155c4	Add support for parsing and encoding ARM's official syntax for the BFI instruction llvm-svn: 123770	2011-01-18 20:45:56 +00:00
Bruno Cardoso Lopes	6c5db0236a	Add support for mips32 madd and msub instructions. Patch by Akira Hatanaka llvm-svn: 123760	2011-01-18 19:29:17 +00:00
Duncan Sands	732cb58b61	For completeness, generalize the (X + Y) - Y -> X transform and add X - (X + 1) -> -1. These were not recommended by my auto-simplifier since they don't fire often enough. However they do fire from time to time, for example they remove one subtraction from the final bitcode for 483.xalancbmk. llvm-svn: 123755	2011-01-18 11:50:19 +00:00
Duncan Sands	2abe6f500f	Simplify (X<<1)-X into X. According to my auto-simplier this is the most common missed simplification in fully optimized code. It occurs sporadically in the testsuite, and many times in 403.gcc: the final bitcode has 131 fewer subtractions after this change. The reason that the multiplies are not eliminated is the same reason that instcombine did not catch this: they are used by other instructions (instcombine catches this with a more general transform which in general is only profitable if the operands have only one use). llvm-svn: 123754	2011-01-18 09:24:58 +00:00
Daniel Dunbar	ba39b2fdc1	McARM: Start marking T2 address operands as such, for the benefit of the parser. llvm-svn: 123722	2011-01-18 03:06:03 +00:00
Benjamin Kramer	869dc645f1	Fix an off-by-one error in ctpop combining. llvm-svn: 123664	2011-01-17 18:00:28 +00:00
Devang Patel	ec7c842bfa	Update tests to accomodate unnamed_addr introduction. llvm-svn: 123663	2011-01-17 17:54:17 +00:00
Benjamin Kramer	e9488ed8eb	Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0. This shaves off 4 popcounts from the hacked 186.crafty source. This is enabled even when a native popcount instruction is available. The combined code is one operation longer but it should be faster nevertheless. llvm-svn: 123621	2011-01-17 12:04:57 +00:00
Kalle Raiskila	8eaf0e83d5	Don't crash SPU BE with memory accesses with big alignmnet. llvm-svn: 123620	2011-01-17 11:59:20 +00:00
Evan Cheng	53ec6fc591	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Nick Lewycky	8f0b243661	Test for lazy value info's ability to prove the absense of NULLs in pointers. llvm-svn: 123601	2011-01-16 21:57:20 +00:00
Michael J. Spencer	3ea4ed2e6b	Make everyone happy this time. llvm-svn: 123599	2011-01-16 21:34:34 +00:00
Anders Carlsson	c9781e5764	Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead. llvm-svn: 123596	2011-01-16 21:25:33 +00:00
Michael J. Spencer	b0510b04d5	Try and fix this test. For some reason llvm-ar thinks that the file exists when it shouldn't, but I have no way to verify that it doesn't actually exist on the buildbot. llvm-svn: 123594	2011-01-16 20:52:58 +00:00
Rafael Espindola	9afb7af08a	Update tests. llvm-svn: 123591	2011-01-16 18:02:57 +00:00
Rafael Espindola	41852873f7	Don't merge two constants if we care about the address of both. This fixes the original testcase in PR8927. It also causes a clang binary built with a patched clang to increase in size by 0.21%. We can probably get some of the size back by writing a pass that detects that a global never has its pointer compared and adds unnamed_addr to it (maybe extend global opt). It is also possible that there are some other cases clang could add unnamed_addr to. I will investigate extending globalopt next. llvm-svn: 123584	2011-01-16 17:05:09 +00:00
Owen Anderson	b86db71ad0	Reduce and merge testcases. llvm-svn: 123579	2011-01-16 09:13:31 +00:00
Chris Lattner	dde85de90f	fix PR8514, a bug where the "heroic" transformation of shift/and into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578	2011-01-16 08:48:11 +00:00
Chris Lattner	a4454efc85	fix PR8932, a case where arg promotion could infinitely promote. llvm-svn: 123574	2011-01-16 08:09:24 +00:00
Chris Lattner	29f339f87c	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	2067fb2a93	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Evan Cheng	144b435a15	Spill R4 if it's going to be used to restore SP from FP. llvm-svn: 123567	2011-01-16 05:14:33 +00:00
Owen Anderson	6e0fa67f91	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563	2011-01-16 04:33:33 +00:00
Chris Lattner	aba06ce448	fix PR8983, a broken assertion. llvm-svn: 123562	2011-01-16 03:43:53 +00:00
Venkatraman Govindaraju	fe346f6cba	Implement AnalyzeBranch in Sparc Backend. llvm-svn: 123561	2011-01-16 03:15:11 +00:00
Chris Lattner	24ea7f696e	fix PR8981, a crash trying to form a conditional inc with a floating point compare. llvm-svn: 123560	2011-01-16 02:56:53 +00:00
Chris Lattner	c4d1d86d3e	reapply my fix for PR8961 with a tweak to properly handle multi-instruction sequences like calls. Many thanks to Jakob for finding a testcase. llvm-svn: 123559	2011-01-16 02:27:38 +00:00
Michael J. Spencer	76f1706025	Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1." llvm-svn: 123557	2011-01-16 01:43:22 +00:00
Chris Lattner	44bcf63348	one of michael's recent patches broke this, temporarily disable it so the bots go green llvm-svn: 123555	2011-01-16 01:04:49 +00:00
Chris Lattner	75599bb566	remove the partial specialization pass. It is unmaintained and has bugs. llvm-svn: 123554	2011-01-16 00:27:10 +00:00
Nick Lewycky	1d57e867a4	Make constmerge a two-pass algorithm so that it won't miss merging opporuntities. Fixes PR8978. llvm-svn: 123541	2011-01-15 18:14:21 +00:00
Rafael Espindola	3b43f22391	Allow unnamed_addr on declarations. llvm-svn: 123529	2011-01-15 08:15:00 +00:00
Chris Lattner	55c2150f36	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527	2011-01-15 07:51:19 +00:00
Chris Lattner	68a47147ba	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526	2011-01-15 07:36:13 +00:00
Chris Lattner	d4eaf6eba8	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524	2011-01-15 07:25:29 +00:00
Chris Lattner	74ed5d30ca	implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) llvm-svn: 123520	2011-01-15 06:32:33 +00:00
Rafael Espindola	70a277119f	Update llvm-gcc's tests. llvm-svn: 123447	2011-01-14 17:01:20 +00:00
Duncan Sands	dc51b0ee48	Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common simplification present in fully optimized code (I think instcombine fails to transform some of these when "X-Y" has more than one use). Fires here and there all over the test-suite, for example it eliminates 8 subtractions in the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc. llvm-svn: 123442	2011-01-14 15:26:10 +00:00
Duncan Sands	4757061c47	Factorize common code out of the InstructionSimplify shift logic. Add in threading of shifts over selects and phis while there. This fires here and there in the testsuite, to not much effect. For example when compiling spirit it fires 5 times, during early-cse, resulting in 6 more cse simplifications, and 3 more terminators being folded by jump threading, but the final bitcode doesn't change in any interesting way: other optimizations would have caught the opportunity anyway, only later. llvm-svn: 123441	2011-01-14 14:44:12 +00:00
Duncan Sands	01be7e406d	Rename this test. llvm-svn: 123440	2011-01-14 14:16:33 +00:00
Chris Lattner	de9ec03027	relax testcase a bit. llvm-svn: 123433	2011-01-14 07:46:33 +00:00
Chris Lattner	eba719204c	revert my fastisel patch again which apparently still gives the llvm-gcc-i386-linux-selfhost buildbot heartburn... llvm-svn: 123431	2011-01-14 06:14:33 +00:00
Chris Lattner	ee950eeb24	reapply r123414 now that the botz are calmed down and the fix is already in. llvm-svn: 123427	2011-01-14 04:24:28 +00:00
Evan Cheng	0cdd5547f1	Completed :lower16: / :upper16: support for movw / movt pairs on Darwin. - Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first. - Added support for Thumb2 :lower16: and :upper16: fix up. - Added :upper16: and :lower16: relocation support to mach-o object writer. llvm-svn: 123424	2011-01-14 02:38:49 +00:00
Chris Lattner	349735530b	r123414 broke llvm-gcc bootstrap apparently, revert llvm-svn: 123422	2011-01-14 02:07:32 +00:00
Duncan Sands	44c273d907	Move some shift transforms out of instcombine and into InstructionSimplify. While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. llvm-svn: 123417	2011-01-14 00:37:45 +00:00
Chris Lattner	5baec05809	fix PR8961 - a fast isel miscompilation where we'd insert a new instruction after sext's generated for addressing that got folded. Previously we compiled test5 into: _test5: ## @test5 ## BB#0: movq -8(%rsp), %rax ## 8-byte Reload movq (%rdi,%rax), %rdi addq %rdx, %rdi movslq %esi, %rax movq %rax, -8(%rsp) ## 8-byte Spill movq %rdi, %rax ret which is insane and wrong. Now we produce: _test5: ## @test5 ## BB#0: movslq %esi, %rax movq (%rdi,%rax), %rax addq %rdx, %rax ret llvm-svn: 123414	2011-01-14 00:01:01 +00:00
Owen Anderson	4f5dac3541	As far as I can tell, unified syntax uses c0-c15 instead of cr0-cr15 for mcr and friends. llvm-svn: 123407	2011-01-13 22:38:16 +00:00
Bob Wilson	3b0197489e	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	9f8d730f9b	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Duncan Sands	36b007d63b	The most common simplification missed by instsimplify in unoptimized bitcode is "X != 0 -> X" when X is a boolean. This occurs a lot because of the way llvm-gcc converts gcc's conditional expressions. Add this, and a few other similar transforms for completeness. llvm-svn: 123372	2011-01-13 08:56:29 +00:00
Evan Cheng	cc474b4864	Model :upper16: and :lower16: as ARM specific MCTargetExpr. This is a step in the right direction. It eliminated some hacks and will unblock codegen work. But it's far from being done. It doesn't reject illegal expressions, e.g. (FOO - :lower16:BAR). It also doesn't work in Thumb2 mode at all. llvm-svn: 123369	2011-01-13 07:58:56 +00:00

... 3 4 5 6 7 ...

12430 Commits