llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Mon P Wang	5b74ab1f1e	Added some basic test cases for r61209 llvm-svn: 61210	2008-12-18 20:05:58 +00:00
Nick Lewycky	ab50d88e6a	Make all the vector elements positive in an srem of constant vector. llvm-svn: 61195	2008-12-18 06:31:11 +00:00
Bill Wendling	61b2c0c046	XFAIL on Linux. llvm-svn: 61176	2008-12-18 00:35:21 +00:00
Bill Wendling	0aa119d1f9	Do not XFAIL. llvm-svn: 61174	2008-12-18 00:27:15 +00:00
Devang Patel	fb89dfcc3f	XFAIL for now. llvm-svn: 61167	2008-12-17 22:54:54 +00:00
Devang Patel	c500dff638	Xfail these tests for now. llvm-svn: 61166	2008-12-17 22:53:09 +00:00
Chris Lattner	196c166a06	Enhance heap sra to be substantially more aggressive w.r.t PHI nodes. This allows it to do fairly general phi insertion if a load from a pointer global wants to be SRAd but the load is used by (recursive) phi nodes. This fixes a pessimization on ppc introduced by Load PRE. llvm-svn: 61123	2008-12-17 05:28:49 +00:00
Eli Friedman	4aae828bf8	Fix for PR3225: disable a broken optimization in DAGTypeLegalizer::ExpandShiftWithKnownAmountBit. In terms of restoring the optimization, the best fix here isn't obvious... any ideas? llvm-svn: 61119	2008-12-17 03:35:17 +00:00
Dale Johannesen	e348900657	A new dag combine; several permutations of this are there under ADD, this one was missing. llvm-svn: 61107	2008-12-16 22:13:49 +00:00
Chris Lattner	c4cc4a328f	Fix another crash found by inspection. If we have a PHI node merging the load multiple times, make sure the check the uses of the PHI to ensure they are transformable. llvm-svn: 61102	2008-12-16 21:24:51 +00:00
Chris Lattner	8b1f2f76d7	fix a crash found by inspection. llvm-svn: 61101	2008-12-16 21:04:51 +00:00
Eli Friedman	de614f9842	Add a helper to remove a branch and DCE the condition, and use it consistently for deleting branches. In addition to being slightly more readable, this makes SimplifyCFG a bit better about cleaning up after itself when it makes conditions unused. llvm-svn: 61100	2008-12-16 20:54:32 +00:00
Bill Wendling	905350a341	Remove empty test. llvm-svn: 61095	2008-12-16 19:07:17 +00:00
Bill Wendling	f807a68f2e	Temporarily revert r61019, r61030, and r61040. These were breaking LLVM Release builds. llvm-svn: 61094	2008-12-16 19:06:48 +00:00
Evan Cheng	96d87db03b	We have decided not to support inline asm where an output operand with a matching input operand with incompatible type (i.e. either one is a floating point and the other is an integer or the sizes of the types differ). SelectionDAGBuild will catch these and exit with an error. llvm-svn: 61092	2008-12-16 18:21:39 +00:00
Torok Edwin	453ef5b57c	add testcase for -print-dbginfo llvm-svn: 61086	2008-12-16 10:10:23 +00:00
Nick Lewycky	1b0fc83809	Generalize support for analyzing loops to include SLE/SGE loop exit conditions and support for non-unit strides with signed exit conditions. llvm-svn: 61082	2008-12-16 08:30:01 +00:00
Chris Lattner	b3becc5776	fix PR3217: fully cached queries need to be verified against the visited set before they are used. If used, their blocks need to be added to the visited set so that subsequent queries don't use conflicting pointer values in the cache result blocks. llvm-svn: 61080	2008-12-16 07:10:09 +00:00
Dan Gohman	10eb3ccaeb	Enable anti-dependence breaking by default when post-RA scheduling is enabled. llvm-svn: 61078	2008-12-16 06:21:45 +00:00
Dan Gohman	40a40dd7c1	Fix some register-alias-related bugs in the post-RA scheduler liveness computation code. Also, avoid adding output-depenency edges when both defs are dead, which frequently happens with EFLAGS defs. Compute Depth and Height lazily, and always in terms of edge latency values. For the schedulers that don't care about latency, edge latencies are set to 1. Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array. These are all subsumed by the Depth and Height fields. llvm-svn: 61073	2008-12-16 03:25:46 +00:00
Chris Lattner	3ac8ed076a	add testcase for r61051 llvm-svn: 61052	2008-12-15 21:46:23 +00:00
Mon P Wang	bb3c2994f0	Added support for splitting and scalarizing vector shifts. llvm-svn: 61050	2008-12-15 21:44:00 +00:00
Chris Lattner	dd4c8f09fa	add a basic test for heap-sra llvm-svn: 61041	2008-12-15 19:42:05 +00:00
Chris Lattner	0e79aa6595	Teach basicaa to use the nocapture attribute when possible. When the intrinsics are properly marked nocapture, the fixme should be addressed. llvm-svn: 61040	2008-12-15 18:59:22 +00:00
Chris Lattner	8119a1f70d	Add a testcase for GCC PR 23455, which lpre handles now. Add some comments about why we're not getting other cases. llvm-svn: 61032	2008-12-15 07:49:24 +00:00
Mon P Wang	2f96113348	Added support to LegalizeType for expanding the operands of scalar to vector and insert vector element. Modified extract vector element to extend the result to match the expected promoted type. llvm-svn: 61029	2008-12-15 06:57:02 +00:00
Chris Lattner	30c1871282	gvn now hoists this load out of the hot non-call path. llvm-svn: 61028	2008-12-15 06:34:48 +00:00
Chris Lattner	ea2933ff07	Adjust testcase to make it more stable across visitation order changes, unbreaking it after r61024. llvm-svn: 61025	2008-12-15 04:42:00 +00:00
Chris Lattner	22cfa14eed	make GVN try to rename inputs to the resultant replaced values, which cleans up the generated code a bit. This should have the added benefit of not randomly renaming functions/globals like my previous patch did. :) llvm-svn: 61023	2008-12-15 03:46:38 +00:00
Chris Lattner	c92b131639	Implement initial support for PHI translation in memdep. This means that memdep keeps track of how PHIs affect the pointer in dep queries, which allows it to eliminate the load in cases like rle-phi-translate.ll, which basically end up being: BB1: X = load P br BB3 BB2: Y = load Q br BB3 BB3: R = phi [P] [Q] load R turning "load R" into a phi of X/Y. In addition to additional exposed opportunities, this makes memdep safe in many cases that it wasn't before (which is required for load PRE) and also makes it substantially more efficient. For example, consider: bb1: // has many predecessors. P = some_operator() load P In this example, previously memdep would scan all the predecessors of BB1 to see if they had something that would mustalias P. In some cases (e.g. test/Transforms/GVN/rle-must-alias.ll) it would actually find them and end up eliminating something. In many other cases though, it would scan and not find anything useful. MemDep now stops at a block if the pointer is defined in that block and cannot be phi translated to predecessors. This causes it to miss the (rare) cases like rle-must-alias.ll, but makes it faster by not scanning tons of stuff that is unlikely to be useful. For example, this speeds up GVN as a whole from 3.928s to 2.448s (60%)!. IMO, scalar GVN should be enhanced to simplify the rle-must-alias pointer base anyway, which would allow the loads to be eliminated. In the future, this should be enhanced to phi translate through geps and bitcasts as well (as indicated by FIXMEs) making memdep even more powerful. llvm-svn: 61022	2008-12-15 03:35:32 +00:00
Chris Lattner	8f6a8a85a3	another random testcase that shouldn't crash gvn and is good for coverage with future changes. llvm-svn: 61011	2008-12-14 21:20:46 +00:00
Chris Lattner	af4007b39f	RLE isn't smart enough to eliminate this safely yet. llvm-svn: 60994	2008-12-13 21:04:20 +00:00
Chris Lattner	cc5ee569a3	rename some tests to be more uniform in naming convention. llvm-svn: 60988	2008-12-13 18:47:40 +00:00
Chris Lattner	5cb658f43c	gvn should never crash on this. llvm-svn: 60987	2008-12-13 18:39:44 +00:00
Bill Wendling	34182ae3ae	Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM: llvm[2]: Linking Release executable opt (without symbols) ... Undefined symbols: "llvm::APFloat::IEEEsingle", referenced from: __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) "llvm::APFloat::IEEEdouble", referenced from: __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) ld: symbol(s) not found This is in release mode. To replicate, compile llvm and llvm-gcc in optimized mode. Then build llvm, in optimized mode, with the newly created compiler. llvm-svn: 60977	2008-12-13 09:28:44 +00:00
Chris Lattner	8753175cd6	make RLE preserve the name of the load that it replaces. This is just a pretification of the IR. llvm-svn: 60973	2008-12-13 07:22:47 +00:00
Devang Patel	91736025e1	Re-enable test. llvm-svn: 60968	2008-12-12 22:42:35 +00:00
Bill Wendling	13e4a3d0b0	- Use patterns instead of creating completely new instruction matching patterns, which are identical to the original patterns. - Change the multiply with overflow so that we distinguish between signed and unsigned multiplication. Currently, unsigned multiplication with overflow isn't working! llvm-svn: 60963	2008-12-12 21:15:41 +00:00
Devang Patel	0aae72ae88	XFAIL these tests for now. llvm-svn: 60959	2008-12-12 19:08:08 +00:00
Nick Lewycky	51228d6707	Revert my re-instated reverted commit, fixes the bootstrap build on x86-64 linux. llvm-svn: 60951	2008-12-12 17:09:07 +00:00
Nick Lewycky	312d95be37	Sneaky, sneaky: move the -1 to the outside of the SMax. Reinstate the optimization of SGE/SLE with unit stride, now that it works properly. llvm-svn: 60881	2008-12-11 17:40:14 +00:00
Bill Wendling	292263313b	If ADD, SUB, or MUL have an overflow bit that's used, don't do transformation on them. The DAG combiner expects that nodes that are transformed have one value result. llvm-svn: 60857	2008-12-10 22:36:00 +00:00
Duncan Sands	81499a8e1c	For amusement, implement SADDO, SSUBO, UADDO, USUBO for promoted integer types, eg: i16 on ppc-32, or i24 on any platform. Complete support for arbitrary precision integers would require handling expanded integer types, eg: i128, but I couldn't be bothered. llvm-svn: 60834	2008-12-10 12:30:42 +00:00
Mon P Wang	308879dcfc	Fixed a bug when trying to optimize a extract vector element of a bit convert that changes the number of elements of a shuffle. llvm-svn: 60829	2008-12-10 03:59:02 +00:00
Chris Lattner	e2b5854e41	Allow basicaa to walk through geps with identical indices in parallel, allowing it to decide that P/Q must alias if A/B must alias in things like: P = gep A, 0, i, 1 Q = gep B, 0, i, 1 This allows GVN to delete 62 more instructions out of 403.gcc. llvm-svn: 60820	2008-12-10 01:04:47 +00:00
Evan Cheng	9419dfe08a	Fix a couple of Dwarf bugs. - Emit DW_AT_byte_size for struct and union of size zero. - Emit DW_AT_declaration for forward type declaration. llvm-svn: 60812	2008-12-10 00:15:44 +00:00
Bill Wendling	1c1dacdd42	Implement fast-isel conversion of a branch instruction that's branching on an overflow/carry from the "arithmetic with overflow" intrinsics. It searches the machine basic block from bottom to top to find the SETO/SETC instruction that is its conditional. If an instruction modifies EFLAGS before it reaches the SETO/SETC instruction, then it defaults to the normal instruction emission. llvm-svn: 60807	2008-12-09 23:19:12 +00:00
Chris Lattner	2550938060	loosen up an assertion that isn't valid when called from invalidateCachedPointerInfo. Thanks to Bill for sending me a testcase. llvm-svn: 60805	2008-12-09 22:45:32 +00:00
Bill Wendling	4c8fb3a0cc	Add sub/mul overflow intrinsics. This currently doesn't have a target-independent way of determining overflow on multiplication. It's very tricky. Patch by Zoltan Varga! llvm-svn: 60800	2008-12-09 22:08:41 +00:00
Duncan Sands	88a2901801	Fix PR3117: not all nodes being legalized. The essential problem was that the DAG can contain random unused nodes which were never analyzed. When remapping a value of a node being processed, such a node may become used and need to be analyzed; however due to operands being transformed during analysis the node may morph into a different one. Users of the morphing node need to be updated, and this wasn't happening. While there I added a bunch of documentation and sanity checks, so I (or some other poor soul) won't have to scratch their head over this stuff so long trying to remember how it was all supposed to work next time some obscure problem pops up! The extra sanity checking exposed a few places where invariants weren't being preserved, so those are fixed too. Since some of the sanity checking is expensive, I added a flag to turn it on. It is also turned on when building with ENABLE_EXPENSIVE_CHECKS=1. llvm-svn: 60797	2008-12-09 21:33:20 +00:00
Chris Lattner	6a5e9eaa36	Teach BasicAA::getModRefInfo(CallSite, CallSite) some tricks based on readnone/readonly functions. Teach memdep to look past readonly calls when analyzing deps for a readonly call. This allows elimination of a few more calls from 403.gcc: before: 63 gvn - Number of instructions PRE'd 153986 gvn - Number of instructions deleted 50069 gvn - Number of loads deleted after: 63 gvn - Number of instructions PRE'd 153991 gvn - Number of instructions deleted 50069 gvn - Number of loads deleted 5 calls isn't much, but this adds plumbing for the next change. llvm-svn: 60794	2008-12-09 21:19:42 +00:00
Evan Cheng	35f9ef25f1	xfail this for now. llvm-svn: 60777	2008-12-09 18:43:00 +00:00
Mikhail Glushenkov	25ca49afd4	Remove Clang tests since clang is not installed on the buildbots. llvm-svn: 60767	2008-12-09 15:11:45 +00:00
Mikhail Glushenkov	b1adbf1413	Add some rudimentary tests for . llvm-svn: 60766	2008-12-09 14:41:27 +00:00
Nick Lewycky	41060b1556	It's easy to handle SLE/SGE when the loop has a unit stride. llvm-svn: 60748	2008-12-09 07:25:04 +00:00
Scott Michel	5c944059b4	CellSPU: - Fix call.ll and call_indirect.ll expected results, now that it's using a different pre-register allocation scheduler. llvm-svn: 60741	2008-12-09 06:12:03 +00:00
Mon P Wang	0c011f8ba9	Fix getNode to allow a vector for the shift amount for shifts of vectors. Fix the shift amount when unrolling a vector shift into scalar shifts. Fix problem in getShuffleScalarElt where it assumes that the input of a bit convert must be a vector. llvm-svn: 60740	2008-12-09 05:46:39 +00:00
Devang Patel	0ef5e583cd	Actually test something. Use PR3170 test case. llvm-svn: 60727	2008-12-08 23:44:46 +00:00
Devang Patel	82fb6bc606	Undo previous patch. llvm-svn: 60701	2008-12-08 17:02:37 +00:00
Dan Gohman	14d4094968	Factor out the code for sign-extending/truncating gep indices and use it in x86 address mode folding. Also, make getRegForValue return 0 for illegal types even if it has a ValueMap for them, because Argument values are put in the ValueMap. This fixes PR3181. llvm-svn: 60696	2008-12-08 07:57:47 +00:00
Mikhail Glushenkov	c75a4df77c	Make 'extern' an option property. Makes (forward) work better. llvm-svn: 60667	2008-12-07 16:47:12 +00:00
Mikhail Glushenkov	a60c58c6dc	Add some clarifying comments. llvm-svn: 60662	2008-12-07 16:44:15 +00:00
Mikhail Glushenkov	d676237550	Add tests for tblgen's LLVMC backend. llvm-svn: 60657	2008-12-07 16:41:50 +00:00
Chris Lattner	a79a341f1e	fix a bug I introduced in simplifycfg handling single entry phi nodes. FoldSingleEntryPHINodes deletes the PHI, so there is no need to delete it afterward. llvm-svn: 60653	2008-12-07 07:22:45 +00:00
Evan Cheng	5c92d425a9	Clean up some ARM GV asm printing out; minor fixes to match what gcc does. llvm-svn: 60621	2008-12-06 02:00:55 +00:00
Chris Lattner	022b15083b	Reimplement the inner loop of DSE. It now uniformly uses getDependence(), doesn't do its own local caching, and is slightly more aggressive about free/store dse (see testcase). This eliminates the last external client of MemDep::getDependenceFrom(). llvm-svn: 60619	2008-12-06 00:53:22 +00:00
Dale Johannesen	f4758579eb	Fix test to pass on Linux. llvm-svn: 60614	2008-12-05 22:38:21 +00:00
Dale Johannesen	f5a072c388	Make LoopStrengthReduce smarter about hoisting things out of loops when they can be subsumed into addressing modes. Change X86 addressing mode check to realize that some PIC references need an extra register. (I believe this is correct for Linux, if not, I'm sure someone will tell me.) llvm-svn: 60608	2008-12-05 21:47:27 +00:00
Evan Cheng	460beb0063	This test also requires -mattr=+sse41. llvm-svn: 60601	2008-12-05 19:26:37 +00:00
Evan Cheng	144447bfa0	Effectively undo 60461 in PIC mode which simply transform V_SET0 / V_SETALLONES into a load from constpool in order to fold into restores. This is not safe to do when PIC base is being used for a number of reasons: 1. GlobalBaseReg may have been spilled. 2. It may not be live at the use. 3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later). llvm-svn: 60595	2008-12-05 17:23:48 +00:00
Chris Lattner	211146e709	Fix test/Transforms/GVN/pre-load.ll llvm-svn: 60594	2008-12-05 17:04:12 +00:00
Evan Cheng	1b795803dd	Re-did 60519. It turns out Darwin's handling of hidden visibility symbols are a bit more complicate than I expected. Both declarations and weak definitions still need a stub indirection. However, the stubs are in data section and they contain the addresses of the actual symbols. llvm-svn: 60571	2008-12-05 01:06:39 +00:00
Scott Michel	550ec4540c	CellSPU: Add new directory under tests/CodeGen/CellSPU to retain tests that aren't part of the test suite but are generally useful nonetheless, and can be expanded later to test the backend against the actual Cell SPU system. There's basically no other good place to put this code, so put it here for the time being. - vecoperations.c: Vector shuffles for all supported vector types, tests for v16i8 add and multiply. llvm-svn: 60566	2008-12-05 00:01:00 +00:00
Devang Patel	4fcea36b8b	Rewrite code that 1) filters loops and 2) calculates new loop bounds. This fixes many bugs. I will add more test cases in a separate check-in. Some day, the code that manipulates CFG and updates dom. info could use refactoring help. llvm-svn: 60554	2008-12-04 21:38:42 +00:00
Bill Wendling	a0466523bd	Temporarily revert r60519. It was causing a bootstrap failure: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c ../../../llvm-gcc.src/libgomp/barrier.c -fno-common -DPIC -o .libs/barrier.o checking for sys/file.h... /var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:non-relocatable subtraction expression, "_gomp_tls_key" minus "L1$pb" /var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:symbol: "_gomp_tls_key" can't be undefined in a subtraction expression make[4]: * [barrier.lo] Error 1 make[4]: * Waiting for unfinished jobs.... /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT alloc.lo -MD -MP -MF .deps/alloc.Tpo -c ../../../llvm-gcc.src/libgomp/alloc.c -o alloc.o >/dev/null 2>&1 yes checking for sys/param.h... make[3]: * [all-recursive] Error 1 make[2]: * [all] Error 2 make[1]: * [all-target-libgomp] Error 2 make[1]: * Waiting for unfinished jobs.... llvm-svn: 60527	2008-12-04 04:07:00 +00:00
Evan Cheng	d4b7459179	Visibility hidden GVs do not require extra load of symbol address from the GOT or non-lazy-ptr. llvm-svn: 60519	2008-12-04 01:56:50 +00:00
Evan Cheng	05ded29738	Use mmx (punpckldq VR64, (mmx_v_set0)) to clear high 32-bits of a VR64 register. llvm-svn: 60499	2008-12-03 19:38:05 +00:00
Rafael Espindola	0b01e188e5	Fix some tests. The grep for "il" was matching "file". llvm-svn: 60485	2008-12-03 17:14:56 +00:00
Richard Osborne	e74ae9dbb7	Add support for ISD::TRAP to the XCore backend llvm-svn: 60479	2008-12-03 10:59:16 +00:00
Evan Cheng	803ac3b438	Fix test. llvm-svn: 60476	2008-12-03 08:20:45 +00:00
Chris Lattner	3f3717a4e2	testcase for br undef folding. llvm-svn: 60471	2008-12-03 07:48:27 +00:00
Chris Lattner	f00b2f3fb4	Teach jump threading some more simple tricks: 1) have it fold "br undef", which does occur with surprising frequency as jump threading iterates. 2) teach j-t to delete dead blocks. This removes the successor edges, reducing the in-edges of other blocks, allowing recursive simplification. 3) Fold things like: br COND, BBX, BBY BBX: br COND, BBZ, BBW which also happens because jump threading iterates. llvm-svn: 60470	2008-12-03 07:48:08 +00:00
Chris Lattner	683df044b0	don't spew tons of stuff to the output. This testcase is not for loop deletion (it is for a ton of passes), which is very bad. llvm-svn: 60465	2008-12-03 06:41:50 +00:00
Dan Gohman	ac6561793c	Mark x86's V_SET0 and V_SETALLONES with isSimpleLoad, and teach X86's foldMemoryOperand how to "fold" them, by converting them into constant-pool loads. When they aren't folded, they use xorps/cmpeqd, but for example when register pressure is high, they may now be folded as memory operands, which reduces register pressure. Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will remat it instead of copying zeros around (V_SETALLONES was already marked). llvm-svn: 60461	2008-12-03 05:21:24 +00:00
Bill Wendling	f1fab58701	Change label to 'carry' for unsigned adds. llvm-svn: 60460	2008-12-03 02:43:12 +00:00
Dan Gohman	dcd4896f12	Fix byval arguments in the fastcc calling convention. The fastcc convention delegates to the regular x86-32 convention which handles byval, but only after it handles a few cases, and it's necessary to handle byval before handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc miscompiling LLVM. llvm-svn: 60453	2008-12-03 01:28:04 +00:00
Dan Gohman	06c3ee5aa8	Add nounwind attributes to this test. llvm-svn: 60451	2008-12-03 01:10:18 +00:00
Dale Johannesen	da5e01399a	testcases for recent dag combiner changes llvm-svn: 60449	2008-12-03 00:52:41 +00:00
Evan Cheng	a77559c870	Remove a (what appears to be) overly strict assertion. Here is what happened: 1. ppcf128 select is expanded to f64 select's. 2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend. 3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed. 4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map. 5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert. Duncan, please take a look. Thanks. llvm-svn: 60443	2008-12-02 21:57:09 +00:00
Scott Michel	e0bbe7afb7	CellSPU: - Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch - Update SPU calling convention info, even if it's not used yet (but can be at some point or another) - Ensure that any-extended f32 loads are custom lowered, especially when they're promoted for use in printf. llvm-svn: 60438	2008-12-02 19:53:53 +00:00
Chris Lattner	2a9747548e	Implement PRE of loads in the GVN pass with a pretty cheap and straight-forward implementation. This does not require any extra alias analysis queries beyond what we already do for non-local loads. Some programs really really like load PRE. For example, SPASS triggers this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc. The biggest limitation to the implementation is that it does not split critical edges. This is a huge killer on many programs and should be addressed after the initial patch is enabled by default. The implementation of this should incidentally speed up rejection of non-local loads because it avoids creating the repl densemap in cases when it won't be used for fully redundant loads. This is currently disabled by default. Before I turn this on, I need to fix a couple of miscompilations in the testsuite, look at compile time performance numbers, and look at perf impact. This is pretty close to ready though. llvm-svn: 60408	2008-12-02 08:16:11 +00:00
Owen Anderson	bd844014fa	Add a test for my previous PRE fix. llvm-svn: 60394	2008-12-02 04:25:42 +00:00
Evan Cheng	39d7e00ff9	Fix PR3124: overly strict assert. llvm-svn: 60392	2008-12-02 02:15:36 +00:00
Bill Wendling	580f12ae30	Second stab at target-dependent lowering of everyone's favorite nodes: [SU]ADDO - LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C, depending on the type of ADDO requested. - LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or COND_C set. llvm-svn: 60388	2008-12-02 01:06:39 +00:00
Chris Lattner	baf38b4f91	Add rdar reference, make this actually fail when the patch isn't applied. llvm-svn: 60376	2008-12-01 22:35:31 +00:00
Dale Johannesen	f4362aae8c	Consider only references to an IV within the loop when figuring out the base of the IV. This produces better code in the example. (Addresses use (IV) instead of (BASE,IV) - a significant improvement on low-register machines like x86). llvm-svn: 60374	2008-12-01 22:00:01 +00:00
Scott Michel	cf677b5a67	CellSPU: - Fix v2[if]64 vector insertion code before IBM files a bug report. - Ensure that zero (0) offsets relative to $sp don't trip an assert (add $sp, 0 gets legalized to $sp alone, tripping an assert) - Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32 llvm-svn: 60358	2008-12-01 17:56:02 +00:00
Bill Wendling	a6e7dd2299	Use m_Specific() instead of double matching. llvm-svn: 60341	2008-12-01 08:09:47 +00:00
Chris Lattner	e6c7ed156f	simplify these patterns using m_Specific. No need to grep for xor in testcase (or is a substring). llvm-svn: 60328	2008-12-01 05:16:26 +00:00
Chris Lattner	0e03e40a76	Teach inst combine to merge GEPs through PHIs. This is really important because it is sinking the loads using the GEPs, but not the GEPs themselves. This triggers 647 times on 403.gcc and makes the .s file much much nicer. For example before: je LBB1_87 ## bb78 LBB1_62: ## bb77 leal 84(%esi), %eax LBB1_63: ## bb79 movl (%eax), %eax ... LBB1_87: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub jmp LBB1_62 ## bb77 after: jne LBB1_63 ## bb79 LBB1_62: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub LBB1_63: ## bb79 movl 84(%esi), %eax The input code was (and the GEPs are merged and the PHI is now eliminated by instcombine): br i1 %tmp233, label %bb78, label %bb77 bb77: %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb78: call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb79: %iftmp.12.0.in = phi %struct.rtx_def [ %tmp235, %bb78 ], [ %tmp234, %bb77 ] %iftmp.12.0 = load %struct.rtx_def %iftmp.12.0.in llvm-svn: 60322	2008-12-01 02:34:36 +00:00
Chris Lattner	01150dce74	testcase for my previous commit. llvm-svn: 60315	2008-12-01 01:42:03 +00:00
Bill Wendling	23684a026c	Implement ((A\|B)&1)\|(B&-2) -> (A&1) \| B transformation. This also takes care of permutations of this pattern. llvm-svn: 60312	2008-12-01 01:07:11 +00:00
Bill Wendling	66a7442059	Add instruction combining for ((A&~B)\|(~A&B)) -> A^B and all permutations. llvm-svn: 60291	2008-11-30 13:52:49 +00:00
Bill Wendling	3e27ac16a6	Implement (A&((~A)\|B)) -> A&B transformation in the instruction combiner. This takes care of all permutations of this pattern. llvm-svn: 60290	2008-11-30 13:08:13 +00:00
Bill Wendling	97ad688c1b	getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all APInt calls instead. This fixes PR3144. llvm-svn: 60288	2008-11-30 12:38:24 +00:00
Eli Friedman	2bc3921ce2	Optimize memmove and memset into the LLVM builtins. Note that these only show up in code from front-ends besides llvm-gcc, like clang. llvm-svn: 60287	2008-11-30 08:32:11 +00:00
Eli Friedman	ccdfdbfc99	Followup to r60283: optimize arbitrary width signed divisions as well as unsigned divisions. Same caveats as before. llvm-svn: 60284	2008-11-30 06:35:39 +00:00
Eli Friedman	d7a261120f	Fix for PR2164: allow transforming arbitrary-width unsigned divides into multiplies. Some more cleverness would be nice, though. It would be nice if we could do this transformation on illegal types. Also, we would prefer a narrower constant when possible so that we can use a narrower multiply, which can be cheaper. llvm-svn: 60283	2008-11-30 06:02:26 +00:00
Eli Friedman	0ef5e1dc82	APIntify a test which is potentially unsafe otherwise, and fix the nearby FIXME. I'm not sure what the right way to fix the Cell test was; if the approach I used isn't okay, please let me know. llvm-svn: 60277	2008-11-30 04:59:26 +00:00
Bill Wendling	5020e916ef	Strengthen check for div inst-combining. llvm-svn: 60276	2008-11-30 04:33:53 +00:00
Bill Wendling	ac11f7d37e	Instcombine was illegally transforming -X/C into X/-C when either X or C overflowed on negation. This commit checks to make sure that neithe C nor X overflows. This requires that the RHS of X (a subtract instruction) be a constant integer. llvm-svn: 60275	2008-11-30 03:42:12 +00:00
Chris Lattner	203a3299e9	don't require GVN to work on dead values, just make the test return the loaded value. llvm-svn: 60252	2008-11-29 21:21:48 +00:00
Chris Lattner	f3e49f038c	Fix a thinko that manifested as a crash on clamav last night. llvm-svn: 60251	2008-11-29 20:29:04 +00:00
Chris Lattner	494758e720	Fix PR3141 by ensuring that MemoryDependenceAnalysis::removeInstruction properly updates the reverse dependency map when it installs updated dependencies for instructions that depend on the removed instruction. llvm-svn: 60222	2008-11-28 22:51:08 +00:00
Chris Lattner	a854ab3760	don't call MergeBasicBlockIntoOnlyPred on a block whose only predecessor is itself. This doesn't make sense, and this is a dead infinite loop anyway. llvm-svn: 60210	2008-11-28 19:54:49 +00:00
Nick Lewycky	40db216722	Chris prefers icmp/select over udiv! llvm-svn: 60187	2008-11-27 22:41:10 +00:00
Nick Lewycky	882443585d	Add a couple of missed optimizations on integer vectors. Multiply and divide by 1, as well as multiply by -1. llvm-svn: 60182	2008-11-27 20:21:08 +00:00
Chris Lattner	73b251b3bf	Fix PR3138: if we merge the entry block into another block, make sure to move the other block back up into the entry position! llvm-svn: 60179	2008-11-27 19:25:19 +00:00
Bill Wendling	7742719284	XFAil test due to reverting of patch. llvm-svn: 60161	2008-11-27 07:34:10 +00:00
Chris Lattner	532458b89f	Make jump threading substantially more powerful, in the following ways: 1. Make it fold blocks separated by an unconditional branch. This enables jump threading to see a broader scope. 2. Make jump threading able to eliminate locally redundant loads when they feed the branch condition of a block. This frequently occurs due to reg2mem running. 3. Make jump threading able to eliminate partially redundant loads when they feed the branch condition of a block. This is common in code with lots of loads and stores like C++ code and 255.vortex. This implements thread-loads.ll and rdar://6402033. Per the fixme's, several pieces of this should be moved into Transforms/Utils. llvm-svn: 60148	2008-11-27 05:07:53 +00:00
Evan Cheng	ee5e950c25	Avoid inserting noop's in the middle of a loop. llvm-svn: 60141	2008-11-27 01:16:00 +00:00
Evan Cheng	f18016728c	On x86 favors folding short immediate into some arithmetic operations (e.g. add, and, xor, etc.) because materializing an immediate in a register is expensive in turns of code size. e.g. movl 4(%esp), %eax addl $4, %eax is 2 bytes shorter than movl $4, %eax addl 4(%esp), %eax llvm-svn: 60139	2008-11-27 00:49:46 +00:00
Evan Cheng	4da44412cf	Add -march=x86. llvm-svn: 60135	2008-11-27 00:37:06 +00:00
Bill Wendling	3376836463	Add x86-specific test for add-with-overflow intrinsics. llvm-svn: 60125	2008-11-26 22:42:19 +00:00
Chris Lattner	d01522d33a	Turn on my codegen prepare heuristic by default. It doesn't affect performance in most cases on the Grawp tester, but does speed some things up (like shootout/hash by 15%). This also doesn't impact compile time in a noticable way on the Grawp tester. It also, of course, gets the testcase it was designed for right :) llvm-svn: 60120	2008-11-26 22:16:44 +00:00
Duncan Sands	f64dd4b09c	Check that running the DAG combiner between type and operation legalization does something useful. llvm-svn: 60108	2008-11-26 16:44:30 +00:00
Bill Wendling	f069b62cd7	Add test for rdar://6394879. llvm-svn: 60079	2008-11-26 02:21:12 +00:00
Chris Lattner	61c2a0fc8a	This adds in some code (currently disabled unless you pass -enable-smarter-addr-folding to llc) that gives CGP a better cost model for when to sink computations into addressing modes. The basic observation is that sinking increases register pressure when part of the addr computation has to be available for other reasons, such as having a use that is a non-memory operation. In cases where it works, it can substantially reduce register pressure. This code is currently an overall win on 403.gcc and 255.vortex (the two things I've been looking at), but there are several things I want to do before enabling it by default: 1. This isn't doing any caching of results, so it is much slower than it could be. It currently slows down release-asserts llc by 1.7% on 176.gcc: 27.12s -> 27.60s. 2. This doesn't think about inline asm memory operands yet. 3. The cost model botches the case when the needed value is live across the computation for other reasons. I'll continue poking at this, and eventually turn it on as llcbeta. llvm-svn: 60074	2008-11-26 02:00:14 +00:00
Chris Lattner	8209f83091	Teach CodeGenPrepare to look through Bitcast instructions when attempting to optimize addressing modes. This allows us to optimize things like isel-sink2.ll into: movl 4(%esp), %eax cmpb $0, 4(%eax) jne LBB1_2 ## F LBB1_1: ## TB movl $4, %eax ret LBB1_2: ## F movzbl 7(%eax), %eax ret instead of: _test: movl 4(%esp), %eax cmpb $0, 4(%eax) leal 4(%eax), %eax jne LBB1_2 ## F LBB1_1: ## TB movl $4, %eax ret LBB1_2: ## F movzbl 3(%eax), %eax ret This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s. Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt it is really testing what it thinks it is. llvm-svn: 60068	2008-11-26 00:26:16 +00:00
Chris Lattner	017dde7e2b	fix an over-reduced test. llvm-svn: 60067	2008-11-26 00:12:08 +00:00
Chris Lattner	72db9f8bdd	this doesn't need EH llvm-svn: 60066	2008-11-26 00:03:26 +00:00
Mikhail Glushenkov	89bfeb825b	Since the old llvmc was removed, rename llvmc2 to llvmc. llvm-svn: 60048	2008-11-25 21:38:12 +00:00
Evan Cheng	c11d7e324f	convertToSignExtendedInteger should return opInvalidOp instead of asserting if sematics of float does not allow arithmetics. llvm-svn: 60042	2008-11-25 19:00:29 +00:00
Scott Michel	59013b297c	CellSPU: (a) Remove conditionally removed code in SelectXAddr. Basically, hope for the best that the A-form and D-form address predicates catch everything before the code decides to emit a X-form address. (b) Expand vector store test cases to include the usual suspects. llvm-svn: 60034	2008-11-25 17:29:43 +00:00
Scott Michel	bb575152bc	CellSPU: test should use shlqby, not shlqbyi llvm-svn: 60001	2008-11-25 01:30:37 +00:00
Bill Wendling	c9f3eec3f9	XFAIL this test. A recent CellSPU check-in broke it. llvm-svn: 60000	2008-11-25 00:56:34 +00:00
Dan Gohman	92cedc8a95	Initial support for anti-dependence breaking. Currently this code does not introduce any new spilling; it just uses unused registers. Refactor the SUnit topological sort code out of the RRList scheduler and make use of it to help with the post-pass scheduler. llvm-svn: 59999	2008-11-25 00:52:40 +00:00
Bill Wendling	cb92038dbd	Testcase for constant CFStrings. llvm-svn: 59992	2008-11-24 23:28:09 +00:00
Chris Lattner	a07ad05059	reenable test llvm-svn: 59986	2008-11-24 21:27:20 +00:00
Bill Wendling	36ee715e71	Temporarily XFAIL this test. r59976 and r59972 broke it. llvm-svn: 59981	2008-11-24 20:43:33 +00:00
Chris Lattner	e5bf93e61f	Fix 3113: If we have a dead cyclic PHI, replace the whole thing with an undef. llvm-svn: 59972	2008-11-24 19:25:36 +00:00
Scott Michel	259a64c097	CellSPU: (a) Slight rethink on i64 zero/sign/any extend code - use a shuffle to directly zero-extend i32 to i64, but use rotates and shifts for sign extension. Also ensure unified register consistency. (b) Add new test harness for i64 operations: i64ops.ll llvm-svn: 59970	2008-11-24 18:20:46 +00:00
Scott Michel	c3965308a4	CellSPU: (a) Improve the extract element code: there's no need to do gymnastics with rotates into the preferred slot if a shuffle will do the same thing. (b) Rename a couple of SPUISD pseudo-instructions for readability and better semantic correspondence. (c) Fix i64 sign/any/zero extension lowering. llvm-svn: 59965	2008-11-24 17:11:17 +00:00
Bill Wendling	855ac77084	Test add-with-overflow with fast ISel. llvm-svn: 59945	2008-11-24 05:23:38 +00:00
Nick Lewycky	47fa9bd187	Extend the 'noalias' attribute to function return values. This is intended to indicate functions that allocate, such as operator new, or list::insert. The actual definition is slightly less strict (for now). No changes to the bitcode reader/writer, asm printer or verifier were needed. llvm-svn: 59934	2008-11-24 03:41:24 +00:00
Bill Wendling	4bb8a7a498	Add support for llvm.uadd.with.overflow. llvm-svn: 59926	2008-11-24 01:38:29 +00:00
Scott Michel	50e49b28f0	CellSPU: Fix bug 3056. Varadic extract_element was not implemented (nor was it ever conceived to occur). llvm-svn: 59891	2008-11-22 23:50:42 +00:00
Nick Lewycky	2fbf26fe70	Optimize (x/y)*y into x-(x%y) in general. Div and rem are about the same, and a subtract is cheaper than a multiply. This generalizes an existing transform. llvm-svn: 59800	2008-11-21 07:33:58 +00:00
Scott Michel	314d705baf	CellSPU: (a) Fix bgs 3052, 3057 (b) Incorporate Duncan's suggestions re: i1 promotion (c) Indentation updates. llvm-svn: 59790	2008-11-21 02:56:16 +00:00
Bill Wendling	1e6d74b84a	Add generic test for add with overflow. llvm-svn: 59781	2008-11-21 02:15:51 +00:00

1 2 3 4 5 ...

6224 Commits