llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00

Author	SHA1	Message	Date
Benjamin Kramer	a58b69aa9d	Try to reuse the value when lowering memset. This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710	2011-01-02 19:57:05 +00:00
Benjamin Kramer	38491f47ce	Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors. We could implement a DAGCombine to turn x * 0x0101 back into logic operations on targets that doesn't support the multiply or it is slow (p4) if someone cares enough. Example code: void test(char *s, int a) { __builtin_memset(s, a, 4); } before: _test: ## @test movzbl 8(%esp), %eax movl %eax, %ecx shll $8, %ecx orl %eax, %ecx movl %ecx, %eax shll $16, %eax orl %ecx, %eax movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret after: _test: ## @test movzbl 8(%esp), %eax imull $16843009, %eax, %eax ## imm = 0x1010101 movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret llvm-svn: 122707	2011-01-02 19:44:58 +00:00
Oscar Fuentes	6514b0ac68	A workaround for a bug in cmake 2.8.3 diagnosed on PR 8885. llvm-svn: 122706	2011-01-02 19:32:31 +00:00
Nick Lewycky	06b94a5e5b	Also remove functions that use complex constant expressions in terms of another function. llvm-svn: 122705	2011-01-02 19:16:44 +00:00
Chris Lattner	c78e4bc366	enhance loop idiom recognition to scan all unconditionally executed blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704	2011-01-02 19:01:03 +00:00
Chris Lattner	3bb2e83433	make inSubLoop much more efficient. llvm-svn: 122703	2011-01-02 18:53:08 +00:00
Chris Lattner	2bcd2564d6	rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead. isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was just a tree and didn't have DFS numbers. Checking DFS numbers is faster and easier than "limiting the search of the tree". llvm-svn: 122702	2011-01-02 18:45:39 +00:00
Chris Lattner	bbae3ddf12	add a list of opportunities for future improvement. llvm-svn: 122701	2011-01-02 18:32:09 +00:00
Chris Lattner	222b24e2de	update a bunch of entries. llvm-svn: 122700	2011-01-02 18:31:38 +00:00
Duncan Sands	2d1c116071	Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described in the PR, the pass could break LCSSA form when inserting preheaders. It probably would be easy enough to fix this, but since currently we always go into LCSSA form after running this pass, doing so is not urgent. llvm-svn: 122695	2011-01-02 13:38:21 +00:00
Cameron Zwarich	482eeb4c8e	Remove an unused member function. llvm-svn: 122693	2011-01-02 12:37:22 +00:00
Oscar Fuentes	c3becc8af6	Propagate to parent scope changes made to CMAKE_CXX_FLAGS. llvm-svn: 122692	2011-01-02 12:30:18 +00:00
Cameron Zwarich	8444a578bb	Fix a typo in a variable name. llvm-svn: 122691	2011-01-02 12:17:10 +00:00
Cameron Zwarich	72a49f9271	Move a load into the only branch where it is used and eliminate a temporary. llvm-svn: 122690	2011-01-02 10:50:14 +00:00
Cameron Zwarich	0a0e69ca0d	Add the explanatory comment from r122680's commit message to the code itself. llvm-svn: 122689	2011-01-02 10:40:14 +00:00
Cameron Zwarich	e522fd8efe	Tidy up indentation. llvm-svn: 122688	2011-01-02 10:10:02 +00:00
Cameron Zwarich	7becf43554	Fix a typo, which should also fix the failure on llvm-x86_64-linux-checks. llvm-svn: 122687	2011-01-02 10:06:44 +00:00
Chris Lattner	f669d6a901	Allow loop-idiom to run on multiple BB loops, but still only scan the loop header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but DOESN'T MERGE THE BLOCKS, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many many more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685	2011-01-02 07:58:36 +00:00
Cameron Zwarich	25272921bb	Remove the #ifdef'd code for balancing the eval-link data structure. It doesn't compile, and everyone's tests have shown it to be slower in practice, even for quite large graphs. I also hope to do an optimization that is only correct with the simpler data structure, which would break this even further. llvm-svn: 122684	2011-01-02 07:53:49 +00:00
Chris Lattner	8a72a8f315	remove debugging code. llvm-svn: 122683	2011-01-02 07:37:13 +00:00
Chris Lattner	bbd22e0c3c	add some -stats output. llvm-svn: 122682	2011-01-02 07:36:44 +00:00
Chris Lattner	2afc3c0dc4	improve loop rotation to use CodeMetrics to analyze the size of a loop header instead of its own code size estimator. This allows it to handle bitcasts etc more precisely. llvm-svn: 122681	2011-01-02 07:35:53 +00:00
Cameron Zwarich	c8a0461c46	Speed up dominator computation some more by optimizing bucket processing. When naively implemented, the Lengauer-Tarjan algorithm requires a separate bucket for each vertex. However, this is unnecessary, because each vertex is only placed into a single bucket (that of its semidominator), and each vertex's bucket is processed before it is added to any bucket itself. Instead of using a bucket per vertex, we use a single array Buckets that has two purposes. Before the vertex V with DFS number i is processed, Buckets[i] stores the index of the first element in V's bucket. After V's bucket is processed, Buckets[i] stores the index of the next element in the bucket to which V now belongs, if any. Reading from the buckets can also be optimized. Instead of processing the bucket of V's parent at the end of processing V, we process the bucket of V itself at the beginning of processing V. This means that the case of the root vertex can be simplified somewhat. It also means that we don't need to look up the DFS number of the semidominator of every node in the bucket we are processing, since we know it is the current index being processed. This is a 6.5% speedup running -domtree on test-suite + SPEC2000/2006, with larger speedups of around 12% on the larger benchmarks like GCC. llvm-svn: 122680	2011-01-02 07:03:00 +00:00
Chris Lattner	34a61ab676	teach loop idiom recognition to form memcpy's from simple loops. llvm-svn: 122678	2011-01-02 03:37:56 +00:00
Nick Lewycky	68d915ae1a	Remove functions from the FnSet when one of their callee's is being merged. This maintains the guarantee that the DenseSet expects two elements it contains to not go from inequal to equal under its nose. As a side-effect, this also lets us switch from iterating to a fixed-point to actually maintaining a work queue of functions to look at again, and we don't add thunks to our work queue so we don't need to detect and ignore them. llvm-svn: 122677	2011-01-02 02:46:33 +00:00
Chris Lattner	9dadac901f	a missed __builtin_object_size case. llvm-svn: 122676	2011-01-01 22:57:31 +00:00
Chris Lattner	e3e3cb83a5	various updates. llvm-svn: 122675	2011-01-01 22:52:11 +00:00
Chris Lattner	fda382af51	fix a globalopt crash on two Adobe-C++ testcases that the recent loop idiom pass exposed. llvm-svn: 122674	2011-01-01 22:31:46 +00:00
Rafael Espindola	2600dec19b	Fix darwin bots. llvm-svn: 122672	2011-01-01 21:58:41 +00:00
Rafael Espindola	55f7a5057d	Add support for the 'H' modifier. llvm-svn: 122667	2011-01-01 20:58:46 +00:00
Anton Korobeynikov	c3dd871e95	Update the test llvm-svn: 122666	2011-01-01 20:57:26 +00:00
Chris Lattner	e2ca13d366	turn on memset idiom recognition by default. Though there are still lots of limitations, this kicks in dozens of times in the 4 specfp2000 benchmarks, and hundreds of times in the int part. It also kicks in hundreds of times in multisource. This kicks in right before loop deletion, which has the pleasant effect of deleting loops that just do a memset. llvm-svn: 122664	2011-01-01 20:39:18 +00:00
Anton Korobeynikov	d37cb4cd1c	Model operand restrictions of mul-like instructions on ARMv5 via earlyclobber stuff. This should fix PRs 2313 and 8157. Unfortunately, no testcase, since it'd be dependent on register assignments. llvm-svn: 122663	2011-01-01 20:38:38 +00:00
Chris Lattner	b9c1684fce	add a validity check that was missed, fixing a crash on the new testcase. llvm-svn: 122662	2011-01-01 20:12:04 +00:00
Duncan Sands	aaddf57af9	Revert commit 122654 at the request of Chris, who reckons that instsimplify is the wrong hammer for this nail, and is probably right. llvm-svn: 122661	2011-01-01 20:08:02 +00:00
Chris Lattner	9a9f43c4a2	improve validity check to handle constant-trip-count loops more aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660	2011-01-01 19:54:22 +00:00
Chris Lattner	4651f8b037	implement the "no aliasing accesses in loop" safety check. This pass should be correct now. llvm-svn: 122659	2011-01-01 19:39:01 +00:00
Rafael Espindola	e223c0aa14	Fix PR8878. llvm-svn: 122658	2011-01-01 19:05:35 +00:00
Duncan Sands	6569cc29d1	Correct a bunch of mistakes which meant that the example pass didn't even compile, let alone work. llvm-svn: 122657	2011-01-01 17:37:07 +00:00
Duncan Sands	621983efe9	I was unable to get the instructions to work if LLVM was built using a separate objects directory. llvm-svn: 122656	2011-01-01 17:28:49 +00:00
Duncan Sands	3ba8ee3552	Clarify that the loadable module turns up in the top-level directory, not locally. llvm-svn: 122655	2011-01-01 17:21:58 +00:00
Duncan Sands	ec8b2b4cc5	Fix a README item by having InstructionSimplify do a mild form of value numbering, in which it considers (for example) "%a = add i32 %x, %y" and "%b = add i32 %x, %y" to be equal because the operands are equal and the result of the instructions only depends on the values of the operands. This has almost no effect (it removes 4 instructions from gcc-as-one-file), and perhaps slows down compilation: I measured a 0.4% slowdown on the large gcc-as-one-file testcase, but it wasn't statistically significant. llvm-svn: 122654	2011-01-01 16:12:09 +00:00
Che-Liang Chiou	a188cfb574	ptx: remove reg-reg addressing mode and st.const llvm-svn: 122653	2011-01-01 11:58:58 +00:00
Che-Liang Chiou	995a853724	ptx: add store instruction llvm-svn: 122652	2011-01-01 10:50:37 +00:00
Erick Tryzelaar	f6477b0a90	Add a reference to the OCamlLangImpl8. llvm-svn: 122651	2011-01-01 03:29:25 +00:00
Erick Tryzelaar	5ac1b424f0	Add an OCaml tutorial page 8 llvm-svn: 122650	2011-01-01 03:27:43 +00:00
Oscar Fuentes	a63d0dbfbf	Add to the list of cmake files the object file, not the asm file. This is necessary for executing the custom command that runs the assember. Fixes PR8877. llvm-svn: 122649	2010-12-31 20:15:37 +00:00
Oscar Fuentes	2e6eb6e191	CMake (MSVC): cmake automatically adds the /EHsc and /GR compiler options. If we are building with exceptions/rtti disabled, we replace /EHsc with /EHs-c- and /GR with /GR-, respectively. If we just add the disabling options we get warnings like this: cl : Command line warning D9025 : overriding '/EHs' with '/EHs-' llvm-svn: 122648	2010-12-31 19:10:49 +00:00
Duncan Sands	74270e8100	Simplify this pass by using a depth-first iterator to ensure that all operands are visited before the instructions themselves. llvm-svn: 122647	2010-12-31 17:49:05 +00:00
Duncan Sands	ca280dbcd5	Zap dead instructions harder. llvm-svn: 122645	2010-12-31 16:17:54 +00:00

... 6 7 8 9 10 ...

69096 Commits