Chris Lattner
b86c76fdbb
reenable array_pod_sort, this time hopefully happy on 64-bit
...
and big endian systems.
llvm-svn: 60371
2008-12-01 21:11:25 +00:00
Bill Wendling
33f3e77a5b
Don't rebuild RHSNeg. Just use the one that's already there.
...
llvm-svn: 60370
2008-12-01 21:06:30 +00:00
Bill Wendling
d436da480d
Document what this check is doing. Also, no need to cast to ConstantInt.
...
llvm-svn: 60369
2008-12-01 21:03:43 +00:00
Bill Wendling
1e4fb7a143
Use a simple comparison. Overflow on integer negation can only occur when the
...
integer is "minint".
llvm-svn: 60366
2008-12-01 19:46:27 +00:00
Chris Lattner
d6533f8d0a
don't #include <algorithm> into the llvm namespace.
...
llvm-svn: 60365
2008-12-01 19:45:45 +00:00
Scott Michel
cf677b5a67
CellSPU:
...
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
llvm-svn: 60358
2008-12-01 17:56:02 +00:00
Chris Lattner
ddae8937e6
switch to std::sort until I have time to sort this out.
...
llvm-svn: 60354
2008-12-01 17:00:08 +00:00
Chris Lattner
f73ecf1a6c
cleanups suggested by duncan, thanks!
...
llvm-svn: 60353
2008-12-01 16:55:19 +00:00
Chris Lattner
ba0f3caaa7
define array_pod_sort in terms of operator< instead of my brain
...
damaged approximation. This should fix it on big endian platforms
and on 64-bit.
llvm-svn: 60352
2008-12-01 16:50:01 +00:00
Duncan Sands
5de8739964
There are no longer any places that require a
...
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
llvm-svn: 60349
2008-12-01 11:41:29 +00:00
Duncan Sands
1fae2ea219
Change the interface to the type legalization method
...
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
llvm-svn: 60348
2008-12-01 11:39:25 +00:00
Bill Wendling
48b7cbbc01
Generalize the FoldOrWithConstant method to fold for any two constants which
...
don't have overlapping bits.
llvm-svn: 60344
2008-12-01 08:32:40 +00:00
Bill Wendling
2a182b838d
Reduce copy-and-paste code by splitting out the code into its own function.
...
llvm-svn: 60343
2008-12-01 08:23:25 +00:00
Bill Wendling
a6e7dd2299
Use m_Specific() instead of double matching.
...
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Bill Wendling
8e484e9556
Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to.
...
llvm-svn: 60340
2008-12-01 07:47:02 +00:00
Chris Lattner
3b908483b7
Rename some variables, only increment BI once at the start of the loop instead of throughout it.
...
llvm-svn: 60339
2008-12-01 07:35:54 +00:00
Chris Lattner
c6e6eaf6d3
pull the predMap densemap out of the inner loop of performPRE, so
...
that it isn't reallocated all the time. This is a tiny speedup for
GVN: 3.90->3.88s
llvm-svn: 60338
2008-12-01 07:29:03 +00:00
Chris Lattner
f72f8e3b74
switch a couple more calls to use array_pod_sort.
...
llvm-svn: 60337
2008-12-01 06:52:57 +00:00
Chris Lattner
3ac27eff64
don't assume iterators implicitly convert to pointers.
...
llvm-svn: 60336
2008-12-01 06:50:46 +00:00
Chris Lattner
80d0eff786
Introduce a new array_pod_sort function and switch LSR to use it
...
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
llvm-svn: 60335
2008-12-01 06:49:59 +00:00
Chris Lattner
74f1e6d3ec
Eliminate use of setvector for the DeadInsts set, just use a smallvector.
...
This is a lot cheaper and conceptually simpler.
llvm-svn: 60332
2008-12-01 06:27:41 +00:00
Chris Lattner
db86ff62f9
DeleteTriviallyDeadInstructions is always passed the
...
DeadInsts ivar, just use it directly.
llvm-svn: 60330
2008-12-01 06:14:28 +00:00
Chris Lattner
d6be279b4d
simplify DeleteTriviallyDeadInstructions again, unlike my previous
...
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
llvm-svn: 60329
2008-12-01 06:11:32 +00:00
Chris Lattner
e6c7ed156f
simplify these patterns using m_Specific. No need to grep for
...
xor in testcase (or is a substring).
llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner
c92e1e104b
Teach jump threading to clean up after itself, DCE and constfolding the
...
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
llvm-svn: 60327
2008-12-01 04:48:07 +00:00
Chris Lattner
ceeb559995
The PreVerifier pass preserves everything. In practice, this
...
prevents the passmgr from adding yet-another domtree invocation
for Verifier if there is already one live.
llvm-svn: 60326
2008-12-01 03:58:38 +00:00
Chris Lattner
13942f82c4
Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs
...
instead of using FoldPHIArgBinOpIntoPHI. In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices. This prevented instcombine
from factoring big chunks of code in 403.gcc. For example:
insn_cuid.exit:
- %tmp336 = load i32** @uid_cuid, align 4
- %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3
- %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*
- %tmp339 = load i32* %tmp338, align 4
- %tmp340 = getelementptr i32* %tmp336, i32 %tmp339
br label %bb62
bb61:
- %tmp341 = load i32** @uid_cuid, align 4
- %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3
- %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*
- %tmp344 = load i32* %tmp343, align 4
- %tmp345 = getelementptr i32* %tmp341, i32 %tmp344
br label %bb62
bb62:
- %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]
+ %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]
+ %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3
+ %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*
+ %tmp341.pn = load i32** @uid_cuid
+ %tmp344.pn = load i32* %tmp344.pn.in
+ %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn
%iftmp.62.0 = load i32* %iftmp.62.0.in
llvm-svn: 60325
2008-12-01 03:42:51 +00:00
Chris Lattner
0e03e40a76
Teach inst combine to merge GEPs through PHIs. This is really
...
important because it is sinking the loads using the GEPs, but
not the GEPs themselves. This triggers 647 times on 403.gcc
and makes the .s file much much nicer. For example before:
je LBB1_87 ## bb78
LBB1_62: ## bb77
leal 84(%esi), %eax
LBB1_63: ## bb79
movl (%eax), %eax
...
LBB1_87: ## bb78
movl $0, 4(%esp)
movl %esi, (%esp)
call L_make_decl_rtl$stub
jmp LBB1_62 ## bb77
after:
jne LBB1_63 ## bb79
LBB1_62: ## bb78
movl $0, 4(%esp)
movl %esi, (%esp)
call L_make_decl_rtl$stub
LBB1_63: ## bb79
movl 84(%esi), %eax
The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):
br i1 %tmp233, label %bb78, label %bb77
bb77:
%tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22
br label %bb79
bb78:
call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
%tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22
br label %bb79
bb79:
%iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]
%iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in
llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Chris Lattner
01150dce74
testcase for my previous commit.
...
llvm-svn: 60315
2008-12-01 01:42:03 +00:00
Chris Lattner
c1adf6fc51
Make GVN be more intelligent about redundant load
...
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
llvm-svn: 60314
2008-12-01 01:31:36 +00:00
Chris Lattner
bd1bc4a75e
Reimplement the non-local dependency data structure in terms of a sorted
...
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
llvm-svn: 60313
2008-12-01 01:15:42 +00:00
Bill Wendling
23684a026c
Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
...
permutations of this pattern.
llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Eli Friedman
401743c904
Fix bogus assertion using getSExtValue for legitimate values, like -1 in
...
an 128-bit-wide integer. No testcase; the issue I ran into depends on
local changes.
llvm-svn: 60311
2008-12-01 00:43:48 +00:00
Chris Lattner
1f8482ffc8
Cache analyses in ivars and add some useful DEBUG output.
...
This speeds up GVN from 4.0386s to 3.9376s.
llvm-svn: 60310
2008-12-01 00:40:32 +00:00
Chris Lattner
77908d9ccf
improve indentation, do cheap checks before expensive ones,
...
remove some fixme's. This speeds up GVN very slightly on 403.gcc
(4.06->4.03s)
llvm-svn: 60309
2008-11-30 23:39:23 +00:00
Chris Lattner
36257aabe4
Eliminate the DepResultTy abstraction. It is now completely
...
redundant with MemDepResult, and MemDepResult has a nicer interface.
llvm-svn: 60308
2008-11-30 23:17:19 +00:00
Eli Friedman
052df7e062
Minor cleanup: use getTrue and getFalse where appropriate. No
...
functional change.
llvm-svn: 60307
2008-11-30 22:48:49 +00:00
Eli Friedman
8da9f2f8d3
Some minor cleanups to instcombine; no functionality change.
...
Note that the FoldOpIntoPhi call is dead because it's impossible for the
first operand of a subtraction to be both a ConstantInt and a PHINode.
llvm-svn: 60306
2008-11-30 21:09:11 +00:00
Chris Lattner
9f7facc8eb
Cache TargetData/AliasAnalysis in the pass instead of calling
...
getAnalysis<>. getAnalysis<> is apparently extremely expensive.
Doing this speeds up GVN on 403.gcc by 16%!
llvm-svn: 60304
2008-11-30 19:24:31 +00:00
Chris Lattner
ee7fbed62d
add the rest of the comparison routines.
...
llvm-svn: 60303
2008-11-30 19:10:41 +00:00
Bill Wendling
66a7442059
Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
...
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling
3e27ac16a6
Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
...
takes care of all permutations of this pattern.
llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling
92ebd6902d
Forgot one remaining call to getSExtValue().
...
llvm-svn: 60289
2008-11-30 12:41:09 +00:00
Bill Wendling
97ad688c1b
getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
...
APInt calls instead.
This fixes PR3144.
llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Eli Friedman
2bc3921ce2
Optimize memmove and memset into the LLVM builtins. Note that these
...
only show up in code from front-ends besides llvm-gcc, like clang.
llvm-svn: 60287
2008-11-30 08:32:11 +00:00
Eli Friedman
3b8efd50d7
A couple small cleanups, plus a new potential optimization.
...
llvm-svn: 60286
2008-11-30 07:52:27 +00:00
Eli Friedman
97d37825f1
Moving potential optimizations out of PR2330 into lib/Target/README.txt.
...
Hopefully this isn't too much stuff to dump into this file.
llvm-svn: 60285
2008-11-30 07:36:04 +00:00
Eli Friedman
ccdfdbfc99
Followup to r60283: optimize arbitrary width signed divisions as well
...
as unsigned divisions. Same caveats as before.
llvm-svn: 60284
2008-11-30 06:35:39 +00:00
Eli Friedman
d7a261120f
Fix for PR2164: allow transforming arbitrary-width unsigned divides into
...
multiplies.
Some more cleverness would be nice, though. It would be nice if we
could do this transformation on illegal types. Also, we would
prefer a narrower constant when possible so that we can use a narrower
multiply, which can be cheaper.
llvm-svn: 60283
2008-11-30 06:02:26 +00:00
Bill Wendling
115290ddd3
Don't make TwoToExp signed by default.
...
llvm-svn: 60279
2008-11-30 05:29:33 +00:00