1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

26281 Commits

Author SHA1 Message Date
Chris Lattner
d7df2dce20 use hte new pred cache to speed up the new non-local memdep
queries.  This speeds up GVN using the new queries (not yet
checked in) by just over 10%.

llvm-svn: 60743
2008-12-09 06:28:49 +00:00
Mon P Wang
0c011f8ba9 Fix getNode to allow a vector for the shift amount for shifts of vectors.
Fix the shift amount when unrolling a vector shift into scalar shifts.
Fix problem in getShuffleScalarElt where it assumes that the input of
a bit convert must be a vector.

llvm-svn: 60740
2008-12-09 05:46:39 +00:00
Chris Lattner
e32dbaddd2 Fix a really subtle off-by-one bug that Duncan noticed with valgrind
on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad.

llvm-svn: 60739
2008-12-09 04:47:21 +00:00
Scott Michel
cf7ec43939 CellSPU:
- Change default scheduling preference to list-burr, which produces somewhat
  better code than the default. Could also use list-tdrr, but need to ask
  dev list about the appropriate handy mnemonic before commiting.

llvm-svn: 60738
2008-12-09 03:37:19 +00:00
Bill Wendling
7250a29def Add initial support for fast-isel of the [SU]ADDO intrinsics. It isn't
complete. For instance, it lowers the common case into this less-than-optimal
code:

        addl    %ecx, %eax
        seto    %cl
        testb   %cl, %cl
        jne     LBB1_2  ## overflow

instead of:

        addl    %ecx, %eax
        jo      LBB1_2  ## overflow

That will come in a future commit.

llvm-svn: 60737
2008-12-09 02:42:50 +00:00
Dan Gohman
e99b76aa24 Don't charge full latency for an anti-dependence, in this simplistic
pipeline model.

llvm-svn: 60733
2008-12-09 00:26:46 +00:00
Dan Gohman
9e5cc22129 Fix a couple of mistaken switch case fall-throughs. Thanks to Bill
for spotting these!

llvm-svn: 60728
2008-12-08 23:50:06 +00:00
Chris Lattner
23e2ac8894 remove DebugIterations option. Despite the accusations,
jump threading has been shown to only expose problems not
have bugs itself.  I'm sure it's completely bug free! ;-)

llvm-svn: 60725
2008-12-08 22:44:07 +00:00
Evan Cheng
3bb2ad8a0a Re-apply 60689 now my head is screwed on right.
llvm-svn: 60711
2008-12-08 19:29:03 +00:00
Dan Gohman
6f3258586e Fix the top-level comments, and fix some 80-column violations.
llvm-svn: 60707
2008-12-08 17:50:35 +00:00
Dan Gohman
5bca97fc4f Revert 60689. It caused many regressions on Darwin targets.
llvm-svn: 60705
2008-12-08 17:38:02 +00:00
Devang Patel
a8d0117253 Fix spelling.
Thanks Duncan!

llvm-svn: 60702
2008-12-08 17:07:24 +00:00
Devang Patel
82fb6bc606 Undo previous patch.
llvm-svn: 60701
2008-12-08 17:02:37 +00:00
Duncan Sands
982c6ed1d9 Fix comment typo.
llvm-svn: 60697
2008-12-08 14:01:59 +00:00
Dan Gohman
14d4094968 Factor out the code for sign-extending/truncating gep indices
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.

llvm-svn: 60696
2008-12-08 07:57:47 +00:00
Chris Lattner
7307ef0ba3 add another level of caching for non-local pointer queries, keeping
track of whether the CachedNonLocalPointerInfo for a block is specific
to a block.  If so, just return it without any pred scanning.  This is
good for a 6% speedup on GVN (when it uses this lookup method, which
it doesn't right now).

llvm-svn: 60695
2008-12-08 07:31:50 +00:00
Chris Lattner
9020891916 consistency
llvm-svn: 60694
2008-12-08 07:21:39 +00:00
Chris Lattner
6ab4673c40 introduce a new RoundUpAlignment helper function, use it to
remove some more 64-bit divs and rems from the StructLayout 
ctor.

llvm-svn: 60692
2008-12-08 07:11:56 +00:00
Dan Gohman
f187690987 Make ConstantAggregateZero::get return a ConstantAggregateZero*,
as suggested in PR3182.

llvm-svn: 60691
2008-12-08 07:10:54 +00:00
Dan Gohman
84876ffe23 Update CPP backend for PrintModulePass API changes.
llvm-svn: 60690
2008-12-08 07:07:24 +00:00
Evan Cheng
d668dd83c0 Perform cheap checks first.
llvm-svn: 60689
2008-12-08 06:52:43 +00:00
Chris Lattner
494eb47570 Some minor optimizations for isObjectSmallerThan.
llvm-svn: 60687
2008-12-08 06:28:54 +00:00
Dan Gohman
e4b1a93573 Move createVirtualRegister out-of-line.
llvm-svn: 60684
2008-12-08 04:54:11 +00:00
Dan Gohman
7954facae5 Clarify some comments.
llvm-svn: 60683
2008-12-08 04:53:23 +00:00
Nick Lewycky
69eb224a5e Fixes for Visual Studio users. Patch by OvermindDL1 on llvm-dev!
llvm-svn: 60679
2008-12-08 00:45:02 +00:00
Chris Lattner
1ff38d6629 add an assert. the cast<> below would catch this but a message is more
useful.

llvm-svn: 60674
2008-12-07 18:45:15 +00:00
Chris Lattner
8cd875dac6 factor some code better.
llvm-svn: 60673
2008-12-07 18:42:51 +00:00
Chris Lattner
9ebcc276e4 factor some code, fixing some fixme's.
llvm-svn: 60672
2008-12-07 18:39:13 +00:00
Chris Lattner
16ea827dfd add support for caching pointer dependence queries. Nothing uses this yet
so it "can't" break anything.  That said, it does appear to work.

llvm-svn: 60654
2008-12-07 08:50:20 +00:00
Chris Lattner
a79a341f1e fix a bug I introduced in simplifycfg handling single entry phi
nodes. FoldSingleEntryPHINodes deletes the PHI, so there is no
need to delete it afterward.

llvm-svn: 60653
2008-12-07 07:22:45 +00:00
Owen Anderson
a5f2ce1ee3 Switch to top-down mode and fix a crasher this exposed caused by an error in the
live interval updating.

llvm-svn: 60652
2008-12-07 05:33:18 +00:00
Nick Lewycky
e277f75880 Fix typo, psuedo -> pseudo.
llvm-svn: 60651
2008-12-07 03:49:52 +00:00
Chris Lattner
35095d7722 Some internal refactoring to make it easier to cache results.
llvm-svn: 60650
2008-12-07 02:56:57 +00:00
Chris Lattner
ad82419b46 Introduce a new MemDep::getNonLocalPointerDependency
method.  This will eventually take over load/store dep
queries from getNonLocalDependency.  For now it works
fine, but is incredibly slow because it does no caching.
Lets not switch GVN to use it until that is fixed :)

llvm-svn: 60649
2008-12-07 02:15:47 +00:00
Chris Lattner
8e8a6b4ec3 push the "pointer case" up the analysis stack a bit. This causes
duplication of logic (in 2 places) to determine what pointer a 
load/store touches.  This will be addressed in a future commit.

llvm-svn: 60648
2008-12-07 01:50:16 +00:00
Chris Lattner
d14e6778c6 make clients have to know how to call getCallSiteDependencyFrom
instead of making getDependencyFrom do it.

llvm-svn: 60647
2008-12-07 01:21:14 +00:00
Chris Lattner
ddfcaff37c rename some variables for consistency
llvm-svn: 60644
2008-12-07 00:39:19 +00:00
Chris Lattner
e4c5f66b3b I love how using out of scope variables is not an error with GCC, no really I do.
llvm-svn: 60643
2008-12-07 00:38:27 +00:00
Chris Lattner
20b7d9667d Rename getCallSiteDependency -> getCallSiteDependencyFrom to
emphasize the scanning and make it more similar to 
getDependencyFrom
 

llvm-svn: 60642
2008-12-07 00:35:51 +00:00
Chris Lattner
dc8cf1fa91 a memdep query on a volatile load/store will always return
clobber with the current implementation.  Instead of returning
a "precise clobber" just return a fuzzy one.  This doesn't 
matter to any clients anyway and should speed up analysis time
very very slightly.

llvm-svn: 60641
2008-12-07 00:28:02 +00:00
Chris Lattner
135a48d48b don't bother touching volatile stores, they will just return clobber on
everything interesting anyway.

llvm-svn: 60640
2008-12-07 00:25:15 +00:00
Chris Lattner
a04521164c remove the ability to get memdep info for vaarg. I don't think the
original impl was correct and noone actually makes the query anyway.

llvm-svn: 60639
2008-12-07 00:21:18 +00:00
Chris Lattner
bd507e3e4d improve a note.
llvm-svn: 60636
2008-12-07 00:15:10 +00:00
Chris Lattner
1fa53e3e56 some more PRE/GVN/DSE related notes.
llvm-svn: 60633
2008-12-06 22:52:12 +00:00
Chris Lattner
00104cf8f8 add a note
llvm-svn: 60632
2008-12-06 22:49:05 +00:00
Chris Lattner
a87ff83a83 some random notes.
llvm-svn: 60624
2008-12-06 19:28:22 +00:00
Nick Lewycky
d33c83b1af Minor cleanup. Use dyn_cast, not isa/cast pairs. No functionality change.
llvm-svn: 60623
2008-12-06 17:57:05 +00:00
Evan Cheng
5c92d425a9 Clean up some ARM GV asm printing out; minor fixes to match what gcc does.
llvm-svn: 60621
2008-12-06 02:00:55 +00:00
Chris Lattner
022b15083b Reimplement the inner loop of DSE. It now uniformly uses getDependence(),
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase).  This eliminates the last external client 
of MemDep::getDependenceFrom().

llvm-svn: 60619
2008-12-06 00:53:22 +00:00
Dan Gohman
e2ee41d1d1 Don't use plain %x to print pointer values. I had changed it from %p
since %p isn't formatted consistently, but obviously plain %x is wrong.
PRIxPTR with a cast to uintptr_t would work here, but that requires
inconvenient build-system changes. %lu works on all current and
foreseable future hosts.

llvm-svn: 60616
2008-12-05 23:39:24 +00:00
Dale Johannesen
c6404f98b2 Forgot a file.
llvm-svn: 60609
2008-12-05 21:55:35 +00:00
Dale Johannesen
f5a072c388 Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Chris Lattner
2b5e1b5263 Make a few major changes to memdep and its clients:
1. Merge the 'None' result into 'Normal', making loads
   and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
   distinguish between the cases when memdep knows the value is
   produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
   are CSEs into memdep instead of it being in GVN.  This still
   leaves verification that the arguments are hte same to GVN to
   let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use 
   getModRefInfo(CallSite,CallSite) instead of doing something 
   very weak.  This only really matters for things like DSA, but
   someday maybe we'll have some other decent context sensitive
   analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
   a) readonly call CSE is slightly simpler
   b) I eliminated the "getDependencyFrom" chaining for load 
      elimination and load CSE doesn't have to worry about 
      volatile (they are always clobbers) anymore.
   c) GVN no longer does any 'lastLoad' caching, leaving it to 
      memdep.
7. The logic in DSE is simplified a bit and sped up.  A potentially
   unsafe case was eliminated.

llvm-svn: 60607
2008-12-05 21:04:20 +00:00
Dan Gohman
5e30c5b83b Demangle and pretty-print symbols in internal backtraces. Patch by
Wesley Peck, with a few fixes by me.

llvm-svn: 60605
2008-12-05 20:12:48 +00:00
Anton Korobeynikov
30085a6f51 Revert invalid r60393. It causes llvm-gcc bootstrap fails in release builds.
See PR3160 for details

llvm-svn: 60604
2008-12-05 19:38:49 +00:00
Chris Lattner
08ad59d631 Make it illegal to call getDependency* on non-memory instructions
like binary operators.

llvm-svn: 60600
2008-12-05 18:46:19 +00:00
Evan Cheng
03ef7cf749 Reason #3 from 60595 doesn't hold true. If we can fold a PIC load from constpool into a use, the rewrite happens at time of spill (not in VirtRegMap). Later on, if the GlobalBaseReg is spilled, the spiller can see the use uses GlobalBaseReg and do the right thing.
llvm-svn: 60596
2008-12-05 17:41:31 +00:00
Evan Cheng
144447bfa0 Effectively undo 60461 in PIC mode which simply transform V_SET0 / V_SETALLONES into a load from constpool in order to fold into restores. This is not safe to do when PIC base is being used for a number of reasons:
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).

llvm-svn: 60595
2008-12-05 17:23:48 +00:00
Chris Lattner
211146e709 Fix test/Transforms/GVN/pre-load.ll
llvm-svn: 60594
2008-12-05 17:04:12 +00:00
Evan Cheng
6879b66c9e Fix comment.
llvm-svn: 60592
2008-12-05 17:00:16 +00:00
Chris Lattner
35547ba5ca Make IsValueFullyAvailableInBlock safe.
llvm-svn: 60588
2008-12-05 07:49:08 +00:00
Dan Gohman
1e7dff35a6 Drop the reg argument to isRegReDefinedByTwoAddr, which was redundant.
llvm-svn: 60586
2008-12-05 05:45:42 +00:00
Dan Gohman
c157324a23 Teach StackSlotColoring to update MachineMemOperands when
changing the stack slots on an instruction, to keep them
consistent with the actual memory addresses.

llvm-svn: 60584
2008-12-05 05:31:14 +00:00
Dan Gohman
be3e0caacb Ignore IMPLICIT_DEF instructions when computing physreg liveness.
While they appear to provide a normal clobbering def, they don't
in the case of the awkward IMPLICIT_DEF+INSERT_SUBREG idiom. It
would be good to change INSERT_SUBREG; until then, this change
allows post-regalloc scheduling to cope in a mildly conservative
way.

llvm-svn: 60583
2008-12-05 05:30:02 +00:00
Evan Cheng
1b795803dd Re-did 60519. It turns out Darwin's handling of hidden visibility symbols are a bit more complicate than I expected. Both declarations and weak definitions still need a stub indirection. However, the stubs are in data section and they contain the addresses of the actual symbols.
llvm-svn: 60571
2008-12-05 01:06:39 +00:00
Ted Kremenek
a851e459e1 Have raw_fd_ostream keep track of the position in the file to make tell() go faster by not requiring a flush().
llvm-svn: 60560
2008-12-04 22:51:11 +00:00
Devang Patel
4fcea36b8b Rewrite code that 1) filters loops and 2) calculates new loop bounds.
This fixes many bugs. I will add more test cases in a separate check-in.

Some day, the code that manipulates CFG and updates dom. info could use refactoring help.

llvm-svn: 60554
2008-12-04 21:38:42 +00:00
Owen Anderson
9e2293bda3 Factor out some common code.
llvm-svn: 60553
2008-12-04 21:20:30 +00:00
Scott Michel
6e9747d2d6 CellSPU: Fix bug 3055
- Add v4f32, v2f64 to LowerVECTOR_SHUFFLE
- Look for vector rotate in shuffle elements, generate a vector rotate
  instead of a full-blown shuffle when opportunity presents itself.
- Generate larger test harness and fix a few interesting but obscure bugs.

llvm-svn: 60552
2008-12-04 21:01:44 +00:00
Duncan Sands
658b461a3c When allocating a stack temporary, use the correct
number of bytes for types such as i1 which are not
a multiple of 8 bits in length.

llvm-svn: 60543
2008-12-04 18:08:40 +00:00
Scott Michel
26d15f31ac Missing closing brace and reverse conditional condition on NDEBUG
llvm-svn: 60541
2008-12-04 17:16:59 +00:00
Chris Lattner
7b3576824a Start simplifying a switch that has a successor that is a switch.
llvm-svn: 60534
2008-12-04 06:31:07 +00:00
Chris Lattner
3acb266d60 This code is apparently quite confused. In the meantime,
get it building when NDEBUG is set.

llvm-svn: 60532
2008-12-04 06:14:27 +00:00
Bill Wendling
a0466523bd Temporarily revert r60519. It was causing a bootstrap failure:
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c ../../../llvm-gcc.src/libgomp/barrier.c  -fno-common -DPIC -o .libs/barrier.o
checking for sys/file.h... /var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:non-relocatable subtraction expression, "_gomp_tls_key" minus "L1$pb"
/var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:symbol: "_gomp_tls_key" can't be undefined in a subtraction expression
make[4]: *** [barrier.lo] Error 1
make[4]: *** Waiting for unfinished jobs....
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT alloc.lo -MD -MP -MF .deps/alloc.Tpo -c ../../../llvm-gcc.src/libgomp/alloc.c -o alloc.o >/dev/null 2>&1
yes
checking for sys/param.h... make[3]: *** [all-recursive] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-target-libgomp] Error 2
make[1]: *** Waiting for unfinished jobs....

llvm-svn: 60527
2008-12-04 04:07:00 +00:00
Scott Michel
1f907dd784 CellSPU:
- First patch from Nehal Desai, a new contributor at Aerospace. Nehal's patch
  fixes sign/zero/any-extending loads for integers and floating point. Example
  code, compiled w/o debugging or optimization where he first noticed the bug:

  int main(void) {
    float a = 99.0;
    printf("%d\n", a);
    return 0;
  }

  Verified that this code actually works on a Cell SPU.

Changes by Scott Michel:
- Fix bug in the value type list constructed by SPUISD::LDRESULT to include
  both the load result's result and chain, not just the chain alone.
- Simplify LowerLOAD and remove extraneous and unnecessary chains.
- Remove unused SPUISD pseudo instructions.

llvm-svn: 60526
2008-12-04 03:02:42 +00:00
Dan Gohman
6ff2c1234b Use register names instead of numbers in debug output.
llvm-svn: 60525
2008-12-04 02:15:26 +00:00
Dan Gohman
93e73ed7f2 Make debug output more informative.
llvm-svn: 60524
2008-12-04 02:14:57 +00:00
Evan Cheng
d4b7459179 Visibility hidden GVs do not require extra load of symbol address from the GOT or non-lazy-ptr.
llvm-svn: 60519
2008-12-04 01:56:50 +00:00
Dan Gohman
f8e215d4b1 Add minimal support for disambiguating memory references. Currently
the main thing this covers is spills to distinct spill slots.

llvm-svn: 60517
2008-12-04 01:35:46 +00:00
Chris Lattner
2677286c25 add a debugging option to help track down j-t problems.
llvm-svn: 60514
2008-12-04 00:07:59 +00:00
Dan Gohman
3836431ec6 Rewrite the liveness bookkeeping code to fix a bunch of
issues with subreg operands and tied operands.

llvm-svn: 60510
2008-12-03 23:07:27 +00:00
Dale Johannesen
0a0e2b1033 Make the debugging dump be a full line.
llvm-svn: 60509
2008-12-03 22:45:31 +00:00
Dale Johannesen
119036d435 Remove an unused field.
llvm-svn: 60508
2008-12-03 22:43:56 +00:00
Dan Gohman
0edbed16c7 Have PseudoSourceValue override Value::dump, so that it works
on PseudoSourceValue values. This also fixes a FIXME in
lib/VMCode/AsmWriter.cpp.

llvm-svn: 60507
2008-12-03 21:37:21 +00:00
Dale Johannesen
a0b1516bdc Fix a misspelled function name.
llvm-svn: 60506
2008-12-03 20:56:12 +00:00
Chris Lattner
420385f8c3 Factor some code into a new FoldSingleEntryPHINodes method.
llvm-svn: 60501
2008-12-03 19:44:02 +00:00
Dan Gohman
19b43e462f Fix an inconsistency in a comment.
llvm-svn: 60500
2008-12-03 19:38:38 +00:00
Evan Cheng
05ded29738 Use mmx (punpckldq VR64, (mmx_v_set0)) to clear high 32-bits of a VR64 register.
llvm-svn: 60499
2008-12-03 19:38:05 +00:00
Dan Gohman
af9b4a8a21 Don't charge the full latency for anti and output dependencies. This is
an area where eventually it would be good to use target-dependent
information.

llvm-svn: 60498
2008-12-03 19:37:34 +00:00
Dale Johannesen
6322cd40c6 A step towards geting linux ppc to work (see PR 3099)
llvm-svn: 60497
2008-12-03 19:33:10 +00:00
Dan Gohman
4f8709518d When looking for anti-dependences on the critical path, don't bother
examining non-anti-dependence edges.

llvm-svn: 60496
2008-12-03 19:32:26 +00:00
Dan Gohman
1020320a05 Add a comment about callee-saved registers.
llvm-svn: 60495
2008-12-03 19:30:13 +00:00
Dale Johannesen
a851280d26 Fix a really wrong comment.
llvm-svn: 60494
2008-12-03 19:25:46 +00:00
Dan Gohman
74529a2226 Split foldMemoryOperand into public non-virtual and protected virtual
parts, and add target-independent code to add/preserve
MachineMemOperands.

llvm-svn: 60488
2008-12-03 18:43:12 +00:00
Dan Gohman
5dad0993a9 Rename isSimpleLoad to canFoldAsLoad, to better reflect its meaning.
llvm-svn: 60487
2008-12-03 18:15:48 +00:00
Dan Gohman
fc05cdda64 Extend X86's addFrameReference to add a MachineMemOperand for
the frame reference. This will help post-RA scheduling determine
that spills to distinct stack slots are independent.

llvm-svn: 60486
2008-12-03 18:11:40 +00:00
Dan Gohman
6be47e9542 Update a comment.
llvm-svn: 60484
2008-12-03 17:10:41 +00:00
Duncan Sands
fbc8da66d6 Only check that the result of the mapping was not
a new node if the node was actually remapped.

llvm-svn: 60482
2008-12-03 12:36:16 +00:00
Rafael Espindola
0c800cf35e Fix bug 3140.
Print a single parameter .file directive if we have an ELF target.

llvm-svn: 60480
2008-12-03 11:01:37 +00:00
Richard Osborne
e74ae9dbb7 Add support for ISD::TRAP to the XCore backend
llvm-svn: 60479
2008-12-03 10:59:16 +00:00
Evan Cheng
440e75e1d5 Refactor code. No functionality change.
llvm-svn: 60478
2008-12-03 08:38:43 +00:00
Bill Wendling
d2208d570b CC should only be a ConstantSDNode at this point. Just use 'cast' instead of 'dyn_cast'.
llvm-svn: 60477
2008-12-03 08:32:02 +00:00
Chris Lattner
f00b2f3fb4 Teach jump threading some more simple tricks:
1) have it fold "br undef", which does occur with
   surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks.  This removes the successor
   edges, reducing the in-edges of other blocks, allowing 
   recursive simplification.
3) Fold things like:
     br COND, BBX, BBY
  BBX:
     br COND, BBZ, BBW

   which also happens because jump threading iterates.

llvm-svn: 60470
2008-12-03 07:48:08 +00:00
Chris Lattner
29326a6d1f third time is the charm.
llvm-svn: 60469
2008-12-03 07:45:15 +00:00
Chris Lattner
d03c1b5440 fix assertion.
llvm-svn: 60468
2008-12-03 07:43:05 +00:00
Chris Lattner
7a00825f57 Rename DeleteBlockIfDead to DeleteDeadBlock and make it
unconditionally delete the block.  All likely clients will
do the checking anyway.

llvm-svn: 60464
2008-12-03 06:40:52 +00:00
Chris Lattner
12c3938837 Factor some code out of SimplifyCFG, forming a new
DeleteBlockIfDead method.

llvm-svn: 60463
2008-12-03 06:37:44 +00:00
Dan Gohman
ac6561793c Mark x86's V_SET0 and V_SETALLONES with isSimpleLoad, and teach X86's
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.

Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).

llvm-svn: 60461
2008-12-03 05:21:24 +00:00
Dan Gohman
6333d48459 Add a sanity-check to tablegen to catch the case where isSimpleLoad
is set but mayLoad is not set. Fix all the problems this turned up.

Change code to not use isSimpleLoad instead of mayLoad unless it
really wants isSimpleLoad.

llvm-svn: 60459
2008-12-03 02:30:17 +00:00
Dan Gohman
18c4a4c9ea Fix a missing #include.
llvm-svn: 60458
2008-12-03 02:10:00 +00:00
Dan Gohman
86b0a220af Fix this comment to reflect that it applies to types other
than just i32.

llvm-svn: 60455
2008-12-03 01:39:44 +00:00
Dan Gohman
dcd4896f12 Fix byval arguments in the fastcc calling convention. The fastcc convention
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.

llvm-svn: 60453
2008-12-03 01:28:04 +00:00
Evan Cheng
a77559c870 Remove a (what appears to be) overly strict assertion. Here is what happened:
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.

llvm-svn: 60443
2008-12-02 21:57:09 +00:00
Dale Johannesen
e06fb96c43 Minor rewrite per review feedback.
llvm-svn: 60442
2008-12-02 21:17:11 +00:00
Scott Michel
69c9d01241 Non-functional change: make custom lowering for truncate stylistically
consistent with the way it's generally done in other places.

llvm-svn: 60439
2008-12-02 19:55:08 +00:00
Scott Michel
e0bbe7afb7 CellSPU:
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
  at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
  they're promoted for use in printf.

llvm-svn: 60438
2008-12-02 19:53:53 +00:00
Dan Gohman
0834d959e9 Fix a typo in a comment.
llvm-svn: 60434
2008-12-02 19:27:20 +00:00
Owen Anderson
afbd11e227 Add support for folding spills into preceding defs when doing pre-alloc splitting.
llvm-svn: 60433
2008-12-02 18:53:47 +00:00
Dale Johannesen
c9123e12e3 One more transformation.
llvm-svn: 60432
2008-12-02 18:40:40 +00:00
Dale Johannesen
531472926e Make the code do what the comment says it does.
llvm-svn: 60431
2008-12-02 18:40:09 +00:00
Chris Lattner
2f766df51f Comment typeo fix, thanks Duncan!
llvm-svn: 60429
2008-12-02 18:33:11 +00:00
Tilmann Scheller
14310949e3 make it possible to custom lower TRUNCATE (needed for the CellSPU target)
llvm-svn: 60409
2008-12-02 12:12:25 +00:00
Chris Lattner
2a9747548e Implement PRE of loads in the GVN pass with a pretty cheap and
straight-forward implementation.  This does not require any extra
alias analysis queries beyond what we already do for non-local loads.

Some programs really really like load PRE.  For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.

The biggest limitation to the implementation is that it does not split
critical edges.  This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.

The implementation of this should incidentally speed up rejection of 
non-local loads because it avoids creating the repl densemap in cases 
when it won't be used for fully redundant loads.

This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact.  This is pretty close to ready though.

llvm-svn: 60408
2008-12-02 08:16:11 +00:00
Nick Lewycky
81635468b3 Add a new SCEV representing signed division.
llvm-svn: 60407
2008-12-02 08:05:48 +00:00
Mon P Wang
7543e266ad Removed some unnecessary code in widening.
llvm-svn: 60406
2008-12-02 07:35:08 +00:00
Chris Lattner
90d1a3c343 add a little helper function that does PHI translation.
llvm-svn: 60405
2008-12-02 07:16:45 +00:00
Chris Lattner
40462e032c add a note
llvm-svn: 60404
2008-12-02 06:32:34 +00:00
Bill Wendling
f5798b5d6c Remove some errors that crept in. No functionality change.
llvm-svn: 60403
2008-12-02 06:24:20 +00:00
Bill Wendling
9981b7bcdc Merge two if-statements into one.
llvm-svn: 60402
2008-12-02 06:22:04 +00:00
Bill Wendling
109da8c135 More styalistic changes. No functionality change.
llvm-svn: 60401
2008-12-02 06:18:11 +00:00
Bill Wendling
654cc91c36 - Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a
constant. If X is a constant, then this is folded elsewhere.

- Added a note to Target/README.txt to indicate that we'd like to implement
  this when we're able.

llvm-svn: 60399
2008-12-02 05:12:47 +00:00
Bill Wendling
a60e3e3539 Improve comment.
llvm-svn: 60398
2008-12-02 05:09:00 +00:00
Bill Wendling
e319ca5f21 - Reduce nesting.
- No need to do a swap on a canonicalized pattern.

No functionality change.

llvm-svn: 60397
2008-12-02 05:06:43 +00:00
Chris Lattner
cc084722e7 some random comment improvements.
llvm-svn: 60395
2008-12-02 04:52:26 +00:00
Owen Anderson
92e405b332 Fix an issue that Chris noticed, where local PRE was not properly instantiating
a new value numbering set after splitting a critical edge.  This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.

llvm-svn: 60393
2008-12-02 04:09:22 +00:00
Evan Cheng
39d7e00ff9 Fix PR3124: overly strict assert.
llvm-svn: 60392
2008-12-02 02:15:36 +00:00
Dale Johannesen
29fb1bf708 Add a few more transformations.
llvm-svn: 60391
2008-12-02 01:30:54 +00:00
Bill Wendling
580f12ae30 Second stab at target-dependent lowering of everyone's favorite nodes: [SU]ADDO
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
  EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
  depending on the type of ADDO requested.

- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
  COND_C set.

llvm-svn: 60388
2008-12-02 01:06:39 +00:00
Bill Wendling
039240b301 Reapply r60382. This time, don't mark "ADC" nodes with "implicit EFLAGS".
llvm-svn: 60385
2008-12-02 00:07:05 +00:00
Bill Wendling
16840cba04 Temporarily revert r60382. It caused CodeGen/X86/i2k.ll and others to fail.
llvm-svn: 60383
2008-12-01 23:44:08 +00:00
Bill Wendling
628848b540 - Have "ADD" instructions return an implicit EFLAGS.
- Add support for seto, setno, setc, and setnc instructions.

llvm-svn: 60382
2008-12-01 23:30:42 +00:00
Bill Wendling
dff9b81623 Expand getVTList, getNodeValueTypes, and SelectNodeTo to handle more value types.
llvm-svn: 60381
2008-12-01 23:28:22 +00:00
Dale Johannesen
f4362aae8c Consider only references to an IV within the loop when
figuring out the base of the IV.  This produces better
code in the example.  (Addresses use (IV) instead of 
(BASE,IV) - a significant improvement on low-register
machines like x86).

llvm-svn: 60374
2008-12-01 22:00:01 +00:00
Bill Wendling
33f3e77a5b Don't rebuild RHSNeg. Just use the one that's already there.
llvm-svn: 60370
2008-12-01 21:06:30 +00:00
Bill Wendling
d436da480d Document what this check is doing. Also, no need to cast to ConstantInt.
llvm-svn: 60369
2008-12-01 21:03:43 +00:00
Bill Wendling
1e4fb7a143 Use a simple comparison. Overflow on integer negation can only occur when the
integer is "minint".

llvm-svn: 60366
2008-12-01 19:46:27 +00:00
Scott Michel
cf677b5a67 CellSPU:
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
  (add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32

llvm-svn: 60358
2008-12-01 17:56:02 +00:00
Duncan Sands
5de8739964 There are no longer any places that require a
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.

llvm-svn: 60349
2008-12-01 11:41:29 +00:00
Duncan Sands
1fae2ea219 Change the interface to the type legalization method
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.

llvm-svn: 60348
2008-12-01 11:39:25 +00:00
Bill Wendling
48b7cbbc01 Generalize the FoldOrWithConstant method to fold for any two constants which
don't have overlapping bits.

llvm-svn: 60344
2008-12-01 08:32:40 +00:00
Bill Wendling
2a182b838d Reduce copy-and-paste code by splitting out the code into its own function.
llvm-svn: 60343
2008-12-01 08:23:25 +00:00
Bill Wendling
a6e7dd2299 Use m_Specific() instead of double matching.
llvm-svn: 60341
2008-12-01 08:09:47 +00:00
Bill Wendling
8e484e9556 Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to.
llvm-svn: 60340
2008-12-01 07:47:02 +00:00
Chris Lattner
3b908483b7 Rename some variables, only increment BI once at the start of the loop instead of throughout it.
llvm-svn: 60339
2008-12-01 07:35:54 +00:00
Chris Lattner
c6e6eaf6d3 pull the predMap densemap out of the inner loop of performPRE, so
that it isn't reallocated all the time.  This is a tiny speedup for
GVN: 3.90->3.88s

llvm-svn: 60338
2008-12-01 07:29:03 +00:00
Chris Lattner
f72f8e3b74 switch a couple more calls to use array_pod_sort.
llvm-svn: 60337
2008-12-01 06:52:57 +00:00
Chris Lattner
80d0eff786 Introduce a new array_pod_sort function and switch LSR to use it
instead of std::sort.  This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.

We should start using array_pod_sort where possible.

llvm-svn: 60335
2008-12-01 06:49:59 +00:00
Chris Lattner
74f1e6d3ec Eliminate use of setvector for the DeadInsts set, just use a smallvector.
This is a lot cheaper and conceptually simpler.

llvm-svn: 60332
2008-12-01 06:27:41 +00:00
Chris Lattner
db86ff62f9 DeleteTriviallyDeadInstructions is always passed the
DeadInsts ivar, just use it directly.

llvm-svn: 60330
2008-12-01 06:14:28 +00:00
Chris Lattner
d6be279b4d simplify DeleteTriviallyDeadInstructions again, unlike my previous
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then 
notifying.

llvm-svn: 60329
2008-12-01 06:11:32 +00:00
Chris Lattner
e6c7ed156f simplify these patterns using m_Specific. No need to grep for
xor in testcase (or is a substring).

llvm-svn: 60328
2008-12-01 05:16:26 +00:00
Chris Lattner
c92e1e104b Teach jump threading to clean up after itself, DCE and constfolding the
new instructions it simplifies.  Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block.  Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.

llvm-svn: 60327
2008-12-01 04:48:07 +00:00
Chris Lattner
ceeb559995 The PreVerifier pass preserves everything. In practice, this
prevents the passmgr from adding yet-another domtree invocation
for Verifier if there is already one live.

llvm-svn: 60326
2008-12-01 03:58:38 +00:00
Chris Lattner
13942f82c4 Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs
instead of using FoldPHIArgBinOpIntoPHI.  In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices.  This prevented instcombine
from factoring big chunks of code in 403.gcc.  For example:

 insn_cuid.exit:                
-       %tmp336 = load i32** @uid_cuid, align 4      
-       %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3    
-       %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*               
-       %tmp339 = load i32* %tmp338, align 4           
-       %tmp340 = getelementptr i32* %tmp336, i32 %tmp339     
        br label %bb62
 
 bb61:       
-       %tmp341 = load i32** @uid_cuid, align 4     
-       %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3        
-       %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*           
-       %tmp344 = load i32* %tmp343, align 4        
-       %tmp345 = getelementptr i32* %tmp341, i32 %tmp344          
        br label %bb62
 
 bb62:      
-       %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]         
+       %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]         
+       %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3     
+       %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*  
+       %tmp341.pn = load i32** @uid_cuid     
+       %tmp344.pn = load i32* %tmp344.pn.in 
+       %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn   
        %iftmp.62.0 = load i32* %iftmp.62.0.in     

llvm-svn: 60325
2008-12-01 03:42:51 +00:00
Chris Lattner
0e03e40a76 Teach inst combine to merge GEPs through PHIs. This is really
important because it is sinking the loads using the GEPs, but
not the GEPs themselves.  This triggers 647 times on 403.gcc
and makes the .s file much much nicer.  For example before:

        je      LBB1_87 ## bb78
LBB1_62:        ## bb77
        leal    84(%esi), %eax
LBB1_63:        ## bb79
        movl    (%eax), %eax
...
LBB1_87:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
        jmp     LBB1_62 ## bb77


after:

        jne     LBB1_63 ## bb79
LBB1_62:        ## bb78
        movl    $0, 4(%esp)
        movl    %esi, (%esp)
        call    L_make_decl_rtl$stub
LBB1_63:        ## bb79
        movl    84(%esi), %eax

The input code was (and the GEPs are merged and
the PHI is now eliminated by instcombine):

        br i1 %tmp233, label %bb78, label %bb77
bb77:           
        %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb78:           
        call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind
        %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22              
        br label %bb79
bb79:           
        %iftmp.12.0.in = phi %struct.rtx_def** [ %tmp235, %bb78 ], [ %tmp234, %bb77 ]           
        %iftmp.12.0 = load %struct.rtx_def** %iftmp.12.0.in             

llvm-svn: 60322
2008-12-01 02:34:36 +00:00
Chris Lattner
c1adf6fc51 Make GVN be more intelligent about redundant load
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal.  This makes load elimination
more aggressive.  For example, on 403.gcc, we had:

<     68 gvn    - Number of instructions PRE'd
< 152718 gvn    - Number of instructions deleted
<  49699 gvn    - Number of loads deleted
<   6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses

now we have:

>     64 gvn    - Number of instructions PRE'd
> 153623 gvn    - Number of instructions deleted
>  49856 gvn    - Number of loads deleted
>   5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses

That's an extra 157 loads deleted and extra 905 other instructions nuked.

This slows down GVN very slightly, from 3.91 to 3.96s.

llvm-svn: 60314
2008-12-01 01:31:36 +00:00
Chris Lattner
bd1bc4a75e Reimplement the non-local dependency data structure in terms of a sorted
vector instead of a densemap.  This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster.  This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc

This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case.  Here's the stats
for 403.gcc:

  6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses

yay for caching :)

llvm-svn: 60313
2008-12-01 01:15:42 +00:00
Bill Wendling
23684a026c Implement ((A|B)&1)|(B&-2) -> (A&1) | B transformation. This also takes care of
permutations of this pattern.

llvm-svn: 60312
2008-12-01 01:07:11 +00:00
Chris Lattner
1f8482ffc8 Cache analyses in ivars and add some useful DEBUG output.
This speeds up GVN from 4.0386s to 3.9376s.

llvm-svn: 60310
2008-12-01 00:40:32 +00:00
Chris Lattner
77908d9ccf improve indentation, do cheap checks before expensive ones,
remove some fixme's.  This speeds up GVN very slightly on 403.gcc 
(4.06->4.03s)

llvm-svn: 60309
2008-11-30 23:39:23 +00:00
Chris Lattner
36257aabe4 Eliminate the DepResultTy abstraction. It is now completely
redundant with MemDepResult, and MemDepResult has a nicer interface.

llvm-svn: 60308
2008-11-30 23:17:19 +00:00
Eli Friedman
052df7e062 Minor cleanup: use getTrue and getFalse where appropriate. No
functional change.

llvm-svn: 60307
2008-11-30 22:48:49 +00:00
Eli Friedman
8da9f2f8d3 Some minor cleanups to instcombine; no functionality change.
Note that the FoldOpIntoPhi call is dead because it's impossible for the 
first operand of a subtraction to be both a ConstantInt and a PHINode.

llvm-svn: 60306
2008-11-30 21:09:11 +00:00
Chris Lattner
9f7facc8eb Cache TargetData/AliasAnalysis in the pass instead of calling
getAnalysis<>.  getAnalysis<> is apparently extremely expensive.
Doing this speeds up GVN on 403.gcc by 16%!

llvm-svn: 60304
2008-11-30 19:24:31 +00:00
Bill Wendling
66a7442059 Add instruction combining for ((A&~B)|(~A&B)) -> A^B and all permutations.
llvm-svn: 60291
2008-11-30 13:52:49 +00:00
Bill Wendling
3e27ac16a6 Implement (A&((~A)|B)) -> A&B transformation in the instruction combiner. This
takes care of all permutations of this pattern.

llvm-svn: 60290
2008-11-30 13:08:13 +00:00
Bill Wendling
92ebd6902d Forgot one remaining call to getSExtValue().
llvm-svn: 60289
2008-11-30 12:41:09 +00:00
Bill Wendling
97ad688c1b getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all
APInt calls instead.

This fixes PR3144.

llvm-svn: 60288
2008-11-30 12:38:24 +00:00
Eli Friedman
2bc3921ce2 Optimize memmove and memset into the LLVM builtins. Note that these
only show up in code from front-ends besides llvm-gcc, like clang.

llvm-svn: 60287
2008-11-30 08:32:11 +00:00
Eli Friedman
3b8efd50d7 A couple small cleanups, plus a new potential optimization.
llvm-svn: 60286
2008-11-30 07:52:27 +00:00
Eli Friedman
97d37825f1 Moving potential optimizations out of PR2330 into lib/Target/README.txt.
Hopefully this isn't too much stuff to dump into this file.

llvm-svn: 60285
2008-11-30 07:36:04 +00:00
Eli Friedman
ccdfdbfc99 Followup to r60283: optimize arbitrary width signed divisions as well
as unsigned divisions.  Same caveats as before.

llvm-svn: 60284
2008-11-30 06:35:39 +00:00
Eli Friedman
d7a261120f Fix for PR2164: allow transforming arbitrary-width unsigned divides into
multiplies.

Some more cleverness would be nice, though. It would be nice if we 
could do this transformation on illegal types.  Also, we would 
prefer a narrower constant when possible so that we can use a narrower
multiply, which can be cheaper.

llvm-svn: 60283
2008-11-30 06:02:26 +00:00
Bill Wendling
115290ddd3 Don't make TwoToExp signed by default.
llvm-svn: 60279
2008-11-30 05:29:33 +00:00
Bill Wendling
4e018f4c22 From Hacker's Delight:
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."

In this case, x == -1.

llvm-svn: 60278
2008-11-30 05:01:05 +00:00
Eli Friedman
0ef5e1dc82 APIntify a test which is potentially unsafe otherwise, and fix the
nearby FIXME.

I'm not sure what the right way to fix the Cell test was; if the 
approach I used isn't okay, please let me know.

llvm-svn: 60277
2008-11-30 04:59:26 +00:00
Bill Wendling
ac11f7d37e Instcombine was illegally transforming -X/C into X/-C when either X or C
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.

llvm-svn: 60275
2008-11-30 03:42:12 +00:00
Chris Lattner
baf3cdd3a1 Two changes: Make getDependency remove QueryInst for a dirty record's
ReverseLocalDeps when we update it.  This fixes a regression test
failure from my last commit.

Second, for each non-local cached information structure, keep a bit that
indicates whether it is dirty or not.  This saves us a scan over the whole
thing in the common case when it isn't dirty.

llvm-svn: 60274
2008-11-30 02:52:26 +00:00
Chris Lattner
90904fda3c introduce a typedef, no functionality change.
llvm-svn: 60272
2008-11-30 02:30:50 +00:00
Chris Lattner
a772cad6ff Change NonLocalDeps to be a densemap of pointers to densemap
instead of containing them by value.  This increases the density
(!) of NonLocalDeps as well as making the reallocation case 
faster.  This speeds up gvn on 403.gcc by 2% and makes room for
future improvements.

I'm not super thrilled with having to explicitly manage the new/delete
of the map, but it is necesary for the next change.

llvm-svn: 60271
2008-11-30 02:28:25 +00:00
Chris Lattner
2e3c1d37d9 calls never depend on allocations.
llvm-svn: 60268
2008-11-30 01:44:00 +00:00
Chris Lattner
2f7da36732 Fix a fixme by making memdep's handling of allocations more logical.
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.

llvm-svn: 60267
2008-11-30 01:39:32 +00:00
Chris Lattner
ae0e214d84 implement a fixme by introducing a new getDependencyFromInternal
method that returns its result as a DepResultTy instead of as a
MemDepResult.  This reduces conversion back and forth.

llvm-svn: 60266
2008-11-30 01:26:32 +00:00
Chris Lattner
20e7b9bf7e Move the getNonLocalDependency method to a more logical place in
the file, no functionality change.

llvm-svn: 60265
2008-11-30 01:18:27 +00:00
Chris Lattner
9be5c5c763 REmove an old fixme, resolve another fixme by adding liberal
comments about what this class does.

llvm-svn: 60264
2008-11-30 01:17:08 +00:00
Chris Lattner
214e6decde remove a bit of incorrect code that tried to be tricky about speeding up
dependencies.  The basic situation was this: consider if we had:

  store1
  ...
  store2
  ...
  store3

Where memdep thinks that store3 depends on store2 and store2 depends 
on store1.  The problem happens when we delete store2: The code in 
question was updating dep info for store3 to be store1.  This is a
spiffy optimization, but is not safe at all, because aliasing isn't
transitive.  This bug isn't exposed today with DSE because DSE will only
zap store2 if it is identifical to store 3, and in this case, it is 
safe to update it to depend on store1.  However, memcpyopt is not so
fortunate, which is presumably why the "dropInstruction" code used to
exist.

Since this doesn't actually provide a speedup in practice, just rip the
code out.

llvm-svn: 60263
2008-11-30 01:09:30 +00:00
Chris Lattner
adf33d42ed Eliminate the dropInstruction method, which is not needed any more.
Fix a subtle iterator invalidation bug I introduced in the last commit.

llvm-svn: 60258
2008-11-29 23:30:39 +00:00
Chris Lattner
6d2b59a58f implement some fixme's: when deleting an instruction with
an entry in the nonlocal deps map, don't reset entries
referencing that instruction to [dirty, null], instead, set
them to [dirty,next] where next is the instruction after the
deleted one.  Use this information in the non-local deps
code to avoid rescanning entire blocks.

This speeds up GVN slightly by avoiding pointless work.  On
403.gcc this makes GVN 1.5% faster. 

llvm-svn: 60256
2008-11-29 22:02:15 +00:00
Chris Lattner
3e86ec7289 Change MemDep::getNonLocalDependency to return its results as
a smallvector instead of a DenseMap.  This speeds up GVN by 5%
on 403.gcc.

llvm-svn: 60255
2008-11-29 21:33:22 +00:00
Chris Lattner
b19c015a87 move MemoryDependenceAnalysis::verifyRemoved to the end of the file,
no functionality/code change.

llvm-svn: 60254
2008-11-29 21:25:10 +00:00