1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00
Commit Graph

2027 Commits

Author SHA1 Message Date
Bob Wilson
8d4a8b9370 Implement NEON vst1 instruction.
llvm-svn: 75037
2009-07-08 20:32:02 +00:00
Chris Lattner
1eeea0d262 add some more check for vector compares.
llvm-svn: 75024
2009-07-08 18:51:25 +00:00
Chris Lattner
182817004d convert a test to "FileCheck" style.
llvm-svn: 75023
2009-07-08 18:48:24 +00:00
Bob Wilson
3809b333de Implement NEON vld1 instructions.
llvm-svn: 75019
2009-07-08 18:11:30 +00:00
David Goodwin
d19a9aa207 Add rev16 test... xfail for now
llvm-svn: 75012
2009-07-08 16:15:06 +00:00
David Goodwin
5bdef4b3f7 Checkpoint Thumb2 Instr info work. Generalized base code so that it can be shared between ARM and Thumb2. Not yet activated because register information must be generalized first.
llvm-svn: 75010
2009-07-08 16:09:28 +00:00
Chris Lattner
c153b998d9 eliminate the v[if]cmp versions of these tests, now that [if]cmp+sext works.
llvm-svn: 74980
2009-07-08 00:49:35 +00:00
Chris Lattner
2939f0a318 Change these tests to use [fi]cmp+sext instead of v[fi]cmp. No
functionality change.

llvm-svn: 74979
2009-07-08 00:46:57 +00:00
Chris Lattner
ea7bd9b484 dag combine sext(setcc) -> vsetcc before legalize. To make this safe,
VSETCC must define all bits, which is different than it was documented
to before.  Since all targets that implement VSETCC already have this
behavior, and we don't optimize based on this, just change the 
documentation.  We now get nice code for vec_compare.ll

llvm-svn: 74978
2009-07-08 00:31:33 +00:00
Chris Lattner
e2435c4f6f add support for legalizing an icmp where the result is illegal (4xi1) but
the input is legal (4 x i32)

llvm-svn: 74964
2009-07-07 23:03:54 +00:00
Chris Lattner
1de1b3155b add a trivial test that vector compares work.
llvm-svn: 74963
2009-07-07 22:51:09 +00:00
Chris Lattner
a754f344b2 implement support for spliting and scalarizing vector setcc's. This
finishes off enough support for vector compares to get the icmp/fcmp
version of 2008-07-23-VSetCC.ll passing.

llvm-svn: 74961
2009-07-07 22:47:46 +00:00
Chris Lattner
573a3eeda2 verify that the fcmp version of this works just as well as the
vfcmp version.  We actually get better code for this silly testcase.

llvm-svn: 74954
2009-07-07 22:07:47 +00:00
Evan Cheng
393e38e44b Add Thumb2 movcc instructions.
llvm-svn: 74946
2009-07-07 20:39:03 +00:00
Evan Cheng
37abb8fc28 Add missing tests.
llvm-svn: 74945
2009-07-07 20:38:08 +00:00
Evan Cheng
fa864ab886 Add Thumb2 pkhbt / pkhtb.
llvm-svn: 74895
2009-07-07 05:35:52 +00:00
Evan Cheng
46b98516f6 Add some more Thumb2 multiplication instructions.
llvm-svn: 74889
2009-07-07 01:17:28 +00:00
Evan Cheng
5a279bb4b2 Add bfc to armv6t2.
llvm-svn: 74868
2009-07-06 22:23:46 +00:00
Evan Cheng
2570d8b541 Added ARM::mls for armv6t2.
llvm-svn: 74866
2009-07-06 22:05:45 +00:00
Evan Cheng
29ce3bfbb8 Avoid adding a duplicate def. This fixes PR4478.
llvm-svn: 74857
2009-07-06 21:34:05 +00:00
Evan Cheng
f20e4fba49 Add thumb2 sign / zero extend with rotate instructions.
llvm-svn: 74755
2009-07-03 01:43:10 +00:00
Evan Cheng
162bd9cead Added indexed stores.
llvm-svn: 74740
2009-07-03 00:06:39 +00:00
Evan Cheng
fcab8e743a Sign extending pre/post indexed loads.
llvm-svn: 74736
2009-07-02 23:16:11 +00:00
Evan Cheng
dad6a41d14 Thumb2 pre/post indexed loads.
llvm-svn: 74696
2009-07-02 07:28:31 +00:00
Chris Lattner
4cddab0f14 @GOTPCREL is also rip-relative. Fix fast-isel to do the right thing.
This fixes an llvm-gcc bootstrap problem I introduced.

llvm-svn: 74691
2009-07-02 04:22:01 +00:00
Chris Lattner
e703feb0ac Fix yet-another bug I introduced into fastisel, this time handling
constant pool references that weren't getting properly rip-relative.

llvm-svn: 74689
2009-07-02 03:14:25 +00:00
Chris Lattner
b51c7950bf Fix codegen for references to available_externally symbols. This fixes
PR4482.

llvm-svn: 74613
2009-07-01 16:53:44 +00:00
Evan Cheng
e6989735a6 CommuteChangesDestination() should check if to-be-commuted instruction defines any register. Also teaches the default commuteInstruction() to commute instruction without definitions (e.g. X86::test / ARM::tsp).
llvm-svn: 74602
2009-07-01 08:29:08 +00:00
Evan Cheng
7d78cb531e Remove special handling of implicit_def. Fix a couple more bugs in liveintervalanalysis and coalescer handling of implicit_def.
Note, isUndef marker must be placed even on implicit_def def operand or else the scavenger will not ignore it. This is necessary because -O0 path does not use liveintervalanalysis, it treats implicit_def just like any other def.

llvm-svn: 74601
2009-07-01 08:19:36 +00:00
Chris Lattner
2bbdc61f92 Fix some fast-isel problems selecting global variable addressing in
pic mode.

llvm-svn: 74582
2009-07-01 03:27:19 +00:00
Evan Cheng
37503e9671 Handle IMPLICIT_DEF with isUndef operand marker, part 2. This patch moves the code to annotate machineoperands to LiveIntervalAnalysis. It also add markers for implicit_def that define physical registers. The rest, is just a lot of details.
llvm-svn: 74580
2009-07-01 01:59:31 +00:00
David Goodwin
19aa5c7d51 Add PIC load and store patterns for Thumb-2.
llvm-svn: 74577
2009-07-01 00:01:13 +00:00
David Goodwin
5805e9aef5 Add thumb-2 store word, halfword, and byte.
llvm-svn: 74555
2009-06-30 22:11:34 +00:00
David Goodwin
aad223dd8a Improve Thumb-2 jump table support.
llvm-svn: 74549
2009-06-30 19:50:22 +00:00
Rafael Espindola
340632e814 Fix PR4485.
Avoid unnecessary duplication of operand 0 of X86::FpSET_ST0_80. This duplication would
cause one register to remain on the stack at the function return.

llvm-svn: 74534
2009-06-30 16:40:03 +00:00
Rafael Espindola
33b0aa0274 Fix PR4484.
This was caused by me confounding FP0 and ST(0).

llvm-svn: 74523
2009-06-30 12:18:16 +00:00
Evan Cheng
28b9e77f19 Temporarily restore the scavenger implicit_def checking code. MachineOperand isUndef mark is not being put on implicit_def of physical registers (created for parameter passing, etc.).
llvm-svn: 74519
2009-06-30 09:19:42 +00:00
Evan Cheng
c6c942b70f Add a bit IsUndef to MachineOperand. This indicates the def / use register operand is defined by an implicit_def. That means it can def / use any register and passes (e.g. register scavenger) can feel free to ignore them.
The register allocator, when it allocates a register to a virtual register defined by an implicit_def, can allocate any physical register without worrying about overlapping live ranges. It should mark all of operands of the said virtual register so later passes will do the right thing.

This is not the best solution. But it should be a lot less fragile to having the scavenger try to track what is defined by implicit_def.

llvm-svn: 74518
2009-06-30 08:49:04 +00:00
Evan Cheng
2a527c3419 A few more load instructions.
llvm-svn: 74500
2009-06-30 02:15:48 +00:00
David Goodwin
6ed9f9c9c5 Enhance tests to include shifted-register operand testing.
llvm-svn: 74490
2009-06-30 01:02:20 +00:00
David Goodwin
4f53387d26 Add Thumb-2 support for TEQ amd TST.
llvm-svn: 74468
2009-06-29 22:49:42 +00:00
David Goodwin
e7df96eedf Thumb-2 tests
llvm-svn: 74464
2009-06-29 22:25:22 +00:00
Rafael Espindola
a0fdda93be FIX PR 4459.
Not sure I understand how the temp register gets used,
but this fixes a bug and introduces no regressions.

llvm-svn: 74446
2009-06-29 20:29:59 +00:00
David Goodwin
9e1280adf3 Rename ARMcmpNZ to ARMcmpZ and use it to represent comparisons that set only the Z flag (i.e. eq and ne). Make ARMcmpZ commutative.
llvm-svn: 74423
2009-06-29 15:33:01 +00:00
Evan Cheng
093adf3ff9 Implement Thumb2 ldr.
After much back and forth, I decided to deviate from ARM design and split LDR into 4 instructions (r + imm12, r + imm8, r + r << imm12, constantpool). The advantage of this is 1) it follows the latest ARM technical manual, and 2) makes it easier to reduce the width of the instruction later. The down side is this creates more inconsistency between the two sub-targets. We should split ARM LDR instruction in a similar fashion later. I've added a README entry for this.

llvm-svn: 74420
2009-06-29 07:51:04 +00:00
Chris Lattner
e711b85035 factor some logic out into a helper function, allow remat of loads from constant
globals.  This implements remat-constant.ll even without aggressive-remat.

llvm-svn: 74373
2009-06-27 04:38:55 +00:00
Chris Lattner
19eb0dad26 Reimplement rip-relative addressing in the X86-64 backend. The new
implementation primarily differs from the former in that the asmprinter
doesn't make a zillion decisions about whether or not something will be
RIP relative or not.  Instead, those decisions are made by isel lowering
and propagated through to the asm printer.  To achieve this, we:

1. Represent RIP relative addresses by setting the base of the X86 addr
   mode to X86::RIP.
2. When ISel Lowering decides that it is safe to use RIP, it lowers to
   X86ISD::WrapperRIP.  When it is unsafe to use RIP, it lowers to
   X86ISD::Wrapper as before.
3. This removes isRIPRel from X86ISelAddressMode, representing it with
   a basereg of RIP instead.
4. The addressing mode matching logic in isel is greatly simplified.
5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate
   passed through various printoperand routines is gone now.
6. The various symbol printing routines in asmprinter now no longer infer
   when to emit (%rip), they just print the symbol.

I think this is a big improvement over the previous situation.  It does have
two small caveats though: 1. I implemented a horrible "no-rip" modifier for
the inline asm "P" constraint modifier.  This is a short term hack, there is
a much better, but more involved, solution.  2. I had to xfail an 
-aggressive-remat testcase because it isn't handling the use of RIP in the
constant-pool reading instruction.  This specific test is easy to fix without
-aggressive-remat, which I intend to do next.

llvm-svn: 74372
2009-06-27 04:16:01 +00:00
Chris Lattner
aef726f8b9 remove some unneeded eh info.
llvm-svn: 74371
2009-06-27 04:07:31 +00:00
Chris Lattner
3e94ce2426 testcase for PR4466
llvm-svn: 74367
2009-06-27 01:33:35 +00:00
David Goodwin
90fc344e41 When possible, use "mvn ra, rb" instead of "eor ra, rb, -1" because mvn has a narrow version and eor(i) does not.
llvm-svn: 74355
2009-06-26 23:13:13 +00:00
Dan Gohman
49b2ecafe7 Add some testcases for some of the recent ScalarEvolution bug fixes.
llvm-svn: 74353
2009-06-26 22:54:11 +00:00
David Goodwin
4997a459c7 Thumb-2 tests
llvm-svn: 74345
2009-06-26 22:37:07 +00:00
Chris Lattner
4384816259 remove unwind info, add test for asmprinting of jump table labels with (%rip)
llvm-svn: 74337
2009-06-26 22:16:49 +00:00
Evan Cheng
016ed65455 Add x86 support for 'n' inline asm modifier. This will be handled target independently as part of MC work.
llvm-svn: 74336
2009-06-26 22:00:19 +00:00
David Goodwin
921faa64cd Thumb-2 has CLZ.
llvm-svn: 74322
2009-06-26 20:47:43 +00:00
David Goodwin
9da977f216 Use "adcs/sbcs" only when the carry-out is live, otherwise use "adc/sbc".
llvm-svn: 74321
2009-06-26 20:45:56 +00:00
Daniel Dunbar
23337e07ce More spelling Count as count.
llvm-svn: 74306
2009-06-26 18:35:07 +00:00
Daniel Dunbar
fa37f8bf95 Spell Count as count.
llvm-svn: 74298
2009-06-26 18:21:54 +00:00
David Goodwin
4a98908300 Add Thumb-2 tests.
llvm-svn: 74295
2009-06-26 18:10:30 +00:00
David Goodwin
46eb5a7a2d ADC used to implement adde should use "adcs" opcode instead of "adc".
llvm-svn: 74293
2009-06-26 18:07:25 +00:00
David Goodwin
b2c485c6bd ORN and BIC tests.
llvm-svn: 74289
2009-06-26 16:20:06 +00:00
David Goodwin
877790aa5f Currently there is a pattern for the thumb-2 MOV 16-bit immediate instruction. That instruction cannot write the flags so it should use T2I instead of T2sI.
Also, added a pattern for the thumb-2 MOV of shifted immediate since that can encode immediates not encodable by the 16-bit immediate.

llvm-svn: 74288
2009-06-26 16:10:07 +00:00
Evan Cheng
7883ae3121 Fix tests: Count -> count.
llvm-svn: 74282
2009-06-26 07:05:57 +00:00
Evan Cheng
da10be895c Fix a CodeGenDAGPatterns bug. Check if top level predicates match when it's looking for duplicates.
llvm-svn: 74276
2009-06-26 05:59:16 +00:00
Daniel Dunbar
be7f3311ad Fix spelling of 'count'
llvm-svn: 74249
2009-06-26 01:33:02 +00:00
Evan Cheng
4ac765118d Select ADC, SBC, and RSC instead of the ADCS, SBCS, and RSCS when the carry bit def is not used.
llvm-svn: 74228
2009-06-25 23:34:10 +00:00
David Goodwin
74414108e9 Use MVN for ~t2_so_imm immediates.
llvm-svn: 74223
2009-06-25 23:11:21 +00:00
Bill Wendling
f962a1df07 Don't grep the -debug output. This isn't the way to test changes.
llvm-svn: 74211
2009-06-25 21:59:32 +00:00
Chris Lattner
f035685176 down with unwind info :)
llvm-svn: 74206
2009-06-25 21:48:17 +00:00
Evan Cheng
0cced3daa8 ISD::ADDE / ISD::SUBE updates the carry bit so they should isle to ADCS and SBCS / RSCS.
llvm-svn: 74200
2009-06-25 20:59:23 +00:00
Evan Cheng
e8c58ee743 Add Thumb2 pc relative add.
llvm-svn: 74141
2009-06-24 23:47:58 +00:00
Evan Cheng
f7814163db We should run these tests as well.
llvm-svn: 74121
2009-06-24 21:36:26 +00:00
Chris Lattner
6e06dc1168 unwind info not needed.
llvm-svn: 74112
2009-06-24 19:48:04 +00:00
Evan Cheng
b4139189b0 Move thumb and thumb2 tests into separate directories.
llvm-svn: 74068
2009-06-24 06:36:07 +00:00
Evan Cheng
7292cadf06 Fix support for inline asm input / output operand tying when operand spans across multiple registers (e.g. two i64 operands in 32-bit mode).
llvm-svn: 74053
2009-06-24 02:05:51 +00:00
Dan Gohman
4f4bda36df Extend ScalarEvolution's multiple-exit support to compute exact
trip counts in more cases.

Generalize ScalarEvolution's isLoopGuardedByCond code to recognize
And and Or conditions, splitting the code out into an
isNecessaryCond helper function so that it can evaluate Ands and Ors
recursively, and make SCEVExpander be much more aggressive about
hoisting instructions out of loops.

test/CodeGen/X86/pr3495.ll has an additional instruction now, but
it appears to be due to an arbitrary register allocation difference.

llvm-svn: 74048
2009-06-24 01:18:18 +00:00
Evan Cheng
eaad82627b Proper patterns for thumb2 shift and rotate instructions.
llvm-svn: 73987
2009-06-23 19:39:13 +00:00
Bob Wilson
6db76aaf10 Add support for ARM's Advanced SIMD (NEON) instruction set.
This is still a work in progress but most of the NEON instruction set
is supported.

llvm-svn: 73919
2009-06-22 23:27:02 +00:00
Evan Cheng
2814371831 It's coalescer, not coaleser.
llvm-svn: 73902
2009-06-22 21:09:17 +00:00
Bob Wilson
0c2c5f65e2 For Darwin on ARMv6 and newer, make register r9 available for use as a
caller-saved register.

llvm-svn: 73901
2009-06-22 21:01:46 +00:00
Evan Cheng
2410955c62 Fix another register coalescer crash: forgot to check if the instruction being updated has already been coalesced.
llvm-svn: 73898
2009-06-22 20:49:32 +00:00
Evan Cheng
b37e7e24d0 hasFP should return true if frame address is taken.
llvm-svn: 73893
2009-06-22 18:38:48 +00:00
Rafael Espindola
373c6bdbc5 Fix PR4185.
Handle FpSET_ST0_80 being used when ST0 is still alive.

llvm-svn: 73850
2009-06-21 12:02:51 +00:00
Chris Lattner
580eecebbd change TLS_ADDR lowering to lower to a real mem operand, instead of matching as
a global with that gets printed with the :mem modifier.  All operands to lea's 
should be handled with the lea32mem operand kind, and this allows the TLS stuff
to do this.  There are several better ways to do this, but I went for the minimal
change since I can't really test this (beyond make check).

This also makes the use of EBX explicit in the operand list in the 32-bit, 
instead of implicit in the instruction.

llvm-svn: 73834
2009-06-20 20:38:48 +00:00
Chris Lattner
965cc0e45b no need for unwind info
llvm-svn: 73832
2009-06-20 19:48:26 +00:00
Chris Lattner
33d1976328 no need for unwind info here.
llvm-svn: 73831
2009-06-20 19:43:09 +00:00
Evan Cheng
b45918e5bb Fix PR4419: handle defs of partial uses.
llvm-svn: 73816
2009-06-20 04:34:51 +00:00
Dan Gohman
651faa1905 Re-apply r73718, now that the fix in r73787 is in, and add a
hand-crafted testcase which demonstrates the bug that was exposed
in 254.gap.

llvm-svn: 73793
2009-06-19 23:23:27 +00:00
Evan Cheng
f18de63563 Enable arm pre-allocation load / store multiple optimization pass.
llvm-svn: 73791
2009-06-19 23:17:27 +00:00
Evan Cheng
b90241ac42 Revert 73718. It's breaking 254.gap.
llvm-svn: 73783
2009-06-19 21:15:06 +00:00
Eli Friedman
5cccb60bad Fix for PR2484: add an SSE1 pattern for a shuffle we normally prefer to
handle with an SSE2 instruction.

llvm-svn: 73760
2009-06-19 07:00:55 +00:00
Eli Friedman
003abaa60d Mark a few Thumb instructions commutable; just happened to spot this
while experimenting.  I'm reasonably sure this is correct, but please 
tell me if these instructions have some strange property which makes this
change unsafe.

llvm-svn: 73746
2009-06-19 01:43:08 +00:00
Evan Cheng
6c1c55f942 On Darwin, ams printer should output a second label before a jump table so the linker knows it's a new atom. But this is only needed if the jump table is put in a separate section from the function body.
llvm-svn: 73720
2009-06-18 20:37:15 +00:00
Dan Gohman
da82dc2ec1 Generalize LSR's OptimizeSMax to handle unsigned max tests as well
as signed max tests. Along with r73717, this helps CodeGen avoid
emitting code for a maximum operation for this class of loop.

llvm-svn: 73718
2009-06-18 20:23:18 +00:00
Dan Gohman
fd857b0406 Remove the code from IVUsers that attempted to handle
casted induction variables in cases where the cast
isn't foldable. It ended up being a pessimization in
many cases. This could be fixed, but it would require
a bunch of complicated code in IVUsers' clients. The
advantages of this approach aren't visible enough to
justify it at this time.

llvm-svn: 73706
2009-06-18 16:54:06 +00:00
Anton Korobeynikov
7fd29c57a8 Initial support for some Thumb2 instructions.
Patch by Viktor Kutuzov and Anton Korzh from Access Softek, Inc.

llvm-svn: 73622
2009-06-17 18:13:58 +00:00
Anton Korobeynikov
d6004a164c Make the test target-neutral
llvm-svn: 73547
2009-06-16 20:25:25 +00:00
Anton Korobeynikov
a74b8323d0 GNU as refuses to assemble "pop {}" instruction. Do not emit such
(this is the case when we have thumb vararg function with single
callee-saved register, which is handled separately).

llvm-svn: 73529
2009-06-16 18:49:08 +00:00
Evan Cheng
a98ff05fca If a val# is defined by an implicit_def and it is being removed, all of the copies off the val# were removed. This causes problem later since the scavenger will see uses of registers without defs. The proper solution is to change the copies into implicit_def's instead.
TurnCopyIntoImpDef turns a copy into implicit_def and remove the val# defined by it. This causes an scavenger assertion later if the def reaches other blocks. Disable the transformation if the value live interval extends beyond its def block.

llvm-svn: 73478
2009-06-16 07:12:58 +00:00
Eli Friedman
6a984089f4 Add some generic expansion logic for SMULO and UMULO. Fixes UMULO
support for x86, and UMULO/SMULO for many architectures, including PPC 
(PR4201), ARM, and Cell. The resulting expansion isn't perfect, but it's
not bad.

llvm-svn: 73477
2009-06-16 06:58:29 +00:00
Dan Gohman
255bcad466 Update this test to use fmul instead of mul.
llvm-svn: 73436
2009-06-15 22:49:34 +00:00
Evan Cheng
4b77794613 ifcvt should ignore cfg where true and false successors are the same.
llvm-svn: 73423
2009-06-15 21:24:34 +00:00
Bill Wendling
a0a5984345 This test is failing. Revert for now.
llvm-svn: 73404
2009-06-15 19:10:56 +00:00
Bill Wendling
1ea00229de Add another testcase for r71478.
llvm-svn: 73399
2009-06-15 18:36:34 +00:00
Arnold Schwaighofer
6b340f9247 CheckTailCallReturnConstraints is missing a check on the
incomming chain of the RETURN node. The incomming chain must
be the outgoing chain of the CALL node. This causes the
backend to identify tail calls that are not tail calls. This
patch fixes this.

llvm-svn: 73387
2009-06-15 14:43:36 +00:00
Evan Cheng
3219c7fbe5 Part 1.
- Change register allocation hint to a pair of unsigned integers. The hint type is zero (which means prefer the register specified as second part of the pair) or entirely target dependent.
- Allow targets to specify alternative register allocation orders based on allocation hint.

Part 2.
- Use the register allocation hint system to implement more aggressive load / store multiple formation.
- Aggressively form LDRD / STRD. These are formed *before* register allocation. It has to be done this way to shorten live interval of base and offset registers. e.g.
v1025 = LDR v1024, 0
v1026 = LDR v1024, 0
=>
v1025,v1026 = LDRD v1024, 0

If this transformation isn't done before allocation, v1024 will overlap v1025 which means it more difficult to allocate a register pair.

- Even with the register allocation hint, it may not be possible to get the desired allocation. In that case, the post-allocation load / store multiple pass must fix the ldrd / strd instructions. They can either become ldm / stm instructions or back to a pair of ldr / str instructions.

This is work in progress, not yet enabled.

llvm-svn: 73381
2009-06-15 08:28:29 +00:00
Evan Cheng
d0a66e438f Add a ARM specific pre-allocation pass that re-schedule loads / stores from
consecutive addresses togther. This makes it easier for the post-allocation pass
to form ldm / stm.

This is step 1. We are still missing a lot of ldm / stm opportunities because
of register allocation are not done in the desired order. More enhancements
coming.

llvm-svn: 73291
2009-06-13 09:12:55 +00:00
Evan Cheng
98216808fe If killed register is defined by implicit_def, do not clear it since it's live range may overlap another def of same register.
llvm-svn: 73255
2009-06-12 21:34:26 +00:00
Evan Cheng
2f784781aa Mark some pattern-less instructions as neverHasSideEffects.
llvm-svn: 73252
2009-06-12 20:46:18 +00:00
Arnold Schwaighofer
780e3addf8 Fix Bug 4278: X86-64 with -tailcallopt calling convention
out of sync with regular cc.

The only difference between the tail call cc and the normal
cc was that one parameter register - R9 - was reserved for
calling functions through a function pointer. After time the
tail call cc has gotten out of sync with the regular cc. 

We can use R11 which is also caller saved but not used as
parameter register for potential function pointers and
remove the special tail call cc on x86-64.

llvm-svn: 73233
2009-06-12 16:26:57 +00:00
Anton Korobeynikov
c82243e658 Add testcase for register scanveger assertion fix in r72755
(double def due to livevars)

llvm-svn: 73096
2009-06-08 22:54:15 +00:00
Eli Friedman
62028b7323 Fix the run-line for this test to work correctly outside of x86.
llvm-svn: 73025
2009-06-07 09:44:19 +00:00
Eli Friedman
2964aa5a38 Tweak the expansion code for BIT_CONVERT to generate better code
converting from an MMX vector to an i64.

llvm-svn: 73024
2009-06-07 09:41:57 +00:00
Eli Friedman
d4b463b0dc Slightly generalize the code that handles shuffles of consecutive loads
on x86 to handle more cases.  Fix a bug in said code that would cause it 
to read past the end of an object.  Rewrite the code in 
SelectionDAGLegalize::ExpandBUILD_VECTOR to be a bit more general. 
Remove PerformBuildVectorCombine, which is no longer necessary with 
these changes.  In addition to simplifying the code, with this change, 
we can now catch a few more cases of consecutive loads.

llvm-svn: 73012
2009-06-07 06:52:44 +00:00
Eli Friedman
2b6cb1684f PR3628: Add patterns to match SHL/SRL/SRA to the corresponding Altivec
instructions.

llvm-svn: 73009
2009-06-07 01:07:55 +00:00
Eli Friedman
2dadbd05f9 Fix the expansion for CONCAT_VECTORS so that it doesn't create illegal
types.

llvm-svn: 72993
2009-06-06 07:08:26 +00:00
Eli Friedman
4395222136 Avoid crashing on a variable-index insertelement with element type i16.
llvm-svn: 72991
2009-06-06 06:32:50 +00:00
Eli Friedman
e546f94ef5 Get rid of some bogus patterns for X86vzmovl. Don't create VZEXT_MOVL
nodes for vectors with an i16 element type.  Add an optimization for 
building a vector which is all zeros/undef except for the bottom 
element, where the bottom element is an i8 or i16.

llvm-svn: 72988
2009-06-06 06:05:10 +00:00
Eli Friedman
539325c8e7 Fix an obvious typo.
llvm-svn: 72987
2009-06-06 05:55:37 +00:00
Eli Friedman
1227d199be Get rid of a bogus pattern that interferes with optimization.
llvm-svn: 72985
2009-06-06 04:17:04 +00:00
Eli Friedman
05eef883e8 PR2598: make sure to expand illegal forms of integer/floating-point
conversions for x86, like <2 x i32> -> <2 x float> and <4 x i16> -> 
<4 x float>.

llvm-svn: 72983
2009-06-06 03:57:58 +00:00
Nate Begeman
058d4eeccf Adapt the x86 build_vector dagcombine to the current state of the legalizer.
build vectors with i64 elements will only appear on 32b x86 before legalize.
Since vector widening occurs during legalize, and produces i64 build_vector 
elements, the dag combiner is never run on these before legalize splits them
into 32b elements.

Teach the build_vector dag combine in x86 back end to recognize consecutive 
loads producing the low part of the vector.

Convert the two uses of TLI's consecutive load recognizer to pass LoadSDNodes
since that was required implicitly.

Add a testcase for the transform.

Old:
	subl	$28, %esp
	movl	32(%esp), %eax
	movl	4(%eax), %ecx
	movl	%ecx, 4(%esp)
	movl	(%eax), %eax
	movl	%eax, (%esp)
	movaps	(%esp), %xmm0
	pmovzxwd	%xmm0, %xmm0
	movl	36(%esp), %eax
	movaps	%xmm0, (%eax)
	addl	$28, %esp
	ret

New:
	movl	4(%esp), %eax
	pmovzxwd	(%eax), %xmm0
	movl	8(%esp), %eax
	movaps	%xmm0, (%eax)
	ret

llvm-svn: 72957
2009-06-05 21:37:30 +00:00
Evan Cheng
ea31ec569b Changing allocation ordering from r3 ... r0 back to r0 ... r3. The order change no longer make sense after the coalescing changes we have made since then.
llvm-svn: 72955
2009-06-05 19:08:58 +00:00
Dan Gohman
31fc8d27b1 Fix an erroneous check for isFNeg; the FNeg case is handled
a few lines later on.

llvm-svn: 72904
2009-06-04 23:43:29 +00:00
Dan Gohman
5f6f8101d5 Split the Add, Sub, and Mul instruction opcodes into separate
integer and floating-point opcodes, introducing
FAdd, FSub, and FMul.

For now, the AsmParser, BitcodeReader, and IRBuilder all preserve
backwards compatability, and the Core LLVM APIs preserve backwards
compatibility for IR producers. Most front-ends won't need to change
immediately.

This implements the first step of the plan outlined here:
http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt

llvm-svn: 72897
2009-06-04 22:49:04 +00:00
Devang Patel
9757e4f9f3 Add new function attribute - noredzone.
Update code generator to use this attribute and remove DisableRedZone target option.
Update llc to set this attribute when -disable-red-zone command line option is used.

llvm-svn: 72894
2009-06-04 22:05:33 +00:00
Evan Cheng
dada49d18a RALinScan::attemptTrivialCoalescing() was returning a virtual register instead of the physical register it is allocated to. This resulted in virtual register(s) being added the live-in sets.
llvm-svn: 72890
2009-06-04 20:53:36 +00:00
Evan Cheng
8a6c448ab0 A value defined by an implicit_def can be liven to a use BB. This is unfortunate. But register allocator still has to add it to the live-in set of the use BB.
llvm-svn: 72888
2009-06-04 20:25:48 +00:00
Dan Gohman
05fe1217c7 Check in test changes that I accidentally left out of r72872.
llvm-svn: 72875
2009-06-04 18:22:31 +00:00
Eli Friedman
11070e275f PR3739, part 2: Use an explicit store to spill XMM registers. (Previously,
the code tried to use "push", which doesn't exist for XMM registers.)

llvm-svn: 72836
2009-06-04 02:32:04 +00:00
Eli Friedman
fd27229206 PR3739, part 1: Disable the red zone on Win64.
llvm-svn: 72830
2009-06-04 02:02:01 +00:00
Evan Cheng
e3a05e6690 Re-apply 72756 with fixes. One of those was introduced by we changed MachineInstrBuilder::addReg() interface.
llvm-svn: 72826
2009-06-04 01:15:28 +00:00
Eli Friedman
dbf32ddf16 PR4317: Handle splits where the new block is unreachable correctly in
DominatorTreeBase::Split.

llvm-svn: 72810
2009-06-03 21:42:06 +00:00
Evan Cheng
b71402d6ae For Darwin / x86_64, override -relocation-model=static to pic if the output is assembly since Darwin assembler does not really support -static codeine.
I view this as a temporary workaround until the assembler / linker changes.

llvm-svn: 72806
2009-06-03 21:13:54 +00:00
Evan Cheng
4e47a019ba Fix for PR4225: When rewriter reuse a value in a physical register , it clear the register kill operand marker and its kill ops information. However, the cleared operand may be a def of a super-register. Clear the kill ops info for the super-register's sub-registers as well.
llvm-svn: 72758
2009-06-03 09:00:27 +00:00
Evan Cheng
82f8fa333e Temporarily revert 72756 for now.
llvm-svn: 72757
2009-06-03 07:40:47 +00:00
Evan Cheng
5afbef29fa Fold preceding / trailing base inc / dec into the single load / store as well.
llvm-svn: 72756
2009-06-03 06:14:58 +00:00
Dan Gohman
609f627ed7 Revert r72734. The Darwin assembler doesn't support the static
relocation model on x86-64. Higher level logic should override
the relocation model to PIC on x86_64-apple-darwin.

llvm-svn: 72746
2009-06-03 00:37:20 +00:00
Dan Gohman
f6e6588203 Fix CodeGenPrepare's address-mode sinking to handle unusual
addresses, involving Base values which do not have Pointer type.
This fixes PR4297.

llvm-svn: 72739
2009-06-02 21:29:13 +00:00
Evan Cheng
7e66d61bec On Darwin x86_64 small code model doesn't guarantee code address fits in 32-bit.
llvm-svn: 72734
2009-06-02 20:09:31 +00:00
Evan Cheng
2d198e1bc2 (i64 (zext (srl GR32 8))) -> movzbl AH is not safe since srl 8 only clear the top 8 bits.
llvm-svn: 72618
2009-05-30 08:43:27 +00:00
Evan Cheng
57f85a1529 Remove an accidental commit.
llvm-svn: 72560
2009-05-29 05:28:52 +00:00
Evan Cheng
550fc9ba9f More h-registers tricks: folding zext nodes.
llvm-svn: 72558
2009-05-29 01:44:43 +00:00
Evan Cheng
a36a15ff66 Do not try to create a MVT type of width 0.
llvm-svn: 72557
2009-05-28 23:52:18 +00:00
Eli Friedman
5a376ed43e Add explicit test for PR4280.
llvm-svn: 72539
2009-05-28 21:04:35 +00:00
Eli Friedman
8b0b7c2d6d Add a testcase which got fixed by recent legalization work.
llvm-svn: 72517
2009-05-28 05:10:20 +00:00
Evan Cheng
40810c4d1b Added optimization that narrow load / op / store and the 'op' is a bit twiddling instruction and its second operand is an immediate. If bits that are touched by 'op' can be done with a narrower instruction, reduce the width of the load and store as well. This happens a lot with bitfield manipulation code.
e.g.
orl     $65536, 8(%rax)
=>
orb     $1, 10(%rax)

Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization.

llvm-svn: 72507
2009-05-28 00:35:15 +00:00
Bill Wendling
2944bcd25c This looks like it passes now.
llvm-svn: 72485
2009-05-27 17:43:21 +00:00
Torok Edwin
99b1003c2e Fix PR4254.
The DAGCombiner created a negative shiftamount, stored in an
unsigned variable. Later the optimizer eliminated the shift entirely as being
undefined.
Example: (srl (shl X, 56) 48). ShiftAmt is 4294967288.
Fix it by checking that the shiftamount is positive, and storing in a signed
variable.

llvm-svn: 72331
2009-05-23 17:29:48 +00:00
Torok Edwin
beb86bd0b4 available_externall linkage is not local, this was confusing the codegenerator,
and it wasn't generating calls through @PLT for these functions.
hasLocalLinkage() is now false for available_externally,
I attempted to fix the inliner and dce to handle available_externally properly.
It passed make check.

llvm-svn: 72328
2009-05-23 14:06:57 +00:00
Eli Friedman
262a99ffed Fix test to account for legalization changes; I think this ends up
running an extra DAGCombine pass which improves the code a bit.

llvm-svn: 72326
2009-05-23 13:15:11 +00:00
Duncan Sands
bbd03677ee Add a new codegen pass that normalizes dwarf exception handling
code in preparation for code generation.  The main thing it does
is handle the case when eh.exception calls (and, in a future
patch, eh.selector calls) are far away from landing pads.  Right
now in practice you only find eh.exception calls close to landing
pads: either in a landing pad (the common case) or in a landing
pad successor, due to loop passes shifting them about.  However
future exception handling improvements will result in calls far
from landing pads:
(1) Inlining of rewinds.  Consider the following case:
In function @f:
...
  invoke @g to label %normal unwind label %unwinds
...
unwinds:
  %ex = call i8* @llvm.eh.exception()
...

In function @g:
...
  invoke @something to label %continue unwind label %handler
...
handler:
  %ex = call i8* @llvm.eh.exception()
... perform cleanups ...
  "rethrow exception"

Now inline @g into @f.  Currently this is turned into:
In function @f:
...
  invoke @something to label %continue unwind label %handler
...
handler:
  %ex = call i8* @llvm.eh.exception()
... perform cleanups ...
  invoke "rethrow exception" to label %normal unwind label %unwinds
unwinds:
  %ex = call i8* @llvm.eh.exception()
...

However we would like to simplify invoke of "rethrow exception" into
a branch to the %unwinds label.  Then %unwinds is no longer a landing
pad, and the eh.exception call there is then far away from any landing
pads.

(2) Using the unwind instruction for cleanups.
It would be nice to have codegen handle the following case:
  invoke @something to label %continue unwind label %run_cleanups
...
handler:
... perform cleanups ...
  unwind

This requires turning "unwind" into a library call, which
necessarily takes a pointer to the exception as an argument
(this patch also does this unwind lowering).  But that means
you are using eh.exception again far from a landing pad.

(3) Bugpoint simplifications.  When bugpoint is simplifying
exception handling code it often generates eh.exception calls
far from a landing pad, which then causes codegen to assert.
Bugpoint then latches on to this assertion and loses sight
of the original problem.

Note that it is currently rare for this pass to actually do
anything.  And in fact it normally shouldn't do anything at
all given the code coming out of llvm-gcc!  But it does fire
a few times in the testsuite.  As far as I can see this is
almost always due to the LoopStrengthReduce codegen pass
introducing pointless loop preheader blocks which are landing
pads and only contain a branch to another block.  This other
block contains an eh.exception call.  So probably by tweaking
LoopStrengthReduce a bit this can be avoided.

llvm-svn: 72276
2009-05-22 20:36:31 +00:00
Eli Friedman
b6fe72e457 Fix for PR4235: to build a floating-point value from integer parts,
build an integer and cast that to a float.  This fixes a crash 
caused by trying to split an f32 into two f16's.

This changes the behavior in test/CodeGen/XCore/fneg.ll because that 
testcase now triggers a DAGCombine which converts the fneg into an integer
operation.  If someone is interested, it's probably possible to tweak 
the test to generate an actual fneg.

llvm-svn: 72162
2009-05-20 06:02:09 +00:00
Evan Cheng
ff129ff17f Fix test on non-darwin hosts.
llvm-svn: 72161
2009-05-20 05:45:36 +00:00
Evan Cheng
e17c02e328 Try again. Allow call to immediate address for ELF or when in static relocation mode.
llvm-svn: 72160
2009-05-20 04:53:57 +00:00
Evan Cheng
8a4887572e Cannot use immediate as call absolute target in PIC mode.
llvm-svn: 72154
2009-05-20 01:11:00 +00:00
Bob Wilson
c6726ecca5 Fix pr4058 and pr4059. Do not split i64 or double arguments between r3 and
the stack.  Patch by Sandeep Patel.

llvm-svn: 72106
2009-05-19 10:02:36 +00:00
Bob Wilson
ec676a76e7 Fix pr4091: Add support for "m" constraint in ARM inline assembly.
llvm-svn: 72105
2009-05-19 05:53:42 +00:00
Dan Gohman
904f081ce7 Add nounwind to a few tests.
llvm-svn: 72002
2009-05-18 15:16:49 +00:00
Anton Korobeynikov
85accafcba Mark rotl/rotr as expand. This generates pretty ugly code, but this is better than nothing.
llvm-svn: 71976
2009-05-17 10:16:28 +00:00
Anton Korobeynikov
8753e89b79 Typo
llvm-svn: 71975
2009-05-17 10:15:22 +00:00
Jakob Stoklund Olesen
fa57451cf5 Help DejaGnu avoid pipe-jam by producing less output from certain test cases.
When a test fails with more than a pipeful of output on stdout AND stderr, one
of the DejaGnu programs blocks. The problem can be avoided by redirecting
stdout to a file.

llvm-svn: 71919
2009-05-16 00:34:42 +00:00
Dan Gohman
a09e38894a Add nounwind to this test.
llvm-svn: 71734
2009-05-13 22:29:12 +00:00
Evan Cheng
e43bfc153e If header of inner loop is aligned, do not align the outer loop header. We don't want to add nops in the outer loop for the sake of aligning the inner loop.
llvm-svn: 71609
2009-05-12 23:58:14 +00:00
Evan Cheng
c7f7276825 Teach TransferDeadness to delete truly dead instructions if they do not produce side effects.
llvm-svn: 71606
2009-05-12 23:07:00 +00:00
Evan Cheng
b0a4c44103 Add nounwind.
llvm-svn: 71575
2009-05-12 18:35:43 +00:00
Evan Cheng
d6e3e4d746 Fixed a stack slot coloring with reg bug: do not update implicit use / def when doing forward / backward propagation.
llvm-svn: 71574
2009-05-12 18:31:57 +00:00
Bob Wilson
16f684a429 Fix pr4195: When iterating through predecessor blocks, break out of the loop
after finding the (unique) layout predecessor.  Sometimes a block may be listed
more than once, and processing it more than once in this loop can lead to
inconsistent values for FtTBB/FtFBB, since the AnalyzeBranch method does not
clear these values.  There's no point in continuing the loop regardless.
The testcase for this is reduced from the 2003-05-02-DependentPHI SingleSource
test.

llvm-svn: 71536
2009-05-12 03:48:10 +00:00
Dan Gohman
d13f674130 Factor the code for collecting IV users out of LSR into an IVUsers class,
and generalize it so that it can be used by IndVarSimplify. Implement the
base IndVarSimplify transformation code using IVUsers. This removes
TestOrigIVForWrap and associated code, as ScalarEvolution now has enough
builtin overflow detection and folding logic to handle all the same cases,
and more. Run "opt -iv-users -analyze -disable-output" on your favorite
loop for an example of what IVUsers does.

This lets IndVarSimplify eliminate IV casts and compute trip counts in
more cases. Also, this happens to finally fix the remaining testcases
in PR1301.

Now that IndVarSimplify is being more aggressive, it occasionally runs
into the problem where ScalarEvolutionExpander's code for avoiding
duplicate expansions makes it difficult to ensure that all expanded
instructions dominate all the instructions that will use them. As a
temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function
to fix up instructions inserted by SCEVExpander. Fortunately, this code
is contained, and can be easily removed once a more comprehensive
solution is available.

llvm-svn: 71535
2009-05-12 02:17:14 +00:00
Evan Cheng
9b27f3ec42 Teach LSR to optimize more loop exit compares, i.e. change them to use postinc iv value. Previously LSR would only optimize those which are in the loop latch block. However, if LSR can prove it is safe (and profitable), it's now possible to change those not in the latch blocks to use postinc values.
Also, if the compare is the only use, LSR would place the iv increment instruction before the compare instead in the latch.

llvm-svn: 71485
2009-05-11 22:33:01 +00:00
Dale Johannesen
dd32623987 Fix PR4188. TailMerging can't tolerate inexact
sucessor info.

llvm-svn: 71478
2009-05-11 21:54:13 +00:00
Dan Gohman
25ab4c185c Make this grep line a little more specific so that it doesn't
accidentally match something unrelated.

llvm-svn: 71458
2009-05-11 18:49:56 +00:00
Dan Gohman
dfa39efe6d When scalarizing a vector BITCAST, check whether the operand has vector
type, rather than assume that it does. If the operand is not vector, it
shouldn't be run through ScalarizeVectorOp. This fixes one of the
testcases in PR3886.

llvm-svn: 71453
2009-05-11 18:30:42 +00:00
Dan Gohman
0edabc8a6f Convert a subtract into a negate and an add when it helps x86
address folding.

llvm-svn: 71446
2009-05-11 18:02:53 +00:00
Dale Johannesen
f86e34065b Reverse a loop that is counting up to a maximum to
count down to 0 instead, under very restricted
circumstances.  Adjust 4 testcases in which this
optimization fires.

llvm-svn: 71439
2009-05-11 17:15:42 +00:00
Anton Korobeynikov
fe1c6d85b8 Add MSP430 test for PR4136
llvm-svn: 71392
2009-05-10 14:48:36 +00:00
Evan Cheng
06b0d3879e Enable loop bb placement optimization.
llvm-svn: 71291
2009-05-08 23:35:49 +00:00
Chris Lattner
7b2dabcac9 Fix PR4152: asm constraint validation happens before dag combine, so we
need to work a bit to combine things like (x+c1+c2) into x+c3.

llvm-svn: 71232
2009-05-08 18:23:14 +00:00
Evan Cheng
2a1d20b0fb Optimize code placement in loop to eliminate unconditional branches or move unconditional branch to the outside of the loop. e.g.
///       A:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       <fallthrough to B>                                                                                                                                                 
///                                                                                                                                                                          
///       B:  --> loop header                                                                                                                                                
///       ...                                                                                                                                                                
///       jcc <cond> C, [exit]                                                                                                                                               
///                                                                                                                                                                          
///       C:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jmp B                                                                                                                                                              
///                                                                                                                                                                          
/// ==>                                                                                                                                                                      
///                                                                                                                                                                          
///       A:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jmp B                                                                                                                                                              
///                                                                                                                                                                          
///       C:  --> new loop header                                                                                                                                            
///       ...                                                                                                                                                                
///       <fallthough to B>                                                                                                                                                  
///                                                                                                                                                                          
///       B:                                                                                                                                                                 
///       ...                                                                                                                                                                
///       jcc <cond> C, [exit] 

llvm-svn: 71209
2009-05-08 06:34:09 +00:00
Bob Wilson
d61f4e70d8 Fix pr4100. Do not remove no-op copies when they are dead. The register
scavenger gets confused about register liveness if it doesn't see them.
I'm not thrilled with this solution, but it only comes up when there are dead
copies in the code, which is something that hopefully doesn't happen much.

Here is what happens in pr4100: As shown in the following excerpt from the
debug output of llc, the source of a move gets reloaded from the stack,
inserting a new load instruction before the move.  Since that source operand
is a kill, the physical register is free to be reused for the destination
of the move.  The move ends up being a no-op, copying R3 to R3, so it is
deleted.  But, it leaves behind the load to reload %reg1028 into R3, and
that load is not updated to show that it's destination operand (R3) is dead.
The scavenger gets confused by that load because it thinks that R3 is live.

Starting RegAlloc of: %reg1025<def,dead> = MOVr %reg1028<kill>, 14, %reg0, %reg0
  Regs have values: 
  Reloading %reg1028 into R3
  Last use of R3[%reg1028], removing it from live set
  Assigning R3 to %reg1025
  Register R3 [%reg1025] is never used, removing it from live set

Alternative solutions might be either marking the load as dead, or zapping
the load along with the no-op copy.  I couldn't see an easy way to do
either of those, though.

llvm-svn: 71196
2009-05-07 23:47:03 +00:00
Bill Wendling
864cbcfc46 THis doesn't fail.
llvm-svn: 71142
2009-05-07 01:41:42 +00:00
Bill Wendling
7c50dcd02e Temporarily revert r71010. It was causing massive failures during self-hosting.
llvm-svn: 71138
2009-05-07 01:27:25 +00:00
Evan Cheng
0ee6696fd8 Do not use register as base ptr of pre- and post- inc/dec load / store nodes.
llvm-svn: 71098
2009-05-06 18:25:01 +00:00
Lang Hames
fcc5ebb1d4 Renamed Spiller classes (plus uses and related files) to VirtRegRewriter.
llvm-svn: 71057
2009-05-06 02:36:21 +00:00
Evan Cheng
0d781df8dc Quotes should be printed before private prefix; some code clean up.
llvm-svn: 71032
2009-05-05 22:50:29 +00:00
Dan Gohman
5e839321f2 If a MachineBasicBlock has multiple ways of reaching another block,
allow it to have multiple CFG edges to that block. This is needed
to allow MachineBasicBlock::isOnlyReachableByFallthrough to work
correctly. This fixes PR4126.

llvm-svn: 71018
2009-05-05 21:10:19 +00:00
Evan Cheng
984da04cd0 Enable stack coloring with regs at -O3.
llvm-svn: 71010
2009-05-05 20:30:36 +00:00
Chris Lattner
5cc9a36d1c Add basic support for code generation of
addrspace(257) -> FS relative on x86.  Patch by Zoltan Varga!

llvm-svn: 70992
2009-05-05 18:52:19 +00:00
Dan Gohman
2973567a95 X86FastISel doesn't support the -tailcallopt ABI.
llvm-svn: 70902
2009-05-04 19:50:33 +00:00
Anton Korobeynikov
262a397978 Fix code emission for conditional branches.
Patch by Collin Winter!

llvm-svn: 70898
2009-05-04 19:10:38 +00:00
Dan Gohman
a79cce4aef Previously, RecursivelyDeleteDeadInstructions provided an option
of returning a list of pointers to Values that are deleted. This was
unsafe, because the pointers in the list are, by nature of what
RecursivelyDeleteDeadInstructions does, always dangling. Replace this
with a simple callback mechanism. This may eventually be removed if
all clients can reasonably be expected to use CallbackVH.

Use this to factor out the dead-phi-cycle-elimination code from LSR
utility function, and generalize it to use the
RecursivelyDeleteTriviallyDeadInstructions utility function.

This makes LSR more aggressive about eliminating dead PHI cycles;
adjust tests to either be less trivial or to simply expect fewer
instructions.

llvm-svn: 70636
2009-05-02 18:29:22 +00:00
Chris Lattner
c6d561ed27 'The attached patch fixes an issue where llc -march=cpp fails with
"Invalid primitive type" on input containing the x86_fp80 type.'
Patch by Collin Winter!

llvm-svn: 70610
2009-05-01 23:54:26 +00:00
Evan Cheng
b7d41a6680 Mark MOV8mr_NOREX and MOV8rm_NOREX as mayStore / mayLoad respectively.
llvm-svn: 70461
2009-04-30 00:58:57 +00:00
Chris Lattner
794fb5b4b3 fix a regression handling indirect results: these need to be considered
memory operands otherwise the writebacks get lost when the inline asm 
doesn't otherwise have side effects.  This fixes rdar://6839427, though
clang really shouldn't generate these anymore.

llvm-svn: 70455
2009-04-30 00:48:50 +00:00
Nate Begeman
b407809122 Fix infinite recursion in the C++ code which handles movddup by making it unnecessary.
llvm-svn: 70425
2009-04-29 22:47:44 +00:00
Evan Cheng
62fdc300dd spillPhysRegAroundRegDefsUses() may have invalidated iterators stored in fixed_ IntervalPtrs. Reset them.
llvm-svn: 70378
2009-04-29 07:16:34 +00:00
Chris Lattner
e1eefefdc3 Disable the load-shrinking optimization from looking at
anything larger than 64-bits, avoiding a crash.  This should
really be fixed to use APInts, though type legalization happens
to help us out and we get good code on the attached testcase at
least.

This fixes rdar://6836460

llvm-svn: 70360
2009-04-29 03:45:07 +00:00
Bill Wendling
7546bed590 Second attempt:
Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to
use the old behavior, the flag is -O0. This change allows for finer-grained
control over which optimizations are run at different -O levels.

Most of this work was pretty mechanical. The majority of the fixes came from
verifying that a "fast" variable wasn't used anymore. The JIT still uses a
"Fast" flag. I'll change the JIT with a follow-up patch.

llvm-svn: 70343
2009-04-29 00:15:41 +00:00
Anton Korobeynikov
1799ac4b55 Properly print 'P' modifier on inline asm memory operands.
This should fix PR3379 and PR4064.
Patch inspired by Edwin Török!

llvm-svn: 70328
2009-04-28 21:49:33 +00:00
Evan Cheng
754a0d2f9e Fix PR4034. Bug in LiveInterval::join when it's compacting new valno's.
llvm-svn: 70291
2009-04-28 06:24:09 +00:00