1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 16:33:37 +01:00
Commit Graph

3210 Commits

Author SHA1 Message Date
Chris Lattner
b28cf9f8d9 temporarily disable this.
llvm-svn: 96717
2010-02-21 03:24:41 +00:00
Dan Gohman
9db0689627 Check for overflow when scaling up an add or an addrec for
scaled reuse.

llvm-svn: 96692
2010-02-19 19:32:49 +00:00
Charles Davis
a64fc8c41b Add support for the 'alignstack' attribute to the x86 backend. Fixes PR5254.
Also, FileCheck'ize a test.

llvm-svn: 96686
2010-02-19 18:17:13 +00:00
Duncan Sands
5d5cce2e19 Revert commits 96556 and 96640, because commit 96556 breaks the
dragonegg self-host build.  I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier.  The
symptom of the 96556 miscompile is the following crash:

  llvm[3]: Compiling AlphaISelLowering.cpp for Release build
  cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
  Stack dump:
  0.	Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
  g++: Internal error: Aborted (program cc1plus)

This occurs when building LLVM using LLVM built by LLVM (via
dragonegg).  Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM.  Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.

Found by bisection.

r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines

Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines

Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96672
2010-02-19 11:30:41 +00:00
Evan Cheng
32031f7404 Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.

e.g. On x86_64
  %0 = icmp eq i32 %x, 0
  %1 = icmp eq i32 %y, 0
  %2 = xor i1 %1, %0
  br i1 %2, label %bb, label %return
=>
	testl   %edi, %edi
	sete    %al
	testl   %esi, %esi
	sete    %cl
	cmpb    %al, %cl
	je      LBB1_2

llvm-svn: 96640
2010-02-19 00:34:39 +00:00
Dan Gohman
58199e30dc When determining the set of interesting reuse factors, consider
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.

llvm-svn: 96629
2010-02-19 00:05:23 +00:00
Mon P Wang
64cd1a8d7f getSplatIndex assumes that the first element of the mask contains the splat index
which is not always true if the mask contains undefs. Modified it to return
the first non undef value.

llvm-svn: 96621
2010-02-18 22:33:18 +00:00
Jakob Stoklund Olesen
5b9d14b55e Always normalize spill weights, also for intervals created by spilling.
Moderate the weight given to very small intervals.

The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.

This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.

llvm-svn: 96613
2010-02-18 21:33:05 +00:00
Dan Gohman
34b5cb7deb Make CodePlacementOpt detect special EH control flow by
checking whether AnalyzeBranch disagrees with the CFG
directly, rather than looking for EH_LABEL instructions.
EH_LABEL instructions aren't always at the end of the
block, due to FP_REG_KILL and other things. This fixes
an infinite loop compiling MultiSource/Benchmarks/Bullet.

llvm-svn: 96611
2010-02-18 21:25:53 +00:00
Chris Lattner
a2e094064f remove empty file
llvm-svn: 96573
2010-02-18 06:29:06 +00:00
Bob Wilson
84fc0200bd Use NEON vmin/vmax instructions for floating-point selects.
Radar 7461718.

llvm-svn: 96572
2010-02-18 06:05:53 +00:00
Evan Cheng
9af06dfc83 Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"

llvm-svn: 96556
2010-02-18 02:13:50 +00:00
Dan Gohman
3cb7dc5912 Don't check for comments, which vary between subtargets.
llvm-svn: 96434
2010-02-17 01:08:57 +00:00
Dan Gohman
493a1fcbe0 Don't attempt to divide INT_MIN by -1; consider such cases to
have overflowed.

llvm-svn: 96428
2010-02-17 00:41:53 +00:00
Chris Lattner
c87f9d6d1a roundss is an sse 4 thing, fix the test on non-sse41 builders
like llvm-gcc-x86_64-darwin10-selfhost

llvm-svn: 96417
2010-02-17 00:29:06 +00:00
Dale Johannesen
d147b9a4d4 Make g5 target explicit; scheduling affects register choice.
llvm-svn: 96413
2010-02-16 23:25:23 +00:00
Chris Lattner
0d35c68d5c fix rdar://7653908, a crash on a case where we would fold a load
into a roundss intrinsic, producing a cyclic dag.  The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection.  Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.

llvm-svn: 96408
2010-02-16 22:35:06 +00:00
Dale Johannesen
60d48aef7b Adjust register numbers in tests to compensate for the
new lack of R2.

llvm-svn: 96407
2010-02-16 22:31:31 +00:00
Chris Lattner
008f62bfa2 filecheckize
llvm-svn: 96404
2010-02-16 22:13:43 +00:00
Evan Cheng
ee44d6a752 Look for SSE and instructions of this form: (and x, (build_vector c1,c2,c3,c4)).
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).

Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335

llvm-svn: 96389
2010-02-16 21:09:44 +00:00
David Greene
c10133139e Add support for emitting non-temporal stores for DAGs marked
non-temporal.  Fix from r96241 for botched encoding of MOVNTDQ.

Add documentation for !nontemporal metadata.

Add a simpler movnt testcase.

llvm-svn: 96386
2010-02-16 20:50:18 +00:00
Bob Wilson
94eef3fc13 Fix pr6111: Avoid using the LR register for the target address of an indirect
branch in ARM v4 code, since it gets clobbered by the return address before
it is used.  Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.

llvm-svn: 96360
2010-02-16 17:24:15 +00:00
Dan Gohman
d19ecedc40 Split the main for-each-use loop again, this time for GenerateTruncates,
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.

llvm-svn: 96308
2010-02-16 01:42:53 +00:00
Anton Korobeynikov
dccd240998 Preliminary patch to improve dwarf EH generation - Hooks to return Personality / FDE / LSDA / TType encoding depending on target / options (e.g. code model / relocation model) - MCIzation of Dwarf EH printer to use encoding information - Stub generation for ELF target (needed for indirect references) - Some other small changes here and there
llvm-svn: 96285
2010-02-15 22:35:59 +00:00
Jakob Stoklund Olesen
143339a43a Fix PR6300.
A virtual register can be used before it is defined in the same MBB if the MBB
is part of a loop. Teach the implicit-def pass about this case.

llvm-svn: 96279
2010-02-15 22:03:29 +00:00
Bob Wilson
01e8d35855 Last week we were generating code with duplicate induction variables in this
test, but the problem seems to have gone away today.  Add a check to make sure
it doesn't come back.

llvm-svn: 96277
2010-02-15 21:56:40 +00:00
Chris Lattner
2ce5f89c01 remove empty file.
llvm-svn: 96271
2010-02-15 21:14:50 +00:00
Chris Lattner
d7470aa340 revert r96241. It breaks two regression tests, isn't documented,
and the testcase needs improvement.

llvm-svn: 96265
2010-02-15 20:53:01 +00:00
Chris Lattner
a8505609fe fix PR6305 by handling BlockAddress in a helper function
called by jump threading.

llvm-svn: 96263
2010-02-15 20:47:49 +00:00
David Greene
ba8bac644b Add support for emitting non-temporal stores for DAGs marked
non-temporal.

llvm-svn: 96241
2010-02-15 17:02:56 +00:00
Jakob Stoklund Olesen
0a65533a38 Fix PR6283.
When coalescing with a physreg, remember to add imp-def and imp-kill when
dealing with sub-registers.

Also fix a related bug in VirtRegRewriter where substitutePhysReg may
reallocate the operand list on an instruction and invalidate the reg_iterator.
This can happen when a register is mentioned twice on the same instruction.

llvm-svn: 96072
2010-02-13 02:06:10 +00:00
Bob Wilson
5d66f81412 Besides removing phi cycles that reduce to a single value, also remove dead
phi cycles.  Adjust a few tests to keep dead instructions from being optimized
away.  This (together with my previous change for phi cycles) fixes Apple
radar 7627077.

llvm-svn: 96057
2010-02-13 00:31:44 +00:00
Dale Johannesen
ea96b2974f When save/restoring CR at prolog/epilog, in a large
stack frame, the prolog/epilog code was using the same
register for the copy of CR and the address of the save slot.  Oops.
This is fixed here for Darwin, sort of, by reserving R2 for this case.
A better way would be to do the store before the decrement of SP,
which is safe on Darwin due to the red zone.

SVR4 probably has the same problem, but I don't know how to fix it;
there is no red zone and R2 is already used for something else.
I'm going to leave it to someone interested in that target.

Better still would be to rewrite the CR-saving code completely;
spilling each CR subregister individually is horrible code.

llvm-svn: 96015
2010-02-12 21:35:34 +00:00
Anton Korobeynikov
c66de6687b Testcases for recent stdcall / fastcall mangling improvements
llvm-svn: 95982
2010-02-12 15:29:13 +00:00
Anton Korobeynikov
7073515c86 Cleanup stdcall / fastcall name mangling.
This should fix alot of problems we saw so far, e.g. PRs 5851 & 2936

llvm-svn: 95980
2010-02-12 15:28:40 +00:00
Dan Gohman
c40eb525ad Reapply the new LoopStrengthReduction code, with compile time and
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.

This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.

llvm-svn: 95975
2010-02-12 10:34:29 +00:00
Bob Wilson
2fd80c3d94 Add a new pass on machine instructions to optimize away PHI cycles that
reduce down to a single value.  InstCombine already does this transformation
but DAG legalization may introduce new opportunities.  This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized.  I measured the compile time 
impact of this (running llc on 176.gcc) and it was not significant.

llvm-svn: 95951
2010-02-12 01:30:21 +00:00
Jakob Stoklund Olesen
b800ff8ca9 Reapply coalescer fix for better cross-class coalescing.
This time with fixed test cases.

llvm-svn: 95938
2010-02-11 23:55:29 +00:00
Mon P Wang
c17e781f35 The previous fix of widening divides that trap was too fragile as it depends on custom
lowering and requires that certain types exist in ValueTypes.h.  Modified widening to
check if an op can trap and if so, the widening algorithm will apply only the op on
the defined elements.  It is safer to do this in widening because the optimizer can't
guarantee removing unused ops in some cases.

llvm-svn: 95823
2010-02-10 23:37:45 +00:00
Bob Wilson
82d5534acc Delete dead PHI machine instructions. These can be created due to type
legalization even when the IR-level optimizer has removed dead phis, such
as when the high half of an i64 value is unused on a 32-bit target.
I had to adjust a few test cases that had dead phis.
This is a partial fix for Radar 7627077.

llvm-svn: 95816
2010-02-10 22:58:57 +00:00
Evan Cheng
8bee7fb61d Now that ShrinkDemandedOps() is separated out from DAG combine. It sometimes leave some obvious nops which dag combine used to clean up afterwards e.g. (trunk (ext n)) -> n. Look for them and squash them.
llvm-svn: 95757
2010-02-10 02:17:34 +00:00
Chris Lattner
340fe1f187 move tests that depend on the x86 backend out of codegen/generic,
and remove a few old and unreduced ones.  Fixes PR5624. 

llvm-svn: 95656
2010-02-09 06:41:03 +00:00
Chris Lattner
fdb9fda4af make target independent.
llvm-svn: 95655
2010-02-09 06:36:30 +00:00
Chris Lattner
28b79c686d merge a target-specific add test into x86 directory.
llvm-svn: 95654
2010-02-09 06:35:50 +00:00
Chris Lattner
34c420d11b merge another test in, drop the trivially constant folded cases.
llvm-svn: 95653
2010-02-09 06:33:27 +00:00
Chris Lattner
ddb2e5a05c consolidate and filecheckize two tests.
llvm-svn: 95652
2010-02-09 06:24:00 +00:00
Chris Lattner
e669789912 merge two tests, make target independent.
llvm-svn: 95651
2010-02-09 06:19:20 +00:00
Chris Lattner
20be5fb012 convert to filecheck.
llvm-svn: 95608
2010-02-08 23:47:34 +00:00
Chris Lattner
6162c89bfe add an x86 implementation of MCTargetExpr for
representing @GOT and friends.  Use it for
personality references as a first use.

llvm-svn: 95588
2010-02-08 22:09:08 +00:00
Dan Gohman
56b9ea088b When CodeGen'ing unoptimized code, there may be unfolded constant expressions
in global initializers. Instead of aborting, attempt to fold them on the
spot. If folding succeeds, emit the folded expression instead.

This fixes PR6255.

llvm-svn: 95583
2010-02-08 22:02:38 +00:00
Dan Gohman
f113e5466c In guaranteed tailcall mode, don't decline the tailcall optimization
for blocks ending in "unreachable".

llvm-svn: 95565
2010-02-08 20:34:14 +00:00
Evan Cheng
5541068ad3 Run codegen dce pass for all targets at all optimization levels. Previously it's
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.

llvm-svn: 95493
2010-02-06 09:07:11 +00:00
Evan Cheng
c3cfda4e7e Remove a large test case that (soon will) no longer make sense.
llvm-svn: 95492
2010-02-06 09:00:30 +00:00
Rafael Espindola
b0bb1ddfe3 Fix alignment on ppc linux. This fixes the build of crtend.o
llvm-svn: 95477
2010-02-06 03:32:21 +00:00
Evan Cheng
de1a4726e6 Do not emit callseq instructions around sibcalls. This eliminated some unnecessary stack adjustments.
llvm-svn: 95475
2010-02-06 03:28:46 +00:00
Bob Wilson
1a324958d6 Handle AddrMode6 (for NEON load/stores) in Thumb2's rewriteT2FrameIndex.
Radar 7614112.

llvm-svn: 95456
2010-02-06 00:24:38 +00:00
Jakob Stoklund Olesen
7b4c60adae Don't unroll loops containing function calls.
llvm-svn: 95454
2010-02-05 23:21:31 +00:00
Bill Wendling
c3f4101cc6 Make test more fucused eliminating extraneous bits.
llvm-svn: 95384
2010-02-05 11:21:05 +00:00
Evan Cheng
4b03f55de1 Fix test.
llvm-svn: 95373
2010-02-05 06:37:00 +00:00
Evan Cheng
81dde4c7f7 Handle tail call with byval arguments.
llvm-svn: 95351
2010-02-05 02:21:12 +00:00
Evan Cheng
94fe5501b7 When the scheduler unfold a load folding instruction it move some of the predecessors to the unfolded load. It decides what gets moved to the load by checking whether the new load is using the predecessor as an operand. The check neglects the cases whether the predecessor is a flagged scheduling unit.
rdar://7604000

llvm-svn: 95339
2010-02-05 01:27:11 +00:00
Bill Wendling
9761f067f8 An empty global constant (one of size 0) may have a section immediately
following it. However, the EmitGlobalConstant method wasn't emitting a body for
the constant. The assembler doesn't like that. Before, we were generating this:

  .zerofill __DATA, __common, __cmd, 1, 3

This fix puts us back to that semantic.

llvm-svn: 95336
2010-02-05 00:17:02 +00:00
Jakob Stoklund Olesen
d72e82107d Fix small bug in handling instructions with more than one implicitly defined operand.
ProcessImplicitDefs would only mark one operand per instruction with <undef>.
This fixed PR6086.

llvm-svn: 95319
2010-02-04 18:46:28 +00:00
Evan Cheng
f5ee7fb571 Re-enable x86 tail call optimization.
llvm-svn: 95295
2010-02-04 06:47:24 +00:00
Chris Lattner
e43007d443 add support for the sparcv9-*-* target triple to turn on
64-bit sparc codegen.  Patch by Nathan Keynes!

llvm-svn: 95293
2010-02-04 06:34:01 +00:00
Evan Cheng
5c8b1b9164 Speculatively disable x86 automatic tail call optimization while we track down a self-hosting issue.
llvm-svn: 95259
2010-02-03 21:40:40 +00:00
Evan Cheng
ccbbdfa8c4 Make test less fragile
llvm-svn: 95258
2010-02-03 21:39:04 +00:00
Evan Cheng
e273e42195 Revert 94937 and move the noreturn check to codegen.
llvm-svn: 95198
2010-02-03 03:55:59 +00:00
Evan Cheng
d9cf09b0d6 Allow all types of callee's to be tail called. But avoid automatic tailcall if the callee is a result of bitcast to avoid losing necessary zext / sext etc.
llvm-svn: 95195
2010-02-03 03:28:02 +00:00
Dale Johannesen
1e9d147461 Reapply 95050 with a tweak to check the register class.
llvm-svn: 95183
2010-02-03 01:40:33 +00:00
Chris Lattner
2b798aafd0 make these less sensitive to asm verbose changes by disabling it for them.
llvm-svn: 95175
2010-02-03 00:48:53 +00:00
Dale Johannesen
08ab638bdc Test revert 95050; there's a good chance it's causing
buildbot failure.

llvm-svn: 95103
2010-02-02 18:52:56 +00:00
Evan Cheng
fac0fdc6a0 Perform sibcall in some cases when arguments are passes memory. Look for cases
where callee's arguments are already in the caller's own caller's stack and
they line up perfectly. e.g.

extern int foo(int a, int b, int c);

int bar(int a, int b, int c) {
  return foo(a, b, c);
}

llvm-svn: 95053
2010-02-02 02:22:50 +00:00
Dale Johannesen
a20fc3d1a9 Make local RA smarter about reusing input register of a copy
as output.  Needed for (functional) correctness in inline asm,
and should be generally beneficial.  7361612.

llvm-svn: 95050
2010-02-02 02:08:02 +00:00
Evan Cheng
efa391da81 Fix PR6196. GV callee may not be a function.
llvm-svn: 95017
2010-02-01 22:40:09 +00:00
Dan Gohman
7b3c210a47 Update this test for a trivial register allocation difference.
llvm-svn: 94989
2010-02-01 19:00:32 +00:00
Evan Cheng
dcc1816642 Undo r94946 now all the tests are passing again.
llvm-svn: 94970
2010-02-01 02:13:39 +00:00
Evan Cheng
b5f97d871c Avoid recursive sibcall's.
llvm-svn: 94946
2010-01-31 06:44:49 +00:00
Anton Korobeynikov
f7651ec593 Fix a gross typo: ARMv6+ may or may not support unaligned memory operations.
Even if they are suported by the core, they can be disabled
(this is just a configuration bit inside some register).

Allow unaligned memops on darwin and conservatively disallow them otherwise.

llvm-svn: 94889
2010-01-30 14:08:12 +00:00
Evan Cheng
40ae22e14d Allow more tailcall optimization: calls with inputs that are all passed in registers.
llvm-svn: 94873
2010-01-30 01:22:00 +00:00
Evan Cheng
2cbd1b19db Catch more trivial tail call opportunities: no inputs and output types match.
llvm-svn: 94804
2010-01-29 06:45:59 +00:00
Chris Lattner
d7a8482810 convert the last 3 targets to use EmitFunctionBody() now that
it has before/end body hooks.

 lib/Target/Alpha/AsmPrinter/AlphaAsmPrinter.cpp |   49 ++-----------
 lib/Target/Mips/AsmPrinter/MipsAsmPrinter.cpp   |   87 ++++++------------------
 lib/Target/XCore/AsmPrinter/XCoreAsmPrinter.cpp |   56 +++------------
 test/CodeGen/XCore/ashr.ll                      |    2 
 4 files changed, 48 insertions(+), 146 deletions(-)

llvm-svn: 94741
2010-01-28 06:22:43 +00:00
Evan Cheng
7e26fdaa78 Fix a bug introduced by r94490 where it created a X86ISD::CMP whose output type is different from its inputs.
This fixes PR6146.

llvm-svn: 94731
2010-01-28 01:57:22 +00:00
Chris Lattner
95118672e3 Give AsmPrinter the most common expected implementation of
runOnMachineFunction, and switch PPC to use EmitFunctionBody.
The two ppc asmprinters now don't heave to define 
runOnMachineFunction.

llvm-svn: 94722
2010-01-28 01:28:58 +00:00
Chris Lattner
df1662f0e6 emit a 0 byte instead of a noop if a function is empty on darwin.
"0" is nice and target independent.

llvm-svn: 94718
2010-01-28 01:06:32 +00:00
Chandler Carruth
4b62a01a0c Quick fix to a test that is currently failing on every Linux build bot. No idea
if this is the "correct" fix, but it seems a strict improvement.

llvm-svn: 94675
2010-01-27 10:36:15 +00:00
Evan Cheng
381bc804d6 Perform trivial tail call optimization for callees with "C" ABI. These are done
even when -tailcallopt is not specified and it does not require changing ABI.
First case is the most trivial one. Perform tail call optimization when both
the caller and callee do not return values and when the callee does not take
any input arguments.

llvm-svn: 94664
2010-01-27 06:25:16 +00:00
Chris Lattner
ee2b6b1cc5 emit jump table an alias ".set" directives through MCStreamer as
assignments.

.set x, a-b

is the same as:

x = a-b

llvm-svn: 94596
2010-01-26 21:53:08 +00:00
Rafael Espindola
f46baf3304 Emit .comm alignment in bytes but .align in powers of 2 for ARM ELF.
Original patch by Sandeep Patel and updated by me.

llvm-svn: 94582
2010-01-26 20:21:43 +00:00
Chris Lattner
044439c9bc eliminate MCAsmInfo::NeedsSet: we now just use .set on any platform
that has it.

llvm-svn: 94581
2010-01-26 20:20:43 +00:00
Evan Cheng
548d00d77c Implement cond ? -1 : 0 with sbb.
llvm-svn: 94490
2010-01-26 02:00:44 +00:00
Rafael Espindola
575697fd65 Update test for darwin.
llvm-svn: 94421
2010-01-25 15:32:10 +00:00
Chris Lattner
2020423588 we removed support for darwin8 tools.
llvm-svn: 94414
2010-01-25 07:43:40 +00:00
Rafael Espindola
82a8b3efd4 Fix PR6134.
We are not emitting alignments on Darwin for "bar". Not sure what is the
correct way to do it.

llvm-svn: 94400
2010-01-25 02:27:39 +00:00
Daniel Dunbar
c1df55e99c Attempt to unbreak test on Linux. Chris, please check.
llvm-svn: 94399
2010-01-25 00:54:13 +00:00
Chris Lattner
6fdaf12267 just remove this test, it is not reduced, is not clear what its testing for and
it is dying due to fragility in the asmprinter .s comments.

llvm-svn: 94372
2010-01-24 19:23:09 +00:00
Mon P Wang
d4d1cbb72b It seems better to scalarize vectors of size 1 instead of widening them.
Add support to widen SETCC.

llvm-svn: 94342
2010-01-24 00:24:43 +00:00
Mon P Wang
871ea08e40 Improved widening loads by adding support for wider loads if
the alignment allows.  Fixed a bug where we didn't use a
vector load/store for PR5626.

llvm-svn: 94338
2010-01-24 00:05:03 +00:00
Chris Lattner
1b7c00a4f2 Change constantexpr global variable initializers to convert the constants
to MCExpr then emit them through MCStreamer with EmitValue.  I think all
global variable initializers are now going through mcstreamer.

llvm-svn: 94293
2010-01-23 06:17:14 +00:00
Eric Christopher
e6d6bfcc32 Don't lower splat vector load to relative to the esp if the
stack may be misaligned.

Update test accordingly.

Patch by Evan Cheng!

llvm-svn: 94291
2010-01-23 06:02:43 +00:00
Chris Lattner
e5e7b41090 stop testing for invalid output.
llvm-svn: 94288
2010-01-23 05:45:28 +00:00
Chris Lattner
20a336f1df emit .ascii and .asciz through MCStreamer.
llvm-svn: 94282
2010-01-23 04:54:10 +00:00
Chris Lattner
4515320f9f remove this test.
llvm-svn: 94276
2010-01-23 03:11:10 +00:00
Evan Cheng
9523142c01 Fix test.
llvm-svn: 94272
2010-01-23 01:21:27 +00:00
Evan Cheng
3bd9efa510 Fix tests.
llvm-svn: 94271
2010-01-23 01:19:28 +00:00
Chris Lattner
5f6aca81b6 make this less constrained, we want blank lines between globals.
llvm-svn: 94201
2010-01-22 19:51:08 +00:00
Dan Gohman
525f7d7833 Revert LoopStrengthReduce.cpp to pre-r94061 for now.
llvm-svn: 94123
2010-01-22 00:46:49 +00:00
Chris Lattner
75db03497a testcase for r94095
llvm-svn: 94096
2010-01-21 20:01:04 +00:00
Dan Gohman
be34c35f32 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
Chris Lattner
e4ac42e11e emit basic block labels with mcstreamer.
llvm-svn: 93993
2010-01-20 07:24:05 +00:00
Chris Lattner
d69a9cc334 emit integer and fp zeros as (e.g.) .byte 0 instead of .space 1,
for tidiness.

llvm-svn: 93992
2010-01-20 07:19:19 +00:00
Chris Lattner
3104fa4a71 signficant cleanups to EmitGlobalConstant (including streamerization
of int initializers), change some methods to be static functions,
use raw_ostream::write_hex instead of a smallstring dance with 
APValue::toStringUnsigned(S, 16).

llvm-svn: 93991
2010-01-20 07:11:32 +00:00
Dan Gohman
34b548b94a Fold (add x, shl(0 - y, n)) -> sub(x, shl(y, n)), to simplify some code
that SCEVExpander can produce when running on behalf of LSR.

llvm-svn: 93949
2010-01-19 23:30:49 +00:00
Dan Gohman
94c0e08951 Make SCEVAddRecExpr's getType return a pointer type when the add
has a pointer member. This helps reduce unnecessary bitcasting
and uglygeps.

llvm-svn: 93939
2010-01-19 22:53:50 +00:00
Dan Gohman
190fee462e Add nounwinds.
llvm-svn: 93919
2010-01-19 21:51:51 +00:00
Jakob Stoklund Olesen
e7d4286d73 Remove predicates when changing an add into an unpredicable mov.
Since the mov is executed unconditionally, make sure that the add didn't have
any predicate.

llvm-svn: 93909
2010-01-19 21:08:28 +00:00
Evan Cheng
4b916556a5 Do not extend extension results beyond the use of a PHI instruction at the start of a use block. A PHI use is expected to kill its source values.
llvm-svn: 93895
2010-01-19 19:45:51 +00:00
Chris Lattner
ff16ee18d9 don't let asm-verbose break the check-next lines in these tests.
llvm-svn: 93869
2010-01-19 06:39:54 +00:00
Chris Lattner
377bd87849 Now that we have everything nicely factored (e.g. asmprinter is not
doing global variable classification anymore) and hookized, sink almost
all target targets global variable emission code into AsmPrinter and out
of each target.

Some notes:

1. PIC16 does completely custom and crazy stuff, so it is not changed.
2. XCore has some custom handling for extra directives.  I'll look at it next.
3. This switches linux/ppc to use .globl instead of .global.  If .globl is
   actually wrong, let me know and I'll fix it.
4. This makes linux/ppc get a lot of random cases right which were obviously
   wrong before, it is probably now a bit healthier.
5. Blackfin will probably start getting .comm and other things that it didn't
   before.  If this is undesirable, it should explicitly opt out of these
   things by clearing the relevant fields of MCAsmInfo.

This leads to a nice diffstat:
 14 files changed, 127 insertions(+), 830 deletions(-)

llvm-svn: 93858
2010-01-19 05:38:33 +00:00
Chris Lattner
c42d723862 fix a significant difference between llvm and gcc on ELF systems:
GCC would put weak zero initialized mutable data in the .bss section,
we would put it into a crasy '.gnu.linkonce.b.test,"aw",@nobits' 
section.  Fixing this will allow simplifications next up.

llvm-svn: 93844
2010-01-19 03:06:01 +00:00
Chris Lattner
394370b299 there is no need to emit a .section above .comm on linux.
llvm-svn: 93842
2010-01-19 02:46:56 +00:00
Evan Cheng
572390be3b Test case for r93758.
llvm-svn: 93824
2010-01-19 00:35:20 +00:00
Evan Cheng
5cf9d23e4e Canonicalize -1 - x to ~x.
Instcombine does this but apparently there are situations where this pattern will escape the optimizer and / or created by isel. Here is a case that's seen in JavaScriptCore:
  %t1 = sub i32 0, %a
  %t2 = add i32 %t1, -1
The dag combiner pattern: ((c1-A)+c2) -> (c1+c2)-A
will fold it to -1 - %a.

llvm-svn: 93773
2010-01-18 21:38:44 +00:00
Chris Lattner
fb103355dd reduce this test and convert to filecheck, hopefully the linux buildbot
will tell me something more useful.

llvm-svn: 93688
2010-01-17 19:09:12 +00:00
Bob Wilson
72cf548263 The Neon "vtst" instruction takes a suffix that is the element size alone --
adding an "i" to the suffix, indicating that the elements are integers, is
accepted but not part of the standard syntax.  This helps us pass a few more
of the Neon tests from gcc.

llvm-svn: 93677
2010-01-17 06:35:17 +00:00
Kenneth Uildriks
d6b30baf78 When checking for sret-demotion, it needs to use legal types. When using the return value of an sret-demoted call, it needs to use possibly illegal types that match the declared Type of the callee.
llvm-svn: 93667
2010-01-16 23:37:33 +00:00
Chris Lattner
4d92f15423 this teestcase takes a long time to crash, remove it. If someone cares about this, they should file a bug, it's not doing any good as an xfail.
llvm-svn: 93604
2010-01-16 00:53:22 +00:00
Bob Wilson
3386047bdb Run the pre-register allocation tail duplication pass by default. Remove
the -pre-regalloc-taildup command-line option, and add a new
-disable-early-taildup option.

llvm-svn: 93597
2010-01-16 00:29:50 +00:00
David Greene
e52529d7cb Fix PR6019. A load has more than one use if it feeds a bitconvert that
has more than one use.

llvm-svn: 93576
2010-01-15 23:23:41 +00:00
Jim Grosbach
b09e69dd22 add testcase for r93564
llvm-svn: 93567
2010-01-15 22:27:37 +00:00
Anton Korobeynikov
7a7f5c50d8 Reenable tests
llvm-svn: 93555
2010-01-15 21:19:26 +00:00
Anton Korobeynikov
b4484a2bab Temporary disable tests
llvm-svn: 93501
2010-01-15 02:09:27 +00:00
Anton Korobeynikov
953a94cb69 Add variable-width shifts for MSP430
llvm-svn: 93468
2010-01-14 22:09:38 +00:00
Dan Gohman
7c596d2b00 Fix a codegen abort seen in 483.xalancbmk.
llvm-svn: 93417
2010-01-14 03:08:49 +00:00
Chris Lattner
b5605b72b7 this test requires SSE, thanks to jyasskin for pointing this out.
llvm-svn: 93360
2010-01-13 21:51:41 +00:00
Evan Cheng
0fa1e2d063 Commit some changes I had managed to lose last night while refactoring the code. Avoid change use of PHI instructions because it's not legal to insert any instructions before them.
This fixes PR6027.

llvm-svn: 93335
2010-01-13 19:16:39 +00:00
Evan Cheng
2afc417122 Re-enable extension optimization pass.
llvm-svn: 93313
2010-01-13 08:45:40 +00:00
Chris Lattner
1b6c061cd0 remove uses of deprecated functions, this generates slightly
different BlockAddress labels, but nothing semantically important.

Add a FIXME that BlockAddress codegen is broken if the LLVM BB has 
an empty name (e.g. strip was run).

llvm-svn: 93303
2010-01-13 07:30:49 +00:00
Evan Cheng
973fceab0c Disable opt-ext pass to unbreak the build for now.
llvm-svn: 93286
2010-01-13 01:51:43 +00:00
Jeffrey Yasskin
59ca529e20 Try to fix the ARM and PPC buildbots. The -mattr=vector-unaligned-mem
flag doesn't exist there, and this is an x86 test.

llvm-svn: 93279
2010-01-13 00:31:43 +00:00
Evan Cheng
76db3bb18e Add a quick pass to optimize sign / zero extension instructions. For targets where the pre-extension values are available in the subreg of the result of the extension, replace the uses of the pre-extension value with the result + extract_subreg.
For now, this pass is fairly conservative. It only perform the replacement when both the pre- and post- extension values are used in the block. It will miss cases where the post-extension values are live, but not used.

llvm-svn: 93278
2010-01-13 00:30:23 +00:00
Evan Cheng
0dddace5f1 Add nounwind.
llvm-svn: 93244
2010-01-12 18:29:23 +00:00
Duncan Sands
395053f13a Revert commit 93204, since it causes the assembler to barf
on x86-64 linux with messages like this:
Error: Incorrect register `%r14' used with `l' suffix

llvm-svn: 93242
2010-01-12 17:46:16 +00:00
Dan Gohman
a48d524fbc Make several tests less fragile.
llvm-svn: 93230
2010-01-12 04:52:47 +00:00
Dan Gohman
51b3e804dc Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS.
llvm-svn: 93229
2010-01-12 04:42:54 +00:00
Evan Cheng
a93b476689 Add manual ISD::OR fastisel selection routines. TableGen is no longer autogen them after 93152 and 93191.
llvm-svn: 93204
2010-01-11 22:59:27 +00:00
Evan Cheng
bd938ebc90 Extend r93152 to work on OR r, r. If the source set bits are known not to overlap, then select as an ADD instead.
llvm-svn: 93191
2010-01-11 22:03:29 +00:00
Chris Lattner
644f29ddf5 reduce this to a sensible testcase.
llvm-svn: 93189
2010-01-11 21:58:19 +00:00
David Greene
5d479fa341 Shorten up this testcase.
llvm-svn: 93187
2010-01-11 21:50:35 +00:00
Evan Cheng
bc84a42d7b Revert 93158. It's breaking quite a few x86_64 tests.
llvm-svn: 93185
2010-01-11 21:13:41 +00:00
Jakob Stoklund Olesen
f1c71ef6ba Avoid adding PHI arguments for a predecessor that has gone away when a BRCOND was constant folded.
This fixes PR5980.

llvm-svn: 93184
2010-01-11 21:02:33 +00:00
Dan Gohman
541c4f4c5d Use a 32-bit and with implicit zero-extension instead of a 64-bit and if it
has an immediate with at least 32 bits of leading zeros, to avoid needing to
materialize that immediate in a register first.

FileCheckize, tidy, and extend a testcase to cover this case.

This fixes rdar://7527390.

llvm-svn: 93160
2010-01-11 17:58:34 +00:00
Dan Gohman
5b79391087 Re-instate MOV64r0 and MOV16r0, with adjustments to work with the
new AsmPrinter. This is perhaps less elegant than describing them
in terms of MOV32r0 and subreg operations, but it allows the
current register to rematerialize them.

llvm-svn: 93158
2010-01-11 17:37:57 +00:00
Dan Gohman
5f2685d424 Generalize this check to avoid depending on a specific register assignment.
llvm-svn: 93157
2010-01-11 17:24:27 +00:00
Dan Gohman
d2df677a8f Make this test less trivial, to avoid spurious failures.
llvm-svn: 93156
2010-01-11 17:23:56 +00:00
Evan Cheng
ee806a0db5 Select an OR with immediate as an ADD if the input bits are known zero. This allow the instruction to be 3address-fied if needed.
llvm-svn: 93152
2010-01-11 17:03:47 +00:00
David Greene
b879ff4855 Implement a feature (-vector-unaligned-mem) to allow targets to
ignore alignment requirements for SIMD memory operands.  This
is useful on architectures like the AMD 10h that do not trap on
unaligned references if a status bit is twiddled at startup time.

llvm-svn: 93151
2010-01-11 16:29:42 +00:00
Jeffrey Yasskin
53a8f3981c Fix http://llvm.org/PR5729: x86-64 tail calls were putting their targets into
R11, and then asserting that the target was in R9.  Since R9 isn't reserved for
the target anymore, and is used as an argument, this patch changes the
assertion.

llvm-svn: 93065
2010-01-09 18:56:43 +00:00
Dan Gohman
3708af1c59 Revert an earlier change to SIGN_EXTEND_INREG for vectors. The VTSDNode
really does need to be a vector type, because
TargetLowering::getOperationAction for SIGN_EXTEND_INREG uses that type,
and it needs to be able to distinguish between vectors and scalars.

Also, fix some more issues with legalization of vector casts.

llvm-svn: 93043
2010-01-09 02:13:55 +00:00
Evan Cheng
2e497d1ed4 Fix a critical bug in 64-bit atomic operation lowering for 32-bit. The results of the cmpxchg8b instructions are being thrown away when it branches back to the top of the checking loop. This means the loop always compares against the old value and this can result in a dead lock.
llvm-svn: 93028
2010-01-08 23:41:50 +00:00
Evan Cheng
f96a9ec02b ReplaceAllUsesOfValueWith may delete other nodes that the one being replaced. Do not delete dead nodes again.
llvm-svn: 92988
2010-01-08 02:36:12 +00:00
Chris Lattner
e0199dff81 Fix rdar://7517201, a regression introduced by r92849.
When folding a and(any_ext(load)) both the any_ext and the
load have to have only a single use.

This removes the anyext-uses.ll testcase which started failing
because it is unreduced and unclear what it is testing.

llvm-svn: 92950
2010-01-07 21:59:23 +00:00
Evan Cheng
4523041394 APInt'fy TargetLowering::SimplifySetCC to fix PR5963.
llvm-svn: 92943
2010-01-07 20:58:44 +00:00
Evan Cheng
51d86260ff Fix a minor regression from my dag combiner changes. One more place which needs to look pass truncates.
llvm-svn: 92885
2010-01-07 00:54:06 +00:00
Jakob Stoklund Olesen
09012552b8 Add comments.
llvm-svn: 92883
2010-01-07 00:51:04 +00:00
Jakob Stoklund Olesen
a63aa4e54b Add Target hook to duplicate machine instructions.
Some instructions refer to unique labels, and so cannot be trivially cloned
with CloneMachineInstr.

llvm-svn: 92873
2010-01-06 23:47:07 +00:00
Evan Cheng
25dcf9b830 Teach dag combine to fold the following transformation more aggressively:
(OP (trunc x), (trunc y)) -> (trunc (OP x, y))

Unfortunately this simple change causes dag combine to infinite looping. The problem is the shrink demanded ops optimization tend to canonicalize expressions in the opposite manner. That is badness. This patch disable those optimizations in dag combine but instead it is done as a late pass in sdisel.

This also exposes some deficiencies in dag combine and x86 setcc / brcond lowering. Teach them to look pass ISD::TRUNCATE in various places.

llvm-svn: 92849
2010-01-06 19:38:29 +00:00
Dan Gohman
93a28a6ce9 Move this test from test/Transforms/IndVarSimplify to
test/CodeGen/X86, as doesn't use -indvars, and it does use
llc -march=x86-64.

llvm-svn: 92799
2010-01-05 22:52:54 +00:00
Bill Wendling
7e9607ab56 Don't assign the shift the same type as the variable being shifted. This could
result in illegal types for the SHL operator.

llvm-svn: 92797
2010-01-05 22:39:10 +00:00
Dan Gohman
5fa04f2707 Delete useless trailing semicolons.
llvm-svn: 92740
2010-01-05 17:55:26 +00:00
Dan Gohman
73b0882c6e Make this test more portable.
llvm-svn: 92514
2010-01-04 21:23:34 +00:00
Dan Gohman
b71bc40eed Add some tests and update an existing test to reflect recent
x86 isel peeps.

llvm-svn: 92509
2010-01-04 20:53:54 +00:00
Anton Korobeynikov
3915cf5ef4 Fix invalid chain folding for memory variant of sdiv / udiv
llvm-svn: 92472
2010-01-04 10:31:54 +00:00
Chris Lattner
8e83066d12 fix PR5930, allowing the asmprinter to emit difference between
two labels as a truncate.

llvm-svn: 92455
2010-01-03 18:33:18 +00:00
Chris Lattner
49cda26f7e add PR#
llvm-svn: 92451
2010-01-03 18:10:58 +00:00
Chris Lattner
7246a69d2b differences between two blockaddress's don't cause a
global variable initializer to require relocations.

llvm-svn: 92450
2010-01-03 18:09:40 +00:00
Chris Lattner
9e64bad0da allow this to work on linux hosts.
llvm-svn: 92407
2010-01-02 00:22:15 +00:00
Chris Lattner
fe8af82cd4 Teach codegen to handle:
(X != null) | (Y != null) --> (X|Y) != 0
 (X == null) & (Y == null) --> (X|Y) == 0

so that instcombine can stop doing this for pointers.  This is part of PR3351,
which is a case where instcombine doing this for pointers (inserting ptrtoint)
is pessimizing code.

llvm-svn: 92406
2010-01-02 00:00:03 +00:00
Chris Lattner
4e49a69ec5 rename file.
llvm-svn: 92405
2010-01-01 23:55:04 +00:00
Chris Lattner
44298d184a Teach codegen to lower llvm.powi to an efficient (but not optimal)
multiply sequence when the power is a constant integer.  Before, our
codegen for std::pow(.., int) always turned into a libcall, which was
really inefficient.

This should also make many gfortran programs happier I'd imagine.

llvm-svn: 92388
2010-01-01 03:32:16 +00:00
Chris Lattner
3d38dbff2a Make this more likely to generate a libcall.
llvm-svn: 92387
2010-01-01 03:26:51 +00:00
Sanjiv Gupta
543a6716fb Extern declaration for unordered.f32 libcall was not being emitted. Fixed that.
llvm-svn: 92242
2009-12-29 03:24:34 +00:00
Sanjiv Gupta
efad5b2a93 Fixed llc crash for zext (i1 -> i8) loads.
llvm-svn: 92201
2009-12-28 04:53:24 +00:00
Chris Lattner
4e96d36f72 handle equality memcmp of 8 bytes on x86-64 with two unaligned loads and a
compare.  On other targets we end up with a call to memcmp because we don't
want 16 individual byte loads.  We should be able to use movups as well, but
we're failing to select the generated icmp.

llvm-svn: 92107
2009-12-24 01:07:17 +00:00
Chris Lattner
5d3919d5f9 move an optimization for memcmp out of simplifylibcalls and into
SDISel.  This optimization was causing simplifylibcalls to 
introduce type-unsafe nastiness.  This is the first step, I'll be 
expanding the memcmp optimizations shortly, covering things that
we really really wouldn't want simplifylibcalls to do.

llvm-svn: 92098
2009-12-24 00:37:38 +00:00
Sanjiv Gupta
7872817f59 Reapply 91904.
llvm-svn: 91996
2009-12-23 11:19:09 +00:00
Sanjiv Gupta
1cd15ef29f deleting empty file.
llvm-svn: 91994
2009-12-23 10:35:24 +00:00
Sanjiv Gupta
70e1523215 Reverting back 91904.
llvm-svn: 91993
2009-12-23 09:46:01 +00:00
Dale Johannesen
b4485fd8a9 Use more sensible type for flags in asms. PR 5570.
Patch by Sylve`re Teissier (sorry, ASCII only).

llvm-svn: 91988
2009-12-23 07:32:51 +00:00
Eric Christopher
ce677a909d Update objectsize intrinsic and associated dependencies. Fix
lowering code and update testcases.

llvm-svn: 91979
2009-12-23 02:51:48 +00:00
Anton Korobeynikov
04878d43e1 Add testcase for PR5703
llvm-svn: 91931
2009-12-22 22:37:23 +00:00
Evan Cheng
7cd6bfe549 Remove target attribute break-sse-dep. Instead, do not fold load into sse partial update instructions unless optimizing for size.
llvm-svn: 91910
2009-12-22 17:47:23 +00:00
Sanjiv Gupta
9581b4dc62 While converting one of the operands to a memory operand, we need to check if it is Legal and does not result into a cyclic dep.
llvm-svn: 91904
2009-12-22 14:25:37 +00:00
Sanjiv Gupta
14c9f2ed42 Emit direction operand in binary insns that stores in memory.
llvm-svn: 91777
2009-12-19 13:52:01 +00:00
Sanjiv Gupta
df6eadc436 Test cases for changes done in 91768.
llvm-svn: 91773
2009-12-19 11:38:14 +00:00
Evan Cheng
bc37151dea Increase opportunities to optimize (brcond (srl (and c1), c2)).
llvm-svn: 91717
2009-12-18 21:31:31 +00:00
Evan Cheng
d97d025eba On recent Intel u-arch's, folding loads into some unary SSE instructions can
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.

movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0

instead of
cvtss2sd (%rdi), %xmm0

An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0

llvm-svn: 91672
2009-12-18 07:40:29 +00:00
Dan Gohman
d97f165eb2 Tidy up this testcase and add test for tailcall optimization
with unreachable.

llvm-svn: 91650
2009-12-18 01:05:06 +00:00
Bob Wilson
a9f20f9f6e Handle ARM inline asm "w" constraints with 64-bit ("d") registers.
The change in SelectionDAGBuilder is needed to allow using bitcasts to convert
between f64 (the default type for ARM "d" registers) and 64-bit Neon vector
types.  Radar 7457110.

llvm-svn: 91649
2009-12-18 01:03:29 +00:00
Dan Gohman
c382d6519c Remove "tail" keywords. These calls are not intended to be tail calls.
This protects this test from depending on codegen not performing the
tail call optimization by default.

llvm-svn: 91648
2009-12-18 01:02:18 +00:00
Jakob Stoklund Olesen
b39930cf6d Add test case for the phi reuse patch.
llvm-svn: 91642
2009-12-18 00:11:44 +00:00
Sean Callanan
06b6feb2e1 Instruction fixes, added instructions, and AsmString changes in the
X86 instruction tables.

Also (while I was at it) cleaned up the X86 tables, removing tabs and
80-line violations.

This patch was reviewed by Chris Lattner, but please let me know if
there are any problems.

* X86*.td
	Removed tabs and fixed 80-line violations

* X86Instr64bit.td
	(IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW)
		Added
	(CALL, CMOV) Added qualifiers
	(JMP) Added PC-relative jump instruction
	(POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate
		that it is 64-bit only (ambiguous since it has no
		REX prefix)
	(MOV) Added rr form going the other way, which is encoded
		differently
	(MOV) Changed immediates to offsets, which is more correct;
		also fixed MOV64o64a to have to a 64-bit offset
	(MOV) Fixed qualifiers
	(MOV) Added debug-register and condition-register moves
	(MOVZX) Added more forms
	(ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which
		(as with MOV) are encoded differently
	(ROL) Made REX.W required
	(BT) Uncommented mr form for disassembly only
	(CVT__2__) Added several missing non-intrinsic forms
	(LXADD, XCHG) Reordered operands to make more sense for
		MRMSrcMem
	(XCHG) Added register-to-register forms
	(XADD, CMPXCHG, XCHG) Added non-locked forms
* X86InstrSSE.td
	(CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ)
		Added
* X86InstrFPStack.td
	(COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP,
	 FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X,
	 FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM,
	 FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE,
	 FXRSTOR)
		Added
	(FCOM, FCOMP) Added qualifiers
	(FSTENV, FSAVE, FSTSW) Fixed opcode names
	(FNSTSW) Added implicit register operand
* X86InstrInfo.td
	(opaque512mem) Added for FXSAVE/FXRSTOR
	(offset8, offset16, offset32, offset64) Added for MOV
	(NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR,
	 LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS,
	 LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT,
	 LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC,
	 CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC,
	 SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL,
	 VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD,
	 VMWRITE, VMXOFF, VMXON) Added
	(NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier
	(JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL,
	 JGE, JLE, JG, JCXZ) Added 32-bit forms
	(MOV) Changed some immediate forms to offset forms
	(MOV) Added reversed reg-reg forms, which are encoded
		differently
	(MOV) Added debug-register and condition-register moves
	(CMOV) Added qualifiers
	(AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV
	(BT) Uncommented memory-register forms for disassembler
	(MOVSX, MOVZX) Added forms
	(XCHG, LXADD) Made operand order make sense for MRMSrcMem
	(XCHG) Added register-register forms
	(XADD, CMPXCHG) Added unlocked forms
* X86InstrMMX.td
	(MMX_MOVD, MMV_MOVQ) Added forms
* X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table
	change

* X86RegisterInfo.td: Added debug and condition register sets
* x86-64-pic-3.ll: Fixed testcase to reflect call qualifier
* peep-test-3.ll: Fixed testcase to reflect test qualifier
* cmov.ll: Fixed testcase to reflect cmov qualifier
* loop-blocks.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-11.ll: Fixed testcase to reflect call qualifier
* 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call
  qualifier
* x86-64-pic-2.ll: Fixed testcase to reflect call qualifier
* live-out-reg-info.ll: Fixed testcase to reflect test qualifier
* tail-opts.ll: Fixed testcase to reflect call qualifiers
* x86-64-pic-10.ll: Fixed testcase to reflect call qualifier
* bss-pagealigned.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-1.ll: Fixed testcase to reflect call qualifier
* widen_load-1.ll: Fixed testcase to reflect call qualifier

llvm-svn: 91638
2009-12-18 00:01:26 +00:00
Evan Cheng
dbd8789125 Revert this dag combine change:
Fold (zext (and x, cst)) -> (and (zext x), cst)

DAG combiner likes to optimize expression in the other way so this would end up cause an infinite looping.

llvm-svn: 91574
2009-12-17 00:40:05 +00:00
Nick Lewycky
503ef79cc5 Make this test pass on Linux.
llvm-svn: 91521
2009-12-16 07:35:25 +00:00
Evan Cheng
aaf2f58a04 Re-enable 91381 with fixes.
llvm-svn: 91489
2009-12-16 00:53:11 +00:00
Dale Johannesen
365ae431a7 Do better with physical reg operands (typically, from inline asm)
in local register allocator.  If a reg-reg copy has a phys reg
input and a virt reg output, and this is the last use of the phys
reg, assign the phys reg to the virt reg.  If a reg-reg copy has
a phys reg output and we need to reload its spilled input, reload
it directly into the phys reg than passing it through another reg.

Following 76208, there is sometimes no dependency between the def of
a phys reg and its use; this creates a window where that phys reg
can be used for spilling (this is true in linear scan also).  This
is bad and needs to be fixed a better way, although 76208 works too
well in practice to be reverted.  However, there should normally be
no spilling within inline asm blocks.  The patch here goes a long way
towards making this actually be true.

llvm-svn: 91485
2009-12-16 00:29:41 +00:00
Kenneth Uildriks
c0ab5a6e88 For fastcc on x86, let ECX be used as a return register after EAX and EDX
llvm-svn: 91410
2009-12-15 03:27:52 +00:00
Evan Cheng
4adb4acc7b Disable 91381 for now. It's miscompiling ARMISelDAG2DAG.cpp.
llvm-svn: 91405
2009-12-15 03:07:11 +00:00
Evan Cheng
c531da60aa Make 91378 more conservative.
1. Only perform (zext (shl (zext x), y)) -> (shl (zext x), y) when y is a constant. This makes sure it remove at least one zest.
2. If the shift is a left shift, make sure the original shift cannot shift out bits.

llvm-svn: 91399
2009-12-15 03:00:32 +00:00
Evan Cheng
cd8f0de016 Use sbb x, x to materialize carry bit in a GPR. The result is all one's or all zero's.
llvm-svn: 91381
2009-12-15 00:53:42 +00:00
Evan Cheng
bd48ad16fa Fold (zext (and x, cst)) -> (and (zext x), cst).
llvm-svn: 91380
2009-12-15 00:52:11 +00:00
Evan Cheng
f3b2e55b34 Propagate zest through logical shift.
llvm-svn: 91378
2009-12-15 00:41:36 +00:00
Dan Gohman
57dc006590 Fix integer cast code to handle vector types.
llvm-svn: 91362
2009-12-14 23:40:38 +00:00
Evan Cheng
ee5b5917fd Disable r91104 for x86. It causes partial register stall which pessimize code in 32-bit.
llvm-svn: 91223
2009-12-12 20:03:14 +00:00
Anton Korobeynikov
724c82337f Lower setcc branchless, if this is profitable.
Based on the patch by Brian Lucas!

llvm-svn: 91175
2009-12-11 23:01:29 +00:00
Dan Gohman
2e616e859b Implement vector widening, splitting, and scalarizing for SIGN_EXTEND_INREG.
llvm-svn: 91158
2009-12-11 21:31:27 +00:00
Dan Gohman
0a78e32f6b Change this to the correct PR number.
llvm-svn: 91148
2009-12-11 20:09:21 +00:00
Dan Gohman
b2cbb1e37e Fix the result type of SELECT nodes lowered from Select instructions with
aggregate return values. This fixes PR5754.

llvm-svn: 91145
2009-12-11 19:50:50 +00:00
Anton Korobeynikov
f8b2e2868e Honour setHasCalls() set from isel.
This is used in some weird cases like general dynamic TLS model.
This fixes PR5723

llvm-svn: 91144
2009-12-11 19:39:55 +00:00
Evan Cheng
4c304eebe9 Tests for 91103 and 91104.
llvm-svn: 91105
2009-12-11 06:02:21 +00:00
Evan Cheng
4b7cf3ed41 It's not safe to coalesce a move where src and dst registers have different subregister indices. e.g.:
%reg16404:1<def> = MOV8rr %reg16412:2<kill>

llvm-svn: 91061
2009-12-10 20:59:45 +00:00
Evan Cheng
bc633478bd Fix test.
llvm-svn: 90988
2009-12-09 22:24:42 +00:00
Evan Cheng
9e2442c0be Optimize splat of a scalar load into a shuffle of a vector load when it's legal. e.g.
vector_shuffle (scalar_to_vector (i32 load (ptr + 4))), undef, <0, 0, 0, 0>
=>
vector_shuffle (v4i32 load ptr), undef, <1, 1, 1, 1>

iff ptr is 16-byte aligned (or can be made into 16-byte aligned).

llvm-svn: 90984
2009-12-09 21:00:30 +00:00
Evan Cheng
41c13e41fe Teach InferPtrAlignment to infer GV+cst alignment and use it to simplify x86 isl lowering code.
llvm-svn: 90925
2009-12-09 01:53:58 +00:00
Evan Cheng
edcc21919f - Support inline asm 'w' constraint for 128-bit vector types.
- Also support the 'q' NEON registers asm code.

llvm-svn: 90894
2009-12-08 23:06:22 +00:00
Anton Korobeynikov
0ace515a4c Reduce (cmp 0, and_su (foo, bar)) into (bit foo, bar). This saves extra instruction. Patch inspired by Brian Lucas!
llvm-svn: 90819
2009-12-08 01:03:04 +00:00
David Greene
73ad44c6b6 Use FileCheck and set nounwind on calls.
llvm-svn: 90790
2009-12-07 19:40:26 +00:00
Dan Gohman
44e25ed254 Don't enable the post-RA scheduler on x86 except at -O3. In its
current form, it is too expensive in compile time.

llvm-svn: 90781
2009-12-07 19:04:31 +00:00
Anton Korobeynikov
eee906f4f0 Dynamic stack realignment use of sp register as source/dest register
in "bic sp, sp, #15" leads to unpredicatble behaviour in Thumb2 mode.
Emit the following code instead:
mov r4, sp
bic r4, r4, #15
mov sp, r4

llvm-svn: 90724
2009-12-06 22:39:50 +00:00
Bill Wendling
887646a585 Temporarily revert r90502. It was causing the llvm-gcc bootstrap on PPC to fail.
llvm-svn: 90653
2009-12-05 07:30:23 +00:00
Jakob Stoklund Olesen
7c5af26d12 Also attempt trivial coalescing for live intervals that end in a copy.
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.

The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.

This patch takes care of a few more cases that r90163 missed.

llvm-svn: 90502
2009-12-04 00:16:04 +00:00
Nate Begeman
3a9c51f256 Don't pull vector sext through both hands of a logical operation, since doing so prevents the fusion of vector sext and setcc into vsetcc.
Add a testcase for the above transformation.
Fix a bogus use of APInt noticed while tracking this down.

llvm-svn: 90423
2009-12-03 07:11:29 +00:00
Bob Wilson
b53c801366 Recognize canonical forms of vector shuffles where the same vector is used for
both source operands.  In the canonical form, the 2nd operand is changed to an
undef and the shuffle mask is adjusted to only reference elements from the 1st
operand.  Radar 7434842.

llvm-svn: 90417
2009-12-03 06:40:55 +00:00
Bill Wendling
0eb481a249 Remove unnecessary check.
llvm-svn: 90352
2009-12-02 22:02:20 +00:00
Evan Cheng
0c687845b1 Fix PR5391: support early clobber physical register def tied with a use (ewwww)
- A valno should be set HasRedefByEC if there is an early clobber def in the middle of its live ranges. It should not be set if the def of the valno is defined by an early clobber.
- If a physical register def is tied to an use and it's an early clobber, it just means the HasRedefByEC is set since it's still one continuous live range.
- Add a couple of missing checks for HasRedefByEC in the coalescer. In general, it should not coalesce a vr with a physical register if the physical register has a early clobber def somewhere. This is overly conservative but that's the price for using such a nasty inline asm "feature".

llvm-svn: 90269
2009-12-01 22:25:00 +00:00
Jim Grosbach
7688d320c9 test case for IV-Users simplification loop improvement
llvm-svn: 90260
2009-12-01 21:53:51 +00:00
Jakob Stoklund Olesen
f07d6129a2 Use CFG connectedness as a secondary sort key when deciding the order of copy coalescing.
This means that well connected blocks are copy coalesced before the less connected blocks. Connected blocks are more difficult to
coalesce because intervals are more complicated, so handling them first gives a greater chance of success.

llvm-svn: 90194
2009-12-01 03:03:00 +00:00
Evan Cheng
fcbc30f36e Fix PR5614: parts of a physical register def may be killed the rest.
llvm-svn: 90180
2009-12-01 00:44:45 +00:00
Jakob Stoklund Olesen
ce2743a619 New virtual registers created for spill intervals should inherit allocation hints from the original register.
This helps us avoid silly copies when rematting values that are copied to a physical register:

leaq	_.str44(%rip), %rcx
movq	%rcx, %rsi
call	_strcmp

becomes:

leaq	_.str44(%rip), %rsi
call	_strcmp

The coalescer will not touch the movq because that would tie down the physical register.

llvm-svn: 90163
2009-11-30 22:55:54 +00:00
Mon P Wang
22b4e4e223 Add test case for r90108
llvm-svn: 90109
2009-11-30 02:42:27 +00:00
Duncan Sands
638c57757d While this test is testing a problem in the generic part of codegen,
the problem only shows for msp430 and pic16 which is why it specifies
them using -march.  But it is wrong to put such tests in CodeGen/Generic,
since not everyone builds these targets.  Put a copy of the test in each
of the target test directories.

llvm-svn: 90005
2009-11-27 16:04:14 +00:00
Evan Cheng
dd352c2a81 Test for 89905.
llvm-svn: 89906
2009-11-26 00:35:01 +00:00
Evan Cheng
bdedf32e51 ProcessImplicitDefs should watch out for invalidated iterator and extra implicit operands on copies.
llvm-svn: 89880
2009-11-25 21:13:39 +00:00
Bruno Cardoso Lopes
038281c523 Support PIC loading of constant pool entries
llvm-svn: 89863
2009-11-25 12:17:58 +00:00
Dale Johannesen
5809ff0e58 Do not store R31 into the caller's link area on PPC.
This violates the ABI (that area is "reserved"), and
while it is safe if all code is generated with current
compilers, there is some very old code around that uses
that slot for something else, and breaks if it is stored
into.  Adjust testcases looking for current behavior.
I've verified that the stack frame size is right in all
testcases, whether it changed or not.  7311323.

llvm-svn: 89811
2009-11-24 22:59:02 +00:00
Evan Cheng
b81878ed80 Enable predication of NEON instructions in Thumb2 mode.
llvm-svn: 89748
2009-11-24 08:06:15 +00:00
Anton Korobeynikov
0f885eb7fd Materialize global addresses via movt/movw pair, this is always better
than doing the same via constpool:
1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2.
2. Load from constpool might stall up to 300 cycles due to cache miss.
3. Movt/movw does not use load/store unit.
4. Less constpool entries => better compiler performance.

This is only enabled on ELF systems, since darwin does not have needed
relocations (yet).

llvm-svn: 89720
2009-11-24 00:44:37 +00:00
Jim Grosbach
76b545e988 move fconst[sd] to UAL. <rdar://7414913>
llvm-svn: 89700
2009-11-23 21:08:25 +00:00
Jim Grosbach
b7607ee5fe update test for 89694
llvm-svn: 89695
2009-11-23 20:39:53 +00:00
Edward O'Callaghan
573a04cfbb Miss two, PR5307.
llvm-svn: 89596
2009-11-22 15:35:28 +00:00
Edward O'Callaghan
a295e7bd9b Convert Thumb2 tests to FileCheck for PR5307.
llvm-svn: 89595
2009-11-22 15:18:27 +00:00
Benjamin Kramer
7968de0cde Turns out stuff gets allocated to different registers depending on the subtarget.
llvm-svn: 89594
2009-11-22 15:15:52 +00:00
Edward O'Callaghan
d1c7b40bb5 Convert ARM tests to FileCheck for PR5307.
llvm-svn: 89593
2009-11-22 14:23:33 +00:00
Benjamin Kramer
a08534a88d Convert test to FileCheck.
llvm-svn: 89589
2009-11-22 13:16:36 +00:00
Edward O'Callaghan
1a250b4109 Forgot to alter RUN line when converting to FileCheck.
llvm-svn: 89588
2009-11-22 13:09:48 +00:00
Edward O'Callaghan
5ae4559914 Fix for bad FileCheck converts in revision 89584.
llvm-svn: 89586
2009-11-22 12:50:05 +00:00
Edward O'Callaghan
949850890f Convert a few tests to FileCheck for PR5307.
llvm-svn: 89584
2009-11-22 11:45:44 +00:00
Jim Grosbach
99c5b49c61 Revert 89562. We're being sneakier than I was giving us credit for, and this
isn't necessary.

llvm-svn: 89568
2009-11-21 23:34:09 +00:00
Jim Grosbach
d4603a5c4e Darwin requires a frame pointer for all non-leaf functions to support correct
backtraces.

llvm-svn: 89562
2009-11-21 21:40:08 +00:00
Jakob Stoklund Olesen
78f465dc49 Don't leave temporary files in the test directory.
llvm-svn: 89531
2009-11-21 02:05:31 +00:00
Dale Johannesen
907ff5a620 When generating a vector the really slow way, via loads
and stores, handle the case where the element size is not
a valid target type correctly (PPC).

llvm-svn: 89521
2009-11-21 00:53:23 +00:00
Evan Cheng
9828118adf Enable hoisting load from constant memories.
llvm-svn: 89510
2009-11-20 23:31:34 +00:00
Sean Callanan
78ee7f5d57 Recommitting PALIGNR shift width fixes.
Thanks to Daniel Dunbar for fixing clang intrinsics:
  http://llvm.org/viewvc/llvm-project?view=rev&revision=89499

llvm-svn: 89500
2009-11-20 22:28:42 +00:00
Dale Johannesen
45f80d39f6 Remove an incorrect overaggressive optimization
(PPC specific).

llvm-svn: 89496
2009-11-20 22:16:40 +00:00
Sean Callanan
d92626fc0d Reverting PALIGNR fix until I figure out how this
broke the Clang testsuite.

llvm-svn: 89495
2009-11-20 22:09:28 +00:00
Sean Callanan
0da77167d3 Fixed PALIGNR to take 8-bit rotations in all cases.
Also fixed the corresponding testcase, and the PALIGNR
  intrinsic (tested for correctness with llvm-gcc).

llvm-svn: 89491
2009-11-20 21:40:28 +00:00
Evan Cheng
9f57c4916e Remat VLDRD from constpool. Clean up some instruction property specifications.
llvm-svn: 89478
2009-11-20 19:57:15 +00:00
Duncan Sands
072d688d75 Fix PR5558, which was caused by a wrong fix for PR3393 (see commit 63048),
which was an expensive checks failure due to a bug in the checking.  This
patch in essence reverts the original fix for PR3393, and refixes it by a
tweak to the way expensive checking is done.

llvm-svn: 89454
2009-11-20 10:45:10 +00:00
Dan Gohman
d3d7358309 Fix fast-isel to avoid selecting the return instruction if a
tail call has been encountered.

llvm-svn: 89444
2009-11-20 02:51:26 +00:00
Evan Cheng
5fe8b0b3c5 Also CSE non-pic load from constant pools.
llvm-svn: 89440
2009-11-20 02:10:27 +00:00
Evan Cheng
405012b096 Fix codegen of conditional move of immediates. We were not making use of the immediate forms of cmov instructions at all.
llvm-svn: 89423
2009-11-20 00:54:03 +00:00
Daniel Dunbar
cfcc2952fb Unbreak test, Bruno please check.
llvm-svn: 89329
2009-11-19 07:18:49 +00:00
Evan Cheng
987b8c3d9a More consistent thumb1 asm printing.
llvm-svn: 89328
2009-11-19 06:57:41 +00:00
Evan Cheng
c2e359a418 Shrink ldr / str [sp, imm0-1024] to 16-bit instructions.
llvm-svn: 89326
2009-11-19 06:32:27 +00:00
Bruno Cardoso Lopes
bf95b9699e - Add sugregister logic to handle f64=(f32,f32).
- Support mips1 like load/store of doubles:

Instead of:
  sdc $f0, X($3)
Generate:
  swc $f0, X($3)
  swc $f1, X+4($3)

llvm-svn: 89322
2009-11-19 06:06:13 +00:00
Bill Wendling
ecc50bcc77 Test from Dhrystone to make sure that we're not emitting an aligned load for a
string that's aligned at 8-bytes instead of 16-bytes.

llvm-svn: 89295
2009-11-19 01:33:57 +00:00
Bob Wilson
70bfa110eb Fix buildbots.
llvm-svn: 89274
2009-11-18 23:30:38 +00:00
Richard Osborne
fc2d5141a4 Add XCore support for indirectbr / blockaddress.
llvm-svn: 89273
2009-11-18 23:20:42 +00:00
Bob Wilson
dccd3bdb4e Tail duplication still needs to iterate. Duplicating new instructions onto
the tail of a block may make that block a new candidate for duplication.

llvm-svn: 89264
2009-11-18 22:52:37 +00:00
Jakob Stoklund Olesen
7b5afd4dd6 Fix PR5300.
When TwoAddressInstructionPass deletes a dead instruction, make sure that all
register kills are accounted for. The 2-addr register does not get special
treatment.

llvm-svn: 89246
2009-11-18 21:33:35 +00:00
Jakob Stoklund Olesen
9472aae362 Fix inverted test and add testcase from failing self-host.
llvm-svn: 89167
2009-11-18 00:02:18 +00:00
Jakob Stoklund Olesen
f96b51a084 Remove fragile test.
llvm-svn: 89150
2009-11-17 21:52:40 +00:00
Jim Grosbach
d4db2d58ae Enable arm jumpt table adjustment.
llvm-svn: 89143
2009-11-17 21:24:11 +00:00
Anton Korobeynikov
6b1a243be8 Forgot to commit test fixes
llvm-svn: 89138
2009-11-17 20:38:36 +00:00
Jakob Stoklund Olesen
0ca73b9208 Enable -split-phi-edges by default, except when -regalloc=local.
The local register allocator doesn't like it when LiveVariables is run.
We should also disable edge splitting under -O0, but that has to wait a bit.

llvm-svn: 89125
2009-11-17 19:15:50 +00:00
Evan Cheng
0f7e9f7cec Revert 89021. It's miscompiling llvm-gcc driver driver at -O0.
llvm-svn: 89082
2009-11-17 09:55:52 +00:00
Jakob Stoklund Olesen
6ac8f7ec34 Enable -split-phi-edges by default
llvm-svn: 89021
2009-11-17 01:07:22 +00:00
Evan Cheng
6e4430374e MOV64rm should be marked isReMaterializable.
llvm-svn: 89019
2009-11-17 00:55:55 +00:00
Jim Grosbach
b123a9cbc0 Convert to FileCheck
llvm-svn: 89007
2009-11-17 00:20:26 +00:00
Jim Grosbach
2f09113304 Convert to FileCheck
llvm-svn: 89002
2009-11-17 00:03:38 +00:00
Jim Grosbach
299e4e76c4 Cleanup. Missed removing these when converting. Oops.
llvm-svn: 89001
2009-11-17 00:00:33 +00:00
Dan Gohman
c2979de134 Fix this test - there don't appear to be any actual Reload Reuses
in this testcase.

llvm-svn: 88998
2009-11-16 23:49:55 +00:00
Dan Gohman
c35e84e1f5 Revert r87049, which was the workaround for the regression triggered
by the recent FixedStackPseudoSourceValue-related changes, now that
the specific bug that affected it is fixed, in r88954.

llvm-svn: 88997
2009-11-16 23:43:42 +00:00
Jim Grosbach
95cf7fad36 Convert to FileCheck
llvm-svn: 88991
2009-11-16 23:19:29 +00:00
Evan Cheng
78be20d62e - Check memoperand alignment instead of checking stack alignment. Most load / store folding instructions are not referencing spill stack slots.
- Mark MOVUPSrm re-materializable.

llvm-svn: 88974
2009-11-16 21:56:03 +00:00
Jim Grosbach
f4abb1280a Convert to FileCheck
llvm-svn: 88947
2009-11-16 20:04:15 +00:00
Lang Hames
6a5810c037 Added a testcase for PR5495.
llvm-svn: 88946
2009-11-16 20:03:13 +00:00
Jim Grosbach
deee4fbd5d Convert to FileCheck
llvm-svn: 88942
2009-11-16 19:46:46 +00:00
Jim Grosbach
0ba7bb08d7 tbb opt off by default
llvm-svn: 88921
2009-11-16 17:24:45 +00:00
David Greene
6469fa6824 Support spill comments.
Have the asm printer emit a comment if an instruction is a spill or
reload and have the spiller mark copies it introdues so the asm printer
can also annotate those.

llvm-svn: 88911
2009-11-16 15:12:23 +00:00
Evan Cheng
ea46259f53 Check if subreg index is zero.
llvm-svn: 88899
2009-11-16 06:31:49 +00:00
Evan Cheng
2fa416debd For some targets, a copy can use a register multiple times, e.g. ppc.
llvm-svn: 88895
2009-11-16 05:52:06 +00:00
Evan Cheng
5c06a152a8 xfail for now. It has been failing.
llvm-svn: 88892
2009-11-16 05:44:04 +00:00
Bruno Cardoso Lopes
21ca44ba49 - Fix a small bug while handling target constant pools (one param was missing).
- Add a smarter constant pool loading, instead of:

lui $2, %hi($CPI1_0)
addiu $2, $2, %lo($CPI1_0)
lwc1 $f0, 0($2)

Generate:

lui $2, %hi($CPI1_0)
lwc1 $f0, %lo($CPI1_0)($2)

llvm-svn: 88886
2009-11-16 04:33:42 +00:00
Jim Grosbach
1aa571da3c Detect need for autoalignment of the stack earlier to catch spills more
conservatively. eliminateFrameIndex() machinery adjust to handle addr mode
6 (vld1/vst1) used for spills. Fix tests to expect aligned Q-reg spilling

llvm-svn: 88874
2009-11-15 21:45:34 +00:00
Jim Grosbach
6028068e88 remove xfail
llvm-svn: 88817
2009-11-14 21:57:35 +00:00
Richard Osborne
8748f55236 Add XCore support for arbitrary-sized aggregate returns.
llvm-svn: 88802
2009-11-14 19:33:35 +00:00
Evan Cheng
b8c04e1226 Added getSubRegIndex(A,B) that returns subreg index of A to B. Use it to replace broken code in VirtRegRewriter.
llvm-svn: 88753
2009-11-14 03:42:17 +00:00
Evan Cheng
9b46e74f42 - Change TargetInstrInfo::reMaterialize to pass in TargetRegisterInfo.
- If destination is a physical register and it has a subreg index, use the
  sub-register instead.
This fixes PR5423.

llvm-svn: 88745
2009-11-14 02:55:43 +00:00
Evan Cheng
3781b2e7b3 Add radar number.
llvm-svn: 88739
2009-11-14 02:11:32 +00:00
Evan Cheng
c56b0a0f14 Fix PR5412: Fix an inverted check and another missing sub-register check.
llvm-svn: 88738
2009-11-14 02:09:09 +00:00
Dan Gohman
b36274632d Enable the tail call optimization when the caller returns undef.
llvm-svn: 88737
2009-11-14 02:06:30 +00:00
Evan Cheng
e43198c166 When expanding t2STRDi8 r, r to two stores, add kill markers correctly.
llvm-svn: 88734
2009-11-14 01:50:00 +00:00
Evan Cheng
e2907b91de Fix PR5411. Bug in UpdateKills. A reg def partially define its super-registers.
llvm-svn: 88719
2009-11-13 23:16:41 +00:00
Dan Gohman
972293611d When optimizing for size, don't tail-merge unless it's likely to be a
code-size win, and not when it's only likely to be code-size neutral,
such as when only a single instruction would be eliminated and a new
branch would be required.

This fixes rdar://7392894.

llvm-svn: 88692
2009-11-13 21:02:15 +00:00
Evan Cheng
f629fdcab2 Fix PR5410: LiveVariables lost subreg def:
D0<def,dead> = ...
...
             = S0<use, kill>
S0<def>      = ...
...
D0<def>      = 

The first D0 def is correctly marked dead, however, livevariables should have
added an implicit def of S0 or we end up with a use without a def.

llvm-svn: 88690
2009-11-13 20:36:40 +00:00
Dan Gohman
01b65e1e48 Don't let a noalias difference disrupt the tailcall optimization.
llvm-svn: 88672
2009-11-13 18:49:38 +00:00
Dale Johannesen
f57a58c4fe Adjust isConstantSplat to allow for big-endian targets.
PPC is such a target; make it work.

llvm-svn: 87060
2009-11-13 01:45:18 +00:00
Daniel Dunbar
cdeab5257c Update test.
llvm-svn: 87049
2009-11-13 01:01:58 +00:00
Jim Grosbach
8ffdb5d109 Clean up testcase a bit. Simplify case blocks and adjust switch instruction to not take an undefined value as input.
llvm-svn: 86997
2009-11-12 17:19:09 +00:00
Benjamin Kramer
86592507dc Fix typo in run line.
llvm-svn: 86984
2009-11-12 12:35:27 +00:00
Evan Cheng
deacae0dd9 RegScavenger::enterBasicBlock should always reset register state.
llvm-svn: 86972
2009-11-12 07:49:10 +00:00
Evan Cheng
b0a193db31 - Teach LSR to avoid changing cmp iv stride if it will create an immediate that
cannot be folded into target cmp instruction.
- Avoid a phase ordering issue where early cmp optimization would prevent the
  later count-to-zero optimization.
- Add missing checks which could cause LSR to reuse stride that does not have
  users.
- Fix a bug in count-to-zero optimization code which failed to find the pre-inc
  iv's phi node.
- Remove, tighten, loosen some incorrect checks disable valid transformations.
- Quite a bit of code clean up.

llvm-svn: 86969
2009-11-12 07:35:05 +00:00
Dan Gohman
f8ec4856e4 Tail merge at any size when there are two potentials blocks and one
can be made to fall through into the other.

llvm-svn: 86909
2009-11-12 00:39:10 +00:00
Kenneth Uildriks
82bc831061 x86 users can now return arbitrary sized structs. Structs too large to fit in return registers will be returned through a hidden sret parameter introduced during SelectionDAG construction.
llvm-svn: 86876
2009-11-11 19:59:24 +00:00
Dan Gohman
9f47de10e3 Add support for tail duplication to BranchFolding, and extend
tail merging support to handle more cases.
 - Recognize several cases where tail merging is beneficial even when
   the tail size is smaller than the generic threshold.
 - Make use of MachineInstrDesc::isBarrier to help detect
   non-fallthrough blocks.
 - Check for and avoid disrupting fall-through edges in more cases.

llvm-svn: 86871
2009-11-11 19:48:59 +00:00
Evan Cheng
913687616e Add nounwind.
llvm-svn: 86814
2009-11-11 07:11:02 +00:00
Bill Wendling
a6d7a411d3 Fix test to work on every platform.
llvm-svn: 86786
2009-11-11 01:44:22 +00:00
Bill Wendling
8718dfbbaa Fix test to work on every platform.
llvm-svn: 86785
2009-11-11 01:41:32 +00:00
Bill Wendling
33ab3cd1bc Make sure that the exception handling data has the same visibility as the
function it's generated for.

llvm-svn: 86779
2009-11-11 01:24:59 +00:00
Bill Wendling
ff705446e1 Test this on Darwin only.
llvm-svn: 86752
2009-11-10 23:18:33 +00:00
Dale Johannesen
20e1cd09ba Emit correct code when making a ConstantPool entry for a vector
constant whose component type is not a legal type for the target.
(If the target ConstantPool cannot handle this type either, it has
an opportunity to merge elements.  In practice any target with
8-bit bytes must support i8 *as data*).  7320806 (partial).

llvm-svn: 86751
2009-11-10 23:16:41 +00:00
Bill Wendling
1176227990 Modify how the prologue encoded the "move" information for the FDE. GCC
generates a sequence similar to this:

__Z4funci:
LFB2:
        mflr r0
LCFI0:
        stmw r30,-8(r1)
LCFI1:
        stw r0,8(r1)
LCFI2:
        stwu r1,-80(r1)
LCFI3:
        mr r30,r1
LCFI4:

where LCFI3 and LCFI4 are used by the FDE to indicate what the FP, LR, and other
things are. We generated something more like this:

Leh_func_begin1:
        mflr r0
        stw r31, 20(r1)
        stw r0, 8(r1)
Llabel1:
        stwu r1, -80(r1)
Llabel2:
        mr r31, r1

Note that we are missing the "mr" instruction. This patch makes it more like the
GCC output.

llvm-svn: 86729
2009-11-10 22:14:04 +00:00
Mike Stump
ee3ba929d0 Add testcase for recent checkin.
llvm-svn: 86620
2009-11-09 23:10:49 +00:00
Jim Grosbach
9f37156ae5 Update test
llvm-svn: 86614
2009-11-09 22:59:01 +00:00
Jim Grosbach
ea6c9c17f5 Use Unified Assembly Syntax for the ARM backend.
llvm-svn: 86494
2009-11-09 00:11:35 +00:00
Anton Korobeynikov
552b831b91 Add and-not (bic) patterns. Based heavily on patch by Brian Lucas!
llvm-svn: 86471
2009-11-08 15:33:12 +00:00
Anton Korobeynikov
6f4ee0efe1 Fix invalid operand updates & implement post-inc memory operands
llvm-svn: 86466
2009-11-08 14:27:38 +00:00
Anton Korobeynikov
7b3a35eee8 It is invalid to infer the value type from the result #0 of the node
since the instruction might use the other result of different type.

llvm-svn: 86462
2009-11-08 12:14:54 +00:00
Nate Begeman
49d93dc6d1 x86 vector shuffle cleanup/fixes:
1. rename the movhp patfrag to movlhps, since thats what it actually matches
2. eliminate the bogus movhps load and store patterns, they were incorrect.  The load transforms are already handled (correctly) by shufps/unpack.
3. revert a recent test change to its correct form.

llvm-svn: 86415
2009-11-07 23:17:15 +00:00
Anton Korobeynikov
9dc741f523 Add some dummy support for post-incremented loads
llvm-svn: 86385
2009-11-07 17:15:06 +00:00
Anton Korobeynikov
0a13189111 Add 8 bit libcalls and make use of them for msp430
llvm-svn: 86384
2009-11-07 17:14:39 +00:00
Anton Korobeynikov
da044db0f5 Initial support for addrmode handling. Tests by Brian Lucas!
llvm-svn: 86382
2009-11-07 17:13:35 +00:00
Anton Korobeynikov
30095499fc It turns out that the testcase in question uncovered subreg-handling bug.
Add assert in asmprinter to catch such cases and xfail the tests.
PR is to be filled.

llvm-svn: 86375
2009-11-07 15:20:32 +00:00
Eric Christopher
c5bcc1db29 Fix a couple of shuffle patterns to use movhlps instead
of movhps as the constraint.  Changes optimizations so
update testcases as appropriate as well.

llvm-svn: 86360
2009-11-07 08:45:53 +00:00
Chris Lattner
d5eaa6d39b Fix PR5421 by APInt'izing switch lowering.
llvm-svn: 86354
2009-11-07 07:50:34 +00:00
Chris Lattner
1be54634e0 merge cmp1 into cmp0 and filecheckize.
llvm-svn: 86345
2009-11-07 06:19:20 +00:00
Evan Cheng
899d8cb6a0 Refactor code. Fix a potential missing check. Teach isIdentical() about tLDRpci_pic.
llvm-svn: 86330
2009-11-07 04:04:34 +00:00
Evan Cheng
8eaaffb9da - Add TargetInstrInfo::isIdentical(). It's similar to MachineInstr::isIdentical
except it doesn't care if the definitions' virtual registers differ. This is
  used by machine LICM and other MI passes to perform CSE.
- Teach Thumb2InstrInfo::isIdentical() to check two t2LDRpci_pic are identical.
  Since pc relative constantpool entries are always different, this requires it
  it check if the values can actually the same.

llvm-svn: 86328
2009-11-07 03:52:02 +00:00
Evan Cheng
6e3e66375a - Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relative
load of a GV from constantpool and then add pc. It allows the code sequence to
  be rematerializable so it would be hoisted by machine licm.
- Add a late pass to break these pseudo instructions into a number of real
  instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm
  to this pass. This is done before post regalloc scheduling to allow the
  scheduler to proper schedule these instructions. It also allow them to be
  if-converted and shrunk by later passes.

llvm-svn: 86304
2009-11-06 23:52:48 +00:00