1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

4888 Commits

Author SHA1 Message Date
Chris Lattner
f9d9976374 fix an oversight which caused us to compile the testcase (and other
less trivial things) into a dummy lea.  Before we generated:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	ret

now we produce:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	ret

This is part of rdar://9289558

llvm-svn: 129662
2011-04-17 17:12:08 +00:00
Chris Lattner
5e00f501ff Fix rdar://9289512 - not folding load into compare at -O0
The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo.  Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:


cmpb    $0, 52(%rax)
je      LBB4_2

instead of:

movb    52(%rax), %cl
cmpb    $0, %cl
je      LBB4_2

This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and 
generating less code.

This was one of the biggest classes of missing load folding.  Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.

llvm-svn: 129656
2011-04-17 06:35:44 +00:00
Chris Lattner
1fe5f78b7e split a complex predicate out to a helper function. Simplify two for loops,
which don't need to check for falling off the end of a block *and* end of phi
nodes, since terminators are never phis.

llvm-svn: 129655
2011-04-17 06:03:19 +00:00
Chris Lattner
cb194276e0 fix rdar://9289583 - fast isel should handle non-canonical commutative binops
allowing us to fold the immediate into the 'and' in this case:

int test1(int i) {
  return 8&i;
}

llvm-svn: 129653
2011-04-17 01:16:47 +00:00
Eli Friedman
2798137293 PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext.
Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.

llvm-svn: 129650
2011-04-16 23:25:34 +00:00
Evan Cheng
b720f37282 Fix divmod libcall lowering. Convert to {S|U}DIVREM first and then expand the node to a libcall. rdar://9280991
llvm-svn: 129633
2011-04-16 03:08:26 +00:00
Chris Lattner
0304b82f80 Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!

llvm-svn: 129558
2011-04-15 05:18:47 +00:00
Owen Anderson
0ce6c0f86e Fix another instance of the DAG combiner not using the correct type for the RHS of a shift.
llvm-svn: 129522
2011-04-14 17:30:49 +00:00
Andrew Trick
e89c19ab7b In the pre-RA scheduler, maintain cmp+br proximity.
This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.

The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.

Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump

Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion

llvm-svn: 129508
2011-04-14 05:15:06 +00:00
Chris Lattner
d4ba43dc76 sink a call into its only use.
llvm-svn: 129503
2011-04-14 04:12:47 +00:00
Owen Anderson
d98929ed6c During post-legalization DAG combining, be careful to only create shifts where the RHS is of the legal type for the new operation.
llvm-svn: 129484
2011-04-13 23:22:23 +00:00
Andrew Trick
916e01c917 Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.

Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.

llvm-svn: 129421
2011-04-13 00:38:32 +00:00
Andrew Trick
d83e7b6a5d Revert 129383. It causes some targets to hit a scheduler assert.
llvm-svn: 129385
2011-04-12 20:14:07 +00:00
Andrew Trick
1e0821075d PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make these heuristic adjustments to node latency work,
I also needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.

llvm-svn: 129383
2011-04-12 19:54:36 +00:00
Jay Foad
0d5ca4cf44 Don't include Operator.h from InstrTypes.h.
llvm-svn: 129271
2011-04-11 09:35:34 +00:00
Chris Lattner
e8dfbaef19 Avoid excess precision issues that lead to generating host-compiler-specific code.
Switch lowering probably shouldn't be using FP for this.  This resolves PR9581.

llvm-svn: 129199
2011-04-09 06:57:13 +00:00
Chris Lattner
badb8ca63c have dag combine zap "store undef", which can be formed during call lowering
with undef arguments.

llvm-svn: 129185
2011-04-09 02:32:02 +00:00
Evan Cheng
bc053100af Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap is lowered into a call to the specified trap function at sdisel time.
llvm-svn: 129152
2011-04-08 21:37:21 +00:00
Andrew Trick
36a1759769 Added a check in the preRA scheduler for potential interference on a
induction variable. The preRA scheduler is unaware of induction vars,
so we look for potential "virtual register cycles" instead.

Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing

llvm-svn: 129100
2011-04-07 19:54:57 +00:00
Bill Wendling
a8db395dc1 Revamp the SjLj "dispatch setup" intrinsic.
It needed to be moved closer to the setjmp statement, because the code directly
after the setjmp needs to know about values that are on the stack. Also, the
'bitcast' of the function context was causing a dead load. This wouldn't be too
horrible, except that at -O0 it wasn't optimized out, and because it wasn't
using the correct base pointer (if there is a VLA), it would try to access a
value from a garbage address.
<rdar://problem/9130540>

llvm-svn: 128873
2011-04-05 01:37:43 +00:00
Stuart Hastings
1635b37415 Revert 123704; it broke threaded LLVM.
llvm-svn: 128868
2011-04-05 00:37:28 +00:00
Cameron Zwarich
2748634089 Add a RemoveFromWorklist method to DCI. This is needed to do some complicated
transformations in target-specific DAG combines without causing DAGCombiner to
delete the same node twice. If you know of a better way to avoid this (see my
next patch for an example), please let me know.

llvm-svn: 128758
2011-04-02 02:40:26 +00:00
Evan Cheng
28382f9178 Add comments.
llvm-svn: 128730
2011-04-01 19:57:01 +00:00
Evan Cheng
13c73e4836 Assign node order numbers to results of call instruction lowering. This should improve src line debug info when sdisel is used. rdar://9199118
llvm-svn: 128728
2011-04-01 19:42:22 +00:00
Evan Cheng
39574b2766 Issue libcalls __udivmod*i4 / __divmod*i4 for div / rem pairs.
rdar://8911343

llvm-svn: 128696
2011-04-01 00:42:02 +00:00
Benjamin Kramer
f9e1ba7398 Turn SelectionDAGBuilder::GetRegistersForValue into a local function.
It couldn't be used outside of the file because SDISelAsmOperandInfo
is local to SelectionDAGBuilder.cpp. Making it a static function avoids
a weird linkage dance.

llvm-svn: 128342
2011-03-26 16:35:10 +00:00
Andrew Trick
651a3701f9 Fix for -pre-RA-sched=source.
Yet another case of unchecked NULL node (for physreg copy).
May fix PR9509.

llvm-svn: 128266
2011-03-25 06:40:55 +00:00
Eli Friedman
76fcfaab12 PR9535: add support for splitting and scalarizing vector ISD::FP_ROUND.
Also cleaning up some duplicated code while I'm here.

llvm-svn: 128176
2011-03-23 22:18:48 +00:00
Andrew Trick
b702dae9b2 Ensure that def-side physreg copies are scheduled above any other uses
so the scheduler can't create new interferences on the copies
themselves. Prior to this fix the scheduler could get stuck in a loop
creating copies.
Fixes PR9509.

llvm-svn: 128164
2011-03-23 20:42:39 +00:00
Andrew Trick
ca42e62048 whitespace
llvm-svn: 128163
2011-03-23 20:40:18 +00:00
Andrew Trick
d9c599d01c Added block number and name to isel debug output.
I'm tired of doing this manually for each checkout.
If anyone knows a better way debug isel for non-trivial tests feel
free to revert and let me know how to do it.

llvm-svn: 128132
2011-03-23 01:38:28 +00:00
Eric Christopher
44e3d3c26e Grammar-o.
llvm-svn: 128004
2011-03-21 18:06:21 +00:00
Nadav Rotem
92561196b7 Add support for legalizing UINT_TO_FP of vectors on platforms which do
not have native support for this operation (such as X86).
The legalized code uses two vector INT_TO_FP operations and is faster
than scalarizing.

llvm-svn: 127951
2011-03-19 13:09:10 +00:00
Benjamin Kramer
52ffb6ea96 BuildUDIV: If the divisor is even we can simplify the fixup of the multiplied value by introducing an early shift.
This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into
	shrl	$2, %edi
	imulq	$613566757, %rdi, %rax
	shrq	$32, %rax
	ret

instead of
	movl    %edi, %eax
	imulq   $613566757, %rax, %rcx
	shrq    $32, %rcx
	subl    %ecx, %eax
	shrl    %eax
	addl    %ecx, %eax
	shrl    $4, %eax

on x86_64

llvm-svn: 127829
2011-03-17 20:39:14 +00:00
Cameron Zwarich
cea63dc052 Move more logic into getTypeForExtArgOrReturn.
llvm-svn: 127809
2011-03-17 14:53:37 +00:00
Cameron Zwarich
a5746339cc Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn().
llvm-svn: 127807
2011-03-17 14:21:56 +00:00
Cameron Zwarich
2bb1e45ea3 The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte
rather than an int. Thankfully, this only causes LLVM to miss optimizations, not
generate incorrect code.

This just fixes the zext at the return. We still insert an i32 ZextAssert when
reading a function's arguments, but it is followed by a truncate and another i8
ZextAssert so it is not optimized.

llvm-svn: 127766
2011-03-16 22:20:18 +00:00
Cameron Zwarich
860d06739b Don't recompute something that we already have in a local variable.
llvm-svn: 127764
2011-03-16 22:20:07 +00:00
Evan Cheng
bac3e87eaa sext(undef) = 0, because the top bits will all be the same.
zext(undef) = 0, because the top bits will be zero.

llvm-svn: 127649
2011-03-15 02:22:10 +00:00
Evan Cheng
50f2d406ec BIT_CONVERT has been renamed to BITCAST.
llvm-svn: 127600
2011-03-14 18:19:52 +00:00
Evan Cheng
cb70b9e80b Minor optimization. sign-ext/anyext of undef is still undef.
llvm-svn: 127598
2011-03-14 18:15:55 +00:00
Owen Anderson
78afadfa5d Teach FastISel to support register-immediate-immediate instructions.
llvm-svn: 127496
2011-03-11 21:33:55 +00:00
Andrew Trick
6aa37a4a2b Replace -dag-chain-limit flag with constant. It has survived a release cycle without being touched, so no longer needs to pollute the hidden-help text.
llvm-svn: 127468
2011-03-11 17:46:59 +00:00
Evan Cheng
d5d2d4a158 Avoid replacing the value of a directly stored load with the stored value if the load is indexed. rdar://9117613.
llvm-svn: 127440
2011-03-11 00:48:56 +00:00
Evan Cheng
a3a7a7e364 Re-commit 127368 and 127371. They are exonerated.
llvm-svn: 127380
2011-03-10 00:16:32 +00:00
Evan Cheng
d7a2008a55 Revert 127368 and 127371 for now.
llvm-svn: 127376
2011-03-09 23:53:17 +00:00
Evan Cheng
b717770dfe Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more
flexible.

If it returns a register class that's different from the input, then that's the
register class used for cross-register class copies.
If it returns a register class that's the same as the input, then no cross-
register class copies are needed (normal copies would do).
If it returns null, then it's not at all possible to copy registers of the
specified register class.

llvm-svn: 127368
2011-03-09 22:47:38 +00:00
Andrew Trick
e529ddb2d7 Improve pre-RA-sched register pressure tracking for duplicate operands.
This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks

llvm-svn: 127347
2011-03-09 19:12:43 +00:00
Benjamin Kramer
4cf03850a2 Fix typo, make helper static.
llvm-svn: 127335
2011-03-09 16:19:12 +00:00
Eric Christopher
d3c7e834e8 Fix some latent bugs if the nodes are unschedulable. We'd gotten away
with this before since none of the register tracking or nightly tests
had unschedulable nodes.

This should probably be refixed with a special default Node that just
returns some "don't touch me" values.

Fixes PR9427

llvm-svn: 127263
2011-03-08 19:35:47 +00:00