1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

467 Commits

Author SHA1 Message Date
Craig Topper
d8faffd93b Declare register classes as const. Fix a couple pointers to register classes that weren't already const.
llvm-svn: 151138
2012-02-22 07:28:11 +00:00
Jakob Stoklund Olesen
b498ebe5b7 Use the same CALL instructions for Windows as for everything else.
The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.

llvm-svn: 150708
2012-02-16 17:56:02 +00:00
Pete Cooper
21409dd760 Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user
llvm-svn: 150537
2012-02-15 00:33:37 +00:00
Pete Cooper
b1229a8866 Fixed bug when custom lowering DEC64m on x86.
If the DEC node had more than one user, it was doing this lowering but
leaving the original DEC node around and so decrementing twice.

Fixes PR11964.

llvm-svn: 150356
2012-02-13 00:10:03 +00:00
David Blaikie
06ecc99a56 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Chandler Carruth
751438a272 Switch all of the uses of my InsertDAGNode helper to follow the exact
same pattern. We already had this pattern is a few places, but others
tried to make a rough approximation of an actual DAG structure. As not
everywhere went to this trouble, nothing could rely on this being done.
In fact, I've checked all references to these node Ids, and the ones
that are using the topo-sort properties are actually satisfied with
a strict-weak-ordering. The requirement appears to be that Use >= Def.

I've added a big blurb of comments to this bit of the transform to
clarify why the order is so important for the next reader of the code.

I'm starting with this change as it is very small, and trivially
reverted if something breaks or the >= above really does need to be >.
If that proves the case, we can hide the problem by reverting this
patch, but the problem exists elsewhere as well, and so a more
comprehensive solution will be needed.

llvm-svn: 148001
2012-01-12 01:34:44 +00:00
Chandler Carruth
e3facd7860 Revert r147945 which disabled an addressing mode transformation. I had
hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.

If anyone else is seeing failures, please let me know!

llvm-svn: 147957
2012-01-11 18:36:12 +00:00
Chandler Carruth
e443e5e55d Disable the transformation I added in r147936 to see if it fixes some
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

llvm-svn: 147945
2012-01-11 12:17:47 +00:00
Chandler Carruth
9729a20760 Hoist a really redundant code pattern into a helper function, and delete
lots of lines of code. No functionality changed.

llvm-svn: 147942
2012-01-11 11:04:36 +00:00
Chandler Carruth
f44e3c7745 Simplify the AND-rooted mask+shift checking code to match that of the
SRL-rooted code.

llvm-svn: 147941
2012-01-11 09:35:04 +00:00
Chandler Carruth
5a0ea8a5fd Unify the interface of the three mask+shift transform helpers, and
factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.

llvm-svn: 147940
2012-01-11 09:35:02 +00:00
Chandler Carruth
7d8eac052d Clarify and make explicit some of the requirements for transforming
mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.

llvm-svn: 147939
2012-01-11 09:35:00 +00:00
Chandler Carruth
068b0ca15a Hoist the logic to transform shift+mask combinations into sub-register
extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.

llvm-svn: 147937
2012-01-11 08:48:20 +00:00
Chandler Carruth
b3371fa250 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Chandler Carruth
f762ca322b Don't rely on the fact that shift values are never very large, and thus
this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.

Spotted by inspection, and impossible to trigger given the shift widths
that can be used.

llvm-svn: 147773
2012-01-09 09:47:25 +00:00
Pete Cooper
4f4a9794b2 Added missing comment about new custom lowering of DEC64
llvm-svn: 144811
2011-11-16 19:03:23 +00:00
Pete Cooper
8441c08e0b Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Dan Gohman
a5f382da8b Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660
2011-11-03 21:49:52 +00:00
Dan Gohman
826cec9a4b Revert r143206, as there are still some failing tests.
llvm-svn: 143262
2011-10-29 00:41:52 +00:00
Dan Gohman
dedcc22bcd Reapply r143177 and r143179 (reverting r143188), with scheduler
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.

llvm-svn: 143206
2011-10-28 17:55:38 +00:00
Duncan Sands
a6507c4bcb Speculatively disable Dan's commits 143177 and 143179 to see if
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.

llvm-svn: 143188
2011-10-28 09:55:57 +00:00
Dan Gohman
484df993bd Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

llvm-svn: 143177
2011-10-28 01:29:32 +00:00
Jakob Stoklund Olesen
b49557d06d Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies.
In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX
instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot
target all GR8 registers, only those in GR8_NOREX.

TO enforce this, we ensure that all instructions using the
EXTRACT_SUBREG are GR8_NOREX constrained.

This fixes PR11088.

llvm-svn: 141499
2011-10-08 18:28:28 +00:00
Bruno Cardoso Lopes
0db437e0d6 Teach PreprocessISelDAG to be aware of vector types and to not process them.
llvm-svn: 136653
2011-08-01 21:54:05 +00:00
Eli Friedman
30d557cc28 Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile.
<rdar://problem/9763308>

llvm-svn: 135084
2011-07-13 21:29:53 +00:00
Eli Friedman
e0a117fbdf Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress.
Part of <rdar://problem/9763308>.

llvm-svn: 135079
2011-07-13 20:44:23 +00:00
Eric Christopher
7260817287 TargetConstant immediates won't be placed into registers so tighten
up the valid constant check earlier.

rdar://9692967

llvm-svn: 134286
2011-07-01 23:04:38 +00:00
Eric Christopher
7ce905754f Fix a small thinko for constant i64 lock/orq optimization where we
we didn't have an opcode for 64-bit constant or expressions.

Fixes rdar://9692967

llvm-svn: 134121
2011-06-30 00:48:30 +00:00
Stuart Hastings
e3158f93ec Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends.
rdar://problem/8614450

llvm-svn: 131746
2011-05-20 19:04:40 +00:00
Eric Christopher
d613c05f26 Update comment.
llvm-svn: 131459
2011-05-17 08:16:14 +00:00
Eric Christopher
c03ef7ebb3 Support XOR and AND optimization with no return value.
Finishes off rdar://8470697

llvm-svn: 131458
2011-05-17 08:10:18 +00:00
Eric Christopher
f81a665961 Couple less magic numbers.
llvm-svn: 131457
2011-05-17 07:50:41 +00:00
Eric Christopher
dc12267689 Make this code a little less magic number laden.
llvm-svn: 131456
2011-05-17 07:47:55 +00:00
Eric Christopher
281a4a4550 Turn this into a table, this will make more sense shortly.
Part of rdar://8470697

llvm-svn: 131200
2011-05-11 21:44:58 +00:00
Eric Christopher
3c17ef53c3 Optimize atomic lock or that doesn't use the result value.
Next up: xor and and.

Part of rdar://8470697

llvm-svn: 131171
2011-05-10 23:57:45 +00:00
Benjamin Kramer
fee48a936f Silence an overzealous uninitialized variable warning from GCC.
llvm-svn: 130053
2011-04-23 08:21:06 +00:00
Benjamin Kramer
7feae20986 X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)
This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
uint64_t foo(uint64_t x) { return (x&1) << 42; }
which used to compile into bloated code:
	shlq	$42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
	movabsq	$4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
	andq	%rdi, %rax              ## encoding: [0x48,0x21,0xf8]
	ret                             ## encoding: [0xc3]

with this patch we can fold the immediate into the and:
	andq	$1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	shlq	$42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
	ret                             ## encoding: [0xc3]

It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
that without making this code even more complicated. See the TODOs in the code.

llvm-svn: 129990
2011-04-22 15:30:40 +00:00
Stuart Hastings
47e45a32a8 Swap VT and DebugLoc operands of getExtLoad() for consistency with
other getNode() methods.  Radar 9002173.

llvm-svn: 125665
2011-02-16 16:23:55 +00:00
Chris Lattner
bcf2d46d8a Enhance ComputeMaskedBits to know that aligned frameindexes
have their low bits set to zero.  This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.

Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively.  Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST).  The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.

llvm-svn: 125470
2011-02-13 22:25:43 +00:00
NAKAMURA Takumi
d418ff3b1a lib/Target/X86/X86ISelDAGToDAG.cpp: __main should be WINCALL64 on Win64.
CALL64 marks %xmm* as dead.

llvm-svn: 124354
2011-01-27 03:20:19 +00:00
Chris Lattner
dde85de90f fix PR8514, a bug where the "heroic" transformation of shift/and
into and/shift would cause nodes to move around and a dangling pointer
to happen.  The code tried to avoid this with a HandleSDNode, but 
got the details wrong.

llvm-svn: 123578
2011-01-16 08:48:11 +00:00
Ted Kremenek
4b09cdedb2 'HiReg' is written but never read. Nuke its
declaration and its assignments.

Found by clang static analyzer.

llvm-svn: 123486
2011-01-14 22:34:13 +00:00
Bill Wendling
fae0dd1afa PR8918 - When used with MinGW64, LLVM generates a "calll __main" at the
beginning of the "main" function. The assembler complains about the invalid
suffix for the 'call' instruction. The right instruction is "callq __main".
Patch by KS Sreeram!

llvm-svn: 122933
2011-01-06 00:47:10 +00:00
Chris Lattner
65c5243bd6 rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Chris Lattner
76601e7a99 it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935
2010-12-05 07:30:36 +00:00
Dale Johannesen
e7f07349e4 Use a MemIntrinsicSDNode for ISD::PREFETCH, which touches
memory, so a MachineMemOperand is useful (not propagated
into the MachineInstr yet).  No functional change except
for dump output.

llvm-svn: 117413
2010-10-26 23:11:10 +00:00
Chris Lattner
195a9c3877 Use #NAME# to have the CMOV multiclass define things with the same names as before
(e.g. CMOVBE16rr instead of CMOVBErr16).

llvm-svn: 115705
2010-10-05 23:00:14 +00:00
Chris Lattner
c3c03dfeff switch CMOVBE to the multipattern:
21 insertions(+), 53 deletions(-)

Moar change coming before I switch the rest.

llvm-svn: 115697
2010-10-05 22:23:58 +00:00
Eric Christopher
84827bd9f5 Temporarily work around new address lowering while I figure out what
needs to happen for darwin.

llvm-svn: 114577
2010-09-22 20:42:08 +00:00
Chris Lattner
26d11d7501 reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress.
llvm-svn: 114529
2010-09-22 04:39:11 +00:00