1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

497 Commits

Author SHA1 Message Date
Duncan Sands
a6507c4bcb Speculatively disable Dan's commits 143177 and 143179 to see if
it fixes the dragonegg self-host (it looks like gcc is miscompiled).
Original commit messages:
Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

Delete #if 0 code accidentally left in.

llvm-svn: 143188
2011-10-28 09:55:57 +00:00
Dan Gohman
484df993bd Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW
on every node as it legalizes them. This makes it easier to use
hasOneUse() heuristics, since unneeded nodes can be removed from the
DAG earlier.

Make LegalizeOps visit the DAG in an operands-last order. It previously
used operands-first, because LegalizeTypes has to go operands-first, and
LegalizeTypes used to be part of LegalizeOps, but they're now split.
The operands-last order is more natural for several legalization tasks.
For example, it allows lowering code for nodes with floating-point or
vector constants to see those constants directly instead of seeing the
lowered form (often constant-pool loads). This makes some things
somewhat more complicated today, though it ought to allow things to be
simpler in the future. It also fixes some bugs exposed by Legalizing
using RAUW aggressively.

Remove the part of LegalizeOps that attempted to patch up invalid chain
operands on libcalls generated by LegalizeTypes, since it doesn't work
with the new LegalizeOps traversal order. Instead, define what
LegalizeTypes is doing to be correct, and transfer the responsibility
of keeping calls from having overlapping calling sequences into the
scheduler.

Teach the scheduler to model callseq_begin/end pairs as having a
physical register definition/use to prevent calls from having
overlapping calling sequences. This is also somewhat complicated, though
there are ways it might be simplified in the future.

This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others.
Please direct high-level questions about this patch to management.

llvm-svn: 143177
2011-10-28 01:29:32 +00:00
Jakob Stoklund Olesen
b49557d06d Add TEST8ri_NOREX pseudo to constrain sub_8bit_hi copies.
In 64-bit mode, sub_8bit_hi sub-registers can only be used by NOREX
instructions. The COPY created from the EXTRACT_SUBREG DAG node cannot
target all GR8 registers, only those in GR8_NOREX.

TO enforce this, we ensure that all instructions using the
EXTRACT_SUBREG are GR8_NOREX constrained.

This fixes PR11088.

llvm-svn: 141499
2011-10-08 18:28:28 +00:00
Bruno Cardoso Lopes
0db437e0d6 Teach PreprocessISelDAG to be aware of vector types and to not process them.
llvm-svn: 136653
2011-08-01 21:54:05 +00:00
Eli Friedman
30d557cc28 Make sure we don't combine a large displacement and a frame index in the same addressing mode on x86-64. It can overflow, leading to a crash/miscompile.
<rdar://problem/9763308>

llvm-svn: 135084
2011-07-13 21:29:53 +00:00
Eli Friedman
e0a117fbdf Refactor out checking for displacements on x86-64 addressing modes. No functionality change. Refactoring in preparation for an additional safety check in FoldOffsetIntoAddress.
Part of <rdar://problem/9763308>.

llvm-svn: 135079
2011-07-13 20:44:23 +00:00
Eric Christopher
7260817287 TargetConstant immediates won't be placed into registers so tighten
up the valid constant check earlier.

rdar://9692967

llvm-svn: 134286
2011-07-01 23:04:38 +00:00
Eric Christopher
7ce905754f Fix a small thinko for constant i64 lock/orq optimization where we
we didn't have an opcode for 64-bit constant or expressions.

Fixes rdar://9692967

llvm-svn: 134121
2011-06-30 00:48:30 +00:00
Stuart Hastings
e3158f93ec Re-commit 131641 with fixes; de-pseudoize MOVSX16rr8 and friends.
rdar://problem/8614450

llvm-svn: 131746
2011-05-20 19:04:40 +00:00
Eric Christopher
d613c05f26 Update comment.
llvm-svn: 131459
2011-05-17 08:16:14 +00:00
Eric Christopher
c03ef7ebb3 Support XOR and AND optimization with no return value.
Finishes off rdar://8470697

llvm-svn: 131458
2011-05-17 08:10:18 +00:00
Eric Christopher
f81a665961 Couple less magic numbers.
llvm-svn: 131457
2011-05-17 07:50:41 +00:00
Eric Christopher
dc12267689 Make this code a little less magic number laden.
llvm-svn: 131456
2011-05-17 07:47:55 +00:00
Eric Christopher
281a4a4550 Turn this into a table, this will make more sense shortly.
Part of rdar://8470697

llvm-svn: 131200
2011-05-11 21:44:58 +00:00
Eric Christopher
3c17ef53c3 Optimize atomic lock or that doesn't use the result value.
Next up: xor and and.

Part of rdar://8470697

llvm-svn: 131171
2011-05-10 23:57:45 +00:00
Benjamin Kramer
fee48a936f Silence an overzealous uninitialized variable warning from GCC.
llvm-svn: 130053
2011-04-23 08:21:06 +00:00
Benjamin Kramer
7feae20986 X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)
This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
uint64_t foo(uint64_t x) { return (x&1) << 42; }
which used to compile into bloated code:
	shlq	$42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
	movabsq	$4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
	andq	%rdi, %rax              ## encoding: [0x48,0x21,0xf8]
	ret                             ## encoding: [0xc3]

with this patch we can fold the immediate into the and:
	andq	$1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	shlq	$42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
	ret                             ## encoding: [0xc3]

It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
that without making this code even more complicated. See the TODOs in the code.

llvm-svn: 129990
2011-04-22 15:30:40 +00:00
Stuart Hastings
47e45a32a8 Swap VT and DebugLoc operands of getExtLoad() for consistency with
other getNode() methods.  Radar 9002173.

llvm-svn: 125665
2011-02-16 16:23:55 +00:00
Chris Lattner
bcf2d46d8a Enhance ComputeMaskedBits to know that aligned frameindexes
have their low bits set to zero.  This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.

Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively.  Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST).  The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.

llvm-svn: 125470
2011-02-13 22:25:43 +00:00
NAKAMURA Takumi
d418ff3b1a lib/Target/X86/X86ISelDAGToDAG.cpp: __main should be WINCALL64 on Win64.
CALL64 marks %xmm* as dead.

llvm-svn: 124354
2011-01-27 03:20:19 +00:00
Chris Lattner
dde85de90f fix PR8514, a bug where the "heroic" transformation of shift/and
into and/shift would cause nodes to move around and a dangling pointer
to happen.  The code tried to avoid this with a HandleSDNode, but 
got the details wrong.

llvm-svn: 123578
2011-01-16 08:48:11 +00:00
Ted Kremenek
4b09cdedb2 'HiReg' is written but never read. Nuke its
declaration and its assignments.

Found by clang static analyzer.

llvm-svn: 123486
2011-01-14 22:34:13 +00:00
Bill Wendling
fae0dd1afa PR8918 - When used with MinGW64, LLVM generates a "calll __main" at the
beginning of the "main" function. The assembler complains about the invalid
suffix for the 'call' instruction. The right instruction is "callq __main".
Patch by KS Sreeram!

llvm-svn: 122933
2011-01-06 00:47:10 +00:00
Chris Lattner
65c5243bd6 rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Chris Lattner
76601e7a99 it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935
2010-12-05 07:30:36 +00:00
Dale Johannesen
e7f07349e4 Use a MemIntrinsicSDNode for ISD::PREFETCH, which touches
memory, so a MachineMemOperand is useful (not propagated
into the MachineInstr yet).  No functional change except
for dump output.

llvm-svn: 117413
2010-10-26 23:11:10 +00:00
Chris Lattner
195a9c3877 Use #NAME# to have the CMOV multiclass define things with the same names as before
(e.g. CMOVBE16rr instead of CMOVBErr16).

llvm-svn: 115705
2010-10-05 23:00:14 +00:00
Chris Lattner
c3c03dfeff switch CMOVBE to the multipattern:
21 insertions(+), 53 deletions(-)

Moar change coming before I switch the rest.

llvm-svn: 115697
2010-10-05 22:23:58 +00:00
Eric Christopher
84827bd9f5 Temporarily work around new address lowering while I figure out what
needs to happen for darwin.

llvm-svn: 114577
2010-09-22 20:42:08 +00:00
Chris Lattner
26d11d7501 reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress.
llvm-svn: 114529
2010-09-22 04:39:11 +00:00
Chris Lattner
c81dcbec9e convert the last 4 X86ISD nodes that should have memoperands to have them.
llvm-svn: 114523
2010-09-22 01:28:21 +00:00
Chris Lattner
29754fc406 give X86ISD::FNSTCW16m a memoperand, since it touches memory. It only
can access the stack due to how it is generated though.

llvm-svn: 114522
2010-09-22 01:11:26 +00:00
Chris Lattner
fee1ac61bd give FP_TO_INT16_IN_MEM and friends a memoperand. They are only
used with stack slots, but hey, lets be safe.

llvm-svn: 114521
2010-09-22 01:05:16 +00:00
Chris Lattner
e52da86fab give VZEXT_LOAD a memory operand, it now works with segment registers.
llvm-svn: 114515
2010-09-22 00:34:38 +00:00
Chris Lattner
706b9206da revert r114386 now that address modes work correctly, we get a nice
call through gs-relative memory now.

llvm-svn: 114510
2010-09-22 00:11:31 +00:00
Chris Lattner
f9861312cb give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257
llvm-svn: 114508
2010-09-21 23:59:42 +00:00
Chris Lattner
b227ae4ddb reimplement support for GS and FS relative address space matching
by having X86DAGToDAGISel::SelectAddr get passed in the parent node
of the operand match (the load/store/atomic op) and having it get
the address space from that, instead of having special FS/GS addr
mode operations that require duplicating the entire instruction set
to support.

This makes FS and GS relative accesses *far* more predictable and
work much better.  It also simplifies the X86 backend a bit, more
to come.

There is still a pending issue with nodes like ISD::PREFETCH and
X86ISD::FLD, which really should be MemSDNode's but aren't.

llvm-svn: 114491
2010-09-21 22:07:31 +00:00
Chris Lattner
55043ef46a fix a long standing wart: all the ComplexPattern's were being
passed the root of the match, even though only a few patterns
actually needed this (one in X86, several in ARM [which should
be refactored anyway], and some in CellSPU that I don't feel 
like detangling).   Instead of requiring all ComplexPatterns to
take the dead root, have targets opt into getting the root by
putting SDNPWantRoot on the ComplexPattern.

llvm-svn: 114471
2010-09-21 20:31:19 +00:00
Chris Lattner
c153d48869 even though I'm about to rip it out, simplify the address mode stuff
llvm-svn: 114468
2010-09-21 19:41:58 +00:00
Chris Lattner
cdfd993df0 propagate MachinePointerInfo through various uses of the old
SelectionDAG::getExtLoad overload, and eliminate it.

llvm-svn: 114446
2010-09-21 17:04:51 +00:00
Chris Lattner
ecdba24738 fix rdar://8453210, a crash handling a call through a GS relative load.
For now, just disable folding the load into the call.

llvm-svn: 114386
2010-09-21 03:37:00 +00:00
Chris Lattner
8df3ffd7ac zap dead code.
llvm-svn: 113073
2010-09-04 18:12:00 +00:00
Jakob Stoklund Olesen
b7bd26db67 Don't call Predicate_* from X86 target.
llvm-svn: 112921
2010-09-03 00:35:18 +00:00
Benjamin Kramer
4eb0e8bb2c Remove dead recursive function. Yay for clang -Wunused-function.
llvm-svn: 112060
2010-08-25 17:27:58 +00:00
Eli Friedman
401dbe036d PR7814: Truncates cannot be ignored for signed comparisons.
llvm-svn: 110268
2010-08-04 22:40:58 +00:00
Chris Lattner
49ac65543c Change LEA to have 5 operands for its memory operand, just
like all other instructions, even though a segment is not
allowed.  This resolves a bunch of gross hacks in the 
encoder and makes LEA more consistent with the rest of the
instruction set.

No functionality change.

llvm-svn: 107934
2010-07-08 23:46:44 +00:00
Evan Cheng
22b3e8f3b1 Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake.
llvm-svn: 107820
2010-07-07 22:15:37 +00:00
Devang Patel
7ab104353b Propagate debug loc.
llvm-svn: 107710
2010-07-06 22:08:15 +00:00
Jakob Stoklund Olesen
6dee31aa07 When creating X86 MUL8 and DIV8 instructions, make sure we don't produce
CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast
register allocator.

Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit).

This fixes PR7312.

llvm-svn: 106934
2010-06-26 00:39:23 +00:00
Dan Gohman
e9dfb84007 Change UpdateNodeOperands' operand and return value from SDValue to
SDNode *, since it doesn't care about the ResNo value.

llvm-svn: 106282
2010-06-18 15:30:29 +00:00
Dan Gohman
9d7cf23808 Don't maintain a set of deleted nodes; instead, use a HandleSDNode
to track a node over CSE events. This fixes PR7368.

llvm-svn: 106266
2010-06-18 01:24:29 +00:00
Eric Christopher
30010cae3a Add first pass at darwin tls compiler support.
llvm-svn: 105381
2010-06-03 04:07:48 +00:00
Jakob Stoklund Olesen
f40bb16b94 Rename X86 subregister indices to something shorter.
Use the tablegen-produced enums.

llvm-svn: 104493
2010-05-24 14:48:17 +00:00
Dan Gohman
0a6811070b Don't leave Base.FrameIndex uninitialized, so that it doesn't
print randomly in debug output.

llvm-svn: 102668
2010-04-29 23:30:41 +00:00
Evan Cheng
d4fe387eb8 Enable i16 to i32 promotion by default.
llvm-svn: 102493
2010-04-28 08:30:49 +00:00
Chris Lattner
6db0f451a7 teach the x86 address matching stuff to handle
(shl (or x,c), 3) the same as (shl (add x, c), 3)
when x doesn't have any bits from c set.

This finishes off PR1135.  Before we compiled the block to:
to:

LBB0_3:                                 ## %bb
	cmpb	$4, %dl
	sete	%dl
	addb	%dl, %cl
	movb	%cl, %dl
	shlb	$2, %dl
	addb	%r8b, %dl
	shlb	$2, %dl
	movzbl	%dl, %edx
	movl	%esi, (%rdi,%rdx,4)
	leaq	2(%rdx), %r9
	movl	%esi, (%rdi,%r9,4)
	leaq	1(%rdx), %r9
	movl	%esi, (%rdi,%r9,4)
	addq	$3, %rdx
	movl	%esi, (%rdi,%rdx,4)
	incb	%r8b
	decb	%al
	movb	%r8b, %dl
	jne	LBB0_1

Now we produce:

LBB0_3:                                 ## %bb
	cmpb	$4, %dl
	sete	%dl
	addb	%dl, %cl
	movb	%cl, %dl
	shlb	$2, %dl
	addb	%r8b, %dl
	shlb	$2, %dl
	movzbl	%dl, %edx
	movl	%esi, (%rdi,%rdx,4)
	movl	%esi, 8(%rdi,%rdx,4)
	movl	%esi, 4(%rdi,%rdx,4)
	movl	%esi, 12(%rdi,%rdx,4)
	incb	%r8b
	decb	%al
	movb	%r8b, %dl
	jne	LBB0_1

llvm-svn: 101958
2010-04-20 23:18:40 +00:00
Dan Gohman
a0f855157e Use const qualifiers with TargetLowering. This eliminates several
const_casts, and it reinforces the design of the Target classes being
immutable.

SelectionDAGISel::IsLegalToFold is now a static member function, because
PIC16 uses it in an unconventional way. There is more room for API
cleanup here.

And PIC16's AsmPrinter no longer uses TargetLowering.

llvm-svn: 101635
2010-04-17 15:26:15 +00:00
Dan Gohman
0e0b8cf9fd Add const qualifiers to CodeGen's use of LLVM IR constructs.
llvm-svn: 101334
2010-04-15 01:51:59 +00:00
Dan Gohman
1b67547226 Delete unneeeded arguments.
llvm-svn: 101276
2010-04-14 20:17:22 +00:00
Chris Lattner
58b7cca257 use DebugLoc default ctor instead of DebugLoc::getUnknownLoc()
llvm-svn: 100214
2010-04-02 20:16:16 +00:00
Evan Cheng
41730fc7c8 X86 address mode matching code MatchAddressRecursively does some aggressive hack which require doing a RAUW. It may end up deleting some SDNode up stream. It should avoid referencing deleted nodes.
llvm-svn: 98780
2010-03-17 23:58:35 +00:00
Evan Cheng
7d8c39bb1c Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions.
llvm-svn: 98465
2010-03-14 03:48:46 +00:00
Chris Lattner
3d7dc22a7a add a comment.
llvm-svn: 97709
2010-03-04 01:43:43 +00:00
Chris Lattner
b178d16c23 factor the 'sign extended from 8 bit' patterns better so
that they are not destination type specific.  This allows
tblgen to factor them and the type check is redundant with
what the isel does anyway.

llvm-svn: 97629
2010-03-03 01:45:01 +00:00
Chris Lattner
eed11ae4f8 merge two loops over all nodes in the graph into one.
llvm-svn: 97606
2010-03-02 23:12:51 +00:00
Chris Lattner
27d98e4772 eliminate PreprocessForRMW now that isel handles it.
We still preprocess calls and fp return stuff.

llvm-svn: 97598
2010-03-02 22:33:56 +00:00
Chris Lattner
9c9c1158cb Fix some issues in WalkChainUsers dealing with
CopyToReg/CopyFromReg/INLINEASM.  These are annoying because
they have the same opcode before an after isel.  Fix this by
setting their NodeID to -1 to indicate that they are selected,
just like what automatically happens when selecting things that
end up being machine nodes.

With that done, give IsLegalToFold a new flag that causes it to
ignore chains.  This lets the HandleMergeInputChains routine be
the one place that validates chains after a match is successful,
enabling the new hotness in chain processing.  This smarter
chain processing eliminates the need for "PreprocessRMW" in the
X86 and MSP430 backends and enables MSP to start matching it's
multiple mem operand instructions more aggressively.

I currently #if out the dead code in the X86 backend and MSP 
backend, I'll remove it for real in a follow-on patch.

The testcase changes are:
  test/CodeGen/X86/sse3.ll: we generate better code
  test/CodeGen/X86/store_op_load_fold2.ll: PreprocessRMW was 
      miscompiling this before, we now generate correct code
      Convert it to filecheck while I'm at it.
  test/CodeGen/MSP430/Inst16mm.ll: Add a testcase for mem/mem
      folding to make anton happy. :)

llvm-svn: 97596
2010-03-02 22:20:06 +00:00
Chris Lattner
1707a88a2c Sink InstructionSelect() out of each target into SDISel, and rename it
DoInstructionSelection.  Inline "SelectRoot" into it from DAGISelHeader.
Sink some other stuff out of DAGISelHeader into SDISel.

Eliminate the various 'Indent' stuff from various targets, which dates
to when isel was recursive.

 17 files changed, 114 insertions(+), 430 deletions(-)

llvm-svn: 97555
2010-03-02 06:34:30 +00:00
Chris Lattner
745181da4b remove a little hack I did for the old isel, not needed
now that it is gone.

llvm-svn: 97516
2010-03-01 22:51:11 +00:00
Chris Lattner
fa3b904028 remove a terrible hack that disabled assertions from this file because of build time
problems.  rdar://7697850.

llvm-svn: 97500
2010-03-01 21:20:46 +00:00
Chris Lattner
74715dd299 no need to override IsLegalToFold, the base implementation
disables load folding at -O0.

llvm-svn: 96973
2010-02-23 19:33:11 +00:00
Chris Lattner
b328c57aa4 fix and un-xfail X86/vec_ss_load_fold.ll
llvm-svn: 96720
2010-02-21 04:53:34 +00:00
Chris Lattner
fe81834e6e rename SelectScalarSSELoad -> SelectScalarSSELoadXXX and rewrite
it to follow the mode needed by the new isel.  Instead of returning
the input and output chains, it just returns the (currently only one,
which is a silly limitation) node that has input and output chains.

Since we want the old thing to still work, add a new 
SelectScalarSSELoad to emulate the old interface.  The XXX suffix
and the wrapper will eventually go away.

llvm-svn: 96715
2010-02-21 03:17:59 +00:00
Chris Lattner
735a4efeff rename and document some arguments so I don't have to keep
reverse engineering what they are.

llvm-svn: 96456
2010-02-17 06:07:47 +00:00
Chris Lattner
0d35c68d5c fix rdar://7653908, a crash on a case where we would fold a load
into a roundss intrinsic, producing a cyclic dag.  The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection.  Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.

llvm-svn: 96408
2010-02-16 22:35:06 +00:00
Evan Cheng
b5fe25544c Split SelectionDAGISel::IsLegalAndProfitableToFold to
IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use.

This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses.

llvm-svn: 96255
2010-02-15 19:41:07 +00:00
David Greene
1b3d9eece5 Remove an assumption of default arguments. This is in anticipation of a
change to SelectionDAG build APIs.

llvm-svn: 96239
2010-02-15 16:57:43 +00:00
Chris Lattner
d69e1c1bb2 refactor the conditional jump instructions in the .td file to
use a multipattern that generates both the 1-byte and 4-byte 
versions from the same defm

llvm-svn: 95901
2010-02-11 19:25:55 +00:00
Chris Lattner
7acf9be6c4 move target-independent opcodes out of TargetInstrInfo
into TargetOpcodes.h.  #include the new TargetOpcodes.h
into MachineInstr.  Add new inline accessors (like isPHI())
to MachineInstr, and start using them throughout the 
codebase.

llvm-svn: 95687
2010-02-09 19:54:29 +00:00
Dan Gohman
be34c35f32 Re-implement the main strength-reduction portion of LoopStrengthReduction.
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.

It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.

llvm-svn: 94061
2010-01-21 02:09:26 +00:00
David Greene
80fdc554d0 When XDEBUG is enabled, check for SelectionDAG cycles at some key
points.  This will help us find future problems like the one
described in PR6019.

llvm-svn: 94019
2010-01-20 20:13:31 +00:00
David Greene
e52529d7cb Fix PR6019. A load has more than one use if it feeds a bitconvert that
has more than one use.

llvm-svn: 93576
2010-01-15 23:23:41 +00:00
Dan Gohman
51b3e804dc Reapply the MOV64r0 patch, with a fix: MOV64r0 clobbers EFLAGS.
llvm-svn: 93229
2010-01-12 04:42:54 +00:00
Evan Cheng
bc84a42d7b Revert 93158. It's breaking quite a few x86_64 tests.
llvm-svn: 93185
2010-01-11 21:13:41 +00:00
Dan Gohman
5b79391087 Re-instate MOV64r0 and MOV16r0, with adjustments to work with the
new AsmPrinter. This is perhaps less elegant than describing them
in terms of MOV32r0 and subreg operations, but it allows the
current register to rematerialize them.

llvm-svn: 93158
2010-01-11 17:37:57 +00:00
David Greene
8a6a52d2b8 Change errs() to dbgs().
llvm-svn: 92647
2010-01-05 01:29:08 +00:00
Dan Gohman
9bcfdf98f1 Change SelectCode's argument from SDValue to SDNode *, to make it more
clear what information these functions are actually using.

This is also a micro-optimization, as passing a SDNode * around is
simpler than passing a { SDNode *, int } by value or reference.

llvm-svn: 92564
2010-01-05 01:24:18 +00:00
Dan Gohman
f87ad693cf Flags-producing add, and, or, etc. have the same profibility
rules as normal add, and, or, etc.

llvm-svn: 92507
2010-01-04 20:51:50 +00:00
Chris Lattner
d7e8bd73fe completely eliminate the MOV16r0 'instruction'. The only
interesting part of this is the divrem changes, which are
already tested by CodeGen/X86/divrem.ll.

llvm-svn: 91975
2009-12-23 01:45:04 +00:00
Evan Cheng
a647318eb3 Re-apply 91623 now that I actually know what I was trying to do.
llvm-svn: 91655
2009-12-18 01:59:21 +00:00
Jeffrey Yasskin
f39a138a7c Revert r91623 to unbreak the buildbots.
llvm-svn: 91632
2009-12-17 22:44:34 +00:00
Evan Cheng
d765952b17 Remove an unused option.
llvm-svn: 91623
2009-12-17 21:23:58 +00:00
Dan Gohman
3517f425b8 Target-independent support for TargetFlags on BlockAddress operands,
and support for blockaddresses in x86-32 PIC mode.

llvm-svn: 89506
2009-11-20 23:18:13 +00:00
Daniel Dunbar
9789af1797 llvm-gcc/clang don't (won't?) need this hack.
llvm-svn: 86769
2009-11-11 00:28:38 +00:00
Daniel Dunbar
d8e09f2f13 Add a monstrous hack to improve X86ISelDAGToDAG compile time.
- Force NDEBUG on in any Release build. This drops the compile time to ~100s
   from ~600s, in Release mode.

 - This may just be a temporary workaround, I don't know the true nature of the
   gcc-4.2 compile time performance problem.

llvm-svn: 86695
2009-11-10 18:24:37 +00:00
Dan Gohman
1a8616a0d7 Use SUBREG_TO_REG instead of INSERT_SUBREG to model x86-64's
implicit zero-extend.

llvm-svn: 86196
2009-11-05 23:53:08 +00:00
Dan Gohman
eec0f1c506 Remove uninteresting and confusing debug output.
llvm-svn: 86149
2009-11-05 18:47:09 +00:00
Chris Lattner
4cf2980e59 improve x86 codegen support for blockaddress. We now compile
the testcase into:

_test1:                                                     ## @test1
## BB#0:                                                    ## %entry
	leaq	L_test1_bb6(%rip), %rax
	jmpq	*%rax
L_test1_bb:                                                 ## Address Taken
LBB1_1:                                                     ## %bb
	movb	$1, %al
	ret
L_test1_bb6:                                                ## Address Taken
LBB1_2:                                                     ## %bb6
	movb	$2, %al
	ret

Note, it is very very strange that BlockAddressSDNode doesn't carry 
around TargetFlags.  Dan, please fix this.

llvm-svn: 85703
2009-11-01 03:25:03 +00:00
Nick Lewycky
2b8400628d Remove includes of Support/Compiler.h that are no longer needed after the
VISIBILITY_HIDDEN removal.

llvm-svn: 85043
2009-10-25 06:57:41 +00:00
Nick Lewycky
711c726c97 Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces.
Chris claims we should never have visibility_hidden inside any .cpp file but
that's still not true even after this commit.

llvm-svn: 85042
2009-10-25 06:33:48 +00:00