1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

4263 Commits

Author SHA1 Message Date
Chris Lattner
da05d37aa1 Add new TargetInstrDesc::hasImplicitUseOfPhysReg and
hasImplicitDefOfPhysReg methods.  Use them to remove a 
look in X86 fast isel.

llvm-svn: 68886
2009-04-12 07:26:51 +00:00
Dan Gohman
ac11c8d30f Revert r68847. It breaks the build on non-Darwin targets, with this message
from the assembler:

Error: unknown pseudo-op: `.debug_inlined'
llvm-svn: 68863
2009-04-11 15:57:04 +00:00
Devang Patel
6f907173e0 Keep track of inlined functions and their locations. This information is collected when nested llvm.dbg.func.start intrinsics are seen. (Right now, inliner removes nested llvm.dbg.func.start intrinisics during inlining.)
Create debug_inlined dwarf section using these information. This info is used by gdb, at least on Darwin, to enable better experience debugging inlined functions. See DwarfWriter.cpp for more information on structure of debug_inlined section.

llvm-svn: 68847
2009-04-11 00:16:47 +00:00
Rafael Espindola
88986ef511 Don't fold a load if the other operand is a TLS address.
With this we generate

movl    %gs:0, %eax
leal    i@NTPOFF(%eax), %eax

instead of

movl    $i@NTPOFF, %eax
addl    %gs:0, %eax

llvm-svn: 68778
2009-04-10 10:09:34 +00:00
Chris Lattner
26aee059ba a few fixes to "addrspace(256) is reference offset of GS segment register".
It turns out that there are still several problems with this, will file a bugzilla.

llvm-svn: 68749
2009-04-10 00:16:23 +00:00
Dan Gohman
8121b3f88d Remove the obsolete SelectionDAG::getNodeValueTypes and simplify
code that uses it by using SelectionDAG::getVTList instead.

llvm-svn: 68744
2009-04-09 23:54:40 +00:00
Chris Lattner
e0d0edaf3f Fix code size computation on x86-64, patch by Zoltan Varga!
llvm-svn: 68690
2009-04-09 06:10:51 +00:00
Dan Gohman
6cb1387261 Fix grammaros in comments.
llvm-svn: 68666
2009-04-09 02:06:09 +00:00
Rafael Espindola
7eb72dc5f2 Re-apply 68552.
Tested by bootstrapping llvm-gcc and using that to build llvm.

llvm-svn: 68645
2009-04-08 21:14:34 +00:00
Rafael Espindola
d4563305fd Avoid a hard coded constant.
llvm-svn: 68603
2009-04-08 08:09:33 +00:00
Dan Gohman
c9ce27d6b7 Implement support for using modeling implicit-zero-extension on x86-64
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.

This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.

Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.

Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.

Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.

Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.

llvm-svn: 68576
2009-04-08 00:15:30 +00:00
Bill Wendling
6e702cf68c Temporarily revert r68552. This was causing a failure in the self-hosting LLVM
builds.

--- Reverse-merging (from foreign repository) r68552 into '.':
U    test/CodeGen/X86/tls8.ll
U    test/CodeGen/X86/tls10.ll
U    test/CodeGen/X86/tls2.ll
U    test/CodeGen/X86/tls6.ll
U    lib/Target/X86/X86Instr64bit.td
U    lib/Target/X86/X86InstrSSE.td
U    lib/Target/X86/X86InstrInfo.td
U    lib/Target/X86/X86RegisterInfo.cpp
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86CodeEmitter.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86InstrInfo.h
U    lib/Target/X86/X86ISelDAGToDAG.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
U    lib/Target/X86/X86ISelLowering.h
U    lib/Target/X86/X86InstrInfo.cpp
U    lib/Target/X86/X86InstrBuilder.h
U    lib/Target/X86/X86RegisterInfo.td

llvm-svn: 68560
2009-04-07 22:35:25 +00:00
Rafael Espindola
0324937229 Reduce code duplication on the TLS implementation.
This introduces a small regression on the generated code
quality in the case we are just computing addresses, not
loading values.

Will work on it and on X86-64 support.

llvm-svn: 68552
2009-04-07 21:37:46 +00:00
Mon P Wang
f829fb5cab Added a x86 dag combine to increase the chances to use a
movq for v2i64 on x86-32.

llvm-svn: 68368
2009-04-03 02:43:30 +00:00
Chris Lattner
f1719bf7b5 silence warning in release-asserts build.
llvm-svn: 68253
2009-04-01 22:14:45 +00:00
Evan Cheng
44fdb5d570 i128 shift libcalls are not available on x86.
llvm-svn: 68133
2009-03-31 19:38:51 +00:00
Dan Gohman
86e4d0130c Reapply 68073, with fixes. EH Landing-pad basic blocks are not
entered via fall-through. Don't miss fallthroughs from blocks
terminated by conditional branches. Also, move
isOnlyReachableByFallthrough out of line.

llvm-svn: 68129
2009-03-31 18:39:13 +00:00
Rafael Espindola
3d866ac20c remove unused arguments.
llvm-svn: 68109
2009-03-31 16:16:57 +00:00
Bill Wendling
28fad6fcc1 Really temporarily revert r68073.
llvm-svn: 68100
2009-03-31 08:42:40 +00:00
Bill Wendling
1c40c8c242 Oy! When reverting r68073, I added in experimental code. Sorry...
llvm-svn: 68099
2009-03-31 08:41:31 +00:00
Bill Wendling
4706abded2 Revert r68073. It's causing a failure in the Apple-style builds.
llvm-svn: 68092
2009-03-31 08:26:26 +00:00
Evan Cheng
5c02e62620 X86 address mode isel tweak. If the base of the address is also used by a CopyToReg (i.e. it's likely live-out), do not fold the sub-expressions into the addressing mode to avoid computing the address twice. The CopyToReg use will be isel'ed to a LEA, re-use it for address instead.
This is not yet enabled.

llvm-svn: 68082
2009-03-31 01:13:53 +00:00
Dan Gohman
29694088d3 Except in asm-verbose mode, avoid printing labels for blocks that are
only reachable via fall-through edges. This dramatically reduces the
number of labels printed, and thus also the number of labels the
assembler must parse and remember.

llvm-svn: 68073
2009-03-30 22:55:17 +00:00
Evan Cheng
3e30bcbd69 When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further.
llvm-svn: 68066
2009-03-30 21:36:47 +00:00
Anton Korobeynikov
0404baca28 Do not propagate ELF-specific stuff (data.rel) into other targets. This simplifies code and also ensures correctness.
llvm-svn: 68032
2009-03-30 15:27:43 +00:00
Anton Korobeynikov
2ea565a37b Add data.rel stuff
llvm-svn: 68031
2009-03-30 15:27:03 +00:00
Rafael Espindola
34f59009d1 Use array_lengthof
llvm-svn: 67950
2009-03-28 19:02:18 +00:00
Rafael Espindola
37522e768a Have only one definition of X86AddrNumOperands.
llvm-svn: 67949
2009-03-28 18:55:31 +00:00
Rafael Espindola
884992a7e9 Make code a bit less brittle by no hardcoding the number
of operands in an address in so many places.

llvm-svn: 67945
2009-03-28 17:03:24 +00:00
Evan Cheng
a15fdaa292 Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g.
x * 40
=>
shlq    $3, %rdi
leaq    (%rdi,%rdi,4), %rax

This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq    (%rdi,%rdi,2), %rax
leaq    (%rsi,%rax,8), %rax

llvm-svn: 67917
2009-03-28 05:57:29 +00:00
Rafael Espindola
83ee99d7b0 Avoid hardcoding that X86 addresses have 4 operands.
llvm-svn: 67848
2009-03-27 15:57:50 +00:00
Rafael Espindola
7c113e5354 Use less hard coded constants to make the code less brittle.
llvm-svn: 67846
2009-03-27 15:45:05 +00:00
Rafael Espindola
38604d9598 I am trying to add a segment to the X86 addresses matching to
improve TLS support (see http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090309/075220.html), but that code is VERY brittle.

This patch just makes it a bit more resistant.

llvm-svn: 67843
2009-03-27 15:26:30 +00:00
Evan Cheng
ab6e38c88d -no-implicit-float means explicit fp operations are legal.
llvm-svn: 67784
2009-03-26 23:06:32 +00:00
Bill Wendling
f4247ff478 Pull transform from target-dependent code into target-independent code.
llvm-svn: 67742
2009-03-26 06:14:09 +00:00
Bill Wendling
f79eccc675 Match this pattern so that we can generate simpler code:
%a = ...
  %b = and i32 %a, 2
  %c = srl i32 %b, 1
  %d = br i32 %c, 

into

  %a = ...
  %b = and %a, 2
  %c = X86ISD::CMP %b, 0
  %d = X86ISD::BRCOND %c ...

This applies only when the AND constant value has one bit set and the SRL
constant is equal to the log2 of the AND constant. The back-end is smart enough
to convert the result into a TEST/JMP sequence.

llvm-svn: 67728
2009-03-26 01:47:50 +00:00
Bill Wendling
40ac545f38 Doxygen-ify comments.
llvm-svn: 67727
2009-03-26 01:46:56 +00:00
Evan Cheng
3a7489a4cc CodeGen still defaults to non-verbose asm, but llc now overrides it and default to verbose.
llvm-svn: 67668
2009-03-25 01:47:28 +00:00
Evan Cheng
6ff8cea903 Don't print global names twice with -asm-verbose.
llvm-svn: 67667
2009-03-25 01:08:42 +00:00
Dan Gohman
7a9e8cbf79 I was convinced that it's ok to allow a second i8 return value
to be returned in DL. LLVM's multiple-return-value support is
not ABI-conforming; front-ends that wish to have code emitted
that conforms to an ABI are currently expected to make
arrangements for this on their own rather than assuming that
multiple-return-values will automatically do the right thing.
This commit doesn't fundamentally change this situation.

llvm-svn: 67588
2009-03-24 01:04:34 +00:00
Evan Cheng
b3196f1298 Do not emit comments unless -asm-verbose.
llvm-svn: 67580
2009-03-24 00:17:40 +00:00
Dan Gohman
e9cf3083d2 Correct some comments. Operand numbers start at 0.
llvm-svn: 67518
2009-03-23 15:40:10 +00:00
Evan Cheng
2ec94dd447 Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies.
llvm-svn: 67512
2009-03-23 08:01:15 +00:00
Dan Gohman
745c0acc79 Fix a grammaro in a comment that Bill noticed.
llvm-svn: 67507
2009-03-23 05:02:44 +00:00
Dan Gohman
16b4a33039 Add comments explaining why there's only one register for
i8 return values.

llvm-svn: 67502
2009-03-23 04:28:24 +00:00
Nick Lewycky
a0dcd7e173 Remove strange extra semicolons.
llvm-svn: 67287
2009-03-19 05:51:39 +00:00
Chris Lattner
205380a4e4 Disable the "call to immediate" optimization on x86-64. It is
not safe in general because the immediate could be an arbitrary
value that does not fit in a 32-bit pcrel displacement.  
Conservatively fall back to loading the value into a register
and calling through it.

We still do the optzn on X86-32.

llvm-svn: 67142
2009-03-18 00:43:52 +00:00
Dan Gohman
f6c57d0fe7 Recognize bswapl as bswap too.
llvm-svn: 67072
2009-03-17 02:45:40 +00:00
Dan Gohman
4efda2b52b Recognize "bswapq" as an alternate spelling for the bswap instruction.
llvm-svn: 67071
2009-03-17 02:17:27 +00:00
Dan Gohman
fd6debff99 Use %rip-relative addressing on x86-64 whenever practical, as
it has a smaller encoding than absolute addressing.

llvm-svn: 67002
2009-03-14 02:33:41 +00:00
Dan Gohman
e7495ef7aa Don't forego folding of loads into 64-bit adds when the other
operand is a signed 32-bit immediate. Unlike with the 8-bit
signed immediate case, it isn't actually smaller to fold a
32-bit signed immediate instead of a load. In fact, it's
larger in the case of 32-bit unsigned immediates, because
they can be materialized with movl instead of movq.

llvm-svn: 67001
2009-03-14 02:07:16 +00:00
Dan Gohman
fa0a3504ba Improve FastISel's handling of truncates to i1, and implement
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.

llvm-svn: 66988
2009-03-13 23:53:06 +00:00
Dan Gohman
790659c0d6 Fix FastISel's assumption that i1 values are always zero-extended
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.

llvm-svn: 66941
2009-03-13 20:42:20 +00:00
Rafael Espindola
aadb9af093 add 8 and 16 bit TLS moves.
add a fixme note on how to remove code duplication.

llvm-svn: 66932
2009-03-13 19:39:55 +00:00
Rafael Espindola
ff17d02271 Improve sext and zext of TLS variables.
llvm-svn: 66922
2009-03-13 18:37:06 +00:00
Chris Lattner
63569fa327 generalize this code so that fast isel handles integer truncates to i1, which
codegen to the same thing as integer truncates to i8 (the top bits are 
just undefined).  This implements rdar://6667338

llvm-svn: 66902
2009-03-13 16:36:42 +00:00
Bill Wendling
2fe64f48aa These instructions have special lowering that may lower them to SSE
instructions. Prevent that if we don't want implicit uses of SSE.

llvm-svn: 66877
2009-03-13 08:41:47 +00:00
Evan Cheng
f9951d1557 Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.


Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.

llvm-svn: 66875
2009-03-13 07:51:59 +00:00
Chris Lattner
cbbdd230dd generalize the previous code to use the full generality of LEA
for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this).  On the example, we now
generate:

_test:
	movl	4(%esp), %eax
	cmpl	$42, (%eax)
	setl	%al
	movzbl	%al, %eax
	leal	4(%eax,%eax,8), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	movl	$4, %ecx
	movl	$13, %eax
	cmovg	%ecx, %eax
	ret

llvm-svn: 66869
2009-03-13 05:53:31 +00:00
Chris Lattner
878d951f8f optimize the case of cond ? 42 : 41 and friends. This compiles the
example to:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	setg	%al
	movzbl	%al, %eax
	orl	$4294967294, %eax
	ret

instead of:

        movl    4(%esp), %eax
        cmpl    $41, (%eax)
	movl	$4294967294, %ecx
	movl	$4294967295, %eax
	cmova	%ecx, %eax
	ret

which is smaller in code size and faster. rdar://6668608

llvm-svn: 66868
2009-03-13 05:22:11 +00:00
Dan Gohman
37d843c129 Enhance address-mode folding of ISD::ADD to handle cases where the
operands can't both be fully folded at the same time. For example,
in the included testcase, a global variable is being added with
an add of two values. The global variable wants RIP-relative
addressing, so it can't share the address with another base
register, but it's still possible to fold the initial add.

llvm-svn: 66865
2009-03-13 02:25:09 +00:00
Evan Cheng
d112c41d95 Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address assembly. 2. Fixed JIT encoding by making the address pc-relative.
llvm-svn: 66803
2009-03-12 18:15:39 +00:00
Chris Lattner
26a971c4ec Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"
related transformations out of target-specific dag combine into the
ARM backend.  These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).

Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
with the recently added cp constant select optimization, but is a
very general xform.  For example, we now compile the second example
in const-select.ll to:

_test:
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        seta    %al
        movzbl  %al, %eax
        movl    4(%esp), %ecx
        movsbl  (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl    4(%esp), %eax
        leal    4(%eax), %ecx
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        cmovbe  %eax, %ecx
        movsbl  (%ecx), %eax
        ret

This passes multisource and dejagnu.

llvm-svn: 66779
2009-03-12 06:52:53 +00:00
Chris Lattner
b904fac1b8 improve comment.
llvm-svn: 66778
2009-03-12 06:46:02 +00:00
Evan Cheng
46e903d2f6 On x86, if the only use of a i64 load is a i64 store, generate a pair of double load and store instead.
llvm-svn: 66776
2009-03-12 05:59:15 +00:00
Dan Gohman
d30e108f0e Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and the
assembly text output uses an indirect call ("call *") instead of a direct call.

llvm-svn: 66735
2009-03-11 23:01:47 +00:00
Rafael Espindola
a8fe373200 optimize i8 and i16 tls values.
llvm-svn: 66725
2009-03-11 22:40:04 +00:00
Bill Wendling
fca05e3a5c Add a -no-implicit-float flag. This acts like -soft-float, but may generate
floating point instructions that are explicitly specified by the user.

llvm-svn: 66719
2009-03-11 22:30:01 +00:00
Duncan Sands
b27c523449 It makes no sense to have a ODR version of common
linkage, so remove it.

llvm-svn: 66690
2009-03-11 20:14:15 +00:00
Mon P Wang
287e422039 For yonah, fix a vector shuffle case for v16i8 where we didn't properly clear some bits.
llvm-svn: 66684
2009-03-11 18:47:57 +00:00
Mon P Wang
2867737ad2 Fixed a v8i16 shuffle case that should generate a pshufb instead of a pshuflw/hw.
llvm-svn: 66645
2009-03-11 06:35:11 +00:00
Chris Lattner
eb9327f335 formatting change, reduce indentation. No functionality change.
llvm-svn: 66642
2009-03-11 05:48:52 +00:00
Dan Gohman
e15d8f03c3 Add more information to the EFLAGS note.
llvm-svn: 66515
2009-03-10 00:26:23 +00:00
Dan Gohman
995c3dd344 Add a note about EFLAGS optimization.
llvm-svn: 66508
2009-03-09 23:47:02 +00:00
Chris Lattner
b89dbcd448 do not export all the X86FastISel symbols, ever.
llvm-svn: 66382
2009-03-08 18:44:31 +00:00
Chris Lattner
3342ba06d4 add a note.
llvm-svn: 66360
2009-03-08 03:04:26 +00:00
Chris Lattner
8ace06fdda add a note.
llvm-svn: 66359
2009-03-08 01:54:43 +00:00
Duncan Sands
5ab54d488f Introduce new linkage types linkonce_odr, weak_odr, common_odr
and extern_weak_odr.  These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global.  In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time.   This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function.  If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body.  The
code generators on the other hand map weak and weak_odr linkage
to the same thing.

llvm-svn: 66339
2009-03-07 15:45:40 +00:00
Dan Gohman
b9c32f1aca Arithmetic instructions don't set EFLAGS bits OF and CF bits
the same say the "test" instruction does in overflow cases,
so eliminating the test is only safe when those bits aren't
needed, as is the case for COND_E and COND_NE, or if it
can be proven that no overflow will occur. For now, just
restrict the optimization to COND_E and COND_NE and don't
do any overflow analysis.

llvm-svn: 66318
2009-03-07 01:58:32 +00:00
Dan Gohman
f9599e6c5f Don't use plain INC32 and DEC32 on x86-64; it needs
INC64_32r and INC64_16r, because these instructions are encoded
differently on x86-64. This fixes JIT regressions on x86-64 in
kimwitu++ and others.

llvm-svn: 66207
2009-03-05 21:32:23 +00:00
Dan Gohman
1e9db7c1a1 When creating X86ISD::INC and X86ISD::DEC nodes, only add one operand.
The extra operand didn't appear to cause any trouble, but it was
erroneous regardless.

llvm-svn: 66206
2009-03-05 21:29:28 +00:00
Dan Gohman
f6f684b206 Fix the "test" optimization to recognize "dec" as an add of
negative one, as subtracts of immediates are canonicalized
to adds.

llvm-svn: 66180
2009-03-05 19:32:48 +00:00
Dan Gohman
31fb085c2e Re-apply 66008, now that the unfoldMemoryOperand bug is fixed.
llvm-svn: 66058
2009-03-04 19:44:21 +00:00
Dan Gohman
f41e54c5af Correct this comment.
llvm-svn: 66057
2009-03-04 19:24:25 +00:00
Dan Gohman
04453ca36c When using MachineInstr operand indices on SDNodes, the number
of MachineInstr def operands must be subtracted out. This bug
was uncovered by the recent x86 EFLAGS optimization. Before
that, the only instructions that ever needed unfolding were
things like CMP32rm, where NumDefs is zero.

llvm-svn: 66056
2009-03-04 19:23:38 +00:00
Evan Cheng
7d9019d0f3 Fix PR3666: isel calls to constant addresses.
llvm-svn: 66024
2009-03-04 06:48:53 +00:00
Dan Gohman
6831e2c2a6 Revert r66004 for now; it's causing a variety of test failures.
llvm-svn: 66008
2009-03-04 03:54:19 +00:00
Dan Gohman
c6c669cc1e Teach the x86 backend to eliminate "test" instructions by using the EFLAGS
result from add, sub, inc, and dec instructions in simple cases.

llvm-svn: 66004
2009-03-04 02:33:24 +00:00
Evan Cheng
db402a7a49 Fix PR3701. 1. X86 target renamed eflags register to flags. This matches what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs.
llvm-svn: 65996
2009-03-04 01:41:49 +00:00
Dan Gohman
3c6c7754b2 Add '(implicit EFLAGS)' for AND, OR, XOR, NEG, INC, and DEC
instructions. These aren't used yet.

llvm-svn: 65965
2009-03-03 19:53:46 +00:00
Dan Gohman
51d4e8db6a Fix a bunch of Doxygen syntax issues. Escape special characters,
and put @file directives on their own comment line.

llvm-svn: 65920
2009-03-03 02:55:14 +00:00
Mon P Wang
0258a27c5a Added another darwin subtarget
llvm-svn: 65662
2009-02-28 00:25:30 +00:00
Rafael Espindola
880e63bf01 Refactor TLS code and add some tests. The tests and expected results are:
pic |  declaration | linkage  | visibility |

!pic |  declaration | external | default    | tls1.ll     tls2.ll     | local exec
 pic |  declaration | external | default    | tls1-pic.ll tls2-pic.ll | general dynamic
!pic | !declaration | external | default    | tls3.ll     tls4.ll     | initial exec
 pic | !declaration | external | default    | tls3-pic.ll tls4-pic.ll | general dynamic

!pic |  declaration | external | hidden     | tls7.ll     tls8.ll     | local exec
 pic |  declaration | external | hidden     | X                       | local dynamic
!pic | !declaration | external | hidden     | tls9.ll     tls10.ll    | local exec
 pic | !declaration | external | hidden     | X                       | local dynamic

!pic |  declaration | internal | default    | tls5.ll     tls6.ll     | local exec
 pic |  declaration | internal | default    | X                       | local dynamic

The ones marked with an X have not been implemented since local dynamic is not implemented.

llvm-svn: 65632
2009-02-27 13:37:18 +00:00
Evan Cheng
4014a9a5b8 ADDS{D|S}rr_Int and MULS{D|S}rr_Int are not commutable. The users of these intrinsics expect the high bits will not be modified.
llvm-svn: 65499
2009-02-26 03:12:02 +00:00
Evan Cheng
ec34226c2b Revert BuildVectorSDNode related patches: 65426, 65427, and 65296.
llvm-svn: 65482
2009-02-25 22:49:59 +00:00
Bill Wendling
9d4eb136da Overhaul my earlier submission due to feedback. It's a large patch, but most of
them are generic changes.

- Use the "fast" flag that's already being passed into the asm printers instead
  of shoving it into the DwarfWriter.

- Instead of calling "MI->getParent()->getParent()" for every MI, set the
  machine function when calling "runOnMachineFunction" in the asm printers.

llvm-svn: 65379
2009-02-24 08:30:20 +00:00
Dan Gohman
3766eeea36 Fast-isel can't do TLS yet, so it should fall back to SDISel
if it sees TLS addresses.

llvm-svn: 65341
2009-02-23 22:03:08 +00:00
Evan Cheng
dd139e795c Only v1i16 (i.e. _m64) is returned via RAX / RDX.
llvm-svn: 65313
2009-02-23 09:03:22 +00:00
Nate Begeman
e0093d2501 Generate better code for v8i16 shuffles on SSE2
Generate better code for v16i8 shuffles on SSE2 (avoids stack)
Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops.
Document the shuffle matching logic and add some FIXMEs for later further
  cleanups.
New tests that test the above.

Examples:

New:
_shuf2:
	pextrw	$7, %xmm0, %eax
	punpcklqdq	%xmm1, %xmm0
	pshuflw	$128, %xmm0, %xmm0
	pinsrw	$2, %eax, %xmm0

Old:
_shuf2:
	pextrw	$2, %xmm0, %eax
	pextrw	$7, %xmm0, %ecx
	pinsrw	$2, %ecx, %xmm0
	pinsrw	$3, %eax, %xmm0
	movd	%xmm1, %eax
	pinsrw	$4, %eax, %xmm0
	ret

=========

New:
_shuf4:
	punpcklqdq	%xmm1, %xmm0
	pshufb	LCPI1_0, %xmm0

Old:
_shuf4:
	pextrw	$3, %xmm0, %eax
	movsd	%xmm1, %xmm0
	pextrw	$3, %xmm1, %ecx
	pinsrw	$4, %ecx, %xmm0
	pinsrw	$5, %eax, %xmm0

========

New:
_shuf1:
	pushl	%ebx
	pushl	%edi
	pushl	%esi
	pextrw	$1, %xmm0, %eax
	rolw	$8, %ax
	movd	%xmm0, %ecx
	rolw	$8, %cx
	pextrw	$5, %xmm0, %edx
	pextrw	$4, %xmm0, %esi
	pextrw	$3, %xmm0, %edi
	pextrw	$2, %xmm0, %ebx
	movaps	%xmm0, %xmm1
	pinsrw	$0, %ecx, %xmm1
	pinsrw	$1, %eax, %xmm1
	rolw	$8, %bx
	pinsrw	$2, %ebx, %xmm1
	rolw	$8, %di
	pinsrw	$3, %edi, %xmm1
	rolw	$8, %si
	pinsrw	$4, %esi, %xmm1
	rolw	$8, %dx
	pinsrw	$5, %edx, %xmm1
	pextrw	$7, %xmm0, %eax
	rolw	$8, %ax
	movaps	%xmm1, %xmm0
	pinsrw	$7, %eax, %xmm0
	popl	%esi
	popl	%edi
	popl	%ebx
	ret

Old:
_shuf1:
	subl	$252, %esp
	movaps	%xmm0, (%esp)
	movaps	%xmm0, 16(%esp)
	movaps	%xmm0, 32(%esp)
	movaps	%xmm0, 48(%esp)
	movaps	%xmm0, 64(%esp)
	movaps	%xmm0, 80(%esp)
	movaps	%xmm0, 96(%esp)
	movaps	%xmm0, 224(%esp)
	movaps	%xmm0, 208(%esp)
	movaps	%xmm0, 192(%esp)
	movaps	%xmm0, 176(%esp)
	movaps	%xmm0, 160(%esp)
	movaps	%xmm0, 144(%esp)
	movaps	%xmm0, 128(%esp)
	movaps	%xmm0, 112(%esp)
	movzbl	14(%esp), %eax
	movd	%eax, %xmm1
	movzbl	22(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm1, %xmm2
	movzbl	42(%esp), %eax
	movd	%eax, %xmm1
	movzbl	50(%esp), %eax
	movd	%eax, %xmm3
	punpcklbw	%xmm1, %xmm3
	punpcklbw	%xmm2, %xmm3
	movzbl	77(%esp), %eax
	movd	%eax, %xmm1
	movzbl	84(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm1, %xmm2
	movzbl	104(%esp), %eax
	movd	%eax, %xmm1
	punpcklbw	%xmm1, %xmm0
	punpcklbw	%xmm2, %xmm0
	movaps	%xmm0, %xmm1
	punpcklbw	%xmm3, %xmm1
	movzbl	127(%esp), %eax
	movd	%eax, %xmm0
	movzbl	135(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm0, %xmm2
	movzbl	155(%esp), %eax
	movd	%eax, %xmm0
	movzbl	163(%esp), %eax
	movd	%eax, %xmm3
	punpcklbw	%xmm0, %xmm3
	punpcklbw	%xmm2, %xmm3
	movzbl	188(%esp), %eax
	movd	%eax, %xmm0
	movzbl	197(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm0, %xmm2
	movzbl	217(%esp), %eax
	movd	%eax, %xmm4
	movzbl	225(%esp), %eax
	movd	%eax, %xmm0
	punpcklbw	%xmm4, %xmm0
	punpcklbw	%xmm2, %xmm0
	punpcklbw	%xmm3, %xmm0
	punpcklbw	%xmm1, %xmm0
	addl	$252, %esp
	ret

llvm-svn: 65311
2009-02-23 08:49:38 +00:00
Scott Michel
3f8637305f Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.

llvm-svn: 65296
2009-02-22 23:36:09 +00:00
Evan Cheng
3ea8bd42f3 Add a note.
llvm-svn: 65275
2009-02-22 08:13:45 +00:00
Evan Cheng
4385f393f7 Be bug compatible with gcc by returning MMX values in RAX.
llvm-svn: 65274
2009-02-22 08:05:12 +00:00
Evan Cheng
9d9688ec15 Do not consider MMX_MOVD64rr a move instructions. The source register is in GR32, the destination is VR64. They are not compatible.
llvm-svn: 65273
2009-02-22 08:04:23 +00:00
Anton Korobeynikov
5df82e3e25 Drop bunch of half-working stuff in the ext_weak linkage support.
Now we're using one gross, but quite robust hack :) (previous ones
did not work, for example, when ext_weak symbol was used deep inside
constant expression in the initializer).

The proper fix of this problem will require some quite huge asmprinter
changes and that's why was postponed. This fixes PR3629 by the way :)

llvm-svn: 65230
2009-02-21 11:53:32 +00:00
Bill Wendling
bfc216c45a Make sure this doesn't access .end() too.
llvm-svn: 65213
2009-02-21 01:11:36 +00:00
Bill Wendling
96430050a5 Make sure we don't dereference the .end() of the container.
llvm-svn: 65211
2009-02-21 01:07:26 +00:00
Bill Wendling
66c3ffa2de Propagate more debug loc infos. This also includes some code cleaning.
llvm-svn: 65207
2009-02-21 00:43:56 +00:00
Bill Wendling
09289bc433 We need to propagate the debug location information even when dealing with the
prologue/epilogue.

llvm-svn: 65206
2009-02-21 00:32:08 +00:00
Evan Cheng
c40c3e28f7 Support return of MMX values in 64-bit mode.
llvm-svn: 65152
2009-02-20 20:43:02 +00:00
Bill Wendling
306b992133 Put code that generates debug labels into TableGen so that it can be used by
everyone.

llvm-svn: 64978
2009-02-18 23:12:06 +00:00
Nate Begeman
5e78e558ff Add support to the JIT for true non-lazy operation. When a call to a function
that has not been JIT'd yet, the callee is put on a list of pending functions
to JIT.  The call is directed through a stub, which is updated with the address
of the function after it has been JIT'd.  A new interface for allocating and
updating empty stubs is provided.

Add support for removing the ModuleProvider the JIT was created with, which
would otherwise invalidate the JIT's PassManager, which is initialized with the
ModuleProvider's Module.

Add support under a new ExecutionEngine flag for emitting the infomration 
necessary to update Function and GlobalVariable stubs after JITing them, by
recording the address of the stub and the name of the GlobalValue.  This allows
code to be copied from one address space to another, where libraries may live
at different virtual addresses, and have the stubs updated with their new
correct target addresses.

llvm-svn: 64906
2009-02-18 08:31:02 +00:00
Dan Gohman
9c258bd2ec Factor out the code to add a MachineOperand to a MachineInstrBuilder.
llvm-svn: 64891
2009-02-18 05:45:50 +00:00
Evan Cheng
bd63a1f40d GV with null value initializer shouldn't go to BSS if it's meant for a mergeable strings section. Currently it only checks for Darwin. Someone else please check if it should apply to other targets as well.
llvm-svn: 64877
2009-02-18 02:19:52 +00:00
Scott Michel
4c5fa6c982 Remove trailing whitespace to reduce later commit patch noise.
(Note: Eventually, commits like this will be handled via a pre-commit hook that
 does this automagically, as well as expand tabs to spaces and look for 80-col
 violations.)

llvm-svn: 64827
2009-02-17 22:15:04 +00:00
Chris Lattner
3b67310686 add a horrible note
llvm-svn: 64719
2009-02-17 01:16:14 +00:00
Bill Wendling
266c8bc98f --- Merging (from foreign repository) r64714 into '.':
U    include/llvm/CodeGen/DebugLoc.h
U    lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp

Enable debug location generation at -Os. This goes with the reapplication of the
r63639 patch.

llvm-svn: 64715
2009-02-17 01:04:54 +00:00
Dan Gohman
856736a187 MachineLICM now handles these cases.
llvm-svn: 64620
2009-02-15 23:24:52 +00:00
Dan Gohman
684359ea96 The x86-64 red zone is now being used.
llvm-svn: 64535
2009-02-14 03:30:05 +00:00
Evan Cheng
9041a71923 Teach x86 target -soft-float.
llvm-svn: 64496
2009-02-13 22:36:38 +00:00
Dale Johannesen
560b03bbcd Remove non-DebugLoc versions of BuildMI from X86.
There were some that might even matter in X86FastISel.

llvm-svn: 64437
2009-02-13 02:33:27 +00:00
Bill Wendling
40e4b271af Revert this. It was breaking stuff.
llvm-svn: 64428
2009-02-13 02:16:35 +00:00
Bill Wendling
83b6edd760 Turn off the old way of handling debug information in the code generator. Use
the new way, where all of the information is passed on SDNodes and machine
instructions.

llvm-svn: 64427
2009-02-13 02:01:04 +00:00
Dale Johannesen
5a21722625 Eliminate a couple of non-DebugLoc BuildMI variants.
Modify callers.

llvm-svn: 64409
2009-02-12 23:08:38 +00:00
Dale Johannesen
47321cf01f Arrange to print constants that match "n" and "i" constraints
in inline asm as signed (what gcc does).  Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.

llvm-svn: 64400
2009-02-12 20:58:09 +00:00
Chris Lattner
1174b80823 fix the X86 backend to just drop llvm.declare nodes for VLAs instead of
leaving them in the DAG and then getting selection errors.  This is a 
fix for PR3538.

llvm-svn: 64382
2009-02-12 17:33:11 +00:00
Bill Wendling
49baa5465c Propagate DebugLoc info for spiller call-backs.
llvm-svn: 64329
2009-02-11 21:51:19 +00:00
Dan Gohman
2decb4495d Don't try to set an EFLAGS operand to dead if no instruction was created.
This fixes a bug introduced by r61215.

llvm-svn: 64316
2009-02-11 19:50:24 +00:00
Evan Cheng
cdb35e3f0f Handle llvm.x86.sse2.maskmov.dqu in 64-bit.
llvm-svn: 64240
2009-02-10 22:06:28 +00:00
Evan Cheng
a1b9cf3143 80 col violations.
llvm-svn: 64237
2009-02-10 21:39:44 +00:00
Evan Cheng
3b84024598 Implement FpSET_ST1_*.
llvm-svn: 64186
2009-02-09 23:32:07 +00:00
Dan Gohman
548d7c9145 Use doxygen comment syntax.
llvm-svn: 64150
2009-02-09 18:12:09 +00:00
Evan Cheng
9dc1507838 Turns out AnalyzeBranch can modify the mbb being analyzed. This is a nasty
suprise to some callers, e.g. register coalescer. For now, add an parameter
that tells AnalyzeBranch whether it's safe to modify the mbb. A better
solution is out there, but I don't have time to deal with it right now.

llvm-svn: 64124
2009-02-09 07:14:22 +00:00
Chris Lattner
a50a554333 add a note.
llvm-svn: 64093
2009-02-08 20:44:19 +00:00
Dale Johannesen
b22cb23f6f Use getDebugLoc forwarder instead of getNode()->getDebugLoc.
No functional change.

llvm-svn: 64026
2009-02-07 19:59:05 +00:00
Dan Gohman
4105a38248 Constify TargetInstrInfo::EmitInstrWithCustomInserter, allowing
ScheduleDAG's TLI member to use const.

llvm-svn: 64018
2009-02-07 16:15:20 +00:00
Dale Johannesen
a259483aae Get rid of the last non-DebugLoc versions of getNode!
Many targets build placeholder nodes for special operands, e.g.
GlobalBaseReg on X86 and PPC for the PIC base.  There's no
sensible way to associate debug info with these.  I've left
them built with getNode calls with explicit DebugLoc::getUnknownLoc operands. 
I'm not too happy about this but don't see a good improvement;
I considered adding a getPseudoOperand or something, but it
seems to me that'll just make it harder to read.

llvm-svn: 63992
2009-02-07 00:55:49 +00:00
Dan Gohman
8437b9efa1 Refactor some repeated logic into a separate function.
llvm-svn: 63989
2009-02-07 00:43:41 +00:00
Dan Gohman
8aeae15ebd Make a comment a doxygen comment.
llvm-svn: 63988
2009-02-07 00:42:54 +00:00
Dale Johannesen
1580ab6b7f Remove more non-DebugLoc getNode variants. Use
getCALLSEQ_{END,START} to permit passing no DebugLoc
there.  UNDEF doesn't logically have DebugLoc; add
getUNDEF to encapsulate this.

llvm-svn: 63978
2009-02-06 23:05:02 +00:00
Dale Johannesen
c405486235 Remove more non-DebugLoc versions of getNode.
llvm-svn: 63969
2009-02-06 21:50:26 +00:00
Bill Wendling
2c7f47d1d4 Record debug location information in the Dwarf writer.
A simple test program shows that debugging works. :-)

llvm-svn: 63968
2009-02-06 21:45:08 +00:00
Dan Gohman
a2f300f26d Use .size and .type on ELF systems; this helps tools that map
addresses to symbols.

llvm-svn: 63962
2009-02-06 21:15:52 +00:00
Evan Cheng
e00df1d39c Move getPointerRegClass from TargetInstrInfo to TargetRegisterInfo.
llvm-svn: 63938
2009-02-06 17:43:24 +00:00
Evan Cheng
381b2df5ff Add TargetInstrInfo::isSafeToMoveRegisterClassDefs. It returns true if it's safe to move an instruction which defines a value in the register class. Replace pre-splitting specific IgnoreRegisterClassBarriers with this new hook.
llvm-svn: 63936
2009-02-06 17:17:30 +00:00
Dale Johannesen
e95c76b65e Get rid of one more non-DebugLoc getNode and
its corresponding getTargetNode.  Lots of
caller changes.

llvm-svn: 63904
2009-02-06 01:31:28 +00:00
Evan Cheng
87def37f67 A few more isAsCheapAsAMove.
llvm-svn: 63852
2009-02-05 08:42:55 +00:00
Dale Johannesen
15a801f11d Remove non-DebugLoc versions of getLoad and getStore.
Adjust the many callers of those versions.

llvm-svn: 63767
2009-02-04 20:06:27 +00:00
Chris Lattner
feb6b56242 Bill implemented this.
llvm-svn: 63752
2009-02-04 19:09:07 +00:00
Chris Lattner
eae4653469 add a note, this is why we're faster at SciMark-MonteCarlo with
SSE disabled.

llvm-svn: 63751
2009-02-04 19:08:01 +00:00
Dan Gohman
1cd89d625c Minor code cleanups; no functionality change.
llvm-svn: 63740
2009-02-04 17:28:58 +00:00