1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00
Commit Graph

179 Commits

Author SHA1 Message Date
Dan Gohman
ac47a4b9ed Enable the new no-SP register classes by default. This is to address
PR4572. A few tests have some minor code regressions due to different
coalescing.

llvm-svn: 78217
2009-08-05 17:40:24 +00:00
Dan Gohman
5d566d918b Major calling convention code refactoring.
Instead of awkwardly encoding calling-convention information with ISD::CALL,
ISD::FORMAL_ARGUMENTS, ISD::RET, and ISD::ARG_FLAGS nodes, TargetLowering
provides three virtual functions for targets to override:
LowerFormalArguments, LowerCall, and LowerRet, which replace the custom
lowering done on the special nodes. They provide the same information, but
in a more immediately usable format.

This also reworks much of the target-independent tail call logic. The
decision of whether or not to perform a tail call is now cleanly split
between target-independent portions, and the target dependent portion
in IsEligibleForTailCallOptimization.

This also synchronizes all in-tree targets, to help enable future
refactoring and feature work.

llvm-svn: 78142
2009-08-05 01:29:28 +00:00
Anton Korobeynikov
0bac80c138 Unbreak Win64 CC. Step one: honour register save area, fix some alignment and provide a different set of call-clobberred registers.
llvm-svn: 77962
2009-08-03 08:12:53 +00:00
Dan Gohman
d36cbd0574 Resync lea32addr and lea64addr.
llvm-svn: 77893
2009-08-02 16:09:17 +00:00
Evan Cheng
148032a1a2 Optimize some common usage patterns of atomic built-ins __sync_add_and_fetch() and __sync_sub_and_fetch.
When the return value is not used (i.e. only care about the value in the memory), x86 does not have to use add to implement these. Instead, it can use add, sub, inc, dec instructions with the "lock" prefix.

This is currently implemented using a bit of instruction selection trick. The issue is the target independent pattern produces one output and a chain and we want to map it into one that just output a chain. The current trick is to select it into a merge_values with the first definition being an implicit_def. The proper solution is to add new ISD opcodes for the no-output variant. DAG combiner can then transform the node before it gets to target node selection.

Problem #2 is we are adding a whole bunch of x86 atomic instructions when in fact these instructions are identical to the non-lock versions. We need a way to add target specific information to target nodes and have this information carried over to machine instructions. Asm printer (or JIT) can use this information to add the "lock" prefix.

llvm-svn: 77582
2009-07-30 08:33:02 +00:00
Bill Wendling
50e5f4884a Add the following functions:
- SYSCALL (normal fast system call instruction) [0f 05]
- SYSENTER (system call entry instruction) [0f 34]
- SYSEXIT (system call exit instruction) [0f 35]
- SYSEXIT64 (system call exit instruction to 64-bit user code) [REX.W 0f 35]
- SYSRET (system call return instruction) [0f 07]

Patch by Sean Callanan.

llvm-svn: 76528
2009-07-21 01:07:24 +00:00
Chris Lattner
e12dcd84ca use SUBREG_TO_REG instead of INSERT_SUBREG, this way the code
generator can know the top bits are zero, not undefined.
Thanks to Dan for pointing this out.

llvm-svn: 75899
2009-07-16 06:31:37 +00:00
Chris Lattner
290c415b94 reapply r75408, which eliminates MOV64r0 in favor of using
MOV32r0 + subregs to do the same thing.  This should work now
that PR4544 is fixed.  Thanks Evan!

llvm-svn: 75671
2009-07-14 20:19:57 +00:00
Bill Wendling
dcf4c8e237 Temporarily revert r75408. It appears to break the Apple-style builds:
x86_64-apple-darwin10-gcc -c   -g -O2  -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute   -mdynamic-no-pic -DHAVE_CONFIG_H -I. -I. -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/. -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../include -I./../intl -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../libcpp/include  -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~dst/Developer/usr/local/include -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~obj/src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmCore.roots/llvmCore~dst/Developer/usr/local/include  -D_DEBUG  -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -DLLVM_VERSION_INFO='"9999"' -DBUILD_LLVM_APPLE_STYLE   /Volumes/Sandbox/Buildbot/llvm/build.llvm-gcc-x86_64-darwin10-selfhost/build/llvmgcc42.roots/llvmgcc42~obj/src/gcc/tree-ssa-alias.c -o tree-ssa-alias.o
/var/tmp//ccJQ2JBT.s:4134:Incorrect register `%rcx' used with `l' suffix
make[2]: *** [tree-ssa-live.o] Error 1
make[2]: *** Waiting for unfinished jobs....

llvm-svn: 75412
2009-07-12 02:49:22 +00:00
Chris Lattner
83effafb6b eliminate MOV64r0 in favor of a Pat<> pattern. This is only nontrivial because
the div lowering code explicitly references it.

llvm-svn: 75408
2009-07-12 00:47:55 +00:00
Chris Lattner
a54c70286c fix x86-64 static codegen to materialize the address of a global with movl instead
of lea.  It is better for code size (and presumably efficiency) to use:

  movl $foo, %eax

rather than:

  leal foo, eax

Both give a nice zero extending "move immediate" instruction, the former is just
smaller.  Note that global addresses should be handled different by the x86
backend, but I chose to follow the style already in place and add more fixme's.

llvm-svn: 75403
2009-07-11 23:17:29 +00:00
Chris Lattner
4902f811b6 comment cleanup, reduce nesting.
llvm-svn: 75398
2009-07-11 22:50:33 +00:00
Chris Lattner
2af5f2aeca remove some dead patterns, WrapperRIP doesn't exist in -static mode
anymore, so these aren't needed.

llvm-svn: 75397
2009-07-11 22:47:21 +00:00
Chris Lattner
19eb0dad26 Reimplement rip-relative addressing in the X86-64 backend. The new
implementation primarily differs from the former in that the asmprinter
doesn't make a zillion decisions about whether or not something will be
RIP relative or not.  Instead, those decisions are made by isel lowering
and propagated through to the asm printer.  To achieve this, we:

1. Represent RIP relative addresses by setting the base of the X86 addr
   mode to X86::RIP.
2. When ISel Lowering decides that it is safe to use RIP, it lowers to
   X86ISD::WrapperRIP.  When it is unsafe to use RIP, it lowers to
   X86ISD::Wrapper as before.
3. This removes isRIPRel from X86ISelAddressMode, representing it with
   a basereg of RIP instead.
4. The addressing mode matching logic in isel is greatly simplified.
5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate
   passed through various printoperand routines is gone now.
6. The various symbol printing routines in asmprinter now no longer infer
   when to emit (%rip), they just print the symbol.

I think this is a big improvement over the previous situation.  It does have
two small caveats though: 1. I implemented a horrible "no-rip" modifier for
the inline asm "P" constraint modifier.  This is a short term hack, there is
a much better, but more involved, solution.  2. I had to xfail an 
-aggressive-remat testcase because it isn't handling the use of RIP in the
constant-pool reading instruction.  This specific test is easy to fix without
-aggressive-remat, which I intend to do next.

llvm-svn: 74372
2009-06-27 04:16:01 +00:00
Chris Lattner
580eecebbd change TLS_ADDR lowering to lower to a real mem operand, instead of matching as
a global with that gets printed with the :mem modifier.  All operands to lea's 
should be handled with the lea32mem operand kind, and this allows the TLS stuff
to do this.  There are several better ways to do this, but I went for the minimal
change since I can't really test this (beyond make check).

This also makes the use of EBX explicit in the operand list in the 32-bit, 
instead of implicit in the instruction.

llvm-svn: 73834
2009-06-20 20:38:48 +00:00
Chris Lattner
12ba79a2b7 eliminate the "call" operand modifier from the asm descriptions, modeling
it as a pcrel immediate instead.  This gets pc-rel weirdness out of the
main printoperand codepath.

llvm-svn: 73829
2009-06-20 19:34:09 +00:00
Chris Lattner
7b43ca847d implement support for lowering subregs when preparing to print
LEA64_32r, eliminating a bunch of modifier logic stuff on addr modes.

Implement support for printing mbb labels as operands.

llvm-svn: 73817
2009-06-20 07:03:18 +00:00
Evan Cheng
c2b51a746a CALL64pcrel32 immediate field is 32-bit. Patch by Abhinav Duggal.
llvm-svn: 73536
2009-06-16 19:44:27 +00:00
Bill Wendling
43f2a61c26 The Ls and Qs were mixed up. Patch by Sean.
llvm-svn: 73417
2009-06-15 20:59:31 +00:00
Bill Wendling
8b64cfd877 "The Intel instruction tables should include the 64-bit and 32-bit instructions
that push immediate operands of 1, 2, and 4 bytes (extended to the native
register size in each case).  The assembly mnemonics are "pushl" and "pushq."
One such instruction appears at the beginning of the "start" function , so this
is essential for accurate disassembly when unwinding."

Patch by Sean Callanan!

llvm-svn: 73407
2009-06-15 19:39:04 +00:00
Dan Gohman
609f627ed7 Revert r72734. The Darwin assembler doesn't support the static
relocation model on x86-64. Higher level logic should override
the relocation model to PIC on x86_64-apple-darwin.

llvm-svn: 72746
2009-06-03 00:37:20 +00:00
Evan Cheng
7e66d61bec On Darwin x86_64 small code model doesn't guarantee code address fits in 32-bit.
llvm-svn: 72734
2009-06-02 20:09:31 +00:00
Dale Johannesen
8b6ee9e312 Revert 72707 and 72709, for the moment.
llvm-svn: 72712
2009-06-02 03:12:52 +00:00
Dale Johannesen
c08669561e Make the implicit inputs and outputs of target-independent
ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to)
instead of MVT::Flag.  Remove CARRY_FALSE in favor of 0; adjust
all target-independent code to use this format.

Most targets will still produce a Flag-setting target-dependent
version when selection is done.  X86 is converted to use i32
instead, which means TableGen needs to produce different code
in xxxGenDAGISel.inc.  This keys off the new supportsHasI1 bit
in xxxInstrInfo, currently set only for X86; in principle this
is temporary and should go away when all other targets have
been converted.  All relevant X86 instruction patterns are
modified to represent setting and using EFLAGS explicitly.  The
same can be done on other targets.

The immediate behavior change is that an ADC/ADD pair are no
longer tightly coupled in the X86 scheduler; they can be
separated by instructions that don't clobber the flags (MOV).
I will soon add some peephole optimizations based on using
other instructions that set the flags to feed into ADC.

llvm-svn: 72707
2009-06-01 23:27:20 +00:00
Dan Gohman
b38d9b6a57 Fix a grammaro and clarify a comment.
llvm-svn: 72668
2009-05-31 17:52:18 +00:00
Evan Cheng
2d198e1bc2 (i64 (zext (srl GR32 8))) -> movzbl AH is not safe since srl 8 only clear the top 8 bits.
llvm-svn: 72618
2009-05-30 08:43:27 +00:00
Evan Cheng
550fc9ba9f More h-registers tricks: folding zext nodes.
llvm-svn: 72558
2009-05-29 01:44:43 +00:00
Chris Lattner
5cc9a36d1c Add basic support for code generation of
addrspace(257) -> FS relative on x86.  Patch by Zoltan Varga!

llvm-svn: 70992
2009-05-05 18:52:19 +00:00
Dan Gohman
180fa04e35 Rename GR8_, GR16_, GR32_, and GR64_ to GR8_ABCD, GR16_ABCD,
GR32_ABCD, and GR64_ABCD, respectively, to help describe them.

llvm-svn: 70210
2009-04-27 16:33:14 +00:00
Dan Gohman
885b9c3688 Break up long multi-mnemonic strings into separate lines for readability.
llvm-svn: 70209
2009-04-27 15:13:28 +00:00
Rafael Espindola
4e7a0bf1f1 Fix PR 4004 by including the call to __tls_get_addr in X86tlsaddr. This is not
very elegant, but neither is the tls specification :-(

llvm-svn: 69968
2009-04-24 12:59:40 +00:00
Rafael Espindola
5adc7ad39e TLS_addr64 and TLS_addr32 define RDI and EAX. They don't use them.
This fixes PR4002.

llvm-svn: 69672
2009-04-21 08:22:09 +00:00
Rafael Espindola
d74132e2c5 For general dynamic TLS access we must use
leaq	foo@TLSGD(%rip), %rdi

as part of the instruction sequence. Using a register other than %rdi and then
copying it to %rdi is not valid.

llvm-svn: 69350
2009-04-17 14:35:58 +00:00
Dan Gohman
8393d29bc8 Rename COPY_TO_SUBCLASS to COPY_TO_REGCLASS, and generalize
it accordingly. Thanks to Jakob Stoklund Olesen for pointing
out how this might be useful.

llvm-svn: 68986
2009-04-13 21:06:25 +00:00
Dan Gohman
be7227005f Implement x86 h-register extract support.
- Add patterns for h-register extract, which avoids a shift and mask,
   and in some cases a temporary register.
 - Add address-mode matching for turning (X>>(8-n))&(255<<n), where
   n is a valid address-mode scale value, into an h-register extract
   and a scaled-offset address.
 - Replace X86's MOV32to32_ and related instructions with the new
   target-independent COPY_TO_SUBREG instruction.

On x86-64 there are complicated constraints on h registers, and
CodeGen doesn't currently provide a high-level way to express all of them,
so they are handled with a bunch of special code. This code currently only
supports extracts where the result is used by a zero-extend or a store,
though these are fairly common.

These transformations are not always beneficial; since there are only
4 h registers, they sometimes require extra move instructions, and
this sometimes increases register pressure because it can force out
values that would otherwise be in one of those registers. However,
this appears to be relatively uncommon.

llvm-svn: 68962
2009-04-13 16:09:41 +00:00
Dan Gohman
a68d99c707 Add a comment about MOVSX64rr8.
llvm-svn: 68950
2009-04-13 15:13:28 +00:00
Rafael Espindola
7eb72dc5f2 Re-apply 68552.
Tested by bootstrapping llvm-gcc and using that to build llvm.

llvm-svn: 68645
2009-04-08 21:14:34 +00:00
Dan Gohman
c9ce27d6b7 Implement support for using modeling implicit-zero-extension on x86-64
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.

This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.

Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.

Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.

Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.

Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.

llvm-svn: 68576
2009-04-08 00:15:30 +00:00
Bill Wendling
6e702cf68c Temporarily revert r68552. This was causing a failure in the self-hosting LLVM
builds.

--- Reverse-merging (from foreign repository) r68552 into '.':
U    test/CodeGen/X86/tls8.ll
U    test/CodeGen/X86/tls10.ll
U    test/CodeGen/X86/tls2.ll
U    test/CodeGen/X86/tls6.ll
U    lib/Target/X86/X86Instr64bit.td
U    lib/Target/X86/X86InstrSSE.td
U    lib/Target/X86/X86InstrInfo.td
U    lib/Target/X86/X86RegisterInfo.cpp
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86CodeEmitter.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86InstrInfo.h
U    lib/Target/X86/X86ISelDAGToDAG.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp
U    lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h
U    lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h
U    lib/Target/X86/X86ISelLowering.h
U    lib/Target/X86/X86InstrInfo.cpp
U    lib/Target/X86/X86InstrBuilder.h
U    lib/Target/X86/X86RegisterInfo.td

llvm-svn: 68560
2009-04-07 22:35:25 +00:00
Rafael Espindola
0324937229 Reduce code duplication on the TLS implementation.
This introduces a small regression on the generated code
quality in the case we are just computing addresses, not
loading values.

Will work on it and on X86-64 support.

llvm-svn: 68552
2009-04-07 21:37:46 +00:00
Evan Cheng
3e30bcbd69 When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further.
llvm-svn: 68066
2009-03-30 21:36:47 +00:00
Chris Lattner
205380a4e4 Disable the "call to immediate" optimization on x86-64. It is
not safe in general because the immediate could be an arbitrary
value that does not fit in a 32-bit pcrel displacement.  
Conservatively fall back to loading the value into a register
and calling through it.

We still do the optzn on X86-32.

llvm-svn: 67142
2009-03-18 00:43:52 +00:00
Evan Cheng
d112c41d95 Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address assembly. 2. Fixed JIT encoding by making the address pc-relative.
llvm-svn: 66803
2009-03-12 18:15:39 +00:00
Dan Gohman
d30e108f0e Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and the
assembly text output uses an indirect call ("call *") instead of a direct call.

llvm-svn: 66735
2009-03-11 23:01:47 +00:00
Dan Gohman
f9599e6c5f Don't use plain INC32 and DEC32 on x86-64; it needs
INC64_32r and INC64_16r, because these instructions are encoded
differently on x86-64. This fixes JIT regressions on x86-64 in
kimwitu++ and others.

llvm-svn: 66207
2009-03-05 21:32:23 +00:00
Dan Gohman
31fb085c2e Re-apply 66008, now that the unfoldMemoryOperand bug is fixed.
llvm-svn: 66058
2009-03-04 19:44:21 +00:00
Evan Cheng
7d9019d0f3 Fix PR3666: isel calls to constant addresses.
llvm-svn: 66024
2009-03-04 06:48:53 +00:00
Dan Gohman
6831e2c2a6 Revert r66004 for now; it's causing a variety of test failures.
llvm-svn: 66008
2009-03-04 03:54:19 +00:00
Dan Gohman
c6c669cc1e Teach the x86 backend to eliminate "test" instructions by using the EFLAGS
result from add, sub, inc, and dec instructions in simple cases.

llvm-svn: 66004
2009-03-04 02:33:24 +00:00
Dan Gohman
3c6c7754b2 Add '(implicit EFLAGS)' for AND, OR, XOR, NEG, INC, and DEC
instructions. These aren't used yet.

llvm-svn: 65965
2009-03-03 19:53:46 +00:00
Evan Cheng
a1b9cf3143 80 col violations.
llvm-svn: 64237
2009-02-10 21:39:44 +00:00
Evan Cheng
87def37f67 A few more isAsCheapAsAMove.
llvm-svn: 63852
2009-02-05 08:42:55 +00:00
Nate Begeman
92efc4f0ce Map address space 256 to gs; similar mappings could be supported for the
other x86 segments.  address space 0 is stack/default, 1-255 are reserved for
client use.

llvm-svn: 62980
2009-01-26 01:24:32 +00:00
Evan Cheng
43d680b0d8 Also favors NOT64r.
llvm-svn: 62710
2009-01-21 19:45:31 +00:00
Dan Gohman
8c835f6285 Disable the register+memory forms of the bt instructions for now. Thanks
to Eli for pointing out that these forms don't ignore the high bits of
their index operands, and as such are not immediately suitable for use
by isel.

llvm-svn: 62194
2009-01-13 23:23:30 +00:00
Dan Gohman
15e69a394a Add bt instructions that take immediate operands.
llvm-svn: 62180
2009-01-13 20:33:23 +00:00
Dan Gohman
ca4475dd7b Add patterns to match conditional moves with loads folded
into their left operand, rather than their right. Do this
by commuting the operands and inverting the condition.

llvm-svn: 61842
2009-01-07 01:00:24 +00:00
Dan Gohman
e78fdaec67 Define instructions for cmovo and cmovno.
llvm-svn: 61836
2009-01-07 00:35:10 +00:00
Chris Lattner
062ed6e3dd Fix some JIT encodings.
llvm-svn: 61425
2008-12-25 01:32:49 +00:00
Chris Lattner
f34b843728 BT memory operands load from their address operand.
llvm-svn: 61424
2008-12-25 01:27:10 +00:00
Dan Gohman
1ba93ac6be Add instruction patterns and encodings for the x86 bt instructions.
llvm-svn: 61400
2008-12-23 22:45:23 +00:00
Dan Gohman
22b7b328a4 Move the patterns which have i8 immediates before the patterns
that have i32 immediates so that they get selected first. This
currently only matters in the JIT, as assemblers will
automatically use the smallest encoding.

llvm-svn: 61250
2008-12-19 18:25:21 +00:00
Bill Wendling
13e4a3d0b0 - Use patterns instead of creating completely new instruction matching patterns,
which are identical to the original patterns.

- Change the multiply with overflow so that we distinguish between signed and
  unsigned multiplication. Currently, unsigned multiplication with overflow
  isn't working!

llvm-svn: 60963
2008-12-12 21:15:41 +00:00
Bill Wendling
5d026e47c1 Redo the arithmetic with overflow architecture. I was changing the semantics of
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.

Similar for SUB and MUL instructions.

llvm-svn: 60915
2008-12-12 00:56:36 +00:00
Bill Wendling
4c8fb3a0cc Add sub/mul overflow intrinsics. This currently doesn't have a
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!

llvm-svn: 60800
2008-12-09 22:08:41 +00:00
Nick Lewycky
e277f75880 Fix typo, psuedo -> pseudo.
llvm-svn: 60651
2008-12-07 03:49:52 +00:00
Dan Gohman
5dad0993a9 Rename isSimpleLoad to canFoldAsLoad, to better reflect its meaning.
llvm-svn: 60487
2008-12-03 18:15:48 +00:00
Bill Wendling
039240b301 Reapply r60382. This time, don't mark "ADC" nodes with "implicit EFLAGS".
llvm-svn: 60385
2008-12-02 00:07:05 +00:00
Bill Wendling
16840cba04 Temporarily revert r60382. It caused CodeGen/X86/i2k.ll and others to fail.
llvm-svn: 60383
2008-12-01 23:44:08 +00:00
Bill Wendling
628848b540 - Have "ADD" instructions return an implicit EFLAGS.
- Add support for seto, setno, setc, and setnc instructions.

llvm-svn: 60382
2008-12-01 23:30:42 +00:00
Dan Gohman
e5420a0ae9 Don't set neverHasSideEffects on x86's divide instructions, since
they trap on divide-by-zero, and this side effect is otherwise
unmodeled.

llvm-svn: 59551
2008-11-18 21:29:14 +00:00
Nate Begeman
e621f0539e Fix PEXTRQ encoding
llvm-svn: 58403
2008-10-29 23:07:17 +00:00
Dan Gohman
268cfea6bc Fun x86 encoding tricks: when adding an immediate value of 128,
use a SUB instruction instead of an ADD, because -128 can be
encoded in an 8-bit signed immediate field, while +128 can't be.
This avoids the need for a 32-bit immediate field in this case.

A similar optimization applies to 64-bit adds with 0x80000000,
with the 32-bit signed immediate field.

To support this, teach tablegen how to handle 64-bit constants.

llvm-svn: 57663
2008-10-17 01:33:43 +00:00
Dan Gohman
5d83bd89a5 Define patterns for shld and shrd that match immediate
shift counts, and patterns that match dynamic shift counts
when the subtract is obscured by a truncate node.

Add DAGCombiner support for recognizing rotate patterns
when the shift counts are defined by truncate nodes.

Fix and simplify the code for commuting shld and shrd
instructions to work even when the given instruction doesn't
have a parent, and when the caller needs a new instruction.

These changes allow LLVM to use the shld, shrd, rol, and ror
instructions on x86 to replace equivalent code using two
shifts and an or in many more cases.

llvm-svn: 57662
2008-10-17 01:23:35 +00:00
Chris Lattner
7910d59d44 Change CALLSEQ_BEGIN and CALLSEQ_END to take TargetConstant's as
parameters instead of raw Constants.  This prevents the constants from
being selected by the isel pass, fixing PR2735.

llvm-svn: 57385
2008-10-11 22:08:30 +00:00
Dan Gohman
4aacc3ab83 Split x86's ADJCALLSTACK instructions into 32-bit and 64-bit forms.
This allows the 64-bit forms to use+def RSP instead of ESP. This
doesn't fix any real bugs today, but it is more precise and it
makes the debug dumps on x86-64 look more consistent.

Also, add some comments describing the CALL instructions' physreg
operand uses and defs.

llvm-svn: 56925
2008-10-01 18:28:06 +00:00
Dan Gohman
d6e96e9888 Mark CALL instructions as having a Use of ESP/RSP.
llvm-svn: 56911
2008-10-01 04:14:30 +00:00
Bill Wendling
932818c75a Reverting r56249. On further investigation, this functionality isn't needed.
Apologies for the thrashing.

llvm-svn: 56251
2008-09-16 21:48:12 +00:00
Bill Wendling
1a240c8033 - Change "ExternalSymbolSDNode" to "SymbolSDNode".
- Add linkage to SymbolSDNode (default to external).
- Change ISD::ExternalSymbol to ISD::Symbol.
- Change ISD::TargetExternalSymbol to ISD::TargetSymbol

These changes pave the way to allowing SymbolSDNodes with non-external linkage.

llvm-svn: 56249
2008-09-16 21:12:30 +00:00
Dan Gohman
89660301e3 Rename ConstantSDNode::getValue to getZExtValue, for consistency
with ConstantInt. This led to fixing a bug in TargetLowering.cpp
using getValue instead of getAPIntValue.

llvm-svn: 56159
2008-09-12 16:56:44 +00:00
Anton Korobeynikov
33c69aaf24 Reapply 55899: First draft of EH support on x86/64-linux
Now with fix, which prevents subtle codegen bug to trigger on darwin.
No fix for bug though, it's still there.

llvm-svn: 55955
2008-09-08 21:12:47 +00:00
Bill Wendling
4cc4caab72 Reverting r55898 to r55909. One of these patches was causing an ICE during the full bootstrap on Darwin:
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/bin/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/lib/
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/include
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/sys-include
-O2  -O2 -g -O2  -DIN_GCC    -W -Wall -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition
-isystem ./include  -fPIC -pipe -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2
-D__GCC_FLOAT_NOT_NEEDED  -I. -I. -I../../llvm-gcc.src/gcc
-I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include
-I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include
-I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include
-DSHARED -m64 -DL_negdi2 -c ../../llvm-gcc.src/gcc/libgcc2.c -o
libgcc/x86_64/_negdi2_s.o
Assertion failed: (TargetRegisterInfo::isVirtualRegister(regA) &&
TargetRegisterInfo::isVirtualRegister(regB) && "cannot update physical
register live information"), function runOnMachineFunction, file
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/lib/CodeGen/TwoAddressInstructionPass.cpp,
line 311.
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/bin/
-B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/lib/
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/include
-isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.4.0/sys-include
-O2  -O2 -g -O2  -DIN_GCC    -W -Wall -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition
-isystem ./include  -fPIC -pipe -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2
-D__GCC_FLOAT_NOT_NEEDED  -I. -I. -I../../llvm-gcc.src/gcc
-I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include
-I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include
-I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include
-I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include
-DSHARED -m64 -DL_lshrdi3 -c ../../llvm-gcc.src/gcc/libgcc2.c -o
libgcc/x86_64/_lshrdi3_s.o
../../llvm-gcc.src/gcc/unwind-dw2.c:1527: internal compiler error: Abort trap
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
{standard input}:unknown:Undefined local symbol LBB21_11
{standard input}:unknown:Undefined local symbol LBB21_12
{standard input}:unknown:Undefined local symbol LBB21_13
{standard input}:unknown:Undefined local symbol LBB21_8

llvm-svn: 55928
2008-09-08 17:59:12 +00:00
Anton Korobeynikov
8528e4dc99 First draft of EH support on x86/64-linux
llvm-svn: 55899
2008-09-08 14:21:53 +00:00
Evan Cheng
c3c439a624 For now, can't mark XOR64rr isAsCheapAsAMove. It's technically correct. But various passes cannot handle remating these.
llvm-svn: 55562
2008-08-30 08:54:22 +00:00
Evan Cheng
4bc8c9652e Transform (x << (y&31)) -> (x << y). This takes advantage of the fact x86 shift instructions 2nd operand (shift count) is limited to 0 to 31 (or 63 in the x86-64 case).
llvm-svn: 55558
2008-08-30 02:03:58 +00:00
Dale Johannesen
490c016734 Split the ATOMIC NodeType's to include the size, e.g.
ATOMIC_LOAD_ADD_{8,16,32,64} instead of ATOMIC_LOAD_ADD.
Increased the Hardcoded Constant OpActionsCapacity to match.
Large but boring; no functional change.

This is to support partial-word atomics on ppc; i8 is
not a valid type there, so by the time we get to lowering, the
ATOMIC_LOAD nodes looks the same whether the type was i8 or i32.
The information can be added to the AtomicSDNode, but that is the
largest SDNode; I don't fully understand the SDNode allocation,
but it is sensitive to the largest node size, so increasing
that must be bad.  This is the alternative.

llvm-svn: 55457
2008-08-28 02:44:49 +00:00
Dan Gohman
3976cccecd Reinstate the x86-64 portion of r55190. When doing extloads into
64-bit registers from 16-bit and smaller memory locations, prefer
instructions that define the entire 64-bit register, to avoid
partial-register updates.

llvm-svn: 55422
2008-08-27 17:33:15 +00:00
Evan Cheng
2b9f879a99 Fix asm printing of MOVSDto64mr and MOV64toSDrm.
llvm-svn: 55300
2008-08-25 04:11:42 +00:00
Bill Wendling
60e176391d Reverting r55190, r55191, and r55192. They broke the build with this error message:
{standard input}:17:bad register name `%sil'
make[4]: *** [libgcc/./_addvsi3.o] Error 1
make[4]: *** Waiting for unfinished jobs....
{standard input}:23:bad register name `%dil'
{standard input}:28:bad register name `%dil'
make[4]: *** [libgcc/./_addvdi3.o] Error 1
{standard input}:18:bad register name `%sil'
make[4]: *** [libgcc/./_subvsi3.o] Error 1

llvm-svn: 55200
2008-08-22 20:51:05 +00:00
Dan Gohman
897aa30d7c Anyext tweaks for x86. When extloading a value to i32 or i64, choose
instructions that define the full 32 or 64-bit value. When anyexting
from i8 to i16 or i32, it's not necessary to zero out the high
portion of the register.

llvm-svn: 55190
2008-08-22 19:19:31 +00:00
Dan Gohman
411cc551cb Move the handling of ANY_EXTEND, SIGN_EXTEND_INREG, and TRUNCATE
out of X86ISelDAGToDAG.cpp C++ code and into tablegen code.
Among other things, using tablegen for these things makes them
friendlier to FastISel.

Tablegen can handle the case of i8 subregs on x86-32, but currently
the C++ code for that case uses MVT::Flag in a tricky way, and it
happens to schedule better in some cases. So for now, leave the
C++ code in place to handle the i8 case on x86-32.

llvm-svn: 55078
2008-08-20 21:27:32 +00:00
Dale Johannesen
69c9d47dce Add remaining 64-bit atomic patterns for x86-64.
llvm-svn: 55029
2008-08-20 00:48:50 +00:00
Bill Wendling
ab390189dc Revert r55018 and apply the correct "fix" for the 64-bit sub_and_fetch atomic.
Just expand it like the other X-bit sub_and_fetches.

llvm-svn: 55023
2008-08-20 00:28:16 +00:00
Bill Wendling
ab7c8c091e Add support for the __sync_sub_and_fetch atomics and friends for X86. The code
was already present, but not hooked up to anything.

llvm-svn: 55018
2008-08-19 23:09:18 +00:00
Dale Johannesen
15b76de064 Add support for 8 and 16 bit forms of __sync
builtins on X86.

Change "lock" instructions to be on a separate line.
This is needed to work around a bug in the Darwin
assembler.

llvm-svn: 54999
2008-08-19 18:47:28 +00:00
Dan Gohman
74fa421281 Re-enable elimination of unnecessary SUBREG_TO_REG instructions in
LowerSubregs, and fix an x86-64 isel bug that this exposed.

SUBREG_TO_REG for x86-64 implicit zero extension is only safe for
isel to generate when the source is known to always have zeros in
the high 32 bits. The EXTRACT_SUBREG instruction does not clear
the high 32 bits.

llvm-svn: 54444
2008-08-07 02:54:50 +00:00
Dan Gohman
cc784f1662 Re-introduce the 8-bit subreg zext-inreg patterns for x86-32,
this time using MOV32to32_ and MOV16to16_. Thanks to Evan for
suggesting this.

llvm-svn: 54418
2008-08-06 18:27:21 +00:00
Dan Gohman
99d70043f9 xchg does not modify FLAGS.
llvm-svn: 54411
2008-08-06 15:52:50 +00:00
Dan Gohman
efb5d2ce6e Reapply r54147 with a constraint to only use the 8-bit
subreg form on x86-64, to avoid the problem with x86-32
having GPRs that don't have 8-bit subregs.

Also, change several 16-bit instructions to use 
equivalent 32-bit instructions. These have a smaller
encoding and avoid partial-register updates.

llvm-svn: 54223
2008-07-30 18:09:17 +00:00
Dan Gohman
ebe629a4b2 Revert 54147.
llvm-svn: 54148
2008-07-29 01:02:18 +00:00