Chris Lattner
2d59eef5fd
now that generic vector types aren't selected onto MMX operations,
...
we don't need -disable-mmx anymore.
llvm-svn: 122189
2010-12-19 20:19:20 +00:00
Chris Lattner
30438e63c8
reduce copy/paste programming with the power of for loops.
...
llvm-svn: 122187
2010-12-19 20:07:10 +00:00
Chris Lattner
29475c23d0
X86 supports i8/i16 overflow ops (except i8 multiplies), we should
...
generate them.
Now we compile:
define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
%0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
%cmp = extractvalue %0 %0, 1
br i1 %cmp, label %if.then, label %if.end
into:
_X: ## @X
## BB#0: ## %entry
subl $12, %esp
movb 16(%esp), %al
addb 20(%esp), %al
jo LBB0_2
Before we were generating:
_X: ## @X
## BB#0: ## %entry
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movb 12(%ebp), %al
testb %al, %al
setge %cl
movb 8(%ebp), %dl
testb %dl, %dl
setge %ah
cmpb %cl, %ah
sete %cl
addb %al, %dl
testb %dl, %dl
setge %al
cmpb %al, %ah
setne %al
andb %cl, %al
testb %al, %al
jne LBB0_2
llvm-svn: 122186
2010-12-19 20:03:11 +00:00
Nate Begeman
ef5f3c0fa7
Add support for matching psign & plendvb to the x86 target
...
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts
llvm-svn: 122098
2010-12-17 22:55:37 +00:00
Nate Begeman
cb6d1c8193
Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
...
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Eric Christopher
0e40452eb0
Rewrite the darwin tlv support to use a chain and return to copying
...
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.
llvm-svn: 121358
2010-12-09 06:25:53 +00:00
Eric Christopher
64e662fce9
Stop confusing people, it's not really a chain, or a tumor.
...
llvm-svn: 121340
2010-12-09 00:57:19 +00:00
Eric Christopher
0100a8fda4
Remove extraneous copy from DAG conversion for darwin tls. This was
...
popping up at O0 when it wasn't folded and the fast allocator would
complain.
llvm-svn: 121330
2010-12-09 00:27:58 +00:00
Chris Lattner
e30adfb732
Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags
...
result. This allows us to compile:
void *test12(long count) {
return new int[count];
}
into:
test12:
movl $4, %ecx
movq %rdi, %rax
mulq %rcx
movq $-1, %rdi
cmovnoq %rax, %rdi
jmp __Znam ## TAILCALL
instead of:
test12:
movl $4, %ecx
movq %rdi, %rax
mulq %rcx
seto %cl
testb %cl, %cl
movq $-1, %rdi
cmoveq %rax, %rdi
jmp __Znam
Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.
llvm-svn: 120936
2010-12-05 07:49:54 +00:00
Chris Lattner
76601e7a99
it turns out that when ".with.overflow" intrinsics were added to the X86
...
backend that they were all implemented except umul. This one fell back
to the default implementation that did a hi/lo multiply and compared the
top. Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test. Now we compile:
void *func(long count) {
return new int[count];
}
into:
__Z4funcl: ## @_Z4funcl
movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
mulq %rcx ## encoding: [0x48,0xf7,0xe1]
seto %cl ## encoding: [0x0f,0x90,0xc1]
testb %cl, %cl ## encoding: [0x84,0xc9]
movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8]
jmp __Znam ## TAILCALL
instead of:
__Z4funcl: ## @_Z4funcl
movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
mulq %rcx ## encoding: [0x48,0xf7,0xe1]
testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2]
movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8]
jmp __Znam ## TAILCALL
Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.
llvm-svn: 120935
2010-12-05 07:30:36 +00:00
Chris Lattner
16bafb2414
generalize the previous check to handle -1 on either side of the
...
select, inserting a not to compensate. Add a missing isZero check
that I lost somehow.
This improves codegen of:
void *func(long count) {
return new int[count];
}
from:
__Z4funcl: ## @_Z4funcl
movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
mulq %rcx ## encoding: [0x48,0xf7,0xe1]
testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2]
movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8]
jmp __Znam ## TAILCALL
## encoding: [0xeb,A]
to:
__Z4funcl: ## @_Z4funcl
movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
mulq %rcx ## encoding: [0x48,0xf7,0xe1]
cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01]
sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff]
notq %rdi ## encoding: [0x48,0xf7,0xd7]
orq %rax, %rdi ## encoding: [0x48,0x09,0xc7]
jmp __Znam ## TAILCALL
## encoding: [0xeb,A]
llvm-svn: 120932
2010-12-05 02:00:51 +00:00
Chris Lattner
474ed0aa9b
Improve an integer select optimization in two ways:
...
1. generalize
(select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
(select (x == 0), -1, y) -> (sign_bit (x - 1)) | y
2. Handle the identical pattern that happens with !=:
(select (x != 0), y, -1) -> (sign_bit (x - 1)) | y
cmov is often high latency and can't fold immediates or
memory operands. For example for (x == 0) ? -1 : 1, before
we got:
< testb %sil, %sil
< movl $-1, %ecx
< movl $1, %eax
< cmovel %ecx, %eax
now we get:
> cmpb $1, %sil
> sbbl %eax, %eax
> orl $1, %eax
llvm-svn: 120929
2010-12-05 01:23:24 +00:00
Benjamin Kramer
851691ddb2
Add patterns for the x86 popcnt instruction.
...
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
supports SSE4.2 (nehalem) or SSE4A (barcelona).
llvm-svn: 120917
2010-12-04 20:32:23 +00:00
Benjamin Kramer
77faee6ba1
Simplify code. No functionality change.
...
llvm-svn: 120907
2010-12-04 14:22:24 +00:00
Evan Cheng
4118b24aca
Fix and re-enable tail call optimization of expanded libcalls.
...
llvm-svn: 120622
2010-12-01 22:59:46 +00:00
Duncan Sands
cd4f56b8e2
I don't think it makes any sense to assert that the target supports SSE3 here.
...
The user (i.e. whoever generated a call to the intrinsic in the first place) is
essentially asking for a particular instruction to be placed in the assembler.
If that instruction won't execute on the target machine, that's their problem
not ours. Two buildbots with processors that don't support SSE3 were barfing
on the apm.ll test in CodeGen/X86 because of this assertion.
llvm-svn: 120574
2010-12-01 12:58:13 +00:00
Evan Cheng
84162760b7
Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot.
...
llvm-svn: 120549
2010-12-01 03:27:20 +00:00
Evan Cheng
f7e586d749
Enable sibling call optimization of libcalls which are expanded during
...
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777
llvm-svn: 120501
2010-11-30 23:55:39 +00:00
Eric Christopher
73365ae8b6
Fix insertion point in pcmp expander.
...
While I'm there, clean up too many \n even for me.
llvm-svn: 120411
2010-11-30 08:20:21 +00:00
Eric Christopher
2170738538
Fix some cleanups from my last patch.
...
llvm-svn: 120410
2010-11-30 08:10:28 +00:00
Eric Christopher
f27f0b5234
Rewrite mwait and monitor support and custom lower arguments.
...
Fixes PR8573.
llvm-svn: 120404
2010-11-30 07:20:12 +00:00
Rafael Espindola
9287c4b38f
Move lowering of TLS_addr32 and TLS_addr64 to X86MCInstLower.
...
llvm-svn: 120263
2010-11-28 21:16:39 +00:00
Rafael Espindola
45cd9713f2
Lower TLS_addr32 and TLS_addr64.
...
llvm-svn: 120225
2010-11-27 20:43:02 +00:00
Wesley Peck
d589353ad0
Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept.
...
llvm-svn: 119990
2010-11-23 03:31:01 +00:00
Anton Korobeynikov
269e7d3be1
Move hasFP() and few related hooks to TargetFrameInfo.
...
llvm-svn: 119740
2010-11-18 21:19:35 +00:00
Chris Lattner
9a0a840839
add targetoperand flags for jump tables, constant pool and block address
...
nodes to indicate when ha16/lo16 modifiers should be used. This lets
us pass PowerPC/indirectbr.ll.
The one annoying thing about this patch is that the MCSymbolExpr isn't
expressive enough to represent ha16(label1-label2) which we need on
PowerPC. I have a terrible hack in the meantime, but this will have
to be revisited at some point.
Last major conversion item left is global variable references.
llvm-svn: 119105
2010-11-15 02:46:57 +00:00
Chris Lattner
51168d6510
move the pic base symbol stuff up to MachineFunction
...
since it is trivial and will be shared between ppc and x86.
This substantially simplifies the X86 backend also.
llvm-svn: 119089
2010-11-14 22:48:15 +00:00
Chris Lattner
09ea2ce78f
simplify getPICBaseSymbol a bit.
...
llvm-svn: 119088
2010-11-14 22:37:11 +00:00
Peter Collingbourne
4ec6b2dbe7
Recognise 32-bit ror-based bswap implementation used by uclibc
...
llvm-svn: 119007
2010-11-13 19:54:30 +00:00
Peter Collingbourne
6c3094234e
Support ; as asm separator
...
llvm-svn: 119006
2010-11-13 19:54:23 +00:00
Dale Johannesen
659061fcd3
Remove possibly useful info from comment, per Chris.
...
llvm-svn: 118865
2010-11-12 00:43:18 +00:00
Duncan Sands
41edf30895
Simplify uses of MVT and EVT. An MVT can be compared directly
...
with a SimpleValueType, while an EVT supports equality and
inequality comparisons with SimpleValueType.
llvm-svn: 118169
2010-11-03 12:17:33 +00:00
Duncan Sands
92f33ea784
Factorize the duplicated logic for choosing the right argument
...
calling convention out of the fast and normal ISel files, and
into the calling convention TD file.
llvm-svn: 117856
2010-10-31 13:21:44 +00:00
John Thompson
6115a7f1d4
Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
...
llvm-svn: 117667
2010-10-29 17:29:13 +00:00
Michael J. Spencer
5518dda87e
x86-Win32: Switch ftol2 calling convention from stdcall to C.
...
llvm-svn: 117474
2010-10-27 18:52:38 +00:00
Dale Johannesen
2566d39d07
An stdcall function calling a non-stdcall function
...
cannot use tailcall. PR 8461.
llvm-svn: 117322
2010-10-25 22:17:05 +00:00
Duncan Sands
8c5f243aa0
Add parentheses to pacify gcc, which warns otherwise.
...
llvm-svn: 117020
2010-10-21 16:02:12 +00:00
Michael J. Spencer
5a68d7ce94
X86: Add alloca probing to dynamic alloca on Windows. Fixes PR8424.
...
llvm-svn: 116984
2010-10-21 01:41:01 +00:00
Dale Johannesen
58fe3193a6
Remove Synthesizable from the Type system; as MMX vector
...
types are no longer Legal on X86, we don't need it.
No functional change. 8499854.
llvm-svn: 116947
2010-10-20 21:32:10 +00:00
Michael J. Spencer
e528f2588f
X86: Add MS-CRT libcalls.
...
llvm-svn: 116801
2010-10-19 07:32:52 +00:00
Michael J. Spencer
cd6be63d05
Fix Whitespace.
...
llvm-svn: 116800
2010-10-19 07:32:42 +00:00
Eric Christopher
303692631e
Combine these together - should probably have some text associated
...
that says what why what we just asserted is wrong.
llvm-svn: 116333
2010-10-12 19:44:17 +00:00
Nick Lewycky
a6815ae877
Mark variable 'NoImplicitFloatOps' used only in an assert as used.
...
llvm-svn: 116323
2010-10-12 18:18:03 +00:00
Dan Gohman
d904add908
Initial va_arg support for x86-64. Patch by David Meyer!
...
llvm-svn: 116319
2010-10-12 18:00:49 +00:00
Andrew Trick
5704a15e36
Fixes bug 8297: i386 cmpxchg8b, missing MachineMemOperand
...
llvm-svn: 116214
2010-10-11 19:02:04 +00:00
Michael J. Spencer
08b96a9209
X86: Call ulldiv and ftol2 on Windows instead of their libgcc eqivilents.
...
llvm-svn: 116188
2010-10-11 05:29:15 +00:00
Michael J. Spencer
8f7251a0e9
X86: MinGW should always use libgcc on Windows.
...
llvm-svn: 116177
2010-10-10 23:11:06 +00:00
Michael J. Spencer
b13cb4dd72
X86: Call _alldiv instead of __divdi3 on Windows (excluding cygwin).
...
llvm-svn: 116174
2010-10-10 22:04:34 +00:00
Michael J. Spencer
ae56350c76
Fix Whitespace.
...
llvm-svn: 116173
2010-10-10 22:04:20 +00:00
Cameron Esfahani
664317d6cd
Recommit 116056, now with the missing file...
...
llvm-svn: 116083
2010-10-08 19:24:18 +00:00