Nate Begeman
dc46021355
Finish off bug 680, allowing targets to custom lower frame and return
...
address nodes.
llvm-svn: 33636
2007-01-29 22:58:52 +00:00
Anton Korobeynikov
611d5e2eda
Propagate changes from my local tree. This patch includes:
...
1. New parameter attribute called 'inreg'. It has meaning "place this
parameter in registers, if possible". This is some generalization of
gcc's regparm(n) attribute. It's currently used only in X86-32 backend.
2. Completely rewritten CC handling/lowering code inside X86 backend.
Merged stdcall + c CCs and fastcall + fast CC.
3. Dropped CSRET CC. We cannot add struct return variant for each
target-specific CC (e.g. stdcall + csretcc and so on).
4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in
on first attribute has meaning 'This is hidden pointer to structure
return. Handle it gently'.
5. Fixed small bug in llvm-extract + add new feature to
FunctionExtraction pass, which relinks all internal-linkaged callees
from deleted function to external linkage. This will allow further
linking everything together.
NOTEs: 1. Documentation will be updated soon.
2. llvm-upgrade should be improved to translate csret => sret.
Before this, there will be some unexpected test fails.
llvm-svn: 33597
2007-01-28 13:31:35 +00:00
Evan Cheng
df277336b8
- FCOPYSIGN custom lowering bug. Clear the sign bit of operand 0 first before
...
or'ing in the sign bit of operand 1.
- Tweaking: rather than left shift the sign bit, fp_extend operand 1 first
before taking its sign bit if its type is smaller than that of operand 0.
llvm-svn: 32932
2007-01-05 21:37:56 +00:00
Evan Cheng
bcf3d2bd15
With SSE2, expand FCOPYSIGN to a series of SSE bitwise operations.
...
llvm-svn: 32900
2007-01-05 07:55:56 +00:00
Evan Cheng
456101ebb9
- Use a different wrapper node for RIP-relative GV, etc.
...
- Proper support for both small static and PIC modes under X86-64
- Some (non-optimal) support for medium modes.
llvm-svn: 32046
2006-11-30 21:55:46 +00:00
Evan Cheng
ae1f3758bd
Don't dag combine floating point select to max and min intrinsics. Those
...
take v4f32 / v2f64 operands and may end up causing larger spills / restores.
Added X86 specific nodes X86ISD::FMAX, X86ISD::FMIN instead.
This fixes PR996.
llvm-svn: 31645
2006-11-10 21:43:37 +00:00
Evan Cheng
7ca1f47a96
Fixed a bug which causes x86 be to incorrectly match
...
shuffle v, undef, <2, ?, 3, ?>
to movhlps
It should match to unpckhps instead.
Added proper matching code for
shuffle v, undef, <2, 3, 2, 3>
llvm-svn: 31519
2006-11-07 22:14:24 +00:00
Chris Lattner
def30d3eda
allow the address of a global to be used with the "i" constraint when in
...
-static mode. This implements PR882.
llvm-svn: 31326
2006-10-31 20:13:11 +00:00
Evan Cheng
090e9abaee
Fixed a significant bug where unpcklpd is incorrectly used to extract element 1 from a v2f64 value.
...
llvm-svn: 31228
2006-10-27 21:08:32 +00:00
Chris Lattner
62a0f00312
Implement branch analysis/xform hooks required by the branch folding pass.
...
llvm-svn: 31065
2006-10-20 17:42:20 +00:00
Chris Lattner
a86fff7583
fit in 80 cols
...
llvm-svn: 31039
2006-10-18 18:26:48 +00:00
Chris Lattner
04ad43b4de
update comments
...
llvm-svn: 30663
2006-09-28 23:33:12 +00:00
Anton Korobeynikov
59ef7e94eb
Adding codegeneration for StdCall & FastCall calling conventions
...
llvm-svn: 30549
2006-09-20 22:03:51 +00:00
Evan Cheng
cfd7b147cf
X86ISD::CMP now produces a chain as well as a flag. Make that the chain
...
operand of a conditional branch to allow load folding into CMP / TEST
instructions.
llvm-svn: 30241
2006-09-11 02:19:56 +00:00
Evan Cheng
15dd42884e
Committing X86-64 support.
...
llvm-svn: 30177
2006-09-08 06:48:29 +00:00
Chris Lattner
26ff12f7f5
Fix PR850 and CodeGen/X86/2006-07-31-SingleRegClass.ll.
...
The CFE refers to all single-register constraints (like "A") by their 16-bit
name, even though the 8 or 32-bit version of the register may be needed.
The X86 backend should realize what is going on and redecode the name back
to its proper form.
llvm-svn: 29420
2006-07-31 23:26:50 +00:00
Chris Lattner
b75fe307e1
Implement the inline asm 'A' constraint. This implements PR825 and
...
CodeGen/X86/2006-07-10-InlineAsmAConstraint.ll
llvm-svn: 29101
2006-07-11 02:54:03 +00:00
Evan Cheng
1d48a494a2
X86 target specific DAG combine: turn build_vector (load x), (load x+4),
...
(load x+8), (load x+12), <0, 1, 2, 3> to a single 128-bit load (aligned and
unaligned).
e.g.
__m128 test(float a, float b, float c, float d) {
return _mm_set_ps(d, c, b, a);
}
_test:
movups 4(%esp), %xmm0
ret
llvm-svn: 29042
2006-07-07 08:33:52 +00:00
Evan Cheng
db5c7909f5
Simplify X86CompilationCallback: always align to 16-byte boundary; don't save EAX/EDX if unnecessary.
...
llvm-svn: 28910
2006-06-24 08:36:10 +00:00
Evan Cheng
bb17ad5ffa
Switch X86 over to a call-selection model where the lowering code creates
...
the copyto/fromregs instead of making the X86ISD::CALL selection code create
them.
llvm-svn: 28463
2006-05-25 00:59:30 +00:00
Chris Lattner
f604017e47
Patches to make the LLVM sources more -pedantic clean. Patch provided
...
by Anton Korobeynikov! This is a step towards closing PR786.
llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Evan Cheng
65c3f3f26b
Remove PreprocessCCCArguments and PreprocessFastCCArguments now that
...
FORMAL_ARGUMENTS nodes include a token operand.
llvm-svn: 28439
2006-05-23 21:06:34 +00:00
Chris Lattner
9aec97df10
Implement an annoying part of the Darwin/X86 abi: the callee of a struct
...
return argument pops the hidden struct pointer if present, not the caller.
For example, in this testcase:
struct X { int D, E, F, G; };
struct X bar() {
struct X a;
a.D = 0;
a.E = 1;
a.F = 2;
a.G = 3;
return a;
}
void foo(struct X *P) {
*P = bar();
}
We used to emit:
_foo:
subl $28, %esp
movl 32(%esp), %eax
movl %eax, (%esp)
call _bar
addl $28, %esp
ret
_bar:
movl 4(%esp), %eax
movl $0, (%eax)
movl $1, 4(%eax)
movl $2, 8(%eax)
movl $3, 12(%eax)
ret
This is correct on Linux/X86 but not Darwin/X86. With this patch, we now
emit:
_foo:
subl $28, %esp
movl 32(%esp), %eax
movl %eax, (%esp)
call _bar
*** addl $24, %esp
ret
_bar:
movl 4(%esp), %eax
movl $0, (%eax)
movl $1, 4(%eax)
movl $2, 8(%eax)
movl $3, 12(%eax)
*** ret $4
For the record, GCC emits (which is functionally equivalent to our new code):
_bar:
movl 4(%esp), %eax
movl $3, 12(%eax)
movl $2, 8(%eax)
movl $1, 4(%eax)
movl $0, (%eax)
ret $4
_foo:
pushl %esi
subl $40, %esp
movl 48(%esp), %esi
leal 16(%esp), %eax
movl %eax, (%esp)
call _bar
subl $4, %esp
movl 16(%esp), %eax
movl %eax, (%esi)
movl 20(%esp), %eax
movl %eax, 4(%esi)
movl 24(%esp), %eax
movl %eax, 8(%esi)
movl 28(%esp), %eax
movl %eax, 12(%esi)
addl $40, %esp
popl %esi
ret
This fixes SingleSource/Benchmarks/CoyoteBench/fftbench with LLC and the
JIT, and fixes the X86-backend portion of PR729. The CBE still needs to
be updated.
llvm-svn: 28438
2006-05-23 18:50:38 +00:00
Evan Cheng
d282cb8542
Should pass by reference.
...
llvm-svn: 28357
2006-05-17 19:07:40 +00:00
Evan Cheng
a1f9f34f35
- Clean up formal argument lowering code. Prepare for vector pass by value work.
...
- Fixed vararg support.
llvm-svn: 27985
2006-04-27 01:32:22 +00:00
Evan Cheng
58d4133b60
Switching over FORMAL_ARGUMENTS mechanism to lower call arguments.
...
llvm-svn: 27975
2006-04-26 01:20:17 +00:00
Evan Cheng
09112df9d3
Separate LowerOperation() into multiple functions, one per opcode.
...
llvm-svn: 27972
2006-04-25 20:13:52 +00:00
Evan Cheng
e0289de5ab
Now generating perfect (I think) code for "vector set" with a single non-zero
...
scalar value.
e.g.
_mm_set_epi32(0, a, 0, 0);
==>
movd 4(%esp), %xmm0
pshufd $69, %xmm0, %xmm0
_mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
movzbw 4(%esp), %ax
movzwl %ax, %eax
pxor %xmm0, %xmm0
pinsrw $5, %eax, %xmm0
llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Evan Cheng
41f2933444
- Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1>
...
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.
llvm-svn: 27875
2006-04-20 08:58:49 +00:00
Evan Cheng
265831aa45
Commute vector_shuffle to match more movlhps, movlp{s|d} cases.
...
llvm-svn: 27840
2006-04-19 20:35:22 +00:00
Evan Cheng
32c4470374
Last few SSE3 intrinsics.
...
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng
da283be867
Added support for _mm_move_ss and _mm_move_sd.
...
llvm-svn: 27575
2006-04-11 00:19:04 +00:00
Evan Cheng
9f27046dc9
- movlp{s|d} and movhp{s|d} support.
...
- Normalize shuffle nodes so result vector lower half elements come from the
first vector, the rest come from the second vector. (Except for the
exceptions :-).
- Other minor fixes.
llvm-svn: 27474
2006-04-06 23:23:56 +00:00
Evan Cheng
6d470008c8
Support for comi / ucomi intrinsics.
...
llvm-svn: 27444
2006-04-05 23:38:46 +00:00
Evan Cheng
056e0af55a
Handle canonical form of e.g.
...
vector_shuffle v1, v1, <0, 4, 1, 5, 2, 6, 3, 7>
This is turned into
vector_shuffle v1, <undef>, <0, 0, 1, 1, 2, 2, 3, 3>
by dag combiner.
It would match a {p}unpckl on x86.
llvm-svn: 27437
2006-04-05 07:20:06 +00:00
Evan Cheng
4623ebd3d0
Use a X86 target specific node X86ISD::PINSRW instead of a mal-formed
...
INSERT_VECTOR_ELT to insert a 16-bit value in a 128-bit vector.
llvm-svn: 27314
2006-03-31 21:55:24 +00:00
Evan Cheng
7b9a0c6d7a
Add support to use pextrw and pinsrw to extract and insert a word element
...
from a 128-bit vector.
llvm-svn: 27304
2006-03-31 19:22:53 +00:00
Evan Cheng
7bc3bc8246
- Added some SSE2 128-bit packed integer ops.
...
- Added SSE2 128-bit integer pack with signed saturation ops.
- Added pshufhw and pshuflw ops.
llvm-svn: 27252
2006-03-29 23:07:14 +00:00
Evan Cheng
fb4b2bfc7d
* Prefer using operation of matching types. e.g unpcklpd rather than movlhps.
...
* Bug fixes.
llvm-svn: 27218
2006-03-28 06:50:32 +00:00
Evan Cheng
4d554dae17
- Clean up / consoladate various shuffle masks.
...
- Some misc. bug fixes.
- Use MOVHPDrm to load from m64 to upper half of a XMM register.
llvm-svn: 27210
2006-03-28 02:43:26 +00:00
Evan Cheng
d8d7ec47bd
Model unpack lower and interleave as vector_shuffle so we can lower the
...
intrinsics as such.
llvm-svn: 27200
2006-03-28 00:39:58 +00:00
Evan Cheng
1dfbede1d1
Remove X86:isZeroVector, use ISD::isBuildVectorAllZeros instead; some fixes / cleanups
...
llvm-svn: 27150
2006-03-26 09:53:12 +00:00
Evan Cheng
e5807f6b47
Build arbitrary vector with more than 2 distinct scalar elements with a
...
series of unpack and interleave ops.
llvm-svn: 27119
2006-03-25 09:37:23 +00:00
Evan Cheng
bdb85b387f
Support for scalar to vector with zero extension.
...
llvm-svn: 27091
2006-03-24 23:15:12 +00:00
Evan Cheng
d58d54cf3e
Handle BUILD_VECTOR with all zero elements.
...
llvm-svn: 27056
2006-03-24 07:29:27 +00:00
Evan Cheng
3028b04057
More efficient v2f64 shuffle using movlhps, movhlps, unpckhpd, and unpcklpd.
...
llvm-svn: 27040
2006-03-24 02:58:06 +00:00
Evan Cheng
68410804f0
Handle more shuffle cases with SHUFP* instructions.
...
llvm-svn: 27024
2006-03-24 01:18:28 +00:00
Evan Cheng
54215cd1ea
Added a ValueType operand to isShuffleMaskLegal(). For now, x86 will not do
...
64-bit vector shuffle.
llvm-svn: 26964
2006-03-22 22:07:06 +00:00
Evan Cheng
cff38e19c3
- Implement X86ISelLowering::isShuffleMaskLegal(). We currently only support
...
splat and PSHUFD cases.
- Clean up shuffle / splat matching code.
llvm-svn: 26954
2006-03-22 18:59:22 +00:00
Evan Cheng
f6dc0a7f5e
- VECTOR_SHUFFLE of v4i32 / v4f32 with undef second vector always matches
...
PSHUFD. We can make permutes entries which point to the undef pointing
anything we want.
- Change some names to appease Chris.
llvm-svn: 26951
2006-03-22 08:01:21 +00:00