Dan Gohman
bdb94669ba
Fix the spelling of the prefetchnta instruction.
...
llvm-svn: 36256
2007-04-18 14:09:14 +00:00
Bill Wendling
3b1189afbf
Add support for our first SSSE3 instruction "pmulhrsw".
...
llvm-svn: 35869
2007-04-10 22:10:25 +00:00
Evan Cheng
00a5cbf9e7
Mark re-materializable instructions.
...
llvm-svn: 35230
2007-03-21 00:16:56 +00:00
Chris Lattner
6d7701714e
add missing braces
...
llvm-svn: 34905
2007-03-04 06:13:52 +00:00
Evan Cheng
a6399ed8d6
How the heck did I forget patterns for llvm.x86.sse2.cmp.sd?
...
llvm-svn: 34434
2007-02-20 00:39:09 +00:00
Evan Cheng
df277336b8
- FCOPYSIGN custom lowering bug. Clear the sign bit of operand 0 first before
...
or'ing in the sign bit of operand 1.
- Tweaking: rather than left shift the sign bit, fp_extend operand 1 first
before taking its sign bit if its type is smaller than that of operand 0.
llvm-svn: 32932
2007-01-05 21:37:56 +00:00
Evan Cheng
bcf3d2bd15
With SSE2, expand FCOPYSIGN to a series of SSE bitwise operations.
...
llvm-svn: 32900
2007-01-05 07:55:56 +00:00
Evan Cheng
4dc2f8e9bb
- Rename MOVDSS2DIrr to MOVSS2DIrr for consistency sake.
...
- Add MOVDI2SSrm and MOVSS2DImr to fold load / store for i32 <-> f32 bit_convert
patterns.
llvm-svn: 32582
2006-12-14 19:43:11 +00:00
Chris Lattner
6a9de21df5
If we have ScalarSSE, we can select bitconvert into single instructions.
...
This compiles bitcast.ll:test3/test4 into:
_test3:
movd %xmm0, %eax
ret
_test4:
movd %edi, %xmm0
ret
llvm-svn: 32230
2006-12-05 18:45:06 +00:00
Evan Cheng
fc1b3d8bc8
Correct instructions for moving data between GR64 and SSE registers; also correct load i64 / store i64 from v2i64.
...
llvm-svn: 31795
2006-11-16 23:33:25 +00:00
Evan Cheng
ae1f3758bd
Don't dag combine floating point select to max and min intrinsics. Those
...
take v4f32 / v2f64 operands and may end up causing larger spills / restores.
Added X86 specific nodes X86ISD::FMAX, X86ISD::FMIN instead.
This fixes PR996.
llvm-svn: 31645
2006-11-10 21:43:37 +00:00
Evan Cheng
7ca1f47a96
Fixed a bug which causes x86 be to incorrectly match
...
shuffle v, undef, <2, ?, 3, ?>
to movhlps
It should match to unpckhps instead.
Added proper matching code for
shuffle v, undef, <2, 3, 2, 3>
llvm-svn: 31519
2006-11-07 22:14:24 +00:00
Chris Lattner
7c265ad682
remove dead/redundant vars
...
llvm-svn: 31435
2006-11-03 23:48:56 +00:00
Evan Cheng
790d5c7697
Fix ldmxcsr JIT encoding.
...
llvm-svn: 31343
2006-11-01 06:53:52 +00:00
Evan Cheng
090e9abaee
Fixed a significant bug where unpcklpd is incorrectly used to extract element 1 from a v2f64 value.
...
llvm-svn: 31228
2006-10-27 21:08:32 +00:00
Evan Cheng
034305a2e8
X86ISD::PEXTRW 3rd operand type is always target pointer type.
...
llvm-svn: 31185
2006-10-25 21:35:05 +00:00
Evan Cheng
95140c9c64
ComplexPatterns sse_load_f32 and sse_load_f64 returns in / out chain operands.
...
llvm-svn: 30892
2006-10-11 21:06:01 +00:00
Evan Cheng
d1a37cb9dc
Don't go too crazy with these AddComplexity. Try matching shufps with load
...
folding first.
llvm-svn: 30848
2006-10-09 21:42:15 +00:00
Evan Cheng
d22f3dd3ed
Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
...
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner
3cd1d08ac6
completely disable folding of loads into scalar sse instructions and provide
...
a framework for doing it right. This fixes
CodeGen/X86/2006-10-07-ScalarSSEMiscompile.ll.
Once X86DAGToDAGISel::SelectScalarSSELoad is implemented right, this task
will be done.
llvm-svn: 30817
2006-10-07 21:55:32 +00:00
Chris Lattner
a51aea84b8
convert packed FP add/sub/mul/div to use a multiclass.
...
llvm-svn: 30815
2006-10-07 21:17:13 +00:00
Chris Lattner
da75127cea
one multiclass now defines all 8 variants of binary-scalar-sse-fp operations.
...
llvm-svn: 30814
2006-10-07 20:55:57 +00:00
Chris Lattner
8ce6993f53
Switch ADD/MUL/DIV/SUB scalarsse fp ops to a multiclass
...
llvm-svn: 30813
2006-10-07 20:35:44 +00:00
Chris Lattner
ec39f5bcd5
Random acts of shrinkage
...
llvm-svn: 30812
2006-10-07 19:49:05 +00:00
Chris Lattner
8e3aa16298
Convert pand/por/pxor to use multiclass
...
llvm-svn: 30811
2006-10-07 19:37:30 +00:00
Chris Lattner
33aecdebfc
Convert some more instructions over to use a new multiclass.
...
Fix a bug where the asmstring for PSUBQrm was wrong.
llvm-svn: 30810
2006-10-07 19:34:33 +00:00
Chris Lattner
260659336a
Fix a bug where PADDQrm printed paddd instead of paddq.
...
llvm-svn: 30809
2006-10-07 19:15:46 +00:00
Chris Lattner
0122bfac98
Add multiclass for SSE2 instructions that correspond to simple binops.
...
llvm-svn: 30808
2006-10-07 19:14:49 +00:00
Chris Lattner
db12d69657
rename:
...
PDI_binop_rm -> PDI_binop_rm_int
PDI_binop_rmi -> PDI_binop_rmi_int
to make it clear that these are for use with intrinsics.
llvm-svn: 30807
2006-10-07 19:02:31 +00:00
Chris Lattner
36709eed45
Convert saturating PADD/PSUB's to use a multiclass
...
llvm-svn: 30806
2006-10-07 18:48:46 +00:00
Chris Lattner
d5d4378010
Convert PAVG*, PMADDWD, and PMUL* to use multiclasses.
...
llvm-svn: 30805
2006-10-07 18:39:00 +00:00
Chris Lattner
753ec9950a
Fix typo in packsswb instr definition, where the load had the wrong type.
...
This allows us to use the multiclass for other packs.
llvm-svn: 30804
2006-10-07 18:23:58 +00:00
Chris Lattner
59bf33e5e4
handle pmin/pmax with multiclasses
...
llvm-svn: 30800
2006-10-07 07:49:33 +00:00
Chris Lattner
2177d324c5
simplify pack and shift intrinsics with multiclasses
...
llvm-svn: 30797
2006-10-07 07:06:17 +00:00
Chris Lattner
31eb3af1a8
Use a multiclass to simplify 'SSE2 Integer comparison'
...
llvm-svn: 30796
2006-10-07 06:47:08 +00:00
Chris Lattner
7cde5d8820
move class defns close to uses to make it easier to read
...
llvm-svn: 30795
2006-10-07 06:33:36 +00:00
Chris Lattner
2842be4e37
simplify horizontal op definitions
...
llvm-svn: 30794
2006-10-07 06:31:41 +00:00
Chris Lattner
b3b659492b
remove more unneeded type info
...
llvm-svn: 30793
2006-10-07 06:27:03 +00:00
Chris Lattner
8a2d78d3cf
remove unneeded definitions and type info
...
llvm-svn: 30792
2006-10-07 06:19:41 +00:00
Chris Lattner
a75da38d99
remove some unneeded type info
...
llvm-svn: 30791
2006-10-07 06:17:43 +00:00
Chris Lattner
d704b454b9
simplify patterns by merging in operand info
...
llvm-svn: 30790
2006-10-07 05:50:25 +00:00
Chris Lattner
bf6419cef6
Factor operands into packed unary classes
...
llvm-svn: 30789
2006-10-07 05:47:20 +00:00
Chris Lattner
06c9aa41f1
remove dead/duplicate instructions
...
llvm-svn: 30788
2006-10-07 05:41:52 +00:00
Chris Lattner
72b130720d
Pull operand info up into parent class for scalar sse intrinsics.
...
llvm-svn: 30787
2006-10-07 05:26:13 +00:00
Chris Lattner
cf13d058a3
convert the sole sd unary intrinsic to a multiclass for consistency
...
llvm-svn: 30786
2006-10-07 05:19:31 +00:00
Chris Lattner
67ea3292d2
pull operand string into the multiclass
...
llvm-svn: 30785
2006-10-07 05:13:26 +00:00
Chris Lattner
e234302d01
Remove RSQRTSS[rm] RCPSS[rm], which are dead.
...
Introduce SS_IntUnary, a multiclass to replace SS_Int[rm].
llvm-svn: 30784
2006-10-07 05:09:48 +00:00
Chris Lattner
22137d1891
eliminate redundancy
...
llvm-svn: 30783
2006-10-07 04:52:09 +00:00
Evan Cheng
79d9bdd28b
These don't have immediate operands.
...
llvm-svn: 30694
2006-10-03 06:55:11 +00:00
Evan Cheng
cfd7b147cf
X86ISD::CMP now produces a chain as well as a flag. Make that the chain
...
operand of a conditional branch to allow load folding into CMP / TEST
instructions.
llvm-svn: 30241
2006-09-11 02:19:56 +00:00
Evan Cheng
8ffe1aa35a
JIT encoding bug.
...
llvm-svn: 30112
2006-09-05 05:59:25 +00:00
Evan Cheng
692215be9c
Can't commute shufps. The high / low parts elements come from different vectors.
...
llvm-svn: 29275
2006-07-25 20:25:40 +00:00
Evan Cheng
1d48a494a2
X86 target specific DAG combine: turn build_vector (load x), (load x+4),
...
(load x+8), (load x+12), <0, 1, 2, 3> to a single 128-bit load (aligned and
unaligned).
e.g.
__m128 test(float a, float b, float c, float d) {
return _mm_set_ps(d, c, b, a);
}
_test:
movups 4(%esp), %xmm0
ret
llvm-svn: 29042
2006-07-07 08:33:52 +00:00
Evan Cheng
0df13a4f2a
Should just use xorps to clear XMM registers for all data types. pxor is also one byte longer.
...
llvm-svn: 28984
2006-06-29 18:04:54 +00:00
Evan Cheng
803891eaa8
Always use xorps to clear XMM registers.
...
llvm-svn: 28979
2006-06-29 00:34:23 +00:00
Chris Lattner
2e64872117
Remove some ugly now-redundant casts.
...
llvm-svn: 28864
2006-06-20 00:25:29 +00:00
Chris Lattner
fbea064e90
Fix some mismatched type constraints
...
llvm-svn: 28862
2006-06-20 00:12:37 +00:00
Evan Cheng
98d508af83
Minor clean up.
...
llvm-svn: 28860
2006-06-19 19:25:30 +00:00
Evan Cheng
bc79e5f0e4
Type of vector extract / insert index operand should be iPTR.
...
llvm-svn: 28796
2006-06-15 08:14:54 +00:00
Evan Cheng
889544823a
Rename instructions for consistency sake.
...
llvm-svn: 28594
2006-05-31 19:00:07 +00:00
Evan Cheng
abbbe57ba2
Select vector_shuffle v1, undef <2, 3, ?, ?> to MOVHLPS.
...
llvm-svn: 28582
2006-05-31 00:51:37 +00:00
Evan Cheng
c024ad7f32
MAXP{D|S} and MINP{D|S} are commutable.
...
llvm-svn: 28578
2006-05-30 23:47:30 +00:00
Evan Cheng
e2397256c1
Commute shufps / shufpd.
...
llvm-svn: 28577
2006-05-30 23:34:30 +00:00
Evan Cheng
03ca651244
Allow shufps x, x, mask to be converted to pshufd x, mask to save a move.
...
llvm-svn: 28565
2006-05-30 20:26:50 +00:00
Evan Cheng
dc9b5f5fc0
X86 integer register classes naming changes. Make them consistent with FP, vector classes.
...
llvm-svn: 28324
2006-05-16 07:21:53 +00:00
Chris Lattner
a03676690b
Teach the code generator to use cvtss2sd as extload f32 -> f64
...
llvm-svn: 28131
2006-05-05 21:35:18 +00:00
Evan Cheng
ef2fbe7460
Use movsd to shuffle in the lowest two elements of a v4f32 / v4i32 vector when
...
movlps cannot be used (e.g. when load from m64 has multiple uses).
llvm-svn: 28089
2006-05-03 20:32:03 +00:00
Evan Cheng
abc391a5a6
Fix a typo.
...
llvm-svn: 27968
2006-04-25 17:48:41 +00:00
Evan Cheng
7f0e30d1a2
Explicitly specify result type for def : Pat<> patterns (if it produces a vector
...
result). Otherwise tblgen will pick the default (v16i8 for 128-bit vector).
llvm-svn: 27965
2006-04-25 00:50:01 +00:00
Evan Cheng
e521de4e60
Added X86 SSE2 intrinsics which can be represented as vector_shuffles. This is
...
a temporary workaround for the 2-wide vector_shuffle problem (i.e. its mask
would have type v2i32 which is not legal).
llvm-svn: 27964
2006-04-24 23:34:56 +00:00
Evan Cheng
3306427d87
Some missing movlps, movhps, movlpd, and movhpd patterns.
...
llvm-svn: 27960
2006-04-24 21:58:20 +00:00
Evan Cheng
e0289de5ab
Now generating perfect (I think) code for "vector set" with a single non-zero
...
scalar value.
e.g.
_mm_set_epi32(0, a, 0, 0);
==>
movd 4(%esp), %xmm0
pshufd $69, %xmm0, %xmm0
_mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
movzbw 4(%esp), %ax
movzwl %ax, %eax
pxor %xmm0, %xmm0
pinsrw $5, %eax, %xmm0
llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Evan Cheng
7bbfc1d41a
Prefer {p}unpack* and mov*dup over {p}shuf* as well.
...
llvm-svn: 27844
2006-04-19 21:15:24 +00:00
Evan Cheng
a52eb1d7d5
- Renamed AddedCost to AddedComplexity.
...
- Added more movhlps and movlhps patterns.
llvm-svn: 27842
2006-04-19 20:37:34 +00:00
Evan Cheng
56e205e534
More mov{h|l}p{d|s} patterns.
...
llvm-svn: 27836
2006-04-19 18:20:17 +00:00
Evan Cheng
b42424177c
- More mov{h|l}ps patterns.
...
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These
are preferred over shufps in most cases.
llvm-svn: 27835
2006-04-19 18:11:52 +00:00
Evan Cheng
7364ee1c92
- PEXTRW cannot take a memory location as its first source operand.
...
- PINSRWrmi encoding bug.
llvm-svn: 27818
2006-04-18 21:59:43 +00:00
Evan Cheng
f16e4bf29d
Name change for clarity sake
...
llvm-svn: 27816
2006-04-18 21:55:35 +00:00
Evan Cheng
8e87e9b0db
Name change for clarity sake
...
llvm-svn: 27814
2006-04-18 21:29:50 +00:00
Evan Cheng
838f053b09
Left a pattern out
...
llvm-svn: 27813
2006-04-18 21:29:08 +00:00
Evan Cheng
2cd4e2d240
Fixed an encoding bug: movd from XMM to R32.
...
llvm-svn: 27807
2006-04-18 18:19:00 +00:00
Evan Cheng
98b1ca65dd
Use movss to insert_vector_elt(v, s, 0).
...
llvm-svn: 27782
2006-04-17 22:45:49 +00:00
Evan Cheng
833ce43152
Encoding bug
...
llvm-svn: 27773
2006-04-17 21:33:57 +00:00
Evan Cheng
3d26db8148
Errors in patterns preventing load folding
...
llvm-svn: 27762
2006-04-17 18:05:01 +00:00
Evan Cheng
68b2e5b4b0
movduprm, movshduprm bugs
...
llvm-svn: 27734
2006-04-16 18:11:28 +00:00
Evan Cheng
26d917789c
Encoding bugs
...
llvm-svn: 27733
2006-04-16 07:02:22 +00:00
Evan Cheng
9f33b2abc5
More encoding bugs
...
llvm-svn: 27722
2006-04-15 06:10:09 +00:00
Evan Cheng
87e0cd1569
pslldrm, psrawrm, etc. encoding bug
...
llvm-svn: 27721
2006-04-15 05:59:08 +00:00
Evan Cheng
4487cf8125
hsubp{s|d} encoding bug
...
llvm-svn: 27720
2006-04-15 05:52:42 +00:00
Evan Cheng
32e5d4f6bc
Silly bug
...
llvm-svn: 27719
2006-04-15 05:37:34 +00:00
Evan Cheng
c626c9bb00
Some clean up
...
llvm-svn: 27715
2006-04-14 23:32:40 +00:00
Evan Cheng
32c4470374
Last few SSE3 intrinsics.
...
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng
184264997e
Misc. SSE2 intrinsics: clflush, lfench, mfence
...
llvm-svn: 27699
2006-04-14 07:43:12 +00:00
Evan Cheng
360a73046f
pcmpeq* and pcmpgt* intrinsics.
...
llvm-svn: 27685
2006-04-14 01:39:53 +00:00
Evan Cheng
18a1a0e199
psll*, psrl*, and psra* intrinsics.
...
llvm-svn: 27684
2006-04-14 00:14:05 +00:00
Evan Cheng
f7645b0c49
Doh. PANDrm, etc. are not commutable.
...
llvm-svn: 27668
2006-04-13 18:11:28 +00:00
Evan Cheng
2de048bc69
psad, pmax, pmin intrinsics.
...
llvm-svn: 27647
2006-04-13 06:11:45 +00:00
Evan Cheng
93dcea2b5a
Various SSE2 packed integer intrinsics: pmulhuw, pavgw, etc.
...
llvm-svn: 27645
2006-04-13 05:24:54 +00:00
Evan Cheng
2f634fac6d
padds{b|w}, paddus{b|w}, psubs{b|w}, psubus{b|w} intrinsics.
...
llvm-svn: 27639
2006-04-13 00:43:35 +00:00
Evan Cheng
537bdb370c
Naming inconsistency.
...
llvm-svn: 27638
2006-04-13 00:00:23 +00:00