Chris Lattner
4e99e6dfdd
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
136a27d0d0
* Add supprot for SCALAR_TO_VECTOR operations where the input needs to be
...
promoted/expanded (e.g. SCALAR_TO_VECTOR from i8/i16 on PPC).
* Add support for targets to request that VECTOR_SHUFFLE nodes be promoted
to a canonical type, for example, we only want v16i8 shuffles on PPC.
* Move isShuffleLegal out of TLI into Legalize.
* Teach isShuffleLegal to allow shuffles that need to be promoted.
llvm-svn: 27399
2006-04-04 17:23:26 +00:00
Chris Lattner
020ff34600
Signed shr by a constant is not the same as sdiv by 2^k
...
llvm-svn: 27395
2006-04-04 06:11:42 +00:00
Evan Cheng
f07104b717
cmpps / cmppd encoding bug
...
llvm-svn: 27393
2006-04-04 03:04:07 +00:00
Chris Lattner
85dc06c29e
Constant fold bitconvert(undef)
...
llvm-svn: 27391
2006-04-04 01:02:22 +00:00
Evan Cheng
2be8582ddb
Compact some intrinsic definitions.
...
llvm-svn: 27388
2006-04-04 00:10:53 +00:00
Chris Lattner
2bf9c8cc18
Plug in the byte and short splats
...
llvm-svn: 27387
2006-04-04 00:05:13 +00:00
Chris Lattner
0128e4d335
Revert accidentally committed hunks.
...
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
57b9e01b3e
Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
...
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Evan Cheng
bbe72d932c
Some SSE1 intrinsics: min, max, sqrt, etc.
...
llvm-svn: 27384
2006-04-03 23:49:17 +00:00
Chris Lattner
ab96554280
revert previous patch
...
llvm-svn: 27383
2006-04-03 23:14:49 +00:00
Evan Cheng
7ff32cd571
Use movlpd to: store lower f64 extracted from v2f64.
...
Use movhpd to: store upper f64 extracted from v2f64.
llvm-svn: 27382
2006-04-03 22:30:54 +00:00
Chris Lattner
eb9684f6a4
Force use of a frame-pointer if there is anything on the stack that is aligned
...
more than the OS keeps the stack aligned.
llvm-svn: 27381
2006-04-03 22:03:29 +00:00
Chris Lattner
4601b86ad0
The stack alignment is now computed dynamically, just verify it is correct.
...
llvm-svn: 27380
2006-04-03 21:39:57 +00:00
Chris Lattner
264e4a438a
Remove unused method
...
llvm-svn: 27379
2006-04-03 21:39:03 +00:00
Evan Cheng
169240beb7
- More efficient extract_vector_elt with shuffle and movss, movsd, movd, etc.
...
- Some bug fixes and naming inconsistency fixes.
llvm-svn: 27377
2006-04-03 20:53:28 +00:00
Chris Lattner
758878175b
Align vectors to the size in bytes, not bits.
...
llvm-svn: 27376
2006-04-03 19:28:50 +00:00
Chris Lattner
d13dd8ef5c
Add a missing check, this fixes UnitTests/Vector/sumarray.c
...
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner
d9902c3de0
Add a missing check, which broke a bunch of vector tests.
...
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Chris Lattner
c65511b05c
Add the full set of min/max instructions
...
llvm-svn: 27372
2006-04-03 15:58:28 +00:00
Andrew Lenharth
4760eaae91
support x * (c1 + c2) where c1 and c2 are pow2s. special case for c2 == 4
...
llvm-svn: 27370
2006-04-03 04:19:17 +00:00
Andrew Lenharth
af4a638eab
mul by const conversion sequences. more coming soon
...
llvm-svn: 27368
2006-04-03 03:18:59 +00:00
Andrew Lenharth
b133e47444
back this out
...
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth
91c6f28ad6
This should be a win of every arch
...
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Andrew Lenharth
4c950de4c1
This makes McCat/12-IOtest go 8x faster or so
...
llvm-svn: 27363
2006-04-02 21:08:39 +00:00
Andrew Lenharth
067eaad4d9
This will be needed soon
...
llvm-svn: 27362
2006-04-02 20:13:57 +00:00
Chris Lattner
fa82c33ae7
add a note
...
llvm-svn: 27360
2006-04-02 07:20:00 +00:00
Chris Lattner
8ba4723c74
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
0b4f7786d2
relax assertion
...
llvm-svn: 27358
2006-04-02 06:19:46 +00:00
Chris Lattner
6433904644
Allow targets to compute masked bits for intrinsics.
...
llvm-svn: 27357
2006-04-02 06:15:09 +00:00
Chris Lattner
dbdc830c83
Add a little dag combine to compile this:
...
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
%tmp1 = load <4 x float>* %in ; <<4 x float>> [#uses=1]
%tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 ) ; <int> [#uses=1]
%tmp = seteq int %tmp, 0 ; <bool> [#uses=1]
%tmp3 = cast bool %tmp to int ; <int> [#uses=1]
ret int %tmp3
}
into this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
mtspr 256, r2
blr
instead of this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
xori r3, r3, 1
cntlzw r3, r3
srwi r3, r3, 5
mtspr 256, r2
blr
llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner
42a1e621f1
vector casts of casts are eliminable. Transform this:
...
%tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1]
%tmp = cast <4 x int> %tmp to <4 x float> ; <<4 x float>> [#uses=1]
into:
%tmp = cast <4 x uint> %tmp to <4 x float> ; <<4 x float>> [#uses=1]
llvm-svn: 27355
2006-04-02 05:43:13 +00:00
Chris Lattner
d9ca6b5bfc
vector casts never reinterpret bits
...
llvm-svn: 27354
2006-04-02 05:40:28 +00:00
Chris Lattner
3c994295fe
Allow transforming this:
...
%tmp = cast <4 x uint>* %testData to <4 x int>* ; <<4 x int>*> [#uses=1]
%tmp = load <4 x int>* %tmp ; <<4 x int>> [#uses=1]
to this:
%tmp = load <4 x uint>* %testData ; <<4 x uint>> [#uses=1]
%tmp = cast <4 x uint> %tmp to <4 x int> ; <<4 x int>> [#uses=1]
llvm-svn: 27353
2006-04-02 05:37:12 +00:00
Chris Lattner
cb26b2dfe8
Turn altivec lvx/stvx intrinsics into loads and stores. This allows the
...
elimination of one load from this:
int AreSecondAndThirdElementsBothNegative( vector float *in ) {
#define QNaN 0x7FC00000
const vector unsigned int testData = (vector unsigned int)( QNaN, 0, 0, QNaN );
vector float test = vec_ld( 0, (float*) &testData );
return ! vec_any_ge( test, *in );
}
Now generating:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
addi r6, r1, -16
lvx v0, r5, r4
stvx v0, 0, r6
lvx v1, 0, r3
vcmpgefp. v0, v0, v1
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
xori r3, r3, 1
cntlzw r3, r3
srwi r3, r3, 5
mtspr 256, r2
blr
llvm-svn: 27352
2006-04-02 05:30:25 +00:00
Chris Lattner
8967316b8c
Remove done item
...
llvm-svn: 27351
2006-04-02 05:28:54 +00:00
Chris Lattner
c76f9e8691
Implement promotion for EXTRACT_VECTOR_ELT, allowing v16i8 multiplies to work with PowerPC.
...
llvm-svn: 27349
2006-04-02 05:06:04 +00:00
Chris Lattner
9c24ec6de5
add a note
...
llvm-svn: 27348
2006-04-02 03:59:11 +00:00
Chris Lattner
f15063eadf
Implement the Expand action for binary vector operations to break the binop
...
into elements and operate on each piece. This allows generic vector integer
multiplies to work on PPC, though the generated code is horrible.
llvm-svn: 27347
2006-04-02 03:57:31 +00:00
Chris Lattner
389e309bfb
Intrinsics that just load from memory can be treated like loads: they don't
...
have to serialize against each other. This allows us to schedule lvx's
across each other, for example.
llvm-svn: 27346
2006-04-02 03:41:14 +00:00
Chris Lattner
e314cf19ba
Adjust to change in Intrinsics.gen interface.
...
llvm-svn: 27344
2006-04-02 03:35:01 +00:00
Chris Lattner
104db817c8
Constant fold all of the vector binops. This allows us to compile this:
...
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"
aka:
void %test2(<16 x sbyte>* %P) {
store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
ret void
}
into this:
_test2:
mfspr r2, 256
oris r4, r2, 32768
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
lvx v0, r5, r4
stvx v0, 0, r3
mtspr 256, r2
blr
instead of this:
_test2:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
vspltisb v0, 8
lvx v1, r5, r4
vxor v0, v1, v0
stvx v0, 0, r3
mtspr 256, r2
blr
... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html
llvm-svn: 27343
2006-04-02 03:25:57 +00:00
Chris Lattner
badebf1c9b
Add a new -view-legalize-dags command line option
...
llvm-svn: 27342
2006-04-02 03:07:27 +00:00
Chris Lattner
52732a272f
Implement constant folding of bit_convert of arbitrary constant vbuild_vector nodes.
...
llvm-svn: 27341
2006-04-02 02:53:43 +00:00
Chris Lattner
8a66373ad7
These entries already exist
...
llvm-svn: 27340
2006-04-02 02:51:27 +00:00
Chris Lattner
5eefd8e0b3
Add some missing node names
...
llvm-svn: 27339
2006-04-02 02:41:18 +00:00
Chris Lattner
2a24d68439
New note
...
llvm-svn: 27337
2006-04-02 01:47:20 +00:00
Chris Lattner
3aa0246b4a
Constant fold casts from things like <4 x int> -> <4 x uint>, likewise int<->fp.
...
llvm-svn: 27336
2006-04-02 01:38:28 +00:00
Chris Lattner
da4217646a
Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
...
"vspltisb v0, 8" instead of a constant pool load.
llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
13e8d5973c
Prefer larger register classes over smaller ones when a register occurs in
...
multiple register classes. This fixes PowerPC/2006-04-01-FloatDoubleExtend.ll
llvm-svn: 27334
2006-04-02 00:24:45 +00:00