1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

1509 Commits

Author SHA1 Message Date
Chris Lattner
2bd91746e1 pretty print node name
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner
44ea12c5f8 Implement an important entry from README_ALTIVEC:
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's.  Instead, just branch on CR6 directly. :)

For example, for:
void foo2(vector float *A, vector float *B) {
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
}

We now generate:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

instead of:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        cmpwi cr0, r3, 0
        beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

This implements CodeGen/PowerPC/vec_br_cmp.ll.

llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
519001b0ee move some stuff around, clean things up
llvm-svn: 27802
2006-04-18 17:52:36 +00:00
Chris Lattner
e90fdf3b98 Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
even/odd halves.  Thanks to Nate telling me what's what.

llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner
5951b60cb4 Implement v16i8 multiply with this code:
vmuloub v5, v3, v2
        vmuleub v2, v3, v2
        vperm v2, v2, v5, v4

This implements CodeGen/PowerPC/vec_mul.ll.  With this, v16i8 multiplies are
6.79x faster than before.

Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.

Remove the 'integer multiplies' todo from the README file.

llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Chris Lattner
4d84b56e64 Lower v8i16 multiply into this code:
li r5, lo16(LCPI1_0)
        lis r6, ha16(LCPI1_0)
        lvx v4, r6, r5
        vmulouh v5, v3, v2
        vmuleuh v2, v3, v2
        vperm v2, v2, v5, v4

where v4 is:
LCPI1_0:                                        ;  <16 x ubyte>
        .byte   2
        .byte   3
        .byte   18
        .byte   19
        .byte   6
        .byte   7
        .byte   22
        .byte   23
        .byte   10
        .byte   11
        .byte   26
        .byte   27
        .byte   14
        .byte   15
        .byte   30
        .byte   31

This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.

llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner
613d7fda64 Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.

This speeds up v4i32 multiplies 4.1x (measured) on a G5.  This implements
PowerPC/vec_mul.ll

llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Chris Lattner
81938fa3db remove done item
llvm-svn: 27778
2006-04-17 21:52:03 +00:00
Chris Lattner
fdecddb741 Don't diddle VRSAVE if no registers need to be added/removed from it. This
allows us to codegen functions as:

_test_rol:
        vspltisw v2, -12
        vrlw v2, v2, v2
        blr

instead of:

_test_rol:
        mfvrsave r2, 256
        mr r3, r2
        mtvrsave r3
        vspltisw v2, -12
        vrlw v2, v2, v2
        mtvrsave r2
        blr

Testcase here: CodeGen/PowerPC/vec_vrsave.ll

llvm-svn: 27777
2006-04-17 21:48:13 +00:00
Chris Lattner
021f521a41 Vectors that are known live-in and live-out are clearly already marked in
the vrsave register for the caller.  This allows us to codegen a function as:

_test_rol:
        mfspr r2, 256
        mr r3, r2
        mtspr 256, r3
        vspltisw v2, -12
        vrlw v2, v2, v2
        mtspr 256, r2
        blr

instead of:

_test_rol:
        mfspr r2, 256
        oris r3, r2, 40960
        mtspr 256, r3
        vspltisw v0, -12
        vrlw v2, v0, v0
        mtspr 256, r2
        blr

llvm-svn: 27772
2006-04-17 21:22:06 +00:00
Chris Lattner
a717d4f53b Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this:
vspltisw v2, -12
        vrlw v2, v2, v2

instead of:

        vspltisw v0, -12
        vrlw v2, v0, v0

when a function is returning a value.

llvm-svn: 27771
2006-04-17 21:19:12 +00:00
Chris Lattner
6b76deffb5 Move some knowledge about registers out of the code emitter into the register info.
llvm-svn: 27770
2006-04-17 21:07:20 +00:00
Chris Lattner
face261a94 Use a small table instead of macros to do this conversion.
llvm-svn: 27769
2006-04-17 20:59:25 +00:00
Chris Lattner
f2347c31b4 Make sure to check splats of every constant we can, handle splat(31) by
being a bit more clever, add support for odd splats from -31 to -17.

llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Chris Lattner
cc4222d95b Teach the ppc backend to use rol and vsldoi to generate splatted constants.
This implements vec_constants.ll:test_vsldoi and test_rol

llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner
7d66e5a118 add a note
llvm-svn: 27758
2006-04-17 17:29:41 +00:00
Chris Lattner
2d8d6c9feb Make some code more general, adding support for constant formation of several
new patterns.

llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner
9dd4ebffca Learn how to make odd splatted constants in range [17,29]. This implements
PowerPC/vec_constants.ll:test_29.

llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner
72a67a5b1f Pull some code out into a helper function.
Effeciently codegen even splats in the range [-32,30].

This allows us to codegen <30,30,30,30> as:

        vspltisw v0, 15
        vadduwm v2, v0, v0

instead of as a cp load.

llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner
5367a73dec Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such.  This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll

llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner
34ec6432f6 Regenerate with adjusted costs
llvm-svn: 27746
2006-04-17 05:26:20 +00:00
Chris Lattner
36ceea9e96 Regenerate with correct offset
llvm-svn: 27744
2006-04-17 05:08:46 +00:00
Chris Lattner
671f50cf33 Increase the opcodes by one each to disambiguate COPY from VMRGHW.
llvm-svn: 27742
2006-04-17 00:47:48 +00:00
Chris Lattner
99ee809cb6 Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles
of various 4-element vectors.

llvm-svn: 27739
2006-04-17 00:37:02 +00:00
Chris Lattner
d86516991a Implement a TODO: have the legalizer canonicalize a bunch of operations to
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.

llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner
f4126f0db7 Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
Remove some done items from the todo list.

llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner
44245f11c3 Fix a crash when faced with a shuffle vector that has an undef in its mask.
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner
2ede0fef98 Add patterns for matching vnots with bit converted inputs. Most of these will
go away when I start using evan's binop type canonicalizer

llvm-svn: 27725
2006-04-15 23:45:24 +00:00
Chris Lattner
5c9d357d7c Allow undef in a shuffle mask
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner
cf80e569f6 Move the rest of the PPCTargetLowering::LowerOperation cases out into
separate functions, for simplicity and code clarity.

llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
aacabea404 Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
functions, which makes the code much cleaner :)

llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
569ea9c6dd Force non-darwin targets to use a static relo model. This fixes PR734,
tested by CodeGen/Generic/vector.ll

llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
cec07adf4d add a note, move an altivec todo to the altivec list.
llvm-svn: 27654
2006-04-13 16:48:00 +00:00
Reid Spencer
b08854af39 Add the README files to the distribution.
llvm-svn: 27651
2006-04-13 06:39:24 +00:00
Chris Lattner
e087b8e321 Add a new way to match vector constants, which make it easier to bang bits of
different types.

Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1.  This compiles:

typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
  *P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
  *P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
  *P3 = vec_abs((vector float)*P3);
}

to:

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        vspltisw v0, -1
        vslw v0, v0, v0
        lvx v1, 0, r3
        vand v1, v1, v0
        stvx v1, 0, r3
        lvx v1, 0, r4
        vandc v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vandc v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

instead of (with two constant pool entries):

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        li r6, lo16(LCPI1_0)
        lis r7, ha16(LCPI1_0)
        li r8, lo16(LCPI1_1)
        lis r9, ha16(LCPI1_1)
        lvx v0, r7, r6
        lvx v1, 0, r3
        vand v0, v1, v0
        stvx v0, 0, r3
        lvx v0, r9, r8
        lvx v1, 0, r4
        vand v1, v1, v0
        stvx v1, 0, r4
        lvx v1, 0, r5
        vand v0, v1, v0
        stvx v0, 0, r5
        mtspr 256, r2
        blr

GCC produces (with 2 cp entries):

_test:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc00c
        mtspr 256,r0
        lis r2,ha16(LC0)
        lis r9,ha16(LC1)
        la r2,lo16(LC0)(r2)
        lvx v0,0,r3
        lvx v1,0,r5
        la r9,lo16(LC1)(r9)
        lwz r12,-4(r1)
        lvx v12,0,r2
        lvx v13,0,r9
        vand v0,v0,v12
        stvx v0,0,r3
        vspltisw v0,-1
        vslw v12,v0,v0
        vandc v1,v1,v12
        stvx v1,0,r5
        lvx v0,0,r4
        vand v0,v0,v13
        stvx v0,0,r4
        mtspr 256,r12
        blr

llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
ce6e988fa6 Rename get_VSPLI_elt -> get_VSPLTI_elt
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively.  This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI

llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
602d86f7af Ensure that zero vectors are always v4i32, which forces them to CSE with
each other.  This implements CodeGen/PowerPC/vxor-canonicalize.ll

llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Nate Begeman
ccd6ea1913 Fix SingleSource/UnitTests/Vector/sumarray-dbl
llvm-svn: 27594
2006-04-11 19:44:43 +00:00
Nate Begeman
786d44f822 Fix PR727, correctly handling large stack aligments on ppc
llvm-svn: 27593
2006-04-11 19:29:21 +00:00
Chris Lattner
0e63e916b3 we have a shuffle instr, add an example.
llvm-svn: 27592
2006-04-11 18:47:03 +00:00
Jim Laskey
1e0cbe4158 Suppress debug label when not debug.
llvm-svn: 27588
2006-04-11 08:11:53 +00:00
Chris Lattner
e12152a64b Vector function results go into V2 according to GCC. The darwin ABI doc
doesn't say where they go :-/

llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
5d1acb831a Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
No functionality change.

llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner
3c6e4a1dc9 properly mark vector selects as expanded to select_cc
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner
2ffa288a23 Add VRRC select support
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Nate Begeman
6cdc599d05 Disable switch lowering for targets based on the selection dag isel,
letting the code generator handle them directly.

llvm-svn: 27539
2006-04-08 19:46:55 +00:00
Chris Lattner
8234bfe18e Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
constant pool load.

llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner
e8defcff7d Change the interface to the predicate that determines if vsplti* can be used.
No functionality changes.

llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Jim Laskey
fabb0ba736 Make sure that debug labels are defined within the same section and after the
entry point of a function.

llvm-svn: 27494
2006-04-07 20:44:42 +00:00
Jim Laskey
b93bc75add Foundation for call frame information.
llvm-svn: 27491
2006-04-07 16:34:46 +00:00
Chris Lattner
db7dfe8c61 Add an item
llvm-svn: 27470
2006-04-06 23:16:19 +00:00
Chris Lattner
a390188fd4 Make sure to return the result in the right type.
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner
c0680ae07e Match vpku[hw]um(x,x).
Convert vsldoi(x,x) to work the same way other (x,x) cases work.

llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
a52d88ee89 Add support for matching vmrg(x,x) patterns
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
300076cbd8 Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
6cf87c1b01 remove two done items
llvm-svn: 27453
2006-04-06 19:19:38 +00:00
Chris Lattner
2875bb116e Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
lower it and LLVM to have one fewer intrinsic.  This implements
CodeGen/PowerPC/vec_shuffle.ll

llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
10fa7be550 Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
vperm with a perm mask lvx'd from the constant pool.

llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
7f13e50435 Add all of the data stream intrinsics and instructions. woo
llvm-svn: 27442
2006-04-05 22:27:14 +00:00
Chris Lattner
338945e669 Fix a typo
llvm-svn: 27440
2006-04-05 20:15:25 +00:00
Chris Lattner
d1b47b18ed Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng
9e56e97205 Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Chris Lattner
ee971bedf2 add vsl
llvm-svn: 27425
2006-04-05 01:16:22 +00:00
Chris Lattner
993209029f add vmladduhm
llvm-svn: 27423
2006-04-05 00:49:48 +00:00
Chris Lattner
66c3b75644 Add m[tf]vscr instructions.
llvm-svn: 27421
2006-04-05 00:03:57 +00:00
Chris Lattner
10394b1c42 add a note
llvm-svn: 27419
2006-04-04 23:45:11 +00:00
Chris Lattner
e7a52b473f Add missing byte merges.
llvm-svn: 27418
2006-04-04 23:43:56 +00:00
Chris Lattner
ab137b431f Add FP -> Int Conversions
llvm-svn: 27417
2006-04-04 23:25:02 +00:00
Chris Lattner
6cf881590f add average intrinsics
llvm-svn: 27416
2006-04-04 23:14:00 +00:00
Chris Lattner
59c4add58a add a note
llvm-svn: 27414
2006-04-04 22:43:55 +00:00
Chris Lattner
d1483ca1ad Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Chris Lattner
4e99e6dfdd Ask legalize to promote all vector shuffles to be v16i8 instead of having to
handle all 4 PPC vector types.   This simplifies the matching code and allows
us to eliminate a bunch of patterns.  This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.

llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
2bf9c8cc18 Plug in the byte and short splats
llvm-svn: 27387
2006-04-04 00:05:13 +00:00
Chris Lattner
0128e4d335 Revert accidentally committed hunks.
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
57b9e01b3e Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Chris Lattner
eb9684f6a4 Force use of a frame-pointer if there is anything on the stack that is aligned
more than the OS keeps the stack aligned.

llvm-svn: 27381
2006-04-03 22:03:29 +00:00
Chris Lattner
c65511b05c Add the full set of min/max instructions
llvm-svn: 27372
2006-04-03 15:58:28 +00:00
Chris Lattner
fa82c33ae7 add a note
llvm-svn: 27360
2006-04-02 07:20:00 +00:00
Chris Lattner
8ba4723c74 Inform the dag combiner that the predicate compares only return a low bit.
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
8967316b8c Remove done item
llvm-svn: 27351
2006-04-02 05:28:54 +00:00
Chris Lattner
9c24ec6de5 add a note
llvm-svn: 27348
2006-04-02 03:59:11 +00:00
Chris Lattner
da4217646a Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
"vspltisb v0, 8" instead of a constant pool load.

llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
38318b2706 Implement vnot using VNOR instead of using 'vspltisb v0, -1' and vxor
llvm-svn: 27331
2006-04-01 22:41:47 +00:00
Chris Lattner
32bb17a5f3 Shrinkify some more intrinsic definitions.
llvm-svn: 27322
2006-03-31 22:41:56 +00:00
Chris Lattner
12e9ce7104 Pull operand asm string into base class, shrinkifying intrinsic definitions.
No functionality change.

llvm-svn: 27320
2006-03-31 22:34:05 +00:00
Chris Lattner
3d6e5f8a05 Fix 80 column violations :)
llvm-svn: 27315
2006-03-31 21:57:36 +00:00
Chris Lattner
d66dd2a4ee fix a pasto
llvm-svn: 27308
2006-03-31 21:19:06 +00:00
Chris Lattner
28219f34bc Add vperm support for all datatypes
llvm-svn: 27307
2006-03-31 20:00:35 +00:00
Chris Lattner
336d6646ab Rearrange code a bit
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner
786f782398 Add, sub and shuffle are legal for all vector types
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Chris Lattner
d27ced882b add a note
llvm-svn: 27302
2006-03-31 19:00:22 +00:00
Chris Lattner
e3774da014 note to self: *save* file, then check it in
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner
95d358dbdb Implement an item from the readme, folding vcmp/vcmp. instructions with
identical instructions into a single instruction.  For example, for:

void test(vector float *x, vector float *y, int *P) {
  int v = vec_any_out(*x, *y);
  *x = (vector float)vec_cmpb(*x, *y);
  *P = v;
}

we now generate:

_test:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        lvx v0, 0, r4
        lvx v1, 0, r3
        vcmpbfp. v0, v1, v0
        mfcr r4, 2
        stvx v0, 0, r3
        rlwinm r3, r4, 27, 31, 31
        xori r3, r3, 1
        stw r3, 0(r5)
        mtspr 256, r2
        blr

instead of:

_test:
        mfspr r2, 256
        oris r6, r2, 57344
        mtspr 256, r6
        lvx v0, 0, r4
        lvx v1, 0, r3
        vcmpbfp. v2, v1, v0
        mfcr r4, 2
***     vcmpbfp v0, v1, v0
        rlwinm r4, r4, 27, 31, 31
        stvx v0, 0, r3
        xori r3, r4, 1
        stw r3, 0(r5)
        mtspr 256, r2
        blr

Testcase here: CodeGen/PowerPC/vcmp-fold.ll

llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner
560f734320 compactify some more instruction definitions
llvm-svn: 27288
2006-03-31 05:38:32 +00:00
Chris Lattner
2c3d6bdb55 Compactify comparisons.
llvm-svn: 27287
2006-03-31 05:32:57 +00:00
Chris Lattner
e330741a6c Lower vector compares to VCMP nodes, just like we lower vector comparison
predicates to VCMPo nodes.

llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner
a7a7c035b3 These are done
llvm-svn: 27284
2006-03-31 04:53:21 +00:00
Chris Lattner
a31d719e0a Mark INSERT_VECTOR_ELT as expand
llvm-svn: 27276
2006-03-31 01:48:55 +00:00
Chris Lattner
87d3a2e045 Add the rest of the vmul instructions and the vmulsum* instructions.
llvm-svn: 27268
2006-03-30 23:39:06 +00:00
Chris Lattner
22b7e551f1 Use a new tblgen feature to significantly shrinkify instruction definitions that
directly correspond to intrinsics.

llvm-svn: 27266
2006-03-30 23:21:27 +00:00
Chris Lattner
6aca6013d2 Add a bunch of new instructions for intrinsics.
llvm-svn: 27265
2006-03-30 23:07:36 +00:00
Chris Lattner
1a773f8f18 add a note
llvm-svn: 27243
2006-03-29 00:24:13 +00:00
Chris Lattner
93559450b8 add a note
llvm-svn: 27227
2006-03-28 18:56:23 +00:00
Jim Laskey
eb38a3e83a Expose base register for DwarfWriter. Refactor code accordingly.
llvm-svn: 27225
2006-03-28 13:48:33 +00:00
Nate Begeman
d432d66cc8 Fix a couple typos
llvm-svn: 27216
2006-03-28 04:18:18 +00:00
Nate Begeman
5a82c8ccbd Add a few more altivec intrinsics
llvm-svn: 27215
2006-03-28 04:15:58 +00:00
Chris Lattner
a570305421 implement a bunch more intrinsics.
llvm-svn: 27209
2006-03-28 02:29:37 +00:00
Chris Lattner
ac98e20cc9 Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
same thing and we have a dag node for the former.

llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner
d5da541d42 Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Jim Laskey
8688957c53 Translate llvm target registers to dwarf register numbers properly.
llvm-svn: 27180
2006-03-27 20:18:45 +00:00
Chris Lattner
dab8425129 Add a bunch of notes from my journey thus far.
llvm-svn: 27170
2006-03-27 07:41:00 +00:00
Chris Lattner
f1d6a9483f Split out altivec notes into their own README
llvm-svn: 27168
2006-03-27 07:04:16 +00:00
Chris Lattner
4b0fc38fe7 Fix the JIT encoding of VSEL
llvm-svn: 27160
2006-03-27 03:34:17 +00:00
Chris Lattner
b5efa3e0f5 Fix the JIT encoding of VSPLTI*
llvm-svn: 27159
2006-03-27 03:28:57 +00:00
Nate Begeman
3d518334b9 SelectionDAGISel can now natively handle Switch instructions, in the same
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks.  The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.

This functionality is currently only enabled on x86, but should be safe for
every target.  In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.

llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Chris Lattner
03ad35fd49 add vsel
llvm-svn: 27153
2006-03-26 22:38:43 +00:00
Chris Lattner
65a455b060 Codegen vector predicate compares.
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng
b17bbf8ccb Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Chris Lattner
f0c36b99e6 Add all of the altivec comparison instructions. Add patterns for the
non-predicate altivec compare intrinsics.

llvm-svn: 27143
2006-03-26 04:57:17 +00:00
Chris Lattner
4e0a78ea30 Add and 8/16-bit adds, add all integer subtracts, add saturating subtract
intrinsics.

llvm-svn: 27142
2006-03-26 02:39:02 +00:00
Chris Lattner
d33ef7a1bc implement the vsldoi intrinsic.
llvm-svn: 27139
2006-03-26 00:41:48 +00:00
Chris Lattner
7d557e00f3 fix the pattern for vandc, it's NOT vnand
llvm-svn: 27136
2006-03-25 23:10:40 +00:00
Chris Lattner
88a0c65463 add patterns for VANDC/VNOR, implementing
CodeGen/PowerPC/eqv-andc-orc-nor.ll:VNOR/VANDC

llvm-svn: 27135
2006-03-25 23:05:29 +00:00
Chris Lattner
f80b39f9b1 Add some logical operations
llvm-svn: 27127
2006-03-25 22:16:05 +00:00
Chris Lattner
d2823658b4 implement a bunch of intrinsics
llvm-svn: 27118
2006-03-25 08:01:02 +00:00
Chris Lattner
cb5f9269a9 Move all Altivec stuff out into a new PPCInstrAltivec.td file.
Add a bunch of patterns for different datatypes, e.g. bit_convert, undef and
zero vector support.

llvm-svn: 27117
2006-03-25 07:51:43 +00:00
Chris Lattner
57064915a6 Add some basic patterns for other datatypes
llvm-svn: 27116
2006-03-25 07:39:07 +00:00
Chris Lattner
7f5fba9c67 add all supported formats to the vector register file
llvm-svn: 27115
2006-03-25 07:36:56 +00:00
Chris Lattner
2fa3a6c436 Add support for __builtin_altivec_vnmsubfp /vmaddfp
llvm-svn: 27112
2006-03-25 07:05:55 +00:00
Chris Lattner
e199d55073 #include Intrinsics.h into all dag isels
llvm-svn: 27109
2006-03-25 06:47:10 +00:00
Chris Lattner
0899b16b2d Codegen things like:
<int -1, int -1, int -1, int -1>
and
 <int 65537, int 65537, int 65537, int 65537>

Using things like:
  vspltisb v0, -1
and:
  vspltish v0, 1

instead of using constant pool loads.

This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.

llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Jim Laskey
1716e53341 Add dwarf register numbering to register data.
llvm-svn: 27081
2006-03-24 21:15:58 +00:00
Chris Lattner
8840036091 add another note
llvm-svn: 27077
2006-03-24 20:04:27 +00:00
Chris Lattner
21abff3712 Fix a bad JIT encoding of VPERM. Why is VPERM D,A,B,C but vfmadd is D,A,C,B ??
llvm-svn: 27069
2006-03-24 18:24:43 +00:00
Chris Lattner
3133dafd4b Like the comment says, prefer to use the implicit add done by [r+r] addressing
modes than emitting an explicit add and using a base of r0.  This implements
Regression/CodeGen/PowerPC/mem-rr-addr-mode.ll

llvm-svn: 27068
2006-03-24 17:58:06 +00:00
Chris Lattner
303dc30593 Disable the i32->float G5 optimization. It is unsafe, as documented in the
comment.

This fixes 177.mesa, and McCat/09-vor with the td scheduler.

llvm-svn: 27060
2006-03-24 07:53:47 +00:00
Chris Lattner
ba4966c16c add support for using vxor to build zero vectors. This implements
Regression/CodeGen/PowerPC/vec_zero.ll

llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
ace2d0d227 Gabor points out that we can't spell. :)
llvm-svn: 27049
2006-03-24 07:12:19 +00:00
Chris Lattner
2e5162fa6e add a note
llvm-svn: 27000
2006-03-23 21:28:44 +00:00
Chris Lattner
974982c89c Add PPC vector bit-convert support
llvm-svn: 26995
2006-03-23 19:54:27 +00:00
Jim Laskey
cec9c18c62 Add support to locate local variables in frames (early version.)
llvm-svn: 26994
2006-03-23 18:12:57 +00:00
Jim Laskey
f3cc740d75 Change interface to DwarfWriter.
llvm-svn: 26991
2006-03-23 18:09:44 +00:00
Chris Lattner
89e0790edb Eliminate IntrinsicLowering from TargetMachine.
Make the CBE and V9 backends create their own, since they're the only ones that use it.

llvm-svn: 26974
2006-03-23 05:43:16 +00:00
Chris Lattner
5141ebb2c4 This has been implemented. Tweak it into another note
llvm-svn: 26944
2006-03-22 05:33:23 +00:00
Chris Lattner
f84f3bf95b When possible, custom lower 32-bit SINT_TO_FP to this:
_foo2:
        extsw r2, r3
        std r2, -8(r1)
        lfd f0, -8(r1)
        fcfid f0, f0
        frsp f1, f0
        blr

instead of this:

_foo2:
        lis r2, ha16(LCPI2_0)
        lis r4, 17200
        xoris r3, r3, 32768
        stw r3, -4(r1)
        stw r4, -8(r1)
        lfs f0, lo16(LCPI2_0)(r2)
        lfd f1, -8(r1)
        fsub f0, f1, f0
        frsp f1, f0
        blr

This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).

llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
cfbce5186a Add support for "ri" addressing modes where the immediate is a 14-bit field
which is shifted left two bits before use.  Instructions like STD use this
addressing mode.

llvm-svn: 26942
2006-03-22 05:26:03 +00:00
Chris Lattner
2e606dc60f Fix the JIT encoding of the VAForm_1 instructions, including vmaddfp
llvm-svn: 26935
2006-03-22 01:44:36 +00:00
Chris Lattner
31a93c7740 These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.
llvm-svn: 26930
2006-03-21 20:51:05 +00:00
Chris Lattner
140be98ab8 Don't emit pseudo instructions!
llvm-svn: 26926
2006-03-21 20:19:37 +00:00
Nate Begeman
4bd73df4bd Update readme
llvm-svn: 26924
2006-03-21 18:58:20 +00:00
Chris Lattner
414fed4108 Print absolute memory references like this:
lwz r2, 8(0)
instead of this:
       lwz r2, 8(r0)

This fixes the llc/llc-beta failures on PPC last night.

llvm-svn: 26922
2006-03-21 17:21:13 +00:00
Chris Lattner
6417236c41 With Evan's latest tblgen patch, this code is obsolete, thanks Evan!
llvm-svn: 26917
2006-03-21 06:37:40 +00:00
Chris Lattner
acb2506622 When codegen'ing vector MUL using VFMADD, *add* the 0, don't *mul* the 0.
llvm-svn: 26913
2006-03-21 00:51:38 +00:00
Chris Lattner
9e611a25c7 minor note
llvm-svn: 26912
2006-03-21 00:47:09 +00:00
Chris Lattner
cdc4657988 Handle constant addresses more efficiently, folding the low bits into the
disp field of the load/store if possible.  This compiles
CodeGen/PowerPC/load-constant-addr.ll to:

_test:
        lis r2, 2838
        lfs f1, 26848(r2)
        blr

instead of:

_test:
        lis r2, 2838
        ori r2, r2, 26848
        lfs f1, 0(r2)
        blr

llvm-svn: 26908
2006-03-20 22:38:22 +00:00
Chris Lattner
a498dd25d9 remove dead variable
llvm-svn: 26907
2006-03-20 22:37:23 +00:00
Chris Lattner
978628896b Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
figuring these out! :)

llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner
5c994b8c63 reenable this hack, the tblgen version isn't quite ready
llvm-svn: 26902
2006-03-20 17:54:43 +00:00
Chris Lattner
fb0e160aa5 Fix the pattern for VADDUWM, add i32 splat
llvm-svn: 26901
2006-03-20 17:51:58 +00:00
Evan Cheng
57da1afbc8 Use tblgen'd VECTOR_SHUFFLE selection code.
llvm-svn: 26900
2006-03-20 08:14:16 +00:00
Chris Lattner
dc3605efdb Add support for generating vspltw, instead of a vperm instruction with a
constant pool load.  This generates significantly nicer code for splats.

When tblgen gets bugfixed, we can remove the custom selection code.

llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner
8faa2cf693 Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate.
llvm-svn: 26897
2006-03-20 06:37:44 +00:00
Chris Lattner
4b7aa59bbc fix duplicate definition errors
llvm-svn: 26896
2006-03-20 06:33:01 +00:00
Chris Lattner
1cdeda1c5a Check in some intermediate code that adds a skeleton for matching vsplt*
instructions

llvm-svn: 26894
2006-03-20 06:15:45 +00:00
Chris Lattner
c230af9810 fix typo
llvm-svn: 26889
2006-03-20 05:05:55 +00:00
Chris Lattner
bea056ecf2 add vsplat instructions, fix sched description for vperm
llvm-svn: 26888
2006-03-20 04:47:33 +00:00
Chris Lattner
0e56cf0d94 Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*

llvm-svn: 26887
2006-03-20 01:53:53 +00:00
Chris Lattner
65e6e12dca Claim to have v16i8 for perm masks
llvm-svn: 26886
2006-03-20 01:53:02 +00:00
Chris Lattner
6f502da274 add the vperm instruction
llvm-svn: 26883
2006-03-20 01:00:56 +00:00
Chris Lattner
ada41aad4d Add a note about the MUL -> FMADD vector bug.
llvm-svn: 26874
2006-03-19 22:08:08 +00:00
Chris Lattner
789570bafb Custom lower SCALAR_TO_VECTOR into lve*x.
llvm-svn: 26868
2006-03-19 06:55:52 +00:00
Chris Lattner
80f9f7138a PPC doesn't have SCALAR_TO_VECTOR
llvm-svn: 26865
2006-03-19 06:17:19 +00:00
Chris Lattner
89bc332152 add support for vector undef
llvm-svn: 26863
2006-03-19 06:10:09 +00:00
Chris Lattner
a9b4a2ab99 minor fixes
llvm-svn: 26857
2006-03-19 05:43:01 +00:00
Chris Lattner
d3910ca755 notes
llvm-svn: 26856
2006-03-19 05:33:30 +00:00
Chris Lattner
b46a4c28ad we don't use lmw/stmw. When we want them they are easy enough to add
llvm-svn: 26853
2006-03-19 04:33:37 +00:00
Chris Lattner
1bd0aaf2b8 rename these nodes
llvm-svn: 26848
2006-03-19 01:13:28 +00:00
Nate Begeman
793c8136ae Fix subfic to match subc by default instead of sub so that it is correctly
cost-modeled as producing a flag.  This fixes the test I just added for neg

llvm-svn: 26835
2006-03-17 22:41:37 +00:00
Nate Begeman
42736d46b2 Remove BRTWOWAY*
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.

llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
87dbd49cbe remove dead variable
llvm-svn: 26813
2006-03-16 23:52:08 +00:00
Nate Begeman
63c4456867 Notes on how to kill the eeevil brtwoway, and make ppc branch selector
more target independant, generate better code, and be less conservative.

llvm-svn: 26809
2006-03-16 22:37:48 +00:00
Chris Lattner
f2008cb73b Strangely, calls clobber call-clobbered vector regs. Whodathoughtit?
llvm-svn: 26808
2006-03-16 22:35:59 +00:00
Chris Lattner
8a756c5171 add a note
llvm-svn: 26807
2006-03-16 22:25:55 +00:00
Chris Lattner
57773fdac1 teach the ppc backend how to spill/reload vector regs
llvm-svn: 26806
2006-03-16 22:24:02 +00:00
Chris Lattner
661ee5d3c1 add callee saved vector regs
llvm-svn: 26805
2006-03-16 22:07:06 +00:00
Evan Cheng
cad75d9f0c Added a way for TargetLowering to specify what values can be used as the
scale component of the target addressing mode.

llvm-svn: 26802
2006-03-16 21:47:42 +00:00
Chris Lattner
7f5361757b in functions that use a lot of callee saved regs, this can be more than
5 instructions away.

llvm-svn: 26801
2006-03-16 21:31:45 +00:00
Chris Lattner
bf153651b1 Add support for copying registers. still needed: spilling and reloading them
llvm-svn: 26800
2006-03-16 20:03:58 +00:00
Nate Begeman
cbca1b3d14 Another case we could do better on.
llvm-svn: 26795
2006-03-16 18:50:44 +00:00
Chris Lattner
b5d0896994 Save/restore VRSAVE once per function, not once per block.
llvm-svn: 26793
2006-03-16 18:25:23 +00:00
Nate Begeman
e371cb595a Update scheduling info for vrsave instruction
llvm-svn: 26776
2006-03-15 05:25:05 +00:00
Chris Lattner
392087f5bd Fix an off by one error that caused PPC LLC failures last night.
llvm-svn: 26758
2006-03-14 17:56:49 +00:00
Evan Cheng
ae7469b2c5 PPC LSR pass should use target lowering hooks.
llvm-svn: 26743
2006-03-13 23:56:51 +00:00
Evan Cheng
7ec94f2ff7 Added getTargetLowering() to TargetMachine. Refactored targets to support this.
llvm-svn: 26742
2006-03-13 23:20:37 +00:00
Chris Lattner
d0505331d2 For functions that use vector registers, save VRSAVE, mark used
registers, and update it on entry to each function, then restore it on exit.

This compiles:

void func(vfloat *a, vfloat *b, vfloat *c) {
        *a = *b * *c + *c;
}

to this:

_func:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        lvx v0, 0, r5
        lvx v1, 0, r4
        vmaddfp v0, v1, v0, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

GCC produces this (which has additional stack accesses):

_func:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc000
        mtspr 256,r0
        lvx v0,0,r5
        lvx v1,0,r4
        lwz r12,-4(r1)
        vmaddfp v0,v0,v1,v0
        stvx v0,0,r3
        mtspr 256,r12
        blr

llvm-svn: 26733
2006-03-13 21:52:10 +00:00
Chris Lattner
3aff8e6acf Fix a couple of bugs that broke the alpha tester build
llvm-svn: 26722
2006-03-13 05:23:59 +00:00
Chris Lattner
9898674f99 Handle cracked instructions in dispatch group formation.
llvm-svn: 26721
2006-03-13 05:20:04 +00:00
Chris Lattner
ba10d4e4ab Mark instructions that are cracked by the PPC970 decoder as such.
llvm-svn: 26720
2006-03-13 05:15:10 +00:00
Chris Lattner
a278639f29 Several big changes:
1. Use flags on the instructions in the .td file to indicate the PPC970 unit
   type instead of a table in the .cpp file.  Much cleaner.
2. Change the hazard recognizer to build d-groups according to the actual
   algorithm used, not my flawed understanding of it.
3. Model "must be in the first slot" and "must be the only instr in a group"
   accurately.

llvm-svn: 26719
2006-03-12 09:13:49 +00:00
Chris Lattner
19b93158c1 blr is a branch too
llvm-svn: 26710
2006-03-11 21:49:49 +00:00