1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

15390 Commits

Author SHA1 Message Date
Chris Lattner
4be8276d27 This has apparently been fixed
llvm-svn: 30864
2006-10-11 01:44:46 +00:00
Rafael Espindola
46e7aceb1d uint <-> double conversion
llvm-svn: 30862
2006-10-10 20:38:57 +00:00
Evan Cheng
0d8a340a8f Also update getNodeLabel for LoadSDNode.
llvm-svn: 30861
2006-10-10 20:11:26 +00:00
Evan Cheng
a12747d2b4 SDNode::dump should also print out extension type and VT.
llvm-svn: 30860
2006-10-10 20:05:10 +00:00
Rafael Espindola
0112351e9a add fp sub
llvm-svn: 30859
2006-10-10 19:35:01 +00:00
Rafael Espindola
27d68a3c22 add double <-> int conversion
llvm-svn: 30858
2006-10-10 18:55:14 +00:00
Chris Lattner
e0734f522f Fix another bug in extload promotion.
llvm-svn: 30857
2006-10-10 18:54:19 +00:00
Rafael Espindola
413aa20bc8 compare doubles
llvm-svn: 30856
2006-10-10 16:33:47 +00:00
Rafael Espindola
b0719f1374 initial support for fp compares. Unordered compares not implemented yet
llvm-svn: 30854
2006-10-10 12:56:00 +00:00
Evan Cheng
070ae65fa8 Fix a bug introduced by my LOAD/LOADX changes.
llvm-svn: 30853
2006-10-10 07:51:21 +00:00
Evan Cheng
b2998e15f2 More isel time load folding checking for nodes that produce flag values.
See comment in CanBeFoldedBy() for detailed explanation.

llvm-svn: 30851
2006-10-10 01:46:56 +00:00
Evan Cheng
d1a37cb9dc Don't go too crazy with these AddComplexity. Try matching shufps with load
folding first.

llvm-svn: 30848
2006-10-09 21:42:15 +00:00
Evan Cheng
8f6c6b19e6 Don't convert to MOVLP if using shufps etc. may allow load folding.
llvm-svn: 30847
2006-10-09 21:39:25 +00:00
Evan Cheng
d22f3dd3ed Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Rafael Espindola
bae07b25d6 add float -> double and double -> float conversion
llvm-svn: 30835
2006-10-09 17:50:29 +00:00
Reid Spencer
2e9ba7166f Fix PR886:
The result of yyparse() was not being checked. When YYERROR or YYABORT is
called it causes yyparse() to return 1 to indicate the error. The code was
silently ignoring this situation because it previously expected either an
exception or a null ParserResult to indicate an error. The patch corrects
this situation.

llvm-svn: 30834
2006-10-09 17:36:59 +00:00
Chris Lattner
0c057ef048 Fix a bug pointed out by Zhongxing Xu
llvm-svn: 30831
2006-10-09 17:28:13 +00:00
Rafael Espindola
f917f096e2 add ADDS and ADCS
llvm-svn: 30830
2006-10-09 17:18:28 +00:00
Rafael Espindola
319b5e9c95 expand ISD::SELECT
llvm-svn: 30829
2006-10-09 16:28:33 +00:00
Rafael Espindola
fed11f040c add a note
llvm-svn: 30828
2006-10-09 14:18:33 +00:00
Rafael Espindola
aaeadcb6f5 expand ISD::EXTLOAD
llvm-svn: 30827
2006-10-09 14:13:40 +00:00
Rafael Espindola
1e16a7e972 most ARM targets are little endian
llvm-svn: 30826
2006-10-09 14:12:15 +00:00
Chris Lattner
9f980ec2a1 Implement SROA of unions with mixed pointers/integers in them. This implements
PR892 and Transforms/ScalarRepl/union-pointer.ll:test2

llvm-svn: 30825
2006-10-08 23:53:04 +00:00
Chris Lattner
f8afa75cef Implement Transforms/ScalarRepl/union-pointer.ll:test
llvm-svn: 30823
2006-10-08 23:28:04 +00:00
Chris Lattner
b0e0a23959 Eliminate more token factors by taking advantage of transitivity:
if TF depends on A and B, and A depends on B, TF just needs to depend on
A.  With Jim's alias-analysis stuff enabled, this compiles the testcase in
PR892 into:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %edx, 28(%esp)
        movl %eax, 32(%esp)
        movl %eax, 24(%esp)
        movl %edx, 36(%esp)
        movl 52(%esp), %ecx
        movl %ecx, 4(%esp)
        movl %eax, 8(%esp)
        movl %edx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

instead of:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %eax, 24(%esp)
        movl %edx, 28(%esp)
        movl 24(%esp), %eax
        movl %eax, 32(%esp)
        movl 28(%esp), %eax
        movl %eax, 36(%esp)
        movl 32(%esp), %eax
        movl 36(%esp), %ecx
        movl 52(%esp), %edx
        movl %edx, 4(%esp)
        movl %eax, 8(%esp)
        movl %ecx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

llvm-svn: 30821
2006-10-08 22:57:01 +00:00
Jim Laskey
9260b2f86e Combiner alias analysis passes Multisource (release-asserts.)
llvm-svn: 30818
2006-10-07 23:37:56 +00:00
Chris Lattner
3cd1d08ac6 completely disable folding of loads into scalar sse instructions and provide
a framework for doing it right.  This fixes
CodeGen/X86/2006-10-07-ScalarSSEMiscompile.ll.

Once X86DAGToDAGISel::SelectScalarSSELoad is implemented right, this task
will be done.

llvm-svn: 30817
2006-10-07 21:55:32 +00:00
Chris Lattner
a51aea84b8 convert packed FP add/sub/mul/div to use a multiclass.
llvm-svn: 30815
2006-10-07 21:17:13 +00:00
Chris Lattner
da75127cea one multiclass now defines all 8 variants of binary-scalar-sse-fp operations.
llvm-svn: 30814
2006-10-07 20:55:57 +00:00
Chris Lattner
8ce6993f53 Switch ADD/MUL/DIV/SUB scalarsse fp ops to a multiclass
llvm-svn: 30813
2006-10-07 20:35:44 +00:00
Chris Lattner
ec39f5bcd5 Random acts of shrinkage
llvm-svn: 30812
2006-10-07 19:49:05 +00:00
Chris Lattner
8e3aa16298 Convert pand/por/pxor to use multiclass
llvm-svn: 30811
2006-10-07 19:37:30 +00:00
Chris Lattner
33aecdebfc Convert some more instructions over to use a new multiclass.
Fix a bug where the asmstring for PSUBQrm was wrong.

llvm-svn: 30810
2006-10-07 19:34:33 +00:00
Chris Lattner
260659336a Fix a bug where PADDQrm printed paddd instead of paddq.
llvm-svn: 30809
2006-10-07 19:15:46 +00:00
Chris Lattner
0122bfac98 Add multiclass for SSE2 instructions that correspond to simple binops.
llvm-svn: 30808
2006-10-07 19:14:49 +00:00
Chris Lattner
db12d69657 rename:
PDI_binop_rm -> PDI_binop_rm_int
  PDI_binop_rmi -> PDI_binop_rmi_int

to make it clear that these are for use with intrinsics.

llvm-svn: 30807
2006-10-07 19:02:31 +00:00
Chris Lattner
36709eed45 Convert saturating PADD/PSUB's to use a multiclass
llvm-svn: 30806
2006-10-07 18:48:46 +00:00
Chris Lattner
d5d4378010 Convert PAVG*, PMADDWD, and PMUL* to use multiclasses.
llvm-svn: 30805
2006-10-07 18:39:00 +00:00
Chris Lattner
753ec9950a Fix typo in packsswb instr definition, where the load had the wrong type.
This allows us to use the multiclass for other packs.

llvm-svn: 30804
2006-10-07 18:23:58 +00:00
Rafael Espindola
38e9e2e01d implement FUITOS and FUITOD
llvm-svn: 30803
2006-10-07 14:24:52 +00:00
Rafael Espindola
90a24709fb implement FLDD
llvm-svn: 30802
2006-10-07 14:03:39 +00:00
Rafael Espindola
b8ce0f8bbd implement fadds, faddd, fmuls and fmuld
llvm-svn: 30801
2006-10-07 13:46:42 +00:00
Chris Lattner
59bf33e5e4 handle pmin/pmax with multiclasses
llvm-svn: 30800
2006-10-07 07:49:33 +00:00
Chris Lattner
2177d324c5 simplify pack and shift intrinsics with multiclasses
llvm-svn: 30797
2006-10-07 07:06:17 +00:00
Chris Lattner
31eb3af1a8 Use a multiclass to simplify 'SSE2 Integer comparison'
llvm-svn: 30796
2006-10-07 06:47:08 +00:00
Chris Lattner
7cde5d8820 move class defns close to uses to make it easier to read
llvm-svn: 30795
2006-10-07 06:33:36 +00:00
Chris Lattner
2842be4e37 simplify horizontal op definitions
llvm-svn: 30794
2006-10-07 06:31:41 +00:00
Chris Lattner
b3b659492b remove more unneeded type info
llvm-svn: 30793
2006-10-07 06:27:03 +00:00
Chris Lattner
8a2d78d3cf remove unneeded definitions and type info
llvm-svn: 30792
2006-10-07 06:19:41 +00:00
Chris Lattner
a75da38d99 remove some unneeded type info
llvm-svn: 30791
2006-10-07 06:17:43 +00:00
Chris Lattner
d704b454b9 simplify patterns by merging in operand info
llvm-svn: 30790
2006-10-07 05:50:25 +00:00
Chris Lattner
bf6419cef6 Factor operands into packed unary classes
llvm-svn: 30789
2006-10-07 05:47:20 +00:00
Chris Lattner
06c9aa41f1 remove dead/duplicate instructions
llvm-svn: 30788
2006-10-07 05:41:52 +00:00
Chris Lattner
72b130720d Pull operand info up into parent class for scalar sse intrinsics.
llvm-svn: 30787
2006-10-07 05:26:13 +00:00
Chris Lattner
cf13d058a3 convert the sole sd unary intrinsic to a multiclass for consistency
llvm-svn: 30786
2006-10-07 05:19:31 +00:00
Chris Lattner
67ea3292d2 pull operand string into the multiclass
llvm-svn: 30785
2006-10-07 05:13:26 +00:00
Chris Lattner
e234302d01 Remove RSQRTSS[rm] RCPSS[rm], which are dead.
Introduce SS_IntUnary, a multiclass to replace SS_Int[rm].

llvm-svn: 30784
2006-10-07 05:09:48 +00:00
Chris Lattner
22137d1891 eliminate redundancy
llvm-svn: 30783
2006-10-07 04:52:09 +00:00
Chris Lattner
f5758df6cd Fix a bug legalizing zero-extending i64 loads into 32-bit loads. The bottom
part was always forced to be sextload, even when we needed an zextload.

llvm-svn: 30782
2006-10-07 00:58:36 +00:00
Chris Lattner
f5b9b4a4b2 Set the jt section
llvm-svn: 30781
2006-10-06 22:52:33 +00:00
Chris Lattner
3f92c791b4 initialize ivar
llvm-svn: 30780
2006-10-06 22:52:08 +00:00
Chris Lattner
d5f5a433b2 If a target uses a GOT, put it in the jt data section, not the text
section.  This will fix alpha when Andrew implements
AlphaTargetMachine::getTargetLowering().

llvm-svn: 30779
2006-10-06 22:50:56 +00:00
Chris Lattner
2ca01febcf Alpha uses a got
llvm-svn: 30778
2006-10-06 22:46:51 +00:00
Chris Lattner
b5b96302f2 jump tables handle pic
llvm-svn: 30776
2006-10-06 22:32:29 +00:00
Chris Lattner
ad60994822 print labels even if a MBB doesn't have a corresponding LLVM BB, just don't
print the LLVM BB label.

llvm-svn: 30775
2006-10-06 21:28:17 +00:00
Rafael Espindola
a96c205e12 add optional input flag to FMRRD
llvm-svn: 30774
2006-10-06 20:33:26 +00:00
Rafael Espindola
54301ca490 add support for calling functions that return double
llvm-svn: 30771
2006-10-06 19:10:05 +00:00
Evan Cheng
6d15f83d46 80 col violation.
llvm-svn: 30770
2006-10-06 18:57:51 +00:00
Chris Lattner
399106d8f8 ugly codegen
llvm-svn: 30769
2006-10-06 17:39:34 +00:00
Chris Lattner
0d39b3a4cf Fix a miscompilation of:
long long foo(long long X) {
  return (long long)(signed char)(int)X;
}

Instead of:

_foo:
        extsb r2, r4
        srawi r3, r4, 31
        mr r4, r2
        blr

we now produce:

_foo:
        extsb r4, r4
        srawi r3, r4, 31
        blr

This fixes a miscompilation in ConstantFolding.cpp.

llvm-svn: 30768
2006-10-06 17:34:12 +00:00
Rafael Espindola
d870b158b3 fix some bugs affecting functions with no arguments
llvm-svn: 30767
2006-10-06 17:26:30 +00:00
Rafael Espindola
f35563ff66 fix the stack alignment
llvm-svn: 30766
2006-10-06 14:29:47 +00:00
Rafael Espindola
f679bdf121 add support for calling functions that have double arguments
llvm-svn: 30765
2006-10-06 12:50:22 +00:00
Evan Cheng
9ce3d493f0 Still need to support -mcpu=<> or cross compilation will fail. Doh.
llvm-svn: 30764
2006-10-06 09:17:41 +00:00
Evan Cheng
6fc0ae2136 Do away with CPU feature list. Just use CPUID to detect MMX, SSE, SSE2, SSE3, and 64-bit support.
llvm-svn: 30763
2006-10-06 08:21:07 +00:00
Evan Cheng
35a3337e1d It appears the inline asm in GetCpuIDAndInfo() may clobbers some registers if it isn't inlined (at < -O3). Force it to be inlined.
llvm-svn: 30762
2006-10-06 07:50:56 +00:00
Chris Lattner
5fc3bb074c MachineBasicBlock::splice was incorrectly updating parent pointers on
instructions.

llvm-svn: 30760
2006-10-06 01:12:44 +00:00
Evan Cheng
275825195a Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Evan Cheng
c9e079d0c1 Add getStore() helper function to create ISD::STORE nodes.
llvm-svn: 30758
2006-10-05 22:57:11 +00:00
Chris Lattner
eca9897bd5 Don't crash if an MBB doesn't have an LLVM BB
llvm-svn: 30757
2006-10-05 21:40:14 +00:00
Rafael Espindola
2e4743b6d1 use a const ref for passing the vector to ArgumentLayout
llvm-svn: 30756
2006-10-05 17:46:48 +00:00
Rafael Espindola
f0e4950ef4 implement a ArgumentLayout class to factor code common to LowerFORMAL_ARGUMENTS and LowerCALL
implement FMDRR
add support for f64 function arguments

llvm-svn: 30754
2006-10-05 16:48:49 +00:00
Jim Laskey
3f9f064fd1 Alias analysis code clean ups.
llvm-svn: 30753
2006-10-05 15:07:25 +00:00
Chris Lattner
513ba43053 add a new SimplifyDemandedVectorElts method, which works similarly to
SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752
2006-10-05 06:55:50 +00:00
Chris Lattner
14ad447136 Add insertelement/extractelement helper ctors.
llvm-svn: 30750
2006-10-05 06:24:58 +00:00
Chris Lattner
7f98896c02 Lower some min/max idioms to minss/maxss when unsafe fp math is enabled.
llvm-svn: 30748
2006-10-05 04:11:26 +00:00
Chris Lattner
08da1a510d Don't bother setting JumpTableTextSection, it is about to disappear
llvm-svn: 30745
2006-10-05 03:13:59 +00:00
Chris Lattner
4f41b86e7f Emit pic jumptables to the same section that the function is emitted to,
allowing label differences to work.  This fixes CodeGen/X86/pic_jumptable.ll

llvm-svn: 30744
2006-10-05 03:13:28 +00:00
Chris Lattner
068190eb91 Pass the MachineFunction into EmitJumpTableInfo.
llvm-svn: 30742
2006-10-05 03:01:21 +00:00
Chris Lattner
75e572ab20 implement and use getSectionForFunction
llvm-svn: 30741
2006-10-05 02:51:36 +00:00
Chris Lattner
cc21d20348 Use getSectionForFunction.
llvm-svn: 30740
2006-10-05 02:49:23 +00:00
Chris Lattner
d62ecab2e3 Use getSectionForFunction
llvm-svn: 30739
2006-10-05 02:48:40 +00:00
Chris Lattner
a293b73042 use getSectionForFunction to decide which section to emit code into
llvm-svn: 30738
2006-10-05 02:47:13 +00:00
Chris Lattner
758352e9b1 Implement getSectionForFunction, use it when printing function body.
llvm-svn: 30737
2006-10-05 02:43:52 +00:00
Chris Lattner
b92a46c4f6 move getSectionForFunction to AsmPrinter
llvm-svn: 30736
2006-10-05 02:42:47 +00:00
Chris Lattner
ca844c6695 Move getSectionForFunction to AsmPrinter, change it to return a string.
llvm-svn: 30735
2006-10-05 02:42:20 +00:00
Chris Lattner
0ca8a69c28 implement DarwinTargetAsmInfo::getSectionForFunction, use it when outputting
function bodies

llvm-svn: 30733
2006-10-05 00:35:50 +00:00
Chris Lattner
6ba6d0e937 Give TargetAsmInfo a virtual dtor, add a new getSectionForFunction method.
llvm-svn: 30732
2006-10-05 00:35:16 +00:00
Chris Lattner
2e10b0c095 emit jump table before debug info
llvm-svn: 30731
2006-10-05 00:26:05 +00:00
Chris Lattner
dd6343bd8d Always emit the jump table after the function so it's part of the same 'atom'
as the function body.

llvm-svn: 30730
2006-10-05 00:24:46 +00:00