Chris Lattner
4978a4f2f4
New note, Nate, please check to see if I'm full of it :)
...
llvm-svn: 28118
2006-05-05 05:36:15 +00:00
Jeff Cohen
953baf0494
Fix VC++ compilation error.
...
llvm-svn: 28117
2006-05-05 01:47:05 +00:00
Chris Lattner
56694d0bb0
Sink noop copies into the basic block that uses them. This reduces the number
...
of cross-block live ranges, and allows the bb-at-a-time selector to always
coallesce these away, at isel time.
This reduces the load on the coallescer and register allocator. For example
on a codec on X86, we went from:
1643 asm-printer - Number of machine instrs printed
419 liveintervals - Number of loads/stores folded into instructions
1144 liveintervals - Number of identity moves eliminated after coalescing
1022 liveintervals - Number of interval joins performed
282 liveintervals - Number of intervals after coalescing
1304 liveintervals - Number of original intervals
86 regalloc - Number of times we had to backtrack
1.90232 regalloc - Ratio of intervals processed over total intervals
40 spiller - Number of values reused
182 spiller - Number of loads added
121 spiller - Number of stores added
132 spiller - Number of register spills
6 twoaddressinstruction - Number of instructions commuted to coalesce
360 twoaddressinstruction - Number of two-address instructions
to:
1636 asm-printer - Number of machine instrs printed
403 liveintervals - Number of loads/stores folded into instructions
1155 liveintervals - Number of identity moves eliminated after coalescing
1033 liveintervals - Number of interval joins performed
279 liveintervals - Number of intervals after coalescing
1312 liveintervals - Number of original intervals
76 regalloc - Number of times we had to backtrack
1.88998 regalloc - Ratio of intervals processed over total intervals
1 spiller - Number of copies elided
41 spiller - Number of values reused
191 spiller - Number of loads added
114 spiller - Number of stores added
128 spiller - Number of register spills
4 twoaddressinstruction - Number of instructions commuted to coalesce
356 twoaddressinstruction - Number of two-address instructions
On this testcase, this change provides a modest reduction in spill code,
regalloc iterations, and total instructions emitted. It increases the number
of register coallesces.
llvm-svn: 28115
2006-05-05 01:04:50 +00:00
Chris Lattner
bf8f91715e
Adjust to use proper TargetData copy ctor
...
llvm-svn: 28112
2006-05-04 21:18:40 +00:00
Chris Lattner
db888e7880
Final pass of minor cleanups for MachineInstr
...
llvm-svn: 28110
2006-05-04 19:36:09 +00:00
Evan Cheng
ceb7645d40
Initial support for register pressure aware scheduling. The register reduction
...
scheduler can go into a "vertical mode" (i.e. traversing up the two-address
chain, etc.) when the register pressure is low.
This does seem to reduce the number of spills in the cases I've looked at. But
with x86, it's no guarantee the performance of the code improves.
It can be turned on with -sched-vertically option.
llvm-svn: 28108
2006-05-04 19:16:39 +00:00
Chris Lattner
32adc4592f
Remove redundancy and a level of indirection when creating machine operands
...
llvm-svn: 28107
2006-05-04 19:14:44 +00:00
Chris Lattner
075404adaa
Remove and simplify some more machineinstr/machineoperand stuff.
...
llvm-svn: 28105
2006-05-04 18:16:01 +00:00
Chris Lattner
eb41c99161
Rename MO_VirtualRegister -> MO_Register. Clean up immediate handling.
...
llvm-svn: 28104
2006-05-04 18:05:43 +00:00
Chris Lattner
685568510a
Move some methods out of MachineInstr into MachineOperand
...
llvm-svn: 28102
2006-05-04 17:52:23 +00:00
Chris Lattner
55938f67ae
Fix Transforms/InstCombine/2006-05-04-DemandedBitCrash.ll
...
llvm-svn: 28101
2006-05-04 17:33:35 +00:00
Chris Lattner
97f1af2f14
There shalt be only one "immediate" operand type!
...
llvm-svn: 28099
2006-05-04 17:21:20 +00:00
Chris Lattner
51b0cac238
Change "value" in MachineOperand to be a GlobalValue, as that is the only
...
thing that can be in it. Remove a dead method.
llvm-svn: 28098
2006-05-04 17:02:51 +00:00
Chris Lattner
20affbd29a
Revert Nate's CR patch from last night, which caused many regressions (e.g. fhourstones).
...
Loading and storing off R0 isn't what we wanted. Also, taking some CR's out of
CRRC seems to cause failures as well. Further investigation is required.
llvm-svn: 28097
2006-05-04 16:56:45 +00:00
Jeff Cohen
097cc5d00b
Make external globals public; other minor cleanup.
...
llvm-svn: 28096
2006-05-04 16:20:22 +00:00
Jeff Cohen
a954c15ea1
Make Intel syntax the default when LLVM is built with VC++.
...
llvm-svn: 28095
2006-05-04 16:19:27 +00:00
Chris Lattner
a39a7f900f
Remove a bunch more dead V9 specific stuff
...
llvm-svn: 28094
2006-05-04 01:26:39 +00:00
Chris Lattner
c779fca289
Remove a bunch more SparcV9 specific stuff
...
llvm-svn: 28093
2006-05-04 01:15:02 +00:00
Chris Lattner
ed58ec2a57
Remove some more V9-specific stuff.
...
llvm-svn: 28092
2006-05-04 00:49:59 +00:00
Chris Lattner
0f89e6b11d
Remove some more unused stuff from MachineInstr that was leftover from V9.
...
llvm-svn: 28091
2006-05-04 00:44:25 +00:00
Chris Lattner
b14a767a3e
Simplify handling of relocations
...
llvm-svn: 28090
2006-05-04 00:42:08 +00:00
Evan Cheng
ef2fbe7460
Use movsd to shuffle in the lowest two elements of a v4f32 / v4i32 vector when
...
movlps cannot be used (e.g. when load from m64 has multiple uses).
llvm-svn: 28089
2006-05-03 20:32:03 +00:00
Chris Lattner
f89e1162ad
Change from using MachineRelocation ctors to using static methods
...
in MachineRelocation to create Relocations.
llvm-svn: 28088
2006-05-03 20:30:20 +00:00
Chris Lattner
326eedfa77
minor cleanups, no functionality change
...
llvm-svn: 28087
2006-05-03 18:55:56 +00:00
Chris Lattner
87fa1cef04
inline a simple method
...
llvm-svn: 28083
2006-05-03 17:21:32 +00:00
Chris Lattner
d36b66d6dc
Suck block address tracking out of targets into the JIT Emitter. This
...
simplifies the MachineCodeEmitter interface just a little bit and makes
BasicBlocks work like constant pools and jump tables.
llvm-svn: 28082
2006-05-03 17:10:41 +00:00
Chris Lattner
b12bd9d7a7
Fix a bug in Owen's checkin that broke the CBE on all non sparc v9 platforms.
...
llvm-svn: 28081
2006-05-03 05:48:41 +00:00
Nate Begeman
a4ea552058
Teach the x86 jit how to handle jump tables not directly used by a jump
...
instruction.
llvm-svn: 28080
2006-05-03 04:52:47 +00:00
Nate Begeman
95a4f191e0
Finish up the initial jump table implementation by allowing jump tables to
...
not be 100% dense. Increase the minimum threshold for the number of cases
in a switch statement from 4 to 6 in order to create a jump table.
llvm-svn: 28079
2006-05-03 03:48:02 +00:00
Evan Cheng
2922daab4b
Bottom up register pressure reduction work: clean up some hacks and enhanced
...
the heuristic to further reduce spills for several test cases. (Note, it may
not necessarily translate to runtime win!)
llvm-svn: 28076
2006-05-03 02:10:45 +00:00
Owen Anderson
71bc529dfa
Refactor TargetMachine, pushing handling of TargetData into the target-specific subclasses. This has one caller-visible change: getTargetData() now returns a pointer instead of a reference.
...
This fixes PR 759.
llvm-svn: 28074
2006-05-03 01:29:57 +00:00
Chris Lattner
be9958e6f7
Align function bodies correctly.
...
llvm-svn: 28073
2006-05-03 01:03:20 +00:00
Chris Lattner
c43d309514
Simplify some code. Don't add memory blocks to the Blocks list twice.
...
llvm-svn: 28071
2006-05-03 00:54:49 +00:00
Chris Lattner
95f75d0506
Add assertions that verify that the actual arguments to a call or invoke match
...
the prototype of the called function.
llvm-svn: 28070
2006-05-03 00:48:22 +00:00
Chris Lattner
06ccac43d7
Change the BasicBlockAddrs map to be a vector, indexed by MBB number.
...
llvm-svn: 28069
2006-05-03 00:32:55 +00:00
Chris Lattner
28ac95615b
Keep the alpha JIT similar to the PPC/X86 jits
...
llvm-svn: 28068
2006-05-03 00:31:21 +00:00
Chris Lattner
be23568cea
Simplify some code
...
llvm-svn: 28066
2006-05-03 00:13:06 +00:00
Chris Lattner
2bf37af52d
Several related changes:
...
1. Change several methods in the MachineCodeEmitter class to be pure virtual.
2. Suck emitConstantPool/initJumpTableInfo into startFunction, removing them
from the MachineCodeEmitter interface, and reducing the amount of target-
specific code.
3. Change the JITEmitter so that it allocates constantpools and jump tables
*right* next to the functions that they belong to, instead of in a separate
pool of memory. This makes all memory for a function be contiguous, and
means the JITEmitter only tracks one block of memory now.
llvm-svn: 28065
2006-05-02 23:22:24 +00:00
Nate Begeman
d9438bedaa
Remove some stuff from the README
...
llvm-svn: 28063
2006-05-02 22:43:31 +00:00
Chris Lattner
8054cd3830
Do not make the JIT memory manager manage the memory for globals. Instead
...
just have the JIT malloc them.
llvm-svn: 28062
2006-05-02 21:57:51 +00:00
Chris Lattner
41cc593fc3
Minor cleanups, no functionality change.
...
llvm-svn: 28061
2006-05-02 21:44:14 +00:00
Chris Lattner
d100478886
Fix a purely hypothetical problem (for now): emitWord emits in the host
...
byte format. This doesn't work when using the code emitter in a cross target
environment. Since the code emitter is only really used by the JIT, this
isn't a current problem, but if we ever start emitting .o files, it would be.
llvm-svn: 28060
2006-05-02 19:14:47 +00:00
Chris Lattner
055baf5c7b
Refactor the machine code emitter interface to pull the pointers for the current
...
code emission location into the base class, instead of being in the derived classes.
This change means that low-level methods like emitByte/emitWord now are no longer
virtual (yaay for speed), and we now have a framework to support growable code
segments. This implements feature request #1 of PR469.
llvm-svn: 28059
2006-05-02 18:27:26 +00:00
Nate Begeman
d7b4d2a743
Since we don't handle callee-save CRs right yet, don't allocate them. Also
...
don't step on R11 in the middle of a function when saving and restoring CRs
llvm-svn: 28058
2006-05-02 17:37:31 +00:00
Nate Begeman
2a81b6be22
Print function number instead of name
...
llvm-svn: 28057
2006-05-02 17:36:46 +00:00
Nate Begeman
fa83cee567
Hooray, everyone now uses the same printBasicBlockLabel implementation
...
llvm-svn: 28056
2006-05-02 17:34:51 +00:00
Chris Lattner
5c1abc2b94
Remove dead method
...
llvm-svn: 28055
2006-05-02 17:20:28 +00:00
Chris Lattner
c11eac5284
There is no reason to use a virtual method to store this word.
...
llvm-svn: 28053
2006-05-02 17:16:20 +00:00
Chris Lattner
bb8c8b5d9d
Remove the debug machine code emitter. The "FilePrinterEmitter" is more
...
useful for debugging.
llvm-svn: 28051
2006-05-02 16:59:24 +00:00
Nate Begeman
05174045df
Extend printBasicBlockLabel a bit so that it can be used to print all
...
basic block labels, consolidating the code to do so in one place for each
target.
llvm-svn: 28050
2006-05-02 05:37:32 +00:00
Nate Begeman
82a6c0c66c
Update the PPC compilation callback code to not need weird abi-violating
...
prologs and epilogs, keep all the asm in one place, and remove use of
compiler builtin functions.
llvm-svn: 28049
2006-05-02 04:50:05 +00:00
Chris Lattner
1275941193
Add pass ID's for various passes, so they can be AddRequiredID. Patch by
...
Domagoj Babic!
llvm-svn: 28048
2006-05-02 04:24:36 +00:00
Jeff Cohen
a35a8a5f9c
De-virtualize SwitchSection.
...
llvm-svn: 28047
2006-05-02 03:58:45 +00:00
Jeff Cohen
b257253098
De-virtualize EmitZeroes.
...
llvm-svn: 28046
2006-05-02 03:46:13 +00:00
Jeff Cohen
5c2e201a63
Finish support for Microsoft ML/MASM. May still be a few rough edges.
...
llvm-svn: 28045
2006-05-02 03:11:50 +00:00
Jeff Cohen
ec0f5808a1
Make Intel syntax mode friendlier to Microsoft ML assembler (still needs more work).
...
llvm-svn: 28044
2006-05-02 01:16:28 +00:00
Chris Lattner
c2914c8ac5
Fix a latent bug that my spiller patch last week exposed: we were leaving
...
instructions in the virtregfolded map that were deleted. Because they
were deleted, newly allocated instructions could end up at the same address,
magically finding themselves in the map. The solution is to remove entries
from the map when we delete the instructions.
llvm-svn: 28041
2006-05-01 22:03:24 +00:00
Chris Lattner
a9f3c7c50a
When promoting a load to a reg-reg copy, where the load was a previous
...
instruction folded with spill code, make sure the remove the load from
the virt reg folded map.
llvm-svn: 28040
2006-05-01 21:17:10 +00:00
Chris Lattner
befcd1e76d
Remove previous patch, which wasn't quite right.
...
llvm-svn: 28039
2006-05-01 21:16:03 +00:00
Chris Lattner
8456272509
Put PHI/INLINEASM into the correct namespace.
...
llvm-svn: 28037
2006-05-01 17:00:49 +00:00
Evan Cheng
44e851f6ec
Dis-favor stores more
...
llvm-svn: 28035
2006-05-01 09:20:44 +00:00
Evan Cheng
494e47d755
Bottom up register-pressure reduction scheduler now pushes store operations
...
up the schedule. This helps code that looks like this:
loads ...
computations (first set) ...
stores (first set) ...
loads
computations (seccond set) ...
stores (seccond set) ...
Without this change, the stores and computations are more likely to
interleave:
loads ...
loads ...
computations (first set) ...
computations (second set) ...
computations (first set) ...
stores (first set) ...
computations (second set) ...
stores (stores set) ...
This can increase the number of spills if we are unlucky.
llvm-svn: 28033
2006-05-01 09:14:40 +00:00
Evan Cheng
94264c7c90
Didn't mean ScheduleDAGList.cpp to make the last checkin.
...
llvm-svn: 28030
2006-05-01 08:56:34 +00:00
Evan Cheng
ef706b77c4
Remove temp. option -spiller-check-liveout, it didn't cause any failure nor performance regressions.
...
llvm-svn: 28029
2006-05-01 08:54:57 +00:00
Chris Lattner
fe8f858ec0
Remove %'s from register names when in intel mode.
...
llvm-svn: 28027
2006-05-01 05:53:50 +00:00
Chris Lattner
bce271e829
Format #APP lines a bit nicer
...
llvm-svn: 28026
2006-05-01 04:11:03 +00:00
Evan Cheng
02e72f8f55
Local spiller kills a store if the folded restore is turned into a copy.
...
But this is incorrect if the spilled value live range extends beyond the
current BB.
It is currently controlled by a temporary option -spiller-check-liveout.
llvm-svn: 28024
2006-04-30 08:41:47 +00:00
Jeff Cohen
1b3f7b8b48
Mingw32 patches supplied by Anton Korobeynikov.
...
llvm-svn: 28023
2006-04-29 18:41:44 +00:00
Chris Lattner
6ce6942d21
Remove a bogus transformation. This fixes SingleSource/UnitTests/2006-01-23-InitializedBitField.c
...
with some changes I have to the new CFE.
llvm-svn: 28022
2006-04-28 23:33:20 +00:00
Evan Cheng
a7ee4891c5
I can't spell: Register, not Regsiter.
...
llvm-svn: 28021
2006-04-28 23:19:39 +00:00
Evan Cheng
516164744a
Implemented x86 inline asm b, h, w, k modifiers.
...
llvm-svn: 28020
2006-04-28 23:11:40 +00:00
Chris Lattner
4c79d3b238
Fix InstCombine/2006-04-28-ShiftShiftLongLong.ll
...
llvm-svn: 28019
2006-04-28 22:21:41 +00:00
Chris Lattner
e3de67fae2
Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
...
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Evan Cheng
a33feb51db
Initial caller side support (for CCC only, not FastCC) of 128-bit vector
...
passing by value.
llvm-svn: 28015
2006-04-28 21:29:37 +00:00
Evan Cheng
a8b295feb2
Bare-bone X86 inline asm printer support.
...
llvm-svn: 28014
2006-04-28 21:19:05 +00:00
Evan Cheng
9f21c2daf4
Remove the temporary option: -no-isel-fold-inflight
...
llvm-svn: 28012
2006-04-28 18:54:11 +00:00
Evan Cheng
d577ce4c4a
Implement four-wide shuffle with 2 shufps if no more than two elements come
...
from each vector. e.g.
shuffle(G1, G2, 7, 1, 5, 2)
==>
movaps _G2, %xmm0
shufps $151, _G1, %xmm0
shufps $216, %xmm0, %xmm0
llvm-svn: 28011
2006-04-28 07:03:38 +00:00
Chris Lattner
a297ad27db
Fix PR743: emit -help output of a tool to cout, not cerr.
...
llvm-svn: 28010
2006-04-28 05:36:25 +00:00
Evan Cheng
f843942504
TargetLowering::LowerArguments should return a VBIT_CONVERT of
...
FORMAL_ARGUMENTS SDOperand in the return result vector.
llvm-svn: 28009
2006-04-28 05:25:15 +00:00
Chris Lattner
4e783a3101
Mapping of physregs can make it so that the designated and input physregs are
...
the same. In this case, don't emit a noop copy.
llvm-svn: 28008
2006-04-28 04:43:18 +00:00
Chris Lattner
2118062b9c
Fix Transforms/Reassociate/2006-04-27-ReassociateVector.ll
...
llvm-svn: 28007
2006-04-28 04:14:49 +00:00
Evan Cheng
37af498015
Use movaps instead of movapd for spill / restore.
...
llvm-svn: 28005
2006-04-28 02:23:35 +00:00
Evan Cheng
ff5a88b62e
Added a temporary option -no-isel-fold-inflight to control whether a "inflight"
...
node can be folded.
llvm-svn: 28003
2006-04-28 02:09:19 +00:00
Chris Lattner
85a24e23a2
When we have a two-address instruction where the input cannot be clobbered
...
and is already available, instead of falling back to emitting a load, fall
back to emitting a reg-reg copy. This generates significantly better code
for some SSE testcases, as SSE has lots of two-address instructions and
none of them are read/modify/write. As one example, this change does:
pshufd %XMM5, XMMWORD PTR [%ESP + 84], 255
xorps %XMM2, %XMM5
cmpltps %XMM1, %XMM0
- movaps XMMWORD PTR [%ESP + 52], %XMM0
- movapd %XMM6, XMMWORD PTR [%ESP + 52]
+ movaps %XMM6, %XMM0
cmpltps %XMM6, XMMWORD PTR [%ESP + 68]
movapd XMMWORD PTR [%ESP + 52], %XMM6
movaps %XMM6, %XMM0
cmpltps %XMM6, XMMWORD PTR [%ESP + 36]
cmpltps %XMM3, %XMM0
- movaps XMMWORD PTR [%ESP + 20], %XMM0
- movapd %XMM7, XMMWORD PTR [%ESP + 20]
+ movaps %XMM7, %XMM0
cmpltps %XMM7, XMMWORD PTR [%ESP + 4]
movapd XMMWORD PTR [%ESP + 20], %XMM7
cmpltps %XMM4, %XMM0
... which is far better than a store followed by a load!
llvm-svn: 28001
2006-04-28 01:46:50 +00:00
Chris Lattner
65291785c8
Add a note
...
llvm-svn: 27999
2006-04-28 00:04:05 +00:00
Chris Lattner
53275cb616
Add a note
...
llvm-svn: 27998
2006-04-27 21:40:57 +00:00
Chris Lattner
3f6a151e2d
Add support for inserting undef into a vector. This implements
...
Transforms/InstCombine/vec_insert_to_shuffle.ll
llvm-svn: 27997
2006-04-27 21:14:21 +00:00
Evan Cheng
11e3cec8bd
Make x86 isel lowering produce tailcall nodes. They are match to normal calls
...
for now.
Patch contributed by Alexander Friedman.
llvm-svn: 27994
2006-04-27 08:40:39 +00:00
Evan Cheng
efbc112b7c
A couple of new entries.
...
llvm-svn: 27993
2006-04-27 08:31:33 +00:00
Evan Cheng
24795120e1
Support for passing 128-bit vector arguments via XMM registers.
...
llvm-svn: 27992
2006-04-27 08:31:10 +00:00
Evan Cheng
a6081bc674
Insert a VBIT_CONVERT between a FORMAL_ARGUMENT node and its vector uses
...
(VAND, VADD, etc.). Legalizer will assert otherwise.
llvm-svn: 27991
2006-04-27 08:29:42 +00:00
Evan Cheng
1e065ae594
Oops
...
llvm-svn: 27989
2006-04-27 05:44:50 +00:00
Evan Cheng
a0e0eabc07
Bug fix: not updating NumIntRegs.
...
llvm-svn: 27988
2006-04-27 05:35:28 +00:00
Chris Lattner
3b27c1bd33
Fix Regression/CodeGen/Generic/2006-04-26-SetCCAnd.ll and
...
PR748.
llvm-svn: 27987
2006-04-27 05:01:07 +00:00
Evan Cheng
a1f9f34f35
- Clean up formal argument lowering code. Prepare for vector pass by value work.
...
- Fixed vararg support.
llvm-svn: 27985
2006-04-27 01:32:22 +00:00
Chris Lattner
be67e21327
Fix some nondeterminstic behavior in the mem2reg pass that (in addition to
...
nondeterminism being bad) could cause some trivial missed optimizations (dead
phi nodes being left around for later passes to clean up).
With this, llvm-gcc4 now bootstraps and correctly compares. I don't know
why I never tried to do it before... :)
llvm-svn: 27984
2006-04-27 01:14:43 +00:00
Chris Lattner
b42f17a66a
Implement Transforms/IndVarsSimplify/complex-scev.ll, a case where we didn't
...
recognize some simple affine IV's.
llvm-svn: 27982
2006-04-26 18:34:07 +00:00
Evan Cheng
3abec16563
Fix fastcc failures.
...
llvm-svn: 27980
2006-04-26 18:21:31 +00:00
Evan Cheng
58d4133b60
Switching over FORMAL_ARGUMENTS mechanism to lower call arguments.
...
llvm-svn: 27975
2006-04-26 01:20:17 +00:00
Evan Cheng
b54140b0d6
Don't forget return void.
...
llvm-svn: 27974
2006-04-25 23:03:35 +00:00
Nate Begeman
627fd2faaa
Keep the stack from on darwin 16-byte aligned. This fixes many JIT
...
failres.
llvm-svn: 27973
2006-04-25 20:54:26 +00:00
Evan Cheng
09112df9d3
Separate LowerOperation() into multiple functions, one per opcode.
...
llvm-svn: 27972
2006-04-25 20:13:52 +00:00
Andrew Lenharth
d62d88bae0
slightly more useful error message
...
llvm-svn: 27971
2006-04-25 19:33:41 +00:00
Andrew Lenharth
39d53414a5
better c99 struct handling
...
llvm-svn: 27970
2006-04-25 19:33:23 +00:00
Evan Cheng
abc391a5a6
Fix a typo.
...
llvm-svn: 27968
2006-04-25 17:48:41 +00:00
Nate Begeman
acdefe26fa
Fix a warning
...
llvm-svn: 27967
2006-04-25 17:46:32 +00:00
Nate Begeman
deeb953086
No functionality changes, but cleaner code with correct comments.
...
llvm-svn: 27966
2006-04-25 04:45:59 +00:00
Evan Cheng
7f0e30d1a2
Explicitly specify result type for def : Pat<> patterns (if it produces a vector
...
result). Otherwise tblgen will pick the default (v16i8 for 128-bit vector).
llvm-svn: 27965
2006-04-25 00:50:01 +00:00
Evan Cheng
e521de4e60
Added X86 SSE2 intrinsics which can be represented as vector_shuffles. This is
...
a temporary workaround for the 2-wide vector_shuffle problem (i.e. its mask
would have type v2i32 which is not legal).
llvm-svn: 27964
2006-04-24 23:34:56 +00:00
Evan Cheng
b7a2ab21a5
Add a new entry.
...
llvm-svn: 27963
2006-04-24 23:30:10 +00:00
Evan Cheng
0282b48ec2
Special case handling two wide build_vector(0, x).
...
llvm-svn: 27961
2006-04-24 22:58:52 +00:00
Evan Cheng
3306427d87
Some missing movlps, movhps, movlpd, and movhpd patterns.
...
llvm-svn: 27960
2006-04-24 21:58:20 +00:00
Evan Cheng
1eae7398a6
A little bit more build_vector enhancement for v8i16 cases.
...
llvm-svn: 27959
2006-04-24 18:01:45 +00:00
Evan Cheng
f74b046b06
Remove a completed entry.
...
llvm-svn: 27958
2006-04-24 17:38:16 +00:00
Evan Cheng
70237fcb5d
MakeMIInst() should handle jump table index operands.
...
llvm-svn: 27955
2006-04-24 05:37:35 +00:00
Chris Lattner
86f1e02800
Add a note
...
llvm-svn: 27954
2006-04-23 19:47:09 +00:00
Evan Cheng
4812ce5035
MOVL shuffle (i.e. movd or movss / movsd from memory) of undef, V2 == V2
...
llvm-svn: 27953
2006-04-23 06:35:19 +00:00
Nate Begeman
7bb910dbe7
Fix the updating of the machine CFG when a PHI node was in a successor of
...
the jump table's range check block. This re-enables 100% dense jump tables
by default on PPC & x86
llvm-svn: 27952
2006-04-23 06:26:20 +00:00
Nate Begeman
f86709e9e2
Code cleanup associated with jump tables, thanks to Chris for noticing
...
these.
llvm-svn: 27950
2006-04-22 23:52:35 +00:00
Nate Begeman
6cfb33a42c
Turn of jump tables for a bit, there are still some issues to work out with
...
updating the machine CFG.
llvm-svn: 27949
2006-04-22 23:51:56 +00:00
Nate Begeman
0d74cbcb6b
Optimized stores to the constant pool, while cool, are unnecessary.
...
llvm-svn: 27948
2006-04-22 22:31:45 +00:00
Nate Begeman
7ed816f900
JumpTable support! What this represents is working asm and jit support for
...
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Evan Cheng
1c33e83af5
Don't do all the lowering stuff for 2-wide build_vector's. Also, minor optimization for shuffle of undef.
...
llvm-svn: 27946
2006-04-22 08:34:05 +00:00
Evan Cheng
ec33bd04fb
Fix a performance regression. Use {p}shuf* when there are only two distinct elements in a build_vector.
...
llvm-svn: 27945
2006-04-22 06:21:46 +00:00
Chris Lattner
de560fcaf7
Teach the JIT how to relocate LI, this fixes the JIT on Prolangs-C/TimberWolfMC
...
llvm-svn: 27943
2006-04-22 06:17:56 +00:00
Chris Lattner
4f0240b577
Fix JIT support for static ctors, which was apparently completely broken!
...
This allows Prolangs-C++/city and probably a bunch of other stuff to work
well with the new front-end
llvm-svn: 27941
2006-04-22 05:02:46 +00:00
Evan Cheng
5cb5fdd8eb
Revamp build_vector lowering to take advantage of movss and movd instructions.
...
movd always clear the top 96 bits and movss does so when it's loading the
value from memory.
The net result is codegen for 4-wide shuffles is much improved. It is near
optimal if one or more elements is a zero. e.g.
__m128i test(int a, int b) {
return _mm_set_epi32(0, 0, b, a);
}
compiles to
_test:
movd 8(%esp), %xmm1
movd 4(%esp), %xmm0
punpckldq %xmm1, %xmm0
ret
compare to gcc:
_test:
subl $12, %esp
movd 20(%esp), %xmm0
movd 16(%esp), %xmm1
punpckldq %xmm0, %xmm1
movq %xmm1, %xmm0
movhps LC0, %xmm0
addl $12, %esp
ret
or icc:
_test:
movd 4(%esp), %xmm0 #5.10
movd 8(%esp), %xmm3 #5.10
xorl %eax, %eax #5.10
movd %eax, %xmm1 #5.10
punpckldq %xmm1, %xmm0 #5.10
movd %eax, %xmm2 #5.10
punpckldq %xmm2, %xmm3 #5.10
punpckldq %xmm3, %xmm0 #5.10
ret #5.10
There are still room for improvement, for example the FP variant of the above example:
__m128 test(float a, float b) {
return _mm_set_ps(0.0, 0.0, b, a);
}
_test:
movss 8(%esp), %xmm1
movss 4(%esp), %xmm0
unpcklps %xmm1, %xmm0
xorps %xmm1, %xmm1
movlhps %xmm1, %xmm0
ret
The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle.
llvm-svn: 27939
2006-04-21 23:03:30 +00:00
Nate Begeman
dc60393018
Fix the comment
...
llvm-svn: 27938
2006-04-21 22:11:27 +00:00
Nate Begeman
67b3094f27
Change the PPC JIT to use a Static relocation model
...
llvm-svn: 27937
2006-04-21 22:04:15 +00:00
Chris Lattner
d81dcf9da4
fix thinko
...
llvm-svn: 27935
2006-04-21 21:05:22 +00:00
Chris Lattner
84a811d57e
add some low-prio notes
...
llvm-svn: 27934
2006-04-21 21:03:21 +00:00
Chris Lattner
8f7c394f3a
The BFS scheduler is apparently nondeterminstic (causes many llvmgcc bootstrap
...
miscompares). Switch RISC targets to use the list-td scheduler, which isn't.
llvm-svn: 27933
2006-04-21 17:16:16 +00:00
Chris Lattner
7cc9363937
Remove a hack required by V9.
...
llvm-svn: 27931
2006-04-21 15:33:35 +00:00
Chris Lattner
3b9a7570cb
Fix a couple more memory issues
...
llvm-svn: 27930
2006-04-21 15:32:26 +00:00
Evan Cheng
e0289de5ab
Now generating perfect (I think) code for "vector set" with a single non-zero
...
scalar value.
e.g.
_mm_set_epi32(0, a, 0, 0);
==>
movd 4(%esp), %xmm0
pshufd $69, %xmm0, %xmm0
_mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
movzbw 4(%esp), %ax
movzwl %ax, %eax
pxor %xmm0, %xmm0
pinsrw $5, %eax, %xmm0
llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Chris Lattner
2ae3ed5e1a
Fix a really subtle and obnoxious memory bug that caused issues with an
...
llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it
*must* be returned by the visit function, or the dag combiner will not
know that the node has been processed (and will, e.g., send it to the
target dag combine xforms).
llvm-svn: 27922
2006-04-20 23:55:59 +00:00
Chris Lattner
01afcd337a
Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll
...
llvm-svn: 27912
2006-04-20 20:48:50 +00:00
Chris Lattner
f1a59f3dc1
Fix the CodeGen/PowerPC/buildvec_canonicalize.ll regression last night.
...
llvm-svn: 27908
2006-04-20 19:01:30 +00:00
Chris Lattner
2c1c3896ed
add a note
...
llvm-svn: 27907
2006-04-20 18:49:28 +00:00
Chris Lattner
829d8b5f7b
remove some v9 specific code
...
llvm-svn: 27900
2006-04-20 18:33:11 +00:00
Chris Lattner
04a8496ef0
This field no longer exists
...
llvm-svn: 27899
2006-04-20 18:32:41 +00:00
Chris Lattner
93d2acdead
Remove this obsolete file
...
llvm-svn: 27895
2006-04-20 18:16:45 +00:00
Chris Lattner
a6a6ac15af
Remove some of the obvious V9-specific cruft
...
llvm-svn: 27893
2006-04-20 18:08:53 +00:00
Chris Lattner
c751750a4f
This target is no longer built. The ,v files now live in the reoptimizer.
...
llvm-svn: 27885
2006-04-20 17:15:44 +00:00
Andrew Lenharth
e2b150550a
Make code match cvs commit message :)
...
llvm-svn: 27881
2006-04-20 15:41:37 +00:00
Andrew Lenharth
8f08647a6d
If we can convert the return pointer type into an integer that IntPtrType
...
can be converted to losslessly, we can continue the conversion to a direct call.
llvm-svn: 27880
2006-04-20 14:56:47 +00:00
Evan Cheng
41f2933444
- Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1>
...
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.
llvm-svn: 27875
2006-04-20 08:58:49 +00:00
Evan Cheng
618314ed2f
Turn a VAND into a VECTOR_SHUFFLE is applicable.
...
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements,
into a vector shuffle with a zero vector. It only does so when TLI tells it
the xform is profitable.
llvm-svn: 27874
2006-04-20 08:56:16 +00:00
Chris Lattner
d11e0056ae
Make sure that the new instructions selected have the right type. This fixes
...
CodeGen/PowerPC/2006-04-19-vmaddfp-crash.ll
llvm-svn: 27868
2006-04-20 05:58:10 +00:00
Chris Lattner
fa2e364d9c
Implement folding of a bunch of binops with undef
...
llvm-svn: 27863
2006-04-20 05:39:12 +00:00
Evan Cheng
9dcd046bbd
Handle v2i64 BUILD_VECTOR custom lowering correctly. v2i64 is a legal type,
...
but i64 is not. If possible, change a i64 op to a f64 (e.g. load, constant)
and then cast it back.
llvm-svn: 27849
2006-04-20 00:11:39 +00:00
Evan Cheng
d79f6a9f5a
isSplatMask() bug: first element can be an undef.
...
llvm-svn: 27847
2006-04-19 23:28:59 +00:00
Chris Lattner
a1cf724536
Simplify some code
...
llvm-svn: 27846
2006-04-19 23:17:50 +00:00
Evan Cheng
019dea6886
- Added support to do aribitrary 4 wide shuffle with no more than three
...
instructions.
- Fixed a commute vector_shuff bug.
llvm-svn: 27845
2006-04-19 22:48:17 +00:00
Evan Cheng
7bbfc1d41a
Prefer {p}unpack* and mov*dup over {p}shuf* as well.
...
llvm-svn: 27844
2006-04-19 21:15:24 +00:00
Evan Cheng
5e80563052
Renamed AddedCost to AddedComplexity.
...
llvm-svn: 27843
2006-04-19 20:38:28 +00:00
Evan Cheng
a52eb1d7d5
- Renamed AddedCost to AddedComplexity.
...
- Added more movhlps and movlhps patterns.
llvm-svn: 27842
2006-04-19 20:37:34 +00:00
Evan Cheng
265831aa45
Commute vector_shuffle to match more movlhps, movlp{s|d} cases.
...
llvm-svn: 27840
2006-04-19 20:35:22 +00:00
Evan Cheng
56e205e534
More mov{h|l}p{d|s} patterns.
...
llvm-svn: 27836
2006-04-19 18:20:17 +00:00
Evan Cheng
b42424177c
- More mov{h|l}ps patterns.
...
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These
are preferred over shufps in most cases.
llvm-svn: 27835
2006-04-19 18:11:52 +00:00
Evan Cheng
318120f8ad
Allow "let AddedCost = n in" to increase pattern complexity.
...
llvm-svn: 27834
2006-04-19 18:07:24 +00:00
Chris Lattner
e307f43f35
add a note
...
llvm-svn: 27832
2006-04-19 16:22:38 +00:00
Andrew Lenharth
ca2734f9f1
Another simple case type merge case to try
...
llvm-svn: 27831
2006-04-19 15:34:34 +00:00
Andrew Lenharth
ef07d5c5a5
deal with memchr
...
llvm-svn: 27830
2006-04-19 15:34:02 +00:00
Andrew Lenharth
09948a5331
friendlier error message
...
llvm-svn: 27829
2006-04-19 15:33:35 +00:00
Chris Lattner
62537a04fb
add a note
...
llvm-svn: 27828
2006-04-19 05:55:06 +00:00
Chris Lattner
99c7c3ad2f
Add a note.
...
llvm-svn: 27827
2006-04-19 05:53:27 +00:00
Andrew Lenharth
d89cdc541a
stupid stuff
...
llvm-svn: 27821
2006-04-19 03:45:25 +00:00
Andrew Lenharth
87c0d713cc
I understand now. Shoot.
...
llvm-svn: 27819
2006-04-18 22:36:11 +00:00
Evan Cheng
7364ee1c92
- PEXTRW cannot take a memory location as its first source operand.
...
- PINSRWrmi encoding bug.
llvm-svn: 27818
2006-04-18 21:59:43 +00:00
Evan Cheng
d6fa185be2
SHUFP{S|D}, PSHUF* encoding bugs. Left out the mask immediate operand.
...
llvm-svn: 27817
2006-04-18 21:56:36 +00:00
Evan Cheng
f16e4bf29d
Name change for clarity sake
...
llvm-svn: 27816
2006-04-18 21:55:35 +00:00
Evan Cheng
82d7cacbbc
Encoding bug: CMPPSrmi, CMPPDrmi dropped operand 2 (condtion immediate).
...
llvm-svn: 27815
2006-04-18 21:31:08 +00:00
Evan Cheng
8e87e9b0db
Name change for clarity sake
...
llvm-svn: 27814
2006-04-18 21:29:50 +00:00
Evan Cheng
838f053b09
Left a pattern out
...
llvm-svn: 27813
2006-04-18 21:29:08 +00:00
Andrew Lenharth
4d432bcb19
llvm.memc* improvements. helps PA a lot in some specmarks
...
llvm-svn: 27812
2006-04-18 20:59:52 +00:00
Andrew Lenharth
d5fd116ede
llvm.memc* improvements. helps PA a lot in some specmarks
...
llvm-svn: 27811
2006-04-18 19:54:11 +00:00
Chris Lattner
f58f727be6
These are correctly encoded by the JIT. I checked :)
...
llvm-svn: 27810
2006-04-18 19:03:38 +00:00
Chris Lattner
5f153584d9
add a note
...
llvm-svn: 27809
2006-04-18 18:30:19 +00:00
Chris Lattner
47a41ae889
Fix a crash on:
...
void foo2(vector float *A, vector float *B) {
vector float C = (vector float)vec_cmpeq(*A, *B);
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
*A = C;
}
llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Evan Cheng
2cd4e2d240
Fixed an encoding bug: movd from XMM to R32.
...
llvm-svn: 27807
2006-04-18 18:19:00 +00:00
Chris Lattner
2bd91746e1
pretty print node name
...
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner
44ea12c5f8
Implement an important entry from README_ALTIVEC:
...
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's. Instead, just branch on CR6 directly. :)
For example, for:
void foo2(vector float *A, vector float *B) {
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
}
We now generate:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
instead of:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
cmpwi cr0, r3, 0
beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
This implements CodeGen/PowerPC/vec_br_cmp.ll.
llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
519001b0ee
move some stuff around, clean things up
...
llvm-svn: 27802
2006-04-18 17:52:36 +00:00
Chris Lattner
3e2a664ada
Teach the codegen about instructions used for SSE spill code, allowing it
...
to optimize cases where it has to spill a lot
llvm-svn: 27801
2006-04-18 16:44:51 +00:00
Chris Lattner
e90fdf3b98
Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
...
even/odd halves. Thanks to Nate telling me what's what.
llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner
5951b60cb4
Implement v16i8 multiply with this code:
...
vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4
This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are
6.79x faster than before.
Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.
Remove the 'integer multiplies' todo from the README file.
llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Evan Cheng
6be2e4b419
Correct comments
...
llvm-svn: 27790
2006-04-18 03:45:01 +00:00
Chris Lattner
4d84b56e64
Lower v8i16 multiply into this code:
...
li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4
where v4 is:
LCPI1_0: ; <16 x ubyte>
.byte 2
.byte 3
.byte 18
.byte 19
.byte 6
.byte 7
.byte 22
.byte 23
.byte 10
.byte 11
.byte 26
.byte 27
.byte 14
.byte 15
.byte 30
.byte 31
This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.
llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner
613d7fda64
Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
...
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.
This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements
PowerPC/vec_mul.ll
llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Evan Cheng
13a5022494
Another entry
...
llvm-svn: 27786
2006-04-18 01:22:57 +00:00
Evan Cheng
2f9011cd87
Another entry.
...
llvm-svn: 27784
2006-04-18 00:21:01 +00:00
Evan Cheng
98b1ca65dd
Use movss to insert_vector_elt(v, s, 0).
...
llvm-svn: 27782
2006-04-17 22:45:49 +00:00
Chris Lattner
76beeb373d
Turn x86 unaligned load/store intrinsics into aligned load/store instructions
...
if the pointer is known aligned.
llvm-svn: 27781
2006-04-17 22:26:56 +00:00
Chris Lattner
8c8f1e35c9
Fix handling of calls in functions that use vectors. This fixes a crash on
...
the code in GCC PR26546.
llvm-svn: 27780
2006-04-17 22:10:08 +00:00
Evan Cheng
ecf13c5d79
Use two pinsrw to insert an element into v4i32 / v4f32 vector.
...
llvm-svn: 27779
2006-04-17 22:04:06 +00:00
Chris Lattner
81938fa3db
remove done item
...
llvm-svn: 27778
2006-04-17 21:52:03 +00:00
Chris Lattner
fdecddb741
Don't diddle VRSAVE if no registers need to be added/removed from it. This
...
allows us to codegen functions as:
_test_rol:
vspltisw v2, -12
vrlw v2, v2, v2
blr
instead of:
_test_rol:
mfvrsave r2, 256
mr r3, r2
mtvrsave r3
vspltisw v2, -12
vrlw v2, v2, v2
mtvrsave r2
blr
Testcase here: CodeGen/PowerPC/vec_vrsave.ll
llvm-svn: 27777
2006-04-17 21:48:13 +00:00
Chris Lattner
5f5a9cc8ea
Add a MachineInstr::eraseFromParent convenience method.
...
llvm-svn: 27775
2006-04-17 21:35:41 +00:00
Evan Cheng
833ce43152
Encoding bug
...
llvm-svn: 27773
2006-04-17 21:33:57 +00:00
Chris Lattner
021f521a41
Vectors that are known live-in and live-out are clearly already marked in
...
the vrsave register for the caller. This allows us to codegen a function as:
_test_rol:
mfspr r2, 256
mr r3, r2
mtspr 256, r3
vspltisw v2, -12
vrlw v2, v2, v2
mtspr 256, r2
blr
instead of:
_test_rol:
mfspr r2, 256
oris r3, r2, 40960
mtspr 256, r3
vspltisw v0, -12
vrlw v2, v0, v0
mtspr 256, r2
blr
llvm-svn: 27772
2006-04-17 21:22:06 +00:00
Chris Lattner
a717d4f53b
Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this:
...
vspltisw v2, -12
vrlw v2, v2, v2
instead of:
vspltisw v0, -12
vrlw v2, v0, v0
when a function is returning a value.
llvm-svn: 27771
2006-04-17 21:19:12 +00:00
Chris Lattner
6b76deffb5
Move some knowledge about registers out of the code emitter into the register info.
...
llvm-svn: 27770
2006-04-17 21:07:20 +00:00
Chris Lattner
face261a94
Use a small table instead of macros to do this conversion.
...
llvm-svn: 27769
2006-04-17 20:59:25 +00:00
Evan Cheng
4de1805c84
Implement v8i16, v16i8 splat using unpckl + pshufd.
...
llvm-svn: 27768
2006-04-17 20:43:08 +00:00
Chris Lattner
e1d38ad84b
implement returns of a vector, testcase here: CodeGen/X86/vec_return.ll
...
llvm-svn: 27767
2006-04-17 20:32:50 +00:00
Chris Lattner
9bcfa7f378
Codegen insertelement with constant insertion points as scalar_to_vector
...
and a shuffle. For this:
void %test2(<4 x float>* %F, float %f) {
%tmp = load <4 x float>* %F ; <<4 x float>> [#uses=2]
%tmp3 = add <4 x float> %tmp, %tmp ; <<4 x float>> [#uses=1]
%tmp2 = insertelement <4 x float> %tmp3, float %f, uint 2 ; <<4 x float>> [#uses=2]
%tmp6 = add <4 x float> %tmp2, %tmp2 ; <<4 x float>> [#uses=1]
store <4 x float> %tmp6, <4 x float>* %F
ret void
}
we now get this on X86 (which will get better):
_test2:
movl 4(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, %xmm1
shufps $3, %xmm1, %xmm1
movaps %xmm0, %xmm2
shufps $1, %xmm2, %xmm2
unpcklps %xmm1, %xmm2
movss 8(%esp), %xmm1
unpcklps %xmm1, %xmm0
unpcklps %xmm2, %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
ret
instead of:
_test2:
subl $28, %esp
movl 32(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%esp)
movss 36(%esp), %xmm0
movss %xmm0, 8(%esp)
movaps (%esp), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
addl $28, %esp
ret
llvm-svn: 27765
2006-04-17 19:21:01 +00:00
Chris Lattner
f2347c31b4
Make sure to check splats of every constant we can, handle splat(31) by
...
being a bit more clever, add support for odd splats from -31 to -17.
llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Evan Cheng
5728f30f7c
Incorrect foldMemoryOperand entries
...
llvm-svn: 27763
2006-04-17 18:06:12 +00:00
Evan Cheng
3d26db8148
Errors in patterns preventing load folding
...
llvm-svn: 27762
2006-04-17 18:05:01 +00:00
Jeff Cohen
4cacdf3a2b
Add checks for __OpenBSD__.
...
llvm-svn: 27761
2006-04-17 17:55:41 +00:00
Chris Lattner
cc4222d95b
Teach the ppc backend to use rol and vsldoi to generate splatted constants.
...
This implements vec_constants.ll:test_vsldoi and test_rol
llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner
7d66e5a118
add a note
...
llvm-svn: 27758
2006-04-17 17:29:41 +00:00
Evan Cheng
eb739d0355
FP SETOLT, SETOLT, SETUGE, SETUGT conditions were implemented incorrectly
...
llvm-svn: 27755
2006-04-17 07:24:10 +00:00
Chris Lattner
2d8d6c9feb
Make some code more general, adding support for constant formation of several
...
new patterns.
llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner
9dd4ebffca
Learn how to make odd splatted constants in range [17,29]. This implements
...
PowerPC/vec_constants.ll:test_29.
llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner
72a67a5b1f
Pull some code out into a helper function.
...
Effeciently codegen even splats in the range [-32,30].
This allows us to codegen <30,30,30,30> as:
vspltisw v0, 15
vadduwm v2, v0, v0
instead of as a cp load.
llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner
5367a73dec
Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
...
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll
llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner
34ec6432f6
Regenerate with adjusted costs
...
llvm-svn: 27746
2006-04-17 05:26:20 +00:00
Chris Lattner
36ceea9e96
Regenerate with correct offset
...
llvm-svn: 27744
2006-04-17 05:08:46 +00:00
Chris Lattner
671f50cf33
Increase the opcodes by one each to disambiguate COPY from VMRGHW.
...
llvm-svn: 27742
2006-04-17 00:47:48 +00:00
Chris Lattner
99ee809cb6
Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles
...
of various 4-element vectors.
llvm-svn: 27739
2006-04-17 00:37:02 +00:00
Evan Cheng
68b2e5b4b0
movduprm, movshduprm bugs
...
llvm-svn: 27734
2006-04-16 18:11:28 +00:00
Evan Cheng
26d917789c
Encoding bugs
...
llvm-svn: 27733
2006-04-16 07:02:22 +00:00
Evan Cheng
b2e3339cb2
Can't fold loads into alias vector SSE ops used for scalar operation. The load
...
address has to be 16-byte aligned but the values aren't spilled to 128-bit
locations.
llvm-svn: 27732
2006-04-16 06:58:19 +00:00
Chris Lattner
d86516991a
Implement a TODO: have the legalizer canonicalize a bunch of operations to
...
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.
llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner
c2f3923b63
Add support for promoting stores from one legal type to another, allowing us
...
to write one pattern for vector stores instead of 4.
llvm-svn: 27730
2006-04-16 01:36:45 +00:00
Chris Lattner
f4126f0db7
Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
...
Remove some done items from the todo list.
llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner
4422d3de1b
Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform
...
Make the insert/extract elt -> shuffle code more aggressive.
This fixes CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27728
2006-04-16 00:51:47 +00:00
Chris Lattner
da260db137
Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask').
...
llvm-svn: 27727
2006-04-16 00:03:56 +00:00
Chris Lattner
44245f11c3
Fix a crash when faced with a shuffle vector that has an undef in its mask.
...
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner
2ede0fef98
Add patterns for matching vnots with bit converted inputs. Most of these will
...
go away when I start using evan's binop type canonicalizer
llvm-svn: 27725
2006-04-15 23:45:24 +00:00
Chris Lattner
254683a3df
Add a new vnot_conv predicate for matching vnot's where the allones vector is
...
bitconverted from some other type.
llvm-svn: 27724
2006-04-15 23:39:14 +00:00
Chris Lattner
918dbf7203
Make these predicates return true for bit_convert(buildvector)'s as well as
...
buildvectors.
llvm-svn: 27723
2006-04-15 23:38:00 +00:00
Evan Cheng
9f33b2abc5
More encoding bugs
...
llvm-svn: 27722
2006-04-15 06:10:09 +00:00
Evan Cheng
87e0cd1569
pslldrm, psrawrm, etc. encoding bug
...
llvm-svn: 27721
2006-04-15 05:59:08 +00:00
Evan Cheng
4487cf8125
hsubp{s|d} encoding bug
...
llvm-svn: 27720
2006-04-15 05:52:42 +00:00
Evan Cheng
32e5d4f6bc
Silly bug
...
llvm-svn: 27719
2006-04-15 05:37:34 +00:00
Evan Cheng
f9a93a1d3f
Do not use movs{h|l}dup for a shuffle with a single non-undef node.
...
llvm-svn: 27718
2006-04-15 03:13:24 +00:00
Chris Lattner
055889cfb9
significant cleanups to code that uses insert/extractelt heavily. This builds
...
maximal shuffles out of them where possible.
llvm-svn: 27717
2006-04-15 01:39:45 +00:00
Evan Cheng
300456c7f2
Added SSE (and other) entries to foldMemoryOperand().
...
llvm-svn: 27716
2006-04-14 23:33:27 +00:00
Evan Cheng
c626c9bb00
Some clean up
...
llvm-svn: 27715
2006-04-14 23:32:40 +00:00
Chris Lattner
5c9d357d7c
Allow undef in a shuffle mask
...
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner
5ecb2adcc6
Move these ctors out of line
...
llvm-svn: 27713
2006-04-14 22:20:32 +00:00
Evan Cheng
32c4470374
Last few SSE3 intrinsics.
...
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Chris Lattner
b3cae60d0b
Teach scalarrepl to promote unions of vectors and floats, producing
...
insert/extractelement operations. This implements
Transforms/ScalarRepl/vector_promote.ll
llvm-svn: 27710
2006-04-14 21:42:41 +00:00
Evan Cheng
184264997e
Misc. SSE2 intrinsics: clflush, lfench, mfence
...
llvm-svn: 27699
2006-04-14 07:43:12 +00:00
Evan Cheng
4831ed56e4
We were not adjusting the frame size to ensure proper alignment when alloca /
...
vla are present in the function. This causes a crash when a leaf function
allocates space on the stack used to store / load with 128-bit SSE
instructions.
llvm-svn: 27698
2006-04-14 07:26:43 +00:00
Evan Cheng
cc83472e2d
New entry
...
llvm-svn: 27697
2006-04-14 07:24:04 +00:00
Reid Spencer
ac43fc444d
Don't print out the install command for Intrinsics.gen unless VERBOSE mode.
...
llvm-svn: 27696
2006-04-14 06:32:31 +00:00
Chris Lattner
8882bef982
Make this assertion better
...
llvm-svn: 27695
2006-04-14 06:08:35 +00:00
Chris Lattner
cf80e569f6
Move the rest of the PPCTargetLowering::LowerOperation cases out into
...
separate functions, for simplicity and code clarity.
llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
aacabea404
Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
...
functions, which makes the code much cleaner :)
llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
9f3ef19ced
Implement value #'ing for vector operations, implementing
...
Regression/Transforms/GCSE/vectorops.ll
llvm-svn: 27691
2006-04-14 05:10:20 +00:00
Evan Cheng
360a73046f
pcmpeq* and pcmpgt* intrinsics.
...
llvm-svn: 27685
2006-04-14 01:39:53 +00:00
Evan Cheng
18a1a0e199
psll*, psrl*, and psra* intrinsics.
...
llvm-svn: 27684
2006-04-14 00:14:05 +00:00
Reid Spencer
86c9d10360
Remove the .cvsignore file so this directory can be pruned.
...
llvm-svn: 27683
2006-04-13 22:00:10 +00:00
Reid Spencer
460ea010ae
Remove .cvsignore so that this directory can be pruned.
...
llvm-svn: 27682
2006-04-13 21:59:03 +00:00
Andrew Lenharth
b6264f0dbd
Handle some kernel code than ends in [0 x sbyte]. I think this is safe
...
llvm-svn: 27672
2006-04-13 19:31:49 +00:00
Reid Spencer
4719e9699d
Expand some code with temporary variables to rid ourselves of the warning
...
about "dereferencing type-punned pointer will break strict-aliasing rules"
llvm-svn: 27671
2006-04-13 18:29:58 +00:00
Evan Cheng
f7645b0c49
Doh. PANDrm, etc. are not commutable.
...
llvm-svn: 27668
2006-04-13 18:11:28 +00:00
Chris Lattner
569ea9c6dd
Force non-darwin targets to use a static relo model. This fixes PR734,
...
tested by CodeGen/Generic/vector.ll
llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
cec07adf4d
add a note, move an altivec todo to the altivec list.
...
llvm-svn: 27654
2006-04-13 16:48:00 +00:00
Andrew Lenharth
bffed48656
linear -> constant time
...
llvm-svn: 27652
2006-04-13 13:43:31 +00:00
Reid Spencer
b08854af39
Add the README files to the distribution.
...
llvm-svn: 27651
2006-04-13 06:39:24 +00:00
Evan Cheng
2de048bc69
psad, pmax, pmin intrinsics.
...
llvm-svn: 27647
2006-04-13 06:11:45 +00:00
Evan Cheng
93dcea2b5a
Various SSE2 packed integer intrinsics: pmulhuw, pavgw, etc.
...
llvm-svn: 27645
2006-04-13 05:24:54 +00:00
Evan Cheng
25fcfb9f2d
X86 SSE2 supports v8i16 multiplication
...
llvm-svn: 27644
2006-04-13 05:10:25 +00:00
Evan Cheng
d6cad69ef4
Update
...
llvm-svn: 27643
2006-04-13 05:09:45 +00:00
Evan Cheng
2f634fac6d
padds{b|w}, paddus{b|w}, psubs{b|w}, psubus{b|w} intrinsics.
...
llvm-svn: 27639
2006-04-13 00:43:35 +00:00
Evan Cheng
537bdb370c
Naming inconsistency.
...
llvm-svn: 27638
2006-04-13 00:00:23 +00:00
Evan Cheng
8768f25c80
SSE / SSE2 conversion intrinsics.
...
llvm-svn: 27637
2006-04-12 23:42:44 +00:00
Evan Cheng
2c2d734efd
All "integer" logical ops (pand, por, pxor) are now promoted to v2i64.
...
Clean up and fix various logical ops issues.
llvm-svn: 27633
2006-04-12 21:21:57 +00:00
Evan Cheng
1477d2d08f
Promote vector AND, OR, and XOR
...
llvm-svn: 27632
2006-04-12 21:20:24 +00:00
Reid Spencer
7f718db335
Make sure CVS versions of yacc and lex files get distributed.
...
llvm-svn: 27630
2006-04-12 20:57:05 +00:00
Reid Spencer
56aa7c79b7
Get rid of a signed/unsigned compare warning.
...
llvm-svn: 27625
2006-04-12 19:28:15 +00:00
Chris Lattner
e087b8e321
Add a new way to match vector constants, which make it easier to bang bits of
...
different types.
Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1. This compiles:
typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
*P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
*P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
*P3 = vec_abs((vector float)*P3);
}
to:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
vspltisw v0, -1
vslw v0, v0, v0
lvx v1, 0, r3
vand v1, v1, v0
stvx v1, 0, r3
lvx v1, 0, r4
vandc v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vandc v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
instead of (with two constant pool entries):
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
li r6, lo16(LCPI1_0)
lis r7, ha16(LCPI1_0)
li r8, lo16(LCPI1_1)
lis r9, ha16(LCPI1_1)
lvx v0, r7, r6
lvx v1, 0, r3
vand v0, v1, v0
stvx v0, 0, r3
lvx v0, r9, r8
lvx v1, 0, r4
vand v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vand v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
GCC produces (with 2 cp entries):
_test:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
lis r9,ha16(LC1)
la r2,lo16(LC0)(r2)
lvx v0,0,r3
lvx v1,0,r5
la r9,lo16(LC1)(r9)
lwz r12,-4(r1)
lvx v12,0,r2
lvx v13,0,r9
vand v0,v0,v12
stvx v0,0,r3
vspltisw v0,-1
vslw v12,v0,v0
vandc v1,v1,v12
stvx v1,0,r5
lvx v0,0,r4
vand v0,v0,v13
stvx v0,0,r4
mtspr 256,r12
blr
llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
7900e6da3b
Turn casts into getelementptr's when possible. This enables SROA to be more
...
aggressive in some cases where LLVMGCC 4 is inserting casts for no reason.
This implements InstCombine/cast.ll:test27/28.
llvm-svn: 27620
2006-04-12 18:09:35 +00:00
Reid Spencer
23eae83205
Don't emit useless warning messages.
...
llvm-svn: 27617
2006-04-12 17:56:16 +00:00
Chris Lattner
ce6e988fa6
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Evan Cheng
66fb7beed7
Promote v4i32, v8i16, v16i8 load to v2i64 load.
...
llvm-svn: 27612
2006-04-12 17:12:36 +00:00
Chris Lattner
602d86f7af
Ensure that zero vectors are always v4i32, which forces them to CSE with
...
each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll
llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Evan Cheng
ce4e1c0068
Vector type promotion for ISD::LOAD and ISD::SELECT
...
llvm-svn: 27606
2006-04-12 16:33:18 +00:00
Chris Lattner
70d68fcfcb
Implement support for the formal_arguments node. To get this, targets shouldcustom legalize it and remove their XXXTargetLowering::LowerArguments overload
...
llvm-svn: 27604
2006-04-12 16:20:43 +00:00
Evan Cheng
fbdf6ece4a
Various SSE2 conversion intrinsics
...
llvm-svn: 27603
2006-04-12 05:20:24 +00:00
Chris Lattner
3df4b4ca55
Don't memoize vloads in the load map! Don't memoize them anywhere here, let
...
getNode do it. This fixes CodeGen/Generic/2006-04-11-vecload.ll
llvm-svn: 27602
2006-04-12 03:25:41 +00:00
Evan Cheng
68b885f50c
Added __builtin_ia32_storelv4si, __builtin_ia32_movqv4si,
...
__builtin_ia32_loadlv4si, __builtin_ia32_loaddqu, __builtin_ia32_storedqu.
llvm-svn: 27599
2006-04-11 22:28:25 +00:00
Nate Begeman
ccd6ea1913
Fix SingleSource/UnitTests/Vector/sumarray-dbl
...
llvm-svn: 27594
2006-04-11 19:44:43 +00:00
Nate Begeman
786d44f822
Fix PR727, correctly handling large stack aligments on ppc
...
llvm-svn: 27593
2006-04-11 19:29:21 +00:00
Chris Lattner
0e63e916b3
we have a shuffle instr, add an example.
...
llvm-svn: 27592
2006-04-11 18:47:03 +00:00
Evan Cheng
c0848b1eaf
gcc lower SSE prefetch into generic prefetch intrinsic. Need to add support
...
later.
llvm-svn: 27591
2006-04-11 18:04:57 +00:00
Evan Cheng
7a9fca11b5
Misc. intrinsics.
...
llvm-svn: 27590
2006-04-11 17:35:57 +00:00
Jim Laskey
1e0cbe4158
Suppress debug label when not debug.
...
llvm-svn: 27588
2006-04-11 08:11:53 +00:00
Evan Cheng
798acd4094
movnt* and maskmovdqu intrinsics
...
llvm-svn: 27587
2006-04-11 06:57:30 +00:00
Evan Cheng
be9f313cd8
Only get Tmp2 for cases where number of operands is > 1. Fixed return void.
...
llvm-svn: 27586
2006-04-11 06:33:39 +00:00
Chris Lattner
651f0655d0
add some todos
...
llvm-svn: 27580
2006-04-11 02:00:08 +00:00
Chris Lattner
e12152a64b
Vector function results go into V2 according to GCC. The darwin ABI doc
...
doesn't say where they go :-/
llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
d02e72ffc3
Add basic support for legalizing returns of vectors
...
llvm-svn: 27578
2006-04-11 01:31:51 +00:00
Chris Lattner
5d1acb831a
Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
...
No functionality change.
llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Evan Cheng
da283be867
Added support for _mm_move_ss and _mm_move_sd.
...
llvm-svn: 27575
2006-04-11 00:19:04 +00:00