Bruno Cardoso Lopes
483bb7eed2
Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask
...
llvm-svn: 113050
2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes
742030b3db
Emit target specific nodes for isSHUFPMask
...
llvm-svn: 113048
2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes
3e3169873e
Previous isMOVLMask matching already emits targets nodes, remove check
...
llvm-svn: 113047
2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes
b867456bfc
One more check from the original isShuffleMaskLegal goes away
...
llvm-svn: 113045
2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes
3081ae493b
Remove a duplicated but useless check that i've inserted in the previous commit.
...
llvm-svn: 113044
2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes
22775e6e65
Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef
...
llvm-svn: 113043
2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes
5d71537f4a
Remove check for unpckh mask
...
llvm-svn: 113035
2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes
d9d2ed558e
Remove check for unpckl mask
...
llvm-svn: 113034
2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes
ecfa52b251
Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start
...
checking each standalone condition and decide whether emit target
specific nodes or remove the condition if it's already matched before.
llvm-svn: 113031
2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes
01dc6f1195
Reapply considered harmfull part of rr112934 and r112942.
...
"Use target specific nodes instead of relying in unpckl and
unpckh pattern fragments during isel time. Also place a
depth limit in getShuffleScalarElt.
llvm-svn: 113020
2010-09-03 22:09:41 +00:00
Bruno Cardoso Lopes
4753ce5e2c
Reintroduce a simple function refactoring done in r112934, also without any functionality changes
...
llvm-svn: 113008
2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes
3c43bc3214
Reapply piecies of r112942 and r112934 which don't do
...
functional changes
llvm-svn: 113007
2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes
9635a81d34
Reapply Fix comment
...
llvm-svn: 113006
2010-09-03 19:55:05 +00:00
Daniel Dunbar
26e0e964ab
Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced
...
some infinite loop and select failures.
- Apologies for eager reverting, but its branch day.
llvm-svn: 113000
2010-09-03 19:38:11 +00:00
Daniel Dunbar
c8af4f3a0a
Revert r112938 "Fix comment", which depends on r112934, which introduced some
...
infinite loop and select failures.
llvm-svn: 112999
2010-09-03 19:38:08 +00:00
Daniel Dunbar
4ece67890b
Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh
...
mask pattern fragment", which depends on r112934, which introduced some infinite
loop and select failures.
llvm-svn: 112998
2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes
70f376e9da
Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment
...
llvm-svn: 112942
2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes
b107a092a5
Fix comment
...
llvm-svn: 112938
2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes
e1ad6555a8
- Use specific nodes to match unpckl masks.
...
- Teach getShuffleScalarElt how to handle more target
specific nodes, so the DAGCombine can make use of it.
- Add another hack to avoid the node update problem
during legalization. More description on the comments
llvm-svn: 112934
2010-09-03 01:24:00 +00:00
Anton Korobeynikov
a65910e5ca
Revert win64 changes. They seem to be incomplete
...
llvm-svn: 112885
2010-09-02 22:31:32 +00:00
Anton Korobeynikov
339ab60a5b
Properly allocate win64 shadow reg area.
...
Patch by Jan Sjodin!
llvm-svn: 112875
2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes
659f549638
Replace unpckl_undef and unpckh_undef matching with target specific opcodes
...
llvm-svn: 112806
2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes
9d4a11d4c6
Move condition out to prepare for more matching
...
llvm-svn: 112805
2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes
1b9095fff1
Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it
...
llvm-svn: 112804
2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes
dcdab94661
become more strict about when it's safe to use X86ISD::MOVLPS
...
llvm-svn: 112799
2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes
b73f0cbc7a
Revert r112689, avoid those kind of checks cause they mess up with mmx
...
llvm-svn: 112760
2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes
9375b2f67d
Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
...
llvm-svn: 112694
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
b69568ab33
minor change, simplify some logic
...
llvm-svn: 112689
2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes
c31697f68c
Move some functions around so they can be used for some other to come function
...
llvm-svn: 112687
2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes
80613a070e
Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
...
llvm-svn: 112661
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes
8fc83b1960
Use x86 specific MOVSHDUP node and add more patterns to match it
...
llvm-svn: 112657
2010-08-31 22:22:11 +00:00
Bruno Cardoso Lopes
dfa177cf81
Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments
...
llvm-svn: 112644
2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes
6fbe7b9ddd
Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
...
llvm-svn: 112642
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
08d5d62dcb
Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles
...
llvm-svn: 112570
2010-08-31 02:26:40 +00:00
Chris Lattner
8cb4abbc0e
fix the buildvector->insertp[sd] logic to not always create a redundant
...
insertp[sd] $0, which is a noop. Before:
_f32: ## @f32
pshufd $1, %xmm1, %xmm2
pshufd $1, %xmm0, %xmm3
addss %xmm2, %xmm3
addss %xmm1, %xmm0
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm3, %xmm0
ret
after:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movdqa %xmm2, %xmm0
insertps $16, %xmm3, %xmm0
ret
The extra movs are due to a random (poor) scheduling decision.
llvm-svn: 112379
2010-08-28 17:59:08 +00:00
Chris Lattner
c3b630d64b
fix the BuildVector -> unpcklps logic to not do pointless shuffles
...
when the top elements of a vector are undefined. This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined. For example, on:
_Complex float f32(_Complex float A, _Complex float B) {
return A+B;
}
We used to produce (with SSE2, SSE4.1+ uses insertps):
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $16, %xmm2, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm1
movdqa %xmm2, %xmm0
unpcklps %xmm1, %xmm0
ret
We now produce:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movaps %xmm2, %xmm0
unpcklps %xmm3, %xmm0
ret
This implements rdar://8368414
llvm-svn: 112378
2010-08-28 17:28:30 +00:00
Chris Lattner
7fa5fa1207
improve comments in the unpcklps generating logic, introduce
...
a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose. No functionality change.
llvm-svn: 112377
2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes
1052e6d5d9
Clean up the logic of vector shuffles -> vector shifts.
...
Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.
llvm-svn: 112348
2010-08-28 02:46:39 +00:00
Anton Korobeynikov
62a9879ef4
Properly handle passing of FP stuff to varargs function on Win64:
...
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!
llvm-svn: 112262
2010-08-27 14:43:06 +00:00
Bruno Cardoso Lopes
6150648a64
zap the now unused MVT::getIntVectorWithNumElements
...
llvm-svn: 112218
2010-08-26 20:53:12 +00:00
Chris Lattner
148485f707
implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.
...
llvm-svn: 112171
2010-08-26 05:51:22 +00:00
Chris Lattner
5256226fc8
fix sse1 only codegen in x86-64 mode, which is something we
...
apparently try to support.
llvm-svn: 112168
2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes
28f3261dbd
Revert this for now, PUNPCKLDQ dont operate on v4f32
...
llvm-svn: 112090
2010-08-25 21:26:37 +00:00
Anton Korobeynikov
1544f79e36
Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there.
...
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.
llvm-svn: 112034
2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes
af72dd7362
PUNPCKLDQ should also be used for v4f32
...
llvm-svn: 112020
2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes
33aa4f7d1c
teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests
...
llvm-svn: 112017
2010-08-25 02:35:37 +00:00
Dan Gohman
e400c660e4
Fix X86's isLegalAddressingMode to recognize that static addresses
...
need not be RIP-relative in small mode.
llvm-svn: 111917
2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes
7939025262
Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments
...
llvm-svn: 111890
2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes
ed9ff8d8d0
Start using target speficic nodes for shuffles: pshufhw and pshuflw
...
llvm-svn: 111837
2010-08-23 20:41:02 +00:00
Anton Korobeynikov
a68e2a53a1
Revert invalid r111792. Jump tables are not broken on x86-64 / coff,
...
it's COFF emitter which does not support differences of two symbols
(and needs to be fixed). GAS is pretty fine with code produced.
llvm-svn: 111801
2010-08-23 07:38:51 +00:00