Nadav Rotem
157be301c5
AVX2: Add an additional broadcast idiom.
...
llvm-svn: 156540
2012-05-10 12:39:13 +00:00
Nadav Rotem
64319ce27c
Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
...
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.
Fix PR11900.
llvm-svn: 156539
2012-05-10 12:22:05 +00:00
Jakob Stoklund Olesen
88cf278739
Use ptr_rc_tailcall instead of GR32_TC.
...
The getPointerRegClass() hook will return GR32_TC, or whatever is
appropriate for the current function.
Patch by Yiannis Tsiouris!
llvm-svn: 156459
2012-05-09 01:50:09 +00:00
Jakob Stoklund Olesen
989c6b112d
s/CSR_Ghc/CSR_NoRegs/
...
Share the CalleeSavedRegs defs between all calling conventions having no
callee-saved registers.
Patch by Yiannis Tsiouris!
llvm-svn: 156382
2012-05-08 15:07:29 +00:00
Craig Topper
77b1a4cee5
Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
...
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Jakob Stoklund Olesen
cc0cf22b98
Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
...
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).
So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.
Patch by Yiannis Tsiouris!
llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Chad Rosier
3e284d8bd6
Fix a regression from r147481. This combine should only happen if there is a
...
single use.
rdar://11360370
llvm-svn: 156316
2012-05-07 18:47:44 +00:00
Manman Ren
6fde9f74b4
X86: optimization for -(x != 0)
...
This patch will optimize -(x != 0) on X86
FROM
cmpl $0x01,%edi
sbbl %eax,%eax
notl %eax
TO
negl %edi
sbbl %eax %eax
In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;
rdar: 10961709
llvm-svn: 156312
2012-05-07 18:06:23 +00:00
Craig Topper
02644ca6b7
Fix some issues in the f16c instructions.
...
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Craig Topper
c6d0bc2afc
Add SSE4A MOVNTSS/MOVNTSD instructions.
...
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Craig Topper
4246b08208
Use MVT instead of EVT as the argument to all the shuffle decode functions. Simplify some of the decode functions.
...
llvm-svn: 156268
2012-05-06 19:46:21 +00:00
Craig Topper
b3b4c9476d
Add VPERMQ/VPERMPD to the list of target specific shuffles that can be looked through for DAG combine purposes.
...
llvm-svn: 156266
2012-05-06 18:54:26 +00:00
Craig Topper
b95ee6cfc1
Add shuffle decode support for VPERMQ/VPERMPD.
...
llvm-svn: 156265
2012-05-06 18:44:02 +00:00
Jim Grosbach
f7461026c2
Nuke a few dead remnants of the CBE.
...
llvm-svn: 156241
2012-05-05 17:45:12 +00:00
Benjamin Kramer
7a9528b540
Add a new target hook "predictableSelectIsExpensive".
...
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Preston Gurd
8de39bd4f6
Adds Intel Atom scheduling latencies to X86InstrSystem.td.
...
llvm-svn: 156194
2012-05-04 19:26:37 +00:00
Craig Topper
88bf1f4404
Fix some loops to match coding standards. No functional change intended.
...
llvm-svn: 156159
2012-05-04 06:39:13 +00:00
Craig Topper
3845ea5b9e
Fix up some spacing. No functional change.
...
llvm-svn: 156158
2012-05-04 06:18:33 +00:00
Craig Topper
71aab70d71
Simplify broadcast lowering code. No functional change intended.
...
llvm-svn: 156157
2012-05-04 05:49:51 +00:00
Craig Topper
6881f1067c
Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
...
llvm-svn: 156156
2012-05-04 04:44:49 +00:00
Craig Topper
f7516089b7
Simplify shuffle narrowing code a bit. No functional change intended.
...
llvm-svn: 156154
2012-05-04 04:08:44 +00:00
Jakob Stoklund Olesen
7bdae32bfd
Remove the SubRegClasses field from RegisterClass descriptions.
...
This information in now computed by TableGen.
llvm-svn: 156152
2012-05-04 03:30:34 +00:00
Craig Topper
9bdd3bb279
Use 'unsigned' instead of 'int' in a few places dealing with counts of vector elements.
...
llvm-svn: 156060
2012-05-03 07:26:59 +00:00
Craig Topper
52869bf5bf
Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
...
llvm-svn: 156059
2012-05-03 07:12:59 +00:00
Preston Gurd
047af997f6
For Intel Atom, use ILP scheduling always, instead of ILP for 64 bit
...
and Hybrid for 32 bit, since benchmarks show ILP scheduling is better
most of the time.
llvm-svn: 156028
2012-05-02 22:02:02 +00:00
Preston Gurd
24f13ffba6
Change the Intel Atom detection code to recognize
...
Lincroft and Medfield.
llvm-svn: 156025
2012-05-02 21:38:46 +00:00
Preston Gurd
29e60325bf
This patch continues the work of adding instruction latencies for X86 Atom,
...
by providing the latencies for the instructions in X86InstrFPStack.td.
llvm-svn: 155996
2012-05-02 16:03:35 +00:00
Manman Ren
0bdd46e32e
Revert r155853
...
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.
llvm-svn: 155992
2012-05-02 15:24:32 +00:00
Craig Topper
00ccecdc84
Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
...
llvm-svn: 155982
2012-05-02 08:03:44 +00:00
Jakub Staszak
5a4bcd5559
Remove unneeded break.
...
llvm-svn: 155959
2012-05-01 23:08:16 +00:00
Jakub Staszak
56c14bb368
Remove trailing spaces.
...
llvm-svn: 155956
2012-05-01 23:04:38 +00:00
Preston Gurd
bee1603263
This patch marks the X86 floating point stack registers ST0-ST7 as reserved
...
in order to avoid assertion failures in the register scavenger. The assertion
failures were “Bad machine code: Using an undefined physical register” and
“Bad machine code: MBB exits via unconditional fall-through but its successor
differs from its CFG successor!”.
llvm-svn: 155930
2012-05-01 19:50:22 +00:00
Manman Ren
2a032bd8f9
X86: optimization for max-like struct
...
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0
FROM
movl %edi, %ecx
subl %esi, %ecx
cmpl %edi, %esi
movl $0, %eax
cmovll %ecx, %eax
TO
xorl %eax, %eax
subl %esi, %edi
cmovll %eax, %edi
movl %edi, %eax
rdar: 10734411
llvm-svn: 155919
2012-05-01 17:16:15 +00:00
Alexey Samsonov
246af5318a
X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment
...
llvm-svn: 155917
2012-05-01 15:16:06 +00:00
Bill Wendling
003b1bf46c
Change the PassManager from a reference to a pointer.
...
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468
llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Craig Topper
1624fb0549
Allow BMI, AES, F16C, POPCNT, FMA3, and CLMUL to be detected on AMD processors.
...
llvm-svn: 155899
2012-05-01 07:10:32 +00:00
Craig Topper
405f995b07
Make XOP and FMA4 require SSE4A to match GCC behavior. Use this to simplify Bulldozer feature list.
...
llvm-svn: 155897
2012-05-01 06:54:48 +00:00
Craig Topper
0272669dd1
Attempt to handle MRMInitReg in emitVEXOpcodePrefix. Hopefully fixes PR12711.
...
llvm-svn: 155896
2012-05-01 06:34:01 +00:00
Craig Topper
d4974e4713
Make XOP imply AVX as its needed to legalize the registers types.
...
llvm-svn: 155891
2012-05-01 05:41:41 +00:00
Craig Topper
9fa14ed244
Remove HasSSE2 from AES and CLMUL predicates. It's now implied by the HasAES and HasCLMUL predicates.
...
llvm-svn: 155890
2012-05-01 05:35:02 +00:00
Craig Topper
50be3b60a4
Make CLMUL and AES imply SSE2 since its needed to legalize the type.
...
llvm-svn: 155888
2012-05-01 05:28:32 +00:00
Craig Topper
cfc6060070
Enable AVX and FMA4 for AMD Bulldozer processors.
...
llvm-svn: 155885
2012-05-01 05:18:13 +00:00
Manman Ren
0a8b8b491f
X86: optimization for -(x != 0)
...
This patch will optimize -(x != 0) on X86
FROM
cmpl $0x01,%edi
sbbl %eax,%eax
notl %eax
TO
negl %edi
sbbl %eax %eax
llvm-svn: 155853
2012-04-30 22:51:25 +00:00
Chad Rosier
0092397f80
Tidy up. No functional change intended.
...
llvm-svn: 155832
2012-04-30 17:47:15 +00:00
Derek Schuff
85abcc8498
Fix fastcc structure return with fast-isel on x86-32
...
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.
(this time, actually commit what was reviewed!)
llvm-svn: 155825
2012-04-30 16:57:15 +00:00
Craig Topper
78a563fd27
No need to normalize index before calling Extract128BitVector
...
llvm-svn: 155811
2012-04-30 05:17:10 +00:00
Pete Cooper
584ad8ab86
Copied all the VEX prefix encoding code from X86MCCodeEmitter to the x86 JIT emitter. Needs some major refactoring as these two code emitters are almost identical
...
llvm-svn: 155810
2012-04-30 03:56:44 +00:00
Jakub Staszak
f526e691cf
Remove unneeded casts. No functionality change.
...
llvm-svn: 155800
2012-04-29 20:52:53 +00:00
Craig Topper
ce1e652483
Simplify code a bit. No functional change intended.
...
llvm-svn: 155798
2012-04-29 20:22:05 +00:00
Derek Schuff
7fe1fbbe81
Revert r155745
...
llvm-svn: 155746
2012-04-27 23:37:41 +00:00