1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

7422 Commits

Author SHA1 Message Date
Craig Topper
06ed6cb856 Add TB encoding to VEROALL, VZEROUPPER, and VCVTPS2PD to allow them to be disassembled. Fixes PR10723.
llvm-svn: 138551
2011-08-25 06:57:46 +00:00
Bruno Cardoso Lopes
5d34219953 Add support for 256-bit versions of VSHUFPD and VSHUFPS.
llvm-svn: 138546
2011-08-25 02:58:26 +00:00
Bruno Cardoso Lopes
ccffe56b5d Add memory version of SHUFPD to mask decoding!
llvm-svn: 138545
2011-08-25 02:58:21 +00:00
Bruno Cardoso Lopes
dfa5cf4620 Create a section for non-instructions patterns in the beginning of the
file, and move more code around!

llvm-svn: 138521
2011-08-24 23:18:11 +00:00
Bruno Cardoso Lopes
719b357628 Move code around!
llvm-svn: 138520
2011-08-24 23:18:09 +00:00
Bruno Cardoso Lopes
3824b766ac Organize UNPCK* patterns, also add remaining for AVX.
llvm-svn: 138519
2011-08-24 23:18:06 +00:00
Bruno Cardoso Lopes
82c8bc7efd Move remaining MOVDDUP patterns close to MOVDDUP defintion and duplicate
the missing ones for AVX.

llvm-svn: 138518
2011-08-24 23:18:04 +00:00
Bruno Cardoso Lopes
d315a6b6e6 Organize and tidy up MOVDDUP section. Also update comments!
llvm-svn: 138517
2011-08-24 23:18:02 +00:00
Bruno Cardoso Lopes
762fb13cc9 Move MOVHLPS patterns close to MOVHLPS definition, and duplicate the
pattern for 128-bit AVX mode.

llvm-svn: 138516
2011-08-24 23:17:59 +00:00
Bruno Cardoso Lopes
d62766849f Move all PSHUF* patterns close to the PSHUF* definitions. Also be
explicit about which subtarget they refer to, and add AVX versions of
the ones we currently don't. Remove old and now wrong comments!

llvm-svn: 138515
2011-08-24 23:17:57 +00:00
Bruno Cardoso Lopes
122f7cfc92 Move all SHUFP* patterns close to the SHUFP* definitions. Also be
explicit about which subtarget they refer to, and add AVX versions of
the ones we currently don't. Make the mask check more strict, to be
clear it won't be used to match to 256-bit versions!

llvm-svn: 138514
2011-08-24 23:17:55 +00:00
Eli Friedman
b6597a2e70 Hook up 64-bit atomic load/store on x86-32. I plan to write more efficient implementations eventually.
llvm-svn: 138505
2011-08-24 22:33:28 +00:00
Eli Friedman
e4cd816e7b Fix whitespace.
llvm-svn: 138487
2011-08-24 21:17:30 +00:00
Eli Friedman
6f95a6ae1b Basic x86 code generation for atomic load and store instructions.
llvm-svn: 138478
2011-08-24 20:50:09 +00:00
Bruno Cardoso Lopes
734febce18 Mark VZEROALL as clobbering all YMM registers
llvm-svn: 138461
2011-08-24 18:48:33 +00:00
Evan Cheng
420bf5446c Move TargetRegistry and TargetSelect from Target to Support where they belong.
These are strictly utilities for registering targets and components.

llvm-svn: 138450
2011-08-24 18:08:43 +00:00
Craig Topper
1da38a34a6 Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711.
llvm-svn: 138427
2011-08-24 06:14:18 +00:00
Bruno Cardoso Lopes
8959b54713 Fix a nasty bug where a v4i64 was being wrong emitted with 32-bit
permutations. Also tidy up some patterns and make them close to their
instruction definition!

llvm-svn: 138392
2011-08-23 22:06:37 +00:00
Evan Cheng
ed13551c1d Some refactoring so TargetRegistry.h no longer has to include any files
from MC.

llvm-svn: 138367
2011-08-23 20:15:21 +00:00
Nick Lewycky
11874a4e0a PerformSubCombine to work on integers larger than i128. Fixes a crasher.
llvm-svn: 138354
2011-08-23 19:01:24 +00:00
Craig Topper
67b22aedb4 Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712.
llvm-svn: 138321
2011-08-23 04:36:33 +00:00
Bruno Cardoso Lopes
8024703a16 Introduce a pass to insert vzeroupper instructions to avoid AVX to
SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper"
llc command line option. This is only the first step (very naive and
conservative one) to sketch out the idea, but proper DFA is coming next
to allow smarter decisions. Comments and ideas now and in further commits
will be very appreciated.

llvm-svn: 138317
2011-08-23 01:14:17 +00:00
Benjamin Kramer
bd13a6a319 X86: Add some operand types required to identify calls.
llvm-svn: 138285
2011-08-22 22:55:32 +00:00
Bruno Cardoso Lopes
8007165688 Add support for breaking 256-bit int VETCC into two 128-bit ones,
avoding scalarization of the compare. Reduces code from 59 to 6
instructions. Fix PR10712.

llvm-svn: 138271
2011-08-22 20:31:04 +00:00
Bruno Cardoso Lopes
23ff325f5b Add 128-bit AVX codegen for PCMP* family of integer instructions
llvm-svn: 138270
2011-08-22 20:31:00 +00:00
Bruno Cardoso Lopes
9979e44f1b Re-write part of VEX encoding logic, to be more easy to read! Also fix
a bug and add a testcase!

llvm-svn: 138123
2011-08-19 22:27:29 +00:00
Craig Topper
f68d77215d Add TB encoding to VEX versions of SSE fp logical operations to fix disassembler
llvm-svn: 138034
2011-08-19 05:28:50 +00:00
Bruno Cardoso Lopes
306110c29a Fix PR10677. Initial patch and idea by Peter Cooper but I've changed the
implementation!

llvm-svn: 138029
2011-08-19 02:23:56 +00:00
Bruno Cardoso Lopes
0d458d4bb3 Re-encoded 128-bit AVX versions of SQRT, RSQRT, RCP have 3 operands
instead of 2. They were already defined this way in their regular
version, but not for the intrinsics versions (*_Int), and that would work
for assembly emission but not for object code, since a MachineOperand
would be missing. This commit fix PR10697.

Also removed the {VSQRT,VRSQRT,VRCP}r_Int forms and match the intrinsic
via INSERT_SUBREG+EXTRACT_SUBREG patterns. The same couldn't be done for
memory versions because sse_load_f32/sse_load_f64 operand need special
handling and don't work like regular "addr" operands.

There are right now 114 "*_Int" and 98 "Int_*" forms! I'm slowly
removing them as I step through, but hope we can get rid of these
someday, they are really annoying :)

llvm-svn: 138012
2011-08-18 23:59:21 +00:00
Bruno Cardoso Lopes
c174d8ac48 Cleanup vector logical ops in AVX and add use int versions for simple
v2i64

llvm-svn: 137919
2011-08-18 02:11:34 +00:00
Bruno Cardoso Lopes
82795e6b41 Fix PR10688. Add support for spliting 256-bit vector shifts when the
shift amount is variable

llvm-svn: 137885
2011-08-17 22:12:20 +00:00
Owen Anderson
3146968039 Allow the MCDisassembler to return a "soft fail" status code, indicating an instruction that is disassemblable, but invalid. Only used for ARM UNPREDICTABLE instructions at the moment.
Patch by James Molloy.

llvm-svn: 137830
2011-08-17 17:44:15 +00:00
Bruno Cardoso Lopes
98531dfd08 Introduce matching patterns for vbroadcast AVX instruction. The idea is to
match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

llvm-svn: 137810
2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes
e3bab71a4b Update comments about vector splat handling in x86
llvm-svn: 137808
2011-08-17 02:29:13 +00:00
Bruno Cardoso Lopes
4ff4ed28af Now that we have a canonical way to handle 256-bit splats:
vinsertf128 $1 + vpermilps $0, remove the old code that used to first
do the splat in a 128-bit vector and then insert it into a larger one.
This is better because the handling code gets simpler and also makes a
better room for the upcoming vbroadcast!

llvm-svn: 137807
2011-08-17 02:29:10 +00:00
Bruno Cardoso Lopes
d64294fb0a Instead of always leaving the work to the generic legalizer when
there is no support for native 256-bit shuffles, be more smart in some
cases, for example, when you can extract specific 128-bit parts and use
regular 128-bit shuffles for them. Example:

For this shuffle:
  shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32>
                <i32 1, i32 0, i32 7, i32 6>

This was expanded to:
  vextractf128  $1, %ymm1, %xmm2
  vpextrq $0, %xmm2, %rax
  vmovd %rax, %xmm1
  vpextrq $1, %xmm2, %rax
  vmovd %rax, %xmm2
  vpunpcklqdq %xmm1, %xmm2, %xmm1
  vpextrq $0, %xmm0, %rax
  vmovd %rax, %xmm2
  vpextrq $1, %xmm0, %rax
  vmovd %rax, %xmm0
  vpunpcklqdq %xmm2, %xmm0, %xmm0
  vinsertf128 $1, %xmm1, %ymm0, %ymm0
  ret

Now we get:
  vshufpd $1, %xmm0, %xmm0, %xmm0
  vextractf128  $1, %ymm1, %xmm1
  vshufpd $1, %xmm1, %xmm1, %xmm1
  vinsertf128 $1, %xmm1, %ymm0, %ymm0

llvm-svn: 137733
2011-08-16 18:21:54 +00:00
Bruno Cardoso Lopes
f026c60f3d While I'm here, remove the "_alt" hacks to a series of INSERT_SUBREG and
also add the AVX versions of the 128-bit patterns

llvm-svn: 137685
2011-08-15 23:36:51 +00:00
Bruno Cardoso Lopes
1e817d1451 Reorder declarations of vmovmskp* and also put the necessary AVX
predicate and TB encoding fields. This fix the encoding for the
attached testcase. This fixes PR10625.

llvm-svn: 137684
2011-08-15 23:36:45 +00:00
Jim Grosbach
31c0c9a1f6 MCTargetAsmParser target match predicate support.
Allow a target assembly parser to do context sensitive constraint checking
on a potential instruction match. This will be used, for example, to handle
Thumb2 IT block parsing.

llvm-svn: 137675
2011-08-15 23:03:29 +00:00
Bruno Cardoso Lopes
b81c3ed76d Fix PR10656. It's only profitable to use 128-bit inserts and extracts
when AVX mode is one. Otherwise is just more work for the type
legalizer.

llvm-svn: 137661
2011-08-15 21:45:54 +00:00
Bruno Cardoso Lopes
8bdbc680ea Fix comment!
llvm-svn: 137521
2011-08-12 21:54:42 +00:00
Bruno Cardoso Lopes
2d100ca13c The VPERM2F128 is a AVX instruction which permutes between two 256-bit
vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

llvm-svn: 137519
2011-08-12 21:48:26 +00:00
Bruno Cardoso Lopes
17ae896095 Move code around and add comments
llvm-svn: 137518
2011-08-12 21:48:22 +00:00
Duncan Sands
10a9e984bc Silence a bunch (but not all) "variable written but not read" warnings
when building with assertions disabled.

llvm-svn: 137460
2011-08-12 14:54:45 +00:00
Andrew Trick
68f830e252 findDeadCallerSavedReg fix: Missing NULL terminator in register arrays.
Fix by Ivan Baev. Sorry I don't have a unit test, but the fix is obvious so I don't want to delay it.

llvm-svn: 137404
2011-08-12 00:49:19 +00:00
Bruno Cardoso Lopes
328a6a980b Add a dag combine to xform 256-bit shuffles into simple vector
inserts and extracts. This simple combine makes us generate only 1
instruction instead of 11 in the v8 case.

llvm-svn: 137362
2011-08-11 21:50:44 +00:00
Bruno Cardoso Lopes
38d4afa02f Fix PR10492 by teaching MOVHLPS and MOVLPS mask matching to be more strict.
llvm-svn: 137324
2011-08-11 18:59:13 +00:00
Nadav Rotem
2461d23062 Add a comment, per Bruno's CR.
llvm-svn: 137313
2011-08-11 17:05:47 +00:00
Nadav Rotem
de1b485f3f [AVX] If the data which is going to be saved is already in two XMM registers
(for example, after integer operation), do not pack the registers into a YMM
before saving. Its better to save as two XMM registers.

Before:
                vinsertf128         $1, %xmm3, %ymm0, %ymm3
                vinsertf128         $0, %xmm1, %ymm3, %ymm1
                vmovaps              %ymm1, 416(%rsp)

After:
                vmovaps              %xmm3, 416+16(%rsp)
                vmovaps              %xmm1, 416(%rsp)

llvm-svn: 137308
2011-08-11 16:41:21 +00:00
Bruno Cardoso Lopes
4106caa9af Cleanup: Remove Int_ CVTSS2SI* forms
llvm-svn: 137297
2011-08-11 02:52:36 +00:00