1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00
Commit Graph

13625 Commits

Author SHA1 Message Date
Johnny Chen
4a97a176e7 Add N3RegVShFrm to represent 3-Register Vector Shift Instructions, which do not
follow the N3RegFrm's operand order of D:Vd N:Vn M:Vm.  The operand order of
N3RegVShFrm is D:Vd M:Vm N:Vn (notice that M:Vm is the first src operand).

Add a parent class N3Vf which requires passing a Format argument and which the
N3V class is modified to inherit from.  N3V class represents the "normal"
3-Register NEON Instructions with N3RegFrm.

Also add a multiclass N3VSh_QHSD to represent clusters of NEON 3-Register Shift
Instructions and replace 8 invocations with it.

llvm-svn: 99655
2010-03-26 21:26:28 +00:00
Jim Grosbach
97d626c850 vldm/vstm can only do up to 16 double-word registers at a time.
Radar 7797856

llvm-svn: 99630
2010-03-26 18:41:09 +00:00
Johnny Chen
c986f10733 Add N3RegFrm to represent "NEON 3 vector register format" instructions.
Examples are VABA (Vector Absolute Difference and Accumulate), VABAL (Vector
Absolute Difference and Accumulate Long), and VABD (Vector Absolute Difference).

llvm-svn: 99628
2010-03-26 18:32:20 +00:00
Evan Cheng
d1ee7e0ba3 Do not sibcall if stack needs to be dynamically aligned.
llvm-svn: 99620
2010-03-26 16:26:03 +00:00
Evan Cheng
377bb993d8 Allow trivial sibcall of vararg callee when no arguments are being passed.
llvm-svn: 99598
2010-03-26 02:13:13 +00:00
Johnny Chen
a8b02d6451 Add N2RegVShLFrm and N2RegVShRFrm formats so that the disassembler can easily
dispatch to the appropriate routines to handle the different interpretations of
the shift amount encoded in the imm6 field.  The Vd, Vm fields are interpreted
the same between the two, though.

See, for example, A8.6.367 VQSHL, VQSHLU (immediate) for N2RegVShLFrm format and
A8.6.368 VQSHRN, VQSHRUN for N2RegVShRFrm format.

llvm-svn: 99590
2010-03-26 01:07:59 +00:00
Jim Grosbach
2a0b14a387 switch the flag for using NEON for SP floating point to a subtarget 'feature'.
Re-commit. This time complete with testsuite updates.

llvm-svn: 99570
2010-03-25 23:47:34 +00:00
Jim Grosbach
97d5bc2b86 need to fix 'make check' tests first. revert for a moment.
llvm-svn: 99569
2010-03-25 23:34:05 +00:00
Jim Grosbach
7e87ba79e6 switch the flag for using NEON for SP floating point to a subtarget 'feature'
llvm-svn: 99568
2010-03-25 23:32:19 +00:00
Johnny Chen
d56897bddc Removed instruction class NI from ARMInstrFormats.td.
It doesn't seem to be used anywhere.

llvm-svn: 99566
2010-03-25 23:11:56 +00:00
Jim Grosbach
b97ff2a4c1 switch the use-vml[as] instructions flag to a subtarget 'feature'
llvm-svn: 99565
2010-03-25 23:11:16 +00:00
Johnny Chen
38c9f64289 Add NVDupLnFrm and change NVDupLane class to use that format.
llvm-svn: 99557
2010-03-25 21:49:12 +00:00
Jim Grosbach
0975d55c8e ARM cortex-a8 doesn't do vmla/vmls well. disable them by default for that cpu
llvm-svn: 99549
2010-03-25 20:48:50 +00:00
Johnny Chen
58278a364d Add NVCVTFrm (NEON Convert with fractional bits immediate) and modify N2VImm to
expect a Format arg.  N2VCvtD/N2VCvtQ are modified to use the NVCVTFrm format.

llvm-svn: 99548
2010-03-25 20:39:04 +00:00
Daniel Dunbar
aeb4d40a70 Fix -Asserts warning, again.
llvm-svn: 99542
2010-03-25 19:35:53 +00:00
Jakob Stoklund Olesen
17f506ccdd Tag SSE2 integer instructions as SSEPackedInt.
llvm-svn: 99540
2010-03-25 18:52:04 +00:00
Jakob Stoklund Olesen
5a6e614de9 Teach TableGen to understand X.Y notation in the TSFlagsFields strings.
Remove much horribleness from X86InstrFormats as a result. Similar
simplifications are probably possible for other targets.

llvm-svn: 99539
2010-03-25 18:52:01 +00:00
Jakob Stoklund Olesen
5ca19faccc Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings.
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.

The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.

This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.

The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.

llvm-svn: 99524
2010-03-25 17:25:00 +00:00
Johnny Chen
cc491eff10 Added a new instruction class NVDupLane to be inherited by VDUPLND and VDUPLNQ,
instead of the current N2V.  Format of NVDupLane instances are set to NEONFrm
currently.

llvm-svn: 99518
2010-03-25 17:01:27 +00:00
Bob Wilson
04e9ff15cb Reapply Kevin's change 94440, now that Chris has fixed the limitation on
opcode values fitting in one byte (svn r99494).

llvm-svn: 99514
2010-03-25 16:36:14 +00:00
Chris Lattner
cda90fafdd eliminate a bunch more parallels now that scheduling
handles dead implicit results more aggressively.  More
to come, I think this is now just a data entry problem.

llvm-svn: 99486
2010-03-25 05:44:01 +00:00
Evan Cheng
d663ac8306 Disable folding loads into tail call in 32-bit PIC mode. It can introduce illegal code like this:
addl    $12, %esp
        popl    %esi
        popl    %edi
        popl    %ebx
        popl    %ebp
        jmpl    *__Block_deallocator-L1$pb(%esi)  # TAILCALL

The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class.

The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit.

llvm-svn: 99455
2010-03-25 00:10:31 +00:00
Bob Wilson
d5673d9f1f Speculatively revert this to see if it fixes buildbot failures.
--- Reverse-merging r99440 into '.':
U    test/MC/AsmParser/X86/x86_32-bit_cat.s
U    test/MC/AsmParser/X86/x86_32-encoding.s
U    include/llvm/IntrinsicsX86.td
U    include/llvm/CodeGen/SelectionDAGNodes.h
U    lib/Target/X86/X86InstrSSE.td
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 99450
2010-03-24 23:26:29 +00:00
Kevin Enderby
9cab7fdb12 Added the Advanced Encryption Standard (AES) Instructions.
llvm-svn: 99440
2010-03-24 22:33:33 +00:00
Jim Grosbach
d285f71b9a Make the use of the vmla and vmls VFP instructions controllable via cmd line.
Preliminary testing shows significant performance wins by not using these
instructions.

llvm-svn: 99436
2010-03-24 22:31:46 +00:00
Kevin Enderby
8d8cb5c491 Fixed the SS42AI template for the SSE 4.2 instructions with TA prefix so it does
not get an "Unknown immediate size" assert failure when used.  All instructions 
of this form have an 8-bit immediate.  Also added a test case of an example
instruction that is of this form.

llvm-svn: 99435
2010-03-24 22:28:42 +00:00
Nate Begeman
c45fc4a137 Per chris's request, add some comments.
llvm-svn: 99434
2010-03-24 22:19:06 +00:00
Johnny Chen
f53b3e5b12 Trivial formating change.
llvm-svn: 99428
2010-03-24 21:25:07 +00:00
Nate Begeman
820b07ed10 BUILD_VECTOR was missing out on some prime opportunities to use SSE 4.1 inserts.
llvm-svn: 99423
2010-03-24 20:49:50 +00:00
Johnny Chen
3e5750fc46 Reverted r99326 which added NVdVmVCVTFrm, and later renamed to NVCVTFrm.
NVCVTFrm will later be used to describe "vcvt with fractional bits".

llvm-svn: 99415
2010-03-24 19:47:14 +00:00
Johnny Chen
31c01c1811 Reverted r99376. The disassembler will deal with the 2-reg format of these two
N3VX instructions using special case code.

llvm-svn: 99409
2010-03-24 18:46:34 +00:00
Jim Grosbach
7dc50db8fa tweak the arm if conversion heuristic
llvm-svn: 99402
2010-03-24 16:15:14 +00:00
Johnny Chen
d31726dba1 Mark VMOVDneon and VMOVQ as having the N2RegFrm form to help the disassembler.
llvm-svn: 99376
2010-03-24 01:29:25 +00:00
Chris Lattner
a44c023751 Switch INC8r to defining its pattern in terms of X86inc_flag
and defining the add pattern with Pat<>, eliminating a use of
parallel.

llvm-svn: 99375
2010-03-24 01:02:12 +00:00
Johnny Chen
dabf739480 Renamed NVdVmImmFrm and NVdVmVCVTFrm to the more proper N2RegFrm and NVCVTFrm,
respectively, and add some more comment.

llvm-svn: 99373
2010-03-24 00:57:50 +00:00
Chris Lattner
36f990dc18 switch SDTBinaryArithWithFlags to be a multiple-result node as well.
llvm-svn: 99370
2010-03-24 00:49:29 +00:00
Chris Lattner
0d53d0a634 Switch SDTUnaryArithWithFlags to being modeled as a two-result
ISD node.  The only change in the generated isel code are comments
like:

<                 // Src: (X86dec_flag:i16 GR16:i16:$src)
---
>                 // Src: (X86dec_flag:i16:i32 GR16:i16:$src)

because now it knows that X86dec_flag returns both an i16 (for the result)
and an i32 (for EFLAGS) in this case.  Wewt.

llvm-svn: 99369
2010-03-24 00:47:47 +00:00
Chris Lattner
e42b80792e remove 64-bit or_is_add parallels.
llvm-svn: 99360
2010-03-24 00:16:52 +00:00
Chris Lattner
6a8da47891 remove useless or_is_add parallel's.
llvm-svn: 99359
2010-03-24 00:15:23 +00:00
Chris Lattner
f468453934 reduce nesting.
llvm-svn: 99358
2010-03-24 00:12:57 +00:00
Jim Grosbach
b19d22fcae try being more permissive for if-conversion on ARM V7. see what the nightly
test run permformance numbers say as to whether it helps.

llvm-svn: 99355
2010-03-24 00:03:13 +00:00
Jakob Stoklund Olesen
1dba4a4389 Revert "Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings."
This reverts commit 99345. It was breaking buildbots.

llvm-svn: 99352
2010-03-23 23:48:51 +00:00
Chris Lattner
4eac41e12e [llvm_void_ty] is no longer needed for result types,
just use an empty result list.

llvm-svn: 99346
2010-03-23 23:46:07 +00:00
Jakob Stoklund Olesen
9df76f18b4 Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings.
This is work in progress. So far, SSE execution domain tables are added to
X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix.

llvm-svn: 99345
2010-03-23 23:14:44 +00:00
Johnny Chen
b7f2a26117 Renamed NVdImmFrm to N1RegModImmFrm.
llvm-svn: 99344
2010-03-23 23:09:14 +00:00
Johnny Chen
d97eb200cf Fix typo in the comment for N3VX class.
llvm-svn: 99328
2010-03-23 21:35:03 +00:00
Johnny Chen
8249bce25e Add comment.
llvm-svn: 99327
2010-03-23 21:30:12 +00:00
Johnny Chen
415ce90919 Add New NEON Format NVdVmVCVTFrm.
Converted some of the NEON vcvt instructions to this format.

llvm-svn: 99326
2010-03-23 21:25:38 +00:00
Johnny Chen
81fb570fda Add New NEON Format NVdVmImmFrm.
llvm-svn: 99322
2010-03-23 20:40:44 +00:00
Evan Cheng
e6099e3382 Teach isSafeToClobberEFLAGS to ignore dbg_value's. We need a MachineBasicBlock::iterator that does this automatically?
llvm-svn: 99320
2010-03-23 20:35:45 +00:00