1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00
Commit Graph

22390 Commits

Author SHA1 Message Date
Reed Kotler
abe153fc74 blank line for test commit
llvm-svn: 164640
2012-09-25 22:34:20 +00:00
Sebastian Pop
270e075dec TargetLowering interface to set/get minimum block entries for jump tables.
Provide interface in TargetLowering to set or get the minimum number of basic
blocks whereby jump tables are generated for switch statements rather than an
if sequence.

    getMinimumJumpTableEntries() defaults to 4.
    setMinimumJumpTableEntries() allows target configuration.

    This patch changes the default for the Hexagon architecture to 5
    as it improves performance on some benchmarks.

llvm-svn: 164628
2012-09-25 20:35:36 +00:00
Michael Liao
b50e89ddce Add missing i64 max/min/umax/umin on 32-bit target
- Turn on atomic6432.ll and add specific test case as well

llvm-svn: 164616
2012-09-25 18:08:13 +00:00
Jim Grosbach
97e019d375 ARM: Darwin BL/BLX relocations to out-of-range symbols.
When a BL/BLX references a symbol in the same translation unit that is
out of range, use an external relocation. The linker will use this to
generate a branch island rather than a direct reference, allowing the
relocation to resolve correctly.

rdar://12359919

llvm-svn: 164615
2012-09-25 18:07:17 +00:00
Bob Wilson
f2dc7ae3f5 Consistently specify the assembly variant to MatchInstructionImpl.
llvm-svn: 164611
2012-09-25 17:19:29 +00:00
Evan Cheng
a5b2ee52c9 Fix an illegal tailcall opt where the callee returns a double via xmm while caller returns x86_fp80 via st0. rdar://12229511
llvm-svn: 164588
2012-09-25 05:32:34 +00:00
Jim Grosbach
d5ba471995 ARM: 'add Rd, pc, #imm' is an alias for 'adr Rd, #imm'.
rdar://9795790

llvm-svn: 164577
2012-09-25 00:08:13 +00:00
Jim Grosbach
484960af64 Mark jump tables in code sections with DataRegion directives.
Even out-of-line jump tables can be in the code section, so mark them
as data-regions for those targets which support the directives.

rdar://12362871&12362974

llvm-svn: 164571
2012-09-24 23:06:27 +00:00
Chad Rosier
4c89e0343a Rather then have a wrapper function, have tblgen instantiate the implementation.
Also remove an unused argument.

llvm-svn: 164567
2012-09-24 22:57:55 +00:00
Roman Divacky
87f9c41b1c Specify MachinePointerInfo as refering to the argument value and offset of the
store when handling byval arguments. Thus preventing reordering of the store
with load with post-RA scheduler.

llvm-svn: 164553
2012-09-24 20:47:19 +00:00
Chad Rosier
599b467187 Rather then have a wrapper function, have tblgen instantiate the implementation.
llvm-svn: 164548
2012-09-24 19:32:29 +00:00
NAKAMURA Takumi
8377e86232 ARMInstPrinter.cpp: Fix a warning in -Asserts. [-Wunused-variable]
llvm-svn: 164459
2012-09-22 13:12:28 +00:00
NAKAMURA Takumi
79a8490f7a Whitespace.
llvm-svn: 164458
2012-09-22 13:12:22 +00:00
Tim Northover
7cab153d37 Fix edge cases of ARM shift operands in arith instructions.
As before with load instructions, oddities like "asr #32", "rrx" could
be printed incorrectly.

Patch by Chris Lidbury.

llvm-svn: 164456
2012-09-22 11:18:19 +00:00
Tim Northover
1c60305666 Fix the handling of edge cases in ARM shifted operands.
This patch fixes load/store instructions to handle less common cases
like "asr #32", "rrx" properly throughout the MC layer.

Patch by Chris Lidbury.

llvm-svn: 164455
2012-09-22 11:18:12 +00:00
Michael Liao
3d9c40c0c8 Fix 16-bit atomic inst encoding and keep pseudo-inst starting with '#'
llvm-svn: 164453
2012-09-22 05:41:15 +00:00
Michael Liao
0a4f3eefaf Fix typo in r164357
llvm-svn: 164452
2012-09-22 03:39:42 +00:00
Akira Hatanaka
cf8158381d MIPS DSP: Add immediate leaves.
llvm-svn: 164435
2012-09-22 00:07:12 +00:00
Akira Hatanaka
e8ffbb3ace MIPS DSP: Add predicates and instruction template.
llvm-svn: 164434
2012-09-22 00:06:06 +00:00
Akira Hatanaka
4acf68deb2 Add MIPS DSP register classes. Set actions of DSP vector operations and override
TargetLowering's callback functions.

llvm-svn: 164431
2012-09-21 23:58:31 +00:00
Akira Hatanaka
5ead4f3d78 SelectionDAG node enums for MIPS DSP nodes.
llvm-svn: 164430
2012-09-21 23:52:47 +00:00
Akira Hatanaka
00202df6d5 Add MIPS accumulator and DSP control registers.
llvm-svn: 164429
2012-09-21 23:48:37 +00:00
Akira Hatanaka
d89661f8bd Add flags and feature bits for mips dsp.
llvm-svn: 164428
2012-09-21 23:41:49 +00:00
Chad Rosier
fd5e542cea [ms-inline asm] Expose the mnemonicIsValid() function in the AsmParser.
llvm-svn: 164420
2012-09-21 22:21:26 +00:00
Chad Rosier
bfd7fc3e7e Add comment.
llvm-svn: 164415
2012-09-21 21:08:46 +00:00
Chad Rosier
2cc6afaac6 Add comment.
llvm-svn: 164414
2012-09-21 20:51:43 +00:00
Chad Rosier
a58913fc00 [fast-isel] Fallback to SelectionDAG isel if we require strict alignment for
non-aligned i32 loads/stores.
rdar://12304911

llvm-svn: 164381
2012-09-21 16:58:35 +00:00
Michael Liao
9a17cba52b Fix a typo in r164357
llvm-svn: 164372
2012-09-21 16:03:03 +00:00
Bill Wendling
38bffadcad Make the 'get*AlignmentFromAttr' functions into member functions within the Attributes class. Now with fix.
llvm-svn: 164370
2012-09-21 15:26:31 +00:00
Andrew Trick
2545253eda Cortex-A9 latency fixes (w/ -schedmodel only).
Quick review against the manual revealed a few obvious mistakes.

llvm-svn: 164361
2012-09-21 05:06:40 +00:00
Michael Liao
2197b133f8 Add missing i8 max/min/umax/umin support
- Fix PR5145 and turn on test 8-bit atomic ops

llvm-svn: 164358
2012-09-21 03:18:52 +00:00
Michael Liao
439a9cea68 Revise td of X86 atomic instructions
- Rewirte most atomic instructions in templates for both better
  maintenance and future extensions, such as HLE in TSX.

llvm-svn: 164357
2012-09-21 03:00:17 +00:00
NAKAMURA Takumi
6900d8a214 Mips16FrameLowering.cpp: Remove unused TII introduced in r164349. [-Wunused-variable]
llvm-svn: 164354
2012-09-21 02:21:30 +00:00
Akira Hatanaka
39d54479a3 Properly save and restore RA and Mips16 callee save registers S0,S1
Patch by Reed Kotler.

llvm-svn: 164349
2012-09-21 01:08:16 +00:00
Chad Rosier
8a1b0217f6 [fast-isel] Fallback to SelectionDAG isel if we require strict alignment for
non-halfword-aligned i16 loads/stores.
rdar://12304911

llvm-svn: 164345
2012-09-21 00:41:42 +00:00
Jim Grosbach
cfecc18fc8 Tidy up. Whitespace.
llvm-svn: 164344
2012-09-21 00:36:42 +00:00
Jim Grosbach
8293ae4ed7 Tidy up. Formatting.
llvm-svn: 164343
2012-09-21 00:26:53 +00:00
Jim Grosbach
135898ebe3 ARM: Use a dedicated intrinsic for vector bitwise select.
The expression based expansion too often results in IR level optimizations
splitting the intermediate values into separate basic blocks, preventing
the formation of the VBSL instruction as the code author intended. In
particular, LICM would often hoist part of the computation out of a loop.

rdar://11011471

llvm-svn: 164340
2012-09-21 00:18:20 +00:00
Bill Wendling
65a9731d9c Revert r164308 to fix buildbots.
llvm-svn: 164309
2012-09-20 16:59:57 +00:00
Bill Wendling
89e5c2d955 Make the 'get*AlignmentFromAttr' functions into member functions within the Attributes class.
llvm-svn: 164308
2012-09-20 16:27:05 +00:00
Craig Topper
2eb5a713a8 Change enum type in a static table to uint8_t instead. Saves about 700 hundred bytes of static data. Change unsigned char in same table to uint8_t for explicitness.
llvm-svn: 164285
2012-09-20 06:14:08 +00:00
Michael Liao
34658dca78 Re-work X86 code generation of atomic ops with spin-loop
- Rewrite/merge pseudo-atomic instruction emitters to address the
  following issue:
  * Reduce one unnecessary load in spin-loop

    previously the spin-loop looks like

        thisMBB:
        newMBB:
          ld  t1 = [bitinstr.addr]
          op  t2 = t1, [bitinstr.val]
          not t3 = t2  (if Invert)
          mov EAX = t1
          lcs dest = [bitinstr.addr], t3  [EAX is implicit]
          bz  newMBB
          fallthrough -->nextMBB

    the 'ld' at the beginning of newMBB should be lift out of the loop
    as lcs (or CMPXCHG on x86) will load the current memory value into
    EAX. This loop is refined as:

        thisMBB:
          EAX = LOAD [MI.addr]
        mainMBB:
          t1 = OP [MI.val], EAX
          LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
          JNE mainMBB
        sinkMBB:

  * Remove immopc as, so far, all pseudo-atomic instructions has
    all-register form only, there is no immedidate operand.

  * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
    td

  * Fix issues in PR13458

- Add comprehensive tests on atomic ops on various data types.
  NOTE: Some of them are turned off due to missing functionality.

- Revise tests due to the new spin-loop generated.

llvm-svn: 164281
2012-09-20 03:06:15 +00:00
Michael Liao
2730b7865e Unify the logic in SelectAtomicLoadAdd and SelectAtomicLoadArith
- Merge the processing of LOAD_ADD with other atomic load-arith
  operations
- Separate the logic getting target constant for atomic-load-op and add
  an optimization for atomic-load-add on i16 with negative value
- Optimize a minor case for atomic-fetch-add i16 with negative operand. Test
  case is revised.

llvm-svn: 164243
2012-09-19 19:36:58 +00:00
Bill Schmidt
4e7e64ff70 Small structs for PPC64 SVR4 must be passed right-justified in registers.
lib/Target/PowerPC/PPCISelLowering.{h,cpp}
 Rename LowerFormalArguments_Darwin to LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerFormalArguments_SVR4 to LowerFormalArguments_32SVR4.
 Receive small structs right-justified in LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerCall_Darwin to LowerCall_Darwin_Or_64SVR4.
 Rename LowerCall_SVR4 to LowerCall_32SVR4.
 Pass small structs right-justified in LowerCall_Darwin_Or_64SVR4.

test/CodeGen/PowerPC/structsinregs.ll
 New test.

llvm-svn: 164228
2012-09-19 15:42:13 +00:00
Craig Topper
abbf768c15 Remove code for setting the VEX L-bit as a function of operand size from the code emitters and the disassembler table builder. Fix a couple instructions that were still missing VEX_L.
llvm-svn: 164204
2012-09-19 06:37:45 +00:00
Craig Topper
7c37abcace Add explicit VEX_L tags to all 256-bit instructions. This will allow us to remove code from the code emitters that examined operands to set the L-bit.
llvm-svn: 164202
2012-09-19 06:06:34 +00:00
Evan Cheng
1a3416521f MOVi16 (movw) is only legal on cpus with V6T2 support. rdar://12300648
llvm-svn: 164169
2012-09-18 21:24:16 +00:00
Roman Divacky
748e9dfd91 Fix the isLocalCall() by checking for linker weakness as well.
llvm-svn: 164155
2012-09-18 18:27:49 +00:00
Akira Hatanaka
a1ab530be9 Revert r164051.
llvm-svn: 164150
2012-09-18 18:08:25 +00:00
Roman Divacky
bb7740900c Avoid symbol name clash when filling TOC.
Patch by Adhemerval Zanella.

llvm-svn: 164141
2012-09-18 17:10:37 +00:00
Roman Divacky
377f342a56 On PPC64 emit the environment pointer. Patch by Adhemerval Zanella.
llvm-svn: 164139
2012-09-18 16:55:29 +00:00
Roman Divacky
953cd43dfa Optimize local func calls to not emit nop for TOC restoration.
Patch by Adhemerval Zanella.

llvm-svn: 164138
2012-09-18 16:47:58 +00:00
Roman Divacky
e91b4521bf When creating MCAsmBackend pass the CPU string as well. In X86AsmBackend
store this and use it to not emit long nops when the CPU is geode which
doesnt support them.

Fixes PR11212.

llvm-svn: 164132
2012-09-18 16:08:49 +00:00
James Molloy
4cb3751b3e More domain conversion; convert VFP VMOVS to NEON instructions in more cases - when we may clobber the other S-lane by converting an S to a D instruction, make an effort to work out if the S lane is clobberable or not.
llvm-svn: 164114
2012-09-18 08:31:15 +00:00
Andrew Trick
65c7aae93f TableGen subtarget emitter. Initialize MCSubtargetInfo with the new machine model.
llvm-svn: 164092
2012-09-18 03:18:56 +00:00
Evan Cheng
82c85585f9 Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte
aligned address. Based on patch by David Peixotto.

Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment
hints. rdar://12090772, rdar://12238782

llvm-svn: 164089
2012-09-18 01:42:45 +00:00
Andrew Trick
150c97940b Revert r164061-r164067. Most of the new subtarget emitter.
I have to work out the Target/CodeGen header dependencies
before putting this back.

llvm-svn: 164072
2012-09-17 23:00:42 +00:00
Andrew Trick
8a499d1f62 TableGen subtarget emitter. Initialize MCSubtargetInfo with the new machine model.
llvm-svn: 164061
2012-09-17 22:18:55 +00:00
Jan Wen Voung
bd8575d1d7 Add some cases to x86 OptimizeCompare to handle DEC and INC, too.
While we are setting the earlier def to true, also make it live.

llvm-svn: 164056
2012-09-17 22:04:23 +00:00
Akira Hatanaka
c0b9726fe7 Make sure there is enough room for RA. getStackSize needs to be cleaned up but
we will do that when we implement the full save/restore.

Patch by Reed Kotler.

llvm-svn: 164051
2012-09-17 20:02:42 +00:00
Benjamin Kramer
2844c979a6 LLVM_ATTRIBUTE_USED forces emission of a function. To silence unused function warnings use LLVM_ATTRIBUTE_UNUSED.
llvm-svn: 164036
2012-09-17 16:46:22 +00:00
Silviu Baranga
aa267976b5 Removed the VMLxForwarding feature for the Cortex-A15 target.
llvm-svn: 164030
2012-09-17 14:10:54 +00:00
Craig Topper
389226c0ae Change unsigned to uint32_t to match base class declaration and other targets.
llvm-svn: 164001
2012-09-16 18:10:23 +00:00
Nadav Rotem
c790bc0984 The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types.
It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast,
and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics.

rdar://11897677

llvm-svn: 163995
2012-09-16 07:39:07 +00:00
Craig Topper
95869a202b Use LLVM_DELETED_FUNCTION in place of 'DO NOT IMPLEMENT' comments.
llvm-svn: 163974
2012-09-15 17:09:36 +00:00
Craig Topper
ded986759d Remove unused private fields to silence -Wunused-private-field.
llvm-svn: 163973
2012-09-15 17:08:51 +00:00
Benjamin Kramer
3be8d89f89 X86: Emitting x87 fsin/fcos for sinf/cosf is not safe without unsafe fp math.
This was only an issue if sse is disabled.

llvm-svn: 163967
2012-09-15 12:44:27 +00:00
Akira Hatanaka
03b00bdf4d Remove aligned/unaligned load/store fragments defined in MipsInstrInfo.td and
use load/store fragments defined in TargetSelectionDAG.td in place of them.
Unaligned loads/stores are either expanded or lowered to target-specific nodes,
so instruction selection should see only aligned load/store nodes.

No changes in functionality.

llvm-svn: 163960
2012-09-15 01:52:08 +00:00
Akira Hatanaka
5540fca519 Handled unaligned load/stores properly in Mips16
Patch by Reed Kotler.

llvm-svn: 163956
2012-09-15 01:02:03 +00:00
Andrew Trick
1659f12c7b Implement getNumLDMAddresses and expose through ARMBaseInstrInfo.
llvm-svn: 163922
2012-09-14 18:48:46 +00:00
Andrew Trick
5b81497e7b Cortex-A9 instruction-level scheduling machine model.
This models the A9 processor at the level of instruction operands, as
opposed to the itinerary, which models each operation at the level of
pipeline stages.

The two primary motivations are:

1) Allow MachineScheduler to model A9 as an out-of-order processor. It
can now distinguish between hazards that force interlocking vs.
buffered resources.

2) Reduce long-term maintenance by allowing the itinerary and target
hooks to eventually be removed. Note that almost all of the complexity
in the new model exists to model instruction variants, which the
itinerary cannot handle. Instead the scheduler previously relied on
processor-specific target hooks which are incomplete and buggy.

llvm-svn: 163921
2012-09-14 18:31:58 +00:00
Sergei Larin
3c6b1bfa42 DAG post-process for Hexagon MI scheduler
This patch introduces a possibility for Hexagon MI scheduler
to perform some target specific post- processing on the scheduling
DAG prior to scheduling.

llvm-svn: 163903
2012-09-14 15:07:59 +00:00
Dmitri Gribenko
93c7ec80b7 Fix Doxygen issues:
* wrap code blocks in \code ... \endcode;
* refer to parameter names in paragraphs correctly (\arg is not what most
  people want -- it starts a new paragraph);
* use \param instead of \arg to document parameters in order to be consistent
  with the rest of the codebase.

llvm-svn: 163902
2012-09-14 14:57:36 +00:00
Benjamin Kramer
36c359546f Remove redundant private field.
clang warned about this being unused in Release builds.

llvm-svn: 163899
2012-09-14 12:19:58 +00:00
Akira Hatanaka
0944ce5e68 mips16 fixes.
1. Add MoveR3216
2. Correct spelling for Move32R16

Patch by Reed Kotler.

llvm-svn: 163869
2012-09-14 03:21:56 +00:00
Michael Liao
5eea004951 Fix comment
llvm-svn: 163835
2012-09-13 20:30:16 +00:00
Michael Liao
0c0da113c5 Add wider vector/integer support for PR12312
- Enhance the fix to PR12312 to support wider integer, such as 256-bit
  integer. If more than 1 fully evaluated vectors are found, POR them
  first followed by the final PTEST.

llvm-svn: 163832
2012-09-13 20:24:54 +00:00
Jakob Stoklund Olesen
72138019a9 Fix the TCRETURNmi64 bug differently.
Add a PatFrag to match X86tcret using 6 fixed registers or less. This
avoids folding loads into TCRETURNmi64 using 7 or more volatile
registers.

<rdar://problem/12282281>

llvm-svn: 163819
2012-09-13 18:31:27 +00:00
Akira Hatanaka
a6138a9115 mips16: When copying operands in a conditional branch instruction, allow for
immediate operands to be copied.

Patch by Reed Kotler.

llvm-svn: 163811
2012-09-13 17:12:37 +00:00
Jakob Stoklund Olesen
eae8fc91cf Revert r163761 "Don't fold indexed loads into TCRETURNmi64."
The patch caused "Wrong topological sorting" assertions.

llvm-svn: 163810
2012-09-13 16:52:17 +00:00
Silviu Baranga
11ff2a551d This patch introduces A15 as a target in LLVM.
llvm-svn: 163803
2012-09-13 15:05:10 +00:00
Craig Topper
e2e98bb26b Add a new compression type to ModRM table that detects when the memory modRM byte represent 8 instructions and the reg modRM byte represents up to 64 instructions. Reduces modRM table from 43k entreis to 25k entries. Based on a patch from Manman Ren.
llvm-svn: 163774
2012-09-13 05:45:42 +00:00
Jakob Stoklund Olesen
b15912aafd Don't fold indexed loads into TCRETURNmi64.
We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

llvm-svn: 163761
2012-09-13 00:25:00 +00:00
Akira Hatanaka
2706c4f9e2 Misc.
1. Remove RA from list of allocatable registers
2. Enable d,y,r constraint inline assembly instructions

Patch by Reed Kotler.

llvm-svn: 163753
2012-09-12 23:27:55 +00:00
Michael Liao
e600a8a616 Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.

llvm-svn: 163743
2012-09-12 21:43:09 +00:00
Chad Rosier
e57a278d0d [ms-inline asm] Make the operand size directives case insensitive.
llvm-svn: 163729
2012-09-12 18:24:26 +00:00
Dmitri Gribenko
8982c8a34d Fix a couple of Doxygen comment issues pointed out by -Wdocumentation.
llvm-svn: 163721
2012-09-12 16:59:47 +00:00
Roman Divacky
5ad9880cb1 Enable exceptions handling on PPC64 now that cr misaligned spilling
was fixed in r163713.

llvm-svn: 163715
2012-09-12 15:29:32 +00:00
Roman Divacky
3d302860e6 This patch corrects logic in PPCFrameLowering for save and restore of
nonvolatile condition register fields across calls under the SVR4 ABIs.                                            
                                                                                                                   
 * With the 64-bit ABI, the save location is at a fixed offset of 8 from                                           
the stack pointer.  The frame pointer cannot be used to access this                                                
portion of the stack frame since the distance from the frame pointer may                                           
change with alloca calls.                                                                                          
                                                                                                                   
 * With the 32-bit ABI, the save location is just below the general
register save area, and is accessed via the frame pointer like the rest
of the save areas.  This is an optional slot, so it must only be created                                           
if any of CR2, CR3, and CR4 were modified.                                                                      
                                                                                                                   
 * For both ABIs, save/restore logic is generated only if one of the     
nonvolatile CR fields were modified.                                   

I also took this opportunity to clean up an extra FIXME in
PPCFrameLowering.h.  Save area offsets for 32-bit GPRs are meaningless
for the 64-bit ABI, so I removed them for correctness and efficiency.


Fixes PR13708 and partially also PR13623. It lets us enable exception handling
on PPC64.

Patch by William J. Schmidt!

llvm-svn: 163713
2012-09-12 14:47:47 +00:00
Roman Divacky
a811b158e5 Add support for AMD Geode.
llvm-svn: 163710
2012-09-12 14:36:02 +00:00
Craig Topper
24d6cafc79 Indentation fixes. No functional change.
llvm-svn: 163682
2012-09-12 06:20:41 +00:00
Chad Rosier
786e0643c5 Rename the isMemory() function to isMem(). No functional change intended.
llvm-svn: 163654
2012-09-11 23:02:35 +00:00
Manman Ren
1a047422a0 Release build: guard dump functions with
"#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"

No functional change. Update r163339.

llvm-svn: 163653
2012-09-11 22:23:19 +00:00
Chad Rosier
c778c0a3f4 StringSwitchify.
llvm-svn: 163649
2012-09-11 21:10:25 +00:00
Chad Rosier
e7a6502bbe Simplify logic. No functional change intended.
llvm-svn: 163648
2012-09-11 20:57:04 +00:00
Jakob Stoklund Olesen
8a149baa44 Add TRI::getSubRegIndexLaneMask().
Sub-register lane masks are bitmasks that can be used to determine if
two sub-registers of a virtual register will overlap. For example, ARM's
ssub0 and ssub1 sub-register indices don't overlap each other, but both
overlap dsub0 and qsub0.

The lane masks will be accurate on most targets, but on targets that use
sub-register indexes in an irregular way, the masks may conservatively
report that two sub-register indices overlap when the eventually
allocated physregs don't.

Irregular register banks also mean that the bits in a lane mask can't be
mapped onto register units, but the concept is similar.

llvm-svn: 163630
2012-09-11 16:34:08 +00:00
Craig Topper
557b8a5a81 Make a bunch of lowering helper functions static instead of member functions. No functional change.
llvm-svn: 163596
2012-09-11 06:15:32 +00:00
Craig Topper
c9fd7a1602 Change unsigned to a uint16_t in static disassembler tables to reduce the table size.
llvm-svn: 163594
2012-09-11 04:19:21 +00:00
Andrew Trick
ffec33601b Reorganize MachineScheduler interfaces and publish them in the header.
The Hexagon target decided to use a lot of functionality from the
target-independent scheduler. That's fine, and other targets should be
able to do the same. This reorg and API update makes that easy.

For the record, ScheduleDAGMI was not meant to be subclassed. Instead,
new scheduling algorithms should be able to implement
MachineSchedStrategy and be done. But if need be, it's nice to be
able to extend ScheduleDAGMI, so I also made that easier. The target
scheduler is somewhat more apt to break that way though.

llvm-svn: 163580
2012-09-11 00:39:15 +00:00
Chad Rosier
419fa9e0b0 Update function names to conform to guidelines. No functional change intended.
llvm-svn: 163561
2012-09-10 22:50:57 +00:00
Chad Rosier
3758eb20e1 Revert r163556. Missed updates to tablegen files.
llvm-svn: 163557
2012-09-10 22:30:35 +00:00
Chad Rosier
8bbcf2b7ae Update function names to conform to guidelines. No functional change intended.
llvm-svn: 163556
2012-09-10 22:23:45 +00:00
Dmitri Gribenko
1d75adbbb2 Remove redundant semicolons which are null statements.
llvm-svn: 163547
2012-09-10 21:26:47 +00:00
Chad Rosier
054e489dd3 [ms-inline asm] Pass the correct AsmVariant to the PrintAsmOperand() function
and update the printOperand() function accordingly.

llvm-svn: 163544
2012-09-10 21:10:49 +00:00
Chad Rosier
e1355ead98 [ms-inline asm] Add support for .att_syntax directive.
llvm-svn: 163542
2012-09-10 20:54:39 +00:00
Jakob Stoklund Olesen
ab77839866 Don't attempt to use flags from predicated instructions.
The ARM backend can eliminate cmp instructions by reusing flags from a
nearby sub instruction with similar arguments.

Don't do that if the sub is predicated - the flags are not written
unconditionally.

<rdar://problem/12263428>

llvm-svn: 163535
2012-09-10 19:17:25 +00:00
Michael Liao
7dfa5e2092 Enhance PR11334 fix to support extload from v2f32/v4f32
- Fix an remaining issue of PR11674 as well

llvm-svn: 163528
2012-09-10 18:33:51 +00:00
Sergei Larin
adf81918db Add "blocked" heuristic to the Hexagon MI scheduler.
Improve AQ instruction selection in the Hexagon MI scheduler.

llvm-svn: 163523
2012-09-10 17:31:34 +00:00
Michael Liao
2791a08d7e Add boolean simplification support from CMOV
- If a boolean value is generated from CMOV and tested as boolean value,
  simplify the use of test result by referencing the original condition.
  RDRAND intrinisc is one of such cases.

llvm-svn: 163516
2012-09-10 16:36:16 +00:00
Elena Demikhovsky
56cdc6a59a The VPSHUFB 256-bit instruction may be generated when one of input vector is undefined or zeroinitializer.
I've added the "zeroinitializer" case in this patch.

llvm-svn: 163506
2012-09-10 12:13:11 +00:00
Benjamin Kramer
37ce5fbc3b Make helper function static.
llvm-svn: 163504
2012-09-10 11:52:14 +00:00
Nick Lewycky
ad25150d03 Add missing space before {. No functionality change.
llvm-svn: 163484
2012-09-09 23:40:55 +00:00
Craig Topper
a91d731898 Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.
llvm-svn: 163473
2012-09-08 17:42:27 +00:00
Craig Topper
9bda7e421e Use 256-bit alignment for constant pool value for 256-bit vector FNEG lowering.
llvm-svn: 163463
2012-09-08 07:46:05 +00:00
Craig Topper
53ec08b4fc Add support for lowering FABS of vector types.
llvm-svn: 163461
2012-09-08 07:31:51 +00:00
Craig Topper
eb1db45675 Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.
llvm-svn: 163458
2012-09-08 04:58:43 +00:00
Benjamin Kramer
f7e00de5d0 Fix alignment of .comm and .lcomm on mingw32.
For some reason .lcomm uses byte alignment and .comm log2 alignment so we can't
use the same setting for both. Fix this by reintroducing the LCOMM enum.
I verified this against mingw's gcc.

llvm-svn: 163420
2012-09-07 21:08:01 +00:00
Jakob Stoklund Olesen
b062caadab Custom DAGCombine for and/or/xor are for all ARMs.
The 'select' transformations apply to all ARM architectures and don't
require hasV6T2Ops.

llvm-svn: 163396
2012-09-07 17:34:15 +00:00
Benjamin Kramer
f7fdee3ce3 MC: Overhaul handling of .lcomm
- Darwin lied about not supporting .lcomm and turned it into zerofill in the
  asm parser. Push the zerofill-conversion down into macho-specific code.
- This makes the tri-state LCOMMType enum superfluous, there are no targets
  without .lcomm.
- Do proper error reporting when trying to use .lcomm with alignment on a target
  that doesn't support it.
- .comm and .lcomm alignment was parsed in bytes on COFF, should be power of 2.
- Fixes PR13755 (.lcomm crashes on ELF).

llvm-svn: 163395
2012-09-07 17:25:13 +00:00
Benjamin Kramer
2c1f1b0513 PR13754: llvm-mc/x86 crashes on .cfi directives without the % prefix for registers.
gas accepts this and it seems to be common enough to be worth supporting. This
doesn't affect the parsing of reg operands outside of .cfi directives.

llvm-svn: 163390
2012-09-07 14:51:35 +00:00
Benjamin Kramer
2e362224b1 MipsAsmParser: Fix a couple of string use-after-frees and misuses of classof.
llvm-svn: 163383
2012-09-07 09:47:42 +00:00
Jack Carter
93a95cbdde The Mips standalone assembler aliased instruction support.
The assembler can alias one instruction into another based
on the operands. For example the jump instruction "J" takes
and immediate operand, but if the operand is a register the
assembler will change it into a jump register "JR" instruction.

These changes are in the instruction td file.

Test cases included

Contributer: Vladimir Medic
llvm-svn: 163368
2012-09-07 01:42:38 +00:00
Jack Carter
d4ab2f65df The Mips standalone assembler intial directive support.
Actually these are just stubs for parsing the directives.
Semantic support will come later.

Test cases included

Contributer: Vladimir Medic
llvm-svn: 163364
2012-09-07 00:48:02 +00:00
Jack Carter
0a824e63ab The Mips standalone assembler fpu instruction support.
Test cases included

Contributer: Vladimir Medic
llvm-svn: 163363
2012-09-07 00:23:42 +00:00
David Blaikie
d96efbd7a3 Remove unused variable introduced by r163346.
llvm-svn: 163359
2012-09-06 23:31:29 +00:00
Jack Carter
b3ec1ea360 The Mips standalone assembler memory instruction support.
This includes sb,sc,sh,sw,lb,lw,lbu,lh,lhu,ll,lw

Test case included

Contributer: Vladimir Medic
llvm-svn: 163346
2012-09-06 20:00:02 +00:00
Manman Ren
b9d2a6fa2e Release build: guard dump functions with "ifndef NDEBUG"
No functional change.

llvm-svn: 163339
2012-09-06 19:06:06 +00:00
Tim Northover
bfaeb1ab9d Diagnose invalid alignments on duplicating VLDn instructions.
Patch by Chris Lidbury.

llvm-svn: 163323
2012-09-06 15:27:12 +00:00
Tim Northover
b12fa01bc6 Check for invalid alignment values when decoding VLDn/VSTn (single ln) instructions.
Patch by Chris Lidbury.

llvm-svn: 163321
2012-09-06 15:17:49 +00:00
Tim Northover
1c637c210f Use correct part of complex operand to encode VST1 alignment.
Patch by Chris Lidbury.

llvm-svn: 163318
2012-09-06 14:36:55 +00:00
Elena Demikhovsky
9339eef307 AVX2 optimization.
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

llvm-svn: 163312
2012-09-06 12:42:01 +00:00
Nadav Rotem
196b00bd57 Fix a few old-GCC warnings. No functional change.
llvm-svn: 163309
2012-09-06 11:13:55 +00:00
James Molloy
7f0b3c1514 Fix self-host; ensure signedness is consistent.
llvm-svn: 163306
2012-09-06 10:32:08 +00:00
James Molloy
791ec0aa52 Improve codegen for BUILD_VECTORs on ARM.
If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base.

llvm-svn: 163304
2012-09-06 09:55:02 +00:00
James Molloy
90179e600b Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer.
llvm-svn: 163298
2012-09-06 09:16:01 +00:00
Michael Liao
290d2703fe Remove duplicated helper function
llvm-svn: 163295
2012-09-06 07:11:22 +00:00
Craig Topper
0b9e2dd7a7 Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder.
llvm-svn: 163293
2012-09-06 06:09:01 +00:00
Craig Topper
b2bad42f00 Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
llvm-svn: 163292
2012-09-06 05:15:01 +00:00
Jack Carter
43a54f6830 There are some Mips instructions that are lowered by the
assembler such as shifts greater than 32. In the case 
of direct object, the code gen needs to do this lowering 
since the assembler is not involved.

With the advent of the llvm-mc assembler, it also needs 
to do the same lowering.

This patch makes that specific lowering code accessible 
to both the direct object output and the assembler.

This patch does not affect generated output.

llvm-svn: 163287
2012-09-06 02:31:34 +00:00
Jack Carter
2a8cbd60d3 Mips specific llvm assembler support for branch and jump instructions.
Test case included.

Contributer: Vladimir Medic
llvm-svn: 163277
2012-09-06 00:43:26 +00:00
Jakob Stoklund Olesen
826d399ee6 Remove predicated pseudo-instructions.
These pseudos are no longer needed now that it is possible to represent
predicated instructions in SSA form.

llvm-svn: 163275
2012-09-05 23:58:04 +00:00
Jakob Stoklund Olesen
0324528c8c Use predication instead of pseudo-opcodes when folding into MOVCC.
Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:

  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR

Becomes a predicated SUBri with a tied imp-use:

  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>

This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.

The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.

llvm-svn: 163274
2012-09-05 23:58:02 +00:00
Jack Carter
f7221de872 Mips specific llvm assembler support for ALU instructions. This includes
register support. Test case included.

Contributer: Vladimir Medic
llvm-svn: 163268
2012-09-05 23:34:03 +00:00
Roman Divacky
85348270cd Stop casting away const qualifier needlessly.
llvm-svn: 163258
2012-09-05 22:26:57 +00:00
Roman Divacky
4be967f49b Use const properly so that we dont remove const qualifier from region and MII
by casting. Found with gcc48.

llvm-svn: 163247
2012-09-05 21:17:34 +00:00
Hal Finkel
a414a44e22 Move the PPC TOC defs into the PPC64 InstrInfo file.
Since TOC is just defined for PPC64, move its definition to PPC64 td file.

Patch by Adhemerval Zanella.

llvm-svn: 163234
2012-09-05 19:22:27 +00:00
Tim Northover
4e03b89c79 Strip old MachineInstrs *after* we know we can put them back.
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.

llvm-svn: 163230
2012-09-05 18:37:53 +00:00
Pranav Bhandarkar
876ff208b6 LLVM Bug Fix 13709: Remove needless lsr(Rp, #32) instruction access the
subreg_hireg of register pair Rp.

	* lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New
	 DenseMap similar to PeepholeMap that additionally records subreg info
	 too.
        (runOnMachineFunction): Record information in PeepholeDoubleRegsMap
        and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to
	the instruction Rx = COPY Rp1:logreg_subreg.
	* test/CodeGen/Hexagon/remove_lsr.ll: New test.
	

llvm-svn: 163214
2012-09-05 16:01:40 +00:00
Craig Topper
864ef1eec5 Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing.
llvm-svn: 163198
2012-09-05 07:26:35 +00:00
Craig Topper
f029cfe913 Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS.
llvm-svn: 163196
2012-09-05 06:58:39 +00:00
Logan Chien
a15abb3d65 Fix UseInitArray option for MIPS target.
llvm-svn: 163193
2012-09-05 06:17:17 +00:00
Craig Topper
6274d26545 Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.
llvm-svn: 163192
2012-09-05 05:48:09 +00:00
Richard Smith
8213d2a51b Remove redundant semicolons to fix -pedantic-errors build.
llvm-svn: 163190
2012-09-05 01:41:37 +00:00
Chad Rosier
b75afa43e4 Fix function name per coding standard.
llvm-svn: 163187
2012-09-05 01:15:43 +00:00
Preston Gurd
c80dc7d214 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Sergei Larin
905bc1964f Porting Hexagon MI Scheduler to the new API.
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.

llvm-svn: 163137
2012-09-04 14:49:56 +00:00
Arnold Schwaighofer
d606c6fcdf Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 163136
2012-09-04 14:37:49 +00:00
Elena Demikhovsky
61924c155d This patch optimizes shuffle instruction - generates 2 instructions instead of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0

llvm-svn: 163134
2012-09-04 12:49:02 +00:00
Chad Rosier
294688cf56 [ms-inline asm] Asm operands can map to one or more MCOperands. Therefore, add
the NumMCOperands argument to the GetMCInstOperandNum() function that is set
to the number of MCOperands this asm operand mapped to.

llvm-svn: 163124
2012-09-03 20:31:23 +00:00
Chad Rosier
6d692c7883 [ms-inline asm] Add a comment.
llvm-svn: 163123
2012-09-03 19:04:35 +00:00
Chad Rosier
bd31fcd8a9 [ms-inline asm] Add an interface to the GetMCInstOperandNum() function in the
MCTargetAsmParser class.

llvm-svn: 163122
2012-09-03 18:47:45 +00:00
Roman Divacky
1a4b67cd3a Remove always true checks. Noticed by Adhemerval Zanella.
llvm-svn: 163117
2012-09-03 16:55:42 +00:00
Chad Rosier
bb0dcf509a Add braces to the case statement.
llvm-svn: 163116
2012-09-03 16:21:15 +00:00
Chad Rosier
fac2e7b419 Removed unused argument.
llvm-svn: 163104
2012-09-03 03:16:09 +00:00
Chris Lattner
4a8f2bcb32 some peepholes that should match horizontal add/sub operations.
llvm-svn: 163103
2012-09-03 02:58:21 +00:00
Chad Rosier
6fbf85d859 [ms-inline asm] Expose the Kind and Opcode variables from the
MatchInstructionImpl() function.

These values are used by the ConvertToMCInst() function to index into the
ConversionTable.  The values are also needed to call the GetMCInstOperandNum()
function.

llvm-svn: 163101
2012-09-03 02:06:46 +00:00
Chad Rosier
ee2993d684 Move ErrorLoc decl into the scope where it's actually used.
llvm-svn: 163100
2012-09-03 01:55:11 +00:00
Nadav Rotem
d1815a0763 Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.

llvm-svn: 163093
2012-09-02 12:10:19 +00:00
Tim Northover
316bfd78cd Limit domain conversion to cases where it won't break dep chains.
NEON domain conversion was too heavy-handed with its widened
registers, which could have stripped existing instructions of their
dependency, leaving them vulnerable to scheduling errors.

llvm-svn: 163070
2012-09-01 18:07:29 +00:00
Logan Chien
b022dbf7dc Fix Thumb2 fixup kind in the integrated-as.
llvm-svn: 163063
2012-09-01 15:06:36 +00:00
Craig Topper
0791e3f380 Typos
llvm-svn: 163053
2012-09-01 06:33:50 +00:00
Manman Ren
9afdad8207 SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its
output chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://11457792

llvm-svn: 163036
2012-08-31 23:16:57 +00:00
Craig Topper
2e53378ff6 Mark FMA4 instructions as commutable and add them to the folding tables.
llvm-svn: 163035
2012-08-31 23:10:34 +00:00
Chad Rosier
1335fb4cf0 Remove an unused argument. The MCInst opcode is set in the ConvertToMCInst()
function nowadays.

llvm-svn: 163030
2012-08-31 22:12:31 +00:00
Craig Topper
4a81c1cbe0 Add selection of RegOp2MemOpTable3 to canFoldMemoryOperand
llvm-svn: 163029
2012-08-31 22:12:16 +00:00
Michael Liao
6f4b3f358d Fix PR12359
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as
  well as PSHUFB will zero elements with negative indices.

  Patch by Sriram Murali <sriram.murali@intel.com>

llvm-svn: 163018
2012-08-31 20:12:31 +00:00
Jack Carter
a986033975 The instruction DINS may be transformed into DINSU or DEXTM depending
on the size of the extraction and its position in the 64 bit word.

This patch allows support of the dext transformations with mips64 direct
object output.

0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword

32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword

32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword

llvm-svn: 163010
2012-08-31 18:06:48 +00:00
Chad Rosier
9367dbd900 Add a comment to explain what's really going on.
llvm-svn: 163005
2012-08-31 17:24:10 +00:00
Chad Rosier
5e5a7c4932 The ConvertToMCInst() function can't fail, so remove the now dead Match_ConversionFail enum.
llvm-svn: 163002
2012-08-31 16:41:07 +00:00
Craig Topper
917333c8c7 Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted.
llvm-svn: 163001
2012-08-31 16:31:13 +00:00
Craig Topper
6bb3145d0d Add support for converting llvm.fma to fma4 instructions.
llvm-svn: 162999
2012-08-31 15:40:30 +00:00
Michael Liao
43c7369b24 Clean up AddedComplexity further after adding UseSSEx
llvm-svn: 162973
2012-08-31 03:01:35 +00:00
Jakob Stoklund Olesen
eb687a399c Fix a couple of typos in EmitAtomic.
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.

rdar://problem/12203728

llvm-svn: 162968
2012-08-31 02:08:34 +00:00
Jim Grosbach
6d3cb70105 X86: Fix encoding of 'movd %xmm0, %rax'
The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v'
prefix, resulting in mis-assembly of the vanilla movd instruction.

llvm-svn: 162963
2012-08-31 00:30:30 +00:00
Chad Rosier
802539bb46 With the fix in r162954/162955 every cvt function returns true. Thus, have
the ConvertToMCInst() return void, rather then a bool.  Update all the cvt
functions as well.

llvm-svn: 162961
2012-08-31 00:03:31 +00:00
Chad Rosier
495e9f8b7b Fix for r162954. Return the Error.
llvm-svn: 162955
2012-08-30 23:22:05 +00:00
Chad Rosier
1421e2d649 Move a check to the validateInstruction() function where it more properly belongs.
llvm-svn: 162954
2012-08-30 23:20:38 +00:00
Chad Rosier
54ce68581e Typo.
llvm-svn: 162952
2012-08-30 23:00:00 +00:00
Michael Liao
b6735b87b0 Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919
2012-08-30 16:54:46 +00:00
NAKAMURA Takumi
80e2544fa6 PPCISelLowering.cpp: Fix r162725.
[Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good!

Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good.

llvm-svn: 162916
2012-08-30 15:52:29 +00:00
NAKAMURA Takumi
df4cfcd69b PPCISelLowering.cpp: Whitespace.
llvm-svn: 162915
2012-08-30 15:52:23 +00:00
Tim Northover
627f946e05 Add support for moving pure S-register to NEON pipeline if desired
llvm-svn: 162898
2012-08-30 10:17:45 +00:00
Craig Topper
3bc01e8fa4 Only perform DAG combine on FMAs of legal types.
llvm-svn: 162892
2012-08-30 06:56:15 +00:00
Michael Liao
0e40defe86 Fix PR13727
- The root cause is that target constant materialization in X86 fast-isel
  creates a PC-rel addressing which may overflow 32-bit range in non-Small code
  model if .rodata section is allocated too far away from code segment in
  MCJIT, which uses Large code model so far.
- Follow the similar logic to fix non-Small code model in fast-isel by skipping
  non-Small code model.

llvm-svn: 162881
2012-08-30 00:30:16 +00:00
Jakob Stoklund Olesen
50309198d1 Rename hasVolatileMemoryRef() to hasOrderedMemoryRef().
Ordered memory operations are more constrained than volatile loads and
stores because they must be ordered with respect to all other memory
operations.

llvm-svn: 162861
2012-08-29 21:19:21 +00:00
Hal Finkel
b356af14b1 Reserve space for the mandatory traceback fields on PPC64.
We need to reserve space for the mandatory traceback fields,
though leaving them as zero is appropriate for now.

Although the ABI calls for these fields to be filled in fully, no
compiler on Linux currently does this, and GDB does not read these
fields.  GDB uses the first word of zeroes during exception handling to
find the end of the function and the size field, allowing it to compute
the beginning of the function.  DWARF information is used for everything
else.  We need the extra 8 bytes of pad so the size field is found in
the right place.

As a comparison, GCC fills in a few of the fields -- language, number
of saved registers -- but ignores the rest.  IBM's proprietary OSes do
make use of the full traceback table facility.

Patch by Bill Schmidt.

llvm-svn: 162854
2012-08-29 20:22:24 +00:00
Tim Northover
692b4c6860 Refactor setExecutionDomain to be clearer about what it's doing and more robust.
llvm-svn: 162844
2012-08-29 16:36:07 +00:00
Benjamin Kramer
49d736fb29 Make helper function static.
llvm-svn: 162843
2012-08-29 16:17:01 +00:00
Benjamin Kramer
b92d13cc42 Make MemoryBuiltins aware of TargetLibraryInfo.
This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.

Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.

Fixes PR13694 and probably others.

llvm-svn: 162841
2012-08-29 15:32:21 +00:00
Craig Topper
aa2444a397 Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3.
llvm-svn: 162829
2012-08-29 07:18:25 +00:00
Andrew Trick
66d93eaf98 Cleanup sloppy code. Jakob's review.
llvm-svn: 162825
2012-08-29 04:41:37 +00:00
Jush Lu
5a78c68e1d [arm-fast-isel] Add support for ARM PIC.
llvm-svn: 162823
2012-08-29 02:41:21 +00:00
Andrew Trick
48b2b90d4d Fix ARM vector copies of overlapping register tuples.
I have tested the fix, but have not been successfull in generating
a robust unit test. This can only be exposed through particular
register assignments.

llvm-svn: 162821
2012-08-29 01:58:55 +00:00
Andrew Trick
e8b0d4d64e cleanup
llvm-svn: 162820
2012-08-29 01:58:52 +00:00
Chad Rosier
eed9ef7a03 Typo.
llvm-svn: 162807
2012-08-28 23:57:47 +00:00
Michael Liao
2136b1b1ed Add comments on the literal value used.
llvm-svn: 162805
2012-08-28 23:42:17 +00:00
Jack Carter
c918c7a81f The instruction DEXT may be transformed into DEXTU or DEXTM depending
on the size of the extraction and its position in the 64 bit word.

This patch allows support of the dext transformations with mips64 direct
object output.

0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword

32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword

32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword

llvm-svn: 162782
2012-08-28 20:07:41 +00:00
Michael Liao
32ad80c81f Explicitly update the number of nodes to be traversed
llvm-svn: 162780
2012-08-28 19:20:29 +00:00
Jack Carter
a525a54e64 Some instructions are passed to the assembler to be
transformed to the final instruction variant. An
example would be dsrll which is transformed into 
dsll32 if the shift value is greater than 32.

For direct object output we need to do this transformation
in the codegen. If the instruction was inside branch
delay slot, it was being missed. This patch corrects this
oversight.

llvm-svn: 162779
2012-08-28 19:07:39 +00:00
Roman Divacky
7c3f29735a Emit word of zeroes after the last instruction as a start of the mandatory
traceback table on PowerPC64. This helps gdb handle exceptions. The other
mandatory fields are ignored by gdb and harder to implement so just add
there a FIXME.

Patch by Bill Schmidt. PR13641.

llvm-svn: 162778
2012-08-28 19:06:55 +00:00
Akira Hatanaka
d8b83a17c8 Follow-up patch to r162731.
Fix a couple of bugs in mips' long branch pass.
This patch was supposed to be committed along with r162731, so I don't have a
new test case.

llvm-svn: 162777
2012-08-28 18:58:57 +00:00
Hal Finkel
0673920af6 Add PPC Freescale e500mc and e5500 subtargets.
Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to
the PowerPC backend.

Patch by Tobias von Koch.

llvm-svn: 162764
2012-08-28 16:12:39 +00:00
Bill Wendling
6488dc22bb The commutative flag is already correctly set within the multiclass. If we set
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>

llvm-svn: 162741
2012-08-28 07:36:46 +00:00
Craig Topper
803047a9bb Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos.
llvm-svn: 162740
2012-08-28 07:30:47 +00:00
Craig Topper
02bb8ce5e0 Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
llvm-svn: 162738
2012-08-28 07:05:28 +00:00
Michael Liao
1f793b9c47 Fix PR12312
- Add a target-specific DAG optimization to recognize a pattern PTEST-able.
  Such a pattern is a OR'd tree with X86ISD::OR as the root node. When
  X86ISD::OR node has only its flag result being used as a boolean value and
  all its leaves are extracted from the same vector, it could be folded into an
  X86ISD::PTEST node.

llvm-svn: 162735
2012-08-28 03:34:40 +00:00
Jakob Stoklund Olesen
eefb981463 Revert r162713: "Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM."
This wasn't the right way to enforce ordering of atomics.

We are already setting the isVolatile bit on memory operands of atomic
operations which is good enough to enforce the correct ordering.

llvm-svn: 162732
2012-08-28 03:11:27 +00:00
Akira Hatanaka
ab45f57419 Fix mips' long branch pass.
Instructions emitted to compute branch offsets now use immediate operands
instead of symbolic labels. This change was needed because there were problems
when R_MIPS_HI16/LO16 relocations were used to make shared objects.

llvm-svn: 162731
2012-08-28 03:03:05 +00:00
Hal Finkel
a65f8ac557 Split several PPC instruction classes.
Slight reorganisation of PPC instruction classes for scheduling. No
functionality change for existing subtargets.
 - Clearly separate load/store-with-update instructions from regular loads and stores.
 - Split IntRotateD -> IntRotateD and IntRotateDI
 - Split out fsub and fadd from FPGeneral -> FPAddSub
 - Update existing itineraries

Patch by Tobias von Koch.

llvm-svn: 162729
2012-08-28 02:49:14 +00:00
Hal Finkel
367c494415 Allow remat of LI on PPC.
Allow load-immediates to be rematerialised in the register coalescer for
PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail,
because it relies on a register move getting emitted. The immediate load is
equivalent, so change this test case.

Patch by Tobias von Koch.

llvm-svn: 162727
2012-08-28 02:10:33 +00:00
Hal Finkel
d28587407f Eliminate redundant CR moves on PPC32.
The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and
unset if it doesn't. The solution up to now was to insert a MachineNode to
set/unset the CR bit, which produces a CR vreg. This vreg was then copied
into CR bit 6. When the register allocator saw a bunch of these in the same
function, it allocated the set/unset CR bit in some random CR register (1
extra instruction) and then emitted CR moves before every vararg function
call, rather than just setting and unsetting CR bit 6 directly before every
vararg function call. This patch instead inserts a PPCcrset/PPCcrunset
instruction which are then matched by a dedicated instruction pattern.

Patch by Tobias von Koch.

llvm-svn: 162725
2012-08-28 02:10:27 +00:00
Hal Finkel
caa4701e37 Optimize zext on PPC64.
The zeroextend IR instruction is lowered to an 'and' node with an immediate
mask operand, which in turn gets legalised to a sequence of ori's & ands.
This can be done more efficiently using the rldicl instruction.

Patch by Tobias von Koch.

llvm-svn: 162724
2012-08-28 02:10:15 +00:00
Jakob Stoklund Olesen
882cb360be More missing mayLoad flags on AVX multiclasses.
llvm-svn: 162714
2012-08-28 00:02:01 +00:00
Jakob Stoklund Olesen
b91754771a Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM.
It is not safe to use normal LDR instructions because they may be
reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag
that prevents reordering.

Atomic loads are also prevented from participating in rematerialization
and load folding.

llvm-svn: 162713
2012-08-27 23:58:52 +00:00
Bill Wendling
d49e183a6f Make sure we add the predicate after all of the registers are added.
<rdar://problem/12183003>

llvm-svn: 162703
2012-08-27 22:12:44 +00:00
Craig Topper
3e5376d85a Remove MMX shift intrinsic handling code that also exists in SelectionDAGBuilder.
llvm-svn: 162661
2012-08-27 08:08:30 +00:00
Craig Topper
bbee14ad9d Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur.
llvm-svn: 162658
2012-08-27 07:19:59 +00:00
Craig Topper
57dd6db42e Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1'
llvm-svn: 162656
2012-08-27 07:04:50 +00:00
Craig Topper
b524d2e36d Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity.
llvm-svn: 162654
2012-08-27 06:08:57 +00:00
Richard Smith
865f47cbb6 Fix integer undefined behavior due to signed left shift overflow in LLVM.
Reviewed offline by chandlerc.

llvm-svn: 162623
2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen
d1820cea0b Add missing mayLoad flags to a large class of AVX *_Int instructions.
llvm-svn: 162622
2012-08-24 23:29:07 +00:00
Jakob Stoklund Olesen
5eccfd2aed Missed tLEApcrelJT.
ARMConstantIslandPass expects this instruction to stay in the same basic
block as the jump table branch.

llvm-svn: 162615
2012-08-24 22:46:55 +00:00
Jakob Stoklund Olesen
38fa28fb10 Explicitly mark LEApcrel pseudos with hasSideEffects.
It's not clear that they should be marked as such, but tbb formation
fails if t2LEApcrelJT is hoisted of of a loop.

This doesn't change the flags on these instructions,
UnmodeledSideEffects was already inferred from the missing pattern.

llvm-svn: 162603
2012-08-24 21:44:11 +00:00
Jakob Stoklund Olesen
708279db06 Fix call instruction operands in ARMFastISel.
The ARM BL and BLX instructions don't have predicate operands, but the
thumb variants tBL and tBLX do.

The argument registers should be added as implicit uses.

llvm-svn: 162593
2012-08-24 20:52:46 +00:00
Jakob Stoklund Olesen
9ebe947bb0 Mark X86::RET and RETI instructions as variadic.
There is special magic happening when returning floating point values on
the x87 stack. The RET instructions get extra f80 operands.

llvm-svn: 162592
2012-08-24 20:52:44 +00:00
Akira Hatanaka
8411cfdb72 Disable Mips' delay slot filler when optimization level is O0.
llvm-svn: 162589
2012-08-24 20:40:15 +00:00
Akira Hatanaka
8e8bb580a8 In MipsDAGToDAGISel::SelectAddr, fold add node into address operand, if its
second operand is MipsISD::GPRel.

llvm-svn: 162584
2012-08-24 20:21:49 +00:00
Roman Divacky
eab620e38c Lower constant pools and jump tables via TOC on PPC64/SVR4.
In collaboration with Adhemerval Zanella.

llvm-svn: 162562
2012-08-24 16:26:02 +00:00
Jakob Stoklund Olesen
02cb24658a Fix load/store SDNode flags.
llvm-svn: 162558
2012-08-24 14:43:30 +00:00
Jakob Stoklund Olesen
4da790818a Add missing SDNPSideEffect flags.
llvm-svn: 162557
2012-08-24 14:43:27 +00:00
Jakob Stoklund Olesen
48bb81b28a Remove more mayLoad workarounds.
llvm-svn: 162556
2012-08-24 14:43:22 +00:00
Craig Topper
aa57ba3944 Custom lower FMA intrinsics to target specific nodes and remove the patterns.
llvm-svn: 162534
2012-08-24 04:03:22 +00:00
Richard Smith
188ddbae92 Fix undefined behavior (negation of INT_MIN) in ARM backend.
llvm-svn: 162520
2012-08-24 00:35:46 +00:00
Jakob Stoklund Olesen
3739d6ca99 Remove some spurious mayLoad = 0 flags.
They were inserted to silence TableGen's warning about
redundant properties. That warning is now gone.

llvm-svn: 162517
2012-08-24 00:31:20 +00:00
Jakob Stoklund Olesen
e9fa31838d Add missing SDNP properties on the flushw node.
llvm-svn: 162515
2012-08-24 00:31:13 +00:00
Jakob Stoklund Olesen
2f512d8eba X86MemBarrier has unmodeled side effects.
llvm-svn: 162514
2012-08-24 00:31:10 +00:00
Jakob Stoklund Olesen
16126ffe0d Preserve operand flags in convertToThreeAddress() by copying operands.
No test case, this is a generalization of r160260.

llvm-svn: 162485
2012-08-23 22:36:31 +00:00
Craig Topper
3d4254e5b4 Favor FMA3 over FMA4 if both are enabled.
llvm-svn: 162454
2012-08-23 18:14:30 +00:00
Craig Topper
528004fc78 Use a switch statement instead of a bunch of if-else checks and pull out the common function call.
llvm-svn: 162428
2012-08-23 04:57:36 +00:00
Craig Topper
68f6b47a37 Remove unused private field to silence build warning.
llvm-svn: 162426
2012-08-23 04:45:31 +00:00