1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-30 15:32:52 +01:00
Commit Graph

8529 Commits

Author SHA1 Message Date
Richard Sandiford
11e0918feb [SystemZ] Handle extensions in RxSBG optimizations
The input to an RxSBG operation can be narrower as long as the upper bits
are don't care.  This fixes a FIXME added in r192783.

llvm-svn: 192790
2013-10-16 13:35:13 +00:00
Richard Sandiford
15044afbed [SystemZ] Improve handling of SETCC
We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI".  In most cases it's better to use an IPM-based
sequence instead.

llvm-svn: 192784
2013-10-16 11:10:55 +00:00
Richard Sandiford
7921f75ba9 Handle (shl (anyext (shr ...))) in SimpilfyDemandedBits
This is really an extension of the current (shl (shr ...)) -> shl optimization.
The main difference is that certain upper bits must also not be demanded.

The motivating examples are the first two in the testcase, which occur
in llvmpipe output.

llvm-svn: 192783
2013-10-16 10:26:19 +00:00
NAKAMURA Takumi
ab81f8a305 Revert r192758 (and r192759), "MC: Better handling of tricky symbol and section names"
GNU AS didn't like quotes in symbol names.

    Error: junk at end of line, first unrecognized character is `"'

        .def "@feat.00";
        "@feat.00" = 1

Reproduced on Cygwin's 2.23.52.20130309 and mingw32's 2.20.1.20100303.

llvm-svn: 192775
2013-10-16 08:22:49 +00:00
Rafael Espindola
3779f822e1 Add a triple to this test.
llvm-svn: 192767
2013-10-16 02:27:33 +00:00
Rafael Espindola
c17b7cf2ed Add support for metadata representing .ident directives.
llvm-svn: 192764
2013-10-16 01:49:05 +00:00
Hans Wennborg
3b3efddc64 MC: Better handling of tricky symbol and section names
Because of win32 mangling, we produce symbol and section names with
funny characters in them, most notably @ characters.

MC would choke on trying to parse its own assembly output. This patch addresses
that by:

- Making @ trigger quoting of symbol names
- Also quote section names in the same way
- Just parse section names like other identifiers (to allow for quotes)
- Don't assume @ signifies a symbol variant if it is in a string.

Differential Revision: http://llvm-reviews.chandlerc.com/D1945

llvm-svn: 192758
2013-10-16 01:20:40 +00:00
Andrew Trick
e3e67d4a0a Enable MI Sched for x86.
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.

Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.

On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.

Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.

The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.

Unit tests are updated so they don't depend on SD scheduling heuristics.

llvm-svn: 192750
2013-10-15 23:33:07 +00:00
Chad Rosier
3e791b2408 [AArch64] Add support for NEON scalar signed saturating absolute value and
scalar signed saturating negate instructions.

llvm-svn: 192733
2013-10-15 21:18:44 +00:00
Manman Ren
39d1a84681 Struct byval: fix a copy-paste error for thumb2.
PR17309

llvm-svn: 192730
2013-10-15 19:42:32 +00:00
Michael Liao
1081bbac6c Fix PR17546
- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.

llvm-svn: 192722
2013-10-15 17:51:58 +00:00
Michael Liao
a94d0a900a Fix PR16807
- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.

llvm-svn: 192721
2013-10-15 17:51:02 +00:00
Daniel Sanders
21c7c7cd9b [mips][msa] Added support for build_vector for v4f32 and v2f64.
llvm-svn: 192699
2013-10-15 13:14:41 +00:00
Richard Sandiford
86798c4d26 [SystemZ] Use A(G)SI when spilling the target of a constant addition
llvm-svn: 192681
2013-10-15 08:42:59 +00:00
Job Noorman
54f125fb4b Fix MSP430 calling convention to match MSPGCC
llvm-svn: 192678
2013-10-15 08:19:39 +00:00
NAKAMURA Takumi
8c0f09fed1 llvm/test/CodeGen/X86/break-avx-dep.ll: Relax an expression to be matched to also r[89], not only rXX.
llvm-svn: 192675
2013-10-15 06:36:36 +00:00
Andrew Trick
e196a05dc8 Improve on r192635, ExeDepsFix for avx, and add a test case.
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.

This was blocking the switch to MI-Sched.

llvm-svn: 192669
2013-10-15 03:39:43 +00:00
Akira Hatanaka
29e44ea3aa [mips] Transfer kill flag to the newly created operand.
llvm-svn: 192662
2013-10-15 01:06:30 +00:00
Quentin Colombet
cb4b84532c [X86][FastISel] During X86 fastisel, the address of indirect call was resolved
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.

The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.

<rdar://problem/15192473>

llvm-svn: 192636
2013-10-14 22:32:09 +00:00
Nick Lewycky
0da8d88a82 Fix a typo, in a comment, in a test.
llvm-svn: 192632
2013-10-14 22:02:53 +00:00
Eric Christopher
1a04817b81 Revert part of a fix from 2010, changes since then:
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.

I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.

llvm-svn: 192631
2013-10-14 21:52:26 +00:00
Will Dietz
ad27c13a64 MachineSink: Fix and tweak critical-edge breaking heuristic.
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one.  The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.

This commit makes a few small changes
to help better realize this goal:

First, modify the loop to ignore registers
defined by this instruction.  We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.

Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)

Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.

Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.

llvm-svn: 192608
2013-10-14 16:57:17 +00:00
Chad Rosier
40761dc629 [AArch64] Add support for NEON scalar integer compare instructions.
llvm-svn: 192596
2013-10-14 14:37:20 +00:00
Bernard Ogden
f482ee15d7 Add Cortex-A57 support
llvm-svn: 192591
2013-10-14 13:17:07 +00:00
Bernard Ogden
ec0167a2ce Add subtarget feature support for Cortex-A53
Some previous implicit defaults have changed, for example FP and NEON
are now on by default.

llvm-svn: 192590
2013-10-14 13:16:57 +00:00
Elena Demikhovsky
c460e7e50a Fixed a bug in dynamic allocation memory on stack.
The alignment of allocated space was wrong, see Bugzila 17345.

Done by Zvi Rackover <zvi.rackover@intel.com>.

llvm-svn: 192573
2013-10-14 07:26:51 +00:00
Vincent Lejeune
7594bd2071 R600: improve dump of S_WAITCNT
llvm-svn: 192557
2013-10-13 17:56:28 +00:00
Vincent Lejeune
316b632e03 R600: Use masked read sel for texture instructions
llvm-svn: 192554
2013-10-13 17:56:10 +00:00
Vincent Lejeune
b337ac16bc R600: fix swizzle export
llvm-svn: 192553
2013-10-13 17:56:04 +00:00
Benjamin Kramer
3000b81f9a Force a CPU on test so it doesn't depend on microarchitectural scheduling decisions.
llvm-svn: 192532
2013-10-12 11:17:12 +00:00
Reed Kotler
9efb450361 For Mips16, start to consolidate all forms of 32 bit literal loading so that
they can be better handled and optimized in the Mips16 constant island code.

llvm-svn: 192520
2013-10-12 02:19:08 +00:00
Matt Arsenault
289accc07f R600: Add scalar i32 add test
llvm-svn: 192501
2013-10-11 21:03:41 +00:00
Matt Arsenault
d5c3e13cc5 Use CHECK-LABEL
llvm-svn: 192500
2013-10-11 21:03:39 +00:00
Matthias Braun
f96d183309 Remove kill flags after if conversion if necessary
When if converting something like:
true:
   ... = R0<kill>

false:
   ... = R0<kill>

then the instructions of the true block must not have a <kill> flag
anymore, as the instruction of the false block follow and do still read
the R0 value.
Specifically this patch determines the set of register live-in in the
false block (possibly after simulating the liveness changes of the
duplicated instructions). Each of these live-in registers mustn't be
killed.

llvm-svn: 192482
2013-10-11 19:04:37 +00:00
Quentin Colombet
7ba3455dfc [DAGCombiner] Load slicing test case: attempt to really fix the buildbots (used sse4.2 instead of avx!).
<rdar://problem/14477220>

llvm-svn: 192480
2013-10-11 18:54:49 +00:00
Quentin Colombet
c02e5604f4 [DAGCombiner] Reapply load slicing (192471) with a test that explicitly set sse4.2 support.
This should fix the buildbots.

Original commit message:
[DAGCombiner] Slice a big load in two loads when the element are next to each
other in memory and the target has paired load and performs post-isel loads
combining.

E.g., this optimization will transform something like this:
a = load i64* addr
b = trunc i64 a to i32
c = lshr i64 a, 32
d = trunc i64 c to i32

into:
b = load i32* addr1
d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.

One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.

<rdar://problem/14477220>

llvm-svn: 192476
2013-10-11 18:29:42 +00:00
Quentin Colombet
fd0097531f [DAGCombiner] Revert load slicing (r192471), until I figure out why it fails on ubuntu.
llvm-svn: 192474
2013-10-11 18:17:17 +00:00
Matthias Braun
434fbd854b Revert "Tests: Be less dependent on a specific schedule/regalloc"
This reverts r192454

Apparently FileCheck isn't as smart as I though and does not enforce a
topological order between variable defs+uses.

llvm-svn: 192472
2013-10-11 18:09:19 +00:00
Quentin Colombet
b60dc81c8b [DAGCombiner] Slice a big load in two loads when the element are next to each
other in memory and the target has paired load and performs post-isel loads
combining.

E.g., this optimization will transform something like this:
 a = load i64* addr
 b = trunc i64 a to i32
 c = lshr i64 a, 32
 d = trunc i64 c to i32

into:
 b = load i32* addr1
 d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.

One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.

<rdar://problem/14477220>

llvm-svn: 192471
2013-10-11 18:01:14 +00:00
Amara Emerson
bf6dcda63c [ARM] Fix FP ABI attributes with no VFP enabled.
llvm-svn: 192458
2013-10-11 16:03:43 +00:00
Matthias Braun
4beef11e35 Tests: Be less dependent on a specific schedule/regalloc
llvm-svn: 192454
2013-10-11 15:40:12 +00:00
Matheus Almeida
73759d3a3b [mips][msa] Improves robustness of the test by enhancing pattern matching.
llvm-svn: 192446
2013-10-11 13:18:01 +00:00
Justin Holewinski
9769d1f0ef [NVPTX] Switch from StrongPHIElimination to PHIElimination in NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc
Fixes PR17529

llvm-svn: 192445
2013-10-11 12:39:39 +00:00
Justin Holewinski
f7d6ae0d5b Make AsmPrinter::emitImplicitDef a virtual method so targets can emit custom comments for implicit defs
For NVPTX, this fixes a crash where the emitImplicitDef implementation was expecting physical registers,
while NVPTX uses virtual registers (with a couple of exceptions).  Now, the implicit def comment will be
emitted as a true PTX register name. Other targets can use this to customize the output of implicit def
comments.

Fixes PR17519

llvm-svn: 192444
2013-10-11 12:39:36 +00:00
Amara Emerson
83afefcfe3 [ARM] Add a test case for disabled neon/fpu features.
llvm-svn: 192440
2013-10-11 11:07:00 +00:00
Daniel Sanders
3649e05b17 [mips][msa] Added support for matching maddv.[bhwd], and msubv.[bhwd] from normal IR (i.e. not intrinsics)
llvm-svn: 192438
2013-10-11 10:50:42 +00:00
Daniel Sanders
9bec7b823b [mips][msa] Added support for matching fmsub.[wd] from normal IR (i.e. not intrinsics)
llvm-svn: 192435
2013-10-11 10:27:32 +00:00
Robert Lytton
864d2bd56d XCore target fix bug in emitArrayBound() causing segmentation fault
llvm-svn: 192434
2013-10-11 10:27:13 +00:00
Robert Lytton
12def987ea XCore target does not emit '.hidden' or '.protected' attributes
llvm-svn: 192433
2013-10-11 10:27:00 +00:00
Robert Lytton
b441cef9c5 XCore target: fix bug in XCoreLowerThreadLocal.cpp
When a ConstantExpr which uses a thread local is part of a PHI node
instruction, the insruction that replaces the ConstantExpr must
be inserted in the predecessor block, in front of the terminator instruction.
If the predecessor block has multiple successors, the edge is first split.

llvm-svn: 192432
2013-10-11 10:26:48 +00:00
Robert Lytton
e5a2d050ac XCore target: add XCoreTargetLowering::isZExtFree()
llvm-svn: 192431
2013-10-11 10:26:29 +00:00
Daniel Sanders
253e018134 [mips][msa] Added support for matching fmadd.[wd] from normal IR (i.e. not intrinsics)
llvm-svn: 192430
2013-10-11 10:14:25 +00:00
Daniel Sanders
4971ec128b [mips][msa] Added support for matching ffint_[us].[wd], and ftrunc_[us].[wd] from normal IR (i.e. not intrinsics)
llvm-svn: 192429
2013-10-11 10:00:06 +00:00
Kevin Qin
e90902acc5 Implement aarch64 neon instruction set AdvSIMD (copy).
llvm-svn: 192410
2013-10-11 02:33:55 +00:00
Matthias Braun
9e9e0f5d4c Tests: Do not unnecessarily depend on kill comments
llvm-svn: 192404
2013-10-10 22:37:49 +00:00
Matthias Braun
fd7da2a38d Tests: Use CHECK-LABEL where possible
llvm-svn: 192403
2013-10-10 22:37:47 +00:00
Matt Arsenault
6f45619203 R600: Fix trunc i64 to i32 on SI
llvm-svn: 192375
2013-10-10 18:04:16 +00:00
Tom Stellard
582bb030db R600/SI: Use -verify-machineinstrs for most tests
We can't enable the verifier for tests with SI_IF and SI_ELSE, because
these instructions are always followed by a COPY which copies their
result to the next basic block.  This violates the machine verifier's
rule that non-terminators can not folow terminators.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192366
2013-10-10 17:11:46 +00:00
Hao Liu
d0ab407a23 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192361
2013-10-10 17:00:52 +00:00
Rafael Espindola
bb93e39fe2 Revert "Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem). Including following 14 instructions: 4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4)."
This reverts commit r192352. It broke the build.

llvm-svn: 192354
2013-10-10 15:15:17 +00:00
Hao Liu
0ff11c9c71 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192352
2013-10-10 15:01:24 +00:00
Benjamin Kramer
11421cc493 Disable function padding to get this test to pass on atom.
llvm-svn: 192348
2013-10-10 12:46:23 +00:00
Tim Northover
50b95fa75d ARM: correct liveness flags during ARMLoadStoreOpt
When we had a sequence like:

    s1 = VLDRS [r0, 1], Q0<imp-def>
    s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def>
    s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def>
    s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def>

we were gathering the {s0, s1} loads below the s3 load. This is fine,
but confused the verifier since now the s3 load had Q0<imp-use> with
no definition above it.

This should mark such uses <undef> as well. The liveness structure at
the beginning and end of the block is unaffected, and the true sN
definitions should prevent any dodgy reorderings being introduced
elsewhere.

rdar://problem/15124449

llvm-svn: 192344
2013-10-10 09:28:20 +00:00
Akira Hatanaka
d7e78a8926 [mips] Do not generate INS/EXT nodes if target does not have support for
ins/ext.

llvm-svn: 192330
2013-10-09 23:36:17 +00:00
Venkatraman Govindaraju
aedc12be2e [Sparc] Disable tail call optimization for sparc64.
This patch fixes PR17506.

llvm-svn: 192294
2013-10-09 12:50:39 +00:00
Elena Demikhovsky
f24ecf7862 AVX-512: Added VRCP28 and VRSQRT28 instructions and intrinsics.
llvm-svn: 192283
2013-10-09 08:16:14 +00:00
Tim Northover
87db53ff7a AArch64: enable MISched by default.
Substantial SelectionDAG scheduling is going away soon, and is
interfering with Hao's attempts to implement LDn/STn instructions, so
I say we make the leap first.

There were a few reorderings (inevitably) which broke some tests. I
tried to replace them with CHECK-DAG variants mostly, but some too
complex for that to be useful and I just reordered them.

llvm-svn: 192282
2013-10-09 07:53:57 +00:00
Tim Northover
a9df6657ee AArch64: migrate ADRP relaxation test to be llvm-mc only.
llvm-svn: 192281
2013-10-09 07:53:49 +00:00
Craig Topper
d5082631e1 Add in64BitMode/in32BitMode to the MMX/SSE2/AVX maskmovq/dq instructions. This way the asm parser will pick the right one based on the mode. Instruction selection already did the right thing based on the pointer size.
llvm-svn: 192266
2013-10-09 02:18:34 +00:00
Chad Rosier
d30c4af71b [AArch64] Add support for NEON scalar floating-point reciprocal estimate,
reciprocal exponent, and reciprocal square root estimate instructions.

llvm-svn: 192242
2013-10-08 22:09:04 +00:00
Chad Rosier
e281a17b84 [AArch64] Add support for NEON scalar signed/unsigned integer to floating-point
convert instructions.

llvm-svn: 192231
2013-10-08 20:43:30 +00:00
Reed Kotler
0b1b97d48b Add fabsf to the list of inlined functions; otherwise
Mips16 will try and create a stub for it and this will
result in a link error because that function does not exist in libc.

llvm-svn: 192223
2013-10-08 19:55:01 +00:00
Matt Arsenault
ed8ec1a52a Add some xfaild R600 tests.
These are bugs to fix later.

llvm-svn: 192212
2013-10-08 18:06:36 +00:00
Reed Kotler
57455fdc7c Let rotr and bswap be handled by expansion for Mips16 since we don't
have native instructions for this.

llvm-svn: 192207
2013-10-08 17:32:33 +00:00
Craig Topper
2d70555027 Fix a typo in the mattr part of the run line.
llvm-svn: 192174
2013-10-08 06:12:26 +00:00
Craig Topper
3d7c6afb79 Explicitly disable AVX on a bunch of tests so they won't fail on AVX machines post r192171.
llvm-svn: 192173
2013-10-08 06:06:57 +00:00
Craig Topper
aa1a4d51f0 Remove some instructions that existed to provide aliases to the assembler. Can be done with InstAlias instead. Unfortunately, this was causing printer to use 'vmovq' or 'vmovd' based on what was parsed. To cleanup the inconsistencies convert all 'vmovd' with 64-bit registers to 'vmovq', but provide an alias so that 'vmovd' will still parse.
llvm-svn: 192171
2013-10-08 05:53:50 +00:00
Akira Hatanaka
6c2bf15c93 [mips] Test case for r192124.
llvm-svn: 192135
2013-10-07 21:32:57 +00:00
Reed Kotler
33301878d0 Add Mips16 patterns for sign extend byte and sign extend halfword.
llvm-svn: 192130
2013-10-07 20:46:19 +00:00
Manman Ren
b284db0070 Struct byval: use the correct alignment for loads generated to load
from struct byval to registers.

We used to pass 0 which means the alignment of PtrVT. Even when the alignment
of the struct is smaller than 4, the LOADs would have alignment of 4, and
further optimizations could combine the LOADs into a ldm, which would
cause crash.

The fix is to pass the alignment of the struct byval.

rdar://problem/15144402

llvm-svn: 192126
2013-10-07 19:47:53 +00:00
Benjamin Kramer
feace9b737 X86: Fix type check. Just because an integer type is illegal doesn't mean it's i64.
Fixes PR17495, where an i24 triggered this code. It's intended to
optimize i64 loads on 32 bit x86.

llvm-svn: 192123
2013-10-07 19:11:35 +00:00
Matt Arsenault
9c8541d286 Change objectsize intrinsic to accept different address spaces.
Bitcasting everything to i8* won't work. Autoupgrade the old
intrinsic declarations to use the new mangling.

llvm-svn: 192117
2013-10-07 18:06:48 +00:00
Amara Emerson
688cdc2151 [ARM] Improve build attributes emission.
llvm-svn: 192111
2013-10-07 16:55:23 +00:00
Chad Rosier
128d9134e7 [AArch64] Add support for NEON scalar arithmetic instructions:
SQDMULH, SQRDMULH, FMULX, FRECPS, and FRSQRTS.

llvm-svn: 192107
2013-10-07 16:36:15 +00:00
Rafael Espindola
499aaf305a Add support for aliases with linkonce_odr.
This will be used to extend constructor aliases in clang.

llvm-svn: 192066
2013-10-06 15:10:43 +00:00
Benjamin Kramer
44710574cb Force a CPU that doesn't have AVX, otherwise this test fails.
llvm-svn: 192065
2013-10-06 13:52:41 +00:00
Benjamin Kramer
a7e734d765 X86: Don't fold spills into SSE operations if the stack is unaligned.
Regalloc can emit unaligned spills nowadays, but we can't fold the
spills into SSE ops if we can't guarantee alignment. PR12250.

llvm-svn: 192064
2013-10-06 13:48:22 +00:00
Elena Demikhovsky
cb8eaca2e4 AVX-512: added scalar convert instructions and intrinsics.
Fixed load folding in VPERM2I instruction.

llvm-svn: 192063
2013-10-06 13:11:09 +00:00
Venkatraman Govindaraju
2d62beab83 [Sparc] Do not emit nop after fcmp* instruction with V9.
llvm-svn: 192056
2013-10-06 07:06:44 +00:00
Elena Demikhovsky
0ff833ab99 AVX-512: fixed shuffle lowering
in case of BLEND and added VSHUFPS patterns.

llvm-svn: 192055
2013-10-06 06:11:18 +00:00
Venkatraman Govindaraju
aacd252702 [Sparc] Custom lower addc/adde/subc/sube on i64 in sparc64.
This is required because i64 is a legal type but addxcc/subxcc reads icc carry bit, which are 32 bit conditional codes.

llvm-svn: 192054
2013-10-06 03:36:18 +00:00
Venkatraman Govindaraju
fa75d8536b [Sparc] Use addxcc/subxcc for adde/sube instead of addx/subx.
addx/subx does not modify conditional codes whereas addxcc/subxx does.

llvm-svn: 192053
2013-10-06 02:11:10 +00:00
Benjamin Kramer
3a6afef4e7 Emit a better error when running out of registers on inline asm.
The most likely case where this error happens is when the user specifies
too many register operands. Don't make it look like an internal LLVM bug
when we can see that the error is coming from an inline asm instruction.
For other instructions we keep the "ran out of registers" error.

llvm-svn: 192041
2013-10-05 19:33:37 +00:00
Craig Topper
0a8f3fc996 Remove unneeded TBM intrinsics. The arithmetic/logical operation patterns are sufficient.
llvm-svn: 192039
2013-10-05 19:22:59 +00:00
Craig Topper
d0a63f6722 Add an additional pattern for BLCI since opt can turn (not (add x, 1)) into (sub -2, x).
llvm-svn: 192037
2013-10-05 17:17:53 +00:00
Jiangning Liu
6d9b4a0e25 Implement aarch64 neon instruction set AdvSIMD (Across).
llvm-svn: 192028
2013-10-05 08:22:10 +00:00
Rafael Espindola
86d473eceb Convert test to FileCheck.
llvm-svn: 192025
2013-10-05 02:58:36 +00:00
Venkatraman Govindaraju
179e7e6dea [Sparc] Use correct alignment while loading/storing fp128 values.
llvm-svn: 192023
2013-10-05 02:29:47 +00:00
Venkatraman Govindaraju
cf869e9b2a [Sparc] Respect hasHardQuad parameter correctly when lowering SINT_TO_FP with fp128 operand.
llvm-svn: 192015
2013-10-05 00:31:41 +00:00
Venkatraman Govindaraju
271e9485db [Sparc] Correct the floating point conditional code mapping in GetOppositeBranchCondition().
llvm-svn: 192006
2013-10-04 23:54:30 +00:00
Reed Kotler
13ebdc7d9c Support tblockaddr for static compilation in Mips16.
llvm-svn: 191986
2013-10-04 22:01:40 +00:00
Akira Hatanaka
e85ca33e98 [mips] Fix a bug in MipsLongBranch::replaceBranch, which was erasing
instructions in delay slots along with the original branch instructions.

llvm-svn: 191978
2013-10-04 20:51:40 +00:00
Matthias Braun
f7ddf86363 ARM: optimizeSelect has to consider the previous register class
optimizeSelect folds (predicated) copy instructions, it must not ignore
the original register class of the operand when replacing the register
with the copies dest register.

llvm-svn: 191963
2013-10-04 16:52:56 +00:00
Matthias Braun
fbba53e45c ARM: do not add a regmask for TAILJUMPs
The jump doesn't really kill the registers, the following call does but
we never get back anyway.
This avoids some verify-machineinstrs problems when TAILJUMPs are
if-converted.

llvm-svn: 191962
2013-10-04 16:52:54 +00:00
Matthias Braun
ae6465eb28 ARM: preserve undef flag in pseudo instruction expanders
Copy over the whole register machine operand instead of creating a new one
with an incomplete set of flags.

llvm-svn: 191961
2013-10-04 16:52:51 +00:00
Jiangning Liu
9f33a743ab Implement aarch64 neon instruction set AdvSIMD (3V elem).
llvm-svn: 191944
2013-10-04 09:20:44 +00:00
Logan Chien
b51d0cd53b [arm] Enhance the test case by checking .fpu directive.
llvm-svn: 191891
2013-10-03 12:18:56 +00:00
Craig Topper
5b62ea95ec Remove duplicated test cases that occurred when I applied the same patch file to my model twice.
llvm-svn: 191873
2013-10-03 04:27:14 +00:00
Craig Topper
5ac188d0f2 Add patterns for selecting TBM instructions from logical operations. Patch from Yunzhong Gao.
llvm-svn: 191871
2013-10-03 04:16:45 +00:00
Elena Demikhovsky
ee11e148e9 AVX-512: fixed a bug in getLoadStoreRegOpcode() for AVX-512 target
llvm-svn: 191818
2013-10-02 12:20:42 +00:00
Vincent Lejeune
c7c1075d49 R600: add a pass that merges clauses.
llvm-svn: 191790
2013-10-01 19:32:58 +00:00
Vincent Lejeune
0321b7798e R600: Put PRED_X instruction in its own clause
llvm-svn: 191789
2013-10-01 19:32:49 +00:00
Vincent Lejeune
e0ac07a3cb R600: Enable -verify-machineinstrs in some tests.
llvm-svn: 191788
2013-10-01 19:32:38 +00:00
Preston Gurd
c52dfda610 Add test case for PR16785.
Thanks for Dimitry Andric, Rafael Espindola, and Benjamin Kramer
for providing and progressively reducing the test case!

llvm-svn: 191782
2013-10-01 17:02:48 +00:00
Richard Sandiford
8ac2bcbe80 [SystemZ] Add comparisons of high words and memory
llvm-svn: 191777
2013-10-01 15:00:44 +00:00
Richard Sandiford
2ed79fb1d7 [SystemZ] Add comparisons of large immediates using high words
There are no corresponding patterns for small immediates because they would
prevent the use of fused compare-and-branch instructions.

llvm-svn: 191775
2013-10-01 14:56:23 +00:00
Richard Sandiford
3b7b53e6f4 [SystemZ] Add immediate addition involving high words
llvm-svn: 191774
2013-10-01 14:53:46 +00:00
Richard Sandiford
884566de6e [SystemZ] Extend test-under-mask support to high GR32s
llvm-svn: 191773
2013-10-01 14:41:52 +00:00
Richard Sandiford
d2e34690a4 [SystemZ] Extend 32-bit RISBG optimizations to high words
This involves using RISB[LH]G, whereas the equivalent z10 optimization
uses RISBG.

llvm-svn: 191770
2013-10-01 14:36:20 +00:00
Richard Sandiford
e2f5332463 [SystemZ] Extend pseudo conditional 8- and 16-bit stores to high words
As the comment says, we always want to use STOC for 32-bit stores.

llvm-svn: 191767
2013-10-01 14:33:55 +00:00
Tim Northover
684a0e633d ARM: support interrupt attribute
This function-attribute modifies the callee-saved register list and function
epilogue (specifically the return instruction) so that a routine is suitable
for use as an interrupt-handler of the specified type without disrupting
user-mode applications.

rdar://problem/14207019

llvm-svn: 191766
2013-10-01 14:33:28 +00:00
Richard Sandiford
5df2380d20 [SystemZ] Add test missing from r191764.
llvm-svn: 191765
2013-10-01 14:31:50 +00:00
Richard Sandiford
7125240faa [SystemZ] Allow integer AND involving high words
llvm-svn: 191762
2013-10-01 14:20:41 +00:00
Richard Sandiford
9d3cacb101 [SystemZ] Allow integer XOR involving high words
llvm-svn: 191759
2013-10-01 14:08:44 +00:00
Richard Sandiford
d2a449d3de [SystemZ] Allow integer OR involving high words
llvm-svn: 191755
2013-10-01 13:22:41 +00:00
Richard Sandiford
3af32e8cab [SystemZ] Allow integer insertions with a high-word destination
llvm-svn: 191753
2013-10-01 13:18:56 +00:00
Richard Sandiford
497097c027 [SystemZ] Allow selects with a high-word destination
llvm-svn: 191751
2013-10-01 13:10:16 +00:00
Richard Sandiford
8c8e2f0237 [SystemZ] Add patterns to load a constant into a high word (IIHF)
Similar to low words, we can use the shorter LLIHL and LLIHH if it turns
out that the other half of the GR64 isn't live.

llvm-svn: 191750
2013-10-01 13:02:28 +00:00
Richard Sandiford
ac3360b004 [SystemZ] Add register zero extensions involving at least one high word
llvm-svn: 191746
2013-10-01 12:49:07 +00:00
Joey Gouly
12afb60cf2 [ARM] Introduce the 'sevl' instruction in ARMv8.
This also removes the restriction on the immediate field of the 'hint'
instruction.

llvm-svn: 191744
2013-10-01 12:39:11 +00:00
Richard Sandiford
192be1070b [SystemZ] Add truncating high-word stores (STCH and STHH)
llvm-svn: 191743
2013-10-01 12:22:49 +00:00
Richard Sandiford
de433bf58d [SystemZ] Add zero-extending high-word loads (LLCH and LLHH)
llvm-svn: 191742
2013-10-01 12:19:08 +00:00
Richard Sandiford
dd8ae7a617 [SystemZ] Add sign-extending high-word loads (LBH and LHH)
llvm-svn: 191740
2013-10-01 12:11:47 +00:00
Richard Sandiford
c2e496f7ba [SystemZ] Use upper words of GR64s for codegen
This just adds the basics necessary for allocating the upper words to
virtual registers (move, load and store).  The move support is parameterised
in a way that makes it easy to handle zero extensions, but the associated
zero-extend patterns are added by a later patch.

The easiest way of testing this seemed to be add a new "h" register
constraint for high words.  I don't expect the constraint to be useful
in real inline asms, but it should work, so I didn't try to hide it
behind an option.

llvm-svn: 191739
2013-10-01 11:26:28 +00:00
Daniel Sanders
6ffe6fc99c [mips][msa] Added support for matching mod_[us] from normal IR (i.e. not intrinsics)
llvm-svn: 191737
2013-10-01 10:22:35 +00:00
Elena Demikhovsky
84c6cd222d AVX-512: Added X86vzmovl patterns
llvm-svn: 191733
2013-10-01 08:38:02 +00:00
Manman Ren
ad317a135a TBAA: update tbaa format from scalar format to struct-path aware format.
llvm-svn: 191690
2013-09-30 18:17:55 +00:00
Manman Ren
799fd39420 TBAA: remove !tbaa from testing cases when they are not needed.
llvm-svn: 191689
2013-09-30 18:17:35 +00:00
Robert Wilhelm
6b36431ffa Fix spelling intruction -> instruction.
llvm-svn: 191610
2013-09-28 11:46:15 +00:00
Tom Stellard
1cb4ba2a4d R600: Fix handling of NAN in comparison instructions
We were completely ignoring the unorder/ordered attributes of condition
codes and also incorrectly lowering seto and setuo.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 191603
2013-09-28 02:50:50 +00:00
Akira Hatanaka
e5351a10fe [mips] Make sure loads from lazy-binding entries do not get CSE'd or hoisted out
of loops.

Previously, two consecutive calls to function "func" would result in the
following sequence of instructions:

1. load $16, %got(func)($gp) // load address of lazy-binding stub.
2. move $25, $16
3. jalr $25                  // jump to lazy-binding stub.
4. nop
5. move $25, $16
6. jalr $25                  // jump to lazy-binding stub again.

With this patch, the second call directly jumps to func's address, bypassing
the lazy-binding resolution routine:

1. load $25, %got(func)($gp) // load address of lazy-binding stub.
2. jalr $25                  // jump to lazy-binding stub.
3. nop
4. load $25, %got(func)($gp) // load resolved address of func.
5. jalr $25                  // directly jump to func.

llvm-svn: 191591
2013-09-28 00:12:32 +00:00
Yunzhong Gao
e51da27a74 Adding intrinsics to the llvm backend for TBM instruction set.
Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1750

llvm-svn: 191539
2013-09-27 18:38:42 +00:00
Manman Ren
2ef9ca7627 TBAA: handle scalar TBAA format and struct-path aware TBAA format.
Remove the command line argument "struct-path-tbaa" since we should not depend
on command line argument to decide which format the IR file is using. Instead,
we check the first operand of the tbaa tag node, if it is a MDNode, we treat
it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA
format.

When clang starts to use struct-path aware TBAA format no matter whether
struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support
for scalar TBAA format can be dropped.

Existing testing cases are updated to use the struct-path aware TBAA format.

llvm-svn: 191538
2013-09-27 18:34:27 +00:00
Richard Sandiford
e1db330ce8 [SystemZ] Rein back the use of block operations
The backend tries to use block operations like MVC, NC, OC and XC for
simple scalar operations.  For correctness reasons, it rejects any case
in which the regions might partially overlap.  However, for performance
reasons, it should also reject cases where the regions might be equal,
since the instruction might then not use the fast path.

This fixes a performance regression seen in bzip2.  We may want to limit
the optimisation even more in future, or even remove it entirely, but I'll
try with this for now.

llvm-svn: 191525
2013-09-27 15:29:20 +00:00
Richard Sandiford
cae9d29151 [SystemZ] Improve handling of PC-relative addresses
The backend previously folded offsets into PC-relative addresses
whereever possible.  That's the right thing to do when the address
can be used directly in a PC-relative memory reference (using things
like LRL).  But if we have a register-based memory reference and need
to load the PC-relative address separately, it's better to use an anchor
point that could be shared with other accesses to the same area of the
variable.

Fixes a FIXME.

llvm-svn: 191524
2013-09-27 15:14:04 +00:00
Daniel Sanders
0987676281 [mips][msa] Implemented insert.d intrinsic.
This intrinsic is lowered into an equivalent INSERT_VECTOR_ELT which is
further lowered into a sequence of insert.w's on MIPS32.

llvm-svn: 191521
2013-09-27 13:36:54 +00:00
Daniel Sanders
3c43957555 [mips][msa] Implemented fill.d intrinsic.
This intrinsic is lowered into an equivalent BUILD_VECTOR which is further
lowered into a sequence of insert.w's on MIPS32.

llvm-svn: 191519
2013-09-27 13:20:41 +00:00
Daniel Sanders
935673af60 [mips][msa] Implemented copy_[us].d intrinsic.
This intrinsic is lowered into equivalent copy_s.w instructions during
legalization.

llvm-svn: 191518
2013-09-27 13:04:21 +00:00
Daniel Sanders
8c83ddcdd2 [mips][msa] Implemented insert_vector_elt for v4f32 and v2f64.
For v4f32 and v2f64, INSERT_VECTOR_ELT is matched by a pseudo-insn which is
later expanded to appropriate insve.[wd] insns.

llvm-svn: 191515
2013-09-27 12:31:32 +00:00
Daniel Sanders
0bb1b5a37f [mips][msa] Implemented extract_vector_elt for v4f32 or v2f64
For v4f32 and v2f64, EXTRACT_VECTOR_ELT is matched by a pseudo-insn which may
be expanded to subregister copies and/or instructions as appropriate.

llvm-svn: 191514
2013-09-27 12:17:32 +00:00
Andrea Di Biagio
a10165167b Remove superfluous comment accidentally checked-in.
llvm-svn: 191513
2013-09-27 12:13:58 +00:00
Daniel Sanders
0f009e6be5 [mips][msa] Added support for MSA registers to copyPhysReg
llvm-svn: 191512
2013-09-27 12:03:51 +00:00
Daniel Sanders
8e7e5fd076 [mips][msa] Added support for matching splati from normal IR (i.e. not intrinsics)
Updated some of the vshf since they (correctly) emit splati's now

llvm-svn: 191511
2013-09-27 11:48:57 +00:00
Andrea Di Biagio
a96ff5eeac Re-apply the change from r191393 with fix for pr17380.
This change fixes the problem reported in pr17380 and re-add the dagcombine 
transformation ensuring that the value types are always legal if the 
transformation is triggered after Legalization took place.

Added the test case from pr17380.

llvm-svn: 191509
2013-09-27 11:37:05 +00:00
Daniel Sanders
d13fea547a [mips][msa] MSA requires FR=1 mode (64-bit FPU register file). Report fatal error when using it in FR=0 mode.
llvm-svn: 191498
2013-09-27 10:08:31 +00:00
Daniel Sanders
6a20248b3a [mips][msa] Expand all truncstores and loadexts for MSA as well as DSP
llvm-svn: 191496
2013-09-27 09:44:59 +00:00
Daniel Sanders
27836999cd [mips][msa] Added missing check in performSRACombine
Reviewers: jacksprat, dsanders

Reviewed By: dsanders

Differential Revision: http://llvm-reviews.chandlerc.com/D1755

llvm-svn: 191495
2013-09-27 09:25:29 +00:00
Weiming Zhao
c16af8ee70 Fix PR 17372: Emitting PLD for stack address for ARM Thumb2
t2PLDi12, t2PLDi8, t2PLDs was omitted in Thumb2InstrInfo.
This patch fixes it.

llvm-svn: 191441
2013-09-26 17:25:10 +00:00
Bill Schmidt
b5aca928c2 [PowerPC] Fix PR17354: Generate nop after local calls for PIC code.
When generating code for shared libraries, even local calls may be
intercepted, so we need a nop after the call for the linker to fix up the
TOC.  Test case adapted from the one provided in PR17354.

llvm-svn: 191440
2013-09-26 17:09:28 +00:00
Andrea Di Biagio
0901efb8fb Revert r191393 since it caused pr17380.
llvm-svn: 191438
2013-09-26 16:54:01 +00:00
Venkatraman Govindaraju
af5985d1f5 [Sparc] Implements exception handling in SPARC with DwarfCFI.
llvm-svn: 191432
2013-09-26 15:11:00 +00:00
Amara Emerson
80d8b3db1e [ARM] Use the load-acquire/store-release instructions optimally in AArch32.
Patch by Artyom Skrobov.

llvm-svn: 191428
2013-09-26 12:22:36 +00:00
Weiming Zhao
14a079be0c Fix PR 17368: disable vector mul distribution for square of add/sub for ARM
Generally, it is desirable to distribute (a + b) * c to a*c + b*c for
ARM with VMLx forwarding, where a, b and c are vectors.
However, for (a + b)*(a + b), distribution will result in one extra
instruction.
With distribution:
  x = a + b (add)
  y = a * x (mul)
  z = y + b * y (mla)

Without distribution:
  x = a + b (add)
  z = x * x (mul)

This patch checks if a mul is a square of add/sub. If yes, skip
distribution.

llvm-svn: 191410
2013-09-25 23:12:06 +00:00
Josh Magee
2c804b5636 Test commit. Removed trailing whitespace.
llvm-svn: 191402
2013-09-25 22:07:48 +00:00
Reed Kotler
ea8c398b50 Fix a bad typo in the inline assembly code for mips16 pic fp stubs
and make one cosmetic cleanup to make it look the same as gcc
in this area; adjusting test cases.

llvm-svn: 191400
2013-09-25 20:58:50 +00:00
Andrea Di Biagio
1968361975 Teach DAGCombiner how to canonicalize dags according to the rule
(shl (zext (shr A, X)), X) => (zext (shl (shr A, X), X)).

The rule only triggers when there are no other uses of the
zext to avoid materializing more instructions.

This helps the DAGCombiner understand that the shl/shr
sequence can then be converted into an and instruction.

llvm-svn: 191393
2013-09-25 19:01:01 +00:00
Quentin Colombet
b8a9667008 [PR16882] Ignore noreturn definitions when setting isPhysRegUsed.
PEI inserts a save/restore sequence for the link register, according to the
information it gets from the MachineRegisterInfo.
MachineRegisterInfo is populated by the VirtRegMap pass.
This pass was not aware of noreturn calls and was registering the definitions of
these calls the same way as regular operations.

Modify VirtRegPass so that it does not set the isPhysRegUsed information for
registers only defined by noreturn calls.
The rational is that a noreturn call is the "last instruction" of the program
(if it returns the behavior is undefined), so everything that is defined by it
cannot be used and will not interfere with anything else. Therefore, it is
pointless to account for then.

llvm-svn: 191349
2013-09-25 00:26:17 +00:00
Andrew Trick
3b462e7046 CriticalAntiDepBreaker is no longer needed for armv7 scheduling.
This is being disabled because it is no longer needed for
performance. It is only used by postRAscheduler which is also planned
for removal, and it is implemented with an out-dated view of register
liveness. It consideres aliases instead of register units, assumes
valid kill flags, and assumes implicit uses on partial register
defs. Kill flags and implicit operands are error prone and impossible
to verify. We should gradually eliminate dependence on them in the
postRA phases.

Targets that still benefit from this should move to the MI
scheduler. If that doesn't solve the problem, then we should add a
hook to regalloc to optimize reload placement.

llvm-svn: 191348
2013-09-25 00:26:16 +00:00
Eli Friedman
bdb3e2822e Add missing check to SETCC optimization.
PR17338.

llvm-svn: 191337
2013-09-24 22:50:14 +00:00
Daniel Sanders
d110591231 [mips][msa] Added support for matching pckev, and pckod from normal IR (i.e. not intrinsics)
llvm-svn: 191306
2013-09-24 14:53:25 +00:00
Daniel Sanders
48059bf5ef [mips][msa] Added support for matching ilv[lr], ilvod, and ilvev from normal IR (i.e. not intrinsics)
llvm-svn: 191304
2013-09-24 14:36:12 +00:00
Daniel Sanders
db41b542e8 [mips][msa] Added support for matching shf from normal IR (i.e. not intrinsics)
llvm-svn: 191302
2013-09-24 14:20:00 +00:00
Daniel Sanders
7c64721346 [mips][msa] Added support for matching vshf from normal IR (i.e. not intrinsics)
llvm-svn: 191301
2013-09-24 14:02:15 +00:00
Daniel Sanders
e154d03143 [mips][msa] Remove the VSPLAT and VSPLATD nodes in favour of matching BUILD_VECTOR.
Most constant BUILD_VECTOR's are matched using ComplexPatterns which cover
bitcasted as well as normal vectors. However, it doesn't seem to be possible to
match ldi.[bhwd] in a type-agnostic manner (e.g. to support the widest range of
immediates, it should be possible to use ldi.b to load v2i64) using TableGen so
ldi.[bhwd] is matched using custom code in MipsSEISelDAGToDAG.cpp

This made the majority of the constant splat BUILD_VECTOR lowering redundant.
The only transformation remaining for constant splats is when an (up-to) 32-bit
constant splat is possible but the value does not fit into a 10-bit signed
integer. In this case, the BUILD_VECTOR is transformed into a bitcasted
BUILD_VECTOR so that fill.[bhw] can be used to splat the vector from a GPR32
register (which is initialized using the usual lui/addui sequence).

There are no additional tests since this is a re-implementation of previous
functionality. The change is intended to make it easier to implement some of
the upcoming instruction selection patches since they can rely on existing
support for BUILD_VECTOR's in the DAGCombiner.

compare_float.ll changed slightly because a BITCAST is no longer
introduced during legalization.

llvm-svn: 191299
2013-09-24 13:33:07 +00:00
Daniel Sanders
1c08f8b17d [mips][msa] Non-constant BUILD_VECTOR's should be expanded to INSERT_VECTOR_ELT instead of memory operations.
The resulting code is the same length, but doesnt cause memory traffic or latency.

llvm-svn: 191297
2013-09-24 13:16:15 +00:00
Daniel Sanders
d201758a30 [mips][msa] Added partial support for matching fmax_a from normal IR (i.e. not intrinsics)
This covers the case where fmax_a can be used to implement ISD::FABS.

llvm-svn: 191296
2013-09-24 13:02:08 +00:00
Daniel Sanders
fe71effbbd [mips][msa] Added support for matching andi, ori, nori, and xori from normal IR (i.e. not intrinsics)
llvm-svn: 191293
2013-09-24 12:32:47 +00:00
Daniel Sanders
f05ed8bd9a [mips][msa] Added support for matching max, maxi, min, mini from normal IR (i.e. not intrinsics)
llvm-svn: 191291
2013-09-24 12:18:31 +00:00
Daniel Sanders
0167ec55f4 [mips][msa] Added support for matching bsel and bseli from normal IR (i.e. not intrinsics)
This required correcting the definition of the bsel and bseli intrinsics.

llvm-svn: 191290
2013-09-24 12:04:44 +00:00
Daniel Sanders
9a3de1f604 [mips][msa] Added support for matching comparisons from normal IR (i.e. not intrinsics)
MIPS SelectionDAG changes:
* Added VCEQ, VCL[ET]_[SU] nodes to represent vector comparisons that produce a bitmask.

llvm-svn: 191286
2013-09-24 10:46:19 +00:00
Daniel Sanders
362149b5a7 [mips][msa] Added support for matching slli, srai, and srli from normal IR (i.e. not intrinsics)
llvm-svn: 191285
2013-09-24 10:28:18 +00:00
NAKAMURA Takumi
3b910496bf llvm/test/CodeGen/AArch64/neon-scalar-reduce-pairwise.ll: Use -mtriple here, or aach64-pecoff might be misassumed on win32 hosts.
llvm-svn: 191275
2013-09-24 04:14:29 +00:00
Jiangning Liu
5867567c41 Initial support for Neon scalar instructions.
Patch by Ana Pazos.

1.Added support for v1ix and v1fx types.
2.Added Scalar Pairwise Reduce instructions.
3.Added initial implementation of Scalar Arithmetic instructions.

llvm-svn: 191263
2013-09-24 02:47:27 +00:00
Michael Gottesman
a2ef7dd057 [stackprotector] Forgot to add in PR number to test case.
llvm-svn: 191261
2013-09-24 02:10:55 +00:00
Michael Gottesman
2ec63d27a9 [stackprotector] Allow for copies from vreg -> vreg to be in a terminator sequence.
Sometimes a copy from a vreg -> vreg sneaks into the middle of a terminator
sequence. It is safe to slice this into the stack protector success bb.

This fixes PR16979.

llvm-svn: 191260
2013-09-24 01:50:26 +00:00
Bill Wendling
339b0f39aa Selecting the address from a very long chain of GEPs can blow the stack.
The recursive nature of the address selection code can cause the stack to
explode if there is a long chain of GEPs. Convert the recursive bit into a
iterative method to avoid this.

<rdar://problem/12445434>

llvm-svn: 191252
2013-09-24 00:13:08 +00:00
Reed Kotler
ed09a36fb5 Make nomips16 mask not repeat if it ends with a '.'.
This mask is purely for debugging and testing.

llvm-svn: 191231
2013-09-23 22:36:11 +00:00
Ben Langmuir
706a7ccbeb Add sha intrinsic tests
These should have been included with r190864, but I forgot to use svn add.

llvm-svn: 191208
2013-09-23 16:57:52 +00:00
Daniel Sanders
ced4e4005c [mips][msa] Added support for matching addvi, and subvi from normal IR (i.e. not intrinsics)
llvm-svn: 191203
2013-09-23 14:29:55 +00:00
Daniel Sanders
34cb8f3e4d [mips][msa] Added support for matching insert and copy from normal IR (i.e. not intrinsics)
Changes to MIPS SelectionDAG:
* Added nodes VEXTRACT_[SZ]EXT_ELT to represent extract and extend in a single
  operation and implemented the DAG combines necessary to fold sign/zero
  extends into the extract.

llvm-svn: 191199
2013-09-23 14:03:12 +00:00
Daniel Sanders
d1df1263eb [mips][msa] Added support for matching pcnt from normal IR (i.e. not intrinsics)
llvm-svn: 191198
2013-09-23 13:40:21 +00:00
Daniel Sanders
7d945d142d [mips][msa] Added support for matching nor from normal IR (i.e. not intrinsics)
llvm-svn: 191195
2013-09-23 13:22:24 +00:00
Daniel Sanders
91c78d1d33 [mips][msa] Added support for matching and, or, and xor from normal IR (i.e. not intrinsics)
llvm-svn: 191194
2013-09-23 12:57:42 +00:00
Daniel Sanders
d3c403c386 [mips][msa] Implemented build_vector using ldi, fill, and custom SelectionDAG nodes (VSPLAT and VSPLATD)
Note: There's a later patch on my branch that re-implements this to select
build_vector without the custom SelectionDAG nodes. The future patch avoids
the constant-folding problems stemming from the custom node (i.e. it doesn't
need to re-implement all the DAG combines related to BUILD_VECTOR).

Changes to MIPS specific SelectionDAG nodes:
* Added VSPLAT
    This is a special case of BUILD_VECTOR that covers the case the
    BUILD_VECTOR is a splat operation.
* Added VSPLATD
    This is a special case of VSPLAT that handles the cases when v2i64 is legal

llvm-svn: 191191
2013-09-23 12:02:46 +00:00
Tim Northover
c9a7e47164 ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.

Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.

This should fix PR15840.

llvm-svn: 191165
2013-09-22 08:21:56 +00:00
Venkatraman Govindaraju
ae9ddc5768 [Sparc] Add support for TLS in sparc.
llvm-svn: 191164
2013-09-22 06:48:52 +00:00
Venkatraman Govindaraju
df68ba133b [SPARC] Make functions with GLOBAL_OFFSET_TABLE access as non-leaf functions.
llvm-svn: 191160
2013-09-22 01:40:24 +00:00
Venkatraman Govindaraju
e3ed207140 [Sparc] Emit .register directive to declare the use of global registers %g2, %g4, %g6 and %g7.
llvm-svn: 191158
2013-09-22 00:42:30 +00:00
Venkatraman Govindaraju
54744c0b41 [Sparc] Fix lowering FABS on fp128 (long double) on pre-v9 targets.
llvm-svn: 191154
2013-09-21 23:51:08 +00:00
Juergen Ributzka
b55735e2d8 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
This reverts commit r191130.

llvm-svn: 191138
2013-09-21 15:09:46 +00:00