1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

34920 Commits

Author SHA1 Message Date
Craig Topper
c02d07a6f0 [X86] Fix feature flags on some MMX register instructions that really were introduced with SSE or SSE2.
llvm-svn: 252709
2015-11-11 07:29:25 +00:00
Craig Topper
4d5ab434af [X86] Remove redundant MMX isel patterns.
llvm-svn: 252708
2015-11-11 07:29:22 +00:00
Dan Gohman
15237f2be8 [WebAssembly] Support non-legal argument and return types.
llvm-svn: 252687
2015-11-11 01:33:02 +00:00
Ahmed Bougacha
021134fdb6 [MC] Use LShr for constant evaluation of ">>" on non-arm64 darwin.
Follow-up to r235963: this matches other assemblers and is less
unexpected (e.g. PR23227).

llvm-svn: 252681
2015-11-11 00:51:36 +00:00
Matt Arsenault
e81b011d19 AMDGPU: Print more fields in comments
llvm-svn: 252677
2015-11-11 00:27:46 +00:00
Matt Arsenault
4f97a53de0 AMDGPU: Remove dead code
llvm-svn: 252675
2015-11-11 00:01:36 +00:00
Matt Arsenault
01b2f20bdc AMDGPU: Set isAllocatable = 0 on VS_32/VS_64
llvm-svn: 252674
2015-11-11 00:01:32 +00:00
Reid Kleckner
30c859c3a9 [WinEH] Insert the MBB for EH_RESTORE after the catchret
Inserting it before the target block could be bad, we might already have
a fallthrough edge to it.

llvm-svn: 252670
2015-11-10 23:22:20 +00:00
Dan Gohman
a1d7c89fbe [WebAssembly] Remove special cases for things that are no longer special. NFC.
llvm-svn: 252656
2015-11-10 21:48:21 +00:00
Bill Schmidt
c7c997268f Add PPCMIPeephole.cpp to CMakeLists.txt
llvm-svn: 252654
2015-11-10 21:43:45 +00:00
Dan Gohman
115720fa23 [WebAssembly] Support for floating point min and max.
llvm-svn: 252653
2015-11-10 21:40:21 +00:00
Bill Schmidt
4bb7c62dcb [PowerPC] Add an MI SSA peephole pass.
This patch adds a pass for doing PowerPC peephole optimizations at the
MI level while the code is still in SSA form.  This allows for easy
modifications to the instructions while depending on a subsequent pass
of DCE.  Both passes are very fast due to the characteristics of SSA.

At this time, the only peepholes added are for cleaning up various
redundancies involving the XXPERMDI instruction.  However, I would
expect this will be a useful place to add more peepholes for
inefficiencies generated during instruction selection.  The pass is
placed after VSX swap optimization, as it is best to let that pass
remove unnecessary swaps before performing any remaining clean-ups.

The utility of these clean-ups are demonstrated by changes to four
existing test cases, all of which now have tighter expected code
generation.  I've also added Eric Schweiz's bugpoint-reduced test from
PR25157, for which we now generate tight code.  One other test started
failing for me, and I've fixed it
(test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not
related to my changes, and I'm not sure why it works before and not
after.  The problem is that the CHECK-NOT: of "statepoint" from test1
fails because of the "statepoint" in test2, and so forth.  Adding a
CHECK-LABEL in between keeps the different occurrences of that string
properly scoped.

llvm-svn: 252651
2015-11-10 21:38:26 +00:00
Sanjay Patel
8d957f05fa [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:               
  clz  r0, r0
  bx  lr
cttz:              
  rbit  r0, r0
  clz  r0, r0
  bx  lr

Instead of:

ctlz:    
  cmp  r0, #0
  moveq  r0, #32
  clzne  r0, r0
  bx  lr
cttz:     
  cmp   r0, #0
  moveq  r0, #32
  rbitne  r0, r0
  clzne  r0, r0
  bx  lr

This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818

Differential Revision: http://reviews.llvm.org/D14469

llvm-svn: 252639
2015-11-10 19:24:31 +00:00
Sanjay Patel
d4ce17626c [AArch64] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any AArch64
implementation.

The net result of allowing this speculation for the regression tests in this
patch is that we get this code:

ctlz:
  clz  w0, w0
  ret

cttz:
  rbit  w8, w0
  clz  w0, w8
  ret

Instead of:

ctlz:
  cbz  w0, .LBB0_2
  clz  w0, w0
  ret
.LBB0_2:
  orr  w0, wzr, #0x20
  ret

cttz:
  cbz  w0, .LBB1_2
  rbit  w8, w0
  clz  w0, w8
  ret
.LBB1_2:
  orr  w0, wzr, #0x20
  ret

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14505

llvm-svn: 252625
2015-11-10 18:11:37 +00:00
Michael Kuperstein
fa11182125 [X86] Do not try to custom-lower sitofp/fptosi in soft-float mode
Differential Revision: http://reviews.llvm.org/D14495

llvm-svn: 252621
2015-11-10 17:37:49 +00:00
James Molloy
bd56418ec0 Reapply "[ARM] Combine CMOV into BFI where possible"
Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh!

Original commit message:

If we have a CMOV, OR and AND combination such as:
  if (x & CN)
      y |= CM;

And:
  * CN is a single bit;
    * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

llvm-svn: 252606
2015-11-10 14:22:05 +00:00
Tilmann Scheller
f1b47116b7 [PowerPC] Remove redundant code.
The local variable Hi is never being read.

Issue identified by the Clang static analyzer.

llvm-svn: 252600
2015-11-10 12:29:37 +00:00
Oliver Stannard
ab0d4de68d [AArch64] Fix halfword load merging for big-endian targets
For big-endian targets, when we merge two halfword loads into a word load, the
order of the halfwords in the loaded value is reversed compared to
little-endian, so the load-store optimiser needs to swap the destination
registers.

This does not affect merging of two word loads, as we use ldp, which treats the
memory as two separate 32-bit words.

llvm-svn: 252597
2015-11-10 11:04:18 +00:00
Igor Breger
bfb07ae48a AVX512 : Implemented encoding and DAG lowering for VMOVHPS/PD and VMOVLPS/PD instructions.
Differential Revision: http://reviews.llvm.org/D14492

llvm-svn: 252592
2015-11-10 07:09:07 +00:00
David Blaikie
474d4d0ac0 Remove another variable unused in -Asserts build
llvm-svn: 252582
2015-11-10 04:10:04 +00:00
David Blaikie
a9d50b2e4f Remove some unused variables to clean up the -Werror build
llvm-svn: 252580
2015-11-10 03:16:28 +00:00
Colin LeMahieu
145827ad73 [Hexagon] Adding instruction aliases and tests.
llvm-svn: 252579
2015-11-10 01:58:26 +00:00
Andy Ayers
abf4852bd1 Support for emitting inline stack probes
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages
between the current stack limit and the desired new stack pointer location. This implements support for
the inline expansion on x64.

For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call
is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications
that arise when introducing new machine basic blocks during prolog and epilog creation.

Added a new test case, modified an existing one to exclude non-x64 coreclr (for now).

Add test case

Fix tests

llvm-svn: 252578
2015-11-10 01:50:49 +00:00
Colin LeMahieu
6b18080d94 [Hexagon] Fixing compound register printing and reenabling more tests.
llvm-svn: 252574
2015-11-10 00:51:56 +00:00
Tim Northover
7a6f0f1427 AArch64: add experimental support for address tagging.
AArch64 has the ability to use the top 8-bits of an "address" for extra
information, with the memory subsystem automatically masking them off for loads
and stores. When that's happening, we can sometimes skip masks on memory
operations in the compiler.

However, this requires the host OS and support stack to preserve those bits so
it can't be enabled everywhere. In principle iOS 8.0 and above do take the
required precautions and but we'll put it under a flag for now.

llvm-svn: 252573
2015-11-10 00:44:23 +00:00
Derek Schuff
3784bab599 [WebAssembly] Support 'unreachable' expression
Lower LLVM's 'unreachable' terminator to ISD::TRAP, and lower ISD::TRAP to
wasm's 'unreachable' expression.

WebAssembly type-checks expressions, but a noreturn function with a
return type that doesn't match the context will cause a check
failure. So we lower LLVM 'unreachable' to ISD::TRAP and then lower that
to WebAssembly's 'unreachable' expression, which typechecks in any
context and causes a trap if executed.

Differential Revision: http://reviews.llvm.org/D14515

llvm-svn: 252566
2015-11-10 00:30:57 +00:00
Colin LeMahieu
fab9e4fee9 [Hexagon] Fixing store instructions and reenabling a few more tests.
llvm-svn: 252561
2015-11-10 00:22:00 +00:00
Akira Hatanaka
5dffc58891 [ARM] Handle t2ADDri in ARMAsmPrinter::EmitUnwindingInstruction.
This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where
llvm_unreachable was reached because t2ADDri wasn't handled.

Test case provided by Tim Northover.

rdar://problem/23270609

http://reviews.llvm.org/D14518

llvm-svn: 252557
2015-11-10 00:10:41 +00:00
Colin LeMahieu
caeaa3d918 [Hexagon] Fixing load instruction parsing and reenabling tests.
llvm-svn: 252555
2015-11-10 00:02:27 +00:00
Reid Kleckner
da90738ed9 [WinEH] Remove isBarrier from instructions that do not return
Fixes machine verification failures with David's latest EH change.

llvm-svn: 252541
2015-11-09 23:34:42 +00:00
Sanjay Patel
a712e76470 add a SelectionDAG method to check if no common bits are set in two nodes; NFCI
This was suggested in:
http://reviews.llvm.org/D13956

and is a follow-on to:
http://reviews.llvm.org/rL252515
http://reviews.llvm.org/rL252519

This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG.

A corresponding function for IR instructions already exists in ValueTracking.

llvm-svn: 252539
2015-11-09 23:31:38 +00:00
David Majnemer
624c46d055 [WinEH] Don't emit CATCHRET from visitCatchPad
Instead, emit a CATCHPAD node which will get selected to a target
specific sequence.

llvm-svn: 252528
2015-11-09 23:07:48 +00:00
Sanjay Patel
4f4bd24ec4 [x86] try harder to match bitwise 'or' into an LEA
The motivation for this patch starts with the epic fail example in PR18007:
https://llvm.org/bugs/show_bug.cgi?id=18007

...unfortunately, this patch makes no difference for that case, but it solves some
simpler cases. We'll get there some day. :)

The current 'or' matching code was using computeKnownBits() via 
isBaseWithConstantOffset() -> MaskedValueIsZero(), but that's an unnecessarily limited use. 
We can do more by copying the logic in ValueTracking's haveNoCommonBitsSet(), so we can 
treat the 'or' as if it was an 'add'.

There's a TODO comment here because we should lift the bit-checking logic into a helper
function, so it's not duplicated in DAGCombiner.

An example of the better LEA matching:

leal (%rdi,%rdi), %eax
andl $1, %esi
orl %esi, %eax

Becomes:

andl $1, %esi
leal (%rsi,%rdi,2), %eax

Differential Revision: http://reviews.llvm.org/D13956

llvm-svn: 252515
2015-11-09 21:16:49 +00:00
Colin LeMahieu
ff6307914f [Hexagon] Separating statement to match what clang-format would do.
llvm-svn: 252513
2015-11-09 21:06:28 +00:00
Reid Kleckner
f7f2cdef8b [WinEH] Tweak funclet prologue/epilogue insertion to pass verifier
For some reason we'd never run MachineVerifier on WinEH code, and you
explicitly have to ask for it with llc. I added it to a few test cases
to get some coverage.

Fixes PR25461.

llvm-svn: 252512
2015-11-09 21:04:00 +00:00
Reid Kleckner
fb328c983a [Hexagon] Fix -Wmicrosoft-enum-value warning with explicit enum type
llvm-svn: 252505
2015-11-09 19:44:38 +00:00
Sanjay Patel
db0dd82683 don't repeat function names in comments; NFC
llvm-svn: 252502
2015-11-09 19:18:26 +00:00
Charlie Turner
4ca51cd513 [AArch64] Add UABDL patterns for log2 shuffle.
Summary:
This matches the sum-of-absdiff patterns emitted by the vectoriser using log2 shuffles.

Relies on D14207 to be able to match the `extract_subvector(..., 0)`

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14208

llvm-svn: 252465
2015-11-09 13:10:52 +00:00
Charlie Turner
fc688fc567 [AArch64] Handle extract_subvector(..., 0) in ISel.
Summary:
Lowering this pattern early to an `EXTRACT_SUBREG` was making it impossible to match larger patterns in tblgen that use `extract_subvector(..., 0)` as part of the their input pattern.

It seems like there will exist somewhere a better way of specifying this pattern over all relevant register value types, but I didn't manage to find it.

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14207

llvm-svn: 252464
2015-11-09 12:45:11 +00:00
Renato Golin
75f12574da [EABI] Add LLVM support for -meabi flag
"GCC requires the freestanding environment provide memcpy, memmove, memset
and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html

Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent
'__aeabi_memops'. This convertion violates GCC contract.

The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI
targets.

Without -meabi: use the triple default EABI.
With -meabi=default: use the triple default EABI.
With -meabi=gnu: use 'memops'.
With -meabi=4 or -meabi=5: use '__aeabi_memops'.
With -meabi set to an unknown value: same as -meabi=default.

Patch by Vinicius Tinti.

llvm-svn: 252462
2015-11-09 12:40:30 +00:00
Renato Golin
ecabe1e17c Revert "[ARM] Combine CMOV into BFI where possible"
This reverts commit r252057, as it broke ARM self-hosting buildbots, probably
due to a code-gen fault.

llvm-svn: 252460
2015-11-09 12:19:10 +00:00
Colin LeMahieu
432c5802d8 [Hexagon] Adding override to methods.
llvm-svn: 252453
2015-11-09 07:10:24 +00:00
Colin LeMahieu
9c4783c8cd [Hexagon] Fixing warnings.
llvm-svn: 252448
2015-11-09 05:47:56 +00:00
Colin LeMahieu
d80e0ce601 [Hexagon] Removing extra gen line.
llvm-svn: 252447
2015-11-09 05:31:39 +00:00
Colin LeMahieu
386150c550 [Hexagon] Maybe the makefile?
llvm-svn: 252446
2015-11-09 05:16:08 +00:00
Colin LeMahieu
733ea75725 [Hexagon] Adding LLVMBuild.txt reference to HexagonAsmParser.
llvm-svn: 252444
2015-11-09 04:31:02 +00:00
Colin LeMahieu
350f96d137 [Hexagon] Enabling ASM parsing on Hexagon backend and adding instruction parsing tests. General updating of the code emission.
llvm-svn: 252443
2015-11-09 04:07:48 +00:00
Colin LeMahieu
51dd820505 [AsmParser] Backends can parameterize ASM tokenization.
llvm-svn: 252439
2015-11-09 00:31:07 +00:00
Hal Finkel
62d1b738df [PowerPC] Fix LoopPreIncPrep not to depend on SCEV constant simplifications
Under most circumstances, if SCEV can simplify X-Y to a constant, then it can
also simplify Y-X to a constant. However, there is no guarantee that this is
always true, and concensus is not to consider that a correctness bug in SCEV
(although it is undesirable).

PPCLoopPreIncPrep gathers pointers used to access memory (via loads, stores and
prefetches) into buckets, where in each bucket the relative pointer offsets are
constant. We used to keep each bucket as a multimap, where SCEV's subtraction
operation was used to define the ordering predicate. Instead, use a fixed SCEV
base expression for each bucket, record the constant offsets from that base
expression, and adjust it later, if desirable, once all pointers have been
collected.

Doing it this way should be more compile-time efficient than the previous
scheme (in addition to making the implementation less sensitive to SCEV
simplification quirks).

Fixes PR25170.

llvm-svn: 252417
2015-11-08 08:04:40 +00:00
David Majnemer
b682989c7b [WinEH] Update PHIs of CATCHRET successors
The TailDuplication machine pass ran across a malformed CFG: a PHI node
referred it's predecessor's predecessor instead of it's predecessor.
This occurred because we split the edge in X86ISelLowering when we
processed the CATCHRET but forgot to do something about the PHI nodes.

This fixes PR25444.

llvm-svn: 252413
2015-11-08 02:36:00 +00:00