1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00
Commit Graph

13389 Commits

Author SHA1 Message Date
Silviu Baranga
f1f3f2018c [ARM] Update ReconstructShuffle to handle mismatched types
Summary:
Port the ReconstructShuffle function from AArch64 to ARM
to handle mismatched incoming types in the BUILD_VECTOR
node.

This fixes an outstanding FIXME in the ReconstructShuffle
code.

Reviewers: t.p.northover, rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11720

llvm-svn: 244314
2015-08-07 11:40:46 +00:00
John Brawn
093484721d Revert "Make global aliases have symbol size equal to their type"
This reverts r242520, as it caused pr24379. Also removes part of the test added
by r243874 that checks the size of alias symbols.

llvm-svn: 244313
2015-08-07 10:56:21 +00:00
JF Bastien
c6f87ed268 WebAssembly: textual emission uses expected opcode names
Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files.

Reviewers: sunfish

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D11776

llvm-svn: 244305
2015-08-07 01:57:03 +00:00
Alex Lorenz
28c3634e6a MIR Serialization: Fix serialization of unnamed IR block references.
The block address machine operands can reference IR blocks in other functions.
This commit fixes a bug where the references to unnamed IR blocks in other
functions weren't serialized correctly.

llvm-svn: 244299
2015-08-06 23:57:04 +00:00
Juergen Ributzka
fb208d24f5 [AArch64][FastISel] Always use AND before checking the branch flag.
When we are not emitting the condition for the branch, because the condition is
in another BB or SDAG did the selection for us, then we have to mask the flag in
the register with AND.

This is required when the condition comes from a truncate, because SDAG only
truncates down to a legal size of i32.

This fixes rdar://problem/22161062.

llvm-svn: 244291
2015-08-06 22:44:15 +00:00
Juergen Ributzka
0453fd2de1 Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types."
This reverts commit r243198 and 243304.

Turns out this wasn't the correct fix for this problem. It works only within
FastISel, but fails when the truncate is selected by SDAG.

llvm-svn: 244287
2015-08-06 22:13:48 +00:00
Tom Stellard
544c5ba738 AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11604

llvm-svn: 244254
2015-08-06 19:28:38 +00:00
Tom Stellard
5d9e608825 AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes
Summary: This allows us to consolidate several of the TableGen patterns.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11602

llvm-svn: 244253
2015-08-06 19:28:30 +00:00
Kit Barton
5eb6e29fcf Fix possible infinite loop in shrink wrapping when searching for save/restore
points.

There is an infinite loop that can occur in Shrink Wrapping while searching 
for the Save/Restore points. 

Part of this search checks whether the save/restore points are located in
different loop nests and if so, uses the (post) dominator trees to find the
immediate (post) dominator blocks. However, if the current block does not have
any immediate (post) dominators then this search will result in an infinite
loop. This can occur in code containing an infinite loop.

The modification checks whether the immediate (post) dominator is different from
the current save/restore block. If it is not, then the search terminates and the
current location is not considered as a valid save/restore point for shrink wrapping.

Phabricator: http://reviews.llvm.org/D11607
llvm-svn: 244247
2015-08-06 19:01:57 +00:00
Alex Lorenz
8ecabffd63 MIR Parser: Report an error when parsing duplicate memory operand flags.
llvm-svn: 244240
2015-08-06 18:26:36 +00:00
Alex Lorenz
0d1da50e29 MIR Serialization: Serialize the 'invariant' machine memory operand flag.
llvm-svn: 244230
2015-08-06 16:55:53 +00:00
Alex Lorenz
b105a060cc MIR Serialization: Serialize the 'non-temporal' machine memory operand flag.
llvm-svn: 244228
2015-08-06 16:49:30 +00:00
Michael Kuperstein
a32d396d60 [X86] Improve EmitLoweredSelect for contiguous CMOV pseudo instructions.
This change improves EmitLoweredSelect() so that multiple contiguous CMOV pseudo
instructions with the same (or exactly opposite) conditions get lowered using a single
new basic-block. This eliminates unnecessary extra basic-blocks (and CFG merge points)
when contiguous CMOVs are being lowered.

Patch by: kevin.b.smith@intel.com
Differential Revision: http://reviews.llvm.org/D11428

llvm-svn: 244202
2015-08-06 08:45:34 +00:00
Alex Lorenz
b06d114835 MIR Serialization: Initial serialization of the machine operand target flags.
This commit implements the initial serialization of the machine operand target
flags. It extends the 'TargetInstrInfo' class to add two new methods that help
to provide text based serialization for the target flags.

This commit can serialize only the X86 target flags, and the target flags for
the other targets will be serialized in the follow-up commits.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244185
2015-08-06 00:44:07 +00:00
Bjarke Hammersholt Roune
f7c0d778e2 [NVPTX] Use LDG for pointer induction variables.
More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables.

Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads.

llvm-svn: 244166
2015-08-05 23:11:57 +00:00
Reid Kleckner
972f55c2b4 Fix Windows test failure with triple instead of using the native OS
llvm-svn: 244159
2015-08-05 22:27:08 +00:00
Alex Lorenz
3768a6786c MIR Serialization: Serialize the machine operand's offset.
This commit serializes the offset for the following operands: target index,
global address, external symbol, constant pool index, and block address.

llvm-svn: 244157
2015-08-05 22:26:15 +00:00
JF Bastien
521bc853f0 x86 atomic: optimize a.store(reg op a.load(acquire), release)
Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations.

Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11382

llvm-svn: 244128
2015-08-05 21:04:59 +00:00
JF Bastien
cf4a8a3579 Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk."
I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff.

llvm-svn: 244121
2015-08-05 20:53:56 +00:00
JF Bastien
e3b11e6e69 Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk.
llvm-svn: 244120
2015-08-05 20:49:46 +00:00
Alex Lorenz
f8bf4abe0d MIR Parser: Report an error when parsing large immediate operands.
llvm-svn: 244100
2015-08-05 19:03:42 +00:00
Alex Lorenz
8fc56ec7f9 MIR Serialization: Serialize the typed immediate integer machine operands.
llvm-svn: 244098
2015-08-05 18:52:21 +00:00
Alex Lorenz
32764a9d0f MIR Parser: Report an error when parsing duplicate register flags.
llvm-svn: 244081
2015-08-05 18:09:03 +00:00
Alex Lorenz
6b0071372b MIR Serialization: Serialize the 'early-clobber' register operand flag.
llvm-svn: 244075
2015-08-05 17:49:03 +00:00
Alex Lorenz
5058149b87 MIR Serialization: Serialize the 'debug-use' register operand flag.
llvm-svn: 244071
2015-08-05 17:41:17 +00:00
Artyom Skrobov
b497a0eb34 ARMISelDAGToDAG.cpp had this self-contradictory code:
return StringSwitch<int>(Flags)
          .Case("g", 0x1)
          .Case("nzcvq", 0x2)
          .Case("nzcvqg", 0x3)
          .Default(-1);
...

  // The _g and _nzcvqg versions are only valid if the DSP extension is
  // available.
  if (!Subtarget->hasThumb2DSP() && (Mask & 0x2))
    return -1;

ARMARM confirms that the comment is right, and the code was wrong.

llvm-svn: 244029
2015-08-05 11:02:14 +00:00
Hal Finkel
58e66c2716 [MachineCombiner] Don't use the opcode-only form of computeInstrLatency
In r242277, I updated the MachineCombiner to work with itineraries, but I
missed a call that is scheduling-model-only (the opcode-only form of
computeInstrLatency). Using the form that takes an MI* allows this to work with
itineraries (and should be NFC for subtargets with scheduling models).

llvm-svn: 244020
2015-08-05 07:45:28 +00:00
Sanjay Patel
c7b58da814 [x86] machine combiner reassociation: mark EFLAGS operand as 'dead'
In the commentary for D11660, I wasn't sure if it was alright to create new
integer machine instructions without also creating the implicit EFLAGS operand. 
From what I can see, the implicit operand is always created by the MachineInstrBuilder
based on the instruction type, so we don't have to do that explicitly. However, in
reviewing the debug output, I noticed that the operand was not marked as 'dead'. 
The machine combiner should do that to preserve future optimization opportunities 
that may be checking for that dead EFLAGS operand themselves.

Differential Revision: http://reviews.llvm.org/D11696

llvm-svn: 243990
2015-08-04 15:21:56 +00:00
Vasileios Kalintiris
e8da6d9454 Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions.
It introduced two regressions on 64-bit big-endian targets running under N32
(MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and
MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets
comparisons such as BEQ compare the whole GPR64 but incorrectly tell the
instruction selector that they operate on GPR32's. This leads to the
elimination of i32->i64 extensions that are actually required by
comparisons to work correctly.

There's currently a patch under review that fixes this problem.

llvm-svn: 243984
2015-08-04 14:26:35 +00:00
Mehdi Amini
950be4e219 Update test suite to make "ninja check" succeed without native backend builtin
Requires "native" feature in most places that were failing.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243960
2015-08-04 06:32:54 +00:00
Mehdi Amini
08771c2f10 Move generic MIR tests in their own subdir, requires "native" as well
These tests rely on the native backend to be built-in.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243959
2015-08-04 06:32:45 +00:00
Saleem Abdulrasool
70820fc14d ARM: support windows division routines
This adds the software division routines for the Windows RTABI.  These are not
expected to be used often though as most modern Windows ARM capable targets
support hardware division.  In the case that the target CPU doesnt support
hardware division, this will be the fallback.

llvm-svn: 243952
2015-08-04 03:57:56 +00:00
Ahmed Bougacha
480b23f13e [AArch64] Add isel support for f16 indexed LD/ST.
llvm-svn: 243935
2015-08-04 01:29:38 +00:00
Ahmed Bougacha
075c5bb353 [AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such.
There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and
is actually already vector-aware; let's use it instead of scalarizing!

The only interesting change is that for v2f32, we previously always used
use v4i32 as the integer vector type.
Use v2i32 instead, and mark FCOPYSIGN as Custom.

llvm-svn: 243926
2015-08-04 00:42:34 +00:00
Ahmed Bougacha
4da2acb3ed [CodeGen] Fix FCOPYSIGN legalization to account for mismatched types.
We used to legalize it like it's any other binary operations.  It's not,
because it accepts mismatched operand types.  Because of that, we used
to hit various asserts and miscompiles.

Specialize vector legalizations to, in the worst case, unroll, or, when
possible, to just legalize the operand that needs legalization.

Scalarization isn't covered, because I can't think of a target where
some but not all of the 1-element vector types are to be scalarized.

llvm-svn: 243924
2015-08-04 00:32:55 +00:00
Alex Lorenz
076303e68c MIR Serialization: Serialize the 'volatile' machine memory operand flag.
llvm-svn: 243923
2015-08-04 00:24:45 +00:00
Alex Lorenz
1167a86bcb MIR Serialization: Initial serialization of the machine memory operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243915
2015-08-03 23:08:19 +00:00
Duncan P. N. Exon Smith
87c77233df DI: Disallow uniquable DICompileUnits
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode.  This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.

Almost all the testcases were updated with this script:

    git grep -e '= !DICompileUnit' -l -- test |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'

I imagine something similar should work for out-of-tree testcases.

llvm-svn: 243885
2015-08-03 17:26:41 +00:00
Tim Northover
a8c3f49d75 ARM: prefer allocating VFP regs at stride 4 on Darwin.
This is necessary for WatchOS support, where the compact unwind format assumes
this kind of layout. For now we only want this on Swift-like CPUs though, where
it's been the Xcode behaviour for ages. Also, since it can expand the prologue
we don't want it at -Oz.

llvm-svn: 243884
2015-08-03 17:20:10 +00:00
John Brawn
e863b52aae [ARM] Make GlobalMerge merge extern globals by default
Enabling merging of extern globals appears to be generally either beneficial or
harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57)
it gives improvements in the 1-5% range, but in the rest the overall effect is
zero.

Differential Revision: http://reviews.llvm.org/D10966

llvm-svn: 243874
2015-08-03 12:13:33 +00:00
James Molloy
08be907489 Be less conservative about forming IT blocks.
In http://reviews.llvm.org/rL215382, IT forming was made more conservative under
the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M.

But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for
v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an
IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block
as long as the flags register is dead afterwards.

This gives significant performance improvements in a variety of MPEG based workloads.

Differential revision: http://reviews.llvm.org/D11680

llvm-svn: 243869
2015-08-03 09:24:48 +00:00
JF Bastien
1f39c2f623 WebAssembly: implement getScalarShiftAmountTy so we can shift by amount, with type
Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!).

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11715

llvm-svn: 243860
2015-08-03 00:00:11 +00:00
Simon Pilgrim
71fbcd2045 [X86][SSE] Refreshed sse2 vector shift tests
llvm-svn: 243850
2015-08-02 15:23:53 +00:00
Jingyue Wu
2c2fc26fb5 [NVPTX] allow register copy between float and int
Summary:
Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class
register copying (e.g. i32 to f32) becomes possible. Enhance
NVPTXInstrInfo::copyPhysReg to handle these cases.

Reviewers: jholewinski

Subscribers: eliben, jholewinski, llvm-commits, bruno

Differential Revision: http://reviews.llvm.org/D11622

llvm-svn: 243839
2015-08-01 18:02:12 +00:00
Simon Pilgrim
67ccd7713f [DAGCombiner] Convert constant AND masks to shuffle clear masks down to the byte level
The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks.

This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading.

There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.)

Differential Revision: http://reviews.llvm.org/D11518

llvm-svn: 243831
2015-08-01 10:01:46 +00:00
JF Bastien
ba4f461411 WebAssembly: handle more than int32 argument/return
Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11699

llvm-svn: 243822
2015-08-01 04:48:44 +00:00
Alex Lorenz
ba9313c06c AMDGPU/SI: Add implicit register operands in the correct order.
This commit fixes a bug in the class 'SIInstrInfo' where the implicit register
machine operands were added to a machine instruction in an incorrect order -
the implicit uses were added before the implicit defs.

I found this bug while working on moving the implicit register operand
verification code from the MIR parser to the machine verifier.

This commit also makes the method 'addImplicitDefUseOperands' in the machine
instruction class public so that it can be reused in the 'SIInstrInfo' class.

Reviewers: Matt Arsenault

Differential Revision: http://reviews.llvm.org/D11689

llvm-svn: 243799
2015-07-31 23:30:09 +00:00
Alex Lorenz
36d3115a60 MIR Parser: Report an error when a jump table entry is redefined.
llvm-svn: 243798
2015-07-31 23:13:23 +00:00
Jingyue Wu
d1717a8a10 [NVPTX] convert pointers in byval kernel arguments to global
Summary:
For example, in

  struct S {
    int *x;
    int *y;
  };
  __global__ void foo(S s) {
    int *b = s.y;
    // use b
  }

"b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for
accessing "b".

Reviewers: jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11505

llvm-svn: 243790
2015-07-31 21:44:14 +00:00
JF Bastien
ccecc328a6 WebAssembly: handle ret void.
Summary:
Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret!

The next step will be to handle other local types (not just int32).

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11692

llvm-svn: 243783
2015-07-31 21:04:18 +00:00