1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00
Commit Graph

13123 Commits

Author SHA1 Message Date
Pat Gavlin
a6d3ba4544 Allow {e,r}bp as the target of {read,write}_register.
This patch allows the read_register and write_register intrinsics to
read/write the RBP/EBP registers on X86 iff the targeted register is
the frame pointer for the containing function.

Differential Revision: http://reviews.llvm.org/D10977

llvm-svn: 241827
2015-07-09 17:40:29 +00:00
Sanjay Patel
6ce96b0ff0 fix an invisible bug when combining repeated FP divisors
This patch fixes bugs that were exposed by the addition of fast-math-flags in the DAG:
r237046 ( http://reviews.llvm.org/rL237046 ):

1. When replacing a division node, it's not enough to RAUW.
   We should call CombineTo() to delete dead nodes and combine again.
2. Because we are changing the DAG, we can't return an empty SDValue
   after the transform. As the code comments say:

    Visitation implementation - Implement dag node combining for different node types.
    The semantics are as follows: Return Value:
      SDValue.getNode() == 0 - No change was made
      SDValue.getNode() == N - N was replaced, is dead and has been handled.
      otherwise - N should be replaced by the returned Operand.

The new test case shows no difference with or without this patch, but it will crash if
we re-apply r237046 or enable FMF via the current -enable-fmf-dag cl::opt.

Differential Revision: http://reviews.llvm.org/D9893

llvm-svn: 241826
2015-07-09 17:28:37 +00:00
Pawel Bylica
b5caea461d Reapply fixed r241790: Fix shift legalization and lowering for big constants.
Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt.

Reviewers: nadav, majnemer, sanjoy, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: http://reviews.llvm.org/D10767

llvm-svn: 241806
2015-07-09 14:58:04 +00:00
Krzysztof Parzyszek
cac9b5847a [Hexagon] Add support for atomic RMW operations
llvm-svn: 241804
2015-07-09 14:51:21 +00:00
Arnaud A. de Grandmaison
90c89b61da [AArch64] Select SBFIZ or UBFIZ instead of left + right shifts
And rename LSB to Immr / MSB to Imms to match the ARM ARM terminology.

llvm-svn: 241803
2015-07-09 14:33:38 +00:00
Renato Golin
fc38fe2618 Test for 241794 (nest attribute in AArch64)
Forgot to git add the test.

Patch by Stephen Cross.

llvm-svn: 241797
2015-07-09 13:29:35 +00:00
Pawel Bylica
7aa3d79c2c Revert r241790: Fix shift legalization and lowering for big constants.
llvm-svn: 241792
2015-07-09 09:50:54 +00:00
Pawel Bylica
f94083a5ee Fix shift legalization and lowering for big constants.
Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt.

Reviewers: nadav, majnemer, sanjoy, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: http://reviews.llvm.org/D10767

llvm-svn: 241790
2015-07-09 08:01:36 +00:00
Elena Demikhovsky
88c04dfc81 Extended syntax of vector version of getelementptr instruction.
The justification of this change is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html

According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.

%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets

In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.

(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset

In all cases the %A type is <4 x i8*>

In the case (2) we add the same offset to all pointers.

The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].

The documentation is updated.

http://reviews.llvm.org/D10496

llvm-svn: 241788
2015-07-09 07:42:48 +00:00
Alex Lorenz
284c4b13e7 MIR Serialization: Serialize the 'undef' register machine operand flag.
llvm-svn: 241762
2015-07-08 23:58:31 +00:00
Sanjay Patel
806e80e796 [x86] enable machine combiner reassociations for scalar single-precision multiplies
llvm-svn: 241752
2015-07-08 22:35:20 +00:00
Eli Bendersky
374b7e43da Add tests for the NVPTXLowerAggrCopies pass.
Note: not testing memmove lowering for now, as it's broken
[see https://llvm.org/bugs/show_bug.cgi?id=24056]

llvm-svn: 241736
2015-07-08 21:29:28 +00:00
Alex Lorenz
1fa43c2c9d MIR Serialization: Serialize the 'killed' register machine operand flag.
llvm-svn: 241734
2015-07-08 21:23:34 +00:00
Simon Pilgrim
0c528a0762 [X86][SSE] Vector shift test cleanup. NFC.
llvm-svn: 241730
2015-07-08 21:11:17 +00:00
Reid Kleckner
f93836486a [Win64] Only treat some functions as having the Win64 convention
All the usual X86 target-specific conventions are collapsed to the
normal Win64 convention, but the custom conventions like GHC and webkit
should not be.

Previously we would assume that the caller allocated 32 bytes of shadow
space for us, which is not how webkit_jscc or other custom conventions
are supposed to work.

Based on a patch by peavo@outlook.com.

Fixes PR24051.

llvm-svn: 241725
2015-07-08 21:03:47 +00:00
Alex Lorenz
8b999fafb7 MIR Parser: Use source locations for MBB naming errors.
This commit changes the type of the field 'Name' in the struct
'yaml::MachineBasicBlock' from 'std::string' to 'yaml::StringValue'. This change
allows the MIR parser to report errors related to the MBB name with the proper
source locations.

llvm-svn: 241718
2015-07-08 20:22:20 +00:00
Krzysztof Parzyszek
ac54e4bbae [Hexagon] Implement commoning of GetElementPtr instructions
llvm-svn: 241714
2015-07-08 19:22:28 +00:00
Reid Kleckner
78c492e610 [SEH] Add missing test case from previous realignment commit
llvm-svn: 241700
2015-07-08 18:09:39 +00:00
Reid Kleckner
cde3a2cf79 [SEH] Ensure that empty __except blocks have their own BB
The 32-bit lowering assumed that WinEHPrepare had this invariant.
WinEHPrepare did it for C++, but not SEH. The result was that we would
insert calls to llvm.x86.seh.restoreframe in normal basic blocks, which
corrupted the frame pointer.

llvm-svn: 241699
2015-07-08 18:08:52 +00:00
James Y Knight
4f71b891ec [SPARC] Cleanup handling of the Y/ASR registers.
- Implement copying ASR to/from GPR regs.
- Mark ASRs as non-allocatable, so it won't try to arbitrarily use
  them inappropriately.
- Instead of inserting explicit WRASR/RDASR nodes in the MUL/DIV
  routines, just do normal register copies.
- Also...mark div as using Y, not just writing it.

Added a test case with some code which previously died with an
assertion failure (with -O0), or produced wrong code (otherwise).

(Third time's the charm?)

Differential Revision: http://reviews.llvm.org/D10401

llvm-svn: 241686
2015-07-08 16:25:12 +00:00
Krzysztof Parzyszek
1ff6cefced [Hexagon] Generate "insert" instructions more aggressively
llvm-svn: 241683
2015-07-08 14:47:34 +00:00
Krzysztof Parzyszek
0942a0a955 Revert 241681: causes Windows builds to fail
llvm-svn: 241682
2015-07-08 14:34:13 +00:00
Krzysztof Parzyszek
bba8c32fe9 [Hexagon] Generate "insert" instructions more aggressively
llvm-svn: 241681
2015-07-08 14:22:27 +00:00
Simon Pilgrim
5c44e5f75e [X86][SSE] Added (V)ROUNDSD + (V)ROUNDSS stack folding support
llvm-svn: 241671
2015-07-08 08:07:57 +00:00
Reid Kleckner
0138358834 [WinEH] Make llvm.x86.seh.restoreframe work for stack realignment prologues
The incoming EBP value points to the end of a local stack allocation, so
we can use that to restore ESI, the base pointer. Once we do that, we
can use local stack allocations. If we know we need stack realignment,
spill the original frame pointer in the prologue and reload it after
restoring ESI.

llvm-svn: 241648
2015-07-07 23:45:58 +00:00
Reid Kleckner
6207d850e4 [WinEH] Add localaddress intrinsic instead of using frameaddress
Clang uses this for SEH finally. The new intrinsic will produce the
right value when stack realignment is required.

llvm-svn: 241643
2015-07-07 23:23:03 +00:00
Arnold Schwaighofer
6d6b413fa3 Add more nvcasts
Tim Northover has told me that they can occur when the compiler cleverly
constructs constants - as demonstrated in the test case.

rdar://21703486

llvm-svn: 241641
2015-07-07 23:13:18 +00:00
Reid Kleckner
45072b933e Rename llvm.frameescape and llvm.framerecover to localescape and localrecover
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.

These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.

Naming suggestions at this point are welcome, I'm happy to re-run sed.

Reviewers: majnemer, nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11011

llvm-svn: 241633
2015-07-07 22:25:32 +00:00
Alex Lorenz
19382e1c52 MIR Serialization: Serialize the 'dead' register machine operand flag.
llvm-svn: 241624
2015-07-07 20:34:53 +00:00
Arnold Schwaighofer
e78f06bc6f Add CHECK lines to test case
llvm-svn: 241619
2015-07-07 19:26:31 +00:00
Arnold Schwaighofer
a112326960 Add a pattern for a nvcast from v2f64 -> v4f32
Since the NvCast is generated by the selection process the concerns about
endianess and bit reversal don't apply.

rdar://21703486

llvm-svn: 241611
2015-07-07 18:31:55 +00:00
Akira Hatanaka
f2cd5836e0 Fix test case to unbreak build.
This commit changes the target arch to fix the test case commited in r241566
that was failing on ninja-x64-msvc-RA-centos6. Also add checks to make sure
the callee's address is loaded to blx's operand. 

llvm-svn: 241588
2015-07-07 14:45:12 +00:00
Akira Hatanaka
548fcd7ec7 [ARM] Define a subtarget feature and use it to decide whether long calls should
be emitted.

This is needed to enable ARM long calls for LTO and enable and disable it on a
per-function basis.

Out-of-tree projects currently using EnableARMLongCalls to emit long calls
should start passing "+long-calls" to the feature string (see the changes made
to clang in r241565).

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D9364

llvm-svn: 241566
2015-07-07 06:54:42 +00:00
Alex Lorenz
813af3fadc MIR Parser: Verify the implicit machine register operands.
This commit verifies that the parsed machine instructions contain the implicit
register operands as specified by the MCInstrDesc. Variadic and call
instructions aren't verified.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10781

llvm-svn: 241537
2015-07-07 02:08:46 +00:00
Dan Gohman
fc49082461 [WebAssembly] Create a CodeGen unittest directory.
llvm-svn: 241520
2015-07-06 23:14:57 +00:00
Alex Lorenz
583f921888 MIR Serialization: Serialize the implicit register flag.
This commit serializes the implicit flag for the register machine operands. It
introduces two new keywords into the machine instruction syntax: 'implicit' and
'implicit-def'. The 'implicit' keyword is used for the implicit register
operands, and the 'implicit-def' keyword is used for the register operands that
have both the implicit and the define flags set.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10709

llvm-svn: 241519
2015-07-06 23:07:26 +00:00
Simon Pilgrim
3c101973b3 [X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with zero'd lanes
The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments.

As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well.

Differential Revision: http://reviews.llvm.org/D10593

llvm-svn: 241516
2015-07-06 22:46:46 +00:00
Sanjay Patel
6781c4029c [x86] extend machine combiner reassociation optimization to SSE scalar adds
Extend the reassociation optimization of http://reviews.llvm.org/rL240361 (D10460)
to SSE scalar FP SP adds in addition to AVX scalar FP SP adds.

With the 'switch' in place, we can trivially add other opcodes and test cases in
future patches.

Differential Revision: http://reviews.llvm.org/D10975

llvm-svn: 241515
2015-07-06 22:35:29 +00:00
Simon Pilgrim
a825efbf95 [X86][SSE] Vectorized i64 uniform constant SRA shifts
This patch adds vectorization support for uniform constant i64 arithmetic shift right operators.

Differential Revision: http://reviews.llvm.org/D9645

llvm-svn: 241514
2015-07-06 22:35:19 +00:00
Reid Kleckner
c447c449e6 [WinEH] Add some test cases I forgot to add to previous commits
llvm-svn: 241510
2015-07-06 21:13:53 +00:00
Reid Kleckner
80b41774f1 [WinEH] Insert the EH code load before the block terminator
The previous code put the load after the terminator, leading to invalid
IR and downstream crashes. This caused http://crbug.com/506446.

llvm-svn: 241509
2015-07-06 21:13:43 +00:00
Simon Pilgrim
385bee8c59 [X86][SSE4A] Shuffle lowering using SSE4A EXTRQ/INSERTQ instructions
This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it.

As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions.

From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level.

Differential Revision: http://reviews.llvm.org/D10146

llvm-svn: 241508
2015-07-06 20:46:41 +00:00
Alex Lorenz
6945ffc637 llc: Add a 'run-pass' option.
This commit adds a 'run-pass' option to llc, which instructs the compiler to run
one specific code generation pass only.

Llc already has the 'start-after' and the 'stop-after' options, and this new
option complements the other two by making it easier to write tests that want
to invoke a single pass only.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10776

llvm-svn: 241476
2015-07-06 17:44:26 +00:00
Matt Arsenault
36294af135 AMDGPU/SI: Add debugging subtarget feature for DS offsets
We don't have a good way to detect most situations where
DS offsets are usable on SI, so add an option to force using
them even if unsafe for debugging performance problems.

llvm-svn: 241462
2015-07-06 16:01:58 +00:00
Simon Pilgrim
e9f414f573 [X86][SSE] Added missing stack folding test for SQRTSD and SQRTSS instructions.
llvm-svn: 241445
2015-07-06 14:15:02 +00:00
Asaf Badouh
a51b8d0d5b [X86][AVX512] Multiply Packed Unsigned Integers with Round and Scale
pmulhrsw

review:
http://reviews.llvm.org/D10948

llvm-svn: 241443
2015-07-06 14:03:40 +00:00
Simon Pilgrim
99be799579 [X86][SSE3] Just use an explicit SSE3 target attribute - not a cpu type.
Merged arch/target into a specific triple - we had i686 and x86_64 targets overriding each other....

llvm-svn: 241410
2015-07-05 19:06:32 +00:00
Simon Pilgrim
7cc9f6e96f [X86][SSE2] Just use an explicit SSE2 target attribute - not a cpu type.
corei7 is capable of a lot more than just SSE2.... 

llvm-svn: 241409
2015-07-05 19:03:51 +00:00
Asaf Badouh
7e53a288e3 [x86][AVX512] add Multiply High Op
include encoding and intrinsics tests.

review
http://reviews.llvm.org/D10896

llvm-svn: 241406
2015-07-05 12:23:20 +00:00
Nemanja Ivanovic
4dede06034 Add missing builtins to the PPC back end for ABI compliance (vol. 2)
This patch corresponds to review:
http://reviews.llvm.org/D10874

Back end portion of the second round of additions to altivec.h.

llvm-svn: 241398
2015-07-05 06:03:51 +00:00