1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

116680 Commits

Author SHA1 Message Date
Pete Cooper
cdb160f205 Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC
Note, this is a reapplication of r236515 with a fix to not assert on non-register operands, but instead only handle them until the subsequent commit.  Original commit message follows.

The code was basically the same here already.  Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs.

Will be used in the next commit to also handle regmasks.

llvm-svn: 236538
2015-05-05 20:14:22 +00:00
Peter Collingbourne
549025e337 Thumb2SizeReduction: Check the correct set of registers for LDMIA.
The register set for LDMIA begins at offset 3, not 4. We were previously
missing the short encoding of this instruction in the case where the base
register was the first register in the register set.

Also clean up some dead code:

- The isARMLowRegister check is redundant with what VerifyLowRegs does;
  replace with an assert.
- Remove handling of LDMDB instruction, which has no short encoding (and
  does not appear in ReduceTable).

Differential Revision: http://reviews.llvm.org/D9485

llvm-svn: 236535
2015-05-05 20:07:10 +00:00
Ulrich Weigand
353f5c25c8 [DAGCombiner] Account for getVectorIdxTy() when narrowing vector load
This patch makes ReplaceExtractVectorEltOfLoadWithNarrowedLoad convert
the element number from getVectorIdxTy() to PtrTy before doing pointer
arithmetic on it.  This is needed on z, where element numbers are i32
but pointers are i64.

Original patch by Richard Sandiford.

llvm-svn: 236530
2015-05-05 19:34:10 +00:00
Ulrich Weigand
8229aae81f [DAGCombiner] Fix ReplaceExtractVectorEltOfLoadWithNarrowedLoad for BE
For little-endian, the function would convert (extract_vector_elt (load X), Y)
to X + Y*sizeof(elt).  For big-endian it would instead use
X + sizeof(vec) - Y*sizeof(elt).  The big-endian case wasn't right since
vector index order always follows memory/array order, even for big-endian.
(Note that the current handling has to be wrong for Y==0 since it would
access beyond the end of the vector.)

Original patch by Richard Sandiford.

llvm-svn: 236529
2015-05-05 19:33:37 +00:00
Ulrich Weigand
aab0c45488 [LegalizeVectorTypes] Allow single loads and stores for more short vectors
When lowering a load or store for TypeWidenVector, the type legalizer
would use a single load or store if the associated integer type was legal.
E.g. it would load a v4i8 as an i32 if i32 was legal.

This patch extends that behavior to promoted integers as well as legal ones.
If the integer type for the full vector width is TypePromoteInteger,
the element type is going to be TypePromoteInteger too, and it's still
better to use a single promoting load or truncating store rather than N
individual promoting loads or truncating stores.  E.g. if you have a v2i8
on a target where i16 is promoted to i32, it's better to load the v2i8 as
an i16 rather than load both i8s individually.

Original patch by Richard Sandiford.

llvm-svn: 236528
2015-05-05 19:32:57 +00:00
Ulrich Weigand
cd1cd8d125 [SystemZ] Add vector intrinsics
This adds intrinsics to allow access to all of the z13 vector instructions.
Note that instructions whose semantics can be described by standard LLVM IR
do not get any intrinsics.

For each instructions whose semantics *cannot* (fully) be described, we
define an LLVM IR target-specific intrinsic that directly maps to this
instruction.

For instructions that also set the condition code, the LLVM IR intrinsic
returns the post-instruction CC value as a second result.  Instruction
selection will attempt to detect code that compares that CC value against
constants and use the condition code directly instead.

Based on a patch by Richard Sandiford.

llvm-svn: 236527
2015-05-05 19:31:09 +00:00
Ulrich Weigand
f07c2265fd [SystemZ] Mark v1i128 and v1f128 as unsupported
The ABI specifies that <1 x i128> and <1 x fp128> are supposed to be
passed in vector registers.  We do not yet support those types, and
some infrastructure is missing before we can do so.

In order to prevent accidentally generating code violating the ABI,
this patch adds checks to detect those types and error out if user
code attempts to use them.

llvm-svn: 236526
2015-05-05 19:30:05 +00:00
Ulrich Weigand
096deccae0 [SystemZ] Handle sub-128 vectors
The ABI allows sub-128 vectors to be passed and returned in registers,
with the vector occupying the upper part of a register.  We therefore
want to legalize those types by widening the vector rather than promoting
the elements.

The patch includes some simple tests for sub-128 vectors and also tests
that we can recognize various pack sequences, some of which use sub-128
vectors as temporary results.  One of these forms is based on the pack
sequences generated by llvmpipe when no intrinsics are used.

Signed unpacks are recognized as BUILD_VECTORs whose elements are
individually sign-extended.  Unsigned unpacks can have the equivalent
form with zero extension, but they also occur as shuffles in which some
elements are zero.

Based on a patch by Richard Sandiford.

llvm-svn: 236525
2015-05-05 19:29:21 +00:00
Ulrich Weigand
73b770d782 [SystemZ] Add CodeGen support for scalar f64 ops in vector registers
The z13 vector facility includes some instructions that operate only on the
high f64 in a v2f64, effectively extending the FP register set from 16
to 32 registers.  It's still better to use the old instructions if the
operands happen to fit though, since the older instructions have a shorter
encoding.

Based on a patch by Richard Sandiford.

llvm-svn: 236524
2015-05-05 19:28:34 +00:00
Ulrich Weigand
119ff7dab3 [SystemZ] Add CodeGen support for v4f32
The architecture doesn't really have any native v4f32 operations except
v4f32->v2f64 and v2f64->v4f32 conversions, with only half of the v4f32
elements being used.  Even so, using vector registers for <4 x float>
and scalarising individual operations is much better than generating
completely scalar code, since there's much less register pressure.
It's also more efficient to do v4f32 comparisons by extending to 2
v2f64s, comparing those, then packing the result.

This particularly helps with llvmpipe.

Based on a patch by Richard Sandiford.

llvm-svn: 236523
2015-05-05 19:27:45 +00:00
Ulrich Weigand
2413036cdd [SystemZ] Add CodeGen support for v2f64
This adds ABI and CodeGen support for the v2f64 type, which is natively
supported by z13 instructions.

Based on a patch by Richard Sandiford.

llvm-svn: 236522
2015-05-05 19:26:48 +00:00
Ulrich Weigand
938ff50eda [SystemZ] Add CodeGen support for integer vector types
This the first of a series of patches to add CodeGen support exploiting
the instructions of the z13 vector facility.  This patch adds support
for the native integer vector types (v16i8, v8i16, v4i32, v2i64).

When the vector facility is present, we default to the new vector ABI.
This is characterized by two major differences:
- Vector types are passed/returned in vector registers
  (except for unnamed arguments of a variable-argument list function).
- Vector types are at most 8-byte aligned.

The reason for the choice of 8-byte vector alignment is that the hardware
is able to efficiently load vectors at 8-byte alignment, and the ABI only
guarantees 8-byte alignment of the stack pointer, so requiring any higher
alignment for vectors would require dynamic stack re-alignment code.

However, for compatibility with old code that may use vector types, when
*not* using the vector facility, the old alignment rules (vector types
are naturally aligned) remain in use.

These alignment rules are not only implemented at the C language level
(implemented in clang), but also at the LLVM IR level.  This is done
by selecting a different DataLayout string depending on whether the
vector ABI is in effect or not.

Based on a patch by Richard Sandiford.

llvm-svn: 236521
2015-05-05 19:25:42 +00:00
Ulrich Weigand
05f3cb9f0c [SystemZ] Add z13 vector facility and MC support
This patch adds support for the z13 processor type and its vector facility,
and adds MC support for all new instructions provided by that facilily.

Apart from defining the new instructions, the main changes are:

- Adding VR128, VR64 and VR32 register classes.
- Making FP64 a subclass of VR64 and FP32 a subclass of VR32.
- Adding a D(V,B) addressing mode for scatter/gather operations
- Adding 1-, 2-, and 3-bit immediate operands for some 4-bit fields.
  Until now all immediate operands have been the same width as the
  underlying field (hence the assert->return change in decode[SU]ImmOperand).

In addition, sys::getHostCPUName is extended to detect running natively
on a z13 machine.

Based on a patch by Richard Sandiford.

llvm-svn: 236520
2015-05-05 19:23:40 +00:00
Pete Cooper
a4a30efee4 Revert "Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC"
This reverts commit 963cdbccf6e5578822836fd9b2ebece0ba9a60b7 (ie r236514)

This is to get the bots green while i investigate.

llvm-svn: 236518
2015-05-05 18:49:08 +00:00
Pete Cooper
d33c29a169 Revert "Fix IfConverter to handle regmask machine operands."
This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515).

This is to get the bots green while i investigate the failures.

llvm-svn: 236517
2015-05-05 18:49:05 +00:00
Pete Cooper
14239b5817 Fix IfConverter to handle regmask machine operands.
A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

llvm-svn: 236515
2015-05-05 18:31:36 +00:00
Pete Cooper
0705733a6e Refactor UpdatePredRedefs and StepForward to avoid duplication. NFC
The code was basically the same here already.  Just added an out parameter for a vector of seen defs so that UpdatePredRedefs can call StepForward first, then do its own post processing on the seen defs.

Will be used in the next commit to also handle regmasks.

llvm-svn: 236514
2015-05-05 18:31:31 +00:00
Diego Novillo
c80738bb5c Fix typo in assert message. NFC.
llvm-svn: 236513
2015-05-05 18:24:47 +00:00
David Blaikie
2520d6743e Fix the clang -Werror build, use of uninitialized variable.
llvm-svn: 236512
2015-05-05 18:12:33 +00:00
Daniel Berlin
af1954cd08 Update BasicAliasAnalysis to understand that nothing aliases with undef values.
It got this in some cases (if one of them was an identified object), but not in all cases.

This caused stores to undef to block load-forwarding in some cases, etc.

Added test to Transforms/GVN to verify optimization occurs as expected.

llvm-svn: 236511
2015-05-05 18:10:49 +00:00
David Blaikie
b888632efe [opaque pointer type] Track explicit GEP pointee type through in-memory IR
llvm-svn: 236510
2015-05-05 18:03:48 +00:00
Reid Kleckner
feb0da7d82 Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"
This reverts commit r236360.

This change exposed a bug in WinEHPrepare by opting win32 code into EH
preparation. We already knew that WinEHPrepare has bugs, and is the
status quo for x64, so I don't think that's a reason to hold off on this
change. I disabled exceptions in the sanitizer tests in r236505 and an
earlier revision.

llvm-svn: 236508
2015-05-05 17:44:16 +00:00
Quentin Colombet
c82cc9dc57 [ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>

llvm-svn: 236507
2015-05-05 17:38:16 +00:00
Lang Hames
a366b4ef77 [Orc] Reapply r236465 with fixes for the MSVC bots.
llvm-svn: 236506
2015-05-05 17:37:18 +00:00
Daniel Sanders
0790458bde [bugpoint] Increase default memory limit to 400MB to fix bugpoint tests.
I tracked down the bug to an unchecked malloc in SmallVectorBase::grow_pod().
This malloc is returning NULL on my machine when running under bugpoint but not
when -enable-valgrind is given.

llvm-svn: 236504
2015-05-05 16:29:40 +00:00
Kit Barton
4fec877662 This patch adds ABI support for v1i128 data type.
It adds v1i128 to the appropriate register classes and checks parameter passing
and return values.

This is related to http://reviews.llvm.org/D9081, which will add instructions
that exploit the v1i128 datatype.

Phabricator review: http://reviews.llvm.org/D9475

llvm-svn: 236503
2015-05-05 16:10:44 +00:00
Igor Laevsky
14843ae54c Emit comment for gc.relocate showing base and derived pointers in human readable form.
Differential Revision: http://reviews.llvm.org/D9326

llvm-svn: 236497
2015-05-05 13:20:42 +00:00
Daniel Sanders
b35b5a2e1c [mips] Generate code for insert/extract operations when using the N64 ABI and MSA.
Summary:
When using the N64 ABI, element-indices use the i64 type instead of i32.
In many cases, we can use iPTR to account for this but additional patterns
and pseudo's are also required.

This fixes most (but not quite all) failures in the test-suite when using
N64 and MSA together.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9342

llvm-svn: 236494
2015-05-05 10:32:24 +00:00
Ismail Donmez
1bf806ae63 Fix regression in parsing armv{6,7}hl- triples. These are used by SUSE
and Redhat currently.

Reviewed by Jonathan Roelofs.

llvm-svn: 236492
2015-05-05 09:29:43 +00:00
Daniel Sanders
ba800c7a2f [mips][msa] Test basic operations for the N32 ABI too.
Summary:
This required adding instruction aliases for dneg.

N64 will be enabled shortly but requires additional bugfixes.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9341

llvm-svn: 236489
2015-05-05 08:48:35 +00:00
Kostya Serebryany
15a12f25e5 [lib/Fuzzer] use handle_abort=1 by default so that when assert() fires we save the test case
llvm-svn: 236476
2015-05-05 01:42:55 +00:00
Lang Hames
1ee1ff3529 [Orc] Revert r236465 - It broke the Windows bots.
Looks like the usual missing explicit move-constructor issue with MSVC. I should
have a fix shortly.

llvm-svn: 236472
2015-05-04 23:30:01 +00:00
Reid Kleckner
92f8523946 [X86] Fix assertion while DAG combining offsets and ExternalSymbols
ExternalSymbol nodes do not contain offsets, unlike GlobalValue nodes.

llvm-svn: 236471
2015-05-04 23:22:36 +00:00
Pete Cooper
7f7b570248 [ARM] IT block insertion needs to update kill flags
When forming an IT block from the first MOV here:

	%R2<def> = t2MOVr %R0, pred:1, pred:%CPSR, opt:%noreg
	%R3<def> = tMOVr %R0<kill>, pred:14, pred:%noreg

the move in to R3 is moved out of the IT block so that later instructions on the same predicate can be inside this block, and we can share the IT instruction.

However, when moving the R3 copy out of the IT block, we need to clear its kill flags for anything in use at this point in time, ie, R0 here.

This appeases the machine verifier which thought that R0 wasn't defined when used.

I have a test case, but its extremely register allocator specific.  It would be too fragile to commit a test which depends on the register allocator here.

llvm-svn: 236468
2015-05-04 22:44:47 +00:00
Pete Cooper
601370f079 Add TransformUtils dependency to lli.
After r236465, Orc uses ValueMaterializer and so needs to link against TransformUtils to get the ValueMaterializer::anchor().

llvm-svn: 236467
2015-05-04 22:33:39 +00:00
Lang Hames
63e2fcbb79 [Orc] Refactor the compile-on-demand layer to make module partitioning lazy,
and avoid cloning unused decls into every partition.

Module partitioning showed up as a source of significant overhead when I
profiled some trivial test cases. Avoiding the overhead of partitionging
for uncalled functions helps to mitigate this.

This change also means that it is no longer necessary to have a
LazyEmittingLayer underneath the CompileOnDemand layer, since the
CompileOnDemandLayer will not extract or emit function bodies until they are
called.

llvm-svn: 236465
2015-05-04 22:03:10 +00:00
Matthias Braun
981c4f348d Vim: Fix some bugs in llvm indent plugin.
llvm-svn: 236464
2015-05-04 21:41:25 +00:00
Matthias Braun
13bf28ceb8 Vim: Set filetype=python for lit configuration files.
llvm-svn: 236463
2015-05-04 21:41:23 +00:00
Matthias Braun
47d7a6a19f Document some of the options in test/lit.cfg
llvm-svn: 236462
2015-05-04 21:37:00 +00:00
Matthias Braun
6cedce8963 Lit: Allow overriding llvm tool paths+arguments, make -D an alias for --param
These changes allow usages where you want to pass an additional
commandline option to all invocations of a specific llvm tool. Example:

> llvm-lit -Dllc=llc -enable-misched -verify-machineinstrs

Differential Revision: http://reviews.llvm.org/D9487

llvm-svn: 236461
2015-05-04 21:36:36 +00:00
Sanjay Patel
8a09ce2621 zap windows line endings; NFC
llvm-svn: 236460
2015-05-04 21:27:27 +00:00
Tim Northover
47387fb490 CodeGen: match up correct insertvalue indices when assessing tail calls.
When deciding whether a value comes from the aggregate or inserted value of an
insertvalue instruction, we compare the indices against those of the location
we're interested in. One of the lists needs reversing because the input data is
backwards (so that modifications take place at the end of the SmallVector), but
we were reversing both before leading to incorrect results.

Should fix PR23408

llvm-svn: 236457
2015-05-04 20:41:51 +00:00
Alex Lorenz
92999a396f YAML: Add an optional 'flow' field to the mapping trait to allow flow mapping output.
This patch adds an optional 'flow' field to the MappingTrait
class so that yaml IO will be able to output flow mappings.

Reviewers: Justin Bogner

Differential Revision: http://reviews.llvm.org/D9450

llvm-svn: 236456
2015-05-04 20:11:40 +00:00
Keno Fischer
7ca523be5c Respect object format choice on Darwin
Summary:
The object format can be set to something other than MachO, e.g.
to use ELF-on-Darwin for MCJIT. This already works on Windows, so
there's no reason it shouldn't on Darwin.

Reviewers: lhames, grosbach

Subscribers: rafael, grosbach, t.p.northover, llvm-commits

Differential Revision: http://reviews.llvm.org/D6185

llvm-svn: 236455
2015-05-04 20:03:01 +00:00
Reid Kleckner
9b04bfdf89 Fix -Wmicrosoft warning by making enum unsigned
llvm-svn: 236436
2015-05-04 18:21:35 +00:00
Davide Italiano
c07bfb4eca [IR/Diagnostic] Assert that DebugLoc is valid before accessing.
PR:		23380
Differential Revision:	http://reviews.llvm.org/D9464
Reviewed by:	dexonsmith

llvm-svn: 236435
2015-05-04 18:08:35 +00:00
Hans Wennborg
1667136f5e Option parsing: properly handle flag aliases for joined options (PR23394)
A joined option always needs to have an argument, even if it's an empty one.

Clang would previously assert when trying to use --extra-warnings, which is
a flag alias for -W, which is a joined option.

llvm-svn: 236434
2015-05-04 18:00:13 +00:00
Ulrich Weigand
b59883a864 [SystemZ] Reclassify f32 subregs of f64 registers
At the moment, all subregs defined by the SystemZ target can be modified
independently of the wider register.  E.g. writing to a GR32 does not
change the upper 32 bits of the GR64.  Writing to an FP32 does not change
the lower 32 bits of the FP64.

Hoewver, the upcoming support for the vector extension redefines FP64 as
one half of a V128.  Floating-point operations leave the other half of
a V128 in an unpredictable state, so it's no longer the case that writing
to an FP32 leaves the bits of the underlying register (the V128) alone.
I'd prefer to have separate subreg_ names for this situation, so that
it's obvious at a glance whether we're talking about a subreg that leaves
the other parts of the register alone.

No behavioral change intended.

Patch originally by Richard Sandiford.

llvm-svn: 236433
2015-05-04 17:41:22 +00:00
Ulrich Weigand
b0ad1dc891 [SystemZ] Clean up AsmParser isMem() handling
We know what MemoryKind an operand has at the time we construct it,
so we might as well just record it in an unused part of the structure.
This makes it easier to add scatter/gather addresses later.

No behavioral change intended.

Patch originally by Richard Sandiford.

llvm-svn: 236432
2015-05-04 17:40:53 +00:00
Ulrich Weigand
e1eb83590c [SystemZ] Fix getTargetNodeName
It seems SystemZTargetLowering::getTargetNodeName got out of sync with
some recent changes to the SystemZISD opcode list.  Add back all the
missing opcodes (and re-sort to the same order as SystemISelLowering.h).

llvm-svn: 236430
2015-05-04 17:39:40 +00:00