1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

29436 Commits

Author SHA1 Message Date
Rafael Espindola
4214740c57 Use sext in fast isel.
Fast isel used to zero extends immediates to 64 bits. This normally goes
unnoticed because the value is truncated to 32 bits for output.

Two cases were it is noticed:

* We fail to use smaller encodings.
* If the original constant was smaller than i32.

In the tests using i1 constants, codegen would change to use -1, which is fine
(and matches what regular isel does) since only the lowest bit is then used.

Instead, this patch then changes the ir to use i8 constants, which looks more
like what clang produces.

llvm-svn: 234249
2015-04-06 22:29:07 +00:00
Reid Kleckner
b881568c8b [WinEH] Don't sink allocas into child handlers
The uselist isn't enough to infer anything about the lifetime of such
allocas. If we want to re-add this optimization, we will need to
leverage lifetime markers to do it.

Fixes PR23122.

llvm-svn: 234196
2015-04-06 18:50:38 +00:00
Tim Northover
497a3115c3 ARM: do not relax Thumb1 -> Thumb2 if only Thumb1 is available.
After recognising that a certain narrow instruction might need a relocation to
be represented, we used to unconditionally relax it to a Thumb2 instruction to
permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2
instructions, so we end up emitting a completely invalid instruction.

Theoretically, ELF does have relocations for these situations; but they are
fairly unusable with such short ranges and the ABI document even says they're
documented "for completeness". So an error is probably better there too.

rdar://20391953

llvm-svn: 234195
2015-04-06 18:44:42 +00:00
Simon Pilgrim
a5a17c0b23 [X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1 targets
This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction.

This allows folding of byte loads and reduces scalar register usage as well.

Differential Revision: http://reviews.llvm.org/D8839

llvm-svn: 234193
2015-04-06 18:39:00 +00:00
Kevin Enderby
d56c59218e Fix failure on builder llvm-clang-lld-x86_64-debian-fast as the
test macho-objc-meta-data.test had a line it shouldn't have had.

llvm-svn: 234190
2015-04-06 18:18:23 +00:00
Kevin Enderby
3bf355af03 For llvm-objdump added support for printing Objc2 32-bit runtime meta data
with the existing -objc-meta-data and -macho options for Mach-O files.

llvm-svn: 234185
2015-04-06 17:47:03 +00:00
Jingyue Wu
cee19c8e7e [SLSR] consider &B[S << i] as &B[(1 << i) * S]
Summary: This reduces handling &B[(1 << i) * s] to handling &B[i * S].

Test Plan: slsr-gep.ll

Reviewers: meheff

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D8837

llvm-svn: 234180
2015-04-06 17:15:48 +00:00
Simon Pilgrim
e8ed19bff1 [DAGCombiner] Add support for FCEIL, FFLOOR and FTRUNC vector constant folding
Differential Revision: http://reviews.llvm.org/D8715

llvm-svn: 234179
2015-04-06 17:15:41 +00:00
Duncan P. N. Exon Smith
acefeaad01 Verifier: Check composite type template params
Add missing checks for `templateParams:` in `MDCompositeType`.  Pull the
current check for `MDSubprogram` to reduce duplicated code and fix it up
to print a good message when the immediate operand isn't an `MDTuple`
(as a drive-by, make the same fix to `variables:` in `MDSubprogram`).

llvm-svn: 234177
2015-04-06 17:04:58 +00:00
Rafael Espindola
6fa49967ca Use a comma after the unique keyword.
H.J. Lu noted that all .section options are separated by a comma.

This patch changes the syntax of unique to require one.

llvm-svn: 234174
2015-04-06 16:34:41 +00:00
Rafael Espindola
91f7164378 Be consistent when deciding if a relocation is needed.
Before when deciding if we needed a relocation in A-B, we wore only checking
if A was weak.

This fixes the asymmetry.

The "InSet" argument should probably be renamed to "ForValue", since InSet is
very MachO specific, but doing so in this patch would make it hard to read.

This fixes PR22815.

llvm-svn: 234165
2015-04-06 15:27:57 +00:00
Rafael Espindola
5574b26c05 Store the sh_link of ARM_EXIDX directly in MCSectionELF.
This avoids some pretty horrible and broken name based section handling.

llvm-svn: 234142
2015-04-06 04:25:18 +00:00
Rafael Espindola
0601e2fab8 Implement unique sections with an unique ID.
This allows the compiler/assembly programmer to switch back to a
section. This in turn fixes the bootstrap failure on powerpc (tested
on gcc110) without changing the ppc codegen at all.

I will try to cleanup the various getELFSection overloads in a  followup patch.
Just using a default argument now would lead to ambiguities.

llvm-svn: 234099
2015-04-04 18:02:01 +00:00
Simon Pilgrim
9b09e6fee5 [DAGCombiner] Canonicalize vector constants for ADD/MUL/AND/OR/XOR re-association
Scalar integers are commuted to move constants to the RHS for re-association - this ensures vectors do the same.

llvm-svn: 234092
2015-04-04 10:20:31 +00:00
Eric Christopher
97beb00b59 Strip trailing whitespace and reword explanatory comment.
llvm-svn: 234078
2015-04-04 02:26:47 +00:00
Craig Topper
54ac1d95cd [X86] Don't use GR64 register 'and with immediate' instructions if the immediate is zero in the upper 33-bits or upper 57-bits. Use GR32 instructions instead.
Previously the patterns didn't have high enough priority and we would only use the GR32 form if the only the upper 32 or 56 bits were zero.

Fixes PR23100.

llvm-svn: 234075
2015-04-04 02:08:20 +00:00
David Majnemer
f0e072b4f9 [WinEH] Fill out CatchHigh in the TryBlockMap
Now all fields in the WinEH xdata have been filled out.

llvm-svn: 234067
2015-04-03 23:37:34 +00:00
David Majnemer
4a823b111e [WinEH] Fill out .xdata for catch objects
This add support for catching an exception such that an exception object
available to the catch handler will be initialized by the runtime.

llvm-svn: 234062
2015-04-03 22:49:05 +00:00
David Majnemer
694a466675 [WinEH] Sink UnwindHelp completely out of IR
We don't need to represent UnwindHelp in IR.  Instead, we can use the
knowledge that we are emitting the parent function to decide if we
should create the UnwindHelp stack object.

llvm-svn: 234061
2015-04-03 22:32:26 +00:00
David Majnemer
2af32ca579 Fix a typo
CHECK-LABEL had the wrong function name.

llvm-svn: 234051
2015-04-03 20:56:24 +00:00
David Majnemer
2243ffe08d [InstCombine] Use DataLayout to determine vector element width
InstCombine didn't realize that it needs to use DataLayout to determine
how wide pointers are.  This lead to assertion failures.

This fixes PR23113.

llvm-svn: 234046
2015-04-03 20:18:40 +00:00
Andrew Kaylor
cd70932cf9 [WinEH] Handle nested landing pads in outlined catch handlers
Differential Revision: http://reviews.llvm.org/D8596

llvm-svn: 234041
2015-04-03 19:37:50 +00:00
Sanjay Patel
ac7b801edf use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
llvm-svn: 234029
2015-04-03 17:17:50 +00:00
Sanjay Patel
09dfdcbcd0 use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
llvm-svn: 234027
2015-04-03 17:13:31 +00:00
Sanjay Patel
ad7b3271f0 use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
llvm-svn: 234024
2015-04-03 17:09:37 +00:00
Sanjay Patel
682732f0ca use update_llc_test_checks.py to tighten checking
remove redundant and unnecessary test parameters

llvm-svn: 234022
2015-04-03 17:02:48 +00:00
Duncan P. N. Exon Smith
9bfb0f6678 Verifier: Check that inlined-at locations agree
Check that the `MDLocalVariable::getInlinedAt()` in a debug info
intrinsic's variable always matches the `MDLocation::getInlinedAt()` of
its `!dbg` attachment.

The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), since it's expensive and unnecessary, but I'll let
this verifier check bake for a while (a week maybe?) first.  I've
updated the testcases that had the wrong value for `inlinedAt:`.

This checks that things are sane in the IR, but currently things go out
of whack in a few places in the backend.  I'll follow shortly with
assertions in the backend (with code fixes).

If you have out-of-tree testcases that just started failing, here's how
I updated these ones:

 1. The verifier check gives you the basic block, function, instruction,
    and relevant metadata arguments (metadata numbering doesn't
    necessarily match the source file, unfortunately).
 2. Look at the `@llvm.dbg.*()` instruction, and compare the
    `inlinedAt:` fields of the variable argument (second `metadata`
    argument) and the `!dbg` attachment.
 3. Figure out based on the variable `scope:` chain and the functions in
    the file whether the variable has been inlined (and into what), so
    you can determine which `inlinedAt:` is actually correct.  In all of
    the in-tree testcases, the `!MDLocation()` was correct and the
    `!MDLocalVariable()` was wrong, but YMMV.
 4. Duplicate the metadata that you're going to change, and add/drop the
    `inlinedAt:` field from one of them.  Be careful that the other
    references to the same metadata node point at the correct one.

llvm-svn: 234021
2015-04-03 16:54:30 +00:00
Sanjay Patel
074d61ed44 add checks; remove redundant testing parameters
llvm-svn: 234020
2015-04-03 16:44:42 +00:00
Duncan P. N. Exon Smith
e27224ace5 CodeGen: Fix MachineInstr::print() for DBG_VALUE
Grab the `MDLocalVariable` from the second-to-last argument; the last
argument is an `MDExpression`, and mixing them up will crash.

llvm-svn: 234019
2015-04-03 16:23:04 +00:00
Sanjay Patel
73ded85ef2 use update_llc_test_checks.py to tighten checking; remove darwin and sandybridge overspecification
llvm-svn: 234017
2015-04-03 16:06:58 +00:00
Simon Pilgrim
828e8b6f6d Added vector tests for DAGCombiner::ReassociateOps
Missing vector tests for rL233482

llvm-svn: 234015
2015-04-03 15:04:46 +00:00
Simon Pilgrim
4f794f11fb [X86] Added SSE4.2 CRC32 memory folding patterns + tests
llvm-svn: 234013
2015-04-03 14:24:40 +00:00
Bill Schmidt
3f570f996a [PowerPC] Enable splat generation for BUILD_VECTOR with little endian
When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR
nodes for little endian because wrong results were produced.  I've
subsequently investigated and found this is due to a call to
BuildVectorSDNode::isConstantSplat that was always specifying
big-endian.  With this changed to correctly identify the target
endianness, the optimizations work as expected.

I found another case of a call to the same method with big-endian
hardcoded, in PPC::isAllNegativeZeroVector().  I discovered this was
an orphaned method with no callers, so I've just removed it.

The existing test/CodeGen/PowerPC/vec_constants.ll checks these
optimizations, so for testing I've just added a variant for little
endian.

llvm-svn: 234011
2015-04-03 13:48:24 +00:00
Simon Pilgrim
380961e86b [X86][3DNow] Added 3DNow! memory folding patterns + tests
llvm-svn: 234008
2015-04-03 11:50:30 +00:00
Simon Pilgrim
a600d06557 [X86][MMX] Added MMX stack folding tests
llvm-svn: 234006
2015-04-03 11:01:15 +00:00
Simon Pilgrim
335a565d46 [DAGCombiner] Combine shuffles of BUILD_VECTOR and SCALAR_TO_VECTOR
This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed.

Differential Revision: http://reviews.llvm.org/D8516

llvm-svn: 234004
2015-04-03 10:02:21 +00:00
Peter Collingbourne
b2266efe11 MC: For variable symbols, maintain MCSymbol::Section as a cache.
Fixes PR19582.

Previously, when an asm assignment (.set or =) was created, we would look up
the section immediately in MCSymbol::setVariableValue. This caused symbols
to receive the wrong section if the RHS of the assignment had not been seen
yet. This had a knock-on effect in the object file emitters, causing them
to emit extra symbols, or to give symbols the wrong visibility or the wrong
section. For example, in the following asm:

.data
.Llocal:

.text
leaq .Llocal1(%rip), %rdi
.Llocal1 = .Llocal2
.Llocal2 = .Llocal

the first assignment would give .Llocal1 a null section, which would never get
fixed up by the second assignment. This would cause the ELF object file emitter
to consider .Llocal1 to be an undefined symbol and give it external linkage,
even though .Llocal1 should not have been emitted at all in the object file.

Or in the following asm:

alias_to_local = Ltmp0
Ltmp0:

the Mach-O object file emitter would give the alias_to_local symbol a n_type
of N_SECT and a n_sect of 0.  This is invalid under the Mach-O specification,
which requires N_SECT symbols to receive a non-zero section number if the
symbol is defined in a section in the object file.

https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist

After this change we do not look up the section when the assignment is created,
but instead look it up on demand and store it in Section, which is treated
as a cache if the symbol is a variable symbol.

This change also fixes a bug in MCExpr::FindAssociatedSection. Previously,
if we saw a subtraction, we would return the first referenced section, even in
cases where we should have been returning the absolute pseudo-section. Now we
always return the absolute pseudo-section for expressions that subtract two
section-derived expressions. This isn't always correct (e.g. if one of the
sections ends up being laid out at an absolute address), but it's probably
the best we can do without more context.

This allows us to remove code in two places where we appear to have been
working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols
and in X86AsmPrinter::EmitStartOfAsmFile.

Re-applies r233595 (aka D8586), which was reverted in r233898.

Differential Revision: http://reviews.llvm.org/D8798

llvm-svn: 233995
2015-04-03 01:46:11 +00:00
Matthias Braun
d8cacdcd25 ARM: Handle physreg targets in RegPair hints gracefully
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.

llvm-svn: 233987
2015-04-03 00:18:38 +00:00
Reid Kleckner
012454b680 [ASan] Don't use stack malloc for 32-bit functions using inline asm
This prevents us from running out of registers in the backend.

Introducing stack malloc calls prevents the backend from recognizing the
inline asm operands as stack objects. When the backend recognizes a
stack object, it doesn't need to materialize the address of the memory
in a physical register. Instead it generates a simple SP-based memory
operand. Introducing a stack malloc forces the backend to find a free
register for every memory operand. 32-bit x86 simply doesn't have enough
registers for this to succeed in most cases.

Reviewers: kcc, samsonov

Differential Revision: http://reviews.llvm.org/D8790

llvm-svn: 233979
2015-04-02 21:44:55 +00:00
Jingyue Wu
f16e3eebfb [SLSR] handles off bounds GEPs
Summary:
The old requirement on GEP candidates being in bounds is unnecessary.
For off-bound GEPs, we still have

  &B[i * S] = B + (i * S) * e = B + (i * e) * S

Test Plan: slsr_offbound_gep in slsr-gep.ll

Reviewers: meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8809

llvm-svn: 233949
2015-04-02 21:18:32 +00:00
Reid Kleckner
b08b580e85 [WinEH] Make llvm.eh.actions use frameescape indices for catch params
This makes it possible to use the same representation of llvm.eh.actions
in outlined handlers as we use in the parent function because i32's are
just constants that can be copied freely between functions.

I had to add a sentinel alloca to the list of child allocas so that we
don't try to sink the catch object into the handler. Normally, one would
use nullptr for this kind of thing, but TinyPtrVector doesn't support
null elements. More than that, it's elements have to have a suitable
alignment. Therefore, I settled on this for my sentinel:

  AllocaInst *getCatchObjectSentinel() {
    return static_cast<AllocaInst *>(nullptr) + 1;
  }

llvm-svn: 233947
2015-04-02 21:13:31 +00:00
Sanjay Patel
ee1b1c4540 [AVX] Improve insertion of i8 or i16 into low element of 256-bit zero vector
Without this patch, we split the 256-bit vector into halves and produced something like:
	movzwl	(%rdi), %eax
	vmovd	%eax, %xmm0
	vxorps	%xmm1, %xmm1, %xmm1
	vblendps	$15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]

Now, we eliminate the xor and blend because those zeros are free with the vmovd:
        movzwl  (%rdi), %eax
        vmovd   %eax, %xmm0

This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685

llvm-svn: 233941
2015-04-02 20:21:52 +00:00
Sanjay Patel
c17be4ec1a [X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073)
For code like this:

define <8 x i32> @load_v8i32() {
  ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}

We produce this AVX code:

_load_v8i32:                            ## @load_v8i32
  movl	$7, %eax
  vmovd	%eax, %xmm0
  vxorps	%ymm1, %ymm1, %ymm1
  vblendps	$1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
  retq

There are at least 2 bugs in play here:

    We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
    We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.

The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.

[1] AddedComplexity has close to no documentation in the source. 
The best we have is this comment: "roughly corresponds to the number of nodes that are covered". 
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). 

I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ

Differential Revision: http://reviews.llvm.org/D8794

llvm-svn: 233931
2015-04-02 17:56:17 +00:00
Elena Demikhovsky
74d944b41a AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 233906
2015-04-02 10:51:40 +00:00
Vasileios Kalintiris
861cb631df [mips] Make sure that we don't adjust the stack pointer by zero amount.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8638

llvm-svn: 233904
2015-04-02 10:14:54 +00:00
Peter Collingbourne
08b6124ef0 Revert r233595, "MC: For variable symbols, maintain MCSymbol::Section as a cache."
llvm-svn: 233898
2015-04-02 07:02:51 +00:00
Lang Hames
ecdf9920d8 [Orc] Fix local-linkage handling in the CompileOnDemand layer.
llvm-svn: 233895
2015-04-02 05:28:10 +00:00
Philip Reames
d3d2df55e2 Teach gcroot how to handle dynamically realigned frames
I'm playing with supporting custom stack map formats with statepoints.  While 
doing so, I noticed that the existing implementation didn't indicate inherently 
unsized frames.  This change essentially just ports the functionality that already 
exists for the default StackMaps section to custom stackmaps.

llvm-svn: 233891
2015-04-02 05:00:40 +00:00
Adam Nemet
86058cb79c [LoopAccesses] Remove unused global variables in tests
llvm-svn: 233887
2015-04-02 04:42:51 +00:00
Lang Hames
3dbd9b3b85 [Orc] Add support classes for inspecting and running C++ static ctor/dtors, and
use these to add support for C++ static ctors/dtors to the Orc-lazy JIT in LLI.

Replace the trivial_retval_1 regression test - the new 'hello' test is covering
strictly more code. 

llvm-svn: 233885
2015-04-02 04:34:45 +00:00