1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

115708 Commits

Author SHA1 Message Date
Daniel Jasper
ee5f11800e [MachineLICM] Small cleanup: Constify and rangeify.
NFC.

llvm-svn: 234018
2015-04-03 16:19:48 +00:00
Sanjay Patel
73ded85ef2 use update_llc_test_checks.py to tighten checking; remove darwin and sandybridge overspecification
llvm-svn: 234017
2015-04-03 16:06:58 +00:00
Simon Pilgrim
828e8b6f6d Added vector tests for DAGCombiner::ReassociateOps
Missing vector tests for rL233482

llvm-svn: 234015
2015-04-03 15:04:46 +00:00
Simon Pilgrim
4f794f11fb [X86] Added SSE4.2 CRC32 memory folding patterns + tests
llvm-svn: 234013
2015-04-03 14:24:40 +00:00
Bill Schmidt
3f570f996a [PowerPC] Enable splat generation for BUILD_VECTOR with little endian
When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR
nodes for little endian because wrong results were produced.  I've
subsequently investigated and found this is due to a call to
BuildVectorSDNode::isConstantSplat that was always specifying
big-endian.  With this changed to correctly identify the target
endianness, the optimizations work as expected.

I found another case of a call to the same method with big-endian
hardcoded, in PPC::isAllNegativeZeroVector().  I discovered this was
an orphaned method with no callers, so I've just removed it.

The existing test/CodeGen/PowerPC/vec_constants.ll checks these
optimizations, so for testing I've just added a variant for little
endian.

llvm-svn: 234011
2015-04-03 13:48:24 +00:00
Simon Pilgrim
380961e86b [X86][3DNow] Added 3DNow! memory folding patterns + tests
llvm-svn: 234008
2015-04-03 11:50:30 +00:00
Simon Pilgrim
a600d06557 [X86][MMX] Added MMX stack folding tests
llvm-svn: 234006
2015-04-03 11:01:15 +00:00
Simon Pilgrim
335a565d46 [DAGCombiner] Combine shuffles of BUILD_VECTOR and SCALAR_TO_VECTOR
This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed.

Differential Revision: http://reviews.llvm.org/D8516

llvm-svn: 234004
2015-04-03 10:02:21 +00:00
Peter Collingbourne
b2266efe11 MC: For variable symbols, maintain MCSymbol::Section as a cache.
Fixes PR19582.

Previously, when an asm assignment (.set or =) was created, we would look up
the section immediately in MCSymbol::setVariableValue. This caused symbols
to receive the wrong section if the RHS of the assignment had not been seen
yet. This had a knock-on effect in the object file emitters, causing them
to emit extra symbols, or to give symbols the wrong visibility or the wrong
section. For example, in the following asm:

.data
.Llocal:

.text
leaq .Llocal1(%rip), %rdi
.Llocal1 = .Llocal2
.Llocal2 = .Llocal

the first assignment would give .Llocal1 a null section, which would never get
fixed up by the second assignment. This would cause the ELF object file emitter
to consider .Llocal1 to be an undefined symbol and give it external linkage,
even though .Llocal1 should not have been emitted at all in the object file.

Or in the following asm:

alias_to_local = Ltmp0
Ltmp0:

the Mach-O object file emitter would give the alias_to_local symbol a n_type
of N_SECT and a n_sect of 0.  This is invalid under the Mach-O specification,
which requires N_SECT symbols to receive a non-zero section number if the
symbol is defined in a section in the object file.

https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist

After this change we do not look up the section when the assignment is created,
but instead look it up on demand and store it in Section, which is treated
as a cache if the symbol is a variable symbol.

This change also fixes a bug in MCExpr::FindAssociatedSection. Previously,
if we saw a subtraction, we would return the first referenced section, even in
cases where we should have been returning the absolute pseudo-section. Now we
always return the absolute pseudo-section for expressions that subtract two
section-derived expressions. This isn't always correct (e.g. if one of the
sections ends up being laid out at an absolute address), but it's probably
the best we can do without more context.

This allows us to remove code in two places where we appear to have been
working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols
and in X86AsmPrinter::EmitStartOfAsmFile.

Re-applies r233595 (aka D8586), which was reverted in r233898.

Differential Revision: http://reviews.llvm.org/D8798

llvm-svn: 233995
2015-04-03 01:46:11 +00:00
Daniel Berlin
585bd68759 Return iterator from BasicBlock::eraseFromParent
Summary:
Same as the last patch, but for BasicBlock
(Requires same code movement)

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8801

llvm-svn: 233992
2015-04-03 01:20:33 +00:00
David Blaikie
dad445edfc [opaque pointer types] Push explicit type parameter for geps through the constant folders
Next: more IRBuilder changes.
llvm-svn: 233991
2015-04-03 01:15:16 +00:00
Matthias Braun
d8cacdcd25 ARM: Handle physreg targets in RegPair hints gracefully
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.

llvm-svn: 233987
2015-04-03 00:18:38 +00:00
Matthias Braun
aa841574c7 MachineRegisterInfo: Make it clear that hints are for vregs
llvm-svn: 233986
2015-04-03 00:18:33 +00:00
NAKAMURA Takumi
6c1c2d6143 llvm/examples/BrainF: Give an explicit pointee type to ConstantExpr::getGetElementPtr(ty...), according to r233938.
llvm-svn: 233983
2015-04-02 22:44:00 +00:00
Reid Kleckner
012454b680 [ASan] Don't use stack malloc for 32-bit functions using inline asm
This prevents us from running out of registers in the backend.

Introducing stack malloc calls prevents the backend from recognizing the
inline asm operands as stack objects. When the backend recognizes a
stack object, it doesn't need to materialize the address of the memory
in a physical register. Instead it generates a simple SP-based memory
operand. Introducing a stack malloc forces the backend to find a free
register for every memory operand. 32-bit x86 simply doesn't have enough
registers for this to succeed in most cases.

Reviewers: kcc, samsonov

Differential Revision: http://reviews.llvm.org/D8790

llvm-svn: 233979
2015-04-02 21:44:55 +00:00
Reid Kleckner
9ca8805526 Fix unused variable in NDEBUG builds
llvm-svn: 233978
2015-04-02 21:43:22 +00:00
Jingyue Wu
f16e3eebfb [SLSR] handles off bounds GEPs
Summary:
The old requirement on GEP candidates being in bounds is unnecessary.
For off-bound GEPs, we still have

  &B[i * S] = B + (i * S) * e = B + (i * e) * S

Test Plan: slsr_offbound_gep in slsr-gep.ll

Reviewers: meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8809

llvm-svn: 233949
2015-04-02 21:18:32 +00:00
Reid Kleckner
b08b580e85 [WinEH] Make llvm.eh.actions use frameescape indices for catch params
This makes it possible to use the same representation of llvm.eh.actions
in outlined handlers as we use in the parent function because i32's are
just constants that can be copied freely between functions.

I had to add a sentinel alloca to the list of child allocas so that we
don't try to sink the catch object into the handler. Normally, one would
use nullptr for this kind of thing, but TinyPtrVector doesn't support
null elements. More than that, it's elements have to have a suitable
alignment. Therefore, I settled on this for my sentinel:

  AllocaInst *getCatchObjectSentinel() {
    return static_cast<AllocaInst *>(nullptr) + 1;
  }

llvm-svn: 233947
2015-04-02 21:13:31 +00:00
Sanjay Patel
ee1b1c4540 [AVX] Improve insertion of i8 or i16 into low element of 256-bit zero vector
Without this patch, we split the 256-bit vector into halves and produced something like:
	movzwl	(%rdi), %eax
	vmovd	%eax, %xmm0
	vxorps	%xmm1, %xmm1, %xmm1
	vblendps	$15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]

Now, we eliminate the xor and blend because those zeros are free with the vmovd:
        movzwl  (%rdi), %eax
        vmovd   %eax, %xmm0

This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685

llvm-svn: 233941
2015-04-02 20:21:52 +00:00
David Blaikie
f300e9b75c [opaque pointer type] API migration for GEP constant factories
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.

llvm-svn: 233938
2015-04-02 18:55:32 +00:00
Quentin Colombet
fcd49dac2f [AArch64] Add a comment to make it explicit why we increased the complexity.
Follow-up of r233653.

llvm-svn: 233936
2015-04-02 18:54:23 +00:00
Sanjay Patel
c17be4ec1a [X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073)
For code like this:

define <8 x i32> @load_v8i32() {
  ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}

We produce this AVX code:

_load_v8i32:                            ## @load_v8i32
  movl	$7, %eax
  vmovd	%eax, %xmm0
  vxorps	%ymm1, %ymm1, %ymm1
  vblendps	$1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
  retq

There are at least 2 bugs in play here:

    We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
    We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.

The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.

[1] AddedComplexity has close to no documentation in the source. 
The best we have is this comment: "roughly corresponds to the number of nodes that are covered". 
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). 

I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ

Differential Revision: http://reviews.llvm.org/D8794

llvm-svn: 233931
2015-04-02 17:56:17 +00:00
Adam Nemet
a56777ef49 [LoopAccesses] Handle case when no memchecks are needed after partitioning
llvm-svn: 233930
2015-04-02 17:51:57 +00:00
Reid Kleckner
53dc058e4d Add an LLVM_PTR_SIZE macro to make LLVM_ALIGNAS more useful
MSVC 2013 requires the argument to __declspec(align()) to be an integer
constant expression that doesn't involve any identifiers like sizeof.

For GCC and Clang, LLVM_PTR_SIZE is equivalent to __SIZEOF_POINTER__,
which dates back to GCC 4.6 and Clang 2010. If that's not available, we
get sizeof(void*), which works with alignas() and
__attribute__((aligned())).

For MSVC, LLVM_PTR_SIZE is 4 or 8 depending on _WIN64.

llvm-svn: 233929
2015-04-02 17:43:35 +00:00
Eli Bendersky
6b1c001175 Fix typo and reword in LangRef
Patch by Douglas Katzman

Differential Revision: http://reviews.llvm.org/D8785

llvm-svn: 233920
2015-04-02 15:20:04 +00:00
Benjamin Kramer
ae20520543 [alignof] Put back the hack for old versions of GCC.
This works around a bug (PR56859) that is fixed in all versions of GCC I tested
with but was present in 4.8.0. Using 4.8.0 is of course a terrible idea, but looks
like we can't drop it just yet.

llvm-svn: 233916
2015-04-02 13:31:50 +00:00
Benjamin Kramer
6923c15ed3 [support] Add a macro wrapper for alignas and simplify some code.
We wrap __attribute((aligned)) for GCC 4.7 and __declspec(align) for
MSVC. The latter behaves weird in some contexts so this should be used
carefully.

llvm-svn: 233910
2015-04-02 11:32:48 +00:00
Vasileios Kalintiris
633e903757 [mips] Implement eliminateCallFramePseudoInstr() in MipsFrameLowering. NFC.
Summary:
Avoid duplicate code in Mips16FrameLowering and MipsSEFrameLowering by
providing an implementation of the eliminateCallFramePseudoInstr()
function from their base class.

Depends on D8640.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8641

llvm-svn: 233909
2015-04-02 11:09:40 +00:00
Elena Demikhovsky
74d944b41a AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 233906
2015-04-02 10:51:40 +00:00
Vasileios Kalintiris
7f63d1f4eb [mips] Expose adjustStackPtr() from MipsInstrInfo. NFC.
Summary:
adjustStackPtr() is implemented from both MipsSEInstrInfo and
Mips16InstrInfo. It makes sense to expose this function from
MipsInstrInfo and avoid explicit casting in some cases.

Depends on D8638.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8640

llvm-svn: 233905
2015-04-02 10:42:44 +00:00
Vasileios Kalintiris
861cb631df [mips] Make sure that we don't adjust the stack pointer by zero amount.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8638

llvm-svn: 233904
2015-04-02 10:14:54 +00:00
Vladimir Sukharev
8943e09b29 [ARM] Rename v8.1a from "extension" to "architecture": follow-up
Corrected forgotten change to remove excess "generic-armv8.1-a" cpu

Subscribers: llvm-commits

Completion of http://reviews.llvm.org/rL233811

llvm-svn: 233903
2015-04-02 09:32:14 +00:00
Peter Collingbourne
08b6124ef0 Revert r233595, "MC: For variable symbols, maintain MCSymbol::Section as a cache."
llvm-svn: 233898
2015-04-02 07:02:51 +00:00
Lang Hames
ecdf9920d8 [Orc] Fix local-linkage handling in the CompileOnDemand layer.
llvm-svn: 233895
2015-04-02 05:28:10 +00:00
Philip Reames
b94079ff0c [gcroot] Remove unused items from an enum
These two were never implemented for gcroot, so there's no point in keeping them around now.

llvm-svn: 233892
2015-04-02 05:02:16 +00:00
Philip Reames
d3d2df55e2 Teach gcroot how to handle dynamically realigned frames
I'm playing with supporting custom stack map formats with statepoints.  While 
doing so, I noticed that the existing implementation didn't indicate inherently 
unsized frames.  This change essentially just ports the functionality that already 
exists for the default StackMaps section to custom stackmaps.

llvm-svn: 233891
2015-04-02 05:00:40 +00:00
Sanjoy Das
29c7d72cfb [ADT] Increment epoch from DenseMap::swap.
Swapping DenseMap A with DenseMap B invalidates iterators pointing into
both A and B.

llvm-svn: 233890
2015-04-02 04:58:12 +00:00
Sanjoy Das
efa5f61003 [ADT] Remove dead code.
Nothing is using DenseMapBase::swap.  Besides it does not compile in its
current form.

llvm-svn: 233889
2015-04-02 04:58:08 +00:00
Adam Nemet
86058cb79c [LoopAccesses] Remove unused global variables in tests
llvm-svn: 233887
2015-04-02 04:42:51 +00:00
Lang Hames
3dbd9b3b85 [Orc] Add support classes for inspecting and running C++ static ctor/dtors, and
use these to add support for C++ static ctors/dtors to the Orc-lazy JIT in LLI.

Replace the trivial_retval_1 regression test - the new 'hello' test is covering
strictly more code. 

llvm-svn: 233885
2015-04-02 04:34:45 +00:00
Craig Topper
c61d7739c0 Don't print an error message when looking up the scheduling model if user specified -mcpu=help.
llvm-svn: 233884
2015-04-02 04:27:50 +00:00
Alexey Samsonov
03b2851bcd Fix a bug indicated by -fsanitize=shift-exponent.
llvm-svn: 233881
2015-04-02 01:30:10 +00:00
Daniel Berlin
88d447a8c9 Mark this inline properly
llvm-svn: 233870
2015-04-02 00:03:15 +00:00
Daniel Berlin
bb6f3adec8 Return iterator from Instruction::eraseFromParent.
Summary:
This is necessary in order to make removal while using reverse iterators work.

(See http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/084122.html)

Updates to other eraseFromParent's to come in later patches.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8783

llvm-svn: 233869
2015-04-02 00:03:07 +00:00
Matthias Braun
d748ed9399 TableGen: Generate more const goodness
llvm-svn: 233857
2015-04-01 22:09:55 +00:00
Kevin Enderby
2cb84b4af8 Fix sanitizer-x86_64-linux-fast failure that was not deleting the bindtable.
llvm-svn: 233856
2015-04-01 21:50:45 +00:00
Kostya Serebryany
af347bcc4a [fuzzer] document the -tokens flag. Also change the diagnostic output
llvm-svn: 233842
2015-04-01 21:33:20 +00:00
Matthias Braun
599e837694 Remove declarations for nonexistent methods
llvm-svn: 233841
2015-04-01 21:16:02 +00:00
Kevin Enderby
2bc88f43cf Add the option -objc-meta-data to llvm-objdump used with -macho to
print the Objective-C runtime meta data for Mach-O files.

There are three types of Objective-C runtime meta data, Objc2 64-bit,
Objc2 32-bit and Objc1 32-bit.  This prints the first of these types. The
changes to print the others will follow next.

llvm-svn: 233840
2015-04-01 20:57:01 +00:00
Benjamin Kramer
a711d21dc6 [X86] Don't accidentally select shll $1, %eax when shrinking an immediate.
addl has higher throughput and this was needlessly picking a suboptimal
encoding causing PR23098.

I wish there was a way of doing this without further duplicating tbl-
generated patterns, but so far I haven't found one.

llvm-svn: 233832
2015-04-01 19:01:09 +00:00