1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00
Commit Graph

105879 Commits

Author SHA1 Message Date
Juergen Ributzka
dd1be32354 [RuntimeDyld][AArch64] Update relocation tests and also add a simple GOT test.
llvm-svn: 213807
2014-07-23 22:23:17 +00:00
Eric Christopher
052c4b4f44 Remove the query for TargetMachine and TargetInstrInfo since we're
already inside TargetInstrInfo.

llvm-svn: 213806
2014-07-23 22:12:03 +00:00
David Blaikie
eaf4b8c44f ArgPromo+DebugInfo: Handle updating debug info over multiple applications of argument promotion.
While the subprogram map cache used by Dead Argument Elimination works
there, I made a mistake when reusing it for Argument Promotion in
r212128 because ArgPromo may transform functions more than once whereas
DAE transforms each function only once, removing all the dead arguments
in one go.

To address this, ensure that the map is updated after each argument
promotion.

In retrospect it might be a little wasteful to create a map of all
subprograms when only handling a single CGSCC, but the alternative is
walking the debug info for each function in the CGSCC that gets updated.
It's not clear to me what the right tradeoff is there, but since the
current tradeoff seems to be working OK (and the code to keep things
updated is very cheap), let's stick with that for now.

llvm-svn: 213805
2014-07-23 22:09:29 +00:00
David Blaikie
ab62d52eec Test debug info in arg promotion with an actual promotion case, rather than a degenerate arg promotion that's actually DAE performed by ArgPromo
Also the debug location I had here was bogus, describing the location of
the call site as in the callee - and unnecessary, so just drop it.

llvm-svn: 213803
2014-07-23 21:30:59 +00:00
Jim Grosbach
98597dd715 Use an explicit triple in testcase.
Make the test work better on non-darwin hosts. Hopefully.

llvm-svn: 213801
2014-07-23 20:46:32 +00:00
Jim Grosbach
f6143714d5 [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.
The transform to constant fold unary operations with an AND across a
vector comparison applies when the constant is not a splat of a scalar
as well.

llvm-svn: 213800
2014-07-23 20:41:43 +00:00
Jim Grosbach
5e39096957 X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.

llvm-svn: 213799
2014-07-23 20:41:38 +00:00
Jim Grosbach
ae8284ac32 DAG: fp->int conversion for non-splat constants.
Constant fold the lanes of the input constant build_vector individually
so we correctly handle when the vector elements are not all the same
constant value.

PR20394

llvm-svn: 213798
2014-07-23 20:41:31 +00:00
Justin Holewinski
d4b153f954 [NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types
llvm-svn: 213794
2014-07-23 20:23:49 +00:00
Justin Holewinski
d906a7b7fa [NVPTX] Silence a GCC warning found by the buildbots
The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.

llvm-svn: 213793
2014-07-23 20:23:47 +00:00
Mark Heffernan
6e2086acbe Do not add unroll disable metadata after unrolling pass for loops with #pragma clang loop unroll(full).
llvm-svn: 213789
2014-07-23 20:05:44 +00:00
Juergen Ributzka
2a176cc855 [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

llvm-svn: 213788
2014-07-23 20:03:13 +00:00
Justin Holewinski
851431280b [NVPTX] mul.wide generation works for any smaller integer source types, not just the next smaller power of two
llvm-svn: 213784
2014-07-23 18:46:03 +00:00
Robert Khasanov
4f172d9c81 [SKX] Added missed test files for rev 213757
llvm-svn: 213780
2014-07-23 18:17:49 +00:00
Saleem Abdulrasool
ef57f88f77 AsmParser: remove deprecated LLIR support
linker_private and linker_private_weak were deprecated in 3.5.  Remove support
for them now that the 3.5 branch has been created.

llvm-svn: 213777
2014-07-23 18:09:31 +00:00
Saleem Abdulrasool
d67537b102 ExecutionEngine: remove a stray semicolon
Detected via GCC 4.8 [-Wpedantic].

llvm-svn: 213776
2014-07-23 18:09:28 +00:00
Robert Khasanov
4e33d5c3f9 [SKX] Fix lowercase "error:" in rev 213757
llvm-svn: 213774
2014-07-23 17:42:13 +00:00
Justin Holewinski
ece54a0498 [NVPTX] Make sure we do not generate MULWIDE ISD nodes when optimizations are disabled
With optimizations disabled, we disable the isel patterns for mul.wide; but we
were still generating MULWIDE ISD nodes.  Now, we only try to generate MULWIDE
ISD nodes in DAGCombine if the optimization level is not zero.

llvm-svn: 213773
2014-07-23 17:40:45 +00:00
Mark Heffernan
cf39d19c7f In unroll pragma syntax and loop hint metadata, change "enable" forms to a new form using the string "full".
llvm-svn: 213772
2014-07-23 17:31:37 +00:00
Alex Lorenz
6942a2c804 test commit: remove trailing space
llvm-svn: 213770
2014-07-23 17:18:05 +00:00
Chad Rosier
2cb7fd5dae [AArch64] Lower sdiv x, pow2 using add + select + shift.
The target-independent DAGcombiner will generate:
asr w1, X, #31 w1 = splat sign bit.
add X, X, w1, lsr #28 X = X + 0 or pow2-1
asr w0, X, asr #4 w0 = X/pow2

However, the add + shifts is expensive, so generate:
add w0, X, 15 w0 = X + pow2-1
cmp X, wzr X - 0
csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X;
asr w0, X, asr 4 w0 = X/pow2

llvm-svn: 213758
2014-07-23 14:57:52 +00:00
Robert Khasanov
cfc9aa43e1 [SKX] Enabling mask instructions: encoding, lowering
KMOVB, KMOVW, KMOVD, KMOVQ, KNOTB, KNOTW, KNOTD, KNOTQ

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 213757
2014-07-23 14:49:42 +00:00
Tim Northover
869fa46eae ARM: spot SBFX-compatbile code expressed with sign_extend_inreg
We were assuming all SBFX-like operations would have the shl/asr form, but
often when the field being extracted is an i8 or i16, we end up with a
SIGN_EXTEND_INREG acting on a shift instead. Simple enough to check for though.

llvm-svn: 213754
2014-07-23 13:59:12 +00:00
Tim Northover
68aeddde08 ARM: add patterns for [su]xta[bh] from just a shift.
Although the final shifter operand is a rotate, this actually only matters for
the half-word extends when the amount == 24. Otherwise folding a shift in is
just as good.

llvm-svn: 213753
2014-07-23 13:59:07 +00:00
James Molloy
41a2f5b855 Enable partial libcall inlining for all targets by default.
This pass attempts to speculatively use a sqrt instruction if one exists on the target, falling back to a libcall if the target instruction returned NaN.

This was enabled for MIPS and System-Z, but is well guarded and is good for most targets - GCC does this for (that I've checked) X86, ARM and AArch64.

llvm-svn: 213752
2014-07-23 13:33:00 +00:00
Tilmann Scheller
189ad507d0 [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRB instructions.
The ARM ARM prohibits STRB instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRB instructions with unpredictable behavior.

llvm-svn: 213750
2014-07-23 13:03:47 +00:00
Daniel Sanders
8566b88442 Added release notes for MIPS.
llvm-svn: 213749
2014-07-23 12:59:26 +00:00
Tim Northover
c357579164 AArch64: remove "arm64_be" support in favour of "aarch64_be".
There really is no arm64_be: it was a useful fiction to test big-endian support
while both backends existed in parallel, but now the only platform that uses
the name (iOS) doesn't have a big-endian variant, let alone one called
"arm64_be".

llvm-svn: 213748
2014-07-23 12:58:11 +00:00
Tilmann Scheller
ea661e77fd [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STR instructions.
The ARM ARM prohibits STR instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STR instructions with unpredictable behavior.

llvm-svn: 213745
2014-07-23 12:38:17 +00:00
Tim Northover
3e8fd62854 AArch64: remove arm64 triple enumerator.
Having both Triple::arm64 and Triple::aarch64 is extremely confusing, and
invites bugs where only one is checked. In reality, the only legitimate
difference between the two (arm64 usually means iOS) is also present in the OS
part of the triple and that's what should be checked.

We still parse the "arm64" triple, just canonicalise it to Triple::aarch64, so
there aren't any LLVM-side test changes.

llvm-svn: 213743
2014-07-23 12:32:47 +00:00
Andrea Di Biagio
e373d3d06d Revert r211771. It was: "[X86] Improve the selection of SSE3/AVX addsub instructions".
This chang fully reverts r211771.
That revision added a canonicalization rule which has the potential to causes a
combine-cycle in the target-independent canonicalizing DAG combine.

The plan is to move the logic that forms target specific addsub nodes as part of
the lowering of shuffles.

llvm-svn: 213736
2014-07-23 11:20:24 +00:00
Chandler Carruth
d673be21b7 [x86] Clean up a test case to use check labels and spell out the exact
instruction sequences with CHECK-NEXT for these test cases.

This notably exposes how absolutely horrible the generated code is for
several of these test cases, and will make any future updates to the
test as our vector instruction selection gets better.

llvm-svn: 213732
2014-07-23 09:11:48 +00:00
Tilmann Scheller
92abecadc2 [ARM] Add regression test for the earlyclobber constraint of ARM STRB.
The constraint was added in r213369.

llvm-svn: 213730
2014-07-23 08:39:50 +00:00
Tilmann Scheller
e0c6a75a6a [ARM] Add earlyclobber constraint to pre/post-indexed ARM STRH instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STRH instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

llvm-svn: 213729
2014-07-23 08:12:51 +00:00
Chandler Carruth
908e62868c [SDAG] Make the DAGCombine worklist not grow endlessly due to duplicate
insertions.

The old behavior could cause arbitrarily bad memory usage in the DAG
combiner if there was heavy traffic of adding nodes already on the
worklist to it. This commit switches the DAG combine worklist to work
the same way as the instcombine worklist where we null-out removed
entries and only add new entries to the worklist. My measurements of
codegen time shows slight improvement. The memory utilization is
unsurprisingly dominated by other factors (the IR and DAG itself
I suspect).

This change results in subtle, frustrating churn in the particular order
in which DAG combines are applied which causes a number of minor
regressions where we fail to match a pattern previously matched by
accident. AFAICT, all of these should be using AddToWorklist to directly
or should be written in a less brittle way. None of the changes seem
drastically bad, and a few of the changes seem distinctly better.

A major change required to make this work is to significantly harden the
way in which the DAG combiner handle nodes which become dead
(zero-uses). Previously, we relied on the ability to "priority-bump"
them on the combine worklist to achieve recursive deletion of these
nodes and ensure that the frontier of remaining live nodes all were
added to the worklist. Instead, I've introduced a routine to just
implement that precise logic with no indirection. It is a significantly
simpler operation than that of the combiner worklist proper. I suspect
this will also fix some other problems with the combiner.

I think the x86 changes are really minor and uninteresting, but the
avx512 change at least is hiding a "regression" (despite the test case
being just noise, not testing some performance invariant) that might be
looked into. Not sure if any of the others impact specific "important"
code paths, but they didn't look terribly interesting to me, or the
changes were really minor. The consensus in review is to fix any
regressions that show up after the fact here.

Thanks to the other reviewers for checking the output on other
architectures. There is a specific regression on ARM that Tim already
has a fix prepped to commit.

Differential Revision: http://reviews.llvm.org/D4616

llvm-svn: 213727
2014-07-23 07:08:53 +00:00
Nick Lewycky
08dae2c274 We may visit a call that uses an alloca multiple times in callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405!
llvm-svn: 213726
2014-07-23 06:24:49 +00:00
NAKAMURA Takumi
413d897e7d Rework to let RuntimeDyld/X86/MachO_x86-64_PIC_relocations.s pass on win32.
FIXME: "llvm-rtdyld -verify -check" is still sensitive to path separator.
Fix searching StubMap to be tolerant of both '/' and '\\' on Win32.

llvm-svn: 213723
2014-07-23 04:32:21 +00:00
NAKAMURA Takumi
4555cd0e60 Suppress a test on win32 for now, llvm/test/ExecutionEngine/RuntimeDyld/X86/MachO_x86-64_PIC_relocations.s.
FIXME: Fix searching StubMap with '/' and '\\' on Win32.
llvm-svn: 213721
2014-07-23 04:05:58 +00:00
NAKAMURA Takumi
a6795992fa RuntimeDyld/X86/MachO_x86-64_PIC_relocations.s: Use %/T here, or sed(1) would be confused with dos path.
llvm-svn: 213720
2014-07-23 04:05:46 +00:00
NAKAMURA Takumi
8e44b58021 Trailing whitespace.
llvm-svn: 213711
2014-07-23 00:42:52 +00:00
NAKAMURA Takumi
d63be9bfd5 RuntimeDyldMachOAArch64.h: Fix a warning. [-Wunused-variable]
llvm-svn: 213710
2014-07-23 00:17:44 +00:00
Lang Hames
0a8073389f [MCJIT] Make stub_addr functionality in RuntimeDyldChecker work in release mode.
There's no reason to restrict this particular piece of RuntimeDyldChecker
functionality to +Asserts builds.

This should fix failures in MachO_x86-64_PIC_relocations.s on release bots.

llvm-svn: 213708
2014-07-22 23:50:51 +00:00
Lang Hames
8cf25fbd15 [MCJIT] Teach RuntimeDyldChecker to handle underscores at the start of symbols.
RuntimeDyldChecker had been testing isalpha(Expr[0]) to recognise symbol tokens,
and throwing unrecognized token errors when it hit symbols with leading
underscores. This fixes that.

llvm-svn: 213706
2014-07-22 23:17:21 +00:00
Juergen Ributzka
00d4bb61a8 XFAIL the test on MIPS
Not sure how to debug this one without a MIPS machine. Any takers?

llvm-svn: 213705
2014-07-22 23:15:01 +00:00
Juergen Ributzka
c2d9ee45f3 [FastIsel][AArch64] Add support for the FastLowerCall and FastLowerIntrinsicCall target-hooks.
This commit modifies the existing call lowering functions to be used as the
FastLowerCall and FastLowerIntrinsicCall target-hooks instead.

This enables patchpoint intrinsic lowering for AArch64.

This fixes <rdar://problem/17733076>

llvm-svn: 213704
2014-07-22 23:14:58 +00:00
Juergen Ributzka
49aab6445a [AArch64] Use CHECK-LABEL in ARM64 ABI unit tests.
llvm-svn: 213703
2014-07-22 23:14:54 +00:00
Lang Hames
b38fc8cc0c [MCJIT] Improve stub_addr file-not-found diagnostic to help track down a
buildbot failure.

llvm-svn: 213701
2014-07-22 23:07:52 +00:00
Lang Hames
59246bdf0c [MCJIT] Refactor and add stub inspection to the RuntimeDyldChecker framework.
This patch introduces a 'stub_addr' builtin that can be used to find the address
of the stub for a given (<file>, <section>, <symbol>) tuple. This address can be
used both to verify the contents of stubs (by loading from the returned address)
and to verify references to stubs (by comparing against the returned address).

Example (1) - Verifying stub contents:

Load 8 bytes (assuming a 64-bit target) from the stub for 'x' in the __text
section of f.o, and compare that value against the addres of 'x'.

# rtdyld-check: *{8}(stub_addr(f.o, __text, x) = x

Example (2) - Verifying references to stubs:

Decode the immediate of the instruction at label 'l', and verify that it's
equal to the offset from the next instruction's PC to the stub for 'y' in the
__text section of f.o (i.e. it's the correct PC-rel difference).

# rtdyld-check: decode_operand(l, 4) = stub_addr(f.o, __text, y) - next_pc(l)
l:
        movq    y@GOTPCREL(%rip), %rax

Since stub inspection requires cooperation with RuntimeDyldImpl this patch
pimpl-ifies RuntimeDyldChecker. Its implementation is moved in to a new class,
RuntimeDyldCheckerImpl, that has access to the definition of RuntimeDyldImpl.

llvm-svn: 213698
2014-07-22 22:47:39 +00:00
Juergen Ributzka
09c7bc6096 Appease the buildbots.
llvm-svn: 213694
2014-07-22 22:02:19 +00:00
Juergen Ributzka
b056ac8df3 [RuntimeDyld][MachO][AArch64] Add a helper function for encoding addends in instructions.
Factor out the addend encoding into a helper function and simplify the
processRelocationRef.

Also add a few simple rtdyld tests. More tests to come once GOTs can be tested too.

Related to <rdar://problem/17768539>

llvm-svn: 213689
2014-07-22 21:42:55 +00:00