1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

108920 Commits

Author SHA1 Message Date
Rafael Espindola
8c36da38ee Move getNonexecutableStackSection up to the base ELF class.
The .note.GNU-stack section is not SystemZ/X86 specific.

llvm-svn: 219796
2014-10-15 15:44:16 +00:00
Matt Arsenault
cb725dcde9 R600: Use existing variable
llvm-svn: 219778
2014-10-15 05:07:00 +00:00
Matt Arsenault
04b9e0240c R600: Remove outdated comment
llvm-svn: 219777
2014-10-15 05:06:57 +00:00
Juergen Ributzka
a0f67257e6 Revert "[FastISel][AArch64] Add custom lowering for GEPs."
This breaks our internal build bots. Reverting it to get the bots green again.

llvm-svn: 219776
2014-10-15 04:55:48 +00:00
Jingyue Wu
3ee10fc280 [MachineSink] Use the real post dominator tree
Summary:
Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in
isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo.
The old heuristics caused instructions to sink unnecessarily, and might create
register pressure.

This is the second try of the fix. The first one (D4814) caused a performance
regression due to failing to sink instructions out of loops (PR21115). This
patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower
one regardless of whether the target block post-dominates the source.

Thanks Alexey Volkov for reporting PR21115! 

Test Plan:
Added a NVPTX codegen test to verify that our change prevents the backend from
over-sinking. It also shows the unnecessary register pressure caused by
over-sinking.

Added an X86 test to verify we can sink instructions out of loops regardless of
the dominance relationship. This test is reduced from Alexey's test in PR21115.

Updated an affected test in X86.

Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime
performance. Results are attached separately in the review thread.

Reviewers: Jiangning, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D5633

llvm-svn: 219773
2014-10-15 03:27:43 +00:00
Tim Northover
da6d757afb ARM: drop check for triple that's no longer used.
Early attempts to support AAPCS bare metal MachO targets based the decision on
the CPU being compiled for. This was not a particularly great idea and we've
got a better option now, but this check remained.

No functional change for any target we care about.

llvm-svn: 219767
2014-10-15 01:05:01 +00:00
Eric Christopher
6ae3b9e3df Remove unused variable.
llvm-svn: 219750
2014-10-15 00:09:07 +00:00
Eric Christopher
58efc65f68 No need to cache this unused variable.
Patch by Ehsan Akhgari.

llvm-svn: 219749
2014-10-14 23:58:51 +00:00
Gerolf Hoflehner
93b11ca136 [AArch64] Wrong CC access in CSINC-conditional branch sequence
This is a follow up to commit r219742. It removes the CCInMI variable
and accesses the CC in CSCINC directly. In the case of a conditional
branch accessing the CC with CCInMI was wrong.

llvm-svn: 219748
2014-10-14 23:55:00 +00:00
Nick Kledzik
88bf7954fb [llvm-objdump] Update error message and add test case for mach-o file with bad library ordinals
llvm-svn: 219746
2014-10-14 23:29:38 +00:00
Gerolf Hoflehner
fbd25ba142 [AAarch64] Optimize CSINC-branch sequence
Peephole optimization that generates a single conditional branch
for csinc-branch sequences like in the examples below. This is
possible when the csinc sets or clears a register based on a condition
code and the branch checks that register. Also the condition
code may not be modified between the csinc and the original branch.

Examples:

1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44
   to b.<invCC>

2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44
   to b.<CC>


rdar://problem/18506500

llvm-svn: 219742
2014-10-14 23:07:53 +00:00
Hal Finkel
3a23f40590 [LoopVectorize] Ignore @llvm.assume for cost estimates and legality
A few minor changes to prevent @llvm.assume from interfering with loop
vectorization. First, treat @llvm.assume like the lifetime intrinsics, which
are scalarized (but don't otherwise interfere with the legality checking).
Second, ignore the cost of ephemeral instructions in the loop (these will go
away anyway during CodeGen).

Alignment assumptions and other uses of @llvm.assume can often end up inside of
loops that should be vectorized (this is not uncommon for assumptions generated
by __attribute__((align_value(n))), for example).

llvm-svn: 219741
2014-10-14 22:59:49 +00:00
David Majnemer
c1e1f7acd9 MC, COFF: Make bigobj test compatible with python3
No functionality change intended.

llvm-svn: 219739
2014-10-14 22:35:11 +00:00
Simon Pilgrim
6a516dfd91 [X86][SSE] pslldq/psrldq shuffle mask decodes
Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions.

Differential Revision: http://reviews.llvm.org/D5598

llvm-svn: 219738
2014-10-14 22:31:34 +00:00
David Majnemer
dcd5a0a888 MC: Rewrite bigobj test in python
This makes the test easier to work with.  No functionality change
intended.

llvm-svn: 219737
2014-10-14 22:26:49 +00:00
Tim Northover
a7b041da1f ARM: remove ARM/Thumb distinction for preferred alignment.
Thumb1 has legitimate reasons for preferring 32-bit alignment of types
i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be
a multiple of 4. However, this is a trade-off betweem code size and RAM usage;
the DataLayout string is not the best place to represent it even if desired.

So this patch removes the extra Thumb requirements, hopefully making ARM and
Thumb completely compatible in this respect.

llvm-svn: 219734
2014-10-14 22:12:17 +00:00
Tim Northover
c5e92d5ca6 ARM: allow misaligned local variables in Thumb1 mode.
There's no hard requirement on LLVM to align local variable to 32-bits, so the
Thumb1 frame handling needs to be able to deal with variables that are only
naturally aligned without falling over.

llvm-svn: 219733
2014-10-14 22:12:14 +00:00
David Majnemer
2a4ee60b8b Add a test for writing COFF BigObj
llvm-svn: 219729
2014-10-14 21:47:53 +00:00
Juergen Ributzka
d3bf2383f1 [FastISel][AArch64] Add custom lowering for GEPs.
This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail
out even for simple cases, because the standard fastEmit functions don't cover
MUL and ADD is lowered inefficientily.

llvm-svn: 219726
2014-10-14 21:41:23 +00:00
Hans Wennborg
71be459fd4 [x86 asm] allow fwait alias in both At&t and Intel modes (PR21208)
Differential Revision: http://reviews.llvm.org/D5741

llvm-svn: 219725
2014-10-14 21:41:17 +00:00
Tim Northover
519637b1b7 ARM: set preferred aggregate alignment to 32 universally.
Before, ARM and Thumb mode code had different preferred alignments, which could
lead to some rather unexpected results. There's justification for reducing it
from the default 64-bits (wasted space), but I don't think there is for going
below 32-bits.

There's no actual ABI change here, just to reassure people.

llvm-svn: 219719
2014-10-14 20:57:26 +00:00
Hal Finkel
aebf419a5c [CFL-AA] CFL-AA should not assert on an va_arg instruction
The CFL-AA implementation was missing a visit* routine for va_arg instructions,
causing it to assert when run on a function that had one. For now, handle these
in a conservative way.

Fixes PR20954.

llvm-svn: 219718
2014-10-14 20:51:26 +00:00
Sanjay Patel
c1c7410bc8 Optimize away fabs() calls when input is squared (known positive).
Eliminate library calls and intrinsic calls to fabs when the input 
is a squared value.

Note that no unsafe-math / fast-math assumptions are needed for
this optimization.

Differential Revision: http://reviews.llvm.org/D5777

llvm-svn: 219717
2014-10-14 20:43:11 +00:00
Juergen Ributzka
3aad57b5cd [FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved.
Sign-/zero-extend folding depended on the load and the integer extend to be
both selected by FastISel. This cannot always be garantueed and SelectionDAG
might interfer. This commit adds additonal checks to load and integer extend
lowering to catch this.

Related to rdar://problem/18495928.

llvm-svn: 219716
2014-10-14 20:36:02 +00:00
David Majnemer
cd6d05c0a5 InstCombine: Don't miscompile X % ((Pow2 << A) >>u B)
We assumed that A must be greater than B because the right hand side of
a remainder operator must be nonzero.

However, it is possible for A to be less than B if Pow2 is a power of
two greater than 1.

Take for example:
i32 %A = 0
i32 %B = 31
i32 Pow2 = 2147483648

((Pow2 << 0) >>u 31) is non-zero but A is less than B.

This fixes PR21274.

llvm-svn: 219713
2014-10-14 20:28:40 +00:00
Jan Vesely
a3b17a8d8b Reapply "R600: Add new intrinsic to read work dimensions"
This effectively reverts revert 219707. After fixing the test to work with
new function name format and renamed intrinsic.

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219710
2014-10-14 20:05:26 +00:00
Hal Finkel
717a51f66b Revert "r216914 - Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'"
Reapply r216913, a fix for PR20832 by Andrea Di Biagio. The commit was reverted
because of buildbot failures, and credit goes to Ulrich Weigand for isolating
the underlying issue (which can be confirmed by Valgrind, which does helpfully
light up like the fourth of July). Uli explained the problem with the original
patch as:

  It seems the problem is calling multiplySignificand with an addend of category
  fcZero; that is not expected by this routine.  Note that for fcZero, the
  significand parts are simply uninitialized, but the code in (or rather, called
  from) multiplySignificand will unconditionally access them -- in effect using
  uninitialized contents.

This version avoids using a category == fcZero addend within
multiplySignificand, which avoids this problem (the Valgrind output is also now
clean).

Original commit message:

[APFloat] Fixed a bug in method 'fusedMultiplyAdd'.

When folding a fused multiply-add builtin call, make sure that we propagate the
correct result in the case where the addend is zero, and the two other operands
are finite non-zero.

Example:
  define double @test() {
    %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0)
    ret double %1
  }

Before this patch, the instruction simplifier wrongly folded the builtin call
in function @test to constant 'double 7.0'.
With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and
propagates the expected result (i.e. 56.0).

Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra
test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence
of NaN/Inf operands.

This fixes PR20832.

llvm-svn: 219708
2014-10-14 19:23:07 +00:00
Rafael Espindola
48d2b0a972 Revert "R600: Add new intrinsic to read work dimensions"
This reverts commit r219705.

CodeGen/R600/work-item-intrinsics.ll was failing on linux.

llvm-svn: 219707
2014-10-14 18:58:04 +00:00
Rafael Espindola
c98553226f Remove unused member variable.
Fixes pr20904.

llvm-svn: 219706
2014-10-14 18:53:16 +00:00
Jan Vesely
c97f76d270 R600: Add new intrinsic to read work dimensions
v2: Add SI lowering
    Add test

v3: Place work dimensions after the kernel arguments.
v4: Calculate offset while lowering arguments
v5: rebase
v6: change prefix to AMDGPU

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219705
2014-10-14 18:52:07 +00:00
Jan Vesely
a1d9fe15f6 R600: FMA is VecALU only instruction
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219704
2014-10-14 18:52:04 +00:00
Reed Kotler
5d3770d1fc Finish getting Mips fast-isel to match up with AArch64 fast-isel
Summary:
In order to facilitate use of common code, checking by reviewers of other fast-isel ports, and hopefully to eventually move most of Mips and other fast-isel ports into target independent code, I've tried to get the two implementations to line up.

There is no functional code change. Just methods moved in the file to be in the same order as in AArch64.

Test Plan: No functional change.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, aemerson, rfuhler

Differential Revision: http://reviews.llvm.org/D5692

llvm-svn: 219703
2014-10-14 18:27:58 +00:00
David Blaikie
2044b61ced DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.
Let me tell you a tale...

Originally committed in r211723 after discovering a nasty case of weird
scoping due to inlining, this was reverted in r211724 after it fired in
ASan/compiler-rt.

(minor diversion where I accidentally committed/reverted again in
r211871/r211873)

After further testing and fixing bugs in ArgumentPromotion (r211872) and
Inlining (r212065) it was recommitted in r212085. Reverted in r212089
after the sanitizer buildbots still showed problems.

Fixed another bug in ArgumentPromotion (r212128) found by this
assertion.

Recommitted in r212205, reverted in r212226 after it crashed some more
on sanitizer buildbots.

Fix clang some more in r212761.

Recommitted in r212776, reverted in r212793. ASan failures.
Recommitted in r213391, reverted in r213432, trying to reproduce flakey
ASan build failure.

Fixed bugs in r213805 (ArgPromo + DebugInfo), r213952
(LiveDebugVariables strips dbg_value intrinsics in functions not
described by debug info).

Recommitted in r214761, reverted in r214999, flakey failure on Windows
buildbot.

Fixed DeadArgElimination + DebugInfo bug in r219210.

Recommitted in r219215, reverted in r219512, failure on ObjC++ atomic
properties in the test-suite on Darwin.

Fixed ObjC++ atomic properties issue in Clang in r219690.

[This commit is provided 'as is' with no hope that this is the last time
I commit this change either expressed or implied]

llvm-svn: 219702
2014-10-14 18:22:52 +00:00
Rafael Espindola
a9ca7347ee Remove method that is identical to the base class one.
llvm-svn: 219700
2014-10-14 17:38:38 +00:00
Matt Arsenault
5c68d89c00 R600/SI: Use DS offsets for constant addresses
Use 0 as the base address for a constant address, so if
we have a constant address we can save moves and form
read2/write2s.

llvm-svn: 219698
2014-10-14 17:21:19 +00:00
David Blaikie
c221665891 Revert "Fix stuff... again."
Accidental commit.

This reverts commit r219693.

llvm-svn: 219695
2014-10-14 17:13:09 +00:00
David Blaikie
7ece1f98c4 Revert some parts of r196288 that were confusing and untested.
If we figure out why they should be here, let's add some testing of some
kind so we can better demonstrate why it's needed.

llvm-svn: 219694
2014-10-14 17:12:02 +00:00
David Blaikie
47d8204a49 Fix stuff... again.
llvm-svn: 219693
2014-10-14 17:11:59 +00:00
Hal Finkel
9e140a3f0a [LVI] Check for @llvm.assume dominating the edge branch
When LazyValueInfo uses @llvm.assume intrinsics to provide edge-value
constraints, we should check for intrinsics that dominate the edge's branch,
not just any potential context instructions. An assumption that dominates the
edge's branch represents a truth on that edge. This is specifically useful, for
example, if multiple predecessors assume a pointer to be nonnull, allowing us
to simplify a later null comparison.

The test case, and an initial patch, were provided by Philip Reames. Thanks!

llvm-svn: 219688
2014-10-14 16:04:49 +00:00
NAKAMURA Takumi
6727c3f0f4 Revert r219638, (r219640 and r219676), "Removing the static destructor from ManagedStatic.cpp by controlling the allocation and de-allocation of the mutex."
It caused hang-up on msc17 builder, probably deadlock.

llvm-svn: 219687
2014-10-14 15:58:16 +00:00
Robert Khasanov
1896253278 [AVX512] Extended avx512_binop_rm to DQ/VL subsets.
Added encoding tests.

llvm-svn: 219686
2014-10-14 15:13:56 +00:00
Robert Khasanov
518a523445 [AVX512] Extended avx512_binop_rm to BW/VL subsets.
Added encoding tests.

llvm-svn: 219685
2014-10-14 14:36:19 +00:00
Bradley Smith
bb04f108aa [AArch64] Fix crash with empty/pseudo-only blocks in A53 erratum (835769) workaround
llvm-svn: 219684
2014-10-14 14:02:41 +00:00
Alexander Potapenko
ad195b7bc4 [llvm-symbolizer] Minor typedef cleanup. NFC.
llvm-svn: 219682
2014-10-14 13:40:44 +00:00
NAKAMURA Takumi
547f5d3262 Threading.h: Use \tparam for template parameters. [-Wdocumentation]
llvm-svn: 219676
2014-10-14 09:34:16 +00:00
Eric Christopher
de9f191394 Grab the subtarget info off of the MachineFunction rather than
indirecting through the TargetMachine.

llvm-svn: 219674
2014-10-14 08:44:19 +00:00
Eric Christopher
c65e60f7c8 Use the triple to figure out if this is a darwin target, not
the subtarget.

llvm-svn: 219673
2014-10-14 08:25:26 +00:00
Eric Christopher
45581b7332 Remove unnecessary TargetMachine.h includes.
llvm-svn: 219672
2014-10-14 07:22:08 +00:00
Eric Christopher
e4d58538b3 Grab the subtarget and subtarget dependent variables off of
MachineFunction rather than TargetMachine.

llvm-svn: 219671
2014-10-14 07:22:00 +00:00
Eric Christopher
db23ede8d2 Grab the subtarget and subtarget dependent variables off of
MachineFunction rather than TargetMachine.

llvm-svn: 219670
2014-10-14 07:17:23 +00:00