1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

105589 Commits

Author SHA1 Message Date
Daniel Sanders
e88052c0c5 [mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple
Summary:
Generally speaking, mips-* vs mips64-* should not be used to make decisions
about the content or format of the ELF. This should be based on the ABI
and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64`
should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`.
Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as
should `mips-linux-gnu-clang -mips64r2 -mabi=n32`.

This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now
since there is no apparent way to base this decision on the ABI and CPU.

Differential Revision: http://reviews.llvm.org/D4539

llvm-svn: 213244
2014-07-17 10:02:08 +00:00
Daniel Sanders
7121fa671b [mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6
Summary:
The cpr1_size field describes the minimum register width to run the program
rather than the size of the registers on the target. MIPS32r6 was acting
as if -mfp64 has been given because it starts off with 64-bit FPU registers.

Differential Revision: http://reviews.llvm.org/D4538

llvm-svn: 213243
2014-07-17 09:57:23 +00:00
Daniel Sanders
eea41061d0 [mips] Fix ELF e_flags related to -mabicalls and -mplt.
Summary:
These options are not implemented yet but we act as if they are always
given.

The integrated assembler is driven by the clang driver so the e_flag test
cases should match the e_flags emitted by GCC+GAS rather than GAS
by itself.

Differential Revision: http://reviews.llvm.org/D4536

llvm-svn: 213242
2014-07-17 09:52:56 +00:00
Yi Kong
d5d6470070 Fix the prefix for arm64 triple
Triple.cpp still returns "arm64" as prefix for arm64 triple, causing Clang not
being able to select the correct GCCBuiltin IR.

This patch changes the value to correct prefix "aarch64". Regression test will
be added in the coming patch.

Differential Revision: http://reviews.llvm.org/D4516

llvm-svn: 213240
2014-07-17 09:43:27 +00:00
Evgeniy Stepanov
5b945c9d7e [msan] Avoid redundant origin stores.
Origin is meaningless for fully initialized values. Avoid
storing origin for function arguments that are known to
be always initialized (i.e. shadow is a compile-time null
constant).

This is not about correctness, but purely an optimization.
Seems to affect compilation time of blacklisted functions
significantly.

llvm-svn: 213239
2014-07-17 09:10:37 +00:00
Suyog Sarda
e62b39fcd0 Move ashr optimization from InstCombineShift to InstSimplify.
Refactor code, no functionality change, test case moved from instcombine to instsimplify.

Differential Revision: http://reviews.llvm.org/D4102
 

llvm-svn: 213231
2014-07-17 06:28:15 +00:00
Matt Arsenault
45d9529fe9 Use range for
llvm-svn: 213230
2014-07-17 06:19:06 +00:00
Matt Arsenault
54393bb30e R600: Short circuit alloca check if address space isn't private.
Skip calling GetUnderlyingObject in cases where it obviously
isn't from an alloca. This should only be a compile time improvement.

llvm-svn: 213229
2014-07-17 06:13:41 +00:00
Suyog Sarda
fe6fdd5295 Fix Typo (first commit to test commit access)
llvm-svn: 213228
2014-07-17 06:09:34 +00:00
Eric Fiselier
95d51f6cee [lit] Add --show-unsupported flag to LIT
llvm-svn: 213227
2014-07-17 05:53:00 +00:00
Saleem Abdulrasool
b2cdb8f33d MC: make WinEH opcode an opaque value
This makes the opcode an opaque value (unsigned int) rather than the
enumeration.  This permits the use of target specific operands.

Split out the generic type into a MCWinEH header and add a supporting
MCWin64EH::Instruction to abstract out the selection of the opcode and
construction of the actual instruction.

llvm-svn: 213221
2014-07-17 03:08:50 +00:00
Hal Finkel
9779454568 Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.

Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.

Original commit message:

BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.

Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.

This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.

Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).

llvm-svn: 213219
2014-07-17 01:28:25 +00:00
Jingyue Wu
7c4bea3e99 Partially revert r210444 due to performance regression
Summary:
Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting

... = array[zext(a)]
... = array[zext(a) + 1]

to

... = array[sext(a)]
... = array[zext(a) + 1],

the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.

Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.

Test Plan: make check-all

Reviewers: eliben, meheff

Reviewed By: meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D4542

llvm-svn: 213209
2014-07-16 23:25:00 +00:00
Sanjay Patel
5537765676 Fixed formatting, removed bug reference, renamed testcase
Thanks to Duncan Exon Smith for reviewing and cleanup suggestions. 

llvm-svn: 213205
2014-07-16 22:40:28 +00:00
Juergen Ributzka
63d2af65d4 [FastISel] Local values shouldn't be alive across an inline asm call with side effects.
This fixes an issue where a local value is defined before and used after an
inline asm call with side effects.

This fix simply flushes the local value map, which updates the insertion point
for the inline asm call to be above any previously defined local values.

This fixes <rdar://problem/17694203>

llvm-svn: 213203
2014-07-16 22:20:51 +00:00
Lang Hames
ddc836290a [MCJIT] Improve a RuntimeDyldChecker diagnostic.
When a RuntimeDyldChecker test requests an invalid operand for an instruction,
print the decoded instruction to aid diagnosis.

llvm-svn: 213202
2014-07-16 22:02:20 +00:00
Hal Finkel
d8b9bfa0c2 Fix a typo in the inalloca description
llvm-svn: 213200
2014-07-16 21:22:46 +00:00
Sanjay Patel
8befe236c4 trivial fix for PR20314
Make sure that the AddrInst is an Instruction.

llvm-svn: 213197
2014-07-16 21:08:10 +00:00
Sanjay Patel
b3fb7dc171 Remove Atom references in description.
Any CPU can run this pass.

llvm-svn: 213190
2014-07-16 20:18:49 +00:00
Manuel Jacob
e41c2e7cde Utilize CastInst::CreatePointerBitCastOrAddrSpaceCast here.
llvm-svn: 213189
2014-07-16 20:13:45 +00:00
Chris Bieneman
8124ab9a24 [RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo.
llvm-svn: 213188
2014-07-16 20:13:31 +00:00
Justin Holewinski
35f9408e7f [NVPTX] Honor alignment on vector loads/stores
We were not considering the stated alignment on vector loads/stores,
leading us to generate vector instructions even when we do not have
sufficient alignment.

Now, for IR like:

  %1 = load <4 x float>, <4 x float>* %ptr, align 4

we will generate correct, conservative PTX like:

  ld.f32 ... [%ptr]
  ld.f32 ... [%ptr+4]
  ld.f32 ... [%ptr+8]
  ld.f32 ... [%ptr+12]

Or if we have an alignment of 8 (for example), we can
generate code like:

  ld.v2.f32 ... [%ptr]
  ld.v2.f32 ... [%ptr+8]

llvm-svn: 213186
2014-07-16 19:45:35 +00:00
Alexey Samsonov
d7eaaec8e8 CHECK-LABEL-ize one test
llvm-svn: 213177
2014-07-16 18:11:31 +00:00
Kevin Enderby
d2a97a67de Add the "-x" flag to llvm-nm for Mach-O files that prints the fields of a symbol in hex.
(generally use for debugging the tools).  This is same functionality as darwin’s
nm(1) "-x" flag.

llvm-svn: 213176
2014-07-16 17:38:26 +00:00
David Blaikie
3569fd24ba Remove unnecessary/redundant std::move
(run returns unique_ptr by value already)

llvm-svn: 213174
2014-07-16 17:09:21 +00:00
Alp Toker
78faffcebb Track clang r213171
The clang rewriter is now a core facility.

llvm-svn: 213173
2014-07-16 16:50:34 +00:00
Chris Bieneman
aa19c0dccd Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations.
No functional code changes.

llvm-svn: 213169
2014-07-16 16:27:31 +00:00
Justin Holewinski
84f0bca9c1 [NVPTX] Rename registers %fl -> %fd and %rl -> %rd
This matches the internal behavior of NVIDIA tools like libnvvm.

llvm-svn: 213168
2014-07-16 16:26:58 +00:00
Tim Northover
c6c02a43ba CodeGen: don't form illegail EXTLOAD operations.
It turns out that in most cases (the main exception being i1-related
types) once these operations are formed we cannot separate them and
the targets end up having to deal with them whether they want to or
not.

This is not a good situation, and a more reasonable default can be
formed by ackowledging this and having targets leave them as Legal.
Only x86 seems to be affected (other targets don't even try marking
the operation Expand).

Mostly there's no visible change here yet, but it will be useful to
have truly expanded EXTLOADS for MVT::f16 softening support.

llvm-svn: 213162
2014-07-16 15:37:24 +00:00
Tim Northover
8d2acb0c42 Convert test to CHECK-LABEL
llvm-svn: 213161
2014-07-16 15:37:08 +00:00
Daniel Sanders
512a21080d [mips][fp64a] Temporarily disable odd-numbered double-precision registers when using the FP64A ABI.
Summary:
A few instructions (mostly cvt.d.w and similar) are causing problems with
-mfp64 and -mno-odd-spreg and it looks like fixing it properly may
take several weeks. In the meantime, let's disable the odd-numbered
double-precision registers so that the generated code is at least valid.

The problem is that instructions like cvt.d.w read from the 32-bit low
subregister of a double-precision FPU register. This often leads to the compiler
to inserting moves to transfer a GPR32 to a FGR32 using mtc1. Such moves
violate the rules against 32-bit writes to odd-numbered FPU registers imposed
by -mno-odd-spreg. By disabling the odd-numbered double-precision registers, it
becomes impossible for the 32-bit low subregister to be odd-numbered.

This fixes numerous test-suite failures when compiling for the FP64A ABI
('-mfp64 -mno-odd-spreg'). There is no LLVM test case because it's difficult to
test that odd-numbered FPU registers are not allocatable. Instead, we depend on
the assembler (GAS and -fintegrated-as) raising errors when the rules are
violated.

Differential Revision: http://reviews.llvm.org/D4532

llvm-svn: 213160
2014-07-16 15:34:07 +00:00
Andrea Di Biagio
a161db3638 [X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.
Before this change, method 'isShuffleMaskLegal' didn't know that shuffles
implementing a 'movhlps' operation were perfectly legal for SSE targets.

This patch adds the missing check for 'isMOVHLPSMask' inside method
'isShuffleMaskLegal' to fix the problem.

The reason why it is important to do this is because the DAGCombiner
conservatively avoids combining a pair of shuffles if the resulting shuffle
node has an illegal mask. Before this patch, shuffles with a MOVHLPS mask were
wrongly considered not to be legal. This was the root cause of some poor-code
generation bugs.

llvm-svn: 213137
2014-07-16 11:29:39 +00:00
Justin Bogner
0043d606da unittests: Actually test reverse iterators in Path tests
This re-enables some #if 0'd code (since 2010) in the Path unittests
and makes at least a weak effort at testing sys::path's rbegin/rend.

This change was inspired by some test failures near uses of rbegin and
rend here:

    http://lab.llvm.org:8011/builders/clang-x86_64-linux-vg/builds/3209

The "valgrind was whining" comment looked promising in terms of a
simpler to debug case of the same errors. However, it appears that the
valgrind complaints the comment was referring to are distinct from the
ones in the frontend, since this updated test isn't complaining for me
under valgrind.

In any case, the disabled tests weren't helping anybody.

llvm-svn: 213125
2014-07-16 08:18:58 +00:00
Reid Kleckner
d5cc38a11b Roundtrip the inalloca bit on allocas through bitcode
This was an oversight in the original support.  As it is, I stuffed this
bit into the alignment.  The alignment is stored in log2 form, so it
doesn't need more than 5 bits, given that Value::MaximumAlignment is 1
<< 29.

Reviewers: nicholas

Differential Revision: http://reviews.llvm.org/D3943

llvm-svn: 213118
2014-07-16 01:34:27 +00:00
Manuel Jacob
a903d1b422 Fix comment in InstCombiner::visitAddrSpaceCast.
In the original version of the patch the behaviour was like described in
the comment.  This behaviour was changed before committing it without
updating the comment.

llvm-svn: 213117
2014-07-16 01:34:21 +00:00
Hans Wennborg
46aad38a9b Perform wildcard expansion in Process::GetArgumentVector on Windows (PR17098)
On Windows, wildcard expansion isn't performed by the shell, but left to the
program itself. The common way to do this is to link with setargv.obj, which
performs the expansion on argc/argv before main is entered. However, we don't
use argv in Clang on Windows, but instead call GetCommandLineW so we can handle
unicode arguments. This means we have to do wildcard expansion ourselves.

A test case will be added on the Clang side.

Differential Revision: http://reviews.llvm.org/D4529

llvm-svn: 213114
2014-07-16 00:52:11 +00:00
Tyler Nowicki
16db81fdfa Emit warnings if vectorization is forced and fails.
This patch modifies the existing DiagnosticInfo system to create a generic base
class that is inherited to produce diagnostic-based warnings. This is used by
the loop vectorizer to trigger a warning when vectorization is forced and
fails. Several tests have been added to verify this behavior.

Reviewed by: Arnold Schwaighofer

llvm-svn: 213110
2014-07-16 00:36:00 +00:00
Juergen Ributzka
51acdaca09 Remove TLI from isInTailCallPosition's arguments. NFC.
There is no need to pass on TLI separately to the function. As Eric pointed out
the Target Machine already provides everything we need.

llvm-svn: 213108
2014-07-16 00:01:22 +00:00
Matt Arsenault
15eb0d54b0 R600/SI: Allow using f32 rcp / rsq when denormals not handled.
These are precise enough to use for OpenCL unless denormals
are handled.

llvm-svn: 213107
2014-07-15 23:50:10 +00:00
David Majnemer
53236d3873 X86: Simplify X86WindowsTargetObjectFile::getSectionForConstant
There exists a helper function to abstract away the various differences
between ConstantVector, ConstantDataVector, ConstantAggregateZero, etc.

Use it to simplify X86WindowsTargetObjectFile::getSectionForConstant.

llvm-svn: 213104
2014-07-15 23:01:10 +00:00
Sanjay Patel
2f0f025b2b Move Post RA Scheduling flag bit into SchedMachineModel
Refactoring; no functional changes intended

    Removed PostRAScheduler bits from subtargets (X86, ARM).
    Added PostRAScheduler bit to MCSchedModel class.
    This bit is set by a CPU's scheduling model (if it exists).
    Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
    Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
    Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
    Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
    Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: 
       a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. 
       b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. 
       c. PPC overrides the CPU's postRA settings by enabling postRA for everything. 
       d. X86 is the only target that actually has postRA specified via sched model info.

Differential Revision: http://reviews.llvm.org/D4217

llvm-svn: 213101
2014-07-15 22:39:58 +00:00
Peter Collingbourne
81db7497a2 [dfsan] Introduce further optimization to reduce the number of union queries.
Specifically, do not compute a union if it is statically known that one
shadow set subsumes the other.

llvm-svn: 213100
2014-07-15 22:13:19 +00:00
Alp Toker
5aa10577c4 CMake: avoid a reconfigure loop from r213091
Removing the native CMakeCache.txt causes the target to get re-run needlessly
on some systems. We'll want another solution for that part of the fix.

llvm-svn: 213099
2014-07-15 22:11:54 +00:00
Matt Arsenault
c093eee935 R600/SI: Fix select on i1
llvm-svn: 213096
2014-07-15 21:44:37 +00:00
David Blaikie
1a77466a94 Try out FileCheck's new (in r212810) -implicit-check-not in a DebugInfo test.
Just tried this on a few tests and this was the only one that was
easily ported to use the new feature, so we'll go with that for now.
Hopefully can act as inspiration/reminder for other tests.

Not all debug info tests need to check for every DW_TAG or NULL child
terminator, but perhaps they should (just to ensure they don't accidentally
end up with tags nested inside other tags without the test failing, for example)

llvm-svn: 213092
2014-07-15 21:06:37 +00:00
Alp Toker
7013c592c2 CMake: fix cross-compilation with external source directories
This adds support for building native artifacts when cross-compiling using the
popular side-by-side source directory layout (no symlinks, no nested
repositories).

llvm-svn: 213091
2014-07-15 21:04:12 +00:00
Duncan P. N. Exon Smith
920db48687 ADT: Add MapVector::remove_if
Add a `MapVector::remove_if()` that erases items in bulk in linear time,
as opposed to quadratic time for repeated calls to `MapVector::erase()`.

llvm-svn: 213090
2014-07-15 20:24:56 +00:00
Matt Arsenault
1ceb5e82c1 R600/SI: Implement less wrong f32 fdiv
Assuming single precision denormals and accurate sqrt/div are not
reported, this passes the OpenCL conformance test.

llvm-svn: 213089
2014-07-15 20:18:31 +00:00
Matt Arsenault
42709b4212 R600: Add predicate for UnsafeFPMath
llvm-svn: 213088
2014-07-15 20:18:24 +00:00
Matt Arsenault
b916da43e5 R600: Remove intrinsics that appear to be unused
llvm-svn: 213087
2014-07-15 20:10:27 +00:00