1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

108342 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith
e70bbe654f DIBuilder: Remove dead code
I neglected to update `DIBuilder::createPieceExpression()` in r218797,
which I noticed while rebasing a patch for PR17891.  On closer
inspection, it looks like dead code.

If there are any downstream users of this, you should transition to the
more general `createExpression()`.  Or, we can add this back, but then
it should just forward to `createExpression()`.

llvm-svn: 218820
2014-10-01 21:14:20 +00:00
Chandler Carruth
efba997ca5 [x86] Merge the remaining test cases into vector-blend.ll and remove all
the ISA-specific test files.

llvm-svn: 218818
2014-10-01 21:07:07 +00:00
Eric Christopher
bce38d60f8 Now that the optimization level is adjusting the feature string
before we hit the subtarget, remove the constructor parameter.

llvm-svn: 218817
2014-10-01 21:05:35 +00:00
Chandler Carruth
17a5f1f40d [x86] Expand the ISA coverage of our blend test in preparation for
merging ISA-specific testing into this file.

llvm-svn: 218816
2014-10-01 21:03:21 +00:00
Argyrios Kyrtzidis
8575c06fb9 Adds 'override' to overriding methods. NFC.
llvm-svn: 218815
2014-10-01 21:00:44 +00:00
Chandler Carruth
54cba461be [x86] Merge the interesting test cases from blend-msb.ll into
vector-blend.ll and remove the former.

llvm-svn: 218814
2014-10-01 20:56:57 +00:00
Chandler Carruth
eb0ef09cba [x86] Move the AVX blend test to a generic name. I'm going to fold other
blend tests into this one.

llvm-svn: 218813
2014-10-01 20:52:55 +00:00
Chandler Carruth
fe43286675 [x86] Remove a test that wasn't doing anything really. We have plenty of
better tests for zext of vectors at this point.

llvm-svn: 218811
2014-10-01 20:50:58 +00:00
Chandler Carruth
1757cc6b66 [x86] Add a 32-bit run to the sext test, and remove a sad vec_sext.ll
test file.

This old test had a bunch of functions that were never even checked. =/
The only thing it really did was to make sure that we did something
reasonable in 32-bit mode with SSE4.1. Adding another run line to the
main vector-sext.ll test seems a better way to do that.

llvm-svn: 218810
2014-10-01 20:49:54 +00:00
Chandler Carruth
d9dbba2329 [x86] Teach both sext and zext vector tests to cover a nice wide range
of architectures: SSE2, SSSE3, SSE4.1, AVX, and AVX2.

Unfortunately, this exposses the absolute horror of the code we generate
for many of these patterns. Anyone wanting to familiarize themselves
with the x86 backend and improve performance could do a lot of good
sitting down and making these test cases not look so terrible. While the
new vector shuffle code I'm working on well help some, it won't fix all
of the crimes here.

llvm-svn: 218807
2014-10-01 20:41:36 +00:00
Eric Christopher
4ce55b7a5c Rework the PPC TargetMachine so that the non-function specific
overrides happen at TargetMachine creation and not on every
subtarget creation.

llvm-svn: 218805
2014-10-01 20:38:26 +00:00
Eric Christopher
5245670b4d constify TargetMachine parameter for X86TargetLowering.
llvm-svn: 218804
2014-10-01 20:38:22 +00:00
Sanjay Patel
790c02f096 Make the sqrt intrinsic return undef for a negative input.
As discussed here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140609/220598.html

And again here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/077168.html

The sqrt of a negative number when using the llvm intrinsic is undefined. 
We should return undef rather than 0.0 to match the definition in the LLVM IR lang ref.

This change should not affect any code that isn't using "no-nans-fp-math"; 
ie, no-nans is a requirement for generating the llvm intrinsic in place of a sqrt function call.

Unfortunately, the behavior introduced by this patch will not match current gcc, xlc, icc, and 
possibly other compilers. The current clang/llvm behavior of returning 0.0 doesn't either. 
We knowingly approve of this difference with the other compilers in an attempt to flag code 
that is invoking undefined behavior.

A front-end warning should also try to convince the user that the program will fail:
http://llvm.org/bugs/show_bug.cgi?id=21093

Differential Revision: http://reviews.llvm.org/D5527

llvm-svn: 218803
2014-10-01 20:36:33 +00:00
Chandler Carruth
0c80c6833d [x86] Sort the ISA-specific RUN lines for vector-sext.ll to go from
oldest to newest. This makes more sense to me and is more consistent
with other tests.

llvm-svn: 218802
2014-10-01 20:32:44 +00:00
Tim Northover
f58482e671 ARM: yes it can (as of r218789)
llvm-svn: 218801
2014-10-01 20:31:58 +00:00
Chandler Carruth
c6c3e58c92 [x86] Rename avx-{s,z}ext.ll to vector-{s,z}ext.ll.
These tests are far and away the best sext and zext tests we have for
vectors. I'm going to merge the other similar tests into them and expand
the ISA coverage.

llvm-svn: 218800
2014-10-01 20:30:30 +00:00
Chandler Carruth
fbc2708243 [x86] Cleanup and re-generate the checks for avx-zext.ll using the new
script.

llvm-svn: 218799
2014-10-01 20:27:16 +00:00
Duncan P. N. Exon Smith
51b31a68a3 DIBuilder: Encapsulate DIExpression's element type
`DIExpression`'s elements are 64-bit integers that are stored as
`ConstantInt`.  The accessors already encapsulate the storage.  This
commit updates the `DIBuilder` API to also encapsulate that.

llvm-svn: 218797
2014-10-01 20:26:08 +00:00
Chandler Carruth
a1f7067666 [x86] Generate the FileCheck assertions for avx-blend.ll with my new
script to make them nice and predictable. This will ease updating them
for the new vector shuffle lowering and seeing the delta if any.

llvm-svn: 218795
2014-10-01 20:19:45 +00:00
Chandler Carruth
f4596dd7b6 [x86] Clean up and generate detailed FileCheck assertions for
avx-sext.ll using my new script.

Also add an AVX2 mode to this test.

Part of cleaning up the test suite before enabling the new vector
shuffle lowering. This also highlights some of the abysmal failures of
the old shuffle lowering. Check out those 'pinsrw' and 'pextrw'
sequences!

llvm-svn: 218794
2014-10-01 20:19:32 +00:00
Bruno Cardoso Lopes
c2183ae743 [MemoryDepAnalysis] Fix compile time slowdown
- Problem
One program takes ~3min to compile under -O2. This happens after a certain
function A is inlined ~700 times in a function B, inserting thousands of new
BBs. This leads to 80% of the compilation time spent in
GVN::processNonLocalLoad and
MemoryDependenceAnalysis::getNonLocalPointerDependency, while searching for
nonlocal information for basic blocks.

Usually, to avoid spending a long time to process nonlocal loads, GVN bails out
if it gets more than 100 deps as a result from
MD->getNonLocalPointerDependency.  However this only happens *after* all
nonlocal information for BBs have been computed, which is the bottleneck in
this scenario. For instance, there are 8280 times where
getNonLocalPointerDependency returns deps with more than 100 bbs and from
those, 600 times it returns more than 1000 blocks.

- Solution
Bail out early during the nonlocal info computation whenever we reach a
specified threshold.  This patch proposes a 100 BBs threshold, it also
reduces the compile time from 3min to 23s.

- Testing
The test-suite presented no compile nor execution time regressions.

Some numbers from my machine (x86_64 darwin):
 - 17s under -Oz (which avoids inlining).
 - 1.3s under -O1.
 - 2m51s under -O2 ToT
 *** 23s under -O2 w/ Result.size() > 100
 - 1m54s under -O2 w/ Result.size() > 500

With NumResultsLimit = 100, GVN yields the same outcome as in the
unlimited 3min version.

http://reviews.llvm.org/D5532
rdar://problem/18188041

llvm-svn: 218792
2014-10-01 20:07:13 +00:00
Sanjay Patel
923727403c Don't repeat function/variable name in comment. NFC.
llvm-svn: 218791
2014-10-01 19:39:32 +00:00
Adam Nemet
e0d1a483d8 [X86 disasm tblegen backend] Clean up numPhysicalOperands asserts
No functionality change intended.

This implements Elena's idea to put the new additionalOperand outside the
switch to cover all cases
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140929/237763.html).

Note only nontrivial change is in MRMSrcMemFrm.  This requires an inclusive
interval of [2, 4] because we have prefix-dependent *optional* immediate
operand.

llvm-svn: 218790
2014-10-01 19:28:11 +00:00
Tim Northover
64a066f76a ARM: allow copying of CPSR when all else fails.
As with x86 and AArch64, certain situations can arise where we need to spill
CPSR in the middle of a calculation. These should be avoided where possible
(MRS/MSR is rather expensive), which ARM is actually better at than the other
two since it tries to Glue defs to uses, but as a last ditch effort, copying is
better than crashing.

rdar://problem/18011155

llvm-svn: 218789
2014-10-01 19:21:03 +00:00
Adrian Prantl
2b1df58ebe Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Duncan P. N. Exon Smith
ee9cf8af02 LTO: Add missing target triple from r218784
llvm-svn: 218786
2014-10-01 18:49:58 +00:00
Reed Kotler
f0d9d7da37 Add fptrunc to mips fast-sel
Summary: Implement conversion of 64 to 32 bit floating point numbers (fptrunc) in mips fast-isel

Test Plan:
fptrunc.ll
checked also with 4 internal mips build bot flavors mip32r1/miprs32r2 and at -O0 and -O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: rfuhler

Differential Revision: http://reviews.llvm.org/D5553

llvm-svn: 218785
2014-10-01 18:47:02 +00:00
Duncan P. N. Exon Smith
236c4250e7 LTO: Ignore disabled diagnostic remarks
r206400 and r209442 added remarks that are disabled by default.
However, if a diagnostic handler is registered, the remarks are sent
unfiltered to the handler.  This is the right behaviour for clang, since
it has its own filters.

However, the diagnostic handler exposed in the LTO API receives only the
severity and message.  It doesn't have the information to filter by pass
name.  For LTO, disabled remarks should be filtered by the producer.

I've changed `LLVMContext::setDiagnosticHandler()` to take a `bool`
argument indicating whether to respect the built-in filters.  This
defaults to `false`, so other consumers don't have a behaviour change,
but `LTOCodeGenerator::setDiagnosticHandler()` sets it to `true`.

To make this behaviour testable, I added a `-use-diagnostic-handler`
command-line option to `llvm-lto`.

This fixes PR21108.

llvm-svn: 218784
2014-10-01 18:36:03 +00:00
David Blaikie
7b8bbd0e46 Add an immovable type to test Optional<T>::emplace more rigorously after r218732.
llvm-svn: 218783
2014-10-01 18:29:44 +00:00
Adrian Prantl
0959156fa3 Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl
229943585f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
Tom Stellard
7f89ae5275 R600: Call EmitFunctionHeader() in the AsmPrinter to populate the ELF symbol table
llvm-svn: 218776
2014-10-01 17:15:17 +00:00
Tom Stellard
a5133e90d3 C API: Add LLVMCloneModule()
llvm-svn: 218775
2014-10-01 17:14:57 +00:00
Jingyue Wu
784352be06 Revert r216862 due to a performance regression
Reported by Alexey Volkov in PR21115

llvm-svn: 218771
2014-10-01 15:22:13 +00:00
Toma Tabacu
ff071cc027 [mips] Rename emit and parse functions for the .cpload assembler directive. NFC.
Summary: It's better if we have a consistent name for .cpload-related functions.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5437

llvm-svn: 218768
2014-10-01 14:53:19 +00:00
Tom Stellard
ed0eb691ca R600/SI: Add a generic pseudo EXP instruction
llvm-svn: 218767
2014-10-01 14:44:45 +00:00
Tom Stellard
c3d0f56e3e R600/SI: Add generic pseudo MTBUF instructions
llvm-svn: 218766
2014-10-01 14:44:43 +00:00
Tom Stellard
a2e56fb3c0 R600/SI: Add generic pseudo SMRD instructions
llvm-svn: 218765
2014-10-01 14:44:42 +00:00
Oliver Stannard
d5cdc39fe7 [ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5
Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions
when targeting ARMv8, but they are actually present on any target with
FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an
M-profile core, but they have the same instructions so we model them
both as FPARMv8 in the ARM backend.

llvm-svn: 218763
2014-10-01 13:13:18 +00:00
Chandler Carruth
327441cee3 [x86] Fix a few more tiny patterns with the new vector shuffle lowering
that keep cropping up in the regression test suite.

This also addresses one of the issues raised on the mailing list with
failing to form 'movsd' in as many cases as we realistically should.
There will be corresponding patches forthcoming for v4f32 at least. This
was a lot of fuss for a relatively small gain, but all the fuss was on
my end trying different ways of holding the pieces of the x86 fragment
patterns *just right*. Now that it works, the code is reasonably simple.

In the new test cases I'm adding here, v2i64 sticks out as just plain
horrible. I've not come up with any great ideas here other than that it
would be nice to recognize when we're *going* to take a domain crossing
hit and cross earlier to get the decent instructions. At least with AVX
it is slightly less silly....

llvm-svn: 218756
2014-10-01 11:14:02 +00:00
Chandler Carruth
52fe0fc691 [x86] Delete some extraneous logic from the new vector shuffle lowering.
Nothing was relying on this and there are potentially some edge cases
that it would not be correct under. Removing it seems better than trying
to "fix" it as nothing was relying on it.

llvm-svn: 218755
2014-10-01 11:13:57 +00:00
Tom Coxon
50ff005894 [AArch64] Allow access to all system registers with MRS/MSR instructions.
The A64 instruction set includes a generic register syntax for accessing
implementation-defined system registers. The syntax for these registers is:
    S<op0>_<op1>_<CRn>_<CRm>_<op2>

The encoding space permitted for implementation-defined system registers
is:
    op0 op1  CRn   CRm   op2
    11  xxx  1x11  xxxx  xxx

The full encoding space can now be accessed:
    op0 op1  CRn   CRm   op2
    xx  xxx  xxxx  xxxx  xxx

This is useful to anyone needing to write assembly code supporting new
system registers before the assembler has learned the official names for
them.

llvm-svn: 218753
2014-10-01 10:13:59 +00:00
Evgeniy Stepanov
07f2031151 Revert r218721, r218735.
Failing bootstrap on Linux (arm, x86).

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13139/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/470
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/8518

llvm-svn: 218752
2014-10-01 10:07:28 +00:00
Asiri Rathnayake
0e12aa2ad9 Add missing natual vector cast.
Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST
was introduced in r217159 and r217138. This patch adds a missing cast from
v2f32 to v1i64 which is causing some compilation failures. Also added test
cases to cover various modimm types and BUILD_VECTORs with i64 elements.

llvm-svn: 218751
2014-10-01 09:59:45 +00:00
NAKAMURA Takumi
db6522181c ADTTests/OptionalTest.cpp: Use LLVM_DELETED_FUNCTION.
llvm-svn: 218750
2014-10-01 09:14:43 +00:00
Oliver Stannard
f4985610cd [ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM)
The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and
FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be
modelled using the same target feature, and all double-precision
operations are already disabled by the fp-only-sp target features.

llvm-svn: 218747
2014-10-01 09:02:17 +00:00
Daniel Sanders
57fd199eaa [mips] Fix disassembly of [ls][wd]c[23], cache, and pref
Fixes PR21015, and PR20993.                                                       
                                                                                  
Patch by Jun Koi

llvm-svn: 218745
2014-10-01 08:26:55 +00:00
Sasa Stankovic
f65f1c04c4 [mips] For indirect calls we don't need $gp to point to .got. Mips linker
doesn't generate lazy binding stub for a function whose address is taken in
the program.

Differential Revision: http://reviews.llvm.org/D5067

llvm-svn: 218744
2014-10-01 08:22:21 +00:00
Justin Bogner
47571604c9 test: XFAIL the non-darwin gmlt test on darwin
r218702 disabled a -gmlt optimization for darwin, but this means the
non-darwin test isn't working there anymore.

llvm-svn: 218742
2014-10-01 05:45:45 +00:00
Lang Hames
61140d8e80 [MCJIT] Turn the getSymbolAddress free function created in r218626 into a static
member of RTDyldMemoryManager (and rename to getSymbolAddressInProcess).

The functionality this provides is very specific to RTDyldMemoryManager, so it
makes sense to keep it in that class to avoid accidental re-use.

No functional change.

llvm-svn: 218741
2014-10-01 04:11:13 +00:00