1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
Commit Graph

123310 Commits

Author SHA1 Message Date
Michael Kuperstein
715d358c66 [X86] DAGCombine should not introduce FILD in soft-float mode
The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp 
directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode.

llvm-svn: 252042
2015-11-04 11:17:53 +00:00
James Molloy
080c19dae1 Revert "[PatternMatch] Switch to use ValueTracking::matchSelectPattern"
This was breaking the modules build and is being reverted while we reach consensus on the right way to solve this layering problem. This reverts commit r251785.

llvm-svn: 252040
2015-11-04 08:36:53 +00:00
Pawel Bylica
12fbf20eab Fix unit tests on Windows: handle env vars with non-ASCII chars.
Summary: On Windows we have to take UTF16 encoded env vars and convert them to UTF8. This patch fixes CopyEnvironment helper function used by process unit tests.

Reviewers: yaron.keren

Subscribers: yaron.keren, llvm-commits

Differential Revision: http://reviews.llvm.org/D14278

llvm-svn: 252039
2015-11-04 08:25:20 +00:00
Sanjoy Das
35a16a42b9 [OperandBundles] Refactor; NFCI.
Extract out a helper function `operandBundleFromBundleOpInfo`.

llvm-svn: 252038
2015-11-04 04:31:21 +00:00
Sanjoy Das
ce6d998be5 [OperandBundles] Refactor; NFCI
Intended to make later changes simpler.  Exposes
`getBundleOperandsStartIndex` and `getBundleOperandsEndIndex`, and uses
them for the computation in `getNumTotalBundleOperands`.

llvm-svn: 252037
2015-11-04 04:31:06 +00:00
Philip Reames
93ead3ba46 [LVI] Update a comment to clarify what's actually happening and why
llvm-svn: 252033
2015-11-04 01:47:04 +00:00
Philip Reames
d4059ff8b7 [CVP] Fold return values if possible
In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns.

The motivation for this is two fold:
 * The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified.
 * It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263.

Differential Revision: http://reviews.llvm.org/D14271

llvm-svn: 252032
2015-11-04 01:43:54 +00:00
Igor Laevsky
c5001abb24 [StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
 allow to lower call safepoints with relocates and results in a different 
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158

llvm-svn: 252028
2015-11-04 01:16:10 +00:00
Alexey Samsonov
5c12917a28 Fix the test case for Windows.
llvm-svn: 252027
2015-11-04 01:09:37 +00:00
Alexey Samsonov
4cb73711bb [LLVMSymbolize] Reduce indentation by using helper function. NFC.
llvm-svn: 252022
2015-11-04 00:30:26 +00:00
Alexey Samsonov
27ddee3db7 [LLVMSymbolize] Properly propagate object parsing errors from the library.
llvm-svn: 252021
2015-11-04 00:30:24 +00:00
Alexey Samsonov
198315389a [llvm-symbolizer] Improve the test for missing input file.
llvm-svn: 252020
2015-11-04 00:30:19 +00:00
Adam Nemet
24ab61577e Fix unused variable warning from r252017
llvm-svn: 252019
2015-11-04 00:10:33 +00:00
Adam Nemet
933537bc6b LLE 6/6: Add LoopLoadElimination pass
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop.  E.g.:

  for (i)
     A[i + 1] = A[i] + B[i]

  =>

  T = A[0]
  for (i)
     T = T + B[i]
     A[i + 1] = T

The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load.  Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.

This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning.  As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here.  Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.

In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads.  I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.

The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop.  Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%).  This gain also transfers over to x86: it's
around 8-10%.

Right now the pass is off by default and can be enabled
with -enable-loop-load-elim.  On the LNT testsuite, there are two
performance changes (negative number -> improvement):

  1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
     critical paths is reduced
  2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
     outside of LNT

The pass is scheduled after the loop vectorizer (which is after loop
distribution).  The rational is to try to reuse LAA state, rather than
recomputing it.  The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.

LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences).  LAA is known to omit loop-independent
dependences in certain situations.  The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.

Reviewers: dberlin, hfinkel

Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13259

llvm-svn: 252017
2015-11-03 23:50:08 +00:00
Adam Nemet
6d8e7bca0f [LAA] LLE 5/6: Add predicate functions Dependence::isForward/isBackward, NFC
Summary: Will be used by the LoopLoadElimination pass.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13258

llvm-svn: 252016
2015-11-03 23:50:03 +00:00
Adam Nemet
ea9a067ee3 [LAA] LLE 4/6: APIs to access the dependent instructions for a dependence, NFC
Summary:
The functions use LAI and MemoryDepChecker classes so they need to be
defined after those definitions outside of the Dependence class.

Will be used by the LoopLoadElimination pass.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13257

llvm-svn: 252015
2015-11-03 23:49:58 +00:00
Peter Collingbourne
284b8079b3 CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.
A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().

It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.

NFCI.

Differential Revision: http://reviews.llvm.org/D14168

llvm-svn: 252014
2015-11-03 23:40:03 +00:00
Matt Arsenault
510281e46d AMDGPU: Make flat_scratch name consistent
The printed name and the parsed assembler names weren't the same.
I'm not sure which name SC prints these as, but I think it's this one.

llvm-svn: 252010
2015-11-03 22:50:34 +00:00
Matt Arsenault
1e730f96ac AMDGPU: Fix asserts on invalid register ranges
If the requested SGPR was not actually aligned, it was
accepted and rounded down instead of rejected.

Also fix an assert if the range is an invalid size.

llvm-svn: 252009
2015-11-03 22:50:32 +00:00
Matt Arsenault
b15a099217 AMDGPU: Fix off by one error in register parsing
If trying to use one past the end, this would assert.

llvm-svn: 252008
2015-11-03 22:50:27 +00:00
Derek Schuff
3b7694d3d7 Address nit
llvm-svn: 252004
2015-11-03 22:40:45 +00:00
Derek Schuff
3d1dc78633 Align whitespace
llvm-svn: 252003
2015-11-03 22:40:43 +00:00
Derek Schuff
d517fd90fa [WebAssembly] Support wasm select operator
Summary:
Add support for wasm's select operator, and lower LLVM's select DAG node
to it.

Reviewers: sunfish

Subscribers: dschuff, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D14295

llvm-svn: 252002
2015-11-03 22:40:40 +00:00
Matt Arsenault
086c4e9cfd AMDGPU: s[102:103] is unavailable on VI
llvm-svn: 252000
2015-11-03 22:39:52 +00:00
Matt Arsenault
e66b0aba3a AMDGPU: Define correct number of SGPRs
There are actually 104 so 2 were missing.

More assembler tests with high register number tuples
will be included in later patches.

llvm-svn: 251999
2015-11-03 22:39:50 +00:00
Matt Arsenault
62d416ff43 AMDGPU: Make findUsedSGPR more readable
Add more comments etc.

llvm-svn: 251996
2015-11-03 22:30:15 +00:00
Matt Arsenault
16a158d1ea AMDGPU: Initialize SIFixSGPRCopies so -print-after works
llvm-svn: 251995
2015-11-03 22:30:13 +00:00
Matt Arsenault
6851d38056 AMDGPU: Alphabetize includes
llvm-svn: 251994
2015-11-03 22:30:08 +00:00
Fiona Glaser
abcc6a8ee2 InstCombine: fix sinking of convergent calls
llvm-svn: 251991
2015-11-03 22:23:39 +00:00
Simon Pilgrim
31ffa103fc [SelectionDAG] Use existing constant nodes instead of recreating them. NFC.
llvm-svn: 251990
2015-11-03 22:21:38 +00:00
Alexey Samsonov
37b571b778 [LLVMSymbolize] Factor out the logic for printing structs from DIContext. NFC.
Introduce DIPrinter which takes care of rendering DILineInfo and
friends. This allows LLVMSymbolizer class to return a structured data
instead of plain std::strings.

llvm-svn: 251989
2015-11-03 22:20:52 +00:00
Simon Pilgrim
90953f232d [X86][AVX] Tweaked shuffle stack folding tests
To avoid alternative lowerings.

llvm-svn: 251986
2015-11-03 21:58:35 +00:00
Adam Nemet
8ce9fb467e [LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13256

llvm-svn: 251985
2015-11-03 21:39:52 +00:00
Simon Pilgrim
034b12fe37 [X86][AVX512] Fixed shuffle test name to match shuffle
llvm-svn: 251984
2015-11-03 21:39:30 +00:00
Alexey Samsonov
785313f5bc [LLVMSymbolize] Move demangling away from printing routines. NFC.
Make printDILineInfo and friends responsible for just rendering the
contents of the structures, demangling should actually be performed
earlier, when we have the information about the originating
SymbolizableModule at hand.

llvm-svn: 251981
2015-11-03 21:36:13 +00:00
Davide Italiano
063a880856 [SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)
This one is enabled only under -ffast-math (due to rounding/overflows)
but allows us to emit shorter code.

Before (on FreeBSD x86-64):
4007f0:       50                      push   %rax
4007f1:       f2 0f 11 0c 24          movsd  %xmm1,(%rsp)
4007f6:       e8 75 fd ff ff          callq  400570 <exp2@plt>
4007fb:       f2 0f 10 0c 24          movsd  (%rsp),%xmm1
400800:       58                      pop    %rax
400801:       e9 7a fd ff ff          jmpq   400580 <pow@plt>
400806:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
40080d:       00 00 00

After:
4007b0:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
4007b4:       e9 87 fd ff ff          jmpq   400540 <exp2@plt>
4007b9:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Differential Revision:	http://reviews.llvm.org/D14045

llvm-svn: 251976
2015-11-03 20:32:23 +00:00
Simon Pilgrim
ac4c196247 [X86][XOP] Add support for the matching of the VPCMOV bit select instruction
XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )

This patch adds tablegen pattern matching for this instruction.

Differential Revision: http://reviews.llvm.org/D8841

llvm-svn: 251975
2015-11-03 20:27:01 +00:00
Rui Ueyama
2aae8dc2fb llmv-pdbdump: Make BuiltinDumper shorter. NFC.
llvm-svn: 251974
2015-11-03 20:16:18 +00:00
Adam Nemet
b9c59b29d9 [LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence
Summary:
When the dependence distance in zero then we have a loop-independent
dependence from the earlier to the later access.

No current client of LAA uses forward dependences so other than
potentially hitting the MaxDependences threshold earlier, this change
shouldn't affect anything right now.

This and the previous patch were tested together for compile-time
regression.  None found in LNT/SPEC.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13255

llvm-svn: 251973
2015-11-03 20:13:43 +00:00
Adam Nemet
c168f14a53 [LAA] LLE 1/6: Expose Forward dependences
Summary:
Before this change, we didn't use to collect forward dependences since
none of the current clients (LV, LDist) required them.

The motivation to also collect forward dependences is a new pass
LoopLoadElimination (LLE) which discovers store-to-load forwarding
opportunities across the loop's backedge.  The pass uses both lexically
forward or backward loop-carried dependences to detect these
opportunities.

The new pass also analyzes loop-independent (forward) dependences since
they can conflict with the loop-carried dependences in terms of how the
data flows through memory.

The newly added test only covers loop-carried forward dependences
because loop-independent ones are currently categorized as NoDep.  The
next patch will fix this.

The two patches were tested together for compile-time regression.  None
found in LNT/SPEC.

Note that with this change LAA provides all dependences rather than just
"interesting" ones.  A subsequent NFC patch will remove the now trivial
isInterestingDependence and rename the APIs.

Reviewers: hfinkel

Subscribers: jmolloy, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13254

llvm-svn: 251972
2015-11-03 20:13:23 +00:00
Rafael Espindola
dc3ad4835d Don't create empty sections just to look like gas.
We are long past the time when this much bug for bug compatibility was
useful.

llvm-svn: 251970
2015-11-03 20:02:22 +00:00
Rafael Espindola
454fc24e96 Relax a few more overspecified tests.
llvm-svn: 251967
2015-11-03 19:38:19 +00:00
Teresa Johnson
93fae75d76 Revert "Move metadata linking after lazy global materialization/linking."
This reverts commit r251926. I believe this is causing an LTO
bootstrapping bot failure
(http://lab.llvm.org:8080/green/job/llvm-stage2-cmake-RgLTO_build/3669/).

Haven't been able to repro it yet, but after looking at the metadata I
am pretty sure I know what is going on.

llvm-svn: 251965
2015-11-03 19:36:04 +00:00
Rafael Espindola
7d3b847175 Remove unnecessary dependency on section and string positions.
llvm-svn: 251964
2015-11-03 19:24:17 +00:00
Kostya Serebryany
6ba411ce7a [libFuzzer] make -test_single_input more reliable: make sure the input's size is equal to it's capacity
llvm-svn: 251961
2015-11-03 18:57:25 +00:00
Rafael Espindola
5583e816fe Delete dead code.
llvm-svn: 251960
2015-11-03 18:55:58 +00:00
Rafael Espindola
7953bd5d91 Simplify local common output.
We now create them as they are found and use higher level APIs.

This is a step in avoiding creating unnecessary sections.

llvm-svn: 251958
2015-11-03 18:50:51 +00:00
Igor Laevsky
691dbb68d2 [CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks
Differential Revision: http://reviews.llvm.org/D14258

llvm-svn: 251957
2015-11-03 18:37:40 +00:00
Rafael Espindola
9103955acf Move code out of a loop and use a range loop.
llvm-svn: 251952
2015-11-03 18:04:07 +00:00
Rafael Espindola
7ec60e8686 Revert "Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines.""
This reverts commit r251937.

The test was updated to the new API, bring the API back.

llvm-svn: 251944
2015-11-03 16:40:37 +00:00