1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00
Commit Graph

155044 Commits

Author SHA1 Message Date
Davide Italiano
234ce81e02 [NewPassManager] Run global dead code elimination after the inliner.
This is the same exact change we did for the current pass manager
in rL314997, but the new pass manager pipeline already happened
to run GlobalOpt after the inliner, so we just insert a run of
GDCE here.

llvm-svn: 315003
2017-10-05 18:36:01 +00:00
Balaram Makam
8affd7c83b [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for ARM/AArch64
Summary: Make test robust enough to not fail due to CFG changes and re-enable for ARM/AArch64.

Reviewers: rovka, fhahn

Reviewed By: fhahn

Subscribers: fhahn, aemerson, rengolin, mcrosier, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38590

llvm-svn: 315002
2017-10-05 18:33:34 +00:00
Reid Kleckner
abb6374299 [X86] Simplify X86 epilogue frame size calculation, NFC
Sink the insertion of "pop ebp" out of the frame size calculation
branches. They all check for HasFP.

Our handling of CLEANUPRET and CATCHRET was equivalent, both are
funclets and use the same frame size. We can eliminate the CLEANUPRET
case.

Hoist the hasFP(MF) query into a local bool.

Rename TargetMBB to CatchRetTarget to be more descriptive.

Eliminate the Optional<unsigned> RetOpcode local, now that it has one
use.

It's only a net savings of 10 lines, but hopefully it's *slightly* more
readable.

llvm-svn: 315000
2017-10-05 18:27:08 +00:00
Davide Italiano
f1ba0c8d8e [PassManager] Improve the interaction between -O2 and ThinLTO.
Run GDCE slightly later so that we don't have to repeat it
twice when preparing for Thin. Thanks to Mehdi for the suggestion.

llvm-svn: 314999
2017-10-05 18:23:25 +00:00
Davide Italiano
ade71dcea1 [PassManager] Run global optimizations after the inliner.
The inliner performs some kind of dead code elimination as it goes,
but there are cases that are not really caught by it. We might
at some point consider teaching the inliner about them, but it
is OK for now to run GlobalOpt + GlobalDCE in tandem as their
benefits generally outweight the cost, making the whole pipeline
faster.

This fixes PR34652.

Differential Revision: https://reviews.llvm.org/D38154

llvm-svn: 314997
2017-10-05 18:06:37 +00:00
Matthew Simpson
e6be9e96c4 [SparsePropagation] Move member definitions to header (NFC)
AbstractLatticeFunction and SparseSolver are class templates parameterized by a
lattice value, so we need to move these member functions over to the header.

Differential Revision: https://reviews.llvm.org/D38561

llvm-svn: 314996
2017-10-05 18:03:30 +00:00
Petar Jovanovic
b4543f0948 [mips] implement .set dspr2 directive
Implement .set dspr2 directive with appropriate feature bits. This
directive is a counterpart of -mattr=dspr2 command line option with the
exception that it does not influence elf header flags.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D38537

llvm-svn: 314994
2017-10-05 17:40:32 +00:00
Matt Arsenault
409ab5393f AMDGPU: Set v2i32 any_extend to expand
llvm-svn: 314993
2017-10-05 17:38:30 +00:00
Krzysztof Parzyszek
973a56b36f [RDF] Simplify construction of maximal registers
The old algoritm was not correct, although it worked most of the time.
Avoid the complex reachability analysis and simply calculate the maximal
registers out of the set of all referenced registers.

llvm-svn: 314991
2017-10-05 17:12:49 +00:00
Rong Xu
c705f0e0e8 [ProfileData] Fix data racing in merging indexed profiles
There is data racing to the static variable RecordIndex in index profile reader
when merging in multiple threads. Make it a member variable in
IndexedInstrProfReader to fix this.

Differential Revision: https://reviews.llvm.org/D38431

llvm-svn: 314990
2017-10-05 17:05:20 +00:00
Artur Pilipenko
d636a4062e [X86] Fix chains update when lowering BUILD_VECTOR to a vector load
The code which lowers BUILD_VECTOR of consecutive loads into a single vector
load doesn't update chains properly. As a result the vector load can be
reordered with the store to the same location.

The current code in EltsFromConsecutiveLoads only updates the chain following
the first load. The fix is to update the chains following all the loads
comprising the vector.

This is a fix for PR10114.

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D38547

llvm-svn: 314988
2017-10-05 16:28:21 +00:00
Konstantin Zhuravlyov
fbd1d211cf AMDGPU: Add and set AMDGPU-specific e_flags
Differential Revision: https://reviews.llvm.org/D38556

llvm-svn: 314987
2017-10-05 16:19:18 +00:00
Ayal Zaks
d3b59ba5b1 [LV] Fix PR34743 - handle casts that sink after interleaved loads
When ignoring a load that participates in an interleaved group, make sure to
move a cast that needs to sink after it.

Testcase derived from reproducer of PR34743.

Differential Revision: https://reviews.llvm.org/D38338

llvm-svn: 314986
2017-10-05 15:45:14 +00:00
Clement Courbet
f0d5e92683 Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."""
broken test on windows

This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca.

llvm-svn: 314985
2017-10-05 14:42:06 +00:00
Sanjay Patel
ead684532b revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)
There is a bot failure that appears to be related to this change:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117

...so reverting to confirm that and attempting to keep the bot green while investigating.

llvm-svn: 314984
2017-10-05 14:26:15 +00:00
Javed Absar
98f52cdd94 [TablgeGen] : Tidy up CodeGenSchedule. NFC.
Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38534

llvm-svn: 314982
2017-10-05 13:27:43 +00:00
Ayal Zaks
44ffc1b606 [LV] Fix PR34711 - widen instruction ranges when sinking casts
Instead of trying to keep LastWidenRecipe updated after creating each recipe,
have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check
if it's a VPWidenRecipe when attempting to extend its range. This ensures that
such extensions, optimized to maintain the original instruction order, do so
only when the instructions are to maintain their relative order. The latter does
not always hold, e.g., when a cast needs to sink to unravel first order
recurrence (r306884).

Testcase derived from reproducer of PR34711.

Differential Revision: https://reviews.llvm.org/D38339

llvm-svn: 314981
2017-10-05 12:41:49 +00:00
Clement Courbet
05c746ebcd Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""
llvm-svn: 314980
2017-10-05 12:39:57 +00:00
Simon Dardis
b2f10c3cc3 [mips] Place certain 64 bit FPU instructions in their own decoder namespace
Previously, instructions that were defined to use the FGR64 register class
were associated with the Mips64 table which was incorrect.

Reviewers: nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D38454

llvm-svn: 314976
2017-10-05 10:27:37 +00:00
Karl-Johan Karlsson
0d005669c5 [DebugInfo] Insert DEBUG_VALUEs after each register redefinition
Summary:
When reinserting debug values after register allocation, make sure to
insert debug values after each redefinition of debug value register in
the slot index range. The reason for this is that DwarfDebug will end
the range of a debug variable when the physical reg is defined. For
instructions with e.g. tied operands this result in prematurely ended
debug range.

This resolves pr34545

Patch by Karl-Johan Karlsson and Bjorn Pettersson

Reviewers: rnk, aprantl

Reviewed By: rnk

Subscribers: bjope, llvm-commits

Differential Revision: https://reviews.llvm.org/D38229

llvm-svn: 314974
2017-10-05 08:37:31 +00:00
George Rimar
fe2f59a586 [MC] - llvm-mc hangs on non-english characters.
Currently llvm-mc just hangs inside infinite loop
while trying to parse file which has ".section .с" inside,
where section name is non-english character.
Patch fixes the issue.

In this patch I also moved content of non-english-characters.s
to test/MC/AsmParser/Inputs folder  so that non-english-characters.s
becomes a single testcase for all invalid inputs containing non-english
symbols. That is convinent because llvm-mc otherwise tries
to parse and tokenize the whole testcase file with tools invocations and
it is harder to isolate the issue.

Differential revision: https://reviews.llvm.org/D38545

llvm-svn: 314973
2017-10-05 08:15:55 +00:00
Clement Courbet
fa18c6d0db Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."
Breaks
clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll

This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa.

llvm-svn: 314972
2017-10-05 08:03:39 +00:00
Craig Topper
9c3200ed1a [InstCombine] Fix a vector splat handling bug in transformZExtICmp.
We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us.

This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first.

Fixes PR34841.

llvm-svn: 314971
2017-10-05 07:59:11 +00:00
Clement Courbet
18754ff013 [MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.
Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later.

Reviewers: dberlin, spatel

Subscribers: davide, nemanjai

Differential Revision: https://reviews.llvm.org/D38232

llvm-svn: 314970
2017-10-05 07:49:09 +00:00
Mikael Holmen
4c352af309 Minor refactoring regarding Cast::isNoopCast(), NFC
Summary:
FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of
Cast::isNoopCast(). According to review comments in D37894 we could instead
use the "DataLayout" version of the method, and thus get rid of the
"IntPtrTy" versions of isNoopCast() completely.

With the above done, the remaining isNoopCast() could then be simplified
a bit more.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D38497

llvm-svn: 314969
2017-10-05 07:07:09 +00:00
Dean Michael Berris
7c47c0e53b [XRay][tools] Support arg1 logging entries in the basic logging mode
Summary:
The arg1 logging handler changed in compiler-rt to start writing a
different type for entries encountered when logging the first argument
of XRay-instrumented functions. This change allows the trace loader to
support reading these record types as well as prepare for when the
basic (naive) mode implementation starts writing down the argument
payloads.

Without this change, binaries with arg1 logging support enabled start
writing unreadable logs for any of the XRay tracing tools.

Reviewers: pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38550

llvm-svn: 314967
2017-10-05 05:18:17 +00:00
Sean Fertile
3695295d20 Enabling new pass manager in LTO (and thinLTO) link step.
Adds the option 'new-pass-manager' to the gold pluggin to enable using the
new pass manager during the lto/thinlto link step.

Patch by Graham Yiu.

 Differential Revision: https://reviews.llvm.org/D38517

llvm-svn: 314963
2017-10-05 01:48:42 +00:00
Xinliang David Li
9b1601a096 Revert r314928 to investigate thinLTO bootstrap failure
llvm-svn: 314961
2017-10-05 01:40:13 +00:00
Eugene Zelenko
6515aa7ae4 [X86] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314953
2017-10-05 00:33:50 +00:00
Matt Arsenault
d79bc6a3f1 AMDGPU: Add comment about clamps
llvm-svn: 314952
2017-10-05 00:13:20 +00:00
Matt Arsenault
263f0dbf65 AMDGPU: Do not fold clamp instructions when sources are different
Patch by hakzsam (Samuel Pitoiset)

llvm-svn: 314951
2017-10-05 00:13:17 +00:00
Craig Topper
d93296dbb4 [InstCombine] Improve support for ashr in foldICmpAndShift
We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users.

Differential Revision: https://reviews.llvm.org/D38521

llvm-svn: 314945
2017-10-04 23:06:13 +00:00
Matt Arsenault
6dee648d90 AMDGPU: Fix not accounting for instruction size in bundles
These were counted as 0. Fixes branch limit exceeded errors
in some large programs.

llvm-svn: 314944
2017-10-04 22:59:12 +00:00
Konstantin Zhuravlyov
298e3abdb4 AMDGPU: Correctly set EI_OSABI based on the os
Differential Revision: https://reviews.llvm.org/D38555

llvm-svn: 314943
2017-10-04 22:44:13 +00:00
Adrian Prantl
6d68e5d516 clang-format file.
llvm-svn: 314942
2017-10-04 22:26:19 +00:00
Adrian Prantl
55b7ddd894 delete commented out code.
llvm-svn: 314941
2017-10-04 22:26:19 +00:00
Sanjoy Das
6419f6461c Do not call Loop::getName on possibly dead loops
This fixes PR34832.

llvm-svn: 314938
2017-10-04 22:02:27 +00:00
Xin Tong
61d1a4c598 [MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed
Summary: Rotate on exit that actually exits the current loop.

Reviewers: davidxl, danielcdh, iteratee, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38563

llvm-svn: 314937
2017-10-04 21:39:25 +00:00
Hans Wennborg
e7ea46e7a9 Fix a -Wparentheses warning. NFC.
llvm-svn: 314936
2017-10-04 21:14:07 +00:00
Justin Lebar
c8a4e60a71 Convert an APInt to int64_t properly in TTI::getGEPCost().
Summary:
If the pointer width is 32 bits and the calculated GEP offset is
negative, we call APInt::getLimitedValue(), which does a
*zero*-extension of the offset.  That's wrong -- we should do an sext.

Fixes a bug introduced in rL314362 and found by Evgeny Astigeevich.

Reviewers: efriedma

Subscribers: sanjoy, javed.absar, llvm-commits, eastig

Differential Revision: https://reviews.llvm.org/D38557

llvm-svn: 314935
2017-10-04 20:47:33 +00:00
Marcello Maggioni
e3a4a32d00 [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC
llvm-svn: 314934
2017-10-04 20:42:46 +00:00
Rafael Espindola
e01aed9846 Bring r314809 back.
But now include a check for CPU_COUNT so we still build on 10 year old
versions of glibc.

Original message:

Use sched_getaffinity instead of std:🧵:hardware_concurrency.

The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

llvm-svn: 314931
2017-10-04 20:27:01 +00:00
Sanjay Patel
95e7a2f063 [SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI
This is a follow-up to https://reviews.llvm.org/D38138. 

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping 
any options by using default param values.

llvm-svn: 314930
2017-10-04 20:26:25 +00:00
Xinliang David Li
63c93124d5 Recommit r314561 after fixing msan build failure
(trial 2) Incoming val defined by terminator instruction which
also requires bitcasts can not be handled.

llvm-svn: 314928
2017-10-04 20:17:55 +00:00
Guozhi Wei
e2af0b6a83 [TargetTransformInfo] Check if function pointer is valid before calling isLoweredToCall
Function isLoweredToCall can only accept non-null function pointer, but a function pointer can be null for indirect function call. So check it before calling isLoweredToCall from getInstructionLatency.

Differential Revision: https://reviews.llvm.org/D38204

llvm-svn: 314927
2017-10-04 20:14:08 +00:00
Jun Bum Lim
a6292a6c0f Recommit : Use the basic cost if a GEP is not used as addressing mode
Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

llvm-svn: 314923
2017-10-04 18:33:52 +00:00
Daniel Neilson
46cdd30793 Revert D38481 due to missing cmake check for CPU_COUNT
Summary:
This reverts D38481. The change breaks systems with older versions of glibc. It
injects a use of CPU_COUNT() from sched.h without checking to ensure that the
function exists first.

Reviewers:

Subscribers:

llvm-svn: 314922
2017-10-04 18:19:03 +00:00
Simon Pilgrim
480172cb76 [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for v8i64/v8f64 512-bit vector compare results.
AVX1/AVX2 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts

llvm-svn: 314921
2017-10-04 18:00:42 +00:00
Krzysztof Parzyszek
87db7408be [Hexagon] Add a member Subtarget to HexagonInstrInfo, NFC
llvm-svn: 314920
2017-10-04 18:00:15 +00:00
Hans Wennborg
05b9db2476 Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"
It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919
2017-10-04 17:54:06 +00:00