1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

52843 Commits

Author SHA1 Message Date
Sanjay Patel
b9989c7cc8 [PowerPC] add tests for FMF propagation; NFC
I'm choosing PPC out of convenience because it does
all of the transforms of interest in these tests by
default. There are multiple FMF problems shown in the 
current checks. D45710 is proposing to fix part of 
that.

llvm-svn: 331471
2018-05-03 17:41:37 +00:00
Bjorn Pettersson
80220423bb Reapply "[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)"
Summary:
This reverts SVN r331441 (reapplies r331337), together with a fix
in to handle an already existing fragment expression in the
dbg.value that must be fragmented due to a split PHI node.

This should solve the problem seen in PR37321, which was the
reason for the revert of r331337.

The situation in PR37321 is that we have a PHI node like this

   %u.sroa = phi i80 [ %u.sroa.x, %if.x ],
                     [ %u.sroa.y, %if.y ],
                     [ %u.sroa.z, %if.z ]

and a dbg.value like this

  call void @llvm.dbg.value(metadata i80 %u.sroa,
                            metadata !13,
                            metadata !DIExpression(DW_OP_LLVM_fragment, 0, 80))

The phi node is split into three 32-bit PHI nodes

  %30:gr32 = PHI %11:gr32, %bb.4, %14:gr32, %bb.5, %27:gr32, %bb.8
  %31:gr32 = PHI %12:gr32, %bb.4, %15:gr32, %bb.5, %28:gr32, %bb.8
  %32:gr32 = PHI %13:gr32, %bb.4, %16:gr32, %bb.5, %29:gr32, %bb.8

but since the original value only is 80 bits we need to adjust the size
of the last fragment expression, and with this patch we get

  DBG_VALUE debug-use %30:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 0, 32)
  DBG_VALUE debug-use %31:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 32, 32)
  DBG_VALUE debug-use %32:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 64, 16)

Reviewers: vsk, aprantl, mstorsjo

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46384

llvm-svn: 331464
2018-05-03 17:04:16 +00:00
Roman Lebedev
8cf123ed80 [CodeGen][X86][NFC] Copy two selectcc tests from AArch64.
These tests are for DAGCombiner::foldSelectCCToShiftAnd().
Right now, they were only tested for AArch64,
but given the upcoming X86 changes to the hasAndNot(),
the test coverage needs to be added.

These tests originated from D27489 / rL289738

llvm-svn: 331454
2018-05-03 13:33:07 +00:00
Simon Pilgrim
c4c90c5eac [X86] Split WriteVecALU/WritePHAdd into XMM and YMM/ZMM scheduler classes
llvm-svn: 331453
2018-05-03 13:27:10 +00:00
Tim Northover
9ef696c849 ARM: don't try to over-align large vectors as arguments.
By default LLVM thinks very large vectors get aligned to their size when
passed across functions. Unfortunately no-one told the ARM backend so it
doesn't trigger stack realignment and so accesses can cause the usual
misalignment issues (e.g. a data abort).

This changes the ABI alignment to the stack alignment, which in practice
(and as a bonus) also coincides with the alignment "natural" vectors get.

llvm-svn: 331451
2018-05-03 12:54:25 +00:00
Piotr Padlewski
997163a54e perform DSE through launder.invariant.group
Summary:
Alias Analysis knows that llvm.launder.invariant.group
returns pointer that mustalias argument, but this information
wasn't used, therefor we didn't DSE through launder.invariant.group

Reviewers: chandlerc, dberlin, bogner, hfinkel, efriedma

Reviewed By: dberlin

Subscribers: amharc, llvm-commits, nlewycky, rsmith

Differential Revision: https://reviews.llvm.org/D31581

llvm-svn: 331449
2018-05-03 11:03:53 +00:00
Piotr Padlewski
1e96fe1a21 Rename invariant.group.barrier to launder.invariant.group
Summary:
This is one of the initial commit of "RFC: Devirtualization v2" proposal:
https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing

Reviewers: rsmith, amharc, kuhar, sanjoy

Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45111

llvm-svn: 331448
2018-05-03 11:03:01 +00:00
Simon Pilgrim
b7289046cc [X86] Split WriteVecIMul/WriteVecPMULLD/WriteMPSAD/WritePSADBW into XMM and YMM/ZMM scheduler classes
Also retagged VDBPSADBW instructions as SchedWritePSADBW instead of SchedWriteVecIMul which matches the behaviour on SkylakeServer (the only thing that supports it...)

llvm-svn: 331445
2018-05-03 10:31:20 +00:00
Benjamin Kramer
8410bf3c2f [WebAssembly] MC: Don't litter test directory.
llvm-svn: 331442
2018-05-03 08:25:14 +00:00
Martin Storsjo
0e9558e4c6 Revert "[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)"
This reverts SVN r331337, see PR37321 for details on the regression
it introduced.

llvm-svn: 331441
2018-05-03 07:09:33 +00:00
Craig Topper
711da34d55 [LoopIdiomRecognize] When looking for 'x & (x -1)' for popcnt, make sure the left hand side of the 'and' matches the left hand side of the 'subtract'
llvm-svn: 331437
2018-05-03 05:48:49 +00:00
Craig Topper
4ae293e526 [LoopIdiomRecognize] Add a test case showing that we transform to ctpop without fully checking the 'x & (x-1)' part.
The code fails to check that the same value is used twice. We only make sure the left hand side of the and is part of the loop recurrence. The 'x' in the subtract can be any value.

llvm-svn: 331436
2018-05-03 05:48:48 +00:00
Max Kazantsev
3dbdfd12d9 Re-enable "[SCEV] Make computeExitLimit more simple and more powerful"
This patch was temporarily reverted because it has exposed bug 37229 on
PowerPC platform. The bug is unrelated to the patch and was just a general
bug in the optimization done for PowerPC platform only. The bug was fixed
by the patch rL331410.

This patch returns the disabled commit since the bug was fixed.

llvm-svn: 331427
2018-05-03 02:37:55 +00:00
Michael Berg
dc3d19e5de MachineInst support mapping SDNode fast math flags for support in Back End code generation
Summary:
Machine Instruction flags for fast math support and MIR print support


Reviewers: spatel, arsenm

Reviewed By: arsenm

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D45781

llvm-svn: 331417
2018-05-03 00:07:56 +00:00
Nemanja Ivanovic
3c3f64c605 [PowerPC] Implement isMaskAndCmp0FoldingBeneficial
Sinking the and closer to a compare against zero is beneficial on PPC as it
allows us to emit record-form instructions. In the future, we may expand this
to a larger set of operations that feed compares against zero since PPC has
lots of record-form instructions.

Differential revision: https://reviews.llvm.org/D46060

llvm-svn: 331416
2018-05-02 23:55:23 +00:00
Sam Clegg
10436289e7 [WebAssembly] MC: Create and use first class section symbols
Differential Revision: https://reviews.llvm.org/D46335

llvm-svn: 331413
2018-05-02 23:11:38 +00:00
Nemanja Ivanovic
9dcc8baaaa [PowerPC] No CTR loop if the candidate exiting block is in a different loop
The CTR loops pass will insert the decrementing branch instruction in an exiting
block for the loop being transformed. However if that block is part of another
loop as well (whether a nested loop or with irreducible CFG), it is not valid
to use that exiting block. In fact, if the loop hass irreducible CFG, we don't
bother analyzing it and we just bail on the transformation. In practice, this
doesn't lead to a noticeable reduction in the number of loops transformed by
this pass.

Fixes https://bugs.llvm.org/show_bug.cgi?id=37229

Differential Revision: https://reviews.llvm.org/D46162

llvm-svn: 331410
2018-05-02 22:56:04 +00:00
Chandler Carruth
213b57c660 [GCOV] Emit the writeout function as nested loops of global data.
Summary:
Prior to this change, LLVM would in some cases emit *massive* writeout
functions with many 10s of 1000s of function calls in straight-line
code. This is a very wasteful way to represent what are fundamentally
loops and creates a number of scalability issues. Among other things,
register allocating these calls is extremely expensive. While D46127 makes this
less severe, we'll still run into scaling issues with this eventually. If not
in the compile time, just from the code size.

Now the pass builds up global data structures modeling the inputs to
these functions, and simply loops over the data structures calling the
relevant functions with those values. This ensures that the code size is
a fixed and only data size grows with larger amounts of coverage data.

A trivial change to IRBuilder is included to make it easier to build
the constants that make up the global data.

Reviewers: wmi, echristo

Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D46357

llvm-svn: 331407
2018-05-02 22:24:39 +00:00
Martin Storsjo
214c7bcdc6 [llvm-rc] Default to writing the output next to the input, if no output is specified
This matches what rc.exe does if no output is specified.

Differential Revision: https://reviews.llvm.org/D46239

llvm-svn: 331403
2018-05-02 21:15:24 +00:00
Martin Storsjo
cb354baeb1 [llvm-cvtres] Allow parameters preceded by '-' in addition to '/'
The real cvtres.exe also allows parameters in either form.

Differential Revision: https://reviews.llvm.org/D46358

llvm-svn: 331402
2018-05-02 21:15:13 +00:00
Paul Semel
de5f098777 [llvm-objcopy] Add --discard-all (-x) option
llvm-svn: 331400
2018-05-02 20:19:22 +00:00
Paul Semel
9070f9fa91 [llvm-objcopy] Add --weaken option
llvm-svn: 331397
2018-05-02 20:14:49 +00:00
Roman Tereshin
58fdf359a9 [GlobalISel][InstructionSelect] Refactoring out a getMatchTable virtual method + other small NFC's
The main goal is to share getMatchTable between the Instruction
Selector and the Testgen.

The commit also contains some NFC only loosely related to refactoring
out the getMatchTable, but strongly related to the initial Testgen
patch (see https://reviews.llvm.org/D43962)

Reviewers: dsanders, aemerson

Reviewed By: dsanders

Subscribers: rovka, kristof.beyls, llvm-commits, dsanders

Differential Revision: https://reviews.llvm.org/D46096

llvm-svn: 331395
2018-05-02 20:07:15 +00:00
Martin Storsjo
6a5bc398fd [llvm-rc] Add rudimentary support for codepages
Only support UTF-8 (since LLVM contains UTF-8 parsing support
already, and the code even does that already) and Windows-1252
(where most code points has the same value in unicode). Keep the
existing default as only allowing ASCII input.

Using the option type JoinedOrSeparate, since the real rc.exe
handles options in this form, even if llvm-rc uses Separate for
other similar existing options.

Rename the struct SearchParams to WriterParams since it's now used
for more than just include paths.

Add a missing getResourceTypeName method to the BundleResource class,
to fix error printing from within STRINGTABLE resources (used in
tests).

Differential Revision: https://reviews.llvm.org/D46238

llvm-svn: 331391
2018-05-02 19:43:44 +00:00
Simon Pilgrim
e2abbcab78 [X86][SNB] Fix scheduling of MMX integer multiply instructions.
The entries were being bound to the wrong class.

llvm-svn: 331388
2018-05-02 19:26:14 +00:00
Simon Pilgrim
92b8f5874f [X86] Split WriteShuffle/WriteVarShuffle + WriteBlend/WriteVarBlend into XMM and YMM/ZMM scheduler classes
llvm-svn: 331386
2018-05-02 18:48:23 +00:00
Martin Storsjo
f77528e6b4 [COFF, ARM64] Hook up a few remaining relocations
Differential Revision: https://reviews.llvm.org/D46355

llvm-svn: 331384
2018-05-02 18:24:37 +00:00
Farhana Aleen
ba85d939d0 [AMDGPU] A trivial fix for a buildbot failure caused by "commit 224a839fcbbead221f872cd32a1dd0c308d37299".
Author: FarhanaAleen
llvm-svn: 331383
2018-05-02 18:16:39 +00:00
Daniel Sanders
c59dd75a2c [reassociate] Fix excessive revisits when processing long chains of reassociatable instructions.
Summary:
Some of our internal testing detected a major compile time regression which I've
tracked down to:
    r278938 - Revert "Reassociate: Reprocess RedoInsts after each inst".
It appears that processing long chains of reassociatable instructions causes
non-linear (potentially exponential) growth in the number of times an
instruction is revisited. For example, the included test revisits instructions
220 times in a 20-instruction test.

It appears that r278938 reversed the order instructions were visited and that
this is preventing scheduled revisits from being cancelled as a result of
visiting the instructions naturally during normal processing. However, simply
reversing the order also harmed the generated code. Upon closer inspection, it
was discovered that revisits occurred in the opposite order to the first pass
(Thanks to escha for spotting that).

This patch makes the revisit order consistent with the first pass which allows
more revisits to be cancelled. This does appear to have a small impact on the
generated code in few cases but it significantly reduces compile-time.

After this patch, our internal test that was most affected by the regression
dropped from ~2 million revisits to ~4k resulting in Reassociate having 0.46%
of the runtime it had before (99.54% improvement).

Here's the summaries reported by lnt for the LLVM test-suite with --benchmarking-only:
| metric         | geomean before patch | geomean after patch | delta   |
| -----          | -----                | -----               | -----   |
| compile time   | 0.1956               | 0.1261              | -35.54% |
| execution time | 0.3240               | 0.3237              | -       |
| code size      | 7365.4459            | 7365.6079           | -       |

The results have a few wins and losses on compile-time, mostly in the +/- 2.5% range. There was one outlier though:
| Performance Regressions - compile_time | Δ | Previous | Current |
| MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk | 9.82% | 2.0473 | 2.2483 |

Reviewers: javed.absar, dberlin

Reviewed By: dberlin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45734

llvm-svn: 331381
2018-05-02 17:59:16 +00:00
Simon Pilgrim
9487f9759e [X86] Cleanup WriteFShuffle/WriteFVarShuffle (+256 variants) scheduler classes with more common default values
llvm-svn: 331380
2018-05-02 17:58:50 +00:00
Farhana Aleen
ea282222ac Revert "[AMDGPU] performAddCombine should run after DAG is legalized."
This reverts commit 6b97d2995566b4dddd6bf0d75579ff44501d4494.

llvm-svn: 331371
2018-05-02 16:48:52 +00:00
Farhana Aleen
e54c727c8e [AMDGPU] performAddCombine should run after DAG is legalized.
Summary: performAddCombine should run after DAG is legalized; Otherwise generic optimization
         in the DAGCombiner can optimize an addcarry+trunc into an addcarry instruction with
         illegal types.

Author: FarhanaAleen

Reviewed By: rampitec

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D46337

llvm-svn: 331368
2018-05-02 16:24:10 +00:00
Clement Courbet
bc7c9645b0 [X86] Fix scheduling info for (V?)SQRTPDm on silvermont.
https://reviews.llvm.org/D46356

llvm-svn: 331356
2018-05-02 13:46:14 +00:00
Sander de Smalen
04ee355bf3 [AArch64][SVE] Asm: Support for LDR/STR fill and spill instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D46270

llvm-svn: 331352
2018-05-02 13:32:39 +00:00
Simon Tatham
1754ebf170 [TableGen] Don't quote variable name when printing !foreach.
An input !foreach expression such as !foreach(a, lst, !add(a, 1))
would be re-emitted by llvm-tblgen -print-records with the first
argument in quotes, giving !foreach("a", lst, !add(a, 1)), which isn't
valid TableGen input syntax.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46352

llvm-svn: 331351
2018-05-02 13:17:26 +00:00
Sander de Smalen
617d4ec81a [AArch64][SVE] Asm: Support for scatter ST1 store instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46248

llvm-svn: 331349
2018-05-02 13:00:30 +00:00
Simon Dardis
7445041586 Revert "[mips] Correct the predicates of sign extension instructions"
I accidently committed this patch after asking for a review, but it has not
been reviewed yet.

This reverts r331346.

llvm-svn: 331348
2018-05-02 12:35:29 +00:00
Simon Dardis
e0cd1d12d8 [mips] Correct the predicates of sign extension instructions
And eliminate the duplication of those instructions for microMIPS32r6.

llvm-svn: 331346
2018-05-02 12:25:33 +00:00
Sander de Smalen
e969c96428 [AArch64][SVE] Asm: Support for non-temporal, contiguous LDNT1/STNT1 load/store instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D46269

llvm-svn: 331343
2018-05-02 11:48:49 +00:00
Simon Dardis
aeaacfef81 [mips] Correct the predicates for shifts.
Reviewers: smaksimovic, abeserminji, atanasyan

Differential Revision: https://reviews.llvm.org/D46123

llvm-svn: 331341
2018-05-02 09:55:49 +00:00
Simon Pilgrim
d0a9ff112b [X86] Cleanup WriteFAdd/WriteFCmp scheduler classes with more common default values
Intel models were targeting x87 instead of packed sse.

Also fixes XOP's VFRCZ to use WriteFAdd/WriteFAddY.

llvm-svn: 331340
2018-05-02 09:18:49 +00:00
Sander de Smalen
86aaf3b716 [AArch64][SVE] Asm: Support for LD1RQ load-and-replicate quad-word vector instructions.
Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D46250

llvm-svn: 331339
2018-05-02 08:49:08 +00:00
Piotr Padlewski
e2eb851fbe Mark invariant.group.barrier as inaccessiblememonly
It turned out that readonly argmemonly is not enough.

  store 42, %p
  %b = barrier(%p)
  store 43, %b
the first store is dead, but because barrier was marked as
reading argument memory, it was considered alive. With
inaccessiblememonly it doesn't read the argument, but
it also can't be CSEd.

based on: https://reviews.llvm.org/D32006

llvm-svn: 331338
2018-05-02 08:22:07 +00:00
Bjorn Pettersson
fec32a90f5 [SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)
Summary:
This is a follow up to rL331182. A PHI node can be split up into
several MIR PHI nodes when being selected. When there is a
dbg.value intrinsic that uses the result of such a PHI node we
need to select several DBG_VALUE instructions, with fragment
expressions, in order to do a correct selection.

Reviewers: rnk, aprantl, vsk

Reviewed By: vsk

Subscribers: mattd, llvm-commits, JDevlieghere, aprantl, gbedwell, rnk

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D46329

llvm-svn: 331337
2018-05-02 06:56:38 +00:00
Farhana Aleen
e7719ff6ee [AMDGPU] Support horizontal vectorization.
Author: FarhanaAleen

Reviewed By: rampitec, arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D46213

llvm-svn: 331313
2018-05-01 21:41:12 +00:00
Sanjay Patel
e1a261a27e [AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare
and (or (lshr X, C), ...), 1 --> (X & C') != 0

I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6

...but we need to do better to catch the more general sequence from the motivating test 
(more than 2 bits in the compare). And a test-suite run with statistics showed that this 
pattern only happened 2 times currently. It would potentially happen more often if 
reassociation worked better (D45842), but it's probably still not too frequent?

This is small enough that I didn't see a need to create a whole new class/file within 
AggressiveInstCombine. There are likely other relatively small matchers like what was 
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). 
We could potentially also consolidate matchers for ctpop, bswap, etc under here.

Differential Revision: https://reviews.llvm.org/D45986

llvm-svn: 331311
2018-05-01 21:02:09 +00:00
Sanjay Patel
340bdee43a [AggressiveInstCombine] add more bitfield test patterns; NFC
Add another baseline for D45986 and a pattern that won't be
matched with that patch.

llvm-svn: 331309
2018-05-01 20:55:03 +00:00
Sanjay Patel
59d02dad02 [PhaseOrdering] add tests for bittest patterns from bitfields; NFC
As mentioned in D45986, there's a potential ordering dependency 
between instcombine and aggressive-instcombine for detecting these,
so I'm adding a few tests to confirm that the expected folds occur
using -O3 (because aggressive-instcombine only runs at -O3 currently).

llvm-svn: 331308
2018-05-01 20:53:44 +00:00
Eli Friedman
4d25e8bf01 [AArch64] Add more tests for 64-bit immediate lowering.
This adds a some more tests, and adds some notes to tests which are using
a suboptimal lowering.

The constants with suboptimal lowerings seem to be relatively rare in
practice, but it might be a fun project to work on improvements.

llvm-svn: 331304
2018-05-01 20:00:14 +00:00
Vedant Kumar
4ecaa6ff5f [DAGCombiner] Fix SDLoc in a (zext (zextload x)) combine (4/N)
The logic for this combine is almost identical to the logic for a
(sext (sextload x)) combine.

This commit factors out the logic so it can be shared by both combines,
and corrects the SDLoc assigned in the zext version of the combine.

Prior to this patch, for the given test case, we would apply the
location associated with the udiv instruction to instructions which
perform the load.

Part of: llvm.org/PR37262

llvm-svn: 331303
2018-05-01 19:51:15 +00:00