1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

187487 Commits

Author SHA1 Message Date
Alexander Shaposhnikov
457557bce5 Revert "Introduce llvm-install-name-tool"
This reverts commit b5913e6d2f6d13fb753df701619731ca11936316.
2019-11-06 17:04:04 -08:00
Matt Arsenault
57e4983b8d AMDGPU: Select global atomicrmw fadd
This only works if there is no use of the return value.
2019-11-06 16:06:38 -08:00
Matt Arsenault
bb02da34a0 TableGen: Remove assert that pattern results match input number
AMDGPU has some atomic instructions that do not return the previous
result, and can only be selected if there are no uses. The source
pattern will only match if the use is empty, so it should be safe to
discard the result.
2019-11-06 16:06:37 -08:00
Eric Christopher
c052667ccc Temporarily Revert:
"[SLP] Generalization of stores vectorization."
 "[SLP] Fix -Wunused-variable. NFC"
 "[SLP] Vectorize jumbled stores."

As they're causing significant (10-30x) compile time regressions on
vectorizable code.

The primary cause of the compile-time regression is f228b5371647f471853c5fb3e6719823a42fe451.

This reverts commits:

f228b5371647f471853c5fb3e6719823a42fe451
5503455ccb3f5fcedced158332c016c8d3a7fa81
21d498c9c0f32dcab5bc89ac593aa813b533b43a
2019-11-06 16:06:15 -08:00
Stanislav Mekhanoshin
5d1e88c9e0 [AMDGPU] Add handling of 160 bit registers in analyzeResourceUsage
This was omitted. Also SReg_96Reg missed IsSGPR assignment.

Differential Revision: https://reviews.llvm.org/D69919
2019-11-06 15:47:32 -08:00
Philip Reames
98ed51c042 [LoopPred] Enable new transformation by default
The basic idea of the transform is to convert variant loop exit conditions into invariant exit conditions by changing the iteration on which the exit is taken when we know that the trip count is unobservable.  See the original patch which introduced the code for a more complete explanation.

The individual parts of this have been reviewed, the result has been fuzzed, and then further analyzed by hand, but despite all of that, I will not be suprised to see breakage here.  If you see problems, please don't hesitate to revert - though please do provide a test case.  The most likely class of issues are latent SCEV bugs and without a reduced test case, I'll be essentially stuck on reducing them.

(Note: A bunch of tests were opted out of the new transform to preserve coverage.  That landed in a previous commit to simplify revert cycles if they turn out to be needed.)
2019-11-06 15:41:57 -08:00
Philip Reames
23f506e253 [LoopPred] Selectively disable to preserve test cases
I'm about to enable the new loop predication transform by default.  It has the effect of completely destroying many read only loops - which happen to be a super common idiom in our test cases.  So as to preserve test coverage of other transforms, disable the new transform where it would cause sharp test coverage regressions.

(This is semantically part of the enabling commit.  It's committed separate to ease revert if the actual flag flip gets reverted.)
2019-11-06 15:41:57 -08:00
Nico Weber
2f60cbb6e9 gn build: (manually) merge b5913e6d2f 2019-11-06 18:26:56 -05:00
Eric Christopher
8e003eb6e9 When lowering calls and tail calls in AArch64, the register mask and
return value location depends on the calling convention of the callee.
`F.getCallingConv()`, however, is the caller CC. Correct it to the
callee CC from `CallLoweringInfo`.

Fixes PR43449

Patch by Shu-Chun Weng!
2019-11-06 15:25:10 -08:00
Lang Hames
bc7e42d8b9 [docs] Fix references to a renamed flag.
The -use-mcjit option was replaced with -jit-kind=mcjit a while back. This patch
updates the docs to reflect that.

Patch by Yu Jian. Thanks Jian!
2019-11-06 14:42:57 -08:00
Roman Lebedev
17f631ed23 [ConstantRange] Add subWithNoWrap() method
Summary:
Much like D67339, adds ConstantRange handling for
when we know no-wrap behavior of the `sub`.

Unlike addWithNoWrap(), we only get lucky re returning empty set
for signed wrap. For unsigned, we must perform overflow check manually.

A patch that makes use of this in LVI (CVP) to be posted later.

Reviewers: nikic, shchenz, efriedma

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69918
2019-11-07 01:30:53 +03:00
Roman Lebedev
e4ece27aeb [ConstantRange] Cleanup addWithNoWrap() by just piggybacking on sadd_sat()/uadd_sat()
As discussed in https://reviews.llvm.org/D69918
that happens to work as intended, and returns empty set if
there is always an overflow because we get lucky with intersection.
Since there's now an explicit test for that, let's prefer cleaner code.
2019-11-07 01:30:53 +03:00
Roman Lebedev
6c155b3a15 [ConstantRange] TestAddWithNo*WrapExhaustive: check that all overflow means empty set
As disscussed in https://reviews.llvm.org/D69918 / https://reviews.llvm.org/D67339
that is an implied postcondition, but it's not really fully tested.
2019-11-07 01:30:53 +03:00
Lang Hames
023dd287ea [JITLink] Refactor EH-frame handling to support eh-frames with existing relocs.
Some targets (E.g. MachO/arm64) use relocations to fix some CFI record fields
in the eh-frame section. When relocations are used the initial (pre-relocation)
content of the eh-frame section can no longer be interpreted by following the
eh-frame specification. This causes errors in the existing eh-frame parser.

This patch moves eh-frame handling into two LinkGraph passes that are run after
relocations have been parsed (but before they are applied). The first] pass
breaks up blocks in the eh-frame section into per-CFI-record blocks, and the
second parses blocks of (potentially multiple) CFI records and adds the
appropriate edges to any CFI fields that do not have existing relocations.
These passes can be run independently of one another. By handling eh-frame
splitting/fixing with LinkGraph passes we can both re-use existing relocations
for CFI record fields and avoid applying eh-frame fixups before parsing the
section (which would complicate the linker and require extra temporary
allocations of working memory).
2019-11-06 14:30:26 -08:00
Alexandre Ganea
0fe15c92e6 [Orc] Fix iterator usage after remove
Differential Revision: https://reviews.llvm.org/D69805
2019-11-06 17:17:27 -05:00
Kazu Hirata
05208cfb69 [JumpThreading] Factor out code to clone instructions (NFC)
Summary:
This patch factors out code to clone instructions -- partly for
readability and partly to facilitate an upcoming patch of my own.

Reviewers: wmi

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69861
2019-11-06 14:16:48 -08:00
Philip Reames
e77f587e1a [WC] Fix a subtle bug in our definition of widenable branch
We had a subtle, but nasty bug in our definition of a widenable branch, and thus in the transforms which used that utility. Specifically, we returned true for any branch which included a widenable condition within it's condition, regardless of whether that widenable condition also had other uses.

The problem is that the result of the WC() call is defined to be one particular value. As such, all users must agree as to what that value is. If we widen a branch without also updating *all other users* of the WC in the same way, we have broken the required semantics.

Most of the textual diff is updating existing transforms not to leave dead uses hanging around. They're largely NFC as the dead instructions would be immediately deleted by other passes. The reason to make these changes is so that the transforms preserve the widenable branch form.

In practice, we don't get bitten by this only because it isn't profitable to CSE WC() calls and the lowering pass from guards uses distinct WC calls per branch.

Differential Revision: https://reviews.llvm.org/D69916
2019-11-06 14:16:34 -08:00
Dávid Bolvanský
0ed7a94cab [Analysis] Attribute deref/deref_or_null should not prevent tail call optimization 2019-11-06 23:08:07 +01:00
Philip Reames
a4e77afd7b [LoopPred] Fix two subtle issues found by inspection
This patch fixes two issues noticed by inspection when going to enable the loop predication code in IndVarSimplify.

Issue 1 - Both the LoopPredication transform, and the already on by default optimizeLoopExits transform, modify the exit count of the exits they modify. (either to 0 or Infinity) Looking at the code more closely, this was not reflected into SCEV and we were instead running later transforms with incorrect SCEVs. Fixing this requires forgetting the loop, weakening a too strong assert, and updating SCEV to not pessimize results when a loop is provable untaken. I haven't been able to find a test case to demonstrate the miscompile.

Issue 2 - For modules without a data layout, we can end up with unsized pointer typed exit counts. Just bail out of this case.

I think these are the last two issues which need addressed before we enable this by default. The code has already survived a decent amount of fuzzing without revealing either of the above.

Differential Revision: https://reviews.llvm.org/D69695
2019-11-06 14:04:45 -08:00
Joel E. Denny
7fc52c9da6 [lit] Protect full test suite from FILECHECK_OPTS
lit's test suite calls lit multiple times for various sample test
suites.  `FILECHECK_OPTS` is safe for FileCheck calls in lit's test
suite.  It's not safe for FileCheck calls in the sample test suites,
whose output affects the results of lit's test suite.

Without this patch, only one such sample test suite is protected from
`FILECHECK_OPTS`, and currently `shtest-shell.py` breaks with
`FILECHECK_OPTS=-vv`.  Moreover, it's hard to predict the future,
especially false passes.  Thus, this patch protects all existing and
future sample test suites from `FILECHECK_OPTS` (and the deprecated
`FILECHECK_DUMP_INPUT_ON_FAILURE`).

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D65156
2019-11-06 16:25:25 -05:00
Craig Topper
17612b1710 [X86] Clamp large constant shift amounts for MMX shift intrinsics to 8-bits.
The MMX intrinsics for shift by immediate take a 32-bit shift
amount but the hardware for shifting by immediate only encodes
8-bits. For the intrinsic we don't require the shift amount to
fit in 8-bits in the frontend because we don't check that its an
immediate in the frontend. If its is not an immediate we move it
to an MMX register and use the shift by register.

But if it is an immediate we'll use the shift by immediate
instruction. But we need to change the shift amount to 8-bits.
We were previously doing this accidentally by masking it in the
encoder. But this can make a large shift amount into a small
in bounds shift amount. Instead we should clamp larger shift
amounts to 255 so that the they don't become in bounds.

Fixes PR43922
2019-11-06 13:03:18 -08:00
Eli Friedman
60c0901efb [AArch64] Re-add patterns for (s/u)mull2.
These patterns were added in D46009, but removed in D54276 due to
missing test coverage.

Differential Revision: https://reviews.llvm.org/D69831
2019-11-06 12:24:18 -08:00
Alexander Shaposhnikov
4e23992441 Introduce llvm-install-name-tool
This diff adds a new "driver" for llvm-objcopy
which is supposed to emulate the behavior of install-name-tool.

Differential revision: https://reviews.llvm.org/D69146

Test plan: make check-all
2019-11-06 11:50:58 -08:00
Steven Wu
b79544457f Fix a typo in my previous commit 2019-11-06 11:42:30 -08:00
David Tenty
377a7f74c0 [NFC] Add SUPPORT_PLUGINS to add_llvm_executable()
Summary:
this allows us to move logic about when it is appropriate set
LLVM_NO_DEAD_STRIP out of each tool and into add_llvm_executable,
which will enable future platform specific handling.

This is a follow on to the reverted D69356

Reviewers: hubert.reinterpretcast, beanz, lhames

Reviewed By: beanz

Subscribers: mgorny, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69638
2019-11-06 14:32:35 -05:00
Quentin Colombet
63eb1844d8 [GISel][ArtifactCombiner] Relax the constraint to combine unmerge with concat_vectors
The combine G_UNMERGE_VALUES with G_CONCAT_VECTORS used to only be performed
when the result type of the G_UNMERGE_VALUES was a vector type.
In other words, we were expecting that the G_UNMERGE_VALUES was effectively
the exact opposite of the G_CONCAT_VECTORS.

Lift that constraint by allowing any G_UNMERGE_VALUES to be combined
with any G_CONCAT_VECTORS (as long as the size of the different pieces
that we merge/unmerge match).

Differential Revision: https://reviews.llvm.org/D69288
2019-11-06 11:27:50 -08:00
Steven Wu
7a47cbea30 [Object][MachO] Rewrite macho-invalid-fat-arch-size into YAML
Summary:
Rewrite one of the invalid macho test input file with YAML file. The
original invalid macho is breaking our internal test infrastusture
because it is too broken to be copy around.

Need to relax an assertion in the YAML/MachoEmitter to allow yaml2obj to
write an invalid object like this.

rdar://problem/56879982

Reviewers: beanz, mtrent

Reviewed By: beanz

Subscribers: hiraditya, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69856
2019-11-06 11:26:25 -08:00
Dávid Bolvanský
0d3ac93316 [X86TargetTransformInfo] Fixed warning: Expression 'ISD == ISD::UREM' is always true. NFCI. 2019-11-06 20:10:29 +01:00
Simon Pilgrim
7d4b7cf3b9 [X86] Fix SLM v2i64 ADD/Sub/CMPEQ instruction schedules
Noticed while fixing the reduction costs for D59710 - the SLM model doesn't account for the poor throughput of v2i64 ops.

Numbers taken from Intel AOM (+ checked against Agner)
2019-11-06 19:08:15 +00:00
Simon Pilgrim
02871604f5 [X86] Fix SLM v2f64 ADD/MUL + FP BLEND/HADD instruction schedules
Noticed while fixing the reduction costs for D59710 - the SLM model doesn't account for the poor throughput of v2f64/v2i64 ops.
2019-11-06 19:08:15 +00:00
Dávid Bolvanský
c3caeb5a37 [X86ISelLowering] Fixed typo in assert. NFCI. 2019-11-06 20:04:15 +01:00
Simon Pilgrim
09d86aa16b [CostModel][X86] Improve add vXi64 + fadd vXf64 reduction tests for SLM
As noted on D59710 we weren't handling the high costs of these operations on SLM.
2019-11-06 17:55:38 +00:00
Simon Pilgrim
eec64adc30 [CostModel][X86] Add add/fadd reduction tests for SLM 2019-11-06 17:04:22 +00:00
Simon Pilgrim
45791378ea CodeGenInstruction - fix uninitialized variable warnings. NFCI. 2019-11-06 17:04:21 +00:00
Simon Pilgrim
b3fa3af42f LoopAccessAnalysis - fix uninitialized variable warnings. NFCI. 2019-11-06 17:04:21 +00:00
Simon Pilgrim
512a23db84 BranchProbabilityInfo - fix uninitialized variable warning. NFCI. 2019-11-06 17:04:21 +00:00
Don Hinton
f1b6995b38 [CommandLine] Add inline ArgName printing
Summary:
This patch adds PrintArgInline (after PrintArg) that strips the
leading spaces from an argument before printing them, for usage
inline.

Related bug: PR42943 <https://bugs.llvm.org/show_bug.cgi?id=42943>

Patch by Daan Sprenkels!

Reviewers: jhenderson, chandlerc, hintonda

Reviewed By: jhenderson

Subscribers: hiraditya, kristina, llvm-commits, dsprenkels

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69501
2019-11-06 08:17:33 -08:00
Pavel Labath
d82cf6fb61 DWARFDebugLoclists: Move to a incremental parsing model
Summary:
This patch stems from the discussion D68270 (including some offline
talks). The idea is to provide an "incremental" api for parsing location
lists, which will avoid caching or materializing parsed data. An
additional goal is to provide a high level location list api, which
abstracts the differences between different encoding schemes, and can be
used by users which don't care about those (such as LLDB).

This patch implements the first part. It implements a call-back based
"visitLocationList" api. This function parses a single location list,
calling a user-specified callback for each entry. This is going to be
the base api, which other location list functions (right now, just the
dumping code) are going to be based on.

Future patches will do something similar for the v4 location lists, and
add a mechanism to translate raw entries into concrete address ranges.

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69672
2019-11-06 16:25:06 +01:00
Miloš Stojanović
1ac6de2765 [NFC][APInt] Fix typos in comments.
Testing git commit access.
2019-11-06 16:01:58 +01:00
Sanjay Patel
d8c7faafe9 [x86] avoid crashing when splitting AVX stores with non-simple type (PR43916)
The store splitting transform was assuming a simple type (MVT),
but that's not necessarily the case as shown in the test.
2019-11-06 09:28:41 -05:00
Ilya Biryukov
9f476ae4bb [Support] fix mingw-w64 build
Older versions of Mingw-w64 do not define _beginthreadex_proc_type,
so we replace it with `unsigned (__stdcall *ThreadFunc)(void *)`.

Fixes https://github.com/clangd/clangd/issues/188

Patch by lh123!

Differential Revision: https://reviews.llvm.org/D69879
2019-11-06 15:18:58 +01:00
Simon Pilgrim
1556de2854 [X86] Fix uninitialized variable warnings. NFCI. 2019-11-06 14:02:43 +00:00
Simon Pilgrim
f134f16f11 X86FoldTablesEmitter - fix static analyzer potential invalid iterator warning. NFCI. 2019-11-06 13:31:00 +00:00
Simon Pilgrim
9ef7cc05e4 [X86] LowerAVXExtend - fix dodgy self-comparison assert.
PVS Studio noticed that we were asserting "VT.getVectorNumElements() == VT.getVectorNumElements()" instead of "VT.getVectorNumElements() == InVT.getVectorNumElements()".
2019-11-06 12:50:29 +00:00
Momchil Velikov
186e50f9f0 [AArch64] Move the branch relaxation pass after BTI insertion
Summary:
Inserting BTI instructions can push branch destinations out of range.

The branch relaxation pass itself cannot insert indirect branches since `TargetInstrInfo::insertIndirecrtBranch` is not implemented for AArch64 (guess +/-128 MB direct branch range is more than enough in practice).

Testing this is a bit tricky.

The original test case we have is 155kloc/6.1M. I've generated a test case using this program:
```

int main() {
  std::cout << R"src(int test();
void g0(), g1(), g2(), g3(), g4(), e();

void f(int v) {
  if ((test() & 2) == 0) {
  switch (v) {
  case 0:
    g0();
  case 1:
    g1();
  case 2:
    g2();
  case 3:
    g3();
  }
)src";

  const int N = 8176;

  for (int i = 0; i < N; ++i)
    std::cout << "    void h" << i << "();\n";
  for (int i = 0; i < N; ++i)
    std::cout << "    h" << i << "();\n";

  std::cout << R"src(
  } else {
    e();
  }
}
)src";
}
```
which is still a bit too much to commit as a regression test, IMHO.

Reviewers: t.p.northover, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69118

Change-Id: Ide5c922bcde08ff4cf635da5e52365525a997a0a
2019-11-06 12:46:50 +00:00
Simon Pilgrim
f32861caf3 [APInt] Fix implicit truncation warning in bitsToFloat(). NFCI. 2019-11-06 12:30:04 +00:00
Roman Lebedev
f8cfdaecea [LoopUnroll] countToEliminateCompares(): fix handling of [in]equality predicates (PR43840)
Summary:
I believe this bisects to https://reviews.llvm.org/D44983
(`[LoopUnroll] Only peel if a predicate becomes known in the loop body.`)

While that revision did contain tests that showed arguably-subpar peeling
for [in]equality predicates that [not] happen in the middle of the loop,
it also disabled peeling for the *first* loop iteration,
because latch would be canonicalized to [in]equality comparison..

That was intentional as per https://reviews.llvm.org/D44983#1059583.
I'm not 100% sure that i'm using correct checks here,
but this fix appears to be going in the right direction..

Let me know if i'm missing some checks here..

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=43840 | PR43840 ]].

Reviewers: fhahn, mkazantsev, efriedma

Reviewed By: fhahn

Subscribers: xbolva00, hiraditya, zzheng, llvm-commits, fhahn

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69617
2019-11-06 15:08:59 +03:00
Roman Lebedev
5cd2221e03 [NFC][LoopUnroll] Update test coverage for peeling w/ inequality predicates 2019-11-06 15:08:59 +03:00
dfukalov
3c6fc97f38 [AMDGPU] Improve code size cost model (part 2)
Summary: Added estimations for ShuffleVector, some cast and arithmetic instructions

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69629
2019-11-06 13:55:48 +03:00
Sjoerd Meijer
2a67a1de31 [TTI][LV] preferPredicateOverEpilogue
We have two ways to steer creating a predicated vector body over creating a
scalar epilogue. To force this, we have 1) a command line option and 2) a
pragma available. This adds a third: a target hook to TargetTransformInfo that
can be queried whether predication is preferred or not, which allows the
vectoriser to make the decision without forcing it.

While this change behaves as a non-functional change for now, it shows the
required TTI plumbing, usage of this new hook in the vectoriser, and the
beginning of an ARM MVE implementation. I will follow up on this with:
- a complete MVE implementation, see D69845.
- a patch to disable this, i.e. we should respect "vector_predicate(disable)"
  and its corresponding loophint.

Differential Revision: https://reviews.llvm.org/D69040
2019-11-06 10:14:20 +00:00