1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

180633 Commits

Author SHA1 Message Date
Andrea Di Biagio
6d80ed7fe4 [MCA] Slightly refactor the bottleneck analysis view. NFCI
This patch slightly refactors data structures internally used by the bottleneck
analysis to track data and resource dependencies.
This patch also updates methods used to print out information about dependency
edges when in debug mode.
This is the last of a sequence of commits done in preparation for an upcoming
patch that fixes PR37494. No functional change intended.

llvm-svn: 363677
2019-06-18 12:59:46 +00:00
Matt Arsenault
7aedd203b5 AMDGPU: Change API for checking for exec modification
Invert the name and return value to better reflect the imprecise
nature.

Force passing in the DefMI, since it's known in the 2 users and could
possibly fail for an arbitrary vreg.

Allow specifying a specific user instruction. Scan through use
instructions, instead of use operands. Add scan thresholds instead of
searching infinitely.

Stop using a set to track seen uses. I didn't understand this usage,
or why it would not check the last use. I don't think the use list has
any particular order.

llvm-svn: 363675
2019-06-18 12:48:36 +00:00
Fangrui Song
0976dce29c MCContext: Delete unused functions
llvm-svn: 363674
2019-06-18 12:30:06 +00:00
Nico Weber
106284874f gn build: Merge r363658
llvm-svn: 363673
2019-06-18 12:29:04 +00:00
Nico Weber
a55cae1e66 gn build: Merge r363649
This reverts commit "gn build: Merge r363626" because r363626
was reverted in r363649.

llvm-svn: 363672
2019-06-18 12:26:31 +00:00
Simon Pilgrim
b3e775ce41 [SelectionDAG] Legalize vaargs that require vector splitting
This adds vector splitting for vaarg instructions during type legalization

Committed on behalf of @luke (Luke Lau)

Differential Revision: https://reviews.llvm.org/D60762

llvm-svn: 363671
2019-06-18 12:24:02 +00:00
Matt Arsenault
1df190857f AMDGPU: Fold readlane from copy of SGPR or imm
These may be inserted to assert uniformity somewhere.

llvm-svn: 363670
2019-06-18 12:23:46 +00:00
Matt Arsenault
4cac027add AMDGPU: Remove unnecessary check for virtual register
The copy was found by searching the uses of a virtual register, so
it's already known to be virtual.

llvm-svn: 363669
2019-06-18 12:23:45 +00:00
Matt Arsenault
dfd05cd0bd AMDGPU: Fix iterator crash in AMDGPUPromoteAlloca
The lifetime intrinsic was erased, which was the next iterator.

llvm-svn: 363668
2019-06-18 12:23:44 +00:00
Matt Arsenault
e85e7e3719 AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.scale
llvm-svn: 363667
2019-06-18 12:23:42 +00:00
Sjoerd Meijer
beeadf0b5d [ARM] Some Thumb2ITBlock clean ups. NFC
Some more refactoring, like registering the IT Block pass, less cryptic
variable names, and some simplification of loops.

Differential Revision: https://reviews.llvm.org/D63419

llvm-svn: 363666
2019-06-18 12:13:11 +00:00
Jonas Paulsson
221b23dde2 [SystemZ] Fix AHIMuxK pseudo expansion.
Do not emit a copy if the source and destination registers are the same.

Review: Ulrich Weigand
llvm-svn: 363665
2019-06-18 12:10:02 +00:00
Valery Pykhtin
730e35ad92 [AMDGPU] Speed up live-in virtual register set computaion in GCNScheduleDAGMILive.
Differential revision: https://reviews.llvm.org/D62401

llvm-svn: 363661
2019-06-18 11:43:17 +00:00
Graham Hunter
19a94b9e2a [SVE][IR] Scalable Vector IR Type with pr42210 fix
Recommit of D32530 with a few small changes:
  - Stopped recursively walking through aggregates in
    the verifier, so that we don't impose too much
    overhead on large modules under LTO (see PR42210).
  - Changed tests to match; the errors are slightly
    different since they only report the array or
    struct that actually contains a scalable vector,
    rather than all aggregates which contain one in
    a nested member.
  - Corrected an older comment

Reviewers: thakis, rengolin, sdesmalen

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D63321

llvm-svn: 363658
2019-06-18 10:11:56 +00:00
Simon Pilgrim
43740afb59 [X86] Regenerate promote.ll. NFC.
llvm-svn: 363657
2019-06-18 10:10:53 +00:00
Diogo N. Sampaio
66545ae9c9 [NFC] Improve triple match of scripts that update tests
Summary:
The prior behavior of the triple matcher would stop
in the first matched triple. It was not possible to
create specific matches for sub-sets of a triple
(e.g aarch64-apple-darwin would never be used after
aarch64 was matched).

This patch:
1) Allows that specialized triples take priority,
considering that the string lenght of the triple
indentifies how specialized a triple is. If two
triples of same lenght match, the one matched first
prevails, preserving the old behavior.

2) Remove 20 duplicated triples of arm, thumb,
aarch64 options with same arguments, matching
the common prefix (aarch64, arm, thumb) of them.

3) Creates three new function matching regexes and
five triple options for arm64-apple-ios,
(arm|thumb)-apple-ios and thumb(v5)?-macho

Reviewers: lebedev.ri, RKSimon, MaskRay, gbedwell

Reviewed By: MaskRay

Subscribers: javed.absar, kristof.beyls, llvm-commits, carwil

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63145

llvm-svn: 363656
2019-06-18 10:04:36 +00:00
Simon Pilgrim
40e9ca2ba9 [X86] Replace any_extend* vector extensions with zero_extend* equivalents
First step toward addressing the vector-reduce-mul-widen.ll regression in D63281 - we should replace ANY_EXTEND/ANY_EXTEND_VECTOR_INREG in X86ISelDAGToDAG to avoid having to add duplicate patterns when treating any extensions as legal.

In future patches this will also allow us to keep any extension nodes around a lot longer in the DAG, which should mean that we can keep better track of undef elements that otherwise become zeros that we think we have to keep......

Differential Revision: https://reviews.llvm.org/D63326

llvm-svn: 363655
2019-06-18 09:50:13 +00:00
Jeremy Morse
9e238559b7 [DebugInfo][Docs] Document that prologue/epilogue variable location changes are ignored
This patch documents that LLVM does not describe all changes in variable
locations during the prologue and the epilogue. The debugger doesn't /
shouldn't step through that portion of the function anyway, and describing
every location through such stages would bloat location lists.

Perform some minor cleanup at the same time,
 * Fix an enumerated list
 * Document that dbg.declare intrinsics have their variable location recorded
   in a MachineFunction table, not with DBG_VALUE meta-insts
 * Adds frame-indexes to the list of things that can be operands to
   DBG_VALUEs.

Differential Revision: https://reviews.llvm.org/D63083

llvm-svn: 363654
2019-06-18 08:52:38 +00:00
Yevgeny Rouban
71f592d2bb [SimplifyCFG] NFC, prof branch_weighs handling is simplified
Using the new SwitchInstProfUpdateWrapper this patch
simplifies 3 places of prof branch_weights handling.

Differential Revision: https://reviews.llvm.org/D62123

llvm-svn: 363652
2019-06-18 06:50:52 +00:00
Fangrui Song
1e6ebcabad [llvm-objdump] Tidy up AMDGCNPrettyPrinter
llvm-svn: 363650
2019-06-18 06:35:18 +00:00
Craig Topper
aeb8ad0216 [X86] Add i128 ctpop and i32/i64/i128 optsize test cases to popcnt.ll
Test cases for PR41151 and D59909.

llvm-svn: 363647
2019-06-18 04:52:49 +00:00
Craig Topper
eadc882e41 [X86] Move code that shrinks immediates for ((x << C1) op C2) into a helper function. NFCI
Preliminary step for D59909

llvm-svn: 363645
2019-06-18 04:23:58 +00:00
Craig Topper
ec570b779e [X86] Remove MOVDI2SSrm/MOV64toSDrm/MOVSS2DImr/MOVSDto64mr CodeGenOnly instructions.
The isel patterns for these use a bitcast and load/store, but
DAG combine should have canonicalized those away.

For the purposes of the memory folding table these opcodes can be
replaced by the MOVSSrm_alt/MOVSDrm_alt and MOVSSmr/MOVSDmr opcodes.

llvm-svn: 363644
2019-06-18 03:23:15 +00:00
Craig Topper
a26008def0 [X86] Introduce new MOVSSrm/MOVSDrm opcodes that use VR128 register class.
Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt.

Use the new versions in patterns that previously used a COPY_TO_REGCLASS
to VR128. These patterns expect the upper bits to be zero. The
current set up appears to work, but I'm not sure we should be
enforcing upper bits being zero through a COPY_TO_REGCLASS.

I wanted to flip the arrangement and use a COPY_TO_REGCLASS to
FR32/FR64 for the patterns that need an f32/f64 result, but that
complicated fastisel and globalisel.

I've been doing some experiments with reducing some isel patterns
and ended up in a situation where I had a
(SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our
post-isel peephole was unable to avoid using an instruction for
the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128
instruction removes the COPY_TO_REGCLASS that was breaking this.

llvm-svn: 363643
2019-06-18 03:23:11 +00:00
Tom Stellard
de4d18699c GlobalISel: Remove redundant pass initialization
Summary:
All the GlobalISel passes are initialized when the target calls
initializeGlobalISel(), so we don't need to call the initializers
from the pass constructors.

Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar

Reviewed By: aemerson

Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63235

llvm-svn: 363642
2019-06-18 02:05:06 +00:00
Alex Brachet
8f102655c8 [llvm-strip] Error when using stdin twice
Summary: Implements bug [[ https://bugs.llvm.org/show_bug.cgi?id=42204 | 42204 ]]. llvm-strip now warns when the same input file is used more than once, and errors when stdin is used more than once.

Reviewers: jhenderson, rupprecht, espindola, alexshap

Reviewed By: jhenderson, rupprecht

Subscribers: emaste, arichardson, jakehehrlich, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63122

llvm-svn: 363638
2019-06-18 00:39:10 +00:00
Matt Arsenault
6f840c2a43 GlobalISel: Use the original flags when lowering fneg to fsub
This was ignoring the flag on fneg, and using the source instruction's
flags. Also fixes tests missing from r358702.

Note the expansion itself isn't correct without nnan, but that should
be fixed separately.

llvm-svn: 363637
2019-06-17 23:48:43 +00:00
Peter Collingbourne
0104f81329 hwasan: Use bits [3..11) of the ring buffer entry address as the base stack tag.
This saves roughly 32 bytes of instructions per function with stack objects
and causes us to preserve enough information that we can recover the original
tags of all stack variables.

Now that stack tags are deterministic, we no longer need to pass
-hwasan-generate-tags-with-calls during check-hwasan. This also means that
the new stack tag generation mechanism is exercised by check-hwasan.

Differential Revision: https://reviews.llvm.org/D63360

llvm-svn: 363636
2019-06-17 23:39:51 +00:00
Peter Collingbourne
93adc422a8 hwasan: Add a tag_offset DWARF attribute to instrumented stack variables.
The goal is to improve hwasan's error reporting for stack use-after-return by
recording enough information to allow the specific variable that was accessed
to be identified based on the pointer's tag. Currently we record the PC and
lower bits of SP for each stack frame we create (which will eventually be
enough to derive the base tag used by the stack frame) but that's not enough
to determine the specific tag for each variable, which is the stack frame's
base tag XOR a value (the "tag offset") that is unique for each variable in
a function.

In IR, the tag offset is most naturally represented as part of a location
expression on the llvm.dbg.declare instruction. However, the presence of the
tag offset in the variable's actual location expression is likely to confuse
debuggers which won't know about tag offsets, and moreover the tag offset
is not required for a debugger to determine the location of the variable on
the stack, so at the DWARF level it is represented as an attribute so that
it will be ignored by debuggers that don't know about it.

Differential Revision: https://reviews.llvm.org/D63119

llvm-svn: 363635
2019-06-17 23:39:41 +00:00
Peter Collingbourne
5b2ec1b30d gn build: Merge r363626.
llvm-svn: 363634
2019-06-17 23:39:31 +00:00
Amara Emerson
2f870f9e0c [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block.
Inter-block localization is the same as what currently happens, except now it
only runs on the entry block because that's where the problematic constants with
long live ranges come from.

The second phase is a new intra-block localization phase which attempts to
re-sink the already localized instructions further right before one of the
multiple uses.

One additional change is to also localize G_GLOBAL_VALUE as they're constants
too. However, on some targets like arm64 it takes multiple instructions to
materialize the value, so some additional heuristics with a TTI hook have been
introduced attempt to prevent code size regressions when localizing these.

Overall, these changes improve CTMark code size on arm64 by 1.2%.

Full code size results:

Program                                         baseline       new       diff
------------------------------------------------------------------------------
 test-suite...-typeset/consumer-typeset.test    1249984      1217216     -2.6%
 test-suite...:: CTMark/ClamAV/clamscan.test    1264928      1232152     -2.6%
 test-suite :: CTMark/SPASS/SPASS.test          1394092      1361316     -2.4%
 test-suite...Mark/mafft/pairlocalalign.test    731320       714928      -2.2%
 test-suite :: CTMark/lencod/lencod.test        1340592      1324200     -1.2%
 test-suite :: CTMark/kimwitu++/kc.test         3853512      3820420     -0.9%
 test-suite :: CTMark/Bullet/bullet.test        3406036      3389652     -0.5%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    8017000      8016992     -0.0%
 test-suite...TMark/7zip/7zip-benchmark.test    2856588      2856588      0.0%
 test-suite...:: CTMark/sqlite3/sqlite3.test    765704       765704       0.0%
 Geomean difference                                                      -1.2%

Differential Revision: https://reviews.llvm.org/D63303

llvm-svn: 363632
2019-06-17 23:20:29 +00:00
Michael Berg
c1dc4dff1a Propagate fmf in IRTranslate for fneg
Summary: This case is related to D63405 in that we need to be propagating FMF on negates.

Reviewers: volkan, spatel, arsenm

Reviewed By: arsenm

Subscribers: wdng, javed.absar

Differential Revision: https://reviews.llvm.org/D63458

llvm-svn: 363631
2019-06-17 23:19:40 +00:00
Craig Topper
85a3beda95 Use VR128X instead of FR32X/FR64X for the register class in VMOVSSZmrk/VMOVSDZmrk.
Removes COPY_TO_REGCLASS from some patterns.

llvm-svn: 363630
2019-06-17 23:08:29 +00:00
Craig Topper
dbfa74b0eb [X86] Make an assert in LowerSCALAR_TO_VECTOR stricter to make it clear what types are allowed here. NFC
Make it clear that only integer type with i32 or smaller elements shoudl get to this part of the code.

llvm-svn: 363629
2019-06-17 23:08:09 +00:00
Stanislav Mekhanoshin
685065c365 [AMDGPU] Use custom inserter for gfx10 VOP2b
This is part of the approved D63204 pending parent revision.
This small change is in fact a part of the VOP2b legalization which
does not technically belong to wave32 support, so extracted
separately.

llvm-svn: 363625
2019-06-17 22:37:37 +00:00
Stanislav Mekhanoshin
da1ba915e4 [AMDGPU] gfx1010 subvector test. NFC.
llvm-svn: 363623
2019-06-17 21:55:06 +00:00
Volkan Keles
38c5d67fe9 [test][AArch64] Relax the check line for G_BRJT in legalizer-info-validation.mir
Replace the specific number with a pattern to relax the test.

llvm-svn: 363621
2019-06-17 21:25:25 +00:00
Philip Reames
5e7c28ccdb Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224

llvm-svn: 363619
2019-06-17 21:06:17 +00:00
Richard Smith
c3e8a686b4 Add convenience utility for replacing a range within a container with a
different range, in preparation for use in Clang.

llvm-svn: 363617
2019-06-17 21:01:09 +00:00
Daniel Sanders
e694af37ff [globalisel] Fix iterator invalidation in the extload combines
Summary:
Change the way we deal with iterator invalidation in the extload combines as it
was still possible to neglect to visit a use. Even worse, it happened in the
in-tree test cases and the checks weren't good enough to detect it.

We now take a cheap copy of the use list before iterating over it. This
prevents iterator invalidation from occurring and has the nice side effect
of making the existing schedule-for-erase/schedule-for-insert mechanism
moot.

Reviewers: aditya_nandakumar

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61813

llvm-svn: 363616
2019-06-17 20:56:31 +00:00
Stanislav Mekhanoshin
19b230399e [AMDGPU] Propagate function attributes thru bitcasts
AMDGPUPropagateAttributes will not work on function bitcatsts,
so move AMDGPUFixFunctionBitcasts before it.

Differential Revision: https://reviews.llvm.org/D63455

llvm-svn: 363614
2019-06-17 20:42:48 +00:00
Philip Reames
21cd56f0ab Fix a bug w/inbounds invalidation in LFTR (recommit)
Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

Original commit comment follows:

This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363613
2019-06-17 20:32:22 +00:00
Peter Collingbourne
b1afe40af5 gn build: Merge r363483.
llvm-svn: 363610
2019-06-17 20:03:11 +00:00
Peter Collingbourne
54e38a1e56 gn build: Merge r363584.
llvm-svn: 363609
2019-06-17 19:59:16 +00:00
Nicolai Haehnle
b637ea1f84 AMDGPU/GFX10: Don't generate s_code_end padding in the asm-printer
Summary:
The purpose of the padding is to guard against stale code being
fetched into the instruction cache by the lowest level prefetching.
We're generating relocatable ELF here, and so the padding should
arguably be added by the linker. This is in fact what Mesa does.

This also fixes multi-part shaders for Mesa.

Change-Id: I6bfede58f20e9f337762ccf39ef9e0e263e69e82

Reviewers: arsenm, rampitec, t-tye

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63427

llvm-svn: 363602
2019-06-17 19:28:43 +00:00
Philip Reames
a9a0f0aedb Reduced test case for pr42279 in advance of the relevant re-commit + fix
llvm-svn: 363601
2019-06-17 19:27:45 +00:00
Nicolai Haehnle
ab6e036531 AMDGPU: Explicitly define a triple for some tests
Summary:
This is related to the changes to the groupstaticsize intrinsic in
D61494 which would otherwise make the related tests in these files
fail or much less useful.

Note that for some reason, SOPK generation is less effective in the
amdhsa OS, which is why I chose PAL. I haven't investigated this
deeper.

Change-Id: I6bb99569338f7a433c28b4c9eb1e3e036b00d166

Reviewers: arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63392

llvm-svn: 363600
2019-06-17 19:25:57 +00:00
Joseph Tremoulet
a724f9ede8 [EarlyCSE] Fix hashing of self-compares
Summary:
Update compare normalization in SimpleValue hashing to break ties (when
the same value is being compared to itself) by switching to the swapped
predicate if it has a lower numerical value.  This brings the hashing in
line with isEqual, which already recognizes the self-compares with
swapped predicates as equal.

Fixes PR 42280.

Reviewers: spatel, efriedma, nikic, fhahn, uabelho

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63349

llvm-svn: 363598
2019-06-17 19:11:28 +00:00
Alina Sbirlea
0d9e90adfb [MemorySSA] Don't use template when the clone is a simplified instruction.
Summary:
LoopRotate doesn't create a faithful clone of an instruction, it may
simplify it beforehand. Hence the clone of an instruction that has a
MemoryDef associated may not be a definition, but a use or not a memory
alternig instruction.
Don't rely on the template when the clone may be simplified.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63355

llvm-svn: 363597
2019-06-17 18:58:40 +00:00
Jessica Paquette
f633cdef41 [GlobalISel][AArch64] Fold G_SUB into G_ICMP when it's safe to do so
Basically porting over the behaviour in AArch64ISelLowering to GISel. See
emitComparison for reference.

When we have something like this:

```
  lhs = G_SUB 0, y
  ...
  G_ICMP lhs, rhs
```

We can fold away the G_SUB and produce a cmn instead, given that we produce
the same value in NZCV.

Add a test showing that the transformation works, and also showing that we
don't perform the transformation when it's unsafe.

Also factor out the CSet emission into emitCSetForICMP.

Differential Revision: https://reviews.llvm.org/D63163

llvm-svn: 363596
2019-06-17 18:40:06 +00:00