1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

752 Commits

Author SHA1 Message Date
Simon Pilgrim
052e3ea653 GVN.cpp - remove unused <vector> include. NFCI. 2021-06-13 14:06:32 +01:00
Daniil Fukalov
ce7972786d [NFC] Fix 'Load' name masking.
Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103456
2021-06-02 11:09:53 +03:00
Arthur Eubanks
7a1762f190 [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Adam Nemet
c2f9888fca [GVN] Improve analysis for missed optimization remark
This change tries to handle multiple dominating users of the pointer operand
by choosing the most immediately dominating one, if possible.  While making
this change I also found that the previous implementation had a missing break
statement, making all loads with an odd number of dominating users emit an
OtherAccess value, so that has also been fixed.

Patch by Henrik G Olsson!

Differential Revision: https://reviews.llvm.org/D79097
2021-05-17 21:51:15 -07:00
dfukalov
3f7e516e28 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-05-14 11:17:14 +03:00
Jordan Rupprecht
f19959f1c4 Revert "[GVN] Clobber partially aliased loads."
This reverts commit 6c570442318e2d3b8b13e95c2f2f588d71491acb.

It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543:

```
$ cat repro.ll
; ModuleID = 'repro.ll'
source_filename = "repro.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.widget = type { i32 }
%struct.baz = type { i32, %struct.snork }
%struct.snork = type { %struct.spam }
%struct.spam = type { i32, i32 }

@global = external local_unnamed_addr global %struct.widget, align 4
@global.1 = external local_unnamed_addr global i8, align 1
@global.2 = external local_unnamed_addr global i32, align 4

define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 {
bb:
  %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1
  %tmp1 = bitcast %struct.snork* %tmp to i64*
  %tmp2 = load i64, i64* %tmp1, align 4
  %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1
  %tmp4 = icmp ugt i64 %tmp2, 4294967295
  br label %bb5

bb5:                                              ; preds = %bb14, %bb
  %tmp6 = load i32, i32* %tmp3, align 4
  %tmp7 = icmp ne i32 %tmp6, 0
  %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false
  %tmp9 = zext i1 %tmp8 to i8
  store i8 %tmp9, i8* @global.1, align 1
  %tmp10 = load i32, i32* @global.2, align 4
  switch i32 %tmp10, label %bb11 [
    i32 1, label %bb12
    i32 2, label %bb12
  ]

bb11:                                             ; preds = %bb5
  br label %bb14

bb12:                                             ; preds = %bb5, %bb5
  %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4
  br label %bb14

bb14:                                             ; preds = %bb12, %bb11
  br label %bb5
}
$ opt -O2 repro.ll -disable-output
opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value *llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst *, unsigned int, llvm::Type *, llvm::Instruction *, const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output
...
```
2021-05-11 16:08:53 -07:00
dfukalov
4398288fa7 [GVN] Clobber partially aliased loads.
Use offsets stored in `AliasResult` implemented in D98718.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95543
2021-04-24 14:14:20 +03:00
Max Kazantsev
9fbf6639f4 [GVN] Introduce loop load PRE
This patch allows PRE of the following type of loads:

```
preheader:
  br label %loop

loop:
  br i1 ..., label %merge, label %clobber

clobber:
  call foo() // Clobbers %p
  br label %merge

merge:
  ...
  br i1 ..., label %loop, label %exit

```

Into
```
preheader:
  %x0 = load %p
  br label %loop

loop:
  %x.pre = phi(x0, x2)
  br i1 ..., label %merge, label %clobber

clobber:
  call foo() // Clobbers %p
  %x1 = load %p
  br label %merge

merge:
  x2 = phi(x.pre, x1)
  ...
  br i1 ..., label %loop, label %exit

```

So instead of loading from %p on every iteration, we load only when the actual clobber happens.
The typical pattern which it is trying to address is: hot loop, with all code inlined and
provably having no side effects, and some side-effecting calls on cold path.

The worst overhead from it is, if we always take clobber block, we make 1 more load
overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken
at least once, the transform is neutral or profitable.

There are several improvements prospect open up:
- We can sometimes be smarter in loop-exiting blocks via split of critical edges;
- If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that
  we don't know if their sum is colder than the header.

Differential Revision: https://reviews.llvm.org/D99926
Reviewed By: reames
2021-04-22 12:50:38 +07:00
Max Kazantsev
0d71c82fc6 [NFC] Move statictic increment out of helper 2021-04-09 16:32:35 +07:00
Max Kazantsev
33c6e92932 [GVN][NFC] Factor out load elimination logic via PRE for reuse 2021-04-09 16:12:25 +07:00
Arthur Eubanks
1debc51397 [GVN] Properly invalidate ICF cache when we simplify a value
This fixes a "Cached first special instruction is wrong!" assert.

The assert fires because replacing a value with another can cause an
instruction to no longer be "special" to ICF. In this case,
devirtualization happened, turning an indirect call to a
call to a willreturn function which is no longer special.

Reviewed By: nikic, rnk

Differential Revision: https://reviews.llvm.org/D99977
2021-04-08 14:01:57 -07:00
Philip Reames
631f83b6d7 Plumb AssumeInst through operand bundle apis [nfc]
Follow up to a6d2a8d6f5.  This covers all the public interfaces of the bundle related code.  I tried to cleanup the internals where the changes were obvious, but there's definitely more room for improvement.
2021-04-06 12:53:53 -07:00
Arthur Eubanks
abf4528faf [GVN] Add missing ICF update
performScalarPREInsertion() inserts instructions into blocks that we
need to tell ImplicitControlFlowTracking about, otherwise the ICF cache
may be invalid.

Fixes PR49193.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D99909
2021-04-06 10:13:42 -07:00
Philip Reames
5c891f93c3 Move GCRelocateInst and GCResultInst to IntrinsicInst.h [nfc]
These two are part of the IntrinsicInst class hierarchy and it helps to cut down on some redundant includes.
2021-04-06 08:33:15 -07:00
Philip Reames
ccaf7c4e15 Remove last remnants of PR49607 migration [NFC]
The key change (4f5e92c) to switch gc.result and gc.relocate to being readnone landed nearly two weeks ago, and we haven't seen any fallout.  Time to remove the code added to make reverting easy.
2021-04-06 07:56:55 -07:00
Max Kazantsev
2231fbd3ed [NFC] Undo some erroneous renamings
Some vars renamed by mistake during auto-replacements. Undoing them.
2021-04-01 13:10:10 +07:00
Max Kazantsev
467293274d [NFC] Disambiguate LI in GVN
Name GVN uses name 'LI' for two different unrelated things:
LoadInst and LoopInfo. This patch relates the variables with
former meaning into 'Load' to disambiguate the code.
2021-04-01 12:40:35 +07:00
KAWASHIMA Takahiro
dd33b4b9e5 [GVN] Propagate llvm.access.group metadata of loads
Before this change, the `llvm.access.group` metadata was dropped
when moving a load instruction in GVN. This prevents vectorizing
a C/C++ loop with `#pragma clang loop vectorize(assume_safety)`.
This change propagates the metadata as well as other metadata if
it is safe (the move-destination basic block and source basic
block belong to the same loop).

Differential Revision: https://reviews.llvm.org/D93503
2021-04-01 10:00:48 +09:00
Philip Reames
3be79b1237 [gvn] CSE gc.relocates based on meaning, not spelling (try 2)
This was (partially) reverted in cfe8f8e0 because the conversion from readonly to readnone in Intrinsics.td exposed a couple of problems.  This change has been reworked to not need that change (via some explicit checks in client code).  This is being done to address the original optimization issue and simplify the testing of the readonly changes.  I'm working on that piece under 49607.

Original commit message follows:

The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)

We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.

Differential Revision: https://reviews.llvm.org/D97974
2021-03-16 10:59:31 -07:00
Nikita Popov
2cc01b071e [GVN] Don't explicitly materialize undefs from dead blocks
When materializing an available load value, do not explicitly
materialize the undef values from dead blocks. Doing so will
will force creation of a phi with an undef operand, even if there
is a dominating definition. The phi will be folded away on
subsequent GVN iterations, but by then we may have already
poisoned MDA cache slots.

Simply don't register these values in the first place, and let
SSAUpdater do its thing.
2021-03-06 23:46:24 +01:00
Philip Reames
83f044db4a [gvn] Handle simply phi equivalence cases
GVN basically doesn't handle phi nodes at all. This is for a reason - we can't value number their inputs since the predecessor blocks have probably not been visited yet.

However, it also creates a significant pass ordering problem. As it stands, instcombine and simplifycfg ends up implementing CSE of phi nodes. This means that for any series of CSE opportunities intermixed with phi nodes, we end up having to alternate instcombine/simplifycfg and gvn to make progress.

This patch handles the simplest case by simply preprocessing the phi instructions in a block, and CSEing them if they are syntactically identical. This turns out to be powerful enough to handle many cases in a single invocation of GVN since blocks which use the cse'd phi results are visited after the block containing the phi. If there's a CSE opportunity in one the phi predecessors required to recognize the phi CSE opportunity, that will require a second iteration on the function. (Still within a single run of gvn though.)

Compile time wise, this could go either way. On one hand, we're potentially causing GVN to iterate over the function more. On the other, we're cutting down on iterations between two passes and potentially shrinking the IR aggressively. So, a bit unclear what to expect.

Note that this does still rely on instcombine to canonicalize block order of the phis, but that's a one time transformation independent of the values incoming to the phi.

Differential Revision: https://reviews.llvm.org/D98080
2021-03-06 09:31:12 -08:00
Philip Reames
551c0ac82d [gvn] CSE gc.relocates based on meaning, not spelling
The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)

We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.

Differential Revision: https://reviews.llvm.org/D97974

(Note: Part of the reviewed change was split and landed as f352463a)
2021-03-05 10:16:12 -08:00
Kazu Hirata
1e91412444 [Scalar] Use range-based for loops (NFC) 2021-02-25 19:54:38 -08:00
ksyx
a9e0e1dc75 [GVN] Fix a typo in comment
NFC.

Differential Revision: https://reviews.llvm.org/D97200

Reviewed By: fhahn
2021-02-23 10:39:34 +08:00
Kazu Hirata
d872c6e963 [Transforms] Use range-based for loops (NFC) 2021-02-08 22:33:53 -08:00
Kazu Hirata
9912faec1f [Transforms/Scalar] Use range-based for loops (NFC) 2021-02-04 21:18:05 -08:00
Nick Desaulniers
87bfbbe74e [GVN] do not repeat PRE on failure to split critical edge
Fixes an infinite loop encountered in GVN.

GVN will delay PRE if it encounters critical edges, attempt to split
them later via calls to SplitCriticalEdge(), then restart.

The caller of GVN::splitCriticalEdges() assumed a return value of true
meant that critical edges were split, that the IR had changed, and that
PRE should be re-attempted, upon which we loop infinitely.

This was exposed after D88438, by compiling the Linux kernel for s390,
but the test case is reproducible on x86.

Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D94996
2021-01-25 11:23:44 -08:00
Kazu Hirata
bc78fac755 [Transforms] Use llvm::append_range (NFC) 2021-01-20 21:35:54 -08:00
Kazu Hirata
beb2026c3d [llvm] Call *(Set|Map)::erase directly (NFC)
We can erase an item in a set or map without checking its membership
first.
2021-01-03 09:57:47 -08:00
Kazu Hirata
e162a6c78e [Scalar] Construct SmallVector with iterator ranges (NFC) 2020-12-28 19:55:18 -08:00
Florian Hahn
a50a21617e [GVN] Correctly set modified status when doing PRE on indices.
This patch updates GVN to correctly return the modified status, if PRE
is performed on indices. It fixes a crash when building the test-suite
with EXPENSIVE_CHECKS and LTO.
2020-12-27 21:58:31 +00:00
Juneyoung Lee
5e880dcc6f [GVN] Use m_LogicalAnd/Or to propagate equality from branch conditions
This patch makes GVN recognize `select c1, c2, false` as well as `select c1, true, c2`
branch condition and propagate equality from these.

See llvm.org/pr48353, D93065

Differential Revision: https://reviews.llvm.org/D93841
2020-12-28 05:28:38 +09:00
Kazu Hirata
0fce5c3fd1 [Transforms] Use llvm::is_contained (NFC) 2020-11-18 20:42:22 -08:00
Florian Hahn
0fbc387566 [GVN] Fix MemorySSA update when replacing assume(false) with stores.
When replacing an assume(false) with a store, we have to be more careful
with the order we insert the new access. This patch updates the code to
look at the accesses in the block to find a suitable insertion point.

Alterantively we could check the defining access of the assume, but IIRC
there has been some discussion about making assume() readnone, so
looking at the access list might be more future proof.

Fixes PR48072.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90784
2020-11-05 12:09:32 +00:00
Jameson Nash
b65f8f86ef [GVN] small improvements to comments 2020-11-03 13:21:48 -05:00
Michael Liao
15bc14c9fa [gvn] PRE needs to skip convergent intrinsics/calls.
- As convergent intrinsics/calls could only be moved to
  control-equivalent blocks, or more precisely the same divergent
  branch, PRE needs to skip them.

Differential Revision: https://reviews.llvm.org/D90391
2020-10-30 11:24:40 -04:00
Serguei Katkov
fd4f36fcb9 [GVN LoadPRE] Add an option to disable splitting backedge
GVN Load PRE can split the backedge causing breaking the loop structure where the latch
contains the conditional branch with for example induction variable.

Different optimizations expect this form of the loop, so it is better to preserve it for some time.
This CL adds an option to control an ability to split backedge.

Default value is true so technically it is NFC and current behavior is not changed.

Reviewers: fedor.sergeev, mkazantsev, nikic, reames, fhahn
Reviewed By: mkazasntsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D89854
2020-10-27 11:59:52 +07:00
Serguei Katkov
83ae2d53d6 [GVN LoadPRE] Extend the scope of optimization by using context to prove safety of speculation
Use context to prove that load can be safely executed at a point where load is being hoisted.

Postpone the decision about safety of speculative load execution till the moment we know
where we hoist load and check safety at that context.

Reviewers: nikic, fhahn, mkazantsev, lebedev.ri, efriedma, reames
Reviewed By: reames, mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D88725
2020-10-06 09:25:16 +07:00
Vedant Kumar
20166f03cc [Instruction] Add dropLocation and updateLocationAfterHoist helpers
Introduce a helper which can be used to update the debug location of an
Instruction after the instruction is hoisted. This can be used to safely
drop a source location as recommended by the docs.

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-09-24 15:00:04 -07:00
Nikita Popov
868b57b4b2 [GVN] Use that assume(!X) implies X==false (PR47496)
We already use that assume(X) implies X==true, do the same for
assume(!X) implying X==false. This fixes PR47496.
2020-09-17 21:34:44 +02:00
Krzysztof Parzyszek
3d10a53bba [GVN] Account for masked loads/stores depending on load/store instructions
This is a case where an intrinsic depends on a non-call instruction.

Differential Revision: https://reviews.llvm.org/D87423
2020-09-10 10:57:33 -05:00
Florian Hahn
1aaacc4182 [EarlyCSE] Explicitly require AAResultsWrapperPass.
The MemorySSAWrapperPass depends on AAResultsWrapperPass and if
MemorySSA is preserved but AAResultsWrapperPass is not, this could lead
to a crash when updating the last user of the MemorySSAWrapperPass.

Alternatively AAResultsWrapperPass could be marked preserved by GVN, but
I am not sure if that would be safe. I am not sure what is required in
order to preserve AAResultsWrapperPass. At the moment, it seems like a
couple of passes that do similar transforms to GVN are preserving it.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87137
2020-09-09 09:14:50 +01:00
Sanjay Patel
80dc8a6aaa [IR][GVN] add/allow commutative intrinsics with >2 args
Follow-up to D86798 and rGe25449f.
2020-09-03 10:14:53 -04:00
Florian Hahn
dc272f692f [GVN] Preserve MemorySSA if it is available.
Preserve MemorySSA if it is available before running GVN.

DSE with MemorySSA will run closely after GVN. If GVN and 2 other
passes preserve MemorySSA, DSE can re-use MemorySSA used by LICM
when doing LTO.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86534
2020-09-03 12:28:13 +01:00
Sanjay Patel
251968146e [IR][GVN] allow intrinsics in Instruction's isCommutative query (2nd try)
The 1st try was reverted because I missed an assert that
needed softening.

As discussed in D86798 / rG09652721 , we were potentially
returning a different result for whether an Instruction
is commutable depending on if we call the base class or
derived class method.

This requires relaxing asserts in GVN, but that pass
seems to be working otherwise.

NewGVN requires more work because it uses different
code paths for numbering binops and calls.
2020-08-31 16:01:19 -04:00
Sanjay Patel
56bc7f03f4 Revert "[IR][GVN] allow intrinsics in Instruction's isCommutative query"
This reverts commit 25597f7783e7038b8a2ee88bb49ac605b211b564.
It is causing crashing on bots such as:
http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10523/steps/ninja-build/logs/stdio
2020-08-30 17:02:51 -04:00
Sanjay Patel
d58c2f282d [IR][GVN] allow intrinsics in Instruction's isCommutative query
As discussed in D86798 / rG09652721 , we were potentially
returning a different result for whether an Instruction
is commutable depending on if we call the base class or
derived class method.

This requires relaxing an assert in GVN, but that pass
seems to be working otherwise.

NewGVN requires more work because it uses different
code paths for numbering binops and calls.
2020-08-30 16:49:22 -04:00
Vedant Kumar
60a6ae1b59 Revert "[Instruction] Add updateLocationAfterHoist helper"
This reverts commit 4a646ca9e2caf70d6312714770f516fb83b7e3cb.

This is causing some bots to fail with "!dbg attachment points at wrong
subprogram for function", like:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/67958/steps/stage%201%20check/logs/stdio
2020-08-11 14:54:09 -07:00
Vedant Kumar
62d3804379 [Instruction] Add updateLocationAfterHoist helper
Introduce a helper on Instruction which can be used to update the debug
location after hoisting.

Use this in GVN and LICM, where we were mistakenly introducing new line
0 locations after hoisting (the docs recommend dropping the location in
this case).

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-08-11 14:05:20 -07:00
Jay Foad
e9385fa488 [NFC][GVN] Fix "avaliable" typos
Differential Revision: https://reviews.llvm.org/D85520
2020-08-07 14:22:24 +01:00