1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
llvm-mirror/lib/Analysis
Sanjoy Das b20d278ebd Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.

If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".

Motivation:

I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard.  So transforming:

```
void f(unsigned x) {
  unsigned t = 5 / x;
  (void)t;
}
```

to

```
void f(unsigned x) { }
```

is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).

Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM.  For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).

Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have.  This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.

For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store.  As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal.  The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal.  Such a
refined variant will look like it is `readonly`.  However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.

Note: this is not just a problem with atomics or with linking
differently optimized object files.  See PR26774 for more realistic
examples that involved neither.

This patch:

This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time.  It then changes a set of IPO passes to bail out if they see
such a function.

Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18634

llvm-svn: 265762
2016-04-08 00:48:30 +00:00
..
AliasAnalysis.cpp NFC: make AtomicOrdering an enum class 2016-04-06 21:19:33 +00:00
AliasAnalysisEvaluator.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
AliasSetTracker.cpp NFC: make AtomicOrdering an enum class 2016-04-06 21:19:33 +00:00
Analysis.cpp [CG] Actually hoist up the generic CallGraphPrinter pass from a weird 2016-03-10 11:08:44 +00:00
AssumptionCache.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
BasicAliasAnalysis.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
BlockFrequencyInfo.cpp Add getBlockProfileCount method to BlockFrequencyInfo 2016-03-23 18:18:26 +00:00
BlockFrequencyInfoImpl.cpp Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes. 2016-02-02 18:20:45 +00:00
BranchProbabilityInfo.cpp Const correctness for BranchProbabilityInfo (NFC) 2016-04-07 21:59:28 +00:00
CallGraph.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
CallGraphSCCPass.cpp Recommit r256952 "Filtering IR printing for print-after-all/print-before-all" 2016-01-06 22:55:03 +00:00
CallPrinter.cpp [CG] Rename the DOT printing pass to actually reference "DOT". 2016-03-10 11:04:40 +00:00
CaptureTracking.cpp [CaptureTracking] Support atomicrmw and cmpxchg 2016-02-18 19:23:27 +00:00
CFG.cpp Avoid overly large SmallPtrSet/SmallSet 2016-01-30 01:24:31 +00:00
CFGPrinter.cpp
CFLAliasAnalysis.cpp [CFLAA] Fix PR27213; incorrect tagging of args/globals 2016-04-05 21:40:45 +00:00
CGSCCPassManager.cpp [PM] Implement the final conclusion as to how the analysis IDs should 2016-03-11 10:22:49 +00:00
CMakeLists.txt PM: Implement a basic loop pass manager 2016-02-25 07:23:08 +00:00
CodeMetrics.cpp use range-based for loop; NFCI 2016-03-08 20:53:48 +00:00
ConstantFolding.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
CostModel.cpp Implemented cost model for masked gather and scatter operations 2015-12-28 20:10:59 +00:00
Delinearization.cpp
DemandedBits.cpp [DemandedBits] Revert r249687 due to PR26071 2016-02-03 15:05:06 +00:00
DependenceAnalysis.cpp [DependenceAnalysis] Check if result of getConstantPart is null 2016-04-04 18:13:18 +00:00
DivergenceAnalysis.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
DominanceFrontier.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
DomPrinter.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
EHPersonalities.cpp Add Rust's personality function to the list of known personality functions 2016-03-15 20:35:45 +00:00
GlobalsModRef.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
InlineCost.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
InstCount.cpp
InstructionSimplify.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp
IVUsers.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
LazyCallGraph.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
LazyValueInfo.cpp [LVI] Fix a bug which prevented use of !range metadata within a query 2016-03-04 22:27:39 +00:00
Lint.cpp [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer. 2016-01-22 01:51:51 +00:00
LLVMBuild.txt
Loads.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
LoopAccessAnalysis.cpp Revert r265535 until we know how we can fix the bots 2016-04-06 14:06:32 +00:00
LoopInfo.cpp IR: Reserve an MDKind for !llvm.loop; NFC 2016-03-25 00:35:38 +00:00
LoopPass.cpp LoopInfo: Simplify ownership of Loop objects 2016-01-08 19:08:53 +00:00
LoopPassManager.cpp [PM] Implement the final conclusion as to how the analysis IDs should 2016-03-11 10:22:49 +00:00
LoopUnrollAnalyzer.cpp [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating. 2016-02-26 02:57:05 +00:00
MemDepPrinter.cpp [PM] Port memdep to the new pass manager. 2016-03-10 00:55:30 +00:00
MemDerefPrinter.cpp NFC. Move isDereferenceable to Loads.h/cpp 2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
MemoryDependenceAnalysis.cpp NFC: make AtomicOrdering an enum class 2016-04-06 21:19:33 +00:00
MemoryLocation.cpp
ModuleDebugInfoPrinter.cpp
ObjCARCAliasAnalysis.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp Add support for objc_unsafeClaimAutoreleasedReturnValue to the 2016-01-27 19:05:08 +00:00
OrderedBasicBlock.cpp
PHITransAddr.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
PostDominators.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
ScalarEvolutionAliasAnalysis.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
ScalarEvolutionExpander.cpp Revert r265535 until we know how we can fix the bots 2016-04-06 14:06:32 +00:00
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
SparsePropagation.cpp
StratifiedSets.h
TargetLibraryInfo.cpp [NVPTX] Infer __nvvm_reflect as nounwind, readnone 2016-03-31 21:29:57 +00:00
TargetTransformInfo.cpp [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead 2016-03-18 00:27:43 +00:00
Trace.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
TypeBasedAliasAnalysis.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
ValueTracking.cpp Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
VectorUtils.cpp [SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan 2016-04-06 07:04:53 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//