This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
This patch adds support for explicitly highlighting sub-expressions
shared by multiple leaf nodes. For example consider the following
code
%shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10
%trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10
tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10
%load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10
%mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11
tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11
We have two leaf nodes (the 2 stores) and the first store stores %trans
which is also used by the matrix multiply %mult. We generate separate
remarks for each leaf (stores). To denote that parts are shared, the
shared expressions are marked as shared (), with a reference to the
other remark that shares it. The operation summary also denotes the
shared operations separately.
Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D72526
Summary:
Currently IsControlFlowEquivalent determine if two blocks are control
flow equivalent by checking if A dominates B and B post dominates A.
There exists blocks that are control flow equivalent even if they don't
satisfy the A dominates B and B post dominates A condition.
For example,
if (cond)
A
if (cond)
B
In the PR, we determine if two blocks are control flow equivalent by
also checking if the two sets of conditions A and B depends on are
equivalent.
Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn,
hfinkel, kbarton
Reviewed By: fhahn
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71578
This patch updates the remark to also include a summary of the number of
vector operations generated for each matrix expression.
Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D72480
Generate remarks for matrix operations in a function. To generate remarks
for matrix expressions, the following approach is used:
1. Collect leafs of matrix expressions (done in
RemarkGenerator::getExpressionLeafs). Leafs are lowered matrix
instructions without other matrix users (like stores).
2. For each leaf, create a remark containing a linearizied version of the
matrix expression.
The following improvements will be submitted as follow-ups:
* Summarize number of vector instructions generated for each expression.
* Account for shared sub-expressions.
* Propagate matrix remarks up the inlining chain.
The information provided by the matrix remarks helps users to spot cases
where matrix expression got split up, e.g. due to inlining not
happening. The remarks allow users to address those issues, ensuring
best performance.
Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D72453
Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility
method, which sets to nullptr the instructions that are not trivially
dead. Use the new method in LoopStrengthReduce.
Alternative: add a bool to the same method; this option adds a marginal
amount of overhead to the other callers, and the method needs to be
updated to return a bool status when it removes/doesn't remove
instructions.
The utility method RecursivelyDeleteTriviallyDeadInstructions receives
as input a vector of Instructions, where all inputs are valid
instructions. This same vector is used as a scratch storage (per the
header comment) to recursively delete instructions. If an instruction is
added as an operand of multiple other instructions, it may be added twice,
then deleted once, then the second reference in the vector is invalid.
Switch to using a Vector<WeakTrackingVH>.
This change facilitates a clean-up in LoopStrengthReduction.
We currently use integer ranges to merge concrete function arguments.
We use the ParamState range for those, but we only look up concrete
values in the regular state. For concrete function arguments that are
themselves arguments of the containing function, we can use the param
state directly and improve the precision in some cases.
Besides improving the results in some cases, this is also a small step towards
switching to ValueLatticeElement, by allowing D60582 to be a NFC.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71836
Apparently cache of AliasSetTrackers held by LICM was the only user of
SimpleAnalysis infrastructure. Now, given that we no longer have that
cache, this infrastructure is obsolete and, taking into account its
nature, we don't want any new solutions to be based on it.
Reviewers: asbirlea, fhahn, efriedma, reames
Reviewed-By: asbirlea
Differential Revision: https://reviews.llvm.org/D73085
Since LICM doesn't use AST caching any more (see D73081), this
infrastructure is now obsolete and we can remove it.
Reviewers: asbirlea, fhahn, efriedma, reames
Reviewed-By: asbirlea
Differential Revision: https://reviews.llvm.org/D73084
Summary:
This is the first step towards complete removal of AST caching from
LICM. Attempts to keep LICM's AST cache up to date across passes can lead
to miscompiles like this one: https://bugs.llvm.org/show_bug.cgi?id=44320.
LICM has already switched to using MemorySSA to do sinking and hoisting
and only builds an AliasSetTracker on demand for the promoteToScalars
step, without caching it from one LICM instance to the next. Given this,
we don't have compile-time reasons to keep AST caching any more.
The only scenario where the caching would be used currently is when
using the LegacyPassManager and setting -enable-mssa-loop-dependency=false.
This switch should help us to surface any possible issues that may arise
along this way, also it turns subsequent removal of AST caching into NFC.
Reviewers: asbirlea, fhahn, efriedma, reames
Reviewed By: asbirlea
Subscribers: hiraditya, george.burgess.iv, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73081
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, nicolasvasilache
Subscribers: hiraditya, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73041
This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus
making it a generic loop utility function. This allows to rewrite loop exit
values by just calling this function without running the whole IndVarSimplify
pass.
We use this in D72714 to rematerialise the iteration count in exit blocks, so
that we can clean-up loop update expressions inside the hardware-loops later.
Differential Revision: https://reviews.llvm.org/D72602
During the SeparateConstOffsetFromGEP pass, signed extensions are distributed
to the values that feed into them and then later recombined. The recombination
stage is somewhat problematic- it doesn't differ add and sub instructions
from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern
in some instances.
An example- the IR contains:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extA + extB
The problematic optimization will transform that into:
%unextendedA
%unextendedB
%subuAuB = unextendedA - unextendedB
%extA = extend A
%extB = extend B
%addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input.
This patch fixes that.
Patch by Drew Wock <drew.wock@sas.com>
Differential Revision: https://reviews.llvm.org/D65967
This reverts commit 3f3017e because there's a failure on peel-loop-nests.ll
with LLVM_ENABLE_EXPENSIVE_CHECKS on.
Differential Revision: https://reviews.llvm.org/D70304
There are a few global (cl::opt) controls that enable optional
behavior in GVN. Introduce GVNOptions that provide corresponding
per-pass instance controls.
That will allow to use GVN multiple times in pipeline each time
with different settings.
Reviewers: asbirlea, rnk, reames, skatkov, fhahn
Reviewed By: fhahn
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72732
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.
In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.
This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.
The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.
Reviewers: tejohnson, davidxl
Reviewed By: tejohnson
Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D72547
It appears to be rather useful when analyzing Loops with multiple
deoptimizing exits, perhaps merged ones.
For now it is used in LoopPredication, will be adding more uses
in other loop passes.
Reviewers: asbirlea, fhahn, skatkov, spatel, reames
Reviewed By: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72754
Summary:
InlineResult is used both in APIs assessing whether a call site is
inlinable (e.g. llvm::isInlineViable) as well as in the function
inlining utility (llvm::InlineFunction). It means slightly different
things (can/should inlining happen, vs did it happen), and the
implicit casting may introduce ambiguity (casting from 'false' in
InlineFunction will default a message about hight costs,
which is incorrect here).
The change renames the type to a more generic name, and disables
implicit constructors.
Reviewers: eraman, davidxl
Reviewed By: davidxl
Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72744
Summary: Duplicate code in widenWithVariantLoadUseCodegen is removed and also use assert to check unknown extension type as it should be filtered out by the pre condition check before calling this function.
Reviewers: az, sanjoy, sebpop, efriedma, javed.absar, sanjoy.google
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits, amehsan
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72652
Summary:
Current peeling implementation bails out in case of loop nests.
The patch introduces a field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.
Reviewers: fhahn, lebedev.ri, xbolva00
Reviewed By: xbolva00
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D70304
pass.
Summary: This patch changes LoopUnrollAndJamPass to a function pass, and
keeps the loops traversal order same as defined in
FunctionToLoopPassAdaptor LoopPassManager.h.
The next patch will change the loop traversal to outer to inner order,
so more loops can be transform.
Discussion in llvm-dev mailing list:
https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D72230
This patch updates the shape propagation to iterate until no new shape
information is discovered.
As initial seed for the forward propagation, we use the matrix intrinsic
instructions. Both propagateShapeForward and propagateShapeBackward
return new work lists, with the instructions to be used for the next
iteration. When propagating forward, we record all instructions we added
new shape information for. When propagating backward, we record all
users of instructions we added new shape information for.
Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70901
This patch extends to shape propagation to also include load
instructions and implements shape aware lowering for vector loads.
Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70900
This patch extends the shape propagation for matrix operations to also
propagate the shape of instructions to their operands.
Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70899
Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't.
Differential Revision: https://reviews.llvm.org/D67126
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.
While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.
In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.
Reviewers: fhahn, bcahoon, efriedma
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D65326
This patch extends the current shape propagation and shape aware
lowering to also support binary operators. Those operators are uniform
with respect to their shape (shape of the input operands is the same as
the shape of their result).
Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70898
If the matrix.multiply calls have the contract fast math flag, we can
use fmuladd. This als adds a command line option to force fmuladd
generation. We can retire this option once there is a clang-level
option.
Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D70951