I foresee two uses for this:
1) It's easier to use those in debugger.
2) Once we start implementing more VPlan-to-VPlan transformations (especially
inner loop massaging stuff), using the vectorized LLVM IR as CHECK targets in
LIT test would become too obscure. I can imagine that we'd want to CHECK
against VPlan dumps after multiple transformations instead. That would be
easier with plain text dumps than with DOT format.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96628
Count iterations of zero-trip loops and assert the count is zero,
rather than asserting inside the loop.
Unreachable functions should use llvm_unreachable.
Remove tautological 'if' statements, even when they're following a
pattern of checks.
Found by the Rotten Green Tests project.
Suppresses an implicit TypeSize to uint64_t conversion warning.
We might be able to just not offset it since we're writing to a
Fixed stack object, but I wasn't sure so I just did what
DAGTypeLegalizer::IncrementPointer does.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D98736
This reverts commit 6b053c9867a3ede32e51cef3ed972d5ce5b38bc0.
The build is broken:
ld.lld: error: undefined symbol: llvm::VPlan::printDOT(llvm::raw_ostream&) const
>>> referenced by LoopVectorize.cpp
>>> LoopVectorize.cpp.o:(llvm::LoopVectorizationPlanner::printPlans(llvm::raw_ostream&)) in archive lib/libLLVMVectorize.a
I foresee two uses for this:
1) It's easier to use those in debugger.
2) Once we start implementing more VPlan-to-VPlan transformations (especially
inner loop massaging stuff), using the vectorized LLVM IR as CHECK targets in
LIT test would become too obscure. I can imagine that we'd want to CHECK
against VPlan dumps after multiple transformations instead. That would be
easier with plain text dumps than with DOT format.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96628
This reverts commit 11b70b9e3a7458b5b78c30020b56e8ca563a4801.
The bot failure was due to ArgumentPromotion deleting functions
without deleting their analyses. This was separately fixed in 4b1c807.
The XCore backend does not support object emission. Several tests fail for this
reason when XCore is the default target. See staging buildbot builder:
clang-xcore-ubuntu-20-x64.
So check for backend object emission before running the tests requiring it.
Incorporate isConfigurationSupported functionality in isObjectEmissionSupported,
to avoid calling them both in the same tests.
Differential Revision: https://reviews.llvm.org/D98400
A more general name might be match-time error propagation. That is,
it's conceivable we'll one day have non-numeric errors that require
the handling fixed by this patch.
Without this patch, FileCheck behaves as follows:
```
$ cat check
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
$ FileCheck -vv -dump-input=never check < input
check:1:54: remark: implicit EOF: expected string found in input
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
^
<stdin>:2:1: note: found here
^
check:1:15: error: unable to substitute variable or numeric expression: overflow error
CHECK-NOT: [[#0x8000000000000000+0x8000000000000000]]
^
$ echo $?
0
```
Notice that the exit status is 0 even though there's an error.
Moreover, FileCheck doesn't print the error diagnostic unless both
`-dump-input=never` and `-vv` are specified.
The same problem occurs when `CHECK-NOT` does have a match but a
capture fails due to overflow: exit status is 0, and no diagnostic is
printed unless both `-dump-input=never` and `-vv` are specified. The
usefulness of capturing from `CHECK-NOT` is questionable, but this
case should certainly produce an error.
With this patch, FileCheck always includes the error diagnostic and
has non-zero exit status for the above examples. It's conceivable
that this change will cause some existing tests to fail, but my
assumption is that they should fail. Moreover, with nearly every
project enabled, this patch didn't produce additional `check-all`
failures for me.
This patch also extends input dumps to include such numeric error
diagnostics for both expected and excluded patterns.
As noted in fixmes in some of the tests added by this patch, this
patch worsens an existing issue with redundant diagnostics. I'll fix
that bug in a subsequent patch.
Reviewed By: thopre, jhenderson
Differential Revision: https://reviews.llvm.org/D98086
Fixed section of code that iterated through a SmallDenseMap and added
instructions in each iteration, causing non-deterministic code; replaced
SmallDenseMap with MapVector to prevent non-determinism.
This reverts commit 01ac6d1587e8613ba4278786e8341f8b492ac941.
Unreachable code should be self-documented as unreachable.
Found by the Rotten Green Tests project.
Differential Revision: https://reviews.llvm.org/D98518
This caused non-deterministic compiler output; see comment on the
code review.
> This patch updates the various IR passes to correctly handle dbg.values with a
> DIArgList location. This patch does not actually allow DIArgLists to be produced
> by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
> Other than that, it should cover every IR pass.
>
> Most of the changes simply extend code that operated on a single debug value to
> operate on the list of debug values in the style of any_of, all_of, for_each,
> etc. Instances of setOperand(0, ...) have been replaced with with
> replaceVariableLocationOp, which takes the value that is being replaced as an
> additional argument. In places where this value isn't readily available, we have
> to track the old value through to the point where it gets replaced.
>
> Differential Revision: https://reviews.llvm.org/D88232
This reverts commit df69c69427dea7f5b3b3a4d4564bc77b0926ec88.
BasicAA stores a reference to LoopInfo inside. This imposes an implicit
requirement of keeping it up to date whenever we modify the IR (in particular,
whenever we modify terminators of blocks that belong to loops). Failing
to do so leads to incorrect state of the LoopInfo.
Because general AA does not require loop info updates and provides to API to
update it properly, the users of AA reasonably assume that there is no need to
update the loop info. It may be a reason of bugs, as example in PR43276 shows.
This patch drops dependence of BasicAA on LoopInfo to avoid this problem.
This may potentially pessimize the result of queries to BasicAA.
Differential Revision: https://reviews.llvm.org/D98627
Reviewed By: nikic
- Previously, https://reviews.llvm.org/D97703 was [[ https://reviews.llvm.org/D98543 | reverted ]] as it broke when building the unit tests when shared libs on.
- This patch reverts the "revert" and makes two minor changes
- The first is it also links in the MCParser lib when building the unittest. This should resolve the issue when building with with shared libs on and off
- The second renames the name of the unit test from `SystemZAsmLexer` to `SystemZAsmLexerTests` since the convention for unittest binaries is to suffix the name of the unit test with "Tests"
Reviewed By: Kai
Differential Revision: https://reviews.llvm.org/D98666
This parallels ConstantDataArray::getRaw() and can be used with ConstantDataSequential::getRawDataValues() in the base class for both types.
Update BuildConstantData{Array,Vector} tests to test the getRaw API. Also removes its unused Module.
In passing, update some comments to include the support for half and bfloat. Update tests to include testing for bfloat.
Differential Revision: https://reviews.llvm.org/D98302
This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64
backend to use these edge kinds. This simplifies the implementation of the
MachO/x86-64 backend and makes it possible to write generic x86-64 passes and
utilities.
The new edge kinds are different from the original set used in the MachO/x86-64
backend. Several edge kinds that were not meaningfully distinguished in that
backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in
the new scheme (these edge kinds can be reintroduced later if we find a use for
them). At the same time, new edge kinds have been introduced to convey extra
information about the state of the graph. E.g. The Request*AndTransformTo**
edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP
entries, and the 'Relaxable' suffix distinguishes edges that are candidates for
optimization from edges which should be left as-is (e.g. to enable runtime
redirection).
ELF/x86-64 will be refactored to use these generic edges at some point in the
future, and I anticipate a similar refactor to create a generic arm64 support
header too.
Differential Revision: https://reviews.llvm.org/D98305
This is a patch to add nonnull and align to assume's operand bundle
only if noundef exists.
Since nonnull and align in fn attr have poison semantics, they should be
paired with noundef or noundef-implying attributes to be immediate UB.
Reviewed By: jdoerfert, Tyker
Differential Revision: https://reviews.llvm.org/D98228
D98289 was erroneously reporting `invalid range list offset 0x20110`
instead of `invalid range list table index 0`.
Differential Revision: https://reviews.llvm.org/D98589
This reverts commit 329aeb5db43f5e69df038fb20d2def77fe6f8595,
and relands commit 61f006ac655431bd44b9e089f74c73bec0c1a48c.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
Add printf-style alternate form flag to prefix hex number with 0x when
present. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:
[[#%#<precision specifier><format specifier>, ...]
where <precision specifier> and <format specifier> are optional and ...
can be a variable definition or not with an empty expression or not.
This feature was requested in https://reviews.llvm.org/D81144#2075532
for llvm/test/MC/ELF/gen-dwarf64.s
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D97845
- This patch adds in support for the ordinary HLASM comment syntax asm
statements (Reference - Chapter 7, Comment Statements, Ordinary Comment
Statements)
- In brief, the ordinary comment syntax if used, must begin with the "*"
character
- To achieve this, this patch makes use of the CommentString attribute
provided in the base MCAsmInfo class
- In the SystemZMCAsmInfo class, the CommentString attribute was set to
"*" based on the assembler dialect
- Furthermore, a new attribute RestrictCommentString, is provided to only
treat a string as a comment if it appears at the start of the asm
statement. Example: "jo *-4" is valid in HLASM (jump back 4 bytes from
current point - similar to jo -4 in gnu asm) and we don't want "*-4" to
be treated as a comment.
- RFC for HLASM Parser support implementation: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html
Reviewed By: scott.linder, Kai
Differential Revision: https://reviews.llvm.org/D97703
This patch makes uses of the context bridges introduced in D83299 to make
AAValueConstantRange call site specific.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D83744
Add support to widen select instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer.
Previously select instructions get handled by the wrong recipe and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D97136
It is good to have a combined `divrem` instruction when the
`div` and `rem` are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.
Reviewed By: arsenm, paquette
Differential Revision: https://reviews.llvm.org/D96013
now -funique-internal-linkage-name flag is available, and we want to flip
it on by default since it is beneficial to have separate sample profiles
for different internal symbols with the same name. As a preparation, we
want to avoid regression caused by the flip.
When we flip -funique-internal-linkage-name on, the profile is collected
from binary built without -funique-internal-linkage-name so it has no uniq
suffix, but the IR in the optimized build contains the suffix. This kind of
mismatch may introduce transient regression.
To avoid such mismatch, we introduce a NameTable section flag indicating
whether there is any name in the profile containing uniq suffix. Compiler
will decide whether to keep uniq suffix during name canonicalization
depending on the NameTable section flag. The flag is only available for
extbinary format. For other formats, by default compiler will keep uniq
suffix so they will only experience transient regression when
-funique-internal-linkage-name is just flipped.
Another type of regression is caused by places where we miss to call
getCanonicalFnName. Those places are fixed.
Differential Revision: https://reviews.llvm.org/D96932
This test currently fails to compile when using a MinGW toolchain as setenv is not defined. This function is a POSIX function Windows does not implement.
This patch enables the setenv macro used in the unit test for all of Windows, making the test compile and run successfully.
Differential Revision: https://reviews.llvm.org/D98271
The LiveDebugValues and LiveDebugVariables implementations for handling
DBG_VALUE_LIST instructions can be simplified significantly if they do not have
to deal with any duplicated operands, such as a DBG_VALUE_LIST that uses the
same register multiple times in its expression. This patch adds a function,
replaceArg, that can be used to simplify a DIExpression in the case of
duplicated operands.
Differential Revision: https://reviews.llvm.org/D83896
This patch updates the various IR passes to correctly handle dbg.values with a
DIArgList location. This patch does not actually allow DIArgLists to be produced
by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
Other than that, it should cover every IR pass.
Most of the changes simply extend code that operated on a single debug value to
operate on the list of debug values in the style of any_of, all_of, for_each,
etc. Instances of setOperand(0, ...) have been replaced with with
replaceVariableLocationOp, which takes the value that is being replaced as an
additional argument. In places where this value isn't readily available, we have
to track the old value through to the point where it gets replaced.
Differential Revision: https://reviews.llvm.org/D88232
Changes to function calls in LocalTest resulted in comparisons between
unsigned values and signed literals; the latter have been updated to be
unsigned to prevent this warning.
This patch updates DbgVariableIntrinsics to support use of a DIArgList for the
location operand, resulting in a significant change to its interface. This patch
does not update all IR passes to support multiple location operands in a
dbg.value; the only change is to update the DbgVariableIntrinsic interface and
its uses. All code outside of the intrinsic classes assumes that an intrinsic
will always have exactly one location operand; they will still support
DIArgLists, but only if they contain exactly one Value.
Among other changes, the setOperand and setArgOperand functions in
DbgVariableIntrinsic have been made private. This is to prevent code from
setting the operands of these intrinsics directly, which could easily result in
incorrect/invalid operands being set. This does not prevent these functions from
being called on a debug intrinsic at all, as they can still be called on any
CallInst pointer; it is assumed that any code directly setting the operands on a
generic call instruction is doing so safely. The intention for making these
functions private is to prevent DIArgLists from being overwritten by code that's
naively trying to replace one of the Values it points to, and also to fail fast
if a DbgVariableIntrinsic is updated to use a DIArgList without a valid
corresponding DIExpression.
This patch adds a new metadata node, DIArgList, which contains a list of SSA
values. This node is in many ways similar in function to the existing
ValueAsMetadata node, with the difference being that it tracks a list instead of
a single value. Internally, it uses ValueAsMetadata to track the individual
values, but there is also a reasonable amount of DIArgList-specific
value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special
case in parsing and printing due to the fact that it requires a function state
(as it may reference function-local values).
This patch should not result in any immediate functional change; it allows for
DIArgLists to be parsed and printed, but debug variable intrinsics do not yet
recognize them as a valid argument (outside of parsing).
Differential Revision: https://reviews.llvm.org/D88175
This is recommit of 4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4.
MIR in one unit test had mismatched types.
For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.
Differential Revision: https://reviews.llvm.org/D96122
This reverts commit 7479a2e00bc41f399942e5106fbdf9b4b0c11506.
This commit causes compile errors on clang-x64-windows-msvc, so I'm
reverting the patch for now.
For reference, the error in question is:
```
error C2280: 'llvm::raw_ostream_iterator<char,char>
&llvm::raw_ostream_iterator<char,char>::operator =(const
llvm::raw_ostream_iterator<char,char> &)': attempting to reference a deleted
function
note: compiler has generated 'llvm::raw_ostream_iterator<char,char>::operator ='
here
note: 'llvm::raw_ostream_iterator<char,char>
&llvm::raw_ostream_iterator<char,char>::operator =(const
llvm::raw_ostream_iterator<char,char> &)': function was implicitly deleted
because 'llvm::raw_ostream_iterator<char,char>' has a data member
'llvm::raw_ostream_iterator<char,char>::OutStream' of reference type
```
Adds a class `raw_ostream_iterator` that behaves like
std::ostream_iterator, but can be used with raw_ostream.
This is useful for using raw_ostream with std algorithms.
For example, it can be used to output std containers as follows:
```
std::vector<int> V = { 1, 2, 3 };
std::copy(V.begin(), V.end(), raw_ostream_iterator<int>(outs(), ", "));
// Output: "1, 2, 3, "
```
The API tries to follow std::ostream_iterator as closely as is
practically possible.
Reviewed By: dblaikie, mkitzan
Differential Revision: https://reviews.llvm.org/D78795
:: (store 1 + 4, addrspace 1)
->
:: (store 1 into undef + 4, addrspace 1)
An offset without a base isn't terribly useful but it's convenient to update
the offset without checking the value. For example, when breaking apart
stores into smaller units
Differential Revision: https://reviews.llvm.org/D97812
As stated in the CMake manual, we are supposed to use MODULE rules to generate
plugin libraries:
"MODULE libraries are plugins that are not linked into other targets but may be
loaded dynamically at runtime using dlopen-like functionality"
Besides, LLVM's plugin infrastructure fits with the AIX treatment of .so
shared objects more than it fits with the AIX treatment of .a library archives
(which may contain shared objects).
Differential revision: https://reviews.llvm.org/D96282
Similar to b3a33553aec7, but this shows a TODO and a potential
miscompile is already present.
We are tracking an FP instruction that does *not* have FMF (reassoc)
properties, so calling that "Unsafe" seems opposite of the common
reading.
I also removed one getter method by rolling the null check into
the access. Further simplification may be possible.
The motivation is to clean up the interactions between FMF and
function-level attributes in these classes and their callers.
The new test shows that there is an existing bug somewhere in
the callers. We assumed that the original code was fully 'fast'
and so we produced IR with 'fast' even though it was just 'reassoc'.
For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.
Differential Revision: https://reviews.llvm.org/D96122
This removes the unit test from a968e7b82eac as it reportedly causes
some link problems. It can be reinstated once the issues are understood
and sorted out.
This adds some simple known bits handling for the three CSINC/NEG/INV
instructions. From the operands known bits we can compute the common
bits of the first operand and incremented/negated/inverted second
operand. The first, especially CSINC ZR, ZR, comes up fair amount in the
tests. The others are more rare so a unit test for them is added.
Differential Revision: https://reviews.llvm.org/D97788
* Introduce the new intrinsic amdgcn_strict_wwm
* Deprecate the old intrinsic amdgcn_wwm
The change is done for consistency as the "strict"
prefix will become an important, distinguishing factor
between amdgcn_wqm and amdgcn_strictwqm in the future.
The "strict" prefix indicates that inactive lanes do not
take part in control flow, specifically an inactive lane
enabled by a strict mode will always be enabled irrespective
of control flow decisions.
The amdgcn_wwm will be removed, but doing so in two steps
gives users time to switch to the new name at their own pace.
Reviewed By: critson
Differential Revision: https://reviews.llvm.org/D96257
This merges more AMDGPU ABI lowering code into the generic call
lowering. Start cleaning up by factoring away more of the pack/unpack
logic into the buildCopy{To|From}Parts functions. These could use more
improvement, and the SelectionDAG versions are significantly more
complex, and we'll eventually have to emulate all of those cases too.
This is mostly NFC, but does result in some minor instruction
reordering. It also removes some of the limitations with mismatched
sizes the old code had. However, similarly to the merge on the input,
this is forcing gfx6/gfx7 to use the gfx8+ ABI (which is what we
actually want, but SelectionDAG is stuck using the weird emergent
ABI).
This also changes the load/store size for stack passed EVTs for
AArch64, which makes it consistent with the DAG behavior.
For the cases of two clobbering loads and one loaded object is fully contained
in the second `BasicAAResult::aliasGEP` returns just `PartialAlias` that
is actually more common case of partial overlap, it doesn't say anything about
actual overlapping sizes.
AA users such as GVN and DSE have no functionality to estimate aliasing of GEPs
with non-constant offsets. The change stores estimated relative offsets so they
can be used further.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93529
remove `Hi` `Lo` argument from `emitDwarfUnitLength`, so we
can make caller of emitDwarfUnitLength easier.
Reviewed By: MaskRay, dblaikie, ikudrin
Differential Revision: https://reviews.llvm.org/D96409
We will pass StringRef and change it in reader.
But we reuse the same Filename vector without clear it,
so in some systems, we may clobbeer previous results.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D97353
We may need to do some customization for DWARF unit length in DWARF
section headers for some targets for some code generation path.
For example, for XCOFF in assembly path, AIX assembler does not require
the debug section containing its debug unit length in the header.
Move emitDwarfUnitLength to MCStreamer class so that we can do
customization in different Streamers
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D95932
This CL is not big but contains changes that span multiple analyses and
passes. This description is very long because it tries to explain basics
on what each pass/analysis does and why we need this change on top of
that. Please feel free to skip parts that are not necessary for your
understanding.
---
`WasmEHFuncInfo` contains the mapping of <EH pad, the EH pad's next
unwind destination>. The value (unwind dest) here is where an exception
should end up when it is not caught by the key (EH pad). We record this
info in WasmEHPrepare to fix catch mismatches, because the CFG itself
does not have this info. A CFG only contains BBs and
predecessor-successor relationship between them, but in `WasmEHFuncInfo`
the unwind destination BB is not necessarily a successor or the key EH
pad BB. Their relationship can be intuitively explained by this C++ code
snippet:
```
try {
try {
foo();
} catch (int) { // EH pad
...
}
} catch (...) { // unwind destination
}
```
So when `foo()` throws, it goes to `catch (int)` first. But if it is not
caught by it, it ends up in the next unwind destination `catch (...)`.
This unwind destination is what you see in `catchswitch`'s
`unwind label %bb` part.
---
`WebAssemblyExceptionInfo` groups exceptions so that they can be sorted
continuously together in CFGSort, as we do for loops. What this analysis
does is very simple: it creates a single `WebAssemblyException` per EH
pad, and all BBs that are dominated by that EH pad are included in this
exception. We also identify subexception relationship in this way: if
EHPad A domiantes EHPad B, EHPad B's exception is a subexception of
EHPad A's exception.
This simple rule turns out to be incorrect in some cases. In
`WasmEHFuncInfo`, if EHPad A's unwind destination is EHPad B, it means
semantically EHPad B should not be included in EHPad A's exception,
because it does not make sense to rethrow/delegate to an inner scope.
This is what happened in CFGStackify as a result of this:
```
try
try
catch
... <- %dest_bb is among here!
end
delegate %dest_bb
```
So this patch adds a phase in `WebAssemblyExceptionInfo::recalculate` to
make sure excptions' unwind destinations are not subexceptions of
their unwind sources in `WasmEHFuncInfo`.
But this alone does not prevent `dest_bb` in the example above from
being sorted within the inner `catch`'s exception, even if its exception
is not a subexception of that `catch`'s exception anymore, because of
how CFGSort works, which will be explained below.
---
CFGSort places BBs within the same `SortRegion` (loop or exception)
continuously together so they can be demarcated with `loop`-`end_loop`
or `catch`-`end_try` in CFGStackify.
`SortRegion` is a wrapper for one of `MachineLoop` or
`WebAssemblyException`. `SortRegionInfo` already does some complicated
things because there discrepancies between those two data structures.
`WebAssemblyException` is what we control, and it is defined as an EH
pad as its header and BBs dominated by the header as its BBs (with a
newly added exception of unwind destinations explained in the previous
paragraph). But `MachineLoop` is an LLVM data structure and uses the
standard loop detection algorithm. So by the algorithm, BBs that are 1.
dominated by the loop header and 2. have a path back to its header.
Because of the second condition, many BBs that are dominated by the loop
header are not included in the loop. So BBs that contain `return` or
branches to outside of the loop are not technically included in
`MachineLoop`, but they can be sorted together with the loop with no
problem.
Maybe to relax the condition, in CFGSort, when we are in a `SortRegion`
we allow sorting of not only BBs that belong to the current innermost
region but also BBs that are by the current region header.
(This was written this way from the first version written by Dan, when
only loops existed.) But now, we have cases in exceptions when EHPad B
is the unwind destination for EHPad A, even if EHPad B is dominated by
EHPad A it should not be included in EHPad A's exception, and should not
be sorted within EHPad A.
One way to make things work, at least correctly, is change `dominates`
condition to `contains` condition for `SortRegion` when sorting BBs, but
this will change compilation results for existing non-EH code and I
can't be sure it will not degrade performance or code size. I think it
will degrade performance because it will force many BBs dominated by a
loop, which don't have the path back to the header, to be placed after
the loop and it will likely to create more branches and blocks.
So this does a little hacky check when adding BBs to `Preferred` list:
(`Preferred` list is a ready list. CFGSort maintains ready list in two
priority queues: `Preferred` and `Ready`. I'm not very sure why, but it
was written that way from the beginning. BBs are first added to
`Preferred` list and then some of them are pushed to `Ready` list, so
here we only need to guard condition for `Preferred` list.)
When adding a BB to `Preferred` list, we check if that BB is an unwind
destination of another BB. To do this, this adds the reverse mapping,
`UnwindDestToSrc`, and getter methods to `WasmEHFuncInfo`. And if the BB
is an unwind destination, it checks if the current stack of regions
(`Entries`) contains its source BB by traversing the stack backwards. If
we find its unwind source in there, we add the BB to its `Deferred`
list, to make sure that unwind destination BB is added to `Preferred`
list only after that region with the unwind source BB is sorted and
popped from the stack.
---
This does not contain a new test that crashes because of this bug, but
this fix changes the result for one of existing test case. This test
case didn't crash because it fortunately didn't contain `delegate` to
the incorrectly placed unwind destination BB.
Fixes https://github.com/emscripten-core/emscripten/issues/13514.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D97247
If resulting size of the output stream is already known,
then the space for stream data could be preliminary
allocated in some cases. f.e. raw_string_ostream could
preallocate the space for the target string(it allows
to avoid reallocations during writing into the stream).
Differential Revision: https://reviews.llvm.org/D91693
When one of the inputs is a wrapping range, intersect with the
union of the two inputs. The union of the two inputs corresponds
to the result we would get if we treated the min/max as a simple
select.
This fixes PR48643.
We don't need any special handling for wrapping ranges (or empty
ranges for that matter). The sub() call will already compute a
correct and precise range.
We only need to adjust the test expectation: We're now computing
an optimal result, rather than an unsigned envelope.
When the optimality check fails, print the inputs, the computed
range and the better range that was found. This makes it much
simpler to identify the cause of the failure.
Make sure that full ranges (which, unlikely all the other cases,
have multiple ways to construct them that all result in the same
range) only print one message by handling them separately.
The current infrastructure for exhaustive ConstantRange testing is
somewhat confusing in what exactly it tests and currently cannot even
be used for operations that produce smallest-size results, rather than
signed/unsigned envelopes.
This patch makes the testing more principled by collecting the exact
set of results of an operation into a bit set and then comparing it
against the range approximation by:
* Checking conservative correctness: All elements in the set must be
in the range.
* Checking optimality under a given preference function: None of the
(slack-free) ranges that can be constructed from the set are
preferred over the computed range.
Implemented preference functions are:
* PreferSmallest: Smallest range regardless of signed/unsigned wrapping
behavior. Probably what we would call "optimal" without further
qualification.
* PreferSmallestUnsigned/Signed: Smallest range that has no
unsigned/signed wrapping. We use this if our calculation is precise
only up to signed/unsigned envelope.
* PreferSmallestNonFullUnsigned/Signed: Smallest range that has no
unsigned/signed wrapping -- but preferring a smaller wrapping range
over a (non-wrapping) full range. We use this if we have a fully
precise calculation but apply a sign preference to the result
(union/intersection). Even with a sign preference, returning a
wrapping range is still "strictly better" than returning a full one.
This also addresses PR49273 by replacing the fragile manual range
construction logic in testBinarySetOperationExhaustive() with generic
code that isn't specialized to the particular form of ranges that set
operations can produces.
Differential Revision: https://reviews.llvm.org/D88356
As discussed on the RFC [0], I am sharing the set of patches that
enables checking of original Debug Info metadata preservation in
optimizations. The proof-of-concept/proposal can be found at [1].
The implementation from the [1] was full of duplicated code,
so this set of patches tries to merge this approach into the existing
debugify utility.
For example, the utility pass in the original-debuginfo-check
mode could be invoked as follows:
$ opt -verify-debuginfo-preserve -pass-to-test sample.ll
Since this is very initial stage of the implementation,
there is a space for improvements such as:
- Add support for the new pass manager
- Add support for metadata other than DILocations and DISubprograms
[0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ
[1] https://github.com/djolertrk/llvm-di-checker
Differential Revision: https://reviews.llvm.org/D82545
The test that was failing is now forced to use the old PM.
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
This patch adds functionality to compare for the equality between `InterfaceFile`s based on attributes specific to linking.
Reviewed By: cishida, steven_wu
Differential Revision: https://reviews.llvm.org/D96629
As discussed on the RFC [0], I am sharing the set of patches that
enables checking of original Debug Info metadata preservation in
optimizations. The proof-of-concept/proposal can be found at [1].
The implementation from the [1] was full of duplicated code,
so this set of patches tries to merge this approach into the existing
debugify utility.
For example, the utility pass in the original-debuginfo-check
mode could be invoked as follows:
$ opt -verify-debuginfo-preserve -pass-to-test sample.ll
Since this is very initial stage of the implementation,
there is a space for improvements such as:
- Add support for the new pass manager
- Add support for metadata other than DILocations and DISubprograms
[0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ
[1] https://github.com/djolertrk/llvm-di-checker
Differential Revision: https://reviews.llvm.org/D82545
Same implementation as G_SEXT_INREG.
Add a testcase to combine-sext-inreg for a concrete example, and a testcase
to KnownBitsTest.
Differential Revision: https://reviews.llvm.org/D96897
We can't construct a working unique_function from an object that's not callable
with the right types, so don't allow deduction to succeed.
This avoids some ambiguous conversion cases, e.g. allowing to overload
on different unique_function types, and to conversion operators to
unique_function.
std::function and the any_invocable proposal have these.
This was added to llvm::function_ref in D88901 and followups
Differential Revision: https://reviews.llvm.org/D96794
The GPUDivergenceAnalysis is now renamed to just "DivergenceAnalysis"
since there is no conflict with LegacyDivergenceAnalysis. In the
legacy PM, this analysis can only be used through the legacy DA
serving as a wrapper. It is now made available as a pass in the new
PM, and has no relation with the legacy DA.
The new DA currently cannot handle irreducible control flow; its
presence can cause the analysis to run indefinitely. The analysis is
now modified to detect this and report all instructions in the
function as divergent. This is super conservative, but allows the
analysis to be used without hanging the compiler.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D96615
This commit fixes how metadata is handled in CloneModule to be sound,
and improves how it's handled in CloneFunctionInto (although the latter
is still awkward when called within a module).
Ruiling Song pointed out in PR48841 that CloneModule was changed to
unsoundly use the RF_ReuseAndMutateDistinctMDs flag (renamed in
fa35c1f80f0ea080a7cbc581416929b0a654f25c for clarity). This flag papered
over a crash caused by other various changes made to CloneFunctionInto
over the past few years that made it unsound to use cloning between
different modules.
(This commit partially addresses PR48841, fixing the repro from
preprocessed source but not textual IR. MDNodeMapper::mapDistinctNode
became unsound in df763188c9a1ecb1e7e5c4d4ea53a99fbb755903 and this
commit does not address that regression.)
RF_ReuseAndMutateDistinctMDs is designed for the IRMover to use,
avoiding unnecessary clones of all referenced metadata when linking
between modules (with IRMover, the source module is discarded after
linking). It never makes sense to use when you're not discarding the
source. This commit drops its incorrect use in CloneModule.
Sadly, the right thing to do with metadata when cloning a function is
complicated, and this patch doesn't totally fix it.
The first problem is that there are two different types of referenceable
metadata and it's not obvious what to with one of them when remapping.
- `!0 = !{!1}` is metadata's version of a constant. Programatically it's
called "uniqued" (probably a better term would be "constant") because,
like `ConstantArray`, it's stored in uniquing tables. Once it's
constructed, it's illegal to change its arguments.
- `!0 = distinct !{!1}` is a bit closer to a global variable. It's legal
to change the operands after construction.
What should be done with distinct metadata when cloning functions within
the same module?
- Should new, cloned nodes be created?
- Should all references point to the same, old nodes?
The answer depends on whether that metadata is effectively owned by a
function.
And that's the second problem. Referenceable metadata's ownership model
is not clear or explicit. Technically, it's all stored on an
LLVMContext. However, any metadata that is `distinct`, that transitively
references a `distinct` node, or that transitively references a
GlobalValue is specific to a Module and is effectively owned by it. More
specifically, some metadata is effectively owned by a specific Function
within a module.
Effectively function-local metadata was introduced somewhere around
c10d0e5ccd12f049bddb24dcf8bbb7fbbc6c68f2, which made it illegal for two
functions to share a DISubprogram attachment.
When cloning a function within a module, you need to clone the
function-local debug info and suppress cloning of global debug info (the
status quo suppresses cloning some global debug info but not all). When
cloning a function to a new/different module, you need to clone all of
the debug info.
Here's what I think we should do (eventually? soon? not this patch
though):
- Distinguish explicitly (somehow) between pure constant metadata owned
by the LLVMContext, global metadata owned by the Module, and local
metadata owned by a GlobalValue (such as a function).
- Update CloneFunctionInto to trigger cloning of all "local" metadata
(only), perhaps by adding a bit to RemapFlag. Alternatively, split
out a separate function CloneFunctionMetadataInto to prime the
metadata map that callers are updated to call ahead of time as
appropriate.
Here's the somewhat more isolated fix in this patch:
- Converted the `ModuleLevelChanges` parameter to `CloneFunctionInto` to
an enum called `CloneFunctionChangeType` that is one of
LocalChangesOnly, GlobalChanges, DifferentModule, and ClonedModule.
- The code maintaining the "functions uniquely own subprograms"
invariant is now only active in the first two cases, where a function
is being cloned within a single module. That's necessary because this
code inhibits cloning of (some) "global" metadata that's effectively
owned by the module.
- The code maintaining the "all compile units must be explicitly
referenced by !llvm.dbg.cu" invariant is now only active in the
DifferentModule case, where a function is being cloned into a new
module in isolation.
- CoroSplit.cpp's call to CloneFunctionInto in CoroCloner::create
uses LocalChangeOnly, since fa635d730f74f3285b77cc1537f1692184b8bf5b
only set `ModuleLevelChanges` to trigger cloning of local metadata.
- CloneModule drops its unsound use of RF_ReuseAndMutateDistinctMDs
and special handling of !llvm.dbg.cu.
- Fixed some outdated header docs and left a couple of FIXMEs.
Differential Revision: https://reviews.llvm.org/D96531
Adds an *unaudited* SHA-256 implementation to `llvm/Support`. The ongoing lld-macho effort needs this to emit an adhoc code signature for macho files on macOS Big Sur.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96540
Some of these accidentally disabled tests failed as a result; updated
tests per @qcolombet instructions. A small number needed additional
updates because legalization has actually changed since they were
written.
Found by the Rotten Green Tests project.
Differential Revision: https://reviews.llvm.org/D95257
The individual recipes have been updated to manage their operands using
VPUser a while back. Now that the transition is done, we can instead
make VPRecipeBase a VPUser and get rid of the toVPUser helper.
Rename the `RF_MoveDistinctMDs` flag passed into `MapValue` and
`MapMetadata` to `RF_ReuseAndMutateDistinctMDs` in order to more
precisely describe its effect and clarify the header documentation.
Found this while helping to investigate PR48841, which pointed out an
unsound use of the flag in `CloneModule()`. For now I've just added a
FIXME there, but I'm hopeful that the new (more precise) name will
prevent other similar errors.
This attempts to move all tools over to using `add_llvm_library` for
better consistency. After doing this, I noticed it ended up as nearly a
reimplementation of https://reviews.llvm.org/rL342148, which later got
reverted in r342336 (b09a8c9bd9b819741b38071a7ccd95042ef2643a).
With ccache and ninja on a large core machine (40), I haven't run into
build errors, so I'm hopeful it's better now, though it doesn't seem to
be any different / new.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D90970
We implement getHostCPUName() for AIX via systemcfg interfaces since access to the processor version register is a privileged operation. We return a value based on the current processor implementation mode.
This fixes the cpu detection used by clang for `-mcpu=native`.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D95966
Currently, the SmallPtrSet type allows inserting elements but it does
not support inserting elements with a positional hint. The lack of this
signature means that you cannot use SmallPtrSet with
std::insert_iterator or std::inserter(), which makes some code
constructs more awkward. This adds an overload of insert() that can be
used in these scenarios.
The positional hint is unused by SmallPtrSet and the call is equivalent
to calling insert() without a hint.
As mentioned in TODO comment, casting double to float causes NaNs to change bits.
To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode.
Patch by Yuta Saito.
Differential Revision: https://reviews.llvm.org/D77384