Split impliesPoison into two recursive walks, one over V, the
other over ValAssumedPoison. This allows us to reason about poison
implications in a number of additional cases that are important
in practice. This is a generalized form of D94859, which handles
the cmp to cmp implication in particular.
Differential Revision: https://reviews.llvm.org/D94866
This CPU supports all v8.5a features except BTI, and so identifies as v8.5a to
Clang. A bit weird, but the best way for things like xnu to detect the new
features it cares about.
This patch adds the default value of 1 to drop_begin.
In the llvm codebase, 70% of calls to drop_begin have 1 as the second
argument. The interface similar to with std::next should improve
readability.
This patch converts a couple of calls to drop_begin as examples.
Differential Revision: https://reviews.llvm.org/D94858
Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations. Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`.
I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above:
```
62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp)
...
```
The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true.
[[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable | According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version:
```
/// Storage for any type.
template <typename T, bool = std::is_trivially_copy_constructible<T>::value
&& std::is_trivially_copy_assignable<T>::value>
class OptionalStorage {
```
Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted. Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93510
This reverts commit 33be50daa9ce1074c3b423a4ab27c70c0722113a,
effectively reapplying:
- 260a856c2abcef49c7cb3bdcd999701db3e2af38
- 3043e5a5c33c4c871f4a1dfd621a8839f9a1f0b3
- 49142991a685bd427d7e877c29c77371dfb7634c
... with a fix to skip a call to `SmallVector::isReferenceToStorage()`
when we know the parameter had been taken by value for small, POD-like
`T`. See https://reviews.llvm.org/D93779 for the discussion on the
revert.
At a high-level, these commits fix reference invalidation in
SmallVector's push_back, append, insert (one or N), and resize
operations. For more details, please see the original commit messages.
This commit fixes a bug that crept into
`SmallVectorTemplateCommon::reserveForAndGetAddress()` during the review
process after performance analysis was done. That function is now called
`reserveForParamAndGetAddress()`, clarifying that it only works for
parameter values. It uses that knowledge to bypass
`SmallVector::isReferenceToStorage()` when `TakesParamByValue`. This is
`constexpr` and avoids adding overhead for "small enough", trivially
copyable `T`.
Performance could potentially be tuned further by increasing the
threshold for `TakesParamByValue`, which is currently defined as:
```
bool TakesParamByValue = sizeof(T) <= 2 * sizeof(void *);
```
in the POD-like version of SmallVectorTemplateBase (else, `false`).
Differential Revision: https://reviews.llvm.org/D94800
Add a matcher that checks if the given subpattern has only one non-debug use.
Also improve existing m_OneUse testcase.
Differential Revision: https://reviews.llvm.org/D94705
This reverts commit 260a856c2abcef49c7cb3bdcd999701db3e2af38.
This reverts commit 3043e5a5c33c4c871f4a1dfd621a8839f9a1f0b3.
This reverts commit 49142991a685bd427d7e877c29c77371dfb7634c.
This change had a larger than anticipated compile-time impact,
possibly because the small value optimization is not working as
intended. See D93779.
It turns out we need to handle `LangOptions` separately from the rest of the options. `LangOptions` used to be conditionally parsed only when `!(DashX.getFormat() == InputKind::Precompiled || DashX.getLanguage() == Language::LLVM_IR)` and we need to restore this order (for more info, see D94682).
We could do this similarly to how `DiagnosticOptions` are handled: via a counterpart to the `IsDiag` mix-in (e.g. `IsLang`). These mix-ins would prefix the option key path with the appropriate `CompilerInvocation::XxxOpts` member. However, this solution would be problematic, as we'd now have two kinds of options (`Lang` and `Diag`) with seemingly incomplete key paths in the same file. To understand what `CompilerInvocation` member an option affects, one would need to read the whole option definition and notice the `IsDiag` or `IsLang` class.
Instead, this patch introduces more robust way to handle different kinds of options separately: via the `KeyPathAndMacroPrefix` class. We have one specialization of that class per `CompilerInvocation` member (e.g. `LangOpts`, `DiagnosticOpts`, etc.). Now, instead of specifying a key path with `"LangOpts->UndefPrefixes"`, we use `LangOpts<"UndefPrefixes">`. This keeps the readability intact (you don't have to look for the `IsLang` mix-in, the key path is complete on its own) and allows us to specify a custom macro prefix within `LangOpts`.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D94676
The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask.
For example:
> start /B /AFFINITY 0xF lld-link.exe ...
Would let LLD only use 4 hyper-threads.
Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket.
Differential Revision: https://reviews.llvm.org/D92419
For small enough, trivially copyable `T`, take the parameter by-value in
`SmallVector::resize`. Otherwise, when growing, update the arugment
appropriately.
Differential Revision: https://reviews.llvm.org/D93781
For small enough, trivially copyable `T`, take the parameter by-value in
`SmallVector::append` and `SmallVector::insert`. Otherwise, when
growing, update the arugment appropriately.
Differential Revision: https://reviews.llvm.org/D93780
This reverts commit 56d1ffb927d03958a7a31442596df749264a7792, reapplying
9abac60309006db00eca0af406c2e16bef26807c, removing insert_one_maybe_copy
and using a helper called forward_value_param instead. This avoids use
of `std::is_same` (or any SFINAE), so I'm hoping it's more portable and
MSVC will be happier.
Original commit message follows:
For small enough, trivially copyable `T`, take the argument by value in
`SmallVector::push_back` and copy it when forwarding to
`SmallVector::insert_one_impl`. Otherwise, when growing, update the
argument appropriately.
Differential Revision: https://reviews.llvm.org/D93779
For small enough, trivially copyable `T`, take the argument by value in
`SmallVector::push_back` and copy it when forwarding to
`SmallVector::insert_one_impl`. Otherwise, when growing, update the
argument appropriately.
Differential Revision: https://reviews.llvm.org/D93779
The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask.
For example:
> start /B /AFFINITY 0xF lld-link.exe ...
Would let LLD only use 4 hyper-threads.
Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket.
Differential Revision: https://reviews.llvm.org/D92419
Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations. Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`.
I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above:
```
62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp)
...
```
The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true.
[[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable | According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version:
```
/// Storage for any type.
template <typename T, bool = std::is_trivially_copy_constructible<T>::value
&& std::is_trivially_copy_assignable<T>::value>
class OptionalStorage {
```
Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted. Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93510
Currently we don't support multiple SHT_SYMTAB_SHNDX sections
and the DT_SYMTAB_SHNDX tag currently.
This patch implements it and fixes the
https://bugs.llvm.org/show_bug.cgi?id=43991.
I had to introduce the `struct DataRegion` to ELF.h,
it is used to represent a region that might have no known size.
It is needed, because we don't know the size of the extended
section indices table when it is located via DT_SYMTAB_SHNDX.
In this case we still want to validate that we don't read
past the end of the file.
Differential revision: https://reviews.llvm.org/D92923
Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only the mir.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D94341
Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423
Remove the InsertionPoint argument from SlotIndexes::insertMBBInMaps
because it was confusing: what does it mean to insert a new block
between two instructions, in the middle of an existing block?
Instead, support the case that MachineBasicBlock::splitAt really needs,
where the new block contains some instructions that are already in the
maps because they have been moved there from the tail of the previous
block.
In all other use cases the new block is empty.
Based on work by Carl Ritson!
Differential Revision: https://reviews.llvm.org/D94311
This patch unifies the way recipes and VPValues are printed after the
transition to VPDef.
VPSlotTracker has been updated to iterate over all recipes and all
their defined values to number those. There is no need to number
values in Value2VPValue.
It also updates a few places that only used slot numbers for
VPInstruction. All recipes now can produce numbered VPValues.
This patch introduces a helper class SubsequentDelim to simplify loops
that generate a comma-separated lists.
For example, consider the following loop, taken from
llvm/lib/CodeGen/MachineBasicBlock.cpp:
for (auto I = pred_begin(), E = pred_end(); I != E; ++I) {
if (I != pred_begin())
OS << ", ";
OS << printMBBReference(**I);
}
The new class allows us to rewrite the loop as:
SubsequentDelim SD;
for (auto I = pred_begin(), E = pred_end(); I != E; ++I)
OS << SD << printMBBReference(**I);
where SD evaluates to the empty string for the first time and ", " for
subsequent iterations.
Unlike interleaveComma, defined in llvm/include/llvm/ADT/STLExtras.h,
SubsequentDelim can accommodate a wider variety of loops, including:
- those that conditionally skip certain items,
- those that need iterators to call getSuccProbability(I), and
- those that iterate over integer ranges.
As an example, this patch cleans up MachineBasicBlock::print.
Differential Revision: https://reviews.llvm.org/D94377
Currently make_early_inc_range cannot be used with iterators with
operator* implementations that do not return a reference.
Most notably in the LLVM codebase, this means the User iterator ranges
cannot be used with make_early_inc_range, which slightly simplifies
iterating over ranges while elements are removed.
Instead of directly using BaseT::reference as return type of operator*,
this patch uses decltype to get the actual return type of the operator*
implementation in WrappedIteratorT.
This patch also updates a few places to use make use of
make_early_inc_range.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93992
This implements basic instructions for the new spec.
- Adds new versions of instructions: `catch`, `catch_all`, and `rethrow`
- Adds support for instruction selection for the new instructions
- `catch` needs a custom routine for the same reason `throw` needs one,
to encode `__cpp_exception` tag symbol.
- Updates `WebAssembly::isCatch` utility function to include `catch_all`
and Change code that compares an instruction's opcode with `catch` to
use that function.
- LateEHPrepare
- Previously in LateEHPrepare we added `catch` instruction to both
`catchpad`s (for user catches) and `cleanuppad`s (for destructors).
In the new version `catch` is generated from `llvm.catch` intrinsic
in instruction selection phase, so we only need to add `catch_all`
to the beginning of cleanup pads.
- `catch` is generated from instruction selection, but we need to
hoist the `catch` instruction to the beginning of every EH pad,
because `catch` can be in the middle of the EH pad or even in a
split BB from it after various code transformations.
- Removes `addExceptionExtraction` function, which was used to
generate `br_on_exn` before.
- CFGStackfiy: Deletes `fixUnwindMismatches` function. Running this
function on the new instruction causes crashes, and the new version
will be added in a later CL, whose contents will be completely
different. So deleting the whole function will make the diff easier to
read.
- Reenables all disabled tests in exception.ll and eh-lsda.ll and a
single basic test in cfg-stackify-eh.ll.
- Updates existing tests to use the new assembly format. And deletes
`br_on_exn` instructions from the tests and FileCheck lines.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D94040
The new test case here contains a first order recurrences and an
instruction that is replicated. The first order recurrence forces an
instruction to be sunk _into_, as opposed to after the replication
region. That causes several things to go wrong including registering
vector instructions multiple times and failing to create dominance
relations correctly.
Instead we should be sinking to after the replication region, which is
what this patch makes sure happens.
Differential Revision: https://reviews.llvm.org/D93629
When creating pi-blocks we try to avoid creating duplicate edges
between outside nodes and the pi-block when an edge is of the
same kind and direction as another one that has already been
created. We do this by keeping track of the edges in an
enumerated array called EdgeAlreadyCreated. The problem is that
this array is declared local to the loop that iterates over the
nodes in the pi-block, so the information gets lost every time a
new inside-node is iterated over. The fix is to move the
declaration to the outer loop.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D94094
Change the way NoAlias assumptions in BasicAA are handled. Instead of
handling this inside the phi-phi code, always initially insert a
NoAlias result into the map and keep track whether it is used.
If it is used, then we require that we also get back NoAlias from
the recursive queries. Otherwise, the entry is changed to MayAlias.
Additionally, keep track of all location pairs we inserted that may
still be based on assumptions higher up. If it turns out one of those
assumptions is incorrect, we flush them from the cache.
The compile-time impact for the new implementation is significantly
higher than the previous iteration of this patch:
https://llvm-compile-time-tracker.com/compare.php?from=c0bb9859de6991cc233e2dedb978dd118da8c382&to=c07112373279143e37568b5bcd293daf81a35973&stat=instructions
However, it should avoid the exponential runtime cases we run into
if we don't cache assumption-based results entirely.
This also produces better results in some cases, because NoAlias
assumptions can now start at any root, rather than just phi-phi pairs.
This is not just relevant for analysis quality, but also for BatchAA
consistency: Otherwise, results would once again depend on query order,
though at least they wouldn't be wrong.
This ended up both more complicated and more expensive than I hoped,
but I wasn't able to come up with another solution that satisfies all
the constraints.
Differential Revision: https://reviews.llvm.org/D91936
Previously when trying to support CoroSplit's function splitting, we
added in a hack that simply added the new function's node into the
original function's SCC (https://reviews.llvm.org/D87798). This is
incorrect since it might be in its own SCC.
Now, more similar to the previous design, we have callers explicitly
notify the LazyCallGraph that a function has been split out from another
one.
In order to properly support CoroSplit, there are two ways functions can
be split out.
One is the normal expected "outlining" of one function into a new one.
The new function may only contain references to other functions that the
original did. The original function must reference the new function. The
new function may reference the original function, which can result in
the new function being in the same SCC as the original function. The
weird case is when the original function indirectly references the new
function, but the new function directly calls the original function,
resulting in the new SCC being a parent of the original function's SCC.
This form of function splitting works with CoroSplit's Switch ABI.
The second way of splitting is more specific to CoroSplit. CoroSplit's
Retcon and Async ABIs split the original function into multiple
functions that all reference each other and are referenced by the
original function. In order to keep the LazyCallGraph in a valid state,
all new functions must be processed together, else some nodes won't be
populated. To keep things simple, this only supports the case where all
new edges are ref edges, and every new function references every other
new function. There can be a reference back from any new function to the
original function, putting all functions in the same RefSCC.
This also adds asserts that all nodes in a (Ref)SCC can reach all other
nodes to prevent future incorrect hacks.
The original hacks in https://reviews.llvm.org/D87798 are no longer
necessary since all new functions should have been registered before
calling updateCGAndAnalysisManagerForPass.
This fixes all coroutine tests when opt's -enable-new-pm is true by
default. This also fixes PR48190, which was likely due to the previous
hack breaking SCC invariants.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D93828
This patch
- Adds containsPoisonElement that checks existence of poison in constant vector elements,
- Renames containsUndefElement to containsUndefOrPoisonElement to clarify its behavior & updates its uses properly
With this patch, isGuaranteedNotToBeUndefOrPoison's tests w.r.t constant vectors are added because its analysis is improved.
Thanks!
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D94053
Check if all possible values for a pair of knownbits give the same icmp result - these are based off the checks performed in InstCombineCompares.cpp and D86578.
Add exhaustive unit test coverage - a followup will update InstCombineCompares.cpp to use this.