This teaches obj2yaml to dump valid regular (not thin) archives.
This also teaches yaml2obj to recognize archives YAML descriptions,
what allows to craft all different kinds of archives (valid and broken ones).
Differential revision: https://reviews.llvm.org/D89949
The removed tests are handled by optimization passes before code gen and
therefore are just a distraction when making code gen changes that may
(as a side effect) reimplement earlier optimization work as a side effect.
Specifically, the following tests where removed:
ult_0_v* -> false
ult_1_v* -> x == 0
ugt_0_v* -> x != 0
ult_{size-of-element-plus-one}_v* -> true
ugt_{size-of-element}_v* -> false
ult_{size-of-element}_v* -> x != mask
ugt_{size-of-element-minus-one}_v* -> x == mask
This reverts commit e038b60d9169733367393f733058f0ff23c28d3f.
This reverts commit a0d84d80315d0c725b5efcd889928bad1171ba56.
This revert was a mistake. The reason of the failures was
"Use uint64_t for branch weights instead of uint32_t"
Differential Revision: https://reviews.llvm.org/D87832
Our "implicit" sections are handled separately from regular ones.
It turns out that the "Offset" key is not handled properly for them.
Perhaps we can generalize handling in one place, but before doing that I'd like
to add support and test cases for each implicit section.
(I need this particular single change to unblock another patch that is already on review,
and I guess doing it independently for each section will be cleaner, see below).
In this patch I've removed `explicit-dynsym-no-dynstr.yaml` to `dynsym-section.yaml`
and added the new test into. In a follow-up we probably might want
to merge 2 another existent `dynsymtab-*.yaml` tests into it too.
Differential revision: https://reviews.llvm.org/D90224
Instead of getting the defining access we should be able to use
getClobberingMemoryAccess to skip non-aliasing MemoryDefs. No additional
checks should be needed, because we only remove the starting def if it
matches the defining access of the load. All we need to worry about is
that there are no (may)alias stores between the starting def and the
load and getClobberingMemoryAccess should guarantee that.
Partly fixes PR47887.
This improves the number of redundant stores removed in some cases
(numbers below for MultiSource, SPEC2000, SPEC2006 on X86 with -flto
-O3).
Same hash: 226 (filtered out)
Remaining: 11
Metric: dse.NumRedundantStores
Program base patch1 diff
test-suite...:: External/Povray/povray.test 1.00 5.00 400.0%
test-suite...chmarks/MallocBench/gs/gs.test 1.00 3.00 200.0%
test-suite...0/253.perlbmk/253.perlbmk.test 21.00 37.00 76.2%
test-suite...0.perlbench/400.perlbench.test 24.00 37.00 54.2%
test-suite.../Applications/SPASS/SPASS.test 3.00 4.00 33.3%
test-suite...006/453.povray/453.povray.test 15.00 18.00 20.0%
test-suite...T2006/445.gobmk/445.gobmk.test 27.00 29.00 7.4%
test-suite.../CINT2006/403.gcc/403.gcc.test 136.00 137.00 0.7%
test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0%
test-suite.../Benchmarks/Bullet/bullet.test NaN 3.00 nan%
test-suite.../Benchmarks/Ptrdist/bc/bc.test NaN 1.00 nan%
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D89647
BRIND and BR_JT are not implmented yet, so expand them atm.
Add regression tests too.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D90283
The addition of TILELOADD instructions with a new encoding format
triggered a hard abort instead of proper error reporting due to the use
of `llvm_unreachable` for actually reachable code.
Properly report an error when the encoding format is unknown.
Differential Revision: https://reviews.llvm.org/D90289
When we need to prove implication of expressions of different type width,
the default strategy is to widen everything to wider type and prove in this
type. This does not interact well with AddRecs with negative steps and
unsigned predicates: such AddRec will likely not have a `nuw` flag, and its
`zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec
in some cases can easily be proved to be an `AddRec` too.
This patch introduces an alternative way to handling implications of different
type widths. If we can prove that wider type values actually fit in the narrow type,
we truncate them and prove the implication in narrow type.
The return was due to revert of underlying patch that this one depends on.
Unit test temporarily disabled because the required logic in SCEV is switched
off due to compile time reasons.
Differential Revision: https://reviews.llvm.org/D89548
vnot (xor -1) should be equivalent to the AArch64 specific AArch64ISD::NOT
node, but allow more folding thanks to all the target independent
optimizations. Specifically this allows select(icmp ne, x, y) to
become "cmeq; bsl y, x" as opposed to needing to convert the predicate
with "cmeq; mvn; bsl x, y"
Unfortunately there is a regression in a cmtst test, but the code it
selected from was already non-canonical, with instcombine preferring to
use an eq predicate instead. Plus the more common case of icmp ne is
improved.
Differential Revision: https://reviews.llvm.org/D90126
The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining
vs table-based exception handling. x86-32 is the only arch for which Windows
uses the former. 32-bit ARM would use what is called Win64SEH today, which
is a bit confusing so instead let's just rename it to be a bit more clear.
Reviewed By: compnerd, rnk
Differential Revision: https://reviews.llvm.org/D90117
We can sharpen the range of a AddRec if we know that it does not
self-wrap and know the symbolic iteration count in the loop. If we can
evaluate the value of AddRec on the last iteration and prove that at least
one its intermediate value lies between start and end, then no-wrap flag
allows us to conclude that all of them also lie between start and end. So
the estimate of range can be improved to union of ranges of start and end.
Switched off by default, can be turned on by flag.
Differential Revision: https://reviews.llvm.org/D89381
Reviewed By: lebedev.ri, nikic
This patch removes extraneous calls to setEdgeProbability introduced
in c91487769d80487eba1712a7a172a1c8977a9b4f.
The follow-up patch, a7b662d0f4098371b96ce4446fb0eba79b0b649f, has
since fixed BranchProbabilityInfo::eraseBlock, so we don't need to
worry about getting stale values from getEdgeProbability.
Also, since getEdgeProbability(BB, BB->getSingleSuccessor()) returns
edge probability 1/1 by default for BB with exactly one successor
edge, we don't need to explicitly call setEdgeProbability.
This patch introduces almost no functional change, but we do end up
reducing debug messages from setEdgeProbability.
Differential Revision: https://reviews.llvm.org/D90284
This reverts commit 5b3bf8b453b8cc00efd5269009a1e63c4442a30e.
This caused a regression in the ASan buildbot. See comments at
https://reviews.llvm.org/D89817 for more information.
For any newly added parse function, clang-tidy complains. New parse
functions are implicitly defined by a macro "Parse##CLASS(N, IsDistinct)".
Now this macro and exising function definitions are corrected (lower case
first character). Some other variable/function names are also corrected
to comply LLVM coding style.
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D90243
SIPreAllocateWWMRegs was being inserted after RegisterCoalescer
but this pass does not exist during FastAlloc so pre-allocation
pass was never being run.
Insert pre-allocation after TwoAddressInstructionPass instead.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D90236
Before we used to only mark unreachable static functions as dead if all
uses were known dead. Now we optimistically assume uses to be dead until
proven otherwise.
If we are looking at a call site argument it might be a load or call
which is in a different context than the call site argument. We cannot
simply use the call site argument range for the call or load.
Bug reported and reduced by Whitney Tsang <whitneyt@ca.ibm.com>.
In the AANoAlias logic we determine if a pointer may have been captured
before a call. We need to look at other uses in the call not uses of the
call.
The new code is not perfect as it does not allow trivial cases where the
call has multiple arguments but it is at least not unsound and a TODO
was added.
Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.
Differential Revision: https://reviews.llvm.org/D88603
If `null_pointer_is_valid` is present, `dereferenceable` does not imply
`nonnull`, make it clear.
Came up in D17993.
Reviewed By: aqjune
Differential Revision: https://reviews.llvm.org/D89417
This patch ensures that BranchProbabilityInfo::eraseBlock(BB) deletes
all entries in Probs associated with with BB.
Without this patch, stale entries for BB may remain in Probs after
eraseBlock(BB), leading to a situation where a newly created basic
block has an edge probability associated with it even before the pass
responsible for creating the basic block adds any edge probability to
it.
Consider the current implementation of eraseBlock(BB):
for (const_succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; ++I) {
auto MapI = Probs.find(std::make_pair(BB, I.getSuccessorIndex()));
if (MapI != Probs.end())
Probs.erase(MapI);
}
Notice that it uses succ_begin(BB) and succ_end(BB), which are based
on BB->getTerminator(). This means that if the terminator changes
between calls to setEdgeProbability and eraseBlock, then we may not
examine all pairs associated with BB.
This is exactly what happens in MaybeMergeBasicBlockIntoOnlyPred,
which merges basic blocks A into B if A is the sole predecessor of B,
and B is the sole successor of A. It replaces the terminator of A
with UnreachableInst before (indirectly) calling eraseBlock(A).
The patch fixes the problem by keeping track of all edge probablities
entered with setEdgeProbability in a map from BasicBlock* to a
successor index.
Differential Revision: https://reviews.llvm.org/D90272
This patch teaches the jump threading pass to set edge probabilities
whenever the pass creates new basic blocks.
Without this patch, the compiler sometimes produces non-deterministic
results. The non-determinism comes from the jump threading pass using
stale edge probabilities in BranchProbabilityInfo. Specifically, when
the jump threading pass creates a new basic block, we don't initialize
its outgoing edge probability.
Edge probabilities are maintained in:
DenseMap<Edge, BranchProbability> Probs;
in class BranchProbabilityInfo, where Edge is an ordered pair of
BasicBlock * and a successor index declared as:
using Edge = std::pair<const BasicBlock *, unsigned>;
Probs maps edges to their corresponding probabilities.
Now, we rarely remove entries from this map, so if we happen to
allocate a new basic block at the same address as a previously deleted
basic block with an edge probability assigned, the newly created basic
block appears to have an edge probability, albeit a stale one.
This patch fixes the problem by explicitly setting edge probabilities
whenever the jump threading pass creates new basic blocks.
Differential Revision: https://reviews.llvm.org/D90106
This was originally part of:
f2c25c70791d
but that was reverted because there was an underlying bug in
processing the vector type of these intrinsics. That was
fixed with:
74ffc823ed21
This is similar in spirit to 01ea93d85d6e (memcpy) except that
here the underlying caller assumptions were created for vectorizer
use (throughput) rather than other passes.
That meant targets could have an enormous throughput cost with no
corresponding size, latency, or blended cost increase.
Paraphrasing from the previous commits:
This may not make sense for some callers, but at least now the
costs will be consistently wrong instead of mysteriously wrong.
Targets should provide better overrides if the current modeling
is not accurate.
When converting a BUILD_VECTOR or VECTOR_SHUFFLE to a splatting load
as of 1461fb6e783cb946b061f66689b419f74f7fad63, we inaccurately check
for a single user of the load and neglect to update the users of the
output chain of the original load. As a result, we can emit a new
load when the original load is kept and the new load can be reordered
after a dependent store. This patch fixes those two issues.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47891
Summary:
This patch adds support for passing in the original delcaration name in the
source file to the libomptarget runtime. This will allow the runtime to provide
more intelligent debugging messages. This patch takes the original expression
parsed from the OpenMP map / update clause and provides a textual
representation if it was explicitly mapped, otherwise it takes the name of the
variable declaration as a fallback. The information in passed to the runtime in
a global array of strings that matches the existing ident_t source location
strings using ";name;filename;column;row;;". See
clang/test/OpenMP/target_map_names.cpp for an example of the generated output
for a given map clause.
Reviewers: jdoervert
Differential Revision: https://reviews.llvm.org/D89802