These currently always require a type parameter. The bitcode reader
already upgrades old bitcode without the type parameter to use the
pointee type.
In cases where the caller does not have byval but the callee does, we
need to follow CallBase::paramHasAttr() and also look at the callee for
the byval type so that CallBase::isByValArgument() and
CallBase::getParamByValType() are in sync. Do the same for preallocated.
While we're here add a corresponding version for inalloca since we'll
need it soon.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D104663
Chrome OS and Android have found it useful to have an automated revert
checker. It was requested to upstream it, since other folks in the LLVM
community may also find value in it.
The tests depend on having a full (non-shallow) checkout of LLVM. This
seems reasonable to me, since:
- the tests should only be run if the user is developing on this script
- it's kind of hard to develop on this script without local git history
:)
If people really want, the tests' dependency on LLVM's history can be
removed. It's mostly just effort/complexity that doesn't seem necessary.
Differential Revision: https://reviews.llvm.org/D105578
This commit also makes some slight changes to the scheduling model for AMDGPU to set the RetireOOO flag for all scheduling classes.
This flag is only used by llvm-mca and allows instructions to retire out of order.
See the differential link below for a deeper explanation of everything.
Differential Revision: https://reviews.llvm.org/D104730
Avoid enumerating all attributes here and instead use
getNameFromAttrKind(), which is based on the tablegen data.
This only leaves us with custom handling for int attributes,
which don't have uniform printing.
Part of https://lists.llvm.org/pipermail/llvm-dev/2021-July/151622.html
"Binary utilities: switch command line parsing from llvm::cl to OptTable"
Users should generally observe no difference as long as they only use intended
option forms. Behavior changes:
* `-t=d` is removed. Use `-t d` instead.
* `--demangle=0` cannot be used. Omit the option or use `--no-demangle` instead.
* `--help-list` is removed. This is a `cl::` specific option.
Note:
* `-t` diagnostic gets improved.
* This patch avoids cl::opt collision if we decide to support multiplexing for binary utilities
* One-dash long options are still supported.
* The `-s` collision (`-s segment section` for Mach-O) is unfortunate. `-s` means `--print-armap` in GNU nm.
* This patch removes the last `cl::multi_val` use case from the `llvm/lib/Support/CommandLine.cpp` library
`-M` (`--print-armap`), `-U` (`--defined-only`), and `-W` (`--no-weak`)
are now deprecated. They could conflict with future GNU nm options.
(--print-armap has an existing alias -s, so GNU will unlikely add a new one.
--no-weak (not in GNU nm) is rarely used anyway.)
`--just-symbol-name` is now deprecated in favor of
`--format=just-symbols` and `-j`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D105330
This patch prevents GlobalISel from optimizing out redundant branch
instructions when compiling without optimizations.
The motivating example is code like the following common pattern in
Swift, where users expect to be able to set a breakpoint on the early
exit:
public func f(b: Bool) {
guard b else {
return // I would like to set a breakpoint here.
}
...
}
The patch modifies two places in GlobalISEL: The first one is in
IRTranslator.cpp where the removal of redundant branches is made
conditional on the optimization level. The second one is in
AArch64InstructionSelector.cpp where an -O0 *only* optimization is
being removed.
Disabling these optimizations increases code size at -O0 by
~8%. However, doing so improves debuggability, and debug builds are
the primary reason why developers compile without optimizations. We
thus concluded that this is the right trade-off.
rdar://79515454
Differential Revision: https://reviews.llvm.org/D105238
There are some patterns involving the permuted scalar to vector node
for which we don't have patterns without direct moves on little endian
subtargets. This causes selection errors. While we can of course add
the missing patterns, any additional effort to make this work is not
useful since there is no support for any CPU that can run in
little endian mode and does not support direct moves.
This adds support for opaque pointers to expandAddToGEP() by always
generating an i8 GEP for opaque pointers. After looking at some other
cases (constexpr GEP folding, SROA GEP generation), I've come around
to the idea that we should use i8 GEPs for opaque pointers, because
the alternative would be to guess a GEP type from surrounding code,
which will not be reliable. Ultimately, i8 GEPs is where we want to
end up anyway, and opaque pointers just make that the natural choice.
There are a couple of other places in SCEVExpander that check pointer
element types, I plan to update those when I run across usable test
coverage that doesn't assert elsewhere.
Differential Revision: https://reviews.llvm.org/D105398
The reduction matching was probably only dealing with binops
when it was written, but we have now generalized it to handle
select and intrinsics too, so assert on that too.
Add a function removePointerBase that returns, essentially, S -
getPointerBase(S). Use it in getMinusSCEV instead of actually
subtracting pointers.
Differential Revision: https://reviews.llvm.org/D105503
Match whats documented in the Intel AOM - almost all the conversion instructions requires BOTH ports (apart from the MMX cvtpi2ps/cvtpi2ps instructions which we already override) - this was being incorrectly modelled as EITHER port.
Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.
Before this patch we would normally use the ABI alignment which can be
to high for the context alginment.
For spilled values we don't need ABI alignment, since the frame entry's
address is not escaped.
rdar://79664965
Differential Revision: https://reviews.llvm.org/D105288
Resubmit after the following changes:
* Fix a latent bug related to unrolling with required epilogue (see e49d65f). I believe this is the cause of the prior PPC buildbot failure.
* Disable non-latch exits for epilogue vectorization to be safe (9ffa90d)
* Split out assert movement (600624a) to reduce churn if this gets reverted again.
Previous commit message (try 3)
Resubmit after fixing test/Transforms/LoopVectorize/ARM/mve-gather-scatter-tailpred.ll
Previous commit message...
This is a resubmit of 3e5ce4 (which was reverted by 7fe41ac). The original commit caused a PPC build bot failure we never really got to the bottom of. I can't reproduce the issue, and the bot owner was non-responsive. In the meantime, we stumbled across an issue which seems possibly related, and worked around a latent bug in 80e8025. My best guess is that the original patch exposed that latent issue at higher frequency, but it really is just a guess.
Original commit message follows...
If we know that the scalar epilogue is required to run, modify the CFG to end the middle block with an unconditional branch to scalar preheader. This is instead of a conditional branch to either the preheader or the exit block.
The motivation to do this is to support multiple exit blocks. Specifically, the current structure forces us to identify immediate dominators and *which* exit block to branch from in the middle terminator. For the multiple exit case - where we know require scalar will hold - these questions are ill formed.
This is the last change needed to support multiple exit loops, but since the diffs are already large enough, I'm going to land this, and then enable separately. You can think of this as being NFCIish prep work, but the changes are a bit too involved for me to feel comfortable tagging the review that way.
Differential Revision: https://reviews.llvm.org/D94892
The build system was linking the PluginsTests unittest against libLLVM.so
and LLVMAsmParser which was causing the test to fail with this error:
LLVM ERROR: inconsistency in registered CommandLine options
We need to add llvm libraries to LLVM_LINK_COMPONENTS so that
they are dropped from the linker arguments when linking with
LLVM_LINK_LLVM_DYLIB=ON
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D105523
The Legalizer expands the operations of urem/srem into a div+mul+sub or divrem
when those are legal/custom. This patch changes the cost-model to reflect that
cost.
Since there is no 'divrem' Instruction in LLVM IR, the cost of divrem
is assumed to be the same as div+mul+sub since the three operations will
need to be executed at runtime regardless.
Patch co-authored by David Sherwood (@david-arm)
Reviewed By: RKSimon, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D103799
Adding usage of VSSRC and VSFRC when adding the live in registers on AIX.
This matches the behaviour of the rest of PPC Subtargets.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D104396
Update costs based on the worst case costs from the script in D103695.
Move to using legalized types wherever possible, which allows us to prune the cost tables.
Update (mainly) vXi8/vXi16 -> vXf32/vXf64 sitofp/uitofp costs based on the worst case costs from the script in D103695.
Move to using legalized types wherever possible, which allows us to prune the cost tables.
We already have reassociation code for Adds and Ors separately in DAG
combiner, this adds it for the combination of the two where Ors act like
Adds. It reassociates (add (or (x, c), y) -> (add (add (x, y), c)) where
we know that the Ors operands have no common bits set, and the Or has
one use.
Differential Revision: https://reviews.llvm.org/D104765
We have several checks for both cl::opt and OptLevel over our
pass config, although these checks do not properly work if
default value of a cl::opt will be false. Create a helper to
use instead and properly handle it. NFC for now.
Differential Revision: https://reviews.llvm.org/D105517
Before we replaced value by registering all their uses. However, as we
replace a value old uses become stale. We now replace values explicitly
and keep track of "new values" when doing so to avoid replacing only
uses in stale/old values but not their replacements.
We often need to deal with the value lattice that contains none and
undef as special values. A simple helper makes this much nicer.
Differential Revision: https://reviews.llvm.org/D103857
When we do simplification via AAPotentialValues or AAValueConstantRange
we need to simplify the operands of an instruction we deconstruct first.
This does not only improve the result, see for example range.ll, but is
required as we allow outside AAs to provide simplification rules via
callbacks. If we do ignore the simplification rules and base other
simplifications on the IR instead we can create an inconsistent state.
This will use the python that LLVM was configured to use rather than
python from PATH.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D105224
The combine was disabled in 4e22c7265d86 as it caused failures in
the ppc64be-multistage (bootstrap) bot.
It turns out that the combine did not correctly update the MMO for
the high load which caused aliased stores to be reported as unaliased.
This patch fixes that problem and re-enables the combine.
There are cases where infer address spaces pass cannot yet
infer an address space in the opt pipeline and then in the
llc pipeline it runs too late for atomic expand pass to
benefit from a specific address space.
Move atomic expand pass past the infer address spaces.
Fixes: SWDEV-293410
Differential Revision: https://reviews.llvm.org/D105511