Summary:
This would have been marginally useful to me during/for rG7ea46aee3670981827c04df89b2c3a1cbdc7561b.
With ongoing migration to representing assumes via operand bundles on the assume, this will be gradually more useful.
Reviewers: nickdesaulniers, diegotf, dblaikie, george.burgess.iv, jdoerfert, Tyker
Reviewed By: nickdesaulniers
Subscribers: hiraditya, mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83177
Summary:
It is reasonably common to want to clone some call with different bundles.
Let's actually provide an interface to do that.
Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers
Reviewed By: nickdesaulniers
Subscribers: llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83248
This diff merges all options for llvm-install-name-tool under a single
function processLoadCommands. Also adds another test case for -add_rpath
option.
Test plan: make check-all
Reviewed by: jhenderson, alexshap, smeenai, Ktwu
Differential Revision: https://reviews.llvm.org/D82812
As reported in https://reviews.llvm.org/D83101#2133062
the new visitInsertElementInst()/visitExtractElementInst() functionality
is causing miscompiles (previously-crashing test added)
It is due to the fact how the infra of Scalarizer is dealing with DCE,
it was not updated or was it ready for such scalar value forwarding.
It always assumed that the moment we "scalarized" something,
it can go away, and did so with prejudice.
But that is no longer safe/okay to do.
Instead, let's prevent it from ever shooting itself into foot,
and let's just accumulate the instructions-to-be-deleted
in a vector, and collectively cleanup (those that are *actually* dead)
them all at the end.
All existing tests are not reporting any new garbage leftovers,
but maybe it's test coverage issue.
Even though wide vectors are legal they still cost more as we
will have to eventually split them. Not all operations can
be uniformly done on vector types.
Conservatively add the cost of splitting at least to 8 dwords,
which is our widest possible load.
We are more or less lying to cost mode with this change but
this can prevent vectorizer from creation of wide vectors which
results in RA problems for us.
Differential Revision: https://reviews.llvm.org/D83078
The new option works like the existing LLVM_TABLEGEN, and
LLVM_CONFIG_PATH options. Instead of building llvm-nm, the build uses
the executable defined by LLVM_NM.
This is useful for cross-compilation scenarios where the host cannot run
the cross-compiled tool, and recursing into another cmake build is not
an option (due to required DEFINE's, for example).
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D83022
Summary:
Aside from unifying the code a bit, this change smooths the
transition to use of future "opaque generic block references"
in the type-erased dominator tree base class.
Change-Id: If924b092cc8561c4b6a7450fe79bc96df0e12472
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: wdng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83086
Summary:
Avoid exposing details about how roots are stored. This enables subsequent
type-erasure changes.
v5:
- cleanup a unit test by using EXPECT_EQ instead of EXPECT_TRUE
Change-Id: I532b774cc71f2224e543bc7d79131d97f63f093d
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: jvesely, wdng, hiraditya, kuhar, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83085
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.
New methods are introduced to cover common access patterns.
Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83083
This covers both the existing memory functions as well as the new bulk memory proposal.
Added new test files since changes where also required in the inputs.
Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit.
Differential Revision: https://reviews.llvm.org/D82821
Summary:
Change MCExpr to support Aurora VE's modifiers. Change asmparser to use
existing MCExpr parser (parseExpression) to parse an expression contining
symbols with modifiers and offsets. Also add several regression tests
of MC layer.
Reviewers: simoll, k-ishizaka
Reviewed By: simoll
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #ve
Differential Revision: https://reviews.llvm.org/D83170
Summary: Change to use isa instead of dyn_cast to avoid a warning.
Reviewers: simoll, k-ishizaka
Reviewed By: simoll
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #ve
Differential Revision: https://reviews.llvm.org/D83200
This was resulting in a missing vreg def in the use select
instruction.
The output of the pseudo doesn't make sense, since it really shouldn't
have the vreg output in the first place, and instead an implicit scc
def to match the real scalar behavior.
We could have easier to understand tests if we selected scalar
versions of the [us]{add|sub}.with.overflow intrinsics.
This does still end up producing vector code in the end, since it gets
moved later.
Summary: This is a complementary patch to D82100 since the aix builbot is still running the unsupported test shtest-format-argv0. Add system-aix to the sub llvm-lit config.
Reviewers: daltenty, hubert.reinterpretcast
Reviewed By: hubert.reinterpretcast
Subscribers: delcypher, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82905
We can often fold an ADDI into the offset of load/store instructions:
(load (addi base, off1), off2) -> (load base, off1+off2)
(store val, (addi base, off1), off2) -> (store val, base, off1+off2)
This is possible when the off1+off2 continues to fit the 12-bit immediate.
We remove the previous restriction where we would never fold the ADDIs if
the load/stores had nonzero offsets. We now do the fold the the resulting
constant still fits a 12-bit immediate, or if off1 is a variable's address
and we know based on that variable's alignment that off1+offs2 won't overflow.
Differential Revision: https://reviews.llvm.org/D79690
Summary:
When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.
Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L
Differential Revision: https://reviews.llvm.org/D82481
* The getLine and getColumn functions need to update the position, or
they will return stale data for buffered streams. This fixes a bug in
the clang -analyzer-checker-option-help option, which was not wrapping
the help text correctly when stdout is not a TTY.
* If the stream contains multi-byte UTF-8 sequences, then the whole
sequence needs to be considered to be a single character. This has the
edge case that the buffer might fill up and be flushed part way
through a character.
* If the stream contains East Asian wide characters, these will be
rendered twice as wide as other characters, so we need to increase the
column count to match.
This doesn't attempt to handle everything unicode can do (combining
characters, right-to-left markers, ...), but hopefully covers most
things likely to be common in messages and source code we might want to
print.
Differential revision: https://reviews.llvm.org/D76291
This reverts commit d3e3f36ff1151f565730977ac4f663a2ccee48ae,
which reverter the original commit 2c16100e6f72075564ea1f67fa5a82c269dafcd3,
but with polly tests now actually passing.
This adjusts the MVE fp16 cost model, similar to how we already do for
integer casts. It uses the base cost of 1 per cvt for most fp extend /
truncates, but adjusts it for loads and stores where we know that a
extending load has been used to get the load into the correct lane, and
only an MVE VCVTB is then needed.
Differential Revision: https://reviews.llvm.org/D81813
This adds some default costs for fp extends and truncates, generally
costing them as 1 per lane. If the type is not legal then the cost will
include a call to an __aeabi_ function.
Some NEON code is also adjusted to make sure it applies to the expected
types, now that fp16 is a more common thing.
Differential Revision: https://reviews.llvm.org/D82458
This matches the DAG behavior where this is called after the loop
checking for calls. The AMDGPU implementation depends on knowing if
there are calls in the function or not, so move this later.
Another problem is finalizeLowering is actually called twice; I was
seeing weird inconsistencies since the first call would produce
unexpected results and the second run would correct them in some
contexts. Since this requires disabling the verifier, and it's useful
to serialize the MIR immediately after selection, FinalizeISel should
probably not be a real pass.
The default constructor wasn't setting isSet o the ArgDescriptor, so
while these had the value set, they were treated as missing. This only
ended up mattering in the indirect call case (and for regular calls in
GlobalISel, which current doesn't have a way to support the variable
ABI).
Summary: As Bugzilla-35090 reported, the rationale for using custom lowering SREM/UREM should no longer be true. At the IR level, the div-rem-pairs pass performs the transformation where the remainder is computed from the result of the division when both a required. We should now be able to lower these directly on P9. And the pass also fixed the problem that divide is in a different block than the remainder. This is a patch to remove redundant code and make SREM/UREM legal directly on P9.
Reviewed By: lkail
Differential Revision: https://reviews.llvm.org/D82145
Use a simpler code sequence when the shift amount is known not to be
zero modulo the bit width.
Nothing much uses this until D77152 changes the translation of fshl and
fshr intrinsics.
Differential Revision: https://reviews.llvm.org/D82540
Using a negation instead of a subtraction from a constant can save an
instruction on some targets.
Nothing much uses this until D77152 changes the translation of fshl and
fshr intrinsics.
Differential Revision: https://reviews.llvm.org/D82539
Adds implementation of getMainExecutable() and is_local_impl() to
Support/Unix/Path.inc. Both are needed to compile LLVM for z/OS.
Reviewed By: hubert.reinterpretcast, emaste
Differential Revision: https://reviews.llvm.org/D82544
This is needed to build LLVM on z/OS, as there is no header file
which provides these constants.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D82368
Compilers may evaluate call arguments in different order,
which would result in different order of IR, which would break the tests.
Spotted thanks to Dmitri Gribenko!
This expands the existing extend costs with a few extras for larger
types than legal, which will usually be split under MVE. It also adds
trunk support for the same thing. These should not have a large effect
on many things, but makes the costs explicit and keeps a certain balance
between the trunks and extends.
Differential Revision: https://reviews.llvm.org/D82457
Summary:
I'm interested in taking the original C++ input,
for which we currently are stuck with an alloca
and producing roughly the lower IR,
with neither an alloca nor a vector ops:
https://godbolt.org/z/cRRWaJ
For that, as intermediate step, i'd to somehow perform scalarization.
As per @arsenmn suggestion, i'm trying to see if scalarizer can help me
avoid writing a bicycle.
I'm not sure if it's really intentional that variable insert is not handled currently.
If it really is, and is supposed to stay that way (?), i guess i could guard it..
See [[ https://bugs.llvm.org/show_bug.cgi?id=46524 | PR46524 ]].
Reviewers: bjope, cameron.mcinally, arsenm, jdoerfert
Reviewed By: jdoerfert
Subscribers: arphaman, uabelho, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82961