This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and
is not a right patch according to comments in D91724
This reverts commit 42eaf4fe0adef3344adfd9fbccd49f325cb549ef.
(cherry picked from commit 34badc409cc452575c538c4b6449546adc38f121)
In case of a virtual register tied to a phys-def, the register class needs to
be computed. Make sure that this works generally also with fast regalloc by
using TLI.getRegClassFor() whenever possible, and make only the case of
'Untyped' use getMinimalPhysRegClass().
Fixes https://bugs.llvm.org/show_bug.cgi?id=51699.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109291
(cherry picked from commit 118997d8e931dcb4c6e972611a7e4febcc33a061)
When expanding a SMULFIXSAT ISD node (usually originating from
a smul.fix.sat intrinsic) we've applied some optimizations for
the special case when the scale is zero. The idea has been that
it would be cheaper to use an SMULO instruction (if legal) to
perform the multiplication and at the same time detect any overflow.
And in case of overflow we could use some SELECT:s to replace the
result with the saturated min/max value. The only tricky part
is to know if we overflowed on the min or max value, i.e. if the
product is positive or negative. Unfortunately the implementation
has been incorrect as it has looked at the product returned by the
SMULO to determine the sign of the product. In case of overflow that
product is truncated and won't give us the correct sign bit.
This patch is adding an extra XOR of the multiplication operands,
which is used to determine the sign of the non truncated product.
This patch fixes PR51677.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D108938
(cherry picked from commit 789f01283d52065b10049b58a3288c4abd1ef351)
This patch is a revert of e08f205f5c2c. In that patch, DW_TAG_subprograms
were permitted to be referenced across CU boundaries, to improve stack
trace construction using call site information. Unfortunately, as
documented in PR48790, the way that subprograms are "owned" by dwarf units
is sufficiently complicated that subprograms end up in unexpected units,
invalidating cross-unit references.
There's no obvious way to easily fix this, and several attempts have
failed. Revert this to ensure correct DWARF is always emitted.
Three tests change in addition to the reversion, but they're all very
light alterations.
Differential Revision: https://reviews.llvm.org/D107076
(cherry picked from commit d4ce9e463d51b18547dbd181884046abf77c5c91)
Signed-off-by: Jeremy Morse <jeremy.morse@sony.com>
Conflicts:
llvm/test/DebugInfo/X86/convert-loclist.ll
visitEXTRACT_SUBVECTOR can sometimes create illegal BITCASTs when
removing "redundant" INSERT_SUBVECTOR operations. This patch adds
an extra check to ensure such combines only occur after operation
legalisation if any resulting BITBAST is itself legal.
Differential Revision: https://reviews.llvm.org/D108086
(cherry picked from commit cd0e1964137f1cd7b508809ec80c7d9dcb3f0458)
The introduction of `SHF_GNU_RETAIN` has caused massive problems on Solaris.
Initially, as reported in Bug 49437, it caused dozens of testsuite failures
on both sparc and x86. The objects were marked as `ELFOSABI_NONE`, but
`SHF_GNU_RETAIN` is a GNU extension. In the native Solaris ABI, that flag
(in the range for OS-specific values) is `SHF_SUNW_ABSENT` with a
completely different semantics, which confuses Solaris `ld` very much.
Later, the objects became (correctly) marked `ELFOSABI_GNU`, which Solaris
`ld` doesn't support, causing it to SEGV and break the build. The linker
is currently being hardened to not accept non-native OS ABIs to avoid this.
The need for linker support is already documented in
`clang/include/clang/Basic/AttrDocs.td`, but not currently checked.
This patch avoids all this by not emitting `SHF_GNU_RETAIN` on Solaris at all.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D107747
(cherry picked from commit 7bbbf2956181f375ab193321b37ea71c5fc44054)
This transform was added with D58874, but there were no tests for overflow ops.
We need to change this one way or another because it can crash as shown in:
https://llvm.org/PR51238
Note that if there are no uses of an overflow op's bool overflow result, we
reduce it to a regular math op, so we continue to fold that case either way.
If we have uses of both the math and the overflow bool, then we are likely
not saving anything by creating an independent sub instruction as seen in
the test diffs here.
This patch makes the behavior in SDAG consistent with what we do in
instcombine AFAICT.
Differential Revision: https://reviews.llvm.org/D106983
(cherry picked from commit fa6b2c9915ba27e1e97f8901ea4aa877f331fb9f)
This patch legalizes the Machine Value Type introduced in D94096 for loads
and stores. A new target hook named getAsmOperandValueType() is added which
maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization.
Differential Revision: https://reviews.llvm.org/D94097
Adds MVT::i64x8, a Machine Value Type needed for lowering inline assembly
operands which materialize a sequence of eight general purpose registers.
Differential Revision: https://reviews.llvm.org/D94096
When we have a terminator sequence (i.e. a tailcall or return),
MIIsInTerminatorSequence is used to work out where the preceding ABI-setup
instructions end, i.e. the parts that were glued to the terminator
instruction. This allows LLVM to split blocks safely without having to
worry about ABI stuff.
The function only ignores DBG_VALUE instructions, meaning that the two
debug instructions I recently added can end terminator sequences early,
causing various MachineVerifier errors. This patch promotes the test for
debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any
debug-info interfering with this function.
Differential Revision: https://reviews.llvm.org/D106660
(cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4)
This patch adds a peephole optimization `SETCC(FREEZE(x),const)` => `FREEZE(SETCC(x,const))`
if the SETCC is only used by BRCOND.
Combined with `BRCOND(FREEZE(X)) => BRCOND(X)`, this leads to a nice improvement in the generated assembly when x is a masked loaded value.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D105344
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D106380
Avoid several crashes when DBG_INSTR_REF and DBG_PHI instructions are fed
to the instruction scheduler. DBG_INSTR_REFs should be treated like
DBG_LABELs, and just ignored for the purpose of scheduling [0].
DBG_PHIs however behave much more like DBG_VALUEs: they refer to register
operands, and if some register defs get shuffled around during instruction
scheduling, there's a risk that the debug instr will refer to the wrong
value. There's already a facility for updating DBG_VALUEs to reflect this;
add DBG_PHI to the list of instructions that it will update.
[0] Suboptimal, but it's what instr scheduling does right now.
Differential Revision: https://reviews.llvm.org/D106663
When working out which instruction defines a value, the
instruction-referencing variable location code has a few special cases for
physical registers:
* Arguments are never defined by instructions,
* Constant physical registers always read the same value, are never def'd
This patch adds a third case for the llvm.frameaddress intrinsics: you can
read the framepointer in any block if you so choose, and use it as a
variable location, as shown in the added test.
This rather violates one of the assumptions behind instruction referencing,
that LLVM-ir shouldn't be able to read from an arbitrary register at some
arbitrary point in the program. The solution for now is to just emit a
DBG_PHI that reads the register value: this works, but if we wanted to do
something clever with DBG_PHIs in the future then this would probably get
in the way. As it stands, this patch avoids a crash.
Differential Revision: https://reviews.llvm.org/D106659
This patch builds on top of D106575 in which scalable-vector splats were
supported in `ISD::matchBinaryPredicate`. It teaches the DAGCombiner how
to perform a variety of the pre-existing saturating add/sub combines on
scalable-vector types.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106652
This reverts commit 0a37163d1d855a2db41e1f46ddbc3f4570bd7ca6.
Reason: Broke the sanitizer msan bots. More details are available in the
original Phabricator review: https://reviews.llvm.org/D106814.
This adds support for the case where
WideSize = DstSize + K * SrcSize
In this case, we can pad the G_MERGE_VALUES instruction with K extra undef
values with width SrcSize. Then the destination can be handled via
widenScalarDst.
Differential Revision: https://reviews.llvm.org/D106814
Use it AArch64 post-legal combiner. These don't always get folded because when
the instructions are created the constants are obscured by artifacts.
Differential Revision: https://reviews.llvm.org/D106776
Dominator trees were previously used for an optimization related to
`wasm.lsda` but the optimization was removed in D97309. Currently
dominators are not doing anything in this pass. Also removes some
`include` lines without which it compiles.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D106811
This fixes an assert firing when compiling code which involves 128 bit
integrals.
This would trigger runtime checks similar to this:
```
Assertion failed: getMinSignedBits() <= 64 && "Too many bits for int64_t", file llvm/include/llvm/ADT/APInt.h, line 1646
```
To get around this, we just saturate those big values.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D105320
During tail duplication, SSA values may be updated and have their uses
replaced with a virtual register, and any debug instructions that use
that value are deleted. This patch fixes the implementation of the debug
instruction deletion to work correctly for debug instructions that use
the SSA value multiple times, by batching deletions so that we don't
attempt to delete the same instruction twice.
Differential Revision: https://reviews.llvm.org/D106557
Late in SelectionDAG we join up instruction numbers with their defining
instructions, if it couldn't be done during the main part of SelectionDAG.
One exception is function arguments, where we have to point a DBG_PHI
instruction at the incoming live register, as they don't have a defining
instruction. This patch adds another exception, for constant physregs, like
aarch64 has.
It may seem wasteful to use two instructions where we could use a single
DBG_VALUE, however the whole point of instruction referencing is to
decouple the identification of values from the specification of where
variable location ranges start.
(Part of my aarch64 work to ease adoption of instruction referencing, as
in the meta comment on D104520)
Differential Revision: https://reviews.llvm.org/D104520
This patch extends support for (scalable-vector) splats in the
DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a
variety of simple combines of constants.
Users of this function may now have to distinguish between
`BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing
with this in-tree follows the approach added for
`ISD::matchUnaryPredicate` implemented in D94501.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106575
to encode the constants for DW_AT_data_member_location.
Summary: In DWARF v3, DW_FORM_data4/8 in
DW_AT_data_member_location are interpreted as location
list pointers. Interpreting constants as pointers is
not expected, so we use DW_FORM_udata to encode the
constants.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D105687
If value tracking can confirm that the cttz/ctlz source is known non-zero then we don't need to create a branch (which DAG will struggle to recover from).
Differential Revision: https://reviews.llvm.org/D106685
I've setup the basic framework for the isGuaranteedNotToBeUndefOrPoison call and updated DAGCombiner::visitFREEZE to use it, further Opcodes can be handled when we have test coverage.
I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place.
SelectionDAG::isGuaranteedNotToBePoison wrappers have also been added.
Differential Revision: https://reviews.llvm.org/D106668
This adds custom lowering for truncating stores when operating on
fixed length vectors in SVE. It also includes a DAG combine to
fold extends followed by truncating stores into non-truncating
stores in order to prevent this pattern appearing once truncating
stores are supported.
Currently truncating stores are not used in certain cases where
the size of the vector is larger than the target vector width.
Differential Revision: https://reviews.llvm.org/D104471
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104797
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
!srcloc is generated in clang codegen, and pulled back out by llvm
functions like AsmPrinter::emitInlineAsm that need to report errors in
the inline asm. From there it goes to LLVMContext::emitError, is
stored in DiagnosticInfoInlineAsm, and ends up back in clang, at
BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true
clang::SourceLocation from the integer cookie.
Throughout this code path, it's now 64-bit rather than 32, which means
that if SourceLocation is expanded to a 64-bit type, this error report
won't lose half of the data.
The compiler will tolerate both of i32 and i64 !srcloc metadata in
input IR without faulting. Test added in llvm/MC. (The semantic
accuracy of the metadata is another matter, but I don't know of any
situation where that matters: if you're reading an IR file written by
a previous run of clang, you don't have the SourceManager that can
relate those source locations back to the original source files.)
Original version of the patch by Mikhail Maltsev.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105491
Prior to this patch, it skipped the instruction defining VNI when checking if the tainted lanes are used.
In the given example, VRGATHER is an illegal instruction because its DstReg overlaps with SrcReg.
Therefore we need to check the defining instruction as well when there is an earlyclobber constraint.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D105684
The coalescer does not check if register uses are available
at the point of rematerialization. If it attempts to rematerialize
an instruction with such uses it can end up with use without a def.
LiveRangeEdit does such check during rematerialization, so just
call LiveRangeEdit::allUsesAvailableAt() to avoid the problem.
Differential Revision: https://reviews.llvm.org/D106396
The existing rule about the operand type is strange. Instead, just say
the operand is a TargetConstant with the right width. (Legalization
ignores TargetConstants, so it doesn't matter if that width is legal.)
Highlights:
1. I had to substantially rewrite the AArch64 isel patterns to expect a
TargetConstant. Nothing too exotic, but maybe a little hairy. Maybe
worth considering a target-specific node with some dagcombines instead
of this complicated nest of isel patterns.
2. Our behavior on RV32 for vectors of i64 has changed slightly. In
particular, we correctly preserve the width of the arithmetic through
legalization. This changes the DAG a bit. Maybe room for
improvement here.
3. I explicitly defined the behavior around overflow. This is necessary
to make the DAGCombine transforms legal, and I don't think it causes any
practical issues.
Differential Revision: https://reviews.llvm.org/D105673
This patch allows iterating typed enum via the ADT/Sequence utility.
It also changes the original design to better separate concerns:
- `StrongInt` only deals with safe `intmax_t` operations,
- `SafeIntIterator` presents the iterator and reverse iterator
interface but only deals with safe `StrongInt` internally.
- `iota_range` only deals with `SafeIntIterator` internally.
This design ensures that operations are always valid. In particular,
"Out of bounds" assertions fire when:
- the `value_type` is not representable as an `intmax_t`
- iterator operations make internal computation underflow/overflow
- the internal representation cannot be converted back to `value_type`
Differential Revision: https://reviews.llvm.org/D106279
We have SelectionDAG patterns for 8 & 16-bit atomic operations, but they
assume the value types will have been legalized to 32-bits. So this adds
the ability to widen them to both AArch64 & generic GISel
infrastructure.
In the textual format, `noduplicates` means no COMDAT/section group
deduplication is performed. Therefore, if both sets of sections are retained, and
they happen to define strong external symbols with the same names,
there will be a duplicate definition linker error.
In PE/COFF, the selection kind lowers to `IMAGE_COMDAT_SELECT_NODUPLICATES`.
The name describes the corollary instead of the immediate semantics. The name
can cause confusion to other binary formats (ELF, wasm) which have implemented/
want to implement the "no deduplication" selection kind. Rename it to be clearer.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106319
ACC registers are a combination of four consecutive vector registers.
If the vector registers are assigned first this often forces a number
of copies to appear just before the ACC register is created. If the ACC
register is assigned first then fewer copies are generated when the vector
registers are assigned.
This patch tries to force the register allocator to assign the ACC registers first
and then the UACC registers and then the vector pair registers. It does this
by changing the priority of the register classes.
This patch also adds hints to help the register allocator assign UACC registers from
known ACC registers and vector pair registers from known UACC registers.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D105854