Summary:
The current implementation of AANoFreeFloating will incorrectly list floating
point loads and stores as may-free. This prevents other attributor instances
like HeapToStack from pushing some allocations to the stack.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D103975
* For the output, the attributes within the target slice should be
grouped by the input order, then sorted by value ordering.
This is to fix current ubuntu buildbot inconsistences.
This allows for using the frame record feature (which uses __hwasan_tls)
independently from however the user wants to access the shadow base, which
prior was only usable if shadow wasn't accessed through the TLS variable or ifuncs.
Frame recording can be explicitly set according to ShadowMapping::WithFrameRecord
in ShadowMapping::init. Currently, it is only enabled on Fuchsia and if TLS is
used, so this should mimic the old behavior.
Added an extra case to prologue.ll that covers this new case.
Differential Revision: https://reviews.llvm.org/D103841
I found the documentation of the various CMake variables difficult to
navigate, because they are unsorted. I can see they've grown
organically with new clusters of somewhat-related options, but the
result is hard to use. This collates them (treating '_' as space).
Differential Revision: https://reviews.llvm.org/D102481
This is relanding commit d1d36f7ad2ae82bea8a6fcc40d6c42a72e21f096 .
This patch additionally addresses failures found in buildbots & post review comments.
This patch introduces a new tool, llvm-tapi-diff, that compares and returns the diff of two TBD files.
Reviewed By: ributzka, JDevlieghere
Differential Revision: https://reviews.llvm.org/D101835
Currently, NoWrapFlags are dropped if we inline operands of SCEVAddExpr
operands. As a consequence, we always drop flags when building
expressions like `getAddExpr(A, getAddExpr(B, C, NUW), NUW)`.
We should be able to retain NUW flags common among all inlined
SCEVAddExpr and the original flags.
Reviewed By: nikic, mkazantsev
Differential Revision: https://reviews.llvm.org/D103877
The function __multi3() is undefined on 32-bit ARM, so a call to it
should never be emitted. Instead, plain instructions need to be
generated to perform 128-bit multiplications.
Differential Revision: https://reviews.llvm.org/D103906
Upon encountering loads/stores on types whose size is not a multiple of 8 bits the SROA pass would either trip an assertion or use logic that was not meant to work with such irregularly-sized types.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D99435
Fixes crash reported here https://reviews.llvm.org/D73607
Using a store to keep the trunc intact. Returning v16i24 would
cause the trunc to be optimized away in SelectionDAGBuilder.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D103940
As shown in:
https://llvm.org/PR50623
...and the similar tests here, we were not accounting for
store merging of different sizes that do not cover the
entire range of the wide value to be stored.
This is the easy fix: just make sure that all of the
original stores are the same size, so when we calculate
the wide width, it's a simple N * M check.
This still allows all of the motivating optimizations from:
D86420 / 54a5dd485c4d
D87112 / 7a06b166b1af
We could enhance this code to track individual bytes and
allow merging multiple sizes.
This is a fix for PR50481
Immediate values for AddrModeT2_i8s4 are already scaled in MCinst operand.
This patch changes the number of bits and scale factor to reflect that
state when checking stack offset status. AddrModeT2_i7s[2|4] also have
this particularity but since MVE instructions are not outlined, just move
these cases to the unhandled ones.
Differential Revision: https://reviews.llvm.org/D103167
1. Better sorting of scalars to be gathered. Trying to insert
constants/arguments/instructions-out-of-loop at first and only then
the instructions which are inside the loop. It improves hoisting of
invariant insertelements instructions.
2. Better detection of shuffle candidates in gathering function.
3. The cost of insertelement for constants is 0.
Part of D57059.
Differential Revision: https://reviews.llvm.org/D103458
Our initial motivating case was memcpy's with alignments > 16. The
loads/stores, to which small memcpy's expand, are kept together in
several places so that we get a sequence like this for a 64 bit copy:
LD w0
LD w1
ST w0
ST w1
The load/store optimiser can generate a LDP/STP w0, w1 from this because
the registers read/written are consecutive. In our case however, the
sequence is optimised during ISel, resulting in:
LD w0
ST w0
LD w0
ST w0
This instruction reordering allows reuse of registers. Since the registers
are no longer consecutive (i.e. they are the same), it inhibits LDP/STP
creation. The approach here is to perform renaming:
LD w0
ST w0
LD w1
ST w1
to enable the folding of the stores into a STP. We do not yet generate
the LDP due to a limitation in the renaming implementation, but plan to
look at that in a follow-up so that we fully support this case. While
this was initially motivated by certain memcpy's, this is a general
approach and thus is beneficial for other cases too, as can be seen
in some test changes.
Differential Revision: https://reviews.llvm.org/D103597
This patch changes RVV's policy for its supported list of fixed-length
vector types by capping by vector size rather than element count. Now
all 1024-byte vectors (of supported element types) are supported, rather
than all 256-element vectors.
This is a more natural fit for the architecture, and allows us to, for
example, improve the support for vector bitcasts.
This change necessitated the adding of some new simple types to avoid
"regressing" on the number of currently-supported vectors. We round out
the 1024-byte types by adding `v512i8`, `v1024i8`, `v512i16` and
`v512f16`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103884
Upon encountering loads/stores on types whose size is not a multiple of 8 bits the SROA pass would either trip an assertion or use logic that was not meant to work with such irregularly-sized types.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D99435
Loads in the first half of the chapter are missing the type argument.
Patched By: klao (Mihaly Barasz)
Reviewed By: Jim
Differential Revision: https://reviews.llvm.org/D90326
The C-string section splitting support added in f9649d123db triggered an assert
("Duplicate canonical symbol at address") when multiple symbols were defined at
the the same offset within a C-string block (this triggered on arm64, where we
always add a block start symbol). The bug was caused by a failure to update the
record of the last canonical symbol address. The fix was to maintain this record
correctly, and move the auto-generation of the block-start symbol above the
handling for symbols defined in the object itself so that all symbols
(auto-generated and defined) are processed in address order.
This patch adds initial support for using the new pass manager when
doing ThinLTO via libLTO.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D102627
These types are (presumably) never used in the generated TableGen files.
The `default` switch case silences any compiler warnings for these
missing types so it's easy to miss.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103883
This patch is a simple fix which registers CONCAT_VECTORS as
custom-lowered for scalable mask vectors. This follows the pattern of
all other scalable-vector types, as the default expansion of
CONCAT_VECTORS cannot handle scalable types, and even if it did it'd go
through the stack and generate worse code.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103896
Command to see the differences:
diff -u <(sed -n 's#^HANDLE_DI_FLAG *([^,]*, *\([^()]*\)) *\(//.*\)\?$#\1#p' <llvm/include/llvm/IR/DebugInfoFlags.def | grep -vw Largest) <(sed -n 's#^ *LLVMDIFlag\([^ ]*\) *= (\?[0-9].*$#\1#p' <llvm/include/llvm-c/DebugInfo.h)
OCaml binding is more seriously out of sync but I have not tried to sync it.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D103910
When `-fstack-clash-protection` is enabled and stack has to be realigned, some parts of redzone is written prior the probe, so probe might overwrite content already written in redzone. To avoid it, we have to make sure the first probe is at full probe size or is the last probe so that we can skip redzone.
It also fixes violation of ABI under PPC where `r1` isn't updated atomically.
This fixes https://bugs.llvm.org/show_bug.cgi?id=49903.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D100290
With Twine now ubiquitous after rG92a79dbe91413f685ab19295fc7a6297dbd6c824,
it needs support for string_view when building clang with newer C++ standards.
This is similar to how StringRef is handled.
Differential Revision: https://reviews.llvm.org/D103935
In most of cases, it has a single space after comma in assembly operands.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103790
Summary: This a minor patch to refactor the test file,
llvm-objdump/XCOFF/section-headers.test, to use yaml2obj
for this testing rather than a canned binary.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D103146
According to ELF V2 ABI, `0` should be the dwarf number of `r0`. Currently MMA's register also uses `0` as its dwarf number, this confuses `RegisterInfoEmitter` and generates wrong dwarf -> llvm mapping.
```
extern const MCRegisterInfo::DwarfLLVMRegPair PPCDwarfFlavour1Dwarf2L[] = {
{ 0U, PPC::VSRp31 },
```
This leads to wrong cfi output in https://reviews.llvm.org/D100290.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D103761