This patch adds some helper in the DirectiveLanguage wrapper to initialize it from
the RecordKeeper and validate the records. This simplify arguments in lots of function
since only the DirectiveLanguge is passed.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D90358
As noted in D90554, there's an opcode typo in using an easily
misused cost model API: getCmpSelInstrCost(). Beyond that, the
assumed sequence of ops is questionable, but that would be
another patch.
My guess is that the x86 test diffs show that we are probably
wrong both before and after this change, so there will be no
practical difference.
As an example, I tried this test which shows a cost of '7'
either way:
define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) {
%V4I32 = call {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb)
%ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1
%r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0
%z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r
ret <4 x i32> %z
}
$ llc -o - sadd.ll -mattr=avx
vpaddd %xmm1, %xmm0, %xmm2
vpcmpgtd %xmm2, %xmm0, %xmm0
vpxor %xmm0, %xmm1, %xmm0
vblendvps %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a
Differential Revision: https://reviews.llvm.org/D90681
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.
Replaces https://reviews.llvm.org/D74158
Differential Revision: https://reviews.llvm.org/D89613
Adds a method called pop_back_n to SmallVector.
This is more readable and less error prone than the alternatives of using
```lang=c++
Vector.resize(Vector.size() - N);
Vector.erase(Vector.end() - N, Vector.end());
for (unsigned I = 0;I<N;++I) Vector.pop_back();
```
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90576
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
This patch changes the intrinsics cost model to assume that by default
target intrinsics are cheap. This didn't seem to be the case for all
intrinsics, and is potentially an MVE problem due to our scalarization
overheads. Cheap seems to be a good default in general though.
Differential Revision: https://reviews.llvm.org/D90597
This patch adds a linear polynomial base class, called LinearPolyBase, which
serves as a base class for StackOffset. It tries to represent a linear
polynomial like:
c0 * scale0 + c1 * scale1 + ... + cK * scaleK
where the scale is implicit, meaning that only the coefficients are
encoded.
This patch also adds a univariate linear polynomial, which serves as
a base class for ElementCount and TypeSize. This tries to represent a
linear polynomial where only one dimension can be set at any one time,
i.e. a TypeSize is either fixed-sized, or scalable-sized, but cannot be
a combination of the two.
class LinearPolyBase
^
|
+---- class StackOffset (dimensions = 2 (fixed/scalable), type = int64_t)
class UnivariateLinearPolyBase
|
|
+---- class LinearPolySize (dimensions = 2 (fixed/scalable))
^
|
+-------- class ElementCount (type = unsigned)
|
|
+-------- class TypeSize (type = uint64_t)
Reviewed By: ctetreau, david-arm
Differential Revision: https://reviews.llvm.org/D88982
Currently it is impossible to create an instance of ELFObjectFile when the
SHT_SYMTAB_SHNDX can't be read. We error out when fail to parse the
SHT_SYMTAB_SHNDX section in the factory method.
This change delays reading of the SHT_SYMTAB_SHNDX section entries,
with it llvm-readobj is now able to work with such inputs.
Differential revision: https://reviews.llvm.org/D89379
parallelTransformReduce is modelled on the C++17 pstl API of
std::transform_reduce, except our wrappers do not use execution policy
parameters.
parallelForEachError allows loops that contain potentially failing
operations to propagate errors out of the loop. This was one of the
major challenges I encountered while parallelizing PDB type merging in
LLD. Parallelizing a loop with parallelForEachError is not behavior
preserving: the loop will no longer stop on the first error, it will
continue working and report all errors it encounters in a list.
I plan to use this to propagate errors out of LLD's
coff::TpiSource::remapTpiWithGHashes, which currently stores errors an
error in the TpiSource object.
Differential Revision: https://reviews.llvm.org/D90639
MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table:
* if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat`
* otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits`
This ensures that (a) non-prevailing copies are discarded and (b)
.gcc_except_table associated to discarded text sections can be discarded by a
.gcc_except_table-aware linker (GNU ld, but not gold or LLD)
This patches matches the GCC behavior. If -fno-unique-section-names is
specified, we don't append the suffix. If -ffunction-sections is additionally specified,
use `.section ...,unique`.
Note, if clang driver communicates that the linker is LLD and we know it
is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table
costs, at least in the -fno-unique-section-names case. We cannot use it on GNU
ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER &
non-SHF_LINK_ORDER components in an output section
https://sourceware.org/bugzilla/show_bug.cgi?id=26256
For RISC-V -mrelax, this patch additionally fixes an assembler-linker
interaction problem: because a section is shrinkable, the length of a call-site
code range is not a constant. Relocations referencing the associated text
section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a
discarded section group member from outside the group is disallowed by the ELF
specification (PR46675):
```
// a.cc
inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; }
int main() { return comdat(); }
// b.cc
inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; }
int foo() { return comdat(); }
clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax
ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section:
```
-fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations.
Reviewed By: jrtc27, rahmanl
Differential Revision: https://reviews.llvm.org/D83655
A SMLoc allows MCStreamer to report location-aware diagnostics, which
were previously done by adding SMLoc to various methods (e.g. emit*) in an ad-hoc way.
Since the file:line is most important, the column is less important and
the start token location suffices in many cases, this patch reverts
b7e7131af2dd7bdb03fa42a3bc1b4bc72ab95ce1
```
// old
symbol-binding-changed.s:6:8: error: local changed binding to STB_GLOBAL
.globl local
^
// new
symbol-binding-changed.s:6:1: error: local changed binding to STB_GLOBAL
.globl local
^
```
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D90511
Running `-fsyntax-only` on UniqueID.h is 2x faster with this patch
(which avoids calling `std::tie` for `operator<`). Since the transitive
includers of this file will go up as `FileEntryRef` gets used in more
places, avoid that compile-time hit. This is a follow-up to
23ed570af1cc165afea1b70a533a4a39d6656501 (suggested by Reid Kleckner).
Also drop the `<tuple>` include from FileSystem.h (which was vestigal
from before UniqueID.h was split out).
Differential Revision: https://reviews.llvm.org/D90471
This reverts the revert commit 408c4408facc3a79ee4ff7e9983cc972f797e176.
This version of the patch includes a fix for a crash caused by
treating ICmp/FCmp constant expressions as instructions.
Original message:
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.
This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.
This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.
I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
For PS4 development we support dllimport/export annotations in
source code. This patch enables the dllimport/export attributes
on PS4 by adding a new function to query the triple for whether
dllimport/export are used and using that function to decide
whether these attributes are supported. This replaces the current
method of checking if the target is Windows.
This means we can drop the use of "TargetArch" in the .td file
(which is an improvement as dllimport/export support isn't really
a function of the architecture).
I have included a simple codgen test to show that the attributes
are accepted and have an effect on codegen for PS4. I have also
enabled the DLLExportStaticLocal and DLLImportStaticLocal
attributes, which we support downstream. However, I am unable to
write a test for these attributes until other patches for PS4
dllimport/export handling land upstream. Whilst writing this
patch I noticed that, as these attributes are internal, they do
not need to be target specific (when these attributes are added
internally in Clang the target specific checks have already been
run); however, I think leaving them target specific is fine
because it isn't harmful and they "really are" target specific
even if that has no functional impact.
Differential Revision: https://reviews.llvm.org/D90442
This reverts the revert commit a1b53db32418cb6ed6f5b2054d15a22b5aa3aeb9.
This patch includes a fix for a reported issue, caused by
matchSelectPattern returning UMIN for selects of pointers in
some cases by looking to some connected casts.
For now, ensure integer instrinsics are only returned for selects of
ints or int vectors.
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105
This reverts commit 19225704890632cd2552f41ada41600a20db1371.
This appears to cause a crash in the following example
a, b, c;
l() {
int e = a, f = l, g, h, i, j;
float *d = c, *k = b;
for (;;)
for (; g < f; g++) {
k[h] = d[i];
k[h - 1] = d[j];
h += e << 1;
i += e;
}
}
clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c
llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.
Add support for match-all tags and GOT-free runtime calls, which
are both required for the kernel to be able to support outlined
checks. This requires extending the access info to let the backend
know when to enable these features. To make the code easier to maintain
introduce an enum with the bit field positions for the access info.
Allow outlined checks to be enabled with -mllvm
-hwasan-inline-all-checks=0. Kernels that contain runtime support for
outlined checks may pass this flag. Kernels lacking runtime support
will continue to link because they do not pass the flag. Old versions
of LLVM will ignore the flag and continue to use inline checks.
With a separate kernel patch [1] I measured the code size of defconfig
+ tag-based KASAN, as well as boot time (i.e. time to init launch)
on a DragonBoard 845c with an Android arm64 GKI kernel. The results
are below:
code size boot time
before 92824064 6.18s
after 38822400 6.65s
[1] https://linux-review.googlesource.com/id/I1a30036c70ab3c3ee78d75ed9b87ef7cdc3fdb76
Depends on D90425
Differential Revision: https://reviews.llvm.org/D90426
Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass.
Differential Revision: https://reviews.llvm.org/D90247
If more than a prefix is provided - e.g. --check-prefixes=CHECK,FOO - we
don't report if (say) FOO is never used. This may lead to a gap in our
test coverage.
This patch introduces a new option, --allow-unused-prefixes. It
currently is set to true, keeping today's behavior. After we explicitly
set it in tests where this behavior was actually intentional, we will
switch it to false by default.
Differential Revision: https://reviews.llvm.org/D90281
Make DebugLogging a member variable so that users of PassBuilder don't
need to pass it around so much.
Move call to TargetMachine::registerPassBuilderCallbacks() within
PassBuilder so users don't need to remember to call it.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D90437
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
On some targets, like AArch64, vector selects can be efficiently lowered
if the vector condition is a compare with a supported predicate.
This patch adds a new argument to getCmpSelInstrCost, to indicate the
predicate of the feeding select condition. Note that it is not
sufficient to use the context instruction when querying the cost of a
vector select starting from a scalar one, because the condition of the
vector select could be composed of compares with different predicates.
This change greatly improves modeling the costs of certain
compare/select patterns on AArch64.
I am also planning on putting up patches to make use of the new argument in
SLPVectorizer & LV.
Reviewed By: dmgreen, RKSimon
Differential Revision: https://reviews.llvm.org/D90070
`Link` is not an optional field currently.
Because of this it is not convenient to write macros.
This makes it optional and fixes corresponding test cases.
Differential revision: https://reviews.llvm.org/D90390
A common pattern when using SmallString is to repeatedly call append to build a larger string.
The issue here is the optimizer can't see through this and often has to check there is enough space in the storage for each string you try to append.
This results in lots of conditional branches and potentially multiple calls to grow needing to be emitted if the buffer wasn't large enough.
By taking an initializer_list of StringRefs, SmallString can preallocate the storage it needs for all of the StringRefs which only need to grow one time at most, then use a fast path of copying all the strings into its storage knowing there is guaranteed to be enough capacity.
By using StringRefs, this also means you can append different string like types in one go as they will all be implicitly converted to a StringRef.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90386
And use it to model LLVM IR's `ptrtoint` cast.
This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)
As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
to not loose pointer-ness of an expression in more cases
As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.
But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89456
Some architectures do not have general vector select instructions (e.g.
AArch64). But some cmp/select patterns can be vectorized using other
instructions/intrinsics.
One example is using min/max instructions for certain patterns.
This patch updates the cost calculations for selects in the SLP
vectorizer to consider using min/max intrinsics.
This patch does not change SLP vectorizer's codegen itself to actually
generate those intrinsics, but relies on the backends to lower the
vector cmps & selects. This keeps things simple on the SLP side and
works well in practice for AArch64.
This exposes additional SLP vectorization opportunities in some
benchmarks on AArch64 (-O3 -flto).
Metric: SLP.NumVectorInstructions
Program base slp diff
test-suite...ications/JM/ldecod/ldecod.test 502.00 697.00 38.8%
test-suite...ications/JM/lencod/lencod.test 1023.00 1414.00 38.2%
test-suite...-typeset/consumer-typeset.test 56.00 65.00 16.1%
test-suite...6/464.h264ref/464.h264ref.test 804.00 822.00 2.2%
test-suite...006/453.povray/453.povray.test 3335.00 3357.00 0.7%
test-suite...CFP2000/177.mesa/177.mesa.test 2110.00 2121.00 0.5%
test-suite...:: External/Povray/povray.test 2378.00 2382.00 0.2%
Reviewed By: RKSimon, samparker
Differential Revision: https://reviews.llvm.org/D89969
All known instances in the code where we relied upon the TypeSize
comparison operators have now been changed to either use scalar
interger comparisons or one of the TypeSize::isKnownXY functions.
It is now safe to remove the comparison operators.
Differential Revision: https://reviews.llvm.org/D90160
The `Root` member of `ImmutableMapRef` was changed recently from a plain
pointer to `IntrusiveRefCntPtr`. However, the `Profile` member function
was not adjusted. This results in comilation error whenever the
`Profile` method is used on an `ImmutableMapRef`. This patch fixes this
issue and also adds unit tests for `ImmutableMapRef`.
Differential Revision: https://reviews.llvm.org/D89486
llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker
set, in order to emit .llvmcmd. EmbedMarker is really about embedding the
command line, so renamed the parameter accordingly, too.
This was not caught at test because the check-prefix was incorrect, but
FileCheck does not report that when multiple prefixes are provided. A
separate patch will address that.
Differential Revision: https://reviews.llvm.org/D90278
This method is at the core of the conversion from hex to binary, and using a lookup table great improves the compile time of hex conversions.
Context: In MLIR we use hex strings to represent very large constants in the textual format of the IR. These changes lead to a large decrease in compile time when parsing these constants (>1 second -> 350 miliseconds).
Differential Revision: https://reviews.llvm.org/D90320
This revision adds a fail-able/checked version of `fromHex` that fails when the input string contains a non-hex character. This removes the need for users to have a separate check for if the string contains all hex digits. This becomes very costly for large hex strings given that checking if a string contains only hex digits is effectively the same as just converting it in the first place.
Context: In MLIR we use hex strings to represent very large constants in the textual format of the IR. These changes lead to a large decrease in compile time when parsing these constants (2 seconds -> 1 second).
Differential Revision: https://reviews.llvm.org/D90265
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.
This changes CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.
Bug: https://bugs.llvm.org/show_bug.cgi?id=47580
Differential Revision: https://reviews.llvm.org/D89072
This reverts commit 504615353f31136dd6bf7a971b6c236fd70582be.
Not sure why this worked for me, but some of the bots pointed out I
copied the wrong includes from FileSystem.h in
23ed570af1cc165afea1b70a533a4a39d6656501. Fixed.