This patch implements the instruction definition and MC tests for the vector
string isolate instructions.
Differential Revision: https://reviews.llvm.org/D84197
https://reviews.llvm.org/D84909
Patch adds two new GICombinerRules, one for G_INTTOPTR and one for
G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if
the cast is within the same address space. The G_PTRTOINT elides
int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.
Patch by mkitzan
A function call can be replicated by optimizations like loop unroll and jump threading and the replicates end up sharing the sample nested callee profile. Therefore when it comes to merging samples for uninlined callees in the sample profile inliner, a callee profile can be merged multiple times which will cause an assert to fire.
This change avoids merging same callee profile for duplicate callsites by filtering out callee profiles with a non-zero head sample count.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D84997
Refactoring `Slice` class and function `createUniversalBinary` from
`llvm-lipo` into MachOUniversalWriter. This refactoring is necessary so
as to use the refactored code for creating universal binaries under
llvm-libtool-darwin.
Reviewed by alexshap, smeenai
Differential Revision: https://reviews.llvm.org/D84662
This patch makes the 'debug_aranges' entry optional. If the entry is
empty, yaml2obj will only emit the header for it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D84921
In this patch, we add a helper function getDWARFEmitterByName(). This
function returns the proper DWARF section emitting method by the name.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D84952
In this patch, emitDebugPubnames(), emitDebugPubtypes(),
emitDebugGNUPubnames(), emitDebugGNUPubtypes() are added.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D85003
It turned out that the D78704 included a private LLVM header, which is excluded
from the LLVM install target.
I'm substituting that `#include` with the public one by moving the necessary
`#define` into that. There was a discussion about this at D78704 and on the
cfe-dev mailing list.
I'm also placing a note to remind others of this pitfall.
Reviewed By: mgorny
Differential Revision: https://reviews.llvm.org/D84929
Scheduler will try to retrieve the offset and base addr to determine if two
loads/stores are disjoint memory access. PowerPC failed to handle this for
frame index which will bring extra memory dependency for loads/stores.
Reviewed By: jji
Differential Revision: https://reviews.llvm.org/D84308
This patch allows SimplifyPartiallyRedundantLoad work when
the branch condition was frozen.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D84944
We can preserve make.implicit metadata in the split block if it is
guaranteed that after following the branch we always reach the block
where processing of null case happens, which is equivalent to
"initial condition must execute if the loop is entered".
Differential Revision: https://reviews.llvm.org/D84925
Reviewed By: asbirlea
Non-trivial unswitching simply moves terminator being unswitch from the loop
up to the switch block. It also preserves all metadata that was there. It might not
be a correct thing to do for `make.implicit` metadata. Consider case:
```
for (...) {
cond = // computed in loop
if (cond) return X;
if (p == null) throw_npe(); !make implicit
}
```
Before the unswitching, if `p` is null and we reach this check, we are guaranteed
to go to `throw_npe()` block. Now we unswitch on `p == null` condition:
```
if (p == null) !make implicit {
for (...) {
if (cond) return X;
throw_npe()
}
} else {
for (...) {
if (cond) return X;
}
}
```
Now, following `true` branch of `p == null` does not always lead us to
`throw_npe()` because the loop has side exit. Now, if we run ImplicitNullCheck
pass on this code, it may end up making the unswitch condition implicit. This may
lead us to turning normal path to `return X` into signal-throwing path, which is
not efficient.
Note that this does not happen during trivial unswitch: it guarantees that we do not
have side exits before condition being unswitched.
This patch fixes this situation by unconditional dropping of `make.implicit` metadata
when we perform non-trivial unswitch. We could preserve it if we could prove that the
condition always executes. This can be done as a follow-up.
Differential Revision: https://reviews.llvm.org/D84916
Reviewed By: asbirlea
is enabled.
When -sample-profile-merge-inlinee is enabled, new FunctionSamples may be
created during profile merge without GUIDToFuncNameMap being initialized.
That will occasionally cause compiler crash. The patch fixes it.
Differential Revision: https://reviews.llvm.org/D84994
Similar to what was recently done to ParseATTOperand. Make
ParseIntelOperand directly responsible for adding to the operand
vector instead of returning the operand. Return a bool for error.
Remove ErrorOperand since it is no longer used.
For consistency with legacy pass name.
Helps with 37 instances of "unknown pass name 'tbaa'" in check-llvm under NPM.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D84967
If an analysis is actually invalidated, there's already a log statement
for that: 'Invalidating analysis: FooAnalysis'.
Otherwise the statement is not very useful.
Reviewed By: asbirlea, ychen
Differential Revision: https://reviews.llvm.org/D84981
findAllocaForValue uses AllocaForValue to cache resolved values.
The function is used only to resolve arguments of lifetime
intrinsic which usually are not fare for allocas. So result reuse
is likely unnoticeable.
In followup patches I'd like to replace the function with
GetUnderlyingObjects.
Depends on D84616.
Differential Revision: https://reviews.llvm.org/D84617
Fix for the issue raised in https://github.com/rust-lang/rust/issues/74632.
The current heuristic for inserting LFENCEs uses a quadratic-time algorithm. This can apparently cause substantial compilation slowdowns for building Rust projects, where functions > 5000 LoC are apparently common.
The updated heuristic in this patch implements a linear-time algorithm. On a set of benchmarks, the slowdown factor for the generated code was comparable (2.55x geo mean for the quadratic-time heuristic, vs. 2.58x for the linear-time heuristic). Both heuristics offer the same security properties, namely, mitigating LVI.
This patch also includes some formatting fixes.
Differential Revision: https://reviews.llvm.org/D84471
After the recent change to the tuning settings for pentium4 to improve our default 32-bit behavior, I've decided to see about implementing -mtune support. This way we could have a default architecture CPU of "pentium4" or "x86-64" and a default tuning cpu of "generic". And we could change our "pentium4" tuning settings back to what they were before.
As a step to supporting this, this patch separates all of the features lists for the CPUs into 2 lists. I'm using the Proc class and a new ProcModel class to concat the 2 lists before passing to the target independent ProcessorModel. Future work to truly support mtune would change ProcessorModel to take 2 lists separately. I've diffed the X86GenSubtargetInfo.inc file before and after this patch to ensure that the final feature list for the CPUs isn't changed.
Differential Revision: https://reviews.llvm.org/D84879
This patch addes time trace functionality to have a better understanding
of the analysis times.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84980
This includes basic support for computeKnownBits on abs. I've left FIXMEs for more complicated things we could do.
Differential Revision: https://reviews.llvm.org/D84963
Just the obvious implementation that rewrites the result type. Also fix
warning from EXTRACT_SUBVECTOR legalization that triggers on the test.
Differential Revision: https://reviews.llvm.org/D84706
The -harness option enables new testing use-cases for llvm-jitlink. It takes a
list of objects to treat as a test harness for any regular objects passed to
llvm-jitlink.
If any files are passed using the -harness option then the following
transformations are applied to all other files:
(1) Symbols definitions that are referenced by the harness files are promoted
to default scope. (This enables access to statics from test harness).
(2) Symbols definitions that clash with definitions in the harness files are
deleted. (This enables interposition by test harness).
(3) All other definitions in regular files are demoted to local scope.
(This causes untested code to be dead stripped, reducing memory cost and
eliminating spurious unresolved symbol errors from untested code).
These transformations allow the harness files to reference and interpose
symbols in the regular object files, which can be used to support execution
tests (including fuzz tests) of functions in relocatable objects produced by a
build.
This allows clients to detect invalid transformations applied by JITLink passes
(e.g. inserting or removing symbols in unexpected ways) and terminate linking
with an error.
This change is used to simplify the error propagation logic in
ObjectLinkingLayer.
Avoid recursively calling copyPhysReg for AGPR handling. This was
dropping the necessary super register implicit defs to avoid liveness
verifier errors.
Pass the abs poison flag to the underlying ConstantRange
implementation, allowing CVP to simplify based on it.
Importantly, this recognizes that abs with poison flag is actually
non-negative...
This fixes an assertion failure that was being triggered in
SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32>
to i64 (which should have been <2xi64>).
Fixes: rdar://66016901
Differential Revision: https://reviews.llvm.org/D84884