We did not have atomic flags on SMRD, did not copy TSFlags
to real instructions, and did not have ret/noret atomic map.
At the moment it is NFC, but needed for D96469.
Differential Revision: https://reviews.llvm.org/D96823
This adds an internal option -wholeprogramdevirt-check which if enabled
will guard each devirtualization with a runtime check against the
expected target, and an invocation of a debug trap if the check fails.
This is useful for debugging WPD failures involving undefined behavior
(e.g. casting to another class type not in the inheritance chain).
Differential Revision: https://reviews.llvm.org/D95969
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D96831
Apply the patch for the third time after fixing buildbot failures.
Refactor SampleProfile.cpp to use the core code in CodeGen.
The main changes are:
(1) Move SampleProfileLoaderBaseImpl class to a header file.
(2) Split SampleCoverageTracker to a head file and a cpp file.
(3) Move the common codes (common options and callsiteIsHot())
to the common cpp file.
(4) Add inline keyword to avoid duplicated symbols -- they will
be removed later when the class is changed to a template.
Differential Revision: https://reviews.llvm.org/D96455
lld already marks shared library defs as ExportDynamic, which prevents
potentially unsafe devirtualization of symbols defined in shared
libraries. Match that behavior in the gold plugin, and add the same
test.
Depends on D96721.
Differential Revision: https://reviews.llvm.org/D96722
Same implementation as G_SEXT_INREG.
Add a testcase to combine-sext-inreg for a concrete example, and a testcase
to KnownBitsTest.
Differential Revision: https://reviews.llvm.org/D96897
Separate the LoZ ELF calling convention in tablegen.
This will make it easier to add the z/OS ABI in future patches.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D96867
This adds a G_ASSERT_SEXT opcode, similar to G_ASSERT_ZEXT. This instruction
signifies that an operation was already sign extended from a smaller type.
This is useful for functions with sign-extended parameters.
E.g.
```
define void @foo(i16 signext %x) {
...
}
```
This adds verifier, regbankselect, and instruction selection support for
G_ASSERT_SEXT equivalent to G_ASSERT_ZEXT.
Differential Revision: https://reviews.llvm.org/D96890
Adds a lld test for a case that the handling added for dynamically
exported symbols in 1487747e990ce9f8851f3d92c3006a74134d7518 already
fixes. Because isExportDynamic returns true when the symbol is
SharedKind with default visibility, it will treat as dynamically
exported and block devirtualization when the definition of a vtable
comes from a shared library. This is desireable as it is dangerous to
devirtualize in that case, since there could be hidden overrides in the
shared library. Typically that happens when the shared library header
contains available externally definitions, which applications can
override. An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.
The regular LTO case in the new test already worked, but there are
2 fixes in this patch needed for the index-only case and the hybrid
LTO case. For the index-only case, WPD should not simply ignore
available externally vtables. A follow on fix will be made to clang to
emit type metadata for those vtables, which the new test is modeling.
For the hybrid case, we need to ensure when the module is split that any
llvm.*used globals are cloned to the regular LTO split module so
available externally vtable definitions are not prematurely deleted.
Another follow on fix will add the equivalent gold test, which requires
a small fix to the plugin to treat symbols in dynamic libraries the same
way lld already is.
Differential Revision: https://reviews.llvm.org/D96721
Updating `EHPadStack` with respect to `TRY` and `CATCH` instructions
have to be done after checking all other conditions, not before. Because
we did this before checking other conditions, when we encounter `TRY`
and we want to record the current mismatching range, we already have
popped up the entry from `EHPadStack`, which we need to access to record
the range.
The `baz` call in the added test needs try-delegate because the previous
TRY marker placement for `quux` was placed before `baz`, because `baz`'s
return value was stackified in RegStackify. If this wasn't stackified
this try-delegate is not strictly necessary, but at the moment it is not
easy to identify cases like this. I plan to transfer `nounwind`
attributes from the LLVM IR to prevent cases like this. The call in the
test does not have `unwind` attribute in order to test this bug, but in
many cases of this pattern the previous call has `nounwind` attribute.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D96711
D94835 added support for WinEH to export public symbols pointing to
basic blocks which are catchret targets for use with Windows CET.
Wasm currently doesn't support public symbols to non-function code
addresses (they get treated like new functions in asm but then don't
lower to object files correctly).
It created them unconditionally for all catchret targets.
This change disables those symbols unless the exceptionHandlingType
is WinEH (since they aren't used with ExceptionHandling::Wasm)
Differential Revision: https://reviews.llvm.org/D96824
CMAKE_SYSROOT works fine here, and `sysroot.py make-fake`
borders on trivial here, but I suppose it's still nice
to have a consistent script to set these up across platforms.
And these are the platforms where we can do real sysroot management one
day.
Differential Revision: https://reviews.llvm.org/D96882
Revert "[SampleFDO] Add missing #includes to unbreak modules build after D96455"
This reverts commit c73cbf218a289029cc0b54183c3cf79454ecc76f.
Revert "[SampleFDO] Fix MSVC "namespace uses itself" warning (NFC)"
This reverts commit a23e6b321ca623b83252f8b1e06a2ad4fc441f89.
Revert "[SampleFDO] Reapply: Refactor SampleProfile.cpp"
This reverts commit 6fd5ccff72eeaffcb3b3ba2696282015aab755bc.
Still seeing link failures when building llc (or other tools), due to
the new SampleProfileLoaderBaseImpl.h containing definitions that get
duplicated across multiple TU's.
```
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::findEquivalenceClasses(llvm::Function&)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::buildEdges(llvm::Function&)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::computeDominanceAndLoopInfo(llvm::Function&)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::getFunctionLoc(llvm::Function&)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::getBlockWeight(llvm::BasicBlock const*)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::printBlockWeight(llvm::raw_ostream&, llvm::BasicBlock const*) const' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::printBlockEquivalence(llvm::raw_ostream&, llvm::BasicBlock const*)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
duplicate symbol 'llvm::SampleProfileLoaderBaseImpl::printEdgeWeight(llvm::raw_ostream&, std::__1::pair<llvm::BasicBlock const*, llvm::BasicBlock const*>)' in:
tools/llc/CMakeFiles/llc.dir/llc.cpp.o
lib/libLLVMInstCombine.a(InstCombineVectorOps.cpp.o)
```
A lot of the code for the masked and unmasked is the same. This
patch adds a boolean to handle the differences so we can share
the code.
Differential Revision: https://reviews.llvm.org/D96841
Bot: http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/28999
```
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h:124:19: error: missing '#include "llvm/Analysis/PostDominators.h"'; 'PostDominatorTree' must be declared before it is used
std::unique_ptr<PostDominatorTree> PDT;
^
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Analysis/PostDominators.h:28:7: note: declaration here is not visible
class PostDominatorTree : public PostDomTreeBase<BasicBlock> {
^
While building module 'LLVM_Transforms' imported from /Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/lib/Transforms/CFGuard/CFGuard.cpp:15:
In file included from <module-includes>:191:
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h:125:19: error: missing '#include "llvm/Analysis/LoopInfo.h"'; 'LoopInfo' must be declared before it is used
std::unique_ptr<LoopInfo> LI;
^
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Analysis/LoopInfo.h:1079:7: note: declaration here is not visible
class LoopInfo : public LoopInfoBase<BasicBlock, Loop> {
^
While building module 'LLVM_Transforms' imported from /Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/lib/Transforms/CFGuard/CFGuard.cpp:15:
In file included from <module-includes>:191:
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h:149:3: error: missing '#include "llvm/Analysis/OptimizationRemarkEmitter.h"'; 'OptimizationRemarkEmitter' must be declared before it is used
OptimizationRemarkEmitter *ORE = nullptr;
^
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:33:7: note: declaration here is not visible
class OptimizationRemarkEmitter {
^
/Users/buildslave/jenkins/workspace/lldb-cmake/llvm-project/llvm/lib/Transforms/CFGuard/CFGuard.cpp:15:10: fatal error: could not build module 'LLVM_Transforms'
```
Interval value
The II value was incremented before exiting the loop, and therefor when
used in the optimization remarks and debug dumps it did not reflect the
initiation interval actually used in Schedule.
Differential Revision: https://reviews.llvm.org/D95692
SROA does not correctly account for offsets in TBAA/TBAA struct metadata.
This patch creates functionality for generating new MD with the corresponding
offset and updates SROA to use this functionality.
Differential Revision: https://reviews.llvm.org/D95826
The NPM LTO pipeline has a lot of fixme's and missing passes, causing a
lot of regressions after the switch in c70737b. Notably unrolling and
vectorization were both disabled, but many other passes are missing
compared to the old pass manager. This attempt to enable the most
obvious missing passes like the unroller, vectorization and other loop
passes, fixing the existing FIXME comments.
Differential Revision: https://reviews.llvm.org/D96780
This is the preliminary patch of converting `LoopInterchange` pass to a loop-nest pass and has no intended functional change.
Changes that are not loop-nest related are split to D96650.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D96644
This adds a new flag -lsr-preferred-addressing-mode to override the target's
preferred addressing mode. It replaces flag -lsr-backedge-indexing, which is
equivalent to preindexed addressing that is one of the options that
-lsr-preferred-addressing-mode accepts.
Differential Revision: https://reviews.llvm.org/D96855
As discussed in:
https://llvm.org/PR49179
...this pattern shows up in library code.
There are several potential generalizations as noted,
but we need to be careful that we get FP special-values
right, and it's not clear how much variation we should
expect to see from this exact idiom.
Also add a script for sysroot management. For now, it can only create
fake sysroots that just symlink to local folders. This is useful for
testing.
Differential Revision: https://reviews.llvm.org/D96868
Summary:
Currently Shrinkwrap is not enabled on AIX.
This patch enables shrink wrap on 32 and 64 bit AIX, and 64 bit ELF.
Reviewed By: sfertile, nemanjai
Differential Revision: https://reviews.llvm.org/D95094
Do not defer to the base class when the register constraint is a
physical fpr. The base class will select SPILLTOVSRRC as the register
class and register allocation will fail on subtargets without VSX
registers.
Differential Revision: https://reviews.llvm.org/D91629
* Update skip-if-dead.ll with tests for wave32.
* Fix the crash in verifier in one newly enabled test by adding
missing fixImplicitOperands in branch insertion code.
```
*** Bad machine code: Using an undefined physical register ***
- function: test_kill_divergent_loop
- basic block: %bb.2 bb (0xad96308)
- instruction: S_CBRANCH_VCCNZ %bb.1, implicit $vcc_lo
- operand 1: implicit $vcc_lo
LLVM ERROR: Found 1 machine code errors.
```
* Simplify "cbranch_kill" to not use interp instructions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96793
The PrepareForLTO stage of LoopRotate tries to avoid unrolling loops
with calls that might be inlined later. See D94232 where this was
introduced.
We didn't catch all occurances of the LoopRotatePass in the New Pass
Manager, so the original regression in astar returned with the pass
manager switch.
Otherwise they are not allocated as a single bit field and take 4
bytes instead of 2.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D95954
Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),bop(shuffle(a,b),shuffle(c,d))) -> bop(shuffle(x,y),shuffle(z,w)),bop(shuffle(a,b),shuffle(c,d))
Attempt to fold from a shuffle of a pair of binops to a binop of shuffles, as long as one/both of the binop sources are also shuffles that can be merged with the outer shuffle. This should guarantee that we remove one binop without introducing any additional shuffles.
Technically there's potential for a merged shuffle's lowering to be poorer than the original shuffle, but it could also be better, and I'm not seeing any regressions as long as we keep the 'don't merge splats' rule already present in MergeInnerShuffle.
This expands and generalizes an existing X86 combine and attempts to merge either of each binop's sources (with an on-the-fly commutation of the shuffle mask) - we couldn't do that in the x86 version as it had to stay in a form that DAGCombine's MergeInnerShuffle would still recognise.
Fixes issue raised by @saugustine in rG5aa8f4c0843a where we were failing to replace null shuffle operands from MergeInnerShuffle to UNDEFs.
Differential Revision: https://reviews.llvm.org/D96345
With enough cores, the slowest tests can significantly change the total testing time if they happen to run late. With this change, a test suite can improve performance (for high-end systems) by listing just a few of the slowest tests up front.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D96594
The helper function isBoolSGPR is too aggressive when determining
when a v_cndmask can be skipped on a boolean value because the
function does not check the operands of and/or/xor.
This can be problematic for the Add/Sub combines that can leave
bits set even for inactive lanes leading to wrong results.
Fix this by inspecting the operands of and/or/xor recursively.
Differential Revision: https://reviews.llvm.org/D86878
This patch adds support for fixed-length vector vselect. It does so by
lowering them to a custom unmasked VSELECT_VL node with a vector length
operand.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D96768