This will ensure that passes that add new global variables will create them
in address space 1 once the passes have been updated to no longer default
to the implicit address space zero.
This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't
already to present to ensure bitcode backwards compatibility.
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.org/D84345
This is similar to the existing alloca and program address spaces (D37052)
and should be used when creating/accessing global variables.
We need this in our CHERI fork of LLVM to place all globals in address space 200.
This ensures that values are accessed using CHERI load/store instructions
instead of the normal MIPS/RISC-V ones.
The problem this is trying to fix is that most of the time the type of
globals is created using a simple PointerType::getUnqual() (or ::get() with
the default address-space value of 0). This does not work for us and we get
assertion/compilation/instruction selection failures whenever a new call
is added that uses the default value of zero.
In our fork we have removed the default parameter value of zero for most
address space arguments and use DL.getProgramAddressSpace() or
DL.getGlobalsAddressSpace() whenever possible. If this change is accepted,
I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead
of relying on the default value of 0 for PointerType::get(), etc.
This patch and the follow-up changes will not have any functional changes
for existing backends with the default globals address space of zero.
A follow-up commit will change the default globals address space for
AMDGPU to 1.
Reviewed By: dylanmckay
Differential Revision: https://reviews.llvm.org/D70947
Summary:
Expand existing loopsink testing to also test loopsinking using new pass
manager. Enable memoryssa for loopsink with new pass manager. This
combination exposed a bug that was previously fixed for loopsink
without memoryssa. When sinking an instruction into a loop, the source
block may not be part of the loop but still needs to be checked for
pointer invalidation. This is the fix for bugzilla #39695 (PR 54659)
expanded to also work with memoryssa.
Respond to review comments. Enable Memory SSA in legacy Loop Sink pass
under EnableMSSALoopDependency option control. Update tests accordingly.
Respond to review comments. Add options controlling whether memoryssa is
used for loop sink, defaulting to off. Expand testing based on these
options.
Respond to review comments. Properly indicated preserved analyses.
This relanding addresses a compile-time performance problem by moving
test for profile data earlier to avoid unnecessary computations.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90249
The constrained intrinsics have metadata arguments, so the
tests here were crashing as noted in D90554 (and that was
reverted even though this bug exists independently of that
change).
Summary:
[NFC intended] Refactor the code for printChanged for reuse and to facilitate
subsequent reporters of changes to the IR in the new pass manager.
Create abstract template base classes for common functionality and give
classes more appropriate names. The base classes handle all of the
determination of when a function or pass is "interesting" and should be
reported or filtered out. They have pure virtual functions which are called
when a change by a pass has been recognized so the derived class need only
provide the overrides to present the information about the changing IR.
There are at least 2 more change reporters to come (which were presented
in my tutorial at the 2020 llvm developer's meeting) that derive from
these classes.
Respond to review comments: move function out of line, remove inline keyword,
remove unneeded qualifiers, simplify comparison.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks), madhur13490 (Madhur Amilkanthwar)
Differential Revision: https://reviews.llvm.org/D87000
Just something I forgot when I added the R82. Need to have a look
at crypto and fusing, but will do that as a follow up.
Differential Revision: https://reviews.llvm.org/D91848
This checks to see if the loop will likely become a tail predicated loop
and disables wls loop generation if so, as the likelihood for reverting
is currently too high. These should be fairly rare situations anyway due
to the way iterations and element counts are used during lowering. Just
not trying can alter how SCEV's are materialized however, leading to
different codegen.
It also adds a option to disable all while low overhead loops, for
debugging.
Differential Revision: https://reviews.llvm.org/D91663
This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.
Differential Revision: https://reviews.llvm.org/D91157
This is a partial un-revert of 32dd5870ee31 (originally df09f82599 ).
I'm adding back the baseline tests first, so we don't have
to back-track as much in case there are still problems.
See discussion in D90554.
This is a partial un-revert of 32dd5870ee31. I'm adding
back the baseline tests first, so we don't have to
back-track as much in case there are still problems.
Implement getMinimumJumpTableEntries() to specify threshold for jump
table genaration. We use 8 for the case of PIC mode to relieve the
impact of PIC calculation required to implement PIC mode jump table.
Update jump table regression test also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91785
Extract the scratch offset from the scratch buffer descriptor that is
stored in the global table.
Differential Revision: https://reviews.llvm.org/D91701
Our code that dumps groups has 3 noticeable issues:
1) It uses `unwrapOrError` in many places.
2) It doesn't allow reporting unique warnings, because the `getGroups` helper is not
a member of `DumpStyle<ELFT>`.
3) It might just crash. See the comment for `StrTableOrErr->data() + Sym.st_name` line.
In this patch I am starting addressing these points.
For start I've converted one of `unwrapOrError` calls to a unique warning.
Differential revision: https://reviews.llvm.org/D91798
Our `printStackSize` implementation currently uses
API like `RelocationRef`, `object::symbol_iterator`.
It is not ideal as it doesn't allow
to handle possible error conditions properly.
Some time ago I started rewriting it and this NFC patch is
a one more step toward to it. Here I am introducing the
`forEachRelocationDo` helper. With it it is possible to iterate
over all kinds of relocations, what is helpful for improving
the code in `printStackSize` and around.
Differential revision: https://reviews.llvm.org/D91530
For MASM syntax, the prefixes are not enclosed in braces.
The assembly code should like:
"evex vcvtps2pd xmm0, xmm1"
Differential Revision: https://reviews.llvm.org/D90441
This allows to reuse the RelocationResolver from the code
that doesn't want to deal with `RelocationRef` class.
I am going to use it in llvm-readobj. See the description
of D91530 for more details.
Differential revision: https://reviews.llvm.org/D91533
as it's causing crashes in the optimizer. A reduced testcase has been posted as a follow-up.
This reverts commit f7eac51b9b3f780c96ca41913293851c5acb465b.
Temporarily Revert "[CostModel] make default size cost for libcalls small (again)" as it depends upon the primary revert.
This reverts commit 8ec7ea3ddce7379e13e8dfb4a5260a6d2004aa1c.
Temporarily Revert "[CostModel] add tests for math library calls; NFC" as it depends upon the primary revert.
This reverts commit df09f825995b10da03f148133c119f52c94fd6e4.
Temporarily Revert "[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC" as it depends upon the primary revert.
This reverts commit 618d555e8d926a83161774df2035519c387269db.
Clang generates a '%' prefix for some registers in CFI directives. E.g.
".cfi_register lr, r12" becomes ".cfi_register lr, %r12" after
processing.
Differential Revision: https://reviews.llvm.org/D91735
The assertion logic in SmallVector::assertSafeToReferenceAfterResize is
hard to follow; split out SmallVector::isSafeToReferenceAfterResize and
add early returns and comments. No functionality change here.
This reuses the existing lower-matrix-intrinsics pass rather than going
the legacy pass route of creating a new pass.
Use this new variant in the NPM -O0 pipeline.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D91811
There's no need to check for reference invalidation when
`SmallVector::resize` is shrinking; the parameter isn't accessed.
Differential Revision: https://reviews.llvm.org/D91832
Unused since https://reviews.llvm.org/D91762 and triggering
-Wunused-private-field
```
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:365:13: error: private field 'GetArgTLS' is not used [-Werror,-Wunused-private-field]
Constant *GetArgTLS;
^
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:366:13: error: private field 'GetRetvalTLS' is not used [-Werror,-Wunused-private-field]
Constant *GetRetvalTLS;
```
Reviewed By: stephan.yichao.zhao
Differential Revision: https://reviews.llvm.org/D91820
One less instruction and reducing use count of zext.
As alive2 confirms, we're fine with all the weird combinations of
undef elts in constants, but unless the shift amount was undef
for a lane, we must sanitize undef mask to zero, since sign bits
are no longer zeros.
https://rise4fun.com/Alive/d7r
```
----------------------------------------
Optimization: zz
Precondition: ((C1 == (width(%r) - width(%x))) && isSignBit(C2))
%o0 = zext %x
%o1 = shl %o0, C1
%r = and %o1, C2
=>
%n0 = sext %x
%r = and %n0, C2
Done: 2016
Optimization is correct!
```
Followup to 7de7c40898a8f815d661781c92757f93fa4c6e5b. I previously
removed a number of == comparisons to LocationSize::unknown(), but
missed these != comparisons.
When constructing a MemoryLocation by hand, require that a
LocationSize is explicitly specified. D91649 will split up
LocationSize::unknown() into two different states, and callers
should make an explicit choice regarding the kind of MemoryLocation
they want to have.
Similarly to assumes and guards deoptimize intrinsics are
marked as writing to ensure proper control dependencies
but they never modify any particular memory location.
Differential Revision: https://reviews.llvm.org/D91658
Instead of separately passing pointer and size, make use of
MemoryLocation. This allows us to also reuse all the existing
logic for determining the MemoryLocation correponding to an
instruction or call argument.
Not quite NFC because used locations may be more precise in some
cases.
This moves handling of alwaysinline, coroutines, matrix lowering, PGO,
and LTO-required passes into PassBuilder. Much of this is replicated
between Clang and opt. Other out-of-tree users also replicate some of
this, such as Rust [1] replicating the alwaysinline, LTO, and PGO
passes.
The LTO passes are also now run in
build(Thin)LTOPreLinkDefaultPipeline() since they are semantically
required for (Thin)LTO.
[1]: f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D91585
In the current state, if getFromHash(0) is called and there's no CU with
dwo_id=0, the lookup will stop at an empty slot, then the check
`Rows[H].getSignature() != S` won't cause the lookup to fail and return
a nullptr (as it should), because the empty slot has a 0 in the
signature field, and a pointer to the empty slot will be incorrectly
returned.
This patch fixes this by using the index field in the hash entry to
check for empty slots: signature = 0 can match a valid hash but
according to the spec the index for an occupied slot will always be
non-zero.
Differential Revision: https://reviews.llvm.org/D91670