This is a basic How-To that describes:
- What Windows Itanium is.
- How to assemble a build environment.
Differential Revision: https://reviews.llvm.org/D89518
Clarify that the base type endianity is used when creating implicit
location storage.
Remove duplicate definition of the generic type.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D98137
In "DWARF Extensions For Heterogeneous Debugging" document that the
DWARF generic type has a target architecture defined endianity.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D98126
This patch adds a new metadata node, DIArgList, which contains a list of SSA
values. This node is in many ways similar in function to the existing
ValueAsMetadata node, with the difference being that it tracks a list instead of
a single value. Internally, it uses ValueAsMetadata to track the individual
values, but there is also a reasonable amount of DIArgList-specific
value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special
case in parsing and printing due to the fact that it requires a function state
(as it may reference function-local values).
This patch should not result in any immediate functional change; it allows for
DIArgLists to be parsed and printed, but debug variable intrinsics do not yet
recognize them as a valid argument (outside of parsing).
Differential Revision: https://reviews.llvm.org/D88175
Rewrites test to use correct architecture triple; fixes incorrect
reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour
so as to not be dependent on changes elsewhere in the patch stack.
This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.
explicitly emitting retainRV or claimRV calls in the IR
This reapplies ed4718eccb12bd42214ca4fb17d196d49561c0c7, which was reverted
because it was causing a miscompile. The bug that was causing the miscompile
has been fixed in 75805dce5ff874676f3559c069fcd6737838f5c0.
Original commit message:
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
This patch adds a new instruction that can represent variadic debug values,
DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set
of basic code changes in MachineInstr and a few adjacent areas, but does not
correctly handle variadic debug values outside of these areas, nor does it
generate them at any point.
The new instruction is similar to the existing DBG_VALUE instruction, with the
following differences: the operands are in a different order, any number of
values may be used in the instruction following the Variable and Expression
operands (these are referred to in code as “debug operands”) and are indexed
from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a
DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the
expression.
The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a
DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the
index given by the argument onto the Expression stack. For example the
sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at
index 0 onto the expression stack.”
Differential Revision: https://reviews.llvm.org/D82363
This patch adds a pipeline to support in-order CPUs such as ARM
Cortex-A55.
In-order pipeline implements a simplified version of Dispatch,
Scheduler and Execute stages as a single stage. Entry and Retire
stages are common for both in-order and out-of-order pipelines.
Differential Revision: https://reviews.llvm.org/D94928
The help text and documentation for the --discard-all option failed to
mention that the option also causes the removal of debug sections. This
change fixes both for both llvm-objcopy and llvm-strip.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97662
This patch is an update to LangRef by describing lifetime intrinsics' behavior
by following the description of MIR's LIFETIME_START/LIFETIME_END markers
at StackColoring.cpp (eb44682d67/llvm/lib/CodeGen/StackColoring.cpp (L163)) and the discussion in llvm-dev.
In order to explicitly define the meaning of an object lifetime, I added 'Object Lifetime' subsection.
Reviewed By: nlopes
Differential Revision: https://reviews.llvm.org/D94002
See pr46990(https://bugs.llvm.org/show_bug.cgi?id=46990). LICM should not sink store instructions to loop exit blocks which cross coro.suspend intrinsics. This breaks semantic of coro.suspend intrinsic which return to caller directly. Also this leads to use-after-free if the coroutine is freed before control returns to the caller in multithread environment.
This patch disable promotion by check whether loop contains coro.suspend intrinsics.
This is a resubmit of D86190.
Disabling LICM for loops with coroutine suspension is a better option not only for correctness purpose but also for performance purpose.
In most cases LICM sinks memory operations. In the case of coroutine, sinking memory operation out of the loop does not improve performance since coroutien needs to get data from the frame anyway. In fact LICM would hurt coroutine performance since it adds more entries to the frame.
Differential Revision: https://reviews.llvm.org/D96928
This caused miscompiles of Chromium tests for iOS due clobbering of live
registers. See discussion on the code review for details.
> Background:
>
> This fixes a longstanding problem where llvm breaks ARC's autorelease
> optimization (see the link below) by separating calls from the marker
> instructions or retainRV/claimRV calls. The backend changes are in
> https://reviews.llvm.org/D92569.
>
> https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
>
> What this patch does to fix the problem:
>
> - The front-end adds operand bundle "clang.arc.attachedcall" to calls,
> which indicates the call is implicitly followed by a marker
> instruction and an implicit retainRV/claimRV call that consumes the
> call result. In addition, it emits a call to
> @llvm.objc.clang.arc.noop.use, which consumes the call result, to
> prevent the middle-end passes from changing the return type of the
> called function. This is currently done only when the target is arm64
> and the optimization level is higher than -O0.
>
> - ARC optimizer temporarily emits retainRV/claimRV calls after the calls
> with the operand bundle in the IR and removes the inserted calls after
> processing the function.
>
> - ARC contract pass emits retainRV/claimRV calls after the call with the
> operand bundle. It doesn't remove the operand bundle on the call since
> the backend needs it to emit the marker instruction. The retainRV and
> claimRV calls are emitted late in the pipeline to prevent optimization
> passes from transforming the IR in a way that makes it harder for the
> ARC middle-end passes to figure out the def-use relationship between
> the call and the retainRV/claimRV calls (which is the cause of
> PR31925).
>
> - The function inliner removes an autoreleaseRV call in the callee if
> nothing in the callee prevents it from being paired up with the
> retainRV/claimRV call in the caller. It then inserts a release call if
> claimRV is attached to the call since autoreleaseRV+claimRV is
> equivalent to a release. If it cannot find an autoreleaseRV call, it
> tries to transfer the operand bundle to a function call in the callee.
> This is important since the ARC optimizer can remove the autoreleaseRV
> returning the callee result, which makes it impossible to pair it up
> with the retainRV/claimRV call in the caller. If that fails, it simply
> emits a retain call in the IR if retainRV is attached to the call and
> does nothing if claimRV is attached to it.
>
> - SCCP refrains from replacing the return value of a call with a
> constant value if the call has the operand bundle. This ensures the
> call always has at least one user (the call to
> @llvm.objc.clang.arc.noop.use).
>
> - This patch also fixes a bug in replaceUsesOfNonProtoConstant where
> multiple operand bundles of the same kind were being added to a call.
>
> Future work:
>
> - Use the operand bundle on x86-64.
>
> - Fix the auto upgrader to convert call+retainRV/claimRV pairs into
> calls with the operand bundles.
>
> rdar://71443534
>
> Differential Revision: https://reviews.llvm.org/D92808
This reverts commit ed4718eccb12bd42214ca4fb17d196d49561c0c7.
Document the default for the XNACK and SRAMECC target features for code object V2-V3 and V4.
Reviewed By: kzhuravl
Differential Revision: https://reviews.llvm.org/D97598
And clarify in the "writing a pass" docs that both the legacy and new
PMs are being used for the codegen/optimization pipelines.
Reviewed By: ychen, asbirlea
Differential Revision: https://reviews.llvm.org/D97515
This document was originally introduced in ab4648504b2, and was reverted in
912bc4980e9 while I investigated a number of shpinx bot errors. This commit
reintroduces the document with fixes for those errors, as well as some
improvements to the wording and formatting.
For some build configurations, `check-all` calls lit multiple times to
run multiple lit test suites. Most recently, I've found this to be
true when configuring openmp as part of `LLVM_ENABLE_RUNTIMES`, but
this is not the first time.
If one test suite fails, none of the remaining test suites run, so you
cannot determine if your patch has broken them. It can then be
frustrating to try to determine which `check-` targets will run the
remaining tests without getting stuck on the failing tests.
When such cases arise, it is probably best to adjust the cmake
configuration for `check-all` to run all test suites as part of one
lit invocation. Because that fix will likely not be implemented and
land immediately, this patch introduces `--ignore-fail` to serve as a
workaround for developers trying to see test results until it does
land:
```
$ LIT_OPTS=--ignore-fail ninja check-all
```
One problem with `--ignore-fail` is that it makes it challenging to
detect test failures in a script, perhaps in CI. This problem should
serve as motivation to actually fix the cmake configuration instead of
continuing to use `--ignore-fail` indefinitely.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D96371
The current size of the llvm-project repository exceeds 1 GB. A shallow clone can save a lot of space and time. Some developers might not aware of this feature.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D97118
Enabled "bound_ctrl:1" and disabled "bound_ctrl:-1" syntax.
Corrected printer to output "bound_ctrl:1" instead of "bound_ctrl:0".
See bug 35397 for detailed issue description.
Differential Revision: https://reviews.llvm.org/D97048
In semi-automated environments, XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D96662
As discussed on the RFC [0], I am sharing the set of patches that
enables checking of original Debug Info metadata preservation in
optimizations. The proof-of-concept/proposal can be found at [1].
The implementation from the [1] was full of duplicated code,
so this set of patches tries to merge this approach into the existing
debugify utility.
For example, the utility pass in the original-debuginfo-check
mode could be invoked as follows:
$ opt -verify-debuginfo-preserve -pass-to-test sample.ll
Since this is very initial stage of the implementation,
there is a space for improvements such as:
- Add support for the new pass manager
- Add support for metadata other than DILocations and DISubprograms
[0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ
[1] https://github.com/djolertrk/llvm-di-checker
Differential Revision: https://reviews.llvm.org/D82545
The test that was failing is now forced to use the old PM.
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.
This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
ofDW_AT_comp_dir in DWARF.
Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.
Differential Revision: https://reviews.llvm.org/D95753
Rework template argument checking so that all arguments are type-checked
and cast if necessary.
Add a test.
Differential Revision: https://reviews.llvm.org/D96416
As discussed on the RFC [0], I am sharing the set of patches that
enables checking of original Debug Info metadata preservation in
optimizations. The proof-of-concept/proposal can be found at [1].
The implementation from the [1] was full of duplicated code,
so this set of patches tries to merge this approach into the existing
debugify utility.
For example, the utility pass in the original-debuginfo-check
mode could be invoked as follows:
$ opt -verify-debuginfo-preserve -pass-to-test sample.ll
Since this is very initial stage of the implementation,
there is a space for improvements such as:
- Add support for the new pass manager
- Add support for metadata other than DILocations and DISubprograms
[0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ
[1] https://github.com/djolertrk/llvm-di-checker
Differential Revision: https://reviews.llvm.org/D82545
This adds a G_ASSERT_SEXT opcode, similar to G_ASSERT_ZEXT. This instruction
signifies that an operation was already sign extended from a smaller type.
This is useful for functions with sign-extended parameters.
E.g.
```
define void @foo(i16 signext %x) {
...
}
```
This adds verifier, regbankselect, and instruction selection support for
G_ASSERT_SEXT equivalent to G_ASSERT_ZEXT.
Differential Revision: https://reviews.llvm.org/D96890
With enough cores, the slowest tests can significantly change the total testing time if they happen to run late. With this change, a test suite can improve performance (for high-end systems) by listing just a few of the slowest tests up front.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D96594
Some test systems do not use lit for test discovery but only for its
substitution and test selection because they use another way of managing
test collections, e.g. CTest. This forces those tests to be invoked with
lit --no-indirectly-run-check. When a mix of lit version is in use, it
requires to detect the availability of that option.
This commit provides a new config option standalone_tests to signal a
directory made of tests meant to run as standalone. When this option is
set, lit skips test discovery and the indirectly run check. It also adds
the missing documentation for --no-indirectly-run-check.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94766
1. Emit warnings for files without symbols.
2. Add -no_warning_for_no_symbols.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D95843
The few options are niche. They solved a problem which was traditionally solved
with more shell commands (`llvm-readelf -n` fetches the Build ID. Then
`ln` is used to hard link the file to a directory derived from the Build ID.)
Due to limitation, they are no longer used by Fuchsia and they don't appear to
be used elsewhere (checked with Google Search and Debian Code Search). So delete
them without a transition period.
Announcement: https://lists.llvm.org/pipermail/llvm-dev/2021-February/148446.html
Differential Revision: https://reviews.llvm.org/D96310
This patch adds a new intrinsic experimental.vector.reduce that takes a single
vector and returns a vector of matching type but with the original lane order
reversed. For example:
```
vector.reverse(<A,B,C,D>) ==> <D,C,B,A>
```
The new intrinsic supports fixed and scalable vectors types.
The fixed-width vector relies on shufflevector to maintain existing behaviour.
Scalable vector uses the new ISD node - VECTOR_REVERSE.
This new intrinsic is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].
Patch by Paul Walker (@paulwalker-arm).
[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html
Differential Revision: https://reviews.llvm.org/D94883
In the past, it was stated in D87994 that it is allowed to dereference a pointer that is partially undefined
if all of its possible representations fit into a dereferenceable range.
The motivation of the direction was to make a range analysis helpful for assuring dereferenceability.
Even if a range analysis concludes that its offset is within bounds, the offset could still be partially undefined; to utilize the range analysis, this relaxation was necessary.
https://groups.google.com/g/llvm-dev/c/2Qk4fOHUoAE/m/KcvYMEgOAgAJ has more context about this.
However, this is currently blocking another optimization, which is annotating the noundef attribute for library functions' arguments. D95122 is the patch.
Currently, there are quite a few library functions which cannot have noundef attached to its pointer argument because it can be transformed from load/store.
For example, MemCpyOpt can convert stores into memset:
```
store p, i32 0
store (p+1), i32 0 // Since currently it is allowed for store to have partially undefined pointer..
->
memset(p, 0, 8) // memset cannot guarantee that its ptr argument is noundef.
```
A bigger problem is that this makes unclear which library functions are allowed to have 'noundef' and which functions aren't (e.g., strlen).
This makes annotating noundef almost impossible for this kind of functions.
This patch proposes that all memory operations should have well-defined pointers.
For memset/memcpy, it is semantically equivalent to running a loop until the size is met (and branching on undef is UB), so the size is also updated to be well-defined.
Strictly speaking, this again violates the implication of dereferenceability from range analysis result.
However, I think this is okay for the following reasons:
1. It seems the existing analyses in the LLVM main repo does not have conflicting implementation with the new proposal.
`isDereferenceableAndAlignedPointer` works only when the GEP offset is constant, and `isDereferenceableAndAlignedInLoop` is also fine.
2. A possible miscompilation happens only when the source has a pointer with a *partially* undefined offset (it's okay with poison because there is no 'partially poison' value).
But, at least I'm not aware of a language using LLVM as backend that has a well-defined program while allowing partially undefined pointers.
There might be such a language that I'm not aware of, but improving the performance of the mainstream languages like C and Rust is more important IMHO.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D95238
explicitly emitting retainRV or claimRV calls in the IR
Background:
This fixes a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.attachedcall" to calls,
which indicates the call is implicitly followed by a marker
instruction and an implicit retainRV/claimRV call that consumes the
call result. In addition, it emits a call to
@llvm.objc.clang.arc.noop.use, which consumes the call result, to
prevent the middle-end passes from changing the return type of the
called function. This is currently done only when the target is arm64
and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
claimRV is attached to the call since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since the ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if retainRV is attached to the call and
does nothing if claimRV is attached to it.
- SCCP refrains from replacing the return value of a call with a
constant value if the call has the operand bundle. This ensures the
call always has at least one user (the call to
@llvm.objc.clang.arc.noop.use).
- This patch also fixes a bug in replaceUsesOfNonProtoConstant where
multiple operand bundles of the same kind were being added to a call.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
Before, the first mention of LLVM's license on the developer policy page stated
that LLVM's license is Apache 2. This patch makes that more accurate by
mentioning the LLVM exception this first time the LLVM license is discussed on
that page, i.e. Apache-2.0 with LLVM-exception.
Technically, the correct SPDX identifier for LLVM's license is 'Apache-2.0 WITH
LLVM-exception', but I thought that writing the 'WITH' in lower case made the
paragraph easier to read without reducing clarity.
Differential Revision: https://reviews.llvm.org/D96482
This is a follow up patch to D83136 adding the align attribute to `cmpxchg`.
See also D83465 for `atomicrmw`.
Differential Revision: https://reviews.llvm.org/D87443
This reverts commit 4a64d8fe392449b205e59031aad5424968cf7446.
Makes clang crash when buildling trivial iOS programs, see comment
after https://reviews.llvm.org/D92808#2551401
emitting retainRV or claimRV calls in the IR
This reapplies 3fe3946d9a958b7af6130241996d9cfcecf559d4 without the
changes made to lib/IR/AutoUpgrade.cpp, which was violating layering.
Original commit message:
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.rv" to calls, which
indicates the call is implicitly followed by a marker instruction and
an implicit retainRV/claimRV call that consumes the call result. In
addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
consumes the call result, to prevent the middle-end passes from changing
the return type of the called function. This is currently done only when
the target is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
the call is annotated with claimRV since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if the implicit call is a call to
retainRV and does nothing if it's a call to claimRV.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
This reverts commit 3fe3946d9a958b7af6130241996d9cfcecf559d4.
The commit violates layering by including a header from Analysis in
lib/IR/AutoUpgrade.cpp.
emitting retainRV or claimRV calls in the IR
Background:
This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.
https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
What this patch does to fix the problem:
- The front-end adds operand bundle "clang.arc.rv" to calls, which
indicates the call is implicitly followed by a marker instruction and
an implicit retainRV/claimRV call that consumes the call result. In
addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
consumes the call result, to prevent the middle-end passes from changing
the return type of the called function. This is currently done only when
the target is arm64 and the optimization level is higher than -O0.
- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
with the operand bundle in the IR and removes the inserted calls after
processing the function.
- ARC contract pass emits retainRV/claimRV calls after the call with the
operand bundle. It doesn't remove the operand bundle on the call since
the backend needs it to emit the marker instruction. The retainRV and
claimRV calls are emitted late in the pipeline to prevent optimization
passes from transforming the IR in a way that makes it harder for the
ARC middle-end passes to figure out the def-use relationship between
the call and the retainRV/claimRV calls (which is the cause of
PR31925).
- The function inliner removes an autoreleaseRV call in the callee if
nothing in the callee prevents it from being paired up with the
retainRV/claimRV call in the caller. It then inserts a release call if
the call is annotated with claimRV since autoreleaseRV+claimRV is
equivalent to a release. If it cannot find an autoreleaseRV call, it
tries to transfer the operand bundle to a function call in the callee.
This is important since ARC optimizer can remove the autoreleaseRV
returning the callee result, which makes it impossible to pair it up
with the retainRV/claimRV call in the caller. If that fails, it simply
emits a retain call in the IR if the implicit call is a call to
retainRV and does nothing if it's a call to claimRV.
Future work:
- Use the operand bundle on x86-64.
- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
calls annotated with the operand bundles.
rdar://71443534
Differential Revision: https://reviews.llvm.org/D92808
Based on the comments in the code, the idea is that AsmPrinter is
unable to produce entry value blocks of arbitrary length, such as
DW_OP_entry_value [DW_OP_reg5 DW_OP_lit1 DW_OP_plus]. But the way the
Verifier check is written it also disallows DW_OP_entry_value
[DW_OP_reg5] DW_OP_lit1 DW_OP_plus which seems to overshoot the
target.
Note that this patch does not change any of the safety guards in
LiveDebugValues — there is zero behavior change for clang. It just
allows us to legalize more complex expressions in future patches.
rdar://73907559
Differential Revision: https://reviews.llvm.org/D95990
On z/OS, other error messages are not matched correctly in lit tests.
```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```
This patch adds a lit substitution to fix it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95808
With the new PM imminent, bugpoint will diverge from opt, meaning it may
not reproduce a crash with the same arguments passed to opt. We need to
specify alternatives to bugpoint for reducing crashes.
I looked at the rest of the document to see if anything could be
improved. Major highlights:
* Run -Xclang -disable-llvm-passes instead of -O0 for skipping IR passes
* Mention the files that clang dumps on a crash
* Remove outdated reference to `delta` and plug `creduce` instead
* Mention llvm-reduce on top of bugpoint
* Mention --print-before-all --print-module-scope
* Mention sanitizers in addition to valgrind
* Mention opt-bisect for miscompiles
Reviewed By: fhahn, MaskRay
Differential Revision: https://reviews.llvm.org/D95578
So far, it was not specified what happens with the VGPRs of inactive
lanes when functions are called. This patch explicitely mentions that
the VGPR values of inactive lanes need to be preserved for all
registers.
This describes the current behavior, as only active lanes of registers
are saved to scratch. Also, as the multi-lane nature of VGPRs is not
properly modeled, we cannot determine the live VGPRs from inactive lanes
at calls. So we cannot save them, even if we intended to do so.
Differential Revision: https://reviews.llvm.org/D95610
To set non-default rounding mode user usually calls function 'fesetround'
from standard C library. This way has some disadvantages.
* It creates unnecessary dependency on libc. On the other hand, setting
rounding mode requires few instructions and could be made by compiler.
Sometimes standard C library even is not available, like in the case of
GPU or AI cores that execute small kernels.
* Compiler could generate more effective code if it knows that a particular
call just sets rounding mode.
This change introduces new IR intrinsic, namely 'llvm.set.rounding', which
sets current rounding mode, similar to 'fesetround'. It however differs
from the latter, because it is a lower level facility:
* 'llvm.set.rounding' does not return any value, whereas 'fesetround'
returns non-zero value in the case of failure. In glibc 'fesetround'
reports failure if its argument is invalid or unsupported or if floating
point operations are unavailable on the hardware. Compiler usually knows
what core it generates code for and it can validate arguments in many
cases.
* Rounding mode is specified in 'fesetround' using constants like
'FE_TONEAREST', which are target dependent. It is inconvenient to work
with such constants at IR level.
C standard provides a target-independent way to specify rounding mode, it
is used in FLT_ROUNDS, however it does not define standard way to set
rounding mode using this encoding.
This change implements only IR intrinsic. Lowering it to machine code is
target-specific and will be implemented latter. Mapping of 'fesetround'
to 'llvm.set.rounding' is also not implemented here.
Differential Revision: https://reviews.llvm.org/D74729
On z/OS, the following error message is not matched correctly in lit tests.
```
EDC5129I No such file or directory.
```
This patch uses a lit config substitution to check for platform specific error messages.
Reviewed By: muiez, jhenderson
Differential Revision: https://reviews.llvm.org/D95246
Add LLVM to the DW_CFA_LLVM_def_aspace_cfa and
DW_CFA_LLVM_def_aspace_cfa_sf DWARF extensions.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D95640
This adds a generic opcode which communicates that a type has already been
zero-extended from a narrower type.
This is intended to be similar to AssertZext in SelectionDAG.
For example,
```
%x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16
```
Signifies that the top 48 bits of %x are known to be 0.
This is useful in cases like this:
```
define i1 @zeroext_param(i8 zeroext %x) {
%cmp = icmp ult i8 %x, -20
ret i1 %cmp
}
```
In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit
value.
If we know that `%x` is already zero-ed out in the relevant high bits, we can
avoid the truncate.
Currently, in GISel, this looks like this:
```
_zeroext_param:
and w8, w0, #0xff ; We don't actually need this!
cmp w8, #236
cset w0, lo
ret
```
While SDAG does not produce the truncation, since it knows that it's
unnecessary:
```
_zeroext_param:
cmp w0, #236
cset w0, lo
ret
```
This patch
- Adds G_ASSERT_ZEXT
- Adds MIRBuilder support for it
- Adds MachineVerifier support for it
- Documents it
It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There
should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.)
This allows us to skip over hints in the legalizer etc. These can then later
be selected like COPY instructions or removed.
Differential Revision: https://reviews.llvm.org/D95564
... and similarly for some other cases. This is for consistency and to
make it easier to search for mentions of a particular architecture.
Differential Revision: https://reviews.llvm.org/D95453
Commit be9f322e8dc530a56f03356aad31fa9031b27e26 moved the list of workers from
slaves.py to workers.py, but the documentation in "How To Add A Builder" was
never updated and now references a non-existing file. This fixes that.
Reviewed By: gkistanova
Differential Revision: https://reviews.llvm.org/D94886
The documentation for contributing to LLVM currently links to the section
explaining how to submit a Phabricator review using the web interface.
I believe it would be better to link to the general page for using
Phabricator instead, which explains how to sign up with Phabricator,
and also how to submit patches using either the web interface or the
command-line.
I think this is worth changing because what currently *appears* to be our
preferred way of submitting a patch (through the web interface) isn't
actually what we prefer. Indeed, patches submitted from the command-line
have more meta-data available (such as which repository the patch targets),
and also can't suffer from missing context.
Differential Revision: https://reviews.llvm.org/D94929