Summary:
This support is needed for the Fortran array variables with pointer/allocatable
attribute. This support enables debugger to identify the status of variable
whether that is currently allocated/associated.
for pointer array (before allocation/association)
without DW_AT_associated
(gdb) pt ptr
type = integer (140737345375288:140737354129776)
(gdb) p ptr
value requires 35017956 bytes, which is more than max-value-size
with DW_AT_associated
(gdb) pt ptr
type = integer (:)
(gdb) p ptr
$1 = <not associated>
for allocatable array (before allocation)
without DW_AT_allocated
(gdb) pt arr
type = integer (140737345375288:140737354129776)
(gdb) p arr
value requires 35017956 bytes, which is more than max-value-size
with DW_AT_allocated
(gdb) pt arr
type = integer, allocatable (:)
(gdb) p arr
$1 = <not allocated>
Testing
- unit test cases added
- check-llvm
- check-debuginfo
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D83544
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.
This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.
My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.
This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.
It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.
The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.
Additional context is in D79744.
This patch changes llvm-readelf (and llvm-readobj for consistency)
behavior to print an error when executed with no input files.
Reading from stdin can be achieved via a '-' for the input
object.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46400
Differential Revision: https://reviews.llvm.org/D83704
Reviewed by: jhenderson, MaskRay, sbc, jyknight
This diff starts the implementation of llvm-libtool-darwin
(an llvm based replacement of cctool's libtool).
Libtool is used for creating static and dynamic libraries
from a bunch of object files given as input.
Reviewed by alexshap, smeenai, jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D82923
From @erichkeane:
```
This patch doesn't seem to build for me:
/iusers/ekeane1/workspaces/llvm-project/llvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp: In function ‘llvm::Error llvm::exegesis::parseDataBuffer(const char*, size_t, const void*, const void*, llvm::SmallVector<long int, 4>*)’:
/iusers/ekeane1/workspaces/llvm-project/llvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp:99:37: error: ‘struct perf_branch_entry’ has no member named ‘cycles’
CycleArray->push_back(Entry.cycles);
I'm on RHEL7, so I have kernel 3.10, so it doesn't have 'cycles'.
According ot this: https://elixir.bootlin.com/linux/v4.3/source/include/uapi/linux/perf_event.h#L963 kernel 4.3 is the first time that 'cycles' appeared in this structure.
```
Make explicit that freeze does not touch paddings of an aggregate.
(Relevant comment: https://reviews.llvm.org/D83752#2152550)
This implies that `v = freeze(load p); store v, q` may still leave undef bits
or poison in memory if `v` is an aggregate, but it still happens for
non-byte integers such as i1.
Differential Revision: https://reviews.llvm.org/D83927
Starting with Skylake, the LBR contains the precise number of cycles between the two
consecutive branches.
Making use of this will hopefully make the measurements more precise than the
existing methods of using RDTSC.
Differential Revision: https://reviews.llvm.org/D77422
This came up in a recent review, someone was wondering were was
this all documented and I couldn't find a reference to provide.
Differential Revision: https://reviews.llvm.org/D83816
Like most readability rules, it isn't absolute and there is a matter of taste
to it. I think more recent part of the project may be more consistent in the
current application of the guideline. I suspect sources like
mlir/lib/Dialect/StandardOps/IR/Ops.cpp may be examples of this at the moment.
Differential Revision: https://reviews.llvm.org/D82594
Some of the system registers readable on AArch64 and ARM platforms
return different values with each read (for example a timer counter),
these shouldn't be hoisted outside loops or otherwise interfered with,
but the normal @llvm.read_register intrinsic is only considered to read
memory.
This introduces a separate @llvm.read_volatile_register intrinsic and
maps all system-registers on ARM platforms to use it for the
__builtin_arm_rsr calls. Registers declared with asm("r9") or similar
are unaffected.
This changes the matrix load/store intrinsic definitions to load/store from/to
a pointer, and not from/to a pointer to a vector, as discussed in D83477.
This also includes the recommit of "[Matrix] Tighten LangRef definitions and
Verifier checks" which adds improved language reference descriptions of the
matrix intrinsics and verifier checks.
Differential Revision: https://reviews.llvm.org/D83785
Loop metadata nodes do not adhere to the documented property:
(a) LoopIDs are not unique: Any pass that duplicates IR will do it
including its metadata (e.g. LoopVersioning) such that multiple
loops are linked with the same LoopID. There is even a test case
(Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple
loops with the same LoopID.
(b) LoopIDs are not persistent: Adding or removing an item from a LoopID
can only be done by creating a new MDNode and assigning it to the
loop's branch(es). Passes such as LoopUnroll (llvm.loop.unroll.disable)
and LoopVectorize (llvm.loop.isvectorized) use this to mark loops to
not be transformed multiple times or to avoid that a LoopVersioned
original loop is transformed.
Update the documentation according to how llvm.loop is used in practice.
Differential Revision: https://reviews.llvm.org/D55290
This tightens the matrix intrinsic definitions in LLVM LangRef and adds
correspondings checks to the IR Verifier.
Differential Revision: https://reviews.llvm.org/D83477
fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E)
This is only allowed when "reassoc" is present on the fadd.
As discussed in D80801, this transform goes beyond
what is allowed by "contract" FMF (-ffp-contract=fast).
That is because we are fusing the trailing add of 'E' with a
multiply, but without "reassoc", the code mandates that the
products A*B and C*D are added together before adding in 'E'.
I've added this example to the LangRef to try to clarify the
meaning of "contract". If that seems reasonable, we should
probably do something similar for the clang docs because
there does not appear to be any formal spec for the behavior
of -ffp-contract=fast.
Differential Revision: https://reviews.llvm.org/D82499
This doesn't appear used for anything, and is emitted incorrectly
based on the description. This also depends on the IR type, and
pointee element type.
In FileCheck.rst, add `-dump-input-context` and `-dump-input-filter`,
and fix some `-dump-input` documentation.
In `FileCheck -help`, `cl::value_desc("kind")` is being ignored for
`-dump-input-filter`, so just drop it.
Extend `-dump-input=help` to mention FILECHECK_OPTS.
Summary:
As per disscussion in D83351, using `for_each` is potentially confusing,
at least in regards to inconsistent style (there's less than 100 `for_each`
usages in LLVM, but ~100.000 `for` range-based loops
Therefore, it should be avoided.
Reviewers: dblaikie, nickdesaulniers
Reviewed By: dblaikie, nickdesaulniers
Subscribers: hubert.reinterpretcast, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83431
This adds the --debug-vars option to llvm-objdump, which prints
locations (registers/memory) of source-level variables alongside the
disassembly based on DWARF info. A vertical line is printed for each
live-range, with a label at the top giving the variable name and
location, and the position and length of the line indicating the program
counter range in which it is valid.
Differential revision: https://reviews.llvm.org/D70720
Summary:
Fixes two minor issues in the docs present under `ninja docs-llvm-html`:
1 - A header is too small:
```
Warning, treated as error:
llvm/llvm/docs/Passes.rst:70:Title underline too short.
``-basic-aa``: Basic Alias Analysis (stateless AA impl)
------------------------------------------------------
```
2 - Multiple definitions on a non-anonymous target (llvm-dev mailing list):
```
Warning, treated as error:
llvm/llvm/docs/DeveloperPolicy.rst:3:Duplicate explicit target name: "llvm-dev mailing list".
```
Reviewers: lattner
Reviewed By: lattner
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83416
LLVM currently does not require function parameters or return values
to be fully initialized, and does not care if they are poison. This can
be useful if the frontend ABI makes no such demands, but may prevent
helpful backend transformations in case they do. Specifically, the C
and C++ languages require all scalar function operands to be fully
determined.
Introducing this attribute is of particular use to MemorySanitizer
today, although other transformations may benefit from it as well.
We can modify MemorySanitizer instrumentation to provide modest (17%)
space savings where `frozen` is present.
This commit only adds the attribute to the Language Reference, and
the actual implementation of the attribute will follow in a separate
commit.
Differential Revision: https://reviews.llvm.org/D82316
This cleans up the stack allocated by a @llvm.call.preallocated.setup.
Should either call the teardown or the preallocated call to clean up the
stack. Calling both is UB.
Add LangRef.
Add verifier check that the token argument is a @llvm.call.preallocated.setup.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D83354
Do not enforce recommonmark dependency if sphinx is called to build
manpages. In order to do this, try to import recommonmark first
and do not configure it if it's not available. Additionally, declare
a custom tags for the selected builder via CMake, and ignore
recommonmark import failure when 'man' target is used.
This will permit us to avoid the problematic recommonmark dependency
for the majority of Gentoo users that do not need to locally build
the complete documentation but want to have tool manpages.
Differential Revision: https://reviews.llvm.org/D83161
Summary:
D80831 changed part of the prefix usage for AIX.
But there are other places getting prefix from DataLayout.
This patch intends to make prefix usage consistent on AIX.
Reviewed by: hubert.reinterpretcast, daltenty
Differential Revision: https://reviews.llvm.org/D81270
The "new way" of enabling recommonmark is only supported in recommonmark
0.5 and later. Use the deprecated approach with versions of Sphinx that
still support it.
If I understand correctly there's no way to use older versions of
recommonmark (<0.5) with newer versions of Sphinx (>3.0) because the old
approach got removed.
Differential revision: https://reviews.llvm.org/D75284
Every other value parameter attribute uses parentheses, so accept this
as the preferred modern syntax. Updating everything to use the new
syntax is left for a future change.
This is cleaning up comments (mostly in the bitcode handling) about
removing some backward compatibility aspect in the 4.0 release.
Historically, "4.0" was used during the development of the 3.x
versions as "this future major breaking change version". At the time
the major number was used to indicate the compatibility. When we
reached 3.9 we decided to change the numbering, instead of going to
3.10 we went to 4.0 but after changing the meaning of the major
number to not mean anything anymore with respect to bitcode backward
compatibility.
The current policy
(https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility)
indicates only now:
The current LLVM version supports loading any bitcode since version 3.0.
Differential Revision: https://reviews.llvm.org/D82514
Before the fix the build of docs-llvm-html would fail.
The D80959 introduced options that are not recognized, so we have
warning as:
llvm-project/llvm/docs/CommandGuide/llvm-dwarfdump.rst:40\
:unknown option: --debug-info
Differential Revision: https://reviews.llvm.org/D82460
Before the fix the build of docs-llvm-html would fail.
The rG8bc03d216824 introduced a reference to an undefined label,
so we have warning as:
llvm-project/llvm/docs/GlobalISel/GenericOpcode.rst:295:\
undefined label: i_intr_llvm_ptrmask (if the link has no\
caption the label must precede a section header)