The patch removes unnecessary members of DWARFDebugAddr and further
simplifies the implementation by separating parsing methods of tables
in the DWARFv5 and pre-standard formats.
Differential Revision: https://reviews.llvm.org/D74197
As a preparation for the subsequent patches, this updates the wordings
of some error messages in DWARFDebugAddr.
Differential Revision: https://reviews.llvm.org/D74196
This replaces a collocation "a .debug_addr table" with "an address table"
because the latter sounds more accurate.
Differential Revision: https://reviews.llvm.org/D74407
As there is no header in pre-DWARFv5 address tables, and we fill
the class data members with some artificial values, we should not
dump them as that might be misleading.
Differential Revision: https://reviews.llvm.org/D74195
As addresses in the address tables may have relocations, thus,
the relocations should be resolved to read the correct address.
That is especially important for targets that use RELA relocations
because in that case addends are stored in relocation sections.
Differential Revision: https://reviews.llvm.org/D74404
This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses
them to implement soft promotion for the half type. This is
enough to provide basic support for __fp16 with strictfp.
Add the necessary X86 support to use VCVTPS2PH/VCVTPH2PS when F16C
is enabled.
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.
There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
This was creating a select on true/false values, and then comparing
that later. This produced more work for later combines, which can be
avoided by just using the boolean values. This was copied from the
original DAG expansion, which also has the same problem. This doesn't
have a observable change using SelectionDAG, but since GlobalISel is
missing these optimizations, the final code was noticeably longer.
These have nicer expansions implemented in the DAG. Ideally we would
either directly implement all of these special expansions, or stop
expanding division in the IR.
Mark the CrashRecoveryContextImpl constructor noexcept, so that MSVC
won't emit an unwind helper to clean up the allocation from `new` if the
constructor throws an exception.
Otherwise, MSVC complains:
llvm\lib\Support\CrashRecoveryContext.cpp(220): error C2712: \
Cannot use __try in functions that require object unwinding
The other simple fix would be to wrap `new` in a static helper or
lambda.
Users have reported that Tensorflow builds LLVM with /EHsc.
This is apparently worse than 1-byte alignment. This does not attempt
to decompose 2-byte aligned wide stores, but will stop trying to
produce them.
Also fix bug in LoadStoreVectorizer which was decreasing the alignment
and vectorizing stack accesses. It was assuming a stack object was an
alloca that could have its base alignment changed, which is not true
if the pointer is derived from a function argument.
Since natural fdiv lowering is now more conservative even with
denormals disabled, we get a slower expansion from just a plain
1.0/fdiv. Directly emit the rcp intrinsic when using it to implement
integer division to avoid a pointlessly complex sequence.
https://reviews.llvm.org/D74273
Pad macho section data to pointer size bytes, so that relocation
table and symbol table following section data will be pointer size
aligned.
Patch by pguo.
We aren't doing a good job of optimizing AVX512 outside of this code. So remove the bail out for AVX512 and replace with a FIXME. This at least gets us the AVX2 codegen.
Differential Revision: https://reviews.llvm.org/D74431
Update test scripts were limited because they performed a single action
on the entire file and if that action was controlled by arguments, like
the one introduced in D68819, there was no record of it.
This patch introduces the capability of changing the arguments passed to
the script "on-the-fly" while processing a test file. In addition, an
"on/off" switch was added so that processing can be disabled for parts
of the file where the content is simply copied. The last extension is a
record of the invocation arguments in the auto generated NOTE. These
arguments are also picked up in a subsequent invocation, allowing
updates with special options enabled without user interaction.
To change the arguments the string `UTC_ARGS:` has to be present in a
line, followed by "additional command line arguments". That is
everything that follows `UTC_ARGS:` will be added to a growing list
of "command line arguments" which is reparsed after every update.
Reviewed By: arichardson
Differential Revision: https://reviews.llvm.org/D69701
Instructions marked as FrameSetup do not cause requestLabelAfterInsn to
be called and so no such label is generated. Call instructions which
require call site entries to be generated require this label to be
present in order to calculate the return PC offset/address, but the
check for whether the call instruction is marked as FrameSetup was not
present.
Therefore in the case where a call instruction is marked as FrameSetup,
an assertion failure occurs if a call site entry is to be generated.
This is the case with RISC-V's implementation of save/restore via
library calls.
Differential Revision: https://reviews.llvm.org/D71593
This patch adds the support required for using the __riscv_save and
__riscv_restore libcalls to implement a size-optimization for prologue
and epilogue code, whereby the spill and restore code of callee-saved
registers is implemented by common functions to reduce code duplication.
Logic is also included to ensure that if both this optimization and
shrink wrapping are enabled then the prologue and epilogue code can be
safely inserted into the basic blocks chosen by shrink wrapping.
Differential Revision: https://reviews.llvm.org/D62686
As an approximation to a dead edge we can check if the terminator is
dead. If so, the corresponding operand use in a PHI node is dead even if
the PHI node itself is not.
ObjectLinkingLayer was not correctly propagating dependencies through local
symbols within an object. This could cause symbol lookup to return before a
searched-for symbol is ready if the following conditions are met:
(1) The definition of the symbol being searched for transitively depends on a
local symbol within the same object, and that local symbol in turn
transitively depends on an external symbol provided by some other module
in the JIT.
(2) Concurrent compilation is enabled.
(3) Thread scheduling causes the lookup of the searched-for symbol to return
before all transitive dependencies of the looked-up symbol are emitted.
This bug was found by inspection and has not been observed in practice.
A jitlink test case has been added to verify that symbol dependencies are
correctly propagated through local symbol definitions.
Summary:
SIInstrInfo::expandPostRAPseudo converts ENTER_WWM in-place into an
S_OR_SAVEEXEC instruction that needs certain implicit operands. Without
this patch I get errors like this that make it harder to use -stop-after
to bisect the pass pipeline:
$ llc -march=amdgcn test/CodeGen/AMDGPU/wqm.ll -stop-after=postrapseudos -o - | sed -E 's/ (from|into) custom "TargetCustom[0-9]+"//' | llc -march=amdgcn -x=mir
error: <stdin>:1295:70: missing implicit register operand 'implicit-def $scc'
renamable $sgpr2_sgpr3 = S_OR_SAVEEXEC_B64 -1, implicit-def $exec
^
Note that this error is currently only generated by MIParser but it
comes with a FIXME comment:
// FIXME: Move the implicit operand verification to the machine verifier.
Reviewers: critson, arsenm, rampitec, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74428
Summary:
Dwarf stores source-file names the three parts:
<compilation_directory><include_directory><filename>
Prior to this change, the code only allowed retrieving either all
three as the absolute path, or just the filename. But many
compile-command lines--especially those in hermetic build systems
don't specify an absolute path, nor just the filename, but rather the
path relative to the compilation directory. This features allows
retrieving them in that style.
Add tests for path printing styles.
Modify createBasicPrologue to handle include directories.
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73383
Summary:
Part of the changes in D44564 made BasicAA not CFG only due to it using
PhiAnalysisValues which may have values invalidated.
Subsequent patches (rL340613) appear to have addressed this limitation.
BasicAA should not be invalidated by non-CFG-altering passes.
A concrete example is MemCpyOpt which preserves CFG, but we are testing
it invalidates BasicAA.
llvm-dev RFC: https://groups.google.com/forum/#!topic/llvm-dev/eSPXuWnNfzM
Reviewers: john.brawn, sebpop, hfinkel, brzycki
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74353
Based on uops.info these should have 5 cycle latency as they did on Haswell/Broadwell. I have no additional internal information from Intel.
This was also shown as a discrepancy in the spreadsheet that was sent with an early llvm-dev post about llvm-exegesis.
It also matches Agner Fog.
Differential Revision: https://reviews.llvm.org/D74357
This restores commit 748bb5a0f1964d20dfb3891b0948ab6c66236c70, along
with a fix for a Chromium test suite build issue (and a new test for
that case).
Differential Revision: https://reviews.llvm.org/D73242
Summary:
* for <= tbd_v3, simulator platforms appear the same as the real
platform and we distinct the difference from the architecture.
fixes: rdar://problem/59161559
Reviewers: ributzka, steven_wu
Reviewed By: ributzka
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74416