The intrinsic int_hexagon_S2_asr_i_vh was mapped to S2_asr_r_vh, which
is wrong. The testcase vasrh.select.ll was using an invalid immediate
for that intrinsic. This is not a proper testcase, since at the MIR
level such use of this intrinsic should never appear.
Together with 824b25fc02, this completes the fix for llvm.org/PR44090.
Having a generic CPU removes a warning when creating a subtarget without
the CPU being explicitly specified.
Differential Revision: https://reviews.llvm.org/D70490
Follow-up of cb47b8783: don't query TTI->preferPredicateOverEpilogue when
option -prefer-predicate-over-epilog is set to false, i.e. when we prefer not
to predicate the loop.
Differential Revision: https://reviews.llvm.org/D70382
With updates to various LLVM tools that use SpecialCastList.
It was tempting to use RealFileSystem as the default, but that makes it
too easy to accidentally forget passing VFS in clang code.
Summary:
This patch removes manual location list handling in the statistics code
and replaces it with the new DWARFDie api, which provides access to a
"cooked" location list. This has the following effects:
- the code now properly handles split-dwarf location lists
- it will automatically support dwarf5 location lists once support for
those is added
- it properly handles location lists with base address selection entries
- it fixes a bug where the location list code was using the first
DW_AT_ranges range as a "base address" of the compile unit (it should
have used DW_AT_low_pc instead. The effect of this was that the
computation of the start address of a variable in its scope was broken
for these kinds of compile units. This only manifested itself on
linked files, since in object files the first DW_AT_ranges range
normally starts at 0.
Since pretty much every kind of location list was broken in some way,
it's hard to verify that the new implementation is correct -- the output
will be different in all non-trivial cases, and mostly with good reason.
Most of the existing statistics tests continue to pass though, and a
visual inspection of the statistics for non-trivial inputs shows that
the data is more "reasonable" now. I have updated the "dwo statistics"
test to include the new numbers, as the previous ones were completely
bogus, and I have added a targeted test for the "base address" bug.
Reviewers: dblaikie, cmtice, vsk
Subscribers: aprantl, SouraVX, JDevlieghere, djtodoro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70444
The PE32Header struct is only used by COFFYAML, for intermediate
storage. The struct doesn't match the on-disk struct layout as it
uses native integers instead of e.g. support::ulittle32_t, so just
widen the fields to fit values for object::pe32plus_header, in
addition to object::pe32_header.
This avoids truncating the 64 bit ImageBase for 64 bit executables.
Differential Revision: https://reviews.llvm.org/D70464
Summary:
This is a follow-up to 590f279c456bbde632eca8ef89a85c478f15a249, which
moved some of the callers to use VFS.
It turned out more code in Driver calls into real filesystem APIs and
also needs an update.
Reviewers: gribozavr2, kadircet
Reviewed By: kadircet
Subscribers: ormris, mgorny, hiraditya, llvm-commits, jkorous, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D70440
Summary:
Add a function attribute to allow the target specific default loop unroll threshold
to be specified on a per-function basis. This allows a front-end to give guidance
where it has insight that is not available to the back-end, while still allowing the
target specific heuristics to also have an effect.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68873
Summary:
Also, replace the SmallVector with a normal C array.
Reviewers: vsk
Reviewed By: vsk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70498
Darwin lazily saves the AVX512 context on first use [1]: instead of checking
that it already does to figure out if the OS supports AVX512, trust that
the kernel will do the right thing and always assume the context save
support is available.
[1] https://github.com/apple/darwin-xnu/blob/xnu-4903.221.2/osfmk/i386/fpu.c#L174
Reviewers: ab, RKSimon, craig.topper
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D70453
This error was originally added a while(7 years) ago when
including multiple files was basically always an error. Tablegen
now has preprocessor support, which allows for building nice
c/c++ style include guards. With the current error being
reported, we unfortunately need to double guard when including
files:
* In user of MyFile.td
#ifndef MYFILE_TD
include MyFile.td
#endif
* In MyFile.td
#ifndef MYFILE_TD
#define MYFILE_TD
...
#endif
Differential Revision: https://reviews.llvm.org/D70410
In a34680a33eb OrcError.h and Orc/RPC/*.h were split out from the rest of
ExecutionEngine in order to eliminate false dependencies for remote JIT
targets (see https://reviews.llvm.org/D68732), however this broke modules
builds (see https://reviews.llvm.org/D69817).
This patch splits these headers out into a separate module, LLVM_OrcSupport,
in order to fix the modules build.
Fixes <rdar://56377508>.
Moving accesses in MemorySSA at InsertionPlace::End, when an instruction is
moved into a block, almost always means insert at the end of the block, but
before the block terminator. This matters when the block terminator is a
MemoryAccess itself (an invoke), and the insertion must be done before
the terminator for the update to be correct.
Insert an additional position: InsertionPlace:BeforeTerminator and update
current usages where this applies.
Resolves PR44027.
DwarfExpression::addMachineReg() knows how to build a larger register
that isn't expressible in DWARF by combining multiple
subregisters. However, if the entire value fits into just one
subregister, it would still emit the other subregisters, leading to
all sorts of inconsistencies down the line.
This patch fixes that by moving an already existing(!) check whether
the subregister's offset is before the end of the value to the right
place.
rdar://problem/57294211
Differential Revision: https://reviews.llvm.org/D70508
Nothing breaks, so this probably has zero impact, but it seems nice,
since to_vector is more like makeArrayRef, and should really live in
SmallVector.h.
After speaking with Sanjay - seeing a number of miscompiles and working
on tracking down a testcase. None of the follow on patches seem to
have helped so far.
This reverts commit 8a0aa5310bccbb42d16d11db090419fcefdd1376.
One of current peephole optimiations is to remove SLL/SRL if
the sub register has been zero extended. This phase has two bugs
and one limitations.
First, for the physical subregister used in pseudo insn COPY
like below, it permits incorrect optimization.
%0:gpr32 = COPY $w0
...
%4:gpr = MOV_32_64 %0:gpr32
%5:gpr = SLL_ri %4:gpr(tied-def 0), 32
%6:gpr = SRA_ri %5:gpr(tied-def 0), 32
The $w0 could be from the return value of a previous function call
and its upper 32-bit value might contain some non-zero values.
The same applies to function arguments.
Second, the current code may permits removing SLL/SRA like below:
%0:gpr32 = COPY $w0
%1:gpr32 = COPY %0:gpr32
...
%4:gpr = MOV_32_64 %1:gpr32
%5:gpr = SLL_ri %4:gpr(tied-def 0), 32
%6:gpr = SRA_ri %5:gpr(tied-def 0), 32
The reason is that it did not follow def-use chain to skip all
intermediate 32bit-to-32bit COPY instructions.
The current implementation is also very conservative for PHI
instructions. If any PHI insn component is another PHI or COPY insn,
it will just permit SLL/SRA.
This patch fixed the issue as follows:
- During def/use chain traversal, if any physical register is read,
SLL/SRA will be preserved as these physical registers are mostly
from function return values or current function arguments.
- Recursively visit all COPY and PHI instructions.
After speaking with Sanjay - seeing a number of miscompiles and working
on tracking down a testcase. None of the follow on patches seem to
have helped so far.
This reverts commit 7ff57705ba196ce649d6034614b3b9df57e1f84f.
FileError doesn't need direct access to Error's internals as it can access the
payload via handleErrors.
(ErrorList remains special: it is not possible to write a handler for it, due
to the special auto-unpacking treatment that it receives from handleErrors.)
Summary:
681454dae4
Clone+exec death test allocates a single page of stack to run chdir + exec on.
This is not enough when gtest is built with ASan and run on particular
hardware.
With ASan on x86_64, ExecDeathTestChildMain has frame size of 1728 bytes.
Call to chdir() in ExecDeathTestChildMain ends up in
_dl_runtime_resolve_xsavec, which attempts to save register state on the stack;
according to cpuid(0xd) XSAVE register save area size is 2568 on my machine.
This results in something like this in all death tests:
Result: died but not with expected error.
...
[ DEATH ] AddressSanitizer:DEADLYSIGNAL
[ DEATH ] =================================================================
[ DEATH ] ==178637==ERROR: AddressSanitizer: stack-overflow on address ...
PiperOrigin-RevId: 278709790
Reviewers: pcc
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70332
The Custom handler doesn't do anything for these nodes anyway.
SelectionDAGISel won't mutate them if they are Legal or Custom.
X86 has custom code for mutating them due to missing isel patterns.
When the isel patterns are added Legal will be the right answer.
So go ahead a change it now since that's where we'll end up.
This is mostly NFC, but I removed the setting of the guard's calling convention onto the WC call. Why? Because it was untested, and was producing an ill defined output as the declaration's convention wasn't been changed leaving a mismatch which is UB.