When GVN sets the incoming value for a phi to undef because the incoming block
is unreachable it needs to also invalidate the cached info for that phi in
MemoryDependenceAnalysis, otherwise later queries will return stale information.
Differential Revision: https://reviews.llvm.org/D51099
llvm-svn: 340529
Both DWARFDebugLine and DWARFDebugAddr used the same callback mechanism
for handling recoverable errors. They both implemented similar warn() function
to be used as such callbacks.
In this revision we get rid of code duplication and move this warn() function
to DWARFContext as DWARFContext::dumpWarning().
Reviewers: lhames, jhenderson, aprantl, probinson, dblaikie, JDevlieghere
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D51033
llvm-svn: 340528
This version of the patch fixes cleaning up ssa_copy intrinsics, so it does not
crash for instructions in blocks that have been marked unreachable.
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.
As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.
Reviewers: davide, mssimpso, dberlin, efriedma
Reviewed By: davide, dberlin
Differential Revision: https://reviews.llvm.org/D45330
llvm-svn: 340525
For the _WIN32 macro, it is the definedness that matters rather than
the value. Most uses of the macro already rely on the definedness.
This commit fixes the few remaining uses that relied on the value.
Differential Revision: https://reviews.llvm.org/D51105
llvm-svn: 340520
Most users won't have to worry about this as all of the
'getOrInsertFunction' functions on Module will default to the program
address space.
An overload has been added to Function::Create to abstract away the
details for most callers.
This is based on https://reviews.llvm.org/D37054 but without the changes to
make passing a Module to Function::Create() mandatory. I have also added
some more tests and fixed the LLParser to accept call instructions for
types in the program address space.
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D47541
llvm-svn: 340519
subtarget features for indirect calls and indirect branches.
This is in preparation for enabling *only* the call retpolines when
using speculative load hardening.
I've continued to use subtarget features for now as they continue to
seem the best fit given the lack of other retpoline like constructs so
far.
The LLVM side is pretty simple. I'd like to eventually get rid of the
old feature, but not sure what backwards compatibility issues that will
cause.
This does remove the "implies" from requesting an external thunk. This
always seemed somewhat questionable and is now clearly not desirable --
you specify a thunk the same way no matter which set of things are
getting retpolines.
I really want to keep this nicely isolated from end users and just an
LLVM implementation detail, so I've moved the `-mretpoline` flag in
Clang to no longer rely on a specific subtarget feature by that name and
instead to be directly handled. In some ways this is simpler, but in
order to preserve existing behavior I've had to add some fallback code
so that users who relied on merely passing -mretpoline-external-thunk
continue to get the same behavior. We should eventually remove this
I suspect (we have never tested that it works!) but I've not done that
in this patch.
Differential Revision: https://reviews.llvm.org/D51150
llvm-svn: 340515
Aligning section contents is not required, but only
recommended, by the specification. Microsoft's documentation says
(https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format#section-table-section-headers):
"For object files, the value should be aligned on a 4-byte boundary
for best performance."
However, according to my measurements, aligning section contents has
a neutral to negative effect on performance.
I measured the median run time of 100 links of Chromium's
base_unittests on Linux with lld-link and on Windows with link.exe with
both aligned and unaligned sections. On Linux I didn't see a measurable
performance difference, and on Windows the link was slightly faster
with unaligned sections (presumably because on Windows the bottleneck
is I/O).
Also, the sections created by cl.exe are unaligned, so we should expect
tools to broadly accept unaligned sections.
Differential Revision: https://reviews.llvm.org/D51149
llvm-svn: 340514
This patch's test case relies on debug prints which isn't generally an
OK way to test stuff in LLVM and fails whenever asserts aren't enabled.
I've send a heads-up to the commit and detailed comments on the review.
llvm-svn: 340513
When complaining that the triple is incompatible with all targets, print out the triple not just a generic error about triples not matching.
llvm-svn: 340509
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.
Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().
Differential Revision: https://reviews.llvm.org/D50621
llvm-svn: 340508
Summary:
Reorganize WebAssemblyInstrSIMD.td to put all of the instruction
definitions together, making it easier to see which instructions have
been implemented already. Depends on D51143.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51113
llvm-svn: 340504
Summary:
WebAssemblyInstrFormats.td retains only multiclasses that are used in
multiple other tablegen files.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D51143
llvm-svn: 340503
The format is the same as in ELF: a sequence of ULEB128-encoded
symbol indexes.
Differential Revision: https://reviews.llvm.org/D51047
llvm-svn: 340499
If we have a min/max pair we can do a better job of counting sign bits if we look at them together. This is similar to what is done in the SelectionDAG version of computeNumSignBits for ISD::SMAX/SMIN.
Differential Revision: https://reviews.llvm.org/D51112
llvm-svn: 340480
Previously we asumed a vector reduction add is part of a loop and one of the input is a phi. But the code in SelectionDAGBuilder that sets vector reduction flag handles more cases than that. It just requires that the use chain ends in a horizontal reduction. And there are no other uses. This means it can handle unrolled reduction loops.
If the initial value of the reduction was 0, an unrolled loop would begin with a vector reduction add that has two sad inputs. Previously we would only transform one side of the add, but for this case we need to transform both sides.
I've created a lambda to reuse some of the code for both sides. And fixed the variables names to remove reference to "phi".
Differential Revision: https://reviews.llvm.org/D50817
llvm-svn: 340478
Summary:
This CL adds support for arbitrary BUILD_VECTORS, i.e. not splats and
not consts. This is the last feature needed to properly lower v2i64
multiplies without a i64x2.mul instruction (which is not in the spec),
so i64x2.mul is removed as well.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51082
Remove unnecessary condition and fix whitespace
llvm-svn: 340472
This solves the motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38527
If we are legalizing an FP vector op that maps to 1 of the LLVM intrinsics that mimic libm calls,
but we're going to end up with scalar libcalls for that vector type anyway, then we should unroll
the vector op into scalars before widening. This avoids libcalls because we've lost the knowledge
that some of the scalar elements are undef.
Differential Revision: https://reviews.llvm.org/D50791
llvm-svn: 340469
We're currently getting this behavior implicitly, since we determine if
a Def's optimization is valid based on the ID of its defining access.
This is incorrect, though I wouldn't be surprised if this was masked in
part by that we're using a WeakVH to track what Defs are optimized to.
(Not to mention that we don't move Defs super often, AFAICT). I'll
submit a patch to fix this shortly.
This also includes a minor refactor to reduce duplication a bit.
No test is included, since like said, this already happens to be our
behavior. I'll add a test for this with my fix to the other bug
mentioned above.
llvm-svn: 340461
The inline sequence is very long (about 70 bytes on Thumb1), so it's
not really a good idea to inline it, especially when optimizing for
size.
Differential Revision: https://reviews.llvm.org/D47917
llvm-svn: 340458
Add support for reading and writing MessagePack, a binary object serialization
format which aims to be more compact than text formats like JSON or YAML.
The specification can be found at
https://github.com/msgpack/msgpack/blob/master/spec.md
Will be used for encoding metadata in AMDGPU code objects.
Differential Revision: https://reviews.llvm.org/D44429
llvm-svn: 340457
Instead of asserting that the function doesn't have any unreachable
code, just ignore it for the purpose of computing liveness.
Differential Revision: https://reviews.llvm.org/D51070
llvm-svn: 340456
Fix bug https://bugs.llvm.org/show_bug.cgi?id=38643
In BPFAsmBackend applyFixup(), there is an assertion for FixedValue to be 0.
This may not be true, esp. for optimiation level 0.
For example, in the above bug, for the following two
static variables:
@bpf_map_lookup_elem = internal global i8* (i8*, i8*)*
inttoptr (i64 1 to i8* (i8*, i8*)*), align 8
@bpf_map_update_elem = internal global i32 (i8*, i8*, i8*, i64)*
inttoptr (i64 2 to i32 (i8*, i8*, i8*, i64)*), align 8
The static variable @bpf_map_update_elem will have a symbol
offset of 8 and a FK_SecRel_8 with FixupValue 8 will cause
the assertion if llvm is built with -DLLVM_ENABLE_ASSERTIONS=ON.
The above relocations will not exist if the program is compiled
with optimization level -O1 and above as the compiler optimizes
those static variables away. In the below error message, -O2
is suggested as this is the common practice.
Note that FixedValue = 0 in applyFixup() does exist and is valid,
e.g., for the global variable my_map in the above bug. The bpf
loader will process them properly for map_id's before loading
the program into the kernel.
The static variables, which are not optimized away by compiler,
may have FK_SecRel_8 relocation with non-zero FixedValue.
The patch removed the offending assertion and will issue
a hard error as below if the FixedValue in applyFixup()
is not 0.
$ llc -march=bpf -filetype=obj fixup.ll
LLVM ERROR: Unsupported relocation: try to compile with -O2 or above,
or check your static variable usage
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 340455
Summary:
When we don't actually have stack-allocated variables but need SP only
to support EH, we don't need to write SP back in the epilog, because we
don't bump down the stack pointer.
Reviewers: dschuff
Subscribers: jgravelle-google, sbc100, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51114
llvm-svn: 340454
On Windows, movw+movt pairs with relocations are handled with a single
relocation that covers them both. Therefore we can't inject anything
between these instructions, otherwise the relocation (which in LLVM
only is treated as the movw instruction's relocation, while the movt
instruction's relocation is dropped) will end up bogus.
These instructions are bundled up until right before the constant
islands pass, making this effectively the only place that can split
them apart.
Differential Revision: https://reviews.llvm.org/D51032
llvm-svn: 340451
This avoids a potential infinite loop setting and unsetting bits in the
mask.
Reduced from a failure on the polly-aosp bot.
Differential Revision: https://reviews.llvm.org/D51066
llvm-svn: 340446
Summary:
Add MemorySSA as a dependency to LoopSimplifyCFG and preserve it.
Disabled by default until all passes preserve MemorySSA.
Reviewers: bogner, chandlerc
Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D50911
llvm-svn: 340445
Summary:
Add MemorySSA as a depency to LoopInstInstSimplify and preserve it.
Disabled by default until all passes preserve MemorySSA.
Reviewers: chandlerc
Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D50906
llvm-svn: 340444
There's no need to track a seperate variable for argmemonly aliasing. This falls out naturally of the modinfo union. Note that we may return earlier than we would have earlier if all arguments are explicitly readnone. The overall result doesn't change, just how we get there.
llvm-svn: 340443
Inspired by what AArch64 does for shifts, this patch attempts to replace shift amounts with neg if we can.
This is done directly as part of isel so its as late as possible to avoid breaking some BZHI patterns since those patterns need an unmasked (32-n) to be correct.
To avoid manual load folding and custom instruction selection for the negate. I've inserted new nodes in the DAG above the shift node in topological order.
Differential Revision: https://reviews.llvm.org/D48789
llvm-svn: 340441
Summary:
There are several functions in the form of `has***` or `needs***` in
`WebAssemblyFrameLowering` and its `MachineFrameInfo` argument can be
obtained from `MachineFunction` so it is not necessarily has to be
passed from a caller. Also, it is more in line with other overriden
fuctions like `hasBP` or `hasReservedCallFrame`, which also take only
`MachineFunction` argument.
Reviewers: dschuff
Subscribers: sbc100, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51116
llvm-svn: 340438
When the key is not already in the map, the access operator[] creates an empty value and grows the map.
Resizing a map is very slow, so this needs to be avoided.
Found with csmith + asserts.
May help with
https://bugs.llvm.org/show_bug.cgi?id=25843
Patch by Tom Rix.
Differential Revision: https://reviews.llvm.org/D50780
llvm-svn: 340434
Summary:
`catch` instruction certainly has rather huge side effects and the flag
was missing. At the moment this does not change any unit tests we
currently have.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D50919
llvm-svn: 340433
We're calling these functions quite a bit from outside of MemorySSA.cpp
now. Given that they're relatively simple one-liners, I think the style
preference is to have them inline.
llvm-svn: 340430
wasm-lld expects relocation entries to be sorted by offset. In most
cases llvm produces them in order, but the CODE section (which combines
many MCSections) is an exception because we order the functions in
Symbol order, not in section order. What is more, its not clear weather
`recordRelocation` is guaranteed to be called in offset order so this
sort of most likely needed in the general case too.
Differential Revision: https://reviews.llvm.org/D51065
llvm-svn: 340423
32-bit constant address space is declared as 6, so the
maximum number of address spaces is 6, not 5.
Fixes "LLVM ERROR: Pointer address space out of range".
v5: rename MAX_COMMON_ADDRESS to MAX_AMDGPU_ADDRESS
v4: - fix compilation issues
- fix out of bounds access
v3: use static_assert()
v2: add a very simple test for 32-bit addr space
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106630
llvm-svn: 340417
Constant and global may alias, also one rules table wasn't
ordered correctly.
Pinpointed by Matt.
v2: add a test with swapped parameters
llvm-svn: 340416
Add intrinsic isel patterns for sxtb16, sxtab16, uxtb16 and uxtab16
so that they can perform a ror.
Differential Revision: https://reviews.llvm.org/D51034
llvm-svn: 340405
This adds the plumbing for the Tiny code model for the AArch64 backend. This,
instead of loading addresses through the normal ADRP;ADD pair used in the Small
model, uses a single ADR. The 21 bit range of an ADR means that the code and
its statically defined symbols need to be within 1MB of each other.
This makes it mostly interesting for embedded applications where we want to fit
as much as we can in as small a space as possible.
Differential Revision: https://reviews.llvm.org/D49673
llvm-svn: 340397