Current implementation of emitPatchpoint() is very inefficient:
for every FrameIndex operand if creates new MachineInstr with
that operand expanded and all other copied as is.
Since PATCHPOINT/STATEPOINT instructions may have *a lot* of
FrameIndex operands, we end up creating and erasing many
machine instructions. But we can do it in single pass, with only
one new machine instruction generated.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D81181
Calls that are marked as @notoc do not require the extra nop after the call
for the TOC restore.
Differential Revision: https://reviews.llvm.org/D81081
Summary:
This patch adds legalisation of extensions where the operand
of the extend is a legal scalable type but the result is not.
EXTRACT_SUBVECTOR is used to split the result, before
being replaced by target-specific [S|U]UNPK[HI|LO] operations.
For example:
```
zext <vscale x 16 x i8> %a to <vscale x 16 x i16>
```
should emit:
```
uunpklo z2.h, z0.b
uunpkhi z1.h, z0.b
```
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: efriedma
Subscribers: tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79587
We can simplify
```
icmp <pred> phi(C1, C2, ...), C
```
with
```
phi(icmp(C1, C), icmp(C2, C), ...)
```
provided that all comparison of constants are constants themselves.
Differential Revision: https://reviews.llvm.org/D81151
Reviewed By: lebedev.ri
Summary:
Add regression tests of asmparser, mccodeemitter, and disassembler for
fixed-point operation instructions. In order to support them, we add
MImm parser to asmparser. Also add a new MPD instruction which is one
of multiply instructions.
Differential Revision: https://reviews.llvm.org/D81207
Use getMemoryOpCost from the generic implementation of getUserCost
and have getInstructionThroughput return the result of that for loads
and stores.
This also means that the X86 implementation of getUserCost can be
removed with the functionality folded into its getMemoryOpCost.
Differential Revision: https://reviews.llvm.org/D80984
This patch addresses the comment in [D80972](https://reviews.llvm.org/D80972#inline-744217).
Before this patch, the initial length field of .debug_aranges section should be declared as:
```
## 32-bit DWARF
debug_aranges:
- Length:
TotalLength: 0x20
Version: 2
...
## 64-bit DWARF
debug_aranges:
- Length:
TotalLength: 0xffffffff
TotalLength64: 0x20
Version: 2
...
```
After this patch:
```
## 32-bit DWARF
debug_aranges:
- [[Format: DWARF32]] ## Optional
Length: 0x20
Version: 2
...
## 64-bit DWARF
debug_aranges:
- Format: DWARF64
Length: 0x20
Version: 2
```
Current implementation of generating DWARF64 .debug_aranges section is buggy. A follow-up patch will improve it and add test cases for DWARF64.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D81063
It should not be necessary to use weak linkage for these. Doing so
implies interposablity and thus PIC generates indirections and
dynamic relocations, which are unnecessary and suboptimal. Aside
from this, ASan instrumentation never introduces GOT indirection
relocations where there were none before--only new absolute relocs
in RELRO sections for metadata, which are less problematic for
special linkage situations that take pains to avoid GOT generation.
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D80605
Summary:
Cache the results from getMachineBasicBlocks in LexicalScopes to speed
up UserValueScopes::dominates queries. This replaces the caching done
in UserValueScopes. Compared to the old caching method, this reduces
memory traffic when a VarLoc is copied (e.g. when a VarLocMap grows),
and enables caching across basic blocks.
When compiling sqlite 3.5.7 (CTMark version), this patch reduces the
number of calls to getMachineBasicBlocks from 10,207 to 1,093. I also
measured a small compile-time reduction (~ 0.1% of total wall time, on
average, on my machine).
As a drive-by, I made the DebugLoc in UserValueScopes a const reference
to cut down on MetadataTracking traffic.
Reviewers: jmorse, Orlando, aprantl, nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80957
Now that we have an operand based form for the GC arguments to a statepoint intrinsic, update RS4GC to use it and update tests to reflect. This is pretty straight forward. I nearly landed without review, but figured a second set of eyes didn't hurt.
Differential Revision: https://reviews.llvm.org/D81121
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Differential Revision: https://reviews.llvm.org/D79835
- Change the reference to salvageDebugInfoOrUndef to salvageDebugInfo
(in accordance with https://reviews.llvm.org/D78369).
- Reorganize a few sections in preparation for an upcoming change that
attempts to specify rules for updating debug locations.
- Fix some intra-document links.
- Some spelling / wording fixes.
Global TableGen let override blocks are pretty dangerous and override
any local special cases. In this case, the broader HasFlatGlobalInsts
was overriding the more specific predicate for
FeatureAtomicFaddInsts. Make sure HasFlatGlobalInsts is implied by
FeatureAtomicFaddInsts, and make sure the right predicate is used.
One issue with independently setting the subtarget features on
incompatible targets is all of the encoding families do not define all
opcodes. This will hit an assert on gfx10 for example, since we set
the encoding independently based on the generation and not based on a
feature.
This logically belongs with 89d48ccabe6a950369b2bd922b1d8e987b856ac7,
but this order was needed to avoid regressions before adding
mayRaiseFPExceptions to relevant instructions.
This may be missing a few overrides to set it off still in some
special cases. Since the flags set during selection should now be
reliably preserved, this should not change codegen for non-strictfp
functions.
As mentioned in the post-commit comments of D81013 -
the mask check API has to assume the shuffle is
not length-changing, but we have not ruled that out
in this code. Use the ShuffleVectorInst call instead.
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D79835
There are a number of MIR tests using instructions on subtargets where
they don't really exist. These are some of the easy cases that don't
require splitting up test functions.
Summary:
Unlike normal traps, debug traps are allowed to return and can have
additional instructions in the same basic block. Without explicit
backend support for debug traps, they are lowered in ISel as normal
traps. Since normal traps are lowered in the WebAssembly backend to
the UNREACHABLE instruction, which is a terminator, using debug traps
could lead to invalid MBBs when there are additional instructions
after the trap. This patch fixes the issue by lowering debug traps to
a new version of the UNREACHABLE instruction, DEBUG_UNREACHABLE, that
is not a terminator.
An alternative approach would have been to make UNREACHABLE not a
terminator, but that breaks a large number of tests. In particular, it
would require removing the traps inserted after noreturn calls to
@llvm.wasm.throw because otherwise the terminator throw would be
followed by a non-terminator UNREACHABLE and we would be back to
having invalid MBBs. Overall the approach in this patch seems simpler.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81055
Just two paragraphs above it says:
"If the compiler does not support this [skipping code generation for a particular branch], it will fall back
to the "abort" implementation."
And that actually correctly describes llvm_unreachable implementation.
Differential Revision: https://reviews.llvm.org/D81130