Optionally allow the order of restoring the callee-saved registers in the
epilogue to be reversed.
The flag -reverse-csr-restore-seq generates the following code:
```
stp x26, x25, [sp, #-64]!
stp x24, x23, [sp, #16]
stp x22, x21, [sp, #32]
stp x20, x19, [sp, #48]
; [..]
ldp x24, x23, [sp, #16]
ldp x22, x21, [sp, #32]
ldp x20, x19, [sp, #48]
ldp x26, x25, [sp], #64
ret
```
Note how the CSRs are restored in the same order as they are saved.
One exception to this rule is the last `ldp`, which allows us to merge
the stack adjustment and the ldp into a post-index ldp. This is done by
first generating:
ldp x26, x27, [sp]
add sp, sp, #64
which gets merged by the arm64 load store optimizer into
ldp x26, x25, [sp], #64
The flag is disabled by default.
llvm-svn: 327569
Layer implementations typically mutate module state, and this is better
reflected by having layers own the Module they are operating on.
llvm-svn: 327566
Summary:
We already emit relocations in this case when the "incremental linker
compatible" flag is set, but it turns out these relocations are also
required for /guard:cf. Now that we have two use cases for this
behavior, let's make it unconditional to try to keep things simple.
We never hit this problem in Clang because it always sets the
"incremental linker compatible" flag when targeting MSVC. However, LLD
LTO doesn't set this flag, so we'd get CFG failures at runtime when
using ThinLTO and /guard:cf. We probably don't want LLD LTO to set the
"incremental linker compatible" assembler flag, since this has nothing
to do with incremental linking, and we don't need to timestamp LTO
temporary objects.
Fixes PR36624.
Reviewers: inglorion, espindola, majnemer
Subscribers: mehdi_amini, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D44485
llvm-svn: 327557
I removed this in r316797 because the coverage report showed no coverage and I thought it should have been handled by the auto generated table. I now see that there is code that bypasses the table if the shift amount is out of bounds.
This adds back the code. We'll codegen out of bounds i8 shifts to effectively (amount & 0x1f). The 0x1f is a strange quirk of x86 that shift amounts are always masked to 5-bits(except 64-bits). So if the masked value is still out bounds the result will be 0.
Fixes PR36731.
llvm-svn: 327540
I had to modify the bswap recognition to allow unshrunk masks to make this work.
Fixes PR36689.
Differential Revision: https://reviews.llvm.org/D44442
llvm-svn: 327530
swifterror llvm values model the swifterror register as memory at the
LLVM IR level. ISel will perform adhoc mem-to-reg on them. swifterror
values are constraint in how they can be used. Spilling them to memory
is not allowed.
SjLjEHPrepare tried to lower swifterror values to memory which is
unecessary since the back-end will spill and reload the register as
neccessary (as long as clobbering calls are marked as such which is the
case here) and further leads to invalid IR because swifterror values
can't be stored to memory.
rdar://38164004
llvm-svn: 327521
I don't know how to expose this in a test. There are ARM / AArch64
sched classes that include zero latency instructions, but I'm not
seeing sched info printed for those targets. X86 will almost
certainly have these soon (see PR36671), but no model has
'let Latency = 0' currently.
llvm-svn: 327518
Support G_LSHR/G_ASHR/G_SHL. We have 3 variance for
shift instructions : shift gpr, shift imm, shift 1.
Currently GlobalIsel TableGen generate patterns for
shift imm and shift 1, but with shiftCount i8.
In G_LSHR/G_ASHR/G_SHL like LLVM-IR both arguments
has the same type, so for now only shift i8 can use
auto generated TableGen patterns.
The support of G_SHL/G_ASHR enables tryCombineSExt
from LegalizationArtifactCombiner.h to hit, which
results in different legalization for the following tests:
LLVM :: CodeGen/X86/GlobalISel/ext-x86-64.ll
LLVM :: CodeGen/X86/GlobalISel/gep.ll
LLVM :: CodeGen/X86/GlobalISel/legalize-ext-x86-64.mir
-; X64-NEXT: movsbl %dil, %eax
+; X64-NEXT: movl $24, %ecx
+; X64-NEXT: # kill: def $cl killed $ecx
+; X64-NEXT: shll %cl, %edi
+; X64-NEXT: movl $24, %ecx
+; X64-NEXT: # kill: def $cl killed $ecx
+; X64-NEXT: sarl %cl, %edi
+; X64-NEXT: movl %edi, %eax
..which is not optimal and should be addressed later.
Rework of the patch by igorb
Reviewed By: igorb
Differential Revision: https://reviews.llvm.org/D44395
llvm-svn: 327499
This could end up inititialized if someone called the function with a
null AsmPrinter. Right now this only happens in DIEHash unit tests,
presumably because it was hard to create an AsmPrinter in the context of
unit tests. This only worked before r327486 because those tests did not
use any dwarf forms whose size actually depended on the dwarf version
(otherwise, they would have crashed due to null dereference).
I fix the uninitialized error, by explicitly initializing FormParams to
an invalid value, which will cause getFixedFormByteSize to return None
if called with a form with version-dependent size. A more principled
solution might be to fix the DIEHash tests to always pass in a valid
AsmPrinter.
llvm-svn: 327498
These previously all failed one way or another, but now we produce a more
helpful error message.
Change-Id: I8ffd2e87c8e35a5134c3be289e0a1fecaa2bb8ca
Differential revision: https://reviews.llvm.org/D44115
llvm-svn: 327497
Additionally, allow more than two operands to !con, !add, !and, !or
in the same way as is already allowed for !listconcat and !strconcat.
Change-Id: I9659411f554201b90cd8ed7c7e004d381a66fa93
Differential revision: https://reviews.llvm.org/D44112
llvm-svn: 327494
This makes using !dag more convenient in some cases.
Change-Id: I0a8c35e15ccd1ecec778fd1c8d64eee38d74517c
Differential revision: https://reviews.llvm.org/D44111
llvm-svn: 327493
This allows constructing DAG nodes with programmatically determined
names, and can simplify constructing DAG nodes in other cases as
well.
Also, add documentation and some very simple tests for the already
existing !con.
Change-Id: Ida61cd82e99752548d7109ce8da34d29da56a5f7
Differential revision: https://reviews.llvm.org/D44110
llvm-svn: 327492
Summary:
This patch replaces the two switches which are deducing the size of
various forms with a single implementation. I have put the new
implementation into BinaryFormat, to avoid introducing dependencies
between the two independent libraries (DebugInfo and CodeGen) that need
this functionality.
Reviewers: aprantl, JDevlieghere, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44418
llvm-svn: 327486
Summary:
The old bindings should have used an enum instead of a boolean. This
deprecates LLVMHasUnnamedAddr and LLVMSetUnnamedAddr , replacing them
with LLVMGetUnnamedAddress and LLVMSetUnnamedAddress respectively that do.
Though it is unlikely LLVM will gain more supported global value linker
hints, the new API can scale to accommodate this.
Reviewers: deadalnix, whitequark
Reviewed By: whitequark
Subscribers: llvm-commits, harlanhaskins
Differential Revision: https://reviews.llvm.org/D43448
llvm-svn: 327479
The Error locals need to be protected by a mutex. (This could be fixed by
having the promises / futures contain Expected and Error values, but
MSVC's future implementation does not support this yet).
Hopefully this will fix some of the errors seen on the builders due to
r327474.
llvm-svn: 327477
This can be used to extract the symbol table from a RuntimeDyld instance prior
to disposing of it.
This patch also updates RTDyldObjectLinkingLayer to use the new method, rather
than requesting symbols one at a time via getSymbol.
llvm-svn: 327476
The lookup function takes a list of VSOs, a set of symbol names (or just one
symbol name) and a materialization function object. It returns an
Expected<SymbolMap> (if given a set of names) or an Expected<JITEvaluatedSymbol>
(if given just one name). The lookup method constructs an
AsynchronousSymbolQuery for the given names, applies that query to each VSO in
the list in turn, and then blocks waiting for the query to complete. If
threading is enabled then the materialization function object can be used to
execute the materialization on different threads. If threading is disabled the
MaterializeOnCurrentThread utility must be used.
llvm-svn: 327474
We now only create recursive concats if we have more than two non-zero values. This keeps our subvector broadcast DAG combine functioning.
llvm-svn: 327457
This better able to detect undef and zeros pieces in the concat. Or cases when only one subvector is non-zero. This allows us to avoid silly things like double inserts into progressively larger undefs.
This still builds 512 bit concats of 128 bits by building up through 256 bits first. But I don't know if that's best.
We probably want to merge this with the vXi1 concat code since they are very similar.
llvm-svn: 327454
BUILD_VECTORs aren't themselves legalized until LegalizeDAG so we should still be able to create an "illegal" one before that. This helps combine with BUILD_VECTORS that are introduced during LegalizeVectorOps due to unrolling.
llvm-svn: 327446
Nothing prevents us from having both frame-setup and frame-destroy on
the same instruction.
When merging:
* frame-setup OPCODE1
* frame-destroy OPCODE2
into
* frame-setup frame-destroy OPCODE3
we want to be able to print and parse both flags.
llvm-svn: 327442
Summary:
It is possible for LVI to encounter instructions that are not in valid
SSA form and reference themselves. One example is the following:
%tmp4 = and i1 %tmp4, undef
Before this patch LVI would recurse until running out of stack memory
and crashed. This patch marks these self-referential instructions as
Overdefined and aborts analysis on the instruction.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33357
Reviewers: craig.topper, anna, efriedma, dberlin, sebpop, kuhar
Reviewed by: dberlin
Subscribers: uabelho, spatel, a.elovikov, fhahn, eli.friedman, mzolotukhin, spop, evandro, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D34135
llvm-svn: 327432
Make sure that DWARF line information generated by Windows can be properly read by Posix OS and vice versa.
Differential Revision: https://reviews.llvm.org/D44290
llvm-svn: 327430
Injected sources are basically a way to add actual source file content
to your PDB. Presumably you could use this for shipping your source code
with your debug information, but in practice I can only find this being
used for embedding natvis files inside of PDBs.
In order to effectively test LLVM's natvis file injection, we need a way
to dump the injected sources of a PDB in a way that is authoritative
(i.e. based on Microsoft's understanding of the PDB format, and not
LLVM's). To this end, I've added support for dumping injected sources
via DIA. I made a PDB file that used the /natvis option to generate a
test case.
Differential Revision: https://reviews.llvm.org/D44405
llvm-svn: 327428
Under some circumstances the divrems won't have been combined together before getting to this code.
So replace the assertion with a if() guard to not expand to X-((X/C)*C) to give the other combine chance to happen.
Reduced from OSS-Fuzz #6883https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6883
llvm-svn: 327424
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SelectionDAGBuilder to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting
source & dest specific alignments through the new API.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 327421
The goal is to make the reciprocal throughput computation accessible through the
MCSchedModel interface. This is particularly important for llvm-mca because it
can only query the MCSchedModel interface.
No functional change intended.
Differential Revision: https://reviews.llvm.org/D44392
llvm-svn: 327420