It is possible for two modules to define the same set of external
symbols without causing a duplicate symbol error at link time,
as long as each of the symbols is a comdat member. So we cannot
use them as part of a unique id for the module.
Differential Revision: https://reviews.llvm.org/D38602
llvm-svn: 315026
Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D38477
llvm-svn: 315016
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.
Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).
Reviewers: aprantl, friss
Subscribers: llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D38504
llvm-svn: 315014
Ensure the program_headers call will fail correctly if the program
headers are larger than the underlying buffer.
Patch by Parker Thompson!
llvm-svn: 315012
Summary:
Xcode's dsymutil emits a __swift_ast DWARF section, which is required for debugging,
and which contains a byte-for-byte dump of the swiftmodule file.
Add this feature to llvm-dsymutil.
Tested with `gobjdump --dwarf=info -s`, by verifying that the contents of
`__DWARF.__swift_ast` match between Xcode's dsymutil and llvm-dsymutil
(Xcode's dwarfdump and llvm-dwarfdump don't currently recognize the
__swift_ast section).
Reviewers: aprantl, friss
Subscribers: llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D38504
llvm-svn: 315004
AbstractLatticeFunction and SparseSolver are class templates parameterized by a
lattice value, so we need to move these member functions over to the header.
Differential Revision: https://reviews.llvm.org/D38561
llvm-svn: 314996
There is data racing to the static variable RecordIndex in index profile reader
when merging in multiple threads. Make it a member variable in
IndexedInstrProfReader to fix this.
Differential Revision: https://reviews.llvm.org/D38431
llvm-svn: 314990
Summary:
FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of
Cast::isNoopCast(). According to review comments in D37894 we could instead
use the "DataLayout" version of the method, and thus get rid of the
"IntPtrTy" versions of isNoopCast() completely.
With the above done, the remaining isNoopCast() could then be simplified
a bit more.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D38497
llvm-svn: 314969
Summary:
If the pointer width is 32 bits and the calculated GEP offset is
negative, we call APInt::getLimitedValue(), which does a
*zero*-extension of the offset. That's wrong -- we should do an sext.
Fixes a bug introduced in rL314362 and found by Evgeny Astigeevich.
Reviewers: efriedma
Subscribers: sanjoy, javed.absar, llvm-commits, eastig
Differential Revision: https://reviews.llvm.org/D38557
llvm-svn: 314935
But now include a check for CPU_COUNT so we still build on 10 year old
versions of glibc.
Original message:
Use sched_getaffinity instead of std:🧵:hardware_concurrency.
The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.
With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.
This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.
llvm-svn: 314931
This is a follow-up to https://reviews.llvm.org/D38138.
I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping
any options by using default param values.
llvm-svn: 314930
Function isLoweredToCall can only accept non-null function pointer, but a function pointer can be null for indirect function call. So check it before calling isLoweredToCall from getInstructionLatency.
Differential Revision: https://reviews.llvm.org/D38204
llvm-svn: 314927
Recommitting r314517 with the fix for handling ConstantExpr.
Original commit message:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
mode in the target. However, since it doesn't check its actual users, it will
return FREE even in cases where the GEP cannot be folded away as a part of
actual addressing mode. For example, if an user of the GEP is a call
instruction taking the GEP as a parameter, then the GEP may not be folded in
isel.
llvm-svn: 314923
Summary:
This reverts D38481. The change breaks systems with older versions of glibc. It
injects a use of CPU_COUNT() from sched.h without checking to ensure that the
function exists first.
Reviewers:
Subscribers:
llvm-svn: 314922
It broke the Chromium / SQLite build; see PR34830.
> Summary:
> 1/ Operand folding during complex pattern matching for LEAs has been
> extended, such that it promotes Scale to accommodate similar operand
> appearing in the DAG.
> e.g.
> T1 = A + B
> T2 = T1 + 10
> T3 = T2 + A
> For above DAG rooted at T3, X86AddressMode will no look like
> Base = B , Index = A , Scale = 2 , Disp = 10
>
> 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
> so that if there is an opportunity then complex LEAs (having 3 operands)
> could be factored out.
> e.g.
> leal 1(%rax,%rcx,1), %rdx
> leal 1(%rax,%rcx,2), %rcx
> will be factored as following
> leal 1(%rax,%rcx,1), %rdx
> leal (%rdx,%rcx) , %edx
>
> 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
> thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014
llvm-svn: 314919
Summary:
This patch teaches `DT.applyUpdates` to take the fast when applying zero or just one update and makes it not run the internal batch updater machinery.
With this patch, it should no longer make sense to have a special check in user's code that checks the update sequence size before applying them, e.g.
```
if (!MyUpdates.empty())
DT.applyUpdates(MyUpdates);
```
or
```
if (MyUpdates.size() == 1)
if (...)
DT.insertEdge(...)
else
DT.deleteEdge(...)
```
Reviewers: dberlin, brzycki, davide, grosser, sanjoy
Reviewed By: dberlin, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38541
llvm-svn: 314917
Before the patch this was in Analysis. Moving it to IR and making it implicit
part of LLVMContext::diagnose allows the full opt-remark facility to be used
outside passes e.g. the pass manager. Jessica is planning to use this to
report function size after each pass. The same could be used for time
reports.
Tested with BUILD_SHARED_LIBS=On.
llvm-svn: 314909
Summary:
1/ Operand folding during complex pattern matching for LEAs has been
extended, such that it promotes Scale to accommodate similar operand
appearing in the DAG.
e.g.
T1 = A + B
T2 = T1 + 10
T3 = T2 + A
For above DAG rooted at T3, X86AddressMode will no look like
Base = B , Index = A , Scale = 2 , Disp = 10
2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
so that if there is an opportunity then complex LEAs (having 3 operands)
could be factored out.
e.g.
leal 1(%rax,%rcx,1), %rdx
leal 1(%rax,%rcx,2), %rcx
will be factored as following
leal 1(%rax,%rcx,1), %rdx
leal (%rdx,%rcx) , %edx
3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
thus avoiding creation of any complex LEAs within a loop.
Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
Reviewed By: lsaba
Subscribers: jmolloy, spatel, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35014
llvm-svn: 314886
I found that llvm-mc does not like non-english characters even in comments,
which it tries to tokenize.
Problem happens because of functions like isdigit(), isalnum() which takes
int argument and expects it is not negative.
But at the same time MCParser uses char* to store input buffer poiner, char has signed value,
so it is possible to pass negative value to one of functions from above and
that triggers an assert.
Testcase for demonstration is provided.
To fix the issue helper functions were introduced in StringExtras.h
Differential revision: https://reviews.llvm.org/D38461
llvm-svn: 314883
This patch makes DT::eraseNode mark DFSInfo as invalid.
Not marking it as invalid leads to DFS numbers getting corrupted
and failing VerifyDFSNumbers check.
This patch also makes children iterator const (NFC).
llvm-svn: 314847
The test fails on Linux; see follow-up email on the llvm-commits list.
> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409
This also reverts the follow-up r314818:
> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test
llvm-svn: 314825
All the buildbots are red, e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/
> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask' of
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
>
> Reviewed By: Ayal
>
> Subscribers: hans, mzolotukhin
>
> Differential Revision: https://reviews.llvm.org/D36130
llvm-svn: 314824
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.
X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.
Differential revision: https://reviews.llvm.org/D38480
llvm-svn: 314821
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.
Differential revision: https://reviews.llvm.org/D38409
llvm-svn: 314817
The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.
With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.
This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.
llvm-svn: 314809
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.
This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
Reviewed By: Ayal
Subscribers: hans, mzolotukhin
Differential Revision: https://reviews.llvm.org/D36130
llvm-svn: 314806
This adds a DiagnosticString member to the AsmOperand tablegen class, so
that the diagnostic text to be used when an assembly operand is
incorrect can be stored in the tablegen description of the operand,
rather than in a separate switch statement in the AsmParser.
If DiagnosticString is used for any operands, tablegen will emit a
getMatchKindDiag function, to map from diagnostic enums to strings.
Differential revision: https://reviews.llvm.org/D31606
llvm-svn: 314803
Summary:
This patch teaches the DominatorTree verifier to check DFS In/Out numbers which are used to answer dominance queries.
DFS number verification is done in O(nlogn), so it shouldn't add much overhead on top of the O(n^3) sibling property verification.
This check should detect errors like the one spotted in PR34466 and related bug reports.
The patch also cleans up the DFS calculation a bit, as all constructed trees should have a single root now.
I see 2 new test failures when running check-all after this change:
```
Failing Tests (2):
Polly :: Isl/CodeGen/OpenMP/reference-argument-from-non-affine-region.ll
Polly :: Isl/CodeGen/OpenMP/two-parallel-loops-reference-outer-indvar.ll
```
which seem to happen just after `Create LLVM-IR from SCoPs` -- I XFAILed them in r314800.
Reviewers: dberlin, grosser, davide, zhendongsu, bollu
Reviewed By: dberlin
Subscribers: nandini12396, bollu, Meinersbur, brzycki, llvm-commits
Differential Revision: https://reviews.llvm.org/D38331
llvm-svn: 314801
The current table-generated assembly instruction matcher returns a
64-bit error code when matching fails. Since multiple instruction
encodings with the same mnemonic can fail for different reasons, it uses
some heuristics to decide which message is important.
This heuristic does not work well for targets that have many encodings
with the same mnemonic but different operands, or which have different
versions of instructions controlled by subtarget features, as it is hard
to know which encoding the user was intending to use.
Instead of trying to improve the heuristic in the table-generated
matcher, this patch changes it to report a list of near-miss encodings.
This list contains an entry for each encoding with the correct mnemonic,
but with exactly one thing preventing it from being valid. This thing
could be a single invalid operand, a missing target feature or a failed
target-specific validation function.
The target-specific assembly parser can then report an error message
giving multiple options for instruction variants that the user may have
been trying to use. For example, I am working on a patch to use this for
ARM, which can give this error for an invalid instruction for ARMv6-M:
<stdin>:8:3: error: invalid instruction, multiple near-miss encodings found
adds r0, r1, #0x8
^
<stdin>:8:3: note: for one encoding: instruction requires: thumb2
adds r0, r1, #0x8
^
<stdin>:8:16: note: for one encoding: expected an integer in range [0, 7]
adds r0, r1, #0x8
^
<stdin>:8:16: note: for one encoding: expected a register in range [r0, r7]
adds r0, r1, #0x8
^
This also allows the target-specific assembly parser to apply its own
heuristics to suppress some errors. For example, the error "instruction
requires: arm-mode" is never going to be useful when targeting an
M-profile architecture (which does not have ARM mode).
This patch just adds the target-independent mechanism for doing this,
all targets still use the old mechanism. I've added a bit in the
AsmParser tablegen class to allow targets to switch to this new
mechanism. To use this, the target-specific assembly parser will have to
be modified for the change in signature of MatchInstructionImpl, and to
report errors based on the list of near-misses.
Differential revision: https://reviews.llvm.org/D27620
llvm-svn: 314774
Summary:
When checking if a constant expression is a noop cast we fetched the
IntPtrType by doing DL->getIntPtrType(V->getType())). However, there can
be cases where V doesn't return a pointer, and then getIntPtrType()
triggers an assertion.
Now we pass DataLayout to isNoopCast so the method itself can determine
what the IntPtrType is.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D37894
llvm-svn: 314763
Call ConstantFoldSelectInstruction() to fold cases like below
select <2 x i1><i1 true, i1 false>, <2 x i8> <i8 0, i8 1>, <2 x i8> <i8 2, i8 3>
All operands are constants and the condition has mixed true and false conditions.
Differential Revision: https://reviews.llvm.org/D38369
llvm-svn: 314741
Summary:
This avoids using void * as the type of the lattice value and ugly casts needed to make that happen.
(If folks want to use references, etc, they can use a reference_wrapper).
Reviewers: davide, mssimpso
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D38476
llvm-svn: 314734
Issues addressed since original review:
- Avoid bug in regalloc greedy/machine verifier when forwarding to use
in an instruction that re-defines the same virtual register.
- Fixed bug when forwarding to use in EarlyClobber instruction slot.
- Fixed incorrect forwarding to register definitions that showed up in
explicit_uses() iterator (e.g. in INLINEASM).
- Moved removal of dead instructions found by
LiveIntervals::shrinkToUses() outside of loop iterating over
instructions to avoid instructions being deleted while pointed to by
iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
doing so can break code that implicitly relies on the physical
register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
can break the machine verifier by creating LiveRanges that don't
end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
llvm-svn: 314729
This came out of a recent discussion on llvm-dev
(https://reviews.llvm.org/D38042). Currently the Verifier will strip
the debug info metadata from a module if it finds the dbeug info to be
malformed. This feature is very valuable since it allows us to improve
the Verifier by making it stricter without breaking bcompatibility,
but arguable the Verifier pass should not be modifying the IR. This
patch moves the stripping of broken debug info into AutoUpgrade
(UpgradeDebugInfo to be precise), which is a much better location for
this since the stripping of malformed (i.e., produced by older, buggy
versions of Clang) is a (harsh) form of AutoUpgrade.
This change is mostly NFC in nature, the one big difference is the
behavior when LLVM module passes are introducing malformed debug
info. Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info, after this patch the Verifier
will report a fatal error. I believe this behavior is actually more
desirable anyway.
Differential Revision: https://reviews.llvm.org/D38184
llvm-svn: 314699
Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined.
Reviewers: dblaikie, aprantl
Reviewed By: dblaikie, aprantl
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D37877
llvm-svn: 314694
Summary: In SamplePGO ThinLTO compile phase, we will not invoke ICP as it may introduce confusion to the 2nd annotation. This patch extracted that logic and makes it clearer before profile annotation. In the mean time, we need to make function importing process both inlined callsites as well as not promoted indirect callsites.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: sanjoy, mehdi_amini, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D38094
llvm-svn: 314619
Summary:
Demanglers such as libiberty know how to strip suffixes of the form
\.[a-zA-Z]+\.\d+, but our current promoted value suffixes are
.llvm.${modulehash}, where the module hash is in hex. Change the
module hash to decimal to allow demanglers to handle this.
Reviewers: danielcdh
Subscribers: llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D38405
llvm-svn: 314527
This implement the insertion operator for DWARF address ranges so they
are consistently printed as [LowPC, HighPC).
While a dump method might have felt more consistent, it is used
exclusively for printing error messages in the verifier and never used
for actual dumping. Hence this approach is more intuitive and creates
less clutter at the call sites.
Differential revision: https://reviews.llvm.org/D38395
llvm-svn: 314523
Summary:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target.
However, since it doesn't check its actual users, it will return FREE even in cases
where the GEP cannot be folded away as a part of actual addressing mode.
For example, if an user of the GEP is a call instruction taking the GEP as a parameter,
then the GEP may not be folded in isel.
Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng
Reviewed By: hfinkel
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38085
llvm-svn: 314517
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.
If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).
This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899https://bugs.llvm.org/show_bug.cgi?id=34610
llvm-svn: 314516
Summary:
This commit adds comments on how the AMDPAL OS type overloads the
existing AMDGPU_ calling conventions used by Mesa, and adds a couple of
new ones.
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D37752
llvm-svn: 314502
Summary:
This operating system type represents the AMDGPU PAL runtime, and will
be required by the AMDGPU backend in order to generate correct code for
this runtime.
Currently it generates the same code as not specifying an OS at all.
That will change in future commits.
Patch from Tim Corringham.
Subscribers: arsenm, nhaehnle
Differential Revision: https://reviews.llvm.org/D37380
llvm-svn: 314500
This patch introduces 3 helper functions: error(), warn() and note() to
make printing during verification more consistent. When supported, the
respective prefixes are printed in color using the same color scheme as
clang.
Differential revision: https://reviews.llvm.org/D38368
llvm-svn: 314498
Allow the proper recognition of Enum values and global variables inside ms inline-asm memory / immediate expressions, as they require some additional overhead and treated incorrect if doesn't early recognized.
supersedes D33278, D35774
Differential Revision: https://reviews.llvm.org/D37412
llvm-svn: 314493
This patch implements the dwarfdump option --find=<name>. This option
looks for a DIE in the accelerator tables and dumps it if found. This
initial patch only adds support for .apple_names to keep the review
small, adding the other sections and pubnames support should be
trivial though.
Differential Revision: https://reviews.llvm.org/D38282
llvm-svn: 314439
JumpThreading now preserves dominance and lazy value information across the
entire pass. The pass manager is also informed of this preservation with
the goal of DT and LVI being recalculated fewer times overall during
compilation.
This change prepares JumpThreading for enhanced opportunities; particularly
those across loop boundaries.
Patch by: Brian Rzycki <b.rzycki@samsung.com>,
Sebastian Pop <s.pop@samsung.com>
Differential revision: https://reviews.llvm.org/D37528
llvm-svn: 314435
The second argument for Allocator::Deallocate is the number of elements,
not the size of a single element. In asan mode specifying a large number
of elements poisoned random memory regions, leading to crashes
everywhere.
llvm-svn: 314413
Summary:
This allows sharing the lattice value code between LVI and SCCP (D36656).
It also adds a `satisfiesPredicate` function, used by D36656.
Reviewers: davide, sanjoy, efriedma
Reviewed By: sanjoy
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D37591
llvm-svn: 314411
Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::clear() was
doing.
Reviewers: chandlerc
Subscribers: mehdi_amini, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D38201
llvm-svn: 314375
concept.
Add a unit-test to make sure we don't backslide, and tweak the MockBaseLayer
utility to make it easier to test this kind of thing in the future.
llvm-svn: 314374
Summary:
This avoids C++ UB if the GEP is weird and the calculation overflows
int64_t, and it's also observable in the cost model's results.
Such GEPs are almost surely not valid pointers, but LLVM nonetheless
generates them sometimes.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38337
llvm-svn: 314362
This commit allows the outliner to avoid saving and restoring the link register
on AArch64 when it is dead within an entire class of candidates.
This introduces changes to the way the outliner interfaces with the target.
For example, the target now interfaces with the outliner using a
MachineOutlinerInfo struct rather than by using getOutliningCallOverhead and
getOutliningFrameOverhead.
This also improves several comments on the outliner's cost model.
https://reviews.llvm.org/D36721
llvm-svn: 314341
Summary:
According to https://gcc.gnu.org/wiki/SplitStacks, the linker expects a zero-sized .note.GNU-split-stack section if split-stack is used (and also .note.GNU-no-split-stack section if it also contains non-split-stack functions), so it can handle the cases where a split-stack function calls non-split-stack function.
This change adds the sections if needed.
Fixes PR #34670.
Reviewers: thanm, rnk, luqmana
Reviewed By: rnk
Subscribers: llvm-commits
Patch by Cherry Zhang <cherryyz@google.com>
Differential Revision: https://reviews.llvm.org/D38051
llvm-svn: 314335
Summary:
Found when testing stage-2 build with D38101.
```
In file included from /build/llvm/lib/Support/Path.cpp:1045:
/build/llvm/lib/Support/Unix/Path.inc:648:14: error: comparison 'uint64_t' (aka 'unsigned long') > 18446744073709551615 is always false [-Werror,-Wtautological-constant-compare]
if (length > std::numeric_limits<size_t>::max()) {
~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
`size_t` is `uint64_t` here, apparently, thus any `uint64_t` value
always fits into `size_t`.
Initial patch was to use some preprocessor logic to
not check if the size is known to fit at compile time.
But Zachary Turner suggested using this approach.
Reviewers: Bigcheese, rafael, zturner, mehdi_amini
Reviewed by (via email): zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38132
llvm-svn: 314312
This was intended to be no-functional-change, but it's not - there's a test diff.
So I thought I should stop here and post it as-is to see if this looks like what was expected
based on the discussion in PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603
Notes:
1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried
through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'.
The parameter isn't passed down, so we pick up the default value from the function signature
after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the
'SimplifyCFG' calls.
2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops.
This would theoretically allow us to differentiate the transforms controlled by those params
independently.
3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too.
I just stopped here to minimize the diffs.
4. Similarly, I stopped short of messing with the pass manager layer. I have another question that
could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG
set to true no matter where in the pipeline it's creating SimplifyCFG passes?
// Create an early function pass manager to cleanup the output of the
// frontend.
EarlyFPM.addPass(SimplifyCFGPass());
-->
/// \brief Construct a pass with the default thresholds
/// and switch optimizations.
SimplifyCFGPass::SimplifyCFGPass()
: BonusInstThreshold(UserBonusInstThreshold),
LateSimplifyCFG(true) {} <-- switches get converted to lookup tables and loops may not be in canonical form
If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG'
setting via recursion was masking this bug.
Differential Revision: https://reviews.llvm.org/D38138
llvm-svn: 314308
Summary:
A new FDR metadata record will support logging a function call argument;
appending multiple metadata records will represent a sequence of arguments
meaning that "holes" are not representable by the buffer format. Each
call argument is currently a 64-bit value (useful for "this" pointers and
synchronization objects).
If present, we put this argument to the function call "entry" record it
belongs to, and alter its type to notify the user of its presence.
Reviewers: dberris
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32840
llvm-svn: 314269
It is useful for the symbol to contain the index of the
function of global it represents in the function/global
index space.
For imports we also store the import index so that the
linker can find, for example, the signature of the
corresponding function, which is defined by the import
In the long run we need to decide whether this API
surface should be closer to binary (where imported
functions are seperate) or the wasm spec (where the
function index space is unified).
Differential Revision: https://reviews.llvm.org/D38189
llvm-svn: 314230
When dsymutil generates the companion file, its strips all unnecessary
sections by omitting their body and setting the offset in their
corresponding load command to zero.
One such section is the .eh_frame section, as it contains runtime
information rather than debug information and is part of the __TEXT
segment. When reading this section, we would just read the number of
bytes specified in the load command, starting from offset 0 (i.e. the
beginning of the file).
Rather than trying to parse this obviously invalid section, dwarfdump
now skips this.
Differential revision: https://reviews.llvm.org/D38135
llvm-svn: 314208
Removing X86 broadcast(f/i)32x2 intrinsics from llvm.
Adding autoUpgrade support.
Moving matching tests from avx512dq-intrinsics.ll to avx512dq-intrinsics-upgrade.ll and from avx512dqvl-intrinsics.ll to avx512dqvl-intrinsics-upgrade.ll.
Differential Revision: https://reviews.llvm.org/D38220
llvm-svn: 314195
Usually the frontend communicates the size of wchar_t via metadata and
we can optimize wcslen (and possibly other calls in the future). In
cases without the wchar_size metadata we would previously try to guess
the correct size based on the target triple; however this is fragile to
keep up to date and may miss users manually changing the size via flags.
Better be safe and stop guessing and optimizing if the frontend didn't
communicate the size.
Differential Revision: https://reviews.llvm.org/D38106
llvm-svn: 314185
Summary:
Sanitizer blacklist entries currently apply to all sanitizers--there
is no way to specify that an entry should only apply to a specific
sanitizer. This is important for Control Flow Integrity since there are
several different CFI modes that can be enabled at once. For maximum
security, CFI blacklist entries should be scoped to only the specific
CFI mode(s) that entry applies to.
Adding section headers to SpecialCaseLists allows users to specify more
information about list entries, like sanitizer names or other metadata,
like so:
[section1]
fun:*fun1*
[section2|section3]
fun:*fun23*
The section headers are regular expressions. For backwards compatbility,
blacklist entries entered before a section header are put into the '[*]'
section so that blacklists without sections retain the same behavior.
SpecialCaseList has been modified to also accept a section name when
matching against the blacklist. It has also been modified so the
follow-up change to clang can define a derived class that allows
matching sections by SectionMask instead of by string.
Reviewers: pcc, kcc, eugenis, vsk
Reviewed By: eugenis, vsk
Subscribers: vitalybuka, llvm-commits
Differential Revision: https://reviews.llvm.org/D37924
llvm-svn: 314170