This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).
Consider this example
%vla = alloca i32, i64 2
call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())
Instcombine will turn it into
%vla1 = alloca [2 x i32]
%vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())
If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
%vla1 = alloca [2 x i32]
call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())
The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))
I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).
Reviewers: aprantl, vsk
Reviewed By: aprantl, vsk
Subscribers: JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D48837
llvm-svn: 336191
The signature of setRegToConstant changed in r336171, so adjust the AArch64
unit test in a similar way to how the X86 unit test was changed in that commit.
llvm-svn: 336188
The target does just enough to be able to run llvm-exegesis in latency mode for
at least some opcodes.
Differential Revision: https://reviews.llvm.org/D48780
llvm-svn: 336187
Lower more than 4 arguments using stack. This patch targets MIPS32.
It supports only functions with arguments of type i32.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D47934
llvm-svn: 336185
unswitching loops.
Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:
1) We must invalidate any info SCEV has cached before unswitching as we
may change (or destroy) the loop structure by the act of unswitching,
and make it hard to recover everything we want to invalidate within
SCEV.
2) We need to invalidate all of the loops whose CFGs are mutated by the
unswitching. Notably, this isn't the *entire* loop nest, this is
every loop contained by the outermost loop reached by an exit block
relevant to the unswitch.
And we need to do this even when doing trivial unswitching.
I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.
However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.
Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.
Differential Revision: https://reviews.llvm.org/D47624
llvm-svn: 336183
It appears that the function pointer we use there isn't reliably 4-byte
aligned. I have no idea why or how we could correct this, so for now we
just regress the Windows performance some.
Someone with access to Windows could try working on a fix. At the very
least we could use a double indirection rather than a table, but maybe
there is some way to fully restore this optimization. I don't want to
play too much with this when I don't have access to the platform and
this at least should restore the last bots.
llvm-svn: 336178
of libstdc++, not just certain versions of GCC. The original macros
broke when using Clang + libstdc++4.9 sadly.
Sadly, testing for versions of libstdc++ has been extremely problematic
in the past, so I'm just narrowing this down to Windows and when using
libc++ as that seems at least very unlikely to keep build bots broken.
llvm-svn: 336174
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.
Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri
llvm-svn: 336172
Putting `sizeof(T) <= 16` into the parameter of a `std::conditional`
causes every version of MSVC I've tried to crash:
https://godbolt.org/g/eqVULL
Really frustrating, but an extra layer of indirection through an
instantiated type gives a working way to access this computed constant.
llvm-svn: 336170
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Author: NutshellySima
Subscribers: vsk, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336163
introducing llvm::trivially_{copy,move}_constructible type traits.
This uses a completely portable implementation of these traits provided
by Richard Smith. You can see it on compiler explorer in all its glory:
https://godbolt.org/g/QEDZjW
I have transcribed it, clang-formatted it, added some comments, and made
the tests fit into a unittest file.
I have also switched llvm::unique_function over to use these new, much
more portable traits. =D
Hopefully this will fix the build bot breakage from my prior commit.
llvm-svn: 336161
Summary:
When we import an alias (which will import a copy of the aliasee), but
aren't going to import the aliasee directly, the distributed backend
index will not contain the aliasee summary. Handle this in the summary
assembly printer by printing "null" as the aliasee.
Reviewers: davidxl, dexonsmith
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits
Differential Revision: https://reviews.llvm.org/D48699
llvm-svn: 336160
supporting move-only closures.
Most of the core optimizations for std::function are here plus
a potentially novel one that detects trivially movable and destroyable
functors and implements those with fewer indirections.
This is especially useful as we start trying to add concurrency
primitives as those often end up with move-only types (futures,
promises, etc) and wanting them to work through lambdas.
As further work, we could add better support for things like const-qualified
operator()s to support more algorithms, and r-value ref qualified operator()s
to model call-once. None of that is here though.
We can also provide our own llvm::function that has some of the optimizations
used in this class, but with copy semantics instead of move semantics.
This is motivated by increasing usage of things like executors and the task
queue where it is useful to embed move-only types like a std::promise within
a type erased function. That isn't possible without this version of a type
erased function.
Differential Revision: https://reviews.llvm.org/D48349
llvm-svn: 336156
The verifier identified several modules that were broken due to incorrect
linkage on declarations. To fix this, CompileOnDemandLayer2::extractFunction
has been updated to change decls to external linkage.
llvm-svn: 336150
Summary:
In the individual index files emitted for distributed ThinLTO backends,
the module path ids are not contiguous. Assign slots to module paths in
order to handle this better and also to get contiguous numbering in the
summary assembly.
Reviewers: davidxl, dexonsmith
Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, steven_wu
Differential Revision: https://reviews.llvm.org/D48698
llvm-svn: 336148
Different CodeBlocks don't overlap. The same MCInst cannot appear in more than
one code block because all blocks are instantiated before the simulation is run.
We should always clear the content of map VariantDescriptors before every
simulation, since VariantDescriptors cannot possibly store useful information
for the next blocks. It is also "safer" to clear its content because `MCInst*`
is used as the key type for map VariantDescriptors.
llvm-svn: 336142
Summary:
Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change
SCEV is able to prove that the loop doesn't wrap-self (due to zext i16
to i64), disabling the entire loop versioning pass. Removed the zext and
just use i64.
Reviewers: sanjoy
Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D48409
llvm-svn: 336140
LLVM doesn't guarantee anything about the high bits of a register holding
an i1 value at the IR level, so don't translate LLVM IR i1 values directly
into WebAssembly conditional branch operands. WebAssembly's conditional
branches do demand all 32 bits be valid.
Fixes PR38019.
llvm-svn: 336138
Add registers still missing after r328016 (D43353):
- for bits 15-8 of SI, DI, BP, SP (*H), and R8-R15 (*BH),
- for bits 31-16 of R8-R15 (*WH).
Thanks to Craig Topper for pointing it out.
llvm-svn: 336134
Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.
%1 = extractelement <2 x i32> %a, i32 0
%2 = extractelement <2 x i32> %a, i32 1
%cond = icmp sgt i32 %1, %2
%3 = extractelement <2 x i32> %a, i32 0
%4 = extractelement <2 x i32> %a, i32 1
%select = select i1 %cond, i32 %3, i32 %4
Author: FarhanaAleen
Reviewed By: ABataev, RKSimon, spatel
Differential Revision: https://reviews.llvm.org/D47608
llvm-svn: 336130
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)
Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving
to where other passes could use that functionality.
Differential Revision: https://reviews.llvm.org/D48662
llvm-svn: 336128
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.
In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.
This patch replaces the assert with an error.
Differential Revision: https://reviews.llvm.org/D48517
llvm-svn: 336127
Summary:
This adds a new -no-weak flag to nm to hide weak symbols in its output.
This also adds a -W alias for this which is analogous to -U.
Patch by Keith Smiley
Reviewers: kastiglione, enderby, compnerd
Reviewed By: kastiglione
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48751
llvm-svn: 336126
Use an -implicit-check-not to make sure an error which should not occur
in fact does not occur before the first CHECK line.
Suggested by Paul Robinson in post-commit feedback for r335897.
llvm-svn: 336123
Similarily, don't fold fp128 loads into SSE instructions if the load isn't aligned. Unless we're targeting an AMD CPU that doesn't check alignment on arithmetic instructions.
Should fix PR38001
llvm-svn: 336121
We currently don't any-extend vararg parameters before storing them to the stack
locations on Darwin. However, SelectionDAG however does this, and so user code
is in the wild which inadvertently relies on this extension. This can manifest
in cases where the value stored is (int)0, but the actual parameter is interpreted
by va_arg as a pointer, and so not extending to 64 bits causes the callee to
load additional undefined bits.
llvm-svn: 336120
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Subscribers: vsk, mgorny, llvm-commits
Author: NutshellySima
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336114
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.
llvm-svn: 336113