Putting `sizeof(T) <= 16` into the parameter of a `std::conditional`
causes every version of MSVC I've tried to crash:
https://godbolt.org/g/eqVULL
Really frustrating, but an extra layer of indirection through an
instantiated type gives a working way to access this computed constant.
llvm-svn: 336170
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Author: NutshellySima
Subscribers: vsk, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336163
introducing llvm::trivially_{copy,move}_constructible type traits.
This uses a completely portable implementation of these traits provided
by Richard Smith. You can see it on compiler explorer in all its glory:
https://godbolt.org/g/QEDZjW
I have transcribed it, clang-formatted it, added some comments, and made
the tests fit into a unittest file.
I have also switched llvm::unique_function over to use these new, much
more portable traits. =D
Hopefully this will fix the build bot breakage from my prior commit.
llvm-svn: 336161
Summary:
When we import an alias (which will import a copy of the aliasee), but
aren't going to import the aliasee directly, the distributed backend
index will not contain the aliasee summary. Handle this in the summary
assembly printer by printing "null" as the aliasee.
Reviewers: davidxl, dexonsmith
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits
Differential Revision: https://reviews.llvm.org/D48699
llvm-svn: 336160
supporting move-only closures.
Most of the core optimizations for std::function are here plus
a potentially novel one that detects trivially movable and destroyable
functors and implements those with fewer indirections.
This is especially useful as we start trying to add concurrency
primitives as those often end up with move-only types (futures,
promises, etc) and wanting them to work through lambdas.
As further work, we could add better support for things like const-qualified
operator()s to support more algorithms, and r-value ref qualified operator()s
to model call-once. None of that is here though.
We can also provide our own llvm::function that has some of the optimizations
used in this class, but with copy semantics instead of move semantics.
This is motivated by increasing usage of things like executors and the task
queue where it is useful to embed move-only types like a std::promise within
a type erased function. That isn't possible without this version of a type
erased function.
Differential Revision: https://reviews.llvm.org/D48349
llvm-svn: 336156
The verifier identified several modules that were broken due to incorrect
linkage on declarations. To fix this, CompileOnDemandLayer2::extractFunction
has been updated to change decls to external linkage.
llvm-svn: 336150
Summary:
In the individual index files emitted for distributed ThinLTO backends,
the module path ids are not contiguous. Assign slots to module paths in
order to handle this better and also to get contiguous numbering in the
summary assembly.
Reviewers: davidxl, dexonsmith
Subscribers: mehdi_amini, inglorion, eraman, llvm-commits, steven_wu
Differential Revision: https://reviews.llvm.org/D48698
llvm-svn: 336148
Different CodeBlocks don't overlap. The same MCInst cannot appear in more than
one code block because all blocks are instantiated before the simulation is run.
We should always clear the content of map VariantDescriptors before every
simulation, since VariantDescriptors cannot possibly store useful information
for the next blocks. It is also "safer" to clear its content because `MCInst*`
is used as the key type for map VariantDescriptors.
llvm-svn: 336142
Summary:
Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change
SCEV is able to prove that the loop doesn't wrap-self (due to zext i16
to i64), disabling the entire loop versioning pass. Removed the zext and
just use i64.
Reviewers: sanjoy
Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D48409
llvm-svn: 336140
LLVM doesn't guarantee anything about the high bits of a register holding
an i1 value at the IR level, so don't translate LLVM IR i1 values directly
into WebAssembly conditional branch operands. WebAssembly's conditional
branches do demand all 32 bits be valid.
Fixes PR38019.
llvm-svn: 336138
Add registers still missing after r328016 (D43353):
- for bits 15-8 of SI, DI, BP, SP (*H), and R8-R15 (*BH),
- for bits 31-16 of R8-R15 (*WH).
Thanks to Craig Topper for pointing it out.
llvm-svn: 336134
Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.
%1 = extractelement <2 x i32> %a, i32 0
%2 = extractelement <2 x i32> %a, i32 1
%cond = icmp sgt i32 %1, %2
%3 = extractelement <2 x i32> %a, i32 0
%4 = extractelement <2 x i32> %a, i32 1
%select = select i1 %cond, i32 %3, i32 %4
Author: FarhanaAleen
Reviewed By: ABataev, RKSimon, spatel
Differential Revision: https://reviews.llvm.org/D47608
llvm-svn: 336130
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)
Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving
to where other passes could use that functionality.
Differential Revision: https://reviews.llvm.org/D48662
llvm-svn: 336128
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.
In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.
This patch replaces the assert with an error.
Differential Revision: https://reviews.llvm.org/D48517
llvm-svn: 336127
Summary:
This adds a new -no-weak flag to nm to hide weak symbols in its output.
This also adds a -W alias for this which is analogous to -U.
Patch by Keith Smiley
Reviewers: kastiglione, enderby, compnerd
Reviewed By: kastiglione
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48751
llvm-svn: 336126
Use an -implicit-check-not to make sure an error which should not occur
in fact does not occur before the first CHECK line.
Suggested by Paul Robinson in post-commit feedback for r335897.
llvm-svn: 336123
Similarily, don't fold fp128 loads into SSE instructions if the load isn't aligned. Unless we're targeting an AMD CPU that doesn't check alignment on arithmetic instructions.
Should fix PR38001
llvm-svn: 336121
We currently don't any-extend vararg parameters before storing them to the stack
locations on Darwin. However, SelectionDAG however does this, and so user code
is in the wild which inadvertently relies on this extension. This can manifest
in cases where the value stored is (int)0, but the actual parameter is interpreted
by va_arg as a pointer, and so not extending to 64 bits causes the callee to
load additional undefined bits.
llvm-svn: 336120
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Subscribers: vsk, mgorny, llvm-commits
Author: NutshellySima
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336114
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.
llvm-svn: 336113
We have special case support for 2 shift values for basic blends, but irregular shift patterns end up using the generic lowering, despite shuffle lowering being good enough to handle more complex blends.
llvm-svn: 336112
Summary:
Replace use of a SmallPtrSet with a SmallSetVector to make the worklist
iteration order deterministic. This is done as the order the blocks are
removed may affect whether or not PHI nodes in successor blocks are
removed.
For example, consider the following case where %bb1 and %bb2 are
removed:
bb1:
br i1 undef, label %bb3, label %bb4
bb2:
br i1 undef, label %bb4, label %bb3
bb3:
pv1 = phi type [ undef, %bb1 ], [ undef, %bb2], [ v0, %other ]
br label %bb4
bb4:
pv2 = phi type [ undef, %bb1 ], [ undef, %bb2 ],
[ pv1, %bb3 ], [ v0, %other ]
If %bb2 is removed before %bb1, the incoming values from %bb1 and %bb2
to pv1 will be removed before %bb1 is removed as a predecessor to %bb4.
The pv1 node will thus be optimized out (to v0) at the time %bb1 is
removed as a predecessor to %bb4, leaving the blocks as following when
the incoming value from %bb1 has been removed:
bb3: ; pv1 optimized out, incoming value to pv2 is v0
br label %bb4
bb4:
pv2 = phi type [ v0, %bb3 ], [ v0, %other ]
The pv2 PHI node will be optimized away by removePredecessor() as all
incoming values are identical.
In case %bb2 is removed after %bb1, pv1 will not be optimized out at the
time %bb2 is removed as a predecessor to %bb4, leaving the blocks as
following when the incoming value from %bb2 to pv2 has been removed:
bb3:
pv1 = phi type [ undef, %bb2 ], [ v0, %other ]
br label %bb4
bb4:
pv2 = phi type [ pv1, %bb3 ], [ v0, %other ]
The pv2 PHI node will thus not be removed in this case, ultimately
leading to the following output
bb3: ; pv1 optimized out, incoming value to pv2 is v0
br label %bb4
bb4:
pv2 = phi type [ v0, %bb3 ], [ v0, %other ]
I have not looked into changing DeleteDeadBlock() so that the redundant
PHI nodes are removed.
I have not added a test case, as I was not able to create a particularly
small and (not messy) reproducer. This is likely due to SmallPtrSet
behaving deterministically when in small mode.
Reviewers: void, dexonsmith, spatel, skatkov, fhahn, bkramer, nhaehnle
Reviewed By: fhahn
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D48369
llvm-svn: 336109
This was my mistake for only running test/MC/X86 and test/CodeGen/X86.
Arguably .word should be removed from this test, as it is not supported
universally.
llvm-svn: 336107
Currently the llvm-exegesis native architecture is determined by comparing the
llvm native architecture with X86, so to add a new target would mean adding a
new check. Change this to building up a list of the targets llvm-exegesis
supports then using that, as this means that when adding a new target you just
add the target to the list of supported targets.
Differential Revision: https://reviews.llvm.org/D48778
llvm-svn: 336105
The X86 asm parser currently has custom parsing logic for .word. Rather than
use this custom logic, we can just use addAliasForDirective to enable the
reuse of AsmParser::parseDirectiveValue.
See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon
(rL332607) backends.
Differential Revision: https://reviews.llvm.org/D47004
This is a fixed reland of rL336100. This should have been caught in
pre-commit testing so apologies for the noise.
llvm-svn: 336104
This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure.
llvm-svn: 336102
Due to current limitations in constant analysis, we need flags
on add or mul to show propagation for the potential transform
suggested in these tests (no other binops currently report
identity constants).
llvm-svn: 336101
The X86 asm parser currently has custom parsing logic for .word. Rather than
use this custom logic, we can just use addAliasForDirective to enable the
reuse of AsmParser::parseDirectiveValue.
See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon
(rL332607) backends.
Differential Revision: https://reviews.llvm.org/D47004
llvm-svn: 336100
Currently the cycle counter is taken from the subtarget schedule model, which
isn't any use if the subtarget doesn't have one. Delegate the decision to the
target benchmark runner, as it may know better what to do in that case, with
the default being the current behaviour.
Differential Revision: https://reviews.llvm.org/D48779
llvm-svn: 336099
This version contains a fix to add values for which the state in ParamState change
to the worklist if the state in ValueState did not change. To avoid adding the
same value multiple times, mergeInValue returns true, if it added the value to
the worklist. The value is added to the worklist depending on its state in
ValueState.
Original message:
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.
Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.
Reviewers: dberlin, mssimpso, davide, efriedma
Reviewed By: mssimpso
Differential Revision: https://reviews.llvm.org/D43762
llvm-svn: 336098