Summary:
getAsInstruction is the only non-const member method.
It is impossible to enforce const-correctness because of it.
Reviewers: jmolloy, majnemer
Reviewed By: jmolloy
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70113
Summary:
This patch redefines freeze instruction from being UnaryOperator to a subclass of UnaryInstruction.
ConstantExpr freeze is removed, as discussed in the previous review.
FreezeOperator is not added because there's no ConstantExpr freeze.
`freeze i8* null` test is added to `test/Bindings/llvm-c/freeze.ll` as well, because the null pointer-related bug in `tools/llvm-c/echo.cpp` is now fixed.
InstVisitor has visitFreeze now because freeze is not unaryop anymore.
Reviewers: whitequark, deadalnix, craig.topper, jdoerfert, lebedev.ri
Reviewed By: craig.topper, lebedev.ri
Subscribers: regehr, nlopes, mehdi_amini, hiraditya, steven_wu, dexonsmith, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69932
Summary: A user can force a function to be inlined by specifying the always_inline attribute. Currently, thinlto implementation is not aware of always_inline functions and does not guarantee import of such functions, which in turn can prevent inlining of such functions.
Patch by Bharathi Seshadri <bseshadr@cisco.com>
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70014
Patch enables import of write-only variables with non-trivial initializers
to fix linker errors. Initializers of imported variables are converted to
'zeroinitializer' to avoid promotion of referenced objects.
Differential revision: https://reviews.llvm.org/D70006
ShuffleVectorInst::isExtractSubvectorMask, introduced in
[CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368)
erroneously thought that
%340 = shufflevector <4 x float> %339, <4 x float> undef, <3 x i32> <i32 2, i32 3, i32 undef>
is a subvector extract, even though it goes off the end of the parent
vector with the undef index. That then caused an assert in
BasicTTIImplBase::getExtractSubvectorOverhead.
This commit fixes that, by not considering the above a subvector
extract.
Differential Revision: https://reviews.llvm.org/D70005
Change-Id: I87b8b00b24bef19ffc9a1b82ef4eca3b8a246eaf
Summary:
To be used in `ConstantRange::mulWithNoOverflow()`,
may in future be useful for when saturating shift/mul ops are added.
These are precise as far as i can tell.
I initially though i will need `APInt::[us]mul_sat()` for these,
but it turned out much simpler to do what `ConstantRange::multiply()`
does - perform multiplication in twice the bitwidth, and then truncate.
Though here we want saturating signed truncation.
Reviewers: nikic, reames, spatel
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69994
Summary:
To be used in `ConstantRange::shlWithNoOverflow()`,
may in future be useful for when saturating shift/mul ops are added.
Unlike `ConstantRange::shl()`, these are precise.
Reviewers: nikic, spatel, reames
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69960
Summary: The 'BB->getParent()' pointer was utilized before it was verified against nullptr. Check lines: 3567, 3581.
Reviewers: jyknight, RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69751
Patch allows importing declarations of functions and variables, referenced
by the initializer of some other readonly variable.
Differential revision: https://reviews.llvm.org/D69561
Summary:
When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining.
Reviewers: wmi, davidxl, tejohnson
Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69736
Summary:
Much like D67339, adds ConstantRange handling for
when we know no-wrap behavior of the `sub`.
Unlike addWithNoWrap(), we only get lucky re returning empty set
for signed wrap. For unsigned, we must perform overflow check manually.
A patch that makes use of this in LVI (CVP) to be posted later.
Reviewers: nikic, shchenz, efriedma
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69918
As discussed in https://reviews.llvm.org/D69918
that happens to work as intended, and returns empty set if
there is always an overflow because we get lucky with intersection.
Since there's now an explicit test for that, let's prefer cleaner code.
As noted in follow-up to:
rGa1e8ad4f2fa7
It's not safe to assume that an element of the constant is always
non-null. It's definitely not an expected case for the current
instcombine user, but that may not hold if this function is
eventually called from arbitrary places.
DIExpression::isImplicit() did not handle DW_OP_LLVM_fragment
correctly. It was scanning the elements in the expression by
iterating from the end. But we do not know the position of
ops unless we iterate from the beginning of the expression,
since DW_OP:s and their operands are stored flat in the expression
list. The old code also assumed that a DW_OP_LLVM_fragment
only occupied one element in the expression list, but it actually
occupies three elements.
If module pass uses on-demand function analyses then structure is being
displayed incorrectly because FunctionPassManagerImpl can't dump contained
FPPassManager instances.
Differential revision: https://reviews.llvm.org/D69315
Summary:
Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation.
The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254.
In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager.
Reviewers: chandlerc, echristo
Subscribers: mehdi_amini, sanjoy.google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69121
This should trigger a dereference before null-check warning,
but I don't see it when building with clang. In any case, the
current and known future users of this helper require non-null
args, so I'm converting the 'if' to an assert.
This fixes a minor oversight mentioned in the review of D69379:
we should push extractelement into the operands of getelementptr
regardless of whether that enables further folding.
Emit a remarks section by default for the following formats:
* bitstream
* yaml-strtab
while still providing -remarks-section=<bool> to override the defaults.
Summary:
Getelementptr has vector type if any of its operands are vectors
(the scalar operands being implicitly broadcast to all vector elements).
Extractelement applied to a vector getelementptr can be folded by
applying the extractelement in turn to all of the vector operands.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69379
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.
Reviewers: thakis, rnk, theraven, pcc
Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65761
new functions are compatible before upgrading a function call to an
intrinsic call.
Sometimes users insert calls to ARC runtime functions that are not
compatible with the corresponding intrinsic functions (for example,
'i8* @objc_storeStrong' instead of 'void @objc_storeStrong'). Don't
upgrade those calls.
rdar://problem/56447127
Current implementation of Instruction::mayReadFromMemory()
returns !doesNotAccessMemory() which is !ReadNone. This
does not take into account that the writeonly attribute
also indicates that the call does not read from memory.
The patch changes the predicate to !doesNotReadMemory()
that reflects the intended behavior.
Differential Revision: https://reviews.llvm.org/D69086
llvm-svn: 375389
Summary:
If all the shifts amount are already poison-producing,
then we can add more poison-producing flags ontop:
https://rise4fun.com/Alive/Ocwi
Otherwise, we should only consider the possible range of shift amts that don't result in poison.
For unsigned range not not overflow, we must not shift out any set bits,
and the actual limit for `x` can be computed by backtransforming
the maximal value we could ever get out of the `shl` - `-1` through
`lshr`. If the `x` is any larger than that then it will overflow.
Likewise for signed range, but just in signed domain..
This is based on the general idea outlined by @nikic in https://reviews.llvm.org/D68672#1714990
Reviewers: nikic, sanjoy
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits, nikic
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69217
llvm-svn: 375370
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.
Original commit message:
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.
This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.
To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.
The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.
This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.
To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.
I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.
On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.
I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.
I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).
Differential revision: https://reviews.llvm.org/D63932
llvm-svn: 375094
Summary:
Internally in LLVM's metadata we use DW_OP_entry_value operations with
the same semantics as DWARF; that is, its operand specifies the number
of bytes that the entry value covers.
At the time of emitting entry values we don't know the emitted size of
the DWARF expression that the entry value will cover. Currently the size
is hardcoded to 1 in DIExpression, and other values causes the verifier
to fail. As the size is 1, that effectively means that we can only have
valid entry values for registers that can be encoded in one byte, which
are the registers with DWARF numbers 0 to 31 (as they can be encoded as
single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte
DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump
will print an operation "correctly", even if the byte size is less than
that, which may make it seem that we emit correct DWARF for registers
with DWARF numbers > 31. If you instead use readelf for such cases, it
will interpret the number of specified bytes as a DWARF expression. This
seems like a limitation in llvm-dwarfdump.
As suggested in D66746, a way forward would be to add an internal
variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand
instead specifies the number of operations that the entry value covers,
and we then translate that into the byte size at the time of emission.
In this patch that internal operation is added. This patch keeps the
limitation that a entry value can only be applied to simple register
locations, but it will fix the issue with the size operand being
incorrect for DWARF numbers > 31.
Reviewers: aprantl, vsk, djtodoro, NikolaPrica
Reviewed By: aprantl
Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D67492
llvm-svn: 374881
Summary:
The guard for printing function flags in the summary was not checking
the NoInline flag.
Reviewers: wmi
Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68948
llvm-svn: 374802