`FeatureMadd4` is used to disable `madd4`, and the corresponding feature
option is `(+-)nomadd4`. Renaming to the `FeatureNoMadd4` makes its
purpose clear.
Patch by YunQiang Su.
Differential Revision: https://reviews.llvm.org/D83780
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the ternary subset (zbt subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79875
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the single-bit subset (zbs subextension) of
the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79874
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions belonging to both the permutation and the base
subsets of the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79873
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the permutation subset (zbp subextension) of
the experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79871
This patch provides optimization of bit manipulation operations by
enabling the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the base subset (zbb subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.
This patch is based on Claire Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf
Differential Revision: https://reviews.llvm.org/D79870
This fixes an instance where MemorySSA-using Dead Store Elimination is failing
to do a transformation that the non-MemorySSA-using version does.
Differential Revision: https://reviews.llvm.org/D83783
The actual rotation happens in processLoop, so the second removed
call to verifyMemorySSA was unnecessary.
In fact, processLoop/rotateLoop already verify MemorySSA before
and after transforming each loop. Hence, both calls can be removed.
Pointed out by @lebedev.ri post-commit D51718.
Summary:
Without these, the generic branch relaxation pass will underestimate the
range required for branches spanning these and we can end up with
"fixup value out of range" errors rather than relaxing the branches.
Some of the instructions in the expansion may end up being compressed
but exactly determining that is awkward, and these conservative values
should be safe, if slightly suboptimal in rare cases.
Reviewers: asb, lenary, luismarques, lewis-revill
Reviewed By: asb, luismarques
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77443
In D83482 we agreed to name e_* fields that are used for overriding
values (like e_phoff) as EPh* (e.g. EPhOff).
Currently we have a set of e_sh* fields that are named inconsistently
with this rule. This patch renames all of them.
Differential revision: https://reviews.llvm.org/D83766
In 2b3c505, the pointer arguments for the matrix load and store
intrinsics was changed to always be the element type of the vector
argument.
This patch updates the MatrixBuilder to not add the pointer type to the
overloaded types and adjusts the clang/mlir tests.
This should fix a few build failures on GreenDragon, including
http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O0-g/7891/
This improves condition in the ELFFile::program_headers().
Previously if was possible to read the headers from the wrong place when
the value of e_phoff was so large that computation overflowed.
Differential revision: https://reviews.llvm.org/D83774
Fix incorrect use of the size of Path when accessing PathUTF16, as the
UTF-16 path can be shorter. Added unit test for coverage of this test
case.
Thanks to Ding Fei (danix800) for the code fix, see
https://reviews.llvm.org/D83321.
Differential Revision: https://reviews.llvm.org/D83689
Some of the system registers readable on AArch64 and ARM platforms
return different values with each read (for example a timer counter),
these shouldn't be hoisted outside loops or otherwise interfered with,
but the normal @llvm.read_register intrinsic is only considered to read
memory.
This introduces a separate @llvm.read_volatile_register intrinsic and
maps all system-registers on ARM platforms to use it for the
__builtin_arm_rsr calls. Registers declared with asm("r9") or similar
are unaffected.
The existing code already considered this case. Unfortunately a typo in
the condition prevents it from triggering. Also the existing code, had
it run, forgot to do the folding.
This fixes PR42876.
Differential Revision: https://reviews.llvm.org/D65802
Summary:
This change avoids exposing tensorflow types when including TFUtils.h.
They are just an implementation detail, and don't need to be used
directly when implementing an analysis requiring ML model evaluation.
The TFUtils APIs, while generically typed, are still not exposed unless
the tensorflow C library is present, as they currently have no use
otherwise.
Reviewers: mehdi_amini, davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83843
The dependencies in llvm/unittests/Transforms/IPO/CMakeLists.txt
introduced in revision 0750757e were incomplete, leading to link errors
for a DBUILD_SHARED_LIBS=True build.
During code generation we might change/add basic blocks so keeping a
list of them is fairly easy to break. Nested parallel regions were
enough. The new scheme does recompute the list of blocks to be outlined
once it is needed.
Reviewed By: anchu-rajendran
Differential Revision: https://reviews.llvm.org/D82722
Since D82572, we keep "reference" edges for callback call sites. While
not strictly necessary they can improve the traversal order. However, we
did not update them properly in case a pass removed the callback call
site which caused a verification error (PR46687). With this patch we
update these reference edges properly during the invocation of
`CallGraphSCCPass::RefreshCallGraph` in non-checking mode.
Reviewed By: sdmitriev
Differential Revision: https://reviews.llvm.org/D83718
Since D83271 we can optimize the GPU state machine to avoid spurious
call edges that increase the register usage of kernels. With this patch
we inform the user why and if this optimization is happening and when it
is not.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D83707
Summary: This patch added dependency graph to the attributor so that we can dump the dependencies between AAs more easily. We can also apply general graph algorithms to the graph, making it easier for us to create deep wrappers.
Reviewers: jdoerfert, sstefan1, uenoku, homerdin, baziotis
Reviewed By: jdoerfert
Subscribers: jfb, okura, mgrang, kuter, lebedev.ri, hiraditya, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78861
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.
This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.
Differential Revision: https://reviews.llvm.org/D79219
Summary: This patch added dependency graph to the attributor so that we can dump the dependencies between AAs more easily. We can also apply general graph algorithms to the graph, making it easier for us to create deep wrappers.
Reviewers: jdoerfert, sstefan1, uenoku, homerdin, baziotis
Reviewed By: jdoerfert
Subscribers: jfb, okura, mgrang, kuter, lebedev.ri, hiraditya, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78861
Summary: This patch introduces basic unittest interface for the Attributor and a simple test case for casting.
Reviewers: jdoerfert, sstefan1, uenoku, homerdin, baziotis
Reviewed By: jdoerfert
Subscribers: mgorny, uenoku, kuter, okura, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83754
Add handling of s_andn2 and mask of 0.
This eliminates redundant instructions from uniform control flow.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D83641
Summary: The `getIdAddr()` function returns the address of the ID of the abstract attribute
Reviewers: jdoerfert, sstefan1, uenoku, homerdin, baziotis
Reviewed By: jdoerfert
Subscribers: okura, hiraditya, uenoku, kuter, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83172
If no alignment is specified we try to find the datalayout by using the insert position to get the module so we can get the datalayout. But if those are null, then we deference a null pointer.
This patch adds asserts to make the failure a little more obvious than just seg faulting.
Differential Revision: https://reviews.llvm.org/D83829
This pacth fix out-of-tree build of Flang after the introduction of acc_gen.
Reviewed By: sscalpone
Differential Revision: https://reviews.llvm.org/D83835
destructor via a pointer of the wrong static type.
This caused crashes during deallocation in C++14 builds when using a
deallocator whose sized delete requires the size argument to be correct.
Also make the LazyCallThroughManager destructor protected to catch this
sort of bug in the future.
Summary:
This allows users of the llvm library discover whether llvm was built
with the tensorflow c API dependency, which helps if using the TFUtils
wrapper, for example.
We don't do the same for the LLVM_HAVE_TF_AOT flag, because that does
not expose any API.
Reviewers: mehdi_amini, davidxl
Subscribers: mgorny, aaron.ballman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83746
Instead of detecting it automatically but also allowing for the setting
to be specified explicitly, always detect whether exceptions are enabled
based on whether -fno-rtti (or equivalent) is used. It's less confusing
to have a single way of tweaking that knob.
This change follows the lead of 71d88cebfb42.
For `.reloc offset, *, *`, currently offset can be a constant or symbol.
This patch makes it support any expression which can be folded to sym+constant.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D83751