This patch adds support for UNIMP in both 32- and 16-bit forms. The 32-bit
form can be seen as a variant of the ECALL/EBREAK/etc. family of instructions.
The 16-bit form is just all zeroes, which isn't a valid RISC-V instruction,
but still follows the 16-bit instruction form (i.e. bits 0-1 != 11).
Until recently unimp was undocumented and supported just by binutils, which
printed unimp for either the 16 or 32-bit form. Both forms are now documented
<https://github.com/riscv/riscv-asm-manual/pull/20> and binutils now supports
c.unimp <https://sourceware.org/ml/binutils-cvs/2018-11/msg00179.html>.
Differential Revision: https://reviews.llvm.org/D54316
Patch by Luís Marques.
llvm-svn: 347988
For targets where i32 is not a legal type (e.g. 64-bit RISC-V),
LegalizeIntegerTypes must promote the result of ISD::FLT_ROUNDS_.
Differential Revision: https://reviews.llvm.org/D53820
llvm-svn: 347986
For targets where i32 is not a legal type (e.g. 64-bit RISC-V),
LegalizeIntegerTypes must promote the operands of ISD::PREFETCH.
Differential Revision: https://reviews.llvm.org/D53281
llvm-svn: 347980
Terminator folding transform lacks MemorySSA update for memory Phis,
while they exist within MemorySSA analysis. They need exactly the same
type of updates as regular Phis. Failing to update them properly ends up
with inconsistent MemorySSA and manifests in various assertion failures.
This patch adds Memory Phi updates to this transform.
Thanks to @jonpa for finding this!
Differential Revision: https://reviews.llvm.org/D55050
Reviewed By: asbirlea
llvm-svn: 347979
For targets where i32 is not a legal type (e.g. 64-bit RISC-V),
LegalizeIntegerTypes must promote the operand.
Differential Revision: https://reviews.llvm.org/D53279
llvm-svn: 347978
DAGTypeLegalizer::PromoteSetCCOperands currently prefers to zero-extend
operands when it is able to do so. For some targets this is more expensive
than a sign-extension, which is also a valid choice. Introduce the
isSExtCheaperThanZExt hook and use it in the new SExtOrZExtPromotedInteger
helper. On RISC-V, we prefer sign-extension for FromTy == MVT::i32 and ToTy ==
MVT::i64, as it can be performed using a single instruction.
Differential Revision: https://reviews.llvm.org/D52978
llvm-svn: 347977
As discussed in the RFC
<http://lists.llvm.org/pipermail/llvm-dev/2018-October/126690.html>, 64-bit
RISC-V has i64 as the only legal integer type. This patch introduces patterns
to support codegen of the new instructions
introduced in RV64I: addiw, addiw, subw, sllw, slliw, srlw, srliw, sraw,
sraiw, ld, sd.
Custom selection code is needed for srliw as SimplifyDemandedBits will remove
lower bits from the mask, meaning the obvious pattern won't work:
def : Pat<(sext_inreg (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt), i32),
(SRLIW GPR:$rs1, uimm5:$shamt)>;
This is sufficient to compile and execute all of the GCC torture suite for
RV64I other than those files using frameaddr or returnaddr intrinsics
(LegalizeDAG doesn't know how to promote the operands - a future patch
addresses this).
When promoting i32 sltu/sltiu operands, it would be more efficient to use
sign-extension rather than zero-extension for RV64. A future patch adds a hook
to allow this.
Differential Revision: https://reviews.llvm.org/D52977
llvm-svn: 347973
D47882, D48130 and D48131 introduce a new lowering strategy for part-word
atomicrmw/cmpxchg and uses it to lower these operations for the RISC-V target.
Rather than having AtomicExpandPass produce the LL/SC loop in the IR level, it
instead calculates the necessary mask values and inserts a target-specific
intrinsic, which is lowered at a much later stage (after register allocation).
This ensures that architecture-specific restrictions for forward-progress in
LL/SC loops can be guaranteed.
This patch documents this new AtomicExpandPass functionality. See the previous
llvm-dev RFC for more info
<http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>.
Differential Revision: https://reviews.llvm.org/D52234
llvm-svn: 347971
Previously we emitted a punpcklbw/punpckhbw to move the byte elements into the upper half of 16 bit elements then shifted right by 8 to zero the upper bits. After DAG combine we end up with punpcklbw/punpckhbw into the lower bits with zeros in the uppers bits and no shifts. So just emit that directly.
llvm-svn: 347966
Don't expand SDIV with an immediate that is a power of 2 if we optimise for
minimum code size. For example:
sdiv %1, i32 4
gets expanded to a sequence of 3 instructions, but this is suboptimal for
minimum code size so instead we just generate a MOV and a SDIV if integer
division is supported.
Differential Revision: https://reviews.llvm.org/D54546
llvm-svn: 347965
Three minor changes to these extra costs:
* For ICmp instructions, instead of adding 2 all the time for extending each
operand, this is only done if that operand is neither a load or an
immediate.
* The operands extension costs for divides removed, because we now use a high
cost already for the divide (20).
* The costs for lhsr/ashr extra costs removed as this did not seem useful.
Review: Ulrich Weigand
https://reviews.llvm.org/D55053
llvm-svn: 347961
We had a EVT variable capturing the result of getSimpleValueType which returns an MVT. Another place using EVT that could have been MVT. And an 'int' that should be 'unsigned'.
llvm-svn: 347959
In this diff the elf-specific tests are moved into the subfolder llvm-objcopy/ELF
(the change was discussed in the comments on https://reviews.llvm.org/D54674).
A separate code reivew wasn't sent for this change
since Phabricator is failing to create such a large diff.
Test plan:
make check-all
make check-llvm-tools
make check-llvm-tools-llvm-objcopy
llvm-svn: 347958
Summary:
Suppressed warnings in release builds due to variable used
only in assert statement.
Subscribers: llvm-commits, eraman, mgorny
Differential Revision: https://reviews.llvm.org/D55100
llvm-svn: 347939
The add_llvm_symbol_exports function in AddLLVM.cmake creates command
line link flags with paths containing CMAKE_CURRENT_BINARY_DIR, but that
will break if CMAKE_CURRENT_BINARY_DIR contains whitespace. This patch
adds quotes to those paths.
Fixes PR39843.
Patch by John Garvin.
Differential Revision: https://reviews.llvm.org/D55081
llvm-svn: 347937
r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.
This fixes PR30806.
Differential Revision: https://reviews.llvm.org/D54713
llvm-svn: 347934
Also adds a boring build file for llvm/lib/BinaryFormat (needed by llvm/lib/IR).
lib/IR marks Attributes and IntrinsicsEnum as public_deps (because IR's public
headers include the generated .inc files), so projects depending on lib/IR will
implicitly depend on them being generated. As a consequence, most targets won't
have to explicitly list a dependency on these tablegen steps (contrast with
intrinsics_gen in the cmake build).
This doesn't yet have the optimization where tablegen's output is only updated
if it's changed.
Differential Revision: https://reviews.llvm.org/D55028#inline-486755
llvm-svn: 347927
Also fix a missing file in lib/Support/BUILD.gn found by the script.
The script is very stupid and assumes that CMakeLists.txt follow the standard
LLVM CMakeLists.txt formatting with one cpp source file per line. Despite its
simplicity, it works well in practice.
It would be nice if it also checked deps and maybe automatically applied its
suggestions.
Differential Revision: https://reviews.llvm.org/D54930
llvm-svn: 347925
Summary:
Expands for vector types all of the integer operations that are
expanded for scalars because they are not supported at all by
WebAssembly.
This CL has no tests because such tests would really be testing the
target-independent expansion, but I'm happy to add tests if reviewers
think it would be helpful.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D55010
llvm-svn: 347923
Scattered ARM relocations for Mach-O's only have 24 bits available to
encode the offset. This is not checked but just truncated and can result
in corrupt binaries after linking because the relocations are applied to
the wrong offset. This patch will check and error out in those
situations instead of emitting a wrong relocation.
Patch by: Sander Bogaert (dzn)
Differential revision: https://reviews.llvm.org/D54776
llvm-svn: 347922
The motivating case for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=32023
and the corresponding rot16.ll regression tests.
Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc
sequences that don't get folded in IR.
As the TODO comments suggest, there will be regressions if we extend this (for x86,
we mostly seem to be missing LEA opportunities, but there are likely vector folds
missing too). I think those should be considered existing bugs because this is the
same transform that we do as an IR canonicalization in instcombine. We just need
more tests to make those visible independent of this patch.
Differential Revision: https://reviews.llvm.org/D54640
llvm-svn: 347917
Utilise a similar ('late') lowering strategy to D47882. The changes to
AtomicExpandPass allow this strategy to be utilised by other targets which
implement shouldExpandAtomicCmpXchgInIR.
All cmpxchg are lowered as 'strong' currently and failure ordering is ignored.
This is conservative but correct.
Differential Revision: https://reviews.llvm.org/D48131
llvm-svn: 347914
Also revert fix r347876
One of the buildbots was reporting a failure in some relevant tests that I can't
repro or explain at present, so reverting until I can isolate.
llvm-svn: 347911
Currently CaptureTracker gives up if it encounters a value with more than 20
uses. The motivation for this cap is to keep it relatively cheap for
BasicAliasAnalysis use case, where the results can't be cached. Although, other
clients of CaptureTracker might be ok with higher cost. This patch introduces an
argument for PointerMayBeCaptured functions to specify the max number of uses to
explore. The motivation for this change is a downstream user of CaptureTracker,
but I believe upstream clients of CaptureTracker might also benefit from more
fine grained cap.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D55042
llvm-svn: 347910
It makes more sense to order FI-based memops in descending order when
the stack goes down. This allows offsets to stay "consecutive" and allow
easier pattern matching.
llvm-svn: 347906
I believe we should be legalizing these with the rest of vector binary operations. If any custom lowering is required for these nodes, this will give the DAG combine between LegalizeVectorOps and LegalizeDAG to run on the custom code before constant build_vectors are lowered in LegalizeDAG.
I've moved MULHU/MULHS handling in AArch64 from Lowering to isel. Moving the lowering earlier caused build_vector+extract_subvector simplifications to kick in which made the generated code worse.
Differential Revision: https://reviews.llvm.org/D54276
llvm-svn: 347902
This is another patch for -x86-experimental-vector-widening. This pre widens narrow division by constants so that we can get pass the legal type check in the generic DAG combiner. Otherwise we end up scalarizing.
I've restricted this to splats for now because it was easy to just call DAG.getConstant. Not sure what we should do for non-splat? Increase the element size?Widen the constant vector by padding with 1?
Differential Revision: https://reviews.llvm.org/D54919
llvm-svn: 347898
This is an almost direct move of the functionality from InstCombine to
InstSimplify. There's no reason not to do this in InstSimplify because
we never create a new value with this transform.
(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)
I've changed 1 of the conditions for the fold (1 of the blocks for the
branch must be the block we started with) into an assert because I'm not
sure how that could ever be false.
We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.
The 3-way compare changes show that InstCombine has some kind of
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.
llvm-svn: 347896
When tablegen detects that there exist two subregister compositions that
result in the same value for some register, it will emit a warning. This
kind of an overlap in compositions should only happen when it is caused
by a user-defined composition. It can happen, however, that the user-
defined composition is not identically equal to another one, but it does
produce the same value for one or more registers. In such cases suppress
the warning.
This patch is to silence the warning when building the System Z backend
after D50725.
Differential Revision: https://reviews.llvm.org/D50977
llvm-svn: 347894
Summary:
Replace `aext([asz]ext x)` with `aext/sext/zext x` in order to
reduce the number of instructions generated to clean up some
legalization artifacts.
Reviewers: aditya_nandakumar, dsanders, aemerson, bogner
Reviewed By: aemerson
Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D54174
llvm-svn: 347893
Summary: The original intention of !Config.xx.empty() was probably to emphasize the thing that is currently considered, but I feel the simplified form is actually easier to understand and it is also consistent with the call sites in other llvm components.
Reviewers: alexshap, rupprecht, jakehehrlich, jhenderson, espindola
Reviewed By: alexshap, rupprecht
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D55040
llvm-svn: 347891
This commit caused a large compile-time slowdown in some cases when NDEBUG is
off due to the dominator tree verification it added. Fix this by only doing
dominator tree and loop info verification when something has been hoisted.
Differential Revision: https://reviews.llvm.org/D52827
llvm-svn: 347889
Summary:
We can sometimes end up with multiple copies of a local variable that
have the same GUID in the index. This happens when there are local
variables with the same name that are in different source files having the
same name/path at compile time (but compiled into different bitcode objects).
In this case make sure we import the copy in the caller's module.
This enables importing both of the variables having the same GUID
(but which will have different promoted names since the module paths,
and therefore the module hashes, will be distinct).
Importing the wrong copy is particularly problematic for read only
variables, since we must import them as a local copy whenever
referenced. Otherwise we get undefs at link time.
Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed
for testing the distributed index case via clang, which will be sent as
a separate clang-side patch shortly. We were previously not doing the
dead code/read only computation before computing imports when testing
distributed index generation (like it was for testing importing and
other ThinLTO mechanisms alone).
Reviewers: evgeny777
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D55047
llvm-svn: 347886
"svn update --depth=..." is, annoyingly, not a specification of the
desired depth, but rather a _limit_ added on top of the "sticky" depth
in the working-directory. However, if the directory doesn't exist yet,
then it sets the sticky depth of the new directory entries.
Unfortunately, the svn command-line has no way of expanding the depth
of a directory from "empty" to "files", without also removing any
already-expanded subdirectories. The way you're supposed to increase
the depth of an existing directory is via --set-depth, but
--set-depth=files will also remove any subdirs which were already
requested.
This change avoids getting into the state of ever needing to increase
the depth of an existing directory from "empty" to "files" in the
first place, by:
1. Use svn update --depth=files, not --depth=immediates.
The latter has the effect of checking out the subdirectories and
marking them as depth=empty. The former excludes sub-directories from
the list of entries, which avoids the problem.
2. Explicitly populate missing parent directories.
Using --parents seemed nice and easy, but it marks the parent dirs as
depth=empty. Instead, check out parents explicitly if they're missing.
llvm-svn: 347883
This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit.
A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly).
Differential: https://reviews.llvm.org/D54714
llvm-svn: 347877
My change svn-id: 347871 caused a buildbot failure due to an unused
variable def (used in an assert).
Change-Id: Ia882d18bb6fa79b4d7bbfda422b9ea5d23eab336
llvm-svn: 347876