This can result in significant code size savings in some cases,
e.g. an interrupt table all filled with the same assembly stub
in a certain Cortex-M BSP results in code blowup by a factor of 2.5.
Differential Revision: https://reviews.llvm.org/D34806
llvm-svn: 315853
This reduces code size for constructs like vtables or interrupt
tables that refer to functions in global initializers.
Differential Revision: https://reviews.llvm.org/D34805
llvm-svn: 315852
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.
Differential Revision: https://reviews.llvm.org/D38928
llvm-svn: 315850
Summary:
It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register.
It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day.
Reviewers: RKSimon, delena, zvi
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38932
llvm-svn: 315849
We should be able to fold constant conditions by converting to shuffles, but fixing that would break these tests in their current form. Since they are really trying to test masking ops, add a non-constant mask to the selects.
llvm-svn: 315848
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.
This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.
Depends on D37443
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37445
llvm-svn: 315843
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
Depends on D36618
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37443
Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().
llvm-svn: 315841
Fixes cfe/trunk/test/Misc/backend-resource-limit-diagnostics.cl
test after r315808
We may hit few other similar issues, but I want to discuss good
solution offline.
llvm-svn: 315830
- Update docs to match llvm coding style
- Add missing FP16_OVFL bit for gfx9
- Fix the size of the kernel descriptor in the docs
Differential Revision: https://reviews.llvm.org/D38902
llvm-svn: 315822
- Do not allow amd_amdgpu_isa directives on non-amdgcn architectures
- Do not allow amd_amdgpu_hsa_metadata on non-amdhsa OSes
- Do not allow amd_amdgpu_pal_metadata on non-amdpal OSes
Differential Revision: https://reviews.llvm.org/D38750
llvm-svn: 315812
- Emit NT_AMD_AMDGPU_ISA
- Add assembler parsing for isa version directive
- If isa version directive does not match command line arguments, then return error
Differential Revision: https://reviews.llvm.org/D38748
llvm-svn: 315808
If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle.
Fixes the last issue with PR22415
llvm-svn: 315807
We don't need a bitconvert as a root pattern in these cases. The types in the other parts of the pattern are sufficient to express the behavior of these instructions.
llvm-svn: 315798
I believe these were added incorrectly under the belief that the load size was smaller than the input register size, but that's not true.
llvm-svn: 315795
"No such file or directory: C:\\...\\tests\\Output\\shared-output.py.tmp/Output/Shared/SHARED.tmp"
And yet other forward-slashes don't seem to be causing the same
problem. I'll see if I can get ahold of a Windows machine to poke at
this directly later.
llvm-svn: 315792
Currently llvm assembler emits parsing error for valid IR assembly
alloca i32, i32 9, addrspace(5)
when alloca addr space is 5.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D38713
llvm-svn: 315791
Summary:
This patch removes the `verifyNCD` check.
The reason for this is that the other checks are sufficient to prove or disprove correctness of any DominatorTree, and that `verifyNCD` doesn't provide (in my option) better error messages then the other ones.
Additionally, this should give a (small) improvement to the total verification time, as the check is O(n), and checking the sibling property takes O(n^3).
Reviewers: dberlin, grosser, davide, brzycki
Reviewed By: brzycki
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38802
llvm-svn: 315790
There were two copies of the logic needed to construct a line stats
object for each line in a range: this patch brings it down to one. In
the future, this will make it easier for IDE clients to display coverage
in-line in source editors. To do that, we just need to move the new
LineCoverageIterator class to libCoverage.
llvm-svn: 315789
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.
Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.
Improve the compile time of RegBankSelect by up to 20%.
Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.
NFC.
llvm-svn: 315781
Summary:
Operand variable lookups are now performed by the RuleMatcher rather than
searching the whole matcher hierarchy for a match. This revealed a wrong-code
bug that currently affects ARM and X86 where patterns that use a variable more
than once in the match pattern will be imported but won't check that the
operands are identical. This can cause the tablegen-erated matcher to
accept matches that should be rejected.
Depends on D36569
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: aemerson, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36618
llvm-svn: 315780