Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches.
llvm-svn: 363710
Add `IsGP64bit` and `IsPTR64bit` to the list of `UnsupportedFeatures`
of the P5600 scheduling definitions. Also mark some MIPS 64-bit
instructions by PTR_64 and GPR_64 predicates. This reduces number
of "No schedule information for" and "lacks information for" errors
in case of marking this scheduler model as complete.
This patch is one of a series of patches. The goal is to make P5600
scheduler model complete and turn on the `CompleteModel` flag.
Differential Revision: https://reviews.llvm.org/D63237
llvm-svn: 363702
Set the hasNoSchedulingInfo flag for the`MipsAsmPseudoInst`. These
pseudo-instructions are never used by codegen. This flag allows to
reduce number of "No schedule information for" and "lacks information
for" errors in case of marking a scheduler model as complete.
This patch is one of a series of patches. The goal is to make P5600
scheduler model complete and turn on the `CompleteModel` flag.
Differential Revision: https://reviews.llvm.org/D63236
llvm-svn: 363701
When running LLDB lit tests on Windows, the system selects a debug version
of Python, which was issuing lots of ResourceWarnings about files that
weren't closed. There are two kinds of them, and each test triggered one
of each.
This patch fixes one kind by ensuring TestRunner explicitly close the
temporary files created for routing stderr. This is important on Windows
but has no net effect on Posix systems.
The remaining ResourceWarnings are more elusive; the bug may lie in
the Python library subprocess.py, and it may be Windows-specific.
Differential Revision: https://reviews.llvm.org/D63102
llvm-svn: 363700
This includes saturating and non-saturating shifts, both with
immediate shift count and with the shift counts given by another
vector register; VSHLC (in which the bits shifted out of each active
vector lane are shifted in to the next active lane); and also VMOVL,
which is enough like an immediate shift that it didn't fit too badly
in this category.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62672
llvm-svn: 363696
Summary:
These form a small family of their own, to go with the floating-point
VMINNM/VMAXNM instructions added in a previous commit.
They introduce the first of many special cases in the mnemonic
recognition code, because VMIN with the E suffix used by the VPT
predication system needs to avoid being interpreted as the nonexistent
instruction 'VMI' with an ordinary 'NE' condition suffix.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62671
llvm-svn: 363695
Part of fixing the X86 regression noted in D63281 - I've split this into X86 and generic parts - the generic commit will be coming shortly and will fix the vector-reduce-mul-widen.ll regression introduced here.
llvm-svn: 363693
Summary:
Their names began with a mishmash of `MVE_`, `t2` and no prefix at
all. Now they all start with `MVE_`, which seems like a reasonable
choice on the grounds that (a) NEON is the thing they're most at risk
of being confused with, and (b) MVE implies Thumb-2, so a prefix
indicating MVE is strictly more specific than one indicating Thumb-2.
Reviewers: ostannard, SjoerdMeijer, dmgreen
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63492
llvm-svn: 363690
This patch adds support for generating calls through the procedure
linkage table where required for a given ExternalSymbol or GlobalAddress
callee.
Differential Revision: https://reviews.llvm.org/D55304
llvm-svn: 363686
1) `-x foo` currently dumps one `foo`. This change makes it dump all `foo`.
2) `-x foo -x foo` currently dumps `foo` twice. This change makes it dump `foo` once.
In addition, if foo has section index 9, `-x foo -x 9` dumps `foo` once.
3) Give a warning instead of an error if `foo` does not exist.
The new behaviors match GNU readelf.
Also, print a new line as a separator between two section dumps.
GNU readelf uses two lines, but one seems good enough.
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D63475
llvm-svn: 363683
This patch slightly refactors data structures internally used by the bottleneck
analysis to track data and resource dependencies.
This patch also updates methods used to print out information about dependency
edges when in debug mode.
This is the last of a sequence of commits done in preparation for an upcoming
patch that fixes PR37494. No functional change intended.
llvm-svn: 363677
Invert the name and return value to better reflect the imprecise
nature.
Force passing in the DefMI, since it's known in the 2 users and could
possibly fail for an arbitrary vreg.
Allow specifying a specific user instruction. Scan through use
instructions, instead of use operands. Add scan thresholds instead of
searching infinitely.
Stop using a set to track seen uses. I didn't understand this usage,
or why it would not check the last use. I don't think the use list has
any particular order.
llvm-svn: 363675
This adds vector splitting for vaarg instructions during type legalization
Committed on behalf of @luke (Luke Lau)
Differential Revision: https://reviews.llvm.org/D60762
llvm-svn: 363671
Some more refactoring, like registering the IT Block pass, less cryptic
variable names, and some simplification of loops.
Differential Revision: https://reviews.llvm.org/D63419
llvm-svn: 363666
Recommit of D32530 with a few small changes:
- Stopped recursively walking through aggregates in
the verifier, so that we don't impose too much
overhead on large modules under LTO (see PR42210).
- Changed tests to match; the errors are slightly
different since they only report the array or
struct that actually contains a scalable vector,
rather than all aggregates which contain one in
a nested member.
- Corrected an older comment
Reviewers: thakis, rengolin, sdesmalen
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D63321
llvm-svn: 363658
Summary:
The prior behavior of the triple matcher would stop
in the first matched triple. It was not possible to
create specific matches for sub-sets of a triple
(e.g aarch64-apple-darwin would never be used after
aarch64 was matched).
This patch:
1) Allows that specialized triples take priority,
considering that the string lenght of the triple
indentifies how specialized a triple is. If two
triples of same lenght match, the one matched first
prevails, preserving the old behavior.
2) Remove 20 duplicated triples of arm, thumb,
aarch64 options with same arguments, matching
the common prefix (aarch64, arm, thumb) of them.
3) Creates three new function matching regexes and
five triple options for arm64-apple-ios,
(arm|thumb)-apple-ios and thumb(v5)?-macho
Reviewers: lebedev.ri, RKSimon, MaskRay, gbedwell
Reviewed By: MaskRay
Subscribers: javed.absar, kristof.beyls, llvm-commits, carwil
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63145
llvm-svn: 363656
First step toward addressing the vector-reduce-mul-widen.ll regression in D63281 - we should replace ANY_EXTEND/ANY_EXTEND_VECTOR_INREG in X86ISelDAGToDAG to avoid having to add duplicate patterns when treating any extensions as legal.
In future patches this will also allow us to keep any extension nodes around a lot longer in the DAG, which should mean that we can keep better track of undef elements that otherwise become zeros that we think we have to keep......
Differential Revision: https://reviews.llvm.org/D63326
llvm-svn: 363655
This patch documents that LLVM does not describe all changes in variable
locations during the prologue and the epilogue. The debugger doesn't /
shouldn't step through that portion of the function anyway, and describing
every location through such stages would bloat location lists.
Perform some minor cleanup at the same time,
* Fix an enumerated list
* Document that dbg.declare intrinsics have their variable location recorded
in a MachineFunction table, not with DBG_VALUE meta-insts
* Adds frame-indexes to the list of things that can be operands to
DBG_VALUEs.
Differential Revision: https://reviews.llvm.org/D63083
llvm-svn: 363654
Using the new SwitchInstProfUpdateWrapper this patch
simplifies 3 places of prof branch_weights handling.
Differential Revision: https://reviews.llvm.org/D62123
llvm-svn: 363652
The isel patterns for these use a bitcast and load/store, but
DAG combine should have canonicalized those away.
For the purposes of the memory folding table these opcodes can be
replaced by the MOVSSrm_alt/MOVSDrm_alt and MOVSSmr/MOVSDmr opcodes.
llvm-svn: 363644
Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt.
Use the new versions in patterns that previously used a COPY_TO_REGCLASS
to VR128. These patterns expect the upper bits to be zero. The
current set up appears to work, but I'm not sure we should be
enforcing upper bits being zero through a COPY_TO_REGCLASS.
I wanted to flip the arrangement and use a COPY_TO_REGCLASS to
FR32/FR64 for the patterns that need an f32/f64 result, but that
complicated fastisel and globalisel.
I've been doing some experiments with reducing some isel patterns
and ended up in a situation where I had a
(SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our
post-isel peephole was unable to avoid using an instruction for
the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128
instruction removes the COPY_TO_REGCLASS that was breaking this.
llvm-svn: 363643
Summary:
All the GlobalISel passes are initialized when the target calls
initializeGlobalISel(), so we don't need to call the initializers
from the pass constructors.
Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar
Reviewed By: aemerson
Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63235
llvm-svn: 363642
Summary: Implements bug [[ https://bugs.llvm.org/show_bug.cgi?id=42204 | 42204 ]]. llvm-strip now warns when the same input file is used more than once, and errors when stdin is used more than once.
Reviewers: jhenderson, rupprecht, espindola, alexshap
Reviewed By: jhenderson, rupprecht
Subscribers: emaste, arichardson, jakehehrlich, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63122
llvm-svn: 363638
This was ignoring the flag on fneg, and using the source instruction's
flags. Also fixes tests missing from r358702.
Note the expansion itself isn't correct without nnan, but that should
be fixed separately.
llvm-svn: 363637
This saves roughly 32 bytes of instructions per function with stack objects
and causes us to preserve enough information that we can recover the original
tags of all stack variables.
Now that stack tags are deterministic, we no longer need to pass
-hwasan-generate-tags-with-calls during check-hwasan. This also means that
the new stack tag generation mechanism is exercised by check-hwasan.
Differential Revision: https://reviews.llvm.org/D63360
llvm-svn: 363636
The goal is to improve hwasan's error reporting for stack use-after-return by
recording enough information to allow the specific variable that was accessed
to be identified based on the pointer's tag. Currently we record the PC and
lower bits of SP for each stack frame we create (which will eventually be
enough to derive the base tag used by the stack frame) but that's not enough
to determine the specific tag for each variable, which is the stack frame's
base tag XOR a value (the "tag offset") that is unique for each variable in
a function.
In IR, the tag offset is most naturally represented as part of a location
expression on the llvm.dbg.declare instruction. However, the presence of the
tag offset in the variable's actual location expression is likely to confuse
debuggers which won't know about tag offsets, and moreover the tag offset
is not required for a debugger to determine the location of the variable on
the stack, so at the DWARF level it is represented as an attribute so that
it will be ignored by debuggers that don't know about it.
Differential Revision: https://reviews.llvm.org/D63119
llvm-svn: 363635
Inter-block localization is the same as what currently happens, except now it
only runs on the entry block because that's where the problematic constants with
long live ranges come from.
The second phase is a new intra-block localization phase which attempts to
re-sink the already localized instructions further right before one of the
multiple uses.
One additional change is to also localize G_GLOBAL_VALUE as they're constants
too. However, on some targets like arm64 it takes multiple instructions to
materialize the value, so some additional heuristics with a TTI hook have been
introduced attempt to prevent code size regressions when localizing these.
Overall, these changes improve CTMark code size on arm64 by 1.2%.
Full code size results:
Program baseline new diff
------------------------------------------------------------------------------
test-suite...-typeset/consumer-typeset.test 1249984 1217216 -2.6%
test-suite...:: CTMark/ClamAV/clamscan.test 1264928 1232152 -2.6%
test-suite :: CTMark/SPASS/SPASS.test 1394092 1361316 -2.4%
test-suite...Mark/mafft/pairlocalalign.test 731320 714928 -2.2%
test-suite :: CTMark/lencod/lencod.test 1340592 1324200 -1.2%
test-suite :: CTMark/kimwitu++/kc.test 3853512 3820420 -0.9%
test-suite :: CTMark/Bullet/bullet.test 3406036 3389652 -0.5%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 8017000 8016992 -0.0%
test-suite...TMark/7zip/7zip-benchmark.test 2856588 2856588 0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test 765704 765704 0.0%
Geomean difference -1.2%
Differential Revision: https://reviews.llvm.org/D63303
llvm-svn: 363632
Summary: This case is related to D63405 in that we need to be propagating FMF on negates.
Reviewers: volkan, spatel, arsenm
Reviewed By: arsenm
Subscribers: wdng, javed.absar
Differential Revision: https://reviews.llvm.org/D63458
llvm-svn: 363631
This is part of the approved D63204 pending parent revision.
This small change is in fact a part of the VOP2b legalization which
does not technically belong to wave32 support, so extracted
separately.
llvm-svn: 363625