This should allow users of the library to get a range to iterate through
all the subcommands that are registered to the global parser. This
allows users to define subcommands in libraries that self-register to
have dispatch done at a different stage (like main). It allows for
writing code like the following:
for (auto *S : cl::getRegisteredSubcommands()) {
if (*S) {
// Dispatch on S->getName().
}
}
This change also contains tests that show this usage pattern.
Reviewers: zturner, dblaikie, echristo
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D24489
llvm-svn: 281290
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.
Fixes PR30362. A program for upgrading test cases is attached to that
bug.
Differential Revision: http://reviews.llvm.org/D20147
llvm-svn: 281284
This should make it easier to add cases that we currently don't cover,
like supporting more kinds of type mismatches and more than 2 input vectors.
llvm-svn: 281283
That confuses e.g. machine basic block placement, which then doesn't
realize that control can fall through a block that ends with a conditional
tail call. Instead, isBranch=1 should be set.
Also, mark EFLAGS as used by these instructions.
llvm-svn: 281281
Convert the previous introduced is-a relationship between the LVICache and LVIImple clases into a has-a relationship and hide all the implementation details of the cache from the lazy query layer.
The only slightly concerning change here is removing the addition of a queried block into the SeenBlock set in LVIImpl::getBlockValue. As far as I can tell, this was effectively dead code. I think it *used* to be the case that getCachedValueInfo wasn't const and might end up inserting elements in the cache during lookup. That's no longer true and hasn't been for a while. I did fixup the const usage to make that more obvious.
llvm-svn: 281272
Seperate the caching logic from the implementation of the lazy analysis. For the moment, the lazy analysis impl has a is-a relationship with the cache; this will change to a has-a relationship shortly. This was done as two steps merely to keep the changes simple and the diff understandable.
llvm-svn: 281266
Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095
Reviewers: davidxl
Subscribers: vsk, llvm-commits
Differential Revision: https://reviews.llvm.org/D24147
llvm-svn: 281252
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.
As part of this many minor cleanups to the Parser:
* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
now fixed.
These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.
Reviewers: rnk, majnemer
Subscribers: aemerson, jyknight, llvm-commits
Differential Revision: https://reviews.llvm.org/D24047
llvm-svn: 281249
This patch moves symbol mangling from findSymbol to getSymbolAddress. The
findSymbol, findExistingSymbol and findModuleForSymbol methods now always take
a mangled name, allowing the 'demangle-and-retry' cruft to be removed from
findSymbol. See http://llvm.org/PR28699 for details.
Patch by James Holderness. Thanks very much James!
llvm-svn: 281238
r280832 added 32-bit support for emitting conditional tail-calls, but
dropped imp-used parameter registers. This went unnoticed until
r281113, which added 64-bit support, as this is only exposed with
parameter passing via registers.
Don't drop the imp-used parameters.
llvm-svn: 281223
Trying to infer the 'returned' attribute if an argument is already
'returned' can lead to verification failure: inference might determine
that a different argument is passed through which would result in two
different arguments marked as 'returned'.
This fixes PR30350.
llvm-svn: 281221
Summary: This removes disabled instructions from match tables so we will not match them at all.
Reviewers: tstellarAMD, vpykhtin, artem.tamazov
Subscribers: wdng, nhaehnle, arsenm
Differential Revision: https://reviews.llvm.org/D24452
llvm-svn: 281216
For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).
1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.
1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.
llvm-svn: 281215
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
ldr r0, .CPI0
bl printf
bx lr
.CPI0: &format_string
format_string: .asciz "hello, world!\n"
We can emit:
adr r0, .CPI0
bl printf
bx lr
.CPI0: .asciz "hello, world!\n"
This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).
llvm-svn: 281213
Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking
a single byte offset) to preserve the pointer type information through
selection.
llvm-svn: 281205
Some generic instructions have multiple types. While in theory these always be
discovered by inspecting the single definition of each generic vreg, in
practice those definitions won't always be local and traipsing through a big
function to find them will not be fun.
So this changes MIRPrinter to print out the type of uses as well as defs, if
they're known to be different or not known to be the same.
On the parsing side, we're a little more flexible: provided each register is
given a type in at least one place it's mentioned (and all types are
consistent) we accept the MIR. This doesn't introduce ambiguity but makes
writing tests manually a bit less painful.
llvm-svn: 281204
- Add AllocatorList, a non-intrusive list that owns an LLVM-style
allocator and provides a std::list-like interface (trivially built on
top of simple_ilist),
- add a typedef (and unit tests) for BumpPtrList, and
- use BumpPtrList for the list of llvm::yaml::Token (i.e., TokenQueueT).
TokenQueueT has no need for the complexity of an intrusive list. The
only reason to inherit from ilist was to customize the allocator.
TokenQueueT was the only example in-tree of using ilist<> in a truly
non-intrusive way.
Moreover, this removes the final use of the non-intrusive
ilist_traits<>::createNode (after r280573, r281177, and r281181). I
have a WIP patch that removes this customization point (and the API that
relies on it) that I plan to commit soon.
Note: AllocatorList owns the allocator, which limits the viable API
(e.g., splicing must be on the same list). For now I've left out
any problematic API. It wouldn't be hard to split AllocatorList into
two layers: an Impl class that calls DerivedT::getAlloc (via CRTP), and
derived classes that handle Allocator ownership/reference/etc semantics;
and then implement splice with appropriate assertions; but TBH we should
probably just customize the std::list allocators at that point.
llvm-svn: 281182
Now that MachineBasicBlock::reverse_instr_iterator knows when it's at
the end (since r281168 and r281170), implement
MachineBasicBlock::reverse_iterator directly on top of an
ilist::reverse_iterator by adding an IsReverse template parameter to
MachineInstrBundleIterator. This replaces another hard-to-reason-about
use of std::reverse_iterator on list iterators, matching the changes for
ilist::reverse_iterator from r280032 (see the "out of scope" section at
the end of that commit message). MachineBasicBlock::reverse_iterator
now has a handle to the current node and has obvious invalidation
semantics.
r280032 has a more detailed explanation of how list-style reverse
iterators (invalidated when the pointed-at node is deleted) are
different from vector-style reverse iterators like std::reverse_iterator
(invalidated on every operation). A great motivating example is this
commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp.
Note: If your out-of-tree backend deletes instructions while iterating
on a MachineBasicBlock::reverse_iterator or converts between
MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator,
you'll need to update your code in similar ways to r280032. The
following table might help:
[Old] ==> [New]
delete &*RI, RE = end() delete &*RI++
RI->erase(), RE = end() RI++->erase()
reverse_iterator(I) std::prev(I).getReverse()
reverse_iterator(I) ++I.getReverse()
--reverse_iterator(I) I.getReverse()
reverse_iterator(std::next(I)) I.getReverse()
RI.base() std::prev(RI).getReverse()
RI.base() ++RI.getReverse()
--RI.base() RI.getReverse()
std::next(RI).base() RI.getReverse()
(For more details, have a look at r280032.)
llvm-svn: 281172
This is a prep commit before fixing MachineBasicBlock::reverse_iterator
invalidation semantics, ala r281167 for ilist::reverse_iterator. This
changes MachineBasicBlock::Instructions to track which node is the
sentinel regardless of LLVM_ENABLE_ABI_BREAKING_CHECKS.
There's almost no functionality change (aside from ABI). However, in
the rare configuration:
#if !defined(NDEBUG) && !defined(LLVM_ENABLE_ABI_BREAKING_CHECKS)
the isKnownSentinel() assertions in ilist_iterator<>::operator* suddenly
have teeth for MachineInstr. If these assertions start firing for your
out-of-tree backend, have a look at the suggestions in the commit
message for r279314, and at some of the commits leading up to it that
avoid dereferencing the end() iterator.
llvm-svn: 281168
This should *actually* fix PR30244. This cranks up the workaround for PR30188 so that we never sink loads or stores of allocas.
The idea is that these should be removed by SROA/Mem2Reg, and any movement of them may well confuse SROA or just cause unwanted code churn. It's not ideal that the midend should be crippled like this, but that unwanted churn can really cause significant regressions in important workloads (tsan).
llvm-svn: 281162
Exposed by PR30244, we will split a block currently if we think we can sink at least one instruction. However this isn't right - the reason we split predecessors is so that we can sink instructions that otherwise couldn't be sunk because it isn't safe to do so - stores, for example.
So, change the heuristic to only split if it thinks it can sink at least one non-speculatable instruction.
Should fix PR30244.
llvm-svn: 281160
Summary:
This will let e.g. the load/store vectorizer propagate this metadata
appropriately.
Reviewers: arsenm
Subscribers: tra, jholewinski, hfinkel, mzolotukhin
Differential Revision: https://reviews.llvm.org/D23479
llvm-svn: 281153
Summary:
With this change (plus some changes to prevent !invariant from being
clobbered within llvm), clang will be able to model the __ldg CUDA
builtin as an invariant load, rather than as a target-specific llvm
intrinsic. This will let the optimizer play with these loads --
specifically, we should be able to vectorize them in the load-store
vectorizer.
Reviewers: tra
Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc
Differential Revision: https://reviews.llvm.org/D23477
llvm-svn: 281152
Summary:
An IR load can be invariant, dereferenceable, neither, or both. But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.
This patch splits up the notions of invariance and dereferenceability at
the MI level. It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D23371
llvm-svn: 281151