This assumes that
a) finding the bucket containing the value is LIKELY
b) finding an empty bucket is LIKELY
c) growing the table is UNLIKELY
I also switched the a) and b) cases for SmallPtrSet as we seem to use
the set mostly more for insertion than for checking existence.
In a simple benchmark consisting of 2^21 insertions of 2^20 unique
pointers into a DenseMap or SmallPtrSet a few percent speedup on average,
but nothing statistically significant.
llvm-svn: 230232
This patch introduces a new mechanism that allows IR modules to co-operatively
build pointer sets corresponding to addresses within a given set of
globals. One particular use case for this is to allow a C++ program to
efficiently verify (at each call site) that a vtable pointer is in the set
of valid vtable pointers for the class or its derived classes. One way of
doing this is for a toolchain component to build, for each class, a bit set
that maps to the memory region allocated for the vtables, such that each 1
bit in the bit set maps to a valid vtable for that class, and lay out the
vtables next to each other, to minimize the total size of the bit sets.
The patch introduces a metadata format for representing pointer sets, an
'@llvm.bitset.test' intrinsic and an LTO lowering pass that lays out the globals
and builds the bitsets, and documents the new feature.
Differential Revision: http://reviews.llvm.org/D7288
llvm-svn: 230054
This fixes an error introduced in r228934 where None was converted to
an int instead of the int being converted to an Optional as intended.
We make that sort of mistake a compile error by changing NoneType into
a scoped enum.
Finally, provide a static NoneType called None to avoid forcing all
users to spell it NoneType::None.
llvm-svn: 229980
classes. We can't use template aliases because on MSVC they don't appear
to work correctly in the common usage such as Format.h.
Many thanks to Zach for doing all the testing and debugging here. I just
slotted the fix into the code.
llvm-svn: 229362
Introduces a subset of C++14 integer sequences in STLExtras. This is
just enough to support unpacking a std::tuple into the arguments of
snprintf, we can add more of it when it's actually needed.
Also removes an ancient macro hack that leaks a macro into the global
namespace. Clean up users that made use of the convenient hack.
llvm-svn: 229337
Original commit message:
SmallVector: Resolve a long-standing fixme by using the existing unitialized_copy dispatch.
This makes append() use memcpy for trivially copyable types.
llvm-svn: 229149
This commit makes the following changes:
- Stop issuing a warning when the triples' string representations do not match
exactly if the Triple objects generated from the strings compare equal.
- On Apple platforms, choose the triple that has the larger minimum version
number.
rdar://problem/16743513
Differential Revision: http://reviews.llvm.org/D7591
llvm-svn: 228999
I just realized that the specialized metadata node patch I'm about to
commit won't compile on old compilers. Bump `hash_combine()`'s support
for non-variadic templates to 18 (I tested this by reversing the logic
in the #ifdef).
llvm-svn: 228629
Add some API to `APSInt` to make it easier to compare with `int64_t`.
- `APSInt::compareValues(APSInt, APSInt)` returns 1, -1 or 0 for
greater, lesser, or equal, doing the right thing for mismatched
"has-sign" and bitwidths. This is just like `isSameValue()` (and is
now the implementation of it).
- `APSInt::get(int64_t)` gets a signed `APSInt`.
- `operator<(int64_t)`, etc., are implemented trivially via `get()`
and `compareValues()`.
- Also added `APSInt::getUnsigned(uint64_t)` to make it easier to test
`compareValues()`.
llvm-svn: 228239
Similar to the C++14 void specializations of these templates, useful as
a stop-gap until LLVM switches to '14.
Example use-cases in tblgen because I saw some functors that looked like
they could be simplified/refactored.
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7324
llvm-svn: 227828
Summary:
V8->V9:
- cleanup tests
V7->V8:
- addressed feedback from David:
- switched to range-based 'for' loops
- fixed formatting of tests
V6->V7:
- rebased and adjusted AsmPrinter args
- CamelCased .td, fixed formatting, cleaned up names, removed unused patterns
- diffstat: 3 files changed, 203 insertions(+), 227 deletions(-)
V5->V6:
- addressed feedback from Chandler:
- reinstated full verbose standard banner in all files
- fixed variables that were not in CamelCase
- fixed names of #ifdef in header files
- removed redundant braces in if/else chains with single statements
- fixed comments
- removed trailing empty line
- dropped debug annotations from tests
- diffstat of these changes:
46 files changed, 456 insertions(+), 469 deletions(-)
V4->V5:
- fix setLoadExtAction() interface
- clang-formated all where it made sense
V3->V4:
- added CODE_OWNERS entry for BPF backend
V2->V3:
- fix metadata in tests
V1->V2:
- addressed feedback from Tom and Matt
- removed top level change to configure (now everything via 'experimental-backend')
- reworked error reporting via DiagnosticInfo (similar to R600)
- added few more tests
- added cmake build
- added Triple::bpf
- tested on linux and darwin
V1 cover letter:
---------------------
recently linux gained "universal in-kernel virtual machine" which is called
eBPF or extended BPF. The name comes from "Berkeley Packet Filter", since
new instruction set is based on it.
This patch adds a new backend that emits extended BPF instruction set.
The concept and development are covered by the following articles:
http://lwn.net/Articles/599755/http://lwn.net/Articles/575531/http://lwn.net/Articles/603983/http://lwn.net/Articles/606089/http://lwn.net/Articles/612878/
One of use cases: dtrace/systemtap alternative.
bpf syscall manpage:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b4fc1a460f3017e958e6a8ea560ea0afd91bf6fe
instruction set description and differences vs classic BPF:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.txt
Short summary of instruction set:
- 64-bit registers
R0 - return value from in-kernel function, and exit value for BPF program
R1 - R5 - arguments from BPF program to in-kernel function
R6 - R9 - callee saved registers that in-kernel function will preserve
R10 - read-only frame pointer to access stack
- two-operand instructions like +, -, *, mov, load/store
- implicit prologue/epilogue (invisible stack pointer)
- no floating point, no simd
Short history of extended BPF in kernel:
interpreter in 3.15, x64 JIT in 3.16, arm64 JIT, verifier, bpf syscall in 3.18, more to come in the future.
It's a very small and simple backend.
There is no support for global variables, arbitrary function calls, floating point, varargs,
exceptions, indirect jumps, arbitrary pointer arithmetic, alloca, etc.
From C front-end point of view it's very restricted. It's done on purpose, since kernel
rejects all programs that it cannot prove safe. It rejects programs with loops
and with memory accesses via arbitrary pointers. When kernel accepts the program it is
guaranteed that program will terminate and will not crash the kernel.
This patch implements all 'must have' bits. There are several things on TODO list,
so this is not the end of development.
Most of the code is a boiler plate code, copy-pasted from other backends.
Only odd things are lack or < and <= instructions, specialized load_byte intrinsics
and 'compare and goto' as single instruction.
Current instruction set is fixed, but more instructions can be added in the future.
Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Subscribers: majnemer, chandlerc, echristo, joerg, pete, rengolin, kristof.beyls, arsenm, t.p.northover, tstellarAMD, aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D6494
llvm-svn: 227008
manager to support the actual uses of it. =]
When I ported instcombine to the new pass manager I discover that it
didn't work because TLI wasn't available in the right places. This is
a somewhat surprising and/or subtle aspect of the new pass manager
design that came up before but I think is useful to be reminded of:
While the new pass manager *allows* a function pass to query a module
analysis, it requires that the module analysis is already run and cached
prior to the function pass manager starting up, possibly with
a 'require<foo>' style utility in the pass pipeline. This is an
intentional hurdle because using a module analysis from a function pass
*requires* that the module analysis is run prior to entering the
function pass manager. Otherwise the other functions in the module could
be in who-knows-what state, etc.
A somewhat surprising consequence of this design decision (at least to
me) is that you have to design a function pass that leverages
a module analysis to do so as an optional feature. Even if that means
your function pass does no work in the absence of the module analysis,
you have to handle that possibility and remain conservatively correct.
This is a natural consequence of things being able to invalidate the
module analysis and us being unable to re-run it. And it's a generally
good thing because it lets us reorder passes arbitrarily without
breaking correctness, etc.
This ends up causing problems in one case. What if we have a module
analysis that is *definitionally* impossible to invalidate. In the
places this might come up, the analysis is usually also definitionally
trivial to run even while other transformation passes run on the module,
regardless of the state of anything. And so, it follows that it is
natural to have a hard requirement on such analyses from a function
pass.
It turns out, that TargetLibraryInfo is just such an analysis, and
InstCombine has a hard requirement on it.
The approach I've taken here is to produce an analysis that models this
flexibility by making it both a module and a function analysis. This
exposes the fact that it is in fact safe to compute at any point. We can
even make it a valid CGSCC analysis at some point if that is useful.
However, we don't want to have a copy of the actual target library info
state for each function! This state is specific to the triple. The
somewhat direct and blunt approach here is to turn TLI into a pimpl,
with the state and mutators in the implementation class and the query
routines primarily in the wrapper. Then the analysis can lazily
construct and cache the implementations, keyed on the triple, and
on-demand produce wrappers of them for each function.
One minor annoyance is that we will end up with a wrapper for each
function in the module. While this is a bit wasteful (one pointer per
function) it seems tolerable. And it has the advantage of ensuring that
we pay the absolute minimum synchronization cost to access this
information should we end up with a nice parallel function pass manager
in the future. We could look into trying to mark when analysis results
are especially cheap to recompute and more eagerly GC-ing the cached
results, or we could look at supporting a variant of analyses whose
results are specifically *not* cached and expected to just be used and
discarded by the consumer. Either way, these seem like incremental
enhancements that should happen when we start profiling the memory and
CPU usage of the new pass manager and not before.
The other minor annoyance is that if we end up using the TLI in both
a module pass and a function pass, those will be produced by two
separate analyses, and thus will point to separate copies of the
implementation state. While a minor issue, I dislike this and would like
to find a way to cleanly allow a single analysis instance to be used
across multiple IR unit managers. But I don't have a good solution to
this today, and I don't want to hold up all of the work waiting to come
up with one. This too seems like a reasonable thing to incrementally
improve later.
llvm-svn: 226981
There is no reason for this state to be exposed as public. The single element
constructor was superfulous in light of the single element ArrayRef
constructor.
llvm-svn: 226424
utils/sort_includes.py.
I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.
llvm-svn: 225974
This default constructor is a bit weird. It left the range in an invalid
state. That might be reasonable so that you can construct a local
iterator range and assign to it based on some logic to compute the range
you want. If folks would like to support that use case, I can add it
back, but in 238-odd usages none have actually wanted to do this. ;]
llvm-svn: 225592
r221973 changed SmallVector::operator[] to use size_t instead of unsigned.
Before that, on 64bit platforms, when a large index (say -1) was passed,
truncating it to unsigned avoided an overflow when computing 'begin() + idx',
and failed the range checking assertion, as expected.
With r221973, idx isn't truncated, so the addition wraps to
'(char*)begin() - 1', and doesn't fire anymore when it should have done so.
This commit changes the comparison to instead compute 'end() - begin()'
(i.e., 'size()'), which avoids potentially overflowing additions, and
correctly triggers the assertion when values such as -1 are passed.
Note that the problem already existed before that revision, on platforms
where sizeof(size_t) == sizeof(unsigned).
llvm-svn: 225338
This appears to have broken at least the windows build bots due to
compile errors in the predicate that didn't simply supress the overload.
I'm not sure what the fix is, and the bots have been broken for a long
time now so I'm just reverting until Michael can figure out a fix.
llvm-svn: 225064
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.
llvm-svn: 224255
Clang's static analyzer found several potential cases of undefined
behavior, use of un-initialized values, and potentially null pointer
dereferences in tablegen, Support, MC, and ADT. This cleans them up
with specific assertions on the assumptions of the code.
llvm-svn: 224154
`MDString`s can have arbitrary characters in them. Prevent an assertion
that fired in `BitcodeWriter` because of sign extension by copying the
characters into the record as `unsigned char`s.
Based on a patch by Keno Fischer; fixes PR21882.
llvm-svn: 224077
DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled
the memory footprint of the map. Now we use a compressed set so the second
element uses no memory at all. This required some surgery on DenseMap as
all accesses to the bucket now have to go through methods; this should
have no impact on the behavior of DenseMap though. The new default bucket
type for DenseMap is a slightly extended std::pair as we expose it through
DenseMap's iterator and don't want to break any existing users.
llvm-svn: 223588
Required some APInt massaging to get proper empty/tombstone values. Apart
from making the code a bit simpler this also reduces the bucket size of
the ConstantInt map from 32 to 24 bytes.
llvm-svn: 223478
This operating system type represents the AMD HSA runtime,
and will be required by the R600 backend in order to generate
correct code for this runtime.
llvm-svn: 223124
po_iterator_storage's insertEdge was updated to reflect the API
changes from many of our insert methods in r222334, however the
template specialization for external storage was not updated. This
updates the specialization.
llvm-svn: 222446
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.
This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...
llvm-svn: 222334
Having two ways to do this doesn't seem terribly helpful and
consistently using the insert version (which we already has) seems like
it'll make the code easier to understand to anyone working with standard
data structures. (I also updated many references to the Entry's
key and value to use first() and second instead of getKey{Data,Length,}
and get/setValue - for similar consistency)
Also removes the GetOrCreateValue functions so there's less surface area
to StringMap to fix/improve/change/accommodate move semantics, etc.
llvm-svn: 222319
StringSet is still a bit dodgy in that it exposes the raw iterator of
the StringMap parent, which exposes the weird detail that StringSet
actually has a 'value'... but anyway, this is useful for a handful of
clients that want to reference the newly inserted/persistent string data
in the StringSet/Map/Entry/thing.
llvm-svn: 222302
The other option would be to do something like
if (that.isSingleWord())
VAL = that.VAL;
else
pVal = that.pVal
This bug was causing 86TTI::getIntImmCost to be miscompiled in a LTO
bootstrap in stage2, causing the build of stage3 to fail.
LLVM is getting quiet good at exploiting this. Not sure if there is anything
a sanitizer could do to help
llvm-svn: 222294
This matches std::vector and is more efficient as it avoids
truncations.
With this the text segment of opt goes from 19705442 bytes
to 19703930 bytes.
llvm-svn: 221973
A subtle bug was found where attempting to copy a non-const function_ref
lvalue would actually invoke the generic forwarding constructor (as it
was a closer match - being T& rather than the const T& of the implicit
copy constructor). In the particular case this lead to a dangling
function_ref member (since it had referenced the function_ref passed by
value to its ctor, rather than the outer function_ref that was still
alive)
SFINAE the converting constructor to not be considered if the copy
constructor is available and demonstrate that this causes the copy to
refer to the original functor, not to the function_ref it was copied
from. (without the code change, the test would fail as Y would be
referencing X and Y() would see the result of the mutation to X, ie: 2)
llvm-svn: 221753
This operation is analogous to its counterpart in DenseMap: It allows lookup
via cheap-to-construct keys (provided that getHashValue and isEqual are
implemented for the cheap key-type in the DenseMapInfo specialization).
Thanks to Chandler for the review.
llvm-svn: 220168
We have a transform that changes:
(x lshr C1) udiv C2
into:
x udiv (C2 << C1)
However, it is unsafe to do so if C2 << C1 discards any of C2's bits.
This fixes PR21255.
llvm-svn: 219634
to what we actually want ilogb implementation. This makes everything
*much* easier to deal with and is actually what we want when using it
anyways.
llvm-svn: 219474
code using it more readable.
Also add a copySign static function that works more like the standard
function by accepting the value and sign-carying value as arguments.
No interesting logic here, but tests added to cover the basic API
additions and make sure they do something plausible.
llvm-svn: 219453
One of many steps to generalize subprogram emission to both the DWO and
non-DWO sections (to emit -gmlt-like data under fission). Once the
functions are pushed down into DwarfCompileUnit some of the data
structures will be pushed at least into DwarfFile so that they can be
unique per-file, allowing emission to both files independently.
llvm-svn: 219345
Fix http://llvm.org/PR21158 by adding a cast to unsigned long long,
so the comparison would be between two unsigned long longs instead
of bool and unsigned long long.
if (getAsUnsignedInteger(*this, Radix, ULLVal) ||
static_cast<unsigned long long>(static_cast<T>(ULLVal)) != ULLVal)
llvm-svn: 219065
This can be used for in-place initialization of non-moveable types.
For compilers that don't support variadic templates, only up to four
arguments are supported. We can always add more, of course, but this
should be good enough until we move to a later MSVC that has full
support for variadic templates.
Inspired by std::experimental::optional from the "Library Fundamentals" C++ TS.
Reviewed by David Blaikie.
llvm-svn: 218732
This takes a single argument convertible to T, and
- if the Optional has a value, returns the existing value,
- otherwise, constructs a T from the argument and returns that.
Inspired by std::experimental::optional from the "Library Fundamentals" C++ TS.
llvm-svn: 218618
MSVC's STL has a bug in `std::equal()`: it asserts on nullptr iterators,
causing a block revert in r215981. This works around that by re-writing
`ArrayRef::equals()` to do the work itself.
llvm-svn: 215986
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
It's not clear what the semantics of a self-move should be. The
consensus appears to be that a self-move should leave the object in a
moved-from state, which is what our existing move assignment operator
does.
However, the MSVC 2013 STL will perform self-moves in some cases. In
particular, when doing a std::stable_sort of an already sorted APSInt
vector of an appropriate size, one of the merge steps will self-move
half of the elements.
We don't notice this when building with MSVC, because MSVC will not
synthesize the move assignment operator for APSInt. Presumably MSVC
does this because APInt, the base class, has user-declared special
members that implicitly delete move special members. Instead, MSVC
selects the copy-assign operator, which defends against self-assignment.
Clang, on the other hand, selects the move-assign operator, and we get
garbage APInts.
llvm-svn: 215478
Remove the MinGW32 and Cygwin types from the OSType enumeration. These values
are represented via environments of Windows. It is a source of confusion and
needlessly clutters the code. The cost of doing this is that we must sink the
check for them into the normalization code path along with the spelling.
Addresses PR20592.
llvm-svn: 215303
checking whether the ArrayRef is equal to an explicit list of arguments.
This is particularly easy to implement even without variadic templates
because ArrayRef happens to be homogeneously typed. As a consequence we
can use a "clever" wrapper type and default arguments to capture in
a single method many arguments as well as *how many* arguments the user
specified.
Thanks to Dave Blaikie for helping me pull together this little helper.
Suggestions for how to improve or generalize it are of course welcome.
I'll be using it immediately in my follow-up patch. =D
llvm-svn: 214041
Having both Triple::arm64 and Triple::aarch64 is extremely confusing, and
invites bugs where only one is checked. In reality, the only legitimate
difference between the two (arm64 usually means iOS) is also present in the OS
part of the triple and that's what should be checked.
We still parse the "arm64" triple, just canonicalise it to Triple::aarch64, so
there aren't any LLVM-side test changes.
llvm-svn: 213743
This is a prerequisite for checking for 'mti' and 'img' in a consistent way in
clang. Previously 'img' could use Triple::getVendor() but 'mti' could only use
Triple::getVendorName().
llvm-svn: 213381
Add a `MapVector::remove_if()` that erases items in bulk in linear time,
as opposed to quadratic time for repeated calls to `MapVector::erase()`.
llvm-svn: 213090
Actually update the changed indexes in the map portion of `MapVector`
when erasing from the middle. Add a unit test that checks for this.
Note that `MapVector::erase()` is a linear time operation (it was and
still is). I'll commit a new method in a moment called
`MapVector::remove_if()` that deletes multiple entries in linear time,
which should be slightly less painful.
llvm-svn: 213084
The underlying function. utohex_buffer, already supports an argument for
deciding if the hex characters should be upper or lower case. Expose an
identical argument for utohexstr.
llvm-svn: 212991
Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang.
Differential Revision: http://reviews.llvm.org/D4435
llvm-svn: 212626
The slice(N, M) interface is powerful but not concise when wanting to
drop a few elements off of an ArrayRef, fix this by adding a drop_back
method.
llvm-svn: 212370
This better aligns with other LLVM-specific and C++ standard library smart
pointer types.
In particular there are at least a few uses of intrusive refcounting in the
frontend where it's worth investigating std::shared_ptr as a more appropriate
alternative.
llvm-svn: 212366
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.
COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.
This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.
Differential Revision: http://reviews.llvm.org/D4178
llvm-svn: 211920
Certain versions of GCC (~4.7) couldn't handle the SFINAE on access
control, but with "= delete" (hidden behind a macro for portability)
this issue is worked around/addressed.
Patch by Agustín Bergé
llvm-svn: 211525
Various places in LLVM assume that container size and count are unsigned
and do not use the container size_type. Therefore they break compilation
(or possibly executation) for LP64 systems where size_t is 64 bit while
unsigned is still 32 bit.
If we'll ever that many items in the container size_type could be made
size_t for a specific containers after reviweing its other uses.
llvm-svn: 211353
only 1/0 result like std::set. Some of the LLVM ADT already return unsigned
count(), while others still return bool count().
In continuation to r197879, this patch modifies DenseMap, DenseSet,
ScopedHashTable, ValueMap:: count() to return size_type instead of bool,
1 instead of true and 0 instead of false.
size_type is typedef-ed locally within each class to size_t.
http://reviews.llvm.org/D4018
Reviewed by dblaikie.
llvm-svn: 211350
Currently, when using llvm as an assembler, DWARF debug information is only
generated for the .text section. This patch modifies this so that DWARF info
is emitted for all executable sections.
llvm-svn: 211273
Unfortunately there's no way to elegantly do this with pre-canned
algorithms. Using a generating iterator doesn't work because you default
construct for each element, then move construct into the actual slot
(bad for copy but non-movable types, and a little unneeded overhead even
in the move-only case), so just write it out manually.
This solution isn't exception safe (if one of the element's ctors calls
we don't fall back, destroy the constructed elements, and throw on -
which std::uninitialized_fill does do) but SmallVector (and LLVM) isn't
exception safe anyway.
llvm-svn: 210495