Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11041
llvm-svn: 241888
It is meant to be used to record modules @imported by the current
compile unit, so a debugger an import the same modules to replicate this
environment before dropping into the expression evaluator.
DIModule is a sibling to DINamespace and behaves quite similarly.
In addition to the name of the module it also records the module
configuration details that are necessary to uniquely identify the module.
This includes the configuration macros (e.g., -DNDEBUG), the include path
where the module.map file is to be found, and the isysroot.
The idea is that the backend will turn this into a DW_TAG_module.
http://reviews.llvm.org/D9614
rdar://problem/20965932
llvm-svn: 241017
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
llvm-svn: 239940
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).
The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.
Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.
This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:
- Add the safestack function attribute, similar to the ssp, sspstrong and
sspreq attributes.
- Add the SafeStack instrumentation pass that applies the safe stack to all
functions that have the safestack attribute. This pass moves all unsafe local
variables to the unsafe stack with a separate stack pointer, whereas all
safe variables remain on the regular stack that is managed by LLVM as usual.
- Invoke the pass as the last stage before code generation (at the same time
the existing cookie-based stack protector pass is invoked).
- Add unit tests for the safe stack.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6094
llvm-svn: 239761
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238602
As a space optimization, this instruction would just encode the pointer
type of the first operand and use the knowledge that the second and
third operands would be of the pointee type of the first. When typed
pointers go away, this assumption will no longer be available - so
encode the type of the second operand explicitly and rely on that for
the third.
Test case added to demonstrate the backwards compatibility concern,
which only comes up when the definition of the second operand comes
after the use (hence the weird basic block sequence) - at which point
the type needs to be explicitly encoded in the bitcode and the record
length changes to accommodate this.
llvm-svn: 235966
Summary:
Make sure the abbrev operands are valid and that we can read/skip them
afterwards.
Bug found with AFL fuzz.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9030
llvm-svn: 235595
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
llvm-svn: 235475
Without pointee types the space optimization of storing only the pointer
type and not the value type won't be viable - so add the extra type
information that would be missing.
Storeatomic coming soon.
llvm-svn: 235474
Summary:
If a pointer is marked as dereferenceable_or_null(N), LLVM assumes it
is either `null` or `dereferenceable(N)` or both. This change only
introduces the attribute and adds a token test case for the `llvm-as`
/ `llvm-dis`. It does not hook up other parts of the optimizer to
actually exploit the attribute -- those changes will come later.
For pointers in address space 0, `dereferenceable(N)` is now exactly
equivalent to `dereferenceable_or_null(N)` && `nonnull`. For other
address spaces, `dereferenceable(N)` is potentially weaker than
`dereferenceable_or_null(N)` && `nonnull` (since we could have a null
`dereferenceable(N)` pointer).
The motivating case for this change is Java (and other managed
languages), where pointers are either `null` or dereferenceable up to
some usually known-at-compile-time constant offset.
Reviewers: rafael, hfinkel
Reviewed By: hfinkel
Subscribers: nicholas, llvm-commits
Differential Revision: http://reviews.llvm.org/D8650
llvm-svn: 235132
Change the callers of `WriteToBitcodeFile()` to pass `true` or
`shouldPreserveBitcodeUseListOrder()` explicitly. I left the callers
that want to send `false` alone.
I'll keep pushing the bit higher until hopefully I can delete the global
`cl::opt` entirely.
llvm-svn: 234957
We only defer loading metadata inside ParseModule when ShouldLazyLoadMetadata
is true and we have not loaded any Metadata block yet.
This commit implements all-or-nothing loading of Metadata. If there is a
request to load any metadata block, we will load all deferred metadata blocks.
We make sure the deferred metadata blocks are loaded before we materialize any
function or a module.
The default value of the added parameter ShouldLazyLoadMetadata for
getLazyBitcodeModule is false, so the default behavior stays the same.
We only set the parameter to true when creating LTOModule in local contexts.
These can only really be used for parsing symbols, so it's unnecessary to ever
load the metadata blocks.
If we are going to enable lazy-loading of Metadata for other usages of
getLazyBitcodeModule, where deferred metadata blocks need to be loaded, we can
expose BitcodeReader::materializeMetadata to Module, similar to
Module::materialize.
rdar://19804575
llvm-svn: 232198
Like r230414, add bitcode support including backwards compatibility, for
an explicit type parameter to GEP.
At the suggestion of Duncan I tried coalescing the two older bitcodes into a
single new bitcode, though I did hit a wrinkle: I couldn't figure out how to
create an explicit abbreviation for a record with a variable number of
arguments (the indicies to the gep). This means the discriminator between
inbounds and non-inbounds gep is a full variable-length field I believe? Is my
understanding correct? Is there a way to create such an abbreviation? Should I
just use two bitcodes as before?
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7736
llvm-svn: 230415
Eventually we can make some of these pass the error along to the caller.
Reports a fatal error if:
We find an invalid abbrev record
We try to get an invalid abbrev number
We can't fill the current word due to an EOF
Fixed an invalid bitcode test to check for output with FileCheck
Bugs found with afl-fuzz
llvm-svn: 226986
This adds assembly and bitcode support for `MDLocation`. The assembly
side is rather big, since this is the first `MDNode` subclass (that
isn't `MDTuple`). Part of PR21433.
(If you're wondering where the mountains of testcase updates are, we
don't need them until I update `DILocation` and `DebugLoc` to actually
use this class.)
llvm-svn: 225830
The bitcode reading interface used std::error_code to report an error to the
callers and it is the callers job to print diagnostics.
This is not ideal for error handling or diagnostic reporting:
* For error handling, all that the callers care about is 3 possibilities:
* It worked
* The bitcode file is corrupted/invalid.
* The file is not bitcode at all.
* For diagnostic, it is user friendly to include far more information
about the invalid case so the user can find out what is wrong with the
bitcode file. This comes up, for example, when a developer introduces a
bug while extending the format.
The compromise we had was to have a lot of error codes.
With this patch we use the DiagnosticHandler to communicate with the
human and std::error_code to communicate with the caller.
This allows us to have far fewer error codes and adds the infrastructure to
print better diagnostics. This is so because the diagnostics are printed when
he issue is found. The code that detected the problem in alive in the stack and
can pass down as much context as needed. As an example the patch updates
test/Bitcode/invalid.ll.
Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the
caller. A simple one like llvm-dis can just use fatal errors. The gold plugin
needs a bit more complex treatment because of being passed non-bitcode files. An
hypothetical interactive tool would make all bitcode errors non-fatal.
llvm-svn: 225562
This reverts commit r225498 (but leaves r225499, which was a worthy
cleanup).
My plan was to change `DEBUG_LOC` to store the `MDNode` directly rather
than its operands (patch was to go out this morning), but on reflection
it's not clear that it's strictly better. (I had missed that the
current code is unlikely to emit the `MDNode` at all.)
Conflicts:
lib/Bitcode/Reader/BitcodeReader.cpp (due to r225499)
llvm-svn: 225531
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode). This adds the `distinct` keyword to assembly.
Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:
- self-references, which are never uniqued, and
- nodes whose operands are replaced that hit a uniquing collision.
The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.
Part of PR22111.
llvm-svn: 225474
units.
This was debated back and forth a bunch, but using references is now
clearly cleaner. Of all the code written using pointers thus far, in
only one place did it really make more sense to have a pointer. In most
cases, this just removes immediate dereferencing from the code. I think
it is much better to get errors on null IR units earlier, potentially
at compile time, than to delay it.
Most notably, the legacy pass manager uses references for its routines
and so as more and more code works with both, the use of pointers was
likely to become really annoying. I noticed this when I ported the
domtree analysis over and wrote the entire thing with references only to
have it fail to compile. =/ It seemed better to switch now than to
delay. We can, of course, revisit this is we learn that references are
really problematic in the API.
llvm-svn: 225145
This reflects the typelessness of `Metadata` in the bitcode format,
removing types from all metadata operands.
`METADATA_VALUE` represents a `ValueAsMetadata`, and always has two
fields: the type and the value.
`METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`,
doesn't store types. It stores operands at their ID+1 so that `0` can
reference `nullptr` operands.
Part of PR21532.
llvm-svn: 224073
They were producing the wrong result if NumBits == BitsInWord. The old mask
produced -1, the new mask 0.
This should fix the 32 bit bots.
llvm-svn: 222166
This avoids an issue where AtEndOfStream mistakenly returns true at the /start/ of
a stream.
(In the rare case that the size is known and actually 0, the slow path will still
handle it correctly.)
llvm-svn: 221840
Every MemoryObject is a StreamableMemoryObject since the removal of
StringRefMemoryObject, so just merge the two.
I will clean up the MemoryObject interface in the upcoming commits.
llvm-svn: 221766
This doesn't change the interface or gives additional safety but removes
a ton of retain/release boilerplate.
No functionality change.
llvm-svn: 217778
This forces callers to use std::move when calling it. It is somewhat odd to have
code with std::move that doesn't always move, but it is also odd to have code
without std::move that sometimes moves.
llvm-svn: 217049
This allows streams that only use BLOCKINFO for debugging purposes to omit
the block entirely. As long as another stream is available with the correct
BLOCKINFO, the first stream can still be analyzed and dumped.
As part of this commit, BitstreamReader gets a move constructor and move
assignment operator, as well as a takeBlockInfo method.
llvm-svn: 216826
The attached patch simplifies a few interfaces that don't need to take
ownership of a buffer.
For example, both parseAssembly and parseBitcodeFile will parse the
entire buffer before returning. There is no need to take ownership.
Using a MemoryBufferRef makes it obvious in the type signature that
there is no ownership transfer.
llvm-svn: 216488
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
`BlockAddress`es are interesting in that they can reference basic blocks
from *outside* the block's function. Since basic blocks are not global
values, this presents particular challenges for lazy parsing.
One corner case was found in PR11677 and fixed in r147425. In that
case, a global variable references a block address. It's necessary to
load the relevant function to resolve the forward reference before doing
anything with the module.
By inspection, I found (and have fixed here) two other cases:
- An instruction from one function references a block address from
another function, and only the first function is lazily loaded.
I fixed this the same way as PR11677: by eagerly loading the
referenced function.
- A function whose block address is taken is dematerialized, leaving
invalid references to it.
I fixed this by refusing to dematerialize functions whose block
addresses are taken (if you have to load it, you can't unload it).
llvm-svn: 214559
This will let users in other libraries know which error occurred. In particular,
it will be possible to check if the parsing failed or if the file is not
bitcode.
llvm-svn: 214209
Predict and serialize use-list order in bitcode. This makes the option
`-preserve-bc-use-list-order` work *most* of the time, but this is still
experimental.
- Builds a full value-table up front in the writer, sets up a list of
use-list orders to write out, and discards the table. This is a
simpler first step than determining the order from the various
overlapping IDs of values on-the-fly.
- The shuffles stored in the use-list order list have an unnecessarily
large memory footprint.
- `blockaddress` expressions cause functions to be materialized
out-of-order. For now I've ignored this problem, so use-list orders
will be wrong for constants used by functions that have block
addresses taken. There are a couple of ways to fix this, but I
don't have a concrete plan yet.
- When materializing functions lazily, the use-lists for constants
will not be correct. This use case is out of scope: what should the
use-list order be, if it's incomplete?
This is part of PR5680.
llvm-svn: 214125
This attribute indicates that the parameter or return pointer is
dereferenceable. Practically speaking, loads from such a pointer within the
associated byte range are safe to speculatively execute. Such pointer
parameters are common in source languages (C++ references, for example).
llvm-svn: 213385
This was an oversight in the original support. As it is, I stuffed this
bit into the alignment. The alignment is stored in log2 form, so it
doesn't need more than 5 bits, given that Value::MaximumAlignment is 1
<< 29.
Reviewers: nicholas
Differential Revision: http://reviews.llvm.org/D3943
llvm-svn: 213118
This reverts commit r212342.
We can get a StringRef into the current Record, but not one in the bitcode
itself since the string is compressed in it.
llvm-svn: 212356
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.
COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.
This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.
Differential Revision: http://reviews.llvm.org/D4178
llvm-svn: 211920
This allows us to just use a std::unique_ptr to store the pointer to the buffer.
The flip side is that they have to support releasing the buffer back to the
caller.
Overall this looks like a more efficient and less brittle api.
llvm-svn: 211542
We do have use cases for the bitcode reader owning the buffer or not, but we
always know which one we have when we construct it.
It might be possible to simplify this further, but this is a step in the
right direction.
llvm-svn: 211205
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.
This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.
llvm-svn: 210280
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083
This moves the old pass creation functionality to its own header and
updates the callers of that routine. Then it adds a new PM supporting
bitcode writer to the header file, and wires that up in the opt tool.
A test is added that round-trips code into bitcode and back out using
the new pass manager.
llvm-svn: 199078
The inalloca attribute is designed to support passing C++ objects by
value in the Microsoft C++ ABI. It behaves the same as byval, except
that it always implies that the argument is in memory and that the bytes
are never copied. This attribute allows the caller to take the address
of an outgoing argument's memory and execute arbitrary code to store
into it.
This patch adds basic IR support, docs, and verification. It does not
attempt to implement any lowering or fix any possibly broken transforms.
When this patch lands, a complete description of this feature should
appear at http://llvm.org/docs/InAlloca.html .
Differential Revision: http://llvm-reviews.chandlerc.com/D2173
llvm-svn: 197645
This function attribute indicates that the function is not optimized
by any optimization or code generator passes with the
exception of interprocedural optimization passes.
llvm-svn: 189101
The bitcode representation attribute kinds are encoded into / decoded from
should be independent of the current set of LLVM attributes and their position
in the AttrKind enum. This patch explicitly encodes attributes to fixed bitcode
values.
With this patch applied, LLVM does not silently misread attributes written by
LLVM 3.3. We also enhance the decoding slightly such that an error message is
printed if an unknown AttrKind encoding was dected.
Bonus: Dropping bitcode attributes from AttrKind is now easy, as old AttrKinds
do not need to be kept to support the Bitcode reader.
llvm-svn: 187186
Archive files (.a) can have a symbol table indicating which object
files in them define which symbols. The purpose of this symbol table
is to speed up linking by allowing the linker the read only the .o
files it is actually going to use instead of having to parse every
object's symbol table.
LLVM's archive library currently supports a LLVM specific format for
such table. It is hard to see any value in that now that llvm-ld is
gone:
* System linkers don't use it: GNU ar uses the same plugin as the
linker to create archive files with a regular index. The OS X ar
creates no symbol table for IL files, I assume the linker just parses
all IL files.
* It doesn't interact well with archives having both IL and native objects.
* We probably don't want to be responsible for yet another archive
format variant.
This patch then:
* Removes support for creating and reading such index from lib/Archive.
* Remove llvm-ranlib, since there is nothing left for it to do.
We should in the future add support for regular indexes to llvm-ar for
both native and IL objects. When we do that, llvm-ranlib should be
reimplemented as a symlink to llvm-ar, as it is equivalent to "ar s".
llvm-svn: 184019
There was exactly one caller using this API right, the others were relying on
specific behavior of the default implementation. Since it's too hard to use it
right just remove it and standardize on the default behavior.
Defines away PR16132.
llvm-svn: 182636
BitstreamWriter asserts that when blob data is written from the record
element vector, each element fits in a byte. However, if the record
elements are specified as a SmallVector of 'char', this causes a warning
from -Wtautological-constant-out-of-range-compare. Fix this by using
llvm::isUInt<8> instead of a plain comparison against 256.
llvm-svn: 181545
This is some initial code for emitting the attribute groups into the bitcode.
NOTE: This format *may* change! Do not rely upon the attribute groups' bitcode
not changing.
llvm-svn: 174845
bitcode writer would generate abbrev records saying that the abbrev should be
filled with fixed zero-bit bitfields (this happens in the .bc writer when
the number of types used in a module is exactly one, since log2(1) == 0).
In this case, just handle it as a literal zero. We can't "just fix" the writer
without breaking compatibility with existing bc files, so have the abbrev reader
do the substitution.
Strengthen the assert in read to reject reads of zero bits so we catch such
crimes in the future, and remove the special case designed to handle this.
llvm-svn: 174801
instead of always 32-bits at a time) with two changes:
1. Make Read(0) always return zero without affecting the state of our cursor.
2. Hack word_t to always be 32 bits, as staging.
These two caveats will change shortly.
llvm-svn: 174800
Rename the PARAMATTR_CODE_ENTRY to PARAMATTR_CODE_ENTRY_OLD. It will be replaced
by another encoding. Keep around the current LLVM attribute encoder/decoder
code, but move it to the bitcode directories so that no one's tempted to use
them.
llvm-svn: 174335
This cuts in half the number of virtual methods called to refill that word when compiling on a 64-bit
host, and will make 64-bit read operations faster.
llvm-svn: 173072
BLOB (i.e., large, performance intensive data) in a bitcode file was switched to
invoking one virtual method call per byte read. Now we do one virtual call per
BLOB.
llvm-svn: 173065
through a BitstreamCursor that produce it: advance() and
advanceSkippingSubblocks(), representing the two most common ways clients
want to walk through bitcode.
llvm-svn: 172919
AKA: Recompile *ALL* the source code!
This one went much better. No manual edits here. I spot-checked for
silliness and grep-checked for really broken edits and everything seemed
good. It all still compiles. Yell if you see something that looks goofy.
llvm-svn: 169133
Added in bitcode enum for the serializing of fast-math flags. Added in the reading/writing of fast-math flags from the OptimizationFlags record for BinaryOps.
llvm-svn: 168646
- Widespread trailing space removal
- A dash of OCD spacing to block align enums
- joined a line that probably needed 80 cols a while back
llvm-svn: 168566
to the instruction position. The old encoding would give an absolute
ID which counts up within a function, and only resets at the next function.
I.e., Instead of having:
... = icmp eq i32 n-1, n-2
br i1 ..., label %bb1, label %bb2
it will now be roughly:
... = icmp eq i32 1, 2
br i1 1, label %bb1, label %bb2
This makes it so that ids remain relatively small and can be encoded
in fewer bits.
With this encoding, forward reference operands will be given
negative-valued IDs. Use signed VBRs for the most common case
of forward references, which is phi instructions.
To retain backward compatibility we bump the bitcode version
from 0 to 1 to distinguish between the different encodings.
llvm-svn: 165739