1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

263 Commits

Author SHA1 Message Date
Reid Kleckner
c175e28bde [PDB] Hash types up front when merging types instead of using StringMap
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.

Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.

This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.

Reviewers: zturner, inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33428

llvm-svn: 303665
2017-05-23 18:23:59 +00:00
Zachary Turner
b2aaa328db Revert "Make TypeSerializer's StringMap use the same allocator."
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.

This is causing failures on the ASAN bots.

llvm-svn: 303640
2017-05-23 15:50:37 +00:00
Zachary Turner
57dfa705b6 Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

llvm-svn: 303577
2017-05-22 21:07:43 +00:00
Zachary Turner
3eef49f50b Make TypeSerializer's StringMap use the same allocator.
llvm-svn: 303576
2017-05-22 21:07:14 +00:00
Zachary Turner
c669552325 Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

llvm-svn: 303446
2017-05-19 19:26:58 +00:00
Zachary Turner
4a5590fa6b Revert "[CodeView] Provide a common interface for type collections."
This is a squash of ~5 reverts of, well, pretty much everything
I did today.  Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures.  I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!

At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.

llvm-svn: 303409
2017-05-19 05:57:45 +00:00
Zachary Turner
92c35d4ced Don't crash if someone tries to visit an empty type stream.
llvm-svn: 303408
2017-05-19 05:18:09 +00:00
Zachary Turner
34aa3d3b80 [CodeView] Reduce memory usage in TypeSerializer.
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table.  If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary.  At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.

llvm-svn: 303407
2017-05-19 04:56:48 +00:00
Zachary Turner
00a4a5626f Fix crasher in CodeView test.
Apparently this was always broken, but previously we were more
graceful about it and we would print "unknown udt" if we couldn't
find the type index, whereas now we just segfault because we
assume it's valid.  But this exposed a real bug, which is that
we weren't looking in the right place.  So fix that, and also
fix this crash at the same time.

llvm-svn: 303397
2017-05-19 00:56:39 +00:00
Zachary Turner
cb522db549 Fix some build errors and warnings.
llvm-svn: 303391
2017-05-18 23:12:42 +00:00
Zachary Turner
4d00f79887 [CodeView] Raise the source to ID map out of the TypeStreamMerger.
This map will be needed to rewrite symbol streams after re-writing
the corresponding type streams.

llvm-svn: 303390
2017-05-18 23:04:08 +00:00
Zachary Turner
05edea832d [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303388
2017-05-18 23:03:06 +00:00
Zachary Turner
5f8e1427eb [CodeView] Simplify the use of visiting type records & streams.
There is often a lot of boilerplate code required to visit a type
record or type stream.  The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245

llvm-svn: 303271
2017-05-17 16:39:06 +00:00
Zachary Turner
aaaf4b3ba3 [CodeView] Add a random access type visitor.
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans.  This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.

Differential Revision: https://reviews.llvm.org/D33009

llvm-svn: 302936
2017-05-12 19:18:12 +00:00
Aaron Ballman
f74a6f060f Removing a file that is not necessary (and was causing link diagnostics with MSVC 2015); NFC.
llvm-svn: 302531
2017-05-09 14:22:48 +00:00
Zachary Turner
b78cfee1e6 [CodeView] Add support for random access type visitors.
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record.  This works fine for situations like dumping,
but not when you want to visit types in random order.  For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.

In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline.  This
patch adds such a mode.  In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.

Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.

Differential Revision: https://reviews.llvm.org/D32928

llvm-svn: 302454
2017-05-08 18:38:43 +00:00
Zachary Turner
f46b72f64d [CodeView] Reserve TypeDatabase records up front.
Most of the time we know exactly how many type records we
have in a list, and we want to use the visitor to deserialize
them into actual records in a database.  Previously we were
just using push_back() every time without reserving the space
up front in the vector.  This is obviously terrible from a
performance standpoint, and it's not uncommon to have PDB
files with half a million type records, where the performance
degredation was quite noticeable.

llvm-svn: 302302
2017-05-05 22:02:37 +00:00
Zachary Turner
ac2522878b Remove unused private field.
llvm-svn: 302069
2017-05-03 19:42:06 +00:00
Davide Italiano
eed883ed5d [CodeView] Remove constructor initialization of a removed field.
I should've staged this with my last commit.

llvm-svn: 302059
2017-05-03 18:02:46 +00:00
Zachary Turner
7fb895351b [CodeView] Use actual strings for dealing with checksums and lines.
The raw CodeView format references strings by "offsets", but it's
confusing what table the offset refers to.  In the case of line
number information, it's an offset into a buffer of records,
and an indirection is required to get another offset into a
different table to find the final string.  And in the case of
checksum information, there is no indirection, and the offset
refers directly to the location of the string in another buffer.

This would be less confusing if we always just referred to the
strings by their value, and have the library be smart enough
to correctly resolve the offsets on its own from the right
location.

This patch makes that possible.  When either reading or writing,
all the user deals with are strings, and the library does the
appropriate translations behind the scenes.

llvm-svn: 302053
2017-05-03 17:11:40 +00:00
Zachary Turner
31cc3cca3a [llvm-readobj] Update readobj to re-use parsing code.
llvm-readobj hand rolls some CodeView parsing code for string
tables, so this patch updates it to re-use some of the newly
introduced parsing code in LLVMDebugInfoCodeView.

Differential Revision: https://reviews.llvm.org/D32772

llvm-svn: 302052
2017-05-03 17:11:11 +00:00
Zachary Turner
28d3fba1b3 Resubmit r301986 and r301987 "Add codeview::StringTable"
This was reverted due to a "missing" file, but in reality
what happened was that I renamed a file, and then due to
a merge conflict both the old file and the new file got
added to the repository.  This led to an unused cpp file
being in the repo and not referenced by any CMakeLists.txt
but #including a .h file that wasn't in the repo.  In an
even more unfortunate coincidence, CMake didn't report the
unused cpp file because it was in a subdirectory of the
folder with the CMakeLists.txt, and not in the same directory
as any CMakeLists.txt.

The presence of the unused file was then breaking certain
tools that determine file lists by globbing rather than
by what's specified in CMakeLists.txt

In any case, the fix is to just remove the unused file from
the patch set.

llvm-svn: 302042
2017-05-03 15:58:37 +00:00
Daniel Jasper
46c75d6ad5 Revert r301986 (and subsequent r301987).
The patch is failing to add StringTableStreamBuilder.h, but that isn't
even discovered because the corresponding StringTableStreamBuilder.cpp
isn't added to any CMakeLists.txt file and thus never built. I think
this patch is just incomplete.

llvm-svn: 302002
2017-05-03 07:29:25 +00:00
Zachary Turner
4fcb061c01 Fix use after free in BinaryStream library.
This was reported by the ASAN bot, and it turned out to be
a fairly fundamental problem with the design of VarStreamArray
and the way it passes context information to the extractor.

The fix was cumbersome, and I'm not entirely pleased with it,
so I plan to revisit this design in the future when I'm not
pressed to get the bots green again.  For now, this fixes
the issue by storing the context information by value instead
of by reference, and introduces some impossibly-confusing
template magic to make things "work".

llvm-svn: 301999
2017-05-03 05:34:00 +00:00
Zachary Turner
bfe2451c54 Fix type conversion error.
llvm-svn: 301987
2017-05-02 23:41:51 +00:00
Zachary Turner
fc2ee0d542 Make codeview::StringTable.
Previously we had knowledge of how to serialize and deserialize
a string table inside of DebugInfo/PDB, but the string table
that it serializes contains a piece that is actually considered
CodeView and can appear outside of a PDB.  We already have logic
in llvm-readobj and MCCodeView to read and write this format,
so it doesn't make sense to duplicate the logic in DebugInfoPDB
as well.

This patch makes codeview::StringTable (for writing) and
codeview::StringTableRef (for reading), updates DebugInfoPDB
to use these classes for its own writing, and updates llvm-readobj
to additionally use StringTableRef for reading.

It's a bit more difficult to get MCCodeView to use this for
writing, but it's a logical next step.

llvm-svn: 301986
2017-05-02 23:36:17 +00:00
Zachary Turner
33ba01f653 [PDB/CodeView] Read/write codeview inlinee line information.
Previously we wrote line information and file checksum
information, but we did not write information about inlinee
lines and functions.  This patch adds support for that.

llvm-svn: 301936
2017-05-02 16:56:09 +00:00
Zachary Turner
6a54e5a010 [CodeView] Write CodeView line information.
Differential Revision: https://reviews.llvm.org/D32716

llvm-svn: 301882
2017-05-01 23:27:42 +00:00
Zachary Turner
8652898f9b [PDB/CodeView] Rename some classes.
In preparation for introducing writing capabilities for each of
these classes, I would like to adopt a Foo / FooRef naming
convention, where Foo indicates that the class can manipulate and
serialize Foos, and FooRef indicates that it is an immutable view of
an existing Foo.  In other words, Foo is a writer and FooRef is a
reader.  This patch names some existing readers to conform to the
FooRef convention, while offering no functional change.

llvm-svn: 301810
2017-05-01 16:46:39 +00:00
Zachary Turner
75c6138373 [llvm-pdbdump] Abstract some of the YAML/Raw printing code.
There is a lot of duplicate code for printing line info between
YAML and the raw output printer.  This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.

llvm-svn: 301728
2017-04-29 01:13:21 +00:00
Zachary Turner
2af5323aab [CodeView] Isolate Debug Info Fragments into standalone classes.
Previously parsing of these were all grouped together into a
single master class that could parse any type of debug info
fragment.

With writing forthcoming, the complexity of each individual
fragment is enough to warrant them having their own classes so
that reading and writing of each fragment type can be grouped
together, but isolated from the code for reading and writing
other fragment types.

In doing so, I found a place where parsing code was duplicated
for the FileChecksums fragment, across llvm-readobj and the
CodeView library, and one of the implementations had a bug.
Now that the codepaths are merged, the bug is resolved.

Differential Revision: https://reviews.llvm.org/D32547

llvm-svn: 301557
2017-04-27 16:12:16 +00:00
Zachary Turner
6b9ca21731 [Support] Make BinaryStreamArray extractors stateless.
Instead, we now pass a context memeber through the extraction
process.

llvm-svn: 301556
2017-04-27 16:11:47 +00:00
Zachary Turner
93ec93772b Rename some PDB classes.
We have a lot of very similarly named classes related to
dealing with module debug info.  This patch has NFC, it just
renames some classes to be more descriptive (albeit slightly
more to type).  The mapping from old to new class names is as
follows:

   Old          |        New
ModInfo         | DbiModuleDescriptor
ModuleSubstream | ModuleDebugFragment
ModStream       | ModuleDebugStream

With the corresponding Builder classes renamed accordingly.

Differential Revision: https://reviews.llvm.org/D32506

llvm-svn: 301555
2017-04-27 16:11:19 +00:00
Vassil Vassilev
fee845a566 Remove unused functions. Remove static qualifier from functions in header files. NFC.
llvm-svn: 299947
2017-04-11 14:55:32 +00:00
Reid Kleckner
1a39114e26 [codeview] Cope with unsorted streams in type merging
Summary:
MASM can produce type streams that are not topologically sorted. It can
even produce type streams with circular references, but those are not
common in practice.

Reviewers: inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31629

llvm-svn: 299403
2017-04-03 23:58:15 +00:00
Reid Kleckner
b92def0094 [codeview] Add support for label type records
MASM can produce these type records.

llvm-svn: 299388
2017-04-03 21:25:20 +00:00
Reid Kleckner
f53e9a4e87 [codeview] Fix buggy BeginIndexMapSize assertion
This assert is just trying to test that processing each record adds
exactly one entry to the index map. The assert logic was wrong when the
first record in the type stream was a field list.

I've simplified the code by moving the LF_FIELDLIST-specific logic into
the callback for that record type.

llvm-svn: 299035
2017-03-29 22:51:22 +00:00
Reid Kleckner
5da414c396 [PDB] Split item and type records when merging type streams
Summary: MSVC does this when producing a PDB.

Reviewers: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31316

llvm-svn: 298717
2017-03-24 17:26:38 +00:00
Reid Kleckner
b3d6ecbb12 [PDB] Use two DBs when dumping the IPI stream
Summary:
When dumping these records from an object file section, we should use
only one type database. However, when dumping from a PDB, we should use
two: one for the type stream and one for the IPI stream.

Certain type records that normally live in the .debug$T object file
section get moved over to the IPI stream of the PDB file and they get
new indices.

So far, I've noticed that the MSVC linker always moves these records
into IPI:
- LF_FUNC_ID
- LF_MFUNC_ID
- LF_STRING_ID
- LF_SUBSTR_LIST
- LF_BUILDINFO
- LF_UDT_MOD_SRC_LINE

These records have index fields that can point into TPI or IPI. In
particular, LF_SUBSTR_LIST and LF_BUILDINFO point to LF_STRING_ID
records to describe compilation command lines.

I've modified the dumper to have an optional pointer to the item DB, and
to do type name lookup of these fields in that DB. See printItemIndex.
The result is that our pdbdump-headers.test is more faithful to the PDB
contents and the output is less confusing.

Reviewers: ruiu

Subscribers: amccarth, zturner, llvm-commits

Differential Revision: https://reviews.llvm.org/D31309

llvm-svn: 298649
2017-03-23 21:36:25 +00:00
Reid Kleckner
7702500e41 [codeview] Move type index remapping logic to type merger
Summary:
This removes the 'remapTypeIndices' method on every TypeRecord class. My
original idea was that this would be the beginning of some kind of
generic entry point that would enumerate all of the TypeIndices inside
of a TypeRecord, so that we could write generic graph algorithms for
them without duplicating the knowledge of which fields are type index
fields everywhere. This never happened, and nothing else uses this
method. I need to change the API to deal with merging into IPI streams,
so let's move it into the file that uses it first.

Reviewers: zturner, ruiu

Reviewed By: zturner, ruiu

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D31267

llvm-svn: 298564
2017-03-23 00:14:23 +00:00
Reid Kleckner
e576aa7fc5 [codeview] Use separate records for LF_SUBSTR_LIST and LF_ARGLIST
They are structurally the same, but now we need to distinguish them
because one record lives in the IPI stream and the other lives in TPI.

llvm-svn: 298474
2017-03-22 01:37:38 +00:00
Zachary Turner
ab975dc712 [pdb] Fix an uninitialized read, and add a test for it.
This was originally reported in pr32249, uncovered by PTVS-Studio.
There was no code coverage for this path because it was
difficult to construct odd-case PDB files that were not generated
by cl.

Now that we can write construct minimal PDB files from YAML,
it's easy to construct fragments that generate whatever we want.

In this patch I add a test that creates 2 type records.  One
with a unique name, and one without.  I verify that we can go
from PDB to Yaml with no errors.  In a future patch I'd like
to add something like llvm-pdbdump raw -lookup-type that will
just dump one record and nothing else, which should make it
a bit cleaner to find this kind of thing.

llvm-svn: 298017
2017-03-17 00:15:55 +00:00
Zachary Turner
2b205f11e1 [PDB] It is not an error getting the "Invalid" Annotation opcode.
The linker can insert invalid opcodes to indicate padding
bytes, and we should not fail in this case.

llvm-svn: 298016
2017-03-17 00:15:27 +00:00
Zachary Turner
7513c7f0d9 [llvm-pdbdump] Add support for dumping symbols from Yaml -> PDB.
Previously we could round-trip type records from PDB -> Yaml ->
PDB, but for symbols we could only go from PDB -> Yaml.  This
completes the round-tripping for symbols as well.

llvm-svn: 297625
2017-03-13 14:57:45 +00:00
Zachary Turner
2c68c634a9 [Support] Move Stream library from MSF -> Support.
After several smaller patches to get most of the core improvements
finished up, this patch is a straight move and header fixup of
the source.

Differential Revision: https://reviews.llvm.org/D30266

llvm-svn: 296810
2017-03-02 20:52:51 +00:00
Zachary Turner
cd226b0757 [PDB] Make streams carry their own endianness.
Before the endianness was specified on each call to read
or write of the StreamReader / StreamWriter, but in practice
it's extremely rare for streams to have data encoded in
multiple different endiannesses, so we should optimize for the
99% use case.

This makes the code cleaner and more general, but otherwise
has NFC.

llvm-svn: 296415
2017-02-28 00:04:07 +00:00
Zachary Turner
f7fd863005 [PDB] Partial resubmit of r296215, which improved PDB Stream Library.
This was reverted because it was breaking some builds, and
because of incorrect error code usage.  Since the CL was
large and contained many different things, I'm resubmitting
it in pieces.

This portion is NFC, and consists of:

1) Renaming classes to follow a consistent naming convention.
2) Fixing the const-ness of the interface methods.
3) Adding detailed doxygen comments.
4) Fixing a few instances of passing `const BinaryStream& X`.  These
   are now passed as `BinaryStreamRef X`.

llvm-svn: 296394
2017-02-27 22:11:43 +00:00
NAKAMURA Takumi
046a844fb7 Revert r296215, "[PDB] General improvements to Stream library." and followings.
r296215, "[PDB] General improvements to Stream library."
r296217, "Disable BinaryStreamTest.StreamReaderObject temporarily."
r296220, "Re-enable BinaryStreamTest.StreamReaderObject."
r296244, "[PDB] Disable some tests that are breaking bots."
r296249, "Add static_cast to silence -Wc++11-narrowing."

std::errc::no_buffer_space should be used for OS-oriented errors for socket transmission.
(Seek discussions around llvm/xray.)

I could substitute s/no_buffer_space/others/g, but I revert whole them ATM.

Could we define and use LLVM errors there?

llvm-svn: 296258
2017-02-25 17:04:23 +00:00
Zachary Turner
c0166260b8 [PDB] General improvements to Stream library.
This adds various new functionality and cleanup surrounding the
use of the Stream library.  Major changes include:

* Renaming of all classes for more consistency / meaningfulness
* Addition of some new methods for reading multiple values at once.
* Full suite of unit tests for reader / writer functionality.
* Full set of doxygen comments for all classes.
* Streams now store their own endianness.
* Fixed some bugs in a few of the classes that were discovered
  by the unit tests.

llvm-svn: 296215
2017-02-25 00:44:30 +00:00
Zachary Turner
5260228d29 [PDB] Rename Stream related source files.
This is part of a larger effort to get the Stream code moved
up to Support.  I don't want to do it in one large patch, in
part because the changes are so big that it will treat everything
as file deletions and add, losing history in the process.
Aside from that though, it's just a good idea in general to
make small changes.

So this change only changes the names of the Stream related
source files, and applies necessary source fix ups.

llvm-svn: 296211
2017-02-25 00:33:34 +00:00
Zachary Turner
a6279ab220 Don't assume little endian in StreamReader / StreamWriter.
In an effort to generalize this so it can be used by more than
just PDB code, we shouldn't assume little endian.

llvm-svn: 295525
2017-02-18 01:35:33 +00:00
Zachary Turner
f0a0b7f3ae [pdb] Add the ability to resolve TypeServer PDBs.
Some PDBs or object files can contain references to other PDBs
where the real type information lives.  When this happens,
all type indices in the original PDB are meaningless because
their records are not there.

With this patch we add the ability to pull type info from those
secondary PDBs.

Differential Revision: https://reviews.llvm.org/D29973

llvm-svn: 295382
2017-02-16 23:35:45 +00:00
Zachary Turner
1fd42286e1 Properly parse the TypeServer2 record.
llvm-svn: 294046
2017-02-03 21:22:27 +00:00
Rui Ueyama
e8d788b83b Re-submit r293820: Return Error instead of bool from mergeTypeStreams().
llvm-svn: 293847
2017-02-02 00:47:10 +00:00
Rui Ueyama
5084390489 Revert r293820: Return Error instead of bool from mergeTypeStreams().
It broke buildbots.

llvm-svn: 293824
2017-02-01 22:28:43 +00:00
Rui Ueyama
5eb47df814 Return Error instead of bool from mergeTypeStreams().
Previously, mergeTypeStreams returns only true or false, so it was
impossible to know the reason if it failed. This patch changes the
function signature so that it returns an Error object.

Differential Revision: https://reviews.llvm.org/D29362

llvm-svn: 293820
2017-02-01 22:09:34 +00:00
Zachary Turner
bef0faee96 [pdb] Add a new command for analyzing hash collisions.
This introduces the `analyze` subcommand.  For now there is only
one option, to analyze hash collisions in the type streams.  In
the future, however, we could add many more things here, such
as performing size analyses, compacting, and statistics about
the type of records etc.

llvm-svn: 293795
2017-02-01 18:30:22 +00:00
Benjamin Kramer
5fd769f791 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Zachary Turner
fa65a3c140 [CodeView] Finish decoupling TypeDatabase from TypeDumper.
Previously the type dumper itself was passed around to a lot of different
places and manipulated in ways that were more appropriate on the type
database. For example, the entire TypeDumper was passed into the symbol
dumper, when all the symbol dumper wanted to do was lookup the name of a
TypeIndex so it could print it. That's what the TypeDatabase is for --
mapping type indices to names.

Another example is how if the user runs llvm-pdbdump with the option to
dump symbols but not types, we still have to visit all types so that we
can print minimal information about the type of a symbol, but just without
dumping full symbol records. The way we did this before is by hacking it
up so that we run everything through the type dumper with a null printer,
so that the output goes to /dev/null. But really, we don't need to dump
anything, all we want to do is build the type database. Since
TypeDatabaseVisitor now exists independently of TypeDumper, we can do
this. We just build a custom visitor callback pipeline that includes a
database visitor but not a dumper.

All the hackery around printers etc goes away. After this patch, we could
probably even delete the entire CVTypeDumper class since really all it is
at this point is a thin wrapper that hides the details of how to build a
useful visitation pipeline. It's not a priority though, so CVTypeDumper
remains for now.

After this patch we will be able to easily plug in a different style of
type dumper by only implementing the proper visitation methods to dump
one-line output and then sticking it on the pipeline.

Differential Revision: https://reviews.llvm.org/D28524

llvm-svn: 291724
2017-01-11 23:24:22 +00:00
Zachary Turner
c1f7412cbe [CodeView/PDB] Rename a bunch of files.
We were starting to get some name clashes between llvm-pdbdump
and the common CodeView framework, so I took this opportunity
to rename a bunch of files to more accurately describe their
usage.  This also helps in llvm-pdbdump to distinguish
between different files and whether they are used for pretty
dump mode or raw dump mode.

llvm-svn: 291627
2017-01-11 00:35:43 +00:00
Zachary Turner
60f2748d40 [CodeView] Add TypeDatabase class.
This creates a centralized class in which to store type records.
It stores types as an array of entries, which matches the
notion of a type stream being a topologically sorted DAG.
Logic to build up such a database was already being used in
CVTypeDumper, so CVTypeDumper is now updated to to read from
a TypeDatabase which is filled out by an earlier visitor in
the pipeline.

Differential Revision: https://reviews.llvm.org/D28486

llvm-svn: 291626
2017-01-11 00:35:08 +00:00
Zachary Turner
36a764a490 Delete unused file.
llvm-svn: 290021
2016-12-17 00:58:19 +00:00
Zachary Turner
4078aeb252 Resubmit "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."
The original patch was broken due to some undefined behavior
as well as warnings that were triggering -Werror.

llvm-svn: 290000
2016-12-16 22:48:14 +00:00
Zachary Turner
ab64e55c57 Revert "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."
This reverts commit r289978, which is failing due to some rebase/merge
issues.

llvm-svn: 289981
2016-12-16 19:25:23 +00:00
Zachary Turner
526ce01d27 [CodeView] Hook CodeViewRecordIO for reading/writing symbols.
This is the 3rd of 3 patches to get reading and writing of
CodeView symbol and type records to use a single codepath.

Differential Revision: https://reviews.llvm.org/D26427

llvm-svn: 289978
2016-12-16 19:20:35 +00:00
Zachary Turner
3b6151275c Fix some size_t / uint32_t ambiguity errors.
llvm-svn: 286305
2016-11-08 22:30:11 +00:00
Zachary Turner
064bbdf4f2 [CodeView] Hook up CodeViewRecordIO to type serialization path.
Previously support had been added for using CodeViewRecordIO
to read (deserialize) CodeView type records.  This patch adds
support for writing those same records.  With this patch,
reading and writing of CodeView type records finally uses a single
codepath.

Differential Revision: https://reviews.llvm.org/D26253

llvm-svn: 286304
2016-11-08 22:24:53 +00:00
Zachary Turner
efbbdd9c6c Add CodeViewRecordIO for reading and writing.
Using a pattern similar to that of YamlIO, this allows
us to have a single codepath for translating codeview
records to and from serialized byte streams.  The
current patch only hooks this up to the reading of
CodeView type records.  A subsequent patch will hook
it up for writing of CodeView type records, and then a
third patch will hook up the reading and writing of
CodeView symbols.

Differential Revision: https://reviews.llvm.org/D26040

llvm-svn: 285836
2016-11-02 17:05:19 +00:00
Bob Haarman
8163d702de [codeview] support emitting indirect virtual base class information
Summary:
Fixes PR28281.

MSVC lists indirect virtual base classes in the field list of a class,
using LF_IVBCLASS records. This change makes LLVM emit such records
when processing DW_TAG_inheritance tags with the DIFlagVirtual and
(newly introduced) DIFlagIndirect tags.

Reviewers: rnk, ruiu, zturner

Differential Revision: https://reviews.llvm.org/D25578

llvm-svn: 285130
2016-10-25 22:11:52 +00:00
Zachary Turner
2d85414032 [CodeView] Refactor serialization to use StreamInterface.
This was all using ArrayRef<>s before which presents a problem
when you want to serialize to or deserialize from an actual
PDB stream.  An ArrayRef<> is really just a special case of
what can be handled with StreamInterface though (e.g. by using
a ByteStream), so changing this to use StreamInterface allows
us to plug in a PDB stream and get all the record serialization
and deserialization for free on a MappedBlockStream.

Subsequent patches will try to remove TypeTableBuilder and
TypeRecordBuilder in favor of class that operate on
Streams as well, which should allow us to completely merge
the reading and writing codepaths for both types and symbols.

Differential Revision: https://reviews.llvm.org/D25831

llvm-svn: 284762
2016-10-20 18:31:19 +00:00
Reid Kleckner
2a26070bb8 Remove LLVM_NOEXCEPT and replace it with noexcept
Now that we have dropped MSVC 2013, all supported compilers support
noexcept and we can drop this portability macro.

llvm-svn: 284672
2016-10-19 23:52:38 +00:00
Reid Kleckner
4bfe93ae7f Truncate long names in type records
In the MS ABI, the frontend is supposed to MD5 such pathologically long
names. LLVM should still defend itself from long names, though.

Fixes part of PR29098.

llvm-svn: 284136
2016-10-13 17:33:22 +00:00
Zachary Turner
4b8a9c1349 Refactor Symbol visitor code.
Type visitor code had already been refactored previously to
decouple the visitor and the visitor callback interface.  This
was necessary for having the flexibility to visit in different
ways (for example, dumping to yaml, reading from yaml, dumping
to ScopedPrinter, etc).

This patch merely implements the same visitation pattern for
symbol records that has already been implemented for type records.

llvm-svn: 283609
2016-10-07 21:34:46 +00:00
Zachary Turner
4947d059e2 [pdb] Get rid of Data and RawData in CVType.
The `CVType` had two redundant fields which were confusing and
error-prone to fill out.  By treating member records as a distinct
type from leaf records, we are able to simplify this quite a bit.

Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24432

llvm-svn: 281556
2016-09-14 23:00:16 +00:00
Zachary Turner
bca5e415e8 [pdb] Pass CVRecord's through the visitor as non-const references.
This simplifies a lot of code, and will actually be necessary for
an upcoming patch to serialize TPI record hash values.

The idea before was that visitors should be examining records, not
modifying them.  But this is no longer true with a visitor that
constructs a CVRecord from Yaml.  To handle this until now, we
were doing some fixups on CVRecord objects at a higher level, but
the code is really awkward, and it makes sense to just have the
visitor write the bytes into the CVRecord.  In doing so I uncovered
a few bugs related to `Data` and `RawData` and fixed those.

Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24362

llvm-svn: 281067
2016-09-09 18:03:39 +00:00
Zachary Turner
c2876ae1eb [pdb] Write PDB TPI Stream from Yaml.
This writes the full sequence of type records described in
Yaml to the TPI stream of the PDB file.

Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24316

llvm-svn: 281063
2016-09-09 17:46:17 +00:00
Reid Kleckner
09ac865a68 [codeview] Use the correct max CV record length of 0xFF00
Previously we were splitting our records at 0xFFFF bytes, which the
Microsoft tools don't like.

Should fix failure on the new Windows self-host buildbot.

This length appears in microsoft-pdb/PDB/dbi/dbiimpl.h

llvm-svn: 280522
2016-09-02 18:43:27 +00:00
Zachary Turner
efca032046 [codeview] Have visitTypeBegin return the record type.
Previously we were assuming that any visitation of types would
necessarily be against a type we had binary data for.  Reasonable
assumption when were just reading PDBs and dumping them, but once
we start writing PDBs from Yaml this breaks down, because we have
no binary data yet, only Yaml, and from that we need to read the
record kind and perform the switch based on that.

So this patch does that.  Instead of having the visitor switch
on the kind that is already in the CVType record, we change the
visitTypeBegin() method to return the Kind, and switch on the
returned value.  This way, the default implementation can still
return the value from the CVType, but the implementation which
visits Yaml records and serializes binary PDB type records can
use the field in the Yaml as the source of the switch.

llvm-svn: 280307
2016-08-31 23:14:31 +00:00
Zachary Turner
b721d746bd [codeview] Add TypeVisitorCallbackPipeline.
We were kind of hacking this together before by embedding the
ability to forward requests into the TypeDeserializer.  When
we want to start adding more different kinds of visitor callback
interfaces though, this doesn't scale well and is very inflexible.

So introduce the notion of a pipeline, which itself implements
the TypeVisitorCallbacks interface, but which contains an internal
list of other callbacks to invoke in sequence.

Also update the existing uses of CVTypeVisitor to use this new
pipeline class for deserializing records before visiting them
with another visitor.

llvm-svn: 280293
2016-08-31 21:42:26 +00:00
Reid Kleckner
0669cf2688 [codeview] Emit vtable shape information
The shape of the vtable is passed down as the size of the
__vtbl_ptr_type. This special pointer type appears both as the pointee
type of the vptr type, and by itself in every dynamic class. For classes
with multiple vtables, only the shape of the primary vftable is
included, as the shape of all secondary vftables will be the same as in
the base class.

Fixes PR28150

llvm-svn: 280254
2016-08-31 15:59:30 +00:00
Zachary Turner
585b950a97 Remove unused translation unit.
llvm-svn: 279561
2016-08-23 20:08:02 +00:00
Vedant Kumar
3af6f746c8 Fix -Wpessimizing-move error, NFC
llvm-svn: 279095
2016-08-18 17:39:53 +00:00
Zachary Turner
3cf2ce528d Resubmit "Write the TPI stream from a PDB to Yaml."
The original patch was breaking some buildbots due to an
incorrect ordering of function definitions which caused some
compilers to recognize a definition but others to not.

llvm-svn: 279089
2016-08-18 16:49:29 +00:00
Justin Bogner
6fc8fa35ae Revert "Write the TPI stream from a PDB to Yaml."
This is hitting a "use of undeclared identifier 'skipPadding' error
locally and on some bots.

This reverts r278869.

llvm-svn: 278871
2016-08-16 23:37:10 +00:00
Zachary Turner
84ab1f4796 Write the TPI stream from a PDB to Yaml.
Reviewed By: ruiu, rnk
Differential Revision: https://reviews.llvm.org/D23226

llvm-svn: 278869
2016-08-16 23:28:54 +00:00
Justin Bogner
049f0b1295 CodeView: Remove an unused variable
It was breaking the -Werror build.

llvm-svn: 277878
2016-08-05 21:57:10 +00:00
Zachary Turner
d023c59def Fix non portable include path.
llvm-svn: 277876
2016-08-05 21:50:02 +00:00
Zachary Turner
a3ce9cabee [CodeView] Decouple record deserialization from visitor dispatch.
Until now, our use case for the visitor has been to take a stream of bytes
representing a type stream, deserialize the records in sequence, and do
something with them, where "something" is determined by how the user
implements a particular set of callbacks on an abstract class.

For actually writing PDBs, however, we want to do the reverse. We have
some kind of description of the list of records in their in-memory format,
and we want to process each one. Perhaps by serializing them to a byte
stream, or perhaps by converting them from one description format (Yaml)
to another (in-memory representation).

This was difficult in the current model because deserialization and
invoking the callbacks were tightly coupled.

With this patch we change this so that TypeDeserializer is itself an
implementation of the particular set of callbacks. This decouples
deserialization from the iteration over a list of records and invocation
of the callbacks.  TypeDeserializer is initialized with another
implementation of the callback interface, so that upon deserialization it
can pass the deserialized record through to the next set of callbacks. In
a sense this is like an implementation of the Decorator design pattern,
where the Deserializer is a decorator.

This will be useful for writing Pdbs from yaml, where we have a
description of the type records in Yaml format. In this case, the visitor
implementation would have each visitation callback method implemented in
such a way as to extract the proper set of fields from the Yaml, and it
could maintain state that builds up a list of these records. Finally at
the end we can pass this information through to another set of callbacks
which serializes them into a byte stream.

Reviewed By: majnemer, ruiu, rnk
Differential Revision: https://reviews.llvm.org/D23177

llvm-svn: 277871
2016-08-05 21:45:34 +00:00
Zachary Turner
aa1e2354eb [CodeView] Use llvm::Error instead of std::error_code.
This eliminates the remnants of std::error_code from the
DebugInfo libraries.

llvm-svn: 277758
2016-08-04 19:39:55 +00:00
Zachary Turner
2269779262 [msf] Resubmit "Rename Msf -> MSF".
Previously this change was submitted from a Windows machine, so
changes made to the case of filenames and directory names did
not survive the commit, and as a result the CMake source file
names and the on-disk file names did not match on case-sensitive
file systems.

I'm resubmitting this patch from a Linux system, which hopefully
allows the case changes to make it through unfettered.

llvm-svn: 277213
2016-07-29 20:56:36 +00:00
Zachary Turner
a3b385eb1c Revert "[msf] Rename Msf to MSF."
This reverts commit 4d1557ffac41e079bcb1abbcf04f512474dcd6fe.

llvm-svn: 277194
2016-07-29 18:38:47 +00:00
Zachary Turner
27ff4cd2ce [msf] Rename Msf to MSF.
In a previous patch, it was suggested to use all caps instead of
rolling caps for initialisms, so this patch changes everything
to do this.

llvm-svn: 277190
2016-07-29 18:24:26 +00:00
Zachary Turner
3f2fffba74 [pdb] Refactor library to more clearly separate reading/writing
Reviewed By: amccarth, ruiu
Differential Revision: https://reviews.llvm.org/D22693

llvm-svn: 277019
2016-07-28 19:12:28 +00:00
Vassil Vassilev
1a09c6fddc [modules] Add missing includes.
llvm-svn: 276970
2016-07-28 10:26:33 +00:00
Zachary Turner
de0ff2102f [msf] Create LLVMDebugInfoMsf
This provides a better layering of responsibilities among different
aspects of PDB writing code.  Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this.  Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF.  So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.

llvm-svn: 276458
2016-07-22 19:56:05 +00:00
Zachary Turner
b6837aec06 [pdb] Round-trip module & file info to/from YAML.
This implements support for writing compiland and compiland source
file info to a binary PDB.  This is tested by adding support for
dumping these fields from an existing PDB to yaml, reading them
back in, and dumping them again and verifying the values are as
expected.

llvm-svn: 276426
2016-07-22 15:46:37 +00:00
Rui Ueyama
62ac9546b0 Dump enum unique names.
llvm-svn: 275152
2016-07-12 03:33:48 +00:00
Rui Ueyama
eb8764db35 Re-enable TPI hash verification for enum records.
We didn't read unique names correctly. As a result, we computed
hashes on (non-)unique names instead of unique names.

llvm-svn: 275150
2016-07-12 03:25:03 +00:00
David Majnemer
6e3fb51f95 [CodeView] Emit an appropriate symbol kind for globals
We emitted debug info for globals/functions as if they all had external
linkage.  Instead, emit local symbol records when appropriate.

llvm-svn: 274676
2016-07-06 21:07:47 +00:00
Zachary Turner
05b0d33b0c [pdb] Re-add code to write PDB files.
Somehow all the functionality to write PDB files got removed,
probably accidentally when uploading the patch perhaps the wrong
one got uploaded.  This re-adds all the code, as well as the
corresponding test.

llvm-svn: 274248
2016-06-30 17:43:00 +00:00
David Majnemer
2bb75b48c6 [CodeView] Healthy paranoia around strings
Make sure strings don't get too big for a record, truncate them if
need-be.

llvm-svn: 273710
2016-06-24 19:34:41 +00:00
Reid Kleckner
e8d9ec6b43 [codeview] Use one byte for S_FRAMECOOKIE CookieKind and add flags byte
We bailed out while printing codeview for an MSVC compiled
SemaExprCXX.cpp that used this record. The MS reference headers look
incorrect here, which is probably why we had this bug. They use a 32-bit
enum as the field type, but the actual record appears to use one byte
for the cookie kind followed by a flags byte.

llvm-svn: 273691
2016-06-24 17:23:49 +00:00
Reid Kleckner
dcb890e1f1 [codeview] Fix the alignment padding that we add to list records
Tweak the big-types.ll test case to catch this bug. We just need an
enumerator name that doesn't have a length that is a multiple of 4.

llvm-svn: 273477
2016-06-22 20:59:17 +00:00
Reid Kleckner
805d357d67 [codeview] Add support for splitting field list records over 64KB
The basic structure is that once a list record goes over 64K, the last
subrecord of the list is an LF_INDEX record that refers to the next
record. Because the type record graph must be toplogically sorted, this
means we have to emit them in reverse order. We build the type record in
order of declaration, so this means that if we don't want extra copies,
we need to detect when we were about to split a record, and leave space
for a continuation subrecord that will point to the eventual split
top-level record.

Also adds dumping support for these records.

Next we should make sure that large method overload lists work properly.

llvm-svn: 273294
2016-06-21 18:33:01 +00:00
Reid Kleckner
62af8c4725 [codeview] Add DIFlags for pointer to member representations
Summary:
This seems like the least intrusive way to pass this information
through.

Fixes PR28151

Reviewers: majnemer, aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21444

llvm-svn: 273053
2016-06-17 21:31:33 +00:00
Zachary Turner
b871327aa8 Resubmit "[pdb] Change type visitor pattern to be dynamic."
There was a regression introduced during type stream merging when
visiting a field list record.  This has been fixed in this patch.

llvm-svn: 272929
2016-06-16 18:22:27 +00:00
Zachary Turner
9dbc164c30 Revert "[pdb] Change type visitor pattern to be dynamic."
This reverts commit fb0dd311e1ad945827b8ffd5354f4810e2be1579.

This breaks some llvm-readobj tests.

llvm-svn: 272927
2016-06-16 18:09:04 +00:00
Zachary Turner
9300409ecf [pdb] Change type visitor pattern to be dynamic.
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.

Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.

Reviewed By: ruiu, rnk
Differential Revision: http://reviews.llvm.org/D21410

llvm-svn: 272926
2016-06-16 18:00:28 +00:00
Rui Ueyama
f7a0e93409 [codeview] Pass CVRecord to visitTypeBegin callback.
Both parameters to visitTypeBegin are actually members of CVRecord,
so we can just pass CVRecord instead of destructuring it.

Differential Revision: http://reviews.llvm.org/D21435

llvm-svn: 272899
2016-06-16 14:47:23 +00:00
Rui Ueyama
1b33098d46 [codeview] Remove unused parameter.
Differential Revision: http://reviews.llvm.org/D21433

llvm-svn: 272898
2016-06-16 14:41:22 +00:00
Rui Ueyama
44f2539d12 [Codeview] Add a class for LF_UDT_MOD_SRC_LINE.
Differential Revision: http://reviews.llvm.org/D21406

llvm-svn: 272843
2016-06-15 21:25:29 +00:00
Reid Kleckner
e8b7172caa Axe some trailing whitespace from my last commit
llvm-svn: 272830
2016-06-15 20:32:42 +00:00
Reid Kleckner
2eba5b5cb6 [codeview] Move deserialization methods out of line
They aren't performance critical and don't need to be inline.

llvm-svn: 272829
2016-06-15 20:30:34 +00:00
Zachary Turner
8110e5a8a3 Add support for writing through StreamInterface.
This adds method and tests for writing to a PDB stream.  With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.

Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157

llvm-svn: 272369
2016-06-10 05:09:12 +00:00
David Majnemer
08e81eef3f [CodeView] Fix a busted assert in TypeTableBuilder::writeClass
It was checking for Union when it should have checked for Interface.

llvm-svn: 271792
2016-06-04 15:40:31 +00:00
David Majnemer
3e22310a07 [TypeStreamMerger] visitUnknownMember was supposed to be visitUnknownType
llvm-svn: 271790
2016-06-04 15:40:27 +00:00
Reid Kleckner
14799a2f9b [codeview] Add basic record type translation
This only translates data members for now. Translating overloaded
methods is complicated, so I stopped short of doing that.

Reviewers: aaboud

Differential Revision: http://reviews.llvm.org/D20924

llvm-svn: 271680
2016-06-03 15:58:20 +00:00
Zachary Turner
6fb9f9896d [pdb] Dump file checksums from pdb codeview line info.
llvm-svn: 271622
2016-06-03 04:01:48 +00:00
Zachary Turner
9277831e4a [codeview] Dump line number and column information.
To facilitate this, a couple of changes had to be made:

1. `ModuleSubstream` got moved from `DebugInfo/PDB` to
`DebugInfo/CodeView`, and various codeview related types are defined
there.  It turns out `DebugInfo/CodeView/Line.h` already defines many of
these structures, but this is really old code that is not endian aware,
doesn't interact well with `StreamInterface` and not very helpful for
getting stuff out of a PDB.  Eventually we should migrate the old readobj
`COFFDumper` code to these new structures, or at least merge their
functionality somehow.

2. A `ModuleSubstream` visitor is introduced.  Depending on where your
module substream array comes from, different subsets of record types can
be expected.  We are already hand parsing these substream arrays in many
places especially in `COFFDumper.cpp`.  In the future we can migrate these
paths to the visitor as well, which should reduce a lot of code in
`COFFDumper.cpp`.

Differential Revision: http://reviews.llvm.org/D20936
Reviewed By: ruiu, majnemer

llvm-svn: 271621
2016-06-03 03:25:59 +00:00
Zachary Turner
8fad65b692 [llvm-pdbdump] Dump CodeView line information.
This first pass only splits apart the records and dumps the line
info kinds and binary data.  Subsequent patches will parse out
the binary data into more useful information and dump it in
detail.

llvm-svn: 271576
2016-06-02 20:11:22 +00:00
Zachary Turner
d356ecd7d3 [codeview] Fix a nasty use after free.
StreamRef was designed to be a thin wrapper over an abstract
stream interface that could itself be treated the same as any
other stream interface.  For this reason, it inherited publicly
from StreamInterface, and stored a StreamInterface* internally.

But StreamRef was also designed to be lightweight and easily
copyable, similar to ArrayRef.  This led to two misuses of
the classes.

1) When creating a StreamRef A from another StreamRef B, it was
   possible to end up with A storing a pointer to B, even when
   B was a temporary object, leading to use after free.
2) The above situation could be repeated ad nauseum, so that
   A stores a pointer to B, which itself stores a pointer to
   another StreamRef C, and so on and so on, creating an
   unnecessarily level of nesting depth.

This patch removes the public inheritance relationship between
StreamRef and StreamInterface, making it so that we can never
accidentally convert a StreamRef to a StreamInterface.

llvm-svn: 271570
2016-06-02 19:51:48 +00:00
David Majnemer
7085a0544f [CodeView] Use None instead of Void if there is no subprogram
llvm-svn: 271566
2016-06-02 18:51:24 +00:00
Zachary Turner
cc571053da [pdb] Parse and dump section map and section contribs
Differential Revision: http://reviews.llvm.org/D20876
Reviewed By: rnk, ruiu

llvm-svn: 271488
2016-06-02 05:07:49 +00:00
Reid Kleckner
05a06ad643 [codeview] Improve readability of type record assembly
Adds the method MCStreamer::EmitBinaryData, which is usually an alias
for EmitBytes. In the MCAsmStreamer case, it is overridden to emit hex
dump output like this:
        .byte   0x0e, 0x00, 0x08, 0x10
        .byte   0x03, 0x00, 0x00, 0x00
        .byte   0x00, 0x00, 0x00, 0x00
        .byte   0x00, 0x10, 0x00, 0x00

Also, when verbose asm comments are enabled, this patch prints the dump
output for each comment before its record, like this:
        # ArgList (0x1000) {
        #   TypeLeafKind: LF_ARGLIST (0x1201)
        #   NumArgs: 0
        #   Arguments [
        #   ]
        # }
        .byte   0x06, 0x00, 0x01, 0x12
        .byte   0x00, 0x00, 0x00, 0x00

This should make debugging easier and testing more convenient.

Reviewers: aaboud

Subscribers: majnemer, zturner, amccarth, aaboud, llvm-commits

Differential Revision: http://reviews.llvm.org/D20711

llvm-svn: 271313
2016-05-31 18:45:36 +00:00
Reid Kleckner
0021f4c0e0 [codeview] Add a CVTypeDumper::dump(ArrayRef<uint8_t>) overload
This is a convenient wrapper when the type record is already laid out as
bytes in memory.

llvm-svn: 271309
2016-05-31 18:15:23 +00:00
David Majnemer
6d741ef1bb Make sure we don't add an empty string to the stringmap
llvm-svn: 271172
2016-05-29 06:18:06 +00:00
David Majnemer
40cf622629 [SymbolDumper] Validate the string table offset before using it
llvm-svn: 271145
2016-05-28 20:04:46 +00:00
David Majnemer
c990a21322 [SymbolDumper] Validate the string table offset before using it
llvm-svn: 271142
2016-05-28 19:45:56 +00:00
Zachary Turner
94e5255730 [pdb] Finish conversion to zero copy pdb access.
This converts remaining uses of ByteStream, which was still
left in the symbol stream and type stream, to using the new
StreamInterface zero-copy classes.

RecordIterator is finally deleted, so this is the only way left
now.  Additionally, more error checking is added when iterating
the various streams.

With this, the transition to zero copy pdb access is complete.

llvm-svn: 271101
2016-05-28 05:21:57 +00:00
David Majnemer
517cb96201 Make sure data is available before dereferencing it
llvm-svn: 271032
2016-05-27 18:50:02 +00:00
Zachary Turner
0b5ee08758 Resubmit "[pdb] Allow zero-copy read support for symbol streams.""
Due to differences in template instantiation rules, it is not
portable to static_assert(false) inside of an invalid specialization
of a template.  Instead I just =delete the method so that it can't
be used, and leave a comment that it must be explicitly specialized.

llvm-svn: 271027
2016-05-27 18:47:20 +00:00
Chad Rosier
bdab1e9a71 Revert "[pdb] Allow zero-copy read support for symbol streams."
This reverts commit r271024 due to error: static_assert failed
"You must either provide a specialization of VarStreamArrayExtractor
or a custom extractor"

llvm-svn: 271026
2016-05-27 18:31:02 +00:00
Zachary Turner
f7482c3f63 [pdb] Allow zero-copy read support for symbol streams.
This reduces the amount of memory used by llvm-pdbdump by roughly
1/3 of the size of the PDB file.

Differential Revision: http://reviews.llvm.org/D20724
Reviewed By: ruiu

llvm-svn: 271025
2016-05-27 18:20:20 +00:00
Zachary Turner
f070dd590e [codeview,pdb] Try really hard to conserve memory when reading.
PDBs can be extremely large.  We're already mapping the entire
PDB into the process's address space, but to make matters worse
the blocks of the PDB are not arranged contiguously.  So, when
we have something like an array or a string embedded into the
stream, we have to make a copy.  Since it's convenient to use
traditional data structures to iterate and manipulate these
records, we need the memory to be contiguous.

As a result of this, we were using roughly twice as much memory
as the file size of the PDB, because every stream was copied
out and re-stitched together contiguously.

This patch addresses this by improving the MappedBlockStream
to allocate from a BumpPtrAllocator only when a read requires
a discontiguous read.  Furthermore, it introduces some data
structures backed by a stream which can iterate over both
fixed and variable length records of a PDB.  Since everything
is backed by a stream and not a buffer, we can read almost
everything from the PDB with zero copies.

Differential Revision: http://reviews.llvm.org/D20654
Reviewed By: ruiu

llvm-svn: 270951
2016-05-27 01:54:44 +00:00
Zachary Turner
493dc32ae8 [codeview] Move StreamInterface and StreamReader to libcodeview.
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.

llvm-svn: 270751
2016-05-25 20:37:03 +00:00
Zachary Turner
1bbdf5dfd8 [codeview] Add support for new types and symbols.
This patch adds support for:

S_EXPORT
LF_BITFIELD

With this patch, I have run through a couple of gigabytes of PDB
files and cannot find a type or symbol that we do not understand.

llvm-svn: 270637
2016-05-25 00:12:48 +00:00
Zachary Turner
b8f5397a29 [codeview] Add support for S_EXPORT symbol.
llvm-svn: 270636
2016-05-25 00:12:40 +00:00
Zachary Turner
4aa4d6e21a [codeview] Add support for new type records.
This adds support for parsing and dumping the following
symbol types:

S_LPROCREF
S_ENVBLOCK
S_COMPILE2
S_REGISTER
S_COFFGROUP
S_SECTION
S_THUNK32
S_TRAMPOLINE

As of this patch, the test PDB files no longer have any unknown
symbol types.

llvm-svn: 270628
2016-05-24 22:58:46 +00:00
Zachary Turner
8f47ac2281 [codeview, pdb] Dump symbol records in publics stream
Differential Revision: http://reviews.llvm.org/D20580
Reviewed By: ruiu

llvm-svn: 270597
2016-05-24 18:55:14 +00:00
Zachary Turner
ad9a9a6f5e Fix build errors
llvm-svn: 270587
2016-05-24 17:44:29 +00:00
Zachary Turner
8a810b70e5 Dump symbol record details in llvm-pdbdump
This makes use of the newly introduced `CVSymbolVisitor` to dump details
of each type of symbol record in the symbol streams.  Future patches will
bring this visitor based dumping to the publics stream, as well as
creating a `SymbolDumpDelegate` to print more information about
relocations etc.

Differential Revision: http://reviews.llvm.org/D20545
Reviewed By: ruiu

llvm-svn: 270585
2016-05-24 17:30:25 +00:00
Zachary Turner
5844629626 Remove unused variable.
llvm-svn: 270516
2016-05-24 00:06:04 +00:00
Zachary Turner
3c95980335 Make a symbol visitor and use it to dump CV symbols.
Differential Revision: http://reviews.llvm.org/D20534
Reviewed By: rnk

llvm-svn: 270511
2016-05-23 23:41:13 +00:00
Reid Kleckner
1e664f585c [codeview] Test serialization of all known type records
This just checks that we emit all type records once, and then after
merging the type stream with no other type streams, we still emit every
kind of type record.

We could test the dumper output more closely, but that would make the
test very brittle. Currently we're just getting coverage.

llvm-svn: 269778
2016-05-17 16:20:35 +00:00
Reid Kleckner
79a13a2ef0 [codeview] Add type stream merging prototype
Summary:
This code is intended to be used as part of LLD's PDB writing. Until
that exists, this is exposed via llvm-readobj for testing purposes.

Type stream merging uses the following algorithm:

- Begin with a new empty stream, and a new empty hash table that maps
  from type record contents to new type index.
- For each new type stream, maintain a map from source type index to
  destination type index.
- For each record, copy it and rewrite its type indices to be valid in
  the destination type stream.
- If the new type record is not already present in the destination
  stream hash table, append it to the destination type stream, assign it
  the next type index, and update the two hash tables.
- If the type record already exists in the destination stream, discard
  it and update the type index map to forward the source type index to
  the existing destination type index.

Reviewers: zturner, ruiu

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20122

llvm-svn: 269521
2016-05-14 00:02:53 +00:00
Reid Kleckner
c52abd22d5 [codeview] Align class and print names of types
Summary: This way we can get rid of one of the fields in the .def file.

Reviewers: llvm-commits

Subscribers: zturner

Differential Revision: http://reviews.llvm.org/D20251

llvm-svn: 269461
2016-05-13 19:37:07 +00:00
Reid Kleckner
3dd0b43ba4 [codeview] Dump the type index on the first line of each record
This will make it easier to write FileCheck tests.

llvm-svn: 269444
2016-05-13 17:48:24 +00:00
Zachary Turner
30e5cd51f8 Get rid of CVLeafTypes.def and combine with TypeRecords.def
This merges the functionality of the macros in `CVLeafTypes.def` and the
macros in `TypeRecords.def` into a single set of macros.

Differential Revision: http://reviews.llvm.org/D20190
Reviewed By: rnk, amccarth

llvm-svn: 269316
2016-05-12 17:45:51 +00:00
Zachary Turner
26fcd8455c Make CodeView record serialization more generic.
This introduces a variadic template and some helper macros to
safely and correctly deserialize many types of common record
fields while maintaining error checking.

Differential Revision: http://reviews.llvm.org/D20183
Reviewed By: rnk, amccarth

llvm-svn: 269315
2016-05-12 17:45:44 +00:00
Zachary Turner
e4678c41e6 Fix build breakage in DebugInfoCodeview
llvm-svn: 269217
2016-05-11 17:54:20 +00:00