1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

462 Commits

Author SHA1 Message Date
Aaron Smith
a4bf47e131 Fix pretty printing the unspecified param of a variadic function
Summary:
 - Fix a bug in PrettyBuiltinDumper that returns "void" as the name for
  an unspecified builtin type. Since the unspecified param of a variadic
  function is considered a builtin of unspecified type in PDBs, we set
  "..." for its name.

  - Provide a method to determine if a PDBSymbolFunc is variadic in
  PrettyFunctionDumper since PDBSymbolFunc::getArgument() doesn't return the
  last unspecified-type param.

  - Add a pretty-func-dumper.test to test pretty dumping of variadic
  functions.

Reviewers: zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D41801

llvm-svn: 322608
2018-01-17 01:22:03 +00:00
Michael Zolotukhin
b926a6e197 Remove redundant includes from lib/DebugInfo.
llvm-svn: 320620
2017-12-13 21:30:49 +00:00
Zachary Turner
5cd7c2a1c3 Don't #include MemoryBuffer.h from Host.h.
It turns out this #include isn't used from Host.h anyway,
but by having it it causes circular include dependencies.
This issues only surfaced while I was working on a separate
patch, so I'm submitting this first so that it's independent
of the other, unrelated patch.

llvm-svn: 318489
2017-11-17 01:00:35 +00:00
Reid Kleckner
edf5cde5e0 Fix my typo of PDB_TableType
llvm-svn: 318447
2017-11-16 19:41:12 +00:00
Reid Kleckner
3dbc869dca Fix -Wreturn-type falling off the end of a function in new DIA code
llvm-svn: 318444
2017-11-16 19:32:53 +00:00
Aaron Smith
9d7d2480c7 [DebugInfo/PDB] Adding getUndecoratedNameEx and IPDB interfaces for IDiaEnumTables and IDiaTable.
Initial changes to support debugging PE/COFF files with LLDB on Windows through DIA SDK.
There is another set of changes required on the LLDB side before this does anything.

Differential Revision: https://reviews.llvm.org/D39517

llvm-svn: 318403
2017-11-16 14:33:09 +00:00
Aaron Smith
b3a09d4903 Test commit. Add a missing dash to the standard llvm file header; NFC.
llvm-svn: 318400
2017-11-16 13:42:28 +00:00
Rafael Espindola
b0606b4221 Convert FileOutputBuffer to Expected. NFC.
llvm-svn: 317649
2017-11-08 01:05:44 +00:00
Reid Kleckner
06bac2866b [PDB] Handle an empty globals hash table with no buckets
llvm-svn: 316722
2017-10-27 00:45:51 +00:00
Peter Collingbourne
3760a48055 COFF: Add type server pdb files to linkrepro tar file.
Differential Revision: https://reviews.llvm.org/D38977

llvm-svn: 316233
2017-10-20 19:48:26 +00:00
Hans Wennborg
439468e183 Fix -Wcovered-switch-default warnings from r314821
llvm-svn: 314826
2017-10-03 18:44:12 +00:00
Hans Wennborg
abd9b7ecb5 CodeView: Provide a .def file with the register ids
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.

X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.

Differential revision: https://reviews.llvm.org/D38480

llvm-svn: 314821
2017-10-03 18:27:22 +00:00
Peter Collingbourne
8cd9088e66 COFF: PDB: Allow multiple modules with the same name.
It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").

Differential Revision: https://reviews.llvm.org/D37589

llvm-svn: 312744
2017-09-07 20:39:46 +00:00
Peter Collingbourne
573d7ff23a Remove dead code. NFCI.
llvm-svn: 312740
2017-09-07 19:17:30 +00:00
Zachary Turner
fa9d44b037 [llvm-pdbutil] Support dumping CodeView from object files.
We have llvm-readobj for dumping CodeView from object files, and
llvm-pdbutil has always been more focused on PDB.  However,
llvm-pdbutil has a lot of useful options for summarizing debug
information in aggregate and presenting high level statistical
views.  Furthermore, it's arguably better as a testing tool since
we don't have to write tests to conform to a state-machine like
structure where you match multiple lines in succession, each
depending on a previous match.  llvm-pdbutil dumps much more
concisely, so it's possible to use single-line matches in many
cases where as with readobj tests you have to use multi-line
matches with an implicit state machine.

Because of this, I'm adding object file support to llvm-pdbutil.
In fact, this mirrors the cvdump tool from Microsoft, which also
supports both object files and pdb files.  In the future we could
perhaps rename this tool llvm-cvutil.

In the meantime, this allows us to deep dive into object files
the same way we already can with PDB files.

llvm-svn: 312358
2017-09-01 20:06:56 +00:00
Zachary Turner
389bcb458f [llvm-pdbutil] Print detailed S_UDT stats.
This adds a new command line option, -udt-stats, which breaks
down the stats of S_UDT records.  These are one of the biggest
contributors to the size of /DEBUG:FASTLINK PDBs, so they need
some additional tools to be able to analyze their usage.  This
option will dig into each S_UDT record and determine what kind
of record it points to, and then break down the statistics by
the target type.  The goal here is to identify how our object
files differ from MSVC object files in S_UDT records, so that
we can output fewer of them and reach size parity.

llvm-svn: 312276
2017-08-31 20:43:22 +00:00
Zachary Turner
f2828dec7b [lld/pdb] Speed up construction of publics & globals addr map.
computeAddrMap function calls std::stable_sort with a comparison
function that computes deserialized symbols every time its called.
In the result deserializeAs<PublicSym32> is called 20-30 times per
symbol. It's much faster to calculate it beforehand and pass a
pointer to it to the comparison function.

Patch by Alex Telishev
Differential Revision: https://reviews.llvm.org/D36941

llvm-svn: 311373
2017-08-21 20:08:40 +00:00
Zachary Turner
99c57c2876 [llvm-pdbutil] Add support for dumping detailed module stats.
This adds support for dumping a summary of module symbols
and CodeView debug chunks.  This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size.  Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

llvm-svn: 311338
2017-08-21 14:53:25 +00:00
Benjamin Kramer
b795ef1cb5 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Zachary Turner
6c0b0dd57f [LLD/PDB] Write actual records to the globals stream.
Previously we were writing an empty globals stream.  Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this.  Regardless,
without it we don't have information about global variables, so
we need to fix it anyway.  This patch does that.

With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.

Differential Revision: https://reviews.llvm.org/D36535

llvm-svn: 310743
2017-08-11 19:00:03 +00:00
Zachary Turner
d0823e0006 [PDB] Fix an issue writing the publics stream.
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader.  This caused debugging in WinDbg
to break.

We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.

llvm-svn: 310439
2017-08-09 04:23:59 +00:00
Zachary Turner
62cb11667a [PDB] Merge Global and Publics Builders.
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams.  PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.

Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.

Differential Revision: https://reviews.llvm.org/D36489

llvm-svn: 310438
2017-08-09 04:23:25 +00:00
Adrian McCarthy
ee6fb7079a Enable llvm-pdbutil to list enumerations using native PDB reader
This extends the native reader to enable llvm-pdbutil to list the enums in a
PDB and it includes a simple test. It does not yet list the values in the
enumerations, which requires an actual implementation of
NativeEnumSymbol::FindChildren.

To exercise this code, use a command like:

    llvm-pdbutil pretty -native -enums foo.pdb

Differential Revision: https://reviews.llvm.org/D35738

llvm-svn: 310144
2017-08-04 22:37:58 +00:00
Reid Kleckner
50f2ae649f [PDB] Fix section contributions
Summary:
PDB section contributions are supposed to use output section indices and
offsets, not input section indices and offsets.

This allows the debugger to look up the index of the module that it
should look up in the modules stream for symbol information. With this
change, windbg can now find line tables, but it still cannot print local
variables.

Fixes PR34048

Reviewers: zturner

Subscribers: hiraditya, ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D36285

llvm-svn: 309987
2017-08-03 21:15:09 +00:00
Zachary Turner
1e25f820c0 [pdb/lld] Write a valid FPM.
The PDB reserves certain blocks for the FPM that describe which
blocks in the file are allocated and which are free.  We weren't
filling that out at all, and in some cases we were even stomping
it with incorrect data.  This patch writes a correct FPM.

Differential Revision: https://reviews.llvm.org/D36235

llvm-svn: 309896
2017-08-02 22:31:39 +00:00
Zachary Turner
311adaa4ee [pdbutil] Add a command to dump the FPM.
Recently problems have been discovered in the way we write the FPM
(free page map).  In order to fix this, we first need to establish
a baseline about what a correct FPM looks like using an MSVC
generated PDB, so that we can then make our own generated PDBs
match.  And in order to do this, the dumper needs a mode where it
can dump an FPM so that we can write tests for it.

This patch adds a command to dump the FPM, as well as a test against
a known-good PDB.

llvm-svn: 309894
2017-08-02 22:25:52 +00:00
Zachary Turner
6a895f2437 [lld/pdb] Add an empty globals stream.
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.

Differential Revision: https://reviews.llvm.org/D35290

llvm-svn: 309608
2017-07-31 19:36:08 +00:00
Reid Kleckner
8127f85824 [PDB] Initialize the std::array<ulittle32_t> used for the gsi bitmap
With ASan, we would write about 512 bytes of malloc fill value to the
PDB, with some random bits ORed in here and there. Dumping the PDB would
always fail reliably.

llvm-svn: 309331
2017-07-27 23:13:05 +00:00
Reid Kleckner
4941e61d5f [PDB] Write public symbol records and the publics hash table
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.

Reviewers: zturner, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35871

llvm-svn: 309303
2017-07-27 18:25:59 +00:00
Reid Kleckner
162fd727ea [PDB] Remove stale GSI.h header that I intended to remove in the previous commit
llvm-svn: 309069
2017-07-26 00:58:49 +00:00
Reid Kleckner
427e306905 [PDB] Improve GSI hash table dumping for publics and globals
The PDB "symbol stream" actually contains symbol records for the publics
and the globals stream. The globals and publics streams are essentially
hash tables that point into a single stream of records. In order to
match cvdump's behavior, we need to only dump symbol records referenced
from the hash table. This patch implements that, and then implements
global stream dumping, since it's just a subset of public stream
dumping.

Now we shouldn't see S_PROCREF or S_GDATA32 records when dumping
publics, and instead we should see those record in the globals stream.

llvm-svn: 309066
2017-07-26 00:40:36 +00:00
Tim Northover
4aa5f3b819 Revert "Debug: handle dumping the D language."
Reid beat me to it.

llvm-svn: 308902
2017-07-24 17:47:46 +00:00
Tim Northover
5e187f3e4b Debug: handle dumping the D language.
Mostly just to silence a warning about an unhandled case. There don't seem to
be any tests for this operator (at least that I could find).

llvm-svn: 308901
2017-07-24 17:39:44 +00:00
Reid Kleckner
7933f7c5de Add missing case to switch
llvm-svn: 308894
2017-07-24 16:30:44 +00:00
Reid Kleckner
77d8531464 Fix DebugInfo/PDB build by adding missing changes
llvm-svn: 308765
2017-07-21 18:32:00 +00:00
Reid Kleckner
eb65629702 [PDB] Dump extra info about the publics stream
This includes the hash table, the address map, and the thunk table and
section offset table. The last two are only used for incremental
linking, which LLD doesn't support, so they are less interesting. The
hash table is particularly important to get right, since this is the one
of the streams that debuggers use to translate addresses to symbols.

llvm-svn: 308764
2017-07-21 18:28:55 +00:00
Reid Kleckner
247b925085 [PDB] Finish and simplify TPI hashing
Summary:
This removes the CVTypeVisitor updater and verifier classes. They were
made dead by the minimal type dumping refactoring. Replace them with a
single function that takes a type record and produces a hash. Call this
from the minimal type dumper and compare the hash.

I also noticed that the microsoft-pdb reference repository uses a basic
CRC32 for records that aren't special. We already have an implementation
of that CRC ready to use, because it's used in COFF for ICF.

I'll make LLD call this hashing utility in a follow-up change. We might
also consider using this same hash in type stream merging, so that we
don't have to hash our records twice.

Reviewers: inglorion, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D35515

llvm-svn: 308240
2017-07-18 00:33:45 +00:00
Reid Kleckner
9308fade75 [codeview] Fix YAML for LF_TYPESERVER2 by hoisting PDB_UniqueId
Summary:
We were treating the GUIDs in TypeServer2Record as strings, and the
non-ASCII bytes in the GUID would not round-trip through YAML.

We already had the PDB_UniqueId type portably represent a Windows GUID,
but we need to hoist that up to the DebugInfo/CodeView library so that
we can use it in the TypeServer2Record as well as in PDB parsing code.

Reviewers: inglorion, amccarth

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D35495

llvm-svn: 308234
2017-07-17 23:59:44 +00:00
Reid Kleckner
3ab9e840a9 [codeview] Remove TypeServerHandler and PDBTypeServerHandler
Summary:
Instead of wiring these through the CVTypeVisitor interface, clients
should inspect the CVTypeArray before visiting it and potentially load
up the type server's TPI stream if they need it.

No tests relied on this functionality because LLD was the only client.

Reviewers: ruiu

Subscribers: mgorny, hiraditya, zturner, llvm-commits

Differential Revision: https://reviews.llvm.org/D35394

llvm-svn: 308212
2017-07-17 20:28:06 +00:00
Reid Kleckner
4d5e2af5ab [PDB] Fix type server handling for archives
Summary:
This fixes type indices for SDK or CRT static archives. Previously we'd
try to look next to the archive object file path, which would not exist
on the local machine.

Also error out if we can't resolve a type server record. Hypothetically
we can recover from this error by discarding debug info for this object,
but that is not yet implemented.

Reviewers: ruiu, amccarth

Subscribers: aprantl, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35369

llvm-svn: 307946
2017-07-13 20:12:23 +00:00
Reid Kleckner
d1ccae8f6d Fix non-Windows build after PDB native builtin type change
Some C++14 features slipped in along with an extra member qualification.

llvm-svn: 307835
2017-07-12 19:46:35 +00:00
Adrian McCarthy
22b5fc8160 [PDB] Enable NativeSession to create symbols for built-in types on demand
Summary:
There is a reserved range of type indexes for built-in types (like integers).
This will create a symbol for a built-in type if the caller askes for one by
type index.  This is also plumbing for being able to recall symbols by type
index in general, but user-defined types will come in subsequent patches.

Reviewers: rnk, zturner

Subscribers: mgorny, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35163

llvm-svn: 307834
2017-07-12 19:38:11 +00:00
Zachary Turner
f4a9557a9b [lld/pdb] Create an empty public symbol record stream.
This is part of the continuing effort to increase parity between
LLD and MSVC PDBs.  link still doesn't like our PDBs, so the most
obvious thing to check was whether adding an empty publics stream
would get it to do something else.  It still fails in the same way
but at least this removes one more variable from the equation.
The next logical step would be to try creating an empty globals
stream.

Differential Revision: https://reviews.llvm.org/D35224

llvm-svn: 307598
2017-07-10 22:40:20 +00:00
Zachary Turner
6ff9ef15d3 [PDB] More changes to bring lld PDBs to parity with MSVC.
1) Don't write a /src/headerblock stream.  This appears to be
   written conditionally by MSVC, but it's not clear what the
   condition is.  For now, just remove it since we dont' know
   what it is anyway and the particular pdb we've checked in
   for the test doesn't have one.
2) Write a valid timestamp for the PDB file signature.  This
   leads to non-reproducible builds, but it matches the default
   behavior of link, so it should be out default as well.  If
   we need reproducibility, we should add a separate command
   line option for it that is off by default.
3) Write an empty FPO stream.  MSVC seems to always write an
   FPO stream.  This change makes the stream directory match
   up, although we still need to make the contents of the FPO
   stream match.

llvm-svn: 307436
2017-07-07 20:25:39 +00:00
Zachary Turner
ad59058f84 Fix some differences between lld and MSVC generated PDBs.
A couple of things were different about our generated PDBs.

1) We were outputting the wrong Version on the PDB Stream.
   The version we were setting was newer than what MSVC is setting.
   It's not clear what the implications are, but we change LLD
   to use PdbImplVC70, as MSVC does.
2) For the optional debug stream indices in the DBI Stream, we
   were outputting 0 to mean "the stream is not present".  MSVC
   outputs uint16_t(-1), which is the "correct" way to specify
   that a stream is not present.  So we fix that as well.
3) We were setting the PDB Stream signature to 0.  This is supposed
   to be the result of calling time(nullptr).  Although this leads
   to non-deterministic builds, a better way to solve that is by
   having a command line option explicitly for generating a
   reproducible build, and have the default behavior of lld-link
   match the default behavior of link.

To test this, I'm making use of the new and improved `pdb diff`
sub command.  To make it suitable for writing tests against, I had
to modify the diff subcommand slightly to print less verbose output.
Previously it would always print | <column> | <value1> | <value2> |
which is quite verbose, and the values are fragile.  All we really
want to know is "did we produce the same value as link?"  So I added
command line options to print a single character representing the
result status (different, identical, equivalent), and another to
hide the value display.  Note that just inspecting the diff output
used to write the test, you can see some things that are obviously
wrong.  That is just reflective of the fact that this is the state
of affairs today, not that we're asserting that this is "correct".
We can use this as a starting point to discover differences, fix
them, and update the test.

Differential Revision: https://reviews.llvm.org/D35086

llvm-svn: 307422
2017-07-07 18:45:56 +00:00
Zachary Turner
f0a7a434de [llvm-pdbutil] Improve diff mode.
We're getting to the point that some MS tools (e.g. DIA) can recognize
our PDBs but others (e.g. link.exe) cannot. I think the way forward is
to improve our tooling to help us find differences more easily. For
example, if we can compile the same program with clang-cl and cl and
have a tool tell us all the places where the PDBs differ, this could
tell us what we're doing wrong. It's tricky though, because there are a
lot of "benign" differences in a PDB. For example, if the string table
in one PDB consists of "foo" followed by "bar" and in the other PDB it
consists of "bar" followed by "foo", this is not necessarily a critical
difference, as long as the uses of these strings also refer to the
correct location. On the other hand, if the second PDB doesn't even
contain the string "foo" at all, this is a critical difference.

diff mode has been in llvm-pdbutil for quite a while, but because of the
above challenge along with some others, it's been hard to make it
useful. I think this patch addresses that. It looks for all the same
things, but it now prints the output in tabular format (carefully
formatted and aligned into tables and fields), and it highlights
critical differences in red, non-critical differences in yellow, and
identical fields in green.  This makes it easy to spot the places we
differ, and the general concept of outputting arbitrary fields in
tabular format can be extended to provide analysis into many of the
different types of information that show up in a PDB.

Differential Revision: https://reviews.llvm.org/D35039

llvm-svn: 307421
2017-07-07 18:45:37 +00:00
Zachary Turner
8ef979bdc8 [PDB] Teach libpdb to write DBI Stream ECNames.
Based strictly on the name, this seems to have something to do
width edit & continue.  The goal of this patch has nothing to do
with supporting edit and continue though.  msvc link.exe writes
very basic information into this area even when *not* compiling
with support for E&C, and so the goal here is to bring lld-link
to parity.  Since we cannot know what assumptions standard tools
make about the content of PDB files, we need to be as close as
possible.

This ECNames data structure is a standard PDB string hash table.
link.exe puts a single string into this hash table, which is the
full path to the PDB file on disk.  It then references this string
from the module descriptor for the compiler generated `* Linker *`
module.

With this patch, lld-link will generate the exact same sequence of
bytes as MSVC link for this subsection for a given object file
input (as reported by `llvm-pdbutil bytes -ec`).

llvm-svn: 307356
2017-07-07 05:04:36 +00:00
Eugene Zelenko
146f607502 [CodeView, PDB] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 306911
2017-06-30 23:06:03 +00:00
Adrian McCarthy
93bf3cda7e Introduce symbol cache to PDB NativeSession
Instead of creating symbols directly in the findChildren methods of the native
symbol implementations, they will rely on the NativeSession to act as a factory
for these types.  This lets NativeSession cache the NativeRawSymbols in its
new symbol cache and makes that cache the source of unique IDs for the symbols.

Right now, this affects only NativeCompilandSymbols.  There's no external
change yet, so I think the existing tests are still sufficient.  Coming soon
are patches to extend this to built-in types and enums.

llvm-svn: 306610
2017-06-28 22:47:40 +00:00
Zachary Turner
5c0b0c6d4d [pdb] Fix reading of llvm-generated PDBs by cvdump.
If you dump a pdb to yaml, and then round-trip it back to a pdb,
and run cvdump -l <file> on the new pdb, cvdump will generate
output such as this.

*** LINES

** Module: "d:\src\llvm\test\DebugInfo\PDB\Inputs\empty.obj"

Error: Line number corrupted: invalid file id 0
  <Unknown> (MD5), 0001:00000010-0000001A, line/addr pairs = 3

        5 00000010      6 00000013      7 00000018

Note the error message about the corrupted line number.

It turns out that the problem is that cvdump cannot find the
/names stream (e.g. the global string table), and the reason it
can't find the /names stream is because it doesn't understand
the NameMap that we serialize which tells pdb consumers which
stream has the string table.

Some experimentation shows that if we add items to the hash
table in a specific order before serializing it, cvdump can read
it. This suggests that either we're using the wrong hash function,
or we're serializing something incorrectly, but it will take some
deeper investigation to figure out how / why.  For now, this at
least allows cvdump to read our line information (and incidentally,
produces an identical byte sequence to what Microsoft tools
produce when writing the named stream map).

Differential Revision: https://reviews.llvm.org/D34491

llvm-svn: 306233
2017-06-25 03:51:42 +00:00
Zachary Turner
68fda3e094 [llvm-pdbutil] Dump raw bytes of module symbols and debug chunks.
llvm-svn: 306179
2017-06-23 23:08:57 +00:00
Zachary Turner
5bcb7f9566 [llvm-pdbutil] Dump raw bytes of type and id records.
llvm-svn: 306167
2017-06-23 21:50:54 +00:00
Zachary Turner
a5aa4dd5a1 [llvm-pdbutil] Dump raw bytes of various DBI stream subsections.
llvm-svn: 306160
2017-06-23 21:11:54 +00:00
Zachary Turner
086ec4ddf5 [llvm-pdbutil] Dump raw bytes of pdb name map.
This patch dumps the raw bytes of the pdb name map which contains
the mapping of stream name to stream index for the string table
and other reserved streams.

llvm-svn: 306148
2017-06-23 20:18:38 +00:00
Zachary Turner
287c1c9e0d [llvm-pdbutil] Add a function for formatting MSF data.
The goal here is to make it possible to display absolute
file offsets when dumping byets from an MSF.  The problem is
that when dumping bytes from an MSF, often the bytes will
cross a block boundary and encounter a discontinuity.  We
can't use the normal formatBinary() function for this because
this would just treat the sequence as entirely ascending, and
not account out-of-order blocks.

This patch adds a formatMsfData() function to our printer, and
then uses this function to improve the output of the -stream-data
command line option for dumping bytes from a particular stream.

Test coverage is also expanded to make sure to include all possible
scenarios of offsets, sizes, and crossing block boundaries.

llvm-svn: 306141
2017-06-23 18:52:13 +00:00
Adrian McCarthy
8990e637cd Fix build break by using llvm::make_unique instead of std::make_unique.
llvm-svn: 306043
2017-06-22 18:57:51 +00:00
Adrian McCarthy
bd1981ed79 Add IDs and clone methods to NativeRawSymbol
All NativeRawSymbols will have a unique symbol ID (retrievable via
getSymIndexId).  For now, these are initialized to 0, but soon the
NativeSession will be responsible for creating the raw symbols, and it will
assign unique IDs.

The symbol cache in the NativeSession will also require the ability to clone
raw symbols, so I've provided implementations for that as well.

llvm-svn: 306042
2017-06-22 18:43:18 +00:00
Adrian McCarthy
76db3aa83c Make IPDBSession::getGlobalScope a non-const method
There doesn't seem to be a compelling reason why this method should be const
other than it was possible with the DIA implementation.  The native session
is going to act as a symbol factory and cache.  This could be acheived with
mutable (and the existing const_cast), but it seems cleaner to accept that
this method affects the state of the session.

This change eliminates an existing const_cast.

llvm-svn: 306041
2017-06-22 18:42:23 +00:00
Zachary Turner
c6fbceb28e [PDB] Don't write uninitialized bytes to a PDB file.
There were certain fields that we didn't know how to write, as
well as various padding bytes that we would ignore.  This leads
to garbage data in the PDB.  While not strictly necessary, we
should initialize these bytes to something meaningful, as it
makes for easier binary comparison between PDBs.

llvm-svn: 305819
2017-06-20 18:50:55 +00:00
Reid Kleckner
6f888cd561 [PDB] Start emitting source file and line information
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.

This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.

File checksums and string tables, on the other hand, need to be re-done.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34257

llvm-svn: 305713
2017-06-19 17:21:45 +00:00
Zachary Turner
240abe68ab [llvm-pdbutil] Add support for dumping lines and inlinee lines.
llvm-svn: 305529
2017-06-15 23:56:19 +00:00
Zachary Turner
cc221d2ddd [llvm-pdbutil] Add back support for dumping file checksums.
When dumping module source files, also dump checksums.

llvm-svn: 305526
2017-06-15 23:12:41 +00:00
Zachary Turner
8eaa56ff25 [llvm-pdbutil] Add back the ability to dump hashes and index offsets.
This was regressed in a previous patch that re-wrote the dumper,
and I'm incrementally adding back the pieces that are missing.

llvm-svn: 305524
2017-06-15 23:04:42 +00:00
Zachary Turner
8b022e96fb Resubmit "[llvm-pdbutil] rewrite the "raw" output style."
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.

It was broken due to some weird template issues, which have
since been fixed.

llvm-svn: 305517
2017-06-15 22:24:24 +00:00
Zachary Turner
e671337e5a Revert "[llvm-pdbutil] rewrite the "raw" output style."
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.

This is failing due to some strange template problems, so reverting
until it can be straightened out.

llvm-svn: 305505
2017-06-15 20:55:51 +00:00
Zachary Turner
f5c6b23b50 [llvm-pdbutil] rewrite the "raw" output style.
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.

One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.

This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.

See the tests for some examples of what the new output looks like.

Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:

The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.

Differential Revision: https://reviews.llvm.org/D34191

llvm-svn: 305495
2017-06-15 19:34:41 +00:00
Zachary Turner
14296a67da Resubmit "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This was originally reverted because of some non-deterministic
failures on certain buildbots.  Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.

llvm-svn: 305393
2017-06-14 15:59:27 +00:00
Zachary Turner
fc271778ce Revert "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This is causing failures on linux bots with an invalid stream
read.  It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.

llvm-svn: 305371
2017-06-14 06:24:24 +00:00
Zachary Turner
2461a178e5 [codeview] Make obj2yaml/yaml2obj support .debug$S/T sections.
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.

Differential Revision: https://reviews.llvm.org/D34127

llvm-svn: 305366
2017-06-14 05:31:00 +00:00
Reid Kleckner
e5ca9f3c69 [PDB] Add a module descriptor for every object file
Summary:
Expose the module descriptor index and fill it in for section
contributions.

Reviewers: zturner

Subscribers: llvm-commits, ruiu, hiraditya

Differential Revision: https://reviews.llvm.org/D34126

llvm-svn: 305296
2017-06-13 15:49:13 +00:00
Zachary Turner
f1b7f3b2be Slightly better fix for dealing with no-id-stream PDBs.
The last fix required the user to manually add the required
feature.  This caused an LLD test to fail because I failed to
update LLD.  In practice we can hide this logic so it can just
be transparently added when we write the PDB.

llvm-svn: 305236
2017-06-12 21:46:51 +00:00
Zachary Turner
e433c4fcea [llvm-pdbdump] Don't fail on PDBs with no ID stream.
Older PDBs don't have this.  Its presence is detected by using
the various "feature" flags that come at the end of the PDB
Stream.  Detect this, and don't try to dump the ID stream if the
features tells us it's not present.

llvm-svn: 305235
2017-06-12 21:34:53 +00:00
Zachary Turner
f6333be573 Fix a null pointer dereference in llvm-pdbutil pretty.
Static data members were causing a problem because I mistakenly
assumed all members would affect a class's layout and so the
Layout member would be non-null.

llvm-svn: 305229
2017-06-12 20:46:35 +00:00
Zachary Turner
a92d8136a5 [PDB] Don't crash on /debug:fastlink PDBs.
Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value.  This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.

llvm-svn: 304982
2017-06-08 16:00:40 +00:00
Zachary Turner
c5632126fc Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Zachary Turner
d85641a0ca Fix uninitialized read.
llvm-svn: 304846
2017-06-06 23:54:23 +00:00
Chandler Carruth
eb66b33867 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Zachary Turner
5d88e16362 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

llvm-svn: 304738
2017-06-05 21:40:33 +00:00
Zachary Turner
a10e920415 [PDB] Fix use after free.
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary.  This
way it could still return the user a contiguous buffer of the
requested size.  However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.

Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view.  And this buffer would get destroyed
as soon as the strema was closed.

The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.

Differential Revision: https://reviews.llvm.org/D33858

llvm-svn: 304623
2017-06-03 00:33:35 +00:00
Zachary Turner
ea31ec8205 [CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way.  This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.

Differential Revision: https://reviews.llvm.org/D33807

llvm-svn: 304588
2017-06-02 19:49:14 +00:00
Zachary Turner
a26ddca533 [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

llvm-svn: 304484
2017-06-01 21:52:41 +00:00
Adrian Prantl
2a294c6191 [DWARF] Introduce Dump Options
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.

Patch by Spyridoula Gravani!

Differential Revision: https://reviews.llvm.org/D33749

llvm-svn: 304446
2017-06-01 18:18:23 +00:00
Galina Kistanova
28073388d9 Added missing break.
llvm-svn: 304230
2017-05-30 19:02:49 +00:00
Zachary Turner
f028d10a98 [CodeView] Rename ModuleDebugFragment -> DebugSubsection.
This is more concise, and matches the terminology used in other
parts of the codebase more closely.

llvm-svn: 304218
2017-05-30 16:36:15 +00:00
Zachary Turner
342130c3c5 [lld] Fix a bug where we continually re-follow type servers.
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.

Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.

This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.

llvm-svn: 303920
2017-05-25 21:16:03 +00:00
Bob Haarman
07748f35ed [pdb] pad source file name buffer at the end instead of the beginning
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.

Reviewers: amccarth, rnk, zturner

Reviewed By: amccarth, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33475

llvm-svn: 303917
2017-05-25 21:12:15 +00:00
Zachary Turner
d610ac22c2 [CodeView Type Merging] Avoid record deserialization when possible.
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.

Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.

In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.

Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.

This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.

Differential Revision: https://reviews.llvm.org/D33480

llvm-svn: 303914
2017-05-25 21:06:28 +00:00
Zachary Turner
57dfa705b6 Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

llvm-svn: 303577
2017-05-22 21:07:43 +00:00
Zachary Turner
c669552325 Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

llvm-svn: 303446
2017-05-19 19:26:58 +00:00
Zachary Turner
4a5590fa6b Revert "[CodeView] Provide a common interface for type collections."
This is a squash of ~5 reverts of, well, pretty much everything
I did today.  Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures.  I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!

At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.

llvm-svn: 303409
2017-05-19 05:57:45 +00:00
Zachary Turner
1c726a40b8 [llvm-pdbdump] Add the ability to merge PDBs.
Merging PDBs is a feature that will be used heavily by
the linker.  The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools.  This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump.  It takes arbitrarily
many PDBs and outputs a single PDB.

Using this new functionality, a test is added for merging
type records.  Future patches will add the ability to merge
symbol records, module information, etc.

llvm-svn: 303389
2017-05-18 23:03:41 +00:00
Zachary Turner
05edea832d [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303388
2017-05-18 23:03:06 +00:00
Zachary Turner
1a1eb3534c Fix some minor issues in PDB parsing library.
1) Until now I'd never seen a valid PDB where the DBI stream and
   the PDB Stream disagreed on the "Age" field.  Because of that,
   we had code to assert that they matched.  Recently though I was
   given a PDB where they disagreed, so this assumption has proven
   to be incorrect.  Remove this check.

2) We were walking the entire list of hash values for types up front
   and then throwing away the values.  For large PDBs this was a
   significant slow down.  Remove this.

With this patch, I can dump the list of all compilands from a
1.5GB PDB file in just a few seconds.

llvm-svn: 303351
2017-05-18 15:14:44 +00:00
Bob Haarman
b6d21458cc [llvm-pdbdump] in yaml2pdb, generate default output filename if none given
Summary:
llvm-pdbdump yaml2pdb used to fail with a misleading error
message ("An I/O error occurred on the file system") if no output file
was specified. This change adds an assert to PDBFileBuilder to check
that an output file name is specified, and makes llvm-pdbdump generate
an output file name based on the input file name if no output file
name is explicitly specified.

Reviewers: amccarth, zturner

Reviewed By: zturner

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D33296

llvm-svn: 303299
2017-05-17 20:46:48 +00:00
Zachary Turner
5f8e1427eb [CodeView] Simplify the use of visiting type records & streams.
There is often a lot of boilerplate code required to visit a type
record or type stream.  The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245

llvm-svn: 303271
2017-05-17 16:39:06 +00:00
Zachary Turner
aaaf4b3ba3 [CodeView] Add a random access type visitor.
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans.  This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.

Differential Revision: https://reviews.llvm.org/D33009

llvm-svn: 302936
2017-05-12 19:18:12 +00:00
Zachary Turner
5c99339579 [pdb] Don't verify TPI hash values up front.
Verifying the hash values as we are currently doing
results in iterating every type record before the user
even tries to access the first one, and the API user
has no control over, or ability to hook into this
process.

As a result, when the user wants to iterate over types
to print them or index them, this results in a second
iteration over the same list of types.  When there's
upwards of 1,000,000 type records, this is obviously
quite undesirable.

This patch raises the verification outside of TpiStream
, and llvm-pdbdump hooks a hash verification visitor
into the normal dumping process.  So we still verify
the hash records, but we can do it while not requiring
a second iteration over the type stream.

Differential Revision: https://reviews.llvm.org/D32873

llvm-svn: 302206
2017-05-04 23:53:54 +00:00
Zachary Turner
5f415496c7 [PDB] Don't build the entire source file list up front.
I tried to run llvm-pdbdump on a very large (~1.5GB) PDB to
try and identify show-stopping performance problems.  This
patch addresses the first such problem.

When loading the DBI stream, before anyone has even tried to
access a single record, we build an in memory map of every
source file for every module.  In the particular PDB I was
using, this was over 85 million files.  Specifically, the
complexity is O(m*n) where m is the number of modules and
n is the average number of source files (including headers)
per module.

The whole reason for doing this was so that we could have
constant time access to any module and any of its source
file lists.  However, we can still get O(1) access to the
source file list for a given module with a simple O(m)
precomputation, and access to the list of modules is
already O(1) anyway.

So this patches reduces the O(m*n) up-front precomputation
to an O(m) one, where n is ~6,500 and n*m is about 85 million
in my pathological test case.

Differential Revision: https://reviews.llvm.org/D32870

llvm-svn: 302205
2017-05-04 23:53:29 +00:00
Zachary Turner
31cc3cca3a [llvm-readobj] Update readobj to re-use parsing code.
llvm-readobj hand rolls some CodeView parsing code for string
tables, so this patch updates it to re-use some of the newly
introduced parsing code in LLVMDebugInfoCodeView.

Differential Revision: https://reviews.llvm.org/D32772

llvm-svn: 302052
2017-05-03 17:11:11 +00:00
Zachary Turner
28d3fba1b3 Resubmit r301986 and r301987 "Add codeview::StringTable"
This was reverted due to a "missing" file, but in reality
what happened was that I renamed a file, and then due to
a merge conflict both the old file and the new file got
added to the repository.  This led to an unused cpp file
being in the repo and not referenced by any CMakeLists.txt
but #including a .h file that wasn't in the repo.  In an
even more unfortunate coincidence, CMake didn't report the
unused cpp file because it was in a subdirectory of the
folder with the CMakeLists.txt, and not in the same directory
as any CMakeLists.txt.

The presence of the unused file was then breaking certain
tools that determine file lists by globbing rather than
by what's specified in CMakeLists.txt

In any case, the fix is to just remove the unused file from
the patch set.

llvm-svn: 302042
2017-05-03 15:58:37 +00:00