Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.
Reviewer: dblaikie, JDevlieghere, vleschuk
Differential revision: https://reviews.llvm.org/D51081
llvm-svn: 342048
Before this patch, analyzeContext called getCanonicalDIEOffset(), for
which the result depends on the timings of the setCanonicalDIEOffset()
calls in the cloneLambda. This can lead to slightly different output
between runs due to threading.
To prevent this from happening, we now record the output debug info size
after importing the modules (before any concurrent processing takes
place). This value, named the ModulesEndOffset is used to compare the
canonical DIE offset against. If the value is greater than this offset,
the canonical DIE offset has been updated during cloning, and should
therefore not be considered for pruning.
Differential revision: https://reviews.llvm.org/D51443
llvm-svn: 341649
Keeping the compile units in memory is expensive. For the single
threaded case we allocate them in the analyze part and deallocate them
again once we've finished cloning. This poses a problem in the single
threaded case where we did all the analysis first followed by all the
cloning. This meant we had all the link context in memory right after
analyzing finished.
This patch changes the way we order work in the single threaded case.
Instead of doing all the analysis and cloning in serial, we now
interleave the two so we can deallocate the memory as soon as a file is
processed. The result is binary identical and peak memory usage went
down from 13.43GB to 5.73GB for a debug build of trunk clang.
Differential revision: https://reviews.llvm.org/D51618
llvm-svn: 341568
Also adjust some of dsymutil's headers to put the header guards at the top,
otherwise the compiler will not recognize them as header guards.
llvm-svn: 341323
forward declarations.
Especially with template instantiations, there are legitimate reasons
why for declarations might be emitted into a DW_TAG_module skeleton /
forward-declaration sub-tree, that are not forward declarations in the
sense of that there is a more complete definition over in a .pcm file.
The example in the testcase is a constant DW_TAG_member of a
DW_TAG_class template instatiation.
rdar://problem/43623196
llvm-svn: 341123
This (partially) fixes a regression introduced by
https://reviews.llvm.org/D43945 / r327399, which parallelized
DwarfLinker. This patch avoids parsing and allocating the memory for
all input DIEs up front and instead only allocates them in the
concurrent loop in the AnalyzeLambda. At the end of the loop the
memory from the LinkContext is cleared again.
This reduces the peak memory needed to link the debug info of a
non-modular build of the Swift compiler by >3GB.
rdar://problem/43444464
Differential Revision: https://reviews.llvm.org/D51078
llvm-svn: 340650
Both DWARFDebugLine and DWARFDebugAddr used the same callback mechanism
for handling recoverable errors. They both implemented similar warn() function
to be used as such callbacks.
In this revision we get rid of code duplication and move this warn() function
to DWARFContext as DWARFContext::dumpWarning().
Reviewers: lhames, jhenderson, aprantl, probinson, dblaikie, JDevlieghere
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D51033
llvm-svn: 340528
Summary:
The accelerator tables use the debug_str section to store their strings.
However, they do not support the indirect method of access that is
available for the debug_info section (DW_FORM_strx et al.).
Currently our code is assuming that all strings can/will be referenced
indirectly, and puts all of them into the debug_str_offsets section.
This is generally true for regular (unsplit) dwarf, but in the DWO case,
most of the strings in the debug_str section will only be used from the
accelerator tables. Therefore the contents of the debug_str_offsets
section will be largely unused and bloating the main executable.
This patch rectifies this by teaching the DwarfStringPool to
differentiate between strings accessed directly and indirectly. When a
user inserts a string into the pool it has to declare whether that
string will be referenced directly or not. If at least one user requsts
indirect access, that string will be assigned an index ID and put into
debug_str_offsets table. Otherwise, the offset table is skipped.
This approach reduces the overall binary size (when compiled with
-gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is
shrunk by 99%).
Reviewers: probinson, dblaikie, JDevlieghere
Subscribers: aprantl, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D49493
llvm-svn: 339122
The functions `lookForDIEsToKeep` and `keepDIEAndDependencies` can have
some very deep recursion. This tackles part of this problem by removing
the recursion from `lookForDIEsToKeep` by turning it into a worklist.
The difficulty in doing so is the computation of incompleteness, which
depends on the incompleteness of its children. To compute this, we
insert "continuation markers" into the worklist. This informs the work
loop to (re)compute the incompleteness property of the DIE associated
with it (i.e. the parent of the previously processed DIE).
This patch should generate byte-identical output. Unfortunately it also
has some impact of performance, regressing by about 4% when processing
clang on my machine.
Differential revision: https://reviews.llvm.org/D48899
llvm-svn: 338536
Dsymutil's update functionality was broken on Windows because we tried
to rename a file while we're holding open handles to that file. TempFile
provides a solution for this through its keep(Twine) method. This patch
changes dsymutil to make use of that functionality.
Differential revision: https://reviews.llvm.org/D49860
llvm-svn: 338216
This patch add support for emitting DWARF5 accelerator tables
(.debug_names) from dsymutil. Just as with the Apple style accelerator
tables, it's possible to update existing dSYMs. This patch includes a
test that show how you can convert back and forth between the two types.
If no kind of table is specified, dsymutil will default to generating
Apple-style accelerator tables whenever it finds those in its input. The
same is true when there are no accelerator tables at all. Finally, in
the remaining case, where there's at least one DWARF v5 table and no
Apple ones, the output will contains a DWARF accelerator tables
(.debug_names).
Differential revision: https://reviews.llvm.org/D49137
llvm-svn: 337980
When manually finishing the object writer in dsymutil, it's possible
that there are pending labels that haven't been resolved. This results
in an assertion when the assembler tries to fixup a label that doesn't
have an address yet.
Differential revision: https://reviews.llvm.org/D49131
llvm-svn: 336688
When implementing the DWARF accelerator tables in dsymutil I ran into an
assertion in the assembler. Debugging these kind of issues is a lot
easier when looking at the assembly instead of debugging the assembler
itself. Since it's only a matter of creating an AsmStreamer instead of a
MCObjectStreamer it made sense to turn this into a (hidden) dsymutil
feature.
Differential revision: https://reviews.llvm.org/D49079
llvm-svn: 336561
When emitting a CU, store the MCSymbol pointing to the beginning of the
CU. We'll need this information later when emitting the .debug_names
section (DWARF5 accelerator table).
llvm-svn: 336433
The original binary holder has an optimization where it caches a static
library (archive) between consecutive calls to GetObjects. However, the
actual memory buffer wasn't cached between calls.
This made sense when dsymutil was processing objects one after each
other, but when processing them in parallel, several binaries have to be
in memory at the same time. For this reason, every link context
contained a binary holder.
Having one binary holder per context is problematic, because the same
static archive was cached for every object file. Luckily, when the file
is mmap'ed, this was only costing us virtual memory.
This patch introduces a new BinaryHolder variant that is fully cached,
for all the object files it load, as well as the static archives. This
way, we don't have to give up on this optimization of bypassing the
file system.
Differential revision: https://reviews.llvm.org/D48501
llvm-svn: 335990
This patch splits off some abstractions used by dsymutil's dwarf linker
and moves them into separate header and implementation files. This
almost halves the number of LOC in DwarfLinker.cpp and makes it a lot
easier to understand what functionality lives where.
Differential revision: https://reviews.llvm.org/D48647
llvm-svn: 335749
After the recent refactoring that introduced parallel handling of
different object, the binary holder became unique per object file. This
defeats its optimization of caching archives, leading to an archive
being opened for every binary it contains. This is obviously unfortunate
and will need to be refactored soon.
Luckily in practice, the impact of this is limited as most files are
mmap'ed instead of memcopy'd. There's a caveat however: when the memory
buffer requires a null terminator and it's a multiple of the page size,
we allocate instead of mmap'ing. If this happens for a static archive,
we end up with N copies of it in memory, where N is the number of
objects in the archive, leading to exuberant memory usage. This provided
a stopgap solution to ensure that all the files it loads are mmap in
memory by removing the requirement for a terminating null byte.
Differential revision: https://reviews.llvm.org/D48397
llvm-svn: 335293
Errors found processing the DW_AT_ranges attribute are propagated by lower level
routines and reported by their callers.
Reviewer: JDevlieghere
Differential Revision: https://reviews.llvm.org/D48344
llvm-svn: 335188
This simplifies some code which had StringRefs to begin with, and
makes other code more complicated which had const char* to begin
with.
In the end, I think this makes for a more idiomatic and platform
agnostic API. Not all platforms launch process with null terminated
c-string arrays for the environment pointer and argv, but the api
was designed that way because it allowed easy pass-through for
posix-based platforms. There's a little additional overhead now
since on posix based platforms we'll be takign StringRefs which
were constructed from null terminated strings and then copying
them to null terminate them again, but from a readability and
usability standpoint of the API user, I think this API signature
is strictly better.
llvm-svn: 334518
As noted by Adrian on llvm-commits, PrintHTMLEscaped and PrintEscaped in
StringExtras did not conform to the LLVM coding guidelines. This commit
rectifies that.
llvm-svn: 333669
When printing string in the Plist, we weren't escaping the characters
which lead to invalid XML. This patch adds the escape logic to
StringExtras.
rdar://39785334
llvm-svn: 333565
This commit adds a color category so tools can document this option and
enables it for dwarfdump and dsymuttil.
rdar://problem/40498996
llvm-svn: 333176
Also clean up a couple of hacks where we were writing the section
contents to another stream by setting the object writer's stream,
writing and setting it back.
Part of PR37466.
Differential Revision: https://reviews.llvm.org/D47038
llvm-svn: 332858
Change the "recoverable" error callback to take an Error instaed of a
string.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D46831
llvm-svn: 332845
The idea is that a client that wants split dwarf would create a
specific kind of object writer that creates two files, and use it to
create the streamer.
Part of PR37466.
Differential Revision: https://reviews.llvm.org/D47050
llvm-svn: 332749
Reviewed by: dblaikie, JDevlieghere, espindola
Differential Revision: https://reviews.llvm.org/D44560
Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.
There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).
I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.
Known behaviour changes:
- The dump function in DWARFContext now does not attempt to read subsequent
tables when searching for a specific offset, if the unit length field of a
table before the specified offset is a reserved value.
- getOrParseLineTable now returns a useful Error if an invalid offset is
encountered, rather than simply a nullptr.
- The parse functions no longer use `WithColor::warning` directly to report
errors, allowing LLD to call its own warning function.
- The existing parse error messages have been updated to not specifically
include "warning" in their message, allowing consumers to determine what
severity the problem is.
- If the line table version field appears to have a value less than 2, an
informative error is returned, instead of just false.
- If the line table unit length field uses a reserved value, an informative
error is returned, instead of just false.
- Dumping of .debug_line.dwo sections is now implemented the same as regular
.debug_line sections.
- Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
there is a prologue error, just like non-verbose dumping.
As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.
This change also requires a change to LLD, which will be committed separately.
llvm-svn: 331971
It used to symlink dsymutil to llvm-dsymutil, but after r327790 llvm's dsymutil
binary is now called dsymutil without prefix.
r327792 then reversed the direction of the symlink if
LLVM_INSTALL_CCTOOLS_SYMLINKS was set, but that looks like a buildfix and not
like something anyone should need.
https://reviews.llvm.org/D45966
llvm-svn: 330727
Create convenience functions for printing error, warning and note to
stdout. Previously we had similar functions being used in dsymutil, but
given that this pattern is so common it makes sense to make it available
globally.
llvm-svn: 330091
We have a few functions that virtually all command wants to run on
process startup/shutdown. This patch adds InitLLVM class to do that
all at once, so that we don't need to copy-n-paste boilerplate code
to each llvm command's main() function.
Differential Revision: https://reviews.llvm.org/D45602
llvm-svn: 330046
These aren't the .def style files used in LLVM that require a macro
defined before their inclusion - they're just basic non-modular includes
to stamp out command line flag variables.
llvm-svn: 329840
With the threading refactoring, loading of object files happens before
checking whether we're dealing with a swift AST. While that's not an
issue per se, it causes a warning to be printed:
warning: /path/to/a.swiftmodule: The file was not recognized as a valid object file
note: while processing /path/to/a.swiftmodule
This suppresses the warning by checking for a Swift AST before
attempting to load is as an object file.
rdar://39240444
llvm-svn: 329553
The DwarfLinker can have some very deep recursion that can max out the
(significantly smaller) stack when using threads. We don't want this
limitation when we only have a single thread. We already have this
workaround for the architecture-related threading. This patch applies
the same workaround to the parallel analysis and cloning.
Differential revision: https://reviews.llvm.org/D45172
llvm-svn: 329093
When running dsymutil as part of your build system, it can be desirable
for warnings to be part of the end product, rather than just being
emitted to the output stream. This patch upstreams that functionality.
Differential revision: https://reviews.llvm.org/D44639
llvm-svn: 328965
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: JDevlieghere, zturner, echristo, dberris, friss
Reviewed By: echristo
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D45141
llvm-svn: 328943
Now that almost all functionality of Apple's dsymutil has been
upstreamed, the open source variant can be used as a drop in
replacement. Hence we feel it's no longer necessary to have the llvm
prefix.
Differential revision: https://reviews.llvm.org/D44527
llvm-svn: 327790
Make the architecture part of the warning in the DebugMapParser. This
makes things consistent with the Apple's internal version of dsymutil.
llvm-svn: 327485
This is a follow-up to r327137 where we unified error handling for the
DwarfLinker. This replaces calls to errs() and outs() with the
appropriate ostream wrapper everywhere in dsymutil.
llvm-svn: 327411