These bugs were found by writing a Python script which spidered
the entire Chromium build directory tree demangling every symbol
in every object file. At the start, the tool printed:
Processed 27443 object files.
2926377/2936108 symbols successfully demangled (99.6686%)
9731 symbols could not be demangled (0.3314%)
14589 files crashed while demangling (53.1611%)
After this patch, it prints:
Processed 27443 object files.
41295518/41295617 symbols successfully demangled (99.9998%)
99 symbols could not be demangled (0.0002%)
0 files crashed while demangling (0.0000%)
The issues fixed in this patch are:
* Ignore empty parameter packs. Previously we would encounter
a mangling for an empty parameter pack and add a null node
to the AST. Since we don't print these anyway, we now just
don't add anything to the AST and ignore it entirely. This
fixes some of the crashes.
* Account for "incorrect" string literal demanglings. Apparently
an older version of clang would not truncate mangled string
literals to 32 bytes of encoded character data. The demangling
code however would allocate a 32 byte buffer thinking that it
would not encounter more than this, and overrun the buffer.
We now demangle up to 128 bytes of data, since the buggy
clang would encode up to 32 *characters* of data.
* Extended support for demangling init-fini stubs. If you had
something like
struct Foo {
static vector<string> S;
};
this would generate a dynamic atexit initializer *for the
variable*. We didn't handle this, but now we print something
nice. This is actually an improvement over undname, which will
fail to demangle this at all.
* Fixed one case of static this adjustment. We weren't handling
several thunk codes so we didn't recognize the mangling. These
are now handled.
* Fixed a back-referencing problem. Member pointer templates
should have their components considered for back-referencing
The remaining 99 symbols which can't be demangled are all symbols
which are compiler-generated and undname can't demangle either.
llvm-svn: 341000
A while back I submitted a patch to resolve backreferences
lazily, thinking this that it was not always possible to know
in advance what type you were looking at until you had completed
a full pass over the input, and therefore it would be impossible
to resolve backreferences eagerly.
This was mistaken though, and turned out to be an unrelated
problem. In fact, the reverse is true. You *must* resolve
backreferences eagerly. This is because certain types of nested
mangled symbols do not share a backreference context with their
parent symbol, and as such, if you try to resolve them lazily
their backreference context will have been lost by the time you
finish demangling the entire input. On the other hand, resolving
them eagerly appears to always work, and enables us to port
many more tests over.
llvm-svn: 340126
1) We print __restrict twice on member pointers. This is fixed
and relevant tests are re-enabled.
2) Several tests were disabled because of printing slightly
different output than undname. These were confirmed to be
bugs in undname, so we just re-enable the tests.
3) The test for printing reference temporaries is re-enabled. This
is a clang mangling extension, so we have some flexibility with
how we demangle it. The output currently looks fine, so we just
re-enable the test with no fixes.
llvm-svn: 339708
Function template names are not stored in the backref table,
but non-template function names are. The general pattern seems
to be that when you are demangling a symbol name, if the name
starts with '?' it does not go into the backreference table,
otherwise it does. Note that this even handles the general case
of operator names (template or otherwise) not going into the
back-reference table, anonymous namespaces not going into the
backreference table, etc.
It's important that we apply this check *only* for the
unqualified portion of a name, and only for symbol names.
For example, this does not apply to type names (such as class
templates) and we need to make sure that these still do go
into the backref table.
Differential Revision: https://reviews.llvm.org/D50394
llvm-svn: 339211
After we detected the presence of a template via ?$ we would proceed by
only demangling a simple unqualified name. This means we would fail on
templated operators (and perhaps other yet-to-be-determined things)
This was discovered while doing some refactoring to store richer
semantic information about the demangled types to pave the way for
overhauling the way we handle backreferences. (Specifically, we need to
defer recording or resolving back-references until a symbol has been
completely demangled, because we need to use information that only
occurs later in the mangled string to decide whether a back-reference
should be recorded.)
Differential Revision: https://reviews.llvm.org/D50145
llvm-svn: 338608
Properly demangle function parameter back-references.
Previously we treated lists of function parameters and template
parameters the same. There are some important differences with regards
to back-references, and some less important differences regarding which
characters can appear before or after the name.
The important differences are that with a given type T, all instances of
a function parameter list share the same global back-ref table.
Specifically, if X and Y are function pointers, then there are 3
entities in the declaration X func(Y) which all affect and are affected
by the master parameter back-ref table:
1) The parameter list of X's function type
2) the parameter list of func itself
3) The parameter list of Y's function type.
The previous code would create a back-reference table that was local to
a single parameter list, so it would not be shared across parameter
lists.
This was discovered when porting ms-back-references.test from clang's
mangling tests. All of these tests should now pass with the new changes.
In doing so, I split the function for parsing template and function
parameters into two separate functions. This makes the template
parameter list parsing code in particular very small and easy to
understand now.
Differential Revision: https://reviews.llvm.org/D49875
llvm-svn: 338075