1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

55 Commits

Author SHA1 Message Date
Nico Weber
d4f564fc30 llvm-undname: Fix more crashes and asserts on invalid inputs
For functions whose callers don't check that enough input is present,
add checks at the start of the function that enough input is there and
set Error otherwise.

For functions that return AST objects, return nullptr instead of
incomplete AST objects with nullptr fields if an error occurred during
the function.

Introduce a new function demangleDeclarator() for the sequence
demangleFullyQualifiedSymbolName(); demangleEncodedSymbol() and
use it in the two places that had this sequence. Let this new function
check that ConversionOperatorIdentifiers have a valid TargetType.

Some of the bad inputs found by oss-fuzz, others by inspection.

Differential Revision: https://reviews.llvm.org/D60354

llvm-svn: 357936
2019-04-08 19:46:53 +00:00
Nico Weber
ec9094441c llvm-undname: Fix a crash-on-invalid
Found by oss-fuzz, fixes issue 13260 on oss-fuzz.

Differential Revision: https://reviews.llvm.org/D60207

llvm-svn: 357649
2019-04-03 23:27:18 +00:00
Nico Weber
4370387226 llvm-undame: Fix an assert-on-invalid
Found by oss-fuzz, fixes issue 12432 on os-fuzz.

Differential Revision: https://reviews.llvm.org/D60206

llvm-svn: 357648
2019-04-03 23:23:32 +00:00
Nico Weber
b6b1db8acb llvm-undname: Fix an assert-on-invalid
Found by oss-fuzz, fixes issues 12428 and 12429 on oss-fuzz.

Differential Revision: https://reviews.llvm.org/D60204

llvm-svn: 357647
2019-04-03 23:19:39 +00:00
Nico Weber
d503567673 llvm-undname: Fix a crash-on-invalid
Found by oss-fuzz, fixes issues 12435 and 12438 on oss-fuzz.

Differential Revision: https://reviews.llvm.org/D60202

llvm-svn: 357646
2019-04-03 23:15:56 +00:00
Zachary Turner
3475d3dfe8 [llvm-undname] Add support for demangling msvc's noexcept types.
Starting in C++17, MSVC introduced a new mangling for function
parameters that are themselves noexcept functions.  This patch
makes llvm-undname properly demangle them.

Patch by Zachary Henkel
Differential Revision: https://reviews.llvm.org/D55769

llvm-svn: 350656
2019-01-08 21:05:51 +00:00
Zachary Turner
0a0be9b76e [MS Demangler] Fail gracefully on invalid pointer types.
Once we detect a 'P', we know we a pointer type is upcoming, so
we make some assumptions about the output that follows.  If those
assumptions didn't hold, we would assert.  Instead, we should
fail gracefully and propagate the error up.

llvm-svn: 349169
2018-12-14 18:10:13 +00:00
Zachary Turner
6d3b7f4c22 [MS Demangler] Add a regression test for an invalid mangled name.
llvm-svn: 349168
2018-12-14 17:59:27 +00:00
Nico Weber
0ad4ee4225 [MS Demangler] Print public:, protected:, private: if set in FunctionClass or a variable's StorageClass.
undname prints them, and the information is in the decorated name, so we probably shouldn't lose it when undecorating.

I spot-checked a few of the funnier-looking outputs, and undname has the same output.

Differential Revision: https://reviews.llvm.org/D54396

llvm-svn: 346791
2018-11-13 20:18:26 +00:00
Nico Weber
abe2a8d7cb [MS demangler] Use a slightly shorter unmangling for mangled strings.
Before: const wchar_t * {L"%"}
Now: L"%"

See also PR39593.
Differential Revision: https://reviews.llvm.org/D54294

llvm-svn: 346544
2018-11-09 19:28:50 +00:00
Zachary Turner
5c0fec24c6 [MS Demangler] Add support for $$Z parameter pack separator.
$$Z appears between adjacent expanded parameter packs in the
same template instantiation.  We don't need to print it, it's
only there to disambiguate between manglings that would otherwise
be ambiguous.  So we just need to parse it and throw it away.

llvm-svn: 341119
2018-08-30 20:53:29 +00:00
Zachary Turner
696ab67afc [MS Demangler] Fix several crashes and demangling bugs.
These bugs were found by writing a Python script which spidered
the entire Chromium build directory tree demangling every symbol
in every object file.  At the start, the tool printed:

  Processed 27443 object files.
  2926377/2936108 symbols successfully demangled (99.6686%)
  9731 symbols could not be demangled (0.3314%)
  14589 files crashed while demangling (53.1611%)

After this patch, it prints:

  Processed 27443 object files.
  41295518/41295617 symbols successfully demangled (99.9998%)
  99 symbols could not be demangled (0.0002%)
  0 files crashed while demangling (0.0000%)

The issues fixed in this patch are:

  * Ignore empty parameter packs.  Previously we would encounter
    a mangling for an empty parameter pack and add a null node
    to the AST.  Since we don't print these anyway, we now just
    don't add anything to the AST and ignore it entirely.  This
    fixes some of the crashes.

  * Account for "incorrect" string literal demanglings.  Apparently
    an older version of clang would not truncate mangled string
    literals to 32 bytes of encoded character data.  The demangling
    code however would allocate a 32 byte buffer thinking that it
    would not encounter more than this, and overrun the buffer.
    We now demangle up to 128 bytes of data, since the buggy
    clang would encode up to 32 *characters* of data.

  * Extended support for demangling init-fini stubs.  If you had
    something like
      struct Foo {
        static vector<string> S;
      };
    this would generate a dynamic atexit initializer *for the
    variable*.  We didn't handle this, but now we print something
    nice.  This is actually an improvement over undname, which will
    fail to demangle this at all.

  * Fixed one case of static this adjustment.  We weren't handling
    several thunk codes so we didn't recognize the mangling.  These
    are now handled.

  * Fixed a back-referencing problem.  Member pointer templates
    should have their components considered for back-referencing

The remaining 99 symbols which can't be demangled are all symbols
which are compiler-generated and undname can't demangle either.

llvm-svn: 341000
2018-08-29 23:56:09 +00:00
Zachary Turner
e121f14e93 Add support for various C++14 demanglings.
Mostly this includes <auto> and <decltype-auto> return values.
Additionally, this fixes a fairly obscure back-referencing bug
that was encountered in one of the C++14 tests, which is that
if you have something like Foo<&bar, &bar> then the `bar`
forms a backreference.

llvm-svn: 340896
2018-08-29 04:12:44 +00:00
Zachary Turner
b7710f8516 [MS Demangler] Re-write the Microsoft demangler.
This is a pretty large refactor / re-write of the Microsoft
demangler.  The previous one was a little hackish because it
evolved as I was learning about all the various edge cases,
exceptions, etc.  It didn't have a proper AST and so there was
lots of custom handling of things that should have been much
more clean.

Taking what was learned from that experience, it's now
re-written with a completely redesigned and much more sensible
AST.  It's probably still not perfect, but at least it's
comprehensible now to someone else who wants to come along
and make some modifications or read the code.

Incidentally, this fixed a couple of bugs, so I've enabled
the tests which now pass.

llvm-svn: 340710
2018-08-27 03:48:03 +00:00
Zachary Turner
9be9e38aa6 [MS Demangler] Print template constructor args.
Previously if you had something like this:

template<typename T>
struct Foo {
  template<typename U>
  Foo(U);
};

Foo F(3.7);

this would mangle as ??$?0N@?$Foo@H@@QEAA@N@Z

and this would be demangled as:

undname:      __cdecl Foo<int>::Foo<int><double>(double)
llvm-undname: __cdecl Foo<int>::Foo<int>(double)

Note the lack of the constructor template parameter in our
demangling.

This patch makes it so we print the constructor argument list.

llvm-svn: 340356
2018-08-21 22:52:52 +00:00
Zachary Turner
7a7fb4f525 [MS Demangler] Fix a few more edge cases.
I found these by running llvm-undname over a couple hundred
megabytes of object files generated as part of building chromium.
The issues fixed in this patch are:

  1) decltype-auto return types.
  2) Indirect vtables (e.g. const A::`vftable'{for `B'})
  3) Pointers, references, and rvalue-references to member pointers.

I have exactly one remaining symbol out of a few hundred MB of object
files that produces a name we can't demangle, and it's related to
back-referencing.

llvm-svn: 340341
2018-08-21 21:23:49 +00:00
Zachary Turner
234e63078e [MS Demangler] Demangle special operator 'dynamic initializer'.
This is encoded as __E and should print something like
"dynamic initializer for 'Foo'(void)"

This also adds support for dynamic atexit destructor, which is
basically identical but encoded as __F with slightly different
description.

llvm-svn: 340239
2018-08-20 23:59:21 +00:00
Zachary Turner
ac6c3f6394 [MS Demangler] Anonymous namespace hashes can be backreferenced.
Previously we were not remembering the key values of anonymous
namespaces, but we need to do this.

llvm-svn: 340238
2018-08-20 23:58:58 +00:00
Zachary Turner
41eec7abf4 [MS Demangler] Properly demangle anonymous namespaces.
llvm-svn: 340237
2018-08-20 23:58:35 +00:00
Zachary Turner
08e75fdaa7 [MS Demangler] Demangle member pointer template parameters.
llvm-svn: 340199
2018-08-20 19:15:35 +00:00
Zachary Turner
5f102a9ff2 [MS Demangler] Resolve backreferences eagerly, not lazily.
A while back I submitted a patch to resolve backreferences
lazily, thinking this that it was not always possible to know
in advance what type you were looking at until you had completed
a full pass over the input, and therefore it would be impossible
to resolve backreferences eagerly.

This was mistaken though, and turned out to be an unrelated
problem.  In fact, the reverse is true.  You *must* resolve
backreferences eagerly.  This is because certain types of nested
mangled symbols do not share a backreference context with their
parent symbol, and as such, if you try to resolve them lazily
their backreference context will have been lost by the time you
finish demangling the entire input.  On the other hand, resolving
them eagerly appears to always work, and enables us to port
many more tests over.

llvm-svn: 340126
2018-08-18 18:49:48 +00:00
Zachary Turner
491ab30567 [MS Demangler] Properly print all thunk types.
We were only printing the vtordisp thunk before as the previous
patch was more aimed at getting special operators working, one
of which was a thunk.  This patch gets all thunk types to print
properly, and adds a test for each one.

llvm-svn: 340088
2018-08-17 21:32:07 +00:00
Zachary Turner
0387f750b1 [MS Demangler] Demangle all remaining types of operators.
This demangles all remaining special operators including thunks,
RTTI Descriptors, and local static guard variables.

llvm-svn: 340083
2018-08-17 21:18:05 +00:00
Zachary Turner
75b475a1c8 [MS Demangler] Rework the way operators are demangled.
Previously, some of the code for actually parsing mangled
operator names was more like formatting code in nature,
and was interspersed with the demangling code which builds
the AST.  This means that by the time we got to the printing
code, we had lost all information about what type of operator
we had, and all we were left with was a string that we just
had to print.  However, not all operators are actually even
operators.  it's basically just a catch-all mangling for
"special names", and for some of the other types it helps
to know when we're actually doing the printing what it is.

This patch changes the way things work by introducing an
OperatorInfo structure and corresponding enumeration.  When
we demangle we store the enumeration value and demangled
components separately.  This gives more flexibility during
printing.

In doing so, some demanglings of special names which we didn't
previously support come out of this for free, so we now demangle
those.

A few are more complex and are better left for a followup patch
though.

An exhaustive test of every possible operator code is included,
with the ones that don't yet work commented out.

llvm-svn: 340046
2018-08-17 16:14:05 +00:00
Zachary Turner
98b577be85 [MS Demangler] Demangle string literals.
When demangling string literals, Microsoft's undname
simply prints 'string'.  This patch implements string
literal demangling while doing a bit better than this
by decoding as much of the string as possible and
trying to faithfully reproduce the original string
literal definition.

This is a bit tricky because the different character
types char, char16_t, and char32_t are not uniquely
identified by the mangling, so we have to use a
heuristic to try to guess the character type.  But
it works pretty well, and many tests are added to
illustrate the behavior.

Differential Revision: https://reviews.llvm.org/D50806

llvm-svn: 339892
2018-08-16 16:17:36 +00:00
Zachary Turner
8dd04a5968 [MS Demangler] Don't fail on MD5-mangled names.
When we have an MD5 mangled name, we shouldn't choke and say
that it's an invalid name.  Even though it's impossible to demangle,
we should just output the original name.

llvm-svn: 339891
2018-08-16 16:17:17 +00:00
Zachary Turner
609d36082b [MS Demangler] Fix some minor formatting bugs.
1) We print __restrict twice on member pointers.  This is fixed
   and relevant tests are re-enabled.

2) Several tests were disabled because of printing slightly
   different output than undname.  These were confirmed to be
   bugs in undname, so we just re-enable the tests.

3) The test for printing reference temporaries is re-enabled.  This
   is a clang mangling extension, so we have some flexibility with
   how we demangle it.  The output currently looks fine, so we just
   re-enable the test with no fixes.

llvm-svn: 339708
2018-08-14 18:54:28 +00:00
Zachary Turner
f5224914dd [MS Demangler] Support extern "C" functions.
There are two cases we need to support with extern "C"
functions.  The first is the case of a '9' indicating that
the function has no prototype.  This occurs when we mangle
a symbol inside of an extern "C" function, but not the
function itself.

The second case is when we have an overloaded extern "C"
functions.  In this case we emit $$J0 to indicate this.
This patch adds support for both of these cases.

llvm-svn: 339471
2018-08-10 21:09:05 +00:00
Zachary Turner
d563f303cb Resubmit r339450 - [MS Demangler] Add conversion operator tests
This was broken because of a malformed check line.  Incidentally,
this exposed a case where we crash when we should just be returning
an error, so we should fix that.  The demangler shouldn't crash due
to user input.

llvm-svn: 339466
2018-08-10 20:08:46 +00:00
Zachary Turner
808fcfa26f [MS Demangler] Demangle cv qualifiers on template args.
Before we wouldn't properly demangle something like
Foo<const int>.  Template args have a special escape sequence
'$$C' that is optional, but if it is present contains
qualifiers.  So we need to check for this and only if it
present, demangle qualifiers before demangling the type.

With this fix, we re-enable some tests that were previously
marked FIXME.

llvm-svn: 339465
2018-08-10 19:57:36 +00:00
Sanjay Patel
d0c6cf2273 revert r339450 - [MS Demangler] Add conversion operator tests
Something here causes an assertion failure that killed a bunch of bots.
Example:
http://lab.llvm.org:8011/builders/reverse-iteration/builds/7021/steps/check_all/logs/stdio

llvm-svn: 339463
2018-08-10 19:20:16 +00:00
Zachary Turner
ede3bf9219 [MS Demangler] Add conversion operator tests.
The mangled names were added in the original commit, but
the demangled equivalents weren't, so nothing was actually
being checked.

llvm-svn: 339450
2018-08-10 16:55:59 +00:00
Zachary Turner
47a3242786 [MS Demangler] Properly demangle conversion operators.
These were completely broken before.  We need to handle
the 'B' operator tag.

llvm-svn: 339436
2018-08-10 15:04:56 +00:00
Zachary Turner
a9b34c5937 [MS Demangler] Disable a couple of tests.
The check lines are marked FIXME but not the mangled names.
This is causing an error.

llvm-svn: 339435
2018-08-10 14:53:33 +00:00
Zachary Turner
42c8ed3fec [MS Demangler] Fix several issues related to templates.
These were uncovered when porting the mangling tests in
ms-templates.cpp from clang/CodeGenCXX over to demangling
tests.  The main issues fixed here are surrounding integer
literal signed and unsignedness, empty array dimensions,
and pointer and reference non-type template parameters.

Differential Revision: https://reviews.llvm.org/D50512

llvm-svn: 339434
2018-08-10 14:31:04 +00:00
Zachary Turner
21da3ecbe2 [MS Demangler] Create a new backref context for template instantiations.
Template manglings use a fresh back-referencing context, so we
need to do the same.  This fixes several existing tests which are
marked as FIXME, so those are now actually run.

llvm-svn: 339275
2018-08-08 17:17:04 +00:00
Zachary Turner
5651b20005 [MS Demangler] Properly handle backreferencing of special names.
Function template names are not stored in the backref table,
but non-template function names are.  The general pattern seems
to be that when you are demangling a symbol name, if the name
starts with '?' it does not go into the backreference table,
otherwise it does.  Note that this even handles the general case
of operator names (template or otherwise) not going into the
back-reference table, anonymous namespaces not going into the
backreference table, etc.

It's important that we apply this check *only* for the
unqualified portion of a name, and only for symbol names.
For example, this does not apply to type names (such as class
templates) and we need to make sure that these still do go
into the backref table.

Differential Revision: https://reviews.llvm.org/D50394

llvm-svn: 339211
2018-08-08 00:43:31 +00:00
Zachary Turner
770b562212 [MS Demangler] Fix some tests that are no longer broken.
These were fixed with earlier patches, but had not yet been
re-enabled.

llvm-svn: 338778
2018-08-02 22:37:40 +00:00
Zachary Turner
eb21f12ca7 [MS Demangler] Properly demangle templated operators.
After we detected the presence of a template via ?$ we would proceed by
only demangling a simple unqualified name. This means we would fail on
templated operators (and perhaps other yet-to-be-determined things)

This was discovered while doing some refactoring to store richer
semantic information about the demangled types to pave the way for
overhauling the way we handle backreferences. (Specifically, we need to
defer recording or resolving back-references until a symbol has been
completely demangled, because we need to use information that only
occurs later in the mangled string to decide whether a back-reference
should be recorded.)

Differential Revision: https://reviews.llvm.org/D50145

llvm-svn: 338608
2018-08-01 18:32:47 +00:00
Zachary Turner
b435cc5faa Resubmit r338340 "[MS Demangler] Better demangling of template arguments."
This broke the build with GCC, but has since been fixed.

llvm-svn: 338403
2018-07-31 17:16:44 +00:00
Reid Kleckner
a93cf55d3c Revert r338340 "[MS Demangler] Better demangling of template arguments."
Breaks the build with GCC, apparently.

llvm-svn: 338344
2018-07-31 01:08:42 +00:00
Zachary Turner
41bfb8ed38 [MS Demangler] Better demangling of template arguments.
This patch fixes demangling of template aliases as template-template
arguments, and also fixes function pointers and references as
not type template parameters.  All of these can be properly
demangled now, so I've ported over the test
clang/test/CodeGenCXX/ms-template-callbacks.cpp.  All of these
tests pass

llvm-svn: 338340
2018-07-31 00:26:52 +00:00
Zachary Turner
49c01d48fd [MS Demangler] Add ms-return-qualifiers.test.
This is a copy of the tests from
clang/CodeGenCXX/ms-return-qualifiers.cpp converted to demangling
tests.

llvm-svn: 338330
2018-07-30 23:22:39 +00:00
Zachary Turner
0a0d8f34cb [MS Demangler] Add rudimentary C++11 Support
This patch adds support for demangling r-value references, new
operators such as the ""_foo operator, lambdas, alias types,
nullptr_t, and various other C++11'isms.

There is 1 failing test remaining in this file, which appears to
be related to back-referencing. This type of problem has the
potential to get ugly so I'd rather fix it in a separate patch.

Differential Revision: https://reviews.llvm.org/D50013

llvm-svn: 338324
2018-07-30 23:02:10 +00:00
Zachary Turner
2b73642244 [MS Demangler] Demangle symbols in function scopes.
There are a couple of issues you run into when you start getting into
more complex names, especially with regards to function local statics.
When you've got something like:

    int x() {
      static int n = 0;
      return n;
    }

Then this needs to demangle to something like

    int `int __cdecl x()'::`1'::n

The nested mangled symbols (e.g. `int __cdecl x()` in the above
example) also share state with regards to back-referencing, so
we need to be able to re-use the demangler in the middle of
demangling a symbol while sharing back-ref state.

To make matters more complicated, there are a lot of ambiguities
when demangling a symbol's qualified name, because a function local
scope pattern (usually something like `?1??name?`) looks suspiciously
like many other possible things that can occur, such as `?1` meaning
the second back-ref and disambiguating these cases is rather
interesting.  The `?1?` in a local scope pattern is actually a special
case of the more general pattern of `? + <encoded number> + ?`, where
"encoded number" can itself have embedded `@` symbols, which is a
common delimeter in mangled names.  So we have to take care during the
disambiguation, which is the reason for the overly complicated
`isLocalScopePattern` function in this patch.

I've added some pretty obnoxious tests to exercise all of this, which
exposed several other problems related to back-referencing, so those
are fixed here as well. Finally, I've uncommented some tests that were
previously marked as `FIXME`, since now these work.

Differential Revision: https://reviews.llvm.org/D49965

llvm-svn: 338226
2018-07-30 03:12:34 +00:00
Zachary Turner
6861a55d08 [MS Demangler] Properly handle function parameter back-refs.
Properly demangle function parameter back-references.

Previously we treated lists of function parameters and template
parameters the same. There are some important differences with regards
to back-references, and some less important differences regarding which
characters can appear before or after the name.

The important differences are that with a given type T, all instances of
a function parameter list share the same global back-ref table.
Specifically, if X and Y are function pointers, then there are 3
entities in the declaration X func(Y) which all affect and are affected
by the master parameter back-ref table:
  1) The parameter list of X's function type
  2) the parameter list of func itself
  3) The parameter list of Y's function type.

The previous code would create a back-reference table that was local to
a single parameter list, so it would not be shared across parameter
lists.

This was discovered when porting ms-back-references.test from clang's
mangling tests. All of these tests should now pass with the new changes.

In doing so, I split the function for parsing template and function
parameters into two separate functions. This makes the template
parameter list parsing code in particular very small and easy to
understand now.

Differential Revision: https://reviews.llvm.org/D49875

llvm-svn: 338075
2018-07-26 22:13:39 +00:00
Zachary Turner
b81e08b31f [MS Demangler] Print calling convention inside parentheses.
For function pointers, we would print something like

int __cdecl (*)(int)

We need to move the calling convention inside, and print

int (__cdecl *)(int)

This patch implements this change for regular function pointers as
well as member function pointers.

llvm-svn: 338068
2018-07-26 20:33:48 +00:00
Zachary Turner
4fd5cfc7b8 [MS Demangler] Add ms-arg-qualifiers.test
This converts the arg qualifier mangling tests from
clang/CodeGenCXX/mangle-ms-arg-qualifiers.cpp to demangling tests.
Most tests already pass, so this patch doesn't come with any
functional change, just the addition of new tests.  The few tests
that don't pass are left in with a FIXME label so that they don't
run but serve as documentation about what still doesn't work.

llvm-svn: 338067
2018-07-26 20:25:35 +00:00
Zachary Turner
ae479b6222 Add missing tests from ms-mangle.cpp.
None of these tests pass yet so they are commented out, but I'm
adding them with a FIXME label so that they don't get lost when
copying tests over from clang's mangling tests.  Currently these
tests are all commented out.

llvm-svn: 338066
2018-07-26 20:20:29 +00:00
Zachary Turner
246526b1b4 [MS Demangler] Demangle pointers to member functions.
After this patch, we can now properly demangle pointers to member
functions.  The calling convention is located in the wrong place,
but this will be fixed in a followup since it also affects non
member function pointers.

Differential Revision: https://reviews.llvm.org/D49639

llvm-svn: 338065
2018-07-26 20:20:10 +00:00