1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

126525 Commits

Author SHA1 Message Date
Haicheng Wu
9d77533d54 [LIR] Add support for structs and hand unrolled loops
Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258620
2016-01-23 06:52:41 +00:00
Matthias Braun
f692315c07 Inline variable into assert
Seems like some compilers still give unused variable warnings for
bool var = ...;
(void)var;
so I have to inline the variable.

llvm-svn: 258619
2016-01-23 06:49:29 +00:00
NAKAMURA Takumi
c47ea12e00 AArch64ISelLowering.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 258618
2016-01-23 06:34:59 +00:00
Junmo Park
d7f46a4f6c Remove extra whitespace. NFC.
llvm-svn: 258617
2016-01-23 06:34:36 +00:00
David Majnemer
f62478a34a [PruneEH] Don't try to insert a terminator after another terminator
LLVM's BasicBlock has a single terminator, it is not valid to have two.

llvm-svn: 258616
2016-01-23 06:00:44 +00:00
Manuel Jacob
865681354b Put space after pointer type in test. NFC.
llvm-svn: 258615
2016-01-23 05:47:34 +00:00
Matt Arsenault
2d71f7b1ea AMDGPU: Replace some deprecated intrinsic uses in tests
llvm-svn: 258614
2016-01-23 05:42:49 +00:00
Matt Arsenault
cd6d4b4414 AMDGPU: Run instnamer on a few tests
This will make future test updates easier

llvm-svn: 258613
2016-01-23 05:42:43 +00:00
Matt Arsenault
fe8ee22547 AMDGPU: Remove more unused intrinsics
Replace tests with lrp with basic IR expansion

llvm-svn: 258612
2016-01-23 05:42:38 +00:00
David Majnemer
7a3addc91c [PruneEH] FuncletPads must not have undef operands
Instead of RAUW with undef, replace the first non-token instruction with
unreachable.

This fixes PR26263.

llvm-svn: 258611
2016-01-23 05:41:29 +00:00
David Majnemer
09858a3961 [PruneEH] Unify invoke and call handling in DeleteBasicBlock
No functionality change is intended.

llvm-svn: 258610
2016-01-23 05:41:27 +00:00
David Majnemer
0728f4a41f [PruneEH] Reuse code from removeUnwindEdge
PruneEH had functionality idential to removeUnwindEdge.
Consolidate around removeUnwindEdge.
No functionality change is intended.

llvm-svn: 258609
2016-01-23 05:41:22 +00:00
Matt Arsenault
38b08addbb AMDGPU: Move amdgcn intrinsic handling into SITargetLowering
llvm-svn: 258608
2016-01-23 05:32:20 +00:00
Matt Arsenault
f305746857 AMDGPU: Remove IntrNoMem from llvm.SI.sendmsg
This has side effects.

llvm-svn: 258607
2016-01-23 05:32:18 +00:00
Matt Arsenault
c3ec12b749 AMDGPU: Remove Feature64BitPtr
This is a leftover from AMDIL that doesn't do anything
and doesn't belong here.

llvm-svn: 258606
2016-01-23 05:32:14 +00:00
Matthias Braun
0892910f16 AArch64ISel: Fix ccmp code selection matching deep expressions.
Some of the conditions necessary to produce ccmp sequences were only
checked in recursive calls to emitConjunctionDisjunctionTree() after
some of the earlier expressions were already built. Move all checks over
to isConjunctionDisjunctionTree() so they are all checked before we
start emitting instructions.

Also rename some variable to better reflect their usage.

llvm-svn: 258605
2016-01-23 04:05:22 +00:00
Matthias Braun
da14179563 AArch64ISelLowering: Reduce maximum recursion depth of isConjunctionDisjunctionTree()
This function will exhibit exponential runtime (2**n) so we should
rather use a lower limit.

llvm-svn: 258604
2016-01-23 04:05:18 +00:00
Matthias Braun
a0ca239a6c Fix wrong indentation
llvm-svn: 258603
2016-01-23 04:05:16 +00:00
NAKAMURA Takumi
b5bfd9266e AlignOf.h: Appease g++-4.7 for now. Will fix later.
llvm-svn: 258600
2016-01-23 02:22:36 +00:00
Derek Schuff
0558165991 [WebAssembly] Fix RegNumbering for the stack pointer
Previously it failed to add NumArgRegs to the offset and so clobbered an
already-used register. Now just start the numbering after the arg regs
and don't duplicate the add. Test coverage for this coming shortly with
the implementation of byval.

llvm-svn: 258597
2016-01-23 01:20:43 +00:00
Kostya Serebryany
548cef831b [libFuzzer] add more fields to DictionaryEntry to count the number of uses and successes
llvm-svn: 258589
2016-01-22 23:55:14 +00:00
Reid Kleckner
b68cad5bbe [cmake] Disable manifest generation when LLD is the linker
Running mt.exe to make the manifest is really slow. Disabling manifest
generation doesn't seem to break anything.

llvm-svn: 258581
2016-01-22 23:27:13 +00:00
David Majnemer
a2ed036c0a [WinEH] Let cleanups post-dominated by unreachable get executed
Cleanups in C++ are a little weird.  They are only guaranteed to be
reliably executed if, and only if, there is a viable catch handler which
can handle the exception.

This means that reachability of a cleanup is lexically determined by it
being nested with a try-block which unwinds to a catch.  It is *cannot*
be reasoned about by examining the control flow edges leaving a cleanup.

Usually this is not a problem.  It becomes a problem when there are *no*
edges out of a cleanup because we believed that code post-dominated by
the cleanup is dead.  In LLVM's case, this code is what informs the
personality routine about the presence of a suitable catch handler.
However, the lack of edges to that catch handler makes the handler
become unreachable which causes us to remove it.  By removing the
handler, the cleanup becomes unreachable.

Instead, inject a catch-all handler with every cleanup that has no
unwind edges.  This will allow us to properly unwind the stack.

This fixes PR25997.

llvm-svn: 258580
2016-01-22 23:20:43 +00:00
Kevin Enderby
9b924af8f7 Fix the code that leads to the incorrect trigger of the report_fatal_error()
in MachOObjectFile::getSymbolByIndex() when a Mach-O file has
a symbol table load command but the number of symbols are zero.

The code in MachOObjectFile::symbol_begin_impl() should not be
assuming there is a symbol at index 0, in cases there is no symbol
table load command or the count of symbol is zero.  So I also fixed
that.  And needed to fix MachOObjectFile::symbol_end_impl() to
also do the same thing for no symbol table or one with zero entries.

The code in MachOObjectFile::getSymbolByIndex() should trigger
the report_fatal_error() for programmatic errors for any index when
there is no symbol table load command and not return the end iterator.
So also fixed that. Note there is no test case as this is a programmatic
error.

The test case using the file macho-invalid-bad-symbol-index has
a symbol table load command with its number of symbols (nsyms)
is zero. Which was incorrectly testing the bad triggering of the
report_fatal_error() in in MachOObjectFile::getSymbolByIndex().

This test case is an invalid Mach-O file but not for that reason.
It appears this Mach-O file use to have an nsyms value of 11,
and what makes this Mach-O file invalid is the counts and
indexes into the symbol table of the dynamic load command
are now invalid because the number of symbol table entries
(nsyms) is now zero.  Which can be seen with the existing
llvm-obdump:

% llvm-objdump -private-headers macho-invalid-bad-symbol-index
…
Load command 4
     cmd LC_SYMTAB
 cmdsize 24
  symoff 4216
   nsyms 0
  stroff 4392
 strsize 144
Load command 5
            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 8 (past the end of the symbol table)
     iextdefsym 8 (greater than the number of symbols)
     nextdefsym 2 (past the end of the symbol table)
      iundefsym 10 (greater than the number of symbols)
      nundefsym 1 (past the end of the symbol table)
...

And the native darwin tools generates an error for this file:

% nm macho-invalid-bad-symbol-index
nm: object: macho-invalid-bad-symbol-index truncated or malformed object (ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table)

I added new checks for the indexes and sizes for these in the
constructor of MachOObjectFile.  And added comments for what
would be a proper diagnostic messages.

And changed the test case using macho-invalid-bad-symbol-index
to test for the new error now produced.

Also added a test with a valid Mach-O file with a symbol table
load command where the number of symbols is zero that shows
the report_fatal_error() is not called.

llvm-svn: 258576
2016-01-22 22:49:55 +00:00
Ivan Krasin
ce1bcd8c31 Use std::piecewise_constant_distribution instead of ad-hoc binary search.
Summary:
Fix the issue with the most recently discovered unit receiving much less attention.

Note: this is the second attempt (prev: r258473). Now, libc++ build is fixed.

Reviewers: aizatsky, kcc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16487

llvm-svn: 258571
2016-01-22 22:28:27 +00:00
Weiming Zhao
e73494fde9 Fix LivePhysRegs::addLiveOuts
Summary:
The testing for returnBB was flipped which may cause ARM ld/st opt pass uses callee saved regs in returnBB when shrink-wrap is used.


Reviewers: t.p.northover, apazos, MatzeB

Subscribers: mcrosier, zzheng, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D16434

llvm-svn: 258569
2016-01-22 22:21:34 +00:00
Sanjay Patel
40baae90ce fixed to test features, not CPU models
llvm-svn: 258568
2016-01-22 22:20:56 +00:00
Sanjay Patel
6ef10254c1 fix typos; NFC
llvm-svn: 258567
2016-01-22 22:09:41 +00:00
Owen Anderson
28c07122cf Strip local symbols when using externalized debug info.
When we build LLVM with externalized debug info, all debugging and
symbolication related data is extracted into dSYM files prior to
stripping. As such, there is no need to preserve local symbols in LLVM
binaries after dSYM creation.

This shrinks libLLVM.dylib from 58MB to 55MB on my system.

llvm-svn: 258566
2016-01-22 22:07:24 +00:00
Davide Italiano
1b7063b775 [gold] Remove inconsistent llvm_unreachable().
Differential Revision:	 http://reviews.llvm.org/D16429

llvm-svn: 258561
2016-01-22 21:36:49 +00:00
Matt Arsenault
919da0fa79 AMDGPU: Remove GCCBuiltin from intrinsics that need mangling
If the intrinsic is overloaded and works on multiple types,
it cannot resolve to a single corresponding builtin and requires
handling in clang. This just causes crashes now.

llvm-svn: 258559
2016-01-22 21:30:46 +00:00
Matt Arsenault
3913a77bb9 AMDGPU: Add new name for barrier intrinsic
llvm-svn: 258558
2016-01-22 21:30:43 +00:00
Matt Arsenault
7a5e15697d AMDGPU: Rename intrinsics to use amdgcn prefix
The intrinsic target prefix should match the target name
as it appears in the triple.

This is not yet complete, but gets most of the important ones.
llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled
for compatability for now.

llvm-svn: 258557
2016-01-22 21:30:34 +00:00
Sergei Larin
7b219abac0 Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object.
Summary:
Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object.

A good example of improper behavior in the current implementation is section information associated with the GlobalObject. If a section was set for it, and GlobalOpt is creating/modifying a new object based on this one (often copying the original name), without this change new object will be placed in a default section, resulting in inappropriate properties of the new variable.
The argument here is that if customer specified a section for a variable, any changes to it that compiler does should not cause it to change that section allocation.
Moreover, any other properties worth representation in copyAttributesFrom() should also be propagated.

Reviewers: jmolloy, joker-eph, joker.eph

Subscribers: slarin, joker.eph, rafael, tobiasvk, llvm-commits

Differential Revision: http://reviews.llvm.org/D16074

llvm-svn: 258556
2016-01-22 21:18:20 +00:00
Nico Weber
85ea6e2ccc Make InstProfWriter compile again after 258544 with MSVC.
\src\llvm-rw\include\llvm/Support/AlignOf.h(254) :
    error C2872: 'detail' : ambiguous symbol
        could be 'llvm::detail'
        or       'llvm::support::detail'

llvm-svn: 258553
2016-01-22 21:13:04 +00:00
Sanjay Patel
ada0c1bc05 function names start with a lowercase letter; NFC
llvm-svn: 258552
2016-01-22 21:11:47 +00:00
Sanjoy Das
26d6272ad2 [PlaceSafepoints] Introduce a -spp-no-statepoints flag
Summary:
This change adds a `-spp-no-statepoints` flag to PlaceSafepoints that
bypasses the code that wraps newly introduced polls and existing calls
in gc.statepoint.  With `-spp-no-statepoints` enabled, PlaceSafepoints
effectively becomes a safpeoint **poll** insertion pass.

The eventual goal is to "constant fold" this option, along with
`-rs4gc-use-deopt-bundles` to `true`, once clients using gc.statepoint
are okay doing so.

Reviewers: pgavlin, reames, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16439

llvm-svn: 258551
2016-01-22 21:02:55 +00:00
Xinliang David Li
42a10f05d4 [PGO] Remove use of static variable. /NFC
Make the variable a member of  the writer trait object owned
now by the writer. Also use a different generator interface
to pass the infoObject from the writer. 

llvm-svn: 258544
2016-01-22 20:25:56 +00:00
Ahmed Bougacha
8af301da92 [AArch64] Cleanup ccmp test check labels. NFC.
llvm-svn: 258541
2016-01-22 20:02:26 +00:00
Rafael Espindola
9b04f1173e Typo fix and simplification.
Thanks to Justin Bogner for the suggestion.

llvm-svn: 258540
2016-01-22 19:58:18 +00:00
Xinliang David Li
960dc56746 Revert 258486 -- for a better fix coming soon
llvm-svn: 258538
2016-01-22 19:53:31 +00:00
Matt Arsenault
2b88adb9bd AMDGPU: Fix crash with invariant markers
The promote alloca pass didn't handle these intrinsics and crashed.
These intrinsics should accept any address space, but for now just
erase them to avoid breaking.

llvm-svn: 258537
2016-01-22 19:47:54 +00:00
Jingyue Wu
90a1a65026 [NVPTX] expand mul_lohi to mul_lo and mul_hi
Summary: Fixes PR26186.

Reviewers: grosser, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D16479

llvm-svn: 258536
2016-01-22 19:47:26 +00:00
Rafael Espindola
c626c394ea Add ArrayRef support to EndianStream.
Using an array instead of ArrayRef would allow type inference, but
(short of using C99) one would still need to write

    typedef uint16_t VT[];
    LE.write(VT{0x1234, 0x5678});

llvm-svn: 258535
2016-01-22 19:44:46 +00:00
Ahmed Bougacha
7980e233f5 [AArch64] Simplify emitConditionalCompare calls. NFC.
Now that both callsites are identical, we can simplify the
prototype and make it easier to reason about the 2-CC case.

llvm-svn: 258534
2016-01-22 19:43:57 +00:00
Ahmed Bougacha
1c71a2aac6 [AArch64] Lower 2-CC FCCMPs (one/ueq) using AND'ed CCs.
The current behavior is incorrect, as the two CCs returned by
changeFPCCToAArch64CC, intended to be OR'ed, are instead used
in an AND ccmp chain.

Consider:
define i32 @t(float %a, float %b, float %c, float %d, i32 %e, i32 %f) {
  %cc1 = fcmp one float %a, %b
  %cc2 = fcmp olt float %c, %d
  %and = and i1 %cc1, %cc2
  %r = select i1 %and, i32 %e, i32 %f
  ret i32 %r
}

Assuming (%a < %b) and (%c < %d); we used to do:
  fcmp  s0, s1            # nzcv <- 1000
  orr   w8, wzr, #0x1     # w8 <- 1
  csel  w9, w8, wzr, mi   # w9 <- 1
  csel  w8, w8, w9, gt    # w8 <- 1
  fcmp  s2, s3            # nzcv <- 1000
  cset   w9, mi           # w9 <- 1
  tst    w8, w9           # (w8 & w9) == 1, so: nzcv <- 0000
  csel  w0, w0, w1, ne    # w0 <- w0

We now do:
  fcmp  s2, s3            # nzcv <- 1000
  fccmp s0, s1, #0, mi    #  mi, so: nzcv <- 1000
  fccmp s0, s1, #8, le    # !le, so: nzcv <- 1000
  csel  w0, w0, w1, pl    # !pl, so: w0 <- w1

In other words, we transformed:
  (c < d) &&  ((a < b) || (a > b))
into:
  (c < d) &&   (a u>= b) && (a u<= b)
whereas, per De Morgan's, we wanted:
  (c < d) && !((a u>= b) && (a u<= b))

Note that this problem doesn't occur in the test-suite.

changeFPCCToAArch64CC produces disjunct CCs; here, one -> mi/gt.
We can't represent that in the fccmp chain; it can't express
arbitrary OR sequences, as one comment explains:
  In general we can create code for arbitrary "... (and (and A B) C)"
  sequences.  We can also implement some "or" expressions, because
  "(or A B)" is equivalent to "not (and (not A) (not B))" and we can
  implement some  negation operations. [...] However there is no way
  to negate the result of a partial sequence.

Instead, introduce changeFPCCToANDAArch64CC, which produces the
conjunct cond codes:
- (a one b)
    == ((a olt b) || (a ogt b))
    == ((a ord b) && (a une b))
- (a ueq b)
    == ((a uno b) || (a oeq b))
    == ((a ule b) && (a uge b))

Note that, at first, one might think that, when PushNegate is true,
we should use the disjunct CCs, in effect doing:
  (a || b)
  = !(!a && !(b))
  = !(!a && !(b1 || b2))  <- changeFPCCToAArch64CC(b, b1, b2)
  = !(!a && !b1 && !b2)

However, we can take advantage of the fact that the CC is already
negated, which lets us avoid special-casing PushNegate and doing
the simpler to reason about:

  (a || b)
  = !(!a && (!b))
  = !(!a && (b1 && b2))   <- changeFPCCToANDAArch64CC(!b, b1, b2)
  = !(!a && b1 && b2)

This makes both emitConditionalCompare cases behave identically,
and produces correct ccmp sequences for the 2-CC fcmps.

llvm-svn: 258533
2016-01-22 19:43:54 +00:00
Ahmed Bougacha
3a901cfda8 [AArch64] Assert that CCMP isel didn't fail inconsistently.
We verify that the op tree is eligible for CCMP emission in
isConjunctionDisjunctionTree, but it's also possible that
emitConjunctionDisjunctionTree fails later.
The initial check is useful, as it avoids building nodes
that will get discarded.
Still, make sure that inconsistencies don't happen with
an assert.

llvm-svn: 258532
2016-01-22 19:43:43 +00:00
Sanjoy Das
a81b52c690 [RS4GC] Use OB_deopt instead of "deopt"
llvm-svn: 258529
2016-01-22 19:20:40 +00:00
Krzysztof Parzyszek
7ec3ade80f [Hexagon] Use general purpose registers to spill pred/mod registers into
Patch by Tobias Edler Von Koch.

llvm-svn: 258527
2016-01-22 19:15:58 +00:00
Matt Arsenault
8d0283f1a9 AMDGPU: Fix getArchTypePrefix
llvm-svn: 258525
2016-01-22 19:09:12 +00:00