1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

132322 Commits

Author SHA1 Message Date
Matt Arsenault
e21e61958d AMDGPU: Fix inconsistent lowering of select of vectors
f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.

llvm-svn: 270731
2016-05-25 17:34:58 +00:00
Sanjay Patel
e582594538 [x86] avoid code explosion from LoopVectorizer for gather loop (PR27826)
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826

There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.

Differential Revision: http://reviews.llvm.org/D20601

llvm-svn: 270729
2016-05-25 17:27:54 +00:00
Xinliang David Li
e9df93e237 Use new triple API to check if comdat is supported
llvm-svn: 270727
2016-05-25 17:17:51 +00:00
Xinliang David Li
e9cf4364da Add a new helper API in triple /NFC
llvm-svn: 270726
2016-05-25 17:11:31 +00:00
Chris Bieneman
d8d2780ae0 [obj2yaml] [yaml2obj] MachO support for rebase opcodes
This is the first bit of support for MachO __LINKEDIT segment data.

llvm-svn: 270724
2016-05-25 17:09:07 +00:00
Chris Bieneman
2b1c02aa22 [CMake] LINK_LIBS need to be public for Darwin dylib targets
This should actually address PR27855. This results in adding references to the system libs inside generated dylibs so that they get correctly pulled in when linking against the dylib.

llvm-svn: 270723
2016-05-25 17:08:43 +00:00
Tim Shen
198e5cb8a0 Move and add comments to the top for tailcall-string-rvo.ll
Differential Revision: http://reviews.llvm.org/D20311

llvm-svn: 270722
2016-05-25 17:01:09 +00:00
Hal Finkel
c1f6823ee0 [SDAG] Add a fallback multiplication expansion
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.

Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.

This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).

Fixes PR19797.

llvm-svn: 270720
2016-05-25 16:50:22 +00:00
Teresa Johnson
de4624ad69 [ThinLTO] Fix test check prefix so that intended prefix tested
There aren't any checks with prefix PROMOTE, should be PROMOTE_MOD1
which wasn't being tested (but works as expected).

llvm-svn: 270719
2016-05-25 16:45:08 +00:00
Sanjay Patel
289425eb9f [x86, AVX] allow explicit calls to VZERO* to modify state in VZeroUpperInserter pass (PR27823)
As noted in the review, there are still problems, so this doesn't the bug completely.

Differential Revision: http://reviews.llvm.org/D20529

llvm-svn: 270718
2016-05-25 16:39:47 +00:00
Lang Hames
0a4d39f9bf [RuntimeDyld] Call the SymbolResolver::findSymbolInLogicalDylib method when
searching for external symbols, and fall back to the SymbolResolver::findSymbol
method if the former returns null.

This makes RuntimeDyld behave more like a static linker: Symbol definitions
from within the current module's "logical dylib" will be preferred to
external definitions. We can build on this behavior in the future to properly
support weak symbol handling.

Custom symbol resolvers that override the findSymbolInLogicalDylib method may
notice changes due to this patch. Clients who have not overridden this method
should generally be unaffected, however users of the OrcMCJITReplacement class
may notice changes.

llvm-svn: 270716
2016-05-25 16:23:59 +00:00
Chad Rosier
8544c01533 Clarify that we match BSwap in InstCombine and BitReverse in CGP. NFC.
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.

llvm-svn: 270715
2016-05-25 16:22:14 +00:00
Simon Pilgrim
a16bac494b [X86][AVX] Sync with clang/test/CodeGen/avx2-builtins.c
Only tests for the gather intrinsic are still to be added

llvm-svn: 270710
2016-05-25 15:30:08 +00:00
Teresa Johnson
655b4d9f20 [ThinLTO] Refactor ODR resolution and internalization (NFC)
Move the now index-based ODR resolution and internalization routines out
of ThinLTOCodeGenerator.cpp and into either LTO.cpp (index-based
analysis) or FunctionImport.cpp (index-driven optimizations).
This is to enable usage by other linkers.

llvm-svn: 270698
2016-05-25 14:03:11 +00:00
Oleg Ranevskyy
34bf60ca68 [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
Summary:
**Description**

This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.

When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call.
`getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`.

Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr.

This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.

**Minimal reproducer:**
```
int foo(int a, int b, int c);
int baz();

void bar()
{
   int arr[20];
   int i = 0;

   for (i = 0; i < 4; ++i)
     arr[i] = baz();

   for (; i < 20; ++i)
     arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}
```

**Clang command line:**
```
clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir
```

**Expected result:**
The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated.

Reviewers: sanjoy

Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D20058

llvm-svn: 270695
2016-05-25 13:01:33 +00:00
Renato Golin
bde30e2034 [AArch64] Adding a TargetParser for AArch64
There's already a ARMTargetParser,now adding a similar one for aarch64.
so we can use it to do ARCH/CPU/FPU parsing in clang and llvm, instead of
string comparison.

Patch by Jojo Ma.

llvm-svn: 270687
2016-05-25 12:02:33 +00:00
Simon Pilgrim
e806c3471c [X86][AVX2] Added more fast-isel tests to match clang/test/CodeGen/avx2-builtins.c
llvm-svn: 270685
2016-05-25 10:56:23 +00:00
Simon Pilgrim
eb7d07a957 [X86][AVX2] Begun adding fast-isel tests to match clang/test/CodeGen/avx2-builtins.c
llvm-svn: 270683
2016-05-25 10:15:06 +00:00
Simon Pilgrim
b4a440d8b8 [X86][SSE2] Use storeu intrinsics for _mm_storeu_pd/_mm_storeu_pd tests
Also fixed name of _mm_store1_pd test

llvm-svn: 270681
2016-05-25 09:42:29 +00:00
Simon Pilgrim
4950d6bb8d [X86][SSE] Use storeu intrinsics for _mm_storeu_ps test
llvm-svn: 270680
2016-05-25 09:28:06 +00:00
Simon Pilgrim
1a1ddc32da [X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.

Differential Revision: http://reviews.llvm.org/D20568

llvm-svn: 270678
2016-05-25 08:59:18 +00:00
Craig Topper
4710ab1424 [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time.
llvm-svn: 270677
2016-05-25 06:56:32 +00:00
Gerolf Hoflehner
1352c2ef4f [Support] Reapply cleanup r270643
llvm-svn: 270674
2016-05-25 06:23:45 +00:00
David Majnemer
3c41824d16 [FunctionAttrs] Volatile loads should disable readonly
A volatile load has side effects beyond what callers expect readonly to
signify.  For example, it is not safe to reorder two function calls
which each perform a volatile load to the same memory location.

llvm-svn: 270671
2016-05-25 05:53:04 +00:00
Gerolf Hoflehner
3ef084d157 [Support] revert previous commit r270643
llvm-svn: 270670
2016-05-25 05:51:05 +00:00
Zachary Turner
d50b930aa6 [llvm-pdbdump] Decipher the remaining PDB streams.
We know at least know the meaning of every stream of the
PDB file.  Yay!

llvm-svn: 270669
2016-05-25 05:49:48 +00:00
Saleem Abdulrasool
f7d75a7227 Revert "llvm-objdump: support dumping AUX records for weak externals"
Revert it until we can figure out the endianness issue.

llvm-svn: 270667
2016-05-25 05:45:02 +00:00
Saleem Abdulrasool
fc437395b1 Object: ensure that structures are fully defined
Ensure that the unused fields are explicitly stated when defining the types.
Add some compile time assertions about the size requirements for the structure
types.

llvm-svn: 270663
2016-05-25 05:23:02 +00:00
Zachary Turner
7e718f26a3 [llvm-pdbdump] Dump the IPI stream and all records.
llvm-svn: 270661
2016-05-25 04:35:22 +00:00
Rui Ueyama
9930c53d56 pdbdump: fix bug in name hash table.
name_ids() did not return all IDs but only the first NameCount items.
The number of non-zero entries in IDs vector is NameCount, but it
does not mean that all non-zero entries are at the beginning of IDs
vector.

Differential Revision: http://reviews.llvm.org/D20611

llvm-svn: 270656
2016-05-25 04:07:17 +00:00
Zachary Turner
2984a1ea1d [llvm-pdbdump] Stream 0 isn't actually the MSF superblock.
Oddly enough, I realized we don't actually know what stream
0 is (if anything).

llvm-svn: 270655
2016-05-25 03:53:16 +00:00
Saleem Abdulrasool
144a2028c6 test: use a binary file instead
Generate the obj rather than use yaml2obj.  Hopefully, this fixes the PPC64 test
failures.

llvm-svn: 270654
2016-05-25 03:48:07 +00:00
Zachary Turner
fff89b5e93 [llvm-pdbdump] Dump stream summary list.
Try to figure out what each stream is, and dump its name.

This gives us a better picture of what streams we still don't
understand.

llvm-svn: 270653
2016-05-25 03:43:17 +00:00
Saleem Abdulrasool
c48e959a2e Support: remove outdated comment
This information is in the latest version of the specification.

llvm-svn: 270649
2016-05-25 01:59:36 +00:00
Saleem Abdulrasool
73cac6f912 llvm-objdump: support dumping AUX records for weak externals
This is a support COFF feature.  Ensure that we can display the weak externals
auxiliary symbol.  It contains useful information (such as the default binding
and how to resolve the symbol).

llvm-svn: 270648
2016-05-25 01:59:32 +00:00
Davide Italiano
c01e301072 [PM] Port BDCE to the new pass manager.
llvm-svn: 270647
2016-05-25 01:57:04 +00:00
Nirav Dave
9f0b74dd18 Soften assertion in AMDGPU emitPrologue.
[AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not
the scratch descriptor. Continue search if unused register found fails
other requirements.

Reviewers: arsenm, tstellarAMD, nhaehnle

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D20526

llvm-svn: 270646
2016-05-25 01:45:42 +00:00
Eugene Zelenko
6fcb7c07b5 Fix some Include What You Use warnings in examples; other minor fixes.
Differential revision: http://reviews.llvm.org/D20607

llvm-svn: 270645
2016-05-25 01:18:36 +00:00
Matthias Braun
3b9611aad5 ScheduleDAGInstrs: Fix memory corruption
We have to modify V2SU before inserting new elements into the
CurrentVRegDefs set because that may move V2SU in memory invalidating
the reference.

llvm-svn: 270644
2016-05-25 01:18:00 +00:00
Gerolf Hoflehner
ec9d7e176c [Support] Cleanup of an ancient Darwin work-around in Signals.inc (PR26174)
Patch by Jeremy Huddleston Sequoia

llvm-svn: 270643
2016-05-25 00:54:39 +00:00
Derek Bruening
605cdbc11c [esan|wset] EfficiencySanitizer working set tool fastpath
Summary:
Adds fastpath instrumentation for esan's working set tool.  The
instrumentation for an intra-cache-line load or store consists of an
inlined write to shadow memory bits for the corresponding cache line.

Adds a basic test for this instrumentation.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D20483

llvm-svn: 270640
2016-05-25 00:17:24 +00:00
Kostya Serebryany
10bbd18b43 [libFuzzer] print stats if we crash on empty input
llvm-svn: 270639
2016-05-25 00:15:36 +00:00
Richard Smith
9b2e9d0752 Revert r270569 (teach llvm-mc to generate compressed debug sections in zlib
style). It appears that current ELF linkers are not ready for this.

llvm-svn: 270638
2016-05-25 00:14:12 +00:00
Zachary Turner
1bbdf5dfd8 [codeview] Add support for new types and symbols.
This patch adds support for:

S_EXPORT
LF_BITFIELD

With this patch, I have run through a couple of gigabytes of PDB
files and cannot find a type or symbol that we do not understand.

llvm-svn: 270637
2016-05-25 00:12:48 +00:00
Zachary Turner
b8f5397a29 [codeview] Add support for S_EXPORT symbol.
llvm-svn: 270636
2016-05-25 00:12:40 +00:00
Dan Gohman
e46ddfaa34 [WebAssembly] Put __stack_pointer in the offset field of loads and stores.
Instead of this:

i32.const       $push10=, __stack_pointer
i32.load        $push11=, 0($pop10)

Emit this:

i32.const       $push10=, 0
i32.load        $push11=, __stack_pointer($pop10)

It's not currently clear which is better, though there's a chance the second
form may be better at overall compression. We can revisit this when we have
more data; for now it makes sense to make PEI consistent with isel.

Differential Revision: http://reviews.llvm.org/D20411

llvm-svn: 270635
2016-05-24 23:47:41 +00:00
Mike Aizatsky
785d92e27f [libfuzzer] Trying random unit prefixes during corpus load.
Differential Revision: http://reviews.llvm.org/D20301

llvm-svn: 270632
2016-05-24 23:14:29 +00:00
Michael Zolotukhin
82048571c5 Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time.
This reverts commit r270577.

llvm-svn: 270630
2016-05-24 23:00:05 +00:00
Michael Zolotukhin
7bf254c21f [LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.
This fixes PR27847. Now for real.

llvm-svn: 270629
2016-05-24 22:59:58 +00:00
Zachary Turner
4aa4d6e21a [codeview] Add support for new type records.
This adds support for parsing and dumping the following
symbol types:

S_LPROCREF
S_ENVBLOCK
S_COMPILE2
S_REGISTER
S_COFFGROUP
S_SECTION
S_THUNK32
S_TRAMPOLINE

As of this patch, the test PDB files no longer have any unknown
symbol types.

llvm-svn: 270628
2016-05-24 22:58:46 +00:00