1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

169738 Commits

Author SHA1 Message Date
Luke Cheeseman
0b7f5a040e Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method

llvm-svn: 343317
2018-09-28 13:37:27 +00:00
Sven van Haastregt
557d36975e Fix and modernize StringMatcher comment; NFC
llvm-svn: 343316
2018-09-28 13:31:55 +00:00
Petar Jovanovic
4d0632639a [MIPS GlobalISel] Lower i64 arguments
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52409

llvm-svn: 343315
2018-09-28 13:28:47 +00:00
Simon Pilgrim
4ed0763c6c [X86][Btver2] CVTSS2I/CVTSD2I - add missing JFPU0 pipe
We issue JFPU1->JSTC then JFPU0->JFPA then -> JALU0 (integer pipe)

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343314
2018-09-28 13:19:22 +00:00
Jonas Devlieghere
2f26e57f1f Split invocations in CodeGen/X86/cpus.ll among multiple tests. (NFC)
On GreenDragon `CodeGen/X86/cpus.ll` is timing out on the bot with Asan
and UBSan enabled. With the same configuration on my machine, the test
passes but takes more than 3 minutes to do so. I could increase the
timeout, but I believe it makes more sense to split up the test because
it allows for more parallelism.

Differential revision: https://reviews.llvm.org/D52603

llvm-svn: 343313
2018-09-28 12:08:51 +00:00
Andrea Di Biagio
09898692b3 [llvm-mca] Remove redundant namespace prefixes. NFC
We are already "using" namespace llvm in all the files modified by this change.

llvm-svn: 343312
2018-09-28 10:47:24 +00:00
Simon Pilgrim
c60b319823 [X86][Btver2] Fix BSF/BSR schedule
Double throughput to account for 2 pipes + fix BSF's latency/uop counts

Match AMD Fam16h SOG + llvm-exegesis tests

llvm-svn: 343311
2018-09-28 10:26:48 +00:00
Florian Hahn
bdb732a3d8 Revert r343308: [LoopInterchange] Turn into a loop pass.
llvm-svn: 343310
2018-09-28 10:20:07 +00:00
Florian Hahn
6e2559695c [LoopInterchange] Turn into a loop pass.
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.

The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.

It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.


Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D51702

llvm-svn: 343308
2018-09-28 09:45:50 +00:00
Andrea Di Biagio
5d291cb05b [llvm-mca] Teach how to track zero registers in class RegisterFile.
This change is in preparation for a future work on improving support for
optimizable register moves.  We already know if a write is from a zero-idiom, so
we can propagate that bit of information to the PRF.  We use an APInt mask to
identify registers that are set to zero.

llvm-svn: 343307
2018-09-28 09:42:06 +00:00
Peter Smith
f7ab544ae0 [ARM] Remove non-existent cpu arm1176j-s and use mpcore for v6k
The ARMTargetParser.def contains an entry for arm1176j-s which is the
default for the ArmV6K architecture. This cpu does not exist, there are
only arm1176jz-s and arm1176jzf-s and they are both architecture ArmV6KZ.
The only CPUs that are actually ArmV6K are the mpcore, mpcore_nofpu and
later revisions of the arm1136 family r1px (which we don't have a table
entry for).

This patch removes the arm1176j-s and makes mpcore the default for armv6k.

Differential Revision: https://reviews.llvm.org/D52594

llvm-svn: 343303
2018-09-28 09:04:27 +00:00
David Spickett
bfb82efb28 [ARM] Allow execute only code on Cortex-m23
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.

Differential Revision: https://reviews.llvm.org/D52551

llvm-svn: 343302
2018-09-28 08:55:19 +00:00
David Spickett
3cacab9d1f Remove extra whitespace. NFC. (test commit)
llvm-svn: 343301
2018-09-28 08:45:28 +00:00
Oliver Stannard
62a89909f4 [ARM][v8.5A] Add speculation barriers SSBB and PSSBB
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.

Patch by Pablo Barrio!

Differential revision: https://reviews.llvm.org/D52484

llvm-svn: 343300
2018-09-28 08:27:56 +00:00
Simon Pilgrim
a91fff5df1 [X86][BtVer2] Fix PHMINPOS schedule resources typo
PHMINPOS can run on either JFPU pipe

llvm-svn: 343299
2018-09-28 08:21:39 +00:00
Hiroshi Inoue
b51c4baf85 [CodeGen] fix broken successor probability in MBB dump
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.

Differential Revision: https://reviews.llvm.org/D52605

llvm-svn: 343297
2018-09-28 05:27:32 +00:00
Owen Rodley
cfcabbacc1 Test commit. NFC.
llvm-svn: 343296
2018-09-28 04:51:45 +00:00
Craig Topper
f13208ad31 [ScalarizeMaskedMemIntrin] Use MinAlign to calculate alignment for the scalar load/stores to handle element types that are byte-sized but not powers of 2.
This pass doesn't handle non-byte sized types correctly at all, but at least we can make byte sized types work.

llvm-svn: 343294
2018-09-28 03:35:37 +00:00
Aaron Smith
f88f846bf9 [pdb] Simplify the code by replacing a few string conversions with calls to invokeBstrMethod()
Reviewers: aleksandr.urakov, zturner, llvm-commits

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D52624

llvm-svn: 343291
2018-09-28 02:32:07 +00:00
Tom Stellard
857477a1f4 merge-request.sh: Add 7.0 metabug
llvm-svn: 343290
2018-09-28 02:30:42 +00:00
Lang Hames
9e09d1c199 [ORC] clang-format the ThreadSafeModule code.
Evidently I forgot to do this before committing r343055.

llvm-svn: 343288
2018-09-28 01:41:33 +00:00
Lang Hames
3c8d6a2f70 [ORC] Add a const version of ThreadSafeModule::getModule().
llvm-svn: 343287
2018-09-28 01:41:33 +00:00
Lang Hames
1accc97014 [ORC] Lock ThreadSafeContext during module destruction in ThreadSafeModule's
move constructor.

This is basically the same fix as r343261, but applied to the move constructor:
Failure to lock the context during module destruction can lead to data races if
other threads are operating on the context.

llvm-svn: 343286
2018-09-28 01:41:29 +00:00
Craig Topper
e624e3add2 [ScalarizeMaskedMemIntrin] Fix the alignment calculation for the scalar stores of a masked store expansion.
It should be the minimum of the original alignment and the scalar size.

llvm-svn: 343284
2018-09-28 01:06:13 +00:00
Craig Topper
9bf5908b24 [ScalarizeMaskedMemIntrin] Add test cases for masked store expansion. Increase alignment of one of the masked load test cases.
The masked store alignment is being miscalculated, but masked load is correct.

llvm-svn: 343283
2018-09-28 01:06:09 +00:00
Craig Topper
204f37aff8 [X86] Add the test case from PR38986.
The assembly for this test should be optimal now after changes to the ScalarizeMaskedMemIntrin patch.

llvm-svn: 343281
2018-09-27 23:25:10 +00:00
Craig Topper
2606b3ee60 [ScalarizeMaskedMemIntrin] Ensure the mask is a vector of ConstantInts before generating the expansion without control flow.
Its possible the mask itself or one of the elements is a ConstantExpr and we shouldn't optimize in that case.

llvm-svn: 343278
2018-09-27 22:31:42 +00:00
Craig Topper
97b725ca13 [ScalarizeMaskedMemIntrin] Use cast instead of dyn_cast checked by an assert. Consistently make use of the element type variable we already have. NFCI
cast will take care of asserting internally.

llvm-svn: 343277
2018-09-27 22:31:40 +00:00
Derek Schuff
45713731a2 WebAssembly: Rename GetSignature to GetLibcallSignature [NFC]
llvm-svn: 343275
2018-09-27 22:20:33 +00:00
Craig Topper
776f6d52e2 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

llvm-svn: 343273
2018-09-27 21:28:59 +00:00
Craig Topper
a86a3d5210 [ScalarizeMaskedMemIntrin] Add some IR only test cases for masked gather expansion.
llvm-svn: 343272
2018-09-27 21:28:55 +00:00
Craig Topper
747c99f102 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

llvm-svn: 343271
2018-09-27 21:28:52 +00:00
Craig Topper
9358c66d5e [ScalarizeMaskedMemIntrin] Handle the case where the mask is an all zero vector.
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.

So instead just check for Constant and use getAggregateElement which will do the dirty work for us.

llvm-svn: 343270
2018-09-27 21:28:46 +00:00
Craig Topper
4fb14c26de [ScalarizeMaskedMemIntrin] Add dedicated IR only tests for masked load expansion so I can begin making modifications.
llvm-svn: 343269
2018-09-27 21:28:43 +00:00
Craig Topper
b170bb9833 [ScalarizeMaskedMemIntrin] Remove some temporary variables that are only used by a single if condition.
llvm-svn: 343268
2018-09-27 21:28:41 +00:00
Craig Topper
2b3ef1c578 [ScalarizeMaskedMemIntrin] Cleanup comments. NFC
llvm-svn: 343267
2018-09-27 21:28:39 +00:00
Lang Hames
6e429798e4 [ORC] Add definition for IRLayer::setCloneToNewContextOnEmit, use it to set the
flag to true in LLJIT when running in multithreaded mode.

The IRLayer::setCloneToNewContextOnEmit method sets a flag within the IRLayer
that causes modules added to that layer to be moved to a new context (by
serializing to/from a memory buffer) when they are emitted. This allows modules
that were all loaded on the same context to be compiled in parallel.

llvm-svn: 343266
2018-09-27 21:13:07 +00:00
Konstantin Zhuravlyov
61a34ded66 AMDGPU: Split HasExt into HasExtDPP/SDWA/SDWA9
llvm-svn: 343264
2018-09-27 20:49:00 +00:00
Lang Hames
1a1acdc98b [ORC] Make LocalIndirectStubsManager's operations thread-safe.
Locks stub management operations and switches to atomic update for stub
pointers.

llvm-svn: 343262
2018-09-27 20:36:10 +00:00
Lang Hames
5e01951bba [ORC] Lock ThreadSafeContext during Module destructing in ThreadSafeModule.
Failure to lock the context can lead to data races if other threads are
operating on other ThreadSafeModules that share the same context.

llvm-svn: 343261
2018-09-27 20:36:08 +00:00
Konstantin Zhuravlyov
9ba542df7c AMDGPU: Split VOP2Inst into VOP2Inst_e32/e64/sdwa
llvm-svn: 343259
2018-09-27 19:46:41 +00:00
Lang Hames
b3a8125857 [ORC] Coalesce all of ORC's symbol renaming / linkage-promotion utilities into
one SymbolLinkagePromoter utility.

SymbolLinkagePromoter renames anonymous and private symbols, and bumps all
linkages to at least global/hidden-visibility. Modules whose symbols have been
promoted by this utility can be decomposed into sub-modules without introducing
link errors. This is used by the CompileOnDemandLayer to extract single-function
modules for lazy compilation.

llvm-svn: 343257
2018-09-27 19:27:20 +00:00
Lang Hames
6c741c3c40 [ORC] LastKey needs to be protected to prevent data races.
llvm-svn: 343256
2018-09-27 19:27:20 +00:00
Lang Hames
c9f389e152 [lli] Fix ArgV setup bug when running in -jit-kind=orc-lazy mode.
ArgV[ArgC] should be null.

llvm-svn: 343255
2018-09-27 19:27:19 +00:00
Konstantin Zhuravlyov
0364f816bd AMDGPU/NFC: Simplify VOP_MAC_F16/F32
llvm-svn: 343254
2018-09-27 19:24:05 +00:00
Stanislav Mekhanoshin
360400775c [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

llvm-svn: 343249
2018-09-27 18:55:20 +00:00
Craig Topper
5fa500a477 [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

llvm-svn: 343244
2018-09-27 18:01:48 +00:00
Simon Pilgrim
c5fd4681ff [X86] Remove BT/BTC/BTR/BTS rr/ri overrides
llvm-svn: 343241
2018-09-27 17:29:13 +00:00
Simon Pilgrim
7537a49d32 [X86][Btver2] (V)MPSADBW instructions take 3uops not 1
llvm-svn: 343238
2018-09-27 17:13:57 +00:00
Luke Cheeseman
8bf597ea01 Revert r343192 as an ubsan build is currently failing
llvm-svn: 343235
2018-09-27 16:47:30 +00:00