1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

169698 Commits

Author SHA1 Message Date
Clement Courbet
519dab966b [llvm-exegesis] Fix doc in r342947.
llvm-exegesis.rst was using invalid indentation for bullet points.

llvm-svn: 342948
2018-09-25 07:48:38 +00:00
Clement Courbet
0301f9855d [llvm-exegesis] Allow benchmarking arbitrary code snippets.
Summary:

This is a step towards fixing PR38048.

Note that right now the measurements are given per instruction. We'll
need to give measurements a per code snippet and update the analysis (PR38731).

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D52041

llvm-svn: 342947
2018-09-25 07:31:44 +00:00
Stefan Maksimovic
6defe0ebcf [mips] Correct MUL pattern for mips64
Guard existing pattern with a predicate, introduce a new one for revision 6.

Differential Revision: https://reviews.llvm.org/D51684

llvm-svn: 342946
2018-09-25 06:27:49 +00:00
Fangrui Song
556d8103ca Use unique_ptr to hold AsmInfo,MRI,MII,STI
Reviewers: pcc, dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52389

llvm-svn: 342945
2018-09-25 06:19:31 +00:00
Mikael Holmen
49af72e55b Use TRI->regsOverlap() in MachineBasicBlock::computeRegisterLiveness
Summary:
For the loop that used MCRegAliasIterator this should be NFC.

For the loop that previously used MCSubRegIterator we should
now detect more cases where the register is actually live out that
we previously missed.

Reviewers: MatzeB, arsenm

Reviewed By: MatzeB

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D52410

llvm-svn: 342944
2018-09-25 06:10:04 +00:00
Hsiangkai Wang
e307d30c46 [DebugInfo] Do not generate address info for removed debug labels.
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.

Differential Revision: https://reviews.llvm.org/D51908

llvm-svn: 342943
2018-09-25 06:09:50 +00:00
Justin Bogner
c175a1ad14 [MachineCopyPropagation] Reimplement CopyTracker in terms of register units
Change the copy tracker to keep a single map of register units instead
of 3 maps of registers. This gives a very significant compile time
performance improvement to the pass. I measured a 30-40% decrease in
time spent in MCP on x86 and AArch64 and much more significant
improvements on out of tree targets with more registers.

Differential Revision: https://reviews.llvm.org/D52374

llvm-svn: 342942
2018-09-25 05:16:44 +00:00
Lang Hames
67265a68fd Revert "[ORC] Switch to asynchronous resolution in JITSymbolResolver."
This reverts commit r342939.

MSVC's promise/future implementation does not like types that are not default
constructible. Reverting while I figure out a solution.

llvm-svn: 342941
2018-09-25 04:54:03 +00:00
Justin Bogner
48d4cf5b0d [MachineCopyPropagation] Rework how we manage RegMask clobbers
Instead of updating the CopyTracker's maps each time we come across a
RegMask, defer checking for this kind of interference until we're
actually trying to propagate a copy. This avoids the need to
repeatedly iterate over maps in the cases where we don't end up doing
any work.

This is a slight compile time improvement for MachineCopyPropagation
as is, but it also enables a much bigger improvement that I'll follow
up with soon.

Differential Revision: https://reviews.llvm.org/D52370

llvm-svn: 342940
2018-09-25 04:45:25 +00:00
Lang Hames
c1dc6f5684 [ORC] Switch to asynchronous resolution in JITSymbolResolver.
Asynchronous resolution (where the caller receives a callback once the requested
set of symbols are resolved) is a core part of the new concurrent ORC APIs. This
change extends the asynchronous resolution model down to RuntimeDyld, which is
necessary to prevent deadlocks when compiling/linking on a fixed number of
threads: If RuntimeDyld's linking process were a blocking operation, then any
complete K-graph in a program will require at least K threads to link in the
worst case, as each thread would block waiting for all the others to complete.
Using callbacks instead allows the work to be passed between dependent threads
until it is complete.

For backwards compatibility, all existing RuntimeDyld functions will continue
to operate in blocking mode as before. This change will enable the introduction
of a new async finalization process in a subsequent patch to enable asynchronous
JIT linking.

llvm-svn: 342939
2018-09-25 04:43:38 +00:00
Thomas Lively
3503ba6394 [WebAssembly] SIMD sqrt
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52387

llvm-svn: 342937
2018-09-25 03:39:28 +00:00
Stanislav Mekhanoshin
a3275cd945 [AMDGPU] Remove useless check from test. NFC.
The check for assignment of zero is practically useless
while the assignment moves around with different scheduling.

llvm-svn: 342935
2018-09-25 01:24:54 +00:00
Craig Topper
064d1312b6 [X86] Don't create FILD ISD nodes when X87 is disabled.
The included test case previously asserted because the type legalizer tried to soften the FILD ISD node.

Fixes PR38819.

llvm-svn: 342934
2018-09-25 00:16:57 +00:00
Craig Topper
1ab73060d7 [X86] Remove superfluous curly braces. NFC
llvm-svn: 342933
2018-09-25 00:16:54 +00:00
Craig Topper
4e7386800c [X86] Update comment. Use 'glued' instead of 'flagged' NFC
llvm-svn: 342932
2018-09-25 00:16:52 +00:00
Thomas Lively
71304ebc01 [WebAssembly][NFC] Fix hardcoded stack indices in tests
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52388

llvm-svn: 342928
2018-09-24 23:42:07 +00:00
Artem Belevich
926bd94135 [CUDA] Added basic support for compiling with CUDA-10.0
llvm-svn: 342924
2018-09-24 23:10:44 +00:00
Evgeniy Stepanov
169602960f [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342923
2018-09-24 23:03:34 +00:00
Evgeniy Stepanov
5f226d3967 Revert "[hwasan] Record and display stack history in stack-based reports."
This reverts commit r342921: test failures on clang-cmake-arm* bots.

llvm-svn: 342922
2018-09-24 22:50:32 +00:00
Evgeniy Stepanov
af30f7d7e8 [hwasan] Record and display stack history in stack-based reports.
Summary:
Display a list of recent stack frames (not a stack trace!) when
tag-mismatch is detected on a stack address.

The implementation uses alignment tricks to get both the address of
the history buffer, and the base address of the shadow with a single
8-byte load. See the comment in hwasan_thread_list.h for more
details.

Developed in collaboration with Kostya Serebryany.

Reviewers: kcc

Subscribers: srhines, kubamracek, mgorny, hiraditya, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52249

llvm-svn: 342921
2018-09-24 21:38:42 +00:00
Christy Lee
3ea0f6c83f Re-submitting changes in D51550 because it failed to patch.
Reviewers: javed.absar, trentxintong, courbet

Reviewed By: trentxintong

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52433

llvm-svn: 342919
2018-09-24 20:47:12 +00:00
Sanjay Patel
29014ecbf2 [InstCombine] add bitcast+extelt helper function; NFC
We can handle patterns where the elements have different 
sizes, so refactoring ahead of trying to add another blob
within these clauses.

llvm-svn: 342918
2018-09-24 20:41:22 +00:00
Simon Pilgrim
47d7a9344e [X86] Remove shift/rotate by CL memory (RMW) overrides
The uops are slightly different to the register variant, so requires a +1uop tweak

llvm-svn: 342916
2018-09-24 20:11:50 +00:00
Craig Topper
dd729ed393 [X86] Infer 64bit feature support from the CPUID results in getHostCPUFeatures.
After r341022, we more strictly check the 64bit feature in X86Subtargets constructor when a 64-bit triple is used. If we don't infer this feature for autodetected CPUs we might incorrectly report an error if the CPU name wasn't autodetected to a CPU that supports 64-bit.

llvm-svn: 342914
2018-09-24 18:55:41 +00:00
Stefan Pintilie
c79c70a413 [Power9] [LLVM] Add __float128 exponent GET and SET builtins
Added

__builtin_vsx_scalar_extract_expq
__builtin_vsx_scalar_insert_exp_qp

Builtins should behave the same way as in GCC.

Differential Revision: https://reviews.llvm.org/D48185

llvm-svn: 342910
2018-09-24 18:14:13 +00:00
Simon Pilgrim
ed44c7c9cd [X86][AVX] Add truncation as shuffle test for PR31451
llvm-svn: 342908
2018-09-24 17:26:31 +00:00
Christy Lee
610002d619 Reland r342494 after fixing LIT checks.
llvm-svn: 342907
2018-09-24 17:26:30 +00:00
Sanjay Patel
ab07822de8 [Analysis] add comment to generalize finding a scalar op from vector; NFC
llvm-svn: 342906
2018-09-24 17:18:32 +00:00
Sanjay Patel
83b3ac2234 [InstCombine] add/move tests for extractelement; NFC
llvm-svn: 342905
2018-09-24 17:17:16 +00:00
Simon Pilgrim
a4b4612ddc [X86] Remove WriteDiv/WriteIDiv schedule overrides - use classes directly. NFCI.
We're missing quite a bit of data for these instruction, removing the overrides makes this obvious - inconsistent reg/mem variants is a concern as well.

Also, we have Divider resources (HWDivider etc.) but they aren't actually used consistently.

llvm-svn: 342904
2018-09-24 16:58:26 +00:00
Sanjay Patel
bfdf7bad58 [InstCombine] improve variable name and use 'match'; NFC
'width' of a vector usually refers to the bit-width.

https://bugs.llvm.org/show_bug.cgi?id=39016
shows a case where we could extend this fold to handle
a case where the number of elements in the bitcasted
vector is not equal to the resulting value.

llvm-svn: 342902
2018-09-24 16:39:03 +00:00
Evandro Menezes
a69b45a477 [ARM] Adjust the cost model for Exynos
Tune `MaxInterleaveFactor` and `LdStMultipleTiming`and remove
`PartialUpdateClearance` for the Exynos processors.

llvm-svn: 342900
2018-09-24 16:35:14 +00:00
Evandro Menezes
933d3c8ea9 [ARM] Adjust the feature set for Exynos
Enable crypto and literals fusion for the Exynos processors.

llvm-svn: 342899
2018-09-24 16:35:09 +00:00
Zhaoshi Zheng
14ebc5a569 [Thumb1] Any imm8 should have cost of 1
A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.

Differential Revision: https://reviews.llvm.org/D52257

llvm-svn: 342898
2018-09-24 16:15:23 +00:00
Fedor Sergeev
912a050cc8 [New PM][PassInstrumentation] IR printing support for New Pass Manager
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.

- PrintIR routines implement printing callbacks.

- StandardInstrumentations class provides a central place to manage all
  the "standard" in-tree pass instrumentations. Currently it registers
  PrintIR callbacks.

Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923

llvm-svn: 342896
2018-09-24 16:08:15 +00:00
Simon Pilgrim
74a918c83c [X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)
Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases.

This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions.

llvm-svn: 342892
2018-09-24 15:21:57 +00:00
Luke Cheeseman
645b604217 [Arm][AsmParser] Restrict register list size for VSTM/VLDM
- The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified
- The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers
- This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52082

llvm-svn: 342891
2018-09-24 15:13:48 +00:00
Sanjay Patel
206b98671f [DAGCombiner] use UADDO to optimize saturated unsigned add
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for 
use in the later select.

As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively 
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q

But we're not using 'uaddo' in those cases via DAG transforms. This happens in 
CGP after D8889 without checking target lowering to see if the op is supported. 
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with 
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned 
saturated add and converts to uaddo without checking target capabilities.

This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only 
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title 
(unlike x86 which sees improvements for all sizes because all sizes are 'custom'). 
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. 
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, 
so this patch will fire on those tests.

Another possibility given the existing behavior: we could remove the legal-or-custom 
check altogether because we're assuming that a UADDO sequence is canonical/optimal 
before we ever reach here. But that seems like a bug to me. If the target doesn't 
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining 
using a UADDO node. This is similar justification for why we don't canonicalize IR to 
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first 
place.

Differential Revision: https://reviews.llvm.org/D51929

llvm-svn: 342886
2018-09-24 14:47:15 +00:00
Petar Jovanovic
c7a010c7ae [Mips][FastISel] Fix selectBranch on icmp i1
The r337288 tried to fix result of icmp i1 when its input is not sanitized
by falling back to DagISel. While it now produces the correct result for
bit 0, the other bits can still hold arbitrary value which is not supported
by MipsFastISel branch lowering. This patch fixes the issue by falling back
to DagISel in this case.

Patch by Dragan Mladjenovic.

Differential Revision: https://reviews.llvm.org/D52045

llvm-svn: 342884
2018-09-24 14:14:19 +00:00
Zaara Syeda
14c2d842d3 [PowerPC] Support operand modifier 'x' in inline asm
gcc uses operand modifier 'x' in inline asm for VSX registers.
Without this modifier, instructions which use VSX numbering for their
operands are printed as VMX registers. This patch adds support for the
operand modifier 'x'.

Differential Revision: https://reviews.llvm.org/D52244

llvm-svn: 342882
2018-09-24 14:01:16 +00:00
Jonas Devlieghere
a2e27329ed [dsymutil] Set LSan blacklist whenever sanitizers are enabled.
LSan can be enabled by itself or as part of the address sanitizer.
Rather than checking the enabled sanitizers for both, just set the LSan
env options whenever a sanitizer is enabled.

llvm-svn: 342881
2018-09-24 13:56:36 +00:00
Roman Lebedev
c8948d6e37 [NFC][CodeGen][X86][AArch64] More tests for 'bit field extract' w/ constants
It would be best to introduce ISD::BitFieldExtract,
because clearly more than one backend faces the same problem.
But for now let's solve this in the x86-specific DAG combine.

https://bugs.llvm.org/show_bug.cgi?id=38938

llvm-svn: 342880
2018-09-24 13:24:20 +00:00
Matt Arsenault
5fb1110193 AMDGPU: Fix private handling for allowsMisalignedMemoryAccesses
If the alignment is at least 4, this should report true.

Something still seems off with how < 4-byte types are
handled here though.

Fixing this seems to change how some combines get
to where they get, but somehow isn't changing the net
result.

llvm-svn: 342879
2018-09-24 13:18:15 +00:00
Matt Arsenault
b6edb08837 Fix some missing opcodes in bcanalyzer
llvm-svn: 342878
2018-09-24 12:47:17 +00:00
Andrea Di Biagio
3ca5d7eab8 [llvm-mca] Improve code comments in LSUnit.{h, cpp}. NFC
llvm-svn: 342877
2018-09-24 12:45:26 +00:00
Sjoerd Meijer
d5015b6840 [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33
A sequence of VMUL and VADD instructions always give the same or better
performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33.
Executing the VMUL and VADD back-to-back requires the same cycles, but
having separate instructions allows scheduling to avoid the hazard between
these 2 instructions.

Differential Revision: https://reviews.llvm.org/D52289

llvm-svn: 342874
2018-09-24 12:02:50 +00:00
Hans Wennborg
058e12737a Revert r341932 "[ARM] Enable ARMCodeGenPrepare by default"
This caused miscompilation of WebRTC for Android: PR39060.

> We've had the pass enabled downstream for a couple of weeks and it
> seems to be okay, so enable it by default.
>
> Differential Revision: https://reviews.llvm.org/D51920

llvm-svn: 342873
2018-09-24 11:40:07 +00:00
Luke Cheeseman
77de0a1dfd [ARM][ARMLoadStoreOptimizer]
- The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers
- This is an UNPREDICTABLE instruction and shouldn't be done
- It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch
- This fixes https://bugs.llvm.org/show_bug.cgi?id=38389

Differential Revision: https://reviews.llvm.org/D52085

llvm-svn: 342872
2018-09-24 10:42:22 +00:00
Petar Jovanovic
103f0013e7 [deadargelim] Update dbg.value of 'unused' parameters
DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D51968

llvm-svn: 342871
2018-09-24 10:01:24 +00:00
Sam Parker
b4d8a969b6 [ARM] bottom-top mul support ARMParallelDSP
Originally committed in rL342210 but was reverted in rL342260 because
it was causing issues in vectorized code, because I had forgotten to
ensure that we're operating on scalar values.

Original commit message:

On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.

Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 342870
2018-09-24 09:34:06 +00:00