1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

162939 Commits

Author SHA1 Message Date
Michael Zolotukhin
d62fa2f45a Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time.
This reapplies commit r329644.

llvm-svn: 329865
2018-04-11 23:37:53 +00:00
Michael Zolotukhin
83ba2b464d [SSAUpdaterBulk] Fix linux bootstrap/sanitizer failures: explicitly specify order of evaluation.
The standard says that the order of evaluation of an expression
  s[x] = foo()
is unspecified. In our case, we first create an empty entry in the map,
then call foo(), then store its return value to the created entry. The
problem is that foo uses the map as a cache, so if it finds that there
is an entry in the map, it stops computation. This change explicitly
sets the order, thus fixing this heisenbug.

llvm-svn: 329864
2018-04-11 23:37:37 +00:00
Jake Ehrlich
e9823fde1e [llvm-objcopy] Switch over to using TableGen for parsing arguments
Swithces from using the command line library to using TableGen. This will allow
llvm-strip to exist and allow refinements of the command line syntax.

Differential Revision: https://reviews.llvm.org/D44236

llvm-svn: 329863
2018-04-11 23:37:03 +00:00
Simon Pilgrim
205d94af93 [X86] Remove unused itinerary argument from FMA3/FMA4/XOP instructions. NFCI.
llvm-svn: 329862
2018-04-11 23:24:38 +00:00
Weiming Zhao
ce3b1c3276 Add missing vtable anchors
Summary: This patch adds anchor() for MemoryBuffer, raw_fd_ostream, RTDyldMemoryManager, SectionMemoryManager, etc.

Reviewers: jlebar, eli.friedman, dblaikie

Reviewed By: dblaikie

Subscribers: mehdi_amini, mgorny, dblaikie, weimingz, llvm-commits

Differential Revision: https://reviews.llvm.org/D45244

llvm-svn: 329861
2018-04-11 23:09:20 +00:00
Simon Pilgrim
5fa636dda8 X86FoldTableEntry - avoid unnecessary std::string creation. NFCI.
llvm-svn: 329860
2018-04-11 23:08:30 +00:00
whitequark
e28ae58ff4 [LLVM-C] Add LLVMGetHostCPU{Name,Features}.
Without these functions it's hard to create a TargetMachine for
Orc JIT that creates efficient native code.

It's not sufficient to just expose LLVMGetHostCPUName(), because
for some CPUs there's fewer features actually available than
the CPU name indicates (e.g. AVX might be missing on some CPUs
identified as Skylake).

Differential Revision: https://reviews.llvm.org/D44861

llvm-svn: 329856
2018-04-11 22:40:42 +00:00
Simon Pilgrim
eec640564a Don't repeatedly evaluate size() in the for loop. NFCI.
llvm-svn: 329853
2018-04-11 22:24:48 +00:00
Nemanja Ivanovic
81e041e541 [PowerPC] Fix condition for 64-bit rotate when replacing r+r instr with r+i
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=37039
The condition only covers one of the two 64-bit rotate instructions. This just
adds the second (RLDICLo).

Patch by Josh Stone.

llvm-svn: 329852
2018-04-11 21:25:44 +00:00
Puyan Lotfi
cd9bfe0331 Attempting to work around a non-determinism issue.
The main thing that matters with this test is that the COPYs
are moved together not where the REG_SEQUENCES are.

llvm-svn: 329850
2018-04-11 20:29:32 +00:00
Yonghong Song
830a6fb07d bpf: signal error instead of silent drop for certain invalid asm insn
Currently, an invalid asm insn, either in an asm file or
in an inline asm format, might be silently dropped. This patch
fixed two places where this may happen by
signaling the error so user knows what goes wrong.

The following is an example to demonstrate error messages:

    -bash-4.2$ cat t.c
    int test(void *ctx) {
    #if defined(NO_ERROR)
      asm volatile("r0 = *(u16 *)skb[%0]" : : "i"(2));
    #elif defined(ERROR_1)
      asm volatile("r20 = *(u16 *)skb[%0]" : : "i"(2));
    #elif defined(ERROR_2)
      asm volatile("r0 = *(u16 *)(r1 + ?)" : :);
    #endif
      return 0;
    }
    -bash-4.2$ cat run.sh
    for macro in NO_ERROR ERROR_1 ERROR_2; do
      echo "===== compile for macro" $macro
      clang -D${macro} -O2 -target bpf -emit-llvm -S t.c
      echo "==llc=="
      llc -march=bpf -filetype=obj t.ll
    done
    -bash-4.2$ ./run.sh
    ===== compile for macro NO_ERROR
    ==llc==
    ===== compile for macro ERROR_1
    ==llc==
    <inline asm>:1:2: error: invalid register/token name
            r20 = *(u16 *)skb[2]
            ^
    note: !srcloc = 135
    ===== compile for macro ERROR_2
    ==llc==
    <inline asm>:1:21: error: unexpected token
            r0 = *(u16 *)(r1 + ?)
                               ^
    note: !srcloc = 210
    -bash-4.2$

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 329849
2018-04-11 20:24:52 +00:00
Gabor Buella
d11176227c [X86] Describe wbnoinvd instruction
Similar to the wbinvd instruction, except this
one does not invalidate caches. Ring 0 only.
The encoding matches a wbinvd instruction with
an F3 prefix.

Reviewers: craig.topper, zvi, ashlykov

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D43816

llvm-svn: 329847
2018-04-11 20:01:57 +00:00
Daniel Neilson
73a3287ceb [DSE] Add tests for atomic memory intrinsics (NFC)
Summary:
These tests show that DSE currently does nothing with the atomic memory
intrinsics. Future work will teach DSE how to simplify these.

llvm-svn: 329845
2018-04-11 19:46:02 +00:00
David Blaikie
5457a86a41 Rename *CommandFlags.def to *CommandFlags.inc
These aren't the .def style files used in LLVM that require a macro
defined before their inclusion - they're just basic non-modular includes
to stamp out command line flag variables.

llvm-svn: 329840
2018-04-11 18:49:37 +00:00
Daniel Neilson
fdafc56cf8 [DSE] Regenerate tests with update_test_checks.py (NFC)
Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll
test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll

llvm-svn: 329839
2018-04-11 18:43:10 +00:00
Peter Collingbourne
2047ab334e CodeGen: Don't try to canonicalize Unix-style paths in CodeView debug info.
Most importantly, we should not replace slashes with backslashes
because that would invalidate the path.

Differential Revision: https://reviews.llvm.org/D45473

llvm-svn: 329838
2018-04-11 18:24:03 +00:00
Simon Pilgrim
c09c832a8e [X86][Atom] Convert Atom scheduler model to SchedRW (PR32431)
Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550.

I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here,

There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers.

There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical.

NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329837
2018-04-11 18:23:01 +00:00
Andrea Di Biagio
aa4e3e3fd7 [llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources.
This patch moves part of the logic that notifies dispatch stall events from the
DispatchUnit to the Scheduler.

The main goal of this patch is to remove (yet another) dependency between the
DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know
about `Scheduler::Event` and how to classify stalls due to the lack of scheduling
resources. This patch removes that knowledge and simplifies the logic in
DispatchUnit::checkScheduler.

This is another change done in preparation for the work to fix PR36663.

No functional change intended.

llvm-svn: 329835
2018-04-11 18:05:23 +00:00
Simon Pilgrim
ac46827cba [X86] Generalize X86PadShortFunction to work with TargetSchedModel
Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call.

Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329834
2018-04-11 18:05:17 +00:00
Artem Belevich
3561cb9d71 [NVPTX] Removed 'satom' feature which is no longer used.
Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329830
2018-04-11 17:51:33 +00:00
Artem Belevich
3cfdfea288 [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.

Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329829
2018-04-11 17:51:19 +00:00
Tim Renouf
e85f3a4fd1 [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Re-landed after noticing that the buildbot failure from 329808 seemed to
be unrelated.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329826
2018-04-11 17:18:36 +00:00
Daniel Neilson
7fd9ae73ae [DSE] Regenerate tests with update_test_checks.py (NFC)
Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/simple.ll
test/Transforms/DeadStoreElimination/memintrinsics.ll

llvm-svn: 329824
2018-04-11 16:50:04 +00:00
Reid Kleckner
277138a332 [FastISel] Disable local value sinking by default
This is causing compilation timeouts on code with long sequences of
local values and calls (i.e. foo(1); foo(2); foo(3); ...).  It turns out
that code coverage instrumentation is a great way to create sequences
like this, which how our users ran into the issue in practice.

Intel has a tool that detects these kinds of non-linear compile time
issues, and Andy Kaylor reported it as PR37010.

The current sinking code scans the whole basic block once per local
value sink, which happens before emitting each call. In theory, local
values should only be introduced to be used by instructions between the
current flush point and the last flush point, so we should only need to
scan those instructions.

llvm-svn: 329822
2018-04-11 16:03:07 +00:00
Sanjay Patel
75f6d31fd2 [InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse()
llvm-svn: 329821
2018-04-11 15:57:18 +00:00
Paul Robinson
2ee96af0ca [DWARFv5] Fuss with asm syntax for conveying MD5 checksum.
Previously the MD5 option of the .file directive provided the checksum
as a quoted hex string; now it's a normal hex number with 0x prefix,
same as the .octa directive accepts.

Differential Revision: https://reviews.llvm.org/D45459

llvm-svn: 329820
2018-04-11 15:14:05 +00:00
Petar Jovanovic
45805522db [MIPS GlobalISel] Select add i32, i32
Add the minimal support necessary to lower a function that returns the
sum of two i32 values.
Support argument/return lowering of i32 values through registers only.
Add tablegen for regbankselect and instructionselect.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D44304

llvm-svn: 329819
2018-04-11 15:12:32 +00:00
Haicheng Wu
2df8eb8d0e [SLP] update a test case. NFC.
llvm-svn: 329818
2018-04-11 15:09:49 +00:00
Yaxun Liu
c4af07c19e [AMDGPU] Fix lowering enqueue_kernel
Two issues were fixed:

runtime has difficulty to allocate memory for an external symbol of a
kernel and set the address of the external symbol, therefore make the runtime
handle of an enqueued kernel an ordinary global variable. Runtime only needs
to store the address of the loaded kernel to the handle and has verified
that this approach works.

handle the situation where __enqueue_kernel* gets inlined therefore
the enqueued kernel may be used through a constant expr instead
of an instruction.

Differential Revision: https://reviews.llvm.org/D45187

llvm-svn: 329815
2018-04-11 14:46:15 +00:00
Andrea Di Biagio
5d1dd9a325 Revert "[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS"
It caused a buildbot failure (clang-ppc64le-linux-multistage - build #6424)

llvm-svn: 329812
2018-04-11 14:35:23 +00:00
Tim Renouf
268bd738df Revert "[AMDGPU] Ensure there are enough registers for wave dispatch"
This reverts 329808. That change caused a report of a failure in
test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect
it is an expensive-check-only error.

Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0
llvm-svn: 329811
2018-04-11 14:27:41 +00:00
Sander de Smalen
73b5f3945d [AArch64][AsmParser] Split index parsing from vector list.
Summary:
Place parsing of a vector index into a separate function to reduce
duplication, since the code is duplicated in both the parsing of a
Neon vector register operand and a Neon vector list.

This is patch [2/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: rengolin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45428

llvm-svn: 329809
2018-04-11 14:10:37 +00:00
Tim Renouf
ea9df3592f [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329808
2018-04-11 14:02:41 +00:00
Andrea Di Biagio
9b07b84d73 [llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS.
llvm-svn: 329807
2018-04-11 13:52:42 +00:00
Simon Pilgrim
e955d824eb [X86] Add variable shuffle schedule classes
Split variable index shuffles from immediate index shuffles

WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.)
WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.)

WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.)
WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.)

Differential Revision: https://reviews.llvm.org/D45404

llvm-svn: 329806
2018-04-11 13:49:19 +00:00
Francis Visoiu Mistrih
6218e0db93 [AArch64] Add test case for r329797
Forgot to add a test case in the previous commit.

llvm-svn: 329805
2018-04-11 13:37:25 +00:00
Simon Pilgrim
acaf1284c4 [X86][SSE] Tweak cmpps schedule test so that it works properly with just sse1
movhps/movlps test are still broken so we can't disable sse2 yet

llvm-svn: 329802
2018-04-11 13:15:36 +00:00
Dmitry Preobrazhensky
28dc2311b5 [AMDGPU][MC][GFX9] Added v_screen_partition_4se_b32
See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845

Differential Revision: https://reviews.llvm.org/D45443

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329801
2018-04-11 13:13:30 +00:00
Francis Visoiu Mistrih
5194d11e51 [AArch64] Fix regression after r329691
In r329691, we would choose FP even if the offset wouldn't fit, just
because the offset is smaller than the one from BP. This made many
accesses through FP need to scavenge a register, which resulted in
slower and bigger code for no good reason.

This patch now always picks the offset that fits first, even if FP is
preferred.

llvm-svn: 329797
2018-04-11 12:36:55 +00:00
Andrea Di Biagio
74100cfc77 [llvm-mca] Minor code cleanup. NFC
llvm-svn: 329796
2018-04-11 12:31:44 +00:00
Andrea Di Biagio
d9344b8000 [llvm-mca] Renamed BackendStatistics to RetireControlUnitStatistics.
Also, removed flag -verbose in favor of flag -retire-stats.

llvm-svn: 329794
2018-04-11 12:12:53 +00:00
Andrea Di Biagio
b46fe1126f [llvm-mca] Move the logic that prints scheduler statistics from BackendStatistics to its own view.
Added flag -scheduler-stats to print scheduler related statistics.

llvm-svn: 329792
2018-04-11 11:37:46 +00:00
Artur Gainullin
6167711768 Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max.
Bitwise 'not' of the min/max could be eliminated in the pattern:

%notx = xor i32 %x, -1
%cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y
%smax = select i1 %cmp1, i32 %notx, i32 %y
%res = xor i32 %smax, -1

https://rise4fun.com/Alive/lCN

Reviewers: spatel

Reviewed by: spatel

Subscribers: a.elovikov, llvm-commits

Differential Revision: https://reviews.llvm.org/D45317

llvm-svn: 329791
2018-04-11 10:29:37 +00:00
Sjoerd Meijer
10e7e8538f [ARM] FP16 VSEL codegen
This is a follow up of rL327695 to instruction select more variants of VSELGT
and VSELGE, for which it is necessary to custom lower SELECT.

More work is required in this area, which will be addressed soon:
- more variants need to be regression tested, but this depends on the next point.
- first LowerConstantFP need to be adjusted for fp16 values.

Differential Revision: https://reviews.llvm.org/D45205

llvm-svn: 329788
2018-04-11 09:28:04 +00:00
Clement Courbet
512991d3bb [Build][NFC] Split off libpfm detection to a separate module.
llvm-svn: 329783
2018-04-11 07:39:00 +00:00
Sander de Smalen
8787081f3e [AArch64][AsmParser] Unify code for parsing Neon/SVE vectors.
Summary:
Merged 'tryMatchVectorRegister' (specific to Neon) and
'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and
created a generic 'parseVectorKind()' function that returns the #Elements
and ElementWidth of a vector suffix. This reduces the duplication of
this functionality between two the vector implementations.

This is patch [1/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: fhahn

Subscribers: tschuett, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45427

llvm-svn: 329782
2018-04-11 07:36:10 +00:00
Clement Courbet
84516bef77 [llvm-exegesis] Add a flag to disable libpfm even if present.
Summary: Fixes PR37053.

Reviewers: uabelho, gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D45436

llvm-svn: 329781
2018-04-11 07:32:43 +00:00
Petr Hosek
ab1bf72e88 [CMake][runtimes] Process common options in runtimes build
This was removed in D39932 but turned out this is actually needed
because runtimes such as compiler-rt and libc++ rely on common options
processing for setting certain flags such as -ffunction-sections and
-fdata-sections.

Differential Revision: https://reviews.llvm.org/D45507

llvm-svn: 329778
2018-04-11 05:18:03 +00:00
Craig Topper
debdfcda33 [X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select.
The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.

llvm-svn: 329774
2018-04-11 04:55:04 +00:00
Craig Topper
ece8d06b3f [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction.
Previously the code only knew how to handle setcc to a register.

This should fix a crash in the chromium build.

llvm-svn: 329771
2018-04-11 01:09:10 +00:00