1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

201476 Commits

Author SHA1 Message Date
Matt Arsenault
68cd67d10b AMDGPU/GlobalISel: Stop using G_EXTRACT in argument lowering
We really need to put this undef padding stuff into a helper
somewhere, but leave that for when this is moved to generic code.
2020-08-06 09:55:35 -04:00
Matt Arsenault
d9a567a83e AMDGPU/GlobalISel: Implement LLT version of allowsMisalignedMemoryAccesses 2020-08-06 09:50:36 -04:00
Matt Arsenault
c5c7f07aca AMDGPU/GlobalISel: Make s16 phi legal
If we were to have an operation with an s16 def that needs to be
executed in a waterfall loop, not having s16 legal would place an
avoidable burden on RegBankSelect to widen it.
2020-08-06 09:41:14 -04:00
Matt Arsenault
946c8876f1 AMDGPU/GlobalISel: Fix assert on copy to vcc
This was trying to constrain a physical register. By the verifier's
understanding, it's impossible to have a 1-bit copy to vcc/vcc_lo so
don't try to handle physregs.
2020-08-06 09:41:14 -04:00
Raphael Isemann
bc5b29dac3 Revert "PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI."
This reverts commit 87c5437afd273e909e0fed3389de7531d5452ea5.

The commit includes several headers in the middle of a function, which
breaks pretty much everything.
2020-08-06 15:15:43 +02:00
Xing GUO
0504b3725a [DWARFYAML][debug_info] Make the 'Values' field optional.
This patch makes the 'Values' field optional. This is useful when we
handcraft the terminating entry of DIEs.

```
debug_info:
  - Version:  4
    ...
    Entries:
      - AbbrCode: 1
        Values:
          - Value: 0x1234
      - AbbrCode: 0 ## Termination
```

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D85397
2020-08-06 20:43:52 +08:00
Xing GUO
b507631f80 [obj2yaml] Test dumping an empty .debug_aranges section.
This patch adds one test case that tests dumping an empty .debug_aranges
section.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D85405
2020-08-06 20:40:12 +08:00
Petar Avramovic
5de548eba7 [GlobalISel][InlineAsm] Fix matching input constraint to physreg
Add given input and mark it as tied.
Doesn't create additional copy compared to
matching input constraint to virtual register.

Differential Revision: https://reviews.llvm.org/D85122
2020-08-06 14:35:51 +02:00
Simon Pilgrim
a7159bc3c5 BitstreamRemarkParser.h - remove unnecessary includes. NFCI.
Remove unused includes, moving to the lib header or cpp file as necessary.
2020-08-06 13:17:53 +01:00
Simon Pilgrim
e07ee0759f Fix include sorting order. NFC 2020-08-06 11:46:53 +01:00
Simon Pilgrim
57a7b288f9 [X86] getX86MaskVec - replace mask limit from NumElts < 8 with NumElts <= 4
As noted on PR46885, the number of mask elements should always be a power of 2, so to fix the static analyzer warning we are better off replacing the condition to <= 4, and I've added a pow2 assertion as well.
2020-08-06 11:46:19 +01:00
Simon Pilgrim
b4317ecea4 PDBExtras.h - remove unnecessary raw_ostream forward declaration. NFCI.
We already need to include raw_ostream.h, also add missing StringRef.h and cstdint implicit dependencies.

Remove unnecessary includes from PDBExtras.cpp
2020-08-06 11:28:42 +01:00
Simon Pilgrim
78cdf5eff9 [X86][SSE] Expose all memory offsets in expand load tests
Since we're messing with individual element loads we need to expose this to show whats going on.

Part of the work to fix the masked_expandload.ll regressions in D66004
2020-08-06 11:28:42 +01:00
Paul Walker
abb5aa7429 [SVE] Lower scalable vector mul operations.
This allows us to remove extra patterns from AArch64SVEInstrInfo.td
because we can reuse those required for fixed length vectors.

Differential Revision: https://reviews.llvm.org/D85328
2020-08-06 11:15:35 +01:00
Paul Walker
a80884d5ec [SVE] Implement lowering for fixed length vector multiplication.
NOTE: Also uses SVE code generation for NEON size vectors, instead
of expanding i64 based vector multiplications.

Differential Revision: https://reviews.llvm.org/D85327
2020-08-06 11:01:39 +01:00
Alexey Lapshin
21ff6f71fc [dsymutil] Disable dsymutil/X86/reproducer.test on windows.
The dsymutil/X86/reproducer.test test could create paths
longer than MAX_PATH:

C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win
\build\test\tools\dsymutil\X86\Output\reproducer.test.tmp.repro
\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\build
\test\tools\dsymutil\X86\Output\reproducer.test.tmp\Inputs\
basic1.macho.x86_64.o

Disable this test on windows.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D85294
2020-08-06 12:49:35 +03:00
Juneyoung Lee
5a50b9b55e [InstCombine] Fold freeze(undef) into a proper constant
This is a simple patch that folds freeze(undef) into a proper constant after inspecting its uses.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84948
2020-08-06 18:40:04 +09:00
Juneyoung Lee
fa59636db3 [InstCombine] Add tests for D84948; NFC 2020-08-06 18:23:45 +09:00
David Green
77d21dcd3f [LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.

In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).

Differential Revision: https://reviews.llvm.org/D75069
2020-08-06 10:10:50 +01:00
Roman Lebedev
bd46a2901c [NFC][InstCombine] Refactor '(-NSW x) pred x' fold 2020-08-06 11:50:36 +03:00
Roman Lebedev
5a8108b5b6 [InstCombine] (-NSW x) u<= x --> x s<=0 (PR39480)
Name: (-x) u<= x  -->  x s<= 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp ule i8 %neg_x, %x
  =>
%r = icmp sle i8 %x, 0

https://rise4fun.com/Alive/V22

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:36 +03:00
Roman Lebedev
eda43dea9f [InstCombine] (-NSW x) u< x --> x s< 0 (PR39480)
Name: (-x) u< x  -->  x s< 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp ult i8 %neg_x, %x
  =>
%r = icmp slt i8 %x, 0

https://rise4fun.com/Alive/zSuf

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:36 +03:00
Roman Lebedev
7d2785df31 [InstCombine] (-NSW x) u>= x --> x s>= 0 (PR39480)
Name: (-x) u>= x  -->  x s>= 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp uge i8 %neg_x, %x
  =>
%r = icmp sge i8 %x, 0

https://rise4fun.com/Alive/LLHd

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:35 +03:00
Roman Lebedev
29f0308c9f [InstCombine] (-NSW x) u> x --> x s> 0 (PR39480)
Name: (-x) u> x  -->  x s> 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp ugt i8 %neg_x, %x
  =>
%r = icmp sgt i8 %x, 0

https://rise4fun.com/Alive/Raea

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:35 +03:00
Roman Lebedev
b055700d7c [InstCombine] (-NSW x) s<= x --> x s>= 0 (PR39480)
Name: (-x) s<= x  -->  x >= 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sle i8 %neg_x, %x
  =>
%r = icmp sge i8 %x, 0

https://rise4fun.com/Alive/91k

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:35 +03:00
Roman Lebedev
08ee2e8233 [InstCombine] (-NSW x) s< x --> x s> 0 (PR39480)
Name: (-x) s< x  -->  x > 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp slt i8 %neg_x, %x
  =>
%r = icmp sgt i8 %x, 0

https://rise4fun.com/Alive/3IXb

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:35 +03:00
Roman Lebedev
0e688c47d3 [InstCombine] (-NSW x) s>= x --> x s<= 0 (PR39480)
Name: (-x) s>= x  -->  x s<= 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sge i8 %neg_x, %x
  =>
%r = icmp sle i8 %x, 0

https://rise4fun.com/Alive/Hdip

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Roman Lebedev
b24a32fe45 [InstCombine] (-NSW x) ==/!= x --> x ==/!= 0 (PR39480)
Name: (-x) == x  -->  x == 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp eq i8 %neg_x, %x
  =>
%r = icmp eq i8 %x, 0

Name: (-x) != x  -->  x != 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp ne i8 %neg_x, %x
  =>
%r = icmp ne i8 %x, 0

https://rise4fun.com/Alive/4slH

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Roman Lebedev
3b33b0cf64 [InstCombine] (-NSW x) s> x --> x s< 0 (PR39480)
Name: (-x) s> x  -->  x s< 0
%neg_x = sub nsw i8 0, %x ; %x must not be INT_MIN
%r = icmp sgt i8 %neg_x, %x
  =>
%r = icmp slt i8 %x, 0

https://rise4fun.com/Alive/ZslD

https://bugs.llvm.org/show_bug.cgi?id=39480
2020-08-06 11:50:34 +03:00
Roman Lebedev
95a05d24ff [NFC][InstCombine] Add tests for comparisons between x and negation of x (PR39480) 2020-08-06 11:50:34 +03:00
Xing GUO
47c44c8042 [DWARFYAML][debug_info] Pull out dwarf::FormParams from DWARFYAML::Unit.
Unit.Format, Unit.Version and Unit.AddrSize are replaced with
dwarf::FormParams in D84496 to get rid of unnecessary functions
getOffsetSize() and getRefSize(). However, that change makes it
difficult to make AddrSize optional (Optional<uint8_t>). This change
pulls out dwarf::FormParams from DWARFYAML::Unit and use it as a helper
struct in DWARFYAML::emitDebugInfo().

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D85296
2020-08-06 16:39:00 +08:00
Alex Richardson
f82a1396d4 [lit] Remove ANSI control characters from xunit output
Failing test output sometimes contains control characters like \x1b (e.g.
if there was some -fcolor-diagnostics output) which are not allowed inside
XML files. This causes problems with CI systems: for example, the Jenkins
JUnit XML will throw an exception when ecountering those characters and
similar problems also occur with GitLab CI.

Reviewed By: yln, jdenny

Differential Revision: https://reviews.llvm.org/D84233
2020-08-06 09:16:52 +01:00
Craig Topper
7ca0daba97 [X86] Rename X86::getImpliedFeatures to X86::updateImpliedFeatures and pass clang's StringMap directly to it.
No point in building a vector of StringRefs for clang to apply to the
StringMap. Just pass the StringMap and modify it directly.
2020-08-06 00:20:46 -07:00
Martin Storsjö
355c62a2b7 [AArch64] [Windows] Error out on unsupported symbol locations
These might occur in seemingly generic assembly. Previously when
targeting COFF, they were silently ignored, which certainly won't
give the right result. Instead clearly error out, to make it clear
that the assembly needs to be adjusted for this target.

Also change a preexisting report_fatal_error into a proper error
message, pointing out the offending source instruction. This isn't
strictly an internal error, as it can be triggered by user input.

Differential Revision: https://reviews.llvm.org/D85242
2020-08-06 09:23:46 +03:00
Martin Storsjö
7712d0e79e [ARM, AArch64] Fix a comment typo. NFC. 2020-08-06 09:23:45 +03:00
Chuanqi Xu
dc97d0ff53 [Coroutines] Use to collect lifetime marker of in CoroFrame Differential Revision: https://reviews.llvm.org/D85279 2020-08-06 14:21:55 +08:00
Craig Topper
b9b5a7ec13 [X86] Remove incomplete custom handling of i128 sdivrem/udivrem on Windows.
We need to have special handling of i128 div/rem on Windows due
to a weird calling convention needed for the libcall. There was
also some code that made it look like we do the same for sdivrem/udiv,
but the code didn't account for multiple return values of those
functions so couldn't possibly work. I think this code never
triggers because we don't have libcall names defined for those
functions by default so DAGCombine never creates DIVREM nodes.
2020-08-05 23:01:07 -07:00
Douglas Yung
23b9768503 Fix typo in test. Thanks to Andrew Ng for spotting this! 2020-08-05 22:56:15 -07:00
Lang Hames
94684e8bbe [JITLink][MachO][AArch64] More PAGEOFF12 relocation fixes.
Correctly sign extend the addend, and fix implicit shift operand decoding
(it incorrectly returned 0 for some cases), and check that the initial
encoded immediate is 0.
2020-08-05 21:09:45 -07:00
Matt Arsenault
5d9720d33a AMDGPU: Remove ATOMIC_PK_FADD
The f32 and v2f16 cases should be handled the same way.
2020-08-05 22:00:52 -04:00
Arthur Eubanks
30eacbff3b [NewPM][opt] Add more codegen passes
Reduces number of failures by 92.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85381
2020-08-05 18:59:52 -07:00
Ruiling Song
de9c413633 [AMDGPU] add buffer_atomic_swap for float
The functionality is used when calling imageAtomicExhange() on float
type imageBuffer in Graphics shaders.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D85187
2020-08-06 09:45:48 +08:00
Juneyoung Lee
1112f9ad6f [JumpThreading] Allow duplicating a basic block into preds when its branch condition is freeze(phi)
This is the last JumpThreading patch for getting the performance numbers shown at
https://reviews.llvm.org/D84940#2184653 .

This patch makes ProcessBlock call ProcessBranchOnPHI when the branch condition
is freeze(phi) as well (originally it calls the function when the condition is
phi only).

Since what ProcessBranchOnPHI does is to duplicate the basic block into
predecessors if profitable, it is still valid when the condition is freeze(phi)
too.

```
    p = phi [a, pred1] [b, pred2]
    p.fr = freeze p
    br p.fr, ...
=>
  pred1:
    p.fr = freeze a
    br p.fr, ...
  pred2:
    p.fr2 = freeze b
    br p.fr2, ...
```

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85029
2020-08-06 09:51:17 +09:00
Juneyoung Lee
d2cb125458 [JumpThreading] Add a test that duplicates insts of a basic block with cond br to preds; NFC 2020-08-06 09:19:23 +09:00
Alina Sbirlea
d22d8b1c0c [MSSA] Update test with more detailed and resilient checks. [NFC] 2020-08-05 16:46:44 -07:00
LLVM GN Syncbot
0325b9905f [gn build] Port 820e8d8656e 2020-08-05 23:35:59 +00:00
Petr Hosek
70737c97db [CMake] Simplify CMake handling for zlib
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.

This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.

Differential Revision: https://reviews.llvm.org/D79219
2020-08-05 16:07:11 -07:00
Craig Topper
7af487b243 [X86] Rename mod128.ll to divmod128.ll and add test cases for sdiv/udiv/urem.
This improves code coverage on the switch in LowerWin64_i128OP.
2020-08-05 16:04:00 -07:00
Arthur Eubanks
d30f886c6b [MSSA][NewPM] Handle tests with -print-memoryssa
-print-memoryssa in legacy PM is print<memoryssa> in NPM.
Pin tests with -print-memoryssa to legacy PM.
Add corresponding tests for NPM where missing.
This fixes "unknown pass name 'print-memoryssa'".

Some tests still fail in Analysis/MemorySSA due to other passes that
haven't been ported.

pr43427.ll and pr43438.ll required adding -aa-pipeline=basic-aa,
-loop-simplify (since it doesn't run on legacy PM by default), and
decrementing some of the MemoryPhi numbers.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D85333
2020-08-05 15:59:45 -07:00
Craig Topper
a04fa612fc [X86] Disable copy elision in LowerMemArgument for scalarized vectors when the loc VT is a different size than the original element.
For example a v4f16 argument is scalarized to 4 i32 values. So
the values are spread out instead of being packed tightly like
in the original vector.

Fixes PR47000.
2020-08-05 15:44:54 -07:00