1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

182134 Commits

Author SHA1 Message Date
Daniel Sanders
60c2b0f7af Re-commit: r366610 and r366612: Expand pseudo-components before embedding in llvm-config
There were two main problems:
* The 'nativecodegen' pseudo-component was unconditionally adding
  ${native_tgt}CodeGen even though it conditionally added ${native_tgt}Info and
  ${native_tgt}Desc. This has been fixed by making ${native_tgt}CodeGen
  conditional too
* The 'all' pseudo-component was causing library names like LLVMLLVMDemangle as
  the expansion was to a library name and not a component. There doesn't seem to
  be a list of available components anywhere so this has been fixed by moving the
  expansion of 'all' back where it was before. This manifested in different ways
  on different builders but it was the same root cause

llvm-svn: 366622
2019-07-19 22:46:47 +00:00
Matt Arsenault
0e9abc2fbd AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces
llvm-svn: 366621
2019-07-19 22:28:44 +00:00
Stanislav Mekhanoshin
1ace6d5780 [AMDGPU] Autogenerate register sequences in tuples
Differential Revision: https://reviews.llvm.org/D65007

llvm-svn: 366619
2019-07-19 21:43:42 +00:00
Stanislav Mekhanoshin
4f62c7dc7b [AMDGPU] Fixed occupancy calculation for gfx10
Differential Revision: https://reviews.llvm.org/D65010

llvm-svn: 366616
2019-07-19 21:29:51 +00:00
Daniel Sanders
47b2ce4000 Revert r366610 and r366612: Expand pseudo-components before embedding in llvm-config
Some targets are missing LLVMDemangle, one is adding the LLVM prefix twice, and two
are hitting the very error this patch fixes for my target. Reverting while I work
through the reports.

llvm-svn: 366615
2019-07-19 21:11:05 +00:00
Craig Topper
a0fef0a13e [InstCombine] Fix copy/paste mistake in the test cases I added for PR42691. NFC
llvm-svn: 366614
2019-07-19 21:09:21 +00:00
Matt Arsenault
e337bb98bb AMDGPU: Avoid custom predicates for stores with glue
llvm-svn: 366613
2019-07-19 21:01:30 +00:00
Daniel Sanders
eb9d60e13b Fix a latent bug discovered by r366610: nativecodegen includes X86CodeGen when X86 is not compiled
I believe this to have been a latent bug as the same expansion checks for the
existence of ${native_tgt}Info and ${native_tgt}Desc and only adds them if
they were compiled but unconditionally adds ${native_tgt}CodeGen.

This should fix llvm-clang-x86_64-win-fast which builds ARM only on an X86 host and similar builders.

llvm-svn: 366612
2019-07-19 20:58:11 +00:00
Craig Topper
192e68f782 [InstCombine] Add test cases for PR42691. NFC
llvm-svn: 366611
2019-07-19 20:48:52 +00:00
Daniel Sanders
70b0573626 Expand pseudo-components before embedding in llvm-config
Summary:
If you use pseudo-targets like AllTargetsCodeGens in LLVM_DYLIB_COMPONENTS
then a test will fail because `./bin/llvm-config --shared-mode` can't
handle these targets. We can fix this by expanding them before embedding
the string into llvm-config

Reviewers: bogner

Reviewed By: bogner

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65011

llvm-svn: 366610
2019-07-19 20:38:05 +00:00
Matt Arsenault
6e1ebba17d AMDGPU: Redefine setcc condition PatLeafs
Avoid using custom code predicates.

llvm-svn: 366609
2019-07-19 20:24:40 +00:00
Matt Arsenault
2724c8cef5 AMDGPU: Don't rely on m0 being -1 for GWS offsets
This only works if the high bits of m0 are also 0, so m0 would have to
be set to 0xffff.

llvm-svn: 366608
2019-07-19 20:01:24 +00:00
Matt Arsenault
b81a355be5 AMDGPU: Force s_waitcnt after GWS instructions
This is apparently required to be the immediately following
instruction, so force it into a bundle with a waitcnt.

llvm-svn: 366607
2019-07-19 19:47:30 +00:00
Matt Arsenault
f3cd8c40d1 LiveIntervals: Fix handleMove asserting on BUNDLE
The top-level BUNDLE instruction should behave as an ordinary
instruction. It is supposed to have all relevant registers as implicit
operands. Moving it should work as any other instruction. I believe
the assert intended to avoid moving instructions inside bundles.

llvm-svn: 366605
2019-07-19 19:32:00 +00:00
Louis Dionne
747dd4b220 Revert "[libc++] Integrate the PSTL into libc++"
This reverts r366593, which caused unforeseen breakage on the build bots.
I'm reverting until the problems have been figured out and fixed.

llvm-svn: 366603
2019-07-19 18:52:46 +00:00
Michael Liao
5f9a32dcb3 [AMDGPU] Add test case on crashing of si-lower-sgpr-spills pass
Reviewers: arsenm

Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64273

llvm-svn: 366602
2019-07-19 18:50:53 +00:00
Nick Desaulniers
c56d9ef695 Revert "Use the MachineBasicBlock symbol for a callbr target"
This reverts commit r366523/ccbffefccaff42b0d094c9ef0f49fc3e8c8456ea.

Two regressions were immediately reported:
- https://github.com/ClangBuiltLinux/linux/issues/614
- https://github.com/ClangBuiltLinux/linux/issues/615

Reported-by: nathanchance
llvm-svn: 366600
2019-07-19 18:18:02 +00:00
Matt Morehouse
2bd8f0a2eb [RISCV] Disable tests failing on buildbots.
r366399 enabled a couple tests that are failing on a few buildbots.

llvm-svn: 366599
2019-07-19 18:05:12 +00:00
Stanislav Mekhanoshin
0c9d743a5c [AMDGPU] Allow register tuples to set asm names
This change reverts most of the previous register name generation.
The real problem is that RegisterTuple does not generate asm names.
Added optional operand to RegisterTuple. This way we can simplify
register name access and dramatically reduce the size of static
tables for the backend.

Differential Revision: https://reviews.llvm.org/D64967

llvm-svn: 366598
2019-07-19 18:05:01 +00:00
Matt Arsenault
c815dbdcb1 AMDGPU/GlobalISel: Fix MMO flags for kernel argument loads
The DAG lowering sets dereferencable and invariant, not nontemporal.

llvm-svn: 366597
2019-07-19 17:52:56 +00:00
Matt Arsenault
fbb71b0b9e GlobalISel: Add GINodeEquiv for fcopysign
I don't need this at the moment, but it should be here.

llvm-svn: 366596
2019-07-19 17:32:19 +00:00
Shoaib Meenai
5746665a65 [llvm-lipo] Remove trailing whitespace. NFC
llvm-svn: 366595
2019-07-19 17:19:57 +00:00
Louis Dionne
5a48459ac2 [libc++] Integrate the PSTL into libc++
Summary:
This commit allows specifying LIBCXX_ENABLE_PARALLEL_ALGORITHMS when
configuring libc++ in CMake. When that option is enabled, libc++ will
assume that the PSTL can be found somewhere on the CMake module path,
and it will provide the C++17 parallel algorithms based on the PSTL
(that is assumed to be available).

The commit also adds support for running the PSTL tests as part of
the libc++ test suite.

Reviewers: rodgert, EricWF

Subscribers: mgorny, christof, jkorous, dexonsmith, libcxx-commits, mclow.lists, EricWF

Tags: #libc

Differential Revision: https://reviews.llvm.org/D60480

llvm-svn: 366593
2019-07-19 17:02:42 +00:00
Matt Arsenault
59f385cd0c AMDGPU: Add some function return test cases
llvm-svn: 366591
2019-07-19 16:45:48 +00:00
Simon Pilgrim
caf2aa9106 [AMDGPU] Regenerate test file for upcoming patch. NFCI.
llvm-svn: 366589
2019-07-19 15:43:56 +00:00
Matt Arsenault
2217349a57 AMDGPU: Attempt to fix bot error
Manually remove file name from check line, since it somehow ends
up being different on an msvc bot.

llvm-svn: 366586
2019-07-19 14:56:24 +00:00
Matt Arsenault
43bcdb17a1 AMDGPU/GlobalISel: Selection for fminnum/fmaxnum
v2f16 case doesn't work yet because the VOP3P complex patterns haven't
been ported yet.

llvm-svn: 366585
2019-07-19 14:42:40 +00:00
Matt Arsenault
e76c843d30 AMDGPU/GlobalISel: Support arguments with multiple registers
Handles structs used directly in argument lists.

llvm-svn: 366584
2019-07-19 14:29:30 +00:00
Matt Arsenault
3b8fc1b930 AMDGPU/GlobalISel: Rewrite lowerFormalArguments
This should now handle everything except structs passed as multiple
registers.

I think most of the packing logic should be handled by
handleAssignments, but I'm unclear on what the contract is for
multiple registers. This is copying how x86 handles this.

This does change the behavior of the test_sgpr_alignment0 amdgpu_vs
test. I don't think shader arguments should try to follow the
alignment, and registers need to be repacked. I also don't think it
matters, since I think the pointers are packed to the beginning of the
argument list anyway.

llvm-svn: 366582
2019-07-19 14:15:18 +00:00
Matt Arsenault
a09c49dee2 AMDGPU: Decompose all values to 32-bit pieces for calling conventions
This is the more natural lowering, and presents more opportunities to
reduce 64-bit ops to 32-bit.

This should also help avoid issues graphics shaders have had with
64-bit values, and simplify argument lowering in globalisel.

llvm-svn: 366578
2019-07-19 13:57:44 +00:00
Nico Weber
ab9357c357 gn build: Set +x on symlink_or_copy.py
llvm-svn: 366576
2019-07-19 13:40:54 +00:00
Matt Arsenault
1ad1497bb5 DAG: Handle dbg_value for arguments split into multiple subregs
This was handled previously for arguments split due to not fitting in
an MVT. This was dropping the register for argument registers split
due to TLI::getRegisterTypeForCallingConv.

llvm-svn: 366574
2019-07-19 13:36:46 +00:00
Than McIntosh
5b376040fd [NFC] include cstdint/string prior to using uint8_t/string
Summary: include proper header prior to use of uint8_t typedef
and std::string.

Subscribers: llvm-commits

Reviewers: cherry

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64937

llvm-svn: 366572
2019-07-19 13:13:54 +00:00
Dmitry Preobrazhensky
29afa16795 [AMDGPU][MC] Corrected parsing of branch offsets
See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D64629

llvm-svn: 366571
2019-07-19 13:12:47 +00:00
Kai Luo
aa4508e98a [MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs.
Summary:
Current PRE hoists common computations into
CMBB = DT->findNearestCommonDominator(MBB, MBB1).
However, if CMBB is in a hot loop body, we might get performance
degradation.

Differential Revision: https://reviews.llvm.org/D64394

llvm-svn: 366570
2019-07-19 12:58:16 +00:00
Than McIntosh
fb3f89d50d [X86] for split stack, not save/restore nested arg if unused
Summary:
For split-stack, if the nested argument (i.e. R10) is not used, no need to save/restore it in the prologue.

Reviewers: thanm

Reviewed By: thanm

Subscribers: mstorsjo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64673

llvm-svn: 366569
2019-07-19 12:54:44 +00:00
Roman Lebedev
5d89c66ff4 [NFC][InstCombine] Tests for 'rem' formation from sub-of-mul-by-'div' (PR42673)
https://rise4fun.com/Alive/8Rp
https://bugs.llvm.org/show_bug.cgi?id=42673

llvm-svn: 366565
2019-07-19 11:29:18 +00:00
Roman Lebedev
173550eaba [NFC][InstCombine] Redundant masking before left-shift: tests with assume
If the legality check is `(shiftNbits-maskNbits) s>= 0`,
then we can simplify it to `shiftNbits u>= maskNbits`,
which is easier to check for.

However, currently switching the `dropRedundantMaskingOfLeftShiftInput()`
to `SimplifyICmpInst()` does not catch these cases and regresses
currently-handled cases, so i'll leave it as is for now.

https://rise4fun.com/Alive/25P

llvm-svn: 366564
2019-07-19 11:29:04 +00:00
Simon Pilgrim
d94781f714 Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.
llvm-svn: 366563
2019-07-19 11:18:46 +00:00
Oliver Stannard
5b6f9348e5 Don't update NoTrappingFPMath and FPDenormalMode in resetTargetOptions
We'd like to remove this whole function, because these are properties of
functions, not the target as a whole. These two are easy to remove
because they are only used for emitting ARM build attributes, which
expects them to represent the defaults for the whole module, not just
the last function generated.

This is needed to get correct build attributes when using IPRA on ARM,
because IPRA causes resetTargetOptions to get called before
ARMAsmPrinter::emitAttributes.

Differential revision: https://reviews.llvm.org/D64929

llvm-svn: 366562
2019-07-19 10:37:37 +00:00
George Rimar
e8c8550933 [llvm-readelf] - A fix for: "--hash-symbols asserts for 64-bit ELFs"
Fixes https://bugs.llvm.org/show_bug.cgi?id=42622.
(--hash-symbols switch is currently broken for 64-bit ELF files, due to r352630.)

Differential revision: https://reviews.llvm.org/D64788

llvm-svn: 366558
2019-07-19 10:15:03 +00:00
Oliver Stannard
1900a7e6ef [IPRA] Don't rely on non-exact function definitions
If a function definition is not exact, then the linker could select a
differently-compiled version of it, which could use different registers.

https://reviews.llvm.org/D64909

llvm-svn: 366557
2019-07-19 09:59:26 +00:00
Mikhail Maltsev
3c59f45453 [ARM] Add <saturate> operand to SQRSHRL and UQRSHLL
Summary:
According to the new Armv8-M specification
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the
instructions SQRSHRL and UQRSHLL now have an additional immediate
operand <saturate>. The new assembly syntax is:

SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm
UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm

where <saturate> can be either 64 (the existing behavior) or 48, in
that case the result is saturated to 48 bits.

The new operand is encoded as follows:
  #64 Encoded as sat = 0
  #48 Encoded as sat = 1
sat is bit 7 of the instruction bit pattern.

This patch adds a new assembler operand class MveSaturateOperand which
implements parsing and encoding. Decoding is implemented in
DecodeMVEOverlappingLongShift.

Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64810

llvm-svn: 366555
2019-07-19 09:46:28 +00:00
Hubert Tong
8c8d2305a0 [sanitizers] Use covering ObjectFormatType switches
Summary:
This patch removes the `default` case from some switches on
`llvm::Triple::ObjectFormatType`, and cases for the missing enumerators
(`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added.

For `UnknownObjectFormat`, the effect of the action for the `default`
case is maintained; otherwise, where `llvm_unreachable` is called,
`report_fatal_error` is used instead.

Where the `default` case returns a default value, `report_fatal_error`
is used for XCOFF as a placeholder. For `Wasm`, the effect of the action
for the `default` case in maintained.

The code is structured to avoid strongly implying that the `Wasm` case
is present for any reason other than to make the switch cover all
`ObjectFormatType` enumerator values.

Reviewers: sfertile, jasonliu, daltenty

Reviewed By: sfertile

Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64222

llvm-svn: 366544
2019-07-19 08:46:18 +00:00
Jay Foad
70edc47b65 [AMDGPU] Simplify the exclusive scan used for optimized atomics
Summary:
Change the scan algorithm to use only power-of-two shifts (1, 2, 4, 8,
16, 32) instead of starting off shifting by 1, 2 and 3 and then doing
a 3-way ADD, because:

1. It simplifies the compiler a little.
2. It minimizes vgpr pressure because each instruction is now of the
   form vn = vn + vn << c.
3. It is more friendly to the DPP combiner, which currently can't
   combine into an ADD3 instruction.

Because of #2 and #3 the end result is improved from this:

  v_add_u32_dpp v4, v3, v3  row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
  v_mov_b32_dpp v5, v3  row_shr:2 row_mask:0xf bank_mask:0xf
  v_mov_b32_dpp v1, v3  row_shr:3 row_mask:0xf bank_mask:0xf
  v_add3_u32 v1, v4, v5, v1
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_shr:4 row_mask:0xf bank_mask:0xe
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_shr:8 row_mask:0xf bank_mask:0xc
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_bcast:15 row_mask:0xa bank_mask:0xf
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_bcast:31 row_mask:0xc bank_mask:0xf

To this:

  v_add_u32_dpp v1, v1, v1  row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_shr:4 row_mask:0xf bank_mask:0xe
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_shr:8 row_mask:0xf bank_mask:0xc
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_bcast:15 row_mask:0xa bank_mask:0xf
  s_nop 1
  v_add_u32_dpp v1, v1, v1  row_bcast:31 row_mask:0xc bank_mask:0xf

I.e. two fewer computational instructions, one extra nop where we could
schedule something else.

Reviewers: arsenm, sheredom, critson, rampitec, vpykhtin

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64411

llvm-svn: 366543
2019-07-19 08:40:37 +00:00
Serguei Katkov
8a582d259c [Loop Peeling] Enable peeling of multiple exits by default.
Enable loop peeling with multiple exits where all non-latch exits
ends up with deopt by default.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64619

llvm-svn: 366542
2019-07-19 08:35:45 +00:00
Roman Lebedev
6165112266 [InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

Normally, the inner pattern is sign-extend,
but for our purposes it's no different to other patterns:

alive proofs:
f: https://rise4fun.com/Alive/7U3

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64524

llvm-svn: 366540
2019-07-19 08:26:58 +00:00
Roman Lebedev
58a3a62a9e [InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
e: https://rise4fun.com/Alive/0FT

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64521

llvm-svn: 366539
2019-07-19 08:26:47 +00:00
Roman Lebedev
07ecac9c32 [InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
d: https://rise4fun.com/Alive/I5Y

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64519

llvm-svn: 366538
2019-07-19 08:26:37 +00:00
Roman Lebedev
b21b3169de [InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.

There are many variants to this pattern:
c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)

alive proofs:
c: https://rise4fun.com/Alive/RgJh

For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.

https://bugs.llvm.org/show_bug.cgi?id=42563

Differential Revision: https://reviews.llvm.org/D64517

llvm-svn: 366537
2019-07-19 08:26:25 +00:00