1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

59 Commits

Author SHA1 Message Date
Jon Roelofs
48d5dbf972 [MachineVerifier] Make INSERT_SUBREG diagnostic respect operand 2 subregs
This came out of post-commit review: https://reviews.llvm.org/D105953#inline-1012919

Thanks uabelho!
2021-07-21 08:47:17 -07:00
Jon Roelofs
5f81593d96 [MachineVerifier] Diagnose invalid INSERT_SUBREGs
Differential revision: https://reviews.llvm.org/D105953
2021-07-20 17:32:29 -07:00
Jon Roelofs
04c73eae43 Revert "[MachineVerifier] Diagnose invalid INSERT_SUBREGs"
This reverts commit dd57ba1a17b93dbe211d04cb2d4de5f6dc898d60.

It broke some tests: http://45.33.8.238/linux/51314/step_12.txt
2021-07-16 09:53:55 -07:00
Jon Roelofs
ac7796917a [MachineVerifier] Diagnose invalid INSERT_SUBREGs
Differential revision: https://reviews.llvm.org/D105953
2021-07-16 09:43:12 -07:00
Matt Arsenault
cc12b285b6 CodeGen: Print/parse LLTs in MachineMemOperands
This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.
2021-06-30 16:54:13 -04:00
Jon Roelofs
b3511ee3cf [GISel] Support llvm.memcpy.inline
Differential revision: https://reviews.llvm.org/D105072
2021-06-30 12:39:05 -07:00
Matt Arsenault
bb6d824b20 GlobalISel: Relax verification of physical register copy types
This was picking a concrete size for a physical register, and
enforcing exact match on the virtual register's type size. Some
targets add multiple types to a register class, and some are smaller
than the full bit width. For example x86 adds f32 to 128-bit xmm
registers, and AMDGPU adds i16/f16 to 32-bit registers.

It might be better to represent these cases as a copy of the full
register and an extraction of the subpart, but a lot of code assumes
you can directly copy. This will help fix the current usage of the DAG
calling convention infrastructure which is incompatible with how
GlobalISel is now using it.

The API is somewhat cumbersome here, but I just mirrored the existing
functions, except now with LLTs (and allow returning null on failure,
unlike the MVT version). I think the concept of selecting register
classes based on type is flawed to begin with, but I'm trying to keep
this compatible with the existing handling.
2021-04-28 08:45:41 -04:00
Amara Emerson
61b28d4f51 [GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates.
Differential Revision: https://reviews.llvm.org/D99383
2021-03-25 17:23:30 -07:00
Jessica Paquette
66363d109b [AArch64][GlobalISel] Emit bzero on Darwin
Darwin platforms for both AArch64 and X86 can provide optimized `bzero()`
routines. In this case, it may be preferable to use `bzero` in place of a
memset of 0.

This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can
be generated by platforms which may want to use bzero.

To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The
conditions for this are largely a port of the bzero case in
`AArch64SelectionDAGInfo::EmitTargetCodeForMemset`.

The only difference in comparison to the SelectionDAG code is that, when
compiling for minsize, this will fire for all memsets of 0. The original code
notes that it's not beneficial to do this for small memsets; however, using
bzero here will save a mov from wzr. For minsize, I think that it's preferable
to prioritise omitting the mov.

This also fixes a bug in the libcall legalization code which would delete
instructions which could not be legalized. It also adds a check to make sure
that we actually get a libcall name.

Code size improvements (Darwin):

- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)

Differential Revision: https://reviews.llvm.org/D99358
2021-03-25 17:14:25 -07:00
Jessica Paquette
e1c3b00d6a Add missing -march to runline in llvm/test/MachineVerifier/test_g_ubfx_sbfx.mir
This can cause failures if built without a default target triple.
2021-03-24 11:23:08 -07:00
Jessica Paquette
ae291b6dfb [GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)
There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG.

E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain
code that matches a bitfield extract from an and + right shift.

Rather than duplicating code in the same way, this adds two opcodes:

- G_UBFX (unsigned bitfield extract)
- G_SBFX (signed bitfield extract)

They work like this

```
%x = G_UBFX %y, %lsb, %width
```

Where `lsb` and `width` are

- The least-significant bit of the extraction
- The width of the extraction

This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends
the result, while G_SBFX sign-extends the result.

This should allow us to use the combiner to match the bitfield extraction
patterns rather than duplicating pattern-matching code in each target.

Differential Revision: https://reviews.llvm.org/D98464
2021-03-19 14:37:19 -07:00
Matt Arsenault
3e86dcbeed GlobalISel: Verify G_CONCAT_VECTORS has at least 2 sources 2021-03-01 09:10:36 -05:00
Jessica Paquette
840095b1e1 [GlobalISel] Add G_ASSERT_SEXT
This adds a G_ASSERT_SEXT opcode, similar to G_ASSERT_ZEXT. This instruction
signifies that an operation was already sign extended from a smaller type.

This is useful for functions with sign-extended parameters.

E.g.

```
define void @foo(i16 signext %x) {
 ...
}
```

This adds verifier, regbankselect, and instruction selection support for
G_ASSERT_SEXT equivalent to G_ASSERT_ZEXT.

Differential Revision: https://reviews.llvm.org/D96890
2021-02-17 13:10:34 -08:00
Jay Foad
d27645b72c [GlobalISel] Simpler verification of G_SEXT_INREG and G_ASSERT_ZEXT
There's no need to call verifyVectorElementMatch since we already know
that the source and destination types are identical.

Differential Revision: https://reviews.llvm.org/D96589
2021-02-12 21:33:27 +00:00
Jessica Paquette
3fcc23c823 [GlobalISel] Add G_ASSERT_ZEXT
This adds a generic opcode which communicates that a type has already been
zero-extended from a narrower type.

This is intended to be similar to AssertZext in SelectionDAG.

For example,

```
%x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16
```

Signifies that the top 48 bits of %x are known to be 0.

This is useful in cases like this:

```
define i1 @zeroext_param(i8 zeroext %x) {
  %cmp = icmp ult i8 %x, -20
  ret i1 %cmp
}
```

In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit
value.

If we know that `%x` is already zero-ed out in the relevant high bits, we can
avoid the truncate.

Currently, in GISel, this looks like this:

```
_zeroext_param:
  and w8, w0, #0xff ; We don't actually need this!
  cmp w8, #236
  cset w0, lo
  ret
```

While SDAG does not produce the truncation, since it knows that it's
unnecessary:

```
_zeroext_param:
  cmp w0, #236
  cset w0, lo
  ret
```

This patch

- Adds G_ASSERT_ZEXT
- Adds MIRBuilder support for it
- Adds MachineVerifier support for it
- Documents it

It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There
should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.)

This allows us to skip over hints in the legalizer etc. These can then later
be selected like COPY instructions or removed.

Differential Revision: https://reviews.llvm.org/D95564
2021-01-28 13:58:37 -08:00
Serguei Katkov
187141b789 [Verifier] Add tied-ness verification to statepoint intsruction
Reviewers: reames, dantrushin
Reviewed By: reames, dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D94483
2021-01-13 14:40:44 +07:00
Amara Emerson
03855295f3 [GlobalISel] Remove scalar src from non-sequential fadd/fmul reductions.
It's probably better to split these into separate G_FADD/G_FMUL + G_VECREDUCE
operations in the translator rather than carrying the scalar around. The
majority of the time it'll get simplified away as the scalars are probably
identity values.

Differential Revision: https://reviews.llvm.org/D89150
2020-10-15 15:51:44 -07:00
Amara Emerson
bbd25a9a88 [GlobalISel] Add G_VECREDUCE_* opcodes for vector reductions.
These mirror the IR and SelectionDAG intrinsics & nodes.

Opcodes added:
G_VECREDUCE_SEQ_FADD
G_VECREDUCE_SEQ_FMUL
G_VECREDUCE_FADD
G_VECREDUCE_FMUL
G_VECREDUCE_FMAX
G_VECREDUCE_FMIN
G_VECREDUCE_ADD
G_VECREDUCE_MUL
G_VECREDUCE_AND
G_VECREDUCE_OR
G_VECREDUCE_XOR
G_VECREDUCE_SMAX
G_VECREDUCE_SMIN
G_VECREDUCE_UMAX
G_VECREDUCE_UMIN

Differential Revision: https://reviews.llvm.org/D88750
2020-10-08 10:33:19 -07:00
Russell Gallop
16726ba1da [Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL
This hasn't been allowed as a build option since r309990

Remove leftover REQUIRES: global-isel

Differential Revision: https://reviews.llvm.org/D86714
2020-08-27 16:36:27 +01:00
Russell Gallop
c9d8b1aa6b Fix for PS4 bots after 0b7f6cc71a72a85f8a0cbee836a7a8e31876951a 2020-08-27 12:47:26 +01:00
Matt Arsenault
a03ce058a1 GlobalISel: Add generic instructions for memory intrinsics
AArch64, X86 and Mips currently directly consumes these and custom
lowering to produce a libcall, but really these should follow the
normal legalization process through the libcall/lower action.
2020-08-26 20:08:45 -04:00
Matt Arsenault
53cca02b90 GlobalISel: Verify G_BITCAST changes the type
Updated the AArch64 tests the best I could with my vague, inferred
understanding of AArch64 register banks. As far as I can tell, there
is only one 32-bit/64-bit type which will use the gpr register bank,
so we have to use the fpr bank for the other operand.
2020-07-08 17:16:27 -04:00
Matt Arsenault
58a9482211 GlobalISel: Disallow undef generic virtual register uses
With an undef operand, it's possible for getVRegDef to fail and return
null. This is an edge case very little code bothered to
consider. Proper gMIR should use G_IMPLICIT_DEF instead.

I initially tried to apply this restriction to all SSA MIR, so then
getVRegDef would never fail anywhere. However, ProcessImplicitDefs
does technically run while the function is in SSA. ProcessImplicitDefs
and DetectDeadLanes would need to either move, or a new pseudo-SSA
type of function property would need to be introduced.
2020-06-30 19:18:01 -04:00
Dominik Montada
e95e4419fe [NFC] Remove unnecessary require global-isel from tests 2020-06-15 16:35:18 +02:00
Dominik Montada
1fcb94df38 [MachineVerifier][GlobalISel] Check that branches have a MBB operand or are declared indirect. Add missing properties to G_BRJT, G_BRINDIRECT
Summary:
Teach MachineVerifier to check branches for MBB operands if they are not declared indirect.

Add `isBarrier`, `isIndirectBranch` to `G_BRINDIRECT` and `G_BRJT`.
Without these, `MachineInstr.isConditionalBranch()` was giving a
false-positive for those instructions.

Reviewers: aemerson, qcolombet, dsanders, arsenm

Reviewed By: dsanders

Subscribers: hiraditya, wdng, simoncook, s.egerton, arsenm, rovka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81587
2020-06-15 11:17:09 +02:00
James Y Knight
ae071ee9e2 Simplify MachineVerifier's block-successor verification.
There's two properties we want to verify:

1. That the successors returned by analyzeBranch are in the CFG
   successor list, and
2. That there are no extraneous successors are in the CFG successor
   list.

The previous implementation mostly accomplished this, but in a very
convoluted manner.

Differential Revision: https://reviews.llvm.org/D79793
2020-06-06 22:30:51 -04:00
Matt Arsenault
0d431b73a3 GlobalISel: Merge G_PTR_MASK with llvm.ptrmask intrinsic
Confusingly, these were unrelated and had different semantics. The
G_PTR_MASK instruction predates the llvm.ptrmask intrinsic, but has a
different format. G_PTR_MASK only allows clearing the low bits of a
pointer, and only a constant number of bits. The ptrmask intrinsic
allows an arbitrary mask. Replace G_PTR_MASK to match the intrinsic.

Only selects the cases that look like the old instruction. More work
is needed to select the general case. Also new legalization code is
still needed to deal with the case where the incoming mask size does
not match the pointer size, which has a specified behavior in the
langref.
2020-05-26 11:48:13 -04:00
Yuanfang Chen
dd53274771 Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit"""
This reverts commit 80a34ae31125aa46dcad47162ba45b152aed968d with fixes.

Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
2020-02-13 10:16:06 -08:00
Yuanfang Chen
2dbac841f9 Revert "Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit""""
This reverts commit bb51d243308dbcc9a8c73180ae7b9e47b98e68fb.
2020-02-13 10:08:05 -08:00
Yuanfang Chen
93e82c22ef Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit"""
This reverts commit 80a34ae31125aa46dcad47162ba45b152aed968d with fixes.

On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
2020-02-13 10:02:53 -08:00
Yuanfang Chen
c7fb4c55c4 Revert "Reland "[Support] make report_fatal_error abort instead of exit""
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.

There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
2020-02-11 20:41:53 -08:00
Yuanfang Chen
83a2f3c1ba Reland "[Support] make report_fatal_error abort instead of exit"
Summary:
Reland D67847 after D73742 is committed. Replace `sys::Process::Exit(1)`
with `abort` in `report_fatal_error`.

After this patch, for tools turning on `CrashRecoveryContext`,
crash handler installed by `CrashRecoveryContext` is called unless
they installed a non-returning handler using `llvm::install_fatal_error_handler`
like `cc1_main` currently does.

Reviewers: rnk, MaskRay, aganea, hans, espindola, jhenderson

Subscribers: jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, rupprecht, jocewei, jsji, Jim, dmgreen, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74456
2020-02-11 18:20:40 -08:00
Yuanfang Chen
b1c09bbef0 Revert "[Support] make report_fatal_error abort instead of exit"
This reverts commit 647c3f4e47de8a850ffcaa897db68702d8d2459a.

Got bots failure from sanitizer-windows and maybe others.
2020-01-15 17:52:25 -08:00
Yuanfang Chen
725cd0da61 [Support] make report_fatal_error abort instead of exit
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.

** The reason it fixes PR35547 is

`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).

Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola

Reviewed By: rnk, MaskRay

Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D67847
2020-01-15 17:05:13 -08:00
Jonas Paulsson
a1f306c9ad Recommit "[MachineVerifier] Improve verification of live-in lists."
MachineVerifier::visitMachineFunctionAfter() is extended to check the
live-through case for live-in lists. This is only done for registers without
aliases and that are neither allocatable or reserved, such as the SystemZ::CC
register.

The MachineVerifier earlier only catched the case of a live-in use without an
entry in the live-in list (as "using an undefined physical register").

A comment in LivePhysRegs.h has been added stating a guarantee that
addLiveOuts() can be trusted for a full register both before and after
register allocation.

Review: Quentin Colombet

Differential Revision: https://reviews.llvm.org/D68267
2020-01-08 16:58:54 -08:00
Jonas Paulsson
2a69f5997d [MachineVerifier] Improve checks of target instructions operands.
While working with a patch for instruction selection, the splitting of a
large immediate ended up begin treated incorrectly by the backend. Where a
register operand should have been created, it instead became an immediate. To
my surprise the machine verifier failed to report this, which at the time
would have been helpful.

This patch improves the verifier so that it will report this type of error.

This patch XFAILs CodeGen/SPARC/fp128.ll, which has been reported at
https://bugs.llvm.org/show_bug.cgi?id=44091

Review: thegameg, arsenm, fhahn
https://reviews.llvm.org/D63973
2019-12-03 10:20:52 +01:00
Galina Kistanova
0ea4a2c505 Revert "[MachineVerifier] Improve verification of live-in lists.
This reverts commit b7b170c to give the author more time to address failing tests on the expensive checks buildbots.
2019-11-07 14:02:13 -08:00
Daniel Sanders
7a5b72e3a3 [globalisel] Rename G_GEP to G_PTR_ADD
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD

Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69734
2019-11-05 10:31:17 -08:00
Jonas Paulsson
fcb5e0f825 Fix buildbots troubled by b7b170c.
Add '# REQUIRES: systemz-registered-target' in the new tests.
2019-11-04 16:54:33 +01:00
Jonas Paulsson
1fa33bad39 [MachineVerifier] Improve verification of live-in lists.
MachineVerifier::visitMachineFunctionAfter() is extended to check the
live-through case for live-in lists. This is only done for registers without
aliases and that are neither allocatable or reserved, such as the SystemZ::CC
register.

The MachineVerifier earlier only catched the case of a live-in use without
an entry in the live-in list (as "using an undefined physical register").

A comment in LivePhysRegs.h has been added stating a guarantee that
addLiveOuts() can be trusted for a full register both before and after
register allocation.

Review: Quentin Colombet
https://reviews.llvm.org/D68267
2019-11-04 16:22:00 +01:00
Amara Emerson
bcfd2edd61 Add an operand to memory intrinsics to denote the "tail" marker.
We need to propagate this information from the IR in order to be able to safely
do tail call optimizations on the intrinsics during legalization. Assuming
it's safe to do tail call opt without checking for the marker isn't safe because
the mem libcall may use allocas from the caller.

This adds an extra immediate operand to the end of the intrinsics and fixes the
legalizer to handle it.

Differential Revision: https://reviews.llvm.org/D68151

llvm-svn: 373140
2019-09-28 05:33:21 +00:00
Amara Emerson
cdac4a1a3c Remove unnecessary REQUIRES from a test.
llvm-svn: 369835
2019-08-24 02:39:51 +00:00
Amara Emerson
b67ced190e [GlobalISel] Introduce a G_DYN_STACKALLOC opcode to represent dynamic allocas.
This just adds the opcode and verifier, it will be used to replace existing
dynamic alloca handling in a subsequent patch.

Differential Revision: https://reviews.llvm.org/D66677

llvm-svn: 369833
2019-08-24 02:25:56 +00:00
Aditya Nandakumar
5b8b42ce04 [GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources
https://reviews.llvm.org/D66171

llvm-svn: 368753
2019-08-13 21:49:11 +00:00
Matt Arsenault
3482b4fef4 GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR
llvm-svn: 368705
2019-08-13 15:52:21 +00:00
Matt Arsenault
284e8e1c63 GlobalISel: Change representation of shuffle masks
Currently shufflemasks get emitted as any other constant, and you end
up with a bunch of virtual registers of G_CONSTANT with a
G_BUILD_VECTOR. The AArch64 selector then asserts on anything that
doesn't fit this pattern. This isn't an ideal representation, and
should avoid legalization and have fewer opportunities for a
representational error.

Rather than invent a new shuffle mask operand type, similar to what
ShuffleVectorSDNode does, just track the original IR Constant mask
operand. I don't completely like the idea of adding another link to
the IR, but MIR is already quite dependent on IR constants already,
and this will allow sharing the shuffle mask utility functions with
the IR.

llvm-svn: 368704
2019-08-13 15:34:38 +00:00
Daniel Sanders
52b4df9ca4 Add missing REQUIRES to r368487
llvm-svn: 368494
2019-08-09 22:16:16 +00:00
Daniel Sanders
233ed83478 [globalisel] Add G_SEXT_INREG
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
  %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
  %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
  %3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 24
  %3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
  %1:_(s32) = G_SHL %0:_(s32), i32 24
  %2:_(s32) = G_ASHR %1:_(s32), i32 23

All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.

To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
  %1 = G_CONSTANT 16
  %2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
  %2 = G_SEXT_INREG %0, 16
or:
  %1 = G_CONSTANT 16
  %2:_(s32) = G_SHL %0, %1
  %3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.

Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61289

llvm-svn: 368487
2019-08-09 21:11:20 +00:00
David Zarzycki
eb6539b002 [Testing] Fix tests that break with read-only checkouts
Found with `mount --bind -o ro ...` on Linux.

llvm-svn: 367519
2019-08-01 06:41:40 +00:00
Matt Arsenault
f7e237142b GlobalISel: Verify G_MERGE_VALUES operand sizes
llvm-svn: 364822
2019-07-01 18:01:35 +00:00