1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

181396 Commits

Author SHA1 Message Date
Craig Topper
78b10f8bde [X86] Remove unnecessary isel pattern for MOVLPSmr.
This was identical to a pattern for MOVPQI2QImr with a bitcast
as an input. But we should be able to turn MOVPQI2QImr into
MOVLPSmr in the execution domain fixup pass so we shouldn't
need this.

llvm-svn: 365224
2019-07-05 17:31:25 +00:00
Christudasan Devadasan
e2c5ebd6e9 [NFC] A test commit to check the access permission. Removed a blank line.
llvm-svn: 365223
2019-07-05 17:07:42 +00:00
James Henderson
10e7ad4054 [docs][llvm-readobj] Add a note to options that do nothing in GNU output
--section-data, --section-relocations and --section-symbols have no
effect for GNU style ouput. This patch changes the docs to point this
out, as it has caught me out on a couple of occasions.

See also https://bugs.llvm.org/show_bug.cgi?id=42522.

llvm-svn: 365221
2019-07-05 16:38:52 +00:00
Thomas Preud'homme
997901b7e1 [FileCheck] Share variable instance among uses
Summary:
This patch changes expression support to use one instance of
FileCheckNumericVariable per numeric variable rather than one per
variable and per definition. The current system was only necessary for
the last patch of the numeric expression support patch series in order
to handle a line using a variable defined earlier on the same line from
the input text. However this can be dealt more efficiently.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64229

llvm-svn: 365220
2019-07-05 16:25:46 +00:00
Thomas Preud'homme
e54ac8e70d [FileCheck] Don't diagnose undef vars at parse time
Summary:
Diagnosing use of undefined variables takes place in
parseNumericVariableUse() and printSubstitutions() for numeric variables
but only takes place in printSubstitutions() for string variables. The
reason for the split location of diagnostics is that parsing is not
aware of the clearing of variables due to --enable-var-scope and thus
use of variables cleared in this way can only be catched by
printSubstitutions().

Beyond the code level inconsistency, there is also a user facing
inconsistency since diagnostics look different between the two
functions. While the diagnostic in printSubstitutions is more verbose,
doing the diagnostic there allows to diagnose all undefined variables
rather than just the first one and error out.

This patch create dummy variable definition when encountering a use of
undefined variable so that parsing can proceed and be diagnosed by
printSubstitutions() later. Tests that were testing whether parsing
fails in such case are thus modified accordingly.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64228

llvm-svn: 365219
2019-07-05 16:25:33 +00:00
Yaxun Liu
205ba6a068 [AMDGPU] Added a new metadata for multi grid sync implicit argument
Patch by Christudasan Devadasan.

Differential Revision: https://reviews.llvm.org/D63886

llvm-svn: 365217
2019-07-05 16:05:17 +00:00
Matt Arsenault
562e34ab2e ScheduleDAG: Fix incorrectly killing registers in bundles
When looking for uses/defs to add kill flags, the iterator was double
incremented, skipping the first instruction in the bundle. The use
register in the first bundle instruction was then incorrectly killed.
The "First" instruction should be the BUNDLE itself as the proper
reverse iterator endpoint.

llvm-svn: 365216
2019-07-05 15:32:28 +00:00
Eugene Leviant
1cec5950f5 [ThinLTO] Attempt to recommit r365188 after alignment fix
llvm-svn: 365215
2019-07-05 15:25:05 +00:00
David Green
3aa131e8d6 [ARM] MVE patterns for VMVN, VORR and VBIC
This add simple Q register forms of bitwise not instructions.

Differential Revision: https://reviews.llvm.org/D63983

llvm-svn: 365214
2019-07-05 15:21:29 +00:00
Nico Weber
a963a94a23 gn build: Merge r365203
llvm-svn: 365213
2019-07-05 15:14:06 +00:00
Jay Foad
7f13ea9dd2 [AMDGPU] DPP combiner: recognize identities for more opcodes
Summary:
This allows the DPP combiner to kick in more often. For example the
exclusive scan generated by the atomic optimizer for a divergent atomic
add used to look like this:

        v_mov_b32_e32 v3, v1
        v_mov_b32_e32 v5, v1
        v_mov_b32_e32 v6, v1
        v_mov_b32_dpp v3, v2  wave_shr:1 row_mask:0xf bank_mask:0xf
        s_nop 1
        v_add_u32_dpp v4, v3, v3  row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
        v_mov_b32_dpp v5, v3  row_shr:2 row_mask:0xf bank_mask:0xf
        v_mov_b32_dpp v6, v3  row_shr:3 row_mask:0xf bank_mask:0xf
        v_add3_u32 v3, v4, v5, v6
        v_mov_b32_e32 v4, v1
        s_nop 1
        v_mov_b32_dpp v4, v3  row_shr:4 row_mask:0xf bank_mask:0xe
        v_add_u32_e32 v3, v3, v4
        v_mov_b32_e32 v4, v1
        s_nop 1
        v_mov_b32_dpp v4, v3  row_shr:8 row_mask:0xf bank_mask:0xc
        v_add_u32_e32 v3, v3, v4
        v_mov_b32_e32 v4, v1
        s_nop 1
        v_mov_b32_dpp v4, v3  row_bcast:15 row_mask:0xa bank_mask:0xf
        v_add_u32_e32 v3, v3, v4
        s_nop 1
        v_mov_b32_dpp v1, v3  row_bcast:31 row_mask:0xc bank_mask:0xf
        v_add_u32_e32 v1, v3, v1
        v_add_u32_e32 v1, v2, v1
        v_readlane_b32 s0, v1, 63

But now most of the dpp movs are combined into adds:

        v_mov_b32_e32 v3, v1
        v_mov_b32_e32 v5, v1
        s_nop 0
        v_mov_b32_dpp v3, v2  wave_shr:1 row_mask:0xf bank_mask:0xf
        s_nop 1
        v_add_u32_dpp v4, v3, v3  row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
        v_mov_b32_dpp v5, v3  row_shr:2 row_mask:0xf bank_mask:0xf
        v_mov_b32_dpp v1, v3  row_shr:3 row_mask:0xf bank_mask:0xf
        v_add3_u32 v1, v4, v5, v1
        s_nop 1
        v_add_u32_dpp v1, v1, v1  row_shr:4 row_mask:0xf bank_mask:0xe
        s_nop 1
        v_add_u32_dpp v1, v1, v1  row_shr:8 row_mask:0xf bank_mask:0xc
        s_nop 1
        v_add_u32_dpp v1, v1, v1  row_bcast:15 row_mask:0xa bank_mask:0xf
        s_nop 1
        v_add_u32_dpp v1, v1, v1  row_bcast:31 row_mask:0xc bank_mask:0xf
        v_add_u32_e32 v1, v2, v1
        v_readlane_b32 s0, v1, 63

Reviewers: arsenm, vpykhtin

Subscribers: kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, MaskRay, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64207

llvm-svn: 365211
2019-07-05 14:52:48 +00:00
Eugene Leviant
1492333665 Reverted r365188 due to alignment problems on i686-android
llvm-svn: 365206
2019-07-05 13:26:05 +00:00
Graham Hunter
c25ec2cf30 Scalable Vector IR Type with further LTO fixes
Reintroduces the scalable vector IR type from D32530, after it was reverted
a couple of times due to increasing chromium LTO build times. This latest
incarnation removes the walk over aggregate types from the verifier entirely,
in favor of rejecting scalable vectors in the isValidElementType methods in
ArrayType and StructType. This removes the 70% degradation observed with
the second repro tarball from PR42210.

Reviewers: thakis, hans, rengolin, sdesmalen

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D64079

llvm-svn: 365203
2019-07-05 12:48:16 +00:00
Robert Lougher
996cf75e4f This reverts r365061 and r365062 (test update)
Revision r365061 changed a skip of debug instructions for a skip
of meta instructions. This is not safe, as IMPLICIT_DEF is classed
as a meta instruction.

llvm-svn: 365202
2019-07-05 12:42:06 +00:00
Sam Elliott
e4d790ca7d [RISCV] Support @llvm.readcyclecounter() Intrinsic
On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock
cycles executed by the core, from an arbitrary point in the past. This
matches the intended semantics of `@llvm.readcyclecounter()`, which we
currently leave to the default lowering (to the constant 0).

With this patch, we will now correctly lower this intrinsic to the
intended semantics, using the user-space instruction `rdcycle`. On
64-bit targets, we can directly lower to this instruction.

On 32-bit targets, we need to do more, as `rdcycle` only returns the low
32-bits of the `cycle` CSR. In this case, we perform a custom lowering,
based on the PowerPC lowering, using `rdcycleh` to obtain the high
32-bits of the `cycle` CSR. This custom lowering inserts a new basic
block which detects overflow in the high 32-bits of the `cycle` CSR
during reading (because multiple instructions are required to read). The
emitted assembly matches the suggested assembly in the RISC-V
specification.

Differential Revision: https://reviews.llvm.org/D64125

llvm-svn: 365201
2019-07-05 12:35:21 +00:00
Nico Weber
289a9cccdc lld, llvm-dlltool, llvm-lib: Use getAsString() instead of getSpelling() for printing unknown args
Since OPT_UNKNOWN args never have any values and consist only of
spelling (and are never aliased), this doesn't make any difference in
practice, but it's more consistent with Arg's guidance to use
getAsString() for diagnostics, and it matches what clang does.

Also tweak two tests to use an unknown option that contains '=' for
additional coverage while here. (The new tests pass fine with the old
code too though.)

llvm-svn: 365200
2019-07-05 12:31:32 +00:00
Robert Lougher
482875e832 Revert r365198 as this accidentally commited something that
should not have been added.

llvm-svn: 365199
2019-07-05 12:30:45 +00:00
Robert Lougher
76cec1c67e This reverts r365061 and r365062 (test update)
Revision r365061 changed a skip of debug instructions for a skip
of meta instructions. This is not safe, as IMPLICIT_DEF is classed
as a meta instruction.

llvm-svn: 365198
2019-07-05 12:20:21 +00:00
Sam Elliott
f0504afcda [RISCV][NFC] Replace hard-coded CSR duplication with symbolic references
Reviewers: asb, lenary

Reviewed By: asb, lenary

Subscribers: MaskRay, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64139

Patch by James Clarke (jrtc27)

llvm-svn: 365195
2019-07-05 12:16:40 +00:00
Simon Pilgrim
9df63d034d Fix MSVC/cppcheck Use::Next isn't initialized warning. NFCI.
llvm-svn: 365194
2019-07-05 12:12:23 +00:00
Eugene Leviant
6f9815974b [llvm-objcopy] Allow strip symtab from executables and DSOs
Differential revision: https://reviews.llvm.org/D61672

llvm-svn: 365193
2019-07-05 12:10:44 +00:00
Thomas Preud'homme
e7bb9ea3e0 [FileCheck] Fix comment in parseNumericVariableUse
Summary:
Comment explaining the interaction between parsing of numeric variable
definition and uses in parseNumericVariableUse is stale since it
suggests both use and definition parsing is done in the same function.
This was the case in a previous version of the patch committed as
71d3f227a790d6cf39d8c6267940e0dc0c237e11 but is no longer the case. This
patch updates the comment accordingly.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64227

llvm-svn: 365192
2019-07-05 12:01:12 +00:00
Thomas Preud'homme
0520d355f2 [FileCheck] Factor some parsing checks out
Summary:
Both callers of parseNumericVariableDefinition() perform the same extra
check that no character is found after the variable name. This patch
factors out this check into parseNumericVariableDefinition().

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64226

llvm-svn: 365191
2019-07-05 12:01:06 +00:00
Thomas Preud'homme
c7adf5822e [FileCheck] Add missing final dot in comment
llvm-svn: 365190
2019-07-05 12:00:56 +00:00
Eugene Leviant
f98174731f [ThinLTO] Attempt to recommit r365040 after caching fix
It's possible that some function can load and store the same
variable using the same constant expression:

store %Derived* @foo, %Derived** bitcast (%Base** @bar to %Derived**)
%42 = load %Derived*, %Derived** bitcast (%Base** @bar to %Derived**)

The bitcast expression was mistakenly cached while processing loads,
and never examined later when processing store. This caused @bar to
be mistakenly treated as read-only variable. See load-store-caching.ll.

llvm-svn: 365188
2019-07-05 12:00:10 +00:00
James Henderson
d000b963f9 [docs][llvm-objcopy] Improve some wording.
llvm-svn: 365187
2019-07-05 11:57:07 +00:00
Nico Weber
8896c28cc1 Make joined instances of JoinedOrSeparate flags point to the unaliased args, like all other arg types do
This fixes an 8-year-old regression. r105763 made it so that aliases
always refer to the unaliased option – but it missed the "joined" branch
of JoinedOrSeparate flags. (r162231 then made the Args classes
non-virtual, and r169344 moved them from clang to llvm.)

Back then, there was no JoinedOrSeparate flag that was an alias, so it
wasn't observable. Now /U in CLCompatOptions is a JoinedOrSeparate alias
in clang, and warn_slash_u_filename incorrectly used the aliased arg id
(using the unaliased one isn't really a regression since that warning
checks if the undefined macro contains slash or backslash and only then
emits the warning – and no valid use will pass "-Ufoo/bar" or similar).

Also, lld has many JoinedOrSeparate aliases, and due to this bug it had
to explicitly call `getUnaliasedOption()` in a bunch of places, even
though that shouldn't be necessary by design. After this fix in Option,
these calls really don't have an effect any more, so remove them.

No intended behavior change.

(I accidentally fixed this bug while working on PR29106 but then
wondered why the warn_slash_u_filename broke. When I figured it out, I
thought it would make sense to land this in a separate commit.)

Differential Revision: https://reviews.llvm.org/D64156

llvm-svn: 365186
2019-07-05 11:45:24 +00:00
Nico Weber
24ec7afabe gn build: Merge r365179
llvm-svn: 365185
2019-07-05 11:34:48 +00:00
George Rimar
7add0fa08e [Object/ELF.h] - Improve error reporting.
The errors coming from ELF.h are usually not very
useful because they are uninformative. This patch is a
first step to improve the situation.

I tested this patch with a run of check-llvm and found
that few messages are untested. In this patch, I did not
add more tests but marked all such cases with a "TODO" comment.

For all tested messages I extended the error text to
provide more details (see test cases changed).

Differential revision: https://reviews.llvm.org/D64014

llvm-svn: 365183
2019-07-05 11:28:49 +00:00
Nico Weber
41ec2302bf lld-link: Make /debugtype: option work better
- The code tried to pass false to split()'s KeepEmpty parameter, but
  instead passed it to MaxSplit. As a result, it would never split on
  commas. This has been broken since the flag was added in r278056.

- The code used getSpelling() for getting the argument's values, but
  getSpelling() always returns the `/debugtype:` prefix without any
  values. So if any /debugtype: flag was passed, it always resulted in
  an "unknown option:" warning. (The warning code then used the correct
  getValue() for printing the invalid option, so the warning looked
  kind of like it made sense.) This regressed in r342894.

Slightly improve the test coverage of this feature (but since I don't
know what this flag actually does, there's still no test for the correct
semantics), and add a comment to getSpelling() explaining what it does.

llvm-svn: 365182
2019-07-05 11:28:31 +00:00
Simon Pilgrim
a505ee9998 [X86][SSE] LowerINSERT_VECTOR_ELT - early out for out of range indices
Fixes OSS-Fuzz #15662

llvm-svn: 365180
2019-07-05 10:34:53 +00:00
David Green
1676daeb17 [ARM] MVE VMOV immediate handling
This adds some handling for VMOVimm, using the same method that NEON uses. We
create VMOVIMM/VMVNIMM/VMOVFPIMM nodes based on the immediate, and select them
using the now renamed ARMvmovImm/etc. There is also an extra 64bit immediate
mode that I have not yet added here.

Code by David Sherwood

Differential Revision: https://reviews.llvm.org/D63884

llvm-svn: 365178
2019-07-05 10:02:43 +00:00
David Green
1783c7f2bd [ARM] MVE fp to int conversions
This adds the patterns needed for fptosi and sitofp.

Differential Revision: https://reviews.llvm.org/D63729

llvm-svn: 365176
2019-07-05 09:34:30 +00:00
Fangrui Song
c2647006c3 [RISCV] Delete a ctor that is commented out. NFC
llvm-svn: 365175
2019-07-05 08:25:14 +00:00
Seiya Nuta
95574c7fc3 [llvm-objcopy][NFC] Refactor output target parsing v2
Summary:
Use an enum instead of string to hold the output file format in Config.InputFormat and Config.OutputFormat. It's essential to support other output file formats other than ELF.

This patch originally has been submitted as D63239. However, there was an use-of-uninitialized-value bug and reverted in r364379 (git commit 4ee933c).

This patch includes the fix for the bug by setting Config.InputFormat/Config.OutputFormat in parseStripOptions.

Reviewers: espindola, alexshap, rupprecht, jhenderson

Reviewed By: jhenderson

Subscribers: emaste, arichardson, jakehehrlich, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64170

llvm-svn: 365173
2019-07-05 05:28:38 +00:00
Fangrui Song
fc1281d728 [llvm-objcopy][test] Fix respect-umask.test after D62718/r365162
llvm-svn: 365172
2019-07-05 05:10:28 +00:00
Alex Brachet
f2933ff9d8 Fix patch not passing test cases
llvm-svn: 365170
2019-07-05 01:28:41 +00:00
Alex Brachet
ed81df11c0 Temporarily stop failing test case
llvm-svn: 365168
2019-07-05 01:13:09 +00:00
Peter Collingbourne
2b7fc385f9 gn build: Merge r365130.
llvm-svn: 365167
2019-07-05 01:11:20 +00:00
Peter Collingbourne
f3e1554185 gn build: Merge r365103.
llvm-svn: 365166
2019-07-05 01:11:18 +00:00
Peter Collingbourne
b5743da917 gn build: Merge r365007.
llvm-svn: 365165
2019-07-05 01:11:16 +00:00
Peter Collingbourne
d71bef0e75 gn build: Merge r365091.
llvm-svn: 365164
2019-07-05 01:11:14 +00:00
Craig Topper
14cc551da9 [X86] Add custom isel to select ADD/SUB/OR/XOR/AND to their non-immediate forms under optsize when the immediate has additional users.
Summary:
We attempt to prevent folding immediates with multiple users under optsize. But we only do this from store nodes and X86ISD::ADD/SUB/XOR/OR/AND patterns. We don't do it for ISD::ADD/SUB/XOR/OR/AND even though we count them as users when deciding whether to fold into other nodes. This leads to situations where we block folding to a compare for example, but still fold into an AND or OR as seen in PR27202.

Unfortunately touching the isel patterns in tablegen for the ISD::ADD/SUB/XOR/OR/AND opcodes will cause the patterns to be unusable for fast isel. And we don't have a way to make a fast isel only pattern.

To workaround this, this patch adds custom isel in front of the isel table that will select the non-immediate forms if the immediate has additional users. This may create some issues for ANDN and NOT matching. And there's room for improvement with unsigned 32 immediates on 64-bit AND.

This patch needs more thorough test cases, but I wanted to get feedback on the direction. Please send me any other test cases you've seen in the wild.

I think we probably have the same issue with the immediate matching when we fold RMW from X86ISD::ADD/SUB/XOR/OR/AND. And our TEST immedaite shrinking logic. Our cost modeling for immediates that can fit in a sign extended 8-bit immediate on a 16/32/64 bit operation is completely wrong.

I also wonder if we should update the ConstantHoisting cost model and block folding for "opaque" constants. But of course constants can still be created by DAG combine and lowering optimizations.

Fixes PR27202

Reviewers: spatel, RKSimon, andreadb

Reviewed By: RKSimon

Subscribers: jsji, hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59909

llvm-svn: 365163
2019-07-04 22:53:57 +00:00
Alex Brachet
33790e584f [llvm-objcopy] Change handling of output file permissions
Summary: Address bug [[ https://bugs.llvm.org/show_bug.cgi?id=42082 | 42082 ]] where files were always outputted with 0775 permissions. Now, the output file is given either 0666 or 0777 if the object is executable.

Reviewers: espindola, alexshap, rupprecht, jhenderson, jakehehrlich, MaskRay

Reviewed By: rupprecht, jhenderson, jakehehrlich, MaskRay

Subscribers: emaste, arichardson, jakehehrlich, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62718

llvm-svn: 365162
2019-07-04 22:45:27 +00:00
Simon Atanasyan
1adce17c0f [mips] Refactor expandSeq and expandSeqI methods. NFC
llvm-svn: 365161
2019-07-04 22:45:07 +00:00
Hubert Tong
3bebe36404 [NFC] Make some ObjectFormatType switches covering
Summary:
This patch removes the `default` case from some switches on
`llvm::Triple::ObjectFormatType`, and cases for the missing enumerators
are then added.

For `UnknownObjectFormat`, the action (`llvm_unreachable`) for the
`default` case is kept.

For the other unhandled cases, `report_fatal_error` is used instead.

Reviewers: sfertile, jasonliu, daltenty

Reviewed By: sfertile

Subscribers: wuzish, aheejin, jsji, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63767

llvm-svn: 365160
2019-07-04 21:40:28 +00:00
Alex Brachet
847203d76b [docs] [tools] Fix see also links
Summary: Changes "see also" links to use :manpage: instead of plain text or the form `name|name` which was being treated literally, not as a link.

Reviewers: jhenderson, rupprecht

Reviewed By: jhenderson

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63970

llvm-svn: 365159
2019-07-04 21:19:05 +00:00
Craig Topper
20330a7c7f [DAGCombiner] Don't combine (addcarry (uaddo X, Y), 0, Carry) -> (addcarry X, Y, Carry) if the Carry comes from the uaddo.
Summary:
The uaddo won't be removed and the addcarry will still be
dependent on the uaddo. So we'll just increase the use count
of X and Y and potentially require a COPY.

Reviewers: spatel, RKSimon, deadalnix

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64190

llvm-svn: 365149
2019-07-04 18:18:46 +00:00
Tim Renouf
dd229b271a [AMDGPU] Custom lower INSERT_SUBVECTOR v3, v4, v5, v8
Summary:
Since the changes to introduce vec3 and vec5, INSERT_VECTOR for these
sizes has been marked "expand", which made LegalizeDAG lower it to loads
and stores via a stack slot. The code got optimized a bit later, but the
now-unused stack slot was never deleted.

This commit avoids that problem by custom lowering INSERT_SUBVECTOR into
an EXTRACT_VECTOR_ELT and INSERT_VECTOR_ELT for each element in the
subvector to insert.

V2: Addressed review comments re test.

Differential Revision: https://reviews.llvm.org/D63160

Change-Id: I9e3c13e36f68cfa3431bb9814851cc1f673274e1
llvm-svn: 365148
2019-07-04 17:38:24 +00:00
Sanjay Patel
eab9da7e7d [InstCombine] allow undef elements when forming splat from chain of insertelements
We allow forming a splat (broadcast) shuffle, but we were conservatively limiting
that to cases where all elements of the vector are specified. It should be safe
from a codegen perspective to allow undefined lanes of the vector because the
expansion of a splat shuffle would become the chain of inserts again.

Forming splat shuffles can reduce IR and help enable further IR transforms.
Motivating bugs:
https://bugs.llvm.org/show_bug.cgi?id=42174
https://bugs.llvm.org/show_bug.cgi?id=16739

Differential Revision: https://reviews.llvm.org/D63848

llvm-svn: 365147
2019-07-04 16:45:34 +00:00