1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

61145 Commits

Author SHA1 Message Date
Simon Pilgrim
f357de2ce2 [X86][SSE] lowerAddSubToHorizontalOp - enable ymm extraction+fold
Limiting scalar hadd/hsub generation to the lowest xmm looks to be unnecessary - we will be extracting one upper xmm whatever, and we can remove a shuffle by using the hop which is inline with what shouldUseHorizontalOp expects to happen anyway.

Testing on btver2 (the main target for fast-hops) shows this is beneficial even for float ops where we have a 'shuffle' to extract the float result:
https://godbolt.org/z/0R-U-K

Differential Revision: https://reviews.llvm.org/D61426

llvm-svn: 359786
2019-05-02 14:00:55 +00:00
James Henderson
cecec4697a [llvm-strip]Add --no-strip-all to disable --strip-all behaviour (including default stripping)
If certain switches are not specified, llvm-strip behaves as if
--strip-all were specified. This means that for testing, when we don't
want the stripping behaviour, we have to specify one of these switches,
which can be confusing. This change adds --no-strip-all to allow an
alternative way of suppressing the default stripping, in a less
confusing manner.

Reviewed by: jakehehrlich, MaskRay

Differential Revision: https://reviews.llvm.org/D61377

llvm-svn: 359781
2019-05-02 11:53:02 +00:00
Diana Picus
e8b91df048 [ARM GlobalISel] Select extensions to < 32 bits
Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in
the exact same way as 32 bits. This overwrites the higher bits, but that
should be ok since all legal users of types smaller than 32 bits ignore
those bits anyway.

llvm-svn: 359768
2019-05-02 09:28:00 +00:00
Diana Picus
3d1035c103 [ARM GlobalISel] Rename some inst selector tests. NFC
Prepare to add support for extensions to types smaller than 32 bits.

llvm-svn: 359767
2019-05-02 09:24:47 +00:00
Diana Picus
c8e9ac0231 [ARM GlobalISel] Legalize extensions to < 32 bits
Make it legal to extend from e.g. s1 to s8 or s16.

llvm-svn: 359766
2019-05-02 09:21:46 +00:00
Stanislav Mekhanoshin
e079e36506 [AMDGPU] gfx1010 lost VOP2 forms of some add/sub
Add legalization of V_ADD_I32, V_SUB_I32, V_SUBREV_I32.

Differential Revision:

llvm-svn: 359757
2019-05-02 04:26:35 +00:00
Stanislav Mekhanoshin
4dc73d8f2e [AMDGPU] gfx1010 allows VOP3 to have a literal
Differential Revision: https://reviews.llvm.org/D61413

llvm-svn: 359756
2019-05-02 04:01:39 +00:00
Craig Topper
e3bd3f27ee [X86] Remove the redundant suffix in vfpclassp[d,s]'s broadcasting variant
The broadcasting variant for instruction vfpclassp[d,s] shouldn't use suffix q/l. So remove them from the template.

Patch by Pengfei Wang

Differential Revision: https://reviews.llvm.org/D61295

llvm-svn: 359753
2019-05-02 03:25:50 +00:00
Bob Haarman
43271cda19 remove inalloca parameters in globalopt and simplify argpromotion
Summary:
Inalloca parameters require special handling in some optimizations.
This change causes globalopt to strip the inalloca attribute from
function parameters when it is safe to do so, removes the special
handling for inallocas from argpromotion, and replaces it with a
simple check that causes argpromotion to skip functions that receive
inallocas (for when the pass is invoked on code that didn't run
through globalopt first). This also avoids a case where argpromotion
would incorrectly try to pass an inalloca in a register.

Fixes PR41658.

Reviewers: rnk, efriedma

Reviewed By: rnk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61286

llvm-svn: 359743
2019-05-02 00:37:36 +00:00
Thomas Preud'homme
59d8151559 [FileCheck] Fix line-count.txt test
Summary: Enable currently skipped diagnostic test and fix column number

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61320

llvm-svn: 359742
2019-05-02 00:04:44 +00:00
Thomas Preud'homme
0cb0f08935 FileCheck [4/12]: Introduce @LINE numeric expressions
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch introduces the @LINE numeric
expressions.

This commit introduces a new syntax to express a relation a numeric
value in the input text must have with the line number of a given CHECK
pattern: [[#<@LINE numeric expression>]]. Further commits build on that
to express relations between several numeric values in the input text.
To help with naming, regular variables are renamed into pattern
variables and old @LINE expression syntax is referred to as legacy
numeric expression.

Compared to existing @LINE expressions, this new syntax allow arbitrary
spacing between the component of the expression. It offers otherwise the
same functionality but the commit serves to introduce some of the data
structure needed to support more general numeric expressions.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60384

llvm-svn: 359741
2019-05-02 00:04:38 +00:00
Jessica Paquette
096618e493 Fix erroneous flag in GISel line for arm64-fast-isel-materialize.ll
Accidentally put a fast-isel-abort=2 instead of the GISel abort line.

This test doesn't actually fall back at all for GISel though, so remove the
fallback checks entirely.

llvm-svn: 359737
2019-05-01 22:50:11 +00:00
Hiroshi Yamauchi
a0f852be9b [PGO][CHR] A bug fix.
Summary: Fix a transformation bug where two scopes share a common instrution to hoist.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61405

llvm-svn: 359736
2019-05-01 22:49:52 +00:00
Jessica Paquette
62380e005c [GlobalISel][AArch64] Use fmov for G_FCONSTANT when possible
This adds support for using fmov rather than a standard mov to materialize
G_FCONSTANT when it's safe to do so.

Update arm64-fast-isel-materialize.ll and select-constant.mir to show that the
selection is correct.

llvm-svn: 359734
2019-05-01 22:39:43 +00:00
Nikita Popov
12d161d0c6 [AArch64] Add tests for bool vector reductions; NFC
Baseline tests for PR41635.

llvm-svn: 359723
2019-05-01 20:18:36 +00:00
Sanjay Patel
821fcd03cc [PowerPC] add test that could infinite loop with reordered transforms; NFC
This is a slightly reduced version of the test from D61384.
Adding this as a preliminary step, so I can update D61149 with
the proposed fix.

llvm-svn: 359709
2019-05-01 17:34:30 +00:00
Simon Pilgrim
da80791169 [X86][SSE] Fold scalar horizontal add/sub for non-0/1 element extractions
We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector).

Differential Revision: https://reviews.llvm.org/D61263

llvm-svn: 359707
2019-05-01 17:13:35 +00:00
Stanislav Mekhanoshin
460bd1eabe [AMDGPU] gfx1010 GCNRegBankReassign pass
Reassign registers to reduce register bank conflicts.

Differential Revision: https://reviews.llvm.org/D61344

llvm-svn: 359704
2019-05-01 16:49:31 +00:00
Stanislav Mekhanoshin
9a916e7529 [AMDGPU] gfx1010 GCNNSAReassign pass
Convert NSA into non-NSA images.

Differential Revision: https://reviews.llvm.org/D61341

llvm-svn: 359700
2019-05-01 16:40:49 +00:00
Stanislav Mekhanoshin
b19eb3bbce [AMDGPU] gfx1010 MIMG implementation
Differential Revision: https://reviews.llvm.org/D61339

llvm-svn: 359698
2019-05-01 16:32:58 +00:00
Teresa Johnson
1eace23e47 [ThinLTO] Fix unreachable code when parsing summary entries.
Summary:
Early returns were causing some code to be skipped. This was missed
since the summary entries are typically at the end of the llvm assembly
file.

Fixes PR41663.

Reviewers: RKSimon, wristow

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61355

llvm-svn: 359697
2019-05-01 16:26:59 +00:00
Stanislav Mekhanoshin
54445e6cec [AMDGPU] gfx1010 DS implementation
Differential Revision: https://reviews.llvm.org/D61332

llvm-svn: 359696
2019-05-01 16:11:11 +00:00
Sanjay Patel
8e3f62a813 Revert "[DAGCombiner] try repeated fdiv divisor transform before building estimate"
This reverts commit fb9a5307a94e6f1f850e4d89f79103b123f16279 (rL359398)
because it can cause an infinite loop due to opposing combines.

llvm-svn: 359695
2019-05-01 16:06:21 +00:00
Hubert Tong
77da182969 [tests] Add host-byteorder-*-endian; update XFAILs of big-endian triples
Summary:
Triple components in `XFAIL` lines are tested against the target triple.
Various tests that are expected to fail on big-endian hosts are marked
as being `XFAIL` for big-endian targets. This patch corrects these tests
by having them test against a new `host-byteorder-big-endian` feature.

Reviewers: xingxue, sfertile, jasonliu

Reviewed By: xingxue

Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60551

llvm-svn: 359689
2019-05-01 15:36:18 +00:00
Fangrui Song
2822221971 [llvm-ar][llvm-nm][llvm-size] Change -long-option to --long-option in tests. NFC
llvm-svn: 359688
2019-05-01 15:31:15 +00:00
Simon Pilgrim
854c7023a5 [X86][SSE] Add demanded elts support X86ISD::PMULDQ\PMULUDQ
Add to SimplifyDemandedVectorEltsForTargetNode and SimplifyDemandedBitsForTargetNode

llvm-svn: 359686
2019-05-01 14:50:50 +00:00
Simon Pilgrim
4df2bed63e [X86][SSE] Add SSE vector shift support to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359680
2019-05-01 13:51:09 +00:00
Simon Pilgrim
d8ca974230 [X86][SSE] Split 512-bit -> 128-bit vector directly in SimplifyDemandedVectorEltsForTargetNode
llvm-svn: 359678
2019-05-01 12:48:42 +00:00
Simon Pilgrim
55f2f015c8 [X86][SSE] Add 512-bit vector support to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359677
2019-05-01 12:37:41 +00:00
Simon Pilgrim
8ad8285736 [X86][SSE] Add X86ISD::PACKSS\PACKUS to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359673
2019-05-01 11:29:36 +00:00
Simon Pilgrim
baa9adae36 [X86][SSE] Add scalar horizontal add/sub tests for element extractions from upper lanes
As suggested on D61263

llvm-svn: 359671
2019-05-01 11:17:11 +00:00
Simon Pilgrim
2c48ed1f8e [X86][SSE] Add X86ISD::UNPCKL\UNPCK to SimplifyDemandedVectorEltsForTargetNode vector splitting
llvm-svn: 359670
2019-05-01 11:08:03 +00:00
Simon Pilgrim
1d92b61236 [X86][SSE] Move extract_subvector(pshufb) fold to SimplifyDemandedVectorEltsForTargetNode
This lets us hit more cases than combineExtractSubvector and allows us reuse more code.

llvm-svn: 359669
2019-05-01 10:58:38 +00:00
Fangrui Song
3707037ab1 [llvm-objdump] Print newlines before and after "Disassembly of section ...:"
This improves readability and the behavior is consistent with GNU objdump.

The new test test/tools/llvm-objdump/X86/disassemble-section-name.s
checks we print newlines before and after "Disassembly of section ...:"

Differential Revision: https://reviews.llvm.org/D61127

llvm-svn: 359668
2019-05-01 10:40:48 +00:00
Simon Pilgrim
b5161ed3a1 [X86][SSE] Extract i1 elements from vXi1 bool vectors
This is an alternative to D59669 which more aggressively extracts i1 elements from vXi1 bool vectors using a MOVMSK.

Differential Revision: https://reviews.llvm.org/D61189

llvm-svn: 359666
2019-05-01 10:02:22 +00:00
George Rimar
d4d3b123ca [yaml2obj] - Report when unknown section is referenced from program header declaration block.
Previously we did not report this.
Also this removes multiple lookups in the map
what cleanups the code.

Differential revision: https://reviews.llvm.org/D61322

llvm-svn: 359663
2019-05-01 09:45:55 +00:00
Fangrui Song
28e82b6565 [llvm-readobj] Change -t to --symbols in tests. NFC
-t is --symbols in llvm-readobj but --section-details (unimplemented) in readelf.
The confusing option should not be used since we aim for improving
compatibility.

Keep just one llvm-readobj -t use case in test/tools/llvm-readobj/symbols.test

llvm-svn: 359661
2019-05-01 09:28:24 +00:00
Fangrui Song
295ca47f17 [gold] Fix two readelf tests after rL359649
llvm-svn: 359660
2019-05-01 09:01:10 +00:00
Fangrui Song
93056357f1 Fix test/tools/llvm-readobj/mips-plt.test
llvm-svn: 359657
2019-05-01 06:46:34 +00:00
Fangrui Song
599bfe12d8 [llvm-readobj] llvm-readobj --elf-output-style=GNU => llvm-readelf. NFC
The latter is much more common.

A dedicated --elf-output-style=GNU test demonstrating it is the same as
llvm-readelf is sufficient.

llvm-svn: 359652
2019-05-01 05:55:22 +00:00
Fangrui Song
b2d3b0af04 [llvm-readobj] Change -long-option to --long-option in tests. NFC
We use both -long-option and --long-option in tests. Switch to --long-option for consistency.

In the "llvm-readelf" mode, -long-option is discouraged as it conflicts with grouped short options and it is not accepted by GNU readelf.

While updating the tests, change llvm-readobj -s to llvm-readobj -S to reduce confusion ("s" is --section-headers in llvm-readobj but --symbols in llvm-readelf).

llvm-svn: 359649
2019-05-01 05:27:20 +00:00
David L. Jones
91dc79cfb4 Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element"
This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313.

llvm-svn: 359648
2019-05-01 05:01:03 +00:00
Fangrui Song
53adc7fe70 [llvm-objcopy] Simplify SHT_NOBITS -> SHT_PROGBITS promotion
GNU objcopy uses bfd_elf_get_default_section_type to decide the candidate section type,
which roughly translates to our [a] (I assume SEC_COMMON implies SHF_ALLOC):

  (!(Sec.Flags & ELF::SHF_ALLOC) || Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)))

Then, it updates the section type in bfd/elf.c:elf_fake_sections if:

  if (this_hdr->sh_type == SHT_NULL)
    this_hdr->sh_type = sh_type; // common case
  else if (this_hdr->sh_type == SHT_NOBITS
           && sh_type == SHT_PROGBITS
           && (asect->flags & SEC_ALLOC) != 0)  // uncommon case
    ...
    this_hdr->sh_type = sh_type;

If the following condition is met the uncommon branch is executed:

  if (elf_section_type (osec) == SHT_NULL
      && (osec->flags == isec->flags
	  || (final_link
	      && ((osec->flags ^ isec->flags)
		  & ~(SEC_LINK_ONCE | SEC_LINK_DUPLICATES | SEC_RELOC)) == 0)))

I suggest we just ignore this clause and follow the common case
behavior, which is done in this patch. Rationales to do so:

If --set-section-flags is a no-op (osec->flags == isec->flags)
(corresponds to the "readonly" test in set-section-flags.test), GNU
objcopy will require (Sec.Flags & ELF::SHF_ALLOC). [a] is essentially:

  Flags & (SectionFlag::SecContents | SectionFlag::SecLoad)

This special case is not really useful. Non-SHF_ALLOC SHT_NOBITS
sections do not make much sense and it doesn't matter if they are
SHT_NOBITS or SHT_PROGBITS.

For all other RUN lines in set-section-flags.test, the new behavior
matches GNU objcopy, i.e. this patch improves compatibility.

Differential Revision: https://reviews.llvm.org/D60189

llvm-svn: 359639
2019-05-01 00:39:31 +00:00
Philip Reames
d398cea2a1 [InstCombine] Limit a vector demanded elts rule which was producing invalid IR.
The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design).  It turns out that the LangRef disallows such cases when indexing structs.  The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes.

This should fix https://bugs.llvm.org/show_bug.cgi?id=41624.

llvm-svn: 359633
2019-04-30 23:09:26 +00:00
Alina Sbirlea
f0c01fb0d6 [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.
Summary:
MemorySSA keeps internal pointers of AA and DT.
If these get invalidated, so should MemorySSA.

Reviewers: george.burgess.iv, chandlerc

Subscribers: jlebar, Prazek, llvm-commits

Tags: LLVM

Differential Revision: https://reviews.llvm.org/D61043

llvm-svn: 359627
2019-04-30 22:43:55 +00:00
Alina Sbirlea
33469074e2 [AliasAnalysis/NewPassManager] Invalidate AAManager less often.
Summary:
This is a redo of D60914.

The objective is to not invalidate AAManager, which is stateless, unless
there is an explicit invalidate in one of the AAResults.

To achieve this, this patch adds an API to PAC, to check precisely this:
is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61284

llvm-svn: 359622
2019-04-30 22:15:47 +00:00
Stanislav Mekhanoshin
56ccad08f4 [AMDGPU] gfx1010 VMEM and SMEM implementation
Differential Revision: https://reviews.llvm.org/D61330

llvm-svn: 359621
2019-04-30 22:08:23 +00:00
Alina Sbirlea
e3be1edc9e [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.
Summary:
Match NewPassManager behavior: add option for interleaved loops in the
old pass manager, and use that instead of the flag used to disable loop unroll.
No changes in the defaults.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61030

llvm-svn: 359615
2019-04-30 21:29:20 +00:00
Rong Xu
d938c3cfb7 [llvm-profdata] Add overlap command to compute similarity b/w two profile files
Add overlap functionality to llvm-profdata tool to compute the similarity
between two profile files.

Differential Revision: https://reviews.llvm.org/D60977

llvm-svn: 359612
2019-04-30 21:19:12 +00:00
Simon Pilgrim
8caf94c860 [X86][SSE] Fold extract_subvector(extend(x)) -> extend_vector_inreg(x)
This adds any extend support - folding to zero_extend_vector_inreg (PMOVZX) for legality

Minor improvement for PR39709

llvm-svn: 359608
2019-04-30 20:31:07 +00:00