1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

211313 Commits

Author SHA1 Message Date
Sanne Wouda
81bbde389a Use LoopRotate PrepareForLTO stage in NPM
The PrepareForLTO stage of LoopRotate tries to avoid unrolling loops
with calls that might be inlined later.  See D94232 where this was
introduced.

We didn't catch all occurances of the LoopRotatePass in the New Pass
Manager, so the original regression in astar returned with the pass
manager switch.
2021-02-17 14:06:57 +00:00
David Green
37557faad8 [ARM] MVE abs intrinsic costs. NFC 2021-02-17 13:54:17 +00:00
Andrew Savonichev
db410b4394 [NFC] Use the same type for bit fields in MCSchedClassDesc
Otherwise they are not allocated as a single bit field and take 4
bytes instead of 2.

Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D95954
2021-02-17 15:54:22 +03:00
luxufan
227b23901d [RISCV] Simplify BP initialisation
We can re-use copyPhysReg rather than writing a specialised copy.

Differential Revision: https://reviews.llvm.org/D95227
2021-02-17 20:33:20 +08:00
Simon Pilgrim
7dc80861c2 [DAG] Pull out getTruncatedUSUBSAT helper from foldSubToUSubSat. NFCI.
This will simplify an incoming generic implementation of D25987.

I'll rebase D96703 shortly to support this.
2021-02-17 12:17:08 +00:00
Simon Pilgrim
ae34979c89 [DAG] Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),bop(shuffle(a,b),shuffle(c,d))) (REAPPLIED)
Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),bop(shuffle(a,b),shuffle(c,d))) -> bop(shuffle(x,y),shuffle(z,w)),bop(shuffle(a,b),shuffle(c,d))

Attempt to fold from a shuffle of a pair of binops to a binop of shuffles, as long as one/both of the binop sources are also shuffles that can be merged with the outer shuffle. This should guarantee that we remove one binop without introducing any additional shuffles.

Technically there's potential for a merged shuffle's lowering to be poorer than the original shuffle, but it could also be better, and I'm not seeing any regressions as long as we keep the 'don't merge splats' rule already present in MergeInnerShuffle.

This expands and generalizes an existing X86 combine and attempts to merge either of each binop's sources (with an on-the-fly commutation of the shuffle mask) - we couldn't do that in the x86 version as it had to stay in a form that DAGCombine's MergeInnerShuffle would still recognise.

Fixes issue raised by @saugustine in rG5aa8f4c0843a where we were failing to replace null shuffle operands from MergeInnerShuffle to UNDEFs.

Differential Revision: https://reviews.llvm.org/D96345
2021-02-17 11:42:43 +00:00
Jay Foad
9a5c672484 [AMDGPU] Rename simplifyI24 to simplifyMul24
Also simplify one of its call sites. NFC.
2021-02-17 11:33:49 +00:00
David Zarzycki
4d99d51ea2 [lit] Add "early_tests" config option
With enough cores, the slowest tests can significantly change the total testing time if they happen to run late. With this change, a test suite can improve performance (for high-end systems) by listing just a few of the slowest tests up front.

Reviewed By: jdenny, jhenderson

Differential Revision: https://reviews.llvm.org/D96594
2021-02-17 06:32:04 -05:00
Piotr Sobczak
2d47d0a999 [AMDGPU] Fix a miscompile with S_ADD/S_SUB
The helper function isBoolSGPR is too aggressive when determining
when a v_cndmask can be skipped on a boolean value because the
function does not check the operands of and/or/xor.

This can be problematic for the Add/Sub combines that can leave
bits set even for inactive lanes leading to wrong results.

Fix this by inspecting the operands of and/or/xor recursively.

Differential Revision: https://reviews.llvm.org/D86878
2021-02-17 12:24:58 +01:00
Fraser Cormack
73b0930caf [RISCV] Add support for fixed vector vselect
This patch adds support for fixed-length vector vselect. It does so by
lowering them to a custom unmasked VSELECT_VL node with a vector length
operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96768
2021-02-17 10:59:00 +00:00
Simon Pilgrim
3342619547 [X86][SSE] Add testcase for bug reported in D96345
Failure to handle binop(shuffle(x,undef),shuffle(a,b))
2021-02-17 10:57:00 +00:00
Thomas Preud'homme
b894f8c766 Add lit config for dir with standalone tests
Some test systems do not use lit for test discovery but only for its
substitution and test selection because they use another way of managing
test collections, e.g. CTest. This forces those tests to be invoked with
lit --no-indirectly-run-check. When a mix of lit version is in use, it
requires to detect the availability of that option.

This commit provides a new config option standalone_tests to signal a
directory made of tests meant to run as standalone. When this option is
set, lit skips test discovery and the indirectly run check. It also adds
the missing documentation for --no-indirectly-run-check.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D94766
2021-02-17 10:38:58 +00:00
Sjoerd Meijer
b77d610c0f Follow up of rGdea4a63e6359, which committed a slightly different version than
intended.
2021-02-17 10:07:26 +00:00
Igor Kudrin
305dfa701a [DebugInfo] Keep the DWARF64 flag in the module metadata
This allows the option to affect the LTO output. Module::Max helps to
generate debug info for all modules in the same format.

Differential Revision: https://reviews.llvm.org/D96597
2021-02-17 17:03:34 +07:00
Sjoerd Meijer
8726f1acb2 [LSR] Cleanup of getPreferredAddresingMode. NFC.
This is a follow up D96600 and cleans up most calls to
getPreferredAddresingMode. I.e., we really don't need to query the same things
again and again, but get the preferred addressing mode once for each loop. So
this should be a lot friendlier for compile times, especially if we start
implementing getPreferredAddresingMode.

Differential Revision: https://reviews.llvm.org/D96772
2021-02-17 09:45:29 +00:00
Sam McCall
32aca15f02 [ADT] Add SFINAE guards to unique_function constructor.
We can't construct a working unique_function from an object that's not callable
with the right types, so don't allow deduction to succeed.
This avoids some ambiguous conversion cases, e.g. allowing to overload
on different unique_function types, and to conversion operators to
unique_function.

std::function and the any_invocable proposal have these.
This was added to llvm::function_ref in D88901 and followups

Differential Revision: https://reviews.llvm.org/D96794
2021-02-17 10:36:07 +01:00
Sjoerd Meijer
86d0604744 [MachineSink] Add a loop sink limit
To make sure compile-times don't regress, add an option to restrict the number
of instructions considered for sinking as alias analysis can be expensive and
for the same reason also skip large blocks.

Differential Revision: https://reviews.llvm.org/D96485
2021-02-17 08:50:53 +00:00
Cassie Jones
f638cb4edc [vim] Highlight most common MIR syntax not in LLVM IR
This adds highlighting for MIR instruction opcodes, physical registers,
and MIR types.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D95553
2021-02-17 02:39:03 -05:00
Cassie Jones
7f4b320ea8 [vim] Add initial syntax definition for .mir files
This initial definition handles the yaml container and the embedding of
the inner IRs. As a stopgap, this reuses the LLVM IR syntax highlighting
for the MIR function bodies--even though it's not technically correct,
it produces decent highlighting for a first pass.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D95552
2021-02-17 02:38:16 -05:00
Yang Fan
5421d477d4 [SampleFDO] Fix MSVC "namespace uses itself" warning (NFC)
MSVC warning:
```
SampleProfileLoaderBaseImpl.h(41): warning C4515: 'llvm': namespace uses itself
```
2021-02-17 15:27:30 +08:00
Kazu Hirata
2a6bf5d944 [CodeGen] Use range-based for loops (NFC) 2021-02-16 23:23:08 -08:00
Kazu Hirata
b2d40b94cd [llvm] Fix header guards (NFC)
Identified with llvm-header-guard.
2021-02-16 23:23:07 -08:00
Kazu Hirata
62d0340f62 [SCEV] Use ListSeparator (NFC) 2021-02-16 23:23:05 -08:00
Mircea Trofin
c2ccde3171 [mlgo] Fetch models from path / URL
Allow custom location for pre-trained models used when AOT-compiling
policies.

Differential Revision: https://reviews.llvm.org/D96796
2021-02-16 22:47:14 -08:00
Hsiangkai Wang
c1df0e7a08 [RISCV] Spilling for RISC-V V extension. (2nd version)
Differential Revision: https://reviews.llvm.org/D95148
2021-02-17 14:05:19 +08:00
Hsiangkai Wang
226d4b8ac5 [RISCV] Frame handling for RISC-V V extension.
This patch proposes how to deal with RISC-V vector frame objects. The
layout of RISC-V vector frame will look like

|---------------------------------|
| scalar callee-saved registers   |
|---------------------------------|
| scalar local variables          |
|---------------------------------|
| scalar outgoing arguments       |
|---------------------------------|
| RVV local variables &&          |
| RVV outgoing arguments          |
|---------------------------------| <- end of frame (sp)

If there is realignment or variable length array in the stack, we will use
frame pointer to access fixed objects and stack pointer to access
non-fixed objects.

|---------------------------------| <- frame pointer (fp)
| scalar callee-saved registers   |
|---------------------------------|
| scalar local variables          |
|---------------------------------|
| ///// realignment /////         |
|---------------------------------|
| scalar outgoing arguments       |
|---------------------------------|
| RVV local variables &&          |
| RVV outgoing arguments          |
|---------------------------------| <- end of frame (sp)

If there are both realignment and variable length array in the stack, we
will use frame pointer to access fixed objects and base pointer to access
non-fixed objects.

|---------------------------------| <- frame pointer (fp)
| scalar callee-saved registers   |
|---------------------------------|
| scalar local variables          |
|---------------------------------|
| ///// realignment /////         |
|---------------------------------| <- base pointer (bp)
| RVV local variables &&          |
| RVV outgoing arguments          |
|---------------------------------|
| /////////////////////////////// |
| variable length array           |
| /////////////////////////////// |
|---------------------------------| <- end of frame (sp)
| scalar outgoing arguments       |
|---------------------------------|

In this version, we do not save the addresses of RVV objects in the
stack. We access them directly through the polynomial expression
(a x VLENB + b). We do not reserve frame pointer when there is any RVV
object in the stack. So, we also access the scalar frame objects through the
polynomial expression (a x VLENB + b) if the access across RVV stack
area.

Differential Revision: https://reviews.llvm.org/D94465
2021-02-17 14:05:19 +08:00
Douglas Yung
4d011b4418 Fix gcc build after de3a485d9 due to a gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92598
This should fix gcc based builders such as http://lab.llvm.org:8011/#/builders/76/builds/1683
2021-02-16 21:57:12 -08:00
Alexander Shaposhnikov
dc7498891d [llvm-libtool] Emit warnings for files without symbols
1. Emit warnings for files without symbols.
2. Add -no_warning_for_no_symbols.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D95843
2021-02-16 17:52:12 -08:00
Tony Tye
efe0f85e7b [AMDGPU] Correct rmw atomics s_waitcnt generation
The AMD GPU SIMemoryLegalizer was using the ordering address space
rather than the instruction address space when determining the
s_waitcnt to generate to ensure that a read-modify-write atomic has
completed. This resulted in additional unnecessary counters being
waited on.

Differential Revision: https://reviews.llvm.org/D96743
2021-02-17 01:32:29 +00:00
LLVM GN Syncbot
bb770362c3 [gn build] Port 6fd5ccff72ee 2021-02-17 00:53:56 +00:00
Rong Xu
0db6fe3021 [SampleFDO] Reapply: Refactor SampleProfile.cpp
Reapply patch after fixing buildbot failure.
Refactor SampleProfile.cpp to use the core code in CodeGen.
The main changes are:
(1) Move SampleProfileLoaderBaseImpl class to a header file.
(2) Split SampleCoverageTracker to a head file and a cpp file.
(3) Move the common codes (common options and callsiteIsHot())
to the common cpp file.

Differential Revision: https://reviews.llvm.org/D96455
2021-02-16 16:43:21 -08:00
Sriraman Tallam
7bb78c1965 Basic block sections should enable function sections implicitly.
Basic block sections enables function sections implicitly, this is not needed
and is inefficient with "=list" option.

We had basic block sections enable function sections implicitly in clang. This
is particularly inefficient with "=list" option as it places functions that do
not have any basic block sections in separate sections. This causes unnecessary
object file overhead for large applications.

This patch disables this implicit behavior. It only creates function sections
for those functions that require basic block sections.

Further, there was an inconistent behavior with llc as llc was not turning on
function sections by default. This patch makes llc and clang consistent and
tests are added to check the new behavior.

This is the first of two patches and this adds functionality in LLVM to
create a new section for the entry block if function sections is not
enabled.

Differential Revision: https://reviews.llvm.org/D93876
2021-02-16 16:27:16 -08:00
Petr Hosek
8007c5924c [MC][ELF] Support for zero flag section groups
This change introduces support for zero flag ELF section groups to LLVM.
LLVM already supports COMDAT sections, which in ELF are a special type
of ELF section groups. These are generally useful to enable linker GC
where you want a group of sections to always travel together, that is to
be either retained or discarded as a whole, but without the COMDAT
semantics. Other ELF assemblers already support zero flag ELF section
groups and this change helps us reach feature parity.

Differential Revision: https://reviews.llvm.org/D95851
2021-02-16 14:23:40 -08:00
LLVM GN Syncbot
b5ada42319 [gn build] Port c761fe77bdca 2021-02-16 22:13:03 +00:00
Mehdi Amini
51c40d78da Revert "[SampleFDO][NFC] Refactor SampleProfile.cpp"
This reverts commit 310b35304cdf5a230c042904655583c5532d3e91.
The build is broken with -DBUILD_SHARED_LIBS=ON :

lib/ProfileData/CMakeFiles/LLVMProfileData.dir/SampleProfileLoaderBaseUtil.cpp.o: In function `llvm::sampleprofutil::callsiteIsHot(llvm::sampleprof::FunctionSamples const*, llvm::ProfileSummaryInfo*, bool)':
SampleProfileLoaderBaseUtil.cpp:(.text._ZN4llvm14sampleprofutil13callsiteIsHotEPKNS_10sampleprof15FunctionSamplesEPNS_18ProfileSummaryInfoEb+0x1a): undefined reference to `llvm::ProfileSummaryInfo::isColdCount(unsigned long) const'
SampleProfileLoaderBaseUtil.cpp:(.text._ZN4llvm14sampleprofutil13callsiteIsHotEPKNS_10sampleprof15FunctionSamplesEPNS_18ProfileSummaryInfoEb+0x28): undefined reference to `llvm::ProfileSummaryInfo::isHotCount(unsigned long) const'
...
2021-02-16 22:11:42 +00:00
David Blaikie
bce9ce9628 Effectively revert ba2aa5f49ebb since the object isn't destroyed polymorphically 2021-02-16 13:45:25 -08:00
Simonas Kazlauskas
3635306b39 [llvm-dwp] Join dwo paths correctly when DWOPath is absolute
When the `DWOPath` is absolute, we want to use `DWOPath` as is, without prepending any other
components to the path. The `sys::path::append` does not join, but rather unconditionally appends
the paths, so something like `sys::path::append("/tmp", "/tmp/banana")` will result in
`/tmp/tmp/banana` rather than the desired `/tmp/banana`.

This then causes `llvm-dwp` to fail in a following situation:

```
$ clang -gsplit-dwarf /tmp/banana/test.c -c -o /tmp/outdir/foo.o
$ clang outdir/foo.o -o outdir/hm
$ llvm-dwarfdump outdir/hm | grep -C2 foo.dwo
                  DW_AT_comp_dir    ("/tmp")
                  DW_AT_GNU_pubnames  (true)
                  DW_AT_GNU_dwo_name    ("/tmp/outdir/foo.dwo")
                                DW_AT_GNU_dwo_id    (0xde4d396f3bf0e257)
                  DW_AT_low_pc  (0x0000000000401100)
$ strace -o trace llvm-dwp -e outdir/hm -o outdir/hm.dwp
error: No such file or directory
$ cat trace | grep foo.dwo
openat(AT_FDCWD, "/tmp/tmp/outdir/foo.dwo", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
```

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D96678
2021-02-16 13:38:35 -08:00
David Blaikie
9fa64c89c8 Fix -Wnon-virtual-dtor by making the ctor protected 2021-02-16 13:38:28 -08:00
Victor Huang
a248bb5c65 [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol
We currently represent TOC entries by an MCSymbol. This is not enough in some situations.
For example, when accessing an initialized TLS variable v on AIX using the general dynamic
model, we need to generate the two following entries for v:

.tc .v[TC],v@m
.tc v[TC],v

One is for the region handle (with the @m relocation), the other is for the variable offset.
This refactoring allows storing several entries for the same symbol with different VariantKind
in the TOC. If the VariantKind is not specified, we default to VK_None.

The AIX TLS implementation using this refactoring to generate the two entries will be posted
in a subsequent patch.

Patched By: bsaleil
Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D96346
2021-02-16 21:32:16 +00:00
Kazu Hirata
987c549a5b [SampleFDO] Provide a virtual desructor for SampleProfileLoaderBaseImpl
This patch fixes a warning:

  llvm-project/llvm/include/llvm/ProfileData/SampleProfileLoaderBaseImpl.h:69:7:
  error: 'llvm::SampleProfileLoaderBaseImpl' has virtual functions but
  non-virtual destructor [-Werror,-Wnon-virtual-dtor]

Differential Revision: https://reviews.llvm.org/D96810
2021-02-16 13:17:33 -08:00
Sterling Augustine
8c549cf52d Revert "[DAG] Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),bop(shuffle(a,b),shuffle(c,d)))"
This reverts commit 5dfba562dd247f731528448ee83785b099f93629.

That commit causes an assertion failure with the following repro:

typedef long b __attribute__((__vector_size__(16)));
b *d;
b e;
b __attribute__((__always_inline__)) c(b h, b i) {
  return (__attribute__((__vector_size__(8 * sizeof(short)))) short)h + i;
}
j() {
  b k, l, m, n, o[6], p, q;
  m = d[5];
  b r = m;
  b s = f(r, 8);
  q = s;
  l = d[1];
  p = l;
  t(q);
  n = c(m, l);
  o[1] = c(s, f(p, 8));
  k = __builtin_shufflevector(n, o[1], 0, 2);
  e = __builtin_ia32_psrlwi128(k, j);
}

./bin/clang -cc1 -triple x86_64-grtev4-linux-gnu -emit-obj -O1 -std=c99 test.c
2021-02-16 12:48:15 -08:00
Craig Topper
a24cc1819d [RISCV] Add isel patterns for fixed vector fmsub/fnmadd/fnmsub. 2021-02-16 12:03:33 -08:00
LLVM GN Syncbot
5272bea13d [gn build] Port ecea7218fb9b 2021-02-16 19:23:52 +00:00
LLVM GN Syncbot
dc45992690 [gn build] Port 310b35304cdf 2021-02-16 19:23:52 +00:00
Raphael Isemann
8e3d373535 [FileCollector] Fix that the file system case-sensitivity check was inverted
real_path returns an `std::error_code` which evaluates to `true` in case an
error happens and `false` if not. This code was checking the inverse, so
case-insensitive file systems ended up being detected as case sensitive.

Tested using an LLDB reproducer test as we anyway need a real file system and
also some matching logic to detect whether the respective file system is
case-sensitive (which the test is doing via some Python checks that we can't
really emulate with the usual FileCheck logic).

Fixes rdar://67003004

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D96795
2021-02-16 20:21:32 +01:00
Craig Topper
c5e37509c1 [RISCV] Add add/sub saturation tests that exist on ARM/AArch64/X86
There have been some recent changes to the type legalization for
some of these intrinsics so I thought it would be good to have
coverage.
2021-02-16 11:19:57 -08:00
Rong Xu
a3be335242 [SampleFDO][NFC] Refactor SampleProfile.cpp
Refactor SampleProfile.cpp to use the core code in CodeGen.
The main changes are:
(1) Move SampleProfileLoaderBaseImpl class to a header file.
(2) Split SampleCoverageTracker to a head file and a cpp file.
(3) Move the common codes (common options and callsiteIsHot())
to the common cpp file.

Differential Revision: https://reviews.llvm.org/D96455
2021-02-16 11:18:21 -08:00
Jessica Paquette
36c97d7503 Revert "[AArch64][GlobalISel] Fold constants into G_GLOBAL_VALUE"
This reverts commit 61b4702a408834228c1c139b0e9af98616774db4.

We were seeing some test failures in SPECINT2006 due to this change. Reverting
to investigate.
2021-02-16 10:50:12 -08:00
Michael Kruse
2074156484 [OpenMP] Implement '#pragma omp tile', by Michael Kruse (@Meinersbur).
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.

This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.

A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.

I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).

Differential Revision: https://reviews.llvm.org/D76342
2021-02-16 09:45:07 -08:00
Simon Pilgrim
2409bec523 [DAG] PromoteIntRes_ADDSUBSHLSAT - promote ISD::UADDSAT as clamped add
Similar to D96622, we're better off just promoting uaddsat(x,y) -> umin(add(x,y),c) instead of trying to perform a shifted uaddsat.

I initially tried to just use shifted promotion in cases where we didn't have a legal/custom umin - but we don't appear to have any targets that have uaddsat but not umin, so imo we're better off always using the umin and avoid an untested shifted uaddsat code path.

Differential Revision: https://reviews.llvm.org/D96767
2021-02-16 17:37:44 +00:00