1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

197616 Commits

Author SHA1 Message Date
Amara Emerson
9e0493b789 [AArch64] Fix CollectLOH creating an AdrpAdd LOH when there's a live used reg
between the two instructions.

If there's a pattern like:
$xA = ADRP foo @PAGE
[some killing use of reg Xb]
$Xb = ADDXri $Xa, 0, @PAGEOFF

CollectLOH would create an AdrpAdd LOH that resulted in the linker optimizing
this sequence into:
$xB = ADR foo
[some killing use of reg $Xb]
... and therefore clobbers the live $Xb register that was used by the
instruction in between.

This was discovered by a GlobalISel patch D78465 which broke up global variable
accesses into two pseudos, which in some cases could be moved apart.

Differential Revision: https://reviews.llvm.org/D80834
2020-06-01 16:00:55 -07:00
Vedant Kumar
22f3fd7742 [LiveDebugValues] Remove early-exit when testing regmasks, NFC
In transferRegisterDef, if the instruction has a regmask attached, we'll
check if any currently used register is clobbered by the regmask.

The early exit in this scan isn't necessary, costs a set lookup, and is
almost never taken [1]. Delete it.

[1]
http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136
2020-06-01 15:16:10 -07:00
Matt Arsenault
3cd292c66e AMDGPU: Change internal tracking of wave size
Store the log2 wave size instead of forcing division and log2
operations when querying either.
2020-06-01 17:55:08 -04:00
Joseph Huber
76c68e0c0b [OpenMP] Replace Clang's OpenMP RTL Definitions with OMPKinds.def
Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits

Tags: #openmp, #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80222
2020-06-01 16:23:10 -04:00
Sterling Augustine
d40cd3b3ce For --relativenames, ignore directory 0, which is the comp_dir.
Update for upstream comments. Improve test by writing all the debug
info by hand.

Reviewers: dblaikie, jhenderson

Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80168
2020-06-01 13:13:37 -07:00
Mircea Trofin
368ac03869 [llvm][NFC] Cache FAM in InlineAdvisor
Summary:
This simplifies the interface by storing the function analysis manager
with the InlineAdvisor, and, thus, not requiring it be passed each time
we inquire for an advice.

Reviewers: davidxl, asbirlea

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80405
2020-06-01 13:02:34 -07:00
Daniel Grumberg
cc866fecc7 Add DIAError.h to list of headers excluded from the LLVM_DebugInfo_PDB module
Differential Revision: https://reviews.llvm.org/D80808
2020-06-01 21:01:05 +01:00
Florian Hahn
2456f3f2db [Matrix] Implement matrix index expressions ([][]).
This patch implements matrix index expressions
(matrix[RowIdx][ColumnIdx]).

It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx).
MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First,
if the base of a subscript is of matrix type, we create a incomplete
MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete
MatrixSubscriptExpr, we create a complete
MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx)

Similar to vector elements, it is not possible to take the address of
a MatrixSubscriptExpr.
For CodeGen, a new MatrixElt type is added to LValue, which is very
similar to VectorElt. The only difference is that we may need to cast
the type of the base from an array to a vector type when accessing it.

Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D76791
2020-06-01 20:08:49 +01:00
Sanjay Patel
1a3f9700f3 [InstCombine] fix use of base VectorType; NFC
SimplifyDemandedVectorElts() bails out on ScalableVectorType
anyway, but we can exit faster with the external check.

Move this to a helper function because there are likely other
vector folds that we can try here.
2020-06-01 14:28:31 -04:00
Matt Arsenault
6b892181f5 AMDGPU: Fix not emitting nofpexcept on fdiv expansion
In this awkward case, we have to emit custom pseudo-constrained FP
wrappers. InstrEmitter concludes that since a mayRaiseFPException
instruction had a chain, it can't add nofpexcept.

Test deferred until mayRaiseFPException is really set on everything.
2020-06-01 14:10:26 -04:00
Vedant Kumar
b5e5fd1027 [LiveDebugValues] Add LocIndex::u32_{location,index}_t types for readability, NFC
This is per Adrian's suggestion in https://reviews.llvm.org/D80684.
2020-06-01 11:02:36 -07:00
Vedant Kumar
6f84fd7763 [LiveDebugValues] Speed up removeEntryValue, NFC
Summary:
Instead of iterating over all VarLoc IDs in removeEntryValue(), just
iterate over the interval reserved for entry value VarLocs. This changes
the iteration order, hence the test update -- otherwise this is NFC.

This appears to give an ~8.5x wall time speed-up for LiveDebugValues when
compiling sqlite3.c 3.30.1 with a Release clang (on my machine):

```
          ---User Time---   --System Time--   --User+System--   ---Wall Time--- --- Name ---
  Before: 2.5402 ( 18.8%)   0.0050 (  0.4%)   2.5452 ( 17.3%)   2.5452 ( 17.3%) Live DEBUG_VALUE analysis
   After: 0.2364 (  2.1%)   0.0034 (  0.3%)   0.2399 (  2.0%)   0.2398 (  2.0%) Live DEBUG_VALUE analysis
```

The change in removeEntryValue() is the only one that appears to affect
wall time, but for consistency (and to resolve a pending TODO), I made
the analogous changes for iterating over SpillLocKind VarLocs.

Reviewers: nikic, aprantl, jmorse, djtodoro

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80684
2020-06-01 11:02:36 -07:00
Matt Arsenault
0ba1ec26dd DAG: Fix getNode dropping flags if there's a glue output
The AMDGPU non-strict fdiv lowering needs to introduce an FP mode
switch in some cases, and has custom nodes to provide chain/glue for
the intermediate FP operations. We need to propagate nofpexcept here,
but getNode was dropping the flags.

Adding nofpexcept in the AMDGPU custom lowering is left to a future
patch.

Also fix a second case where flags were dropped, but in this case it
seems it just didn't handle this number of operands.

Test will be included in future AMDGPU patch.
2020-06-01 13:48:02 -04:00
Hiroshi Yamauchi
c65f25e192 [PGO] Improve the working set size heuristics under the partial sample PGO.
Summary:
The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize)
under the partial sample PGO may not be accurate because the profile is partial
and the number of hot profile counters in the ProfileSummary may not reflect the
actual working set size of the program being compiled.

To improve this, the (approximated) ratio of the the number of profile counters
of the program being compiled to the number of profile counters in the partial
sample profile is computed (which is called the partial profile ratio) and the
working set size of the profile is scaled by this ratio to reflect the working
set size of the program being compiled and used for the working set size
heuristics.

The partial profile ratio is approximated based on the number of the basic
blocks in the program and the NumCounts field in the ProfileSummary and computed
through the thin LTO indexing. This means that there is the limitation that the
scaled working set size is available to the thin LTO post link passes only.

Reviewers: davidxl

Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79831
2020-06-01 10:29:23 -07:00
Matt Arsenault
3fc69c930e AMDGPU: Fix test in code directory 2020-06-01 13:26:51 -04:00
Matt Arsenault
bf7398d4cf AMDGPU: Remove dead file 2020-06-01 13:26:51 -04:00
hsmahesha
ed5decd2c8 [AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes
Summary:
While clustering mem ops, AMDGPU target needs to consider number of clustered bytes
to decide on max number of mem ops that can be clustered. This patch adds support to pass
number of clustered bytes to target mem ops clustering logic.

Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar

Reviewed By: foad

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80545
2020-06-01 22:52:34 +05:30
Stanislav Mekhanoshin
d47b2c2a4b Temporarily removed unstable test. NFC. 2020-06-01 10:18:54 -07:00
Matt Arsenault
7e6b33626b AMDGPU: Fix alignment for dynamic allocas
The alignment value also needs to be scaled by the wave size.
2020-06-01 13:06:37 -04:00
Stanislav Mekhanoshin
3f75cfd780 Update some names in test. NFC.
There seems to be some instability with IR nameing between
platforms. Attempted to fix it with replacing dot-numbered
names.
2020-06-01 09:11:18 -07:00
Fangrui Song
18064a73bd [Object] Add DF_1_PIE
This flag (and the whole field DT_FLAGS_1) originated from Solaris. I intend to use it in an LLD patch D80872.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D80871
2020-06-01 08:56:02 -07:00
Sanjay Patel
8cc0da7038 [InstCombine] add test for select-of-shuffle; NFC
This is based on an example in D80658
2020-06-01 11:52:07 -04:00
Stanislav Mekhanoshin
17a709861f Process gep (phi ptr1, ptr2) in SROA
Differential Revision: https://reviews.llvm.org/D79218
2020-06-01 08:41:05 -07:00
Sam Clegg
232d118233 [WebAssembly] Update test expectations
simd-2.C now compiles thanks to:
  https://github.com/WebAssembly/wasi-libc/pull/183

Differential Revision: https://reviews.llvm.org/D80930
2020-06-01 08:35:27 -07:00
Sanjay Patel
24f9324c29 [InstNamer] use 'i' for Instructions, not 'tmp'
As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and
D80584, the name 'tmp' is almost always a bad choice, but we have
a legacy of regression tests with that name because it was baked
into utils/update_test_checks.py.

This change makes -instnamer more consistent (already using "arg"
and "bb", the common LLVM shorthand). And it avoids the conflict
in telling users of the FileCheck script to run "-instnamer" to
create a better regression test and having that cause a warn/fail
in update_test_checks.py.
2020-06-01 11:11:14 -04:00
Ehud Katz
1b0a29266b [StructurizeCFG] Fix an incorrect comment, NFC. 2020-06-01 17:42:09 +03:00
James Henderson
47c0f73fb2 [Support] Add more context to DataExtractor getLEB128 errors
Reviewed by: clayborg, dblaikie, labath

Differential Revision: https://reviews.llvm.org/D80799
2020-06-01 14:00:01 +01:00
James Henderson
b5cd4f0921 [DebugInfo] Add use of truncating data extractor to debug line parsing
This will ensure that nothing can ever start parsing data from a future
sequence and part-read data will be returned as 0 instead.

Reviewed by: aprantl, labath

Differential Revision: https://reviews.llvm.org/D80796
2020-06-01 12:33:21 +01:00
Sanjay Patel
804d4bd875 [utils] change default nameless value to "TMP"
This is effectively reverting rGbfdc2552664d to avoid test churn
while we figure out a better way forward.

We at least salvage the warning on name conflict from that patch
though.

If we change the default string again, we may want to mass update
tests at the same time. Alternatively, we could live with the poor
naming if we change -instnamer.

This also adds a test to LLVM as suggested in the post-commit
review. There's a clang test that is also affected. That seems
like a layering violation, but I have not looked at fixing that yet.

Differential Revision: https://reviews.llvm.org/D80584
2020-06-01 06:54:45 -04:00
James Henderson
129a1ceed7 [llvm-dwarfdump][test] Use verbose output to check expected opcodes
The debug_line_invalid.test test case was previously using the
interpreted line table dumping to identify which opcodes have been
parsed. This change moves to looking for the expected opcodes
explicitly. This is probably a little clearer and also allows for
testing some cases that wouldn't be easily identifiable from the
interpreted table.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D80795
2020-06-01 11:48:02 +01:00
Simon Pilgrim
61aa867bc8 ARMFrameLowering.h - remove unnecessary includes. NFC.
They are implicitly included in TargetFrameLowering.h and only ever used in TargetFrameLowering override methods.
2020-06-01 11:47:13 +01:00
Simon Pilgrim
b88b305492 MIPatternMatch.h - remove unused APFloat/APInt includes. NFC. 2020-06-01 11:47:13 +01:00
Igor Kudrin
515faba258 [DebugInfo] Separate fields with commas in headers of type units (3/3).
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.

Differential Revision: https://reviews.llvm.org/D80806
2020-06-01 17:40:28 +07:00
Igor Kudrin
b9c53c8d85 [DebugInfo] Separate fields with commas in headers of compile units (2/3).
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.

Differential Revision: https://reviews.llvm.org/D80806
2020-06-01 17:40:24 +07:00
Igor Kudrin
e1c94df8e4 [DebugInfo] Separate fields with commas in headers of .debug_pub* tables (1/3).
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.

Differential Revision: https://reviews.llvm.org/D80806
2020-06-01 17:39:48 +07:00
Georgii Rymar
516ffefcde [llvm-readelf] - Add explicit braces again. NFC.
Partially reverts feee98645dde4be31a70cc6660d2fc4d4b9d32d8.

Add explicit braces to a different place to fix
"error: add explicit braces to avoid dangling else [-Werror,-Wdangling-else]"
2020-06-01 13:10:16 +03:00
Georgii Rymar
c01c172f33 [llvm-readelf] - Add explicit braces. NFC.
Should fix the BB (http://lab.llvm.org:8011/builders/clang-ppc64le-rhel/builds/3907/steps/build%20stage%201/logs/stdio):

llvm-readobj/ELFDumper.cpp:4708:5: error: add explicit braces to avoid dangling else [-Werror,-Wdangling-else]
    else
    ^
2020-06-01 12:55:24 +03:00
Ehud Katz
9df1915a6d [StructurizeCFG] Fix region nodes ordering
This is a reimplementation of the `orderNodes` function, as the old
implementation didn't take into account all cases.
The new implementation uses SCCs instead of Loops to take account of
irreducible loops.

Fix PR41509

Differential Revision: https://reviews.llvm.org/D79037
2020-06-01 12:50:35 +03:00
Georgii Rymar
ac85d75310 [llvm-readobj] - Improve error reporting for hash tables.
This improves the next points for broken hash tables:

1) Use reportUniqueWarning to prevent duplication when
   --hash-table and --elf-hash-histogram are used together.

2) Dump nbuckets and nchain fields. It is often possible
   to dump them even when the table itself goes past the EOF etc.

Differential revision: https://reviews.llvm.org/D80373
2020-06-01 12:36:23 +03:00
Tim Northover
8b6ab03c03 AArch64: materialize large stack offset into xzr correctly.
When a stack offset was too big to materialize in a single instruction, we were
trying to do it in stages:

    adds xD, sp, #imm
    adds xD, xD, #imm

Unfortunately, if xD is xzr then the second instruction doesn't exist and
wouldn't do what was needed if it did. Instead we can use a temporary register
for all but the last addition.
2020-06-01 09:30:05 +01:00
serge-sans-paille
ffa794c4cb Improve SmallPtrSetImpl::count implementation
Relying on the find method implies a roundtrip to the iterator world, which is
not costless because iterator creation involves a few check to ensure the
iterator is in a valid position (through the SmallPtrSetIteratorImpl::AdvanceIfNotValid
method). It turns out that the result of SmallPtrSetImpl::find_imp is either
valid or the EndPointer, so there's no need to go through that abstraction,
and the compiler cannot guess it.

Differential Revision: https://reviews.llvm.org/D80708
2020-06-01 07:49:19 +02:00
Chen Zheng
c5c4a9bca7 [MachineCombine] add a hook for resource length limit 2020-05-31 23:21:04 -04:00
Li Rong Yi
80a001669f [PowerPC] Exploit vabsd on P9
Summary: Exploit vabsd* for for absolute difference of vectors on P9,
for example:
void foo (char *restrict p, char *restrict q, char *restrict t)
{
  for (int i = 0; i < 16; i++)
     t[i] = abs (p[i] - q[i]);
}
this case should be matched to the HW instruction vabsdub.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D80271
2020-06-01 02:30:27 +00:00
Nico Weber
6f003d58d9 [gn build] (semi-manually) port a8ca0ec2670 2020-05-31 22:06:11 -04:00
Matt Arsenault
ab446506d6 AMDGPU/GlobalISel: Add stub reg-bank aware combiner pass 2020-05-31 20:40:14 -04:00
Craig Topper
12253f5671 [X86] Rewrite how X86PartialReduction finds candidates to consider optimizing.
Previously we walked the users of any vector binop looking for
more binops with the same opcode or phis that eventually ended up
in a reduction. While this is simple it also means visiting the
same nodes many times since we'll do a forward walk for each
BinaryOperator in the chain. It was also far more general than what
we have tests for or expect to see.

This patch replaces the algorithm with a new method that starts at
extract elements looking for a horizontal reduction. Once we find
a reduction we walk through backwards through phis and adds to
collect leaves that we can consider for rewriting.

We only consider single use adds and phis. Except for a special
case if the Add is used by a phi that forms a loop back to the
Add. Including other single use Adds to support unrolled loops.

Ultimately, I want to narrow the Adds, Phis, and final reduction
based on the partial reduction we're doing. I still haven't
figured out exactly what that looks like yet. But restricting
the types of graphs we expect to handle seemed like a good first
step. As does having all the leaves and the reduction at once.

Differential Revision: https://reviews.llvm.org/D79971
2020-05-31 12:53:01 -07:00
Simon Pilgrim
885928e3e6 [X86][AVX] Reduce unary target shuffles width if the upper elements aren't demanded. 2020-05-31 20:19:24 +01:00
Simon Pilgrim
9e57ca5e0e [X86][AVX] combineX86ShufflesRecursively - peekThroughOneUseBitcasts subvector before widening.
This matches what we do for the full sized vector ops at the start of combineX86ShufflesRecursively, and helps getFauxShuffleMask extract more INSERT_SUBVECTOR patterns.
2020-05-31 19:58:33 +01:00
Matt Arsenault
354a94569e AArch64/GlobalISel: Fix incorrect ptrmask usage for alignment
I inverted the mask when I ported to the new form of G_PTRMASK in
8bc03d2168241f7b12265e9cd7e4eb7655709f34.

I don't think this really broke anything, since G_VASTART isn't
handled for types with an alignment higher than the stack alignment.
2020-05-31 10:56:55 -04:00
Sanjay Patel
2ca9a76699 [utils] change update_test_checks.py use of 'TMP' value names
As discussed in PR45951:
https://bugs.llvm.org/show_bug.cgi?id=45951

There's a potential name collision between update_test_checks.py and -instnamer
and/or manually-generated IR test files because all of them try to use the
variable name that should never be used: "tmp".

This patch proposes to reduce the odds of collision and adds a warning if we
detect the problem. This will cause regression test churn when regenerating
CHECK lines on existing files.

Differential Revision: https://reviews.llvm.org/D80584
2020-05-31 10:46:11 -04:00