1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

210128 Commits

Author SHA1 Message Date
Hsiangkai Wang
ae0f83f30e [RISCV] Add new V instructions in v1.0-08a0b46.
Add new V instructions.
vfrsqrte7.v
vfrece7.v
vrgatherei16.vv
vneg.v
vncvt.x.x.w
vfneg.v
2021-01-22 00:59:58 +08:00
Hsiangkai Wang
0774f4ad89 [RISCV] Make LMUL field in VTYPE continuous.
Upgrade RISC-V V extension to v1.0-08a0b46.
Update the VTYPE encoding. Make LMUL encoding in a continuous field.
2021-01-22 00:47:32 +08:00
Jay Foad
29c5895119 [AMDGPU][GlobalISel] Run SIAddImgInit
This pass is required to get correct codegen for image instructions with
the tfe or lwe bits set.

Differential Revision: https://reviews.llvm.org/D95132
2021-01-21 15:54:54 +00:00
Matt Arsenault
efeae7eafa AMDGPU: Remove v_rsq_f64 patterns
This isn't accurate enough without correction
2021-01-21 10:51:36 -05:00
Matt Arsenault
c5422499b7 AMDGPU: Use more accurate fast f64 fdiv
A raw v_rcp_f64 isn't accurate enough, so start applying correction.
2021-01-21 10:51:36 -05:00
Matt Arsenault
a964cd0596 AArch64/GlobalISel: Factor out parametersInCSRMatch
Make this look more like the DAG handling and move to common code.

I also noticed AArch64 seems to not be properly adding the
physreg:virtreg mapping to the function live ins.
2021-01-21 10:32:48 -05:00
Sebastian Neubauer
23ec8fe2bd [AMDGPU] Implement mir parseCustomPseudoSourceValue
Allow parsing generated mir with custom pseudo source value tokens.
Also rename pseudo source values to have more meaningful names.

Differential Revision: https://reviews.llvm.org/D94768
2021-01-21 16:32:17 +01:00
David Green
a094599779 [ARM] Fix vector saddsat costs.
It turns out the vectorizer calls the getIntrinsicInstrCost functions
with a scalar return type and vector VF. This updates the costmodel to
handle that, still producing the correct vector costs.

A vectorizer test is added to show it vectorizing at the correct factor
again.
2021-01-21 15:30:39 +00:00
Joseph Huber
627ae5f2d2 [OpenMP] Add support for mapping names in mapper API
Summary:
The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library.

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D94806
2021-01-21 09:26:44 -05:00
Matt Arsenault
6e817c8235 AMDGPU: Add occupancy to serialized MachineFunctionInfo
Not sure about the default value handling, but also not sure
defaulting to a theoretically subtarget dependent value.
2021-01-21 09:21:00 -05:00
Sanjay Patel
6d8b694072 [InstCombine] avoid crashing on attribute propagation
In https://llvm.org/PR48810 , we are crashing while trying to
propagate attributes from mempcpy (returns void*) to memcpy
(returns nothing - void).

We can avoid the crash by removing known incompatible
attributes for the void return type.

I'm not sure if this goes far enough (should we just drop all
attributes since this isn't the same function?). We also need
to audit other transforms in LibCallSimplifier to make sure
there are no other cases that have the same problem.

Differential Revision: https://reviews.llvm.org/D95088
2021-01-21 08:13:26 -05:00
Mikael Holmen
42ce53b2e1 [MC] Use std::make_tuple to make some toolchains happy again
My toolchain (LLVM 8.0, libstdc++ 5.4.0) complained with:

12:27:43 ../lib/MC/MCDwarf.cpp:814:10: error: chosen constructor is explicit in copy-initialization
12:27:43   return {Offset, Size, SetDelta};
12:27:43          ^~~~~~~~~~~~~~~~~~~~~~~~
12:27:43 /proj/flexasic/app/llvm/8.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
12:27:43         constexpr tuple(_UElements&&... __elements)
12:27:43                   ^
12:27:43 1 error generated.

This commit adds explicit calls to std::make_tuple to work around
the problem.
2021-01-21 14:05:14 +01:00
Simon Pilgrim
add6aab738 [DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE (REAPPLIED).
Add DemandedElts support inside the TRUNCATE analysis.

REAPPLIED - this was reverted by @hans at rGa51226057fc3 due to an issue with vector shift amount types, which was fixed in rG935bacd3a724 and an additional test case added at rG0ca81b90d19d

Differential Revision: https://reviews.llvm.org/D56387
2021-01-21 13:01:34 +00:00
Simon Pilgrim
fc0ca227d5 [X86][SSE] Add uitofp(trunc(and(lshr(x,c)))) vector test
Reduced from regression reported by @hans on D56387
2021-01-21 12:38:36 +00:00
Simon Pilgrim
bf43079680 [DAG] SimplifyDemandedBits - correctly adjust truncated shift amount type
As noticed on D56387, for vectors we must always correctly adjust the shift amount type during truncation (not just after legalization). We were getting away with it as we currently only accepted scalars via the dyn_cast<ConstantSDNode>.
2021-01-21 12:38:36 +00:00
Adhemerval Zanella
f85ca92142 MC: AArch64: Add support for gotpage_lo15
It is not used bt LLVM itself, but it would be used on lld tests
to implement R_AARCH64_LD64_GOTPAGE_LO15 support.
2021-01-21 08:29:49 -03:00
Simon Pilgrim
49e90d18ab [DAG] CombineToPreIndexedLoadStore - use const APInt& for getAPIntValue(). NFCI.
Cleanup some code to use auto* properly from cast, and use const APInt& for getAPIntValue() to avoid an unnecessary copy.
2021-01-21 11:04:09 +00:00
Simon Pilgrim
4968f25461 [X86] Avoid a std::string copy by replacing auto with const auto&. NFC.
Fixes msvc analyzer warning.
2021-01-21 11:04:07 +00:00
Luo, Yuanke
d25938d612 Revert "[X86][AMX] Fix tile config register spill issue."
This reverts commit 20013d02f3352a88d0838eed349abc9a2b0e9cc0.
2021-01-21 18:11:43 +08:00
Florian Hahn
3ba57cf8ef [LoopUnswitch] Implement first version of partial unswitching.
This patch applies the idea from D93734 to LoopUnswitch.

It adds support for unswitching on conditions that are only
invariant along certain paths through a loop.

In particular, it targets conditions in the loop header that
depend on values loaded from memory. If either path from
the true or false successor through the loop does not modify
memory, perform partial loop unswitching.

That is, duplicate the instructions feeding the condition in the pre-header.
Then unswitch on the duplicated condition. The condition is now known
in the unswitched version for the 'invariant' path through the original loop.

On caveat of this approach is that one of the loops created can be partially
unswitched again. To avoid this behavior, `llvm.loop.unswitch.partial.disable`
metadata is added to the unswitched loops, to avoid subsequent partial
unswitching.

If that's the approach to go, I can move the code handling the metadata kind
into separate functions.

This increases the cases we unswitch quite a bit in SPEC2006/SPEC2000 &
MultiSource. It also allows us to eliminate a dead loop in SPEC2017's omnetpp

```
Tests: 236
Same hash: 170 (filtered out)
Remaining: 66
Metric: loop-unswitch.NumBranches

Program                                        base   patch  diff
 test-suite...000/255.vortex/255.vortex.test     2.00  23.00 1050.0%
 test-suite...T2006/401.bzip2/401.bzip2.test     7.00  55.00 685.7%
 test-suite :: External/Nurbs/nurbs.test         5.00  26.00 420.0%
 test-suite...s-C/unix-smail/unix-smail.test     1.00   3.00 200.0%
 test-suite.../Prolangs-C++/ocean/ocean.test     1.00   3.00 200.0%
 test-suite...tions/lambda-0.1.3/lambda.test     1.00   3.00 200.0%
 test-suite...yApps-C++/PENNANT/PENNANT.test     2.00   5.00 150.0%
 test-suite...marks/Ptrdist/yacr2/yacr2.test     1.00   2.00 100.0%
 test-suite...lications/viterbi/viterbi.test     1.00   2.00 100.0%
 test-suite...plications/d/make_dparser.test    12.00  24.00 100.0%
 test-suite...CFP2006/433.milc/433.milc.test    14.00  27.00 92.9%
 test-suite.../Applications/lemon/lemon.test     7.00  12.00 71.4%
 test-suite...ce/Applications/Burg/burg.test     6.00  10.00 66.7%
 test-suite...T2006/473.astar/473.astar.test    16.00  26.00 62.5%
 test-suite...marks/7zip/7zip-benchmark.test    78.00 121.00 55.1%
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D93764
2021-01-21 09:46:41 +00:00
Fangrui Song
260c3e1d18 MCDwarf: Delete uneeded parameter
And change signature
2021-01-21 00:55:07 -08:00
Georgii Rymar
e9ef9e434c [llvm-nm][ELF] - Make -D display symbol versions.
This fixes https://bugs.llvm.org/show_bug.cgi?id=48670.

Since binutils 2.35, nm -D displays symbol versions by default.
This patch teaches llvm-nm to do the same.

Differential revision: https://reviews.llvm.org/D94907
2021-01-21 11:23:45 +03:00
Luo, Yuanke
90681ff5c1 [X86][AMX] Fix tile config register spill issue.
Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. When across function, the ldtilecfg instruction may be inserted
on each AMX instruction which use tile config register. This cause all
tile data register clobber.
To fix this issue, we remove the model of tile config register. We
analyze the regmask of call instruction and insert ldtilecfg if there is
any tile data register live across the call. Inserting the sttilecfg
before the call is unneccessary, because the tile config doesn't change
and we can just reload the config.
Besides we also need check tile config register interference. Since we
don't model the config register we should check interference from the
ldtilecfg to each tile data register def.
             ldtilecfg
             /       \
            BB1      BB2
            /         \
           call       BB3
           /           \
       %1=tileload   %2=tilezero
We can start from the instruction of each tile def, and backward to
ldtilecfg. If there is any call instruction, and tile data register is
not preserved, we should insert ldtilecfg after the call instruction.

Differential Revision: https://reviews.llvm.org/D94155
2021-01-21 16:01:50 +08:00
Georgii Rymar
fc516b0eae [yaml2obj/obj2yaml] - Improve dumping/creating of ELF versioning sections.
This makes the following improvements.

For `SHT_GNU_versym`:
 * yaml2obj: set `sh_link` to index of `.dynsym` section automatically.
For `SHT_GNU_verdef`:
 * yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
 * yaml2obj: set `sh_info` field automatically.
 * obj2yaml: don't dump the `Info` field when its value matches the number of version definitions.
For `SHT_GNU_verneed`:
 * yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
 * yaml2obj: set `sh_info` field automatically.
 * obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies.

Also, simplifies few test cases.

Differential revision: https://reviews.llvm.org/D94956
2021-01-21 10:36:48 +03:00
madhur13490
ec64cb3a8e [IndirectFunctions] Skip propagating attributes to address taken functions
In case of indirect calls or address taken functions,
skip propagating any attributes to them. We just
propagate features to such functions.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D94585
2021-01-21 07:04:28 +00:00
Kazu Hirata
cd6de67c94 [llvm] Use hasSingleElement (NFC) 2021-01-20 21:35:55 -08:00
Kazu Hirata
bc78fac755 [Transforms] Use llvm::append_range (NFC) 2021-01-20 21:35:54 -08:00
Kazu Hirata
3d7023e152 [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-20 21:35:52 -08:00
Max Kazantsev
18f0a21fe4 [X86] Add experimental option to separately tune alignment of innermost loops
We already have an experimental option to tune loop alignment. Its impact
is very wide (and there is a suspicion that it's not always profitable). We want
to have something more narrow to play with. This patch adds similar option that
overrides preferred alignment for innermost loops. This is for experimental
purposes, default values do not change the existing behavior.

Differential Revision: https://reviews.llvm.org/D94895
Reviewed By: pengfei
2021-01-21 11:15:16 +07:00
Hsiangkai Wang
ac0a4f4e3c [RISCV] Implement vssseg intrinsics.
Define vlsseg intrinsics and pseudo instructions. Lower vlsseg
intrinsics to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94863
2021-01-21 11:51:35 +08:00
Hsiangkai Wang
0051771904 [RISCV] Implement vlsseg intrinsics.
Define vlsseg intrinsics and pseudo instructions. Lower vlsseg intrinsics
to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94763
2021-01-21 11:51:35 +08:00
Hsiangkai Wang
00fc860b80 [RISCV] Implement vsseg intrinsics.
Define vsseg intrinsics and pseudo instructions. Lower vsseg intrinsics
to pseudo instructions in RISCVDAGToDAGISel.

Differential Revision: https://reviews.llvm.org/D94688
2021-01-21 11:51:35 +08:00
Craig Topper
69813f9c57 [RISCV] Use update_llc_test_checks.py to regenerate check lines in vleff-rv32.ll and vleff-rv64.ll.
This should minimize change in a future patch.
2021-01-20 18:51:02 -08:00
Jonas Devlieghere
0ca2ad5bbe [dsymutil] Compare object modification times using second precision
The modification time in the debug map is expressed using second
precision, while the modification time returned by the filesystem could
be more precise. Avoid spurious warnings about timestamp mismatches by
truncating the modification time reported by the system to seconds.
2021-01-20 18:45:30 -08:00
Guozhi Wei
4abb9aaecc [DAGCombiner] Precommit test case for D95086
This is the test case for D95086 with worse result.

Differential Revision: https://reviews.llvm.org/D95103
2021-01-20 17:15:47 -08:00
Varun Gandhi
d38b7dd07c [NFC] Minor cleanup for ValueHandle code.
Based on feedback in https://reviews.llvm.org/D93433.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D94238
2021-01-20 16:27:55 -08:00
Dávid Bolvanský
01c91735ba [BuildLibcalls, Attrs] Support more variants of C++'s new, add attributes for C++'s delete
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95095
2021-01-21 00:12:37 +01:00
Craig Topper
0af17f4b92 [RISCV] Add another isel pattern for slliu.w.
Previously we only matched (and (shl X, C1), 0xffffffff << C1)
which matches the InstCombine canonicalization order. But its
possible to see (shl (and X, 0xffffffff), C1) if the pattern
is introduced in SelectionDAG. For example, through expansion of
a GEP.
2021-01-20 14:54:40 -08:00
Craig Topper
f1cbef17c5 [RISCV] Add addu.w and slliu.w test that uses getelementptr with zero extended indices.
This is closer to the kind of code that these intrinsics are
targeted at. Note we fail to match slliu.w here because our pattern
looks for (and (shl X, C1), 0xffffffff << C1) rather than
(shl (and X, 0xffffffff), C1). I'll fix this in a follow up
commit.
2021-01-20 14:54:40 -08:00
Ryan Houdek
780b2cb4d4 D94954: Fixes Snapdragon Kryo CPU core detection
All of these families were claiming to be a73 based, which was causing
-mcpu/mtune=native to never use the newer features available to these
cores.

Goes through each and bumps the individual cores to their respective Big
counterparts. Since this code path doesn't support big.little detection,
there was already a precedent set with the Qualcomm line to choose the
big cores only.

Adds a comment on each line for the product's name that the part number
refers to. Confirmed on-device and through Linux header naming
convections.

Additionally newer SoCs mix CPU implementer parts from multiple
implementers. Both 0x41 (ARM) and 0x51 (Qualcomm) in the Snapdragon case

This was causing a desync in information where the scan at the start to
find the implementer would mismatch the part scan later on.
Now scan for both implementer and part at the start so these stay in
sync.

Differential Revision: https://reviews.llvm.org/D94954
2021-01-20 22:23:43 +00:00
Tony Tye
aa97298419 [NFC][AMDGPU] Document target ID syntax for code object V2 to V3
Differential Revision: https://reviews.llvm.org/D95018
2021-01-20 21:48:52 +00:00
Mircea Trofin
62a8a8cc0d Reland "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor"
This reverts commit d97f776be5f8cd3cd446fe73827cd355f6bab4e1.

The original problem was due to build failures in shared lib builds. D95079
moved ImportedFunctionsInliningStatistics under Analysis, unblocking
this.
2021-01-20 13:33:43 -08:00
LLVM GN Syncbot
9c6a63c9cf [gn build] Port 95ce32c7878d 2021-01-20 21:18:20 +00:00
Mircea Trofin
88d4cb48b4 [NFC] Move ImportedFunctionsInliningStatistics to Analysis
This is related to D94982. We want to call these APIs from the Analysis
component, so we can't leave them under Transforms.

Differential Revision: https://reviews.llvm.org/D95079
2021-01-20 13:18:03 -08:00
Nikita Popov
7112a6a1c2 [PredicateInfo] Handle logical and/or
Teach PredicateInfo to handle logical and/or the same way as
bitwise and/or. This allows handling logical and/or inside IPSCCP
and NewGVN.
2021-01-20 21:03:07 +01:00
Nikita Popov
4d77906ee2 [PredicateInfo][SCCP][NewGVN] Add tests for logical and/or (NFC)
Duplicate some existing and/or tests using logical form.
2021-01-20 20:53:55 +01:00
Reid Kleckner
ea4ad8388b Reland "[PDB] Defer relocating .debug$S until commit time and parallelize it"
This reverts commit 5b7aef6eb4b2930971029b984cb2360f7682e5a5 and relands
6529d7c5a45b1b9588e512013b02f891d71bc134.

The ASan error was debugged and determined to be the fault of an invalid
object file input in our test suite, which was fixed by my last change.
LLD's project policy is that it assumes input objects are valid, so I
have added a comment about this assumption to the relocation bounds
check.
2021-01-20 11:53:43 -08:00
Nikita Popov
52fe8d4bda [PredicateInfo] Generalize processing of conditions
Branch/assume conditions in PredicateInfo are currently handled in
a rather ad-hoc manner, with some arbitrary limitations. For example,
an `and` of two `icmp`s will be handled, but an `and` of an `icmp`
and some other condition will not. That also includes the case where
more than two conditions and and'ed together.

This patch makes the handling more general by looking through and/ors
up to a limit and considering all kinds of conditions (though operands
will only be taken for cmps of course).

Differential Revision: https://reviews.llvm.org/D94447
2021-01-20 20:40:41 +01:00
Thomas Lively
652dd89674 [WebAssembly] Prototype new f64x2 conversions
As proposed in https://github.com/WebAssembly/simd/pull/383.

Differential Revision: https://reviews.llvm.org/D95012
2021-01-20 11:28:06 -08:00
dfukalov
f3ae5b9b8c [NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
2021-01-20 22:22:45 +03:00