1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

212943 Commits

Author SHA1 Message Date
Vaivaswatha Nagaraj
566ea91075 [OCaml] Add (get/set)_module_identifer functions
Also:

- Fix a bug that crept in when fixing a buildbot failure in
f7be9db622
- Use mlsize_t for cstr_to_string as that is what
caml_alloc_string specifies.

Differential Revision: https://reviews.llvm.org/D98851
2021-03-20 20:41:51 +05:30
David Zarzycki
980d119a88 [lit] Sort testing summary output
As fallout from from the record-and-reorder work, people asked that the
summary output be sorted to aid diffing.
2021-03-20 07:52:08 -04:00
Jeroen Dobbelaere
52195b6999 Revert of D49126 [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types.
Now that intrinsic name mangling can cope with unnamed types, the custom name mangling in PredicateInfo (introduced by D49126) can be removed.
(See D91250, D48541)

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D91661
2021-03-20 11:37:09 +01:00
Wang, Pengfei
fff4eb8636 [X86] Fix a bug when calculating the ldtilecfg insertion points.
The BB we initialized the ldtilecfg is special. We don't need to check
if its predecessor BBs need to insert ldtilecfg for calls.

We reused the flag HasCallBeforeAMX, so that the predecessors won't be
added to CfgNeedInsert.

This case happens only when the entry BB is in a loop. We need to hoist
the first tile config point out of the loop in future.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D98845
2021-03-20 17:48:59 +08:00
Juneyoung Lee
4bb78e2cb0 [CFLGraph] Fix a crash due to missing handling of freeze
https://reviews.llvm.org/D85534#2636321
2021-03-21 02:14:13 +09:00
Shao-Ce Sun
c789b50f4d [NFC][ValueTypes] Align code by column
Adjusted some whitespaces.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98975
2021-03-20 13:43:07 +08:00
Carl Ritson
c6760fad72 [AMDGPU] Add MDT update missing from D98915 2021-03-20 13:38:58 +09:00
Nemanja Ivanovic
b918a37b96 [PowerPC][NFC] Do not produce i64 constants in 32-bit mode
There are some instances where we produce constants of type MVT::i64
unconditionally in the target DAG combines. This is not actually
valid in 32-bit mode.
2021-03-19 22:54:47 -05:00
Craig Topper
ca727da695 [RISCV] Rename WriteShift/ReadShift scheduler classes to WriteShiftImm/ReadShiftImm. Move variable shifts from WriteIALU/ReadIALU to new WriteShiftReg/ReadShiftReg.
Previously only immediate shifts were in WriteShift. Register
shifts were grouped with IALU. Seems likely that immediate shifts
would be as fast or faster than register shifts. And that immediate
shifts wouldn't be any faster than IALU. So if any deserved to be in
their own group it should be register shifts not immediate shifts.

Rather than try to flip them let's just add more granularity
and give each kind their own class. I've used new names for both to
make them unambiguous and to force any downstream implementations to
be forced to put correct information in their scheduler models.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D98911
2021-03-19 20:39:49 -07:00
Carl Ritson
ab6ec1f384 [AMDGPU] Rename SIInsertSkips Pass
Pass no longer handles skips.  Pass now removes unnecessary
unconditional branches and lowers early termination branches.
Hence rename to SILateBranchLowering.

Move code to handle returns to epilog from SIPreEmitPeephole
into SILateBranchLowering. This means SIPreEmitPeephole only
contains optional optimisations, and all required transforms
are in SILateBranchLowering.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D98915
2021-03-20 11:48:04 +09:00
Carl Ritson
6af08589c4 [AMDGPU] Merge SIRemoveShortExecBranches into SIPreEmitPeephole
SIRemoveShortExecBranches is an optimisation so fits well in the
context of SIPreEmitPeephole.

Test changes relate to early termination from kills which have now
been lowered prior to considering branches for removal.
As these use s_cbranch the execz skips are now retained instead.
Currently either behaviour is valid as kill with EXEC=0 is a nop;
however, if early termination is used differently in future then
the new behaviour is the correct one.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D98917
2021-03-20 11:26:42 +09:00
Lang Hames
8f653b1bbd [llvm-jitlink] Scan input files for first object to determine triple.
The previous logic would crash if the first input file was an archive rather
than an object.
2021-03-19 19:24:10 -07:00
Senran Zhang
d628808ad9 [Utils][vim] Highlight poison keyword
Reviewed By: awarzynski, MaskRay

Differential Revision: https://reviews.llvm.org/D98927
2021-03-19 19:09:11 -07:00
Lang Hames
ae897863a3 [JITLink] Remove redundant local variable definitions from a unit test. 2021-03-19 18:29:36 -07:00
Carl Ritson
fd73d84cf5 [AMDGPU] Allow index optimisation in SIPreEmitPeephole for bundles
Add code so duplication index register changes can be removed from
inside bundles.

Reviewed By: rampitec, foad

Differential Revision: https://reviews.llvm.org/D98940
2021-03-20 10:26:23 +09:00
Anshil Gandhi
b6fa39152d [NFC] [PowerPC] Determine Endianness in PPCTargetMachine
The TargetMachine uses the triple to determine endianness. Just
use that logic rather than replicating it in PPCSubtarget.

Differential revision: https://reviews.llvm.org/D98674
2021-03-19 20:22:16 -05:00
Peter Collingbourne
b0fbcbc034 gn build: Unbreak Android cross-compilation.
- D96404 defaulted to libunwind which isn't provided by NDK r21
  (or r22), so specify -rtlib=libgcc on non-arm32.
- D97993 means that we need to use --gcc-toolchain instead of -B
  to let the driver find libgcc.
2021-03-19 16:28:24 -07:00
Ellis Hoag
434b17dd9a Port D97640 to llvm/include/llvm/ProfileData/InstrProfData.inc
Differential Revision: https://reviews.llvm.org/D98982
2021-03-19 16:24:16 -07:00
Lang Hames
263720c0fc [JITLink] Don't issue lookups for empty symbol sets.
Issuing a lookup for an empty symbol set is legal, but can actually result in
unrelated work being done if there was a work queue left over from the previous
lookup. We can avoid doing this unrelated work (reducing stack depth and
interleaving of debugging output) by not issuing these no-op lookups in the
first place.
2021-03-19 16:10:47 -07:00
Christoffer Lernö
3f24585141 Add type attributes to LLVM C API
The LLVM C API is missing type attributes as is needed by attributes
such as sret and byval. This patch adds three missing wrapper
functions.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=48249

https://reviews.llvm.org/D97763
2021-03-19 19:07:04 -04:00
Arthur Eubanks
4b9d699155 [NewPM] Verify LoopAnalysisResults after a loop pass
All loop passes should preserve all analyses in LoopAnalysisResults. Add
checks for those when the checks are enabled (which is by default with
expensive checks on).

Note that due to PR44815, we don't check LAR's ScalarEvolution.
Apparently calling SE.verify() can change its results.

This is a reland of https://reviews.llvm.org/D98820 which was reverted
due to unacceptably large compile time regressions in normal debug
builds.
2021-03-19 14:56:37 -07:00
Jessica Paquette
ae291b6dfb [GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)
There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG.

E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain
code that matches a bitfield extract from an and + right shift.

Rather than duplicating code in the same way, this adds two opcodes:

- G_UBFX (unsigned bitfield extract)
- G_SBFX (signed bitfield extract)

They work like this

```
%x = G_UBFX %y, %lsb, %width
```

Where `lsb` and `width` are

- The least-significant bit of the extraction
- The width of the extraction

This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends
the result, while G_SBFX sign-extends the result.

This should allow us to use the combiner to match the bitfield extraction
patterns rather than duplicating pattern-matching code in each target.

Differential Revision: https://reviews.llvm.org/D98464
2021-03-19 14:37:19 -07:00
Fangrui Song
e2c184371f [llvm-readobj] Remove legacy GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} and dump new GNU_PROPERTY_X86_ISA_1_{NEEDED,USED}
https://sourceware.org/bugzilla/show_bug.cgi?id=26703 deprecated the
previous GNU_PROPERTY_X86_ISA_1_{CMOV,SSE,*} values (renamed to `COMPAT`)
and added new values.

Since the legacy values are not used by compilers, having dumping support in
llvm-readobj is unnecessary. So just drop the legacy feature.

The new values are used by GCC 11
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250) `-march=x86-64-v[234]` to
indicate the micro-architecture ISA levels.

Differential Revision: https://reviews.llvm.org/D98818
2021-03-19 14:35:22 -07:00
Arthur Eubanks
36d2e0e3b3 Revert "[NewPM] Verify LoopAnalysisResults after a loop pass"
This reverts commit 94c269baf58330a5e303a4f86f64681f2f7a858b.

Still causes too large of compile time regression in normal debug
builds. Will put under expensive checks instead.
2021-03-19 14:31:08 -07:00
Ella Ma
2a15d1e5e9 [llvm] Add assertions for the smart pointers with the possibility to be null in ModuleLazyLoaderCache::operator()
Split from D91844.

The return value of function `ModuleLazyLoaderCache::operator()` in file llvm/tools/llvm-link/llvm-link.cpp. According to the bug report of my static analyzer, the std::function variable `ModuleLazyLoaderCache::createLazyModule` points to function `loadFile`, which may return `nullptr` when error. And the pointer is dereferenced without a check.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D97258
2021-03-19 13:52:34 -07:00
Arthur Eubanks
20e98ffdeb [NewPM] Verify LoopAnalysisResults after a loop pass
All loop passes should preserve all analyses in LoopAnalysisResults. Add
    checks for those.

    Note that due to PR44815, we don't check LAR's ScalarEvolution.
    Apparently calling SE.verify() can change its results.

    Only verify MSSA when VerifyMemorySSA, normally it's very expensive.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98820
2021-03-19 13:26:45 -07:00
Sanjay Patel
32fe5823ff [SLP] remove unnecessary characters in test; NFC
Glitch that crept in with 62f9c3358b81
2021-03-19 15:09:53 -04:00
Sanjay Patel
9eda45d03e [SLP] add tests for min/max reductions that use intrinsics; NFC 2021-03-19 15:06:16 -04:00
Craig Topper
0b6dbbc1d5 [AArch64] Fix LowerMGATHER to return the chain result for floating point gathers.
Found by adding asserts to LegalizeDAG to make sure custom legalized
results had the right types.

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D98968
2021-03-19 11:53:46 -07:00
David Green
5d7ef4b589 [ARM] Tone down the MVE scalarization overhead
The scalarization overhead was set deliberately high for MVE, whilst the
codegen was new. It helps protect us against the negative ramifications
of mixing scalar and vector instructions. This decreases that,
especially for floating point where the cost of extracting/inserting
lane elements can be low. For integer the cost is still fairly high due
to the cross-register-bank copy, but is no longer n^2 in the length of
the vector.

In general, this will decrease the cost of scalarizing floats and long
integer vectors. i64 increase in cost, having a high cost before and
after this patch. For floats this allows up to start doing things like
vectorizing fdiv instructions, even if they are scalarized.

Differential Revision: https://reviews.llvm.org/D98245
2021-03-19 18:30:11 +00:00
Philip Reames
eaf092af50 Update basic deref API to account for possiblity of free [NFC]
This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree".

The point of this change is purely to simplify iteration on other pieces on way to making the switch. Rebuilding with a change to Value.h is slow and painful, so I want to get the API change landed. Once that's done, I plan to more closely audit each caller, add the inference rules in their own patch, then post a patch with the langref changes and test diffs. The value of the command line flag is that we can exercise the inference logic in standalone patches without needing the whole switch ready to go just yet.

Differential Revision: https://reviews.llvm.org/D98908
2021-03-19 11:17:19 -07:00
Alexey Bataev
1e97683a06 [Cost]Canonicalize the cost for logical or/and reductions.
The generic cost of logical or/and reductions should be cost of bitcast
<ReduxWidth x i1> to iReduxWidth + cmp eq|ne iReduxWidth.

Differential Revision: https://reviews.llvm.org/D97961
2021-03-19 11:01:58 -07:00
Bjorn Pettersson
13603c344c [LangRef] Describe memory layout for vectors types
There are a couple of caveats when it comes to how vectors are
stored to memory, and thereby also how bitcast between vector
and integer types work, in LLVM IR. Specially in relation to
endianess. This patch is an attempt to document such things.

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D94964
2021-03-19 19:00:37 +01:00
Craig Topper
1508ec9d96 [RISCV] Add missing bitcasts to the results of lowerINSERT_SUBVECTOR and lowerEXTRACT_SUBVECTOR when handling mask vectors.
Found by adding asserts to LegalizeDAG to catch incorrect result
types being returned.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98964
2021-03-19 10:54:33 -07:00
Craig Topper
fa5127d61e [Hexagon] Return an i64 for result 0 from LowerREADCYCLECOUNTER instead of an i32.
As far as I can tell, the node coming in has an i64 result so the
return should have the same type. The HexagonISD node used for
this has a type profile that says the result is i64.

Found while trying to add assserts to LegalizeDAG to catch
result type mismatches.

Reviewed By: kparzysz

Differential Revision: https://reviews.llvm.org/D98962
2021-03-19 10:54:33 -07:00
Jianzhou Zhao
8e8ad7066e [dfsan] Turn on testing origin tracking at atomics.ll 2021-03-19 17:53:13 +00:00
Andrei Elovikov
e11e0993d0 [NFC][VPlan] Guard print routines with "#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D98897
2021-03-19 10:50:12 -07:00
Andrei Elovikov
c3beafcfc6 [VPlan] Add plain text (not DOT's digraph) dumps
I foresee two uses for this:
1) It's easier to use those in debugger.
2) Once we start implementing more VPlan-to-VPlan transformations (especially
   inner loop massaging stuff), using the vectorized LLVM IR as CHECK targets in
   LIT test would become too obscure. I can imagine that we'd want to CHECK
   against VPlan dumps after multiple transformations instead. That would be
   easier with plain text dumps than with DOT format.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D96628
2021-03-19 10:50:12 -07:00
Craig Topper
041f7386ea [RISCV] Lower scalable vector masked loads to intrinsics to match fixed vectors and reduce isel patterns.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98840
2021-03-19 10:39:35 -07:00
Fraser Cormack
295f85ecc4 [RISCV] Maintain fixed-length info when optimizing BUILD_VECTORs
I'm not sure how I failed to notice this before, but when optimizing
dominant-element BUILD_VECTORs we would lower via the scalable container type,
which lost us the information about the fixed length of the vector types. By
lowering via the fixed-length type we can preserve that information and
eliminate redundant vsetvli instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98938
2021-03-19 17:21:06 +00:00
Philip Reames
1a629f23e4 [SCEV] Factor out a lambda for strict condition splitting [NFC] 2021-03-19 10:07:12 -07:00
Fraser Cormack
2e43a1f778 [RISCV] Add missing CHECKs to vector test
Since the "LMUL-MAX=2" output for some test functions differed between
RV32 and RV64, the update_llc_test_checks script failed to emit a
unified LMULMAX2 check for them. I'm not sure why it didn't warn about
this.

This patch also takes the opportunity to add unified RV32/RV64 checks to
help shorten the test file when the output for LMULMAX1 and LMULMAX2 is
identical but differs between the two ISAs.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98944
2021-03-19 16:52:16 +00:00
Fraser Cormack
f42a04d7a5 [RISCV] Fix missing scalable->fixed-length vector conversion
Returning the scalable-vector container type would present problems when
the fixed-length INSERT_VECTOR_ELT was used by later operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98776
2021-03-19 16:49:47 +00:00
Martin Storsjö
4ac2184d3a [cmake] Enable Clang warnings about redundant semicolons
This matches what GCC warns about when -pedantic is enabled.

This should avoid such redundant semicolons creeping into the codebase.

Differential Revision: https://reviews.llvm.org/D98941
2021-03-19 18:49:05 +02:00
Jay Foad
210266842f [AMDGPU] Rationalize some check prefixes and use more common prefixes. NFC. 2021-03-19 16:48:33 +00:00
Jay Foad
55e205a98a [AMDGPU] Remove weird target triples from tests. NFC. 2021-03-19 16:48:32 +00:00
Paul Robinson
05294a4e68 [RGT] Recode more unreachable assertions and tautologies
Count iterations of zero-trip loops and assert the count is zero,
rather than asserting inside the loop.
Unreachable functions should use llvm_unreachable.
Remove tautological 'if' statements, even when they're following a
pattern of checks.

Found by the Rotten Green Tests project.
2021-03-19 09:17:22 -07:00
Simon Pilgrim
ea4ee76d88 [DAG] computeKnownBits - add ISD::MULHS/MULHU/SMUL_LOHI/UMUL_LOHI handling
Reuse the existing KnownBits multiplication code to handle the 'extend + multiply + extract high bits' pattern for multiply-high ops.

Noticed while looking at the codegen for D88785 / D98587 - the patch helps division-by-constant expansion code in particular, which suggests that we might have some further KnownBits div/rem cases we could handle - but this was far easier to implement.

Differential Revision: https://reviews.llvm.org/D98857
2021-03-19 16:02:31 +00:00
Jay Foad
a59cc2f1c8 [AMDGPU] Add atomic optimizer nouse tests
Add some atomic optimizer tests where there is no use of the result of
the atomic operation, which is a common case in real code. NFC.

Differential Revision: https://reviews.llvm.org/D98952
2021-03-19 15:39:42 +00:00
Stanislav Mekhanoshin
c6af60f877 [AMDGPU] Remove dead glc1 handing in asm parser. NFC. 2021-03-19 08:37:47 -07:00