1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 10:32:48 +02:00
Commit Graph

219433 Commits

Author SHA1 Message Date
Joseph Huber
4f1fd1c209 [Attributor] Change function internalization to not replace uses in internalized callers
The current implementation of function internalization creats a copy of each
function and replaces every use. This has the downside that the external
versions of the functions will call into the internalized versions of the
functions. This prevents them from being fully independent of eachother. This
patch replaces the current internalization scheme with a method that creates
all the copies of the functions intended to be internalized first and then
replaces the uses as long as their caller is not already internalized.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106931

(cherry picked from commit adbaa39dfce7a8361d89b6a3b382fd8f50b94727)
2021-08-04 16:35:01 -07:00
Cullen Rhodes
710ae2bfd3 [ReleaseNotes] Add scalable matrix extension support to AArch64 changes
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D106853
2021-08-03 15:24:36 +00:00
Muhammad Omair Javaid
bafc7f359f [llvm][Release notes] Add AArch64 SVE, PAC and LLDB prebuilt binary
This patch updates LLVM release notes to add a announcement about
AArch64 SVE, PAC and LLDB prebuilt binary.
2021-08-03 20:20:07 +05:00
David Spickett
aa2a6b072f [llvm][Release notes] Add memory tagging support to lldb changes 2021-08-03 12:25:36 +00:00
Sanjay Patel
d4eedb2312 [Analysis] improve function signature checking for snprintf
The check for size_t parameter 1 was already here for snprintf_chk,
but it wasn't applied to regular snprintf. This could lead to
mismatching and eventually crashing as shown in:
https://llvm.org/PR50885

(cherry picked from commit 7f5555776513f174729a686ed01270e23462aaf7)
2021-08-02 22:58:39 -07:00
Jose M Monsalve Diaz
9770d34891 [OpenMP] Fixing llvm-omp-device-info compilation with runtimes
When using `-DLLVM_ENABLED_RUNTIMES` instead of `-DLLVM_ENABLED_PROJECTS`
the `llvm-omp-device-info` tool is not compiled or installed.
In general, no llvm tool would be build on runtimes, because the
-DLLVM_BUILD_TOOLS flag is removed by the way runtimes compilation calls
cmake again.

This patch is simple. Just forward the value of this flag to the
runtime cmake command.

I'm also removing an unnecessary comment in the compilation of the tool

Differential Revision: https://reviews.llvm.org/D107177

(cherry picked from commit 5424ceeda0534ab382e2a6cb192099f76ee8b12c)
2021-08-02 20:05:19 -07:00
Simon Pilgrim
70f5e23577 [X86][AVX] Add test case for PR51281
(cherry picked from commit 6569b7f90239b5932465a1c6936632b4a9527d66)
2021-08-02 20:05:12 -07:00
Sanjay Patel
df3286259b [DAGCombiner] don't try to partially reduce add-with-overflow ops
This transform was added with D58874, but there were no tests for overflow ops.
We need to change this one way or another because it can crash as shown in:
https://llvm.org/PR51238

Note that if there are no uses of an overflow op's bool overflow result, we
reduce it to a regular math op, so we continue to fold that case either way.
If we have uses of both the math and the overflow bool, then we are likely
not saving anything by creating an independent sub instruction as seen in
the test diffs here.

This patch makes the behavior in SDAG consistent with what we do in
instcombine AFAICT.

Differential Revision: https://reviews.llvm.org/D106983

(cherry picked from commit fa6b2c9915ba27e1e97f8901ea4aa877f331fb9f)
2021-08-02 13:52:48 -07:00
Sanjay Patel
a0686462c3 [AArch64][x86] add tests for add-with-overflow folds; NFC
There's a generic combine for these, but no test coverage.
It's not clear if this is actually a good fold.
The combine was added with D58874, but it has a bug that
can cause crashing ( https://llvm.org/PR51238 ).

(cherry picked from commit e427077ec10ea18ac21f5065342183481d87783a)
2021-08-02 13:52:42 -07:00
Sanjay Patel
b92c9f9565 [DivRemPairs] make sure we have a valid CFG for hoisting division
This transform was added with e38b7e894808ec2
and as shown in:
https://llvm.org/PR51241
...it could crash without an extra check of the blocks.

There might be a more compact way to write this constraint,
but we can't just count the successors/predecessors without
affecting a test that includes a switch instruction.

(cherry picked from commit 5b83261c1518a39636abe094123f1704bbfd972f)
2021-08-02 13:52:37 -07:00
Craig Topper
7c9c296915 [RISCV] Restrict performANY_EXTENDCombine to prevent an infinite loop.
The sign_extend we insert here can get turned into a zero_extend if
the sign bit is known zero. This can enable a setcc combine that
shrinks compares with zero_extend. This reduces the use count of
the zero_extend allowing other combines to turn it back into an
any_extend.

This restricts the combine to only cases where the result is used
by a CopyToReg. This works for my original motivating case. I
hope the CopyToReg use will prevent any converted extends from
turning back into an any_extend.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D106754

(cherry picked from commit 54588bcc052e5b08f90e672c33d0c1ad4eda2424)
2021-08-02 11:31:08 -07:00
Alexandros Lamprineas
276fcebbe0 [AArch64] Legalize MVT::i64x8 in DAG isel lowering
This patch legalizes the Machine Value Type introduced in D94096 for loads
and stores. A new target hook named getAsmOperandValueType() is added which
maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization.

Differential Revision: https://reviews.llvm.org/D94097
2021-08-02 15:45:58 +01:00
Alexandros Lamprineas
a50e569197 [AArch64] Add a Machine Value Type for 8 consecutive registers
Adds MVT::i64x8, a Machine Value Type needed for lowering inline assembly
operands which materialize a sequence of eight general purpose registers.

Differential Revision: https://reviews.llvm.org/D94096
2021-08-02 15:45:58 +01:00
Jeremy Morse
cd0096f439 [DebugInfo][InstrRef] Don't break up ret-sequences on debug-info instrs
When we have a terminator sequence (i.e. a tailcall or return),
MIIsInTerminatorSequence is used to work out where the preceding ABI-setup
instructions end, i.e. the parts that were glued to the terminator
instruction. This allows LLVM to split blocks safely without having to
worry about ABI stuff.

The function only ignores DBG_VALUE instructions, meaning that the two
debug instructions I recently added can end terminator sequences early,
causing various MachineVerifier errors. This patch promotes the test for
debug instructions from "isDebugValue" to "isDebugInstr", thus avoiding any
debug-info interfering with this function.

Differential Revision: https://reviews.llvm.org/D106660

(cherry picked from commit 8612417e5a54cfef941ab45de55e48b4a0c4e8b4)
2021-07-29 15:08:13 +01:00
Bradley Smith
183b0c7c98 [AArch64][SVE] Fix incorrect mask type when lowering fixed type SVE gather/scatter
An incorrect mask type when lowering an SVE gather/scatter was causing
a codegen fault which manifested as the incorrect predicate size being
used for an SVE gather/scatter, (e.g.. p0.b rather than p0.d).

Fixes PR51182.

Differential Revision: https://reviews.llvm.org/D106943

(cherry picked from commit 191831e380f317cd2baa5d48abe02d1d11cd44cb)
2021-07-29 07:03:40 -07:00
Diana Picus
923213f844 test-release.sh: Kill python2
Don't prefer python2's virtualenv when setting up the test-suite.
Always use python3 instead, since that's what we support everywhere else
anyway.

Differential Revision: https://reviews.llvm.org/D106941
2021-07-29 10:28:39 +02:00
Chris Jackson
9a10dd5b1c Revert "[DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR"
This was reverted due to a reported crash.
This reverts commit 796b84d26f4d461fb50e7b4e84e15a10eaca88fc.
2021-07-29 00:04:50 +01:00
Valentin Clement
c463fa6cad [mlir][openacc] Initial translation for DataOp to LLVM IR
Add basic translation of acc.data to LLVM IR with runtime calls.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104301
2021-07-27 22:04:04 -04:00
Jose M Monsalve Diaz
5b7208da36 [OpenMP] Folding threadLimit and numThreads when single value in kernels
The device runtime contains several calls to `__kmpc_get_hardware_num_threads_in_block`
and `__kmpc_get_hardware_num_blocks`. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.

In this patch we use the already introduced `AAFoldRuntimeCall` and the `NumTeams` and
`NumThreads` kernel attributes (to be introduced in a different patch) to fold these functions.
The code checks all the kernels, and if their attributes match, the functions are folded.

In the future we will explore specializing for multiple values of NumThreads and NumTeams.

Depends on D106390

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D106033
2021-07-27 21:47:12 -04:00
George Burgess IV
66a78bc217 llvm/utils: guarantee revert_checker's revert ordering
At the moment, the revert ordering from this tool is unspecified (though
it happens to be in `git log` order, so newest reverts come first).

From the standpoint of tooling and users, this seems to be the opposite
of what we want by default: tools and users will generally try to apply
these reverts as cherry-picks. If two reverts in the list are close
enough to each other, if the reverts get applied out of order, we'll get
a merge conflict.

Rather than having `reverse`s for all tools (and mental reverses for
manual users), just guarantee an oldest-first output ordering for this
function.

Differential Revision: https://reviews.llvm.org/D106838
2021-07-28 00:51:05 +00:00
Juneyoung Lee
bab6d38daf [DAGCombiner] Fold SETCC(FREEZE(x),const) to FREEZE(SETCC(x,const)) if SETCC is used by BRCOND
This patch adds a peephole optimization `SETCC(FREEZE(x),const)` => `FREEZE(SETCC(x,const))`
if the SETCC is only used by BRCOND.

Combined with `BRCOND(FREEZE(X)) => BRCOND(X)`, this leads to a nice improvement in the generated assembly when x is a masked loaded value.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105344
2021-07-28 09:22:15 +09:00
Juneyoung Lee
2e57fe1d7d Precommit test files for D105344 (NFC) 2021-07-28 09:21:55 +09:00
Xiang1 Zhang
a6d5003afd [X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT
Differential Revision: https://reviews.llvm.org/D106780
2021-07-28 08:16:59 +08:00
Johannes Doerfert
9b53e594b4 Reapply "[Attributor] Disable simplification AAs if a callback is present""
This reapplies commit cbb709e25124dc38ee593882051fc88c987fe591 and
includes the use of the lookup method instead of operator[] to avoid
accidentally setting (empty) simplification callbacks.

This reverts commit aa27430a625b2fd059707a87f8ba2df8f480ff11.
2021-07-27 19:14:50 -05:00
Xiang1 Zhang
5d447ad589 Revert "[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT"
This reverts commit 6ff73efea94621e74642e4d7a15cc86a5fb6d411.
2021-07-28 08:12:29 +08:00
Mehdi Amini
115164b68b Add llvm::equal convenient wrapper for ranges around std::equal
Differential Revision: https://reviews.llvm.org/D106913
2021-07-28 00:10:22 +00:00
Xiang1 Zhang
409f0eedd6 [X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT 2021-07-28 08:08:30 +08:00
Krzysztof Parzyszek
60850cdc6a [Hexagon] Fix resetting dead registers in DBG_VALUE_LISTs
This fixes https://llvm.org/PR51229.
2021-07-27 18:36:28 -05:00
LLVM GN Syncbot
803630dbd1 [gn build] Port 8a48e6dda9f7 2021-07-27 23:10:20 +00:00
Johannes Doerfert
96f821bd7b Revert "[Attributor] Disable simplification AAs if a callback is present"
This reverts commit cbb709e25124dc38ee593882051fc88c987fe591 as it
breaks the tests, which was not supposed to happen. Investigating now.
2021-07-27 18:09:42 -05:00
Johannes Doerfert
0042081f58 [Attributor] Verify checkForAllUses return value properly
Also do not emit more than one remark after Heap2Stack failed.
2021-07-27 17:50:27 -05:00
Johannes Doerfert
0b6fb7b1ef [Attributor] Disable simplification AAs if a callback is present
AAValueSimplify, AAValueConstantRange, and AAPotentialValues all look at
the IR by default. If queried for a IR position which has a
simplification callback we should either look at the callback return, or
give up. We do the latter for now.
2021-07-27 17:50:26 -05:00
James Y Knight
9aacda281c Fix test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll.
It was writing to the source directory (which may not be writeable),
rather than using %t.

Fixes: a5dd6c6cf935 ("[LoopVectorize] Don't interleave scalar ordered reductions for inner loops")
2021-07-27 18:32:29 -04:00
Nico Weber
091e720912 [gn build] manually port 71909de37495 2021-07-27 18:23:28 -04:00
Mircea Trofin
506c974562 [MLGO] fix silly LLVM_DEBUG misuse 2021-07-27 15:10:28 -07:00
Mircea Trofin
f4b486c976 [NFC][MLGO] Debug messages for what inline advisor is selected
We already have an indication (error) if the desired inline advisor
cannot be enabled, but we don't have a positive indication. Added
LLVM_DEBUG messages for the latter.
2021-07-27 15:05:39 -07:00
Nemanja Ivanovic
8b3f85a32c [PowerPC] Turn deprecated altivec prefetch instrs to nops on AIX
The dst/dstt/dstst/dststt instructions are nop's on all PowerPC
cores that AIX supports. The AIX assembler also does not accept
these mnemonics. Turn them into nop's on AIX (similar to dstall).
2021-07-27 15:50:02 -05:00
Sanjay Patel
11aa71a71d [x86] update stale code comment; NFC
The transform was generalized with:
1ce05ad619a5
2021-07-27 16:45:52 -04:00
Sanjay Patel
d571039392 [x86] add more tests for cmov and lea; NFC 2021-07-27 16:45:52 -04:00
Matt Arsenault
ece3299a71 AMDGPU/GlobalISel: Fix selecting G_SEXTLOAD/G_ZEXTLOAD pre-gfx9
The patterns for the m0 glue patterns were failing to import.
2021-07-27 15:56:42 -04:00
Matt Arsenault
17f139f330 AMDGPU/GlobalISel: Fix wrong addrspace in test MMOs 2021-07-27 15:56:41 -04:00
Matt Arsenault
cfbd77a7f1 AMDGPU/GlobalISel: Add a few tests for unaligned truncating stores 2021-07-27 15:56:41 -04:00
Benjamin Kramer
c0d40a1ca3 Remove unused include that's also a layering violation. NFC. 2021-07-27 21:21:55 +02:00
Amara Emerson
0a69d4f31a Add test update for a11d9a1f480f which disables fallbacks. 2021-07-27 12:16:06 -07:00
Amara Emerson
6ce8f2f7c1 [AArch64][GlobalISel] Fix constraining LDXPX intrinsic selection.
Causes a fallback because of lack of regclasses on vregs, unless its without
asserts, where we end up crashing later in codegen.
2021-07-27 12:13:56 -07:00
Enna1
2a88f80da4 [ASAN] NFC: Remove redundant variable
`StackAlignment` has only one use: `StackAlignment = std::max(StackAlignment, AI.getAlignment());` So it is redundant.

Reviewed By: vitalybuka, MTC

Differential Revision: https://reviews.llvm.org/D106741
2021-07-27 12:02:37 -07:00
LLVM GN Syncbot
c563d2cbcd [gn build] Port 02077da7e7a8 2021-07-27 18:41:55 +00:00
Adam Nemet
d9613eb43c [Matrix] Fix shape for factored transpose
The shape of the input is C x R.

Differential Revision: https://reviews.llvm.org/D106722
2021-07-27 11:36:13 -07:00
Adam Nemet
add64be20a [Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo
As an instruction is replaced in optimizeTransposes RAUW will replace it in
the ShapeMap (ShapeMap is ValueMap so that uses are updated).  In
finalizeLowering however we skip updating uses if they are in the ShapeMap
since they will be lowered separately at which point we pick up the lowered
operands.

In the testcase what happened was that since we replaced the doubled-transpose
with the shuffle, it ended up in the ShapeMap.  As we lowered the
columnwise-load the use in the shuffle was not updated.  Then as we removed
the original columnwise-load we changed that to an undef.  I.e. we ended up
with:

```
%shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32>
                                   ^^^^^
                                  <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6>
```

Besides the fix itself, I have fortified this last bit.  As we change uses to
undef when removing instruction we track the undefed instruction to make sure
we eventually remove those too.  This would have caught the issue at compile
time.

Differential Revision: https://reviews.llvm.org/D106714
2021-07-27 11:36:13 -07:00
Alexey Zhikhartsev
6516543c4b Add jump-threading optimization for deterministic finite automata
The current JumpThreading pass does not jump thread loops since it can
result in irreducible control flow that harms other optimizations. This
prevents switch statements inside a loop from being optimized to use
unconditional branches.

This code pattern occurs in the core_state_transition function of
Coremark. The state machine can be implemented manually with goto
statements resulting in a large runtime improvement, and this transform
makes the switch implementation match the goto version in performance.

This patch specifically targets switch statements inside a loop that
have the opportunity to be threaded. Once it identifies an opportunity,
it creates new paths that branch directly to the correct code block.
For example, the left CFG could be transformed to the right CFG:

```
          sw.bb                        sw.bb
        /   |   \                    /   |   \
   case1  case2  case3          case1  case2  case3
        \   |   /                /       |       \
        latch.bb             latch.2  latch.3  latch.1
         br sw.bb              /         |         \
                           sw.bb.2     sw.bb.3     sw.bb.1
                            br case2    br case3    br case1
```

Co-author: Justin Kreiner @jkreiner
Co-author: Ehsan Amiri @amehsan

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D99205
2021-07-27 14:34:04 -04:00