1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 10:32:48 +02:00
Commit Graph

219313 Commits

Author SHA1 Message Date
Juneyoung Lee
bab6d38daf [DAGCombiner] Fold SETCC(FREEZE(x),const) to FREEZE(SETCC(x,const)) if SETCC is used by BRCOND
This patch adds a peephole optimization `SETCC(FREEZE(x),const)` => `FREEZE(SETCC(x,const))`
if the SETCC is only used by BRCOND.

Combined with `BRCOND(FREEZE(X)) => BRCOND(X)`, this leads to a nice improvement in the generated assembly when x is a masked loaded value.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105344
2021-07-28 09:22:15 +09:00
Juneyoung Lee
2e57fe1d7d Precommit test files for D105344 (NFC) 2021-07-28 09:21:55 +09:00
Xiang1 Zhang
a6d5003afd [X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT
Differential Revision: https://reviews.llvm.org/D106780
2021-07-28 08:16:59 +08:00
Johannes Doerfert
9b53e594b4 Reapply "[Attributor] Disable simplification AAs if a callback is present""
This reapplies commit cbb709e25124dc38ee593882051fc88c987fe591 and
includes the use of the lookup method instead of operator[] to avoid
accidentally setting (empty) simplification callbacks.

This reverts commit aa27430a625b2fd059707a87f8ba2df8f480ff11.
2021-07-27 19:14:50 -05:00
Xiang1 Zhang
5d447ad589 Revert "[X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT"
This reverts commit 6ff73efea94621e74642e4d7a15cc86a5fb6d411.
2021-07-28 08:12:29 +08:00
Mehdi Amini
115164b68b Add llvm::equal convenient wrapper for ranges around std::equal
Differential Revision: https://reviews.llvm.org/D106913
2021-07-28 00:10:22 +00:00
Xiang1 Zhang
409f0eedd6 [X86] Fix lowering to illegal type in LowerINSERT_VECTOR_ELT 2021-07-28 08:08:30 +08:00
Krzysztof Parzyszek
60850cdc6a [Hexagon] Fix resetting dead registers in DBG_VALUE_LISTs
This fixes https://llvm.org/PR51229.
2021-07-27 18:36:28 -05:00
LLVM GN Syncbot
803630dbd1 [gn build] Port 8a48e6dda9f7 2021-07-27 23:10:20 +00:00
Johannes Doerfert
96f821bd7b Revert "[Attributor] Disable simplification AAs if a callback is present"
This reverts commit cbb709e25124dc38ee593882051fc88c987fe591 as it
breaks the tests, which was not supposed to happen. Investigating now.
2021-07-27 18:09:42 -05:00
Johannes Doerfert
0042081f58 [Attributor] Verify checkForAllUses return value properly
Also do not emit more than one remark after Heap2Stack failed.
2021-07-27 17:50:27 -05:00
Johannes Doerfert
0b6fb7b1ef [Attributor] Disable simplification AAs if a callback is present
AAValueSimplify, AAValueConstantRange, and AAPotentialValues all look at
the IR by default. If queried for a IR position which has a
simplification callback we should either look at the callback return, or
give up. We do the latter for now.
2021-07-27 17:50:26 -05:00
James Y Knight
9aacda281c Fix test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll.
It was writing to the source directory (which may not be writeable),
rather than using %t.

Fixes: a5dd6c6cf935 ("[LoopVectorize] Don't interleave scalar ordered reductions for inner loops")
2021-07-27 18:32:29 -04:00
Nico Weber
091e720912 [gn build] manually port 71909de37495 2021-07-27 18:23:28 -04:00
Mircea Trofin
506c974562 [MLGO] fix silly LLVM_DEBUG misuse 2021-07-27 15:10:28 -07:00
Mircea Trofin
f4b486c976 [NFC][MLGO] Debug messages for what inline advisor is selected
We already have an indication (error) if the desired inline advisor
cannot be enabled, but we don't have a positive indication. Added
LLVM_DEBUG messages for the latter.
2021-07-27 15:05:39 -07:00
Nemanja Ivanovic
8b3f85a32c [PowerPC] Turn deprecated altivec prefetch instrs to nops on AIX
The dst/dstt/dstst/dststt instructions are nop's on all PowerPC
cores that AIX supports. The AIX assembler also does not accept
these mnemonics. Turn them into nop's on AIX (similar to dstall).
2021-07-27 15:50:02 -05:00
Sanjay Patel
11aa71a71d [x86] update stale code comment; NFC
The transform was generalized with:
1ce05ad619a5
2021-07-27 16:45:52 -04:00
Sanjay Patel
d571039392 [x86] add more tests for cmov and lea; NFC 2021-07-27 16:45:52 -04:00
Matt Arsenault
ece3299a71 AMDGPU/GlobalISel: Fix selecting G_SEXTLOAD/G_ZEXTLOAD pre-gfx9
The patterns for the m0 glue patterns were failing to import.
2021-07-27 15:56:42 -04:00
Matt Arsenault
17f139f330 AMDGPU/GlobalISel: Fix wrong addrspace in test MMOs 2021-07-27 15:56:41 -04:00
Matt Arsenault
cfbd77a7f1 AMDGPU/GlobalISel: Add a few tests for unaligned truncating stores 2021-07-27 15:56:41 -04:00
Benjamin Kramer
c0d40a1ca3 Remove unused include that's also a layering violation. NFC. 2021-07-27 21:21:55 +02:00
Amara Emerson
0a69d4f31a Add test update for a11d9a1f480f which disables fallbacks. 2021-07-27 12:16:06 -07:00
Amara Emerson
6ce8f2f7c1 [AArch64][GlobalISel] Fix constraining LDXPX intrinsic selection.
Causes a fallback because of lack of regclasses on vregs, unless its without
asserts, where we end up crashing later in codegen.
2021-07-27 12:13:56 -07:00
Enna1
2a88f80da4 [ASAN] NFC: Remove redundant variable
`StackAlignment` has only one use: `StackAlignment = std::max(StackAlignment, AI.getAlignment());` So it is redundant.

Reviewed By: vitalybuka, MTC

Differential Revision: https://reviews.llvm.org/D106741
2021-07-27 12:02:37 -07:00
LLVM GN Syncbot
c563d2cbcd [gn build] Port 02077da7e7a8 2021-07-27 18:41:55 +00:00
Adam Nemet
d9613eb43c [Matrix] Fix shape for factored transpose
The shape of the input is C x R.

Differential Revision: https://reviews.llvm.org/D106722
2021-07-27 11:36:13 -07:00
Adam Nemet
add64be20a [Matrix] RAUW should only replace an instruction in ShapeMap if supportsShapeInfo
As an instruction is replaced in optimizeTransposes RAUW will replace it in
the ShapeMap (ShapeMap is ValueMap so that uses are updated).  In
finalizeLowering however we skip updating uses if they are in the ShapeMap
since they will be lowered separately at which point we pick up the lowered
operands.

In the testcase what happened was that since we replaced the doubled-transpose
with the shuffle, it ended up in the ShapeMap.  As we lowered the
columnwise-load the use in the shuffle was not updated.  Then as we removed
the original columnwise-load we changed that to an undef.  I.e. we ended up
with:

```
%shuf = shufflevector <8 x double> undef, <8 x double> poison, <6 x i32>
                                   ^^^^^
                                  <i32 0, i32 1, i32 2, i32 4, i32 5, i32 6>
```

Besides the fix itself, I have fortified this last bit.  As we change uses to
undef when removing instruction we track the undefed instruction to make sure
we eventually remove those too.  This would have caught the issue at compile
time.

Differential Revision: https://reviews.llvm.org/D106714
2021-07-27 11:36:13 -07:00
Alexey Zhikhartsev
6516543c4b Add jump-threading optimization for deterministic finite automata
The current JumpThreading pass does not jump thread loops since it can
result in irreducible control flow that harms other optimizations. This
prevents switch statements inside a loop from being optimized to use
unconditional branches.

This code pattern occurs in the core_state_transition function of
Coremark. The state machine can be implemented manually with goto
statements resulting in a large runtime improvement, and this transform
makes the switch implementation match the goto version in performance.

This patch specifically targets switch statements inside a loop that
have the opportunity to be threaded. Once it identifies an opportunity,
it creates new paths that branch directly to the correct code block.
For example, the left CFG could be transformed to the right CFG:

```
          sw.bb                        sw.bb
        /   |   \                    /   |   \
   case1  case2  case3          case1  case2  case3
        \   |   /                /       |       \
        latch.bb             latch.2  latch.3  latch.1
         br sw.bb              /         |         \
                           sw.bb.2     sw.bb.3     sw.bb.1
                            br case2    br case3    br case1
```

Co-author: Justin Kreiner @jkreiner
Co-author: Ehsan Amiri @amehsan

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D99205
2021-07-27 14:34:04 -04:00
Craig Topper
6bfc6b8665 [RISCV] Select vector shl by 1 to a vector add.
A vector add may be faster than a vector shift.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D106689
2021-07-27 10:57:28 -07:00
David Green
0f86a54b34 [AArch64] Update and expand min-max cost model test. NFC
This expands the cost model test for min/max to many more types,
including floating point minnum/maxnum and minimum/maximum, and FP16
with and without fullfp16.  The old llc run lines are removed, as those
are better tested by CodeGen tests.
2021-07-27 18:48:58 +01:00
Andy Kaylor
fd18a762a0 Enabling the copy-constant-to-alloca optimization in more instances
Patch by Mohammad Fawaz

This patch allows lifetime calls to be ignored (and later erased) if we
know that the copy-constant-to-alloca optimization is going to happen.
The case that is missed is when the global variable is in a different address
space than the alloca (as shown in the example added to the lit test.)

This used to work before 6da31fa4a6

Differential Revision: https://reviews.llvm.org/D106573
2021-07-27 10:11:43 -07:00
David Sherwood
cdd50ed2ff [LoopVectorize] Don't interleave scalar ordered reductions for inner loops
Consider the following loop:

  void foo(float *dst, float *src, int N) {
    for (int i = 0; i < N; i++) {
      dst[i] = 0.0;
      for (int j = 0; j < N; j++) {
        dst[i] += src[(i * N) + j];
      }
    }
  }

When we are not building with -Ofast we may attempt to vectorise the
inner loop using ordered reductions instead. In addition we also try
to select an appropriate interleave count for the inner loop. However,
when choosing a VF=1 the inner loop will be scalar and there is existing
code in selectInterleaveCount that limits the interleave count to 2
for reductions due to concerns about increasing the critical path.
For ordered reductions this problem is even worse due to the additional
data dependency, and so I've added code to simply disable interleaving
for scalar ordered reductions for now.

Test added here:

  Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll

Differential Revision: https://reviews.llvm.org/D106646
2021-07-27 17:41:01 +01:00
Anna Thomas
3ad8a0b712 Update reduction test. Remove standalone test file
Based on post commit review comments at 68ffed12b.
2021-07-27 12:35:04 -04:00
Matt Arsenault
3b4453b968 AMDGPU: Update tests for lower i1 change
I forgot to squash the test updates for b32d3d9e81cdd9275d19cd2a396c461edc9e7189
2021-07-27 12:14:58 -04:00
Matt Arsenault
8979bda8e8 AMDGPU: Treat IMPLICIT_DEF like a constant lanemask source
This is partially a workaround. SILowerI1Copies does not understand
unstructured loops. This would result in inserting instructions to
merge a mask register in the same block where it was defined in an
unstructured loop.
2021-07-27 11:44:38 -04:00
Thomas Lively
bb4e957ebf [WebAssembly] Codegen for extmul SIMD instructions
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.

Differential Revision: https://reviews.llvm.org/D106724
2021-07-27 08:41:30 -07:00
Anirudh Prasad
2b967460b4 [SystemZ][z/OS] Initial code to generate assembly files on z/OS
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D106380
2021-07-27 11:29:15 -04:00
Anna Thomas
23f3f90569 Strip undef implying attributes when moving calls
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.

This patch introduces such an API and uses it in relevant passes. See
updated tests.

Fix for PR50744.

Reviewed By: nikic, jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D104641
2021-07-27 10:57:05 -04:00
Tres Popp
05308b27ea Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI."
This reverts commit 1cfecf4fc4278afb0005923f6dff595cd372da5c.

This commit broke LLVM code generated through XLA by removing a
conditional on Ld->getExtensionType() == ISD::NON_EXTLOAD

This is not a perfect revert. The new function is left as other uses of
it exist now.
2021-07-27 16:55:50 +02:00
Tres Popp
d809395fde Revert "Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI.""
This reverts commit d7bbb1230a94cb239aa4a8cb896c45571444675d.

There were follow up uses of a deleted method and I didn't run the
tests. Undo the revert, so I can do it properly.
2021-07-27 16:48:31 +02:00
Tres Popp
9d32182a3a Revert "[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI."
This reverts commit 1cfecf4fc4278afb0005923f6dff595cd372da5c.

This commit broke LLVM code generated through XLA by removing a
conditional on Ld->getExtensionType() == ISD::NON_EXTLOAD
2021-07-27 16:22:25 +02:00
Jeremy Morse
adf33c8470 [DebugInfo][InstrRef] Correctly update DBG_PHIs during instr scheduling
Avoid several crashes when DBG_INSTR_REF and DBG_PHI instructions are fed
to the instruction scheduler. DBG_INSTR_REFs should be treated like
DBG_LABELs, and just ignored for the purpose of scheduling [0].

DBG_PHIs however behave much more like DBG_VALUEs: they refer to register
operands, and if some register defs get shuffled around during instruction
scheduling, there's a risk that the debug instr will refer to the wrong
value. There's already a facility for updating DBG_VALUEs to reflect this;
add DBG_PHI to the list of instructions that it will update.

[0] Suboptimal, but it's what instr scheduling does right now.

Differential Revision: https://reviews.llvm.org/D106663
2021-07-27 15:12:46 +01:00
Tres Popp
64d9d45951 Handle unused variable when assertions are disabled 2021-07-27 15:43:06 +02:00
Anna Thomas
b8af1d6561 [IVDescriptors] Fix bug in checkOrderedReduction
The Exit instruction passed in for checking if it's an ordered reduction need not be
an FPAdd operation. We need to bail out at that point instead of
assuming it is an FPAdd (and hence has two operands). See added testcase.
It crashes without the patch because the Exit instruction is a phi with
exactly one operand.
This latent bug was exposed by 95346ba which added support for
multi-exit loops for vectorization.

Reviewed-By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D106843
2021-07-27 09:31:44 -04:00
Chris Jackson
fe743a5b25 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reapplies commit 76f3ffb2b285998f02639db8fd42fb0de8a540d0 that was
reverted due to buildbot failures.

- Update lit tests with REQUIRES condition.
- Abandon salvage attempt if SCEVUnknown::getValue() returns nullptr.

Differential Revision: https://reviews.llvm.org/D105207
2021-07-27 14:22:09 +01:00
Jeremy Morse
47a51741f3 [DebugInfo][InstrRef] Handle llvm.frameaddress intrinsics gracefully
When working out which instruction defines a value, the
instruction-referencing variable location code has a few special cases for
physical registers:
 * Arguments are never defined by instructions,
 * Constant physical registers always read the same value, are never def'd

This patch adds a third case for the llvm.frameaddress intrinsics: you can
read the framepointer in any block if you so choose, and use it as a
variable location, as shown in the added test.

This rather violates one of the assumptions behind instruction referencing,
that LLVM-ir shouldn't be able to read from an arbitrary register at some
arbitrary point in the program. The solution for now is to just emit a
DBG_PHI that reads the register value: this works, but if we wanted to do
something clever with DBG_PHIs in the future then this would probably get
in the way. As it stands, this patch avoids a crash.

Differential Revision: https://reviews.llvm.org/D106659
2021-07-27 13:44:37 +01:00
Chris Jackson
e13e00f7b3 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reverts commit 76f3ffb2b285998f02639db8fd42fb0de8a540d0 because
of a failure on sanitixer-X86-64-linux-autoconf.
2021-07-27 13:36:56 +01:00
Chris Jackson
2e863e863d [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This patch extends salvaging of debuginfo in the Loop Strength Reduction
(LSR) pass by translating Scalar Evaluations (SCEV) into DIExpressions.
The method is as follows:
- Cache dbg.value intrinsics that are salvageable.
- Obtain a loop Induction Variable (IV) from ScalarExpressionExpander or
  the loop header.
- Translate the IV SCEV into an expression that recovers the current
  loop iteration count. Combine this with the dbg.value's location
  op SCEV to create a DIExpression that salvages the value.

Review by: jmorse

Differential Revision: https://reviews.llvm.org/D105207
2021-07-27 13:00:36 +01:00