1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00
Commit Graph

136854 Commits

Author SHA1 Message Date
Nekotekina
5836324d64 X86: fixup (V)PMADDWD detection
Fix some bugs (missing checks).
Add constant support.
2021-05-17 08:18:22 +03:00
Nekotekina
a7dd06b0f0 X86: improve (V)PMADDWD detection
In function combineMulToPMADDWD, if 17 bit are sign bits,
not just zero bits, the optimization can be applied sometimes.
For now, detect and replace SRA pairs with SRL.
2021-05-13 20:20:14 +03:00
Nekotekina
8ed5423cd2 X86: modify PreserveAll CC to save full AVX-512 state 2021-05-13 11:24:49 +03:00
xddxd
4d88105d4a More C++20 fixes 2021-04-18 16:27:47 +03:00
xddxd
5d8643e8eb C++20 fixes 2021-03-28 16:25:43 +03:00
Ani
d74d689f19 Host: Add workaround for Zen3
Treat Zen3 as Zen2 until upstream adds Zen3 support
2020-11-06 08:16:10 +00:00
Nekotekina
ec657b923d X86: avoid vector-scalar shifts if splat amount is directly a vector ADD/SUB/AND op.
Prefer vector-vector shifts if available (AVX2+).
Improves code generated for rotate and funnel shifts.
Otherwise it would generate a shuffle + slower vector-scalar shift.
2020-11-02 16:33:21 +03:00
Nekotekina
9c0f762155 X86: add patterns for X86ISD::VSHLV and X86ISD::VSRLV
Replace VSELECT instruction which zeroes their result on exceeding legal SHL/SRL shift amount.
2020-11-02 16:33:21 +03:00
Nekotekina
3223efa093 X86: add pattern for X86ISD::VSRAV
Detect clamping ashr shift amount to max legal value
2020-11-02 16:33:21 +03:00
Nekotekina
b7587d5ebe X86: expand detectAVGPattern()
Allow all integer widths in the pattern, allow ashr
Handle signed and mixed cases, allowing to replace truncation
2020-11-02 16:33:21 +03:00
Nekotekina
883d6df927 X86: optimize VSELECT for v16i8 with shl + sign bit test 2020-11-02 16:33:21 +03:00
Nekotekina
999da05669 X86: LowerShift: new algorithm for vector-vector shifts
Emit pair of shifts of double size if possible
2020-11-02 16:33:21 +03:00
Nekotekina
e9d729dcca X86: add RTM to Haswell+ features 2020-11-02 02:50:53 +03:00
Nekotekina
c9f0684580 Disable GDBRegistrationListener
It makes emitting object extremely slow.
GDB doesn't work properly with it anyway.
GDB also often crashes because it cannot read the format.
2020-11-02 02:50:53 +03:00
Nekotekina
5a08b9c29b MCJIT: don't finalize modules on symbol lookup (workaround)
This is extremely slow yet unnecessary with manual finalization.
In LLVM 6 this wasn't a problem.
2020-11-02 02:50:53 +03:00
Nekotekina
391548e226 X86: Fix/workaround Small Code Model for JIT
Force RIP-relative jump tables and global values
Force RIP-relative all zeros / all ones constants
These things were causing crashes due to use of absolute addressing
2020-11-02 02:28:12 +03:00
Bill Wendling
1739071165 [CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR
Tail duplication of a block with an INLINEASM_BR may result in a PHI
node on the indirect branch. This is okay, but it also introduces a copy
for that PHI node *after* the INLINEASM_BR, which is not okay.

See: https://github.com/ClangBuiltLinux/linux/issues/1125

Differential Revision: https://reviews.llvm.org/D88823

(cherry picked from commit d2c61d2bf9bd1efad49acba2f2751112522686aa)
2020-10-07 12:10:48 +02:00
Qiu Chaofan
ce106590bf [SelectionDAG] Don't remove unused negated constant immediately
This reverts partial of a2fb5446 (actually, 2508ef01) about removing
negated FP constant immediately if it has no uses. However, as discussed
in bug 47517, there're cases when NegX is folded into constant from
other places while NegY is removed by that line of code and NegX is
equal to NegY. In these cases, NegX is deleted before used and crash
happens. So revert the code and add necessary test case.

(cherry picked from commit b326d4ff946d2061a566a3fcce9f33b484759fe0)
2020-10-06 16:04:19 +02:00
Sanjay Patel
eaf26356d1 [APFloat] prevent NaN morphing into Inf on conversion (PR43907)
We shift the significand right on a truncation, but that needs to be made NaN-safe:
always set at least 1 bit in the significand.
https://llvm.org/PR43907

See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed).

Differential Revision: https://reviews.llvm.org/D87835

(cherry picked from commit e34bd1e0b03d20a506ada156d87e1b3a96d82fa2)
2020-09-30 13:28:43 +02:00
Amara Emerson
b8f4c2339b [GlobalISel] Fix multiply with overflow intrinsics legalization generating invalid MIR.
During lowering of G_UMULO and friends, the previous code moved the builder's
insertion point to be after the legalizing instruction. When that happened, if
there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder
would try to find that constant during the buildConstant(zero) call, and since
it dominates itself would return the iterator unchanged, even though the def
of the constant was *after* the current insertion point. This resulted in the
compare being generated *before* the constant which it was using.

There's no need to modify the insertion point before building the mul-hi or
constant. Delaying moving the insert point ensures those are built/CSEd before
the G_ICMP is built.

Fixes PR47679

Differential Revision: https://reviews.llvm.org/D88514

(cherry picked from commit 1d54e75cf26a4c60b66659d5d9c62f4bb9452b03)
2020-09-30 13:05:48 +02:00
Robert Widmann
8a3d6aad2f [LLVM-C] Turn a ShuffleVector Constant Into a Getter.
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter.

Differential Revision: https://reviews.llvm.org/D88367

(cherry picked from commit 55f727306e727ea9f013d09c9b8aa70dbce6a1bd)
2020-09-28 12:36:28 +02:00
Craig Disselkoen
810086af1d C API: functions to get mask of a ShuffleVector
This commit fixes a regression (from LLVM 10 to LLVM 11 RC3) in the LLVM
C API.

Previously, commit 1ee6ec2bf removed the mask operand from the
ShuffleVector instruction, storing the mask data separately in the
instruction instead; this reduced the number of operands of
ShuffleVector from 3 to 2. AFAICT, this change unintentionally caused
a regression in the LLVM C API. Specifically, it is no longer possible
to get the mask of a ShuffleVector instruction through the C API. This
patch introduces new functions which together allow a C API user to get
the mask of a ShuffleVector instruction, restoring the functionality
which was previously available through LLVMGetOperand().

This patch also adds tests for this change to the llvm-c-test
executable, which involved adding support for InsertElement,
ExtractElement, and ShuffleVector itself (as well as constant vectors)
to echo.cpp. Previously, vector operations weren't tested at all in
echo.ll.

I also fixed some typos in comments and help-text nearby these changes,
which I happened to spot while developing this patch. Since the typo
fixes are technically unrelated other than being in the same files, I'm
happy to take them out if you'd rather they not be included in the patch.

Differential Revision: https://reviews.llvm.org/D88190

(cherry picked from commit 51cad041e0cb26597c7ccc0fbfaa349b8fffbcda)
2020-09-28 12:36:16 +02:00
Simon Atanasyan
172c27d8f3 [CodeGen] Do not call emitGlobalConstantLargeInt for constant requires 8 bytes to store
This is a fix for PR47630. The regression is caused by the D78011. After
this change the code starts to call the `emitGlobalConstantLargeInt` even
for constants which requires eight bytes to store.

Differential revision: https://reviews.llvm.org/D88261

(cherry picked from commit c6c5629f2fb4ddabd376fbe7c218733283e91d09)
2020-09-28 11:52:59 +02:00
Matt Arsenault
8cd42ca0d2 AArch64/GlobalISel: Narrow stack passed argument access size
This fixes a verifier error in the testcase from bug 47619.

The stack passed s3 value was widened to 4-bytes, and producing a
4-byte memory access with a < 1 byte result type. We need to either
widen the result type or narrow the access size. This copies the code
directly from the AMDGPU handling, which narrows the load size. I
don't like that every target has to handle this, but this is currently
broken on the 11 release branch and this is the simplest fix.

This reverts commit 42bfa7c63b85e76fe16521d1671afcafaf8f64ed.

(cherry picked from commit 6cb0d23f2ea6fb25106b0380797ccbc2141d71e1)
2020-09-28 11:43:55 +02:00
Matt Arsenault
2d82761ec3 AArch64/GlobalISel: Reduced patch for bug 47619
This is the relevant portions of an assert fixed by
b98f902f1877c3d679f77645a267edc89ffcd5d6.
2020-09-25 13:16:13 +02:00
Lucas Prates
8fcafdf787 [CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder
SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.

The issue could be observed in a scenario such as:

```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```

This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.

This fixes Bugzilla #47454.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87844

(cherry picked from commit 53d238a961d14eae46f6f2b296ce48026c7bd0a1)
2020-09-22 11:45:22 +02:00
James Y Knight
5f31397e39 PR47468: Fix findPHICopyInsertPoint, so that copies aren't incorrectly inserted after an INLINEASM_BR.
findPHICopyInsertPoint special cases placement in a block with a
callbr or invoke in it. In that case, we must ensure that the copy is
placed before the INLINEASM_BR or call instruction, if the register is
defined prior to that instruction, because it may jump out of the
block.

Previously, the code placed it immediately after the last def _or
use_. This is wrong, if the use is the instruction which may jump.  We
could correctly place it immediately after the last def (ignoring
uses), but that is non-optimal for register pressure.

Instead, place the copy after the last def, or before the
call/inlineasm_br, whichever is later.

Differential Revision: https://reviews.llvm.org/D87865

(cherry picked from commit f7a53d82c0902147909f28a9295a9d00b4b27d38)
2020-09-22 11:36:19 +02:00
Qiu Chaofan
b157b648ba [SelectionDAG] Check any use of negation result before removal
2508ef01 fixed a bug about constant removal in negation. But after
sanitizing check I found there's still some issue about it so it's
reverted.

Temporary nodes will be removed if useless in negation. Before the
removal, they'd be checked if any other nodes used it. So the removal
was moved after getNode. However in rare cases the node to be removed is
the same as result of getNode. We missed that and will be fixed by this
patch.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D87614

(cherry picked from commit a2fb5446be960ad164060b3c05fc268f7f72d67a)
2020-09-17 13:37:11 +02:00
Ben Dunbobbin
d8484b56e4 [X86][ELF] Prefer lowering MC_GlobalAddress operands to .Lfoo$local for STV_DEFAULT only
This patch restricts the behaviour of referencing via .Lfoo$local
local aliases, introduced in https://reviews.llvm.org/D73230, to
STV_DEFAULT globals only.

Hidden symbols via --fvisiblity=hidden (https://gcc.gnu.org/wiki/Visibility)
is an important scenario.

Benefits:

- Improves the size of object files by using fewer STT_SECTION symbols.

- The code reads a bit better (it was not obvious to me without going
  back to the code reviews why the canBenefitFromLocalAlias function
  currently doesn't consider visibility).

- There is also a side benefit in restoring the effectiveness of the
  --wrap linker option and making the behavior of --wrap consistent
  between LTO and normal builds for references within a translation-unit.
  Note: this --wrap behavior (which is specific to LLD) should not be
  considered reliable. See comments on https://reviews.llvm.org/D73230
  for more.

Differential Revision: https://reviews.llvm.org/D85782

(cherry picked from commit 4cb016cd2d8467c572b2e5c5d34f376ee79e4ac1)
2020-09-17 13:23:29 +02:00
Hans Wennborg
068754a1d5 Revert "RegAllocFast: Record internal state based on register units"
This seems to have caused incorrect register allocation in some cases,
breaking tests in the Zig standard library (PR47278).

As discussed on the bug, revert back to green for now.

> Record internal state based on register units. This is often more
> efficient as there are typically fewer register units to update
> compared to iterating over all the aliases of a register.
>
> Original patch by Matthias Braun, but I've been rebasing and fixing it
> for almost 2 years and fixed a few bugs causing intermediate failures
> to make this patch independent of the changes in
> https://reviews.llvm.org/D52010.

This reverts commit 66251f7e1de79a7c1620659b7f58352b8c8e892e, and
follow-ups 931a68f26b9a3de853807ffad7b2cd0a2dd30922
and 0671a4c5087d40450603d9d26cf239f1a8b1367e. It also adjust some
test expectations.

(cherry picked from commit a21387c65470417c58021f8d3194a4510bb64f46)
2020-09-15 19:12:48 +02:00
Qiu Chaofan
4daf36af28 Revert "[SelectionDAG] Remove unused FP constant in getNegatedExpression"
2508ef01 doesn't totally fix the issue since we did not handle the case
when unused temporary negated result is the same with the result, which
is found by address sanitizer.

(cherry picked from commit e1669843f2aaf1e4929afdd8f125c14536d27664)
2020-09-15 16:57:03 +02:00
Hans Wennborg
d3f2114698 Revert "Double check that passes correctly set their Modified status"
This check fires during self-host.

> The approach is simple: if a pass reports that it's not modifying a
> Function/Module, compute a loose hash of that Function/Module and compare it
> with the original one. If we report no change but there's a hash change, then we
> have an error.
>
> This approach misses a lot of change but it's not super intrusive and can
> detect most of the simple mistakes.
>
> Differential Revision: https://reviews.llvm.org/D80916

This reverts commit 3667d87a33d3c8d4072a41fd84bb880c59347dc0.
2020-09-15 16:47:29 +02:00
Craig Topper
7b6b353218 [FastISel] Bail out of selectGetElementPtr for vector GEPs.
The code that decomposes the GEP into ADD/MUL doesn't work properly
for vector GEPs. It can create bad COPY instructions or possibly
assert.

For now just bail out to SelectionDAG.

Fixes PR45906

(cherry picked from commit 4208ea3e19f8e3e8cd35e6f5a6c43f4aa066c6ec)
2020-09-15 13:55:38 +02:00
Qiu Chaofan
94913439b2 [SelectionDAG] Remove unused FP constant in getNegatedExpression
960cbc53 immediately removes nodes that won't be used to avoid
compilation time explosion. This patch adds the removal to constants to
fix PR47517.

Reviewed By: RKSimon, steven.zhang

Differential Revision: https://reviews.llvm.org/D87614

(cherry picked from commit 2508ef014e8b01006de4e5ee6fd451d1f68d550f)
2020-09-15 13:42:41 +02:00
Nikita Popov
6745ba46ae Fix incorrect SimplifyWithOpReplaced transform (PR47322)
This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.

The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:

 * Export the SimplifyWithOpReplaced API from InstSimplify, with an
   added AllowRefinement flag. For replacements inside the TrueVal
   we don't actually care whether refinement occurs or not, the
   replacement is always legal. This part of the transform is now
   done in InstSimplify only. (It should be noted that the current
   AllowRefinement check is not sufficient -- that's an issue we
   need to address separately.)
 * Change the InstCombine fold to work by temporarily dropping
   poison generating flags, running the fold and then restoring the
   flags if it didn't work out. This will ensure that the InstCombine
   fold is correct as long as the InstSimplify fold is correct.

Differential Revision: https://reviews.llvm.org/D87445
2020-09-15 10:21:08 +02:00
Nikita Popov
c5c1bd4167 Reduce code duplication in simplifySelectWithICmpCond (NFC)
Canonicalize icmp ne to icmp eq and implement all the folds only once.
2020-09-15 10:21:08 +02:00
dfukalov
f527c84080 [AMDGPU] Fix for folding v2.16 literals.
It was found some packed immediate operands (e.g. `<half 1.0, half 2.0>`) are
incorrectly processed so one of two packed values were lost.

Introduced new function to check immediate 32-bit operand can be folded.
Converted condition about current op_sel flags value to fall-through.

Fixes: SWDEV-247595

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D87158

(cherry picked from commit d03c4034dc80c944ec4a5833ba8f87d60183f866)
2020-09-14 15:18:44 +02:00
Alok Kumar Sharma
85e75ebd5c [DebugInfo] Fixing CodeView assert related to lowerBound field of DISubrange.
This is to fix CodeView build failure https://bugs.llvm.org/show_bug.cgi?id=47287
    after DIsSubrange upgrade D80197

    Assert condition is now removed and Count is calculated in case LowerBound
    is absent or zero and Count or UpperBound is constant. If Count is unknown
    it is later handled as VLA (currently Count is set to zero).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D87406

(cherry picked from commit e45b0708ae81ace27de53f12b32a80601cb12bf3)
2020-09-11 11:41:34 +02:00
Brad Smith
f540d300d8 [PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bit PowerPC in PPCTargetLowering
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D86165

(cherry picked from commit 88b368a1c47bca536f03041f7464235b94ea98a1)
2020-09-09 10:03:23 +02:00
Craig Topper
6a6cc0be60 [X86] SSE4_A should only imply SSE3 not SSSE3 in the frontend.
SSE4_1 and SSE4_2 due imply SSSE3. So I guess I got confused when
switching the code to being table based in D83273.

Fixes PR47464

(cherry picked from commit e6bb4c8e7b3e27f214c9665763a2dd09aa96a5ac)
2020-09-08 20:55:52 +02:00
Serge Guelton
df4269f308 Provide anchor for compiler extensions
This patch is cherry-picked from 04b0a4e22e3b4549f9d241f8a9f37eebecb62a31, and
amended to prevent an undefined reference to `llvm::EnableABIBreakingChecks'

(cherry picked from commit 38778e1087b2825e91b07ce4570c70815b49dcdc)
2020-09-08 13:48:13 +02:00
Craig Topper
29f8bec823 [MachineCopyPropagation] In isNopCopy, check the destination registers match in addition to the source registers.
Previously if the source match we asserted that the destination
matched. But GPR <-> mask register copies on X86 can violate this
since we use the same K-registers for multiple sizes.

Fixes this ISPC issue https://github.com/ispc/ispc/issues/1851

Differential Revision: https://reviews.llvm.org/D86507

(cherry picked from commit 4783e2c9c603ed6aeacc76bb1177056a9d307bd1)
2020-09-07 19:52:17 +02:00
Nemanja Ivanovic
054e5f0a5d [PowerPC] Fix broken kill flag after MI peephole
The test case in https://bugs.llvm.org/show_bug.cgi?id=47373 exposed
two bugs in the PPC back end. The first one was fixed in commit
27714075848e7f05a297317ad28ad2570d8e5a43 but the test case had to
be added without -verify-machineinstrs due to the second bug.
This commit fixes the use-after-kill that is left behind by the
PPC MI peephole optimization.

(cherry picked from commit 69289cc10ffd1de4d3bf05d33948e6b21b6e68db)
2020-09-07 19:37:26 +02:00
Nemanja Ivanovic
20da396774 [PowerPC] Do not legalize vector FDIV without VSX
Quite a while ago, we legalized these nodes as we added custom
handling for reciprocal estimates in the back end. We have since
moved to target-independent combines but neglected to turn off
legalization. As a result, we can now get selection failures on
non-VSX subtargets as evidenced in the listed PR.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373
(cherry picked from commit 27714075848e7f05a297317ad28ad2570d8e5a43)
2020-09-07 19:37:25 +02:00
Thomas Lively
7da9b1dbc7 [WebAssembly] Fix incorrect assumption of simple value types
Fixes PR47375, in which an assertion was triggering because
WebAssemblyTargetLowering::isVectorLoadExtDesirable was improperly
assuming the use of simple value types.

Differential Revision: https://reviews.llvm.org/D87110

(cherry picked from commit caee15a0ed52471bd329d01dc253ec9be3936c6d)
2020-09-07 19:23:56 +02:00
Kang Zhang
7664478cd7 [PowerPC] Set v1i128 to expand for SETCC to avoid crash
Summary:
PPC only supports the instruction selection for v16i8, v8i16, v4i32,
v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so
v1i128 for ISD::SETCC will crash.

This patch is to set v1i128 to expand to avoid crash.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D84238

(cherry picked from commit 802c043078ad653aca131648a130b59f041df0b5)
2020-09-01 17:15:52 +02:00
Nikita Popov
42e283c87a [InstSimplify] Protect against more poison in SimplifyWithOpReplaced (PR47322)
Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.

This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.

This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.

Differential Revision: https://reviews.llvm.org/D86834

(cherry picked from commit a5be86fde5de2c253aa19704bf4e4854f1936f8c)
2020-08-31 16:09:42 +02:00
QingShan Zhang
9e52bd71bf [DAGCombine] Don't delete the node if it has uses immediately
This is the follow up patch for https://reviews.llvm.org/D86183 as we miss to delete the node if NegX == NegY, which has use after we create the node.
```
    if (NegX && (CostX <= CostY)) {
      Cost = std::min(CostX, CostZ);
      RemoveDeadNode(NegY);
      return DAG.getNode(Opcode, DL, VT, NegX, Y, NegZ, Flags);  #<-- NegY is used here if NegY == NegX.
    }
```

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86689

(cherry picked from commit deb4b2580715810ecd5cb7eefa5ffbe65e5eedc8)
2020-08-31 14:17:22 +02:00
Brad Smith
d812075793 [SSP] Restore setting the visibility of __guard_local to hidden for better code generation.
Patch by: Philip Guenther

(cherry picked from commit d870e363263835bec96c83f51b20e64722cad742)
2020-08-28 11:45:57 +02:00
Sander de Smalen
43b4ea8954 [AArch64][SVE] Add missing debug info for ACLE types.
This patch adds type information for SVE ACLE vector types,
by describing them as vectors, with a lower bound of 0, and
an upper bound described by a DWARF expression using the
AArch64 Vector Granule register (VG), which contains the
runtime multiple of 64bit granules in an SVE vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86101

(cherry picked from commit 4e9b66de3f046c1e97b34c938b0920fa6401f40c)
2020-08-28 11:14:10 +02:00