SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.
The issue could be observed in a scenario such as:
```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```
This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.
This fixes Bugzilla #47454.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87844
(cherry picked from commit 53d238a961d14eae46f6f2b296ce48026c7bd0a1)
findPHICopyInsertPoint special cases placement in a block with a
callbr or invoke in it. In that case, we must ensure that the copy is
placed before the INLINEASM_BR or call instruction, if the register is
defined prior to that instruction, because it may jump out of the
block.
Previously, the code placed it immediately after the last def _or
use_. This is wrong, if the use is the instruction which may jump. We
could correctly place it immediately after the last def (ignoring
uses), but that is non-optimal for register pressure.
Instead, place the copy after the last def, or before the
call/inlineasm_br, whichever is later.
Differential Revision: https://reviews.llvm.org/D87865
(cherry picked from commit f7a53d82c0902147909f28a9295a9d00b4b27d38)
2508ef01 fixed a bug about constant removal in negation. But after
sanitizing check I found there's still some issue about it so it's
reverted.
Temporary nodes will be removed if useless in negation. Before the
removal, they'd be checked if any other nodes used it. So the removal
was moved after getNode. However in rare cases the node to be removed is
the same as result of getNode. We missed that and will be fixed by this
patch.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D87614
(cherry picked from commit a2fb5446be960ad164060b3c05fc268f7f72d67a)
This patch restricts the behaviour of referencing via .Lfoo$local
local aliases, introduced in https://reviews.llvm.org/D73230, to
STV_DEFAULT globals only.
Hidden symbols via --fvisiblity=hidden (https://gcc.gnu.org/wiki/Visibility)
is an important scenario.
Benefits:
- Improves the size of object files by using fewer STT_SECTION symbols.
- The code reads a bit better (it was not obvious to me without going
back to the code reviews why the canBenefitFromLocalAlias function
currently doesn't consider visibility).
- There is also a side benefit in restoring the effectiveness of the
--wrap linker option and making the behavior of --wrap consistent
between LTO and normal builds for references within a translation-unit.
Note: this --wrap behavior (which is specific to LLD) should not be
considered reliable. See comments on https://reviews.llvm.org/D73230
for more.
Differential Revision: https://reviews.llvm.org/D85782
(cherry picked from commit 4cb016cd2d8467c572b2e5c5d34f376ee79e4ac1)
This seems to have caused incorrect register allocation in some cases,
breaking tests in the Zig standard library (PR47278).
As discussed on the bug, revert back to green for now.
> Record internal state based on register units. This is often more
> efficient as there are typically fewer register units to update
> compared to iterating over all the aliases of a register.
>
> Original patch by Matthias Braun, but I've been rebasing and fixing it
> for almost 2 years and fixed a few bugs causing intermediate failures
> to make this patch independent of the changes in
> https://reviews.llvm.org/D52010.
This reverts commit 66251f7e1de79a7c1620659b7f58352b8c8e892e, and
follow-ups 931a68f26b9a3de853807ffad7b2cd0a2dd30922
and 0671a4c5087d40450603d9d26cf239f1a8b1367e. It also adjust some
test expectations.
(cherry picked from commit a21387c65470417c58021f8d3194a4510bb64f46)
This adds documentation for the options added / changed by D71913, which
enabled aggressive WPD under LTO. The lld release notes already
mentioned it, but I expanded the note.
Differential Revision: https://reviews.llvm.org/D86958
2508ef01 doesn't totally fix the issue since we did not handle the case
when unused temporary negated result is the same with the result, which
is found by address sanitizer.
(cherry picked from commit e1669843f2aaf1e4929afdd8f125c14536d27664)
This check fires during self-host.
> The approach is simple: if a pass reports that it's not modifying a
> Function/Module, compute a loose hash of that Function/Module and compare it
> with the original one. If we report no change but there's a hash change, then we
> have an error.
>
> This approach misses a lot of change but it's not super intrusive and can
> detect most of the simple mistakes.
>
> Differential Revision: https://reviews.llvm.org/D80916
This reverts commit 3667d87a33d3c8d4072a41fd84bb880c59347dc0.
The code that decomposes the GEP into ADD/MUL doesn't work properly
for vector GEPs. It can create bad COPY instructions or possibly
assert.
For now just bail out to SelectionDAG.
Fixes PR45906
(cherry picked from commit 4208ea3e19f8e3e8cd35e6f5a6c43f4aa066c6ec)
960cbc53 immediately removes nodes that won't be used to avoid
compilation time explosion. This patch adds the removal to constants to
fix PR47517.
Reviewed By: RKSimon, steven.zhang
Differential Revision: https://reviews.llvm.org/D87614
(cherry picked from commit 2508ef014e8b01006de4e5ee6fd451d1f68d550f)
This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.
The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:
* Export the SimplifyWithOpReplaced API from InstSimplify, with an
added AllowRefinement flag. For replacements inside the TrueVal
we don't actually care whether refinement occurs or not, the
replacement is always legal. This part of the transform is now
done in InstSimplify only. (It should be noted that the current
AllowRefinement check is not sufficient -- that's an issue we
need to address separately.)
* Change the InstCombine fold to work by temporarily dropping
poison generating flags, running the fold and then restoring the
flags if it didn't work out. This will ensure that the InstCombine
fold is correct as long as the InstSimplify fold is correct.
Differential Revision: https://reviews.llvm.org/D87445
It was found some packed immediate operands (e.g. `<half 1.0, half 2.0>`) are
incorrectly processed so one of two packed values were lost.
Introduced new function to check immediate 32-bit operand can be folded.
Converted condition about current op_sel flags value to fall-through.
Fixes: SWDEV-247595
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D87158
(cherry picked from commit d03c4034dc80c944ec4a5833ba8f87d60183f866)
This is to fix CodeView build failure https://bugs.llvm.org/show_bug.cgi?id=47287
after DIsSubrange upgrade D80197
Assert condition is now removed and Count is calculated in case LowerBound
is absent or zero and Count or UpperBound is constant. If Count is unknown
it is later handled as VLA (currently Count is set to zero).
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D87406
(cherry picked from commit e45b0708ae81ace27de53f12b32a80601cb12bf3)
SSE4_1 and SSE4_2 due imply SSSE3. So I guess I got confused when
switching the code to being table based in D83273.
Fixes PR47464
(cherry picked from commit e6bb4c8e7b3e27f214c9665763a2dd09aa96a5ac)
This patch is cherry-picked from 04b0a4e22e3b4549f9d241f8a9f37eebecb62a31, and
amended to prevent an undefined reference to `llvm::EnableABIBreakingChecks'
(cherry picked from commit 38778e1087b2825e91b07ce4570c70815b49dcdc)
Previously if the source match we asserted that the destination
matched. But GPR <-> mask register copies on X86 can violate this
since we use the same K-registers for multiple sizes.
Fixes this ISPC issue https://github.com/ispc/ispc/issues/1851
Differential Revision: https://reviews.llvm.org/D86507
(cherry picked from commit 4783e2c9c603ed6aeacc76bb1177056a9d307bd1)
The test case in https://bugs.llvm.org/show_bug.cgi?id=47373 exposed
two bugs in the PPC back end. The first one was fixed in commit
27714075848e7f05a297317ad28ad2570d8e5a43 but the test case had to
be added without -verify-machineinstrs due to the second bug.
This commit fixes the use-after-kill that is left behind by the
PPC MI peephole optimization.
(cherry picked from commit 69289cc10ffd1de4d3bf05d33948e6b21b6e68db)
Quite a while ago, we legalized these nodes as we added custom
handling for reciprocal estimates in the back end. We have since
moved to target-independent combines but neglected to turn off
legalization. As a result, we can now get selection failures on
non-VSX subtargets as evidenced in the listed PR.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373
(cherry picked from commit 27714075848e7f05a297317ad28ad2570d8e5a43)
Fixes PR47375, in which an assertion was triggering because
WebAssemblyTargetLowering::isVectorLoadExtDesirable was improperly
assuming the use of simple value types.
Differential Revision: https://reviews.llvm.org/D87110
(cherry picked from commit caee15a0ed52471bd329d01dc253ec9be3936c6d)
Since the parameter is not used anywhere, and the default size of 16
apparently causes PR47359, remove it. This ensures that IntervalMap will
automatically determine the optimal size, using its NodeSizer struct.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D87044
(cherry picked from commit f26fc568402f84a94557cbe86e7aac8319d61387)
Summary:
PPC only supports the instruction selection for v16i8, v8i16, v4i32,
v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so
v1i128 for ISD::SETCC will crash.
This patch is to set v1i128 to expand to avoid crash.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D84238
(cherry picked from commit 802c043078ad653aca131648a130b59f041df0b5)
Replace the check for poison-producing instructions in
SimplifyWithOpReplaced() with the generic helper canCreatePoison()
that properly handles poisonous shifts and thus avoids the problem
from PR47322.
This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which
previously could have caused this code to ignore poison flags.
Setting UseInstrInfo=false should reduce the possible optimizations,
not increase them.
This is not a full solution to the problem, as poison could be
introduced more indirectly. This is just a minimal, easy to backport
fix.
Differential Revision: https://reviews.llvm.org/D86834
(cherry picked from commit a5be86fde5de2c253aa19704bf4e4854f1936f8c)
Tests on Solaris/sparcv9 currently show about 250 failures when building
with gcc, most of them like the following:
FAIL: LLVM-Unit :: Support/./SupportTests/TaskQueueTest.UnOrderedFutures (4269 of 67884)
******************** TEST 'LLVM-Unit :: Support/./SupportTests/TaskQueueTest.UnOrderedFutures' FAILED ********************
Note: Google Test filter = TaskQueueTest.UnOrderedFutures
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from TaskQueueTest
[ RUN ] TaskQueueTest.UnOrderedFutures
0 SupportTests 0x0000000100753b20 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 32
1 SupportTests 0x0000000100752974 llvm::sys::RunSignalHandlers() + 68
2 SupportTests 0x0000000100752b18 SignalHandler(int) + 372
3 libc.so.1 0xffffffff7eedc800 __sighndlr + 12
4 libc.so.1 0xffffffff7eecf23c call_user_handler + 852
5 libc.so.1 0xffffffff7eecf594 sigacthandler + 84
6 SupportTests 0x00000001006f8cb8 std:🧵:_State_impl<std:🧵:_Invoker<std::tuple<llvm::ThreadPool::ThreadPool(llvm::ThreadPoolStrategy)::'lambda'()> > >::_M_run() + 512
7 libstdc++.so.6.0.28 0xfffffffc628117cc execute_native_thread_routine + 16
8 libc.so.1 0xffffffff7eedc6a0 _lwp_start + 0
Since it's effectively impossible to debug such a `SEGV` in a `Release`
build, I tried a `Debug` build instead, only to find that the failures had
gone away.
Further investigation revealed that most of the issue centers around
`llvm/lib/Support/ThreadPool.cpp`. That file is built with `-O3 -fPIC` in
a `Release` build. The failure vanishes if
- compiling without `-fPIC`
- compiling with `-O -fPIC`
- linking with GNU `ld` instead of Solaris `ld`
It has meanwhile been determined that `gcc` doesn't correctly heed some TLS
code sequences. To make things worse, Solaris `ld` doesn't properly
validate its assumptions against the input, generating wrong code.
`gld` like `gcc` is more liberal here and correctly deals with the code it
gets fed from `gcc`.
There's PR target/96607: GCC feeds SPARC/Solaris linker with unrecognized
TLS sequences <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96607> now.
An attempt to build with `-DLLVM_ENABLE_PIC=Off` initially failed since
neither `libRemarks.so` (D85626 <https://reviews.llvm.org/D85626>) nor
`LLVMPolly.so` (D85627 <https://reviews.llvm.org/D85627>) heed that option.
Even with that fixed, a few codegen failures remain.
Next I tried to build just `ThreadPool.cpp` with `-O -fPIC`. While that
fixed the vast majority of the failures, 16 `LLVM :: CodeGen/X86` failures
remained.
Given that that solution was both incomplete and fragile, I went for
building the whole tree with `-O -fPIC` for `Release` and `RelWithDebInfo`
builds.
As detailed in Bug 47304, 2-stage builds also show large numbers of
failures when building with `-O3` or `-O2`, which are likewise worked
around by building with `-O` until they are sufficiently analyzed and
fixed.
This way, all failures relative to a `Debug` build go away.
Tested on `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D85630
(cherry picked from commit 15c66b10114d239c96282cf8fc5330186178974b)
This is the follow up patch for https://reviews.llvm.org/D86183 as we miss to delete the node if NegX == NegY, which has use after we create the node.
```
if (NegX && (CostX <= CostY)) {
Cost = std::min(CostX, CostZ);
RemoveDeadNode(NegY);
return DAG.getNode(Opcode, DL, VT, NegX, Y, NegZ, Flags); #<-- NegY is used here if NegY == NegX.
}
```
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D86689
(cherry picked from commit deb4b2580715810ecd5cb7eefa5ffbe65e5eedc8)
This patch adds type information for SVE ACLE vector types,
by describing them as vectors, with a lower bound of 0, and
an upper bound described by a DWARF expression using the
AArch64 Vector Granule register (VG), which contains the
runtime multiple of 64bit granules in an SVE vector.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D86101
(cherry picked from commit 4e9b66de3f046c1e97b34c938b0920fa6401f40c)
This fixes an issue where the restore point of callee-saves in the
function epilogues was incorrectly calculated when the basic block
consisted of only a RET instruction. This caused dealloc instructions
to be inserted in between the block of callee-save restore instructions,
rather than before it.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D86099
(cherry picked from commit 5f47d4456d192eaea8c56a2b4648023c8743c927)
When joining the legal parts of vector arguments into its original value
during the lower of Formal Arguments in SelectionDAGBuilder, the Calling
Convention information was not being propagated for the handling of each
individual parts. The same did not happen when lowering calls, causing a
mismatch.
This patch fixes the issue by properly propagating the Calling
Convention details.
This fixes Bugzilla #47001.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86715
(cherry picked from commit 3d943bcd223e5b97179840c2f5885fe341e51747)
When collecting `i1` values via `findAllDefs`, ignore Constant's
operands, since Constant's operands might not be `i1`.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46923 which causes ICE
```
llvm-project/llvm/lib/IR/Constants.cpp:1924: static llvm::Constant *llvm::ConstantExpr::getZExt(llvm::Constant *, llvm::Type *, bool): Assertion `C->getType()->getScalarSizeInBits() < Ty->getScalarSizeInBits()&& "SrcTy must be smaller than DestTy for ZExt!"' failed.
```
Differential Revision: https://reviews.llvm.org/D85007
(cherry picked from commit cbea17568f4301582c1d5d43990f089ca6cff522)
The version of `st1d` that operates with vector plus immediate
addressing mode uses the alias `st1d { <Zn>.d }, <Pg>, [<Za>.d]` for
rendering `st1d { <Zn>.d }, <Pg>, [<Za>.d, #0]`. The disassembler was
generating `<Zn>.s` instead of `<Zn>.d>`.
Differential Revision: https://reviews.llvm.org/D86633
We hit the compiling time reported by https://bugs.llvm.org/show_bug.cgi?id=46877
and the reason is the same as D77319. So we need to remove the dead node we created
to avoid increase the problem size of DAGCombiner.
Reviewed By: Spatel
Differential Revision: https://reviews.llvm.org/D86183
(cherry picked from commit 960cbc53ca170c8c605bf83fa63b49ab27a56f65)
Replace the `ident_t` handling in Clang with the methods offered by the
OMPIRBuilder. This cuts down on the clang code as well as the
differences between the two, making further transitions easier. Tests
have changed but there should not be a real functional change. The most
interesting difference is probably that we stop generating local ident_t
allocations for now and just use globals. Given that this happens only
with debug info, the location part of the `ident_t` is probably bigger
than the test anyway. As the location part is already a global, we can
avoid the allocation, memcpy, and store in favor of a constant global
that is slightly bigger. This can be revisited if there are
complications.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D80735
D77531 has a type for mfsprg, it should be mtsprg. This patch is to fix
this typo.
(cherry picked from commit 95e18b2d9d5f93c209ea81df79c2e18ef77de506)
This fixes the "Unable to insert indirect branch" fatal error sometimes
seen when generating position-independent code.
Patch by msizanoen1
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D84833
(cherry picked from commit 5f9ecc5d857fa5d95f6ea36153be19db40576f8a)