Remappings involving extern "C" names were already supported in the
context of <local-name>s, but this support didn't work for remapping the
complete mangling itself. (Eg, we would remap X<foo> but not foo itself,
if foo is an extern "C" function.)
getTargetConstant prevents any optimizations from operating on the
value and basically says its already been iseled. But since we
want the index to be in a register, this isn't true.
Prior to this we were generating a vbroadcast with an immediate
argument which is illegal and was flagged by the expensive checks
bot.
Refactor the splatting of a constant to a vector so that common code is used
both for Power9 and Power8.
Patch by: Anil Mahmud
Differential Revision: https://reviews.llvm.org/D71481
Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics.
Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71614
(This commit restores the original branch (4272372c571) and applies an
additional change dropped from the original in a bad merge. This change
should address the previous bot failures. Both changes reviewed by pete.)
Summary:
This commit builds upon Derek Schuff's 2014 commit for attaching labels to
existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 )
When temporary labels appear ahead of a fragment, MCObjectStreamer will
track the temporary label symbol in a "Pending Labels" list. Labels are
associated with fragments when a real fragment arrives; otherwise, an empty
data fragment will be created if the streamer's section changes or if the
stream finishes.
This commit moves the "Pending Labels" list into each MCStream, so that
this label-fragment matching process is resilient to section changes. If
the streamer emits a label in a new section, switches to another section to
do other work, then switches back to the first section and emits a
fragment, that initial label will be associated with this new fragment.
Labels will only receive empty data fragments in the case where no other
fragment exists for that section.
The downstream effects of this can be seen in Mach-O relocations. The
previous approach could produce local section relocations and external
symbol relocations for the same data in an object file, and this mix of
relocation types resulted in problems in the ld64 Mach-O linker. This
commit ensures relocations triggered by temporary labels are consistent.
Reviewers: pete, ab, dschuff
Reviewed By: pete, dschuff
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71368
This reverts commit 1f3dd83cc1f2b8f72b9d59e2b4221b12fb7f9a95, reapplying
commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462.
The original commit failed on some builds seemingly due to the use of a
bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
This should eliminate a regression seen in D63815.
If we are FP extending the high half extract of a vector,
we should be able to peek through a bitcast sitting
between the extract and extend.
This replaces tablegen patterns with a more general
DAG to DAG override, so we can handle any casted type.
Differential Revision: https://reviews.llvm.org/D71515
Summary:
This is used by the extending_loads combine to tell the apply step which
use is the preferred one to fold and the other uses should be re-written
to consume.
Depends on D69117
Reviewers: volkan, bogner
Reviewed By: volkan
Subscribers: hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69147
This reverts commit e62e760f29567fe0841af870c65a4f8ef685d217.
The issue @uweigand raised should have been fixed by iterating over the
vector that owns the operand list data instead of the FoldingSet.
The MSVC issue raised by @thakis should have been fixed by relaxing the
regexes a little. I don't have a Windows machine available to test that so
I tested it by using `perl -p -e 's/0x([0-9a-f]+)/\U\1\E/g' to convert the
output of %p to the windows style.
I've guessed at the issue @phosek raised as there wasn't enough information
to investigate it. What I think is happening on that bot is the -debug
option isn't available because the second stage build is a release build.
I'm not sure why other release-mode bots didn't report it though.
Previously, LLVM had no functional way of performing casts inside of a
DIExpression(), which made salvaging cast instructions other than Noop
casts impossible. This patch enables the salvaging of casts by using the
DW_OP_LLVM_convert operator for SExt and Trunc instructions.
There is another issue which is exposed by this fix, in which fragment
DIExpressions (which are preserved more readily by this patch) for
values that must be split across registers in ISel trigger an assertion,
as the 'split' fragments extend beyond the bounds of the fragment
DIExpression causing an error. This patch also fixes this issue by
checking the fragment status of DIExpressions which are to be split, and
dropping fragments that are invalid.
Summary:
Instead of generating two i64 instructions for each load or store of a
volatile i128 value (two LDRs or STRs), now emit a single LDP or STP.
Reviewers: labrinea, t.p.northover, efriedma
Reviewed By: efriedma
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69559
Summary:
r347747 added support for clustering mem ops with FI base operands
including support for fixed stack objects in shouldClusterFI, but
apparently this was never tested.
This patch fixes shouldClusterFI to work with scaled as well as
unscaled load/store instructions, and fixes the ordering of memory ops
in MemOpInfo::operator< to ensure that memory addresses always
increase, regardless of which direction the stack grows.
Subscribers: MatzeB, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71334
Add an extra parameter so alignment can be taken under
consideration in gather/scatter legalization.
Differential Revision: https://reviews.llvm.org/D71610
Summary:
Make sure that llvm-locstats is updated before running lit tests even
when it's not an explicit target.
Reviewers: djtodoro, krisb, spatel
Reviewed By: djtodoro
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71611
This is a natural clean-up after D71462/D71464.
It allows to define known section letters used for GNU style
in one place.
Differential revision: https://reviews.llvm.org/D71591
In this test case we use 3 precompiled objects to
test how we print a histogram for an GNU hash section.
It does not make sense to use precompiled objects
for that. Also we could have 2 tests: one for 32 and
another for 64 bits target.
This patch does this change. It is not possible to remove
these precompiled objects because they are used elsewhere.
Differential revision: https://reviews.llvm.org/D71606
GNU uses `l` for SHF_X86_64_LARGE and `y` for SHF_ARM_PURECODE.
Lets follow.
To do this I had to refactor and refine how we print the help flags description.
It was too generic and inconsistent with GNU readelf.
Differential revision: https://reviews.llvm.org/D71464
Our logic that dumped the flags was buggy.
For LLVM style it dumped SHF_MASKPROC/SHF_MASKOS named constants, though
they are not flags, but masks.
For GNU style it was just very inconsistent with GNU which has logic
that is not straightforward. Imagine we have sh_flags == 0x90000000.
SHF_EXCLUDE ("E") has a value of 0x80000000 and SHF_MASKPROC is 0xf0000000.
GNU readelf will not print "E" or "Ep" in this case, but will print just
"p". It only will print "E" when no other processor flag is set.
I had to investigate the GNU source to find the algorithm and now our logic should
match it.
Differential revision: https://reviews.llvm.org/D71462
Summary: Add calculation for elements in structures in getting uniform
base for the Gather/Scatter intrinsic.
Reviewers: craig.topper, c-rhodes, RKSimon
Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71442
We somehow missed doing this when we were working on Power9 exploitation.
This just adds the missing legalization and cost for producing the vector
intrinsics.
Differential revision: https://reviews.llvm.org/D70436
and follow-on patches.
This is breaking a few build bots and local builds with follow-up already
on the patch thread.
This reverts commits 390c8baa5440dda8907688d9ef860f6982bd925f and
520e3d66e7257c77f1226185504bbe1cb90afcfa.
Summary: AArch64 doesn't support uadd.with.overflow.i16 natively. This change adds a legalization rule to convert the 32bit add result to 16bit. This should fix PR43981.
Reviewers: arsenm, qcolombet, paquette, aemerson
Reviewed By: paquette
Subscribers: wdng, rovka, kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71587
The caller will assert for nodes with more than 2 results unless
we return a null SDValue.
I tried to test this by copying an AArch64 test for ScalarizeVecOp_FP_ROUND.
While it did hit the assert and this commited fixed that. It also
hit a later problem that couldn't be fixed without adding strict
FP support to AArch64.
Summary:
These instructions were added to the spec proposal in
https://github.com/WebAssembly/simd/pull/126. Their semantics are
equivalent to `(a + b + 1) / 2`. The opcode for the experimental
i32x4.dot_i16x8_s is also bumped due to a collision with the
i8x16.avgr_u opcode.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71628
This reverts commit 4272372c571cd33edc77a8844b0a224ad7339138.
Caused an MSan buildbot failure. More information available in the patch
that introduced the bug: https://reviews.llvm.org/D71368
This started with adding a test to support get code coverage on
ScalarizeVecOp_UnaryOp_StrictFP by copying an existing AArch64 test
and using constrained sitofp/uitofp intrinsics.
This found 3 separate issues:
-ScalarizeVecOp_UnaryOp_StrictFP needs to do its own replacement
because the caller can't handle replacing multiple results.
-Missing integer promotion support for sitofp/uitofp
-Chain result not always assigned in ExpandLegalINT_TO_FP.
Committing them together so I can add the test case.