Summary:
These instructions convert a vector of floats to a vector of integers
of the same size, with assorted non-default rounding modes.
Implemented in IR as target-specific intrinsics, because as far as I
can see there are no matches for that functionality in the standard IR
intrinsics list.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75255
Summary:
These instructions make a vector of `<4 x float>` by widening every
other lane of a vector of `<8 x half>`.
I wondered about representing these using standard IR, along the lines
of a shufflevector to extract elements of the input into a `<4 x half>`
followed by an `fpext` to turn that into `<4 x float>`. But it looks as
if that would take a lot of work in isel lowering to make it match any
pattern I could sensibly write in Tablegen, and also I haven't been
able to think of any other case where that pattern might be generated
in IR, so there wouldn't be any extra code generation win from doing
it that way.
Therefore, I've just used another target-specific intrinsic. We can
always change it to the other way later if anyone thinks of a good
reason.
(In order to put the intrinsic definition near similar things in
`IntrinsicsARM.td`, I've also lifted the definition of the
`MVEMXPredicated` multiclass higher up the file, without changing it.)
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75254
Summary:
The two MVE instructions that convert between v4f32 and v8f16 were
implemented as instances of the same class, with the same MC operand
list.
But that's not really appropriate, because the narrowing conversion
only partially overwrites its output register (it only has 4 f16
values to write into a vector of 8), so even when unpredicated, it
needs a $Qd_src input, a constraint tying that to the $Qd output, and
a vpred_n.
The widening conversion is better represented like any other
instruction that completely replaces its output when unpredicated: it
should have no $Qd_src operand, and instead, a vpred_r containing a
$inactive parameter. That's a better match to other similar
instructions, such as its integer analogue, the VMOVL instruction that
makes a v4i32 by sign- or zero-extending every other lane of a v8i16.
This commit brings the widening VCVT.F32.F16 into line with the other
instructions that behave like it. That means you can write isel
patterns that use it unpredicated, without having to add a pointless
undefined $QdSrc operand.
No existing code generation uses that instruction yet, so there should
be no functional change from this fix.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75253
Summary:
These instructions work like VMOVN (narrowing a vector of wide values
to half size, and overwriting every other lane of an output register
with the result), except that the narrowing conversion is saturating.
They come in three signedness flavours: signed to signed, unsigned to
unsigned, and signed to unsigned. All are represented in IR by a
target-specific intrinsic that takes two separate 'unsigned' flags.
Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D75252
Summary:
In this patch I've done a slightly bigger rewrite to also remove the
hardcoded header lengths.
Reviewers: jhenderson, dblaikie, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75119
Summary:
This could be considered obvious, but I am putting it up to illustrate
the usefulness/impact of the getInitialLength change.
Reviewers: dblaikie, jhenderson, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75117
The MVE gather instructions smaller than 32bits zext extend the values
in the offset register, as opposed to sign extending them. We need to
make sure that the code that we select from is suitably extended, which
this patch attempts to fix by tightening up the offset checks.
Differential Revision: https://reviews.llvm.org/D75361
Move Base64 implementation from clangd/SemanticHighlighting to
llvm/Support/Base64, fix its implementation and provide a decent test suite.
Previous implementation code was using + operator instead of | to combine some
results, which is a problem when shifting signed values. (0xFF << 16) is
implicitly converted to a (signed) int, and thus results in 0xffff0000, which is
negative. Combining negative numbers with a + in that context is not what we
want to do.
This fixes https://github.com/llvm/llvm-project/issues/149.
Differential Revision: https://reviews.llvm.org/D75057
The Bitcode/DITemplateParameter-5.0.ll test is failing:
FAIL: LLVM :: Bitcode/DITemplateParameter-5.0.ll (5894 of 36324)
******************** TEST 'LLVM :: Bitcode/DITemplateParameter-5.0.ll' FAILED ********************
Script:
--
: 'RUN: at line 1'; /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/llvm-dis -o - /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll.bc | /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/FileCheck /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll
--
Exit Code: 2
Command Output (stderr):
--
It looks like the Bitcode/DITemplateParameter-5.0.ll.bc file was never checked in.
This reverts commit c2b437d53d40b6dc5603c97f527398f477d9c5f1.
in C++ templates.
Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.
Reviewers: probinson, aprantl, dblaikie
Reviewed By: aprantl, dblaikie
Differential Revision: https://reviews.llvm.org/D73462
We should be careful to allow count of re-materialization of operands to be less
then number of physical registers.
STATEPOINT instruction has a variable number of operands and potentially very big.
So re-materialization for all operands is disabled at the moment if restrict-statepoint-remat is true.
The patch relaxes the re-materialization restriction for STATEPOINT instruction allowing it for
fixed operands. Specifically it is about call target.
Reviewers: reames
Reviewed By: reames
Subscribers: llvm-commits, qcolombet, hiraditya
Differential Revision: https://reviews.llvm.org/D75335
Summary:
It should be normal constant instead of target constant.
Pattern CMPri can be matched if the constant can be fitted into immediate field.
Otherwise, pattern CMPrr will be matched.
This fixed bug https://bugs.llvm.org/show_bug.cgi?id=44091.
Reviewers: dcederman, jyknight
Reviewed By: jyknight
Subscribers: jonpa, hiraditya, fedor.sergeev, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75227
The address calculation for the offset assumes that you can calculate the offset by multiplying the index by the store size of the element. But that only works if the element's store size is exactly its real size since we store vectors tightly packed in memory. There are improvements we could make to this like special casing extracting element 0. I think we could also handle cases where the extracted VT is byte sized and the index is aligned with the extract element count.
Differential Revision: https://reviews.llvm.org/D75377
Summary:
Currently the boundaryalign fragment caches its size during the process
of layout and then it is relaxed and update the size in each iteration. This
behaviour is unnecessary and ugly.
Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: MaskRay
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75404
Select_cc isn't used by all targets. X86 doesn't have optimizations
for it.
Since we already know the input to the sint_to_fp/uint_to_fp is
a setcc we can just emit a plain select using that setcc as the
condition. Other DAG combines can turn that into a select_cc on
targets that support it.
Differential Revision: https://reviews.llvm.org/D75415
getFirstInsertionPt's return value must be checked for validity before
casting it to Instruction*. Don't attempt to insert casts after a phi in
a catchswitch block.
Fixes PR45033, introduced in D37832.
Reviewed By: davidxl, hfinkel
Differential Revision: https://reviews.llvm.org/D75381
We get the simple cases of this via demanded elements and other folds,
but that doesn't work if the values have >1 use, so add a dedicated
match for the pattern.
We already have this transform in IR, but it doesn't help the
motivating x86 tests (based on PR42024) because the shuffles don't
exist until after legalization and other combines have happened.
The AArch64 test shows a minimal IR example of the problem.
Differential Revision: https://reviews.llvm.org/D75348
These AddToWorklist calls were added in 84cd968f75bbd6e0fbabecc29d2c1090263adec7.
It's possible the SimplifyDemandedBits/SimplifyDemandedVectorElts
triggered CSE that deleted N. Detect that and avoid adding N
to the worklist.
Fixes PR45067.
Summary:
This patch helps getGuaranteedNonFullPoisonOp handle llvm.assume call.
Also, a comment about the semantics of branch is removed to prevent confusion.
As llvm.assume does, branching on poison directly raises UB (as LangRef says), and this allows transformations such as introduction of llvm.assume on branch condition at each successor, or freely replacing values after conditional branch (such as at loop exit).
Handling br is not addressed in this patch. It makes SCEV more accurate, causing existing LoopVectorize/IndVar/etc tests to fail.
Reviewers: spatel, lebedev.ri, nlopes
Reviewed By: nlopes
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75397
Summary:
This patch helps isGuaranteedNotToBeUndefOrPoison return true if the value
makes the program always undefined.
According to value tracking functions' comments, it is not still in consensus
whether a poison value can be bitwise or not, so conservatively only the case with
i1 is considered.
Reviewers: spatel, lebedev.ri, reames, nlopes, regehr
Reviewed By: nlopes
Subscribers: uenoku, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75396
Summary:
This patch defines two freeze instructions to have the same value number if they are equivalent.
This is allowed because GVN replaces all uses of a duplicated instruction with another.
If it partially rewrites use, it is not allowed. e.g)
```
a = freeze(x)
b = freeze(x)
use(a)
use(a)
use(b)
=>
use(a)
use(b) // This is not allowed!
use(b)
```
Reviewers: fhahn, reames, spatel, efriedma
Reviewed By: fhahn
Subscribers: lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75398
Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.
Saves ~250 includes of Chrono.h & FileSystem.h:
$ diff -u thedeps-before.txt thedeps-after.txt | grep '^[-+] ' | sort | uniq -c | sort -nr
254 - ../llvm/include/llvm/Support/FileSystem.h
253 - ../llvm/include/llvm/Support/Chrono.h
237 - ../llvm/include/llvm/Support/NativeFormatting.h
237 - ../llvm/include/llvm/Support/FormatProviders.h
192 - ../llvm/include/llvm/ADT/StringSwitch.h
190 - ../llvm/include/llvm/Support/FormatVariadicDetails.h
...
This requires duplicating the file_t typedef, which is unfortunate. I
sunk the choice of mapping mode down into the cpp file using variable
template specializations instead of class members in headers.
Using INTERFACE prevents the use of imported libraries as we've done
in 00b3d49 because these aren't linked against the target, they're
only made part of the interface. This doesn't affect the output since
static libraries aren't being linked into, but it enables the use of
imported libraries.
Differential Revision: https://reviews.llvm.org/D74106
This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default.
Differential Revision: https://reviews.llvm.org/D75162
Summary: A function that creates JITSymbolFlags from a GlobalValueSummary. Similar functions exist: fromGlobalValue(), fromObjectSymbol()
Reviewers: lhames
Reviewed By: lhames
Subscribers: hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75082
MCObjectStreamer is more suitable to create fragments than
X86AsmBackend, for example, the function getOrCreateDataFragment is
defined in MCObjectStreamer.
Differential Revision: https://reviews.llvm.org/D75351
When bundle is enabled, data fragment itself has a space to emit NOP
to bundle-align instructions. The behaviour makes it impossible for
us to determine whether the macro fusion really happen when emitting
instructions. In addition, boundary-align fragment is also used to
emit NOPs to align instructions, currently using them together sometimes
makes code crazy.
Differential Revision: https://reviews.llvm.org/D75346