Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.
Declutters the output of D48190.
Reviewers: RKSimon, andreadb, courbet, craig.topper
Reviewed By: andreadb
Subscribers: javed.absar, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48209
llvm-svn: 334833
Summary:
There does not seem to be any other tests for this.
Split off from D47676.
Reviewers: RKSimon, craig.topper, courbet, andreadb
Reviewed By: andreadb
Subscribers: javed.absar, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D48190
llvm-svn: 334832
This is r334704 (which was reverted in r334732) with a fix for
types like x86_fp80. We need to use getTypeAllocSizeInBits and
not getTypeStoreSizeInBits to avoid dropping debug info for
such types.
Original commit msg:
> Summary:
> Do not convert a DbgDeclare to DbgValue if the store
> instruction only refer to a fragment of the variable
> described by the DbgDeclare.
>
> Problem was seen when for example having an alloca for an
> array or struct, and there were stores to individual elements.
> In the past we inserted a DbgValue intrinsics for each store,
> just as if the store wrote the whole variable.
>
> When handling store instructions we insert a DbgValue that
> indicates that the variable is "undefined", as we do not know
> which part of the variable that is updated by the store.
>
> When ConvertDebugDeclareToDebugValue is used with a load/phi
> instruction we assert that the referenced value is large enough
> to cover the whole variable. Afaict this should be true for all
> scenarios where those methods are used on trunk. If the assert
> blows in the future I guess we could simply skip to insert a
> dbg.value instruction.
>
> In the future I think we should examine which part of the variable
> that is accessed, and add a DbgValue instrinsic with an appropriate
> DW_OP_LLVM_fragment expression.
>
> Reviewers: dblaikie, aprantl, rnk
>
> Reviewed By: aprantl
>
> Subscribers: JDevlieghere, llvm-commits
>
> Tags: #debug-info
>
> Differential Revision: https://reviews.llvm.org/D48024
llvm-svn: 334830
Some instructions require of a limited set of FP immediates as operands,
for example '#0.5 or #1.0' for SVE's FADD instruction.
This patch adds support for parsing and printing such FP immediates as
exact values (e.g. #0.499999 is not accepted for #0.5).
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47711
llvm-svn: 334826
Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.
The original commit was reverted because
it broke tests for amdgpu backend, which i didn't check.
Now, the backed was updated to recognize these new
patterns, so we are good.
https://bugs.llvm.org/show_bug.cgi?id=37603https://rise4fun.com/Alive/cplX
Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm
Reviewed By: spatel, rampitec, nhaehnle
Subscribers: wdng, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D47980
llvm-svn: 334818
Summary: The same pattern as D48010, but this one is IR-canonical as of D47428.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48012
llvm-svn: 334817
Summary:
As a followup for D48007.
Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern,
which does not have ub for both the edge cases (`y == 0`, `y == bitwidth`),
i think also handling a pattern that is ub for `y == bitwidth` should be fine.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48010
llvm-svn: 334816
Summary:
D47980 will canonicalize the `x << (32 - y) >> (32 - y)`,
which is the pattern the AMDGPU expects to `x & (-1 >> (32 - y))`,
which is not recognized by AMDGPU.
Thus, it needs to be recognized, too.
Reviewers: nhaehnle, bogner, tstellar, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #amdgpu
Differential Revision: https://reviews.llvm.org/D48007
llvm-svn: 334815
Instruction bundling is only supported on descendants of the
MCEncodedFragment type. By moving the bundling functionality and
MCSubtargetInfo to this class it makes it easier to set and extract the
MCSubtargetInfo when it is necessary.
This is a refactoring change that will make it easier to pass the
MCSubtargetInfo through to writeNops when nop padding is required.
Differential Revision: https://reviews.llvm.org/D45959
llvm-svn: 334814
Summary:
On hover, the whole asm snippet is displayed, including operands.
This requires the actual assembly output instead of just the MCInsts:
This is because some pseudo-instructions get lowered to actual target
instructions during codegen (e.g. ABS_Fp32 -> SSE or X87).
Reviewers: gchatelet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D48164
llvm-svn: 334805
I think this covers most of the unmasked vector instructions. We're still missing a lot of the masked instructions.
There are some test changes here because of the new folding support. I don't think these particular cases should be folded because it creates an undef register dependency. I think the changes introduced in r334175 are not handling stack folding. They're only blocking the peephole pass.
llvm-svn: 334800
LLVM currently assumes that Apple platforms will always use ld64. In the
future, LLD Mach-O might also be supported, so add the beginnings of
linker detection support. ld64 is currently the only detected linker,
since `ld64.lld -v` doesn't yield any useful version output, but we can
add that detection later, and in the meantime it's still useful to have
the ld64 identification.
Switch clang's order file check to use this new detection rather than
just checking for the presence of an ld64 executable.
Differential Revision: https://reviews.llvm.org/D48201
llvm-svn: 334780
IEEE 754 defines the expected result on overflow. As far as I know,
hardware implementations (of f16), and compiler-rt (__floatuntisf)
correctly return +-Inf on overflow. And I can't think of any useful
transform that would take advantage of overflow being undefined here.
Differential Revision: https://reviews.llvm.org/D47807
llvm-svn: 334777
Once a symbol has been selected for materialization it can no longer be
overridden. Stripping the weak flag guarantees this (override attempts will
then be treated as duplicate definitions and result in a DuplicateDefinition
error).
llvm-svn: 334771
Summary:
Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL.
Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar
Reviewed By: spatel
Subscribers: efriedma, wdng, tpr
Differential Revision: https://reviews.llvm.org/D48057
llvm-svn: 334769
The return value of TreePatternNode::getChild is never null. This patch also
updates various places that use return values of getChild to also use
references. Those changes were suggested post-commit for D47463.
llvm-svn: 334764
In particular, when asked to print a MemoryAccess, we'll now print where
defs are optimized to, and we'll print optimized access types.
This patch also introduces an operator<< to make printing AliasResults
easier.
Patch by Juneyoung Lee!
Differential Revision: https://reviews.llvm.org/D47860
llvm-svn: 334760
isVectorClearMaskLegal() is the TLI hook used by the generic
DAGCombiner::XformToShuffleWithZero().
We've grown to accomodate/expect this transform to shuffle
(disabling it more generally results in many regressions).
So I'm narrowly excluding the 256-bit types that clearly
are not worthwhile for AVX1.
I think in most cases we are able to recover by converting
the shuffle back into 'and' ops, but the cases in:
https://bugs.llvm.org/show_bug.cgi?id=37749
...show that there are cracks.
llvm-svn: 334759
This is r334750 (which was reverted in r334754) with a fix for an
uninitialized variable that was caught by msan.
Original commit message:
> If a copy bundle happens to involve overlapping registers, we can end
> up with emitting the copies in an order that ends up clobbering some
> of the subregisters. Since instructions in the copy bundle
> semantically happen at the same time, this is incorrect and we need to
> make sure we order the copies such that this doesn't happen.
llvm-svn: 334756
Summary: A FMF constraint is added to FADD with unsafe still available as the fallback
Reviewers: spatel, wristow, arsenm, hfinkel
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D48180
llvm-svn: 334753
WebAssembly doesn't support more than one function per section
and we rely on function sections being unique. This change ignores
the section provided by the function to avoid two functions being
in the same section.
Without this change the object writer produces the following
error for this test:
LLVM ERROR: section already has a defining function: baz
Differential Revision: https://reviews.llvm.org/D48178
llvm-svn: 334752
If a copy bundle happens to involve overlapping registers, we can end
up with emitting the copies in an order that ends up clobbering some
of the subregisters. Since instructions in the copy bundle
semantically happen at the same time, this is incorrect and we need to
make sure we order the copies such that this doesn't happen.
Differential Revision: https://reviews.llvm.org/D48154
llvm-svn: 334750
On MacOS, if CMAKE_OSX_SYSROOT is used and the user has command line tools
installed, we currently get the include path for libxml2 as
/usr/include/libxml2, instead of ${CMAKE_OSX_SYSROOT}/usr/include/libxml2.
Make it consistent on MacOS by prefixing ${CMAKE_OSX_SYSROOT} when
possible.
rdar://problem/41103601
llvm-svn: 334746
On x86-64, a write to register EAX implicitly clears the upper half or RAX.
128-bit AVX instructions clear the upper 128-bit of the YMM register that
aliases the XMM definition register.
llvm-mca doesn't know about register writes that implicitly clear the upper
portion of an aliasing super-register. This issue will be fixed in a future patch.
llvm-svn: 334742