The majority of the running time of this script tends to be spent in
running git blame on source files touched by patches under review.
By introducing a git blame output cache, some of the git blame commands
don't have to re-run, and the blame information can be retrieved from a
cache.
I've observed that in a typical run matching patches available for
review with potential reviewers, this speeds up the script's running
time by a factor of about 2.5x.
There was no way to set an unsupported or unknown OS ABI.
With this patch it is possible to use any numeric value.
Differential revision: https://reviews.llvm.org/D71765
As the extern_weak target might be missing, resolving to the absolute
address zero, we can't use the normal direct PC-relative branch
instructions (as that would result in relocations out of range).
Improve the classifyGlobalFunctionReference method to set
MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in
AArch64TargetLowering::LowerCall to use the return value from
classifyGlobalFunctionReference for these cases.
Add code in both AArch64FastISel and GlobalISel/IRTranslator to
bail out for function calls to extern weak functions on windows,
to let SelectionDAG handle them.
This matches what was done for X86 in 6bf108d77a3c.
Differential Revision: https://reviews.llvm.org/D71721
As the extern_weak target might be missing, resolving to the absolute
address zero, we can't use the normal direct PC-relative branch
instructions (as that would result in relocations out of range).
Instead check the shouldAssumeDSOLocal method and load the address
from a COFF stub.
This matches what was done for X86 in 6bf108d77a3c.
Differential Revision: https://reviews.llvm.org/D71720
This rewrites a few tests to stop using the
trivial.obj.elf-x86-64 precompiled object
and removes it.
Differential revision: https://reviews.llvm.org/D71662
The custom node PPCISD::XXREVERSE has completely the same semantics of generic node ISD::BSWAP.
We need to clean up it as we have the combine rules for bswap in the base class, while nothing for xxreverse.
Differential Revision: https://reviews.llvm.org/D70657
Summary:
This patch introduces the ROLBRd and RORBRd pseudo-instructions,
which implemenent the "traditional" rotate operations; instead of
the AVR rotate instructions that use the carry bit.
The code is not optimized at all. Especially when dealing with
loops of rotate instructions, this codegen should be improved some
day.
Related bug: 41358 <https://bugs.llvm.org/show_bug.cgi?id=41358>
//Note//: This is my first submitted patch.
Reviewers: dylanmckay, Jim
Reviewed By: dylanmckay
Subscribers: hiraditya, llvm-commits, dylanmckay, dsprenkels
Tags: #llvm
Patched by dsprenkels (Daan Sprenkels)
Differential Revision: https://reviews.llvm.org/D60365
Summary:
Currently, we set legalization action of `ISD::ROTL` vectors as
`Expand` in `PPCISelLowering`. However, we can exploit `vrl(b|h|w|d)`
to lower `ISD::ROTL` directly.
Differential Revision: https://reviews.llvm.org/D71324
Commit d77ae1552fc21a9f3877f3ed7e13d631f517c825
("[DebugInfo] Support to emit debugInfo for extern variables")
added deebugInfo for extern variables for BPF target.
The commit is reverted by 891e25b02d760d0de18c7d46947913b3166047e7
as the committed tests using %clang instead of %clang_cc1 causing
test failed in certain scenarios as reported by Reid Kleckner.
This patch fixed the tests by using %clang_cc1.
Differential Revision: https://reviews.llvm.org/D71818
Summary:
Without this check unnecessary FMA instructions are generated when the FSUB terms are reused.
This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation.
Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel
Reviewed By: arsenm
Subscribers: jvesely, wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71656
This reverts commit d77ae1552fc21a9f3877f3ed7e13d631f517c825.
The tests committed along with this change do not pass, and should be
changed to use %clang_cc1.
Summary:
We noticed in Julia that the sequence below no longer turned into
a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6.
```
%29 = fmul contract <4 x double> %wide.load, %wide.load16
%30 = fmul contract <4 x double> %wide.load13, %wide.load17
%31 = fmul contract <4 x double> %wide.load14, %wide.load18
%32 = fmul contract <4 x double> %wide.load15, %wide.load19
%33 = fadd fast <4 x double> %vec.phi, %29
%34 = fadd fast <4 x double> %vec.phi10, %30
%35 = fadd fast <4 x double> %vec.phi11, %31
%36 = fadd fast <4 x double> %vec.phi12, %32
```
Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function
attribute, but rather emits more local instruction flags.
This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further.
Reviewers: spatel, mcberg2017
Reviewed By: spatel
Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71495
This reverts commit ee7579409b7d940c4e1314d126e900db30c4edff.
It causes crashes during ThinLTO. I suspect the issue is related to
races on the global TypeSize variable, which is 80 at the time of the
crash.
These said test_f32_olt_s for the type of an overloaded intrinsic.
But the parser doesn't use that part of the name and just uses
the types of the arguments.
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.
Also removed the top-level const as requested by Aaron Ballman in similar
patches.
Differential Revision: https://reviews.llvm.org/D71812
This is in the context of the automatic padding work for the jcc erratum mitigation. These are example cases we need to *not* pad for correctness. Exact mechanism to suppress is still TBD, but saving the tests which have come up.
Summary:
This is documented as the appropriate template modifier for call operands.
Fixes PR44272, and adds a regression test.
Also adds support for operand modifiers in Intel-style inline assembly.
Reviewers: rnk
Reviewed By: rnk
Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71677
This is another potential regression exposed by D63815.
Here we peek through a bitcast to find an extract subvector and
scale the splat offset based on that:
splat (bitcast (extract X, C)), LaneC --> duplane (bitcast X), LaneC'
Differential Revision: https://reviews.llvm.org/D71672
This now defines HAVE_VCS_VERSION_INC for all files in Basic,
but now the BUILD.gn file has only a single "sources" field again,
and the automerger requires that. Having the automerger work for
clang/lib/Basic is a very nice to have, and the downside seems tiny.
As discussed in PR44330:
https://bugs.llvm.org/show_bug.cgi?id=44330
...the transform from pow(X, -0.5) libcall/intrinsic to
reciprocal square root can result in small deviations from
the expected result due to differences in the pow()
implementation and/or the extra rounding step from the division.
This patch proposes to allow that difference with either the
'approximate functions' or 'reassociate' FMF:
http://llvm.org/docs/LangRef.html#fast-math-flags
In practice, this likely means that the code is compiled with
all of 'fast' (-ffast-math), but I have preserved the existing
specializations for -0.0/-INF that enable generating safe code
if those special values are allowed simultaneously with
allowing approximation/reassociation.
The question about whether a similar restriction is needed for
the non-reciprocal case -- pow(X, 0.5) -- is deferred. That
transform is allowed without FMF currently, and this patch does
not change that behavior.
Differential Revision: https://reviews.llvm.org/D71706
This seems to have been relying on extra spills being inserted in
these blocks to increase the code size to trigger branch
relaxation. This broke when these spills were avoided. Add some asm to
pad the size of the blocks to make it not matter.
Confusingly, the intrinsic operands do not match the
instruction/custom node. The order is shuffled, and the 3rd operand is
an immediate to select operands.
I'm not 100% sure I did this right, but fdiv still doesn't select end
to end and it will be easier to tell when it does. This at least
avoids an assertion in RegBankSelect and allows hitting the fallback
on selection.