Attributes don't know their parent Context, adding this would make Attribute larger. Instead, we add hasParentContext that answers whether this Attribute belongs to a particular LLVMContext by checking for itself inside the context's FoldingSet. Same with AttributeSet and AttributeList. The Verifier checks them with the Module context.
Differential Revision: https://reviews.llvm.org/D99362
The include is for std::swap(), but that's in <utility> in C++11,
and Hashing.h already includes that.
Differential Revision: https://reviews.llvm.org/D100657
MathExtras.h is indirectly included in over 98% of LLVM's
translation units. It currently expands to over 1MB of stuff,
over which far more than half is due to <algorithm>. Since not
using <algorithm> is slightly less code, do that.
No behavior change.
Differential Revision: https://reviews.llvm.org/D100656
This patch prevents phi-node-elimination from generating a COPY
operation for the register defined by t2WhileLoopStartLR, as it is a
terminator that defines a value.
This happens because of the presence of phi-nodes in the loop body (the
Preheader of which is the block containing the t2WhileLoopStartLR). If
this is not done, the COPY is generated above/before the terminator
(t2WhileLoopStartLR here), and since it uses the value defined by
t2WhileLoopStartLR, MachineVerifier throws a 'use before define' error.
This essentially adds on to the change in differential D91887/D97729.
Differential Revision: https://reviews.llvm.org/D100376
The implementation supports static schedule for Fortran do loops. This
implements the dynamic variant of the same concept.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D97393
Add the `IsText` argument to `GetFile` and `GetFileOrSTDIN` which will help z/OS distinguish between text and binary correctly. This is an extension to [this patch](https://reviews.llvm.org/D97785)
Reviewed By: abhina.sreeskantharajan, amccarth
Differential Revision: https://reviews.llvm.org/D100488
This is an alternative to D99759 to avoid the compile-time explosion seen in:
https://llvm.org/PR49785
Another potential solution would make the exclusion logic stronger to avoid
blowing up, but note that we reduced the complexity of the exclusion mechanism
in D16204 because it was too costly.
So I'm questioning the need for recursion/exclusion entirely - what is the
optimization value vs. cost of recursively computing known bits based on
assumptions?
This was built into the implementation from the start with 60db058,
and we have kept adding code/cost to deal with that capability.
By clearing the query's AssumptionCache inside computeKnownBitsFromAssume(),
this patch retains all existing assume functionality except refining known
bits based on even more assumptions.
We have 1 regression test that shows a difference in optimization power.
Differential Revision: https://reviews.llvm.org/D100573
Sometimes LV has to produce really wide vectors,
and sometimes they end up being not powers of two.
As it can be seen from the diff, the cost computation
is currently completely non-sensical in those cases.
Instead of just scalarizing everything, split/factorize the wide vector
into a number of subvectors, each one having a power-of-two elements,
recurse to get the cost of op on this subvector. Also, check how we'd
legalize this subvector, and if the legalized type is scalar,
also account for the scalarization cost.
Note that for sub-vector loads, we might be able to do better,
when the vectors are properly aligned.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D100099
On Windows, we want to open a file in Binary mode if OF_CRLF bit is not set. On z/OS, we want to open a file in Binary mode if the OF_Text bit is not set.
This patch creates two new functions called ChangeStdinMode and ChangeStdoutMode which will take OpenFlags as an arg to determine which mode to set stdin and stdout to. This will enable patches like https://reviews.llvm.org/D100056 to not affect Windows when setting the OF_Text flag for raw_fd_streams.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D100130
D73568 removed the lit feature object-emission, because it was introduced for a
target which did not support the integrated assembler, and that target no longer
required the feature. XCore still does not support the integrated assembler,
so a build with XCore as the default target fails tests requiring
object-emission. This issue was not publicly visible because there was not a
buildbot for XCore as the default target. We fixed the failures downstream. We
now have builder clang-xcore-ubuntu-20-x64 on the staging buildmaster, which
shows the failures. We would like to make upstream build green.
Omit DebugInfo/Generic on XCore to avoid annotating 70 separate files.
Differential Revision: https://reviews.llvm.org/D98508
Combine sub 0, csinc X, Y, CC to csinv -X, Y, CC providing that the
negation of X is cheap, currently just handling constants. This comes up
during the splat of an i1 to a predicate, where we now generate csetm,
as opposed to cset; rsb.
Differential Revision: https://reviews.llvm.org/D99940
It has to save all caller-saved registers before a call in the handler.
So don't emit a call that save/restore registers.
Reviewed By: simoncook, luismarques, asb
Differential Revision: https://reviews.llvm.org/D100532
Part of the code related to ds_read/ds_write ISel is refactored, and the
corresponding comment is re-written for better readability, which would help
while implementing any future ds_read/ds_write ISel related modifications.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D100300
This patch clarifies the semantics of nocapture attribute.
A 'Pointer Capture' subsection is added to describe the semantics of pointer capture first.
For the nocapture example with two same pointer arguments, it is consistent with the semantics that Alive2 used to run lit tests.
Reviewed By: nlopes
Differential Revision: https://reviews.llvm.org/D97924
These were misleading, they're more of a "clear" than an "invalidate".
We shouldn't be individually clearing analysis results. Either we clear
all analyses when some IR becomes invalid, or we properly go through
invalidation.
There was only one use of this, which can be simulated with
AM.invalidate(F, PA).
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D100519
This allows for walking all nested locations of a given location, and is generally useful when processing locations.
Differential Revision: https://reviews.llvm.org/D100437
str(n)cat appends a copy of the second argument to the end of the first
argument. To find the end of the first argument, str(n)cat has to read
from it until it finds the terminating 0. So it should not be marked as
writeonly. I think this means the argument should not be marked as
writeonly.
(This is causing a mis-compile with legacy DSE, before it got removed)
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D100601
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example
struct S {
__attribute__ ((__aligned__(16))) double v[4];
};
Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)
Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.
This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.
The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.
For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.
On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.
Patch by Momchil Velikov and Lucas Prates.
Differential Revision: https://reviews.llvm.org/D98794
hasMode was looking up the map once. Then we'd either call get which
would look up again, or we'd insert into the map which requires
walking the map to find the insertion point.
I believe the hasMode was needed because get has a special case
to look for DefaultMode if the mode being asked for doesn't exist.
We don't want that here so we were using hasMode to make sure we
wouldn't hit that case.
Simplify to a regular operator[] access which will default
construct a SetType if the lookup fails.
Move some utility functions which are used within LDS lowering pass to a separate utils
file so that other LDS related passes can make use of them when required.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D100526
This generalizes RVInstIShift/RVInstIShiftW to take the upper
5 or 7 bits of the immediate as an input instead of only bit 30. Then
we can share them.
For RVInstIShift I left a hardcoded 0 at bit 26 where RV128 gets
a 7th bit for the shift amount.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D100424
Avoid visiting repeated instructions for processHeaderPhiOperands as it can cause a scenario of endless loop. Test case is attached and can be ran with `opt -basic-aa -tbaa -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=4`.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D97407
Being lazy with printing the banner seems hard to reason with, we should print it
unconditionally first (it could also lead to duplicate banners if we
have multiple functions in -filter-print-funcs).
The printIR() functions were doing too many things. I separated out the
call from PrintPassInstrumentation since we were essentially doing two
completely separate things in printIR() from different callers.
There were multiple ways to generate the name of some IR. That's all
been moved to getIRName(). The printing of the IR name was also
inconsistent, now it's always "IR Dump on $foo" where "$foo" is the
name. For a function, it's the function name. For a loop, it's what's
printed by Loop::print(), which is more detailed. For an SCC, it's the
list of functions in parentheses. For a module it's "[module]", to
differentiate between a possible SCC with a function called "module".
To preserve D74814, we have to check if we're going to print anything at
all first. This is unfortunate, but I would consider this a special
case that shouldn't be handled in the core logic.
Reviewed By: jamieschmeiser
Differential Revision: https://reviews.llvm.org/D100231