This patch addresses assertion failure in case when the only found formula for LSR
is `1*reg => reg` which was supposed to be an impossible situation, however there
is a test that shows it is possible.
In this case, we can use scale register with scale of 1 as the missing base register.
Reviewed By: huihuiz, reames
Differential Revision: https://reviews.llvm.org/D105009
When setting Allocatable on a generated register class check all
superclasses and set Allocatable true if any superclass is
allocatable.
Without this change generated register classes based on an
allocatable class may end up unallocatable due to the topological
inheritance order.
This change primarily effects AMDGPU backend; however, there are
a few changes in MIPs GlobalISel register constraints as a result.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D105967
A common use of `ChangeStatus` is as follows:
```
ChangeStatus Changed = ChangeStatus::UNCHANGED;
Changed |= foo();
```
where `foo` returns `ChangeStatus` as well. Currently `ChangeStatus` doesn't
support compound assignment, we have to write as
```
Changed = Changed | foo();
```
which is not that convenient.
This patch add the support for compound assignment for `ChangeStatus`. Compound
assignment is usually implemented as a member function, and binary arithmetic
operator is therefore implemented using compound assignment. However, unlike
regular C++ class, enum class doesn't support member functions. As a result, they
can only be implemented in the way shown in the patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106109
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
Use double pound at the start of the line to differentiate comments from
statements for Lit or FileCheck.
I will also use this small commit to check my commit access.
Differential Revision: https://reviews.llvm.org/D106103
Since we're still building on top of the MVT based infrastructure, we
need to track the pointer type/address space on the side so we can end
up with the correct pointer LLTs when interpreting CCValAssigns.
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for population
count, reversed load and store related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106021
In the device runtime there are many function calls to `__kmpc_is_spmd_exec_mode`
to query the execution mode of current kernels. In many cases, user programs
only contain target region executing in one mode. As a consequence, those runtime
function calls will only return one value. If we can get rid of these function
calls during compliation, it can potentially improve performance.
In this patch, we use `AAKernelInfo` to analyze kernel execution. Basically, for
each kernel (device) function `F`, we collect all kernel entries `K` that can
reach `F`. A new AA, `AAFoldRuntimeCall`, is created for each call site. In each
iteration, it will check all reaching kernel entries, and update the folded value
accordingly.
In the future we will support more function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105787
This adds some level of type safety, allows helper functions to be added for
specific opcodes for free, and also allows us to succinctly check for class
membership with the usual dyn_cast/isa/cast functions.
To start off with, add variants for the different load/store operations with some
places using it.
Differential Revision: https://reviews.llvm.org/D105751
D104806 broke some uses of getMinusSCEV() in DependenceAnalysis:
subtraction with different pointer bases returns a SCEVCouldNotCompute.
Make sure we avoid cases involving such subtractions.
Differential Revision: https://reviews.llvm.org/D106099
The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit
variant, the former using EDI, the latter RDI, but the use of the
register is implicit. In 64-bit mode, a 0x67 prefix can be used to get
the version using EDI, but there is no way to express this in
assembly in a single instruction, the only way is with an explicit
addr32.
This change adds support for the instruction. When generating assembly
text, that explicit addr32 will be added. When not generating assembly
text, it will be kept as a single instruction and will be emitted with
that 0x67 prefix. When parsing assembly text, it will be re-parsed as
ADDR32 followed by MASKMOVDQU64, which still results in the correct
bytes when converted to machine code.
The same applies to vmaskmovdqu as well.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D103427
This bug was introduced with D105730 / 25ee55c0baff .
If we are not converting all of the operations of a reduction
into a vector op, we need to preserve the existing select form
of the remaining ops. Otherwise, we are potentially leaking
poison where it did not in the original code.
Alive2 agrees that the version that freezes some inputs
and then falls back to scalar is correct:
https://alive2.llvm.org/ce/z/erF4K2
Intrinsics can only be called directly, taking their address is not
legal. This is currently only enforced for intrinsics that have an
ID, rather than all intrinsics. Adjust the check to cover all
intrinsics.
This came up in D106013.
Differential Revision: https://reviews.llvm.org/D106095
Summary:
in the function PPCFunctionInfo::getParmsType(), there is if (Bits > 31 || (Bits > 30 && (Elt != FixedType || hasVectorParms())))
when the Bit is 31 and the Elt is not FixedType(for example the Elt is FloatingType) , the 31th bit will be not encoded, it leave the bit as zero, when the function Expected<SmallString<32>> XCOFF::parseParmsType() the original implement
**// unsigned ParmsNum = FixedParmsNum + FloatingParmsNum;
while (Bits < 32 && ParsedNum < ParmsNum) {
...
}//**
it will look the 31 bits (zero) as FixedType. which should be FloatingType, and get a error.
Reviewers: Jason Liu,ZarkoCA
Differential Revision: https://reviews.llvm.org/D105023
D55348 replaced @objc_msgSend with @llvm.objc.msgSend in tests
together with many other objc intrinsics. However, this is not a
recognized objc intrinsic (https://llvm.org/docs/LangRef.html#objective-c-arc-runtime-intrinsics)
and does not receive special treatment by LLVM. It's likely that
uses of this function were renamed by accident.
This came up in D106013, because the address of @llvm.objs.msgSend
is taken, something which is normally not allowed for intrinsics.
Differential Revision: https://reviews.llvm.org/D106094
`intermediate_commits` is a list of full SHAs, and `across_ref` may/may
not be a full SHA (or a SHA at all). We already have `across_sha`, which
is the resolved form of `across_ref`, so use that instead.
Thanks to probinson for catching this in post-commit review of
https://reviews.llvm.org/D105578!
The ceiling variant was recently added (due to the work towards D105216), and we're spending a lot of time trying to find optimizations for the expression. This patch brute forces the space of i8 unsigned divides and checks that we get a correct (well consistent with APInt) result for both udiv and udiv ceiling.
(This is basically what I've been doing locally in a hand rolled C++ program, and I realized there no good reason not to check it in as a unit test which directly exercises the logic on constants.)
Differential Revision: https://reviews.llvm.org/D106083
This patch makes vector spills valid for tail predication when all loads
from the same stack slot are within the loop
Differential Revision: https://reviews.llvm.org/D105443
This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins.
Reviewed By: #powerpc, nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D105360
This is split from D105216, it handles only a subset of the cases in that patch.
Specifically, the issue being fixed is that the code incorrectly assumed that (Start-Stide) < End implied that the backedge was taken at least once. This is not true when e.g. Start = 4, Stride = 2, and End = 3. Note that we often do produce the right backedge taken count despite the flawed reasoning.
The fix chosen here is to use an alternate form of uceil (ceiling of unsigned divide) lowering which is safe when max(RHS,Start) > Start - Stride. (Note that signedness of both max expression and comparison depend on the signedness of the comparison being analyzed, and that overflow in the Start - Stride expression is allowed.) Note that this is weaker than proving the backedge is taken because it allows start - stride < end < start. Some cases which can't be proven safe are sent down the generic path, and we do end up generating less optimal expressions in a few cases.
Credit for coming up with the approach goes entirely to Eli. I just split it off, tweaked the comments a bit, and did some additional testing.
Differential Revision: https://reviews.llvm.org/D105942
Details:
Switch all #includes to use <> because that is consistent with what happens in the cmake checks.
Otherwise, we could be in the situation where cmake checks see that headers exist at <perfmon/...>
but in llvm-exegesis code, we use "perfmon/...", which may not exist.
Related PR/revisions: D84076, PR51017+D105615
Differential Revision: https://reviews.llvm.org/D105861
The documentation and help messages have recommended the double-dash forms for
quite a while. Remove one-dash long options which are not recognized by GNU
style `getopt_long`.
`-arch` is kept as it is in the manpage of classic nm
https://keith.github.io/xcode-man-pages/nm.1.html
Note: the dyldinfo related options don't have a test.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D105948
This implements the elementtype attribute specified in D105407. It
just adds the attribute and the specified verifier rules, but
doesn't yet make use of it anywhere.
Differential Revision: https://reviews.llvm.org/D106008
This adds an elementtype(<ty>) attribute, which can be used to
attach an element type to a pointer typed argument. It is similar
to byval/byref in purpose, but unlike those does not carry any
specific semantics by itself. However, certain intrinsics may
require it and interpret it in specific ways.
The in-tree use cases for this that I'm currently aware of are:
call ptr @llvm.preserve.array.access.index.p0.p0(ptr elementtype(%ty) %base, i32 %dim, i32 %index)
call ptr @llvm.preserve.struct.access.index.p0.p0(ptr elementtype(%ty) %base, i32 %gep_index, i32 %di_index)
call token @llvm.experimental.gc.statepoint.p0(i64 0, i32 0, ptr elementtype(void ()) @foo, i32 0, i32 0, i32 0, i32 0, ptr addrspace(1) %obj)
Notably, the gc.statepoint case needs a function as element type,
in which case the workaround of adding a separate %ty undef
argument would not work, as arguments cannot be unsized.
Differential Revision: https://reviews.llvm.org/D105407
Fixes some regressions with -fstrict-vtable-pointers in llvm-test-suite.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D106017
This change enables vectorization of multiple exit loops when the exit count is statically computable. That requirement - shared with the rest of LV - in turn requires each exit to be analyzeable and to dominate the latch.
The majority of work to support this was done in a set of previous patches. In particular,, 72314466 avoids having multiple edges from the middle block to the exits, and 4b33b2387 which added support for non-latch single exit and multiple exits with a single exiting block. As a result, this change is basically just removing a bailout and adjusting some tests now that the prerequisite work is done and has stuck in tree for a bit.
Differential Revision: https://reviews.llvm.org/D105817
Continuing on from D105780, this should be the last major bit of
attribute cleanup. Currently, LLParser implements attribute parsing
for functions, parameters and returns separately, enumerating all
supported (and unsupported) attributes each time. This patch
extracts the common parsing logic, and performs a check afterwards
whether the attribute is valid in the given position. Parameters
and returns are handled together, while function attributes need
slightly different logic to support attribute groups.
Differential Revision: https://reviews.llvm.org/D105938
Similar to the folds performed in InstCombinerImpl::foldSelectOpOp, this attempts to push a select further up to help merge a pair of binops.
I'm primarily interested in select(cond,add(x,y),add(x,z)) folds to help expose pointer math (see https://bugs.llvm.org/show_bug.cgi?id=51069 etc.) but I've tried to use the more generic isBinOp().
Differential Revision: https://reviews.llvm.org/D106058
This reverts commit efaf3099c8cec1954831ee28a2f75a72096f50eb.
This reverts commit dc7bdc1e7121693df112f2fdb11cc6b88580ba4b.
Reverting patches due to buildbot failures.
This breaks out some (more) common llvm-specific
variables. Controlling the subprojects and target architectures, along
with clues about restricting build parallelism when linking. 'more
common' is somewhat subjective, of course.
Differential Revision: https://reviews.llvm.org/D105822