1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

212770 Commits

Author SHA1 Message Date
Josh Berdine
c64c5b211f [OCaml] Add missing TypeKinds, Opcode, and AtomicRMWBinOps
There are several enum values that have been added to LLVM-C that are
missing from the OCaml bindings. The types defined in
bindings/ocaml/llvm/llvm.ml should be in sync with the corresponding
enum definitions in include/llvm-c/Core.h. The enum values are passed
from C to OCaml unmodified, and clients of the OCaml bindings
interpret them as tags of the corresponding OCaml types. So the only
changes needed are to add the missing constructors to the type
definitions, and to change the name of the maximum opcode in an
assertion.

Differential Revision: https://reviews.llvm.org/D98578
2021-03-16 15:32:38 +00:00
Joe Ellis
859445ea3f [AArch64][SVE] Fold vector ZExt/SExt into gather loads where possible
This commit folds sxtw'd or uxtw'd offsets into gather loads where
possible with a DAGCombine optimization.

As an example, the following code:

     1	#include <arm_sve.h>
     2
     3	svuint64_t func(svbool_t pred, const int32_t *base, svint64_t offsets) {
     4	  return svld1sw_gather_s64offset_u64(
     5	    pred, base, svextw_s64_x(pred, offsets)
     6	  );
     7	}

would previously lower to the following assembly:

    sxtw	z0.d, p0/m, z0.d
    ld1sw	{ z0.d }, p0/z, [x0, z0.d]
    ret

but now lowers to:

    ld1sw   { z0.d }, p0/z, [x0, z0.d, sxtw]
    ret

Differential Revision: https://reviews.llvm.org/D97858
2021-03-16 15:09:46 +00:00
Max Kazantsev
9f606251f1 [SCEV][NFC] Move check up the stack
One of (and primary) callers of isBasicBlockEntryGuardedByCond is
isKnownPredicateAt, which makes isKnownPredicate check before it.
It already makes non-recursive check inside. So, on this execution
path this check is made twice. The only other caller is
isLoopEntryGuardedByCond. Moving the check there should save some
compile time.
2021-03-16 22:09:17 +07:00
Craig Topper
eeed21c58e [RISCV] Look through copies when trying to find an implicit def in addVSetVL.
The InstrEmitter can sometimes insert a copy after an IMPLICIT_DEF
before connecting it to the vector instruction. This occurs when
constrainRegClass reduces to a class with less than 4 registers.
I believe LMUL8 on masked instructions triggers this since the
result can only use the v8, v16, or v24 register group as the mask
is using v0.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98567
2021-03-16 07:59:09 -07:00
David Zarzycki
f2b6f6c181 [lit testing] Mark reorder.py as unavailable on Windows
The test file has embedded slashes. This is fine for normal users that
are just recording and reordering paths, but not great when the trace
data is committed back to a repository that should work on both Unix and
Windows.
2021-03-16 10:54:06 -04:00
Joe Ellis
cab71b0250 [AArch64][SVEIntrinsicOpts] Factor out redundant SVE mul/fmul intrinsics
This commit implements an IR-level optimization to eliminate idempotent
SVE mul/fmul intrinsic calls. Currently, the following patterns are
captured:

    fmul  pg  (dup_x  1.0)  V  =>  V
    mul   pg  (dup_x  1)    V  =>  V

    fmul  pg  V  (dup_x  1.0)  =>  V
    mul   pg  V  (dup_x  1)    =>  V

    fmul  pg  V  (dup  v  pg  1.0)  =>  V
    mul   pg  V  (dup  v  pg  1)    =>  V

The result of this commit is that code such as:

    1  #include <arm_sve.h>
    2
    3  svfloat64_t foo(svfloat64_t a) {
    4    svbool_t t = svptrue_b64();
    5    svfloat64_t b = svdup_f64(1.0);
    6    return svmul_m(t, a, b);
    7  }

will lower to a nop.

This commit does not capture all possibilities; only the simple cases
described above. There is still room for further optimisation.

Differential Revision: https://reviews.llvm.org/D98033
2021-03-16 14:50:17 +00:00
Craig Topper
a9b14b0ccd [RISCV] Improve i32 UADDSAT/USUBSAT on RV64.
The default promotion uses zero extends that become shifts. We
cam use sign extend instead which is better for RISCV.

I've used two different implementations based on whether we
have minu/maxu instructions.

Differential Revision: https://reviews.llvm.org/D98683
2021-03-16 07:44:06 -07:00
Aaron Puchert
b376ef963f Correct Doxygen syntax for inline code
There is no syntax like {@code ...} in Doxygen, @code is a block command
that ends with @endcode, and generally these are not enclosed in braces.
The correct syntax for inline code snippets is @c <code>.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D98665
2021-03-16 15:17:45 +01:00
LLVM GN Syncbot
24ae2470e3 [gn build] Port 9a5af541ee05 2021-03-16 14:03:53 +00:00
Simon Pilgrim
acf6a1b2db [X86][SSE] canonicalizeShuffleWithBinOps - add PERMILPS/PERMILPD + PERMPD/PERMQ + INSERTPS handling.
Bail if the INSERTPS would introduce zeros across the binop.
2021-03-16 13:52:08 +00:00
RamNalamothu
5a521663a1 [AMDGPU, NFC] Refactor FP/BP spill index code in emitPrologue/emitEpilogue
Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D98617
2021-03-16 19:19:45 +05:30
Simonas Kazlauskas
533dfa60d2 [InstSimplify] Match PtrToInt more directly in a GEP transform (NFC)
In preparation for D98611, the upcoming change will need to apply additional checks to `P` and `V`,
and so this refactor paves the way for adding additional checks in a less awkward way.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98672
2021-03-16 15:45:19 +02:00
David Zarzycki
0a3d22f26f [lit testing] Fix Windows reliability? 2021-03-16 09:11:41 -04:00
Max Kazantsev
4eef8d0048 [Test] Add test with loops guarded by trivial conditions 2021-03-16 19:46:36 +07:00
Max Kazantsev
97be3b8226 [Test] Update auto-generated checks 2021-03-16 19:39:45 +07:00
serge-sans-paille
ee3e917269 [NFC] Use SmallString instead of std::string for the AttrBuilder
This avoids a few unnecessary conversion from StringRef to std::string, and a
bunch of extra allocation thanks to the SmallString.

Differential Revision: https://reviews.llvm.org/D98190
2021-03-16 13:34:14 +01:00
David Zarzycki
8f1e125e6b [llvm-exegesis testing] Workaround unreliable test
Picking an instruction at random is not perfectly reliable.
2021-03-16 08:00:14 -04:00
serge-sans-paille
9381bac6a8 [NFC] Replace loop by idiomatic llvm::find_if 2021-03-16 12:49:19 +01:00
Bjorn Pettersson
5e7e393b90 [TableGen/GlobalISel] Emit MI_predicate custom code for PatFrags (not only PatFrag)
When GlobalISelEmitter::emitCxxPredicateFns emitted code for MI
predicates it used "PatFrag" when searching for definitions. With
this patch it will search for all "PatFrags" instead. Since PatFrag
derives from PatFrags the difference is that we now include all
definitions using PatFrags directly as well. Thus making it possible
to use GISelPredicateCode together with a PatFrags definition.

It might be noted that the matcher code was emitted also for PatFrags
in the past. But then one ended up with errors since the custom code
in testMIPredicate_MI was missing.

Differential Revision: https://reviews.llvm.org/D98486
2021-03-16 12:44:09 +01:00
Sanjay Patel
0e17978276 [SLP] improve readability in reduction logic; NFC
We had 2 different and ambiguously-named 'I' variables.
2021-03-16 07:35:13 -04:00
Markus Böck
1a004a81cb [test] Make sure the test program in GetErrcMessages.cmake exits normally.
If for some reason the test program does not exit normally it'd currently lead to a false positive and it's stdout output being assigned to the output variable.

Instead, check the test program exited normally before assigning the process output to the out variable.

Follow up on rGaf2796c76d2ff4b73165ed47959afd35a769beee
Fixes an issue discovered post commit in https://reviews.llvm.org/D98278
2021-03-16 12:22:40 +01:00
Dmitry Preobrazhensky
92405c8fe8 [AMDGPU][MC] Disabled lds_direct for GFX90a
Fixed bug 49382.

Differential Revision: https://reviews.llvm.org/D98626
2021-03-16 13:52:36 +03:00
Markus Böck
848f365839 [test][NFC] Minor formatting and comment adjustments in GetErrcMessages.cmake
These changes address post-commit review comments discussed in https://reviews.llvm.org/D98278
2021-03-16 11:08:57 +01:00
David Zarzycki
643090aa23 [lit] Sort test start times based on prior test timing data
Lit as it exists today has three hacks that allow users to run tests earlier:

1) An entire test suite can set the `is_early` boolean.
2) A very recently introduced "early_tests" feature.
3) The `--incremental` flag forces failing tests to run first.

All of these approaches have problems.

1) The `is_early` feature was until very recently undocumented. Nevertheless it still lacks testing and is a imprecise way of optimizing test starting times.
2) The `early_tests` feature requires manual updates and doesn't scale.
3) `--incremental` is undocumented, untested, and it requires modifying the *source* file system by "touching" the file. This "touch" based approach is arguably a hack because it confuses editors (because it looks like the test was modified behind the back of the editor) and "touching" the test source file doesn't work if the test suite is read only from the perspective of `lit` (via advanced filesystem/build tricks).

This patch attempts to simplify and address all of the above problems.

This patch formalizes, documents, tests, and defaults lit to recording the execution time of tests and then reordering all tests during the next execution. By reordering the tests, high core count machines run faster, sometimes significantly so.

This patch also always runs failing tests first, which is a positive user experience win for those that didn't know about the hidden `--incremental` flag.

Finally, if users want, they can _optionally_ commit the test timing data (or a subset thereof) back to the repository to accelerate bots and first-time runs of the test suite.

Reviewed By: jhenderson, yln

Differential Revision: https://reviews.llvm.org/D98179
2021-03-16 05:23:04 -04:00
serge-sans-paille
39f31f66e4 [NFC] Wisely nest dyn_cast in FunctionLoweringInfo
Take advantage of the inheritance tree to avoid a few comparison.
2021-03-16 10:22:44 +01:00
Caroline Concatto
cb0c62b65c [SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse
This patch adds support for reverse loop vectorization.
It is possible to vectorize the following loop:
```
  for (int i = n-1; i >= 0; --i)
    a[i] = b[i] + 1.0;
```
with fixed or scalable vector.
The loop-vectorizer will use 'reverse' on the loads/stores to make
sure the lanes themselves are also handled in the right order.
This patch adds support for scalable vector on IRBuilder interface to
create a reverse vector. The IR function
CreateVectorReverse lowers to experimental.vector.reverse for scalable vector
and keedp the original behavior for fixed vector using shuffle reverse.

Differential Revision: https://reviews.llvm.org/D95363
2021-03-16 07:51:59 +00:00
Amara Emerson
7917eb608e [AArch64][GlobalISel] Fix crash on lowering <1 x half> types. 2021-03-15 23:27:43 -07:00
wlei
6bd02e9a2a [CSSPGO][llvm-profgen] Fix getCanonicalFnName usage in llvm-profgen
Previously we didn't support to keep the unique linkage name(-funique-internal-linkage-name) in llvm-profgen. As discussed in https://reviews.llvm.org/D96932, we choose to do canonicalization for it.

Now since "selected" is set as the default parameter of getCanonicalFnName in `D96932`, we don't need to add any attribute here for the previous usage and only fix the missing usage in the pseudo probe decoding.

Differential Revision: https://reviews.llvm.org/D98226
2021-03-15 21:00:42 -07:00
Johannes Doerfert
60c629d360 [NVPTX] CUDA does provide malloc/free since compute capability 2.X
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D98606
2021-03-15 22:45:56 -05:00
Josh Berdine
984ad28c40 [OCaml][test] Fix Bindings/OCaml/executionengine.ml test
It seems that at some point it became necessary to pass `-thread` to
the ocaml compiler for this test.

Differential Revision: https://reviews.llvm.org/D98593
2021-03-16 02:48:36 +00:00
LLVM GN Syncbot
b19c601bda [gn build] Port 4f198b0c27b0 2021-03-16 02:41:16 +00:00
Bing1 Yu
ab2b029d8f [X86] Pass to transform amx intrinsics to scalar operation.
This pass runs in any situations but we skip it when it is not O0 and the
function doesn't have optnone attribute. With -O0, the def of shape to amx
intrinsics is near the amx intrinsics code. We are not able to find a
point which post-dominate all the shape and dominate all amx intrinsics.
To decouple the dependency of the shape, we transform amx intrinsics
to scalar operation, so that compiling doesn't fail. In long term, we
 should improve fast register allocation to allocate amx register.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D93594
2021-03-16 10:40:22 +08:00
David Blaikie
11757a1309 Skip path separators to make the test portable across Win/Linux 2021-03-15 18:24:40 -07:00
Petr Hosek
fb12fb1321 [CMake] Clean up unnecessary dependency
The LINK_COMPONENTS dependency between DebugInfoCodeView and
DebugInfoMSF is unnecessary. Breaking them would allow a more
fine-controlled distribution.

Patch By: dangyi

Differential Revision: https://reviews.llvm.org/D98465
2021-03-15 16:29:16 -07:00
LLVM GN Syncbot
4791deff1c [gn build] Port ecf6466f01c5 2021-03-15 23:01:19 +00:00
Lang Hames
4b389e1c7b [JITLink][MachO][x86-64] Introduce generic x86-64 support.
This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64
backend to use these edge kinds. This simplifies the implementation of the
MachO/x86-64 backend and makes it possible to write generic x86-64 passes and
utilities.

The new edge kinds are different from the original set used in the MachO/x86-64
backend. Several edge kinds that were not meaningfully distinguished in that
backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in
the new scheme (these edge kinds can be reintroduced later if we find a use for
them). At the same time, new edge kinds have been introduced to convey extra
information about the state of the graph. E.g. The Request*AndTransformTo**
edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP
entries, and the 'Relaxable' suffix distinguishes edges that are candidates for
optimization from edges which should be left as-is (e.g. to enable runtime
redirection).

ELF/x86-64 will be refactored to use these generic edges at some point in the
future, and I anticipate a similar refactor to create a generic arm64 support
header too.

Differential Revision: https://reviews.llvm.org/D98305
2021-03-15 15:43:07 -07:00
Nico Weber
b7c9aa8059 [gn build] merge af2796c76d2f a bit more
The default is fine on non-Win, but on Win this needs an explicit
setting now that lit no longer has the right default.
2021-03-15 18:20:54 -04:00
Alexander Yermolovich
0440613b8e [DWARF] Check for AddrOffsetSectionBase to work with DWO Units.
Context: https://lists.llvm.org/pipermail/llvm-dev/2021-February/148521.html

A fix for llvm-symbolizer, and other tools like BOLT, that allows retrieving address when built with -gsplit-dwarf=single mode.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D96827
2021-03-15 14:46:09 -07:00
Artem Belevich
7a17da6eb6 [NVPTX] Avoid temp copy of byval kernel parameters.
Avoid making a temporary copy of byval argument if all accesses are loads and
therefore the pointer to the parameter can not escape.

This avoids excessive global memory accesses when each kernel makes its own
copy.

Differential revision: https://reviews.llvm.org/D98469
2021-03-15 14:27:22 -07:00
Nick Lewycky
40de5cf96a NFC: Formatting changes.
Run clang-format over these files.

Capitalize some variable names per clang-tidy's request.

Pulled out to simplify review of D98302.
2021-03-15 14:26:39 -07:00
Stanislav Mekhanoshin
17050632cf [AMDGPU] Fix copyPhysReg to not produce unalined vgpr access
RA can insert something like a sub1_sub2 COPY of a wide VGPR
tuple which results in the unaligned acces with v_pk_mov_b32
after the copy is expanded. This is regression after D97316.

Differential Revision: https://reviews.llvm.org/D98549
2021-03-15 14:14:30 -07:00
Florian Hahn
1aed9ced3b [AnnotationRemarks] Remove unneeded Function.h include (NFC). 2021-03-15 21:09:35 +00:00
Nico Weber
19f76964c4 [gn build] merge 9bcf0eff99 2021-03-15 17:05:05 -04:00
Nico Weber
f75a03ef22 [gn build] kind of merge af2796c76d2f
Good enough for now. If we need more, we'll do the usual
platform-dependent hardcoding that in practice works for everything else
too.
2021-03-15 17:01:00 -04:00
Stanislav Mekhanoshin
fc6febe595 [AMDGPU] Fixed msan failure with uninitialized value 2021-03-15 13:58:19 -07:00
Kirill Bobyrev
da5206f100 [clangd] Optionally add reflection for clangd-index-server
This was originally landed without the optional part and reverted later:

8080ea4c4b

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D98404
2021-03-15 21:07:25 +01:00
Markus Böck
9badf35f5b Revert line accidentally included in af2796c76d2ff4b73165ed47959afd35a769beee 2021-03-15 21:03:46 +01:00
Sanjay Patel
677d887642 [SLP] update stale test comments; NFC
These bugs were fixed with 0a8e7ca402eb
2021-03-15 16:02:46 -04:00
Stanislav Mekhanoshin
196e7f3138 [AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy
bitmask operand. This will reduce the number of operands in MIR and I hope
the amount of code. These operands are mostly 0 anyway.

Additional advantage that parser will accept these flags in any order unlike
now.

Differential Revision: https://reviews.llvm.org/D96469
2021-03-15 13:00:59 -07:00
Markus Böck
1be4884f17 [test] Add ability to get error messages from CMake for errc substitution
Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used.

This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter.

If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions.

Differential Revision: https://reviews.llvm.org/D98278
2021-03-15 20:56:08 +01:00