1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

217363 Commits

Author SHA1 Message Date
Matt Arsenault
6c04830c65 AMDGPU: Fix infinite loop in DAG combine with fneg + fma
We were not reporting isFNegFree for v2f32, although it is effectively
free after legalization. The generic combine was pulling fneg out of
the fma source operands, and the AMDGPU combine was doing the
opposite.
2021-06-18 19:09:03 -04:00
Matt Arsenault
bf6259aa0b AMDGPU: Fix assert on m0_lo16/m0_hi16
These get added (redundantly) to the bundle expanded for indirect
register accesses. We hit this path only when there is a call in the
function.
2021-06-18 18:48:53 -04:00
Hongtao Yu
7fbb587058 [CSSPGO] Undoing the concept of dangling pseudo probe
As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen.

I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104477
2021-06-18 15:14:11 -07:00
Nikita Popov
7afbecd70d [LoopUnroll] Simplify optimization remarks
Remove dependence on ULO.TripCount/ULO.TripMultiple from ORE and
debug code. For debug code, print information about all exits.
For optimization remarks, only include the unroll count and the
type of unroll (complete, partial or runtime), but omit detailed
information about exit folding, now that more than one exit may
be folded.

Differential Revision: https://reviews.llvm.org/D104482
2021-06-18 23:47:03 +02:00
Hongtao Yu
d9e9fe620c [CSSPGO][llvm-profgen] Fix an issue in findDisjointRanges
We were using 0 as an indicator of invalid offset when computing disjoint ranges. In reality, 0 can be an valid code offset which stands for the first function in .text section. I'm using UINT64_MAX as an invalid code offset instead.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104497
2021-06-18 14:38:48 -07:00
Nick Desaulniers
de2155141d [GCOVProfiling] don't profile Fn's w/ noprofile attribute
Similar to D104475, the Linux kernel would like to avoid compiler
generated code in certain functions. The no_profile function
attribute can be used in C to generate the the noprofile fn attr in IR.
Respect that from GCOVProfiling.

Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D104257
2021-06-18 13:58:34 -07:00
Craig Topper
a2969894c5 [RISCV] Teach vsetvli insertion to remember when predecessors have same AVL and SEW/LMUL ratio if their VTYPEs otherwise mismatch.
Previously we went directly to unknown state on VTYPE mismatch.
If we instead remember the partial match, we can use this to
still use X0, X0 vsetvli in successors if AVL and needed SEW/LMUL
ratio match.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D104069
2021-06-18 12:16:07 -07:00
Hongtao Yu
f2d99918a7 [CSSPGO][llvm-profgen] Ignore LBR records after interrupt transition
If we have seen an inwards transition from external code to internal code, but not a following outwards transition, the inwards transition is likely due to interrupt which is usually unpaired. Ignore current  and subsequent entries since they are likely from an unrelated pre-interrupt context.

LBR records from different interrupt context are unrelated and they should not be mixed together. Currenlty the OS does this for task-scheduling interrupt but not for all interrupts.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D104276
2021-06-18 12:13:53 -07:00
Anshil Gandhi
7f0aeb0ac8 [AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask
Implemented the transformation of xor (llvm.amdgcn.class x, mask), -1 into
llvm.amdgcn.class(x, ~mask). Added LIT tests as well.

Differential Revision: https://reviews.llvm.org/D104049
2021-06-18 13:04:12 -06:00
Hongtao Yu
89f841e69b [CSSPGO] Fix an invalid hash table reference issue in the CS preinliner.
We were using a `StringMap` object to store all profiles to be emitted. The object is basically an unordered hash table, therefore updating it in the process of trasvering it may cause issue since the underlying bucket array could change.

I'm also moving the `csspgo-preinliner` switch around so that no context tri will be constructed (by the constructor of `CSPreInliner`) when the switch is off.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D104267
2021-06-18 11:54:23 -07:00
Andrew Browne
d75f39d6d7 [DFSan] Cleanup code for platforms other than Linux x86_64.
These other platforms are unsupported and untested.
They could be re-added later based on MSan code.

Reviewed By: gbalats, stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104481
2021-06-18 11:21:46 -07:00
Krzysztof Parzyszek
f7732c65f0 Revert "Delay initialization of OptBisect"
This reverts commit ec91df8d8195b8b759a89734dba227da1eaa729f.

It was committed by accident.
2021-06-18 13:16:45 -05:00
Krzysztof Parzyszek
99d2423f69 XFAIL a testcase on Hexagon (missing-abstract-variable.ll)
This seems to be a common problem among several architectures.
2021-06-18 13:15:19 -05:00
Krzysztof Parzyszek
de5add3739 Delay initialization of OptBisect
When LLVM is used in other projects, it may happen that global cons-
tructors will execute before the call to ParseCommandLineOptions.
Since OptBisect is initialized via a constructor, and has no ability
to be updated at a later time, passing "-opt-bisect-limit" to the
parse function may have no effect.

To avoid this problem use a cl::cb (callback) to set the bisection
limit when the option is actually processed.

Differential Revision: https://reviews.llvm.org/D104551
2021-06-18 13:15:19 -05:00
Jingu Kang
aacb03ff49 [AArch64] Add TableGen patterns to generate uaddlv
uaddv(uaddlp(x)) ==> uaddlv(x)
addp(uaddlp(x))  ==> uaddlv(x)

Differential Revision: https://reviews.llvm.org/D104236
2021-06-18 17:23:26 +01:00
Saleem Abdulrasool
daedfc6daf RISCV: simplify a test case for RISCV (NFCI)
The output of the object file is unimportant and entirely discarded.
Simply redirect the output to `/dev/null` or `NUL` as the case may be.
Additionally, the space between the labels is unimportant.  There is no
need to add space between the labels.  Two labels at the same address
are sufficient to generate the difference expression and should still
test the same behaviour.
2021-06-18 08:19:16 -07:00
Simon Pilgrim
ccd7a98f97 [DAG] SelectionDAG::computeKnownBits - use APInt::insertBits to merge subvector knownbits. NFCI.
As noticed on D104472 we can use APInt::insertBits which will avoid a lot of temporary APInt creations
2021-06-18 14:59:01 +01:00
Lang Hames
d2647ecc04 [ORC][C-bindings] Re-order object transform function arguments.
ObjInOut is an in-out parameter not a return value argument, so by convention
it should come after the context value (Ctx).
2021-06-18 22:12:39 +10:00
Lang Hames
eb023bd323 [ORC] Use uint8_t rather than char for RPC wrapper-function calls.
This partially reverts 838490de7ed, which broke some Solaris bots. Apparently
Solaris defines int8_t as char rather than signed char, which made the
SerializationTypeName<char> specialization a redefinition.

This partial revert isolates use of uint8_t buffers to ORC-RPC handling of
wrapper functions only. The TargetProcessControl::runWrapper method will
continue to use char buffers.
2021-06-18 21:56:09 +10:00
Lang Hames
04c9dcdc5d [ORC] Add support for dumping objects to the C API.
Provides ObjectTransformLayer APIs, a getter to access the
ObjectTransformLayer member of LLJIT, and the DumpObjects utility
to make construction of a dump-to-disk transform easy.

An example showing how the new APIs can be used has been added in
llvm/examples/OrcV2Examples/OrcV2CBindingsDumpObjects.
2021-06-18 20:56:45 +10:00
Liqiang Tao
1584f0233e Revert D104028 "[llvm][Inliner] Add an optional PriorityInlineOrder" 2021-06-18 18:52:00 +08:00
Max Kazantsev
6b59cab001 [LoopDeletion] Break backedge if we can prove that the loop is exited on 1st iteration (try 3)
This patch handles one particular case of one-iteration loops for which SCEV
cannot straightforwardly prove BECount = 1. The idea of the optimization is to
symbolically execute conditional branches on the 1st iteration, moving in topoligical
order, and only visiting blocks that may be reached on the first iteration. If we find out
that we never reach header via the latch, then the backedge can be broken.

This implementation uses InstSimplify. SCEV version was rejected due to high
compile time impact.

Differential Revision: https://reviews.llvm.org/D102615
Reviewed By: nikic
2021-06-18 17:31:57 +07:00
Haojian Wu
2192be141f [Attributor] Fix UB behavior on uninitalized bool variables.
Found by ASAN.
2021-06-18 11:49:42 +02:00
Jay Foad
58f1e523e1 [AMDGPU] Update generated checks. NFC. 2021-06-18 10:49:02 +01:00
Daniil Seredkin
ce9c5da2dc [InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X)
InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.

Differential Revision: https://reviews.llvm.org/D104193
2021-06-18 16:28:06 +07:00
Max Kazantsev
bc696a57a2 [Test] Add XFAIL unit test for PR50765 2021-06-18 16:25:42 +07:00
Liqiang Tao
921bbbb45e [llvm][Inliner] Add an optional PriorityInlineOrder
This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining.
The callsite which size is smaller would have a higher priority.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D104028
2021-06-18 16:55:38 +08:00
Max Kazantsev
7e6d9eb18f [NFC] Assert non-zero factor before division
This is to ensure that zero denominator leads to controlled
assertion failure rather than UB.
2021-06-18 15:50:50 +07:00
Haojian Wu
a8d4c2a73b [Attributor] Don't print the call-graph in a hard-coded file.
This looks like not a practical pattern in our codebase (it could fail
in some sandbox environement).

Instead we print it via standard output, and it is controled by the
-attributor-print-call-graph, this follows a similiar pattern of attributor-print-dep.
2021-06-18 09:38:07 +02:00
Tomasz Miąsko
da9f3ec14e [Demangle][Rust] Parse dot suffix
Allow mangled names to include an arbitrary dot suffix, akin to vendor
specific suffix in Itanium mangling.

Primary motivation is a support for symbols renamed during ThinLTO
import / promotion (ThinLTO is the default configuration for optimized
builds in rustc).

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D104358
2021-06-18 09:29:45 +02:00
Daniil Seredkin
2f87452581 Revert "[InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X)"
This reverts commit 31053338c97b36ffb582f9c04a74cec69cce3e70.
2021-06-18 14:21:02 +07:00
Daniil Seredkin
acf8b9a690 [InstCombine] Fold (sext bool X) * (sext bool X) to zext (and X, X)
InstCombine didn't perform (sext bool X) * (sext bool X) --> zext (and X, X) which can result in just (zext X). The patch adds regression tests to check this transformation and adds a check for equality of mul's operands for that case.

Differential Revision: https://reviews.llvm.org/D104193
2021-06-18 14:12:00 +07:00
Fangrui Song
b331e4f4c1 Revert D103717 "[InstrProfiling] Make __profd_ unconditionally private for ELF"
This reverts commit 76d0747e0807307780ba84cbd7e5c80b20c26bd7.

If a group has `__llvm_prf_vals` due to static value profiler counters
(`NS!=0`), we cannot make `__llvm_prf_data` private, because a prevailing text
section may reference `__llvm_prf_data` and will cause a `relocation refers to a
discarded section` linker error.

Note: while a `__profc_` group is non-prevailing, it may be referenced by a
prevailing text section due to inlining.

```
group section [   66] `.group' [__profc__ZN5clang20EmitClangDeclContextERN4llvm12RecordKeeperERNS0_11raw_ostreamE] contains 4 sections:
   [Index]    Name
   [   67]   __llvm_prf_cnts
   [   68]   __llvm_prf_vals
   [   69]   __llvm_prf_data
   [   70]   .rela__llvm_prf_data
```
2021-06-17 23:38:17 -07:00
Johannes Doerfert
363cd23ad2 [Attributor][FIX] Arguments of unknown functions can be undef
This should fix PR50683. The wrong assumption was that we
could always know what the callee is when we replace a call site
argument with undef. We wanted to know that to remove the `noundef`
that might be attached to the argument. Since no callee means we
did the propagation on the caller site, there is no need to remove
an attribute. It is only needed if we replace all uses and therefore
pass `undef` instead of the value that was passed in otherwise.
2021-06-18 01:07:53 -05:00
Johannes Doerfert
4b0523df3a [Attributor] Allow to skip the initial update for a new AA
Users might want to run initialize for a set of AAs without an
intermediate update step. Running update eagerly is not a requirement
anyway so we make it optional.
2021-06-18 01:07:53 -05:00
Johannes Doerfert
c50c80812f [Attributor] Use a centralized value simplification interface
To allow outside AAs that simplify values we need to ensure all value
simplification goes through the Attributor, not AAValueSimplify (or any
of the other AAs we have already like AAPotentialValues). This patch
also introduces an interface for the outside AAs to register
simplification callbacks for an IRPosition. To make this work as
expected we have to pass IRPositions instead of Values in
AAValueSimplify, which makes sense by itself.
2021-06-18 01:07:53 -05:00
Johannes Doerfert
73f1949d65 [Attributor] Introduce a helper do deal with constant type mismatches
If we simplify values we sometimes end up with type mismatches. If the
value is a constant we can often cast it though to still allow
propagation. The logic is now put into a helper and it replaces some
ad hoc things we did before.

This also introduces the AA namespace for abstract attribute related
functions and types.

Differential Revision: https://reviews.llvm.org/D103856
2021-06-18 01:07:52 -05:00
Johannes Doerfert
99cca18714 [Attributor] Make sure Heap2Stack works properly on a GPU target
If the target stack is not accessible between different running
"threads" we have to make sure not to create allocas for mallocs
that might be used by multiple "threads". The "use check" is
sufficient to prevent this but if we apply the "free check" we have
to make sure the pointer is not communicated to others before
the free is reached.

Differential Revision: https://reviews.llvm.org/D98608
2021-06-18 01:07:52 -05:00
Johannes Doerfert
1d462a0452 [OpenMP][NFC] Expose AAExecutionDomain and rename its getter
The initial use for AAExecutionDomain was to determine if a single
thread executes a block. While this is sometimes informative most
of the time, and for other reasons, we actually want to know if it
is the "initial thread". Thus, the thread that started execution on
the current device. The deduction needs to be adjusted in a follow
up as the methods we use right not are looking for the OpenMP thread
id which is resets whenever a thread enters a parallel region. What
we basically want is to look for `llvm.nvvm.read.ptx.sreg.ntid.x` and
equivalent functions.
2021-06-18 01:07:52 -05:00
Johannes Doerfert
4c4da5cf4e [Attributor][NFC] Add test from PR49606
It is not clear to me how we fixed this, I reverted a few candidates but
I couldn't make the test fail. Still worth having it in our regression
suite.
2021-06-18 01:07:52 -05:00
Johannes Doerfert
b3f7c318e7 [Attributor][NFC] Precommit a set of test cases for load simplification 2021-06-18 01:07:51 -05:00
Johannes Doerfert
ffbe2b6b3c [Attributor][NFC] AAReachability is currently stateless, don't invalidate it
We invalidated AAReachabilityImpl directly which is not helpful and
confusing as we still used it regardless. We now avoid invalidating it
(not needed anyway) and add checks for the state. This has by itself no
actual effect but prepares for later extensions.
2021-06-18 01:07:51 -05:00
George Balatsouras
d2b759a121 [dfsan] Replace dfs$ prefix with .dfsan suffix
The current naming scheme adds the `dfs$` prefix to all
DFSan-instrumented functions.  This breaks mangling and prevents stack
trace printers and other tools from automatically demangling function
names.

This new naming scheme is mangling-compatible, with the `.dfsan`
suffix being a vendor-specific suffix:
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-structure

With this fix, demangling utils would work out-of-the-box.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104494
2021-06-17 22:42:47 -07:00
Luke
1271774638 [RISCV] Don't enable Interleaved Access Vectorization
The patch https://reviews.llvm.org/D101469 is intended to enable loop unrolling,
not interleaved access vectorization. The method bool enableInterleavedAccessVectorization()
should not be implemented.
2021-06-18 12:32:30 +08:00
Daniil Seredkin
981cf746b0 [InstCombine][NFC] Added tests for mul with zext/sext operands
Baseline tests for D104193
2021-06-18 11:14:50 +07:00
Igor Kudrin
ca414c6429 [objdump][ARM] Fix evaluating the target address of a Thumb BLX(i)
The instruction can be 16-bit aligned while targeting 32-bit aligned
code. To calculate the target address correctly, the address of the
instruction has to be adjusted.

Differential Revision: https://reviews.llvm.org/D104446
2021-06-18 10:40:55 +07:00
Carl Ritson
b219ce9cea [AMDGPU] Remove duplicate setOperationAction for v4i16/v4f16 (NFC) 2021-06-18 12:38:54 +09:00
Heejin Ahn
f7b0205560 [WebAssembly] Rename event to tag
We recently decided to change 'event' to 'tag', and 'event section' to
'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.

See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D104423
2021-06-17 20:34:19 -07:00
Xun Li
2576f49320 [Coroutine] Properly deal with byval and noalias parameters
This patch is to address https://bugs.llvm.org/show_bug.cgi?id=48857.
Previous attempts can be found in D104007 and D101980.
A lot of discussions can be found in those two patches.
To summarize the bug:
When Clang emits IR for coroutines, the first thing it does is to make a copy of every argument to the local stack, so that uses of the arguments in the function will all refer to the local copies instead of the arguments directly.
However, in some cases we find that arguments are still directly used:
When Clang emits IR for a function that has pass-by-value arguments, sometimes it emits an argument with byval attribute. A byval attribute is considered to be local to the function (just like alloca) and hence it can be easily determined that it does not alias other values. If in the IR there exists a memcpy from a byval argument to a local alloca, and then from that local alloca to another alloca, MemCpyOpt will optimize out the first memcpy because byval argument's content will not change. This causes issues because after a coroutine suspension, the byval argument may die outside of the function, and latter uses will lead to memory use-after-free.
This is only a problem for arguments with either byval attribute or noalias attribute, because only these two kinds are considered local. Arguments without these two attributes will be considered to alias coro_suspend and hence we won't have this problem. So we need to be able to deal with these two attributes in coroutines properly.
For noalias arguments, since coro_suspend may potentially change the value of any argument outside of the function, we simply shouldn't mark any argument in a coroutiune as noalias. This can be taken care of in CoroEarly pass.
For byval arguments, if such an argument needs to live across suspensions, we will have to copy their value content to the frame, not just the pointer.

Differential Revision: https://reviews.llvm.org/D104184
2021-06-17 19:06:10 -07:00
Jim Lin
c8807d8093 [M68k][NFC] Fix indentation in M68kInstrArithmetic.td
Merely fix indentation

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D104434
2021-06-18 09:49:04 +08:00