1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00
Commit Graph

213277 Commits

Author SHA1 Message Date
Nikita Popov
56b9e804f2 [ValueTracking] Handle non-zero shl recurrence
In this case we don't care about the step at all, and only require
that the starting value is non-zero.
2021-03-26 18:39:06 +01:00
Nikita Popov
07f0265381 [ValueTracking] Add tests for non-zero shl recurrences (NFC) 2021-03-26 18:35:38 +01:00
Nikita Popov
0926f8f2a9 [ValueTracking] Handle non-zero add/mul recurrences more precisely
This is mainly for clarity: It doesn't make sense to do any
negative/positive checks when dealing with a nuw add/mul. These
only make sense to nsw add/mul.
2021-03-26 18:30:07 +01:00
Nikita Popov
eb6c470c5f [ValueTracking] Add more non-zero add/mul recurrence tests (NFC) 2021-03-26 18:30:07 +01:00
Vaivaswatha Nagaraj
2e0c6019a3 [OCaml][DebugInfo][Test] Disable debuginfo tests as they fail on some machines 2021-03-26 22:56:38 +05:30
Simon Pilgrim
c49a846dbf [X86][AVX] combineHorizOpWithShuffle - improve SHUFFLE(HOP(LOSUBVECTOR(X),HISUBVECTOR(X))) folding
Peek through bitcasts to find subvector splits and use getTargetShuffleInputs to decode target shuffles as well as ShuffleVectorSDNode
2021-03-26 17:23:54 +00:00
Vaivaswatha Nagaraj
961e941c82 [OCaml][Test] Do not use Option, expand using match
Option seems to be unsupported on the buildbot version
of OCaml. So expand the statements using a match.

Fixes buildbot failure due to
c244cd7217
2021-03-26 22:41:29 +05:30
Florian Hahn
0e377c5502 [BasicAA] Add a few more interesting modulo tests. 2021-03-26 16:56:49 +00:00
Vaivaswatha Nagaraj
3a5a816a59 [OCaml][DebugInfo] Add tests for debug info API
In the process of adding the tests, several bugs were
found in the implementation and interface of the API
and they were fixed.

Some utilities from the core tests (core.ml) were moved
into a separate file for reuse.

The following new functions have been added:
`dibuild_create_global_variable_expression`,
`dibuild_create_constant_value_expression` and
`llmetadata_null`. The third one already existed but
is now exposed publicly.

Differential Revision: https://reviews.llvm.org/D99403
2021-03-26 22:06:48 +05:30
Aleksandr Platonov
64d851e46c [CMake][gRPC] Fix a typo in protobuf version variable name
Without this patch CMake log contains `Using protobuf` instead of `Using protobuf <version>`.

Reviewed By: kbobyrev

Differential Revision: https://reviews.llvm.org/D99405
2021-03-26 19:33:06 +03:00
Jay Foad
f21bfff407 [AMDGPU] Use reductions instead of scans in the atomic optimizer
If the result of an atomic operation is not used then it can be more
efficient to build a reduction across all lanes instead of a scan. Do
this for GFX10, where the permlanex16 instruction makes it viable. For
wave64 this saves a couple of dpp operations. For wave32 it saves one
readlane (which are generally bad for performance) and one dpp
operation.

Differential Revision: https://reviews.llvm.org/D98953
2021-03-26 15:38:14 +00:00
Florian Hahn
00978dd33d [BasicAA] Add a few cases with overflows in index computations.
This patch adds a few test cases where currently NoAlias is returned,
but the pointers can alias if the multiply overflows while computing
a GEP index value.
2021-03-26 14:50:03 +00:00
Sanjay Patel
3fade5f028 [SLP] move test for min/max crashing; NFC
This was originally just an XFAIL test, but I modified it
to check output. To make that bot-friendly, I'm moving it
to the x86 dir since it specified an x86 target.
2021-03-26 10:28:15 -04:00
Zakk Chen
bf5d542718 [RISCV] Add constraint for RVV indexed loads.
Add the constraint when destination EEW not equals the source EEW for
correctness.

The RVV spec has three register overlap rules and I implement the first
stricter constraint because the others are difficult to enforce.

Reviewed By: frasercrmck, craig.topper

Differential Revision: https://reviews.llvm.org/D98920
2021-03-26 07:23:24 -07:00
Sanjay Patel
c9f70d389b Revert "[SLP] allow matching integer min/max intrinsics as reduction ops"
This reverts commit 3c8473ba534daa3 and includes test diffs to
maintain testing status.

There's at least 1 place that was not updated with 7202f47508 ,
so we can crash mismatching select and intrinsics as shown in
PR49730.
2021-03-26 09:59:14 -04:00
Nashe Mncube
3ac25d30ac [InstCombine]Generalise regression tests for sve
The tests, test/Transforms/InstCombine/AArch64/sve-*,
have been shown to not be AArch64 specific. These tests
have been renamed and moved to reflect this.

Differential Revision: https://reviews.llvm.org/D99253
2021-03-26 12:04:50 +00:00
Josh Berdine
094815c89f [OCaml] Fix a possible crash in llvm_struct_name
The implementation of `llvm_struct_name` before this diff calls
`caml_copy_string`, which allocates, while the `result` local variable
points to a block allocated by `caml_alloc_small` that has not yet
been initialized. If the allocation in `caml_copy_string` triggers a
garbage collection, then the GC root `result` contains a pointer to
uninitialized data, which may crash the GC or lead to a memory
corruption.

This diff fixes this by allocating and initializing the string first
and then allocating and initializing the option, thereby leaving no
dangling pointers when allocations are made.

The conversion from a C string to an OCaml string option is refactored
into a function, `cstr_to_string_option`. This function is also used
to simplify the definitions of `llvm_get_mdstring` and
`llvm_string_of_const`.

Differential Revision: https://reviews.llvm.org/D99393
2021-03-26 11:49:13 +00:00
Josh Berdine
b6f24fdff2 [NFC][OCaml] Resolve const and unsigned compilation warnings
There are a number of compilation warnings regarding disregarding
const qualifiers, and casting between pointers to integer types with
different sign.

The incompatible sign warnings are due to treating the result of
`LLVMGetModuleIdentifier` as `const unsigned char *`, but it is
declared as `const char *`.

The dropped const qualifiers are due to the code pattern
`memcpy(String_val(_),_,_)` which ought to be (following the
implementation of the OCaml runtime)
`memcpy((char *)String_val(_),_,_)`. The issue is that `String_val` is
usually used to get the value of an immutable string. But in the
context of the `memcpy` calls, the string is in the process of being
initialized, so is not yet constant.

Differential Revision: https://reviews.llvm.org/D99392
2021-03-26 11:49:13 +00:00
Josh Berdine
511d90318d [NFC][OCaml] Simplify llvm_global_initializer using ptr_to_option
This diff uses ptr_to_option to convert a nullable C pointer to an
OCaml option instead of the redundant implementation in
llvm_global_initializer.

Differential Revision: https://reviews.llvm.org/D99391
2021-03-26 11:49:13 +00:00
David Sherwood
1622731c2d Revert "[LoopVectorize] Simplify scalar cost calculation in getInstructionCost"
This reverts commit 240aa96cf25d880dde7a0db5d96918cfaa4b8891.
2021-03-26 11:36:53 +00:00
David Sherwood
4e4f3dfb9b [LoopVectorize] Simplify scalar cost calculation in getInstructionCost
This patch simplifies the calculation of certain costs in
getInstructionCost when isScalarAfterVectorization() returns a true value.
There are a few places where we multiply a cost by a number N, i.e.

  unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1;
  return N * TTI.getArithmeticInstrCost(...

After some investigation it seems that there are only these cases that occur
in practice:

1. VF is a scalar, in which case N = 1.
2. VF is a vector. We can only get here if: a) the instruction is a
GEP/bitcast with scalar uses, or b) this is an update to an induction variable
that remains scalar.

I have changed the code so that N is assumed to always be 1. For GEPs
the cost is always 0, since this is calculated later on as part of the
load/store cost. For all other cases I have added an assert that none of the
users needs scalarising, which didn't fire in any unit tests.

Only one test required fixing and I believe the original cost for the scalar
add instruction to have been wrong, since only one copy remains after
vectorisation.

Differential Revision: https://reviews.llvm.org/D98512
2021-03-26 11:27:12 +00:00
Abhina Sreeskantharajan
5e07382161 [Windows] Turn off text mode in TableGen and Rewriter to stop CRLF translation
This patch should fix the errors shown on the Windows bots by turning off text mode. I plan to investigate a better fix but this should unblock the buildbots for now.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99363
2021-03-26 07:12:46 -04:00
Max Kazantsev
5df4846d00 [Test] Add failing test for pr49730 2021-03-26 18:03:39 +07:00
Jay Foad
3d5ba03831 [AMDGPU] Inline FSHRPattern into its only use. NFC. 2021-03-26 09:32:02 +00:00
Craig Topper
468c9f39ae [RISCV] Optimize (and (shl GPR:, uimm5:), 0xffffffff) to use 2 shifts instead of 3.
The and would normally become SLLI+SRLI, giving us 2 SLLI+SRLI. We
can detect this and combine the 2 SLLIs into 1.
2021-03-25 23:31:01 -07:00
Craig Topper
c7d9711430 [RISCV] Don't call CheckAndMask from selectZExti32.
Now that targetShrinkDemandedConstant preserves 0xffffffff masks we
shouldn't need to call computeKnownBits here.
2021-03-25 22:07:41 -07:00
Kazu Hirata
5ac8ba2c65 Reapply [InlineCost] Enable the cost benefit analysis on FDO
This patch enables the cost-benefit-analysis-based inliner by default
if we have instrumentation profile.

- SPEC CPU 2017 shows a 0.4% improvement.

- An internal large benchmark shows a 0.9% reduction in the cycle
  count along with 14.6% reduction in the number of call instructions
  executed.

Differential Revision: https://reviews.llvm.org/D98213
2021-03-25 21:51:38 -07:00
Kazu Hirata
3899383bc1 [InlineCost] Reject a zero entry count
This patch teaches the cost-benefit-analysis-based inliner to reject a
zero entry count so that we don't trigger a divide-by-zero.
2021-03-25 21:51:36 -07:00
Wenlei He
f5a657e80a [CSSPGO] Minor tweak for inline candidate priority tie breaker
When prioritize call site to consider for inlining in sample loader, use number of samples as a first tier breaker before using name/guid comparison. This would favor smaller functions when hotness is the same (from the same block). We could try to retrieve accurate function size if this turns out to be more important.

Differential Revision: https://reviews.llvm.org/D99370
2021-03-25 21:15:36 -07:00
Tony
382c4642c9 [NFC][AMDGPU] Corrections to AMD GPU initial kernel launch documentation
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D99223
2021-03-26 02:05:45 +00:00
Lang Hames
1f3047fd72 [JITLink][MachO] Use full <segment>,<section> names for MachO jitlink::Sections.
JITLink now requires section names to be unique. In MachO section names are only
guaranteed to be unique within their containing segment (e.g. a '__const' section
in the '__DATA' segment does not clash with a '__const' section in the '__TEXT'
segment), so we need to use the fully qualified <segment>,<section> section
names (e.g. '__DATA,__const' or '__TEXT,__const') when constructing
jitlink::Sections for MachO objects.
2021-03-25 18:31:18 -07:00
Richard Smith
a95f40a147 Explicitly enable the new pass manager in this test.
Otherwise it fails under -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF.
2021-03-25 18:10:36 -07:00
Craig Topper
6958fd57ed [RISCV] Add Zbb+Zbt command lines to the signed saturing add/sub tests.
This will enable cmov to be used for select. I improve the codegen
of select_cc in D99021, but that patch doesn't work for cmov.
2021-03-25 17:25:36 -07:00
Amara Emerson
61b28d4f51 [GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates.
Differential Revision: https://reviews.llvm.org/D99383
2021-03-25 17:23:30 -07:00
Jessica Paquette
66363d109b [AArch64][GlobalISel] Emit bzero on Darwin
Darwin platforms for both AArch64 and X86 can provide optimized `bzero()`
routines. In this case, it may be preferable to use `bzero` in place of a
memset of 0.

This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can
be generated by platforms which may want to use bzero.

To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The
conditions for this are largely a port of the bzero case in
`AArch64SelectionDAGInfo::EmitTargetCodeForMemset`.

The only difference in comparison to the SelectionDAG code is that, when
compiling for minsize, this will fire for all memsets of 0. The original code
notes that it's not beneficial to do this for small memsets; however, using
bzero here will save a mov from wzr. For minsize, I think that it's preferable
to prioritise omitting the mov.

This also fixes a bug in the libcall legalization code which would delete
instructions which could not be legalized. It also adds a check to make sure
that we actually get a libcall name.

Code size improvements (Darwin):

- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)

Differential Revision: https://reviews.llvm.org/D99358
2021-03-25 17:14:25 -07:00
Richard Smith
7389d10425 Fix a miscompile introduced by 99203f2.
getPointersDiff would previously round down the difference between two
pointers to a multiple of the element size of the pointee, which could
result in a pointer value being decreased a little.

Alexey Bataev has graciously agreed to add a testcase for this;
submitting the bugfix now to unblock.
2021-03-25 16:53:58 -07:00
Rahman Lavaee
615bd4b10f Add missing 'CHECK' prefix to basic block labels test.
The `CHECK` prefix was dropped in e0bf2349303f. This lead to all CHECK
lines having no effect.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D99316
2021-03-25 16:41:41 -07:00
Fangrui Song
02c14a6e9e [Triple][Driver] Add muslx32 environment and use /lib/ld-musl-x32.so.1 for -dynamic-linker
Differential Revision: https://reviews.llvm.org/D99308
2021-03-25 16:25:47 -07:00
Yonghong Song
4bd3f42304 BPF: add extern func to data sections if specified
This permits extern function (BTF_KIND_FUNC) be added
to BTF_KIND_DATASEC if a section name is specified.
For example,

-bash-4.4$ cat t.c
void foo(int) __attribute__((section(".kernel.funcs")));
int test(void) {
  foo(5);
  return 0;
}

The extern function foo (BTF_KIND_FUNC) will be put into
BTF_KIND_DATASEC with name ".kernel.funcs".

This will help to differentiate two kinds of external functions,
functions in kernel and functions defined in other bpf programs.

Differential Revision: https://reviews.llvm.org/D93563
2021-03-25 16:03:29 -07:00
Jingu Kang
a1f6b21702 [ValueTracking] Handle two PHIs in isKnownNonEqual()
loop:
  %cmp.0 = phi i32 [ 3, %entry ], [ %inc, %loop ]
  %pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %loop ]
  ...
  %inc = add i32 %cmp.0, 1
  br label %loop

On above example, %pos.0 uses previous iteration's %cmp.0 with backedge
according to PHI's instruction's defintion. If the %inc is not same among
iterations, we can say the two PHIs are not same.

Differential Revision: https://reviews.llvm.org/D98422
2021-03-25 22:56:05 +00:00
Leonard Chan
2587b4ccc7 [llvm][hwasan] Add Fuchsia shadow mapping configuration
Ensure that Fuchsia shadow memory starts at zero.

Differential Revision: https://reviews.llvm.org/D99380
2021-03-25 15:28:59 -07:00
Guozhi Wei
65f91cd8ed [DAE] Adjust param/arg attributes when changing parameter to undef
In DeadArgumentElimination pass, if a function's argument is never used, corresponding caller's parameter can be changed to undef. If the param/arg has attribute noundef or other related attributes, LLVM LangRef(https://llvm.org/docs/LangRef.html#parameter-attributes) says its behavior is undefined. SimplifyCFG(D97244) takes advantage of this behavior and does bad transformation on valid code.

To avoid this undefined behavior when change caller's parameter to undef, this patch removes noundef attribute and other attributes imply noundef on param/arg.

Differential Revision: https://reviews.llvm.org/D98899
2021-03-25 14:53:22 -07:00
Philip Reames
cb159dd79e Mark gc.relocate and gc.result as readnone (try 2)
As noted in the LangRef, these are semantically readnone projections from the result value of the associated statepoint. However, it turned out we had a few latent bugs being covered up by the fact we were only marking them readonly (see PR49607 for context).

As of this change, all known issues are resolved. This is a deliberately minimal patch to make it easy to test downstream and revert with minimal change if that turns out to be necessary.

Differential Revision: https://reviews.llvm.org/D98729
2021-03-25 14:50:07 -07:00
Philip Reames
c967c78fcf [deref] Handle byval/byref/sret/inalloc/preallocated arguments for deref-at-point semantics
All of these are scoped allocations which remain dereferenceable during the lifetime of the callee.

Differential Revision: https://reviews.llvm.org/D99310
2021-03-25 14:47:31 -07:00
Philip Reames
22b57c9205 Autogen test to account for tool output format change 2021-03-25 14:41:08 -07:00
Philip Reames
a58d2e926a [test] Add test for hoisting to custom allocation function using allocsize
The first is currently demonstrating a miscompile.
2021-03-25 14:31:51 -07:00
Craig Topper
deeab5c08f [RISCV] Reorder checks in RISCVTTIImpl::getGatherScatterOpCost to avoid calling getMinRVVVectorSizeInBits() when V extension is not enabled.
getMinRVVVectorSizeInBits() asserts if the V extension isn't
enabled. So check that gather/scatter is legal first since it
already contains a check for V extension being enabled. It
also already checks getMinRVVVectorSizeInBits for fixed length
vectors so we don't need a check in getGatherScatterOpCost.
2021-03-25 14:20:47 -07:00
Andrew Savonichev
e73adcc6fa [MCA] Support carry-over instructions for in-order processors
Instructions that have more uops than the processor's IssueWidth are
issued in multiple cycles.

The patch fixes PR49712.

Differential Revision: https://reviews.llvm.org/D99339
2021-03-26 00:06:19 +03:00
Nico Weber
124a82a5ee Revert "[InlineCost] Enable the cost benefit analysis on FDO"
This reverts commit ef69aa961d12dee2141a79b05c9637d8cc9c0c74.
Makes clang assert in PGO builds, see repro tgz in
https://bugs.chromium.org/p/chromium/issues/detail?id=1192783#c6
2021-03-25 16:42:19 -04:00
Roman Lebedev
4def33398a [NFCI][SimplifyCFG] Don't pay for a Small{Map,Set}Vector when plain SmallSet will suffice
This *only* changes the cases where we *really* don't care
about the iteration order of the underlying contained,
namely when we will use the values from it to form DTU updates.
2021-03-25 23:25:40 +03:00