1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

144283 Commits

Author SHA1 Message Date
Andy Wingo
4c62f39d2f [WebAssembly] call_indirect issues table number relocs
If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via `TABLE_NUMBER`
relocations against a table symbol.

Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.

We abuse the used-in-reloc flag on symbols to indicate which tables
should end up in the symbol table.  We do this because unfortunately
older wasm-ld will carp if it see a table symbol.

Differential Revision: https://reviews.llvm.org/D90948
2021-02-22 10:13:36 +01:00
Amara Emerson
584c3e72c8 [AArch64][GlobalISel] Fix <16 x s8> G_DUP regbankselect to assign source to gpr.
We can only select this type if the source is on GPR, not FPR.
2021-02-21 21:17:29 -08:00
Kazu Hirata
38cc9ea5cc [CodeGen] Use range-based for loops (NFC) 2021-02-21 19:58:07 -08:00
Kazu Hirata
83ddc1026f [Analysis] Use ListSeparator (NFC) 2021-02-21 19:58:04 -08:00
Petr Hosek
0968fe7374 [InstrProfiling] Use ELF section groups for counters, data and values
__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.

The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.

Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.

Differential Revision: https://reviews.llvm.org/D96757
2021-02-21 16:13:06 -08:00
Craig Topper
eeb855b166 [KnownBits][RISCV] Improve known bits for srem.
The result must be less than or equal to the LHS side, so any
leading zeros in the left hand side must also exist in the result.
This is stronger than the previous behavior where we only considered
the sign bit being 0.

The affected test case used the sign bit being known 0 to change
a sign extend to a zero extend pre type legalization. After type
legalization the types were promoted to i64, but we no longer
knew bit 31 was zero. This shifts are are the equivalent of an
AND with 0xffffffff or zext_inreg X, i32. This patch allows us to
see that bit 31 is zero and remove the shifts.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D97124
2021-02-21 14:48:29 -08:00
Simon Pilgrim
0606b4495b [X86] Add vector support to sub(C1, xor(X, C2)) -> add(xor(X, ~C2), C1+1) fold. 2021-02-21 21:51:27 +00:00
Simon Pilgrim
b7c3719b71 [X86] Replace explicit constant handling in sub(C1, xor(X, C2)) -> add(xor(X, ~C2), C1+1) fold. NFCI.
NFC cleanup before adding vector support - rely on the SelectionDAG to handle everything for us.
2021-02-21 21:40:32 +00:00
Craig Topper
43d920b6bd [SelectionDAG][RISCV] Teach ComputeNumSignBits to handle SREM.
This also removes a pattern from RISCV that is no longer needed
since the sexti32 on the LHS of the srem in the pattern implies
the result is sign extended so the sign_extend_inreg should be
removed in DAG combine now.

Reviewed By: luismarques, RKSimon

Differential Revision: https://reviews.llvm.org/D97133
2021-02-21 11:13:36 -08:00
Simon Pilgrim
19966fb2bc [X86][AVX] canonicalizeLaneShuffleWithRepeatedOps - remove unnecessary BITCASTs.
In conjunction with the 'vperm2x128(bitcast(x),bitcast(y),c) -> bitcast(vperm2x128(x,y,c))' fold in combineTargetShuffle, this should remove any unnecessary bitcasts around vperm2x128 lane shuffles.
2021-02-21 18:40:32 +00:00
Nikita Popov
ce78a3156f [Loads] Add optimized FindAvailableLoadedValue() overload (NFCI)
FindAvailableLoadedValue() accepts an iterator by reference. If no
available value is found, then the iterator will either be left
at a clobbering instruction or the beginning of the basic block.
This allows using FindAvailableLoadedValue() across multiple blocks.

If this functionality is not needed, as is the case in InstCombine,
then we can use a much more efficient implementation: First try
to find an available value, and only perform clobber checks if
we actually found one. As this function only looks at a very small
number of instructions (6 by default) and usually doesn't find an
available value, this saves many expensive alias analysis queries.
2021-02-21 18:42:56 +01:00
Sanjay Patel
3d791c7666 [IR] restrict vector reduction intrinsic types
The arguments in all cases should be vectors of exactly one of integer or FP.

All of the tests currently pass the verifier because we check for any vector
type regardless of the type of reduction.
This obviously can't work if we mix up integer and FP, and based on current
LangRef text it was not intended to work for pointers either.

The pointer case from https://llvm.org/PR49215 is what led me here. That
example was avoided with 5b250a27ec.

Differential Revision: https://reviews.llvm.org/D96904
2021-02-21 12:37:00 -05:00
Nikita Popov
80c50652f7 [Loads] Extract helper frunction for available load/store (NFC)
This contains the logic for extracting an available load/store
from a given instruction, to be reused in a following patch.
2021-02-21 18:24:58 +01:00
Kristina Bessonova
84eff7b913 [ThinLTO] Fix import of multiply defined global variables
Currently, if there is a module that contains a strong definition of
a global variable and a module that has both a weak definition for
the same global and a reference to it, it may result in an undefined symbol error
while linking with ThinLTO.

It happens because:
* the strong definition become internal because it is read-only and can be imported;
* the weak definition gets replaced by a declaration because it's non-prevailing;
* the strong definition failed to be imported because the destination module
  already contains another definition of the global yet this def is non-prevailing.

The patch adds a check to computeImportForReferencedGlobals() that allows
considering a global variable for being imported even if the module contains
a definition of it in the case this def has an interposable linkage type.

Note that currently the check is based only on the linkage type
(and this seems to be enough at the moment), but it might be worth to account
the information whether the def is prevailing or not.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95943
2021-02-21 18:34:12 +02:00
Simon Pilgrim
b2f83027f4 [DAG] Match USUBSAT patterns through zext/trunc
This patch handles usubsat patterns hidden through zext/trunc and uses the getTruncatedUSUBSAT helper to determine if the USUBSAT can be correctly performed in the truncated form:

zext(x) >= y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit)))
zext(x) >  y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit)))

Based on original examples:

void foo(unsigned short *p, int max, int n) {
    int i;
    unsigned m;
    for (i = 0; i < n; i++) {
        m = *--p;
        *p = (unsigned short)(m >= max ? m-max : 0);
    }
}

Differential Revision: https://reviews.llvm.org/D25987
2021-02-21 15:26:54 +00:00
Simon Pilgrim
dd799f16a2 [X86][AVX] Fold concat(extract_subvector(v0,c0), extract_subvector(v1,c1)) -> vperm2x128
Fixes regression exposed by removing bitcasts across logic-ops in D96206.

Differential Revision: https://reviews.llvm.org/D96206
2021-02-21 14:50:43 +00:00
Simon Pilgrim
1640de7ef3 [X86] Fold bitcast(logic(bitcast(X), Y)) --> logic'(X, bitcast(Y)) for int-int bitcasts
Extend the existing combine that handles bitcasting for fp-logic ops to also help remove logic ops across bitcasts to/from the same integer types.

This helps improve AVX512 predicate handling for D/Q logic ops and also allows DAGCombine's scalarizeExtractedBinop to remove some annoying gpr->simd->gpr transfers.

The concat_vectors regression in pr40891.ll will be addressed in a followup commit on this patch.

Differential Revision: https://reviews.llvm.org/D96206
2021-02-21 14:40:54 +00:00
Kazu Hirata
5d6ed75196 [CodeGen] Use range-based for loops (NFC) 2021-02-20 21:46:02 -08:00
Jianzhou Zhao
3bbcdc72f8 [dfsan] Comment out unused methods by D97087 temporarily 2021-02-21 03:31:19 +00:00
Petr Hosek
e5a6554cdf [InstrProfiling] Use nobits as __llvm_prf_cnts section type in ELF
This can reduce the binary size because counters will no longer occupy
space in the binary, instead they will be allocated by dynamic linker.

Differential Revision: https://reviews.llvm.org/D97110
2021-02-20 14:20:33 -08:00
Nikita Popov
81952c68f5 [ConstantRange] Handle wrapping ranges in min/max (PR48643)
When one of the inputs is a wrapping range, intersect with the
union of the two inputs. The union of the two inputs corresponds
to the result we would get if we treated the min/max as a simple
select.

This fixes PR48643.
2021-02-20 22:52:09 +01:00
Sanjay Patel
d173dde91f [InstCombine] fold fdiv with exp/exp2 divisor (PR49147)
Follow-up to:
D96648 / b40fde062
...for the special-case base calls.

From the earlier commit:
This is unusual in the general (non-reciprocal) case because we need
an extra instruction, but that should be better for general FP
reassociation and codegen. We conservatively check for "arcp" FMF
here as we do with existing fdiv folds, but it is not strictly
necessary to have that.
2021-02-20 16:02:58 -05:00
Nikita Popov
ca3345ac4e [ConstantRange] Handle wrapping range in binaryNot()
We don't need any special handling for wrapping ranges (or empty
ranges for that matter). The sub() call will already compute a
correct and precise range.

We only need to adjust the test expectation: We're now computing
an optimal result, rather than an unsigned envelope.
2021-02-20 21:45:59 +01:00
Teresa Johnson
6eca038caa [LTO] Fix cloning of llvm*.used when splitting module
Refines the fix in 3c4c205060c9398da705eb71b63ddd8a04999de9 to only
put globals whose defs were cloned into the split regular LTO module
on the cloned llvm*.used globals. This avoids an issue where one of the
attached values was a local that was promoted in the original module
after the module was cloned. We only need to have the values defined in
the new module on those globals.

Fixes PR49251.

Differential Revision: https://reviews.llvm.org/D97013
2021-02-20 09:46:43 -08:00
Fraser Cormack
c4642ac1ba [RISCV] Support extraction of misaligned subvectors
This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those
which don't align to a vector register boundary. It accomplishes this by
extracting the nearest register-sized subvector (a subregister
operation), then sliding the vector down with VSLIDEDOWN and extracting
the subvector from the first position (a COPY operation).

Since this procedure involves the use of VSCALE and multiplication, the
handling of such operations is done during lowering to simplify the
implementation and make use of DAG combining. This necessitated moving
some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96959
2021-02-20 15:43:54 +00:00
Fraser Cormack
676a71a31e [RISCV] Improve register allocation around vector masks
With vector mask registers only allocatable to V0 (VMV0Regs) it is
relatively simple to generate code which uses multiple masks and naively
requires spilling.

This patch aims to improve codegen in such cases by telling LLVM it can
use VRRegs to hold masks. This will prevent spilling in many cases by
having LLVM copy to an available VR register.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97055
2021-02-20 14:47:51 +00:00
Simon Pilgrim
df708ad856 [InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI.
recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time.

This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created).

Differential Revision: https://reviews.llvm.org/D97056
2021-02-20 13:15:34 +00:00
Simon Pilgrim
7a307c4062 [DAG] foldSubToUSubSat - fold sub(a,trunc(umin(zext(a),b))) -> usubsat(a,trunc(umin(b,SatLimit)))
This moves the last custom x86 USUBSAT fold to generic DAGCombine.

Completes PR40111

Differential Revision: https://reviews.llvm.org/D96703
2021-02-20 12:02:07 +00:00
Juneyoung Lee
bae6e26472 Update BPFAdjustOpt.cpp to accept select form of or as well
This is a minor pattern-match update to BPFAdjustOpt.cpp to accept
not only 'or i1 a, b' but also 'select i1 a, i1 true, i1 b'.
This resolves regression after SimplifyCFG's creating select form
of and/or instead (https://reviews.llvm.org/D95026).
This is a small change, and currently such select form isn't created
or doesn't reach to the late pipeline (because InstCombine eagerly
folds it into and/or i1), so I chose to commit without a review process.
2021-02-20 18:29:58 +09:00
Amara Emerson
eac24b00e5 [AArch64][GlobalISel] Add selection support for G_VECREDUCE of <2 x i32>
This selects to a pairwise add and a subreg copy.
2021-02-20 00:39:38 -08:00
Kazu Hirata
63f16fef3a [CodeGen] Use range-based for loops (NFC) 2021-02-19 22:44:14 -08:00
Dávid Bolvanský
a89bd24670 Reland "[Libcalls, Attrs] Annotate libcalls with noundef"
Fixed Clang tests.
2021-02-20 06:18:48 +01:00
Juneyoung Lee
dab31de0db [ValueTracking] Improve impliesPoison
This patch improves ValueTracking's impliesPoison(V1, V2) to do this reasoning:

```
  %res = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul      = extractvalue { i64, i1 } %res, 0

	; If %mul is poison, %overflow is also poison, and vice versa.
```

This improvement leads to supporting this optimization under `-instcombine-unsafe-select-transform=0`:

```
define i1 @test2_logical(i64 %a, i64 %b, i64* %ptr) {
; CHECK-LABEL: @test2_logical(
; CHECK-NEXT:    [[MUL:%.*]] = mul i64 [[A:%.*]], [[B:%.*]]
; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i64 [[A]], 0
; CHECK-NEXT:    [[TMP2:%.*]] = icmp ne i64 [[B]], 0
; CHECK-NEXT:    [[OVERFLOW_1:%.*]] = and i1 [[TMP1]], [[TMP2]]
; CHECK-NEXT:    [[NEG:%.*]] = sub i64 0, [[MUL]]
; CHECK-NEXT:    store i64 [[NEG]], i64* [[PTR:%.*]], align 8
; CHECK-NEXT:    ret i1 [[OVERFLOW_1]]
;

  %res = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul = extractvalue { i64, i1 } %res, 0
  %cmp = icmp ne i64 %mul, 0
  %overflow.1 = select i1 %overflow, i1 true, i1 %cmp
  %neg = sub i64 0, %mul
  store i64 %neg, i64* %ptr, align 8
  ret i1 %overflow.1
}
```

Previously, this didn't happen because the flag prevented `select i1 %overflow, i1 true, i1 %cmp` from being `or i1 %overflow, %cmp`.
Note that the select -> or conversion happens only when `impliesPoison(%cmp, %overflow)` returns true.
This improvement allows `impliesPoison` to do the reasoning.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96929
2021-02-20 13:22:34 +09:00
Wenlei He
e2d4d21201 [SampleFDO] Skip PreLink ICP for better profile quality of MonoLTO PostLink
For ThinLTO, PreLink ICP is skipped to favor better profile annotation during LTO PostLink. This change applies the same tweak for MonoLTO. Note that PreLink ICP not only makes PostLink profile annotation harder, it is also uncoordinated with PostLink ICP so duplicated ICP could happen.

Differential Revision: https://reviews.llvm.org/D97028
2021-02-19 19:35:23 -08:00
Dávid Bolvanský
2b364cafc0 Revert "[Libcalls, Attrs] Annotate libcalls with noundef"
This reverts commit 33b0c63775ce58014c55e285671e3315104a6076. Bots are failing. Some Clang tests need to be updated too.
2021-02-20 04:18:42 +01:00
Craig Topper
5d242b3155 [RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory operands if we have them.
We don't currently create memory operands for these intrinsics,
but there was a suggestion of using the indexed load/store
intrinsics to implement isel for scalable vector gather/scatter.
That may propagate the memory operand from the gather/scatter
ISD nodes.
2021-02-19 19:12:20 -08:00
Dávid Bolvanský
6e69bbcd66 [Libcalls, Attrs] Annotate libcalls with noundef
I think we can use here same logic as for nonnull.

strlen(X) - X must be noundef => valid pointer.

for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones)

Reviewed By: jdoerfert, aqjune

Differential Revision: https://reviews.llvm.org/D95122
2021-02-20 04:10:07 +01:00
Dávid Bolvanský
8b62edc2be Revert "[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly"
This reverts commit 05d891a19e45687090edcfccfbad334911659eb0.
2021-02-20 03:58:53 +01:00
Dávid Bolvanský
0bb7f2be6e [BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94850
2021-02-20 03:56:01 +01:00
Pan, Tao
a6081b7f55 [CodeGen] Fix two dots between text section name and symbol name
There is a trailing dot in text section name if it has prefix, don't add
repeated dot when connect text section name and symbol name.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D96327
2021-02-20 10:15:48 +08:00
Craig Topper
e9a79b1cf0 [ValueTypes] Assert if changeVectorElementType is called on a simple type with an extended element type.
Previously we would use the extended implementation, but
the extended implementation requires the vector type to be extended
so that we can access the LLVMContext. In theory we could
detect this case and use the context from the element type instead,
but since I know of no cases hitting this in practice today
I've done the simplest thing.

Also add asserts to several extended EVT functions that assume
LLVMTy is non-null.

Follow from discussion in D97036

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D97070
2021-02-19 17:30:46 -08:00
Jianzhou Zhao
bb847f1e7e [dfsan] Add utils that get/set origins
This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D97087
2021-02-20 00:52:33 +00:00
Jacques Pienaar
3e8340f59a Different fix for gcc bug
Was still running into

from definition of 'template<class T> struct llvm::DenseMapInfo'
[-fpermissive]
 template <typename T> struct DenseMapInfo;
                               ^
2021-02-19 16:41:00 -08:00
Yusra Syeda
4857db55d1 [SystemZ/z/OS] Add XPLINK 64-bit calling convention to tablegen.
This commit adds the initial changes to the SystemZ target
description for the XPLINK 64-bit calling convention on z/OS.
Additions include:

 - a new predicate IsTargetXPLINK64
 - different register allocation order
 - generaton of nopr after a call

Reviewed-by: uweigand

Differential Revision: https://reviews.llvm.org/D96887
2021-02-19 18:39:49 -05:00
Philip Reames
ddb3bf5440 [ValueTracking] Add a two argument form of safeCtxI [NFC]
The existing implementation was relying on order of evaluation to achieve a particular result.  This got really confusing when wanting to change the handling for arguments in a later patch.
2021-02-19 14:52:51 -08:00
Amara Emerson
d83de2e66b [AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal. 2021-02-19 14:28:21 -08:00
Jianzhou Zhao
ff60cc1168 [dfsan] Add origin address calculation
This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D97065
2021-02-19 21:30:07 +00:00
Craig Topper
3867db163b [RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC 2021-02-19 13:23:08 -08:00
Tim Shen
e909c229f8 Patch by @wecing (Chenguang Wang).
The current getFoldedSizeOf() implementation uses naive recursion, which
could be really slow when the input structure type is too complex.

This issue was first brought up in
http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by
adding memoization.

Differential Revision: https://reviews.llvm.org/D6594
2021-02-19 12:44:17 -08:00
Jianzhou Zhao
fbcfd3cb70 [msan] Set cmpxchg shadow precisely
In terms of https://llvm.org/docs/LangRef.html#cmpxchg-instruction,
the return type of chmpxchg is a pair {ty, i1}, while I think we
only wanted to set the shadow for the address 0th op, and it has type
ty.

Reviewed-by: eugenis

Differential Revision: https://reviews.llvm.org/D97029
2021-02-19 20:23:23 +00:00