1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

211675 Commits

Author SHA1 Message Date
Nikita Popov
9bdd9a8130 [ConstantRangeTest] Make exhaustive testing more principled (NFC)
The current infrastructure for exhaustive ConstantRange testing is
somewhat confusing in what exactly it tests and currently cannot even
be used for operations that produce smallest-size results, rather than
signed/unsigned envelopes.

This patch makes the testing more principled by collecting the exact
set of results of an operation into a bit set and then comparing it
against the range approximation by:

 * Checking conservative correctness: All elements in the set must be
   in the range.
 * Checking optimality under a given preference function: None of the
   (slack-free) ranges that can be constructed from the set are
   preferred over the computed range.

Implemented preference functions are:

 * PreferSmallest: Smallest range regardless of signed/unsigned wrapping
   behavior. Probably what we would call "optimal" without further
   qualification.
 * PreferSmallestUnsigned/Signed: Smallest range that has no
   unsigned/signed wrapping. We use this if our calculation is precise
   only up to signed/unsigned envelope.
 * PreferSmallestNonFullUnsigned/Signed: Smallest range that has no
   unsigned/signed wrapping -- but preferring a smaller wrapping range
   over a (non-wrapping) full range. We use this if we have a fully
   precise calculation but apply a sign preference to the result
   (union/intersection). Even with a sign preference, returning a
   wrapping range is still "strictly better" than returning a full one.

This also addresses PR49273 by replacing the fragile manual range
construction logic in testBinarySetOperationExhaustive() with generic
code that isn't specialized to the particular form of ranges that set
operations can produces.

Differential Revision: https://reviews.llvm.org/D88356
2021-02-20 12:37:31 +01:00
David Zarzycki
81606a10c8 [lit] Add --xfail and --filter-out (inverse of --filter)
In semi-automated environments,  XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D96662
2021-02-20 05:43:29 -05:00
Juneyoung Lee
bae6e26472 Update BPFAdjustOpt.cpp to accept select form of or as well
This is a minor pattern-match update to BPFAdjustOpt.cpp to accept
not only 'or i1 a, b' but also 'select i1 a, i1 true, i1 b'.
This resolves regression after SimplifyCFG's creating select form
of and/or instead (https://reviews.llvm.org/D95026).
This is a small change, and currently such select form isn't created
or doesn't reach to the late pipeline (because InstCombine eagerly
folds it into and/or i1), so I chose to commit without a review process.
2021-02-20 18:29:58 +09:00
Amara Emerson
eac24b00e5 [AArch64][GlobalISel] Add selection support for G_VECREDUCE of <2 x i32>
This selects to a pairwise add and a subreg copy.
2021-02-20 00:39:38 -08:00
Juneyoung Lee
0bb24cab51 [InstCombine] Add more tests to nonnull-select.ll (NFC) 2021-02-20 16:59:52 +09:00
Kazu Hirata
63f16fef3a [CodeGen] Use range-based for loops (NFC) 2021-02-19 22:44:14 -08:00
Kazu Hirata
c411b22a9b [TableGen] Use ListSeparator (NFC) 2021-02-19 22:44:12 -08:00
Dávid Bolvanský
a89bd24670 Reland "[Libcalls, Attrs] Annotate libcalls with noundef"
Fixed Clang tests.
2021-02-20 06:18:48 +01:00
Juneyoung Lee
dab31de0db [ValueTracking] Improve impliesPoison
This patch improves ValueTracking's impliesPoison(V1, V2) to do this reasoning:

```
  %res = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul      = extractvalue { i64, i1 } %res, 0

	; If %mul is poison, %overflow is also poison, and vice versa.
```

This improvement leads to supporting this optimization under `-instcombine-unsafe-select-transform=0`:

```
define i1 @test2_logical(i64 %a, i64 %b, i64* %ptr) {
; CHECK-LABEL: @test2_logical(
; CHECK-NEXT:    [[MUL:%.*]] = mul i64 [[A:%.*]], [[B:%.*]]
; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i64 [[A]], 0
; CHECK-NEXT:    [[TMP2:%.*]] = icmp ne i64 [[B]], 0
; CHECK-NEXT:    [[OVERFLOW_1:%.*]] = and i1 [[TMP1]], [[TMP2]]
; CHECK-NEXT:    [[NEG:%.*]] = sub i64 0, [[MUL]]
; CHECK-NEXT:    store i64 [[NEG]], i64* [[PTR:%.*]], align 8
; CHECK-NEXT:    ret i1 [[OVERFLOW_1]]
;

  %res = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul = extractvalue { i64, i1 } %res, 0
  %cmp = icmp ne i64 %mul, 0
  %overflow.1 = select i1 %overflow, i1 true, i1 %cmp
  %neg = sub i64 0, %mul
  store i64 %neg, i64* %ptr, align 8
  ret i1 %overflow.1
}
```

Previously, this didn't happen because the flag prevented `select i1 %overflow, i1 true, i1 %cmp` from being `or i1 %overflow, %cmp`.
Note that the select -> or conversion happens only when `impliesPoison(%cmp, %overflow)` returns true.
This improvement allows `impliesPoison` to do the reasoning.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96929
2021-02-20 13:22:34 +09:00
Wenlei He
e2d4d21201 [SampleFDO] Skip PreLink ICP for better profile quality of MonoLTO PostLink
For ThinLTO, PreLink ICP is skipped to favor better profile annotation during LTO PostLink. This change applies the same tweak for MonoLTO. Note that PreLink ICP not only makes PostLink profile annotation harder, it is also uncoordinated with PostLink ICP so duplicated ICP could happen.

Differential Revision: https://reviews.llvm.org/D97028
2021-02-19 19:35:23 -08:00
Dávid Bolvanský
2b364cafc0 Revert "[Libcalls, Attrs] Annotate libcalls with noundef"
This reverts commit 33b0c63775ce58014c55e285671e3315104a6076. Bots are failing. Some Clang tests need to be updated too.
2021-02-20 04:18:42 +01:00
Craig Topper
5d242b3155 [RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory operands if we have them.
We don't currently create memory operands for these intrinsics,
but there was a suggestion of using the indexed load/store
intrinsics to implement isel for scalable vector gather/scatter.
That may propagate the memory operand from the gather/scatter
ISD nodes.
2021-02-19 19:12:20 -08:00
Dávid Bolvanský
6e69bbcd66 [Libcalls, Attrs] Annotate libcalls with noundef
I think we can use here same logic as for nonnull.

strlen(X) - X must be noundef => valid pointer.

for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones)

Reviewed By: jdoerfert, aqjune

Differential Revision: https://reviews.llvm.org/D95122
2021-02-20 04:10:07 +01:00
Dávid Bolvanský
8b62edc2be Revert "[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly"
This reverts commit 05d891a19e45687090edcfccfbad334911659eb0.
2021-02-20 03:58:53 +01:00
Dávid Bolvanský
0bb7f2be6e [BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94850
2021-02-20 03:56:01 +01:00
Pan, Tao
a6081b7f55 [CodeGen] Fix two dots between text section name and symbol name
There is a trailing dot in text section name if it has prefix, don't add
repeated dot when connect text section name and symbol name.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D96327
2021-02-20 10:15:48 +08:00
Craig Topper
e9a79b1cf0 [ValueTypes] Assert if changeVectorElementType is called on a simple type with an extended element type.
Previously we would use the extended implementation, but
the extended implementation requires the vector type to be extended
so that we can access the LLVMContext. In theory we could
detect this case and use the context from the element type instead,
but since I know of no cases hitting this in practice today
I've done the simplest thing.

Also add asserts to several extended EVT functions that assume
LLVMTy is non-null.

Follow from discussion in D97036

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D97070
2021-02-19 17:30:46 -08:00
Jianzhou Zhao
bb847f1e7e [dfsan] Add utils that get/set origins
This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D97087
2021-02-20 00:52:33 +00:00
Jacques Pienaar
3e8340f59a Different fix for gcc bug
Was still running into

from definition of 'template<class T> struct llvm::DenseMapInfo'
[-fpermissive]
 template <typename T> struct DenseMapInfo;
                               ^
2021-02-19 16:41:00 -08:00
Yusra Syeda
4857db55d1 [SystemZ/z/OS] Add XPLINK 64-bit calling convention to tablegen.
This commit adds the initial changes to the SystemZ target
description for the XPLINK 64-bit calling convention on z/OS.
Additions include:

 - a new predicate IsTargetXPLINK64
 - different register allocation order
 - generaton of nopr after a call

Reviewed-by: uweigand

Differential Revision: https://reviews.llvm.org/D96887
2021-02-19 18:39:49 -05:00
Philip Reames
ddb3bf5440 [ValueTracking] Add a two argument form of safeCtxI [NFC]
The existing implementation was relying on order of evaluation to achieve a particular result.  This got really confusing when wanting to change the handling for arguments in a later patch.
2021-02-19 14:52:51 -08:00
Amara Emerson
d83de2e66b [AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal. 2021-02-19 14:28:21 -08:00
Jianzhou Zhao
ff60cc1168 [dfsan] Add origin address calculation
This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D97065
2021-02-19 21:30:07 +00:00
Craig Topper
3867db163b [RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC 2021-02-19 13:23:08 -08:00
Philip Reames
56af664c20 Add datalayout to test added in 7e3183d73
Realized after pushing this would probably fail on bots for other than x86-64.
2021-02-19 13:10:19 -08:00
Philip Reames
ae521e31d1 Add test triggered by review discussion on D97077 2021-02-19 13:03:58 -08:00
Tim Shen
e909c229f8 Patch by @wecing (Chenguang Wang).
The current getFoldedSizeOf() implementation uses naive recursion, which
could be really slow when the input structure type is too complex.

This issue was first brought up in
http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by
adding memoization.

Differential Revision: https://reviews.llvm.org/D6594
2021-02-19 12:44:17 -08:00
Jianzhou Zhao
fbcfd3cb70 [msan] Set cmpxchg shadow precisely
In terms of https://llvm.org/docs/LangRef.html#cmpxchg-instruction,
the return type of chmpxchg is a pair {ty, i1}, while I think we
only wanted to set the shadow for the address 0th op, and it has type
ty.

Reviewed-by: eugenis

Differential Revision: https://reviews.llvm.org/D97029
2021-02-19 20:23:23 +00:00
Philip Reames
4fe33c803d precommit test cleanup for D97077 2021-02-19 12:19:39 -08:00
Sanjay Patel
eb2508cda0 [Verifier] remove dead code for saturating intrinsics; NFC
Test coverage shows that we assert with the string from the
tablegen defs file for these intrinsics, so these cases
should never be live.
2021-02-19 14:58:25 -05:00
Sanjay Patel
b2a0193b6b [Verifier] add tests for saturating intrinsics; NFC
As noted in D96904, we don't have direct tests for these malformed ops.
2021-02-19 14:58:25 -05:00
Haowei Wu
7c097db22c [elfabi] Fix a bug when .dynsym contains no non-local symbol
This patch fixed a bug when elbabi was supplied with a tbe file
contains no non-local symbol. Before this patch, it wrote 0 to
sh_info of the .dynsym section, making the ELF stub file invalid.
This patch fixed this issue.

Differential Revision: https://reviews.llvm.org/D96930
2021-02-19 11:36:53 -08:00
Sanjay Patel
11bf01464d [Analysis][LoopVectorize] do not form reductions of pointers
This is a fix for https://llvm.org/PR49215 either before/after
we make a verifier enhancement for vector reductions with D96904.

I'm not sure what the current thinking is for pointer math/logic
in IR. We allow icmp on pointer values. Therefore, we match min/max
patterns, so without this patch, the vectorizer could form a vector
reduction from that sequence.

But the LangRef definitions for min/max and vector reduction
intrinsics do not allow pointer types:
https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic

So we would crash/assert at some point - either in IR verification,
in the cost model, or in codegen. If we do want to allow this kind
of transform, we will need to update the LangRef and all of those
parts of the compiler.

Differential Revision: https://reviews.llvm.org/D97047
2021-02-19 14:01:57 -05:00
Craig Topper
5925943057 [RISCV] Use inheritance to reduce some repeated code in tablegen. NFC
The VLX and VSX searchable tables, share the same format so we
can have a common base class for them.
2021-02-19 10:42:18 -08:00
Simon Pilgrim
257b9a938c [X86] Regenerate 2007-06-28-X86-64-isel.ll 2021-02-19 18:35:15 +00:00
Simon Pilgrim
383882990f [X86] Remove unused intrinsic declaration 2021-02-19 18:35:14 +00:00
Simon Pilgrim
8fe41f7cf9 [X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll 2021-02-19 18:35:14 +00:00
Craig Topper
b4a550cc03 [RISCV] Remove unneeded indexed segment load/store vector pseudo instruction.
We had more combinations of data and index lmuls than we needed.

Also add some asserts to verify that the IndexVT and data VT have
the same element count when we isel these pseudo instructions.
2021-02-19 10:28:48 -08:00
Craig Topper
eb3717e534 [RISCV] Use custom isel for vector indexed load/store intrinsics.
There are many legal combinations of index and data VTs supported
for these intrinsics. This results in a lot of isel patterns in
RISCVGenDAGISel.inc.

By adding a separate table similar to what we use for segment
load/stores, we can more efficiently manually select these
intrinsics. We should also be able to reuse this table scalable
vector gather/scatter.

This reduces the llc binary size by ~56K.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D97033
2021-02-19 10:10:06 -08:00
Craig Topper
12d4bce8f8 [RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics.
Just like we do for isel patterns, we need to call selectVLOp
to prevent 0 from being selected to X0 by the default isel.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97021
2021-02-19 10:07:12 -08:00
Craig Topper
0858057f74 [RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64
We previously used isel patterns for this, but that used quite
a bit of space in the isel table due to OR being associative
and commutative. It also wouldn't handle shifts/ands being in
reversed order.

This generalizes the shift/and matching from GREVI to
take the expected mask table as input so we can reuse it for
SHFLI.

There is no SHFLIW instruction, but we can promote a 32-bit
SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't
set, a 64-bit SHFLI will preserve 33 sign bits if the input had
at least 33 sign bits. ComputeNumSignBits has been updated to
account for that to avoid sext.w in the tests.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96661
2021-02-19 10:07:12 -08:00
Wei Mi
9407268caa [SampleFDO] Add PromotedInsns to prevent repeated ICP.
In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9,
We use 0 count value profile to memorize which target has been promoted
and prevent repeated ICP for the same target, so we delete PromotedInsns.
However, I found the implementation in the patch has some shortcomings
to be fixed otherwise there will still be repeated ICP. So I add
PromotedInsns back temorarily. Will remove it after I get a thorough fix.
2021-02-19 10:01:49 -08:00
Jessica Paquette
3471d7d959 [AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner
This is to ensure that we can eliminate G_ASSERT_SEXT.

In a follow-up patch, I'm going to make CallLowering emit G_ASSERT_SEXT for
signext parameters.

Differential Revision: https://reviews.llvm.org/D96913
2021-02-19 09:34:47 -08:00
Benjamin Kramer
ba5cce52d3 [LV] Fold single-use variable into assert. NFC. 2021-02-19 18:11:39 +01:00
Nikita Popov
92fca6922a [MemCopyOpt] Enable MemorySSA by default
This enables use of MemorySSA instead of MemDep in MemCpyOpt. To
allow this without significant compile-time impact, the MemCpyOpt
pass is moved directly before DSE (in the cases where this was not
already the case), which allows us to reuse the existing MemorySSA
analysis.

Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt
can also perform simple optimizations across basic blocks.

Differential Revision: https://reviews.llvm.org/D94376
2021-02-19 18:06:25 +01:00
Philip Reames
289a0fd30f [SCEV] Use both known bits and sign bits when computing range of SCEV unknowns
When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results.

One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size.

Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results.

Differential Revision: https://reviews.llvm.org/D96534
2021-02-19 08:29:12 -08:00
Mircea Trofin
dd4f4eac1d [NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit
VirtRegAuxInfo is an extensibility point, so the register allocator's
decision on which implementation to use should be communicated to the
other users - namely, LiveRangeEdit.

Differential Revision: https://reviews.llvm.org/D96898
2021-02-19 07:44:28 -08:00
madhur13490
00aa32634a Make fixed-abi default for AMD HSA OS
fixed-abi uses pre-defined and predictable
SGPR/VGPRs for passing arguments. This patch makes
this scheme default when HSA OS is specified in triple.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96340
2021-02-19 15:05:25 +00:00
David Green
7d67e26f6b [ARM] Correct vector predicate type in MVE getCmpSelInstrCost 2021-02-19 14:43:51 +00:00
Jay Foad
cfbb508b8c [AMDGPU] Add some GFX9 test coverage. NFC. 2021-02-19 14:38:52 +00:00