1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

42887 Commits

Author SHA1 Message Date
Nikita Popov
dd58c146a0 [BasicAA] Accept AATags by const reference (NFC)
Rather than swapping the value, the sizes, the AA tags and the
underlying objects multiple times, invoke the helper methods
with swapped arguments.
2020-10-18 18:19:01 +02:00
Nikita Popov
276c80db80 [AA] Add helper to update result (NFC)
This pattern was repeated a few times, and for some reason always
using insert or try_emplace, even though we know in advance that
we're looking for an existing entry and not trying to create a
new one.
2020-10-18 16:43:26 +02:00
Roman Lebedev
1e8f7ebb6c [SCEV] Model ashr exact x, C as (abs(x) EXACT/u (1<<C)) * signum(x)
It's not pretty, but probably better than modelling it
as an opaque SCEVUnknown, i guess.

It is relevant e.g. for the loop that was brought up in
https://bugs.llvm.org/show_bug.cgi?id=46786#c26
as an example of what we'd be able to better analyze
once SCEV handles `ptrtoint` (D89456).

But as it is evident, even if we deal with `ptrtoint` there,
we also fail to model such an `ashr`.
Also, modeling of mul-of-exact-shr/div could use improvement.

As per alive2:
https://alive2.llvm.org/ce/z/tnfZKd
```
define i8 @src(i8 %0) {
  %2 = ashr exact i8 %0, 4
  ret i8 %2
}

declare i8 @llvm.abs(i8, i1)
declare i8 @llvm.smin(i8, i8)
declare i8 @llvm.smax(i8, i8)

define i8 @tgt(i8 %x) {
  %abs_x = call i8 @llvm.abs(i8 %x, i1 false)
  %div = udiv exact i8 %abs_x, 16
  %t0 = call i8 @llvm.smax(i8 %x, i8 -1)
  %t1 = call i8 @llvm.smin(i8 %t0, i8 1)
  %r = mul nsw i8 %div, %t1
  ret i8 %r
}
```
Transformation seems to be correct!
2020-10-17 21:22:24 +03:00
Roman Lebedev
245883434c [NFC][SCEV] Refactor getAbsExpr() out of createSCEV() 2020-10-17 21:21:02 +03:00
Roman Lebedev
77dcef99bf [NFC][SCEV] Add 'getMinusOne()' method 2020-10-17 21:20:58 +03:00
Juneyoung Lee
e7de338270 Add support for !noundef metatdata on loads
This patch adds metadata !noundef and makes load instructions can optionally have it.
A load with !noundef always return a well-defined value (has no undef bit or isn't poison).
If the loaded value isn't well defined, the behavior is undefined.

This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values.
It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise.

The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead.
The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89050
2020-10-17 13:50:10 +09:00
Albion Fung
21b1fbd81e [PowerPC] Implementation of 128-bit Binary Vector Rotate builtins
This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10.

Differential Revision: https://reviews.llvm.org/D86819
2020-10-16 18:03:22 -04:00
Jameson Nash
fe8adca85b Revert "make the AsmPrinterHandler array public"
I messed up one of the tests.
2020-10-16 17:22:07 -04:00
Jameson Nash
310509685d make the AsmPrinterHandler array public
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.

Differential Revision: https://reviews.llvm.org/D74158
2020-10-16 16:27:31 -04:00
Nikita Popov
7ebc22e3f8 Revert "Recommit "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs""
This reverts commit 32b72c3165bf65cca2e8e6197b59eb4c4b60392a.

While better than before, this change still introduces a large
compile-time regression (>3% on mafft):
https://llvm-compile-time-tracker.com/compare.php?from=fbd62fe60fb2281ca33da35dc25ca3c87ec0bb51&to=32b72c3165bf65cca2e8e6197b59eb4c4b60392a&stat=instructions

Additionally, the logic here doesn't look quite right to me,
I will comment in more detail on the differential revision.
2020-10-16 21:36:33 +02:00
Arthur Eubanks
73f501c86d [CGSCC] Add -abort-on-max-devirt-iterations-reached option
Aborts if we hit the max devirtualization iteration.
Will be useful for testing that changes to devirtualization don't cause
devirtualization to repeat passes more times than necessary.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D89519
2020-10-16 12:34:52 -07:00
Jay Foad
850a1db38c [AMDGPU] Add new llvm.amdgcn.fma.legacy intrinsic
Differential Revision: https://reviews.llvm.org/D89558
2020-10-16 17:10:21 +01:00
Matt Arsenault
e3bfefd3cc Reapply "OpaquePtr: Add type to sret attribute"
This reverts commit eb9f7c28e5fe6d75fed3587023e17f2997c8024b.

Previously this was incorrectly handling linking of the contained
type, so this merges the fixes from D88973.
2020-10-16 11:05:02 -04:00
Max Kazantsev
dee6c89ce3 Recommit "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs"
It was reverted because of negative compile time impact. In this version,
less powerful proof methods are used (non-recursive reasoning only), and
scope limited to constant End values to avoid explision of complex proofs.

Differential Revision: https://reviews.llvm.org/D89381
2020-10-16 17:35:13 +07:00
Nikita Popov
70a47d630d Revert "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs"
This reverts commit 905101c36025fe1c8ecdf9a20cd59db036676073.

This causes a large compile-time regression:
https://llvm-compile-time-tracker.com/compare.php?from=cc175c2cc8e638462bab74e0781e06f9b6eb5017&to=905101c36025fe1c8ecdf9a20cd59db036676073&stat=instructions
2020-10-16 09:47:38 +02:00
Max Kazantsev
53455f1c68 [SCEV][NFC] Split out type balancing in implication engine
We plan to introduce more advanced ways of dealing with different types.
2020-10-16 13:40:24 +07:00
Fangrui Song
2a9d1c6d10 [RISCV] Fix -Wbraced-scalar-init after D89025 2020-10-15 23:29:11 -07:00
Kito Cheng
56f8ee5d8e [RISCV] Add -mtune support
- The goal of this patch is improve option compatible with RISCV-V GCC,
   -mcpu support on GCC side will sent patch in next few days.

 - -mtune only affect the pipeline model and non-arch/extension related
   target feature, e.g. instruction fusion; in td file it called
   TuneFeatures, which is introduced by X86 back-end[1].

 - -mtune accept all valid option for -mcpu and extra alias processor
   option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
   option compatible with RISCV-V GCC.

 - Processor alias for -mtune will resolve according the current target arch,
   rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.

 - Interaction between -mcpu and -mtune:
   * -mtune has higher priority than -mcpu for pipeline model and
     TuneFeatures.

[1] https://reviews.llvm.org/D85165

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D89025
2020-10-16 13:55:08 +08:00
Max Kazantsev
993fc2cded [SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs
We can sharpen the range of a AddRec if we know that it does not
self-wrap and know the symbolic iteration count in the loop. If we can
evaluate the value of AddRec on the last iteration and prove that at least
one its intermediate value lies between start and end, then no-wrap flag
allows us to conclude that all of them also lie between start and end. So
the estimate of range can be improved to union of ranges of start and end.

Differential Revision: https://reviews.llvm.org/D89381
Reviewed By: efriedma
2020-10-16 12:00:39 +07:00
Vedant Kumar
cce078ae12 [PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting
This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
the splitting pass to be toggled on/off. The current method of passing
`-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
correctly (say, with `-O0` or `-Oz`).

To implement the -fsplit-cold-code option, an attribute is applied to
functions to indicate that they may be considered for splitting. This
removes some complexity from the old/new PM pipeline builders, and
behaves as expected when LTO is enabled.

Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
Differential Revision: https://reviews.llvm.org/D57265
Reviewed By: Aditya Kumar, Vedant Kumar
Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
2020-10-15 23:13:33 +00:00
Amara Emerson
03855295f3 [GlobalISel] Remove scalar src from non-sequential fadd/fmul reductions.
It's probably better to split these into separate G_FADD/G_FMUL + G_VECREDUCE
operations in the translator rather than carrying the scalar around. The
majority of the time it'll get simplified away as the scalars are probably
identity values.

Differential Revision: https://reviews.llvm.org/D89150
2020-10-15 15:51:44 -07:00
Thomas Lively
33c86faeda [WebAssembly] Prototype i8x16.popcnt
As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target
builtin and intrinsic rather than normal codegen patterns to make the
instruction opt-in until it is merged to the proposal and stabilized in engines.

Differential Revision: https://reviews.llvm.org/D89446
2020-10-15 21:18:22 +00:00
Florian Hahn
cea367bb2a [LoopVersion] Unify SCEVChecks and alias check handling (NFC).
This is an initial cleanup of the way LoopVersioning interacts with LAA.

Currently LoopVersioning has 2 ways of initializing things:

1. Passing LAI and passing UseLAIChecks = true
2. Passing UseLAIChecks = false, followed by calling setSCEVChecks and
   setAliasChecks.

Both ways of initializing lead to the same result and the duplication
seems more complicated than necessary.

This patch removes the UseLAIChecks flag from the constructor and the
setSCEVChecks & setAliasChecks helpers and move initialization
exclusively to the constructor.

This simplifies things, by providing a single way to initialize
LoopVersioning and reducing duplication.

Reviewed By: Meinersbur, lebedev.ri

Differential Revision: https://reviews.llvm.org/D84406
2020-10-15 22:02:17 +01:00
Evgenii Stepanov
36cc05959a [MTE] Pin the tagged base pointer to one of the stack slots.
Summary:
Pin the tagged base pointer to one of the stack slots, and (if
necessary) rewrite tag offsets so that an object that occupies that
slot has both address and tag offsets of 0. This allows ADDG
instructions for that object to be eliminated and their uses replaced
with the tagged base pointer itself.

This optimization must be done in machine instructions and not in the IR
instrumentation pass, because referring to a stack slot through an IRG
pointer would confuse the stack coloring pass.

The optimization makes a (pretty naive) attempt to find the slot that
would benefit the most by counting the uses of stack slots in the
function.

Reviewers: ostannard, pcc

Subscribers: merge_guards_bot, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72365
2020-10-15 12:50:16 -07:00
Stanislav Mekhanoshin
86aeb69232 [AMDGPU] gfx1032 target
Differential Revision: https://reviews.llvm.org/D89487
2020-10-15 12:41:18 -07:00
Thomas Lively
438da14930 Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions"
This reverts commit 7c8385a352ba21cb388046290d93b53dc273cd9f with a typing fix
to an instruction selection pattern.
2020-10-15 19:32:34 +00:00
Anh Tuyen Tran
23c4b1ff43 [NFC][CaptureTracking] Move static function isNonEscapingLocalObject to llvm namespace
Function isNonEscapingLocalObject is a static one within BasicAliasAnalysis.cpp.
It wraps around PointerMayBeCaptured of CaptureTracking, checking whether a pointer
is to a function-local object, which never escapes from the function.

Although at the moment, isNonEscapingLocalObject is used only by BasicAliasAnalysis,
its functionality can be used by other pass(es), one of which I will put up for review
very soon. Instead of copying the contents of this static function, I move it to llvm
scope, and place it amongst other functions with similar functionality in CaptureTracking.

The rationale for the location are:
- Pointer escape and pointer being captured are actually two sides of the same coin
- isNonEscapingLocalObject is wrapping around another function in CaptureTracking

Reviewed By: jdoerfert (Johannes Doerfert)

Differential Revision: https://reviews.llvm.org/D89465
2020-10-15 18:37:29 +00:00
David Green
d80045e272 [LV] Add a getRecurrenceBinOp and make use of it. NFC 2020-10-15 18:21:41 +01:00
Sanjay Patel
21d0e78943 [CostModel] remove cost-kind predicate for ctlz/cttz intrinsics in basic TTI implementation
The cost modeling for intrinsics is a patchwork based on different
expectations from the callers, so it's a mess. I'm hoping to untangle
this to allow canonicalization to the new min/max intrinsics in IR.
The general goal is to remove the cost-kind restriction here in the
basic implementation class. Ie, if some intrinsic has throughput cost
of 104, assume that it has the same size, latency, and blended costs.
Effectively, an intrinsic with cost N is composed of N simple
instructions. If that's not correct, the target should provide a more
accurate override.

The x86-64 SSE2 subtarget cost diffs require explanation:

1. The scalar ctlz/cttz are assuming "BSR+XOR+CMOV" or
   "TEST+BSF+CMOV/BRANCH", so not cheap.
2. The 128-bit SSE vector width versions assume cost of 18 or 26
   (no explanation provided in the tables, but this corresponds to a
   bunch of shift/logic/compare).
3. The 512-bit vectors in the test file are scaled up by a factor of
   4 from the legal vector width costs.
4. The plain latency cost-kind is not affected in this patch because
   that calc is diverted before we get to getIntrinsicInstrCost().

Differential Revision: https://reviews.llvm.org/D89461
2020-10-15 13:14:41 -04:00
Hiroshi Yamauchi
7e9ad11889 [PGO] Remove the old memop value profiling buckets.
Following up D81682 and D83903, remove the code for the old value profiling
buckets, which have been replaced with the new, extended buckets and disabled by
default.

Also syncing InstrProfData.inc between compiler-rt and llvm.

Differential Revision: https://reviews.llvm.org/D88838
2020-10-15 10:09:49 -07:00
Thomas Lively
29bda9c7c9 Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions"
This reverts commit 7c6bfd90ab2ddaa60de62878c8512db0645e8452.
2020-10-15 15:49:36 +00:00
Thomas Lively
1dd8fe9b9b [WebAssembly] v128.load{8,16,32,64}_lane instructions
Prototype the newly proposed load_lane instructions, as specified in
https://github.com/WebAssembly/simd/pull/350. Since these instructions are not
available to origin trial users on Chrome stable, make them opt-in by only
selecting them from intrinsics rather than normal ISel patterns. Since we only
need rough prototypes to measure performance right now, this commit does not
implement all the load and store patterns that would be necessary to make full
use of the offset immediate. However, the full suite of offset tests is included
to make it easy to track improvements in the future.

Since these are the first instructions to have a memarg immediate as well as an
additional immediate, the disassembler needed some additional hacks to be able
to parse them correctly. Making that code more principled is left as future
work.

Differential Revision: https://reviews.llvm.org/D89366
2020-10-15 15:33:10 +00:00
JonChesterfield
9e44ac573e [NFC] Fix license header from D87841 2020-10-15 15:41:11 +01:00
Paul C. Anagnostopoulos
15f7b61423 [TableGen] Add the !not and !xor operators.
Update the TableGen Programmer's Reference.
2020-10-15 10:12:59 -04:00
Jeremy Morse
023a53e89a [DebugInstrRef] Support recording of instruction reference substitutions
Add a table recording "substitutions" between pairs of <instruction,
operand> numbers, from old pairs to new pairs. Post-isel optimizations are
able to record the outcome of an optimization in this way. For example, if
there were a divide instruction that generated the quotient and remainder,
and it were replaced by one that only generated the quotient:

  $rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1
  DBG_INSTR_REF 1, 0
  DBG_INSTR_REF 1, 1

Became:

  $rax = DIV $rdx, $rsi, debug-instr-num 2
  DBG_INSTR_REF 1, 0
  DBG_INSTR_REF 1, 1

We could enter a substitution from <1, 0> to <2, 0>, and no substitution
for <1, 1> as it's no longer generated.

This approach means that if an instruction or value is deleted once we've
left SSA form, all variables that used the value implicitly become
"optimized out", something that isn't true of the current DBG_VALUE
approach.

Differential Revision: https://reviews.llvm.org/D85749
2020-10-15 11:30:14 +01:00
Georgii Rymar
1e8da37adf [yaml2obj/obj2yaml] - Add support of 'Size' and 'Content' keys for all sections.
Many sections either do not have a support of `Size`/`Content` or support just a
one of them, e.g only `Content`.

`Section` is the base class for sections. This patch adds `Content` and `Size` members
to it and removes similar members from derived classes. This allows to cleanup and
generalize the code and adds a support of these keys for all sections (`SHT_MIPS_ABIFLAGS`
is a only exception, it requires unrelated specific changes to be done).

I had to update/add many tests to test the new functionality properly.

Differential revision: https://reviews.llvm.org/D89039
2020-10-15 11:11:41 +03:00
Luqman Aden
febfadd9f6 [LLD] Set alignment as part of Characteristics in TLS table.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46473

LLD wasn't previously specifying any specific alignment in the TLS table's Characteristics field so the loader would just assume the default value (16 bytes). This works most of the time except if you have thread locals that want specific higher alignments (e.g. 32 as in the bug) *even* if they specify an alignment on the thread local. This change updates LLD to take the max alignment from tls section.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D88637
2020-10-15 00:22:40 -07:00
Luqman Aden
54ac46e473 Revert "[LLD] Set alignment as part of Characteristics in TLS table."
Revert individual wip commits and will instead follow up with a
single commit with all the changes. Makes cherry-picking easier
and will contain all the right tags.

This reverts commit 32a4ad3b6ce6028a371b028cf06fa5feff9534bf.
This reverts commit 7fe13af676678815989a6d0ece684687953245e7.
This reverts commit 51fbc1bef657bb0f5808986555ec3517a84768c4.
This reverts commit f80950a8bb985c082b26534b0e157447bf803935.
This reverts commit 0778cad9f325df4d7b32b22f3dba201a16a0b8fe.
This reverts commit 8b70d527d7ec1c8b9e921177119a0d906ffad4f0.
2020-10-15 00:21:36 -07:00
David Blaikie
64302c9d6e llvm-symbolizer: Exit non-zero when DWARF parsing errors have been rendered 2020-10-14 23:42:00 -07:00
Luqman Aden
dcba0fbd53 Mask out existing alignment bits. 2020-10-14 19:34:32 -07:00
Luqman Aden
b2a8035b8d Fix style warnings. 2020-10-14 19:34:31 -07:00
Luqman Aden
0141307554 [LLD] Set alignment as part of Characteristics in TLS table.
Differential Revision: https://reviews.llvm.org/D88637
2020-10-14 19:34:31 -07:00
Reid Kleckner
4b5ed96964 [ADT] Use alignas + sizeof for inline storage, NFC
AlignedCharArrayUnion is really only needed to handle the "union" case
when we need memory of suitable size and alignment for multiple types.
SmallVector only needs storage for one type, so use that directly.
2020-10-14 16:16:02 -07:00
Konstantin Zhuravlyov
5f87057393 AMDGPU: Update AMDHSA code object version handling
Differential Revision: https://reviews.llvm.org/D89076
2020-10-14 13:04:27 -04:00
Matt Arsenault
781bfb732b InstCombine: Fix infinite loop in copy-constant-to-alloca transform
This was broken by 16295d521e294b27106e51fac29957c1aac8ff89, when
instructions started being handled and not just constant
expressions. This was re-inserting an equivalent bitcast to the
original memcpy operand, which made a non-functional IR change on
every iteration.

This also fixes a secondary problem where it was inserting
addrspacecasts which may not have been legal (i.e. it changed the
source address space). Start visiting all pointer users and fail out
if we can't process them. Also start handling the relevant memory
intrinsic users. These cases can be dealt with by running
InferAddressSpaces separately.
2020-10-14 12:55:25 -04:00
jasonliu
d77cbcb130 [AIX] Turn -fdata-sections on by default in Clang
Summary:

This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
 parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.

Reviewed By: MaskRay, DiggerLin

Differential Revision: https://reviews.llvm.org/D88737
2020-10-14 15:58:31 +00:00
Mircea Trofin
55a70d8e91 [NFC][MC] Use MCRegister in Machine{Sink|Pipeliner}.cpp
Differential Revision: https://reviews.llvm.org/D89328
2020-10-14 08:42:17 -07:00
Konstantin Zhuravlyov
5d60e1508a Remove Combine.td.rej file 2020-10-14 11:39:28 -04:00
Simon Pilgrim
29abe5e4b6 [InstCombine] Add m_SpecificIntAllowUndef pattern matcher
m_SpecificInt doesn't accept undef elements in a vector splat value - tweak specific_intval to optionally allow undefs and add the m_SpecificIntAllowUndef variants.

Allows us to remove the m_APIntAllowUndef + comparison hack inside matchFunnelShift
2020-10-14 16:15:53 +01:00
Juneyoung Lee
7534326bff [ValueTracking] Use assume's noundef operand bundle
This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89219
2020-10-14 20:16:33 +09:00