1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

212512 Commits

Author SHA1 Message Date
Nikita Popov
30e03f0628 [UnitTests] Remove uses of deprecated CreateLoad() API
Missed this usage inside OpenMPIRBuilderTest.
2021-03-11 19:05:53 +01:00
Craig Topper
dd58d36de0 [RISCV] Support fixed vector copysign.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98394
2021-03-11 09:57:24 -08:00
David Green
9b84945f7a [ARM] Move t2DoLoopStart reg alloc hint
This adjusts the place that the t2DoLoopStart reg allocation hint is
inserted, adding it in the ARMTPAndVPTOptimizaionPass in a similar place
as other tail predicated loop optimizations. This removes the need for
doing so in a custom inserter, and should make the hint more accurate,
only adding it where we expect to create a DLS (not DLSTP or WLS).
2021-03-11 17:56:19 +00:00
David Green
18fc27f084 [ARM] Improve WLS lowering
Recently we improved the lowering of low overhead loops and tail
predicated loops, but concentrated first on the DLS do style loops. This
extends those improvements over to the WLS while loops, improving the
chance of lowering them successfully. To do this the lowering has to
change a little as the instructions are terminators that produce a value
- something that needs to be treated carefully.

Lowering starts at the Hardware Loop pass, inserting a new
llvm.test.start.loop.iterations that produces both an i1 to control the
loop entry and an i32 similar to the llvm.start.loop.iterations
intrinsic added for do loops. This feeds into the loop phi, properly
gluing the values together:

  %wls = call { i32, i1 } @llvm.test.start.loop.iterations.i32(i32 %div)
  %wls0 = extractvalue { i32, i1 } %wls, 0
  %wls1 = extractvalue { i32, i1 } %wls, 1
  br i1 %wls1, label %loop.ph, label %loop.exit
...
loop:
  %lsr.iv = phi i32 [ %wls0, %loop.ph ], [ %iv.next, %loop ]
  ..
  %iv.next = call i32 @llvm.loop.decrement.reg.i32(i32 %lsr.iv, i32 1)
  %cmp = icmp ne i32 %iv.next, 0
  br i1 %cmp, label %loop, label %loop.exit

The llvm.test.start.loop.iterations need to be lowered through ISel
lowering as a pair of WLS and WLSSETUP nodes, which each get converted
to t2WhileLoopSetup and t2WhileLoopStart Pseudos. This helps prevent
t2WhileLoopStart from being a terminator that produces a value,
something difficult to control at that stage in the pipeline. Instead
the t2WhileLoopSetup produces the value of LR (essentially acting as a
lr = subs rn, 0), t2WhileLoopStart consumes that lr value (the Bcc).

These are then converted into a single t2WhileLoopStartLR at the same
point as t2DoLoopStartTP and t2LoopEndDec. Otherwise we revert the loop
to prevent them from progressing further in the pipeline. The
t2WhileLoopStartLR is a single instruction that takes a GPR and produces
LR, similar to the WLS instruction.

  %1:gprlr = t2WhileLoopStartLR %0:rgpr, %bb.3
  t2B %bb.1
...
bb.2.loop:
  %2:gprlr = PHI %1:gprlr, %bb.1, %3:gprlr, %bb.2
  ...
  %3:gprlr = t2LoopEndDec %2:gprlr, %bb.2
  t2B %bb.3

The t2WhileLoopStartLR can then be treated similar to the other low
overhead loop pseudos, eventually being lowered to a WLS providing the
branches are within range.

Differential Revision: https://reviews.llvm.org/D97729
2021-03-11 17:56:19 +00:00
Simon Wallis
fb37b7e70a [AArch64] Fix -Wunused-but-set-variable in GCC non-debug build
[AArch64] Fix -Wunused-but-set-variable in GCC -DLLVM_ENABLE_ASSERTIONS=off non-debug build.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D98431
2021-03-11 17:54:05 +00:00
Hiroshi Yamauchi
1295ae3ffe [PGO] Fix two issues in PGOMemOPSizeOpt.
1. PGOMemOPSizeOpt grabs only the first, up to five (by default) entries from
the value profile metadata and preserves the remaining entries for the fallback
memop call site. If there are more than five entries, the rest of the entries
would get dropped. This is fine for PGOMemOPSizeOpt itself as it only promotes
up to 3 (by default) values, but potentially not for other downstream passes
that may use the value profile metadata.

2. PGOMemOPSizeOpt originally assumed that only values 0 through 8 are kept
track of. When the range buckets were introduced, it was changed to skip the
range buckets, but since it does not grab all entries (only five), if some range
buckets exist in the first five entries, it could potentially cause fewer
promotion opportunities (eg. if 4 out of 5 were range buckets, it may be able to
promote up to one non-range bucket, as opposed to 3.) Also, combined with 1, it
means that wrong entries may be preserved, as it didn't correctly keep track of
which were entries were skipped.

To fix this, PGOMemOPSizeOpt now grabs all the entries (up to the maximum number
of value profile buckets), keeps track of which entries were skipped, and
preserves all the remaining entries.

Differential Revision: https://reviews.llvm.org/D97592
2021-03-11 09:53:05 -08:00
Nikita Popov
332a4fa84c [IRBuilder] Deprecate CreateLoad APIs with implicit type
These APIs are not compatible with opaque pointers. Deprecate them
to avoid the introduction of further uses.
2021-03-11 18:46:45 +01:00
Craig Topper
c201bc69e6 [RISCV] Handle vmv.x.s intrinsic for i64 vectors on RV32.
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98372
2021-03-11 09:39:50 -08:00
Craig Topper
2632043921 [RISCV] Support extract_vector_elt for fixed and scalable masked registers.
This uses a really simple approach of converting to an i8 vector
and extracting. This is probably not the best approach especially
if you know the index is constant.

Other ideas:
-Store to stack temporary using vse1, load as scalar and shift.
-Sort of bitcast the vector to a vector of i8, slide down the
 appropriate 8 bit element, copy to scalar, shift down the
 correct bit within the 8 bits we extracted. Not exactly sure
 how to describe such a bitcast from i1 vector to i8 vector
 within the type system for elements less than 8.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98310
2021-03-11 09:26:44 -08:00
Craig Topper
0a46c560b5 [ValueTypes][RISCV] Add MVT for v1f16.
RISCV makes all fixed vector MVTs with size less than or equal
to a command line option legal.

This didn't include v1f16 because it was missing but did include v1f32 and v1f64.

One test is affected where we did test this type, but it is a horizontal
reduction so it is non-sensical. Perhaps we should canonicalize that
away somewhere.

I'm not sure if we should be making v1 types legal, but this will at
least make RISCV consistent across all types.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D98365
2021-03-11 09:23:18 -08:00
Matt Arsenault
8d21dc7b08 AMDGPU/GlobalISel: Improve private addressing mode matching
This enables the look-through-copy to hack around not correctly
regbankselecting constants to match the use bank.
2021-03-11 10:23:35 -05:00
Matt Arsenault
d9bf8e78b9 GlobalISel: Fix off by one in finding explicit byval alignment
For attribute sets, the return index is at 0, and arguments start at
1. getParamAlignment adds the offset of 1, so we need to convert from
attribute index back to IR index.
2021-03-11 10:23:08 -05:00
Matt Arsenault
0d5c3bafbb AMDGPU/GlobalISel: Add more tests for byval arguments 2021-03-11 10:23:08 -05:00
Joseph Huber
ff5095970f [OpenMP] Restore backwards compatibility for libomptarget
Summary:
The changes introduced in D87946 changed the API for libomptarget
functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x
but was not given a backward-compatible interface. This change will
require people using Clang 13.x or 12.x to recompile their offloading
programs.

Reviewed By: jdoerfert cchen

Differential Revision: https://reviews.llvm.org/D98358
2021-03-11 09:52:11 -05:00
Stephen Tozer
617750cbc9 Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
This reverts commit c0f3dfb9f119bb5f22dd8846f5502b6abaf026d3.

Reverted due to an error on the clang-x64-windows-msvc buildbot.
2021-03-11 14:48:01 +00:00
Simon Pilgrim
d3a46742cf [llvm-mca] Fix uninitialized variable in InOrderIssueStage constructor warning. NFCI. 2021-03-11 14:41:20 +00:00
Stefan Pintilie
04d6c68955 [PowerPC] Exploit paddi instruction on Power 10 for constant materialization
Starting with Power 10 the instruction paddi is available to use.
The instruction allows for immediates that are 34 bits.

This patch adds exploitation of the paddi instruction to allow us
to materialize constants.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D93300
2021-03-11 08:37:49 -06:00
Stefan Gränitz
36b73f9593 [Orc] Deallocate debug objects explicitly when destroying the DebugObjectManagerPlugin 2021-03-11 15:26:16 +01:00
Simon Pilgrim
3a70fa60a4 [Transforms] SampleProfileLoaderBaseImpl<BT>::getFunctionLoc - fix Wdocumentation warnings. NFCI. 2021-03-11 14:04:08 +00:00
Qiu Chaofan
dd1d18cacf [PowerPC] Fix multi-use case for swap reduction
4c973ae implemented reduction of vector swap for lane-insensitive
operations. This commit fixes it for checking number of uses of the
vector operation.
2021-03-11 21:58:33 +08:00
Nikita Popov
e1008acc0b [OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC)
Explicitly pass loaded type when creating loads, in preparation
for the deprecation of these APIs.

There are still a couple of uses left.
2021-03-11 14:40:57 +01:00
Bradley Smith
fcd766a749 [AArch64][SVE] Add fixed/scalable lowering of FMAXIMUM/FMINIMUM ISD nodes
Differential Revision: https://reviews.llvm.org/D98348
2021-03-11 13:37:47 +00:00
gbtozers
3b8eba58df [DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands
This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic
operations with two or more non-const operands; this includes the GetElementPtr
instruction, and most Binary Operator instructions. These salvages produce
DIArgList locations and are only valid for dbg.values, as currently variadic
DIExpressions must use DW_OP_stack_value. This functionality is also only added
for salvageDebugInfoForDbgValues; other functions that directly call
salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be
updated in a later patch.

Differential Revision: https://reviews.llvm.org/D91722
2021-03-11 13:33:49 +00:00
Bradley Smith
142ffc8786 Revert "[AArch64][SVE] Allow accesses to SVE stack objects to use frame pointer"
This patch introduced codegen faults.  An attempt to fix this was done
in https://reviews.llvm.org/D97193, but ultimately it was decided to
approach this differently.

This reverts commit 42635856ed3c9a05957640f9deb50cf865c03825.

Differential Revision: https://reviews.llvm.org/D98350
2021-03-11 13:32:35 +00:00
Nikita Popov
fbfc30d1ae [PowerPC] Fix infinite loop in peephole CR optimization (PR49509)
If we encounter a degenerate select node where both operands are
the same, then we can continue negating the condition while swapping
operands, resulting in an infinite loop. Avoid this by bailing out
if both operands are the same.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49509.

Differential Revision: https://reviews.llvm.org/D98340
2021-03-11 14:25:22 +01:00
Simon Pilgrim
9ddda34f05 Revert rGcd938ab162b0ac560dd0e9fee290980c7e0e47e5 "[X86] canonicalizeShuffleWithBinOps - add X86ISD::PSHUFB handling."
Investigating an issue reported by @bkramer, possibly when the PSHUFB mask generates zero elements.
2021-03-11 13:14:00 +00:00
Simon Pilgrim
32e46239ea [X86] Don't attempt to fold sub(C1, xor(X, C2)) with opaque constants
Fixes PR49451
2021-03-11 12:06:40 +00:00
Serguei Katkov
3b0fb3a8e1 [Statepoint Lowering] Handle the case with several gc.result
Recently gc.result has been marked with readnone instead of readonly and
this opens a door for different optimization to duplicate gc.result.
Statepoint lowering is not ready to see several gc.results.
The problem appears when there are gc.results with one located in the same
basic block and another located in other basic block.
In this case we need both export VR and fill local setValue.

Note that this case is not sufficient optimization done before CodeGen.
It is evident that local gc.result dominates all other gc.results and it is handled
by GVN and EarlyCSE.

But anyway, even if IR is not optimal Backend should not crash on a valid IR.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98393
2021-03-11 18:44:44 +07:00
Simon Pilgrim
2a408aa2f4 Fix MSVC "'type cast': conversion from 'unsigned int' to 'const llvm::CallBase *' of greater size" warning. NFCI. 2021-03-11 10:40:46 +00:00
Thomas Preud'homme
f41f493697 [FileCheck] Fix naming of OverflowErrorStr var
As pointed out by Joel E. Denny in D97845, the OverflowErrorStr variable
is misnamed because the error is raised for any parsing error. Note that
in FileCheck proper this only happens in case of (under|over)flow
because the regex will ensure a number in the correct format is matched.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D98342
2021-03-11 10:31:04 +00:00
Simon Pilgrim
ae6da12242 [IPO] Fix EXPENSIVE_CHECKS assert added at D83744. NFCI.
It wasn't taking into account that QueryingAA was a pointer.
2021-03-11 10:29:15 +00:00
Jay Foad
c6bd271cd0 [MCA] Support in-order CPUs with MicroOpBufferSize=1
Differential Revision: https://reviews.llvm.org/D98356
2021-03-11 10:12:54 +00:00
Nikita Popov
a833e60074 Reapply [LICM] Make promotion faster
Relative to the previous implementation, this always uses
aliasesUnknownInst() instead of aliasesPointer() to correctly
handle atomics. The added test case was previously miscompiled.

-----

Even when MemorySSA-based LICM is used, an AST is still populated
for scalar promotion. As the AST has quadratic complexity, a lot
of time is spent in this step despite the existing access count
limit. This patch optimizes the identification of promotable stores.

The idea here is pretty simple: We're only interested in must-alias
mod sets of loop invariant pointers. As such, only populate the AST
with loop-invariant loads and stores (anything else is definitely
not promotable) and then discard any sets which alias with any of
the remaining, definitely non-promotable accesses.

If we promoted something, check whether this has made some other
accesses loop invariant and thus possible promotion candidates.

This is much faster in practice, because we need to perform AA
queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable)
instead of O(NumTotal^2), and NumPromotable tends to be small.
Additionally, promotable accesses have loop invariant pointers,
for which AA is cheaper.

This has a signicant positive compile-time impact. We save ~1.8%
geomean on CTMark at O3, with 6% on lencod in particular and 25%
on individual files.

Conceptually, this change is NFC, but may not be so in practice,
because the AST is only an approximation, and can produce
different results depending on the order in which accesses are
added. However, there is at least no impact on the number of promotions
(licm.NumPromoted) in test-suite O3 configuration with this change.

Differential Revision: https://reviews.llvm.org/D89264
2021-03-11 10:50:28 +01:00
Augusto Noronha
7126fca5c3 Save and restore previous terminal after setting the terminal for checking if terminal supports colors.
The call to "set_curterm" inside the "terminalHasColors" function breaks
the EditLine configuration on some Linux distributions, causing certain
characters that have functions bound to them to not show up and
backspace to stop deleting characters (only visually). This patch
ensures that term struct is restored after the routine for cheking if
terminal supports colors is done, which fixes the aforementioned issue.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D95230
2021-03-11 10:47:06 +01:00
Djordje Todorovic
1e88deac13 [Debugify][OriginalDIMode] Export the report into JSON file
By using the original-di check with debugify in the combination with
the llvm/utils/llvm-original-di-preservation.py it becomes very user
friendly tool. An example of the HTML page with the issues
related to debug info can be found at [0].

[0] https://djolertrk.github.io/di-checker-html-report-example/

Differential Revision: https://reviews.llvm.org/D82546
2021-03-11 01:11:13 -08:00
David Blaikie
82a7e62f0f Fix unused lambda capture in a non-asserts build
For locally scoped lambdas like this there's no particular benefit to
explicitly listing captures - or avoiding capturing this. Switch to [&]
and make it all easier to maintain.

(& driveby change std::function to llvm::function_ref)
2021-03-11 00:22:18 -08:00
Petr Hosek
1de679fe12 [InstrProfiling] Don't generate __llvm_profile_runtime_user
This is no longer needed, we can add __llvm_profile_runtime directly
to llvm.compiler.used or llvm.used to achieve the same effect.

Differential Revision: https://reviews.llvm.org/D98325
2021-03-10 22:33:51 -08:00
Craig Topper
3a79afe74b [RISCV] Merge fixed-vectors-int-splat-rv32.ll and fixed-vectors-int-splat-rv64.ll.
The vXi64 test cases no longer crash on rv32.
2021-03-10 20:15:26 -08:00
Craig Topper
e43efbb1cc [RISCV] Add v2i64 _vi_ and _iv_ test cases to fixed-vectors-int.ll since we no longer crash.
I think we were missing some build_vector or other support and
skipped these test cases. They work now but don't generate
optimal code.
2021-03-10 19:19:47 -08:00
David Blaikie
6526719ee8 Revert "WIP"
Accidental commit.

This reverts commit 60238f29bf489dea7fabbcd5c69753d60a562b36.
2021-03-10 19:17:14 -08:00
David Blaikie
7f68408a6a WIP 2021-03-10 19:13:22 -08:00
Juneyoung Lee
f828df253e Resolve unused variable warning (NFC) 2021-03-11 12:03:03 +09:00
Nico Weber
0c13fbf3c6 [gn build] (manually) Port d6a0560bf258 2021-03-10 21:56:59 -05:00
Zakk Chen
da49c75c2f [Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
Demonstrate how to generate vadd/vfadd intrinsic functions

1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D95016
2021-03-10 18:43:43 -08:00
Vy Nguyen
ec56220409 [llvm] Fix thinko in getVendorSignature(), where expected values of ECX and EDX were flipped for the AMD case.
Follow up to D97504

Differential Revision: https://reviews.llvm.org/D98322
2021-03-10 21:39:19 -05:00
Juneyoung Lee
2a52d0ee68 [InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC) 2021-03-11 11:13:46 +09:00
Ruiling Song
d4ef89cda8 [AMDGPU] Always create Stack Object for reserved VGPR
As we may overwrite inactive lanes of a caller-save-vgpr, we should
always save/restore the reserved vgpr for sgpr spill.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D98319
2021-03-11 10:06:07 +08:00
George Balatsouras
bff211afb1 [dfsan] Update atomics.ll test
Remove hard-coded shadow width references and remove irrelevant instructions.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D98376
2021-03-10 17:55:50 -08:00
Ruiling Song
648b01f347 [ValueMapper] Add debug output for metadata remapping
This is useful for debugging which pointers are updated during remapping
process.

Differential Revision: https://reviews.llvm.org/D95775
2021-03-11 09:54:55 +08:00
Daniel Sanders
4335051f8d [mir] Change 'undef' for MMO base addresses to 'unknown-address'
Differential Revision: https://reviews.llvm.org/D98100
2021-03-10 16:46:44 -08:00