1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

42489 Commits

Author SHA1 Message Date
Amara Emerson
635632451e [GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.

Differential Revision: https://reviews.llvm.org/D88364
2020-09-28 14:00:46 -07:00
Benjamin Kramer
46460daf85 [wasm] Move WasmTraits.h to BinaryFormat
There's no dependency on Object in there and this avoids a cyclic
dependency between libMC and libObject.
2020-09-28 22:07:28 +02:00
Sanjay Patel
068a0e3768 [CostModel] remove hack for intrinsic cost based on cost type
This hack seems to only have been necessary because of the
constructor bug noted in 33125cffd.

Once again, it's hard to prove NFC, but that's the hope...
2020-09-28 15:58:42 -04:00
Sanjay Patel
1d7afc6eeb [CostModel] move early exit for free intrinsics
This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
2020-09-28 13:30:55 -04:00
Sanjay Patel
2915eafc46 [CostModel] split handling of intrinsics from other calls
This should be close to NFC (no-functional-change), but I
can't completely rule out that some call on some target
travels down a different path. There's an especially large
amount of code spaghetti in this part of the cost model.

The goal is to clean up the intrinsic cost handling so
we can canonicalize to the new min/max intrinsics without
causing regressions.
2020-09-28 13:30:55 -04:00
Jessica Paquette
7a97485533 [GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:

```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```

Produce this:

```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```

as long as we are guaranteed to eliminate the original G_AND.

Also matches all commuted forms. E.g.

```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```

will be matched as well.

Differential Revision: https://reviews.llvm.org/D88104
2020-09-28 10:08:14 -07:00
Paul C. Anagnostopoulos
90d4bf8784 [TableGen] Improved messages in PseudoLoweringEmitter. 2020-09-28 10:18:22 -04:00
Georgii Rymar
b76a0e7b80 [yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section.
This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools.

SHT_ARM_EXIDX is a ARM specific index table filled with entries.
Each entry consists of two 4-bytes values (words).
(https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries)

Differential revision: https://reviews.llvm.org/D88228
2020-09-28 11:45:49 +03:00
Benjamin Kramer
d83ca05fea [Coroutines] Remove unused includes. NFC. 2020-09-28 10:27:23 +02:00
Chuanqi Xu
5802e5931e [Coroutines] Reuse storage for local variables with non-overlapping lifetimes
bug 45566 shows the process of building coroutine frame won't consider
that the lifetimes of different local variables are not overlapped,
which means the compiler could generates smaller frame.

This patch calculate the lifetime range of each alloca by StackLifetime
class. Then the patch build non-overlapped sets for allocas whose
lifetime ranges are not overlapped. We use the largest type in a
non-overlapped set as the field type in the frame. In insertSpills
process, if we find the type of field is not the same with the alloca,
we cast the pointer to the field type to the pointer to the alloca type.
Since the lifetime range of alloca in one non-overlapped set is not
overlapped with each other, it should be ok to reuse the storage space
in the frame.

Test plan: check-llvm, check-clang, cppcoro, folly

Reviewers: junparser, lxfind, modocache

Differential Revision: https://reviews.llvm.org/D87596
2020-09-28 15:48:00 +08:00
David Sherwood
0927cfa9f6 [SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy
After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and TypeSize,
since this could give the impression that these classes can be used
in the same way as basic integer integer types. However, division
for scalable types is a bit odd because we are only dividing the
minimum quantity by a value, as opposed to something like:

  (MinSize * Vscale) / SomeValue

This is why when performing division it's important the caller
first establishes whether the operation makes sense, perhaps by
calling isKnownMultipleOf() prior to division. The caller must now
explictly call divideCoefficientBy() on the class to perform the
operation.

Differential Revision: https://reviews.llvm.org/D87700
2020-09-28 08:03:00 +01:00
Arthur Eubanks
546a7d793e Revert "Reland [CodeGen] emit CG profile for COFF object file"
This reverts commit 506b6170cb513f1cb6e93a3b690c758f9ded18ac.

This still causes link errors, see https://crbug.com/1130780.
2020-09-27 22:43:14 -07:00
Nikita Popov
9408648c10 [LVI][CVP] Use block value when simplifying icmps
Add a flag to getPredicateAt() that allows making use of the block
value. This allows us to take into account range information from
the current block, rather than only information that is threaded
over edges, making the icmp simplification in CVP a lot more
powerful.

I'm not changing getPredicateAt() to use the block value
unconditionally to avoid any impact on the JumpThreading pass,
which is somewhat picky about LVI query order.

Most test changes here are just icmps that now get dropped (while
previously only a result used in a return was replaced). The three
tests in icmp.ll show some representative improvements. Some of
the folds this enables have been covered by IPSCCP in the meantime,
but LVI can reason about some cases which are hard to support in
IPSCCP, such as in test_br_cmp_with_offset.

The compile-time time cost of doing this is fairly minimal, with
a ~0.05% CTMark regression for ReleaseThinLTO:
https://llvm-compile-time-tracker.com/compare.php?from=709d03f8af4da4204849a70f01798e7cebba2e32&to=6236fd503761f43c99f4537121e057a01056f185&stat=instructions

This is because the block values will typically already be queried
and cached by other CVP optimizations anyway.

Differential Revision: https://reviews.llvm.org/D69686
2020-09-27 20:25:16 +02:00
Fangrui Song
cd74c48503 [NewPM] Port ConstraintElimination to the new pass manager
If -enable-constraint-elimination is specified, add it to the -O2/-O3 pipeline.
(-O1 uses a separate function now.)

Reviewed By: fhahn, aeubanks

Differential Revision: https://reviews.llvm.org/D88365
2020-09-27 11:12:26 -07:00
Nikita Popov
a436b4c09d [LVI] Require context instruction in external API (NFCI)
Require CxtI in getConstant() and getConstantRange() APIs.
Accordingly drop the BB parameter, as it is implied by
CxtI->getParent().

This makes sure we don't forget to pass the context instruction,
and makes the API contract clearer (also clean up the comments to
that effect -- the value holds at the context instruction, not
the end of the block).
2020-09-27 18:07:24 +02:00
Robert Widmann
8be02c632a [LLVM-C] Turn a ShuffleVector Constant Into a Getter.
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter.

Differential Revision: https://reviews.llvm.org/D88367
2020-09-26 17:32:57 -06:00
Paul C. Anagnostopoulos
2483a36836 [TableGen] Add/edit Doxygen comments to match "TableGen Backend Developer's Guide." 2020-09-26 09:09:22 -04:00
Qiu Chaofan
06278e0f7f [SelectionDAG] Add guard to automatically insert flags
This is like FastMathFlagGuard in IR. Since we use SDAG instance to get
values, it's with SelectionDAG. By creating a FlagInserter in current
scope, all values created by getNode will get the flags if no Flags
argument provided.

In this patch, I applied it to floating point operations folding part in
DAG combiner, and removed Flags passing to getNode to show its effect.
Other places in DAG combiner and other helper methods similar to getNode
also need this. They can be done in follow-up patches.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D87361
2020-09-26 13:57:52 +08:00
John Demme
789af735fa Common code preparation for tblgen-types patch
Cleanup and add methods which https://reviews.llvm.org/D86904 requires. Breaking up to lower review load.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D88267
2020-09-26 02:47:48 +00:00
Arthur Eubanks
67d8bcd8ba [LowerTypeTests][NewPM] Add constructor that uses command line flags
This matches the legacy PM pass by having one constructor use command
line flags, and the other use parameters to the pass.

This fixes all tests under Transforms/LowerTypeTests using NPM.

Reviewed By: ychen, pcc

Differential Revision: https://reviews.llvm.org/D87845
2020-09-25 17:39:59 -07:00
Michael Collison
6f3ffebafe [RISCV] Scheduler description for Bullet
Add the pipeline model for the RISC-V Bullet micro architecture.

Co-authored-by: Evandro Menezes <evandro.menezes@sifive.com>
2020-09-25 18:36:53 -05:00
Alexander Shaposhnikov
9a180ff58f [Object][MachO] Refine the interface of Slice
This patch performs a minor cleanup of the class Slice:
static methods and constructors which take a pointer but assume that
it's not null now take the argument by reference.
NFC.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D88320
2020-09-25 16:27:45 -07:00
Craig Topper
c8dd30699b [IR] Improve the description for Constant::isNormalFP to list all things that are not normal instead of just denormal. NFC 2020-09-25 16:26:46 -07:00
Craig Disselkoen
1cbae6c6e8 C API: functions to get mask of a ShuffleVector
This commit fixes a regression (from LLVM 10 to LLVM 11 RC3) in the LLVM
C API.

Previously, commit 1ee6ec2bf removed the mask operand from the
ShuffleVector instruction, storing the mask data separately in the
instruction instead; this reduced the number of operands of
ShuffleVector from 3 to 2. AFAICT, this change unintentionally caused
a regression in the LLVM C API. Specifically, it is no longer possible
to get the mask of a ShuffleVector instruction through the C API. This
patch introduces new functions which together allow a C API user to get
the mask of a ShuffleVector instruction, restoring the functionality
which was previously available through LLVMGetOperand().

This patch also adds tests for this change to the llvm-c-test
executable, which involved adding support for InsertElement,
ExtractElement, and ShuffleVector itself (as well as constant vectors)
to echo.cpp. Previously, vector operations weren't tested at all in
echo.ll.

I also fixed some typos in comments and help-text nearby these changes,
which I happened to spot while developing this patch. Since the typo
fixes are technically unrelated other than being in the same files, I'm
happy to take them out if you'd rather they not be included in the patch.

Differential Revision: https://reviews.llvm.org/D88190
2020-09-25 16:01:05 -07:00
Eli Friedman
1a2bff22f4 [AArch64][SVE] Drop "argmemonly" from gather/scatter with vector base.
The intrinsics don't have any pointer arguments, so "argmemonly" makes
optimizations think they don't write to memory at all.

Differential Revision: https://reviews.llvm.org/D88186
2020-09-25 16:01:05 -07:00
Simon Pilgrim
effa5d6f54 Fix copy+paste typo in doxygen parameter name to fix Wdocumentation. NFCI. 2020-09-25 22:09:51 +01:00
Arthur Eubanks
f62987dedb [LoopReroll][NewPM] Port -loop-reroll to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87957
2020-09-25 12:09:06 -07:00
Thomas Lively
06c1a3680b [WebAssembly] Check features before making SjLj vars thread-local
1c5a3c4d3823 updated the variables inserted by Emscripten SjLj lowering to be
thread-local, depending on the CoalesceFeaturesAndStripAtomics pass to downgrade
them to normal globals if the target features did not support TLS. However, this
had the unintended side effect of preventing all non-TLS-supporting objects from
being linked into modules with shared memory, because stripping TLS marks an
object as thread-unsafe. This patch fixes the problem by only making the SjLj
lowering variables thread-local if the target machine supports TLS so that it
never introduces new usage of TLS that will be stripped. Since SjLj lowering
works on Modules instead of Functions, this required that the
WebAssemblyTargetMachine have its feature string updated to reflect the
coalesced features collected from all the functions so that a
WebAssemblySubtarget can be created without using any particular function.

Differential Revision: https://reviews.llvm.org/D88323
2020-09-25 11:45:16 -07:00
Matt Arsenault
0ec533bb8a OpaquePtr: Add type to sret attribute
Make the corresponding change that was made for byval in
b7141207a483d39b99c2b4da4eb3bb591eca9e1a. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
2020-09-25 14:07:30 -04:00
Hans Wennborg
dcc4c6d98a Move PassBuilder::registerParseTopLevelPipelineCallback out-of-line
For some mysterious reason it doesn't build with clang-cl when compiled
as part of the includes in clang's CodeGenAction.cpp
(crbug.com/1132292).
2020-09-25 19:55:40 +02:00
Dávid Bolvanský
9b814fe81c [SystemZ] Optimize bcmp calls (PR47420)
Solves https://bugs.llvm.org/show_bug.cgi?id=47420

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D87988
2020-09-25 17:55:39 +02:00
Snehasish Kumar
74d368792a [llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section.
This change adds an option to basic block sections to allow cold
clusters to be assigned a custom text prefix. With a custom prefix such
as ".text.split." (D87840), lld can place them in a separate output section.
The benefits are -

* Empirically shown to improve icache and itlb metrics by 3-5%
(absolute) compared to placing split parts in .text.unlikely.
* Mitigates against poor profiles, eg samplePGO profiles used with the
machine function splitter. Optimizations such as hugepage remapping can
make different decisions at the section granularity.
* Enables section granularity hotness monitoring (checking on the
decisions made during compilation vs sample data from production).

Differential Revision: https://reviews.llvm.org/D87813
2020-09-24 15:26:15 -07:00
Joseph Huber
7b42f2f140 [OpenMP] OpenMPOpt Support for Globalization Remarks
Summary:
This patch add support for printing analysis messages relating to data
globalization on the GPU. This occurs when data is shared between the
threads in a GPU context and must be pushed to global or shared memory.

Reviewers: jdoerfert

Subscribers: guansong hiraditya llvm-commits ormris sstefan1 yaxunl

Tags: #OpenMP #LLVM

Differential Revision: https://reviews.llvm.org/D88243
2020-09-24 18:23:12 -04:00
Vedant Kumar
20166f03cc [Instruction] Add dropLocation and updateLocationAfterHoist helpers
Introduce a helper which can be used to update the debug location of an
Instruction after the instruction is hoisted. This can be used to safely
drop a source location as recommended by the docs.

For more context, see the discussion in https://reviews.llvm.org/D60913.

Differential Revision: https://reviews.llvm.org/D85670
2020-09-24 15:00:04 -07:00
Zequan Wu
e7cf57fa10 Reland [CodeGen] emit CG profile for COFF object file
This reverts commit 90242caca2074dab5a9b76e5bc36d9fafd2179a7.

Error fixed at f5435399e823746bbe1737b95c853d77a42e1ac3

Differential Revision: https://reviews.llvm.org/D87811
2020-09-24 14:38:53 -07:00
Daniel Kiss
5f9dfa91a0 [AArch64] __builtin_return_address for PAuth.
This change adds the support for __builtin_return_address
for ARMv8.3A Pointer Authentication.
Location of the authentication code in the pointer depends on
the system configuration, therefore a dedicated instruction is used for
effectively removing the authentication code without
authenticating the pointer.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D75044
2020-09-24 23:23:49 +02:00
Andrew Litteken
c432043c07 [IRSim] Adding wrapper pass for IRSimilarityIdentfier
This introduces an analysis pass that wraps IRSimilarityIdentifier,
and adds a printer pass to examine in what function similarities are
being found.

Test for what the printer pass can find are in
test/Analysis/IRSimilarityIdentifier.

Reviewed by: paquette, jroelofs

Differential Revision: https://reviews.llvm.org/D86973
2020-09-24 14:59:41 -05:00
Matt Arsenault
8bd5d0338f OpaquePtr: Add helpers for sret to mirror byval
Sret should really have a type parameter like byval does.
2020-09-24 09:57:28 -04:00
Alexandre Ganea
e7d01f8d51 [Support] On Unix, let the CrashRecoveryContext return the signal code
Before this patch, the CrashRecoveryContext was returning -2 upon a signal, like ExecuteAndWait does. This didn't match the behavior on Windows, where the the exception code was returned.

We now return the signal's code, which optionally allows for re-throwing the signal later. Doing so requires all custom handlers to be removed first, through llvm::sys::unregisterHandlers() which we made a public API.

This is part of https://reviews.llvm.org/D70378
2020-09-24 08:21:43 -04:00
Alexandre Ganea
6aabad1fd2 [Support] On Windows, ensure abort() can be catched several times in a row with CrashRecoveryContext
Before this patch, the CrashRecoveryContext would only catch the first abort(). Any further calls to abort() inside subsquent CrashRecoveryContexts would not be catched. This is because the Windows CRT removes the abort() handler before calling it.

This is part of https://reviews.llvm.org/D70378
2020-09-24 08:21:42 -04:00
Florian Hahn
e863e6ec81 [SCEV] Use loop guard info when computing the max BE taken count in howFarToZero.
For some expressions, we can use information from loop guards when
we are looking for a maximum. This patch applies information from
loop guards to the expression used to compute the maximum backedge
taken count in howFarToZero. It currently replaces an unknown
expression X with UMin(X, Y), if the loop is guarded by
X ult Y.

This patch is minimal in what conditions it applies, and there
are a few TODOs to generalize.

This partly addresses PR40961. We will also need an update to
LV to address it completely.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D67178
2020-09-24 11:06:55 +01:00
David Sherwood
f6f274223b [SVE] Add new isKnownXX comparison functions to TypeSize
This patch introduces four new comparison functions:

  isKnownLT, isKnownLE, isKnownGT, isKnownGE

that return true if we know at compile time that a particular
condition is met, i.e. that one size is definitely greater than
another. The existing operators <,>,<=,>= remain in the code for
now, but over time we would like to remove them and change the
code to use the isKnownXY routines instead. These functions do
not assert like the existing operators because the caller is
expected to properly deal with cases where we return false by
analysing the scalable properties. I've made more of an effort
to deal with cases where there are mixed comparisons, i.e. between
fixed width and scalable types.

I've also added some knownBitsXY routines to the EVT and MVT
classes that call the equivalent TypeSize::isKnownXY routines.
I've changed the existing bitsXY functions to call their knownBitsXY
equivalents and added asserts that the scalable properties match.
Again, over time we expect to migrate callers to use knownBitsXY
and make the code more aware of the scalable nature of the sizes.

Differential revision: https://reviews.llvm.org/D88098
2020-09-24 10:22:57 +01:00
Andrew Litteken
b786a09fd0 [IRSim] Adding a basic similarity identifier.
This takes the mapped instructions from the IRInstructionMapper, and
passes it to the Suffix Tree to find the repeated substrings.  Within
each set of repeated substrings, the IRSimilarityCandidates are compared
against one another for structure, and ensuring that the operands in the
instructions are used in the same way.  Each of these structurally
similarity IRSimilarityCandidates are contained in a SimilarityGroup.

Tests checking for identifying identity of structure, different
isomorphic structure, and different
nonisomoprhic structure are found in
unittests/Analysis/IRSimilarityIdentifierTest.cpp.

Differential Revision: https://reviews.llvm.org/D86972
2020-09-24 02:05:25 -05:00
Xing GUO
0719e72f68 [DWARFYAML] Make the ExtLen field of extended opcodes optional.
This patch makes the 'ExtLen' field of extended opcodes optional. We
don't need to manually calculate it in the future.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D88136
2020-09-24 14:13:26 +08:00
David Blaikie
c5476c7c57 DebugInfo: Filter DWARFv5 TUs out of the debug_info unit list when CUs requested
Since DWARFv5 places TUs in debug_info, some of DWARFContext's APIs have
become a bit erroneous, including TUs in the CU list by accident.
Correct that by providing compile_units (& dwo_compile_units) that
filter out the type units from the debug_info units.

Differential Revision: https://reviews.llvm.org/D87935
2020-09-23 22:15:53 -07:00
Andrew Litteken
3d12233786 [IRSim] Adding structural comparison to IRSimilarityCandidate.
Just because sequences of instructions are similar to one another,
doesn't mean they are doing the same thing.

This introduces a structural check for the IRSimilarityCandidate that
compares two IRSimilarityCandidates against one another, and in each
instruction creates a mapping between the operands and results, or
checks that the existing mapping is valid.  If this check passes, it
means we have structurally similar IRSimilarityCandidates.

Tests for whether the candidates are found in
unittests/Analysis/IRSimilarityIdentifierTest.cpp.

Recommit of: b27db2bb68163fa5bcb4a8f631a305eb5adb44e5 for Differential
URL.

Differential Revision: https://reviews.llvm.org/D86971
2020-09-23 22:42:30 -05:00
Andrew Litteken
d5678d1cef Revert "[IRSim] Adding structural comparison to IRSimilarityCandidate."
This reverts commit b27db2bb68163fa5bcb4a8f631a305eb5adb44e5.
2020-09-23 22:40:37 -05:00
Andrew Litteken
5b31a525de [IRSim] Adding structural comparison to IRSimilarityCandidate.
Just because sequences of instructions are similar to one another,
doesn't mean they are doing the same thing.

This introduces a structural check for the IRSimilarityCandidate that
compares two IRSimilarityCandidates against one another, and in each
instruction creates a mapping between the operands and results, or
checks that the existing mapping is valid.  If this check passes, it
means we have structurally similar IRSimilarityCandidates.

Tests for whether the candidates are found in
unittests/Analysis/IRSimilarityIdentifierTest.cpp.
2020-09-23 22:31:12 -05:00
Pushpinder Singh
9fec09e02f [GlobalISel][AMDGPU] Lower G_SMULH/G_UMULH
Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D85653
2020-09-23 22:25:29 -04:00
Arthur Eubanks
c0f4b781c8 [NFC] Remove unnecessary default constructors 2020-09-23 18:54:10 -07:00