1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00
Commit Graph

218839 Commits

Author SHA1 Message Date
Nikita Popov
8eac42b1cf [Inline] Fix noalias addition on simplified instructions (PR50589)
When adding noalias/alias.scope metadata, we analyze the instructions
of the original callee, and then place metadata on the corresponding
inlined instructions in the caller as provided by VMap. However, this
assumes that this actually a clone of the instruction, rather than
the result of simplification. If simplification occurred, the
instruction that VMap points to may not have any relationship as far
as ModRef behavior is concerned.

Fix this by tracking simplified instructions during cloning and then
only processing instructions that have not been simplified. This is
done with an additional map form original to cloned instruction,
into which we only insert if no simplification is performed. The
mapping in VMap can then be compared to this map. If they're the
same, the instruction hasn't been simplified. (I originally wanted
to only track a set of simplified instructions, but that wouldn't
work if the instruction only gets simplified afterwards, e.g. based
on rewritten phis.)

Fixes https://bugs.llvm.org/show_bug.cgi?id=50589.

Differential Revision: https://reviews.llvm.org/D106242
2021-07-20 19:52:41 +02:00
Sami Tolvanen
8bbf0d9c0e ThinLTO: Fix inline assembly references to static functions with CFI
Create an internal alias with the original name for static functions
that are renamed in promoteInternals to avoid breaking inline
assembly references to them. This version uses module inline assembly
to avoid issues with LowerTypeTestsModule.

Relands commmit 8e3b5cb39eef462943ed7556469604ce25c07a1d with arch
specific tests fixed.

Link: https://github.com/ClangBuiltLinux/linux/issues/1354

Reviewed By: nickdesaulniers, pcc

Differential Revision: https://reviews.llvm.org/D104058
2021-07-20 10:30:02 -07:00
Arthur Eubanks
d577e2748f [NewPM] Print pre-transformation IR name in --print-after-all
Sometimes a transformation can change the name of some IR (e.g. an SCC
with functions added/removed). This can be confusing when debug logging
doesn't match the post-transformation name. The specific example I came
across was that --print-after-all said the inliner was working on an SCC
that only contained one function, but calls in multiple functions were
getting inlined. After all inlining, the current SCC only contained one
function.

Piggyback off of the existing logic to handle invalidated IR +
--print-module-scope. Simply always store the IR description and use
that.

Reviewed By: jamieschmeiser

Differential Revision: https://reviews.llvm.org/D106290
2021-07-20 10:20:10 -07:00
Giorgis Georgakoudis
43d4e670a4 [OpenMP] Set RequiresFullRuntime false in SPMDization
SPMDization in D102307 does not change the RequiresFullRuntime argument of kmpc_target_init/deinit calls. However, the constraints of SPMDization detection for converting a target region to SPMD mode should guarantee that the region does not require full runtime support. Hence, this patch sets RequiresFullRuntime to false for improved execution performance.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105556
2021-07-20 09:54:51 -07:00
Fangrui Song
dcddd93e08 [test] Avoid llvm-symbolizer/llvm-addr2line one-dash long options 2021-07-20 09:34:35 -07:00
Shimin Cui
74b9048a41 This patch extends the OptimizeGlobalAddressOfMalloc to handle the null check of global pointer variables. It is disabled with https://reviews.llvm.org/rGb7cd291c1542aee12c9e9fde6c411314a163a8ea. This PR is to reenable it while fixing the original problem reported. The fix is to set the store value correctly when creating store for the new created global init bool symbol.
Reviewed By: efriedma

Differential Revision:  https://reviews.llvm.org/D102711
2021-07-20 12:27:26 -04:00
Craig Topper
a2254d3fcc [RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants.
If we need to shift left anyway we might be able to take advantage
of LUI implicitly shifting its immediate left by 12 to cover part
of the shift. This allows us to use more bits of the LUI immediate
to avoid an ADDI.

isDesirableToCommuteWithShift now considers compressed instruction
opportunities when deciding if commuting should be allowed.

I believe this is the same or similar to one of the optimizations
from D79492.

Reviewed By: luismarques, arcbbb

Differential Revision: https://reviews.llvm.org/D105417
2021-07-20 09:22:06 -07:00
Craig Topper
0bcc5397bd [RISCV] Add -mattr=+c command lines to add-before-shl.ll to prepare for D105417. NFC 2021-07-20 09:22:06 -07:00
Quinn Pham
b0ba22b69f [PowerPC] Semachecking for XL compat builtin icbt
This patch is in a series of patches to provide builtins for compatibility with the XL compiler.
This patch adds semachecking for an already implemented builtin, `__icbt`. `__icbt` is only
valid for Power8 and up.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D105834
2021-07-20 11:05:22 -05:00
Craig Topper
95c86f6829 [RISCV] Add custom isel to select (and (srl X, C1), C2) and (and (shl X, C1), C2)
Replace some existing isel patterns that are covered by the new
code. SLLIUWPat has been removed in favor of folding its root case
into the new code. The other uses in isel patterns for shXadd.uw
have been switched to using hardcoded AND masks.

This is based on the original version of D49585 from ARM. The final
version of that was made a DAG combine, but I've chosen to keep it
as custom isel. I'm not convinced DAG combine is as good with
shift pairs as it is with and+shift. I saw some issues optimizing
the shifts created by vscale lowering if an and isn't created for
from a shift pair.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D106230
2021-07-20 08:53:55 -07:00
Stefan Pintilie
dbd743acfc [PowerPC] Inefficient register allocation of ACC registers results in many copies.
ACC registers are a combination of four consecutive vector registers.
If the vector registers are assigned first this often forces a number
of copies to appear just before the ACC register is created. If the ACC
register is assigned first then fewer copies are generated when the vector
registers are assigned.

This patch tries to force the register allocator to assign the ACC registers first
and then the UACC registers and then the vector pair registers. It does this
by changing the priority of the register classes.

This patch also adds hints to help the register allocator assign UACC registers from
known ACC registers and vector pair registers from known UACC registers.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D105854
2021-07-20 10:53:40 -05:00
Sterling Augustine
4a2706ecc3 Avoid keeping internal string_views in Twine.
This is a follow-up to https://reviews.llvm.org/D103935

A Twine's internal layout should not depend on which version of the
C++ standard is in use. Dynamically linking binaries compiled with two
different layouts (eg, --std=c++14 vs --std=c++17) ends up
problematic.

This change avoids that issue by immediately converting a
string_view to a pointer-and-length at the cost of an extra eight-bytes
in Twine.

Differential Revision: https://reviews.llvm.org/D106186
2021-07-20 08:46:53 -07:00
Craig Topper
9d554c2dc6 [RISCV] Use unordered indexed loads for MGATHER.
I don't think the semantics of the llvm masked gather intrinsic care
about the order the elements are loaded. For example, type legalization
by splitting will chain them in parallel. This is different than
scatter which we do chain in order.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D106025
2021-07-20 08:46:02 -07:00
David Green
b02e31546a [LV] Change interface of getReductionPatternCost to return Optional
Currently the Instruction cost of getReductionPatternCost returns an
Invalid cost to specify "did not find the pattern". This changes that to
return an Optional with None specifying not found, allowing Invalid to
mean an infinite cost as is used elsewhere.

Differential Revision: https://reviews.llvm.org/D106140
2021-07-20 16:44:50 +01:00
Joel E. Denny
5617b5612e [UpdateCCTestChecks] Implement --global-hex-value-regex
For example, in OpenMP offload codegen tests, global variables like
`.offload_maptypes*` are much easier to read in hex.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104743
2021-07-20 11:23:20 -04:00
Joel E. Denny
2be8d8a8b3 [UpdateCCTestChecks] Implement --global-value-regex
`--check-globals` activates checks for all global values, and
`--global-value-regex` filters them.  For example, I'd like to use it
in OpenMP offload codegen tests to check only global variables like
`.offload_maptypes*`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104742
2021-07-20 11:23:20 -04:00
LLVM GN Syncbot
83266a3de4 [gn build] Port 1a29403d2f8a 2021-07-20 15:13:51 +00:00
Kazu Hirata
025dd3da4e [SampleProfile] Remove ProfileIsValid (NFC)
The last use was removed on Jan 22, 2021 in commit
c9cd9a006632419ce7346e50564e6347a93181cc.
2021-07-20 08:07:04 -07:00
Caroline Concatto
4d32828f42 [NFC][LoopVectorizer] Remove VF.isScalable() assertion from collectInstsToScalarize and getInstructionCost
This patch removes the assertion when VF is scalable and replaces
getKnownMinValue() by getFixedValue(),  so it still guards the code against
scalable vector types.
The assertions were used to guarantee that getknownMinValue were not used for
scalable vectors.

Differential Revision: https://reviews.llvm.org/D106359
2021-07-20 15:56:30 +01:00
Anirudh Prasad
0dda2de3a7 [SystemZ][z/OS] Add GOFF support to file magic identification
- This patch adds in the GOFF format to the file magic identification logic in LLVM
- Currently, for the object file support, GOFF is marked as having as an error
- However, this is only temporary until https://reviews.llvm.org/D98437 is merged in

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D105993
2021-07-20 10:50:47 -04:00
Simon Pilgrim
41db4b6343 [CostModel] Templatize EntryCost::Cost to allow custom cost metrics
We currently use an unsigned value for our CostTblEntry and TypeConversionCostTblEntry cost tables which is limiting depending on how the target wishes to handle various CostKinds etc.

For instance, targets might wish to store separate instruction count, latency or throughput values etc. On D46276 we have been investigating storing a code snippet to improve latency/throughput cost calculations.

There is a slight problem in that template argument deduction was struggling to match the now templatized Costs[] tables in a ArrayRef constructor - I've added helper wrappers for CostTableLookup/ConvertCostTableLookup which avoids us having to update all existing calls with a template hint.

Differential Revision: https://reviews.llvm.org/D106351
2021-07-20 15:31:39 +01:00
Johannes Doerfert
1c3574eaae [Attributor] Initialize effectively unused value to appease UBSAN 2021-07-20 09:18:51 -05:00
Bradley Smith
b354b5b95c [AArch64][SVE] Move instcombine like transforms out of SVEIntrinsicOpts
Instead move them to the instcombine that happens in AArch64TargetTransformInfo.

Differential Revision: https://reviews.llvm.org/D106144
2021-07-20 14:17:30 +00:00
Florian Hahn
9c42154b7e [VPlan] Fix formatting glitch from d2a73fb44ea0b8. 2021-07-20 16:16:30 +02:00
Florian Hahn
de477ca54f [VPlan] Add recipe for first-order rec phis, make splicing explicit.
This patch adds a VPFirstOrderRecurrencePHIRecipe, to further untangle
VPWidenPHIRecipe into distinct recipes for distinct use cases/lowering.
See D104989 for a new recipe for reduction phis.

This patch also introduces a new `FirstOrderRecurrenceSplice`
VPInstruction opcode, which is used to make the forming of the vector
recurrence value explicit in VPlan. This more accurately models def-uses
in VPlan and also simplifies code-generation. Now, the vector recurrence
values are created at the right place during VPlan-codegeneration,
rather than during post-VPlan fixups.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D105008
2021-07-20 16:14:17 +02:00
Nico Weber
3a0fbebf9d [gn build] remove stray character in a comment 2021-07-20 10:13:48 -04:00
David Green
1560f33a55 [AArch64] Regenerate some tests checks. NFC 2021-07-20 14:52:36 +01:00
Simon Pilgrim
9e495cac81 [X86] X86InstCombineIntrinsic.cpp - silence clang-tidy warnings about incorrect uses of auto. NFCI.
We were using auto instead of auto* in a number of places which failed the llvm-qualified-auto check.

Additionally we were using auto in some places where the type wasn't immediately obvious - the style guide rule of thumb is only to use auto from casts etc. where the type is already explicitly stated.
2021-07-20 13:37:45 +01:00
Simon Pilgrim
fadcc816a1 [MIPS][MSA] Regenerate basic operations test checks
Cleanup the check prefixes to make refresh a lot easier
2021-07-20 13:37:44 +01:00
LLVM GN Syncbot
bcb32ee85a [gn build] Port 2b08f6af62af 2021-07-20 12:00:01 +00:00
Sebastian Neubauer
d89384c520 [AMDGPU] Improve register computation for indirect calls
First, collect the register usage in each function, then apply the
maximum register usage of all functions to functions with indirect
calls.

This is more accurate than guessing the maximum register usage without
looking at the actual usage.

As before, assume that indirect calls will hit a function in the
current module.

Differential Revision: https://reviews.llvm.org/D105839
2021-07-20 13:48:50 +02:00
Ulrich Weigand
8d80402b64 [SystemZ] Fix invalid assumption in getCPUNameFromS390Model
Code in getCPUNameFromS390Model currently assumes that the
numerical value of the model number always increases with
future hardware.  While this has happened to be the case
with the last few machines, it is not guaranteed -- that
assumption was violated with (much) older machines, and
it can be violated again with future machines.

Fix by explicitly listing model numbers for all supported
machine models.
2021-07-20 13:39:22 +02:00
Timm Bäder
55ff387ea9 [llvm][tools] Hide more unrelated tool options
Differential Revision: https://reviews.llvm.org/D106271
2021-07-20 13:27:33 +02:00
Jay Foad
70ace9e728 [AMDGPU] Pre-commit test case for D106284
This test case shows the scheduler wrongly reordering two buffer
accesses that might alias.
2021-07-20 12:05:33 +01:00
Jeremy Morse
7d0a83254c [DebugInfo][InstrRef] Fix a broken substitution method, add test coverage
This patch fixes a clearly-broken function that I absent-mindedly bodged
many months ago.

Over in D85749 I landed the substituteDebugValuesForInst, that creates
substitution records for all the def operands from one debug-labelled
instruction to the new one. Unfortunately it would crash if the two
instructions had different numbers of operands; I tried to fix this in
537f0fbe82 by adding a "max operand" parameter to the method, but then
didn't actually change the loop bound to take account of this. It passed
all the tests because.... well there wasn't any real test coverage of this
method.

This patch fixes up the loop to be bounded by the MaxOperand bound; and
adds test coverage for the x86-fixup-LEAs calls to this method, so that
it's actually tested.

Differential Revision: https://reviews.llvm.org/D105820
2021-07-20 11:45:13 +01:00
Nico Weber
125ab5ca8a [gn build] (manually) port bc1a2979fc70 2021-07-20 06:43:30 -04:00
Chen Zheng
06edcd023a [PowerPC][NFC] add more cases for lfiwzx/lfiwax 2021-07-20 10:29:56 +00:00
Stanislav Mekhanoshin
c99f31d90e [AMDGPU] Disable LDS lowering for GFX shaders
Apparently these need external LDS symbols to remain.

Fixes: SC1-3279

Differential Revision: https://reviews.llvm.org/D106288
2021-07-20 02:55:25 -07:00
Dawid Jurczak
a09883bcb0 [DSE] Transform memset + malloc --> calloc (PR25892)
After this change DSE can eliminate malloc + memset and emit calloc.
It's https://reviews.llvm.org/D101440 follow-up.

Differential Revision: https://reviews.llvm.org/D103009
2021-07-20 11:39:05 +02:00
Florian Mayer
be3736ea6f Revert "[hwasan] Use stack safety analysis."
This reverts commit e9c63ed10b3bdf6eb3fa76d1a3eb403d6fc6a118.
2021-07-20 10:36:46 +01:00
Kai Luo
eacf3837e5 [PowerPC] Add lit.local.cfg in AtomicExpand tests
Fixed build errors on other platforms.
2021-07-20 09:13:50 +00:00
Florian Mayer
1c1f625528 [hwasan] Use stack safety analysis.
This avoids unnecessary instrumentation.

Reviewed By: eugenis, vitalybuka

Differential Revision: https://reviews.llvm.org/D105703
2021-07-20 10:06:35 +01:00
Sander de Smalen
04cb04f74d [AArch64][SVE][InstCombine] last{a,b} of a splat vector
Replace last{a,b}(splat(X)) with X, irrespective of the predicate.

Patch by/Committing on behalf of: Usman Nadeem (mnadeem)

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D105520
2021-07-20 09:44:43 +01:00
Cullen Rhodes
6765229853 [AArch64][SME] Add system registers and related instructions
This patch adds the new system registers introduced in SME:

  - ID_AA64SMFR0_EL1 (ro) SME feature identifier.
  - SMCR_ELx (r/w) streaming mode control register for configuring
    effective SVE Streaming SVE Vector length when the PE is in
    Streaming SVE mode.
  - SVCR (r/w) streaming vector control register, visible at all
    exception levels. Provides access to PSTATE.SM and PSTATE.ZA
    using MSR and MRS instructions.
  - SMPRI_EL1 (r/w) streaming mode execution priority register.
  - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register.
  - SMIDR_EL1 (ro) streaming mode identification register.
  - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread
    SME context.
  - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for
    labelling memory accesses performed in streaming mode.

Also added in this patch are the SME mode change instructions.
Three MSR immediate instructions are implemented to set or clear
PSTATE.SM, PSTATE.ZA, or both respectively:

  - MSR SVCRSM, #<imm1>
  - MSR SVCRZA, #<imm1>
  - MSR SVCRSMZA, #<imm1>

The following smstart/smstop aliases are also implemented for
convenience:

  smstart    -> MSR SVCRSMZA, #1
  smstart sm -> MSR SVCRSM,   #1
  smstart za -> MSR SVCRZA,   #1

  smstop     -> MSR SVCRSMZA, #0
  smstop sm  -> MSR SVCRSM,   #0
  smstop za  -> MSR SVCRZA,   #0

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105576
2021-07-20 08:06:26 +00:00
Amara Emerson
c7647f3217 [AArch64][GlobalISel] Don't form truncstores in postlegalizer-lowering for s128.
We don't support truncating s128 stores, so don't form them.
2021-07-20 00:04:34 -07:00
Johannes Doerfert
afd4fe3364 [Attributor] Use set vector instead of vector to prevent duplicates 2021-07-20 01:39:34 -05:00
Johannes Doerfert
bce3c823b4 [Attributor] Simplify to values in the genericValueTraversal
We already simplified to a constant, given the new interface we can also
simplify to a generic value.
2021-07-20 01:39:34 -05:00
Johannes Doerfert
fc4701b4f4 [Attributor] Use checkForAllUses instead of custom use tracking
AAMemoryBehaviorFloating used a custom use tracking mechanism even
though checkForAllUses exists and is already more powerful. Further,
AAMemoryBehaviorFloating uses AANoCapture to guarantee that there are no
aliases and following the uses is sufficient. This is an OK assumption
if checkForAllUses is used but custom tracking is easily out of sync
with AANoCapture and problems follow.
2021-07-20 01:39:33 -05:00
Kai Luo
98d913fa91 [PowerPC] Fallback to base's implementation of shouldExpandAtomicCmpXchgInIR and shouldExpandAtomicCmpXchgInIR
If we can't decide `shouldExpandAtomicCmpXchgInIR` or `shouldExpandAtomicCmpXchgInIR` in PPC's implementation after https://reviews.llvm.org/rGb9c3941cd61de1e1b9e4f3311ddfa92394475f4b, resort to base's implementation.

This fixes internal build of OpenMP which uses atomic operations on float.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106234
2021-07-20 06:14:24 +00:00
Craig Topper
3baf156215 [RISCV] Add test cases to show an issue with our fcvt.wu isel patterns on RV64.
The pattern we match is (sext_inreg (assertzexti32 (fp_to_uint)), i32). If
the assertzexti32 has an additional user we'll end up emitting
an fcvt.wu and an fcvt.lu.

This can happen if the original fp_to_uint before type legalization
has one user that causes a sext_inreg to be emitted and one that
doesn't.
2021-07-19 22:58:42 -07:00