1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

207112 Commits

Author SHA1 Message Date
Alex Richardson
775dd2a2a2 [AMDGPU] Set the default globals address space to 1
This will ensure that passes that add new global variables will create them
in address space 1 once the passes have been updated to no longer default
to the implicit address space zero.
This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't
already to present to ensure bitcode backwards compatibility.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D84345
2020-11-20 15:46:53 +00:00
Alex Richardson
9c96f39f77 Add a default address space for globals to DataLayout
This is similar to the existing alloca and program address spaces (D37052)
and should be used when creating/accessing global variables.
We need this in our CHERI fork of LLVM to place all globals in address space 200.
This ensures that values are accessed using CHERI load/store instructions
instead of the normal MIPS/RISC-V ones.

The problem this is trying to fix is that most of the time the type of
globals is created using a simple PointerType::getUnqual() (or ::get() with
the default address-space value of 0). This does not work for us and we get
assertion/compilation/instruction selection failures whenever a new call
is added that uses the default value of zero.

In our fork we have removed the default parameter value of zero for most
address space arguments and use DL.getProgramAddressSpace() or
DL.getGlobalsAddressSpace() whenever possible. If this change is accepted,
I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead
of relying on the default value of 0 for PointerType::get(), etc.

This patch and the follow-up changes will not have any functional changes
for existing backends with the default globals address space of zero.
A follow-up commit will change the default globals address space for
AMDGPU to 1.

Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D70947
2020-11-20 15:46:52 +00:00
Anton Afanasyev
82c83de8d2 [SLP][Test] Update pr47269.ll test. NFC
Expand test for PR47269 to better demonstrate changes introduced by D90445.
2020-11-20 18:33:57 +03:00
Jamie Schmeiser
c6155047c0 Reland: Expand existing loopsink testing to also test loopsinking using new pass manager and fix LICM bug.
Summary:
Expand existing loopsink testing to also test loopsinking using new pass
manager.  Enable memoryssa for loopsink with new pass manager.  This
combination exposed a bug that was previously fixed for loopsink
without memoryssa.  When sinking an instruction into a loop, the source
block may not be part of the loop but still needs to be checked for
pointer invalidation.  This is the fix for bugzilla #39695 (PR 54659)
expanded to also work with memoryssa.

Respond to review comments.  Enable Memory SSA in legacy Loop Sink pass
under EnableMSSALoopDependency option control.  Update tests accordingly.

Respond to review comments.  Add options controlling whether memoryssa is
used for loop sink, defaulting to off.  Expand testing based on these
options.

Respond to review comments.  Properly indicated preserved analyses.

This relanding addresses a compile-time performance problem by moving
test for profile data earlier to avoid unnecessary computations.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90249
2020-11-20 10:26:33 -05:00
Sanjay Patel
7e28c46bfe [CostModel] avoid crashing while finding scalarization overhead
The constrained intrinsics have metadata arguments, so the
tests here were crashing as noted in D90554 (and that was
reverted even though this bug exists independently of that
change).
2020-11-20 10:18:29 -05:00
Jamie Schmeiser
71ead90dbb [NFC intended] Refactor the code for printChanged for reuse and to facilitate subsequent reporters of changes to the IR in the new pass manager.
Summary:
[NFC intended] Refactor the code for printChanged for reuse and to facilitate
subsequent reporters of changes to the IR in the new pass manager.

Create abstract template base classes for common functionality and give
classes more appropriate names.  The base classes handle all of the
determination of when a function or pass is "interesting" and should be
reported or filtered out. They have pure virtual functions which are called
when a change by a pass has been recognized so the derived class need only
provide the overrides to present the information about the changing IR.
There are at least 2 more change reporters to come (which were presented
in my tutorial at the 2020 llvm developer's meeting) that derive from
these classes.

Respond to review comments:  move function out of line, remove inline keyword,
remove unneeded qualifiers, simplify comparison.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks), madhur13490 (Madhur Amilkanthwar)
Differential Revision: https://reviews.llvm.org/D87000
2020-11-20 09:43:06 -05:00
Sjoerd Meijer
22726558a1 [AArch64] Enable post RA scheduler for Cortex-R82
Just something I forgot when I added the R82. Need to have a look
at crypto and fusing, but will do that as a follow up.

Differential Revision: https://reviews.llvm.org/D91848
2020-11-20 14:04:26 +00:00
David Green
cb2c1b3b24 [ARM] Disable WLSTP loops
This checks to see if the loop will likely become a tail predicated loop
and disables wls loop generation if so, as the likelihood for reverting
is currently too high. These should be fairly rare situations anyway due
to the way iterations and element counts are used during lowering. Just
not trying can alter how SCEV's are materialized however, leading to
different codegen.

It also adds a option to disable all while low overhead loops, for
debugging.

Differential Revision: https://reviews.llvm.org/D91663
2020-11-20 13:30:44 +00:00
Pavel Iliin
2529cb73ff [AArch64] Out-of-line atomics (-moutline-atomics) implementation.
This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.

Differential Revision: https://reviews.llvm.org/D91157
2020-11-20 13:30:12 +00:00
Sanjay Patel
5c4c695a34 [CostModel] add tests for math library calls; NFC
This is a partial un-revert of 32dd5870ee31 (originally df09f82599 ).

I'm adding back the baseline tests first, so we don't have
to back-track as much in case there are still problems.
2020-11-20 08:24:49 -05:00
Sanjay Patel
e2de0b36e0 [LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC
See discussion in D90554.

This is a partial un-revert of 32dd5870ee31. I'm adding
back the baseline tests first, so we don't have to
back-track as much in case there are still problems.
2020-11-20 08:15:46 -05:00
Sanjay Patel
3ff3df306f [InstCombine] add test comments for negative tests; NFC 2020-11-20 07:59:46 -05:00
Kazushi (Jam) Marukawa
ae1e6e2a8d [VE] Change threshold for jump table generation
Implement getMinimumJumpTableEntries() to specify threshold for jump
table genaration.  We use 8 for the case of PIC mode to relieve the
impact of PIC calculation required to implement PIC mode jump table.
Update jump table regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91785
2020-11-20 21:27:18 +09:00
Sebastian Neubauer
fe420f0603 [AMDGPU] Implement flat scratch init for pal
Extract the scratch offset from the scratch buffer descriptor that is
stored in the global table.

Differential Revision: https://reviews.llvm.org/D91701
2020-11-20 11:14:30 +01:00
QingShan Zhang
95e940fbb4 [NFC][Test] Update test for IEEE Long Double 2020-11-20 09:57:45 +00:00
Max Kazantsev
39c3beff62 [Test] Auto-update checks in a test 2020-11-20 16:53:51 +07:00
Georgii Rymar
78b3f55aa0 [llvm-readelf/obj] - Improve error reporting when dumping group sections.
Our code that dumps groups has 3 noticeable issues:
1) It uses `unwrapOrError` in many places.
2) It doesn't allow reporting unique warnings, because the `getGroups` helper is not
   a member of `DumpStyle<ELFT>`.
3) It might just crash. See the comment for `StrTableOrErr->data() + Sym.st_name` line.

In this patch I am starting addressing these points.
For start I've converted one of `unwrapOrError` calls to a unique warning.

Differential revision: https://reviews.llvm.org/D91798
2020-11-20 12:40:23 +03:00
Georgii Rymar
55fbb7e6c9 [llvm-readobj] - Introduce forEachRelocationDo helper.
Our `printStackSize` implementation currently uses
API like `RelocationRef`, `object::symbol_iterator`.
It is not ideal as it doesn't allow
to handle possible error conditions properly.

Some time ago I started rewriting it and this NFC patch is
a one more step toward to it. Here I am introducing the
`forEachRelocationDo` helper. With it it is possible to iterate
over all kinds of relocations, what is helpful for improving
the code in `printStackSize` and around.

Differential revision: https://reviews.llvm.org/D91530
2020-11-20 12:21:42 +03:00
Max Kazantsev
c19e519d70 [Test] Add tests demonstrating a bug in SCEV, PR48225
Slightly simplified version of original test reported by Congzhe Cao.
2020-11-20 15:59:22 +07:00
Liu, Chen3
cdaf1b2f95 [X86] Add support for vex, vex2, vex3, and evex for MASM
For MASM syntax, the prefixes are not enclosed in braces.
The assembly code should like:
  "evex vcvtps2pd xmm0, xmm1"

Differential Revision: https://reviews.llvm.org/D90441
2020-11-20 16:20:19 +08:00
Georgii Rymar
49ab0f3272 [lib/Object] - Generalize the RelocationResolver API.
This allows to reuse the RelocationResolver from the code
that doesn't want to deal with `RelocationRef` class.

I am going to use it in llvm-readobj. See the description
of D91530 for more details.

Differential revision: https://reviews.llvm.org/D91533
2020-11-20 10:32:49 +03:00
Qiu Chaofan
448bcbf8d2 [NFC] Pre-commit test for flt_rounds on PowerPC 2020-11-20 15:14:58 +08:00
Arthur Eubanks
2bf127ff66 [PGO] Make -disable-preinline work with NPM
Fixes cspgo_profile_summary.ll under NPM.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D91826
2020-11-19 22:58:55 -08:00
Eric Christopher
e7321a134d Temporarily Revert "[CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation"
as it's causing crashes in the optimizer. A reduced testcase has been posted as a follow-up.

This reverts commit f7eac51b9b3f780c96ca41913293851c5acb465b.

Temporarily Revert "[CostModel] make default size cost for libcalls small (again)" as it depends upon the primary revert.

This reverts commit 8ec7ea3ddce7379e13e8dfb4a5260a6d2004aa1c.

Temporarily Revert "[CostModel] add tests for math library calls; NFC" as it depends upon the primary revert.

This reverts commit df09f825995b10da03f148133c119f52c94fd6e4.

Temporarily Revert "[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC" as it depends upon the primary revert.

This reverts commit 618d555e8d926a83161774df2035519c387269db.
2020-11-19 22:10:23 -08:00
Kazu Hirata
24e4cf7420 [CodeGen] Use llvm::is_contained (NFC) 2020-11-19 22:07:56 -08:00
Bill Wendling
b8f5e920b8 [PowerPC] Allow a '%' prefix for registers in CFI directives
Clang generates a '%' prefix for some registers in CFI directives. E.g.
".cfi_register lr, r12" becomes ".cfi_register lr, %r12" after
processing.

Differential Revision: https://reviews.llvm.org/D91735
2020-11-19 18:19:51 -08:00
Arthur Eubanks
c0f9b031a4 [test] Fix multiply-minimal.ll 2020-11-19 18:16:35 -08:00
Duncan P. N. Exon Smith
7668c9150a ADT: Split out isSafeToReferenceAfterResize helper to use early returns, NFC
The assertion logic in SmallVector::assertSafeToReferenceAfterResize is
hard to follow; split out SmallVector::isSafeToReferenceAfterResize and
add early returns and comments. No functionality change here.
2020-11-19 17:55:04 -08:00
Arthur Eubanks
1e77e12c0e Port -lower-matrix-intrinsics-minimal to NPM
This reuses the existing lower-matrix-intrinsics pass rather than going
the legacy pass route of creating a new pass.

Use this new variant in the NPM -O0 pipeline.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91811
2020-11-19 17:42:48 -08:00
Duncan P. N. Exon Smith
82e253383a ADT: Use early returns in SmallVector::resize, NFC
Just a simple cleanup, no functionality change here.
2020-11-19 17:28:57 -08:00
Duncan P. N. Exon Smith
8605039d2b ADT: Weaken SmallVector::resize assertion from 5abf76fbe37380874a88cc9aa02164800e4e10f3
There's no need to check for reference invalidation when
`SmallVector::resize` is shrinking; the parameter isn't accessed.

Differential Revision: https://reviews.llvm.org/D91832
2020-11-19 17:25:36 -08:00
Sam Clegg
b9f542fc73 [lld][WebAssembly] Convert more tests to asm format. NFC.
Differential Revision: https://reviews.llvm.org/D91681
2020-11-19 16:57:00 -08:00
LLVM GN Syncbot
54ddb1ae8c [gn build] Port 8adc4d1ec76 2020-11-20 00:15:31 +00:00
Arthur Eubanks
030e65d1fe [test] Fix split-vfunc.ll under NPM
We need an AA pipeline under NPM.
This is a no-op if we are still using the legacy PM.
2020-11-19 14:59:05 -08:00
Florian Hahn
8ced7f6571 [ConstraintElimination] Decompose GEP with arbitrary offsets.
This patch decomposes `GEP %x, %offset` as  0 + 1 * %x + 1 * %off.
2020-11-19 22:49:21 +00:00
Arthur Eubanks
bc714b0c12 [test] Fix globalaa-retained.ll under NPM
Just '-O2' didn't run the full AA pipeline under NPM.
2020-11-19 14:24:46 -08:00
Arthur Eubanks
a88609dcad [test] Fix pr39282.ll under NPM
Already has a NPM RUN line
2020-11-19 14:19:48 -08:00
Geoffrey Martin-Noble
f1deff2292 Remove unused private fields
Unused since https://reviews.llvm.org/D91762 and triggering
-Wunused-private-field

```
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:365:13: error: private field 'GetArgTLS' is not used [-Werror,-Wunused-private-field]
  Constant *GetArgTLS;
            ^
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:366:13: error: private field 'GetRetvalTLS' is not used [-Werror,-Wunused-private-field]
  Constant *GetRetvalTLS;
```

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D91820
2020-11-19 13:54:54 -08:00
Roman Lebedev
4f26d69f1c [InstCombine] Fold and(shl(zext(x), width(SIGNMASK) - width(%x)), SIGNMASK) to and(sext(%x), SIGNMASK)
One less instruction and reducing use count of zext.
As alive2 confirms, we're fine with all the weird combinations of
undef elts in constants, but unless the shift amount was undef
for a lane, we must sanitize undef mask to zero, since sign bits
are no longer zeros.

https://rise4fun.com/Alive/d7r
```
----------------------------------------
Optimization: zz
Precondition: ((C1 == (width(%r) - width(%x))) && isSignBit(C2))
  %o0 = zext %x
  %o1 = shl %o0, C1
  %r = and %o1, C2
=>
  %n0 = sext %x
  %r = and %n0, C2

Done: 2016
Optimization is correct!
```
2020-11-20 00:31:27 +03:00
Roman Lebedev
01b8ffbe23 [NFC][InstCombine] Add test coverage for and (sext %x), SIGNMASK-like pattern 2020-11-20 00:31:26 +03:00
Nikita Popov
4f0554345f [MemLoc] Use hasValue() method more (NFC)
Followup to 7de7c40898a8f815d661781c92757f93fa4c6e5b. I previously
removed a number of == comparisons to LocationSize::unknown(), but
missed these != comparisons.
2020-11-19 22:29:44 +01:00
Jianzhou Zhao
53233de33f Remove deadcode from DFSanFunction::get*TLS*()
clean more deadcode after D84704

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D91762
2020-11-19 21:10:37 +00:00
Nikita Popov
63d8e35b93 [MemLoc] Use hasValue() method (NFC)
Instead of comparing to LocationSize::unknown(), prefer calling
the hasValue() method instead, which is less reliant on
implementation details.
2020-11-19 21:53:50 +01:00
Nikita Popov
3a433f6057 [MemLoc] Specify LocationSize in unit test
Followup to 393b9e9db31a3f83bc8b813ee24b56bc8ed93a49,
where I missed updating one MemoryLocation use inside a unit test.
2020-11-19 21:50:44 +01:00
Nikita Popov
53b556c27d [MemLoc] Require LocationSize argument (NFC)
When constructing a MemoryLocation by hand, require that a
LocationSize is explicitly specified. D91649 will split up
LocationSize::unknown() into two different states, and callers
should make an explicit choice regarding the kind of MemoryLocation
they want to have.
2020-11-19 21:45:52 +01:00
Artur Pilipenko
d3849bdc93 [BasicAA] Deoptimize intrinsics don't modify memory
Similarly to assumes and guards deoptimize intrinsics are
marked as writing to ensure proper control dependencies
but they never modify any particular memory location.

Differential Revision: https://reviews.llvm.org/D91658
2020-11-19 12:08:33 -08:00
Nikita Popov
56288c7baf [Lint] Use MemoryLocation
Instead of separately passing pointer and size, make use of
MemoryLocation. This allows us to also reuse all the existing
logic for determining the MemoryLocation correponding to an
instruction or call argument.

Not quite NFC because used locations may be more precise in some
cases.
2020-11-19 20:55:25 +01:00
Nico Weber
344a0904d8 [gn build] (manually) merge 1fb91fcf9cfe849 2020-11-19 14:24:35 -05:00
Arthur Eubanks
9dd02d7898 [NPM] Move more O0 pass building into PassBuilder
This moves handling of alwaysinline, coroutines, matrix lowering, PGO,
and LTO-required passes into PassBuilder. Much of this is replicated
between Clang and opt. Other out-of-tree users also replicate some of
this, such as Rust [1] replicating the alwaysinline, LTO, and PGO
passes.

The LTO passes are also now run in
build(Thin)LTOPreLinkDefaultPipeline() since they are semantically
required for (Thin)LTO.

[1]: f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D91585
2020-11-19 11:22:23 -08:00
Jorge Gorbe Moya
ac434d7750 Fix crash after looking up dwo_id=0 in CU index.
In the current state, if getFromHash(0) is called and there's no CU with
dwo_id=0, the lookup will stop at an empty slot, then the check
`Rows[H].getSignature() != S` won't cause the lookup to fail and return
a nullptr (as it should), because the empty slot has a 0 in the
signature field, and a pointer to the empty slot will be incorrectly
returned.

This patch fixes this by using the index field in the hash entry to
check for empty slots: signature = 0 can match a valid hash but
according to the spec the index for an occupied slot will always be
non-zero.

Differential Revision: https://reviews.llvm.org/D91670
2020-11-19 11:15:01 -08:00