1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

197848 Commits

Author SHA1 Message Date
Jay Foad
14dd772f1b [AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics (fix)
Try to fix Windows buildbots.
2020-06-03 09:44:33 +01:00
Serge Pavlov
86f8cab821 Revert "[Support] Add file lock/unlock functions"
This reverts commit f51bc4fb60fbcef26d18eff549fc68307fd46489.
It broke the Solaris buildbots (Builder clang-solaris11-sparcv9 Build #5494
<http://lab.llvm.org:8014/builders/clang-solaris11-sparcv9/builds/54).
2020-06-03 15:40:12 +07:00
Vitaly Buka
f99a2b16cb [StackSafety,NFC] Convert to template internal stuff
It's going to be usefull for ThinLTO.
2020-06-03 01:36:20 -07:00
Vitaly Buka
299555c7e7 [StackSafety,NFC] Rename internal class 2020-06-03 01:36:20 -07:00
Jay Foad
4a4ccbfa8d [AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics
Differential Revision: https://reviews.llvm.org/D80702
2020-06-03 09:34:22 +01:00
hsmahesha
d03b426308 [AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width
Summary:
Clean-up the width computing logic given a memory operand, and re-arrange code to avoid
code duplication.

Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar

Reviewed By: foad

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80946
2020-06-03 14:03:52 +05:30
LLVM GN Syncbot
d96290b617 [gn build] Port 755a8959152 2020-06-03 08:27:24 +00:00
Thomas Lively
21ce405fed Revert "[WebAssembly] Eliminate range checks on br_tables"
This reverts commit f99d5f8c32a822580a732d15a34e8197da55d22b.
The change was causing UBSan and other failures on some bots.
2020-06-03 01:26:53 -07:00
Vitaly Buka
5b5814a4b5 [StackSafety] Skip non-pointer parameters
Summary: Depends on D80908.

Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80956
2020-06-03 01:16:39 -07:00
Vitaly Buka
9a416f28c3 [NFC, StackSafety] Change type of internal container
Summary: Depends on D80771.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: mehdi_amini, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80847
2020-06-03 01:05:10 -07:00
QingShan Zhang
4237f2e791 [NFC][PowerPC] Remove unused node PPCISD::VMADDFP and PPCISD::VNMSUBFP
These two nodes were added by 69caef2b781130a7d0eeaf8898eb346b6423ae03 in 2005
and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer
way for VMADDFP if we really want to create that node. For VNMSUBFP, we will
also add a more generic node FNMSUB in D76585 if we really want it.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D80429
2020-06-03 06:36:30 +00:00
David Sherwood
5423c8ae53 [CodeGen] Fix warnings in getPackedVectorTypeFromPredicateType
Use getVectorElementCount() instead of getVectorNumElements().
The code changed in this patch is covered by an existing test:

  CodeGen/AArch64/sve-intrinsics-contiguous-prefetches.ll

Differential Revision: https://reviews.llvm.org/D80615
2020-06-03 07:01:20 +01:00
Craig Topper
bb3074f462 [X86] Add CLWB to Tremont CPU. Remove CLDEMOTE, MOVDIRI, MOVDIR64B, and WAITPKG to match gcc. 2020-06-02 22:38:51 -07:00
Serge Pavlov
0a558b962d [Support] Add file lock/unlock functions
New functions `lockFile`, `tryLockFile` and `unlockFile` implement
simple file locking. They lock or unlock entire file. This must be
enough to support simulataneous writes to log files in parallel builds.

Differential Revision: https://reviews.llvm.org/D78896
2020-06-03 12:22:45 +07:00
Carl Ritson
8b90fe296e [AMDGPU] Make SGPR spills exec mask agnostic
Explicitly set the exec mask for SGPR spills and reloads.
This fixes a bug where SGPR spills to memory could be incorrect
if the exec mask was 0 (or differed between spill and reload).

Additionally pack scalar subregisters (upto 16/32 per VGPR),
so that the majority of scalar types can be spilt or reloaded
with a simple memory access.  This should amortize some of the
additional overhead of manipulating the exec mask.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D80282
2020-06-03 12:34:26 +09:00
Mehdi Amini
356c1d11d8 Revert "[NFC, StackSafety] Change type of internal container"
This reverts commit f62813e7eae148a6175de28bfa384524a9f2bf94.
GCC 5.3 build is broken.
2020-06-03 03:02:28 +00:00
Jessica Paquette
9e9596734c [AArch64][GlobalISel] Select zip1 and zip2
Port the code to recognize a zip1/zip2 shuffle mask from AArch64ISelLowering
and put it into the post-legalizer combiner.

Add G_ZIP1 and G_ZIP2 to AArch64InstrGISel.td and hook them up as equivalent
nodes to AArch64zip1 and AArch64zip2. This allows us to select them.

Minor code size improvements for SPECINT2000 at -O3 on 197.parser, 252.eon, and
186.crafty.

Differential Revision: https://reviews.llvm.org/D80969
2020-06-02 18:57:11 -07:00
Kazu Hirata
47434a0533 [JumpThreading] Simplify FindMostPopularDest (NFC)
Summary:
This patch simplifies FindMostPopularDest without changing the
functionality.

Given a list of jump threading destinations, the function finds the
most popular destination.  To ensure determinism when there are
multiple destinations with the highest popularity, the function picks
the first one in the successor list with the highest popularity.

Without this patch:

- The function populates DestPopularity -- a histogram mapping
  destinations to their respective occurrence counts.

- Then we iterate over DestPopularity, looking for the highest
  popularity while building a vector of destinations with the highest
  popularity.

- Finally, we iterate the successor list, looking for the destination
  with the highest popularity.

With this patch:

- We implement DestPopularity with MapVector instead of DenseMap.  We
  populate the map with popularity 0 for all successors in the order
  they appear in the successor list.

- We build the histogram in the same way as before.

- We simply use std::max_element on DestPopularity to find the most
  popular destination.  The use of MapVector ensures determinism.

Reviewers: wmi, efriedma

Reviewed By: wmi

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81030
2020-06-02 18:43:31 -07:00
Vitaly Buka
78160ddec5 [NFC, StackSafety] Change type of internal container
Summary: Depends on D80771.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80847
2020-06-02 18:27:22 -07:00
Vitaly Buka
4d44ddfc60 [MTE] Move tagging in pipeline
Summary:
This removes two analyses from pipeline.

Depends on D80771.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80780
2020-06-02 17:48:55 -07:00
Wei Mi
93d80a543f [SampleFDO] Add use-sample-profile function attribute.
When sampleFDO is enabled, people may expect they can use
-fno-profile-sample-use to opt-out using sample profile for a certain file.
That could be either for debugging purpose or for performance tuning purpose.
However, when thinlto is enabled, if a function in file A compiled with
-fno-profile-sample-use is imported to another file B compiled with
-fprofile-sample-use, the inlined copy of the function in file B may still
get its profile annotated.

The inconsistency may even introduce profile unused warning because if the
target is not compiled with explicit debug information flag, the function
in file A won't have its debug information enabled (debug information will
be enabled implicitly only when -fprofile-sample-use is used). After it is
imported into file B which is compiled with -fprofile-sample-use, profile
annotation for the outline copy of the function will fail because the
function has no debug information, and that will trigger  profile unused
warning.

We add a new attribute use-sample-profile to control whether a function
will use its sample profile no matter for its outline or inline copies.
That will make the behavior of -fno-profile-sample-use consistent.

Differential Revision: https://reviews.llvm.org/D79959
2020-06-02 17:23:17 -07:00
Guozhi Wei
223393b287 [X86] Add a flag to guard the wide load
As shown in http://lists.llvm.org/pipermail/llvm-dev/2020-May/141854.html,
widen load can also cause stall. Add a flag to guard the widening code,
so users can disable it and evaluate its performance impact.

Differential Revision: https://reviews.llvm.org/D80943
2020-06-02 16:16:13 -07:00
Vitaly Buka
69b767eb38 [MTE] Convert StackSafety into analysis
This lets us to remove !stack-safe metadata and
better controll when to perform StackSafety
analysis.

Reviewers: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80771
2020-06-02 16:08:14 -07:00
Vitaly Buka
eb471b1fc4 [StackSafety] Delete useless test 2020-06-02 16:08:14 -07:00
Nick Desaulniers
6cf6be73f3 [Clang][A32/T32][Linux] -O1 implies -fomit-frame-pointer
Summary:
An upgrade of LLVM for CrOS [0] containing [1] triggered a bunch of
errors related to writing to reserved registers for a Linux kernel's
arm64 compat vdso (which is a aarch32 image).

After a discussion on LKML [2], it was determined that
-f{no-}omit-frame-pointer was not being specified. Comparing GCC and
Clang [3], it becomes apparent that GCC defaults to omitting the frame
pointer implicitly when optimizations are enabled, and Clang does not.
ie. setting -O1 (or above) implies -fomit-frame-pointer. Clang was
defaulting to -fno-omit-frame-pointer implicitly unless -fomit-frame-pointer
was set explicitly.

Why this becomes a problem is that the Linux kernel's arm64 compat vdso
contains code that uses r7. r7 is used sometimes for the frame pointer
(for example, when targeting thumb (-mthumb)). See useR7AsFramePointer()
in llvm/llvm-project/llvm/lib/Target/ARM/ARMSubtarget.h. This is mostly
for legacy/compatibility reasons, and the 2019 Q4 revision of the ARM
AAPCS looks to standardize r11 as the frame pointer for aarch32, though
this is not yet implemented in LLVM.

Users that are reliant on the implicit value if unspecified when
optimizations are enabled should explicitly choose -fomit-frame-pointer
(new behavior) or -fno-omit-frame-pointer (old behavior).

[0] https://bugs.chromium.org/p/chromium/issues/detail?id=1084372
[1] https://reviews.llvm.org/D76848
[2] https://lore.kernel.org/lkml/20200526173117.155339-1-ndesaulniers@google.com/
[3] https://godbolt.org/z/0oY39t

Reviewers: kristof.beyls, psmith, danalbert, srhines, MaskRay, ostannard, efriedma

Reviewed By: psmith, danalbert, srhines, MaskRay, efriedma

Subscribers: efriedma, olista01, MaskRay, vhscampos, cfe-commits, llvm-commits, manojgupta, llozano, glider, hctim, eugenis, pcc, peter.smith, srhines

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80828
2020-06-02 15:54:14 -07:00
Eric Christopher
995cc9c405 Undo initialization of TRI in CGP as this is unconditionally initialized
later.
2020-06-02 15:08:54 -07:00
Craig Topper
5a956160d0 [X86] Remove DeleteNode calls from PreprocessISelDAG. Rely on the RemoveDeadNodes call at the end.
Add a MadeChange flag so we don't call RemoveDeadNodes unless
something changed.
2020-06-02 14:10:20 -07:00
Craig Topper
5e8876d712 [X86] Cleanup inconsistencies in our zext/sext vector patterns.
-Fix one place where we had a X86vzload64 but should have had
 X86vzload32.
-Make sure all patterns that have scalar_to_vector+loadi64 also
have scalar_to_vector+f64 to match 32-bit codegen.
-Add some bitcasts that were missing from patterns.
-Make sure that if we have a scalar_to_vector+load pattern
 we also have a vzload pattern.

We probably need some better canonicalization to avoid having
so many patterns.
2020-06-02 13:50:16 -07:00
Kadir Cetinkaya
def2e34328 [llvm] Fix unused variable warning 2020-06-02 22:46:24 +02:00
LLVM GN Syncbot
0638ffd642 [gn build] Port f99d5f8c32a 2020-06-02 20:36:52 +00:00
Eric Christopher
2f90457f4d Fix up clang-tidy warnings around null and pointers. 2020-06-02 13:24:20 -07:00
Amy Kwan
97fd4517d5 [DAGCombiner] Combine shifts into multiply-high
This patch implements a target independent DAG combine to produce multiply-high
instructions from shifts. This DAG combine will combine shifts for any type as
long as the MULH on the narrow type is legal.

For now, it is enabled on PowerPC as PowerPC is the only target that has an
implementation of the isMulhCheaperThanMulShift TLI hook introduced in
D78271.

Moreover, this DAG combine focuses on catching the pattern:
(shift (mul (ext <narrow_type>:$a to <wide_type>), (ext <narrow_type>:$b to <wide_type>)), <narrow_width>)
to produce mulhs when we have a sign-extend, and mulhu when we have
a zero-extend.

The patch performs the following checks:
- Operation is a right shift arithmetic (sra) or logical (srl)
- Input to the shift is a multiply
- Both operands to the shift are sext/zext nodes
- The extends into the multiply are both the same
- The narrow type is half the width of the wide type
- The shift amount is the width of the narrow type
- The respective mulh operation is legal

Differential Revision: https://reviews.llvm.org/D78272
2020-06-02 15:22:48 -05:00
Thomas Lively
dbbd248c77 [WebAssembly] Eliminate range checks on br_tables
Summary:
Jump tables for most targets cannot handle out of range indices by
themselves, so LLVM emits range checks to guard the jump
tables. WebAssembly, on the other hand, implements jump tables using
the br_table instruction, which takes a default branch target as an
operand, making the range checks redundant. This patch introduces a
new MachineFunction pass in the WebAssembly backend to find and
eliminate the redundant range checks.

Reviewers: aheejin, dschuff

Subscribers: mgorny, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80863
2020-06-02 13:14:27 -07:00
dstuttar
11327a08da [TableGen] Avoid generating switch with just default
Summary:
Switch with just default causes an MSVC warning (warning C4065: switch statement
contains 'default' but no 'case' labels).

Change-Id: I9ddeccdef93666256b5454b164b567b73b488461

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81021
2020-06-02 19:48:07 +01:00
Diego Caballero
986805054f Update 'git push' command in GettingStarted guide
'git push' command, without any other arguments, can do different
things depending on the local configuration of Git. This patch
updates the 'git push' command with extra arguments to be more
resilient to any local configuration.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D79964
2020-06-02 21:25:29 +03:00
Jonas Devlieghere
2b24554801 [llvm-dwarfdump] Print [=<offset>] after --debug-* options in help output.
Some of the --debug-* options can take an optional offset. Although the
man page does a good job of making that clear, it's much harder to
discover from the help output.

Currently the only reference to this is the following sentence:

> Where applicable these parameters take an optional =<offset> argument
> to dump only the entry at the specified offset.

This patch changes the help output from to print [=<offset>] after the
options that take an offset.

  --debug-info[=<offset>]    - Dump the .debug_info section

rdar://problem/63150066

Differential revision: https://reviews.llvm.org/D80959
2020-06-02 11:06:11 -07:00
Matt Arsenault
f2733aab9a AMDGPU: Fix a test to be more stable
The chained unconditional branches can be eliminated and it's not
relevant to the test.
2020-06-02 13:47:48 -04:00
Matt Arsenault
bf98af2851 AMDGPU: Don't run indexing mode switches with exec = 0
Add mode defs rather than special casing this like some of the other
instructions.
2020-06-02 13:47:48 -04:00
Matt Arsenault
58874f0270 AMDGPU: Don't run mode switches with exec 0
These are scalar instructions that change vector instructions, so they
should not be executed without any active lanes.

The implementation of -amdgpu-skip-threshold also seem to be backwards
from expected, since decreasing it prevents removal.
2020-06-02 13:47:48 -04:00
Hiroshi Yamauchi
22e9592a06 [PGO] Enable memcmp/bcmp size value profiling.
Summary: Following up D79751.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80578
2020-06-02 10:27:11 -07:00
Sanjay Patel
4bd1716bf2 [InstCombine] add tests for select-of-select-shuffle; NFC 2020-06-02 13:26:21 -04:00
Sanjay Patel
48da5ea648 [InstCombine] regenerate complete test checks; NFC 2020-06-02 13:26:21 -04:00
Simon Pilgrim
2ec7714439 TypeSymbolEmitter.h - reduce includes to forward declarations. NFC. 2020-06-02 16:30:17 +01:00
Alexey Bataev
1d1cfb66ce [OPENMP50]Initial codegen for 'affinity' clauses.
Summary:
Added initial codegen for 'affinity' clauses on task directives.
Emits next code:
```
kmp_task_affinity_info_t affs[<num_elems>];

void *td = __kmpc_task_alloc(..);

affs[<i>].base = &data_i;
affs[<i>].size = sizeof(data_i);
__kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs);
```

The result returned by the call of `__kmpc_omp_reg_task_with_affinity`
function is ignored currently sincethe  runtime currently ignores args
and returns 0 uncoditionally.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80240
2020-06-02 10:50:08 -04:00
Georgii Rymar
935ddb2639 [yaml2obj] - Allocate the file space for SHT_NOBITS sections in some cases.
This teaches yaml2obj to allocate file space for a no-bits section
when there is a non-nobits section in the same segment that follows it.

It was discussed in D78005 thread and matches GNU linkers and LLD behavior.

Differential revision: https://reviews.llvm.org/D80629
2020-06-02 17:19:24 +03:00
serge-sans-paille
965b4cbc82 Use Pseudo Instruction to carry stack probing information
Instead of using a fake call and metadata to temporarily represent a probed
static alloca, use a pseudo instruction.

This is inspired by the SystemZ approach proposed in https://reviews.llvm.org/D78717.

Differential Revision: https://reviews.llvm.org/D80641
2020-06-02 16:14:06 +02:00
Matt Arsenault
e6f5e03023 AMDGPU: Fix not using scalar loads for global reads in shaders
The pass which infers when it's legal to load a global address space
as SMRD was only considering amdgpu_kernel, and ignoring the shader
entry type calling conventions.
2020-06-02 09:49:23 -04:00
Nico Weber
50523c63f4 [gn build] (manually) port 44f989e7809 2020-06-02 08:18:42 -04:00
Igor Kudrin
17735d83e5 Fix a failing test. 2020-06-02 18:50:36 +07:00
Djordje Todorovic
475384322f [CSInfo][NFC] Interpret loaded parameter value separately
The collectCallSiteParameters() method searches for instructions
which load values into registers used for parameters passing.
Previously, interpretation of those values, loaded by one such
instruction, was implemented inside collectCallSiteParameters() method.

This patch moves the interpretation code from collectCallSiteParameters()
method into a separate static method named interpretValue. New method is
called from collectCallSiteParameters() to process each instruction from
targeted instruction scope.

The collectCallSiteParameters() searches for loaded parameter value
among instructions which precede the call instruction, inside the same
basic block. When needed, new method (interpretValue) could be used for
searching any instruction scope.

This is preparation for search of parameter value, loaded inside call
delay slot.

Patch by Nikola Tesic

Differential revision: https://reviews.llvm.org/D78106
2020-06-02 13:05:04 +02:00