1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00
Commit Graph

614 Commits

Author SHA1 Message Date
Gulfem Savrun Yeniceri
54e2d4cdab Revert "[Passes] Add relative lookup table converter pass"
This reverts commit 5fd001a5ffbad403053c4a06bf4b2b76dc52bba8
because it broke clang-with-thin-lto-ubuntu bot.
2021-03-24 18:59:33 +00:00
Gulfem Savrun Yeniceri
93b265f8c0 [Passes] Add relative lookup table converter pass
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244

This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.

Differential Revision: https://reviews.llvm.org/D94355
2021-03-24 17:31:18 +00:00
Jamie Schmeiser
295b85edfb Revert "A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash."
This reverts commit 9544a32287eccb42a4367cff9b5888855d6b8756.
2021-03-23 10:09:27 -04:00
Jamie Schmeiser
c48e552d8e A new option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash.
Summary:
The IR is saved in its print form before each pass is started and a
signal handler is registered.  If the compilation crashes, the signal
handler will print the saved IR to dbgs().  This option
can be modified using -print-module-scope to get the IR for the complete
module.  Note that this option only works with the new pass manager.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban)
Differential Revision: https://reviews.llvm.org/D86657
2021-03-23 09:29:17 -04:00
Gulfem Savrun Yeniceri
61bfb34ac2 Revert "[Passes] Add relative lookup table converter pass"
This reverts commit 78a65cd945d006ff02f9d24d9cc20a302ed93b08 which
caused buildbot failures.
2021-03-23 00:43:16 +00:00
Gulfem Savrun Yeniceri
59cc51764b [Passes] Add relative lookup table converter pass
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244

This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.

Differential Revision: https://reviews.llvm.org/D94355
2021-03-22 22:09:02 +00:00
Sriraman Tallam
af0d8fe721 Remove original implementation of UniqueInternalLinkageNames pass.
D96109 was recently submitted which contains the refactored implementation of
-funique-internal-linakge-names by adding the unique suffixes in clang rather
than as an LLVM pass. Deleting the former implementation in this change.

Differential Revision: https://reviews.llvm.org/D98234
2021-03-10 11:57:40 -08:00
Ta-Wei Tu
87e3ae17fd [NPM] Add -enable-loopinterchange option to NPM
We have the `enable-loopinterchange` option in legacy pass manager but not in NPM.
Add `LoopInterchange` pass to the optimization pipeline (at the same position as before)
when `enable-loopinterchange` is turned on.

Reviewed By: aeubanks, fhahn

Differential Revision: https://reviews.llvm.org/D98116
2021-03-07 02:39:28 +08:00
Arthur Eubanks
e057f4f89a [ThinLTO][NewPM] Clean up dead code under -O0
We're running into undefined references using ThinLTO with -O0 on
Windows/Chrome. This fixes that.

This matches the legacy PM.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D97414
2021-02-24 17:08:57 -08:00
Vitaly Buka
ab66a0320d [ThinLTO, NewPM] Run OptimizerLastEPCallbacks from buildThinLTOPreLinkDefaultPipeline
-O1 and above do dont call real optimizer pipeline in ThinLTO PreLink.
Also clang can't add PostLink OptimizerLastEPCallbacks for in-process ThinLTO.
This results in missing sanitizer passes with ThinLTO.

Simple working solution is just call OptimizerLastEPCallbacks
at the end of buildThinLTOPreLinkDefaultPipeline.

Differential Revision: https://reviews.llvm.org/D96320
2021-02-23 22:14:41 -08:00
Wenlei He
e2d4d21201 [SampleFDO] Skip PreLink ICP for better profile quality of MonoLTO PostLink
For ThinLTO, PreLink ICP is skipped to favor better profile annotation during LTO PostLink. This change applies the same tweak for MonoLTO. Note that PreLink ICP not only makes PostLink profile annotation harder, it is also uncoordinated with PostLink ICP so duplicated ICP could happen.

Differential Revision: https://reviews.llvm.org/D97028
2021-02-19 19:35:23 -08:00
Nikita Popov
92fca6922a [MemCopyOpt] Enable MemorySSA by default
This enables use of MemorySSA instead of MemDep in MemCpyOpt. To
allow this without significant compile-time impact, the MemCpyOpt
pass is moved directly before DSE (in the cases where this was not
already the case), which allows us to reuse the existing MemorySSA
analysis.

Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt
can also perform simple optimizations across basic blocks.

Differential Revision: https://reviews.llvm.org/D94376
2021-02-19 18:06:25 +01:00
David Green
20be3bc420 [NPM][LTO] Do not enable MemorySSA with LoopFullUnrollPass
As with the standard opt pipeline, we disable the MemorySSA dependency
in the LTO LPM pipeline as not all passes preserve MemorySSA.
2021-02-19 08:35:11 +00:00
David Green
0e6287fa19 [NPM][LTO] Update buildLTODefaultPipeline to be more in-line with the old pass manager
The NPM LTO pipeline has a lot of fixme's and missing passes, causing a
lot of regressions after the switch in c70737b. Notably unrolling and
vectorization were both disabled, but many other passes are missing
compared to the old pass manager. This attempt to enable the most
obvious missing passes like the unroller, vectorization and other loop
passes, fixing the existing FIXME comments.

Differential Revision: https://reviews.llvm.org/D96780
2021-02-17 16:56:28 +00:00
Sanne Wouda
81bbde389a Use LoopRotate PrepareForLTO stage in NPM
The PrepareForLTO stage of LoopRotate tries to avoid unrolling loops
with calls that might be inlined later.  See D94232 where this was
introduced.

We didn't catch all occurances of the LoopRotatePass in the New Pass
Manager, so the original regression in astar returned with the pass
manager switch.
2021-02-17 14:06:57 +00:00
Sameer Sahasrabuddhe
76fb79614f [NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager
The GPUDivergenceAnalysis is now renamed to just "DivergenceAnalysis"
since there is no conflict with LegacyDivergenceAnalysis. In the
legacy PM, this analysis can only be used through the legacy DA
serving as a wrapper. It is now made available as a pass in the new
PM, and has no relation with the legacy DA.

The new DA currently cannot handle irreducible control flow; its
presence can cause the analysis to run indefinitely. The analysis is
now modified to detect this and report all instructions in the
function as divergent. This is super conservative, but allows the
analysis to be used without hanging the compiler.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D96615
2021-02-16 10:26:45 +05:30
Arthur Eubanks
31acdea7b2 [opt][NewPM] Add a --print-passes flag to print all available passes
It seems nicer to list passes given a flag rather than displaying all
passes in opt --help.

This is awkwardly structured because a PassBuilder is required, but
reusing the PassBuilder in runPassPipeline() doesn't work because we
read the input IR before getting to runPassPipeline(). So printing the
list of passes needs to happen before reading the input IR. If we remove
the legacy PM code in main() and move everything from NewPMDriver.cpp
into opt.cpp, we can create the PassBuilder before reading IR and check
if we should print the list of passes and exit. But until then this hack
seems fine.

Compared to the legacy PM, the new PM passes are lacking descriptions.
We'll need to figure out a way to add descriptions if we think this is
important.

Also, this only works for passes specified in PassRegistry.def. If we
want to print other custom registered passes, we'll need a different
mechanism.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D96101
2021-02-10 11:22:12 -08:00
Arthur Eubanks
213c28c3fc [NewPM][HelloWorld] Move HelloWorld to Utils
To prevent creating a new component, which creates a new library.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D95907
2021-02-03 12:59:40 -08:00
Hongtao Yu
3594bb4c1b [CSSPGO] Introducing distribution factor for pseudo probe.
Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count.

This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes.

A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead.

Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D93264
2021-02-02 11:55:01 -08:00
Arthur Eubanks
47246e5e03 [NewPM][Unswitch] Add option to disable -O3 non-trivial unswitching
Some benchmarks regress with non-trivial unswitching, so add an option
to opt-out of performing non-trivial unswitching while investigating.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D95796
2021-02-01 11:11:59 -08:00
Bjorn Pettersson
3aa7f0e352 [NewPM] Add ExtraVectorizerPasses support
As it looks like NewPM generally is using SimpleLoopUnswitch
instead of LoopUnswitch, this patch also use SimpleLoopUnswitch
in the ExtraVectorizerPasses sequence (compared with LegacyPM
which use the LoopUnswitch pass).

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D95457
2021-01-26 22:59:10 +01:00
Florian Hahn
d596025713 [LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands.
D84108 exposed a bad interaction between inlining and loop-rotation
during regular LTO, which is causing notable regressions in at least
CINT2006/473.astar.

The problem boils down to: we now rotate a loop just before the vectorizer
which requires duplicating a function call in the preheader when compiling
the individual files ('prepare for LTO'). But this then prevents further
inlining of the function during LTO.

This patch tries to resolve this issue by making LoopRotate more
conservative with respect to rotating loops that have inline-able calls
during the 'prepare for LTO' stage.

I think this change intuitively improves the current situation in
general. Loop-rotate tries hard to avoid creating headers that are 'too
big'. At the moment, it assumes all inlining already happened and the
cost of duplicating a call is equal to just doing the call. But with LTO,
inlining also happens during full LTO and it is possible that a previously
duplicated call is actually a huge function which gets inlined
during LTO.

From the perspective of LV, not much should change overall. Most loops
calling user-provided functions won't get vectorized to start with
(unless we can infer that the function does not touch memory, has no
other side effects). If we do not inline the 'inline-able' call during
the LTO stage, we merely delayed loop-rotation & vectorization. If we
inline during LTO, chances should be very high that the inlined code is
itself vectorizable or the user call was not vectorizable to start with.

There could of course be scenarios where we inline a sufficiently large
function with code not profitable to vectorize, which would have be
vectorized earlier (by scalarzing the call). But even in that case,
there probably is no big performance impact, because it should be mostly
down to the cost-model to reject vectorization in that case. And then
the version with scalarized calls should also not be beneficial. In a way,
LV should have strictly more information after inlining and make more
accurate decisions (barring cost-model issues).

There is of course plenty of room for things to go wrong unexpectedly,
so we need to keep a close look at actual performance and address any
follow-up issues.

I took a look at the impact on statistics for
MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer
loops rotated, but no change to the number of loops vectorized.

Reviewed By: sanwou01

Differential Revision: https://reviews.llvm.org/D94232
2021-01-19 10:15:29 +00:00
Mircea Trofin
d5b73b5222 [NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner
Expanding from D94808 - we ensure the same InlineAdvisor is used by both
InlinerPass instances. The notion of mandatory inlining is moved into
the core InlineAdvisor: advisors anyway have to handle that case, so
this change also factors out that a bit better.

Differential Revision: https://reviews.llvm.org/D94825
2021-01-15 17:59:38 -08:00
Arthur Eubanks
3ceef7fba8 [NewPM] Fix placement of LoopFlatten
https://reviews.llvm.org/D90402 was inconsistent with where it put
LoopFlatten between the two pass managers. It also missed adding it to
the non-O1 function simplification pipeline.

PR48738

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D94650
2021-01-14 09:49:31 -08:00
Wei Mi
31ab135817 [NFC] Rename ThinLTOPhase to ThinOrFullLTOPhase and move it from PassBuilder.h
to Pass.h.

In some compiler passes like SampleProfileLoaderPass, we want to know which
LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum
class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and
it also cannot represent phase in full LTO. The patch extends it to include
full LTO phases and move it from PassBuilder.h to Pass.h, then it is much
easier for PassBuilder to communiate with each pass about current LTO phase.

Differential Revision: https://reviews.llvm.org/D94613
2021-01-13 15:55:40 -08:00
Arthur Eubanks
5a8b898a6d [NewPM] Only non-trivially loop unswitch at -O3 and for non-optsize functions
This matches the legacy pipeline/pass.

Reviewed By: asbirlea, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D94559
2021-01-13 14:54:49 -08:00
Arthur Eubanks
d28c8be642 [NewPM] Run non-trivial loop unswitching under -O2/3/s/z
Fixes https://bugs.llvm.org/show_bug.cgi?id=48715.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D94448
2021-01-12 11:04:40 -08:00
Arthur Eubanks
17f2d36a06 [NewPM] Don't error when there's an unrecognized pass name
This currently blocks --print-before/after with a legacy PM pass, for
example when we use the new PM for the optimization pipeline but the
legacy PM for the codegen pipeline. Also in the future when the codegen
pipeline works with the new PM there will be multiple places to specify
passes, so even when everything is using the new PM, there will still be
multiple places that can accept different pass names.

Reviewed By: hoy, ychen

Differential Revision: https://reviews.llvm.org/D94283
2021-01-07 22:33:32 -08:00
Arthur Eubanks
d0c52bd0f8 [NFC] Rename registerAliasAnalyses -> registerDefaultAliasAnalyses
To clarify that this only affects the "default" AA.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D93980
2021-01-05 11:07:58 -08:00
Kazu Hirata
a5e31bfa6a [llvm] Use llvm::any_of (NFC) 2021-01-04 11:42:47 -08:00
Florian Hahn
1ba1b63d45 [SimplifyCFG] Enabled hoisting late in LTO pipeline.
bb7d3af1139c disabled hoisting in SimplifyCFG by default, but enabled it
late in the pipeline. But it appears as if the LTO pipelines got missed.

This patch adjusts the LTO pipelines to also enable hoisting in the
later stages.

Unfortunately there's no easy way to add a test for the change I think.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D93684
2021-01-04 16:26:58 +00:00
Hongtao Yu
44ed003650 Moving UniqueInternalLinkageNamesPass to the start of IR pipelines.
`UniqueInternalLinkageNamesPass` is useful to CSSPGO, especially when pseudo probe is used. It solves naming conflict for static functions which otherwise will share a merged profile and likely have a profile quality issue with mismatched CFG checksums. Since the pseudo probe instrumentation happens very early in the pipeline, I'm moving `UniqueInternalLinkageNamesPass` right before it. This is being done only to the new pass manager.

Reviewed By: dblaikie, aeubanks

Differential Revision: https://reviews.llvm.org/D93656
2021-01-02 14:26:21 -08:00
Arthur Eubanks
6a52d86d30 [NewPM] Port infer-address-spaces
And add it to the AMDGPU opt pipeline.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D93880
2020-12-28 19:58:12 -08:00
Arthur Eubanks
3fa4c73c5f [NewPM] Fix CGSCCOptimizerLateEPCallbacks place in pipeline
CGSCCOptimizerLateEPCallbacks are supposed to be run before the function
simplification pipeline, like in the legacy PM and as specified in the
comments for registerCGSCCOptimizerLateEPCallback().

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D93871
2020-12-28 14:03:10 -08:00
Arthur Eubanks
aac9763bc3 [NewPM] Add TargetMachine method to add alias analyses
AMDGPUTargetMachine::adjustPassManager() adds some alias analyses to the
legacy PM. We need a way to do the same for the new PM in order to port
AMDGPUTargetMachine::adjustPassManager() to the new PM.

Currently the new PM adds alias analyses by creating an AAManager via
PassBuilder and overriding the AAManager a PassManager uses via
FunctionAnalysisManager::registerPass().

We will continue to respect a custom AA pipeline that specifies an exact
AA pipeline to use, but for "default" we will now add alias analyses
that backends specify. Most uses of PassManager use the "default"
AAManager created by PassBuilder::buildDefaultAAPipeline(). Backends can
override the newly added TargetMachine::registerAliasAnalyses() to add custom
alias analyses.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D93261
2020-12-21 13:46:07 -08:00
Andrew Litteken
cdd15bf157 [IRSim][IROutliner] Adding the extraction basics for the IROutliner.
Extracting the similar regions is the first step in the IROutliner.

Using the IRSimilarityIdentifier, we collect the SimilarityGroups and
sort them by how many instructions will be removed.  Each
IRSimilarityCandidate is used to define an OutlinableRegion.  Each
region is ordered by their occurrence in the Module and the regions that
are not compatible with previously outlined regions are discarded.

Each region is then extracted with the CodeExtractor into its own
function.

We test that correctly extract in:
test/Transforms/IROutliner/extraction.ll
test/Transforms/IROutliner/address-taken.ll
test/Transforms/IROutliner/outlining-same-globals.ll
test/Transforms/IROutliner/outlining-same-constants.ll
test/Transforms/IROutliner/outlining-different-structure.ll

Recommit of bf899e891387d07dfd12de195ce2a16f62afd5e0 fixing memory
leaks.

Reviewers: paquette, jroelofs, yroux

Differential Revision: https://reviews.llvm.org/D86975
2020-12-17 11:27:26 -06:00
dfukalov
b7b67e3e9a [NFC] Reduce include files dependency and AA header cleanup (part 2).
Continuing work started in https://reviews.llvm.org/D92489:

Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h".

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92852
2020-12-17 14:04:48 +03:00
Bardia Mahjour
b2e0054ca0 [DDG] Data Dependence Graph - DOT printer - recommit
This is being recommitted to try and address the MSVC complaint.

This patch implements a DDG printer pass that generates a graph in
the DOT description language, providing a more visually appealing
representation of the DDG. Similar to the CFG DOT printer, this
functionality is provided under an option called -dot-ddg and can
be generated in a less verbose mode under -dot-ddg-only option.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D90159
2020-12-16 12:37:36 -05:00
Bardia Mahjour
8db5fd0457 Revert "[DDG] Data Dependence Graph - DOT printer"
This reverts commit fd4a10732c8bd646ccc621c0a9af512be252f33a, to
investigate the failure on windows: http://lab.llvm.org:8011/#/builders/127/builds/3274
2020-12-14 16:54:20 -05:00
Bardia Mahjour
6d099be63b [DDG] Data Dependence Graph - DOT printer
This patch implements a DDG printer pass that generates a graph in
the DOT description language, providing a more visually appealing
representation of the DDG. Similar to the CFG DOT printer, this
functionality is provided under an option called -dot-ddg and can
be generated in a less verbose mode under -dot-ddg-only option.

Differential Revision: https://reviews.llvm.org/D90159
2020-12-14 16:41:14 -05:00
Zequan Wu
94ff08c70f [PGO] Enable preinline and cleanup when optimize for size
Differential Revision: https://reviews.llvm.org/D91673
2020-12-10 12:29:17 -08:00
Arthur Eubanks
093409b8fe [NPM] Support -fmerge-functions
I tried to put it in the same place in the pipeline as the legacy PM.

Fixes PR48399.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D93002
2020-12-10 11:45:08 -08:00
Anna Thomas
ea0d3b5f95 [ScalarizeMaskedMemIntrin] Add new PM support
This patch adds new PM support for the pass and the pass can be now used
during middle-end transforms. The old pass is remamed to
ScalarizeMaskedMemIntrinLegacyPass.

Reviewed-By: skatkov, aeubanks
Differential Revision: https://reviews.llvm.org/D92743
2020-12-08 17:15:22 -05:00
Arthur Eubanks
5bc5d5ec44 [NewPM] Support --print-before/after in NPM
This changes --print-before/after to be a list of strings rather than
legacy passes. (this also has the effect of not showing the entire list
of passes in --help-hidden after --print-before/after, which IMO is
great for making it less verbose).

Currently PrintIRInstrumentation passes the class name rather than pass
name to llvm::shouldPrintBeforePass(), meaning
llvm::shouldPrintBeforePass() never functions as intended in the NPM.
There is no easy way of converting class names to pass names outside of
within an instance of PassBuilder.

This adds a map of pass class names to their short names in
PassRegistry.def within PassInstrumentationCallbacks. It is populated
inside the constructor of PassBuilder, which takes a
PassInstrumentationCallbacks.

Add a pointer to PassInstrumentationCallbacks inside
PrintIRInstrumentation and use the newly created map.

This is a bit hacky, but I can't think of a better way since the short
id to class name only exists within PassRegistry.def. This also doesn't
handle passes not in PassRegistry.def but rather added via
PassBuilder::registerPipelineParsingCallback().

llvm/test/CodeGen/Generic/print-after.ll doesn't seem very useful now
with this change.

Reviewed By: ychen, jamieschmeiser

Differential Revision: https://reviews.llvm.org/D87216
2020-12-03 16:52:14 -08:00
Mircea Trofin
6fefe6148b [llvm][inliner] Reuse the inliner pass to implement 'always inliner'
Enable performing mandatory inlinings upfront, by reusing the same logic
as the full inliner, instead of the AlwaysInliner. This has the
following benefits:
- reduce code duplication - one inliner codebase
- open the opportunity to help the full inliner by performing additional
function passes after the mandatory inlinings, but before th full
inliner. Performing the mandatory inlinings first simplifies the problem
the full inliner needs to solve: less call sites, more contextualization, and,
depending on the additional function optimization passes run between the
2 inliners, higher accuracy of cost models / decision policies.

Note that this patch does not yet enable much in terms of post-always
inline function optimization.

Differential Revision: https://reviews.llvm.org/D91567
2020-11-30 12:03:39 -08:00
Hongtao Yu
37399ba912 [CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe instrumentation.
This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient.

The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86502
2020-11-30 10:16:54 -08:00
Hongtao Yu
9aa70f5aa6 [CSSPGO] Pseudo probe instrumentation pass
This change introduces a pseudo probe instrumentation pass for block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

Given the following LLVM IR:

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
  %cmp = icmp eq i32 %x, 0
   br i1 %cmp, label %bb1, label %bb2
bb1:
   br label %bb3
bb2:
   br label %bb3
bb3:
   ret void
}
```

The instrumented IR will look like below. Note that each llvm.pseudoprobe intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
   %cmp = icmp eq i32 %x, 0
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 1)
   br i1 %cmp, label %bb1, label %bb2
bb1:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 2)
   br label %bb3
bb2:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 3)
   br label %bb3
bb3:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 4)
   ret void
}
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86499
2020-11-30 10:16:54 -08:00
Andrew Litteken
29b0588fca Revert "[IRSim][IROutliner] Adding the extraction basics for the IROutliner."
Reverting commit due to address sanitizer errors.

> Extracting the similar regions is the first step in the IROutliner.
> 
> Using the IRSimilarityIdentifier, we collect the SimilarityGroups and
> sort them by how many instructions will be removed.  Each
> IRSimilarityCandidate is used to define an OutlinableRegion.  Each
> region is ordered by their occurrence in the Module and the regions that
> are not compatible with previously outlined regions are discarded.
> 
> Each region is then extracted with the CodeExtractor into its own
> function.
> 
> We test that correctly extract in:
> test/Transforms/IROutliner/extraction.ll
> test/Transforms/IROutliner/address-taken.ll
> test/Transforms/IROutliner/outlining-same-globals.ll
> test/Transforms/IROutliner/outlining-same-constants.ll
> test/Transforms/IROutliner/outlining-different-structure.ll
> 
> Reviewers: paquette, jroelofs, yroux
> 
> Differential Revision: https://reviews.llvm.org/D86975

This reverts commit bf899e891387d07dfd12de195ce2a16f62afd5e0.
2020-11-27 19:55:57 -06:00
Andrew Litteken
4300f34d01 [IRSim][IROutliner] Adding the extraction basics for the IROutliner.
Extracting the similar regions is the first step in the IROutliner.

Using the IRSimilarityIdentifier, we collect the SimilarityGroups and
sort them by how many instructions will be removed.  Each
IRSimilarityCandidate is used to define an OutlinableRegion.  Each
region is ordered by their occurrence in the Module and the regions that
are not compatible with previously outlined regions are discarded.

Each region is then extracted with the CodeExtractor into its own
function.

We test that correctly extract in:
test/Transforms/IROutliner/extraction.ll
test/Transforms/IROutliner/address-taken.ll
test/Transforms/IROutliner/outlining-same-globals.ll
test/Transforms/IROutliner/outlining-same-constants.ll
test/Transforms/IROutliner/outlining-different-structure.ll

Reviewers: paquette, jroelofs, yroux

Differential Revision: https://reviews.llvm.org/D86975
2020-11-27 19:08:29 -06:00
Roman Lebedev
bb0e9c020e [PassManager] Run Induction Variable Simplification pass *after* Recognize loop idioms pass, not before
Currently, `-indvars` runs first, and then immediately after `-loop-idiom` does.
I'm not really sure if `-loop-idiom` requires `-indvars` to run beforehand,
but i'm *very* sure that `-indvars` requires `-loop-idiom` to run afterwards,
as it can be seen in the phase-ordering test.

LoopIdiom runs on two types of loops: countable ones, and uncountable ones.
For uncountable ones, IndVars obviously didn't make any change to them,
since they are uncountable, so for them the order should be irrelevant.
For countable ones, well, they should have been countable before IndVars
for IndVars to make any change to them, and since SCEV is used on them,
it shouldn't matter if IndVars have already canonicalized them.
So i don't really see why we'd want the current ordering.

Should this cause issues, it will give us a reproducer test case
that shows flaws in this logic, and we then could adjust accordingly.

While this is quite likely beneficial in-the-wild already,
it's a required part for the full motivational pattern
behind `left-shift-until-bittest` loop idiom (D91038).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91800
2020-11-25 19:20:07 +03:00
Arthur Eubanks
03d1745e44 [NewPM] Add pipeline EP callback after initial frontend cleanup
This matches the legacy PM's EP_ModuleOptimizerEarly. Some backends use
this extension point and adding the pass somewhere else like
PipelineStartEPCallback doesn't work.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D91804
2020-11-24 21:14:36 -08:00
Arthur Eubanks
f4ee2f482f Reland [CGSCC] Detect devirtualization in more cases
The devirtualization wrapper misses cases where if it wraps a pass
manager, an individual pass may devirtualize an indirect call created by
a previous pass. For example, inlining may create a new indirect call
which is devirtualized by instcombine. Currently the devirtualization
wrapper will not see that because it only checks cgscc edges at the very
beginning and end of the pass (manager) it wraps.

This fixes some tests testing this exact behavior in the legacy PM.

Instead of checking WeakTrackingVHs for CallBases at the very beginning
and end of the pass it wraps, check every time
updateCGAndAnalysisManagerForPass() is called.

check-llvm and check-clang with -abort-on-max-devirt-iterations-reached
on by default doesn't show any failures outside of tests specifically
testing it so it doesn't needlessly rerun passes more than necessary.
(The NPM -O2/3 pipeline run the inliner/function simplification pipeline
under a devirtualization repeater pass up to 4 times by default).

http://llvm-compile-time-tracker.com/?config=O3&stat=instructions&remote=aeubanks
shows that 7zip has ~1% compile time regression. I looked at it and saw
that there indeed was devirtualization happening that was not previously
caught, so now it reruns the CGSCC pipeline on some SCCs, which is WAI.

The initial land assumed CallBase WeakTrackingVHs would always be
CallBases, but they can be RAUW'd with undef.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89587
2020-11-23 21:28:59 -08:00
Arthur Eubanks
0cdba06f91 Revert "[CGSCC] Detect devirtualization in more cases"
This reverts commit 14a68b4aa9732293ad7e16f105b0feb53dc8dbe2.

Causes building self hosted clang to crash when using NPM.
2020-11-23 13:21:05 -08:00
Arthur Eubanks
f4f69878dc [NPM] Share pass building options with legacy PM
We should share options when possible.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91741
2020-11-23 13:04:05 -08:00
Arthur Eubanks
1d64d9e07d Port -print-memderefs to NPM
There is lots of code duplication, but hopefully it won't matter soon.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D91683
2020-11-23 11:56:22 -08:00
Arthur Eubanks
9cf634ad08 [CGSCC] Detect devirtualization in more cases
The devirtualization wrapper misses cases where if it wraps a pass
manager, an individual pass may devirtualize an indirect call created by
a previous pass. For example, inlining may create a new indirect call
which is devirtualized by instcombine. Currently the devirtualization
wrapper will not see that because it only checks cgscc edges at the very
beginning and end of the pass (manager) it wraps.

This fixes some tests testing this exact behavior in the legacy PM.

Instead of checking WeakTrackingVHs for CallBases at the very beginning
and end of the pass it wraps, check every time
updateCGAndAnalysisManagerForPass() is called.

check-llvm and check-clang with -abort-on-max-devirt-iterations-reached
on by default doesn't show any failures outside of tests specifically
testing it so it doesn't needlessly rerun passes more than necessary.
(The NPM -O2/3 pipeline run the inliner/function simplification pipeline
under a devirtualization repeater pass up to 4 times by default).

http://llvm-compile-time-tracker.com/?config=O3&stat=instructions&remote=aeubanks
shows that 7zip has ~1% compile time regression. I looked at it and saw
that there indeed was devirtualization happening that was not previously
caught, so now it reruns the CGSCC pipeline on some SCCs, which is WAI.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89587
2020-11-23 11:55:20 -08:00
Arthur Eubanks
2bf127ff66 [PGO] Make -disable-preinline work with NPM
Fixes cspgo_profile_summary.ll under NPM.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D91826
2020-11-19 22:58:55 -08:00
Arthur Eubanks
1e77e12c0e Port -lower-matrix-intrinsics-minimal to NPM
This reuses the existing lower-matrix-intrinsics pass rather than going
the legacy pass route of creating a new pass.

Use this new variant in the NPM -O0 pipeline.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91811
2020-11-19 17:42:48 -08:00
Arthur Eubanks
9dd02d7898 [NPM] Move more O0 pass building into PassBuilder
This moves handling of alwaysinline, coroutines, matrix lowering, PGO,
and LTO-required passes into PassBuilder. Much of this is replicated
between Clang and opt. Other out-of-tree users also replicate some of
this, such as Rust [1] replicating the alwaysinline, LTO, and PGO
passes.

The LTO passes are also now run in
build(Thin)LTOPreLinkDefaultPipeline() since they are semantically
required for (Thin)LTO.

[1]: f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D91585
2020-11-19 11:22:23 -08:00
Florian Hahn
e24d5bd069 Add pass to add !annotate metadata from @llvm.global.annotations.
This patch adds a new pass to add !annotation metadata for entries in
@llvm.global.anotations, which is generated  using
__attribute__((annotate("_name"))) on functions in Clang.

This has been discussed on llvm-dev as part of
    RFC: Combining Annotation Metadata and Remarks
    http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D91195
2020-11-16 14:57:11 +00:00
Florian Hahn
041da6277f Add !annotation metadata and remarks pass.
This patch adds a new !annotation metadata kind which can be used to
attach annotation strings to instructions.

It also adds a new pass that emits summary remarks per function with the
counts for each annotation kind.

The intended uses cases for this new metadata is annotating
'interesting' instructions and the remarks should provide additional
insight into transformations applied to a program.

To motivate this, consider these specific questions we would like to get answered:

* How many stores added for automatic variable initialization remain after optimizations? Where are they?
* How many runtime checks inserted by a frontend could be eliminated? Where are the ones that did not get eliminated?

Discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)

Reviewed By: thegameg, jdoerfert

Differential Revision: https://reviews.llvm.org/D91188
2020-11-13 13:24:10 +00:00
Arthur Eubanks
8a8876f845 [NewPM] Provide method to run all pipeline callbacks, used for -O0
Some targets may add required passes via
TargetMachine::registerPassBuilderCallbacks(). We need to run those even
under -O0. As an example, BPFTargetMachine adds
BPFAbstractMemberAccessPass, a required pass.

This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
usage of the NPM) by allowing us to share added passes like coroutines
and sanitizers between -O0 and other optimization levels.

Since callbacks may end up not adding passes, we need to check if the
pass managers are empty before adding them, so PassManager now has an
isEmpty() function. For example, polly adds callbacks but doesn't always
add passes in those callbacks, so this is necessary to keep
-debug-pass-manager tests' output from changing depending on if polly is
enabled or not.

Tests are a continuation of those added in
https://reviews.llvm.org/D89083.

Reviewed By: asbirlea, Meinersbur

Differential Revision: https://reviews.llvm.org/D89158
2020-11-11 15:10:27 -08:00
Sjoerd Meijer
2b47fb5801 [LoopFlatten] Run it earlier, just before IndVarSimplify
This is a prep step for widening induction variables in LoopFlatten if this is
posssible (D90640), to avoid having to perform certain overflow checks. Since
IndVarSimplify may already widen induction variables, we want to run
LoopFlatten just before IndVarSimplify. This is a minor reshuffle as both
passes were already close after each other.

Differential Revision: https://reviews.llvm.org/D90402
2020-11-10 20:22:41 +00:00
Sjoerd Meijer
79d4f8d97b [LoopFlatten] Make it a FunctionPass
This converts LoopFlatten from a LoopPass to a FunctionPass so that we don't
run into problems of a loop pass deleting a (inner)loop.

Differential Revision: https://reviews.llvm.org/D90940
2020-11-10 20:03:31 +00:00
Arthur Eubanks
280b8ab920 [NewPM] Port -separate-const-offset-from-gep
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D91095
2020-11-09 17:42:36 -08:00
Arthur Eubanks
fe950b7e24 [NewPM] Add unique-internal-linkage-names to PassRegistry.def
Pass was already ported, just not properly hooked up.
2020-11-09 12:54:13 -08:00
Pedro Tammela
b59848c68a [Reg2Mem] add support for the new pass manager
This patch refactors the pass to accomodate the new pass manager
boilerplate.

Differential Revision: https://reviews.llvm.org/D91005
2020-11-08 11:14:05 +00:00
Arthur Eubanks
35a4ce12d4 Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0"
This reverts commit ae38540042668675dd16c642d850115f217ea59f.
As well as some follow-up test fixes.

The original change causes new-pass-manager.ll to fail when polly is enabled.
2020-11-08 00:32:35 -08:00
Arthur Eubanks
8a12301bab [NewPM] Provide method to run all pipeline callbacks, used for -O0
Some targets may add required passes via
    TargetMachine::registerPassBuilderCallbacks(). We need to run those even
    under -O0. As an example, BPFTargetMachine adds
    BPFAbstractMemberAccessPass, a required pass.

    This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
    usage of the NPM) by allowing us to share added passes like coroutines
    and sanitizers between -O0 and other optimization levels.

    Tests are a continuation of those added in
    https://reviews.llvm.org/D89083.

    In order to prevent TargetMachines from adding unnecessary optimization
    passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be
    changed to take an OptimizationLevel, but that will be done separately.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89158
2020-11-04 22:27:16 -08:00
Arthur Eubanks
b3f5096b36 Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback
This allows targets to skip optional optimization passes at -O0.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D90777
2020-11-04 13:11:40 -08:00
Arthur Eubanks
3a6a4e9f83 Revert "[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback"
This reverts commit 7a83aa0520d24ee5285a9c60b97b57a1db1d65e8.

Causing buildbot failures.
2020-11-04 12:57:32 -08:00
Arthur Eubanks
753c4830f9 [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback
This allows targets to skip optional optimization passes at -O0.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D90777
2020-11-04 12:53:30 -08:00
Arthur Eubanks
e22b9f13f5 Port print-must-be-executed-contexts and print-mustexecute to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90207
2020-11-03 21:06:46 -08:00
Arthur Eubanks
3102160c9b [NFC] Clean up PassBuilder
Make DebugLogging a member variable so that users of PassBuilder don't
need to pass it around so much.

Move call to TargetMachine::registerPassBuilderCallbacks() within
PassBuilder so users don't need to remember to call it.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90437
2020-10-30 10:03:59 -07:00
Serguei Katkov
fd4f36fcb9 [GVN LoadPRE] Add an option to disable splitting backedge
GVN Load PRE can split the backedge causing breaking the loop structure where the latch
contains the conditional branch with for example induction variable.

Different optimizations expect this form of the loop, so it is better to preserve it for some time.
This CL adds an option to control an ability to split backedge.

Default value is true so technically it is NFC and current behavior is not changed.

Reviewers: fedor.sergeev, mkazantsev, nikic, reames, fhahn
Reviewed By: mkazasntsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D89854
2020-10-27 11:59:52 +07:00
TaWeiTu
6b4f0b5bdf [NPM] Port -slsr to NPM
`-separate-const-offset-from-gep` has not yet be ported, so some tests are not updated.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D90149
2020-10-27 09:21:40 +08:00
Arthur Eubanks
8eb1142964 Fix typo SSC -> SCC 2020-10-24 16:26:48 -07:00
TaWeiTu
39d46368c8 [NPM] Port -loop-versioning-licm to NPM
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D89371
2020-10-24 21:51:18 +08:00
Arthur Eubanks
77ffdb3e3c [StructurizeCFG][NewPM] Port -structurizecfg to NPM
This doesn't support -structurizecfg-skip-uniform-regions since that
would require porting LegacyDivergenceAnalysis.

The NPM doesn't support adding a non-analysis pass as a dependency of
another, so I had to add -lowerswitch to some tests or pin them to the
legacy PM.

This is the only RegionPass in tree, so I simply copied the logic for
finding all Regions from the legacy PM's RGManager into
StructurizeCFG::run().

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D89026
2020-10-23 15:54:03 -07:00
Arthur Eubanks
e666973053 Port -instnamer to NPM
Some clang tests use this.

Reviewed By: akhuang

Differential Revision: https://reviews.llvm.org/D89931
2020-10-22 12:08:36 -07:00
Arthur Eubanks
fb8cc45e04 [LoopRotate][NPM] Disable header duplication under -Oz
It was already disabled under -Oz in
buildFunctionSimplificationPipeline(), but not in
buildModuleOptimizationPipeline()/addPGOInstrPasses().

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89927
2020-10-22 08:39:12 -07:00
Arthur Eubanks
7faaf7ec26 [BlockExtract][NewPM] Port -extract-blocks to NPM
Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89015
2020-10-21 12:51:11 -07:00
Florian Hahn
125f3aa1f8 [Passes] Move ADCE before DSE & LICM.
The adjustment seems to have very little impact on optimizations.
The only binary change with -O3 MultiSource/SPEC2000/SPEC2006 on X86 is
in consumer-typeset and the size there actually decreases by -0.1%, with
not significant changes in the stats.

On its own, it is mildly positive in terms of compile-time, most likely
due to LICM & DSE having to process slightly less instructions. It
should also be unlikely that DSE/LICM make much new code dead.

http://llvm-compile-time-tracker.com/compare.php?from=df63eedef64d715ce1f31843f7de9c11fe1e597f&to=e3bdfcf94a9eeae6e006d010464f0c1b3550577d&stat=instructions

With DSE & MemorySSA, it gives some nice compile-time improvements, due
to the fact that DSE can re-use the PDT from ADCE, if it does not make
any changes:

http://llvm-compile-time-tracker.com/compare.php?from=15fdd6cd7c24c745df1bb419e72ff66fd138aa7e&to=481f494515fc89cb7caea8d862e40f2c910dc994&stat=instructions

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D87322
2020-10-21 10:30:56 +01:00
Ta-Wei Tu
3a711be325 [NPM] port -unify-loop-exits to NPM
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D89774
2020-10-20 10:46:57 -07:00
Ta-Wei Tu
516d548dc2 [NPM] Port -mergereturn to NPM
Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D89781
2020-10-20 10:33:58 -07:00
Amy Huang
4a9fcdb008 [NPM] Port module-debuginfo pass to the new pass manager
Port pass to NPM and update tests in DebugInfo/Generic.

Differential Revision: https://reviews.llvm.org/D89730
2020-10-19 14:31:17 -07:00
Hans Wennborg
f6eeeaf04e Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting"
This broke Chromium's PGO build, it seems because hot-cold-splitting got turned
on unintentionally. See comment on the code review for repro etc.

> This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
> the splitting pass to be toggled on/off. The current method of passing
> `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
> correctly (say, with `-O0` or `-Oz`).
>
> To implement the -fsplit-cold-code option, an attribute is applied to
> functions to indicate that they may be considered for splitting. This
> removes some complexity from the old/new PM pipeline builders, and
> behaves as expected when LTO is enabled.
>
> Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
> Differential Revision: https://reviews.llvm.org/D57265
> Reviewed By: Aditya Kumar, Vedant Kumar
> Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar

This reverts commit 273c299d5d649a0222fbde03c9a41e41913751b4.
2020-10-19 12:31:14 +02:00
Vedant Kumar
cce078ae12 [PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting
This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
the splitting pass to be toggled on/off. The current method of passing
`-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
correctly (say, with `-O0` or `-Oz`).

To implement the -fsplit-cold-code option, an attribute is applied to
functions to indicate that they may be considered for splitting. This
removes some complexity from the old/new PM pipeline builders, and
behaves as expected when LTO is enabled.

Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
Differential Revision: https://reviews.llvm.org/D57265
Reviewed By: Aditya Kumar, Vedant Kumar
Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
2020-10-15 23:13:33 +00:00
Arthur Eubanks
cf44f794e4 [LoopExtract][NewPM] Port -loop-extract to NPM
-loop-extract-single is just -loop-extract on one loop.

-loop-extract depended on -break-crit-edges and -loop-simplify in the
legacy PM, but the NPM doesn't allow specifying pass dependencies like
that, so manually add those passes to the RUN lines where necessary.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89016
2020-10-13 22:55:42 -07:00
Arthur Eubanks
a70ba1098b [FixIrreducible][NewPM] Port -fix-irreducible to NPM
In the NPM, a pass cannot depend on another non-analysis pass. So pin
the test that tests that -lowerswitch is run automatically to legacy PM.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D89051
2020-10-09 09:22:09 -07:00
Arthur Eubanks
a2aedea790 [LoopInterchange][NewPM] Port -loop-interchange to NPM
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89058
2020-10-09 09:21:31 -07:00
Arthur Eubanks
8c7ac7cd6f [NewPM] Use PassInstrumentation for -verify-each
This removes "VerifyEachPass" parameters from a lot of functions which is nice.

Don't verify after special passes or VerifierPass.

This introduces verification on loop and cgscc passes, verifying the corresponding function/module.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88764
2020-10-07 19:24:25 -07:00
Reid Kleckner
4191643790 Port StripGCRelocates pass to NPM
Fixes one test under NPM

Differential Revision: https://reviews.llvm.org/D88766
2020-10-07 14:41:29 -07:00
Reid Kleckner
1ec8cebaf1 [NPM] Port strip nonlinetable debuginfo pass to the new pass manager
Fixes a few tests in llvm/test/Transforms/Utils.

Differential Revision: https://reviews.llvm.org/D88762
2020-10-07 14:35:36 -07:00
Arthur Eubanks
7b924b5661 [MetaRenamer][NewPM] Port metarenamer to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88690
2020-10-02 15:42:25 -07:00
Sjoerd Meijer
c208ece583 [LoopFlatten] Add a loop-flattening pass
This is a simple pass that flattens nested loops.  The intention is to optimise
loop nests like this, which together access an array linearly:

  for (int i = 0; i < N; ++i)
    for (int j = 0; j < M; ++j)
      f(A[i*M+j]);

into one loop:

  for (int i = 0; i < (N*M); ++i)
    f(A[i]);

It can also flatten loops where the induction variables are not used in the
loop. This can help with codesize and runtime, especially on simple cpus
without advanced branch prediction.

This is only worth flattening if the induction variables are only used in an
expression like i*M+j. If they had any other uses, we would have to insert a
div/mod to reconstruct the original values, so this wouldn't be profitable.

This partially fixes PR40581 as this pass triggers on one of the two cases. I
will follow up on this to learn LoopFlatten a few more (small) tricks. Please
note that LoopFlatten is not yet enabled by default.

Patch by Oliver Stannard, with minor tweaks from Dave Green and myself.

Differential Revision: https://reviews.llvm.org/D42365
2020-10-01 13:54:45 +01:00
Arthur Eubanks
88c0d65e22 [ObjCARCAA][NewPM] Add already ported objc-arc-aa to PassRegistry.def
Also add missing AnalysisKey definition.
2020-09-30 08:50:44 -07:00
Chuanqi Xu
5802e5931e [Coroutines] Reuse storage for local variables with non-overlapping lifetimes
bug 45566 shows the process of building coroutine frame won't consider
that the lifetimes of different local variables are not overlapped,
which means the compiler could generates smaller frame.

This patch calculate the lifetime range of each alloca by StackLifetime
class. Then the patch build non-overlapped sets for allocas whose
lifetime ranges are not overlapped. We use the largest type in a
non-overlapped set as the field type in the frame. In insertSpills
process, if we find the type of field is not the same with the alloca,
we cast the pointer to the field type to the pointer to the alloca type.
Since the lifetime range of alloca in one non-overlapped set is not
overlapped with each other, it should be ok to reuse the storage space
in the frame.

Test plan: check-llvm, check-clang, cppcoro, folly

Reviewers: junparser, lxfind, modocache

Differential Revision: https://reviews.llvm.org/D87596
2020-09-28 15:48:00 +08:00
Fangrui Song
cd74c48503 [NewPM] Port ConstraintElimination to the new pass manager
If -enable-constraint-elimination is specified, add it to the -O2/-O3 pipeline.
(-O1 uses a separate function now.)

Reviewed By: fhahn, aeubanks

Differential Revision: https://reviews.llvm.org/D88365
2020-09-27 11:12:26 -07:00
Arthur Eubanks
f62987dedb [LoopReroll][NewPM] Port -loop-reroll to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87957
2020-09-25 12:09:06 -07:00