1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00
llvm-mirror/test/Other/new-pm-thinlto-defaults.ll

264 lines
14 KiB
LLVM
Raw Normal View History

[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; The IR below was crafted so as:
; 1) To have a loop, so we create a loop pass manager
; 2) To be "immutable" in the sense that no pass in the standard
; pipeline will modify it.
; Since no transformations take place, we don't expect any analyses
; to be invalidated.
; Any invalidation that shows up here is a bug, unless we started modifying
; the IR, in which case we need to make it immutable harder.
;
; Prelink pipelines:
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
; RUN: -passes='thinlto-pre-link<O1>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
; RUN: -passes='thinlto-pre-link<O2>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
; RUN: -passes='thinlto-pre-link<O3>' -S -passes-ep-pipeline-start='no-op-module' %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O3,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS,CHECK-EP-PIPELINE-START
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
; RUN: -passes='thinlto-pre-link<Os>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
; RUN: -passes='thinlto-pre-link<Oz>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-O23SZ,CHECK-PRELINK-O,CHECK-PRELINK-O-NODIS
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager -new-pm-debug-info-for-profiling \
; RUN: -passes='thinlto-pre-link<O2>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-DIS,CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-PRELINK-O
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
;
; Postlink pipelines:
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; RUN: -passes='thinlto<O1>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O1,CHECK-POSTLINK-O,%llvmcheckext
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; RUN: -passes='thinlto<O2>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-POSTLINK-O,%llvmcheckext,CHECK-POSTLINK-O2
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager -passes-ep-pipeline-start='no-op-module' \
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; RUN: -passes='thinlto<O3>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O3,CHECK-O23SZ,CHECK-POSTLINK-O,%llvmcheckext,CHECK-POSTLINK-O3
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; RUN: -passes='thinlto<Os>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-Os,CHECK-O23SZ,CHECK-POSTLINK-O,%llvmcheckext,CHECK-POSTLINK-Os
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager \
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; RUN: -passes='thinlto<Oz>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-Oz,CHECK-O23SZ,CHECK-POSTLINK-O,%llvmcheckext
; RUN: opt -disable-verify -verify-cfg-preserved=0 -debug-pass-manager -new-pm-debug-info-for-profiling \
; RUN: -passes='thinlto<O2>' -S %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=CHECK-O,CHECK-O2,CHECK-O23SZ,CHECK-POSTLINK-O,%llvmcheckext,CHECK-POSTLINK-O2
; Suppress FileCheck --allow-unused-prefixes=false diagnostics.
; CHECK-NOEXT: {{^}}
; CHECK-O: Running pass: Annotation2Metadata
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass
; CHECK-EP-PIPELINE-START-NEXT: Running pass: NoOpModulePass
; CHECK-DIS-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-DIS-NEXT: Running pass: AddDiscriminatorsPass
; CHECK-POSTLINK-O-NEXT: Running pass: PGOIndirectCallPromotion
; CHECK-POSTLINK-O-NEXT: Running analysis: ProfileSummaryAnalysis
; CHECK-POSTLINK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-POSTLINK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: InferFunctionAttrsPass
; CHECK-PRELINK-O-NODIS-NEXT: Running analysis: InnerAnalysisManagerProxy
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis
; CHECK-O-NEXT: Running pass: LowerExpectIntrinsicPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O-NEXT: Running analysis: TargetIRAnalysis
; CHECK-O-NEXT: Running analysis: AssumptionAnalysis
; CHECK-O-NEXT: Running pass: SROA
; CHECK-O-NEXT: Running analysis: DominatorTreeAnalysis
; CHECK-O-NEXT: Running pass: EarlyCSEPass
; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis
; CHECK-O-NEXT: Running pass: CoroEarlyPass
; CHECK-O3-NEXT: Running pass: CallSiteSplittingPass
; CHECK-O-NEXT: Running pass: OpenMPOptPass
; CHECK-POSTLINK-O-NEXT: Running pass: LowerTypeTestsPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: IPSCCPPass
; CHECK-O-NEXT: Running pass: CalledValuePropagationPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: GlobalOptPass
; CHECK-O-NEXT: Running pass: PromotePass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: DeadArgumentEliminationPass
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-PRELINK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
; CHECK-O-NEXT: Running analysis: AAManager
; CHECK-O-NEXT: Running analysis: BasicAA
; CHECK-O-NEXT: Running analysis: ScopedNoAliasAA
; CHECK-O-NEXT: Running analysis: TypeBasedAA
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass
; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
; CHECK-O-NEXT: Running analysis: GlobalsAA
; CHECK-O-NEXT: Running analysis: CallGraphAnalysis
; CHECK-O-NEXT: Running pass: InvalidateAnalysisPass<{{.*}}AAManager
; CHECK-O-NEXT: Invalidating analysis: AAManager
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
; CHECK-PRELINK-O-NEXT: Running analysis: ProfileSummaryAnalysis
[PassManager] Run additional LICM before LoopRotate Loop rotation often has to perform code duplication from header into preheader, which introduces PHI nodes. >>! In D99204, @thopre wrote: > > With loop peeling, it is important that unnecessary PHIs be avoided or > it will leads to spurious peeling. One source of such PHIs is loop > rotation which creates PHIs for invariant loads. Those PHIs are > particularly problematic since loop peeling is now run as part of simple > loop unrolling before GVN is run, and are thus a source of spurious > peeling. > > Note that while some of the load can be hoisted and eventually > eliminated by instruction combine, this is not always possible due to > alignment issue. In particular, the motivating example [1] was a load > inside a class instance which cannot be hoisted because the `this' > pointer has an alignment of 1. > > [1] http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/attachment.cpp Now, we could enhance LoopRotate to avoid duplicating code when not needed, but instead hoist loop-invariant code, but isn't that a code duplication? (*sic*) We have LICM, and in fact we already run it right after LoopRotation. We could try to move it to before LoopRotation, that is basically free from compile-time perspective: https://llvm-compile-time-tracker.com/compare.php?from=6c93eb4477d88af046b915bc955c03693b2cbb58&to=a4bee6d07732b1184c436da489040b912f0dc271&stat=instructions But, looking at stats, i think it isn't great that we would no longer do LICM after LoopRotation, in particular: | statistic name | LoopRotate-LICM | LICM-LoopRotate | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9015799 | -131 | 0.00% | 0.00% | | indvars.NumElimCmp | 3536 | 3544 | 8 | 0.23% | 0.23% | | indvars.NumElimExt | 36725 | 36580 | -145 | -0.39% | 0.39% | | indvars.NumElimIV | 1197 | 1187 | -10 | -0.84% | 0.84% | | indvars.NumElimIdentity | 143 | 136 | -7 | -4.90% | 4.90% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29890 | 48 | 0.16% | 0.16% | | indvars.NumReplaced | 2293 | 2227 | -66 | -2.88% | 2.88% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26329 | -109 | -0.41% | 0.41% | | instcount.TotalBlocks | 1178338 | 1173840 | -4498 | -0.38% | 0.38% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9896139 | -9303 | -0.09% | 0.09% | | lcssa.NumLCSSA | 425871 | 423961 | -1910 | -0.45% | 0.45% | | licm.NumHoisted | 378357 | 378753 | 396 | 0.10% | 0.10% | | licm.NumMovedCalls | 2193 | 2208 | 15 | 0.68% | 0.68% | | licm.NumMovedLoads | 35899 | 31821 | -4078 | -11.36% | 11.36% | | licm.NumPromoted | 11178 | 11154 | -24 | -0.21% | 0.21% | | licm.NumSunk | 13359 | 13587 | 228 | 1.71% | 1.71% | | loop-delete.NumDeleted | 8547 | 8402 | -145 | -1.70% | 1.70% | | loop-instsimplify.NumSimplified | 12876 | 11890 | -986 | -7.66% | 7.66% | | loop-peel.NumPeeled | 1008 | 925 | -83 | -8.23% | 8.23% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42003 | -12 | -0.03% | 0.03% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 242 | 2 | 0.83% | 0.83% | | loop-simplifycfg.NumLoopExitsDeleted | 497 | 20 | -477 | -95.98% | 95.98% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 336 | -282 | -45.63% | 45.63% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11032 | 4 | 0.04% | 0.04% | | loop-unroll.NumUnrolled | 12608 | 12529 | -79 | -0.63% | 0.63% | | mem2reg.NumDeadAlloca | 10222 | 10221 | -1 | -0.01% | 0.01% | | mem2reg.NumPHIInsert | 192110 | 192106 | -4 | 0.00% | 0.00% | | mem2reg.NumSingleStore | 637650 | 637643 | -7 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 814 | 812 | -2 | -0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 283108 | 282934 | -174 | -0.06% | 0.06% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106718 | 6 | 0.01% | 0.01% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | ... but that actually regresses LICM (-12% `licm.NumMovedLoads`), loop-simplifycfg (`NumLoopExitsDeleted`, `NumTerminatorsFolded`), simple-loop-unswitch (`NumTrivial`). What if we instead have LICM both before and after LoopRotate? | statistic name | LoopRotate-LICM | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9014474 | -1456 | -0.02% | 0.02% | | indvars.NumElimCmp | 3536 | 3546 | 10 | 0.28% | 0.28% | | indvars.NumElimExt | 36725 | 36681 | -44 | -0.12% | 0.12% | | indvars.NumElimIV | 1197 | 1185 | -12 | -1.00% | 1.00% | | indvars.NumElimIdentity | 143 | 146 | 3 | 2.10% | 2.10% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29899 | 57 | 0.19% | 0.19% | | indvars.NumReplaced | 2293 | 2299 | 6 | 0.26% | 0.26% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26404 | -34 | -0.13% | 0.13% | | instcount.TotalBlocks | 1178338 | 1173652 | -4686 | -0.40% | 0.40% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9895452 | -9990 | -0.10% | 0.10% | | lcssa.NumLCSSA | 425871 | 425373 | -498 | -0.12% | 0.12% | | licm.NumHoisted | 378357 | 383352 | 4995 | 1.32% | 1.32% | | licm.NumMovedCalls | 2193 | 2204 | 11 | 0.50% | 0.50% | | licm.NumMovedLoads | 35899 | 35755 | -144 | -0.40% | 0.40% | | licm.NumPromoted | 11178 | 11163 | -15 | -0.13% | 0.13% | | licm.NumSunk | 13359 | 14321 | 962 | 7.20% | 7.20% | | loop-delete.NumDeleted | 8547 | 8538 | -9 | -0.11% | 0.11% | | loop-instsimplify.NumSimplified | 12876 | 12041 | -835 | -6.48% | 6.48% | | loop-peel.NumPeeled | 1008 | 924 | -84 | -8.33% | 8.33% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42005 | -10 | -0.02% | 0.02% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 241 | 1 | 0.42% | 0.42% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 619 | 1 | 0.16% | 0.16% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11029 | 1 | 0.01% | 0.01% | | loop-unroll.NumUnrolled | 12608 | 12525 | -83 | -0.66% | 0.66% | | mem2reg.NumPHIInsert | 192110 | 192073 | -37 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637650 | 637652 | 2 | 0.00% | 0.00% | | scalar-evolution.NumTripCountsComputed | 283108 | 282998 | -110 | -0.04% | 0.04% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106691 | -21 | -0.02% | 0.02% | | simple-loop-unswitch.NumBranches | 5178 | 5185 | 7 | 0.14% | 0.14% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 925 | 11 | 1.20% | 1.20% | | simple-loop-unswitch.NumTrivial | 183 | 179 | -4 | -2.19% | 2.19% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | I.e. we end up with less instructions, less peeling, more LICM activity, also note how none of those 4 regressions are here. Namely: | statistic name | LICM-LoopRotate | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015799 | 9014474 | -1325 | -0.01% | 0.01% | | indvars.NumElimCmp | 3544 | 3546 | 2 | 0.06% | 0.06% | | indvars.NumElimExt | 36580 | 36681 | 101 | 0.28% | 0.28% | | indvars.NumElimIV | 1187 | 1185 | -2 | -0.17% | 0.17% | | indvars.NumElimIdentity | 136 | 146 | 10 | 7.35% | 7.35% | | indvars.NumLFTR | 29890 | 29899 | 9 | 0.03% | 0.03% | | indvars.NumReplaced | 2227 | 2299 | 72 | 3.23% | 3.23% | | indvars.NumWidened | 26329 | 26404 | 75 | 0.28% | 0.28% | | instcount.TotalBlocks | 1173840 | 1173652 | -188 | -0.02% | 0.02% | | instcount.TotalInsts | 9896139 | 9895452 | -687 | -0.01% | 0.01% | | lcssa.NumLCSSA | 423961 | 425373 | 1412 | 0.33% | 0.33% | | licm.NumHoisted | 378753 | 383352 | 4599 | 1.21% | 1.21% | | licm.NumMovedCalls | 2208 | 2204 | -4 | -0.18% | 0.18% | | licm.NumMovedLoads | 31821 | 35755 | 3934 | 12.36% | 12.36% | | licm.NumPromoted | 11154 | 11163 | 9 | 0.08% | 0.08% | | licm.NumSunk | 13587 | 14321 | 734 | 5.40% | 5.40% | | loop-delete.NumDeleted | 8402 | 8538 | 136 | 1.62% | 1.62% | | loop-instsimplify.NumSimplified | 11890 | 12041 | 151 | 1.27% | 1.27% | | loop-peel.NumPeeled | 925 | 924 | -1 | -0.11% | 0.11% | | loop-rotate.NumRotated | 42003 | 42005 | 2 | 0.00% | 0.00% | | loop-simplifycfg.NumLoopBlocksDeleted | 242 | 241 | -1 | -0.41% | 0.41% | | loop-simplifycfg.NumLoopExitsDeleted | 20 | 497 | 477 | 2385.00% | 2385.00% | | loop-simplifycfg.NumTerminatorsFolded | 336 | 619 | 283 | 84.23% | 84.23% | | loop-unroll.NumCompletelyUnrolled | 11032 | 11029 | -3 | -0.03% | 0.03% | | loop-unroll.NumUnrolled | 12529 | 12525 | -4 | -0.03% | 0.03% | | mem2reg.NumDeadAlloca | 10221 | 10222 | 1 | 0.01% | 0.01% | | mem2reg.NumPHIInsert | 192106 | 192073 | -33 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637643 | 637652 | 9 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 812 | 814 | 2 | 0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 282934 | 282998 | 64 | 0.02% | 0.02% | | scalar-evolution.NumTripCountsNotComputed | 106718 | 106691 | -27 | -0.03% | 0.03% | | simple-loop-unswitch.NumBranches | 4752 | 5185 | 433 | 9.11% | 9.11% | | simple-loop-unswitch.NumCostMultiplierSkipped | 503 | 925 | 422 | 83.90% | 83.90% | | simple-loop-unswitch.NumSwitches | 18 | 20 | 2 | 11.11% | 11.11% | | simple-loop-unswitch.NumTrivial | 95 | 179 | 84 | 88.42% | 88.42% | {F15983613} {F15983615} {F15983616} (this is vanilla llvm testsuite + rawspeed + darktable) As an example of the code where early LICM only is bad, see: https://godbolt.org/z/GzEbacs4K This does have an observable compile-time regression of +~0.5% geomean https://llvm-compile-time-tracker.com/compare.php?from=7c5222e4d1a3a14f029e5f614c9aefd0fa505f1e&to=5d81826c3411982ca26e46b9d0aff34c80577664&stat=instructions but i think that's basically nothing, and there's potential that it might be avoidable in the future by fixing clang to produce alignment information on function arguments, thus making the second run unneeded. Differential Revision: https://reviews.llvm.org/D99249
2021-04-02 09:40:12 +02:00
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: InlinerPass
; CHECK-O-NEXT: Running pass: InlinerPass
; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
; CHECK-O-NEXT: Running analysis: AAManager
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
; CHECK-O2-NEXT: Running pass: OpenMPOptCGSCCPass on (foo)
; CHECK-O3-NEXT: Running pass: OpenMPOptCGSCCPass on (foo)
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SROA
; CHECK-O-NEXT: Running pass: EarlyCSEPass
; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
; CHECK-O23SZ-NEXT: Running pass: SpeculativeExecutionPass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O3-NEXT: Running pass: AggressiveInstCombinePass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-O1-NEXT: Running pass: LibCallsShrinkWrapPass
; CHECK-O2-NEXT: Running pass: LibCallsShrinkWrapPass
; CHECK-O3-NEXT: Running pass: LibCallsShrinkWrapPass
; CHECK-O23SZ-NEXT: Running pass: TailCallElimPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O-NEXT: Running pass: ReassociatePass
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
; CHECK-O-NEXT: Running pass: LoopSimplifyPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running analysis: LoopAnalysis
; CHECK-O-NEXT: Running pass: LCSSAPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running analysis: ScalarEvolutionAnalysis
; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: LoopInstSimplifyPass
; CHECK-O-NEXT: Running pass: LoopSimplifyCFGPass
[PassManager] Run additional LICM before LoopRotate Loop rotation often has to perform code duplication from header into preheader, which introduces PHI nodes. >>! In D99204, @thopre wrote: > > With loop peeling, it is important that unnecessary PHIs be avoided or > it will leads to spurious peeling. One source of such PHIs is loop > rotation which creates PHIs for invariant loads. Those PHIs are > particularly problematic since loop peeling is now run as part of simple > loop unrolling before GVN is run, and are thus a source of spurious > peeling. > > Note that while some of the load can be hoisted and eventually > eliminated by instruction combine, this is not always possible due to > alignment issue. In particular, the motivating example [1] was a load > inside a class instance which cannot be hoisted because the `this' > pointer has an alignment of 1. > > [1] http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/attachment.cpp Now, we could enhance LoopRotate to avoid duplicating code when not needed, but instead hoist loop-invariant code, but isn't that a code duplication? (*sic*) We have LICM, and in fact we already run it right after LoopRotation. We could try to move it to before LoopRotation, that is basically free from compile-time perspective: https://llvm-compile-time-tracker.com/compare.php?from=6c93eb4477d88af046b915bc955c03693b2cbb58&to=a4bee6d07732b1184c436da489040b912f0dc271&stat=instructions But, looking at stats, i think it isn't great that we would no longer do LICM after LoopRotation, in particular: | statistic name | LoopRotate-LICM | LICM-LoopRotate | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9015799 | -131 | 0.00% | 0.00% | | indvars.NumElimCmp | 3536 | 3544 | 8 | 0.23% | 0.23% | | indvars.NumElimExt | 36725 | 36580 | -145 | -0.39% | 0.39% | | indvars.NumElimIV | 1197 | 1187 | -10 | -0.84% | 0.84% | | indvars.NumElimIdentity | 143 | 136 | -7 | -4.90% | 4.90% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29890 | 48 | 0.16% | 0.16% | | indvars.NumReplaced | 2293 | 2227 | -66 | -2.88% | 2.88% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26329 | -109 | -0.41% | 0.41% | | instcount.TotalBlocks | 1178338 | 1173840 | -4498 | -0.38% | 0.38% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9896139 | -9303 | -0.09% | 0.09% | | lcssa.NumLCSSA | 425871 | 423961 | -1910 | -0.45% | 0.45% | | licm.NumHoisted | 378357 | 378753 | 396 | 0.10% | 0.10% | | licm.NumMovedCalls | 2193 | 2208 | 15 | 0.68% | 0.68% | | licm.NumMovedLoads | 35899 | 31821 | -4078 | -11.36% | 11.36% | | licm.NumPromoted | 11178 | 11154 | -24 | -0.21% | 0.21% | | licm.NumSunk | 13359 | 13587 | 228 | 1.71% | 1.71% | | loop-delete.NumDeleted | 8547 | 8402 | -145 | -1.70% | 1.70% | | loop-instsimplify.NumSimplified | 12876 | 11890 | -986 | -7.66% | 7.66% | | loop-peel.NumPeeled | 1008 | 925 | -83 | -8.23% | 8.23% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42003 | -12 | -0.03% | 0.03% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 242 | 2 | 0.83% | 0.83% | | loop-simplifycfg.NumLoopExitsDeleted | 497 | 20 | -477 | -95.98% | 95.98% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 336 | -282 | -45.63% | 45.63% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11032 | 4 | 0.04% | 0.04% | | loop-unroll.NumUnrolled | 12608 | 12529 | -79 | -0.63% | 0.63% | | mem2reg.NumDeadAlloca | 10222 | 10221 | -1 | -0.01% | 0.01% | | mem2reg.NumPHIInsert | 192110 | 192106 | -4 | 0.00% | 0.00% | | mem2reg.NumSingleStore | 637650 | 637643 | -7 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 814 | 812 | -2 | -0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 283108 | 282934 | -174 | -0.06% | 0.06% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106718 | 6 | 0.01% | 0.01% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | ... but that actually regresses LICM (-12% `licm.NumMovedLoads`), loop-simplifycfg (`NumLoopExitsDeleted`, `NumTerminatorsFolded`), simple-loop-unswitch (`NumTrivial`). What if we instead have LICM both before and after LoopRotate? | statistic name | LoopRotate-LICM | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015930 | 9014474 | -1456 | -0.02% | 0.02% | | indvars.NumElimCmp | 3536 | 3546 | 10 | 0.28% | 0.28% | | indvars.NumElimExt | 36725 | 36681 | -44 | -0.12% | 0.12% | | indvars.NumElimIV | 1197 | 1185 | -12 | -1.00% | 1.00% | | indvars.NumElimIdentity | 143 | 146 | 3 | 2.10% | 2.10% | | indvars.NumElimRem | 4 | 5 | 1 | 25.00% | 25.00% | | indvars.NumLFTR | 29842 | 29899 | 57 | 0.19% | 0.19% | | indvars.NumReplaced | 2293 | 2299 | 6 | 0.26% | 0.26% | | indvars.NumSimplifiedSDiv | 6 | 8 | 2 | 33.33% | 33.33% | | indvars.NumWidened | 26438 | 26404 | -34 | -0.13% | 0.13% | | instcount.TotalBlocks | 1178338 | 1173652 | -4686 | -0.40% | 0.40% | | instcount.TotalFuncs | 111825 | 111829 | 4 | 0.00% | 0.00% | | instcount.TotalInsts | 9905442 | 9895452 | -9990 | -0.10% | 0.10% | | lcssa.NumLCSSA | 425871 | 425373 | -498 | -0.12% | 0.12% | | licm.NumHoisted | 378357 | 383352 | 4995 | 1.32% | 1.32% | | licm.NumMovedCalls | 2193 | 2204 | 11 | 0.50% | 0.50% | | licm.NumMovedLoads | 35899 | 35755 | -144 | -0.40% | 0.40% | | licm.NumPromoted | 11178 | 11163 | -15 | -0.13% | 0.13% | | licm.NumSunk | 13359 | 14321 | 962 | 7.20% | 7.20% | | loop-delete.NumDeleted | 8547 | 8538 | -9 | -0.11% | 0.11% | | loop-instsimplify.NumSimplified | 12876 | 12041 | -835 | -6.48% | 6.48% | | loop-peel.NumPeeled | 1008 | 924 | -84 | -8.33% | 8.33% | | loop-rotate.NumNotRotatedDueToHeaderSize | 368 | 365 | -3 | -0.82% | 0.82% | | loop-rotate.NumRotated | 42015 | 42005 | -10 | -0.02% | 0.02% | | loop-simplifycfg.NumLoopBlocksDeleted | 240 | 241 | 1 | 0.42% | 0.42% | | loop-simplifycfg.NumTerminatorsFolded | 618 | 619 | 1 | 0.16% | 0.16% | | loop-unroll.NumCompletelyUnrolled | 11028 | 11029 | 1 | 0.01% | 0.01% | | loop-unroll.NumUnrolled | 12608 | 12525 | -83 | -0.66% | 0.66% | | mem2reg.NumPHIInsert | 192110 | 192073 | -37 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637650 | 637652 | 2 | 0.00% | 0.00% | | scalar-evolution.NumTripCountsComputed | 283108 | 282998 | -110 | -0.04% | 0.04% | | scalar-evolution.NumTripCountsNotComputed | 106712 | 106691 | -21 | -0.02% | 0.02% | | simple-loop-unswitch.NumBranches | 5178 | 5185 | 7 | 0.14% | 0.14% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 925 | 11 | 1.20% | 1.20% | | simple-loop-unswitch.NumTrivial | 183 | 179 | -4 | -2.19% | 2.19% | | simple-loop-unswitch.NumBranches | 5178 | 4752 | -426 | -8.23% | 8.23% | | simple-loop-unswitch.NumCostMultiplierSkipped | 914 | 503 | -411 | -44.97% | 44.97% | | simple-loop-unswitch.NumSwitches | 20 | 18 | -2 | -10.00% | 10.00% | | simple-loop-unswitch.NumTrivial | 183 | 95 | -88 | -48.09% | 48.09% | I.e. we end up with less instructions, less peeling, more LICM activity, also note how none of those 4 regressions are here. Namely: | statistic name | LICM-LoopRotate | LICM-LoopRotate-LICM | Δ | % | abs(%) | | asm-printer.EmittedInsts | 9015799 | 9014474 | -1325 | -0.01% | 0.01% | | indvars.NumElimCmp | 3544 | 3546 | 2 | 0.06% | 0.06% | | indvars.NumElimExt | 36580 | 36681 | 101 | 0.28% | 0.28% | | indvars.NumElimIV | 1187 | 1185 | -2 | -0.17% | 0.17% | | indvars.NumElimIdentity | 136 | 146 | 10 | 7.35% | 7.35% | | indvars.NumLFTR | 29890 | 29899 | 9 | 0.03% | 0.03% | | indvars.NumReplaced | 2227 | 2299 | 72 | 3.23% | 3.23% | | indvars.NumWidened | 26329 | 26404 | 75 | 0.28% | 0.28% | | instcount.TotalBlocks | 1173840 | 1173652 | -188 | -0.02% | 0.02% | | instcount.TotalInsts | 9896139 | 9895452 | -687 | -0.01% | 0.01% | | lcssa.NumLCSSA | 423961 | 425373 | 1412 | 0.33% | 0.33% | | licm.NumHoisted | 378753 | 383352 | 4599 | 1.21% | 1.21% | | licm.NumMovedCalls | 2208 | 2204 | -4 | -0.18% | 0.18% | | licm.NumMovedLoads | 31821 | 35755 | 3934 | 12.36% | 12.36% | | licm.NumPromoted | 11154 | 11163 | 9 | 0.08% | 0.08% | | licm.NumSunk | 13587 | 14321 | 734 | 5.40% | 5.40% | | loop-delete.NumDeleted | 8402 | 8538 | 136 | 1.62% | 1.62% | | loop-instsimplify.NumSimplified | 11890 | 12041 | 151 | 1.27% | 1.27% | | loop-peel.NumPeeled | 925 | 924 | -1 | -0.11% | 0.11% | | loop-rotate.NumRotated | 42003 | 42005 | 2 | 0.00% | 0.00% | | loop-simplifycfg.NumLoopBlocksDeleted | 242 | 241 | -1 | -0.41% | 0.41% | | loop-simplifycfg.NumLoopExitsDeleted | 20 | 497 | 477 | 2385.00% | 2385.00% | | loop-simplifycfg.NumTerminatorsFolded | 336 | 619 | 283 | 84.23% | 84.23% | | loop-unroll.NumCompletelyUnrolled | 11032 | 11029 | -3 | -0.03% | 0.03% | | loop-unroll.NumUnrolled | 12529 | 12525 | -4 | -0.03% | 0.03% | | mem2reg.NumDeadAlloca | 10221 | 10222 | 1 | 0.01% | 0.01% | | mem2reg.NumPHIInsert | 192106 | 192073 | -33 | -0.02% | 0.02% | | mem2reg.NumSingleStore | 637643 | 637652 | 9 | 0.00% | 0.00% | | scalar-evolution.NumBruteForceTripCountsComputed | 812 | 814 | 2 | 0.25% | 0.25% | | scalar-evolution.NumTripCountsComputed | 282934 | 282998 | 64 | 0.02% | 0.02% | | scalar-evolution.NumTripCountsNotComputed | 106718 | 106691 | -27 | -0.03% | 0.03% | | simple-loop-unswitch.NumBranches | 4752 | 5185 | 433 | 9.11% | 9.11% | | simple-loop-unswitch.NumCostMultiplierSkipped | 503 | 925 | 422 | 83.90% | 83.90% | | simple-loop-unswitch.NumSwitches | 18 | 20 | 2 | 11.11% | 11.11% | | simple-loop-unswitch.NumTrivial | 95 | 179 | 84 | 88.42% | 88.42% | {F15983613} {F15983615} {F15983616} (this is vanilla llvm testsuite + rawspeed + darktable) As an example of the code where early LICM only is bad, see: https://godbolt.org/z/GzEbacs4K This does have an observable compile-time regression of +~0.5% geomean https://llvm-compile-time-tracker.com/compare.php?from=7c5222e4d1a3a14f029e5f614c9aefd0fa505f1e&to=5d81826c3411982ca26e46b9d0aff34c80577664&stat=instructions but i think that's basically nothing, and there's potential that it might be avoidable in the future by fixing clang to produce alignment information on function arguments, thus making the second run unneeded. Differential Revision: https://reviews.llvm.org/D99249
2021-04-02 09:40:12 +02:00
; CHECK-O-NEXT: Running pass: LICM
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: LoopRotatePass
; CHECK-O-NEXT: Running pass: LICM
; CHECK-O-NEXT: Running pass: SimpleLoopUnswitchPass
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-O-NEXT: Running pass: LoopSimplifyPass
; CHECK-O-NEXT: Running pass: LCSSAPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: LoopIdiomRecognizePass
; CHECK-O-NEXT: Running pass: IndVarSimplifyPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: LoopDeletionPass
; CHECK-O-NEXT: Running pass: LoopFullUnrollPass
; CHECK-O-NEXT: Running pass: SROA on foo
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-Os-NEXT: Running pass: MergedLoadStoreMotionPass
; CHECK-Os-NEXT: Running pass: GVN
; CHECK-Os-NEXT: Running analysis: MemoryDependenceAnalysis
; CHECK-Os-NEXT: Running analysis: PhiValuesAnalysis
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-Oz-NEXT: Running pass: MergedLoadStoreMotionPass
; CHECK-Oz-NEXT: Running pass: GVN
; CHECK-Oz-NEXT: Running analysis: MemoryDependenceAnalysis
; CHECK-Oz-NEXT: Running analysis: PhiValuesAnalysis
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O2-NEXT: Running pass: MergedLoadStoreMotionPass
; CHECK-O2-NEXT: Running pass: GVN
; CHECK-O2-NEXT: Running analysis: MemoryDependenceAnalysis
; CHECK-O2-NEXT: Running analysis: PhiValuesAnalysis
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O3-NEXT: Running pass: MergedLoadStoreMotionPass
; CHECK-O3-NEXT: Running pass: GVN
; CHECK-O3-NEXT: Running analysis: MemoryDependenceAnalysis
; CHECK-O3-NEXT: Running analysis: PhiValuesAnalysis
; CHECK-O1-NEXT: Running pass: MemCpyOptPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SCCPPass
; CHECK-O-NEXT: Running pass: BDCEPass
; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-O23SZ-NEXT: Running pass: JumpThreadingPass
; CHECK-O23SZ-NEXT: Running analysis: LazyValueAnalysis
; CHECK-O23SZ-NEXT: Running pass: CorrelatedValuePropagationPass
; CHECK-O23SZ-NEXT: Invalidating analysis: LazyValueAnalysis
; CHECK-O1-NEXT: Running pass: CoroElidePass
; CHECK-O-NEXT: Running pass: ADCEPass
; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis
; CHECK-O23SZ-NEXT: Running pass: MemCpyOptPass
; CHECK-O23SZ-NEXT: Running pass: DSEPass
; CHECK-O23SZ-NEXT: Running pass: LoopSimplifyPass
; CHECK-O23SZ-NEXT: Running pass: LCSSAPass
; CHECK-O23SZ-NEXT: Running pass: LICMPass on Loop at depth 1 containing: %loop
; CHECK-O23SZ-NEXT: Running pass: CoroElidePass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-O-NEXT: Running pass: CoroSplitPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-PRELINK-O-NEXT: Running pass: GlobalOptPass
; CHECK-POSTLINK-O-NEXT: Running pass: GlobalOptPass
; CHECK-POSTLINK-O-NEXT: Running pass: GlobalDCEPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: EliminateAvailableExternallyPass
; CHECK-POSTLINK-O-NEXT: Running pass: ReversePostOrderFunctionAttrsPass
; CHECK-POSTLINK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
; CHECK-POSTLINK-O-NEXT: Running pass: Float2IntPass
; CHECK-POSTLINK-O-NEXT: Running pass: LowerConstantIntrinsicsPass
; CHECK-EXT: Running pass: {{.*}}::Bye
; CHECK-POSTLINK-O-NEXT: Running pass: LoopSimplifyPass
; CHECK-POSTLINK-O-NEXT: Running pass: LCSSAPass
; CHECK-POSTLINK-O-NEXT: Running pass: LoopRotatePass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: LoopDistributePass
; CHECK-POSTLINK-O-NEXT: Running pass: InjectTLIMappings
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: LoopVectorizePass
; CHECK-POSTLINK-O-NEXT: Running analysis: BlockFrequencyAnalysis
; CHECK-POSTLINK-O-NEXT: Running analysis: BranchProbabilityAnalysis
; CHECK-POSTLINK-O-NEXT: Running pass: LoopLoadEliminationPass
; CHECK-POSTLINK-O-NEXT: Running analysis: LoopAccessAnalysis
; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-POSTLINK-O2-NEXT: Running pass: SLPVectorizerPass
; CHECK-POSTLINK-O3-NEXT: Running pass: SLPVectorizerPass
; CHECK-POSTLINK-Os-NEXT: Running pass: SLPVectorizerPass
; CHECK-POSTLINK-O-NEXT: Running pass: VectorCombinePass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
; CHECK-POSTLINK-O-NEXT: Running pass: LoopUnrollPass
[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
2018-12-12 18:32:52 +01:00
; CHECK-POSTLINK-O-NEXT: Running pass: WarnMissedTransformationsPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
; CHECK-POSTLINK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
; CHECK-POSTLINK-O-NEXT: Running pass: LoopSimplifyPass
; CHECK-POSTLINK-O-NEXT: Running pass: LCSSAPass
; CHECK-POSTLINK-O-NEXT: Running pass: LICMPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: AlignmentFromAssumptionsPass
; CHECK-POSTLINK-O-NEXT: Running pass: LoopSinkPass
; CHECK-POSTLINK-O-NEXT: Running pass: InstSimplifyPass
; CHECK-POSTLINK-O-NEXT: Running pass: DivRemPairsPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass
; CHECK-O-NEXT: Running pass: CoroCleanupPass
; CHECK-POSTLINK-O-NEXT: Running pass: CGProfilePass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-POSTLINK-O-NEXT: Running pass: GlobalDCEPass
; CHECK-POSTLINK-O-NEXT: Running pass: ConstantMergePass
; CHECK-POSTLINK-O-NEXT: Running pass: RelLookupTableConverterPass
; CHECK-POSTLINK-O-NEXT: Running analysis: TargetIRAnalysis
; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
; CHECK-PRELINK-O-NEXT: Running pass: CanonicalizeAliasesPass
; CHECK-PRELINK-O-NEXT: Running pass: NameAnonGlobalPass
[PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM. Based on the original patch by Davide, but I've adjusted the API exposed to just be different entry points rather than exposing more state parameters. I've factored all the common logic out so that we don't have any duplicate pipelines, we just stitch them together in different ways. I think this makes the build easier to reason about and understand. This adds a direct method for getting the module simplification pipeline as well as a method to get the optimization pipeline. While not my express goal, this seems nice and gives a good place comment about the restrictions that are imposed on them. I did make some minor changes to the way the pipelines are structured here, but hopefully not ones that are significant or controversial: 1) I sunk the PGO indirect call promotion to only be run when we have PGO enabled (or as part of the special ThinLTO pipeline). 2) I made the extra GlobalOpt run in ThinLTO just happen all the time and at a slightly more powerful place (before we remove available externaly functions). This seems like general goodness and not a big compile time sink, so it didn't make sense to *only* use it in ThinLTO. Fewer differences in the pipeline makes everything simpler IMO. 3) I hoisted the ThinLTO stop point pre-link above the the RPO function attr inference. The RPO inference won't infer anything terribly meaningful pre-link (recursiveness?) so it didn't make a lot of sense. But if the placement of RPO inference starts to matter, we should move it to the canonicalization phase anyways which seems like a better place for it (and there is a FIXME to this effect!). But that seemed a bridge too far for this patch. If we ever need to parameterize these pipelines more heavily, we can always sink the logic to helper functions with parameters to keep those parameters out of the public API. But the changes above seemed minor that we could possible get away without the parameters entirely. I added support for parsing 'thinlto' and 'thinlto-pre-link' names in pass pipelines to make it easy to test these routines and play with them in larger pipelines. I also added a really basic manifest of passes test that will show exactly how the pipelines behave and work as well as making updates to them clear. Lastly, this factoring does introduce a nesting layer of module pass managers in the default pipeline. I don't think this is a big deal and the flexibility of decoupling the pipelines seems easily worth it. Differential Revision: https://reviews.llvm.org/D33540 llvm-svn: 304407
2017-06-01 13:39:39 +02:00
; CHECK-O-NEXT: Running pass: PrintModulePass
; Make sure we get the IR back out without changes when we print the module.
; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {
; CHECK-O-NEXT: entry:
; CHECK-O-NEXT: br label %loop
; CHECK-O: loop:
; CHECK-O-NEXT: %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
; CHECK-O-NEXT: %iv.next = add i32 %iv, 1
; CHECK-O-NEXT: tail call void @bar()
; CHECK-O-NEXT: %cmp = icmp eq i32 %iv, %n
; CHECK-O-NEXT: br i1 %cmp, label %exit, label %loop
; CHECK-O: exit:
; CHECK-O-NEXT: ret void
; CHECK-O-NEXT: }
;
declare void @bar() local_unnamed_addr
define void @foo(i32 %n) local_unnamed_addr {
entry:
br label %loop
loop:
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
%iv.next = add i32 %iv, 1
tail call void @bar()
%cmp = icmp eq i32 %iv, %n
br i1 %cmp, label %exit, label %loop
exit:
ret void
}