Allow disabling of vectorization using internal options

Summary: Currently, the internal options -vectorize-loops, -vectorize-slp, and -interleave-loops do not have much practical effect. This is because they are used to initialize the corresponding flags in the pass managers, and those flags are then unconditionally overwritten when compiling via clang or via LTO from the linkers. The only exception was -vectorize-loops via opt because of some special hackery there. While vectorization could still be disabled when compiling via clang, using -fno-[slp-]vectorize, this meant that there was no way to disable it when compiling in LTO mode via the linkers. This only affected ThinLTO, since for regular LTO vectorization is done during the compile step for scalability reasons. For ThinLTO it is invoked in the LTO backends. See also the discussion on PR45434. This patch makes it so the internal options can actually be used to disable these optimizations. Ultimately, the best long term solution is to mark the loops with metadata (similar to the approach used to fix -fno-unroll-loops in D77058), but this enables a shorter term workaround, and actually makes these internal options useful. I constant propagated the initial values of these internal flags into the pass manager flags (for some reasons vectorize-loops and interleave-loops were initialized to true, while vectorize-slp was initialized to false). As mentioned above, they are overwritten unconditionally so this doesn't have any real impact, and these initial values aren't particularly meaningful. I then changed the passes to check the internl values and return without performing the associated optimization when false (I changed the default of -vectorize-slp to true so the options behave similarly). I was able to remove the hackery in opt used to get -vectorize-loops=false to work, as well as a special option there used to disable SLP vectorization. Finally, I changed thinlto-slp-vectorize-pm.c to: a) Only test SLP (moved the loop vectorization checking to a new test). b) Use code that is slp vectorized when it is enabled, and check that instead of whether the pass is enabled. c) Test the new behavior of -vectorize-slp. d) Test both pass managers. The loop vectorization (and associated interleaving) testing I moved to a new thinlto-loop-vectorize-pm.c test, with several changes: a) Changed the flags on the interleaving testing so that it will actually interleave, and check that. b) Test the new behavior of -vectorize-loops and -interleave-loops. c) Test both pass managers. Reviewers: fhahn, wmi Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77989
2024-11-26 04:32:44 +01:00 · 2020-04-12 19:12:38 -07:00 · 2020-04-12 19:12:38 -07:00 · ca9019a39c
commit ca9019a39c
parent ca0ed79c22
9 changed files with 29 additions and 40 deletions
--- a/include/llvm/Passes/PassBuilder.h
+++ b/include/llvm/Passes/PassBuilder.h
@ -73,16 +73,15 @@ public:
  /// can be set in the PassBuilder when using a LLVM as a library.
  PipelineTuningOptions();

-  /// Tuning option to set loop interleaving on/off. Its default value is that
-  /// of the flag: `-interleave-loops`.
+  /// Tuning option to set loop interleaving on/off, set based on opt level.
  bool LoopInterleaving;

-  /// Tuning option to enable/disable loop vectorization. Its default value is
-  /// that of the flag: `-vectorize-loops`.
+  /// Tuning option to enable/disable loop vectorization, set based on opt
+  /// level.
  bool LoopVectorization;

-  /// Tuning option to enable/disable slp loop vectorization. Its default value
-  /// is that of the flag: `vectorize-slp`.
+  /// Tuning option to enable/disable slp loop vectorization, set based on opt
+  /// level.
  bool SLPVectorization;

  /// Tuning option to enable/disable loop unrolling. Its default value is true.
--- a/include/llvm/Transforms/Vectorize/LoopVectorize.h
+++ b/include/llvm/Transforms/Vectorize/LoopVectorize.h
@ -118,6 +118,7 @@ struct LoopVectorizeOptions {

 /// The LoopVectorize Pass.
 struct LoopVectorizePass : public PassInfoMixin<LoopVectorizePass> {
+private:
  /// If false, consider all loops for interleaving.
  /// If true, only loops that explicitly request interleaving are considered.
  bool InterleaveOnlyWhenForced;
@ -126,9 +127,8 @@ struct LoopVectorizePass : public PassInfoMixin<LoopVectorizePass> {
  /// If true, only loops that explicitly request vectorization are considered.
  bool VectorizeOnlyWhenForced;

-  LoopVectorizePass(LoopVectorizeOptions Opts = {})
-      : InterleaveOnlyWhenForced(Opts.InterleaveOnlyWhenForced),
-        VectorizeOnlyWhenForced(Opts.VectorizeOnlyWhenForced) {}
+public:
+  LoopVectorizePass(LoopVectorizeOptions Opts = {});

  ScalarEvolution *SE;
  LoopInfo *LI;
--- a/include/llvm/Transforms/Vectorize/SLPVectorizer.h
+++ b/include/llvm/Transforms/Vectorize/SLPVectorizer.h
@ -55,8 +55,6 @@ class BoUpSLP;

 } // end namespace slpvectorizer

-extern cl::opt<bool> RunSLPVectorization;
-
 struct SLPVectorizerPass : public PassInfoMixin<SLPVectorizerPass> {
  using StoreList = SmallVector<StoreInst *, 8>;
  using StoreListMap = MapVector<Value *, StoreList>;
--- a/lib/Passes/PassBuilder.cpp
+++ b/lib/Passes/PassBuilder.cpp
@ -242,9 +242,9 @@ static cl::opt<bool> EnableCallGraphProfile(
    cl::desc("Enable call graph profile pass for the new PM (default = on)"));

 PipelineTuningOptions::PipelineTuningOptions() {
-  LoopInterleaving = EnableLoopInterleaving;
-  LoopVectorization = EnableLoopVectorization;
-  SLPVectorization = RunSLPVectorization;
+  LoopInterleaving = true;
+  LoopVectorization = true;
+  SLPVectorization = false;
  LoopUnrolling = true;
  ForgetAllSCEVInLoopUnroll = ForgetSCEVInLoopUnroll;
  Coroutines = false;
--- a/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ b/lib/Transforms/IPO/PassManagerBuilder.cpp
@ -164,9 +164,9 @@ PassManagerBuilder::PassManagerBuilder() {
    LibraryInfo = nullptr;
    Inliner = nullptr;
    DisableUnrollLoops = false;
-    SLPVectorize = RunSLPVectorization;
-    LoopVectorize = EnableLoopVectorization;
-    LoopsInterleaved = EnableLoopInterleaving;
+    SLPVectorize = false;
+    LoopVectorize = true;
+    LoopsInterleaved = true;
    RerollLoops = RunLoopRerolling;
    NewGVN = RunNewGVN;
    LicmMssaOptCap = SetLicmMssaOptCap;
--- a/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/lib/Transforms/Vectorize/LoopVectorize.cpp
@ -1598,9 +1598,8 @@ struct LoopVectorize : public FunctionPass {

  explicit LoopVectorize(bool InterleaveOnlyWhenForced = false,
                         bool VectorizeOnlyWhenForced = false)
-      : FunctionPass(ID) {
-    Impl.InterleaveOnlyWhenForced = InterleaveOnlyWhenForced;
-    Impl.VectorizeOnlyWhenForced = VectorizeOnlyWhenForced;
+      : FunctionPass(ID),
+        Impl({InterleaveOnlyWhenForced, VectorizeOnlyWhenForced}) {
    initializeLoopVectorizePass(*PassRegistry::getPassRegistry());
  }

@ -7672,6 +7671,12 @@ static bool processLoopInVPlanNativePath(
  return true;
 }

+LoopVectorizePass::LoopVectorizePass(LoopVectorizeOptions Opts)
+    : InterleaveOnlyWhenForced(Opts.InterleaveOnlyWhenForced ||
+                               !EnableLoopInterleaving),
+      VectorizeOnlyWhenForced(Opts.VectorizeOnlyWhenForced ||
+                              !EnableLoopVectorization) {}
+
 bool LoopVectorizePass::processLoop(Loop *L) {
  assert((EnableVPlanNativePath || L->empty()) &&
         "VPlan-native path is not enabled. Only process inner loops.");
--- a/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/lib/Transforms/Vectorize/SLPVectorizer.cpp
@ -108,8 +108,7 @@ using namespace slpvectorizer;

 STATISTIC(NumVectorInstructions, "Number of vector instructions generated");

-cl::opt<bool>
-    llvm::RunSLPVectorization("vectorize-slp", cl::init(false), cl::Hidden,
+cl::opt<bool> RunSLPVectorization("vectorize-slp", cl::init(true), cl::Hidden,
                                  cl::desc("Run the SLP vectorization passes"));

 static cl::opt<int>
@ -5647,6 +5646,8 @@ bool SLPVectorizerPass::runImpl(Function &F, ScalarEvolution *SE_,
                                LoopInfo *LI_, DominatorTree *DT_,
                                AssumptionCache *AC_, DemandedBits *DB_,
                                OptimizationRemarkEmitter *ORE_) {
+  if (!RunSLPVectorization)
+    return false;
  SE = SE_;
  TTI = TTI_;
  TLI = TLI_;
--- a/test/Transforms/SLPVectorizer/X86/opt.ll
+++ b/test/Transforms/SLPVectorizer/X86/opt.ll
@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -O3 -S -mtriple=x86_64-apple-macosx10.8.0 -mcpu=corei7-avx | FileCheck %s --check-prefix=SLP
-; RUN: opt < %s -O3 -disable-slp-vectorization -S -mtriple=x86_64-apple-macosx10.8.0 -mcpu=corei7-avx | FileCheck %s --check-prefix=NOSLP
+; RUN: opt < %s -O3 -vectorize-slp=false -S -mtriple=x86_64-apple-macosx10.8.0 -mcpu=corei7-avx | FileCheck %s --check-prefix=NOSLP

 target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx10.8.0"
--- a/tools/opt/opt.cpp
+++ b/tools/opt/opt.cpp
@ -180,11 +180,6 @@ DisableLoopUnrolling("disable-loop-unrolling",
                     cl::desc("Disable loop unrolling in all relevant passes"),
                     cl::init(false));

-static cl::opt<bool>
-DisableSLPVectorization("disable-slp-vectorization",
-                        cl::desc("Disable the slp vectorization pass"),
-                        cl::init(false));
-
 static cl::opt<bool> EmitSummaryIndex("module-summary",
                                      cl::desc("Emit module summary index"),
                                      cl::init(false));
@ -406,18 +401,9 @@ static void AddOptimizationPasses(legacy::PassManagerBase &MPM,
  Builder.DisableUnrollLoops = (DisableLoopUnrolling.getNumOccurrences() > 0) ?
                               DisableLoopUnrolling : OptLevel == 0;

-  // Check if vectorization is explicitly disabled via -vectorize-loops=false.
-  // The flag enables vectorization in the LoopVectorize pass, it is on by
-  // default, and if it was disabled, leave it disabled here.
-  // Another flag that exists: -loop-vectorize, controls adding the pass to the
-  // pass manager. If set, the pass is added, and there is no additional check
-  // here for it.
-  if (Builder.LoopVectorize)
  Builder.LoopVectorize = OptLevel > 1 && SizeLevel < 2;

-  // When #pragma vectorize is on for SLP, do the same as above
-  Builder.SLPVectorize =
-      DisableSLPVectorization ? false : OptLevel > 1 && SizeLevel < 2;
+  Builder.SLPVectorize = OptLevel > 1 && SizeLevel < 2;

  if (TM)
    TM->adjustPassManager(Builder);