llvm-mirror/lib/Transforms/Scalar/CMakeLists.txt

add_llvm_library(LLVMScalarOpts
  ADCE.cpp
  ConstantHoisting.cpp
  ConstantProp.cpp
  CorrelatedValuePropagation.cpp
  DCE.cpp
  DeadStoreElimination.cpp
  Scalarizer.cpp
  EarlyCSE.cpp
  GlobalMerge.cpp
  GVN.cpp
  IndVarSimplify.cpp
  JumpThreading.cpp
  LICM.cpp
  LoopDeletion.cpp
  LoopIdiomRecognize.cpp
  LoopInstSimplify.cpp
  LoopRotation.cpp
  LoopStrengthReduce.cpp
  LoopRerollPass.cpp
  LoopUnrollPass.cpp
  LoopUnswitch.cpp
  LowerAtomic.cpp
  MemCpyOptimizer.cpp
  PartiallyInlineLibCalls.cpp
  Reassociate.cpp
  Reg2Mem.cpp
  SampleProfile.cpp
  SCCP.cpp
  SROA.cpp
  Scalar.cpp
  ScalarReplAggregates.cpp
  SimplifyCFGPass.cpp
  FlattenCFGPass.cpp
  Sink.cpp
  StructurizeCFG.cpp
  TailRecursionElimination.cpp
  )

add_dependencies(LLVMScalarOpts intrinsics_gen)
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`add_llvm_library(LLVMScalarOpts`
			`ADCE.cpp`
Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062 2014-01-25 03:02:55 +01:00			`ConstantHoisting.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`ConstantProp.cpp`
Rename file to something more descriptive. llvm-svn: 112590 2010-08-31 09:41:39 +02:00			`CorrelatedValuePropagation.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`DCE.cpp`
			`DeadStoreElimination.cpp`
Add a Scalarizer pass. llvm-svn: 195471 2013-11-22 17:58:05 +01:00			`Scalarizer.cpp`
CMake: Add missing source file. llvm-svn: 122724 2011-01-03 03:13:05 +01:00			`EarlyCSE.cpp`
Fix CMake build. llvm-svn: 142204 2011-10-17 19:50:39 +02:00			`GlobalMerge.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`GVN.cpp`
			`IndVarSimplify.cpp`
			`JumpThreading.cpp`
			`LICM.cpp`
			`LoopDeletion.cpp`
Start of a pass for recognizing memset and memcpy idioms. No functionality yet. llvm-svn: 122562 2010-12-26 20:32:44 +01:00			`LoopIdiomRecognize.cpp`
Add a new loop-instsimplify pass, with the intention of replacing the instance of instcombine that is currently in the middle of the loop pass pipeline. This commit only checks in the pass; it will hopefully be enabled by default later. llvm-svn: 122719 2011-01-03 01:25:16 +01:00			`LoopInstSimplify.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`LoopRotation.cpp`
			`LoopStrengthReduce.cpp`
Add a loop rerolling pass This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3i] = foo(0); x[3i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. llvm-svn: 194939 2013-11-17 00:59:05 +01:00			`LoopRerollPass.cpp`
Update CMakeLists for recent renames. llvm-svn: 85660 2009-10-31 15:38:25 +01:00			`LoopUnrollPass.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`LoopUnswitch.cpp`
Add an atomic lowering pass llvm-svn: 110113 2010-08-03 18:19:16 +02:00			`LowerAtomic.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`MemCpyOptimizer.cpp`
Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097 2013-08-23 12:27:02 +02:00			`PartiallyInlineLibCalls.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`Reassociate.cpp`
			`Reg2Mem.cpp`
SampleProfileLoader pass. Initial setup. This adds a new scalar pass that reads a file with samples generated by 'perf' during runtime. The samples read from the profile are incorporated and emmited as IR metadata reflecting that profile. The profile file is assumed to have been generated by an external profile source. The profile information is converted into IR metadata, which is later used by the analysis routines to estimate block frequencies, edge weights and other related data. External profile information files have no fixed format, each profiler is free to define its own. This includes both the on-disk representation of the profile and the kind of profile information stored in the file. A common kind of profile is based on sampling (e.g., perf), which essentially counts how many times each line of the program has been executed during the run. The SampleProfileLoader pass is organized as a scalar transformation. On startup, it reads the file given in -sample-profile-file to determine what kind of profile it contains. This file is assumed to contain profile information for the whole application. The profile data in the file is read and incorporated into the internal state of the corresponding profiler. To facilitate testing, I've organized the profilers to support two file formats: text and native. The native format is whatever on-disk representation the profiler wants to support, I think this will mostly be bitcode files, but it could be anything the profiler wants to support. To do this, every profiler must implement the SampleProfile::loadNative() function. The text format is mostly meant for debugging. Records are separated by newlines, but each profiler is free to interpret records as it sees fit. Profilers must implement the SampleProfile::loadText() function. Finally, the pass will call SampleProfile::emitAnnotations() for each function in the current translation unit. This function needs to translate the loaded profile into IR metadata, which the analyzer will later be able to use. This patch implements the first steps towards the above design. I've implemented a sample-based flat profiler. The format of the profile is fairly simplistic. Each sampled function contains a list of relative line locations (from the start of the function) together with a count representing how many samples were collected at that line during execution. I generate this profile using perf and a separate converter tool. Currently, I have only implemented a text format for these profiles. I am interested in initial feedback to the whole approach before I send the other parts of the implementation for review. This patch implements: - The SampleProfileLoader pass. - The base ExternalProfile class with the core interface. - A SampleProfile sub-class using the above interface. The profiler generates branch weight metadata on every branch instructions that matches the profiles. - A text loader class to assist the implementation of SampleProfile::loadText(). - Basic unit tests for the pass. Additionally, the patch uses profile information to compute branch weights based on instruction samples. This patch converts instruction samples into branch weights. It does a fairly simplistic conversion: Given a multi-way branch instruction, it calculates the weight of each branch based on the maximum sample count gathered from each target basic block. Note that this assignment of branch weights is somewhat lossy and can be misleading. If a basic block has more than one incoming branch, all the incoming branches will get the same weight. In reality, it may be that only one of them is the most heavily taken branch. I will adjust this assignment in subsequent patches. llvm-svn: 194566 2013-11-13 13:22:21 +01:00			`SampleProfile.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`SCCP.cpp`
Introduce a new SROA implementation. This is essentially a ground up re-think of the SROA pass in LLVM. It was initially inspired by a few problems with the existing pass: - It is subject to the bane of my existence in optimizations: arbitrary thresholds. - It is overly conservative about which constructs can be split and promoted. - The vector value replacement aspect is separated from the splitting logic, missing many opportunities where splitting and vector value formation can work together. - The splitting is entirely based around the underlying type of the alloca, despite this type often having little to do with the reality of how that memory is used. This is especially prevelant with unions and base classes where we tail-pack derived members. - When splitting fails (often due to the thresholds), the vector value replacement (again because it is separate) can kick in for preposterous cases where we simply should have split the value. This results in forming i1024 and i2048 integer "bit vectors" that tremendously slow down subsequnet IR optimizations (due to large APInts) and impede the backend's lowering. The new design takes an approach that fundamentally is not susceptible to many of these problems. It is the result of a discusison between myself and Duncan Sands over IRC about how to premptively avoid these types of problems and how to do SROA in a more principled way. Since then, it has evolved and grown, but this remains an important aspect: it fixes real world problems with the SROA process today. First, the transform of SROA actually has little to do with replacement. It has more to do with splitting. The goal is to take an aggregate alloca and form a composition of scalar allocas which can replace it and will be most suitable to the eventual replacement by scalar SSA values. The actual replacement is performed by mem2reg (and in the future SSAUpdater). The splitting is divided into four phases. The first phase is an analysis of the uses of the alloca. This phase recursively walks uses, building up a dense datastructure representing the ranges of the alloca's memory actually used and checking for uses which inhibit any aspects of the transform such as the escape of a pointer. Once we have a mapping of the ranges of the alloca used by individual operations, we compute a partitioning of the used ranges. Some uses are inherently splittable (such as memcpy and memset), while scalar uses are not splittable. The goal is to build a partitioning that has the minimum number of splits while placing each unsplittable use in its own partition. Overlapping unsplittable uses belong to the same partition. This is the target split of the aggregate alloca, and it maximizes the number of scalar accesses which become accesses to their own alloca and candidates for promotion. Third, we re-walk the uses of the alloca and assign each specific memory access to all the partitions touched so that we have dense use-lists for each partition. Finally, we build a new, smaller alloca for each partition and rewrite each use of that partition to use the new alloca. During this phase the pass will also work very hard to transform uses of an alloca into a form suitable for promotion, including forming vector operations, speculating loads throguh PHI nodes and selects, etc. After splitting is complete, each newly refined alloca that is a candidate for promotion to a scalar SSA value is run through mem2reg. There are lots of reasonably detailed comments in the source code about the design and algorithms, and I'm going to be trying to improve them in subsequent commits to ensure this is well documented, as the new pass is in many ways more complex than the old one. Some of this is still a WIP, but the current state is reasonbly stable. It has passed bootstrap, the nightly test suite, and Duncan has run it successfully through the ACATS and DragonEgg test suites. That said, it remains behind a default-off flag until the last few pieces are in place, and full testing can be done. Specific areas I'm looking at next: - Improved comments and some code cleanup from reviews. - SSAUpdater and enabling this pass inside the CGSCC pass manager. - Some datastructure tuning and compile-time measurements. - More aggressive FCA splitting and vector formation. Many thanks to Duncan Sands for the thorough final review, as well as Benjamin Kramer for lots of review during the process of writing this pass, and Daniel Berlin for reviewing the data structures and algorithms and general theory of the pass. Also, several other people on IRC, over lunch tables, etc for lots of feedback and advice. llvm-svn: 163883 2012-09-14 11:22:59 +02:00			`SROA.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`Scalar.cpp`
			`ScalarReplAggregates.cpp`
			`SimplifyCFGPass.cpp`
Factor FlattenCFG out from SimplifyCFG Patch by: Mei Ye llvm-svn: 187764 2013-08-06 04:43:45 +02:00			`FlattenCFGPass.cpp`
Update CMake build. llvm-svn: 103266 2010-05-07 19:13:20 +02:00			`Sink.cpp`
Move StructurizeCFG out of R600 to generic Transforms. Register it with PassManager llvm-svn: 184343 2013-06-19 22:18:24 +02:00			`StructurizeCFG.cpp`
Initial support for the CMake build system. llvm-svn: 56419 2008-09-22 03:08:49 +02:00			`TailRecursionElimination.cpp`
			`)`
llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112 2012-06-24 15:32:01 +02:00
			`add_dependencies(LLVMScalarOpts intrinsics_gen)`