mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-22 02:33:06 +01:00
Updating MergeFunctions.rst
Improving readability, removing redundant contents. Reviewers: hiraditya Differential Revision: https://reviews.llvm.org/D50686 llvm-svn: 340131
This commit is contained in:
parent
a55eeea976
commit
dc85840415
@ -10,84 +10,71 @@ Introduction
|
||||
Sometimes code contains equal functions, or functions that does exactly the same
|
||||
thing even though they are non-equal on the IR level (e.g.: multiplication on 2
|
||||
and 'shl 1'). It could happen due to several reasons: mainly, the usage of
|
||||
templates and automatic code generators. Though, sometimes user itself could
|
||||
templates and automatic code generators. Though, sometimes the user itself could
|
||||
write the same thing twice :-)
|
||||
|
||||
The main purpose of this pass is to recognize such functions and merge them.
|
||||
|
||||
Why would I want to read this document?
|
||||
---------------------------------------
|
||||
Document is the extension to pass comments and describes the pass logic. It
|
||||
describes algorithm that is used in order to compare functions, it also
|
||||
explains how we could combine equal functions correctly, keeping module valid.
|
||||
This document is the extension to pass comments and describes the pass logic. It
|
||||
describes the algorithm that is used in order to compare functions and
|
||||
explains how we could combine equal functions correctly to keep the module
|
||||
valid.
|
||||
|
||||
Material is brought in top-down form, so reader could start learn pass from
|
||||
ideas and end up with low-level algorithm details, thus preparing him for
|
||||
reading the sources.
|
||||
Material is brought in a top-down form, so the reader could start to learn pass
|
||||
from high level ideas and end with low-level algorithm details, thus preparing
|
||||
him or her for reading the sources.
|
||||
|
||||
So main goal is do describe algorithm and logic here; the concept. This document
|
||||
is good for you, if you *don't want* to read the source code, but want to
|
||||
understand pass algorithms. Author tried not to repeat the source-code and
|
||||
cover only common cases, and thus avoid cases when after minor code changes we
|
||||
need to update this document.
|
||||
The main goal is to describe the algorithm and logic here and the concept. If
|
||||
you *don't want* to read the source code, but want to understand pass
|
||||
algorithms, this document is good for you. The author tries not to repeat the
|
||||
source-code and covers only common cases to avoid the cases of needing to
|
||||
update this document after any minor code changes.
|
||||
|
||||
|
||||
What should I know to be able to follow along with this document?
|
||||
-----------------------------------------------------------------
|
||||
|
||||
Reader should be familiar with common compile-engineering principles and LLVM
|
||||
code fundamentals. In this article we suppose reader is familiar with
|
||||
`Single Static Assingment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
|
||||
concepts. Understanding of
|
||||
`IR structure <http://llvm.org/docs/LangRef.html#high-level-structure>`_ is
|
||||
also important.
|
||||
The reader should be familiar with common compile-engineering principles and
|
||||
LLVM code fundamentals. In this article, we assume the reader is familiar with
|
||||
`Single Static Assignment
|
||||
<http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
|
||||
concept and has an understanding of
|
||||
`IR structure <http://llvm.org/docs/LangRef.html#high-level-structure>`_.
|
||||
|
||||
We will use such terms as
|
||||
We will use terms such as
|
||||
"`module <http://llvm.org/docs/LangRef.html#high-level-structure>`_",
|
||||
"`function <http://llvm.org/docs/ProgrammersManual.html#the-function-class>`_",
|
||||
"`basic block <http://en.wikipedia.org/wiki/Basic_block>`_",
|
||||
"`user <http://llvm.org/docs/ProgrammersManual.html#the-user-class>`_",
|
||||
"`value <http://llvm.org/docs/ProgrammersManual.html#the-value-class>`_",
|
||||
"`instruction <http://llvm.org/docs/ProgrammersManual.html#the-instruction-class>`_".
|
||||
"`instruction
|
||||
<http://llvm.org/docs/ProgrammersManual.html#the-instruction-class>`_".
|
||||
|
||||
As a good start point, Kaleidoscope tutorial could be used:
|
||||
As a good starting point, the Kaleidoscope tutorial can be used:
|
||||
|
||||
:doc:`tutorial/index`
|
||||
|
||||
Especially it's important to understand chapter 3 of tutorial:
|
||||
It's especially important to understand chapter 3 of tutorial:
|
||||
|
||||
:doc:`tutorial/LangImpl03`
|
||||
|
||||
Reader also should know how passes work in LLVM, they could use next article as
|
||||
a reference and start point here:
|
||||
The reader should also know how passes work in LLVM. They could use this
|
||||
article as a reference and start point here:
|
||||
|
||||
:doc:`WritingAnLLVMPass`
|
||||
|
||||
What else? Well perhaps reader also should have some experience in LLVM pass
|
||||
What else? Well perhaps the reader should also have some experience in LLVM pass
|
||||
debugging and bug-fixing.
|
||||
|
||||
What I gain by reading this document?
|
||||
-------------------------------------
|
||||
Main purpose is to provide reader with comfortable form of algorithms
|
||||
description, namely the human reading text. Since it could be hard to
|
||||
understand algorithm straight from the source code: pass uses some principles
|
||||
that have to be explained first.
|
||||
|
||||
Author wishes to everybody to avoid case, when you read code from top to bottom
|
||||
again and again, and yet you don't understand why we implemented it that way.
|
||||
|
||||
We hope that after this article reader could easily debug and improve
|
||||
MergeFunctions pass and thus help LLVM project.
|
||||
|
||||
Narrative structure
|
||||
-------------------
|
||||
Article consists of three parts. First part explains pass functionality on the
|
||||
top-level. Second part describes the comparison procedure itself. The third
|
||||
part describes the merging process.
|
||||
The article consists of three parts. The first part explains pass functionality
|
||||
on the top-level. The second part describes the comparison procedure itself.
|
||||
The third part describes the merging process.
|
||||
|
||||
In every part author also tried to put the contents into the top-down form.
|
||||
First, the top-level methods will be described, while the terminal ones will be
|
||||
at the end, in the tail of each part. If reader will see the reference to the
|
||||
In every part, the author tries to put the contents in the top-down form.
|
||||
The top-level methods will first be described followed by the terminal ones at
|
||||
the end, in the tail of each part. If the reader sees the reference to the
|
||||
method that wasn't described yet, they will find its description a bit below.
|
||||
|
||||
Basics
|
||||
@ -95,46 +82,46 @@ Basics
|
||||
|
||||
How to do it?
|
||||
-------------
|
||||
Do we need to merge functions? Obvious thing is: yes that's a quite possible
|
||||
case, since usually we *do* have duplicates. And it would be good to get rid of
|
||||
them. But how to detect such a duplicates? The idea is next: we split functions
|
||||
onto small bricks (parts), then we compare "bricks" amount, and if it equal,
|
||||
compare "bricks" themselves, and then do our conclusions about functions
|
||||
Do we need to merge functions? The obvious answer is: Yes, that is quite a
|
||||
possible case. We usually *do* have duplicates and it would be good to get rid
|
||||
of them. But how do we detect duplicates? This is the idea: we split functions
|
||||
into smaller bricks or parts and compare the "bricks" amount. If equal,
|
||||
we compare the "bricks" themselves, and then do our conclusions about functions
|
||||
themselves.
|
||||
|
||||
What the difference it could be? For example, on machine with 64-bit pointers
|
||||
(let's assume we have only one address space), one function stores 64-bit
|
||||
integer, while another one stores a pointer. So if the target is a machine
|
||||
What could the difference be? For example, on a machine with 64-bit pointers
|
||||
(let's assume we have only one address space), one function stores a 64-bit
|
||||
integer, while another one stores a pointer. If the target is the machine
|
||||
mentioned above, and if functions are identical, except the parameter type (we
|
||||
could consider it as a part of function type), then we can treat ``uint64_t``
|
||||
and``void*`` as equal.
|
||||
could consider it as a part of function type), then we can treat a ``uint64_t``
|
||||
and a ``void*`` as equal.
|
||||
|
||||
It was just an example; possible details are described a bit below.
|
||||
This is just an example; more possible details are described a bit below.
|
||||
|
||||
As another example reader may imagine two more functions. First function
|
||||
performs multiplication on 2, while the second one performs arithmetic right
|
||||
shift on 1.
|
||||
As another example, the reader may imagine two more functions. The first
|
||||
function performs a multiplication on 2, while the second one performs an
|
||||
arithmetic right shift on 1.
|
||||
|
||||
Possible solutions
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
Let's briefly consider possible options about how and what we have to implement
|
||||
in order to create full-featured functions merging, and also what it would
|
||||
meant for us.
|
||||
mean for us.
|
||||
|
||||
Equal functions detection, obviously supposes "detector" method to be
|
||||
implemented, latter should answer the question "whether functions are equal".
|
||||
This "detector" method consists of tiny "sub-detectors", each of them answers
|
||||
Equal function detection obviously supposes that a "detector" method to be
|
||||
implemented and latter should answer the question "whether functions are equal".
|
||||
This "detector" method consists of tiny "sub-detectors", which each answers
|
||||
exactly the same question, but for function parts.
|
||||
|
||||
As the second step, we should merge equal functions. So it should be a "merger"
|
||||
method. "Merger" accepts two functions *F1* and *F2*, and produces *F1F2*
|
||||
function, the result of merging.
|
||||
|
||||
Having such a routines in our hands, we can process whole module, and merge all
|
||||
Having such routines in our hands, we can process a whole module, and merge all
|
||||
equal functions.
|
||||
|
||||
In this case, we have to compare every function with every another function. As
|
||||
reader could notice, this way seems to be quite expensive. Of course we could
|
||||
the reader may notice, this way seems to be quite expensive. Of course we could
|
||||
introduce hashing and other helpers, but it is still just an optimization, and
|
||||
thus the level of O(N*N) complexity.
|
||||
|
||||
@ -143,44 +130,45 @@ access lookup? The answer is: "yes".
|
||||
|
||||
Random-access
|
||||
"""""""""""""
|
||||
How it could be done? Just convert each function to number, and gather all of
|
||||
them in special hash-table. Functions with equal hash are equal. Good hashing
|
||||
means, that every function part must be taken into account. That means we have
|
||||
to convert every function part into some number, and then add it into hash.
|
||||
Lookup-up time would be small, but such approach adds some delay due to hashing
|
||||
routine.
|
||||
How it could this be done? Just convert each function to a number, and gather
|
||||
all of them in a special hash-table. Functions with equal hashes are equal.
|
||||
Good hashing means, that every function part must be taken into account. That
|
||||
means we have to convert every function part into some number, and then add it
|
||||
into the hash. The lookup-up time would be small, but such a approach adds some
|
||||
delay due to the hashing routine.
|
||||
|
||||
Logarithmical search
|
||||
""""""""""""""""""""
|
||||
We could introduce total ordering among the functions set, once we had it we
|
||||
We could introduce total ordering among the functions set, once ordered we
|
||||
could then implement a logarithmical search. Lookup time still depends on N,
|
||||
but adds a little of delay (*log(N)*).
|
||||
|
||||
Present state
|
||||
"""""""""""""
|
||||
Both of approaches (random-access and logarithmical) has been implemented and
|
||||
tested. And both of them gave a very good improvement. And what was most
|
||||
surprising, logarithmical search was faster; sometimes up to 15%. Hashing needs
|
||||
some extra CPU time, and it is the main reason why it works slower; in most of
|
||||
cases total "hashing" time was greater than total "logarithmical-search" time.
|
||||
Both of the approaches (random-access and logarithmical) have been implemented
|
||||
and tested and both give a very good improvement. What was most
|
||||
surprising is that logarithmical search was faster; sometimes by up to 15%. The
|
||||
hashing method needs some extra CPU time, which is the main reason why it works
|
||||
slower; in most cases, total "hashing" time is greater than total
|
||||
"logarithmical-search" time.
|
||||
|
||||
So, preference has been granted to the "logarithmical search".
|
||||
|
||||
Though in the case of need, *logarithmical-search* (read "total-ordering") could
|
||||
be used as a milestone on our way to the *random-access* implementation.
|
||||
|
||||
Every comparison is based either on the numbers or on flags comparison. In
|
||||
*random-access* approach we could use the same comparison algorithm. During
|
||||
comparison we exit once we find the difference, but here we might have to scan
|
||||
whole function body every time (note, it could be slower). Like in
|
||||
"total-ordering", we will track every numbers and flags, but instead of
|
||||
comparison, we should get numbers sequence and then create the hash number. So,
|
||||
once again, *total-ordering* could be considered as a milestone for even faster
|
||||
(in theory) random-access approach.
|
||||
Every comparison is based either on the numbers or on the flags comparison. In
|
||||
the *random-access* approach, we could use the same comparison algorithm.
|
||||
During comparison, we exit once we find the difference, but here we might have
|
||||
to scan the whole function body every time (note, it could be slower). Like in
|
||||
"total-ordering", we will track every number and flag, but instead of
|
||||
comparison, we should get the numbers sequence and then create the hash number.
|
||||
So, once again, *total-ordering* could be considered as a milestone for even
|
||||
faster (in theory) random-access approach.
|
||||
|
||||
MergeFunctions, main fields and runOnModule
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
There are two most important fields in class:
|
||||
There are two main important fields in the class:
|
||||
|
||||
``FnTree`` – the set of all unique functions. It keeps items that couldn't be
|
||||
merged with each other. It is defined as:
|
||||
@ -192,8 +180,8 @@ implemented “<” operator among the functions set (below we explain how it wo
|
||||
exactly; this is a key point in fast functions comparison).
|
||||
|
||||
``Deferred`` – merging process can affect bodies of functions that are in
|
||||
``FnTree`` already. Obviously such functions should be rechecked again. In this
|
||||
case we remove them from ``FnTree``, and mark them as to be rescanned, namely
|
||||
``FnTree`` already. Obviously, such functions should be rechecked again. In this
|
||||
case, we remove them from ``FnTree``, and mark them to be rescanned, namely
|
||||
put them into ``Deferred`` list.
|
||||
|
||||
runOnModule
|
||||
@ -205,28 +193,30 @@ The algorithm is pretty simple:
|
||||
2. Scan *worklist*'s functions twice: first enumerate only strong functions and
|
||||
then only weak ones:
|
||||
|
||||
2.1. Loop body: take function from *worklist* (call it *FCur*) and try to
|
||||
2.1. Loop body: take a function from *worklist* (call it *FCur*) and try to
|
||||
insert it into *FnTree*: check whether *FCur* is equal to one of functions
|
||||
in *FnTree*. If there *is* equal function in *FnTree* (call it *FExists*):
|
||||
merge function *FCur* with *FExists*. Otherwise add function from *worklist*
|
||||
to *FnTree*.
|
||||
in *FnTree*. If there *is* an equal function in *FnTree*
|
||||
(call it *FExists*): merge function *FCur* with *FExists*. Otherwise add
|
||||
the function from the *worklist* to *FnTree*.
|
||||
|
||||
3. Once *worklist* scanning and merging operations is complete, check *Deferred*
|
||||
list. If it is not empty: refill *worklist* contents with *Deferred* list and
|
||||
do step 2 again, if *Deferred* is empty, then exit from method.
|
||||
3. Once the *worklist* scanning and merging operations are complete, check the
|
||||
*Deferred* list. If it is not empty: refill the *worklist* contents with
|
||||
*Deferred* list and redo step 2, if the *Deferred* list is empty, then exit
|
||||
from method.
|
||||
|
||||
Comparison and logarithmical search
|
||||
"""""""""""""""""""""""""""""""""""
|
||||
Let's recall our task: for every function *F* from module *M*, we have to find
|
||||
equal functions *F`* in shortest time, and merge them into the single function.
|
||||
equal functions *F`* in the shortest time possible , and merge them into a
|
||||
single function.
|
||||
|
||||
Defining total ordering among the functions set allows to organize functions
|
||||
into the binary tree. The lookup procedure complexity would be estimated as
|
||||
O(log(N)) in this case. But how to define *total-ordering*?
|
||||
Defining total ordering among the functions set allows us to organize
|
||||
functions into a binary tree. The lookup procedure complexity would be
|
||||
estimated as O(log(N)) in this case. But how do we define *total-ordering*?
|
||||
|
||||
We have to introduce a single rule applicable to every pair of functions, and
|
||||
following this rule then evaluate which of them is greater. What kind of rule
|
||||
it could be? Let's declare it as "compare" method, that returns one of 3
|
||||
following this rule, then evaluate which of them is greater. What kind of rule
|
||||
could it be? Let's declare it as the "compare" method that returns one of 3
|
||||
possible values:
|
||||
|
||||
-1, left is *less* than right,
|
||||
@ -243,52 +233,52 @@ Of course it means, that we have to maintain
|
||||
* transitivity (``a <= b`` and ``b <= c``, then ``a <= c``)
|
||||
* asymmetry (if ``a < b``, then ``a > b`` or ``a == b``).
|
||||
|
||||
As it was mentioned before, comparison routine consists of
|
||||
"sub-comparison-routines", each of them also consists
|
||||
"sub-comparison-routines", and so on, finally it ends up with a primitives
|
||||
As mentioned before, the comparison routine consists of
|
||||
"sub-comparison-routines", with each of them also consisting of
|
||||
"sub-comparison-routines", and so on. Finally, it ends up with primitive
|
||||
comparison.
|
||||
|
||||
Below, we will use the next operations:
|
||||
Below, we will use the following operations:
|
||||
|
||||
#. ``cmpNumbers(number1, number2)`` is method that returns -1 if left is less
|
||||
#. ``cmpNumbers(number1, number2)`` is a method that returns -1 if left is less
|
||||
than right; 0, if left and right are equal; and 1 otherwise.
|
||||
|
||||
#. ``cmpFlags(flag1, flag2)`` is hypothetical method that compares two flags.
|
||||
#. ``cmpFlags(flag1, flag2)`` is a hypothetical method that compares two flags.
|
||||
The logic is the same as in ``cmpNumbers``, where ``true`` is 1, and
|
||||
``false`` is 0.
|
||||
|
||||
The rest of article is based on *MergeFunctions.cpp* source code
|
||||
(*<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp*). We would like to ask
|
||||
reader to keep this file open nearby, so we could use it as a reference for
|
||||
further explanations.
|
||||
The rest of the article is based on *MergeFunctions.cpp* source code
|
||||
(found in *<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp*). We would like
|
||||
to ask reader to keep this file open, so we could use it as a reference
|
||||
for further explanations.
|
||||
|
||||
Now we're ready to proceed to the next chapter and see how it works.
|
||||
Now, we're ready to proceed to the next chapter and see how it works.
|
||||
|
||||
Functions comparison
|
||||
====================
|
||||
At first, let's define how exactly we compare complex objects.
|
||||
|
||||
Complex objects comparison (function, basic-block, etc) is mostly based on its
|
||||
sub-objects comparison results. So it is similar to the next "tree" objects
|
||||
Complex object comparison (function, basic-block, etc) is mostly based on its
|
||||
sub-object comparison results. It is similar to the next "tree" objects
|
||||
comparison:
|
||||
|
||||
#. For two trees *T1* and *T2* we perform *depth-first-traversal* and have
|
||||
two sequences as a product: "*T1Items*" and "*T2Items*".
|
||||
|
||||
#. Then compare chains "*T1Items*" and "*T2Items*" in
|
||||
most-significant-item-first order. Result of items comparison would be the
|
||||
result of *T1* and *T2* comparison itself.
|
||||
#. We then compare chains "*T1Items*" and "*T2Items*" in
|
||||
the most-significant-item-first order. The result of items comparison
|
||||
would be the result of *T1* and *T2* comparison itself.
|
||||
|
||||
FunctionComparator::compare(void)
|
||||
---------------------------------
|
||||
Brief look at the source code tells us, that comparison starts in
|
||||
A brief look at the source code tells us that the comparison starts in the
|
||||
“``int FunctionComparator::compare(void)``” method.
|
||||
|
||||
1. First parts to be compared are function's attributes and some properties that
|
||||
outsides “attributes” term, but still could make function different without
|
||||
changing its body. This part of comparison is usually done within simple
|
||||
*cmpNumbers* or *cmpFlags* operations (e.g.
|
||||
``cmpFlags(F1->hasGC(), F2->hasGC())``). Below is full list of function's
|
||||
1. The first parts to be compared are the function's attributes and some
|
||||
properties that is outside the “attributes” term, but still could make the
|
||||
function different without changing its body. This part of the comparison is
|
||||
usually done within simple *cmpNumbers* or *cmpFlags* operations (e.g.
|
||||
``cmpFlags(F1->hasGC(), F2->hasGC())``). Below is a full list of function's
|
||||
properties to be compared on this stage:
|
||||
|
||||
* *Attributes* (those are returned by ``Function::getAttributes()``
|
||||
@ -333,7 +323,7 @@ arguments (see ``cmpValues`` method below).
|
||||
|
||||
FunctionComparator::cmpType
|
||||
---------------------------
|
||||
Consider how types comparison works.
|
||||
Consider how type comparison works.
|
||||
|
||||
1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the
|
||||
integer type. It could be done if its address space is 0, or if address spaces
|
||||
@ -343,7 +333,7 @@ are ignored at all. Do the same thing for the right type.
|
||||
preference to one of them. So proceed to the next step.
|
||||
|
||||
3. If types are of different kind (different type IDs). Return result of type
|
||||
IDs comparison, treating them as a numbers (use ``cmpNumbers`` operation).
|
||||
IDs comparison, treating them as numbers (use ``cmpNumbers`` operation).
|
||||
|
||||
4. If types are vectors or integers, return result of their pointers comparison,
|
||||
comparing them as numbers.
|
||||
@ -378,21 +368,21 @@ technique (see the very first paragraph of this chapter). Both *left* and
|
||||
way. If we get -1 or 1 on some stage, return it. Otherwise return 0.
|
||||
|
||||
8. Steps 1-6 describe all the possible cases, if we passed steps 1-6 and didn't
|
||||
get any conclusions, then invoke ``llvm_unreachable``, since it's quite
|
||||
get any conclusions, then invoke ``llvm_unreachable``, since it's quite an
|
||||
unexpectable case.
|
||||
|
||||
cmpValues(const Value*, const Value*)
|
||||
-------------------------------------
|
||||
Method that compares local values.
|
||||
|
||||
This method gives us an answer on a very curious quesion: whether we could treat
|
||||
local values as equal, and which value is greater otherwise. It's better to
|
||||
start from example:
|
||||
This method gives us an answer to a very curious question: whether we could
|
||||
treat local values as equal, and which value is greater otherwise. It's
|
||||
better to start from example:
|
||||
|
||||
Consider situation when we're looking at the same place in left function "*FL*"
|
||||
and in right function "*FR*". And every part of *left* place is equal to the
|
||||
corresponding part of *right* place, and (!) both parts use *Value* instances,
|
||||
for example:
|
||||
Consider the situation when we're looking at the same place in left
|
||||
function "*FL*" and in right function "*FR*". Every part of *left* place is
|
||||
equal to the corresponding part of *right* place, and (!) both parts use
|
||||
*Value* instances, for example:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
@ -401,13 +391,13 @@ for example:
|
||||
|
||||
So, now our conclusion depends on *Value* instances comparison.
|
||||
|
||||
Main purpose of this method is to determine relation between such values.
|
||||
The main purpose of this method is to determine relation between such values.
|
||||
|
||||
What we expect from equal functions? At the same place, in functions "*FL*" and
|
||||
"*FR*" we expect to see *equal* values, or values *defined* at the same place
|
||||
in "*FL*" and "*FR*".
|
||||
What can we expect from equal functions? At the same place, in functions
|
||||
"*FL*" and "*FR*" we expect to see *equal* values, or values *defined* at
|
||||
the same place in "*FL*" and "*FR*".
|
||||
|
||||
Consider small example here:
|
||||
Consider a small example here:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
@ -421,20 +411,20 @@ Consider small example here:
|
||||
instr0 i32 %pg0 instr1 i32 %pg0 instr2 i32 123
|
||||
}
|
||||
|
||||
In this example, *pf0* is associated with *pg0*, *pf1* is associated with *pg1*,
|
||||
and we also declare that *pf0* < *pf1*, and thus *pg0* < *pf1*.
|
||||
In this example, *pf0* is associated with *pg0*, *pf1* is associated with
|
||||
*pg1*, and we also declare that *pf0* < *pf1*, and thus *pg0* < *pf1*.
|
||||
|
||||
Instructions with opcode "*instr0*" would be *equal*, since their types and
|
||||
opcodes are equal, and values are *associated*.
|
||||
|
||||
Instruction with opcode "*instr1*" from *f* is *greater* than instruction with
|
||||
opcode "*instr1*" from *g*; here we have equal types and opcodes, but "*pf1* is
|
||||
greater than "*pg0*".
|
||||
Instructions with opcode "*instr1*" from *f* is *greater* than instructions
|
||||
with opcode "*instr1*" from *g*; here we have equal types and opcodes, but
|
||||
"*pf1* is greater than "*pg0*".
|
||||
|
||||
And instructions with opcode "*instr2*" are equal, because their opcodes and
|
||||
Instructions with opcode "*instr2*" are equal, because their opcodes and
|
||||
types are equal, and the same constant is used as a value.
|
||||
|
||||
What we assiciate in cmpValues?
|
||||
What we associate in cmpValues?
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
* Function arguments. *i*-th argument from left function associated with
|
||||
*i*-th argument from right function.
|
||||
@ -444,23 +434,22 @@ What we assiciate in cmpValues?
|
||||
* Instructions.
|
||||
* Instruction operands. Note, we can meet *Value* here we have never seen
|
||||
before. In this case it is not a function argument, nor *BasicBlock*, nor
|
||||
*Instruction*. It is global value. It is constant, since its the only
|
||||
supposed global here. Method also compares:
|
||||
* Constants that are of the same type.
|
||||
* If right constant could be losslessly bit-casted to the left one, then we
|
||||
also compare them.
|
||||
*Instruction*. It is a global value. It is a constant, since it's the only
|
||||
supposed global here. The method also compares: Constants that are of the
|
||||
same type and if right constant can be losslessly bit-casted to the left
|
||||
one, then we also compare them.
|
||||
|
||||
How to implement cmpValues?
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
*Association* is a case of equality for us. We just treat such values as equal.
|
||||
But, in general, we need to implement antisymmetric relation. As it was
|
||||
mentioned above, to understand what is *less*, we can use order in which we
|
||||
meet values. If both of values has the same order in function (met at the same
|
||||
time), then treat values as *associated*. Otherwise – it depends on who was
|
||||
*Association* is a case of equality for us. We just treat such values as equal,
|
||||
but, in general, we need to implement antisymmetric relation. As mentioned
|
||||
above, to understand what is *less*, we can use order in which we
|
||||
meet values. If both values have the same order in a function (met at the same
|
||||
time), we then treat values as *associated*. Otherwise – it depends on who was
|
||||
first.
|
||||
|
||||
Every time we run top-level compare method, we initialize two identical maps
|
||||
(one for the left side, another one for the right side):
|
||||
Every time we run the top-level compare method, we initialize two identical
|
||||
maps (one for the left side, another one for the right side):
|
||||
|
||||
``map<Value, int> sn_mapL, sn_mapR;``
|
||||
|
||||
@ -471,11 +460,11 @@ To add value *V* we need to perform the next procedure:
|
||||
|
||||
``sn_map.insert(std::make_pair(V, sn_map.size()));``
|
||||
|
||||
For the first *Value*, map will return *0*, for second *Value* map will return
|
||||
*1*, and so on.
|
||||
For the first *Value*, map will return *0*, for the second *Value* map will
|
||||
return *1*, and so on.
|
||||
|
||||
Then we can check whether left and right values met at the same time with simple
|
||||
comparison:
|
||||
We can then check whether left and right values met at the same time with
|
||||
a simple comparison:
|
||||
|
||||
``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);``
|
||||
|
||||
@ -490,7 +479,7 @@ Of course, we can combine insertion and comparison:
|
||||
|
||||
Let's look, how whole method could be implemented.
|
||||
|
||||
1. we have to start from the bad news. Consider function self and
|
||||
1. We have to start with the bad news. Consider function self and
|
||||
cross-referencing cases:
|
||||
|
||||
.. code-block:: c++
|
||||
@ -507,7 +496,7 @@ cross-referencing cases:
|
||||
This comparison has been implemented in initial *MergeFunctions* pass
|
||||
version. But, unfortunately, it is not transitive. And this is the only case
|
||||
we can't convert to less-equal-greater comparison. It is a seldom case, 4-5
|
||||
functions of 10000 (checked on test-suite), and, we hope, reader would
|
||||
functions of 10000 (checked in test-suite), and, we hope, the reader would
|
||||
forgive us for such a sacrifice in order to get the O(log(N)) pass time.
|
||||
|
||||
2. If left/right *Value* is a constant, we have to compare them. Return 0 if it
|
||||
@ -518,8 +507,8 @@ comparison.
|
||||
|
||||
4. Explicit association of *L* (left value) and *R* (right value). We need to
|
||||
find out whether values met at the same time, and thus are *associated*. Or we
|
||||
need to put the rule: when we treat *L* < *R*. Now it is easy: just return
|
||||
result of numbers comparison:
|
||||
need to put the rule: when we treat *L* < *R*. Now it is easy: we just return
|
||||
the result of numbers comparison:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
@ -530,16 +519,16 @@ result of numbers comparison:
|
||||
if (LeftRes.first->second < RightRes.first->second) return -1;
|
||||
return 1;
|
||||
|
||||
Now when *cmpValues* returns 0, we can proceed comparison procedure. Otherwise,
|
||||
if we get (-1 or 1), we need to pass this result to the top level, and finish
|
||||
comparison procedure.
|
||||
Now when *cmpValues* returns 0, we can proceed the comparison procedure.
|
||||
Otherwise, if we get (-1 or 1), we need to pass this result to the top level,
|
||||
and finish comparison procedure.
|
||||
|
||||
cmpConstants
|
||||
------------
|
||||
Performs constants comparison as follows:
|
||||
|
||||
1. Compare constant types using ``cmpType`` method. If result is -1 or 1, goto
|
||||
step 2, otherwise proceed to step 3.
|
||||
1. Compare constant types using ``cmpType`` method. If the result is -1 or 1,
|
||||
goto step 2, otherwise proceed to step 3.
|
||||
|
||||
2. If types are different, we still can check whether constants could be
|
||||
losslessly bitcasted to each other. The further explanation is modification of
|
||||
@ -581,10 +570,10 @@ bitcastable:
|
||||
if (int Res = cmpNumbers(L->getValueID(), R->getValueID()))
|
||||
return Res;
|
||||
|
||||
5. Compare the contents of constants. The comparison depends on kind of
|
||||
5. Compare the contents of constants. The comparison depends on the kind of
|
||||
constants, but on this stage it is just a lexicographical comparison. Just see
|
||||
how it was described in the beginning of "*Functions comparison*" paragraph.
|
||||
Mathematically it is equal to the next case: we encode left constant and right
|
||||
Mathematically, it is equal to the next case: we encode left constant and right
|
||||
constant (with similar way *bitcode-writer* does). Then compare left code
|
||||
sequence and right code sequence.
|
||||
|
||||
@ -598,7 +587,7 @@ It enumerates instructions from left *BB* and right *BB*.
|
||||
``cmpValues`` method.
|
||||
|
||||
2. If one of left or right is *GEP* (``GetElementPtr``), then treat *GEP* as
|
||||
greater than other instructions, if both instructions are *GEPs* use ``cmpGEP``
|
||||
greater than other instructions. If both instructions are *GEPs* use ``cmpGEP``
|
||||
method for comparison. If result is -1 or 1, pass it to the top-level
|
||||
comparison (return it).
|
||||
|
||||
@ -618,11 +607,11 @@ comparison (return it).
|
||||
4. We can finish instruction enumeration in 3 cases:
|
||||
|
||||
4.1. We reached the end of both left and right basic-blocks. We didn't
|
||||
exit on steps 1-3, so contents is equal, return 0.
|
||||
exit on steps 1-3, so contents are equal, return 0.
|
||||
|
||||
4.2. We have reached the end of the left basic-block. Return -1.
|
||||
|
||||
4.3. Return 1 (the end of the right basic block).
|
||||
4.3. Return 1 (we reached the end of the right basic block).
|
||||
|
||||
cmpGEP
|
||||
------
|
||||
@ -652,8 +641,8 @@ method, and compare it like a numbers.
|
||||
|
||||
5. Compare operand types.
|
||||
|
||||
6. For some particular instructions check equivalence (relation in our case) of
|
||||
some significant attributes. For example we have to compare alignment for
|
||||
6. For some particular instructions, check equivalence (relation in our case) of
|
||||
some significant attributes. For example, we have to compare alignment for
|
||||
``load`` instructions.
|
||||
|
||||
O(log(N))
|
||||
@ -692,7 +681,7 @@ call wrapper around *F* and replace *G* with that call.
|
||||
change the callers: call *F* instead of *G*. That's what
|
||||
``replaceDirectCallers`` does.
|
||||
|
||||
Below is detailed body description.
|
||||
Below is a detailed body description.
|
||||
|
||||
If “F” may be overridden
|
||||
------------------------
|
||||
@ -736,17 +725,17 @@ also have alias to *F*.
|
||||
|
||||
No global aliases, replaceDirectCallers
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
If global aliases are not supported. We call ``replaceDirectCallers`` then. Just
|
||||
If global aliases are not supported. We call ``replaceDirectCallers``. Just
|
||||
go through all calls of *G* and replace it with calls of *F*. If you look into
|
||||
method you will see that it scans all uses of *G* too, and if use is callee (if
|
||||
user is call instruction and *G* is used as what to be called), we replace it
|
||||
with use of *F*.
|
||||
the method you will see that it scans all uses of *G* too, and if use is callee
|
||||
(if user is call instruction and *G* is used as what to be called), we replace
|
||||
it with use of *F*.
|
||||
|
||||
If “F” could not be overridden, fix it!
|
||||
"""""""""""""""""""""""""""""""""""""""
|
||||
|
||||
We call ``writeThunkOrAlias(Function *F, Function *G)``. Here we try to replace
|
||||
*G* with alias to *F* first. Next conditions are essential:
|
||||
*G* with alias to *F* first. The next conditions are essential:
|
||||
|
||||
* target should support global aliases,
|
||||
* the address itself of *G* should be not significant, not named and not
|
||||
@ -761,7 +750,7 @@ so *G* could be replaced with this wrapper.
|
||||
As follows from *llvm* reference:
|
||||
|
||||
“Aliases act as *second name* for the aliasee value”. So we just want to create
|
||||
second name for *F* and use it instead of *G*:
|
||||
a second name for *F* and use it instead of *G*:
|
||||
|
||||
1. create global alias itself (*GA*),
|
||||
|
||||
@ -793,10 +782,4 @@ it instead of *G*.
|
||||
|
||||
3. Get rid of *G*.
|
||||
|
||||
That's it.
|
||||
==========
|
||||
We have described how to detect equal functions, and how to merge them, and in
|
||||
first chapter we have described how it works all-together. Author hopes, reader
|
||||
have some picture from now, and it helps him improve and debug this pass.
|
||||
|
||||
Reader is welcomed to send us any questions and proposals ;-)
|
||||
|
Loading…
Reference in New Issue
Block a user