This could be simplified further, but Hal has a specific feature for
ignoring TTI, and so I preserved that.
Also, I needed to use it because a number of tests fail when switching
from a null TTI to the NoTTI nonce implementation. That seems suspicious
to me and so may be something that you need to look into Hal. I worked
it by preserving the old behavior for these tests with the flag that
ignores all target info.
llvm-svn: 171722
BBVectorize would, except for loads and stores, always fuse instructions
so that the first instruction (in the current source order) would always
represent the low part of the input vectors and the second instruction
would always represent the high part. This lead to too many shuffles
being produced because sometimes the opposite order produces fewer of them.
With this change, BBVectorize tracks the kind of pair connections that form
the DAG of candidate pairs, and uses that information to reorder the pairs to
avoid excess shuffles. Using this information, a future commit will be able
to add VTTI-based shuffle costs to the pair selection procedure. Importantly,
the number of remaining shuffles can now be estimated during pair selection.
There are some trivial instruction reorderings in the test cases, and one
simple additional test where we certainly want to do a reordering to
avoid an unnecessary shuffle.
llvm-svn: 167122
The powi intrinsic requires special handling because it always takes a single
integer power regardless of the result type. As a result, we can vectorize
only if the powers are equal. Fixes PR12364.
llvm-svn: 153797
This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).
llvm-svn: 149468