1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Go to file
Tobias Grosser 1b2b3c1ea1 [InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))
Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:

> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.

Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:

`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`

This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation

Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).

As an example, consider the following IR snippet

```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```

which would now be transformed to

```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```

This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.

Reviewers: grosser, vtjnash, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22511

Contributed-by: Matthias Reisinger
llvm-svn: 275989
2016-07-19 16:39:17 +00:00
bindings [OCaml] Add functions for accessing metadata nodes. 2016-06-22 03:30:24 +00:00
cmake [cmake] Create the LLVM_BUILD_UTILS option. 2016-07-10 02:43:47 +00:00
docs Retry: [llvm-profdata] Speed up merging by using a thread pool 2016-07-19 01:17:20 +00:00
examples [Kaleidoscope][BuildingAJIT] Start filling in text for chapter 3. 2016-07-15 01:39:49 +00:00
include [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR 2016-07-19 15:07:43 +00:00
lib [InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp)) 2016-07-19 16:39:17 +00:00
projects Remove autoconf support 2016-01-26 21:29:08 +00:00
resources
runtimes [CMake] Add LLVM runtimes directory 2016-06-23 22:07:21 +00:00
test [InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp)) 2016-07-19 16:39:17 +00:00
tools Retry: [llvm-profdata] Speed up merging by using a thread pool 2016-07-19 01:17:20 +00:00
unittests Retry: [llvm-profdata] Speed up merging by using a thread pool 2016-07-19 01:17:20 +00:00
utils TableGen: Allow custom register operand decoder method 2016-07-18 23:20:46 +00:00
.arcconfig Upgrade all the .arcconfigs to https. 2016-07-14 13:15:37 +00:00
.clang-format
.clang-tidy Don't use misc-unused-parameters check on LLVM. 2016-04-13 08:58:52 +00:00
.gitignore [CMake] Add LLVM runtimes directory 2016-06-23 22:07:21 +00:00
CMakeLists.txt Bump the trunk version to 4.0.0svn. 2016-07-18 17:51:04 +00:00
CODE_OWNERS.TXT Transfer ownership of the gold plugin. 2016-07-05 20:49:50 +00:00
configure Remove autoconf support 2016-01-26 21:29:08 +00:00
CREDITS.TXT Update my email address. 2016-05-10 16:23:54 +00:00
LICENSE.TXT Update copyright year to 2016. 2016-03-30 22:41:06 +00:00
llvm.spec.in [Sparc] Implement i64 load/store support for 32-bit sparc. 2015-08-10 19:11:39 +00:00
LLVMBuild.txt
README.txt Revert previous test commit. 2016-01-04 19:13:29 +00:00

Low Level Virtual Machine (LLVM)
================================

This directory and its subdirectories contain source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments.

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the documentation provided in docs/ for further
assistance with LLVM, and in particular docs/GettingStarted.rst for getting
started with LLVM and docs/README.txt for an overview of LLVM's
documentation setup.

If you are writing a package for LLVM, see docs/Packaging.rst for our
suggestions.