mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-22 18:54:02 +01:00
6a49dbd1a3
This adds a function specialization pass to LLVM. Constant parameters like function pointers and constant globals are propagated to the callee by specializing the function. This is a first version with a number of limitations: - The pass is off by default, so needs to be enabled on the command line, - It does not handle specialization of recursive functions, - It does not yet handle constants and constant ranges, - Only 1 argument per function is specialised, - The cost-model could be further looked into, and perhaps related, - We are not yet caching analysis results. This is based on earlier work by Matthew Simpson (D36432) and Vinay Madhusudan. More recently this was also discussed on the list, see: https://lists.llvm.org/pipermail/llvm-dev/2021-March/149380.html. The motivation for this work is that function specialisation often comes up as a reason for performance differences of generated code between LLVM and GCC, which has this enabled by default from optimisation level -O3 and up. And while this certainly helps a few cpu benchmark cases, this also triggers in real world codes and is thus a generally useful transformation to have in LLVM. Function specialisation has great potential to increase compile-times and code-size. The summary from some investigations with this patch is: - Compile-time increases for short compile jobs is high relatively, but the increase in absolute numbers still low. - For longer compile-jobs, the extra compile time is around 1%, and very much in line with GCC. - It is difficult to blame one thing for compile-time increases: it looks like everywhere a little bit more time is spent processing more functions and instructions. - But the function specialisation pass itself is not very expensive; it doesn't show up very high in the profile of the optimisation passes. The goal of this work is to reach parity with GCC which means that eventually we would like to get this enabled by default. But first we would like to address some of the limitations before that. Differential Revision: https://reviews.llvm.org/D93838
51 lines
1.2 KiB
LLVM
51 lines
1.2 KiB
LLVM
; RUN: opt -function-specialization -deadargelim -inline -S < %s | FileCheck %s
|
|
|
|
; CHECK-LABEL: @main(i64 %x, i1 %flag) {
|
|
; CHECK: entry:
|
|
; CHECK-NEXT: br i1 %flag, label %plus, label %minus
|
|
; CHECK: plus:
|
|
; CHECK-NEXT: [[TMP0:%.+]] = add i64 %x, 1
|
|
; CHECH-NEXT: br label %merge
|
|
; CHECK: minus:
|
|
; CHECK-NEXT: [[TMP1:%.+]] = sub i64 %x, 1
|
|
; CHECK-NEXT: br label %merge
|
|
; CHECK: merge:
|
|
; CHECK-NEXT: [[TMP2:%.+]] = phi i64 [ [[TMP0]], %plus ], [ [[TMP1]], %minus ]
|
|
; CHECK-NEXT: ret i64 [[TMP2]]
|
|
; CHECK-NEXT: }
|
|
;
|
|
define i64 @main(i64 %x, i1 %flag) {
|
|
entry:
|
|
br i1 %flag, label %plus, label %minus
|
|
|
|
plus:
|
|
%tmp0 = call i64 @compute(i64 %x, i64 (i64)* @plus)
|
|
br label %merge
|
|
|
|
minus:
|
|
%tmp1 = call i64 @compute(i64 %x, i64 (i64)* @minus)
|
|
br label %merge
|
|
|
|
merge:
|
|
%tmp2 = phi i64 [ %tmp0, %plus ], [ %tmp1, %minus]
|
|
ret i64 %tmp2
|
|
}
|
|
|
|
define internal i64 @compute(i64 %x, i64 (i64)* %binop) {
|
|
entry:
|
|
%tmp0 = call i64 %binop(i64 %x)
|
|
ret i64 %tmp0
|
|
}
|
|
|
|
define internal i64 @plus(i64 %x) {
|
|
entry:
|
|
%tmp0 = add i64 %x, 1
|
|
ret i64 %tmp0
|
|
}
|
|
|
|
define internal i64 @minus(i64 %x) {
|
|
entry:
|
|
%tmp0 = sub i64 %x, 1
|
|
ret i64 %tmp0
|
|
}
|