mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-23 11:13:28 +01:00
923d744cb9
Summary: If there's a callees metadata attached to the indirect call instruction, add CallGraphEdges to the callees mentioned in the metadata when computing FunctionSummary. * Why this is necessary: Consider following code example: ``` (foo.c) static int f1(int x) {...} static int f2(int x); static int (*fptr)(int) = f2; static int f2(int x) { if (x) fptr=f1; return f1(x); } int foo(int x) { (*fptr)(x); // !callees metadata of !{i32 (i32)* @f1, i32 (i32)* @f2} would be attached to this call. } (bar.c) int bar(int x) { return foo(x); } ``` At LTO time when `foo.o` is imported into `bar.o`, function `foo` might be inlined into `bar` and PGO-guided indirect call promotion will run after that. If the profile data tells that the promotion of `@f1` or `@f2` is beneficial, the optimizer will check if the "promoted" `@f1` or `@f2` (such as `@f1.llvm.0` or `@f2.llvm.0`) is available. Without this patch, importing `!callees` metadata would only add promoted declarations of `@f1` and `@f2` to the `bar.o`, but still the optimizer will assume that the function is available and perform the promotion. The result of that is link failure with `undefined reference to @f1.llvm.0`. This patch fixes this problem by adding callees in the `!callees` metadata to CallGraphEdges so that their definition would be properly imported into. One may ask that there already is a logic to add indirect call promotion targets to be added to CallGraphEdges. However, if profile data says "indirect call promotion is only beneficial under a certain inline context", the logic wouldn't work. In the code example above, if profile data is like ``` bar:1000000:100000 1:100000 1: foo:100000 1: 100000 f1:100000 ``` , Computing FunctionSummary for `foo.o` wouldn't add `foo->f1` to CallGraphEdges. (Also, it is at least "possible" that one can provide profile data to only link step but not to compilation step). Reviewers: tejohnson, mehdi_amini, pcc Reviewed By: tejohnson Subscribers: inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44399 llvm-svn: 327358
23 lines
668 B
LLVM
23 lines
668 B
LLVM
; Do setup work: generate bitcode and combined index
|
|
; RUN: opt -module-summary %s -o %t1.bc
|
|
; RUN: opt -module-summary %p/Inputs/callees-metadata.ll -o %t2.bc
|
|
|
|
; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps \
|
|
; RUN: -r=%t1.bc,bar,plx \
|
|
; RUN: -r=%t1.bc,foo,l \
|
|
; RUN: -r=%t2.bc,foo,pl
|
|
; RUN: llvm-dis %t.o.1.3.import.bc -o - | FileCheck %s
|
|
; CHECK: define {{.*}} i32 @f1.llvm.0
|
|
; CHECK: define {{.*}} i32 @f2.llvm.0
|
|
|
|
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
|
|
target triple = "x86_64-unknown-linux-gnu"
|
|
|
|
define dso_local i32 @bar(i32 %x) {
|
|
entry:
|
|
%call = call i32 @foo(i32 %x)
|
|
ret i32 %call
|
|
}
|
|
|
|
declare dso_local i32 @foo(i32)
|