1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
llvm-mirror/lib
Eric Christopher 5d29fd1be5 Fix a problem where the TwoAddressInstructionPass which generate redundant register moves in a loop.
From:
int M, total;
void foo() {
int i;
for (i = 0; i < M; i++) {
  total = total + i / 2;
}
}

This is the kernel loop:

.LBB0_2: # %for.body

=>This Inner Loop Header: Depth=1
movl %edx, %esi
movl %ecx, %edx
shrl $31, %edx
addl %ecx, %edx
sarl %edx
addl %esi, %edx
incl %ecx
cmpl %eax, %ecx
jl .LBB0_2
--------------------------
The first mov insn "movl %edx, %esi" could be removed if we change "addl %esi, %edx" to "addl %edx, %esi".

The IR before TwoAddressInstructionPass is:
BB#2: derived from LLVM BB %for.body

Predecessors according to CFG: BB#1 BB#2
    %vreg3<def> = COPY %vreg12<kill>; GR32:%vreg3,%vreg12
    %vreg2<def> = COPY %vreg11<kill>; GR32:%vreg2,%vreg11
    %vreg7<def,tied1> = SHR32ri %vreg3<tied0>, 31, %EFLAGS<imp-def,dead>; GR32:%vreg7,%vreg3
    %vreg8<def,tied1> = ADD32rr %vreg3<tied0>, %vreg7<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg3,%vreg7
    %vreg9<def,tied1> = SAR32r1 %vreg8<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg9,%vreg8
    %vreg4<def,tied1> = ADD32rr %vreg9<kill,tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg4,%vreg9,%vreg2
    %vreg5<def,tied1> = INC64_32r %vreg3<kill,tied0>, %EFLAGS<imp-def,dead>; GR32:%vreg5,%vreg3
    CMP32rr %vreg5, %vreg0, %EFLAGS<imp-def>; GR32:%vreg5,%vreg0
    %vreg11<def> = COPY %vreg4; GR32:%vreg11,%vreg4
    %vreg12<def> = COPY %vreg5<kill>; GR32:%vreg12,%vreg5
    JL_4 <BB#2>, %EFLAGS<imp-use,kill>
Now TwoAddressInstructionPass will choose vreg9 to be tied with vreg4. However, it doesn't see that there is copy from vreg4 to vreg11 and another copy from vreg11 to vreg2 inside the loop body. To remove those copies, it is necessary to choose vreg2 to be tied with vreg4 instead of vreg9. This code pattern commonly appears when there is reduction operation in a loop.

So check for a reversed copy chain and if we encounter one then we can commute the add instruction so we can avoid a copy.

Patch by Wei Mi.
http://reviews.llvm.org/D7806

llvm-svn: 231148
2015-03-03 22:03:03 +00:00
..
Analysis Remove getDataLayout() from Instruction/GlobalValue/BasicBlock/Function 2015-03-03 22:01:13 +00:00
AsmParser Revert "Remove the explicit SDNodeIterator::operator= in favor of the implicit default" 2015-03-03 21:18:16 +00:00
Bitcode Add missing includes. make_unique proliferated everywhere. 2015-03-01 21:28:53 +00:00
CodeGen Fix a problem where the TwoAddressInstructionPass which generate redundant register moves in a loop. 2015-03-03 22:03:03 +00:00
DebugInfo [llvm-pdbdump] Many minor fixes and improvements 2015-03-02 04:39:56 +00:00
ExecutionEngine Add missing includes. make_unique proliferated everywhere. 2015-03-01 21:28:53 +00:00
Fuzzer [fuzzer] one more experimental search mode: -use_coverage_pairs=1 2015-02-20 03:02:37 +00:00
IR Remove getDataLayout() from Instruction/GlobalValue/BasicBlock/Function 2015-03-03 22:01:13 +00:00
IRReader Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects. 2015-02-11 03:28:02 +00:00
LineEditor Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects. 2015-02-11 03:28:02 +00:00
Linker Restore LLVMLinkModules C API until it is properly deprecated. 2015-03-02 18:59:38 +00:00
LTO [LTO API] fix memory leakage introduced at r230290. 2015-02-25 21:20:53 +00:00
MC Remove useless .debug_macinfo section setup. 2015-03-02 19:52:42 +00:00
Object Use read{16,32,64}{le,be}() instead of *reinterpret_cast<u{little,big}{16,32,64}_t>(). 2015-03-02 21:19:12 +00:00
Option Remove explicit no-op dtor in favor of the implicit dtor so as not to disable/deprecate the copy operations. 2015-03-03 19:53:02 +00:00
ProfileData Add missing includes. make_unique proliferated everywhere. 2015-03-01 21:28:53 +00:00
Support Make Triple::getOSVersion make sense for Android. 2015-03-03 18:23:51 +00:00
TableGen Add missing includes. make_unique proliferated everywhere. 2015-03-01 21:28:53 +00:00
Target Revert "Remove the explicit SDNodeIterator::operator= in favor of the implicit default" 2015-03-03 21:18:16 +00:00
Transforms RewriteStatepointsForGC::PhiState: Remove explicit copy ctor in favor of the Rule of Zero 2015-03-03 21:49:07 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile