llvm-mirror/test/Transforms/LoopUnroll/peel-loop-inner.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -S -passes='require<opt-remark-emit>,loop-unroll<peeling;no-runtime>,simplify-cfg,instcombine' -unroll-force-peel-count=3 -verify-dom-info | FileCheck %s

define void @basic(i32 %K, i32 %N) {
; CHECK-LABEL: @basic(
; CHECK-NEXT:  entry:
; CHECK-NEXT:    br label [[OUTER:%.*]]
; CHECK:       outer:
; CHECK-NEXT:    [[I:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[I_INC:%.*]], [[OUTER_BACKEDGE:%.*]] ]
; CHECK-NEXT:    [[CMP_INNER_PEEL8:%.*]] = icmp sgt i32 [[K:%.*]], 3
; CHECK-NEXT:    br i1 [[CMP_INNER_PEEL8]], label [[INNER:%.*]], label [[OUTER_BACKEDGE]]
; CHECK:       inner:
; CHECK-NEXT:    [[J:%.*]] = phi i32 [ [[J_INC:%.*]], [[INNER]] ], [ 3, [[OUTER]] ]
; CHECK-NEXT:    [[J_INC]] = add nuw nsw i32 [[J]], 1
; CHECK-NEXT:    [[CMP_INNER:%.*]] = icmp slt i32 [[J_INC]], [[K]]
; CHECK-NEXT:    br i1 [[CMP_INNER]], label [[INNER]], label [[OUTER_BACKEDGE]], [[LOOP0:!llvm.loop !.*]]
; CHECK:       outer.backedge:
; CHECK-NEXT:    [[I_INC]] = add i32 [[I]], 1
; CHECK-NEXT:    [[CMP_OUTER:%.*]] = icmp slt i32 [[I_INC]], [[N:%.*]]
; CHECK-NEXT:    br i1 [[CMP_OUTER]], label [[OUTER]], label [[END:%.*]]
; CHECK:       end:
; CHECK-NEXT:    ret void
;
entry:
  br label %outer

outer:
  %i = phi i32 [ 0, %entry ], [ %i.inc, %outer.backedge ]
  br label %inner

inner:
  %j = phi i32 [ 0, %outer ], [ %j.inc, %inner ]
  %j.inc = add i32 %j, 1
  %cmp.inner = icmp slt i32 %j.inc, %K
  br i1 %cmp.inner, label %inner, label %outer.backedge, !llvm.loop !1

outer.backedge:
  %i.inc = add i32 %i, 1
  %cmp.outer = icmp slt i32 %i.inc, %N
  br i1 %cmp.outer, label %outer, label %end

end:
  ret void
}

!1 = distinct !{!1}
Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions This was orginally committed in 2245fb8aaa1c1f85f53f7b19a1ee3ac69b1a1dfe. but was immediately reverted in f3abd54958ab90ba7c100d3fa936a3ce0dd2ad04 because of a PHI handling issue. Original commit message: 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: https://github.com/darktable-org/rawspeed/blob/ca57d77fb2ba81f21fc712cfac26e54f46406473/src/librawspeed/common/FloatingPoint.h#L115-L120 ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction. 2020-11-27 10:21:33 +03:00			`; NOTE: Assertions have been autogenerated by utils/update_test_checks.py`
[NewPM][LoopUnroll] Rename unroll* to loop-unroll* The legacy pass is called "loop-unroll", but in the new PM it's called "unroll". Also applied to unroll-and-jam and unroll-full. Fixes various check-llvm tests when NPM is turned on. Reviewed By: Whitney, dmgreen Differential Revision: https://reviews.llvm.org/D82590 2020-06-26 09:28:32 -07:00			`; RUN: opt < %s -S -passes='require<opt-remark-emit>,loop-unroll<peeling;no-runtime>,simplify-cfg,instcombine' -unroll-force-peel-count=3 -verify-dom-info \| FileCheck %s`
[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637 2019-08-22 10:06:46 +00:00
			`define void @basic(i32 %K, i32 %N) {`
			`; CHECK-LABEL: @basic(`
			`; CHECK-NEXT: entry:`
			`; CHECK-NEXT: br label [[OUTER:%.*]]`
			`; CHECK: outer:`
			`; CHECK-NEXT: [[I:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[I_INC:%.]], [[OUTER_BACKEDGE:%.]] ]`
[SimplifyCFG] FoldBranchToCommonDest(): re-lift restrictions on liveout uses of bonus instructions I have previously tried doing that in b33fbbaa34f0fe9fb16789afc72ae424c1825b69 / d38205144febf4dc42c9270c6aa3d978f1ef65e1, but eventually it was pointed out that the approach taken there was just broken wrt how the uses of bonus instructions are updated to account for the fact that they should now use either bonus instruction or the cloned bonus instruction. In particluar, all that manual handling of PHI nodes in successors was just wrong. But, the fix is actually much much simpler than my initial approach: just tell SSAUpdate about both instances of bonus instruction, and let it deal with all the PHI handling. Alive2 confirms that the reproducers from the original bugs (@pr48450*) are now handled correctly. This effectively reverts commit 59560e85897afc50090b6c3d920bacfd28b49d06, effectively relanding b33fbbaa34f0fe9fb16789afc72ae424c1825b69. 2021-01-22 22:18:34 +03:00			`; CHECK-NEXT: [[CMP_INNER_PEEL8:%.]] = icmp sgt i32 [[K:%.]], 3`
[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637 2019-08-22 10:06:46 +00:00			`; CHECK-NEXT: br i1 [[CMP_INNER_PEEL8]], label [[INNER:%.*]], label [[OUTER_BACKEDGE]]`
			`; CHECK: inner:`
[SimplifyCFG] FoldBranchToCommonDest(): re-lift restrictions on liveout uses of bonus instructions I have previously tried doing that in b33fbbaa34f0fe9fb16789afc72ae424c1825b69 / d38205144febf4dc42c9270c6aa3d978f1ef65e1, but eventually it was pointed out that the approach taken there was just broken wrt how the uses of bonus instructions are updated to account for the fact that they should now use either bonus instruction or the cloned bonus instruction. In particluar, all that manual handling of PHI nodes in successors was just wrong. But, the fix is actually much much simpler than my initial approach: just tell SSAUpdate about both instances of bonus instruction, and let it deal with all the PHI handling. Alive2 confirms that the reproducers from the original bugs (@pr48450*) are now handled correctly. This effectively reverts commit 59560e85897afc50090b6c3d920bacfd28b49d06, effectively relanding b33fbbaa34f0fe9fb16789afc72ae424c1825b69. 2021-01-22 22:18:34 +03:00			`; CHECK-NEXT: [[J:%.]] = phi i32 [ [[J_INC:%.]], [[INNER]] ], [ 3, [[OUTER]] ]`
[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637 2019-08-22 10:06:46 +00:00			`; CHECK-NEXT: [[J_INC]] = add nuw nsw i32 [[J]], 1`
			`; CHECK-NEXT: [[CMP_INNER:%.*]] = icmp slt i32 [[J_INC]], [[K]]`
Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions This was orginally committed in 2245fb8aaa1c1f85f53f7b19a1ee3ac69b1a1dfe. but was immediately reverted in f3abd54958ab90ba7c100d3fa936a3ce0dd2ad04 because of a PHI handling issue. Original commit message: 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: https://github.com/darktable-org/rawspeed/blob/ca57d77fb2ba81f21fc712cfac26e54f46406473/src/librawspeed/common/FloatingPoint.h#L115-L120 ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction. 2020-11-27 10:21:33 +03:00			`; CHECK-NEXT: br i1 [[CMP_INNER]], label [[INNER]], label [[OUTER_BACKEDGE]], [[LOOP0:!llvm.loop !.*]]`
[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637 2019-08-22 10:06:46 +00:00			`; CHECK: outer.backedge:`
			`; CHECK-NEXT: [[I_INC]] = add i32 [[I]], 1`
			`; CHECK-NEXT: [[CMP_OUTER:%.]] = icmp slt i32 [[I_INC]], [[N:%.]]`
Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions This was orginally committed in 2245fb8aaa1c1f85f53f7b19a1ee3ac69b1a1dfe. but was immediately reverted in f3abd54958ab90ba7c100d3fa936a3ce0dd2ad04 because of a PHI handling issue. Original commit message: 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: https://github.com/darktable-org/rawspeed/blob/ca57d77fb2ba81f21fc712cfac26e54f46406473/src/librawspeed/common/FloatingPoint.h#L115-L120 ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction. 2020-11-27 10:21:33 +03:00			`; CHECK-NEXT: br i1 [[CMP_OUTER]], label [[OUTER]], label [[END:%.*]]`
[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637 2019-08-22 10:06:46 +00:00			`; CHECK: end:`
Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions This was orginally committed in 2245fb8aaa1c1f85f53f7b19a1ee3ac69b1a1dfe. but was immediately reverted in f3abd54958ab90ba7c100d3fa936a3ce0dd2ad04 because of a PHI handling issue. Original commit message: 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: https://github.com/darktable-org/rawspeed/blob/ca57d77fb2ba81f21fc712cfac26e54f46406473/src/librawspeed/common/FloatingPoint.h#L115-L120 ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction. 2020-11-27 10:21:33 +03:00			`; CHECK-NEXT: ret void`
[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637 2019-08-22 10:06:46 +00:00			`;`
			`entry:`
			`br label %outer`

			`outer:`
			`%i = phi i32 [ 0, %entry ], [ %i.inc, %outer.backedge ]`
			`br label %inner`

			`inner:`
			`%j = phi i32 [ 0, %outer ], [ %j.inc, %inner ]`
			`%j.inc = add i32 %j, 1`
			`%cmp.inner = icmp slt i32 %j.inc, %K`
			`br i1 %cmp.inner, label %inner, label %outer.backedge, !llvm.loop !1`

			`outer.backedge:`
			`%i.inc = add i32 %i, 1`
			`%cmp.outer = icmp slt i32 %i.inc, %N`
			`br i1 %cmp.outer, label %outer, label %end`

			`end:`
			`ret void`
			`}`

			`!1 = distinct !{!1}`