llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

History

Craig Topper 456baa52f2 [CostModel][X86] Fix overcounting arithmetic cost in illegal types in getArithmeticReductionCost/getMinMaxReductionCost We were overcounting the number of arithmetic operations needed at each level before we reach a legal type. We were using the full vector type for that level, but we are going to split the input vector at that level in half. So the effective arithmetic operation cost at that level is half the width. So for example on 8i32 on an sse target. Were were calculating the cost of an 8i32 op which is likely 2 for basic integer. Then after the loop we count 2 more v4i32 ops. For a total arith cost of 4. But if you look at the assembly there would only be 3 arithmetic ops. There are still more bugs in this code that I'm going to work on next. The non pairwise code shouldn't count extract subvectors in the loop. There are no extracts, the types are split in registers. For pairwise we need to use 2 two src permute shuffles. Differential Revision: https://reviews.llvm.org/D55397 llvm-svn: 348621	2018-12-07 18:20:56 +00:00
..
llvm	[CostModel][X86] Fix overcounting arithmetic cost in illegal types in getArithmeticReductionCost/getMinMaxReductionCost	2018-12-07 18:20:56 +00:00
llvm-c	[IR] Add a dedicated FNeg IR Instruction	2018-11-13 18:15:47 +00:00

Craig Topper 456baa52f2 [CostModel][X86] Fix overcounting arithmetic cost in illegal types in getArithmeticReductionCost/getMinMaxReductionCost

We were overcounting the number of arithmetic operations needed at each level before we reach a legal type. We were using the full vector type for that level, but we are going to split the input vector at that level in half. So the effective arithmetic operation cost at that level is half the width.

So for example on 8i32 on an sse target. Were were calculating the cost of an 8i32 op which is likely 2 for basic integer. Then after the loop we count 2 more v4i32 ops. For a total arith cost of 4. But if you look at the assembly there would only be 3 arithmetic ops.

There are still more bugs in this code that I'm going to work on next. The non pairwise code shouldn't count extract subvectors in the loop. There are no extracts, the types are split in registers. For pairwise we need to use 2 two src permute shuffles.

Differential Revision: https://reviews.llvm.org/D55397

llvm-svn: 348621

2018-12-07 18:20:56 +00:00

llvm

[CostModel][X86] Fix overcounting arithmetic cost in illegal types in getArithmeticReductionCost/getMinMaxReductionCost

2018-12-07 18:20:56 +00:00

llvm-c

[IR] Add a dedicated FNeg IR Instruction

2018-11-13 18:15:47 +00:00