mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2025-02-01 05:01:59 +01:00
4f713f8431
Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 llvm-svn: 355672
Analysis Opportunities: //===---------------------------------------------------------------------===// In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the ScalarEvolution expression for %r is this: {1,+,3,+,2}<loop> Outside the loop, this could be evaluated simply as (%n * %n), however ScalarEvolution currently evaluates it as (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n)) In addition to being much more complicated, it involves i65 arithmetic, which is very inefficient when expanded into code. //===---------------------------------------------------------------------===// In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll, ScalarEvolution is forming this expression: ((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32))) This could be folded to (-1 * (trunc i64 undef to i32)) //===---------------------------------------------------------------------===//