1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
llvm-mirror/lib
Ahmed Bougacha 1c71a2aac6 [AArch64] Lower 2-CC FCCMPs (one/ueq) using AND'ed CCs.
The current behavior is incorrect, as the two CCs returned by
changeFPCCToAArch64CC, intended to be OR'ed, are instead used
in an AND ccmp chain.

Consider:
define i32 @t(float %a, float %b, float %c, float %d, i32 %e, i32 %f) {
  %cc1 = fcmp one float %a, %b
  %cc2 = fcmp olt float %c, %d
  %and = and i1 %cc1, %cc2
  %r = select i1 %and, i32 %e, i32 %f
  ret i32 %r
}

Assuming (%a < %b) and (%c < %d); we used to do:
  fcmp  s0, s1            # nzcv <- 1000
  orr   w8, wzr, #0x1     # w8 <- 1
  csel  w9, w8, wzr, mi   # w9 <- 1
  csel  w8, w8, w9, gt    # w8 <- 1
  fcmp  s2, s3            # nzcv <- 1000
  cset   w9, mi           # w9 <- 1
  tst    w8, w9           # (w8 & w9) == 1, so: nzcv <- 0000
  csel  w0, w0, w1, ne    # w0 <- w0

We now do:
  fcmp  s2, s3            # nzcv <- 1000
  fccmp s0, s1, #0, mi    #  mi, so: nzcv <- 1000
  fccmp s0, s1, #8, le    # !le, so: nzcv <- 1000
  csel  w0, w0, w1, pl    # !pl, so: w0 <- w1

In other words, we transformed:
  (c < d) &&  ((a < b) || (a > b))
into:
  (c < d) &&   (a u>= b) && (a u<= b)
whereas, per De Morgan's, we wanted:
  (c < d) && !((a u>= b) && (a u<= b))

Note that this problem doesn't occur in the test-suite.

changeFPCCToAArch64CC produces disjunct CCs; here, one -> mi/gt.
We can't represent that in the fccmp chain; it can't express
arbitrary OR sequences, as one comment explains:
  In general we can create code for arbitrary "... (and (and A B) C)"
  sequences.  We can also implement some "or" expressions, because
  "(or A B)" is equivalent to "not (and (not A) (not B))" and we can
  implement some  negation operations. [...] However there is no way
  to negate the result of a partial sequence.

Instead, introduce changeFPCCToANDAArch64CC, which produces the
conjunct cond codes:
- (a one b)
    == ((a olt b) || (a ogt b))
    == ((a ord b) && (a une b))
- (a ueq b)
    == ((a uno b) || (a oeq b))
    == ((a ule b) && (a uge b))

Note that, at first, one might think that, when PushNegate is true,
we should use the disjunct CCs, in effect doing:
  (a || b)
  = !(!a && !(b))
  = !(!a && !(b1 || b2))  <- changeFPCCToAArch64CC(b, b1, b2)
  = !(!a && !b1 && !b2)

However, we can take advantage of the fact that the CC is already
negated, which lets us avoid special-casing PushNegate and doing
the simpler to reason about:

  (a || b)
  = !(!a && (!b))
  = !(!a && (b1 && b2))   <- changeFPCCToANDAArch64CC(!b, b1, b2)
  = !(!a && b1 && b2)

This makes both emitConditionalCompare cases behave identically,
and produces correct ccmp sequences for the 2-CC fcmps.

llvm-svn: 258533
2016-01-22 19:43:54 +00:00
..
Analysis [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType. 2016-01-22 03:08:27 +00:00
AsmParser Implemented Support of IA interrupt and exception handlers: 2015-12-21 14:07:14 +00:00
Bitcode [ThinLTO] Avoid unnecesary hash lookups during metadata linking (NFC) 2016-01-21 16:46:40 +00:00
CodeGen [WinEH] Make collectFuncletMembers non-recursive 2016-01-22 18:49:50 +00:00
DebugInfo Fix instance of -Wcovered-switch-default 2016-01-13 20:39:22 +00:00
ExecutionEngine [RuntimeDyld][AArch64] Add support for the MachO ARM64_RELOC_SUBTRACTOR reloc. 2016-01-21 21:59:50 +00:00
Fuzzer Revert r258473 as it's breaking the build with libc++ 2016-01-22 03:21:52 +00:00
IR Replace Type::getInt32Ty() and comparison by isIntegerTy(32). NFC. 2016-01-22 03:30:27 +00:00
IRReader [ThinLTO] Metadata linking for imported functions 2015-12-17 17:14:09 +00:00
LibDriver
LineEditor
Linker [ThinLTO] Do metadata linking during batch function importing 2016-01-22 00:15:53 +00:00
LTO [LTO] Fix error reporting when a file passed to libLTO is invalid or non-existent 2016-01-20 09:03:42 +00:00
MC Rename MCLineEntry to MCDwarfLineEntry 2016-01-21 01:59:03 +00:00
Object Fix MachOObjectFile::getSymbolName() to not call report_fatal_error() 2016-01-22 18:47:14 +00:00
Option Convert Arg, ArgList, and Option to dump() to dbgs() rather than errs(). 2015-12-18 18:55:26 +00:00
Passes [attrs] Extract the pure inference of function attributes into 2015-12-27 08:41:34 +00:00
ProfileData [PGO] eliminate use of static variable 2016-01-22 05:48:40 +00:00
Support AMDGPU: Fix getArchTypePrefix 2016-01-22 19:09:12 +00:00
TableGen [TableGen] Use FoldingSets instead of DenseMaps to unique UnOpInit, BinOpInit and TernOpInit. This remove the memory needed to store the key for the DenseMap. NFC 2016-01-18 20:36:06 +00:00
Target [AArch64] Lower 2-CC FCCMPs (one/ueq) using AND'ed CCs. 2016-01-22 19:43:54 +00:00
Transforms [RS4GC] Use OB_deopt instead of "deopt" 2016-01-22 19:20:40 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile