1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

16 Commits

Author SHA1 Message Date
Pete Cooper
9f89f00988 DAG legalisation can now handle illegal fma vector types by scalarisation
llvm-svn: 159092
2012-06-24 00:05:44 +00:00
Lang Hames
7d298105e5 Rename fp-op fusion option (yet again) for compatibility with GCC option.
llvm-svn: 159042
2012-06-22 22:31:00 +00:00
Lang Hames
68cf87e3ef Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Lang Hames
662801dbc8 Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
llvm-svn: 158902
2012-06-21 06:10:00 +00:00
Lang Hames
f0b9601a6d Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Owen Anderson
060176473a Make this testcase independent of register allocation.
llvm-svn: 157761
2012-05-31 18:07:02 +00:00
Owen Anderson
38e510d831 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
llvm-svn: 157708
2012-05-30 18:54:50 +00:00
Owen Anderson
49ddb93c30 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
llvm-svn: 157707
2012-05-30 18:50:39 +00:00
Owen Anderson
e3d41b44cc Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
llvm-svn: 156029
2012-05-02 22:17:40 +00:00
Lang Hames
7d83af4ed0 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.

llvm-svn: 155724
2012-04-27 18:51:24 +00:00
Evan Cheng
3499593c7e On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly.
llvm-svn: 154689
2012-04-13 18:59:28 +00:00
Evan Cheng
f138fb4599 Add more fused mul+add/sub patterns. rdar://10139676
llvm-svn: 154484
2012-04-11 06:59:47 +00:00
Evan Cheng
b5291aea18 Match (fneg (fma) to vfnma. rdar://10139676
llvm-svn: 154469
2012-04-11 01:21:25 +00:00
Evan Cheng
eaf8eba8c4 Merge fma.ll into fusedMAC.ll
llvm-svn: 154466
2012-04-11 01:03:11 +00:00
Sebastian Pop
e6eeed8151 updated patch for the ARM fused multiply add/sub
In this update:
- I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2.
- I kept setting .fpu=neon-vfpv4 code attribute because that is what the
assembler understands.

Patch by Ana Pazos <apazos@codeaurora.org>

llvm-svn: 152036
2012-03-05 17:39:52 +00:00
Anton Korobeynikov
76b0745f6c Add fused multiple+add instructions from VFPv4.
Patch by Ana Pazos!

llvm-svn: 148658
2012-01-22 12:07:33 +00:00