llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00

History

Evan Cheng fc78767730 Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960		2010-12-05 22:04:16 +00:00
..
Analysis	Also inore '()' while creating mdnode name from ObjC symbol name.	2010-12-03 23:40:45 +00:00
Archive	Merge System into Support.	2010-11-29 18:16:10 +00:00
AsmParser	Add a new 'hotpatch' attribute. This attribute will insert a two-byte no-op	2010-10-25 15:37:09 +00:00
Bitcode	Generalize the darwin wrapper hack to work with generic macho triples as well as darwin ones.	2010-11-29 23:29:54 +00:00
CodeGen	Remove the PHIElimination.h header, as it is no longer needed.	2010-12-05 21:39:42 +00:00
CompilerDriver	Now to chant the magical incantation that will exorcise the System library	2010-11-29 19:44:50 +00:00
ExecutionEngine	Remove unneeded zero arrays.	2010-12-04 15:28:22 +00:00
Linker	Merge System into Support.	2010-11-29 18:16:10 +00:00
MC	Once the layout is done we don't need to keep updating which fragments are	2010-12-04 22:47:22 +00:00
Object	Merge System into Support.	2010-11-29 18:16:10 +00:00
Support	Silence 'may be used uninitialized in this function' warnings. Static analysis	2010-12-04 20:20:34 +00:00
Target	Making use of VFP / NEON floating point multiply-accumulate / subtraction is	2010-12-05 22:04:16 +00:00
Transforms	Refactor jump threading.	2010-12-05 19:06:41 +00:00
VMCore	Fix PR 4170 by having ExtractValueInst::getIndexedType() reject out-of-bounds indexing.	2010-12-05 20:50:26 +00:00
Makefile	Add LLVMObject Library.	2010-11-15 03:21:41 +00:00