1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00
llvm-mirror/test/CodeGen
James Molloy c2be58dbb0 [AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.

llvm-svn: 215199
2014-08-08 12:33:21 +00:00
..
AArch64 [AArch64] Add an FP load balancing pass for Cortex-A57 2014-08-08 12:33:21 +00:00
ARM Make these regexes stricter by disallowing any additional characters in the output. 2014-08-07 23:04:07 +00:00
CPP
Generic Use "weak alias" instead of "alias weak" 2014-07-30 22:51:54 +00:00
Hexagon DebugInfo: Assert that any CU for which debug_loc lists are emitted, has at least one range. 2014-08-06 00:21:25 +00:00
Inputs
Mips fix materialization of one bit constants and global values which are accessed through 2014-08-07 22:09:01 +00:00
MSP430
NVPTX
PowerPC [PowerPC] Swap arguments and adjust shift count for vsldoi on little endian 2014-08-05 20:47:25 +00:00
R600 R600: Cleanup fadd and fsub tests 2014-08-06 20:27:55 +00:00
SPARC
SystemZ
Thumb [ARM] In dynamic-no-pic mode, ARM's post-RA pseudo expansion was incorrectly 2014-08-02 05:40:40 +00:00
Thumb2 ARM: do not generate BLX instructions on Cortex-M CPUs. 2014-08-06 11:13:14 +00:00
X86 [pr19635] Revert most of r170537, and add new testcase. 2014-08-08 08:21:19 +00:00
XCore