1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00
Commit Graph

21466 Commits

Author SHA1 Message Date
Hal Finkel
1424f01791 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

llvm-svn: 158223
2012-06-08 19:19:53 +00:00
Hal Finkel
44387d4528 Mark the PPC CTRRC and CTRRC8 register classes as non-allocatable.
Marking these classes as non-alocatable allows CTR loop generation to
work correctly with the block placement passes, etc. These register
classes are currently used only by some unused TCRETURN patterns.
In future cleanup, these will be removed.

Thanks again to Jakob for suggesting this fix to the CTR loop problem!

llvm-svn: 158221
2012-06-08 19:02:08 +00:00
Manman Ren
186346ff90 Enable optimization for integer ABS on X86 if Subtarget has CMOV.
llvm-svn: 158220
2012-06-08 18:58:26 +00:00
Andrew Trick
d4053e5a11 Fix Target->Codegen dependence.
Bulk move of TargetInstrInfo implementation into
TargetInstrInfoImpl. This is dirty because the code isn't part of
TargetInstrInfoImpl class, nor should it be, because the methods are
not target hooks. However, it's the current mechanism for keeping
libTarget useful outside the backend. You'll get a not-so-nice link
error if you invoke a TargetInstrInfo method that depends on CodeGen.

The TargetInstrInfoImpl class should probably be removed since it
doesn't really solve this problem.

To really fix this, we probably need separate interfaces for the
CodeGen/nonCodeGen sides of TargetInstrInfo.

llvm-svn: 158212
2012-06-08 17:23:27 +00:00
Hal Finkel
d05ff520b8 Disable the PPC CTR-Loops pass by default.
The pass itself works well, but the something in the Machine* infrastructure
does not understand terminators which define registers. Without the ability
to use the block-placement pass, etc. this causes performance regressions (and
so is turned off by default). Turning off the analysis turns off the problems
with the Machine* infrastructure.

llvm-svn: 158206
2012-06-08 15:38:25 +00:00
Hal Finkel
a6629c556e Fix a bug in the new PPC CTR-Loops pass.
The code which tests for an induction operation cannot assume that any
ADDI instruction will have a register operand because the operand could
also be a frame index; for example:
    %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16

llvm-svn: 158205
2012-06-08 15:38:23 +00:00
Hal Finkel
bb4e499e94 Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code.
This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon
pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are
no longer otherwise used. Also, invalid preheader DebugLoc is not used.

llvm-svn: 158204
2012-06-08 15:38:21 +00:00
Manman Ren
f51a6d5fae X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237

llvm-svn: 158175
2012-06-07 22:39:10 +00:00
Nadav Rotem
7d996eafba Do not optimize the used bits of the x86 vselect condition operand, when the condition operand is a vector of 1-bit predicates.
This may happen on MIC devices.

llvm-svn: 158168
2012-06-07 20:53:48 +00:00
Andrew Trick
4fe40f02fd Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency.
llvm-svn: 158164
2012-06-07 19:42:04 +00:00
Andrew Trick
1429abc731 ARM getOperandLatency rewrite.
Match expectations of the new latency API. Cleanup and make the logic consistent.

llvm-svn: 158163
2012-06-07 19:42:00 +00:00
Andrew Trick
cbd1d0a130 ARM getOperandLatency should return -1 for unknown, consistent with API
llvm-svn: 158162
2012-06-07 19:41:58 +00:00
Andrew Trick
d0f85a1d12 Fix ARM getInstrLatency logic to work with the current API.
llvm-svn: 158161
2012-06-07 19:41:55 +00:00
Manman Ren
c8e46bcf47 PR13046: we can't replace usage of SUB with CMP in the lowering phase.
It will cause assertion failure later on.

llvm-svn: 158160
2012-06-07 19:27:33 +00:00
Rafael Espindola
4c9d611360 Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

llvm-svn: 158158
2012-06-07 18:39:19 +00:00
Manman Ren
1d91fc3342 X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 158126
2012-06-07 00:42:47 +00:00
Manman Ren
f591de61da Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Benjamin Kramer
58b98297ac Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Benjamin Kramer
d93c18846c Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Chad Rosier
5a354cd5e8 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434

llvm-svn: 158087
2012-06-06 17:37:40 +00:00
Richard Barton
eedaf01042 Correct decoder for T1 conditional B encoding
llvm-svn: 158055
2012-06-06 09:12:53 +00:00
Craig Topper
8d23b98818 Mark several instructions SSE2 instead of SSE3 as they should be.
llvm-svn: 158049
2012-06-06 06:45:27 +00:00
Andrew Trick
24cce40009 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Yuan Lin
00b608400c Fix header file include order in NVPTX backend NV_CONTRIB
llvm-svn: 158013
2012-06-05 19:06:13 +00:00
Roman Divacky
f8e2e4beaa PPC32 uses R2 as the TLS register. Fix the copy and paste.
llvm-svn: 158004
2012-06-05 17:14:17 +00:00
Andrew Trick
6d6fa07808 X86 itinerary properties.
llvm-svn: 157981
2012-06-05 03:44:46 +00:00
Andrew Trick
80ddb55a53 ARM itinerary properties.
llvm-svn: 157980
2012-06-05 03:44:43 +00:00
Andrew Trick
e7159e6731 misched: Added MultiIssueItineraries.
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.

llvm-svn: 157979
2012-06-05 03:44:40 +00:00
Andrew Trick
8b333df134 whitespace
llvm-svn: 157976
2012-06-05 03:44:29 +00:00
Joel Jones
6fb688ff18 Revert commit r157966
llvm-svn: 157972
2012-06-05 00:47:21 +00:00
Joel Jones
fdd5284d04 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>

llvm-svn: 157966
2012-06-04 23:38:57 +00:00
Akira Hatanaka
f860bcf5fc Fix a bug in MipsTargetLowering::LowerLOAD. A shift-right-logical node is
inserted after the shift-left-logical node.

llvm-svn: 157937
2012-06-04 17:46:29 +00:00
Roman Divacky
0daa2c0556 Implement local-exec TLS on PowerPC.
llvm-svn: 157935
2012-06-04 17:36:38 +00:00
Hans Wennborg
ff55c96bc2 MIPS TLS: use the model selected by TargetMachine::getTLSModel().
This was mostly done already in r156162, but I missed one place.

llvm-svn: 157929
2012-06-04 14:02:08 +00:00
Hans Wennborg
60ee5bbc4f Better comments for TLS-related X86 MachineOperand flags.
llvm-svn: 157920
2012-06-04 09:55:36 +00:00
Craig Topper
52bf0cfb27 Add intrinsic forms for FMA instructions to opcode folding tables.
llvm-svn: 157917
2012-06-04 07:46:16 +00:00
Craig Topper
eb2d859f52 Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions.
llvm-svn: 157914
2012-06-04 07:08:21 +00:00
Hal Finkel
c1daf7daf8 Fix a copy-and-paste duplication error in the PPC 440 and A2 schedules (no functionality change).
llvm-svn: 157912
2012-06-04 02:39:52 +00:00
Hal Finkel
c1fe73fae2 Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

llvm-svn: 157911
2012-06-04 02:21:00 +00:00
Craig Topper
5837bcfc02 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Craig Topper
8d3031fa46 Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Manman Ren
c3a6de9953 Revert r157831
llvm-svn: 157896
2012-06-03 03:14:24 +00:00
Craig Topper
685b86b007 Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
llvm-svn: 157895
2012-06-03 01:40:43 +00:00
Craig Topper
e783584ea7 Add neverHasSideEffects and mayLoad to FMA3 instructions.
llvm-svn: 157894
2012-06-03 00:30:49 +00:00
Benjamin Kramer
bb30e1face Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Chris Lattner
68489fe6b7 remove an unused variable.
llvm-svn: 157872
2012-06-02 01:03:42 +00:00
Akira Hatanaka
378d42a0b3 Remove code which is no longer needed in MipsAsmPrinter and MipsMCInstLower.
llvm-svn: 157867
2012-06-02 00:05:11 +00:00
Akira Hatanaka
334dbca66f Set operation actions for load/store nodes in the Mips backend.
llvm-svn: 157866
2012-06-02 00:04:42 +00:00
Akira Hatanaka
b2bdf54ad5 Add definitions of 32/64-bit unaligned load/store instructions for Mips.
llvm-svn: 157865
2012-06-02 00:04:19 +00:00
Akira Hatanaka
b7342fea3a Define functions MipsTargetLowering::LowerLOAD and LowerSTORE which
custom-lower unaligned load and store nodes.

llvm-svn: 157864
2012-06-02 00:03:49 +00:00