1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00
Commit Graph

64 Commits

Author SHA1 Message Date
Benjamin Kramer
aff6a9e50a Initialize HasVMLxForwarding.
llvm-svn: 128709
2011-04-01 09:20:31 +00:00
Evan Cheng
6e3d087477 available_externally (hidden or not) GVs are always accessed via stubs. rdar://9027648.
llvm-svn: 126191
2011-02-22 06:58:34 +00:00
Evan Cheng
0dfe28a9b5 Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.

llvm-svn: 123991
2011-01-21 18:55:51 +00:00
Evan Cheng
7e2b414953 Don't forget to emit the load from indirect symbol when using movw + movt to materialize GA indirect symbols.
llvm-svn: 123809
2011-01-19 02:16:49 +00:00
Evan Cheng
53ec6fc591 Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g.
movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc, r0

It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.

llvm-svn: 123619
2011-01-17 08:03:18 +00:00
Evan Cheng
05ef00f4dc Clean up ARM subtarget code by using Triple ADT.
llvm-svn: 123276
2011-01-11 21:46:47 +00:00
Andrew Trick
3637733170 Fix the ARM IIC_iCMPsi itinerary and add an important assert.
llvm-svn: 122794
2011-01-04 00:32:57 +00:00
Andrew Trick
134b2a5907 Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.

llvm-svn: 122541
2010-12-24 05:03:26 +00:00
Andrew Trick
53f4556c64 whitespace
llvm-svn: 122539
2010-12-24 04:28:06 +00:00
Evan Cheng
fc78767730 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Bob Wilson
03a878a419 Define the subtarget feature for the architecture version,
as derived from the target triple.  This is important for enabling
features that are implied based on the architecture version.

llvm-svn: 118643
2010-11-09 22:50:47 +00:00
Evan Cheng
eab7251695 Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing.
llvm-svn: 118160
2010-11-03 06:34:55 +00:00
Bob Wilson
bbb91c6a1c PR8359: The ARM backend may end up allocating registers D16 to D31 when
"-mattr=+vfp3" is specified. However, this will not work for hardware that
only supports 16 registers.  Add a new flag to support -"mattr=+vfp3,+d16".
Patch by Jan Voung!

llvm-svn: 116310
2010-10-12 16:22:47 +00:00
Owen Anderson
c34e1296b8 Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise
cost modeling for if-conversion.  Now if only we had a way to estimate the misprediction probability.

Adjsut CodeGen/ARM/ifcvt10.ll.  The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.

llvm-svn: 114995
2010-09-28 21:57:50 +00:00
Bob Wilson
dc396388cb Add a command line option "-arm-strict-align" to disallow unaligned memory
accesses for ARM targets that would otherwise allow it.  Radar 8465431.

llvm-svn: 114941
2010-09-28 04:09:35 +00:00
Evan Cheng
c9cb37516d Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.

llvm-svn: 113570
2010-09-10 01:29:16 +00:00
Jim Grosbach
1128a47289 cortex m4 has floating point support, but only single precision.
llvm-svn: 110810
2010-08-11 15:44:15 +00:00
Evan Cheng
f8604b772e Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors.
llvm-svn: 110798
2010-08-11 07:17:46 +00:00
Evan Cheng
5fca4ca5f9 - Add subtarget feature -mattr=+db which determine whether an ARM cpu has the
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
  instructions).
- Added tests for memory barrier codegen.

llvm-svn: 110785
2010-08-11 06:22:01 +00:00
Evan Cheng
b6b08dfca1 Explicitly initialize SlowFPBrcc and Pref32BitThumb to false.
llvm-svn: 110587
2010-08-09 19:19:36 +00:00
Jim Grosbach
e04cc6cb43 Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack
instructions to subtarget features and update tests to reflect.
PR5717.

llvm-svn: 103136
2010-05-05 23:44:43 +00:00
Jim Grosbach
3630aff780 Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by
Jordy <snhjordy@gmail.com>.

Followup patches will add some tests and adjust to use Subtarget features
for the instructions.

llvm-svn: 103119
2010-05-05 20:44:35 +00:00
Dan Gohman
0e0b8cf9fd Add const qualifiers to CodeGen's use of LLVM IR constructs.
llvm-svn: 101334
2010-04-15 01:51:59 +00:00
Jim Grosbach
2a0b14a387 switch the flag for using NEON for SP floating point to a subtarget 'feature'.
Re-commit. This time complete with testsuite updates.

llvm-svn: 99570
2010-03-25 23:47:34 +00:00
Jim Grosbach
97d5bc2b86 need to fix 'make check' tests first. revert for a moment.
llvm-svn: 99569
2010-03-25 23:34:05 +00:00
Jim Grosbach
7e87ba79e6 switch the flag for using NEON for SP floating point to a subtarget 'feature'
llvm-svn: 99568
2010-03-25 23:32:19 +00:00
Jim Grosbach
b97ff2a4c1 switch the use-vml[as] instructions flag to a subtarget 'feature'
llvm-svn: 99565
2010-03-25 23:11:16 +00:00
Jim Grosbach
0975d55c8e ARM cortex-a8 doesn't do vmla/vmls well. disable them by default for that cpu
llvm-svn: 99549
2010-03-25 20:48:50 +00:00
Jim Grosbach
d285f71b9a Make the use of the vmla and vmls VFP instructions controllable via cmd line.
Preliminary testing shows significant performance wins by not using these
instructions.

llvm-svn: 99436
2010-03-24 22:31:46 +00:00
Anton Korobeynikov
90fcfccc91 Add substarget feature for FP16
llvm-svn: 98503
2010-03-14 18:42:38 +00:00
Anton Korobeynikov
e0e616a74d Initial bits of ARMv4-only support.
Patch by John Tytgat!

llvm-svn: 97886
2010-03-06 19:39:36 +00:00
Jeffrey Yasskin
fb10587e50 Kill ModuleProvider and ghost linkage by inverting the relationship between
Modules and ModuleProviders. Because the "ModuleProvider" simply materializes
GlobalValues now, and doesn't provide modules, it's renamed to
"GVMaterializer". Code that used to need a ModuleProvider to materialize
Functions can now materialize the Functions directly. Functions no longer use a
magic linkage to record that they're materializable; they simply ask the
GVMaterializer.

Because the C ABI must never change, we can't remove LLVMModuleProviderRef or
the functions that refer to it. Instead, because Module now exposes the same
functionality ModuleProvider used to, we store a Module* in any
LLVMModuleProviderRef and translate in the wrapper methods.  The bindings to
other languages still use the ModuleProvider concept.  It would probably be
worth some time to update them to follow the C++ more closely, but I don't
intend to do it.

Fixes http://llvm.org/PR5737 and http://llvm.org/PR5735.

llvm-svn: 94686
2010-01-27 20:34:15 +00:00
Bob Wilson
b293fe32cb Remove isProfitableToDuplicateIndirectBranch target hook. It is profitable
for all the processors where I have tried it, and even when it might not help
performance, the cost is quite low.  The opportunities for duplicating
indirect branches are limited by other factors so code size does not change
much due to tail duplicating indirect branches aggressively.

llvm-svn: 90144
2009-11-30 18:35:03 +00:00
Anton Korobeynikov
0f885eb7fd Materialize global addresses via movt/movw pair, this is always better
than doing the same via constpool:
1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2.
2. Load from constpool might stall up to 300 cycles due to cache miss.
3. Movt/movw does not use load/store unit.
4. Less constpool entries => better compiler performance.

This is only enabled on ELF systems, since darwin does not have needed
relocations (yet).

llvm-svn: 89720
2009-11-24 00:44:37 +00:00
Bob Wilson
6b68bd153a Add a target hook to allow changing the tail duplication limit based on the
contents of the block to be duplicated.  Use this for ARM Cortex A8/9 to
be more aggressive tail duplicating indirect branches, since it makes it
much more likely that they will be predicted in the branch target buffer.
Testcase coming soon.

llvm-svn: 89187
2009-11-18 03:34:27 +00:00
David Goodwin
e1d06f2239 Allow target to specify regclass for which antideps will only be broken along the critical path.
llvm-svn: 88682
2009-11-13 19:52:48 +00:00
David Goodwin
93a4f29c67 Fixed to address code review. No functional changes.
llvm-svn: 86634
2009-11-10 00:48:55 +00:00
Evan Cheng
4df1ac5722 I am no spelling bee.
llvm-svn: 84250
2009-10-16 06:18:09 +00:00
Evan Cheng
b1580b5c48 Enable post-alloc scheduling for all ARM variants except for Thumb1.
llvm-svn: 84249
2009-10-16 06:11:08 +00:00
Evan Cheng
8685a8ec0c Add comment.
llvm-svn: 84246
2009-10-16 05:33:58 +00:00
David Goodwin
a4b73e486e Remove neonfp attribute and instead set default based on CPU string. Add -arm-use-neon-fp to override the default.
llvm-svn: 83218
2009-10-01 22:19:57 +00:00
David Goodwin
d0edce4c0d Restore the -post-RA-scheduler flag as an override for the target specification. Remove -mattr for setting PostRAScheduler enable and instead use CPU string.
llvm-svn: 83215
2009-10-01 21:46:35 +00:00
David Goodwin
a282690f82 Remove -post-RA-schedule flag and add a TargetSubtarget method to enable post-register-allocation scheduling. By default it is off. For ARM, enable/disable with -mattr=+/-postrasched. Enable by default for cortex-a8.
llvm-svn: 83122
2009-09-30 00:10:16 +00:00
Evan Cheng
41e87f2f13 Reference to hidden symbols do not have to go through non-lazy pointer in non-pic mode. rdar://7187172.
llvm-svn: 80904
2009-09-03 07:04:02 +00:00
Evan Cheng
d7a07ab112 Let Darwin linker auto-synthesize stubs and lazy-pointers. This deletes a bunch of nasty code in ARM asm printer.
llvm-svn: 80404
2009-08-28 23:18:09 +00:00
Daniel Dunbar
3d2ac751da Remove some dead code.
llvm-svn: 78219
2009-08-05 18:12:37 +00:00
David Goodwin
99adffe5f2 Initial support for single-precision FP using NEON. Added "neonfp" attribute to enable. Added patterns for some binary FP operations.
llvm-svn: 78081
2009-08-04 17:53:06 +00:00
Daniel Dunbar
0b82c938fe Normalize Subtarget constructors to take a target triple string instead of
Module*.

Also, dropped uses of TargetMachine where unnecessary. The only target which
still takes a TargetMachine& is Mips, I would appreciate it if someone would
normalize this to match other targets.

llvm-svn: 77918
2009-08-02 22:11:08 +00:00
Evan Cheng
5ef6928dff Fix Thumb2 function call isel. Thumb1 and Thumb2 should share the same
instructions for calls since BL and BLX are always 32-bit long and BX is always
16-bit long.

Also, we should be using BLX to call external function stubs.

llvm-svn: 77756
2009-08-01 00:16:10 +00:00
Bob Wilson
6e81f12950 Use thumb2 for ARM architectures V6T2 and later. Fix a bug in checking
for "thumb" and add a check for V6T2.

llvm-svn: 73905
2009-06-22 21:28:22 +00:00