llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
David Meyer	c50cb2f15a	Remove NaClMode llvm-svn: 142338	2011-10-18 05:29:23 +00:00
James Molloy	c4fcff419c	Check in a patch that has already been code reviewed by Owen that I'd forgotten to commit. Build on previous patches to successfully distinguish between an M-series and A/R-series MSR and MRS instruction. These take different mask names and have a slightly different opcode format. Add decoder and disassembler tests. Improvement on the previous patch - successfully distinguish between valid v6m and v7m masks (one is a subset of the other). The patch had to be edited slightly to apply to ToT. llvm-svn: 140696	2011-09-28 14:21:38 +00:00
Evan Cheng	ead45e2ba6	Fix a bug introduced during refactoring a couple of months ago. Cortex-M3 does not support Thumb2 dsp instructions. rdar://10152911. llvm-svn: 140181	2011-09-20 21:38:18 +00:00
Nick Lewycky	9b5a242546	Add a new MC bit for NaCl (Native Client) mode. NaCl requires that certain instructions are more aligned than the CPU requires, and adds some additional directives, to follow in future patches. Patch by David Meyer! llvm-svn: 139125	2011-09-05 21:51:43 +00:00
Nick Lewycky	59cb9e0d85	Remove stray fullstop. llvm-svn: 138589	2011-08-25 21:46:20 +00:00
Evan Cheng	faa59a86f1	Rename attribute 'thumb' to a more descriptive 'thumb-mode'. llvm-svn: 134626	2011-07-07 19:05:12 +00:00
Evan Cheng	ac7afa4009	Sink feature IsThumb into MC layer. llvm-svn: 134608	2011-07-07 08:26:46 +00:00
Evan Cheng	952943f744	Change some ARM subtarget features to be single bit yes/no in order to sink them down to MC layer. Also fix tests. llvm-svn: 134590	2011-07-07 03:55:05 +00:00
Evan Cheng	9581f645b3	Factor ARM triple parsing out of ARMSubtarget. Another step towards making ARM subtarget info available to MC. llvm-svn: 134569	2011-07-07 00:08:19 +00:00
Jim Grosbach	461adc233e	ARMv7M vs. ARMv7E-M support. The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261	2011-07-01 21:12:19 +00:00
Bob Wilson	3daeb462cb	This patch combines several changes from Evan Cheng for rdar://8659675. Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. llvm-svn: 129775	2011-04-19 18:11:57 +00:00
Bob Wilson	56f64ab701	Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637. llvm-svn: 129774	2011-04-19 18:11:52 +00:00
Bob Wilson	0cbbc50f26	Avoid some 's' 16-bit instruction which partially update CPSR (and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 llvm-svn: 129773	2011-04-19 18:11:49 +00:00
Evan Cheng	64850406cf	Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 llvm-svn: 128665	2011-03-31 19:38:48 +00:00
Bob Wilson	438a9a1367	Add Neon VCVT instructions for f32 <-> f16 conversions. Clang is now providing intrinsics for these and so we need to support them in the backend. Radar 8068427. llvm-svn: 121902	2010-12-15 22:14:12 +00:00
Evan Cheng	12561e250d	Code clean up. llvm-svn: 120965	2010-12-05 23:03:45 +00:00
Evan Cheng	fc78767730	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Evan Cheng	b565d1acf9	Add some missing isel predicates on def : pat patterns to avoid generating VFP vmla / vmls (they cause stalls). Disabling them in isel is properly not a right solution, I'll look into a proper solution next. llvm-svn: 118922	2010-11-12 20:32:20 +00:00
Evan Cheng	eab7251695	Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. llvm-svn: 118160	2010-11-03 06:34:55 +00:00
Bob Wilson	bbb91c6a1c	PR8359: The ARM backend may end up allocating registers D16 to D31 when "-mattr=+vfp3" is specified. However, this will not work for hardware that only supports 16 registers. Add a new flag to support -"mattr=+vfp3,+d16". Patch by Jan Voung! llvm-svn: 116310	2010-10-12 16:22:47 +00:00
Jim Grosbach	efad965653	Nuke it from orbit. It's the only way to be sure. (Kill the dead non-MC asm printer for the ARM target.) llvm-svn: 115127	2010-09-30 01:57:53 +00:00
Evan Cheng	c9cb37516d	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Jim Grosbach	1d9631950f	80 column cleanup. llvm-svn: 111266	2010-08-17 18:39:16 +00:00
Chris Lattner	a556264e06	fix emacs language spec's, patch by Edmund Grimley-Evans! llvm-svn: 111241	2010-08-17 16:20:04 +00:00
Jim Grosbach	1128a47289	cortex m4 has floating point support, but only single precision. llvm-svn: 110810	2010-08-11 15:44:15 +00:00
Evan Cheng	f8604b772e	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. llvm-svn: 110798	2010-08-11 07:17:46 +00:00
Evan Cheng	4929ba9d20	ArchV7M implies HW division instructions. llvm-svn: 110797	2010-08-11 07:00:16 +00:00
Evan Cheng	31e15214c6	ArchV6T2, V7A, and V7M implies Thumb2; Archv7A implies NEON. llvm-svn: 110796	2010-08-11 06:57:53 +00:00
Evan Cheng	273160895e	Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.) llvm-svn: 110795	2010-08-11 06:51:54 +00:00
Evan Cheng	e5bab36c75	Add Cortex-M0 support. It's a ARMv6m device (no ARM mode) with some 32-bit instructions: dmb, dsb, isb, msr, and mrs. llvm-svn: 110786	2010-08-11 06:30:38 +00:00
Evan Cheng	5fca4ca5f9	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785	2010-08-11 06:22:01 +00:00
Evan Cheng	15d23d4966	Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations. llvm-svn: 110584	2010-08-09 18:35:19 +00:00
Evan Cheng	67743f2057	Add an ARM "feature". Cortex-a8 fp comparison is very slow (> 20 cycles). llvm-svn: 108256	2010-07-13 19:21:50 +00:00
Jim Grosbach	e04cc6cb43	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack instructions to subtarget features and update tests to reflect. PR5717. llvm-svn: 103136	2010-05-05 23:44:43 +00:00
Jim Grosbach	3630aff780	Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by Jordy <snhjordy@gmail.com>. Followup patches will add some tests and adjust to use Subtarget features for the instructions. llvm-svn: 103119	2010-05-05 20:44:35 +00:00
Anton Korobeynikov	7a3b393469	Some bits of A9 scheduling: VFP llvm-svn: 100643	2010-04-07 18:19:18 +00:00
Jakob Stoklund Olesen	4c043c50fd	Replace TSFlagsFields and TSFlagsShifts with a simpler TSFlags field. When a target instruction wants to set target-specific flags, it should simply set bits in the TSFlags bit vector defined in the Instruction TableGen class. This works well because TableGen resolves member references late: class I : Instruction { AddrMode AM = AddrModeNone; let TSFlags{3-0} = AM.Value; } let AM = AddrMode4 in def ADD : I; TSFlags gets the expected bits from AddrMode4 in this example. llvm-svn: 100384	2010-04-05 03:10:20 +00:00
Jim Grosbach	8ed11c8bed	vml[as] are slow on 1136jf-s also. llvm-svn: 100066	2010-04-01 00:13:43 +00:00
Jim Grosbach	2a0b14a387	switch the flag for using NEON for SP floating point to a subtarget 'feature'. Re-commit. This time complete with testsuite updates. llvm-svn: 99570	2010-03-25 23:47:34 +00:00
Jim Grosbach	97d5bc2b86	need to fix 'make check' tests first. revert for a moment. llvm-svn: 99569	2010-03-25 23:34:05 +00:00
Jim Grosbach	7e87ba79e6	switch the flag for using NEON for SP floating point to a subtarget 'feature' llvm-svn: 99568	2010-03-25 23:32:19 +00:00
Jim Grosbach	b97ff2a4c1	switch the use-vml[as] instructions flag to a subtarget 'feature' llvm-svn: 99565	2010-03-25 23:11:16 +00:00
Anton Korobeynikov	90fcfccc91	Add substarget feature for FP16 llvm-svn: 98503	2010-03-14 18:42:38 +00:00
David Goodwin	6b56e77397	Add ARMv6 itineraries. llvm-svn: 89218	2009-11-18 18:39:57 +00:00
Anton Korobeynikov	3ba3789153	Use NEON reg-reg moves, where profitable. This reduces "domain-cross" stalls, when we used to mix vfp and neon code (the former were used for reg-reg moves) llvm-svn: 85764	2009-11-02 00:10:38 +00:00
David Goodwin	a4b73e486e	Remove neonfp attribute and instead set default based on CPU string. Add -arm-use-neon-fp to override the default. llvm-svn: 83218	2009-10-01 22:19:57 +00:00
David Goodwin	d0edce4c0d	Restore the -post-RA-scheduler flag as an override for the target specification. Remove -mattr for setting PostRAScheduler enable and instead use CPU string. llvm-svn: 83215	2009-10-01 21:46:35 +00:00
David Goodwin	a282690f82	Remove -post-RA-schedule flag and add a TargetSubtarget method to enable post-register-allocation scheduling. By default it is off. For ARM, enable/disable with -mattr=+/-postrasched. Enable by default for cortex-a8. llvm-svn: 83122	2009-09-30 00:10:16 +00:00
David Goodwin	1d72b88015	Checkpoint NEON scheduling itineraries. llvm-svn: 82657	2009-09-23 21:38:08 +00:00
David Goodwin	2686178489	Allow a zero cycle stage to reserve/require a FU without advancing the cycle counter. llvm-svn: 78736	2009-08-11 22:38:43 +00:00

1 2

71 Commits