llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Evan Cheng	9e8f90a020	Rename createAsmInfo to createMCAsmInfo and move registration code to MCTargetDesc to prepare for next round of changes. llvm-svn: 135219	2011-07-14 23:50:31 +00:00
Evan Cheng	50f2d8d304	Eliminate asm parser's dependency on TargetMachine: - Each target asm parser now creates its own MCSubtatgetInfo (if needed). - Changed AssemblerPredicate to take subtarget features which tablegen uses to generate asm matcher subtarget feature queries. e.g. "ModeThumb,FeatureThumb2" is translated to "(Bits & ModeThumb) != 0 && (Bits & FeatureThumb2) != 0". llvm-svn: 134678	2011-07-08 01:53:10 +00:00
Evan Cheng	9581f645b3	Factor ARM triple parsing out of ARMSubtarget. Another step towards making ARM subtarget info available to MC. llvm-svn: 134569	2011-07-07 00:08:19 +00:00
Evan Cheng	034261674b	Fix the ridiculous SubtargetFeatures API where it implicitly expects CPU name to be the first encoded as the first feature. It then uses the CPU name to look up features / scheduling itineray even though clients know full well the CPU name being used to query these properties. The fix is to just have the clients explictly pass the CPU name! llvm-svn: 134127	2011-06-30 01:53:36 +00:00
Evan Cheng	76b1239b24	Remove TargetOptions.h dependency from ARMSubtarget. llvm-svn: 133738	2011-06-23 18:15:17 +00:00
Daniel Dunbar	1c0a0fde5c	ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows() predicates. llvm-svn: 129816	2011-04-19 21:14:45 +00:00
Bob Wilson	3daeb462cb	This patch combines several changes from Evan Cheng for rdar://8659675. Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. llvm-svn: 129775	2011-04-19 18:11:57 +00:00
Jim Grosbach	0510dc2765	Tidy up. llvm-svn: 129034	2011-04-06 22:35:47 +00:00
NAKAMURA Takumi	00228d0c2c	Triple::MinGW64 is deprecated and removed. We can use Triple::MinGW32 generally. No one uses *-mingw64. mingw-w64 is represented as {i686\|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way. llvm-svn: 125747	2011-02-17 12:24:17 +00:00
Rafael Espindola	547873da60	Add support for the --noexecstack option. llvm-svn: 124077	2011-01-23 17:55:27 +00:00
Anton Korobeynikov	cf5967630b	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Evan Cheng	fc78767730	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Chris Lattner	4e53e6e198	tidy up llvm-svn: 119462	2010-11-17 05:41:32 +00:00
Anton Korobeynikov	76c52dcf44	First step of huge frame-related refactoring: move emit{Prologue,Epilogue} out of TargetRegisterInfo to TargetFrameInfo, which is definitely much better suitable place llvm-svn: 119097	2010-11-15 00:06:54 +00:00
Eric Christopher	474a93ff76	Revert the accidental commit I made reverting the previous commit. llvm-svn: 118835	2010-11-11 20:50:14 +00:00
Eric Christopher	e7f27cf66a	Revert this temporarily. llvm-svn: 118827	2010-11-11 19:47:02 +00:00
Rafael Espindola	49d508fd19	Jim Asked us to move DataLayout on ARM back to the most specialized classes. Do so and also change X86 for consistency. Investigating if this can be improved a bit. llvm-svn: 115469	2010-10-03 18:59:45 +00:00
Jason W Kim	6d7784e5f5	I added a new file ARMAsmBackend which stubs out in similar ways to the eqv X86 class. For now, I split the ELFARMAsmBackend from the DarwinARMAsmBackend (also mimicking X86) Tested against -r115126 llvm-svn: 115129	2010-09-30 02:17:26 +00:00
Nick Lewycky	b9daeb2fd0	Resolve this GCC warning: ARMTargetMachine.cpp:53: error: control reaches end of non-void function llvm-svn: 114992	2010-09-28 21:40:26 +00:00
Rafael Espindola	3e325614db	Odd additional stub framework for the ARM MC ELF emission. llc now recognizes the "intent" to support MC/obj emission for ARM, but given that they are all stubs, it asserts on --filetype=obj --march=arm Patch by Jason Kim. llvm-svn: 114856	2010-09-27 18:31:37 +00:00
Bob Wilson	ba02d5b620	Convert some VTBL and VTBX instructions to use pseudo instructions prior to register allocation. Remove the NEONPreAllocPass, which is no longer needed. Yeah!! llvm-svn: 113818	2010-09-13 23:55:10 +00:00
Evan Cheng	f8604b772e	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. llvm-svn: 110798	2010-08-11 07:17:46 +00:00
Evan Cheng	15d23d4966	Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations. llvm-svn: 110584	2010-08-09 18:35:19 +00:00
Evan Cheng	a04ba7588a	Add an option to disable 32 -> 16-bit Thumb2 size reduction pass for experimentation. llvm-svn: 110579	2010-08-09 17:16:10 +00:00
Anton Korobeynikov	7ae895e007	Hook in GlobalMerge pass llvm-svn: 109359	2010-07-24 21:52:08 +00:00
Evan Cheng	8c4e6ba789	Remove early IT block formation. It's not used. llvm-svn: 107513	2010-07-02 21:07:09 +00:00
Bob Wilson	a779332a5e	Add missing ARM and Thumb data layout info for vector types. Radar 8128745. llvm-svn: 106820	2010-06-25 04:41:08 +00:00
Evan Cheng	c3976de390	Oops. IT block formation pass needs to be run at any optimization level. llvm-svn: 106775	2010-06-24 19:10:14 +00:00
Evan Cheng	e9ba3241a3	Move ARM if-conversion before post-ra scheduling. llvm-svn: 106355	2010-06-18 23:32:07 +00:00
Evan Cheng	b5fadc47e0	Allow ARM if-converter to be run after post allocation scheduling. - This fixed a number of bugs in if-converter, tail merging, and post-allocation scheduler. If-converter now runs branch folding / tail merging first to maximize if-conversion opportunities. - Also changed the t2IT instruction slightly. It now defines the ITSTATE register which is read by instructions in the IT block. - Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't change the instruction ordering in the IT block (since IT mask has been finalized). It also ensures no other instructions can be scheduled between instructions in the IT block. This is not yet enabled. llvm-svn: 106344	2010-06-18 23:09:54 +00:00
Evan Cheng	46b89e05fd	Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler. llvm-svn: 106091	2010-06-16 07:35:02 +00:00
Evan Cheng	9f80ad5683	Typo. llvm-svn: 105677	2010-06-09 03:49:12 +00:00
Evan Cheng	a1940f03f6	Thumb2 IT blocks are fairly expensive. When there are multiple selects using the same condition, it's important to make sure they are scheduled together to avoid forming multiple IT blocks. I'm adding a pre-regalloc pass that forms IT blocks early (by re-scheduling instructions and split basic blocks) to attempt to fix this. This is not turned on by default since I am not sure this is the right fix. Another issue is llvm selects are modeled as two-address conditional moves. This can be very bad when the copies before the conditional moves are not coalesced away. Teach IT formation pass to move the copies above the IT block (when legal) to avoid breaking the IT block. llvm-svn: 105669	2010-06-09 01:46:50 +00:00
Dan Gohman	fb6f4da0e0	Implement a bunch more TargetSelectionDAGInfo infrastructure. Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and EmitTargetCodeForMemmove out of TargetLowering and into SelectionDAGInfo to exercise this. llvm-svn: 103481	2010-05-11 17:31:57 +00:00
Anton Korobeynikov	e5c1563130	Remove late ARM codegen optimization pass committed by accident. It is not ready for public yet. llvm-svn: 100673	2010-04-07 18:23:27 +00:00
Anton Korobeynikov	51d00a7f48	Move NEON-VFP domain fixer upper, so post-RA scheduler would benefit from it. llvm-svn: 100668	2010-04-07 18:21:46 +00:00
Anton Korobeynikov	aa8b0d11a5	Some initial version of global merger llvm-svn: 100641	2010-04-07 18:19:07 +00:00
Daniel Dunbar	2c99d47702	TargetRegistry: Fix create{AsmInfo,MCDisassembler} to return non-const objects. llvm-svn: 99097	2010-03-20 22:36:22 +00:00
Chris Lattner	ec61994b1d	remove dead code. llvm-svn: 95134	2010-02-02 21:38:59 +00:00
Chris Lattner	21bcba21e7	eliminate all the dead addSimpleCodeEmitter implementations. eliminate random "code emitter" stuff in Alpha, except for the JIT path. Next up, remove the template cruft. llvm-svn: 95131	2010-02-02 21:31:47 +00:00
Jim Grosbach	034f69e0aa	For aligned load/store instructions, it's only required to know whether a function can support dynamic stack realignment. That's a much easier question to answer at instruction selection stage than whether the function actually will have dynamic alignment prologue. This allows the removal of the stack alignment heuristic pass, and improves code quality for cases where the heuristic would result in dynamic alignment code being generated when it was not strictly necessary. llvm-svn: 93885	2010-01-19 18:31:11 +00:00
Jim Grosbach	0e1230b23b	Factor the stack alignment calculations out into a target independent pass. No functionality change. llvm-svn: 90336	2009-12-02 19:30:24 +00:00
Jim Grosbach	1aa571da3c	Detect need for autoalignment of the stack earlier to catch spills more conservatively. eliminateFrameIndex() machinery adjust to handle addr mode 6 (vld1/vst1) used for spills. Fix tests to expect aligned Q-reg spilling llvm-svn: 88874	2009-11-15 21:45:34 +00:00
Chris Lattner	25421b6954	indicate what the native integer types for the target are. Please verify. llvm-svn: 86397	2009-11-07 19:07:32 +00:00
Evan Cheng	6e3e66375a	- Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relative load of a GV from constantpool and then add pc. It allows the code sequence to be rematerializable so it would be hoisted by machine licm. - Add a late pass to break these pseudo instructions into a number of real instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm to this pass. This is done before post regalloc scheduling to allow the scheduler to proper schedule these instructions. It also allow them to be if-converted and shrunk by later passes. llvm-svn: 86304	2009-11-06 23:52:48 +00:00
Daniel Dunbar	4daaf9d3f4	Pass StringRef by value. llvm-svn: 86251	2009-11-06 10:58:06 +00:00
Anton Korobeynikov	de7cbab064	Move subtarget check upper for NEON reg-reg fixup pass. llvm-svn: 85914	2009-11-03 18:46:11 +00:00
Anton Korobeynikov	ff29071cc6	Turn neon reg-reg moves fixup code into separate pass. This should reduce the compile time. llvm-svn: 85850	2009-11-03 01:04:26 +00:00
Bob Wilson	fc1194919b	Revert r85346 change to control tail merging by CodeGenOpt::Level. I'm going to redo this using the OptimizeForSize function attribute. llvm-svn: 85426	2009-10-28 20:46:46 +00:00
Bob Wilson	98c9fb94ab	Record CodeGen optimization level in the BranchFolding pass so that we can use it to control tail merging when there is a tradeoff between performance and code size. When there is only 1 instruction in the common tail, we have been merging. That can be good for code size but is a definite loss for performance. Now we will avoid tail merging in that case when the optimization level is "Aggressive", i.e., "-O3". Radar 7338114. Since the IfConversion pass invokes BranchFolding, it too needs to know the optimization level. Note that I removed the RegisterPass instantiation for IfConversion because it required a default constructor. If someone wants to keep that for some reason, we can add a default constructor with a hard-wired optimization level. llvm-svn: 85346	2009-10-27 23:49:38 +00:00

1 2 3 4

157 Commits