llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Bill Schmidt	049ba390f5	Large code model support for PowerPC. Large code model is identical to medium code model except that the addis/addi sequence for "local" accesses is never used. All accesses use the addis/ld sequence. The coding changes are straightforward; most of the patch is taken up with creating variants of the medium model tests for large model. llvm-svn: 175767	2013-02-21 17:12:27 +00:00
Bill Schmidt	c895316923	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. llvm-svn: 170209	2012-12-14 17:02:38 +00:00
Bill Schmidt	07e3927917	This is another cleanup patch for 64-bit PowerPC TLS processing. I had some hackery in place that hid my poor use of TblGen, which I've now sorted out and cleaned up. No change in observable behavior, so no new test cases. llvm-svn: 170149	2012-12-13 20:57:10 +00:00
Bill Schmidt	bd4902c44a	This is just a clean-up patch that simplifies the initial-exec TLS logic by avoiding use of machine operand flags. No change in observable behavior, so no new test cases. llvm-svn: 170141	2012-12-13 18:45:54 +00:00
Bill Schmidt	7a93daad1a	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill llvm-svn: 170003	2012-12-12 19:29:35 +00:00
Bill Schmidt	45b56f7632	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill llvm-svn: 169910	2012-12-11 20:30:11 +00:00
Bill Schmidt	9d8cdcda41	This patch introduces initial-exec model support for thread-local storage on 64-bit PowerPC ELF. The patch includes code to handle external assembly and MC output with the integrated assembler. It intentionally does not support the "old" JIT. For the initial-exec TLS model, the ABI requires the following to calculate the address of external thread-local variable x: Code sequence Relocation Symbol ld 9,x@got@tprel(2) R_PPC64_GOT_TPREL16_DS x add 9,9,x@tls R_PPC64_TLS x The register 9 is arbitrary here. The linker will replace x@got@tprel with the offset relative to the thread pointer to the generated GOT entry for symbol x. It will replace x@tls with the thread-pointer register (13). The two test cases verify correct assembly output and relocation output as just described. PowerPC-specific selection node variants are added for the two instructions above: LD_GOT_TPREL and ADD_TLS. These are inserted when an initial-exec global variable is encountered by PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to machine instructions LDgotTPREL and ADD8TLS. LDgotTPREL is a pseudo that uses the same LDrs support added for medium code model's LDtocL, with a different relocation type. The rest of the processing is straightforward. llvm-svn: 169281	2012-12-04 16:18:08 +00:00
Bill Schmidt	0975882ed4	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. llvm-svn: 168708	2012-11-27 17:35:46 +00:00
Ulrich Weigand	ee889d3c11	Fix wrong PowerPC instruction opcodes for: - lwaux - lhzux - stbu llvm-svn: 167863	2012-11-13 19:21:31 +00:00
Ulrich Weigand	2bbf543b9d	Fix instruction encoding for "bd(n)z" on PowerPC, by using a new instruction format BForm_1. llvm-svn: 167861	2012-11-13 19:15:52 +00:00
Ulrich Weigand	9eb8ea0836	Fix instruction encoding for "isel" on PowerPC, using a new instruction format AForm_4. llvm-svn: 167860	2012-11-13 19:14:19 +00:00
Adhemerval Zanella	0c35c65808	PowerPC: Fix for rldcl/rldicl/rldicr MC emission This patch fixes the rldcl/rldicl/rldicr instruction emission. The issue is the MDForm_1 instruction defines the PowerISA MB field from 'rldicl' with the name MBE, but RLDCL/RLDICL/RLDICR definition uses as 'MB'. It end up by generatint the 'rldicl' enconding at 'lib/Target/PowerPC/PPCGenMCCodeEmitter.inc' to use the fourth argument as the third. The patch changes it by adjusting to use the fourth argument as intended. Fixes PR14180. llvm-svn: 166770	2012-10-26 12:09:58 +00:00
Adhemerval Zanella	cd87aee529	This patch fixes the MC object emission of 'nop' for external function calls and also fixes the R_PPC64_TOC16 and R_PPC64_TOC16_DS relocation offset. The 'nop' is needed so a restore TOC instruction (ld r2,40(r1)) can be placed by the linker to correct restore the TOC of previous function. Current code has two issues: it defines in PPCInstr64Bit.td file a LDinto_toc and LDtoc_restore as a DSForm_1 with DS_RA=0 where it should be DS=2 (the 8 bytes displacement of the TOC saving). It also wrongly emits a MC intruction using an uint32_t value while the PPC::BL8_NOP_ELF and PPC::BLA8_NOP_ELF are both uint64_t (because of the following 'nop'). This patch corrects the remaining ExecutionEngine using MCJIT: ExecutionEngine/2002-12-16-ArgTest.ll ExecutionEngine/2003-05-07-ArgumentTest.ll ExecutionEngine/2005-12-02-TailCallBug.ll ExecutionEngine/hello.ll ExecutionEngine/hello2.ll ExecutionEngine/test-call.ll llvm-svn: 166682	2012-10-25 14:29:13 +00:00
Will Schmidt	686f66629c	- add tokens to PPCInstrInfo.td and PPCInstr64Bit.td to resolve "Instruction 'foo' has no tokens" errors during llvm-tblgen -gen-asm-matcher attempts. At this time, the added tokens are "#comment" style rather than the actual mnemonic. This will be revisited once the rest of the base asmparser bits get straightened out for ppc64-elf-linux. llvm-svn: 165237	2012-10-04 18:14:28 +00:00
Hal Finkel	a414a44e22	Move the PPC TOC defs into the PPC64 InstrInfo file. Since TOC is just defined for PPC64, move its definition to PPC64 td file. Patch by Adhemerval Zanella. llvm-svn: 163234	2012-09-05 19:22:27 +00:00
Hal Finkel	a65f8ac557	Split several PPC instruction classes. Slight reorganisation of PPC instruction classes for scheduling. No functionality change for existing subtargets. - Clearly separate load/store-with-update instructions from regular loads and stores. - Split IntRotateD -> IntRotateD and IntRotateDI - Split out fsub and fadd from FPGeneral -> FPAddSub - Update existing itineraries Patch by Tobias von Koch. llvm-svn: 162729	2012-08-28 02:49:14 +00:00
Hal Finkel	367c494415	Allow remat of LI on PPC. Allow load-immediates to be rematerialised in the register coalescer for PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail, because it relies on a register move getting emitted. The immediate load is equivalent, so change this test case. Patch by Tobias von Koch. llvm-svn: 162727	2012-08-28 02:10:33 +00:00
Roman Divacky	eab620e38c	Lower constant pools and jump tables via TOC on PPC64/SVR4. In collaboration with Adhemerval Zanella. llvm-svn: 162562	2012-08-24 16:26:02 +00:00
Hal Finkel	aa174abb14	Add a comment about mftb vs. mfspr on PPC. Thanks to Alex Rosenberg for the suggestion. llvm-svn: 161428	2012-08-07 17:04:20 +00:00
Hal Finkel	15265edebe	MFTB on PPC64 should really be encoded using MFSPR. The MFTB instruction itself is being phased out, and its functionality is provided by MFSPR. According to the ISA docs, using MFSPR works on all known chips except for the 601 (which did not have a timebase register anyway) and the POWER3. Thanks to Adhemerval Zanella for pointing this out! llvm-svn: 161346	2012-08-06 21:21:44 +00:00
Hal Finkel	aadd19de06	Add readcyclecounter lowering on PPC64. On PPC64, this can be done with a simple TableGen pattern. To enable this, I've added the (otherwise missing) readcyclecounter SDNode definition to TargetSelectionDAG.td. llvm-svn: 161302	2012-08-04 14:10:46 +00:00
Jakob Stoklund Olesen	b8af245a15	Remove variable_ops from call instructions in most targets. Call instructions are no longer required to be variadic, and variable_ops should only be used for instructions that encode a variable number of arguments, like the ARM stm/ldm instructions. llvm-svn: 160189	2012-07-13 20:44:29 +00:00
Hal Finkel	ebe9ea8bd7	Add support for the PPC isel instruction. The isel (integer select) instruction is supported on the 440 and A2 embedded cores and on the POWER7. llvm-svn: 159045	2012-06-22 23:10:08 +00:00
Hal Finkel	a94da28a6d	Add support for generating reg+reg (indexed) pre-inc loads on PPC. llvm-svn: 158823	2012-06-20 15:43:03 +00:00
Hal Finkel	42b797225a	Add support for generating reg+reg preinc stores on PPC. PPC will now generate STWUX and friends. llvm-svn: 158698	2012-06-19 02:34:32 +00:00
Hal Finkel	a005be7ae7	Split out the PPC instruction class IntSimple from IntGeneral. On the POWER7, adds and logical operations can also be handled in the load/store pipelines. We'll call these IntSimple. llvm-svn: 158366	2012-06-12 19:01:24 +00:00
Hal Finkel	a9b329fcf1	Improve ext/trunc patterns on PPC64. The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. llvm-svn: 158283	2012-06-09 22:10:19 +00:00
Hal Finkel	bb4e499e94	Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204	2012-06-08 15:38:21 +00:00
Roman Divacky	0daa2c0556	Implement local-exec TLS on PowerPC. llvm-svn: 157935	2012-06-04 17:36:38 +00:00
Hal Finkel	9fad4cf803	Add a missing PPC 64-bit stwu pattern. This seems to fix the remaining compile-time failures on PPC64 when compiling with -enable-ppc-preinc. llvm-svn: 157159	2012-05-20 17:11:24 +00:00
Hal Finkel	42a487282a	Split the LdStGeneral PPC itin. class into LdStLoad and LdStStore. Loads and stores can have different pipeline behavior, especially on embedded chips. This change allows those differences to be expressed. Except for the 440 scheduler, there are no functionality changes. On the 440, the latency adjustment is only by one cycle, and so this probably does not affect much. Nevertheless, it will make a larger difference in the future and this removes a FIXME from the 440 itin. llvm-svn: 153821	2012-04-01 04:44:16 +00:00
Hal Finkel	548d6f1ad0	Fix dynamic linking on PPC64. Dynamic linking on PPC64 has had problems since we had to move the top-down hazard-detection logic post-ra. For dynamic linking to work there needs to be a nop placed after every call. It turns out that it is really hard to guarantee that nothing will be placed in between the call (bl) and the nop during post-ra scheduling. Previous attempts at fixing this by placing logic inside the hazard detector only partially worked. This is now fixed in a different way: call+nop codegen-only instructions. As far as CodeGen is concerned the pair is now a single instruction and cannot be split. This solution works much better than previous attempts. The scoreboard hazard detector is also renamed to be more generic, there is currently no cpu-specific logic in it. llvm-svn: 153816	2012-03-31 14:45:15 +00:00
Roman Divacky	9bfd7cd1ad	Convert PowerPC to register mask operands. llvm-svn: 152122	2012-03-06 16:41:49 +00:00
Hal Finkel	784c4bf068	X11/X2 loads around indirect calls on ppc64 should not be deleted. llvm-svn: 151374	2012-02-24 17:54:01 +00:00
Jia Liu	b077b6085d	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Hal Finkel	01f7c7a17e	make CR spill and restore 64-bit clean (no functional change), and fix some other problems found with -verify-machineinstrs llvm-svn: 146024	2011-12-07 06:34:06 +00:00
Roman Divacky	3624922127	Fix wrong usages of CTR/MCTR where CTR8/MCTR8 was meant. - Check for MTCTR8 in addition to MTCTR when looking up a hazard. - When lowering an indirect call use CTR8 when targeting 64bit. - Introduce BCTR8 that uses CTR8 and use it on 64bit when expanding ISD::BRIND. The last change fixes PR8487. With those changes, we are able to compile a running "ls" and "sh" on FreeBSD/PowerPC64. llvm-svn: 132552	2011-06-03 15:47:49 +00:00
Cameron Zwarich	b5755a9dc2	Fix PR8828 by removing the explicit def in MovePCToLR as well as the pointless piclabel operand. The operand in the tablegen definition doesn't actually turn into an MI operand, so it just confuses anything checking the TargetInstrDesc for the number of operands. It suffices to just have an implicit def of LR. llvm-svn: 131626	2011-05-19 02:56:28 +00:00
Jakob Stoklund Olesen	8eef3feba8	PowerPC atomic pseudos clobber CR0, they don't read it. llvm-svn: 128829	2011-04-04 17:07:09 +00:00
Chris Lattner	51793c5b92	split out an encoder for memri operands, allowing a relocation to be plopped into the immediate field. This allows us to encode stuff like this: lbz r3, lo16(__ZL4init)(r4) ; globalopt.cpp:5 ; encoding: [0x88,0x64,A,A] ; fixup A - offset: 0, value: lo16(__ZL4init), kind: fixup_ppc_lo16 stw r3, lo16(__ZL1s)(r5) ; globalopt.cpp:6 ; encoding: [0x90,0x65,A,A] ; fixup A - offset: 0, value: lo16(__ZL1s), kind: fixup_ppc_lo16 With this, we should have a completely function MCCodeEmitter for PPC, wewt. llvm-svn: 119134	2010-11-15 08:22:03 +00:00
Chris Lattner	96f8078924	add support for encoding the lo14 forms used for a few PPC64 addressing modes. For example, we now get: ld r3, lo16(_G)(r3) ; encoding: [0xe8,0x63,A,0bAAAAAA00] ; fixup A - offset: 0, value: lo16(_G), kind: fixup_ppc_lo14 llvm-svn: 119133	2010-11-15 08:02:41 +00:00
Chris Lattner	ac5f6ff408	implement the start of support for lo16 and ha16, allowing us to get stuff like: lis r4, ha16(__ZL4init) ; encoding: [0x3c,0x80,A,A] ; fixup A - offset: 0, value: ha16(__ZL4init), kind: fixup_ppc_ha16 llvm-svn: 119127	2010-11-15 06:33:39 +00:00
Chris Lattner	8fa38e3222	remove asmstrings (which can never be printed) from pseudo instructions, allowing is to eliminate some dead operand printing methods from the instprinter. llvm-svn: 119113	2010-11-15 03:48:58 +00:00
Chris Lattner	51168d6510	move the pic base symbol stuff up to MachineFunction since it is trivial and will be shared between ppc and x86. This substantially simplifies the X86 backend also. llvm-svn: 119089	2010-11-14 22:48:15 +00:00
Chris Lattner	fc626aa37d	reimplement ppc asmprinter "toc" handling to use a VariantKind on the operand, required for .o file writing and fixing the PowerPC/mult-alt-generic-powerpc64.ll failure with the new instprinter. llvm-svn: 119087	2010-11-14 22:22:59 +00:00
Chris Lattner	a574e6c6cf	remove a bogus pattern, which had the same pattern as STDU but codegen'd differently. This really wanted to use some sort of subreg to get the low 4 bytes of the G8RC register or something. However, it's invalid and nothing is testing it, so I'm just zapping the bogosity. llvm-svn: 97345	2010-02-27 21:15:32 +00:00
Chris Lattner	ce7be2638a	Eliminate some uses of immAllOnes, just use -1, it does the same thing and is more efficient for the matcher. llvm-svn: 96712	2010-02-21 03:12:16 +00:00
Tilmann Scheller	29361c46ac	Add support for calls through function pointers in the 64-bit PowerPC SVR4 ABI. Patch contributed by Ken Werner of IBM! llvm-svn: 91680	2009-12-18 13:00:15 +00:00
Bob Wilson	25738f9e79	Add PowerPC codegen for indirect branches. llvm-svn: 86050	2009-11-04 21:31:18 +00:00
Dan Gohman	3393a4c997	Rename usesCustomDAGSchedInserter to usesCustomInserter, and update a bunch of associated comments, because it doesn't have anything to do with DAGs or scheduling. This is another step in decoupling MachineInstr emitting from scheduling. llvm-svn: 85517	2009-10-29 18:10:34 +00:00

1 2 3

123 Commits