llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Bill Schmidt	9f4da44752	This patch makes medium code model the default for 64-bit PowerPC ELF. When the CodeGenInfo is to be created for the PPC64 target machine, a default code-model selection is converted to CodeModel::Medium provided we are not targeting the Darwin OS. Defaults for Darwin are unaffected. llvm-svn: 168747	2012-11-27 23:36:26 +00:00
Chad Rosier	9a90d62b0b	Add -verify-machineinstrs to these fast-isel test cases. llvm-svn: 168723	2012-11-27 20:49:56 +00:00
Manman Ren	c45c0a304b	CSE: allow PerformTrivialCoalescing to check copies across basic block boundaries. Given the following case: BB0 %vreg1<def> = SUBrr %vreg0, %vreg7 %vreg2<def> = COPY %vreg7 BB1 %vreg10<def> = SUBrr %vreg0, %vreg2 We should be able to CSE between SUBrr in BB0 and SUBrr in BB1. rdar://12462006 llvm-svn: 168717	2012-11-27 18:58:41 +00:00
Manman Ren	cbcf2bcc8a	X86: do not fold load instructions such as [V]MOVS[S\|D] to other instructions when the destination register is wider than the memory load. These load instructions load from m32 or m64 and set the upper bits to zero, while the folded instructions may accept m128. rdar://12721174 llvm-svn: 168710	2012-11-27 18:09:26 +00:00
Bill Schmidt	0975882ed4	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. llvm-svn: 168708	2012-11-27 17:35:46 +00:00
Ulrich Weigand	d899cee68f	Never use .lcomm on platforms where it does not accept an alignment argument. Instead, use a pair of .local and .comm directives. This avoids spurious differences between binaries built by the integrated assembler vs. those built by the external assembler, since the external assembler may impose alignment requirements on .lcomm symbols where the integrated assembler does not. llvm-svn: 168704	2012-11-27 16:11:16 +00:00
Craig Topper	46a57ca7fa	Revert accidental commit. llvm-svn: 168687	2012-11-27 08:17:04 +00:00
Craig Topper	63381d45be	Make PrintReg constructor explicit to prevent weird implicit conversions from accidentally being triggered. llvm-svn: 168686	2012-11-27 08:14:24 +00:00
Craig Topper	7092a97454	Add test cases for r168417. llvm-svn: 168681	2012-11-27 07:19:54 +00:00
Chad Rosier	0001e972e0	Extend test case for r168657. llvm-svn: 168658	2012-11-27 01:10:48 +00:00
NAKAMURA Takumi	9402a552ad	llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Loosen expression corresponding to r168627. Win32 and *bsd were affected. llvm-svn: 168651	2012-11-27 00:48:27 +00:00
Chad Rosier	ad2ee03384	Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary. This pass was conservative in that it always reserved the FP to enable dynamic stack realignment, which allowed the RA to use aligned spills for vector registers. This happens even when spills were not necessary. The RA has since been improved to use unaligned spills when necessary. The new behavior is to realign the stack if the frame pointer was already reserved for some other reason, but don't reserve the frame pointer just because a function contains vector virtual registers. Part of rdar://12719844 llvm-svn: 168627	2012-11-26 22:55:05 +00:00
Jakub Staszak	12c307c8fd	Normalize splat 256bit vectors with 8 elements. llvm-svn: 168600	2012-11-26 19:24:31 +00:00
Eli Bendersky	d85e96be00	Rewrite test to not use a FileCheck variable and redefine it on the same line. In preparation for the FileCheck functionality change which will allow using a variable later on the same line. No functionality change. llvm-svn: 168588	2012-11-26 14:09:46 +00:00
Benjamin Kramer	42c6896fe3	PPC: MCize most of the darwin PIC emission. The last remaining bit is "bcl 20, 31, AnonSymbol", which I couldn't find the instruction definition for. Only whitespace changes in assembly output. llvm-svn: 168541	2012-11-24 13:18:25 +00:00
Akira Hatanaka	0f8303f1e5	[mips] Generate big GOT code. llvm-svn: 168460	2012-11-21 20:40:38 +00:00
Anton Korobeynikov	a96a1c8e42	Add support for varargs functions for msp430. Patch by Job Noorman! llvm-svn: 168440	2012-11-21 17:28:27 +00:00
Anton Korobeynikov	1a8ff7b99a	Add support for byval args. Patch by Job Noorman! llvm-svn: 168439	2012-11-21 17:23:03 +00:00
Tim Northover	3556e52a02	Fix physical register liveness calculations: + Take account of clobbers + Give outputs priority over inputs since they happen later. llvm-svn: 168360	2012-11-20 09:56:11 +00:00
Elena Demikhovsky	9f52a3ef84	Intel OCL built-ins calling conventions now support MacOS 32-bit. llvm-svn: 168359	2012-11-20 09:37:57 +00:00
Anton Korobeynikov	7a285e97e2	Factor out type info emission into separate routine. It turned out that ARM wants different layout of type infos. This is yet another patch in attempt to fix PR7187 llvm-svn: 168325	2012-11-19 21:06:26 +00:00
Jakob Stoklund Olesen	42930d7c54	Handle mixed normal and early-clobber defs on inline asm. PR14376. llvm-svn: 168320	2012-11-19 19:31:10 +00:00
Andrew Trick	ab75b8798c	Use a full triple for a PPC test case for asm syntax. llvm-svn: 168283	2012-11-18 06:21:03 +00:00
Andrew Trick	d4358df73b	Silence the buildbots for this test while I figure out the triple llvm-svn: 168249	2012-11-17 03:39:26 +00:00
Andrew Trick	52f84ce773	Broaden isSchedulingBoundary to check aliases of SP. On PPC the stack pointer is X1, but ADJCALLSTACK writes R1. Fixes PR14315: Register regmask dependency problem with misched. llvm-svn: 168248	2012-11-17 03:35:11 +00:00
Eli Friedman	d7496f6688	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. llvm-svn: 168240	2012-11-17 01:52:46 +00:00
Chad Rosier	7aa7c0d952	[fast-isel] Add the -verify-machineinstrs to these test cases. The remaining test cases require fixes to fast-isel before the verifier can be enabled. Part of rdar://12594152 llvm-svn: 168233	2012-11-17 00:42:06 +00:00
Akira Hatanaka	869eb1acb9	Initial implementation of MipsTargetLowering::isLegalAddressingMode. llvm-svn: 168230	2012-11-17 00:25:41 +00:00
Weiming Zhao	85dce59506	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. llvm-svn: 168207	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	3cd85d754d	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 llvm-svn: 168200	2012-11-16 21:15:20 +00:00
Richard Osborne	c8f73df738	Fix handling of aliases to functions. An alias to a function should use pc relative addressing. llvm-svn: 168199	2012-11-16 21:12:38 +00:00
Justin Holewinski	a794462d5b	[NVPTX] Order global variables in def-use order before emiting them in the final assembly llvm-svn: 168198	2012-11-16 21:03:51 +00:00
NAKAMURA Takumi	6c26c8f4b6	llvm/test/CodeGen/X86/hipe-cc*.ll: Add explicit -mcpu, or they don't expect to pass on Atom. llvm-svn: 168171	2012-11-16 16:07:37 +00:00
Duncan Sands	98b6a4f4b5	Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris. llvm-svn: 168166	2012-11-16 12:36:39 +00:00
Craig Topper	ad33f996a6	Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types. llvm-svn: 168141	2012-11-16 06:37:56 +00:00
Akira Hatanaka	ba0e266eb2	[mips] Fix delay slot filler so that instructions with register operand $1 are allowed in branch delay slot. llvm-svn: 168131	2012-11-16 02:39:34 +00:00
Eli Friedman	79932a2f77	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. llvm-svn: 168107	2012-11-15 22:44:27 +00:00
Adhemerval Zanella	c159b16933	PowerPC: Lowering floor intrinsic for Altivec This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and llvm.nearbyint to Altivec instruction when using 4 single-precision float vectors. llvm-svn: 168086	2012-11-15 20:56:03 +00:00
Bill Schmidt	f294eb980a	This patch is in preparation for adding medium code model support to the PPC64 target. The five tests modified herein test code generation that is sensitive to the code model selected. So I've added -code-model=small to the RUN commands for each. Since small code model is the default, this has no effect for now; but this prepares us for eventually changing the default to medium code model for PPC64. Test changes verified with small and medium code model as default on powerpc64-unknown-linux-gnu. All tests continue to pass. llvm-svn: 167999	2012-11-14 23:23:27 +00:00
Jakub Staszak	8c20275ebf	Make sure to not get AVX code on an AVX-capable host. Revealed in r167967. llvm-svn: 167989	2012-11-14 22:24:01 +00:00
NAKAMURA Takumi	8adf86a12e	test/CodeGen/Hexagon/postinc-load.ll: Suppress it for now. It triggered the failure on i686 hosts. llvm-svn: 167988	2012-11-14 22:22:37 +00:00
Eric Christopher	caf5a23d81	Remove the CellSPU port. Approved by Chris Lattner. llvm-svn: 167984	2012-11-14 22:09:20 +00:00
NAKAMURA Takumi	e40f4623cd	llvm/test/CodeGen/X86/memset.ll: FileCheck-ize, and add another case on +avx. llvm-svn: 167975	2012-11-14 21:01:40 +00:00
Jyotsna Verma	a472ef54f3	Added multiclass for post-increment load instructions. llvm-svn: 167974	2012-11-14 20:38:48 +00:00
Benjamin Kramer	27983167e3	Force CPU in test so we don't accidentally get AVX code on an AVX-capable host. llvm-svn: 167973	2012-11-14 20:31:42 +00:00
Benjamin Kramer	0006b33581	X86: Enable SSE memory intrinsics even when stack alignment is less than 16 bytes. The stack realignment code was fixed to work when there is stack realignment and a dynamic alloca is present so this shouldn't cause correctness issues anymore. Note that this also enables generation of AVX instructions for memset under the assumptions: - Unaligned loads/stores are always fast on CPUs supporting AVX - AVX is not slower than SSE We may need some tweaked heuristics if one of those assumptions turns out not to be true. Effectively reverts r58317. Part of PR2962. llvm-svn: 167967	2012-11-14 20:08:40 +00:00
Nadav Rotem	b339c55cd3	The code pattern "imm0_255_neg" is used for checking if an immediate value is a small negative number. This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag. rdar://12028498 llvm-svn: 167963	2012-11-14 19:39:15 +00:00
Justin Holewinski	3f79944ac9	[NVPTX] Implement custom lowering of loads/stores for i1 Loads from i1 become loads from i8 followed by trunc Stores to i1 become zext to i8 followed by store to i8 Fixes PR13291 llvm-svn: 167948	2012-11-14 19:19:16 +00:00
Anton Korobeynikov	c8df249529	Fix really stupid ARM EHABI info generation bug: we should not emit eh table and handler data if there are no landing pads in the function. Patch by Logan Chien with some cleanups from me. llvm-svn: 167945	2012-11-14 19:13:30 +00:00
Rafael Espindola	1fb628bc96	Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333. llvm-svn: 167912	2012-11-14 05:08:56 +00:00

1 2 3 4 5 ...

6646 Commits