llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00

Author	SHA1	Message	Date
Bill Schmidt	2b1221e06b	[PowerPC] Replace foul hackery with real calls to __tls_get_addr My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. llvm-svn: 221703	2014-11-11 20:44:09 +00:00
Rafael Espindola	d10e16c7f3	Use a 8 bit immediate when possible. This fixes pr21529. llvm-svn: 221700	2014-11-11 19:46:36 +00:00
Dario Domizioli	138d3c8b1e	[X86][ELF] Fix PR20243 - leaf frame pointer bug with TLS access The ISel lowering for global TLS access in PIC mode was creating a pseudo instruction that is later expanded to a call, but the code was not setting the hasCalls flag in the MachineFrameInfo alongside the adjustsStack flag. This caused some functions to be mistakenly recognized as leaf functions, and this in turn affected the decision to eliminate the frame pointer. With the fix, hasCalls is properly set and the leaf frame pointer is correctly preserved. llvm-svn: 221695	2014-11-11 18:44:49 +00:00
Vasileios Kalintiris	d0bab7c3e7	[mips] Add preliminary support for the MIPS II target. Summary: This patch enables code generation for the MIPS II target. Pre-Mips32 targets don't have the MUL instruction, so we add the correspondent pattern that uses the MULT/MFLO combination in order to retrieve the product. This is WIP as we don't support code generation for select nodes due to the lack of conditional-move instructions. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6150 llvm-svn: 221686	2014-11-11 11:43:55 +00:00
Vasileios Kalintiris	9d62a870f7	[mips] Add hardware register name "hwr_ulr" ($29) The canonical name when printing assembly is still $29. The reason is that GAS does not accept "$hwr_ulr" at the moment. This addresses the comments from r221307, which reverted the original commit r221299. llvm-svn: 221685	2014-11-11 11:22:39 +00:00
Andrea Di Biagio	8872877147	[X86] Add missing check for 'isINSERTPSMask' in method 'isShuffleMaskLegal'. This helps the DAGCombiner to identify more opportunities to fold shuffles. llvm-svn: 221684	2014-11-11 11:20:31 +00:00
Vasileios Kalintiris	88cc5e39d4	Recommit "[mips] Add names and tests for the hardware registers" The original commit r221299 was reverted in r221307. I removed the name "hrw_ulr" ($29) from the original commit because two tests were failing. llvm-svn: 221681	2014-11-11 10:31:31 +00:00
Craig Topper	b267ba2986	Use uint64_t as the type for the X86 TSFlag format enum. Allows removal of the VEXShift hack that was used to access the higher bits of TSFlags. llvm-svn: 221673	2014-11-11 07:32:32 +00:00
Michael Kuperstein	cd941e090d	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Recommitting - This time, with a hopefully working test. Differential Revision: http://reviews.llvm.org/D6128 llvm-svn: 221672	2014-11-11 07:07:40 +00:00
Jingyue Wu	3d5e891a7e	[NVPTX] Remove dead code in NVPTXTargetTransformInfo (NFC) llvm-svn: 221668	2014-11-11 05:24:04 +00:00
Rafael Espindola	476a83ecd4	MCAsmParserExtension has a copy of the MCAsmParser. Use it. Base classes were storing a second copy. llvm-svn: 221667	2014-11-11 05:18:41 +00:00
Quentin Colombet	6564d569c9	[X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if AVX2 is available. According to IACA, the new lowering has a throughput of 8 cycles instead of 13 with the previous one. Althought this lowering kicks in some SPECs benchmarks, the performance improvement was within the noise. Correctness testing has been done for the whole range of uint32_t with the following program: uint4 v = (uint4) {0,1,2,3}; uint32_t i; //Check correctness over entire range for uint4 -> float4 conversion for( i = 0; i < 1U << (32-2); i++ ) { float4 t = test(v); float4 c = correct(v); if( 0xf != _mm_movemask_ps( t == c )) { printf( "Error @ %vx: %vf vs. %vf\n", v, c, t); return -1; } v += 4; } Where "correct" is the old lowering and "test" the new one. The patch adds a test case for the two custom lowering instruction. It also modifies the vector cost model, which is why cast.ll and uitofp.ll are modified. 2009-02-26-MachineLICMBug.ll is also modified because we now hoist 7 instructions instead of 4 (3 more constant loads). rdar://problem/18153096> llvm-svn: 221657	2014-11-11 02:23:47 +00:00
Michael Kuperstein	c9f353294c	Reverting r221626 due to a too-strict test. llvm-svn: 221629	2014-11-10 21:07:41 +00:00
Juergen Ributzka	500ec86b5d	[AArch64][FastISel] Fix kill flags for integer extends. In the case we optimize an integer extend away and replace it directly with the source register, we also have to clear all kill flags at all its uses. This is necessary, because the orignal IR instruction might be trivially dead, but we replaced it with a nop at MI level. llvm-svn: 221628	2014-11-10 21:05:31 +00:00
Michael Kuperstein	db34b4e22e	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Differential Revision: http://reviews.llvm.org/D6128 llvm-svn: 221626	2014-11-10 20:40:21 +00:00
Jingyue Wu	74a580e42d	[NVPTX] Add an NVPTX-specific TargetTransformInfo Summary: It currently only implements hasBranchDivergence, and will be extended in later diffs. Split from D6188. Test Plan: make check-all Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6195 llvm-svn: 221619	2014-11-10 18:38:25 +00:00
Rafael Espindola	8cb53479d4	Misc style fixes. NFC. This fixes a few cases of: * Wrong variable name style. * Lines longer than 80 columns. * Repeated names in comments. * clang-format of the above. This make the next patch a lot easier to read. llvm-svn: 221615	2014-11-10 18:11:10 +00:00
Zoran Jovanovic	557d63136e	[mips][microMIPS] Fix issue with delay slot filler and microMIPS Differential Revision: http://reviews.llvm.org/D6193 llvm-svn: 221612	2014-11-10 17:27:56 +00:00
Daniel Sanders	141b848f86	[mips] Fix sret arguments for N32/N64 which were accidentally broken in r221534. llvm-svn: 221604	2014-11-10 15:57:53 +00:00
Matt Arsenault	fe816ec6cd	R600: Remove unused define llvm-svn: 221543	2014-11-07 20:45:00 +00:00
Daniel Sanders	a0cd02ace9	[mips] Promote i32 arguments to i64 for the N32/N64 ABI and fix <64-bit structs... Summary: ... and after all that refactoring, it's possible to distinguish softfloat floating point values from integers so this patch no longer breaks softfloat to do it. Remove direct handling of i32's in the N32/N64 ABI by promoting them to i64. This more closely reflects the ABI documentation and also fixes problems with stack arguments on big-endian targets. We now rely on signext/zeroext annotations (already generated by clang) and the Assert[SZ]ext nodes to avoid the introduction of unnecessary sign/zero extends. It was not possible to convert three tests to use signext/zeroext. These tests are bswap.ll, ctlz-v.ll, ctlz-v.ll. It's not possible to put signext on a vector type so we just accept the sign extends here for now. These tests don't pass the vectors the same way clang does (clang puts multiple elements in the same argument, these map 1 element to 1 argument) so we don't need to worry too much about it. With this patch, all known N32/N64 bugs should be fixed and we now pass the first 10,000 tests generated by ABITest.py. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6117 llvm-svn: 221534	2014-11-07 16:54:21 +00:00
Daniel Sanders	412a9e70ae	[mips] Removed the remainder of MipsCC. NFC. Summary: One of the calls to AllocateStack (the one in LowerCall) doesn't look like it should be there but it was there before and removing it breaks the frame size calculation. Reviewers: vmedic, theraven Reviewed By: theraven Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6116 llvm-svn: 221529	2014-11-07 15:33:08 +00:00
Daniel Sanders	5bfeec4042	[mips] Remove MipsCC::reservedArgArea() in favour of MipsABIInfo::GetCalleeAllocdArgSizeInBytes(). NFC. Summary: Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6115 llvm-svn: 221528	2014-11-07 15:03:53 +00:00
NAKAMURA Takumi	092d3a0319	MipsCCState.h: Use LLVM_DELETED_FUNCTION for msc17. llvm-svn: 221527	2014-11-07 14:56:31 +00:00
Daniel Sanders	a45773664f	[mips] Move MipsCCState to a separate file and clang-formatted it. Summary: Depends on D6113 Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6114 llvm-svn: 221525	2014-11-07 14:24:31 +00:00
Daniel Sanders	2b361cb67c	[mips] Fix unused variable warnings introduced in r221521 llvm-svn: 221522	2014-11-07 12:43:01 +00:00
Daniel Sanders	9db1197993	[mips] Remove remaining use of MipsCC::intArgRegs() in favour of MipsABIInfo::GetByValArgRegs() and MipsABIInfo::GetVarArgRegs() Summary: Depends on D6112 Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6113 llvm-svn: 221521	2014-11-07 12:21:37 +00:00
Daniel Sanders	0e05d1ee29	[mips] Remove MipsCC::getRegVT(). NFC Summary: It's no longer used. Reviewers: vmedic, theraven Reviewed By: theraven Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6112 llvm-svn: 221519	2014-11-07 12:02:59 +00:00
Daniel Sanders	1b0b5063fe	[mips] Remove MipsCC::analyzeCallOperands in favour of CCState::AnalyzeCallOperands. NFC Summary: In addition to the usual f128 workaround, it was also necessary to provide a means of accessing ArgListEntry::IsFixed. Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6111 llvm-svn: 221518	2014-11-07 11:43:49 +00:00
Daniel Sanders	95f569ae9e	[mips] Move SpecialCallingConv to MipsCCState and use it from tablegen-erated code. NFC Summary: In the long run, it should probably become a calling convention in its own right but for now just move it out of MipsISelLowering::analyzeCallOperands() so that we can drop this function in favour of CCState::AnalyzeCallOperands(). Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6085 llvm-svn: 221517	2014-11-07 11:10:48 +00:00
Daniel Sanders	1e7ac2a9c7	[mips] Removed IsVarArg from MipsISelLowering::analyzeCallOperands(). NFC. Summary: CCState objects already carry this information in their isVarArg() method. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6084 llvm-svn: 221516	2014-11-07 10:45:16 +00:00
Ahmed Bougacha	b887a994f2	[AArch64] Keep flags on condition vreg when instantiating a CB branch. Reversing a CB* instruction used to drop the flags on the condition. On the included testcase, this lead to a read from an undefined vreg. Using addOperand keeps the flags, here <undef>. Differential Revision: http://reviews.llvm.org/D6159 llvm-svn: 221507	2014-11-07 02:50:00 +00:00
Simon Pilgrim	e2d60b3127	[X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq) Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions. Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables. Differential Revision: http://reviews.llvm.org/D6001 llvm-svn: 221489	2014-11-06 22:15:41 +00:00
Ahmed Bougacha	d69ce819ea	[X86] Add VFMADDSUB cases for the 213->231 custom inserter. Also add tests for vfmadd/vfmsub. llvm-svn: 221488	2014-11-06 22:04:15 +00:00
Ahmed Bougacha	97c6b57f51	[X86] Add missing FMA3 VFMADDSUB in the emitter. Also reuse the fma4 intrinsic test to cover fma3 instructions too. llvm-svn: 221487	2014-11-06 21:58:11 +00:00
Colin LeMahieu	02f427b6fb	[Hexagon] Adding basic Hexagon ELF object emitter. llvm-svn: 221465	2014-11-06 17:05:51 +00:00
Eli Bendersky	39bb0a448e	Clean up NVPTXLowerStructArgs.cpp. NFC * Remove unnecessary const_casts and C-style casts * Simplify attribute access code * Simplify ArrayRef creation * 80-col and clang-format llvm-svn: 221464	2014-11-06 17:05:49 +00:00
Daniel Sanders	427838b573	[mips] Removed IsSoftFloat from MipsISelLowering::analyzeCallOperands(). NFC Summary: It isn't used anymore. Depends on D6081 Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6083 llvm-svn: 221463	2014-11-06 16:48:57 +00:00
Daniel Sanders	c33a78ead1	[mips] Removed MipsISelLowering::analyzeFormalArguments() in favour of CCState::AnalyzeFormalArguments() Summary: As with returns, we must be able to identify f128 arguments despite them being lowered away. We do this with a pre-analyze step that builds a vector and then we use this vector from the tablegen-erated code. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6081 llvm-svn: 221461	2014-11-06 16:36:30 +00:00
Andrea Di Biagio	a697bfdcaf	[X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8. Example: define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3> ret <4 x i32> %shuffle } Before llc (-mattr=+sse4.1), produced the following assembly instruction: pblendw $4294967103, %xmm1, %xmm0 After pblendw $63, %xmm1, %xmm0 llvm-svn: 221455	2014-11-06 14:36:45 +00:00
Aaron Ballman	b7915a3ee5	Fixing some -Wcast-qual warnings; NFC. llvm-svn: 221454	2014-11-06 14:32:30 +00:00
Toma Tabacu	8e2e22ba28	[mips] Tolerate the use of the %z inline asm operand modifier with non-immediates. Summary: Currently, we give an error if %z is used with non-immediates, instead of continuing as if the %z isn't there. For example, you use the %z operand modifier along with the "Jr" constraints ("r" makes the operand a register, and "J" makes it an immediate, but only if its value is 0). In this case, you want the compiler to print "$0" if the inline asm input operand turns out to be an immediate zero and you want it to print the register containing the operand, if it's not. We give an error in the latter case, and we shouldn't (GCC also doesn't). Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6023 llvm-svn: 221453	2014-11-06 14:25:42 +00:00
Sasa Stankovic	e20257b78e	[mips] Add the following MIPS options that control gp-relative addressing of small data items: -mgpopt, -mlocal-sdata, -mextern-sdata. Implement gp-relative addressing for constants. Differential Revision: http://reviews.llvm.org/D4903 llvm-svn: 221450	2014-11-06 13:20:12 +00:00
Toma Tabacu	5d123a7fbb	[mips] Improve error/warning messages and testing for the .cpload assembler directive. Summary: Improved warning message when using .cpload inside a reorder section and added an error message for using .cpload with Mips16 enabled. Modified the tests to fit with the changes mentioned above, added a test-case for the N32 ABI in cpload.s and did some reformatting to make the tests easier to read. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5465 llvm-svn: 221447	2014-11-06 10:02:45 +00:00
David Majnemer	8c2f8688d2	X86, MC: Tidy up some whitespace in GetRelocType No functionality change intended. llvm-svn: 221443	2014-11-06 08:10:37 +00:00
Quentin Colombet	d01225e402	[X86] Lower VSELECT into SHRUNKBLEND when we shrink the bits used into the condition to match a blend. This prevents optimizations that work on VSELECT to perform invalid transformations. Indeed, the optimized condition does not match the vector boolean content that is expected and bad things may happen. This patch yields the exact same code on the whole test-suite + specs (-O3 and -O3 -march=core-avx2), it improves one test case (vector-blend.ll) and fixes a bug reduced in vselect-avx.ll. <rdar://problem/18819506> llvm-svn: 221429	2014-11-06 02:25:03 +00:00
Simon Pilgrim	c2b395ce18	[X86][SSE] Vector integer to float conversion memory folding Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works. Differential Revision: http://reviews.llvm.org/D5981 llvm-svn: 221407	2014-11-05 22:28:25 +00:00
Matt Arsenault	92d039adc4	R600/SI: Fix omod display for VOP3b llvm-svn: 221387	2014-11-05 19:35:00 +00:00
Derek Schuff	2c22caff55	[x86 fast-isel] Materialize allocas with the correct-sized lea for ILP32 Summary: X86FastISel::fastMaterializeAlloca was incorrectly conditioning its opcode selection on subtarget bitness rather than pointer size. Differential Revision: http://reviews.llvm.org/D6136 llvm-svn: 221386	2014-11-05 19:27:21 +00:00
Matt Arsenault	3bdbaa2185	R600/SI: Move all rsrc building functions to SIISelLowering llvm-svn: 221383	2014-11-05 19:01:19 +00:00

1 2 3 4 5 ...

30661 Commits