llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Rafael Espindola	753d7ba297	Use __literal16. It has been supported by the linker since 2005. llvm-svn: 201365	2014-02-13 23:16:11 +00:00
Rafael Espindola	73f6bffbc1	Add triples to try to fix the windows bots. llvm-svn: 201345	2014-02-13 16:49:47 +00:00
Rafael Espindola	367df40fdc	.file is only available on ELF, use a triple instead of -march. llvm-svn: 201337	2014-02-13 15:38:16 +00:00
Rafael Espindola	f20c560b40	"foo" is not a ppc instruction, don't try to parse it. llvm-svn: 201336	2014-02-13 15:33:35 +00:00
Rafael Espindola	71737927b7	Specify a triple. MachO AArch64 support is missing. llvm-svn: 201335	2014-02-13 15:30:06 +00:00
Daniel Sanders	7a3a160940	Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333	2014-02-13 14:44:26 +00:00
NAKAMURA Takumi	74e8d4a7db	llvm/test/CodeGen/AArch64/cpus.ll: Tweak to use -mtriple=aarch64-unknown-unknown, or this would crash for targeting pecoff like *-mingw32. llvm-svn: 201315	2014-02-13 11:06:23 +00:00
Tim Northover	5e3ca32797	ARM: remove floating-point patterns for @llvm.arm.neon.vabs The front-end is now generating the generic @llvm.fabs for this operation now, so the extra patterns are no longer needed. llvm-svn: 201314	2014-02-13 10:44:30 +00:00
Oliver Stannard	f7cc40a705	Add Cortex-A53 and Cortex-A57 cores to the AArch64 backend llvm-svn: 201305	2014-02-13 09:46:11 +00:00
Hao Liu	386fc0d8ae	[AArch64]Fix the problems that can't select mul/add/sub of v1i8/v1i16/v1i32 types. As this problems are similar to shl/sra/srl, also add patterns for shift nodes. llvm-svn: 201298	2014-02-13 05:42:33 +00:00
Juergen Ributzka	19fe55f203	[DAG] Fix the recognition of opaque constants in the SelectionDAGBuilder. This fix checks the original LLVM IR node to identify opaque constants by looking for the bitcast-constant pattern. Originally we looked at the generated SDNode, but this might lead to incorrect results. The SDNode could have been generated by an constant expression that was folded to a constant. This fixes <rdar://problem/16050719> llvm-svn: 201291	2014-02-13 04:19:26 +00:00
Hao Liu	ee04163cfe	[AArch64]Add support for spilling FPR8/FPR16. llvm-svn: 201287	2014-02-13 02:36:58 +00:00
Andrea Di Biagio	b682c0a265	[X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it. Instead of expanding a packed shift into a sequence of scalar shifts, the backend now tries (when possible) to convert the vector shift into a vector multiply. Before this change, a shift of a MVT::v8i16 vector by a build_vector of constants was always scalarized into a long sequence of "vector extracts + scalar shifts + vector insert". With this change, if there is SSE2 support, we emit a single vector multiply. This change also affects SSE4.1, AVX, AVX2 shifts: - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants is now lowered when possible into a single SSE4.1 vector multiply. - Packed v16i16 shift left by constant build_vector are now expanded when possible into a single AVX2 vpmullw. This change also improves the lowering of AVX512f vector shifts. Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected by this change. llvm-svn: 201271	2014-02-12 23:42:28 +00:00
Akira Hatanaka	b360215008	Pass edges weights to MachineBasicBlock::addSuccessor in TailDuplicatePass to preserve branch probability information. <rdar://problem/15893208> llvm-svn: 201245	2014-02-12 18:09:18 +00:00
Daniel Sanders	656c4d360b	Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241	2014-02-12 15:39:20 +00:00
Daniel Sanders	e647d6441b	Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237	2014-02-12 14:44:54 +00:00
Evan Cheng	f8b059795f	Tweak ARM fastcc by adopting these two AAPCS rules: * CPRCs may be allocated to co-processor registers or the stack – they may never be allocated to core registers * When a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable The difference is only noticeable in rare cases where there are a large number of floating point arguments (e.g. 7 doubles + additional float, double arguments). Although it's probably still better to avoid vmov as it can cause stalls in some older ARM cores. The other, more subtle benefit, is to minimize difference between the various calling conventions. rdar://16039676 llvm-svn: 201193	2014-02-11 23:49:31 +00:00
David Blaikie	4af7429b68	DebugInfo: Remove dependence on file numbering in the line table. These tests were unnecessarily sensitive to the presence and ordering of elements in the line table file_names list which will break on a future change I'm working on. llvm-svn: 201185	2014-02-11 21:46:46 +00:00
Matt Arsenault	cc13cc04ab	R600/SI: Fix assertion on infinite loops. This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177	2014-02-11 21:12:38 +00:00
Robert Lougher	834d664400	Teach the DAGCombiner how to fold concat_vector nodes when the input is two BUILD_VECTOR nodes, e.g.: (concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4)) -> (BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4) This fixes an issue with AVX, where a sequence was not recognized as a 256-bit vbroadcast due to the concat_vectors. llvm-svn: 201158	2014-02-11 15:42:46 +00:00
Robert Lytton	604b5e52e1	XCore target: fix const section handling Xcore target ABI requires const data that is externally visible to be handled differently if it has C-language linkage rather than C++ language linkage. Clang now emits ".cp.rodata" section information. All other externally visible constant data will be placed in the DP section. llvm-svn: 201144	2014-02-11 10:36:26 +00:00
Robert Lytton	6ac9a5d013	XCore target: Lower ATOMIC_LOAD & ATOMIC_STORE llvm-svn: 201143	2014-02-11 10:36:18 +00:00
Elena Demikhovsky	31d965117b	AVX: fixed a bug in LowerVECTOR_SHUFFLE llvm-svn: 201140	2014-02-11 10:21:53 +00:00
Elena Demikhovsky	ac4dfca982	AVX-512: Optimized BUILD_VECTOR pattern; fixed encoding of VEXTRACTPS instruction. llvm-svn: 201134	2014-02-11 07:25:59 +00:00
Quentin Colombet	a29a390b9b	[CodeGenPrepare] Test case for the promotions that bypass the profitability check due to some other checks in the addressing mode matcher. I.e., test case for commit r201121. <rdar://problem/16020230> llvm-svn: 201132	2014-02-11 06:55:43 +00:00
Tom Stellard	2712f90019	R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are used DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097	2014-02-10 16:58:30 +00:00
Tim Northover	68738948e7	ARM: use natural LLVM IR for vshll instructions Similarly to the vshrn instructions, these are simple zext/sext + trunc operations. Using normal LLVM IR should allow for better code, and more sharing with the AArch64 backend. llvm-svn: 201093	2014-02-10 16:20:29 +00:00
Oliver Stannard	d01574e188	ARM: r12 is callee-saved for interrupt handlers For A- and R-class processors, r12 is not normally callee-saved, but is for interrupt handlers. See AAPCS, 5.3.1.1, "Use of IP by the linker". llvm-svn: 201089	2014-02-10 14:24:23 +00:00
Tim Northover	f94bdee15e	ARM: use LLVM IR to represent the vshrn operation vshrn is just the combination of a right shift and a truncate (and the limits on the immediate value actually mean the signedness of the shift doesn't matter). Using that representation allows us to get rid of an ARM-specific intrinsic, share more code with AArch64 and hopefully get better code out of the mid-end optimisers. llvm-svn: 201085	2014-02-10 14:04:07 +00:00
Robert Lougher	3ef707fb80	Test commit - added a new line to vec_shuf-insert.ll. llvm-svn: 201083	2014-02-10 12:42:13 +00:00
Matheus Almeida	a37a49cc06	[mips][msa] Add DLSA instruction. llvm-svn: 201081	2014-02-10 12:05:17 +00:00
Matheus Almeida	856616a320	[mips][msa] Update FileCheck prefix in preparation for the addition of Mips64 tests. No functional changes. llvm-svn: 201080	2014-02-10 11:30:09 +00:00
Elena Demikhovsky	110d93ce93	AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors. llvm-svn: 201066	2014-02-10 07:02:39 +00:00
Hao Liu	636db9c0e6	[AArch64]Implement the copy of two FPR8 registers by using FMOVss of two FPR32 registers in copyPhysReg. llvm-svn: 201061	2014-02-10 03:16:22 +00:00
Rafael Espindola	edad45fa11	Always create a temporary symbol to use with the cfi frame. This is a small simplification and a small step in fixing pr18743 since private functions on MachO should be using a 'l' prefix. llvm-svn: 200994	2014-02-07 21:23:18 +00:00
Rafael Espindola	a9692dd935	Use FileCheck variables to simplify this test. llvm-svn: 200992	2014-02-07 21:11:33 +00:00
Renato Golin	4169534e5e	Fix Darwin bots from EHABI change llvm-svn: 200990	2014-02-07 20:32:32 +00:00
Matt Arsenault	accd6717ba	R600/SI: Add failing test for 3 x i64 vectors. Stores of <4 x i64> do work (although they do expand to 4 stores instead of 2), but 3 x i64 vectors fail to select. llvm-svn: 200989	2014-02-07 20:29:40 +00:00
Renato Golin	9fd4f091dc	Remove -arm-disable-ehabi option llvm-svn: 200988	2014-02-07 20:12:49 +00:00
Sasa Stankovic	d15975817e	[mips] Forbid the use of registers t6, t7 and t8 if the target is NaCl. Differential Revision: http://llvm-reviews.chandlerc.com/D2694 llvm-svn: 200978	2014-02-07 17:16:40 +00:00
Rafael Espindola	89e2f80cac	Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it. Thanks to John McCall for noticing it. llvm-svn: 200977	2014-02-07 16:21:30 +00:00
Oliver Stannard	690aee262c	LLVM-1163: AAPCS-VFP violation when CPRC allocated to stack According to the AAPCS, when a CPRC is allocated to the stack, all other VFP registers should be marked as unavailable. I have also modified the rules for allocating non-CPRCs to the stack, to make it more explicit that all GPRs must be made unavailable. I cannot think of a case where the old version would produce incorrect answers, so there is no test for this. llvm-svn: 200970	2014-02-07 11:19:53 +00:00
Venkatraman Govindaraju	b475c5619b	[Sparc] Emit relocations for Thread Local Storage (TLS) when integrated assembler is used. llvm-svn: 200962	2014-02-07 05:54:20 +00:00
Venkatraman Govindaraju	3eb62616b0	[Sparc] Emit correct relocations for PIC code when integrated assembler is used. llvm-svn: 200961	2014-02-07 04:24:35 +00:00
Manman Ren	5fdca739ee	PGO branch weight: fix PR18752. Fix a bug triggered in IfConverterTriangle when CvtBB has multiple predecessors by getting the weights before removing a successor. llvm-svn: 200958	2014-02-07 00:38:56 +00:00
Jim Grosbach	f2f14a2d43	X86: Resolve a long standing FIXME and properly isel pextr[bw]. Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use them to match the relevant pextr store instructions. The test widen_load-2.ll requires a slight change because with the stores gone, the remaining instructions are scheduled in a different order. Add test cases for SSE4 and AVX variants. Resolves rdar://13414672. Patch by Adam Nemet <anemet@apple.com>. llvm-svn: 200957	2014-02-07 00:16:33 +00:00
Rafael Espindola	77433ac346	Convert test to FileCheck. llvm-svn: 200955	2014-02-06 23:35:22 +00:00
Quentin Colombet	f0d12dd9ee	[CodeGenPrepare] Move away sign extensions that get in the way of addressing mode. Basically the idea is to transform code like this: %idx = add nsw i32 %a, 1 %sextidx = sext i32 %idx to i64 %gep = gep i8* %myArray, i64 %sextidx load i8* %gep Into: %sexta = sext i32 %a to i64 %idx = add nsw i64 %sexta, 1 %gep = gep i8* %myArray, i64 %idx load i8* %gep That way the computation can be folded into the addressing mode. This transformation is done as part of the addressing mode matcher. If the matching fails (not profitable, addressing mode not legal, etc.), the matcher will revert the related promotions. <rdar://problem/15519855> llvm-svn: 200947	2014-02-06 21:44:56 +00:00
Tom Stellard	1906c48d55	R600/SI: Add a MUBUF store pattern for Reg+Imm offsets llvm-svn: 200935	2014-02-06 18:36:41 +00:00
Tom Stellard	c690406420	R600/SI: Add a MUBUF store pattern for Imm offsets llvm-svn: 200934	2014-02-06 18:36:39 +00:00
Tom Stellard	2e3a1cc4d8	R600/SI: Add a MUBUF load pattern for Reg+Imm offsets llvm-svn: 200933	2014-02-06 18:36:38 +00:00
Tom Stellard	879ab71511	R600/SI: Use immediates offsets for SMRD instructions whenever possible There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932	2014-02-06 18:36:34 +00:00
Juergen Ributzka	a5769c5abc	[DAG] Don't pull the binary operation though the shift if the operands have opaque constants. During DAGCombine visitShiftByConstant assumes that certain binary operations with only constant operands can always be folded successfully. This is no longer true when the constant is opaque. This commit fixes visitShiftByConstant by not performing the optimization for opaque constants. Otherwise we would end up in an infinite DAGCombine loop. llvm-svn: 200900	2014-02-06 04:09:06 +00:00
Quentin Colombet	b3e9fcb39b	[RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register. The idea is to choose a color for the variable that cannot be allocated and recolor its interferences around. Unlike the current register allocation scheme, it is allowed to change the color of an already assigned (but maybe not splittable or spillable) live interval while propagating this change to its neighbors. In other word, there are two things that may help finding an available color: - Already assigned variables (RS_Done) can be recolored to different color. - The recoloring allows to catch solutions that needs to touch more that just the neighbors of the current allocated variable. E.g., vA can use {R1, R2 } vB can use { R2, R3} vC can use {R1 } Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and they all interfere. vA is assigned R1 vB is assigned R2 vC tries to evict vA but vA is already done. => Regular register allocation heuristic fails. Last chance recoloring kicks in: vC does as if vA was evicted => vC uses R1. vC is marked as fixed. vA needs to find a color. None are available. vA cannot evict vC: vC is a fixed virtual register now. vA does as if vB was evicted => vA uses R2. vB needs to find a color. R3 is available. Recoloring => vC = R1, vA = R2, vB = R3. <rdar://problem/15947839> llvm-svn: 200883	2014-02-05 22:13:59 +00:00
Rafael Espindola	98165a6a91	Remove support for not using .loc directives. Clang itself was not using this. The only way to access it was via llc. llvm-svn: 200862	2014-02-05 18:00:21 +00:00
Petar Jovanovic	c17768616f	[mips] Add NaCl target and forbid indexed loads and stores for it This patch adds NaCl target for Mips. It also forbids indexed loads and stores if the target is NaCl. Patch by Sasa Stankovic. Differential Revision: http://llvm-reviews.chandlerc.com/D2690 llvm-svn: 200855	2014-02-05 17:19:30 +00:00
Elena Demikhovsky	f8035c205a	AVX-512: optimized icmp -> sext -> icmp pattern llvm-svn: 200849	2014-02-05 16:17:36 +00:00
Michel Danzer	ed3052cc54	R600/SI: Add pattern for zero-extending i1 to i32 Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830	2014-02-05 09:48:05 +00:00
Elena Demikhovsky	2e0202b75e	AVX-512: Added intrinsic for cvtph2ps. Added VPTESTNM instruction. Added a pattern to vselect (lit tests will follow). llvm-svn: 200823	2014-02-05 07:05:03 +00:00
David Peixotto	a6dfad28da	Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly This patch fixes the ldr-pseudo implementation to work when used in inline assembly. The fix is to move arm assembler constant pools from the ARMAsmParser class to the ARMTargetStreamer class. Previously we kept the assembler generated constant pools in the ARMAsmParser object. This does not work for inline assembly because a new parser object is created for each blob of inline assembly. This patch moves the constant pools to the ARMTargetStreamer class so that the constant pool will remain alive for the entire code generation process. An ARMTargetStreamer class is now required for the arm backend. There was no existing implementation for MachO, only Asm and ELF. Instead of creating an empty MachO subclass, we decided to make the ARMTargetStreamer a non-abstract class and provide default (llvm_unreachable) implementations for the non constant-pool related methods. Differential Revision: http://llvm-reviews.chandlerc.com/D2638 llvm-svn: 200777	2014-02-04 17:22:40 +00:00
Tom Stellard	279daf2506	R600/SI: Custom lower i64 ISD::SELECT llvm-svn: 200774	2014-02-04 17:18:40 +00:00
Tom Stellard	f4a180e50b	R600: Enable vector fpow. The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773	2014-02-04 17:18:37 +00:00
Tim Northover	cf3928b4f7	ARM & AArch64: merge NEON absolute compare intrinsics There was an extremely confusing proliferation of LLVM intrinsics to implement the vacge & vacgt instructions. This combines them all into two polymorphic intrinsics, shared across both backends. llvm-svn: 200768	2014-02-04 14:55:42 +00:00
Tim Northover	4c3de0c83d	ARM: fix fast-isel assertion failure Missing braces on if meant we inserted both ARM and Thumb load for a litpool entry. This didn't end well. rdar://problem/15959157 llvm-svn: 200752	2014-02-04 10:38:46 +00:00
Michel Danzer	292dfb1151	R600/SI: Fix fneg for 0.0 V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743	2014-02-04 07:12:38 +00:00
David Blaikie	96c0bb7cc5	DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0} A bunch of test cases needed to be cleaned up for this, many my fault - when implementid imported modules I updated test cases by simply duplicating the prior metadata field - which wasn't always the empty metadata entry. llvm-svn: 200731	2014-02-04 01:23:52 +00:00
Tim Northover	0b6ea5de72	AArch64 & ARM: refactor crypto intrinsics to take scalars Some of the SHA instructions take a scalar i32 as one argument (largely because they work on 160-bit hash fragments). This wasn't reflected in the IR previously, with ARM and AArch64 choosing different types (<4 x i32> and <1 x i32> respectively) which was ugly. This makes all the affected intrinsics take a uniform "i32", allowing them to become non-polymorphic at the same time. llvm-svn: 200706	2014-02-03 17:27:49 +00:00
Hal Finkel	9f2b5f3ecd	Expand vector bswap in LegalizeVectorOps ISD::BSWAP was missing from the list of node types that should be expanded element-wise. llvm-svn: 200705	2014-02-03 17:27:25 +00:00
Matt Arsenault	f89553a645	Add some xfailed R600 tests for 64-bit private accesses. llvm-svn: 200620	2014-02-02 00:13:12 +00:00
Matt Arsenault	4198d962c0	R600/SI: Fix insertelement with dynamic indices. This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619	2014-02-02 00:05:35 +00:00
Venkatraman Govindaraju	9c2c79ab67	[Sparc] Set %o7 as the return address register instead of %i7 in MCRegisterInfo. Also, add CFI instructions to initialize the frame correctly. llvm-svn: 200617	2014-02-01 18:54:16 +00:00
Josh Magee	d0e03ee88f	[stackprotector] Implement the sspstrong rules for stack layout. This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to follow the extended stack layout rules for sspstrong and sspreq. The sspstrong layout rules are: 1. Large arrays and structures containing large arrays (>= ssp-buffer-size) are closest to the stack protector. 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are 2nd closest to the protector. 3. Variables that have had their address taken are 3rd closest to the protector. Differential Revision: http://llvm-reviews.chandlerc.com/D2546 llvm-svn: 200601	2014-02-01 01:36:16 +00:00
Reid Kleckner	239e9806ff	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596	2014-01-31 23:50:57 +00:00
Reid Kleckner	80a8045bb4	Don't put non-static allocas in the static alloca map Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593	2014-01-31 23:45:12 +00:00
Reid Kleckner	1d1b284b77	Set -mcpu to make this test pass on atom bots llvm-svn: 200588	2014-01-31 22:58:10 +00:00
Lang Hames	884a7dc676	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577	2014-01-31 21:29:19 +00:00
Reid Kleckner	8ff8b30e4d	[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret' MSVC always places the 'this' parameter for a method first. The implicit 'sret' pointer for methods always comes second. We already implement this for __thiscall by putting sret parameters on the stack, but __cdecl methods require putting both parameters on the stack in opposite order. Using a special calling convention allows frontends to keep the sret parameter first, which avoids breaking lots of assumptions in LLVM and Clang. Fixes PR15768 with the corresponding change in Clang. Reviewers: ributzka, majnemer Differential Revision: http://llvm-reviews.chandlerc.com/D2663 llvm-svn: 200561	2014-01-31 17:41:22 +00:00
Matheus Almeida	489791e923	[mips][msa] Add insert.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200543	2014-01-31 13:31:20 +00:00
Matheus Almeida	5c17e14d3e	Update FileCheck prefixes in preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200541	2014-01-31 13:05:56 +00:00
Manman Ren	0552af6547	This patch teaches the DAGCombiner how to fold insert_subvector nodes when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506	2014-01-31 01:10:35 +00:00
Manman Ren	7760d41e27	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502	2014-01-31 00:42:44 +00:00
Chad Rosier	156f3a2a96	[AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types. llvm-svn: 200491	2014-01-30 21:46:54 +00:00
Juergen Ributzka	ead2eaed6f	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Re-applying the patch, but this time without using AsmPrinter methods. Reviewed by Andy llvm-svn: 200481	2014-01-30 18:58:27 +00:00
Evgeniy Stepanov	000eb0d51d	Reenable ARM EHABI on Android. Broken in r200388. llvm-svn: 200466	2014-01-30 14:18:25 +00:00
Jakob Stoklund Olesen	412a5b3d9b	Implement SPARCv9 atomic_swap_64 with a pseudo. The SWAP instruction only exists in a 32-bit variant, but the 64-bit atomic swap can be implemented in terms of CASX, like the other atomic rmw primitives. llvm-svn: 200453	2014-01-30 04:48:46 +00:00
Juergen Ributzka	88f69803a7	Revert "[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic." This reverts commit r200444 to unbreak buildbots. llvm-svn: 200445	2014-01-30 03:34:02 +00:00
Juergen Ributzka	6ef42913cf	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Reviewed by Andy llvm-svn: 200444	2014-01-30 03:06:14 +00:00
Manman Ren	a4c69e4cda	Revert r200431 due to bot failures. llvm-svn: 200434	2014-01-30 00:53:27 +00:00
Manman Ren	a49dcc98e7	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. llvm-svn: 200431	2014-01-30 00:24:37 +00:00
Manman Ren	e23a689faf	PGO branch weight: update edge weights in IfConverter. This commit only handles IfConvertTriangle. To update edge weights of a successor, one interface is added to MachineBasicBlock: /// Set successor weight of a given iterator. setSuccWeight(succ_iterator I, uint32_t weight) An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated, since we now correctly update the edge weights, the cold block is placed at the end of the function and we jump to the cold block. llvm-svn: 200428	2014-01-29 23:18:47 +00:00
Matheus Almeida	67244395fb	[mips][msa] Add fill.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200400	2014-01-29 15:12:02 +00:00
Matheus Almeida	6fd8deacb5	[mips][msa] CHECK-DAG-ize MSA 2r_vector_scalar.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200399	2014-01-29 14:32:03 +00:00
Matheus Almeida	3e07e293c7	[mips][msa] Add copy_{u,s}.d. These instructions are only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200398	2014-01-29 14:05:28 +00:00
Matheus Almeida	8cac709f4b	[mips][msa] CHECK-DAG-ize MSA elm_copy.ll test. This update is a preparation for the addition of Mips64 MSA tests. No functional changes. llvm-svn: 200395	2014-01-29 13:51:34 +00:00
Renato Golin	6ca0034624	Enable EHABI by default After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388	2014-01-29 11:50:56 +00:00
Venkatraman Govindaraju	a50ca1f645	[Sparc] Use %r_disp32 for pc_rel entries in FDE as well. This makes MCAsmInfo::getExprForFDESymbol() a virtual function and overrides it in SparcMCAsmInfo. llvm-svn: 200376	2014-01-29 06:59:20 +00:00
Venkatraman Govindaraju	1541483555	[Sparc] Use %r_disp32 for pc_rel entries in gcc_except_table and eh_frame. Otherwise, assembler (gas) fails to assemble them with error message "operation combines symbols in different segments". This is because MC computes pc_rel entries with subtract expression between labels from different sections. llvm-svn: 200373	2014-01-29 04:51:35 +00:00
Venkatraman Govindaraju	416f0c2389	[SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9. llvm-svn: 200368	2014-01-29 03:35:08 +00:00
Rafael Espindola	f6087fc40c	Use a raw_stream to implement the mangler. This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. llvm-svn: 200367	2014-01-29 02:30:38 +00:00
Kevin Qin	379441a4e6	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00

1 2 3 4 5 ...

9190 Commits