llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Jyotsna Verma	5307666fe8	Reverting r181331. Missing file, HexagonSplitConst32AndConst64.cpp, from lib/Target/Hexagon/CMakeLists.txt. llvm-svn: 181334	2013-05-07 17:12:35 +00:00
Jyotsna Verma	af0c734e1b	Hexagon: Fix Small Data support to handle -G 0 correctly. llvm-svn: 181331	2013-05-07 16:42:15 +00:00
Bill Wendling	0c1e625af5	Reduce attributes. llvm-svn: 181245	2013-05-06 20:57:23 +00:00
Tom Stellard	fb8e73f3af	R600: Emit config values in register / value pairs Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181228	2013-05-06 17:50:51 +00:00
Tom Stellard	6c3f6e1b02	R600: Stop emitting the instruction type byte before each instruction Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181225	2013-05-06 17:50:44 +00:00
Tom Stellard	ebe049fd75	R600: Emit ISA for CALL_FS_* instructions Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181223	2013-05-06 17:50:26 +00:00
Ulrich Weigand	1431b3c2f5	[SystemZ] Add CodeGen test cases This adds all CodeGen tests for the SystemZ target. This version of the patch incorporates feedback from a review by Sean Silva. Thanks to all reviewers! Patch by Richard Sandiford. llvm-svn: 181204	2013-05-06 16:17:29 +00:00
Michael Kuperstein	74685eff73	Fix slightly too aggressive conact_vector optimization. (Would sometimes optimize away conacts used to extend a vector with undef values) llvm-svn: 181186	2013-05-06 08:06:13 +00:00
Bill Wendling	454b468436	Add a testcase that checks that we generate functions with frame pointers or not depending upon the function attributes. llvm-svn: 181180	2013-05-06 05:45:57 +00:00
Evan Cheng	2fcdb53946	Test case for r181160 and r181161. rdar://13782395 llvm-svn: 181162	2013-05-05 18:07:15 +00:00
Stepan Dyatkovskiy	c06cd03f6e	For ARM backend, fixed "byval" attribute support. Now even the small structures could be passed within byval (small enough to be stored in GPRs). In regression tests next function prototypes are checked: PR15293: %artz = type { i32 } define void @foo(%artz* byval %s) define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2) foo: "s" stored in R0 foo2: "s" stored in R0, "s2" stored in R2. Next AAPCS rules are checked: 5.5 Parameters Passing, C.4 and C.5, "ParamSize" is parameter size in 32bit words: -- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4. Parameter should be sent to the stack; NCRN := R4. -- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4. Parameter stored in GPRs; NCRN += ParamSize. llvm-svn: 181148	2013-05-05 07:48:36 +00:00
David Majnemer	d5ba4da281	Remove a recently redundant transform from X86ISelLowering. X86ISelLowering has support to treat: (icmp ne (and (xor %flags, -1), (shl 1, flag)), 0) as if it were actually: (icmp eq (and %flags, (shl 1, flag)), 0) However, r179386 has code at the InstCombine level to handle this. llvm-svn: 181145	2013-05-05 02:00:10 +00:00
Tim Northover	d4f2cac7b6	AArch64: support literal pool access in large memory model. llvm-svn: 181120	2013-05-04 16:54:07 +00:00
Tim Northover	4ef2500d01	AArch64: support large code model for jump-tables llvm-svn: 181119	2013-05-04 16:54:00 +00:00
Tim Northover	ece66eacb2	AArch64: implement support for blockaddress in large code model llvm-svn: 181118	2013-05-04 16:53:53 +00:00
Tim Northover	87645e02c0	AArch64: implement large code model access to global variables. The MOVZ/MOVK instruction sequence may not be the most efficient (a literal-pool load could be better) but adding that would require reinstating the ConstantIslands pass. For now the sequence is correct, and that's enough. Beware, as of commit GNU ld does not appear to support the relocations needed for this. Its primary purpose (for now) will be to support JITed code, since in that case there is no guarantee of where your code will end up in memory relative to external symbols it references. llvm-svn: 181117	2013-05-04 16:53:46 +00:00
Reed Kotler	b89d9a0181	Remove some uneeded pseudos in the presence of the naked function attribute. llvm-svn: 181072	2013-05-03 23:17:24 +00:00
Akira Hatanaka	5f295bccfc	[mips] Split the DSP control register and define one register for each field of its fields. This removes false dependencies between DSP instructions which access different fields of the the control register. Implicit register operands are added to instructions RDDSP and WRDSP after instruction selection, depending on the value of the mask operand. llvm-svn: 181041	2013-05-03 18:37:49 +00:00
Tom Stellard	2165728987	R600: Expand vector or, shl, srl, and xor nodes llvm-svn: 181035	2013-05-03 17:21:31 +00:00
Tom Stellard	f2fd0109a0	R600: Add pattern for SHA-256 Ma function This can be optimized using the BFI_INT instruction. llvm-svn: 181033	2013-05-03 17:21:20 +00:00
Akira Hatanaka	ab6ee99fe0	[mips] Handle reading, writing or copying of ccond field of DSP control register. - Define pseudo instructions which store or load ccond field of the DSP control register. - Emit the pseudos in MipsSEInstrInfo::storeRegToStack and loadRegFromStack. - Expand the pseudos before callee-scan save. - Emit instructions RDDSP or WRDSP to copy between ccond field and GPRs. llvm-svn: 180969	2013-05-02 23:07:05 +00:00
Vincent Lejeune	5d7b2a4aea	R600: Signed literals are 64bits wide llvm-svn: 180960	2013-05-02 21:53:03 +00:00
Vincent Lejeune	3ff31b75b3	R600: If previous bundle is dot4, PV valid chan is always X llvm-svn: 180959	2013-05-02 21:52:55 +00:00
Vincent Lejeune	97fb65a788	R600: Add a test to check that use_kill is emitted llvm-svn: 180958	2013-05-02 21:52:46 +00:00
Vincent Lejeune	62da1453e1	R600: Prettier asmPrint of Alu llvm-svn: 180956	2013-05-02 21:52:30 +00:00
Pranav Bhandarkar	520d26e773	Hexagon - Add peephole optimizations for zero extends. * lib/Target/Hexagon/HexagonInstrInfo.td: Add patterns to combine a sequence of a pair of i32->i64 extensions followed by a "bitwise or" into COMBINE_rr. * lib/Target/Hexagon/HexagonPeephole.cpp: Copy propagate Rx in the instruction Rp = COMBINE_Ir_V4(0, Rx) to the uses of Rp:subreg_loreg. * test/CodeGen/Hexagon/union-1.ll: New test. * test/CodeGen/Hexagon/combine_ir.ll: Fix test. llvm-svn: 180946	2013-05-02 20:22:51 +00:00
Manman Ren	0e2c14381d	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180935	2013-05-02 18:11:35 +00:00
Michael Liao	ca62375d96	Rewrite X86 codegen regression test with FileCheck llvm-svn: 180910	2013-05-02 06:20:42 +00:00
Michael Liao	ec28235c2a	Avoid generating tempfile(s) never used As DejaGNU is deprecated, it seems pipe-jam issue doesn't exist any more. llvm-svn: 180892	2013-05-01 22:46:50 +00:00
Bill Wendling	218b457a2f	Revert r180737. The companion patch was reverted, and this is not relevant right now. llvm-svn: 180889	2013-05-01 22:32:08 +00:00
Nadav Rotem	d62966b79d	Optimize away nop CONCAT_VECTOR nodes. Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector. rdar://13402653 PR15866 llvm-svn: 180871	2013-05-01 19:18:51 +00:00
Rafael Espindola	056abc9e26	Put VMOVPQIto64rr in the VRPDI class. Patch by Joshua Magee. llvm-svn: 180842	2013-05-01 13:00:16 +00:00
Michael Liao	cff6c527b8	Forget remove the tempfile argument llvm-svn: 180838	2013-05-01 05:45:57 +00:00
Michael Liao	349e772e4f	More rewrites of x86 codegen regression tests with FileCheck llvm-svn: 180837	2013-05-01 05:34:30 +00:00
Akira Hatanaka	f5c940dea8	[mips] Fix handling of instructions which copy to/from accumulator registers. Expand copy instructions between two accumulator registers before callee-saved scan is done. Handle copies between integer GPR and hi/lo registers in MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not needed. llvm-svn: 180827	2013-04-30 23:22:09 +00:00
Stephen Lin	84b2d4dbd4	Only pass 'returned' to target-specific lowering code when the value of entire register is guaranteed to be preserved. llvm-svn: 180825	2013-04-30 22:49:28 +00:00
Akira Hatanaka	0bca7f3584	[mips] Instruction selection patterns for DSP-ASE vector select and compare instructions. llvm-svn: 180820	2013-04-30 22:37:26 +00:00
Adrian Prantl	7482401c1d	Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a" because it breaks some buildbots. This reverts commit 180816. llvm-svn: 180819	2013-04-30 22:35:14 +00:00
Adrian Prantl	baf0a98faa	Change the informal convention of DBG_VALUE so that we can express a register-indirect address with an offset of 0. It used to be that a DBG_VALUE is a register-indirect value if the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE is register-indirect if the first operand is a register and the second operand is an immediate. For plain registers use the combination reg, reg. rdar://problem/13658587 llvm-svn: 180816	2013-04-30 22:16:46 +00:00
Hal Finkel	2ac40ae0d3	LocalStackSlotAllocation improvements First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created. Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway. Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll Jim has okayed this off-list. llvm-svn: 180799	2013-04-30 20:04:37 +00:00
Manman Ren	0b37dd0efc	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180796	2013-04-30 17:52:57 +00:00
Vincent Lejeune	25352bd54f	R600: fix loop-address.ll test Texture cache is now used when shader type is not specified llvm-svn: 180785	2013-04-30 12:47:56 +00:00
Michael Liao	c67e0fc9ea	Rewrite X86 codegen regression test with FileCheck llvm-svn: 180776	2013-04-30 07:51:08 +00:00
Vincent Lejeune	29f24e0ce8	R600: use native for alu llvm-svn: 180761	2013-04-30 00:14:38 +00:00
Vincent Lejeune	e641cd06c9	R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache llvm-svn: 180755	2013-04-30 00:13:39 +00:00
Michael Liao	3db1a24464	Rewrite test in FileCheck instead of grep in X86 codegen llvm-svn: 180754	2013-04-30 00:13:38 +00:00
Manman Ren	180923f053	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 180745	2013-04-29 22:58:55 +00:00
Bill Wendling	e24a89874b	Duplicate a testcase. llvm-svn: 180744	2013-04-29 22:42:47 +00:00
Michael Liao	4e03c7690d	Rewrite some tests with FileCHeck in X86 codegen - Revise previous patches of the same purpose by fixing ) grep <PA> \| not grep <PB> semantically is not the same as CHECK: <PA>{{^<PB>.$}} as the former will check all occurrences of <PA> while the later only check the first match. As the result, CHECK needs putting in all place where <PA> occurs. *) grep <PA> \| count <N> needs a final CHECK-NOT of the same pattern. (As 'CHECK-<N>' is proposed for discussion, converting 'grep \| count <N>' where N > 1 is postponed.) llvm-svn: 180742	2013-04-29 22:41:29 +00:00
Tom Stellard	33e7a52e1c	R600: Use correct CF_END instruction on Northern Island GPUs llvm-svn: 180735	2013-04-29 22:23:58 +00:00
Tom Stellard	a22d2b47f3	R600: Fix encoding of CF_END_{EG, R600} instructions The EOP bit was not being encoded. llvm-svn: 180734	2013-04-29 22:23:54 +00:00
Rafael Espindola	ef355ff0c1	Make all darwin ppc stubs local. This fixes pr15763. Patch by David Fang. llvm-svn: 180657	2013-04-27 00:43:16 +00:00
Benjamin Kramer	583d3f2591	Make CHECK lines a bit less strict so they also match code generated for win64. Hopefully brings the windows buildbots back to life. llvm-svn: 180630	2013-04-26 21:04:21 +00:00
Tom Stellard	de2ad0a8f1	R600: Initialize AMDGPUMachineFunction::ShaderType to ShaderType::COMPUTE We need to intialize this to something and since clang does not set the shader type attribute and clang is used only for compute shaders, initializing it to COMPUTE seems like the best choice. Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 180620	2013-04-26 18:32:24 +00:00
Benjamin Kramer	40a2d53c85	ARM/NEON: Pattern match vector integer abs to vabs. llvm-svn: 180604	2013-04-26 15:00:57 +00:00
Benjamin Kramer	11723aa321	X86: Now that we have a canonical form for vector integer abs, match it into pabs. llvm-svn: 180600	2013-04-26 12:05:21 +00:00
Benjamin Kramer	7ce75fb032	DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars. This already helps SSE2 x86 a lot because it lacks an efficient way to represent a vector select. The long term goal is to enable the backend to match a canonicalized pattern into a single instruction (e.g. vabs or pabs). llvm-svn: 180597	2013-04-26 09:19:19 +00:00
Preston Gurd	0547d81fdb	This patch adds the X86FixupLEAs pass, which will reduce instruction latency for certain models of the Intel Atom family, by converting instructions into their equivalent LEA instructions, when it is both useful and possible to do so. llvm-svn: 180573	2013-04-25 20:29:37 +00:00
Chad Rosier	3030f76a0d	[inline asm] Add a test case for r180226. The specific issue is that the inline assembly is requesting a 64-bit register, which is invalid for i386. rdar://13731657 llvm-svn: 180445	2013-04-25 17:10:21 +00:00
Silviu Baranga	c656dd9bdd	Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar. llvm-svn: 180254	2013-04-25 09:32:33 +00:00
Tom Stellard	48d161332e	R600: Use SHT_PROGBITS for the .AMDGPU.config section The libelf implementation that is distributed here: http://www.mr511.de/software/english.html will not parse sections that are marked SHT_NULL. llvm-svn: 180230	2013-04-24 23:56:14 +00:00
Andrew Trick	73014520d6	MI Sched: eliminate local vreg copies. For now, we just reschedule instructions that use the copied vregs and let regalloc elliminate it. I would really like to eliminate the copies on-the-fly during scheduling, but we need a complete implementation of repairIntervalsInRange() first. The general strategy is for the register coalescer to eliminate as many global copies as possible and shrink live ranges to be extended-basic-block local. The coalescer should not have to worry about resolving local copies (e.g. it shouldn't attemp to reorder instructions). The scheduler is a much better place to deal with local interference. The coalescer side of this equation needs work. llvm-svn: 180193	2013-04-24 15:54:43 +00:00
Jyotsna Verma	4a5d195942	Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions. llvm-svn: 180145	2013-04-23 21:17:40 +00:00
Stephen Lin	44a24c9593	Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) llvm-svn: 180138	2013-04-23 19:42:25 +00:00
Jyotsna Verma	624f2e0434	Hexagon: Remove assembler mapped instruction definitions. llvm-svn: 180133	2013-04-23 19:15:55 +00:00
Vincent Lejeune	3666f07489	R600: Use .AMDGPU.config section to emit stacksize llvm-svn: 180124	2013-04-23 17:34:12 +00:00
Vincent Lejeune	e5ba5f1b14	R600: Add CF_END llvm-svn: 180123	2013-04-23 17:34:00 +00:00
Jyotsna Verma	20903a7aba	Hexagon: Remove duplicate instructions to handle global/immediate values for absolute/absolute-set addressing modes. llvm-svn: 180120	2013-04-23 17:11:46 +00:00
Rafael Espindola	f7c86d97a1	Move test from grep to FileCheck. llvm-svn: 180092	2013-04-23 12:03:27 +00:00
Akira Hatanaka	913bf6194a	[mips] In performDSPShiftCombine, check that all elements in the vector are shifted by the same amount and the shift amount is smaller than the element size. llvm-svn: 180039	2013-04-22 19:58:23 +00:00
Stephen Lin	4e394628e7	Extra paranoid test for r179925 (verify that tail calls are not generated to 'this'-returning constructors of objects with different 'this' pointers than the caller) llvm-svn: 180032	2013-04-22 17:23:49 +00:00
Stepan Dyatkovskiy	8adaf54376	Fix for 5.5 Parameter Passing --> Stage C: -- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. llvm-svn: 180011	2013-04-22 13:06:52 +00:00
Arnaud A. de Grandmaison	087fe129d8	Cleanup: test source files do not need to be executable llvm-svn: 180003	2013-04-22 08:02:43 +00:00
David Blaikie	9bfe15c313	Revert "Revert "PR14606: debug info imported_module support"" This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even though the debug info was clearly invalid on all of them, but this ought to fix it. llvm-svn: 179996	2013-04-22 06:12:31 +00:00
Jim Grosbach	3104dcf2ca	Legalize vector truncates by parts rather than just splitting. Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr llvm-svn: 179989	2013-04-21 23:47:41 +00:00
Jim Grosbach	2582e2e539	ARM: Split out cost model vcvt testcases. They had a separate RUN line already, so may as well be in a separate file. llvm-svn: 179988	2013-04-21 23:47:37 +00:00
Jakob Stoklund Olesen	c9f30e9065	Passing arguments to varags functions under the SPARC v9 ABI. Arguments after the fixed arguments never use the floating point registers. llvm-svn: 179987	2013-04-21 21:36:49 +00:00
Jakob Stoklund Olesen	d8a2b84611	Fix the SETHIimm pattern for 64-bit code. Don't ignore the high 32 bits of the immediate. llvm-svn: 179985	2013-04-21 21:18:03 +00:00
Tim Northover	593f76e08e	ARM: fix part of test which actually needed an asserts build This should fix a buildbot failure that occurred after r179977. llvm-svn: 179978	2013-04-21 12:20:19 +00:00
Tim Northover	943f2a9234	ARM: Use ldrd/strd to spill 64-bit pairs when available. This allows common sp-offsets to be part of the instruction and is probably faster on modern CPUs too. llvm-svn: 179977	2013-04-21 11:57:07 +00:00
Bill Wendling	61eb6957c5	Remove tbaa metadata. llvm-svn: 179970	2013-04-21 01:38:25 +00:00
Jakob Stoklund Olesen	cbfbef04da	Compile varargs functions for SPARCv9. With a little help from the frontend, it looks like the standard va_* intrinsics can do the job. Also clean up an old bitcast hack in LowerVAARG that dealt with unaligned double loads. Load SDNodes can specify an alignment now. Still missing: Calling varargs functions with float arguments. llvm-svn: 179961	2013-04-20 22:49:16 +00:00
Tim Northover	de5285eb6f	ARM: don't add FrameIndex offset for LDMIA (has no immediate) Previously, when spilling 64-bit paired registers, an LDMIA with both a FrameIndex and an offset was produced. This kind of instruction shouldn't exist, and the extra operand was being confused with the predicate, causing aborts later on. This removes the invalid 0-offset from the instruction being produced. llvm-svn: 179956	2013-04-20 19:31:00 +00:00
Stephen Lin	98df7358cd	Minor renaming of tests (for consistency with an in-development patch) llvm-svn: 179954	2013-04-20 16:21:26 +00:00
Benjamin Kramer	a7e8f887fe	Don't litter .s files in test directory. llvm-svn: 179937	2013-04-20 10:43:40 +00:00
Stephen Lin	9d99ba2071	Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). llvm-svn: 179925	2013-04-20 05:14:40 +00:00
Stephen Lin	65c1101eba	Allow tail call opportunity detection through nested and/or multiple iterations of extractelement/insertelement indirection llvm-svn: 179924	2013-04-20 04:27:51 +00:00
Akira Hatanaka	11b4211d68	[mips] Instruction selection patterns for DSP-ASE vector shifts. llvm-svn: 179906	2013-04-19 23:21:32 +00:00
Hal Finkel	9e44a50443	Fix PPC optimizeCompareInstr swapped-sub argument handling When matching a compare with a subtract where the arguments of the compare are swapped w.r.t. the arguments of the subtract, we need to negate the predicates (or CR bit indices) of the users. This, however, is not the same as inverting the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM backend seems to do this correctly, but when I adapted the code for the PPC backend, I introduced an error in this logic. Comparison optimization is now enabled again by default. llvm-svn: 179899	2013-04-19 22:08:38 +00:00
Anton Korobeynikov	f95220dd8b	Do not mangle in MS-way the globals with magic \001 in the name. Based on the patch by David Nadlinger! llvm-svn: 179889	2013-04-19 21:20:56 +00:00
Bill Wendling	e8c6d1cb09	Make test slightly more readable. llvm-svn: 179888	2013-04-19 21:14:59 +00:00
Bill Wendling	7256108f6f	Add a testcase to make sure we generate the proper compact unwind section for a function that cannot produce a compact unwind encoding. llvm-svn: 179887	2013-04-19 21:07:11 +00:00
Eric Christopher	88bdd26cc9	Revert "PR14606: debug info imported_module support" This reverts commit r179836 as it seems to have caused test failures. llvm-svn: 179840	2013-04-19 07:47:16 +00:00
David Blaikie	46f35f8e56	PR14606: debug info imported_module support Adding another CU-wide list, in this case of imported_modules (since they should be relatively rare, it seemed better to add a list where each element had a "context" value, rather than add a (usually empty) list to every scope). This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll need to expand this to cover DW_TAG_imported_declaration too. llvm-svn: 179836	2013-04-19 06:57:04 +00:00
Tom Stellard	017c53ebbd	R600: Add pattern for the BFI_INT instruction llvm-svn: 179830	2013-04-19 02:11:06 +00:00
Tom Stellard	b767059700	R600: Reorganize lit tests and document how they should be organized llvm-svn: 179828	2013-04-19 02:10:53 +00:00
Hal Finkel	b96e61374f	Disable PPC comparison optimization by default This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do I'm disabling this for now. llvm-svn: 179807	2013-04-18 22:54:25 +00:00
Hal Finkel	44190578df	Implement optimizeCompareInstr for PPC Many PPC instructions have a so-called 'record form' which stores to a specific condition register the result of comparing the result of the instruction with zero (always as a signed comparison). For integer operations on PPC64, this is always a 64-bit comparison. This implementation is derived from the implementation in the ARM backend; there are some differences because PPC condition registers are allocatable virtual registers (although the record forms always use a specific one), and we look for a matching subtraction instruction after the compare (but before the first use) in addition to before it. llvm-svn: 179802	2013-04-18 22:15:08 +00:00
Benjamin Kramer	aeff9e581b	X86: Add an SSE2 lowering for 64 bit compares when pcmpgtq (SSE4.2) isn't available. This pattern started popping up in vectorized min/max reductions. llvm-svn: 179797	2013-04-18 21:37:45 +00:00
Derek Schuff	c55a3d43a9	Allow misaligned stores in x86 fast-isel. In X86FastISel::X86SelectStore(), improperly aligned stores are rejected and handled by the DAG-based ISel. However, X86FastISel::X86SelectLoad() makes no such requirement. There doesn't appear to be an x86 architectural correctness issue with allowing potentially unaligned store instructions. This patch removes this restriction. Patch by Jim Stichnot. llvm-svn: 179774	2013-04-18 17:41:08 +00:00
Hao Liu	ca09ec237c	Fix for PR14824, An ARM Load/Store Optimization bug llvm-svn: 179751	2013-04-18 09:11:08 +00:00
Eli Bendersky	802610971f	This patch teaches x86 fast-isel to generate the native div/idiv instructions for the sdiv/srem/udiv/urem bitcode instructions. This is done for the i8, i16, and i32 types, as well as i64 for the x86_64 target. Patch by Jim Stichnoth llvm-svn: 179715	2013-04-17 20:10:13 +00:00
Vincent Lejeune	cd0483fb18	R600: Make Export Instruction not duplicable llvm-svn: 179686	2013-04-17 15:17:39 +00:00
Richard Osborne	3bc2e6cf63	[XCore] Extend test to check positve offsets are folded into addresses. llvm-svn: 179621	2013-04-16 20:05:52 +00:00
Richard Osborne	d8d60d4b61	[XCore] Give test more generic name. I intend to extend the test with more offset folding checks llvm-svn: 179620	2013-04-16 19:56:55 +00:00
Richard Osborne	53ec25c8fa	[XCore] Convert a couple of tests to FileCheck. llvm-svn: 179619	2013-04-16 19:41:19 +00:00
Logan Chien	6f13ff357d	Implement ARM unwind opcode assembler. llvm-svn: 179591	2013-04-16 12:02:21 +00:00
Jakob Stoklund Olesen	b4edc00933	Add 64-bit multiply and divide instructions for SPARC v9. llvm-svn: 179582	2013-04-16 02:57:02 +00:00
Tom Stellard	bd67f8cd81	R600/SI: Emit config values in register value pairs. Instead of emitting config values in a predefined order, the code emitter will now emit a 32-bit register index followed by the 32-bit config value. llvm-svn: 179546	2013-04-15 17:51:35 +00:00
Tom Stellard	a44e2e18a1	R600/SI: Emit configuration value in the .AMDGPU.config ELF section llvm-svn: 179545	2013-04-15 17:51:30 +00:00
Tom Stellard	cb4468b00a	R600: Emit ELF formatted code rather than raw ISA. llvm-svn: 179544	2013-04-15 17:51:21 +00:00
Tim Northover	b5dc8bb136	Avoid outputting temporary test file into source tree. llvm-svn: 179532	2013-04-15 15:49:13 +00:00
Hal Finkel	371be65604	Fix PPC64 CR spill location for callee-saved registers This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack pointer (SP + 8). However, this is not relative to the SP after the new stack frame is established, but instead relative to the caller's stack pointer (it is stored into the linkage area of the parent's stack frame). So, like with the link register, we don't directly spill the CRs with other callee-saved registers, but just mark them to be spilled during prologue generation. In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32). llvm-svn: 179500	2013-04-15 02:07:05 +00:00
Jakob Stoklund Olesen	3b790b7f2e	Use i32 for all SPARC shift amounts, even in 64-bit mode. Test case by llvm-stress. llvm-svn: 179477	2013-04-14 05:48:50 +00:00
Jakob Stoklund Olesen	8fafe8cd31	Add support for the abs64 SPARC v9 code model. For when 16 TB just isn't enough. llvm-svn: 179474	2013-04-14 05:10:36 +00:00
Jakob Stoklund Olesen	d29f125f5b	Add support for the SPARC v9 abs44 code model. This is the default model for non-PIC 64-bit code. It supports text+data+bss linked anywhere in the low 16 TB of the address space. llvm-svn: 179473	2013-04-14 04:57:51 +00:00
Jakob Stoklund Olesen	b5173ad8fb	Also put target flags on SPARC constant pool references. Constant pool entries are accessed exactly the same way as global variables. llvm-svn: 179471	2013-04-14 04:35:16 +00:00
Jakob Stoklund Olesen	c23fada5f9	Fix patterns for 64-bit pointers. This fixes the pic32 code model for SPARC v9. llvm-svn: 179469	2013-04-14 01:53:23 +00:00
Jakob Stoklund Olesen	6182cc630a	Define SPARC code models. Currently, only abs32 and pic32 are implemented. Add a test case for abs32 with 64-bit code. 64-bit PIC code is currently broken. llvm-svn: 179463	2013-04-13 19:02:23 +00:00
Hal Finkel	978a847acb	Spill and restore PPC CR registers using the FP when we have one For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was during the save, and so we need to use the FP in these cases (as for all of the other spills and restores, but the CR restore has a special code path because its reserved slot, like the link register, is specified directly relative to the adjusted SP). llvm-svn: 179457	2013-04-13 08:09:20 +00:00
Andrew Trick	2bd87ad8d4	Further generalize this scheduler test. The order of copies depends on queue order, which is not very stable. llvm-svn: 179456	2013-04-13 07:37:27 +00:00
Andrew Trick	fb2a8d10f8	Fix a dislexic regex. llvm-svn: 179455	2013-04-13 07:29:21 +00:00
Andrew Trick	1ef71359cd	Add a missing REQUIRES: asserts llvm-svn: 179453	2013-04-13 06:12:46 +00:00
Andrew Trick	861493bc4f	MI-Sched: schedule physreg copies. The register allocator expects minimal physreg live ranges. Schedule physreg copies accordingly. This is slightly tricky when they occur in the middle of the scheduling region. For now, this is handled by rescheduling the copy when its associated instruction is scheduled. Eventually we may instead bundle them, but only if we can preserve the bundles as parallel copies during regalloc. llvm-svn: 179449	2013-04-13 06:07:40 +00:00
Akira Hatanaka	e0468ce3e1	[mips] Reapply r179420 and r179421. llvm-svn: 179434	2013-04-13 00:55:41 +00:00
Akira Hatanaka	b0b85e00d8	Revert r179420 and r179421. llvm-svn: 179422	2013-04-12 22:40:07 +00:00
Akira Hatanaka	737648f84c	[mips] Instruction selection patterns for carry-setting and using add instructions. llvm-svn: 179421	2013-04-12 22:24:52 +00:00
Akira Hatanaka	d809bc8eeb	[mips] v4i8 and v2i16 add, sub and mul instruction selection patterns. llvm-svn: 179420	2013-04-12 22:14:24 +00:00
Nico Rieck	1162bb7a1d	Replace coff-/elf-dump with llvm-readobj llvm-svn: 179361	2013-04-12 04:06:46 +00:00
Nadav Rotem	f96cc4976d	Fix the test on linux by setting the triple and the align format llvm-svn: 179354	2013-04-12 01:07:16 +00:00
Nadav Rotem	662256bafa	Add a flag to align all basic blocks in the function. When debugging performance regressions we often ask ourselves if the regression that we see is due to poor isel/sched/ra or due to some micro-architetural problem. When comparing two code sequences one good way to rule out front-end bottlenecks (and other the issues) is to force code alignment. This pass adds a flag that forces the alignment of all of the basic blocks in the program. llvm-svn: 179353	2013-04-12 00:48:32 +00:00
Preston Gurd	8e4196dfe6	Use FileCheck instead of grep. llvm-svn: 179322	2013-04-11 21:39:01 +00:00
Jack Carter	834846d7a8	Mips specific inline asm memory operand modifier test case These changes are based on commit responses for r179135. llvm-svn: 179315	2013-04-11 19:39:19 +00:00
Eli Bendersky	0ce49fd520	Add a CHECK-NOT for a more faithful translation of the original grep \| count 2. Thanks to Reid Kleckner for catching this. llvm-svn: 179289	2013-04-11 14:43:19 +00:00
Benjamin Kramer	f15ba24b8d	Add missing colons to check lines. llvm-svn: 179277	2013-04-11 12:41:41 +00:00
Benjamin Kramer	4413e71a39	FileCheckize a bunch of tests. llvm-svn: 179276	2013-04-11 12:32:23 +00:00
Michael Liao	877d1576e6	Optimize vector select from all 0s or all 1s As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane, vector select could be simplified to AND/OR or removed if one or both values being selected is all 0s or all 1s. llvm-svn: 179267	2013-04-11 05:15:54 +00:00
Michael Liao	87125582e9	Enhance bool simplifcation in X86 to handle more cases This patch is revised based on patch from Victor Umansky <victor.umansky@intel.com>. More cases are handled in X86's bool simplification, i.e. - SETCC_CARRY - value is truncated to i1 with AND As a by-product, PR5443 is also fixed. llvm-svn: 179265	2013-04-11 04:43:09 +00:00
Eli Bendersky	90daaa543a	Rewrite some of the test/CodeGen/X86 tests to use FileCheck instead of grep llvm-svn: 179241	2013-04-10 23:30:20 +00:00
Hal Finkel	03d47320aa	Manually remove successors in if conversion when CopyAndPredicateBlock is used In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is used because the to-be-predicated block has other predecessors, we need to explicitly remove the old copied block from the successors list. Normally if conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges to cleanup the successors list, but if the predicated block contained an un-analyzable branch (such as a now-predicated return), then this will fail. These extra successors were causing a problem on PPC because it was causing later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in the code. llvm-svn: 179227	2013-04-10 22:05:25 +00:00
Jack Carter	8e319ed798	Mips specific inline asm memory operand modifier test case These changes are based on commit responses for r179135. llvm-svn: 179225	2013-04-10 22:02:32 +00:00
Michel Danzer	c1562afdde	R600/SI: Add pattern for AMDGPUurecip 21 more little piglits with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 179186	2013-04-10 17:17:56 +00:00
Reed Kotler	68e5128508	This is for an experimental option -mips-os16. The idea is to compile all Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this would happen as long as floating point instructions are not needed. Probably it would also make sense to compile as mips32 if atomic operations are needed too. There may be other cases too. A module pass prescans the IR and adds the mips16 or nomips16 attribute to functions depending on the functions needs. Mips 16 mode can result in a 40% code compression by utililizing 16 bit encoding of many instructions. The hope is for this to replace the traditional gcc way of dealing with Mips16 code using floating point which involves essentially using soft float but with a library implemented using mips32 floating point. This gcc method also requires creating stubs so that Mips32 code can interact with these Mips 16 functions that have floating point needs. My conjecture is that in reality this traditional gcc method would never win over this new method. I will be implementing the traditional gcc method also. Some of it is already done but I needed to do the stubs to finish the work and those required this mips16/32 mixed mode capability. I have more ideas for to make this new method much better and I think the old method will just live in llvm for anyone that needs the backward compatibility but I don't for what reason that would be needed. llvm-svn: 179185	2013-04-10 16:58:04 +00:00
Vincent Lejeune	daa1e69206	R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr llvm-svn: 179174	2013-04-10 13:29:20 +00:00
Christian Konig	f40f671bab	R600/SI: dynamical figure out the reg class of MIMG Depending on the number of bits set in the writemask. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179166	2013-04-10 08:39:16 +00:00
Christian Konig	76cd1a76c2	R600/SI: adjust writemask to only the used components Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179165	2013-04-10 08:39:08 +00:00
Christian Konig	ffddac18a4	R600/SI: remove image sample writemask Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 179164	2013-04-10 08:39:01 +00:00
Evan Cheng	9f82233851	__sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in xmm0 / xmm1. rdar://13599493 llvm-svn: 179141	2013-04-10 01:26:07 +00:00
Jack Carter	03f8f98410	Mips specific inline asm operand modifier 'D' Modifier 'D' is to use the second word of a double integer. We had previously implemented the pure register varient of the modifier and this patch implements the memory reference. #include "stdio.h" int b[8] = {0,1,2,3,4,5,6,7}; void main() { int i; // The first word. Notice, no 'D' {asm ( "lw %0,%1;" : "=r" (i) : "m" ((b+4)) );} printf("%d\n",i); // The second word {asm ( "lw %0,%D1;" : "=r" (i) : "m" ((b+4)) );} printf("%d\n",i); } llvm-svn: 179135	2013-04-09 23:19:50 +00:00
Hal Finkel	8b05494b58	Allow PPC B and BLR to be if-converted into some predicated forms This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. llvm-svn: 179134	2013-04-09 22:58:37 +00:00

1 2 3 4 5 ...

7457 Commits