llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-30 23:42:52 +01:00

Author	SHA1	Message	Date
Elena Demikhovsky	139f25ed2c	AVX-512: implemented extractelement with variable index. Added parsing of mask register and "zeroing" semantic, like {%k1} {z}. llvm-svn: 190595	2013-09-12 08:55:00 +00:00
Hal Finkel	6164109851	PPC: Enable aggressive anti-dependency breaking Aggressive anti-dependency breaking is enabled by default for all PPC cores. This provides a general speedup on the P7 and other platforms (among other factors, the instruction group formation for the non-embedded PPC cores is done during post-RA scheduling). In order to do this safely, the incompatibility between uses of the MFOCRF instruction and anti-dependency breaking are resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed FIXME, the problem was that MFOCRF's output is sensitive to the identify of the source register, and always paired with a shift to undo this effect. Because anti-dependency breaking is unaware of this hidden dependency of the shift amount on the source register of the MFOCRF instruction, changing that register must be inhibited. Two test cases were adjusted: The SjLj test was made more insensitive to register choices and scheduling; the saveCR test disabled anti-dependency breaking because part of what it is testing is proper register reuse. llvm-svn: 190587	2013-09-12 05:24:49 +00:00
Tom Stellard	6a507da088	R600/SI: expose TBUFFER_STORE_FORMAT_* for OpenGL transform feedback For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist. The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take a resource descriptor might be nicer. The maximum number of input SGPRs is bumped to 17. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190575	2013-09-12 02:55:14 +00:00
Bill Wendling	911391c864	Try to fix the atom buildbots by adding an explicit 'cpu' to the 'llc' command. llvm-svn: 190541	2013-09-11 19:06:04 +00:00
Daniel Sanders	16a6e0ac3d	[mips][msa] Added test cases that were supposed to be part of r190507, r190509, r190512, and r190518. llvm-svn: 190522	2013-09-11 12:39:25 +00:00
Daniel Sanders	e3f2e5de18	[mips][msa] Added support for matching mulv, nlzc, sll, sra, srl, and subv from normal IR (i.e. not intrinsics) llvm-svn: 190518	2013-09-11 11:58:30 +00:00
Daniel Sanders	96466ff8b4	[mips][msa] Added support for matching fadd, fdiv, flog2, fmul, frint, fsqrt, and fsub from normal IR (i.e. not intrinsics) llvm-svn: 190512	2013-09-11 10:51:30 +00:00
Daniel Sanders	a52c7f09dc	[mips][msa] Added support for matching div_[su] from normal IR (i.e. not intrinsics) llvm-svn: 190509	2013-09-11 10:38:58 +00:00
Daniel Sanders	f68b00e629	[mips][msa] Added support for matching addv from normal IR (i.e. not intrinsics) The corresponding intrinsic is now lowered into equivalent IR (ISD::ADD) before instruction selection. llvm-svn: 190507	2013-09-11 10:28:16 +00:00
Daniel Sanders	534d28aa11	[mips][msa] Corrected the definition of the dotp_[su].[hwd] intrinsics The elements of the operands should be half the width of the elements of the result. llvm-svn: 190505	2013-09-11 09:59:17 +00:00
Richard Sandiford	bfcf129b8e	[SystemZ] Add TM and TMY The main complication here is that TM and TMY (the memory forms) set CC differently from the register forms. When the tested bits contain some 0s and some 1s, the register forms set CC to 1 or 2 based on the value the uppermost bit. The memory forms instead set CC to 1 regardless of the uppermost bit. Until now, I've tried to make it so that a branch never tests for an impossible CC value. E.g. NR only sets CC to 0 or 1, so branches on the result will only test for 0 or 1. Originally I'd tried to do the same thing for TM and TMY by using custom matching code in ISelDAGToDAG. That ended up being very ugly though, and would have meant duplicating some of the chain checks that the common isel code does. I've therefore gone for the simpler alternative of adding an extra operand to the TM DAG opcode to say whether a memory form would be OK. This means that the inverse of a "TM;JE" is "TM;JNE" rather than the more precise "TM;JNLE", just like the inverse of "TMLL;JE" is "TMLL;JNE". I suppose that's arguably less confusing though... llvm-svn: 190400	2013-09-10 10:20:32 +00:00
Daniel Sanders	32227b7995	[mips][msa] Removed unsupported dot product instructions (dotp_[su].b) The dotp_[su].b instructions never existed in any revision of the MSA spec. llvm-svn: 190398	2013-09-10 09:51:43 +00:00
Bill Wendling	5e97475233	Another attempt to fix windows buildbots. llvm-svn: 190350	2013-09-09 20:29:32 +00:00
Bill Wendling	5dadcab742	Attempt to fix buildbots by giving an explicit output to the llvm-mc command. llvm-svn: 190349	2013-09-09 20:22:38 +00:00
Bill Wendling	10cf877e75	Expand test to make sure that we can generate compact unwind from an ASM file. llvm-svn: 190348	2013-09-09 20:12:36 +00:00
Bill Wendling	a2bc7420c8	Expand test to make sure that we can generate compact unwind from an ASM file. llvm-svn: 190347	2013-09-09 20:10:54 +00:00
Joey Gouly	03af45ccfe	[ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode. IT blocks can only be one instruction lonf, and can only contain a subset of the 16 instructions. Patch by Artyom Skrobov! llvm-svn: 190309	2013-09-09 14:21:49 +00:00
Robert Lytton	b73a61715b	XCore handling of thread local lowering Fix XCoreLowerThreadLocal trying to initialise globals which have no initializer. Add handling of const expressions containing thread local variables. These need to be replaced with instructions, as the thread ID is used to access the thread local variable. llvm-svn: 190300	2013-09-09 10:42:11 +00:00
Robert Lytton	dc8d32008e	XCore target: change to Sched::Source This sidesteps a bug in PrescheduleNodesWithMultipleUses() which does not check if callResources will be affected by the transformation. llvm-svn: 190299	2013-09-09 10:42:05 +00:00
Robert Lytton	4a5772968b	XCore target: fix weak linkage attribute handling llvm-svn: 190298	2013-09-09 10:41:57 +00:00
Bill Wendling	2c532e9c9b	Generate compact unwind encoding from CFI directives. We used to generate the compact unwind encoding from the machine instructions. However, this had the problem that if the user used `-save-temps' or compiled their hand-written `.s' file (with CFI directives), we wouldn't generate the compact unwind encoding. Move the algorithm that generates the compact unwind encoding into the MCAsmBackend. This way we can generate the encoding whether the code is from a `.ll' or `.s' file. <rdar://problem/13623355> llvm-svn: 190290	2013-09-09 02:37:14 +00:00
Jiangning Liu	b2cc9767e4	Implement aarch64 neon instruction set AdvSIMD (3V Diff), covering the following 26 instructions, SADDL, UADDL, SADDW, UADDW, SSUBL, USUBL, SSUBW, USUBW, ADDHN, RADDHN, SABAL, UABAL, SUBHN, RSUBHN, SABDL, UABDL, SMLAL, UMLAL, SMLSL, UMLSL, SQDMLAL, SQDMLSL, SMULL, UMULL, SQDMULL, PMULL llvm-svn: 190288	2013-09-09 02:20:27 +00:00
Manman Ren	edc0da266c	Debug Info Testing: use null instead of an empty string in context field. llvm-svn: 190284	2013-09-09 00:12:17 +00:00
Manman Ren	fa420c3e35	Debug Info Testing: update context from empty string to null. Context should be either null or MDNode. llvm-svn: 190267	2013-09-08 03:11:54 +00:00
Akira Hatanaka	3fb22c57eb	[mips] Fix typos. llvm-svn: 190236	2013-09-07 01:14:42 +00:00
Akira Hatanaka	3eef445630	[mips] Enhance command line option "-mno-ldc1-sdc1" to expand base+index double precision loads and stores as well as reg+imm double precision loads and stores. Previously, expansion of loads and stores was done after register allocation, but now it takes place during legalization. As a result, users will see double precision stores and loads being emitted to spill and restore 64-bit FP registers. llvm-svn: 190235	2013-09-07 00:52:30 +00:00
Akira Hatanaka	b84769b3d7	[mips] Set instruction itineraries of loads, stores and conditional moves. llvm-svn: 190219	2013-09-06 23:28:24 +00:00
Manman Ren	450526b5a9	Debug Info Testing: updated to use NULL instead of "i32 0" in a few fields. Field 2 of DIType (Context), field 9 of DIDerivedType (TypeDerivedFrom), field 12 of DICompositeType (ContainingType), fields 2, 7, 12 of DISubprogram (Context, Type, ContainingType). llvm-svn: 190205	2013-09-06 21:03:58 +00:00
Aaron Watry	e4512c5eff	R600: Add support for LDS atomic subtract Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190200	2013-09-06 20:17:42 +00:00
Manman Ren	3a3457bef0	Debug Info Testing: Updated to use null instead of "i32 0" for containing-type field of DICompositeType. This will help the follow-on patch of using DITypeRef for containing-type field. llvm-svn: 190187	2013-09-06 18:13:59 +00:00
Tim Northover	5e921518f7	SelectionDAG: create correct BooleanContent constants Occasionally DAGCombiner can spot that a SETCC operation is completely redundant and reduce it to "all true" or "all false". If this happens to a vector, the value produced has to take account of what a normal comparison would have produced, which may be an all-1s bitmask. The fix in SelectionDAG.cpp is tested, however, as far as I can see the code in TargetLowering.cpp is possibly unreachable and almost certainly irrelevant when triggered so there are no tests. However, I believe it's still clearly the right change and may save someone else some hassle if it suddenly becomes reachable. So I'm doing it anyway. llvm-svn: 190147	2013-09-06 12:38:12 +00:00
Richard Sandiford	8d6edc5218	[SystemZ] Tweak integer comparison code The architecture has many comparison instructions, including some that extend one of the operands. The signed comparison instructions use sign extensions and the unsigned comparison instructions use zero extensions. In cases where we had a free choice between signed or unsigned comparisons, we were trying to decide at lowering time which would best fit the available instructions, taking things like extension type into account. The code to do that was getting increasingly hairy and was also making some bad decisions. E.g. when comparing the result of two LLCs, it is better to use CR rather than CLR, since CR can be fused with a branch while CLR can't. This patch removes the lowering code and instead adds an operand to integer comparisons to say whether signed comparison is required, whether unsigned comparison is required, or whether either is OK. We can then leave the choice of instruction up to the normal isel code. llvm-svn: 190138	2013-09-06 11:51:39 +00:00
Richard Sandiford	ea5b4917b9	[SystemZ] Use XC for a memset of 0 llvm-svn: 190130	2013-09-06 10:25:07 +00:00
Matt Arsenault	f658af2617	Teach CodeGenPrepare about address spaces llvm-svn: 190112	2013-09-06 00:18:43 +00:00
Juergen Ributzka	4554c8ed29	[X86] Perform VSELECT DAG combines also before DAG type legalization. If the DAG already has only legal types, then the second round of DAG combines is skipped. In this case VSELECT+SETCC patterns that match a more efficient instruction (e.g. min/max) are never recognized. This fix allows VSELECT+SETCC combines if the types are already legal before DAG type legalization. Reviewer: Nadav llvm-svn: 190105	2013-09-05 23:02:56 +00:00
Matt Arsenault	071be273be	R600: Fix i64 to i32 trunc on SI llvm-svn: 190091	2013-09-05 19:41:10 +00:00
Tom Stellard	ce0432a0c3	R600: Add support for local memory atomic add llvm-svn: 190080	2013-09-05 18:38:09 +00:00
Tom Stellard	6c1db18560	R600: Expand SELECT nodes rather than custom lowering them llvm-svn: 190079	2013-09-05 18:38:03 +00:00
Tom Stellard	8f7c5a681a	R600: Fix incorrect LDS size calculation GlobalAdderss nodes that appeared in more than one basic block were being counted twice. llvm-svn: 190078	2013-09-05 18:37:57 +00:00
Tom Stellard	d2fff2dd99	R600/SI: Don't emit S_WQM_B64 instruction for compute shaders llvm-svn: 190077	2013-09-05 18:37:52 +00:00
Joey Gouly	071ca2ff6d	[ARMv8] Implement the new DMB/DSB operands. This removes the custom ISD Node: MEMBARRIER and replaces it with an intrinsic. llvm-svn: 190055	2013-09-05 15:35:24 +00:00
Tilmann Scheller	31cc184566	Reverting 190043 for now. Solution is not sufficient to prevent 'mov pc, lr' being emitted for jump table code. Test case doesn't trigger the added functionality. llvm-svn: 190047	2013-09-05 11:59:43 +00:00
Tilmann Scheller	14c2ce0a1e	ARM: Add GPR register class excluding LR for use with the ADR instruction. This improves code generation for jump tables by avoiding the emission of "mov pc, lr" which could fool the processor into believing this is a return from a function causing mispredicts. The code generation logic for jump tables uses ADR to materialize the address of the jump target. Patch by Daniel Stewart! llvm-svn: 190043	2013-09-05 11:10:31 +00:00
Richard Sandiford	399318ba38	[SystemZ] Add NC, OC and XC For now these are just used to handle scalar ANDs, ORs and XORs in which all operands are memory. llvm-svn: 190041	2013-09-05 10:36:45 +00:00
Venkatraman Govindaraju	b3ea970660	[Sparc] Correctly handle call to functions with ReturnsTwice attribute. In sparc, setjmp stores only the registers %fp, %sp, %i7 and %o7. longjmp restores the stack, and the callee-saved registers (all local/in registers: %i0-%i7, %l0-%l7) using the stored %fp and register windows. However, this does not guarantee that the longjmp will restore the registers, as they were when the setjmp was called. This is because these registers may be clobbered after returning from setjmp, but before calling longjmp. This patch prevents the registers %i0-%i5, %l0-l7 to live across the setjmp call using the register mask. llvm-svn: 190033	2013-09-05 05:32:16 +00:00
Andrew Trick	330821bce0	mi-sched: Force bottom up scheduling for generic targets. Fast register pressure tracking currently only takes effect during bottom up scheduling. Forcing this is a bit faster and simpler for targets that don't have many scheduling constraints and don't need top-down scheduling. llvm-svn: 190014	2013-09-04 23:54:00 +00:00
Eric Christopher	fd11e8a82d	Expand and rewrite comment. llvm-svn: 189998	2013-09-04 21:23:23 +00:00
Arnold Schwaighofer	22610acac7	Change swift/vldm test case to be less dependent on allocation order 'Force' values in registers using the calling convention. Now, we only depend on the calling convention and that the allocator performs copy coalescing. llvm-svn: 189985	2013-09-04 20:51:06 +00:00
Vincent Lejeune	4fd20e35e6	R600: Use shared op optimization when checking cycle compatibility llvm-svn: 189981	2013-09-04 19:53:54 +00:00
Vincent Lejeune	4a8c23c168	R600: Non vector only instruction can be scheduled on trans unit llvm-svn: 189980	2013-09-04 19:53:46 +00:00

1 2 3 4 5 ...

8140 Commits