llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Bill Wendling	be9af41475	Emit an error message if the value passed to __builtin_returnaddress isn't a constant __builtin_returnaddress requires that the value passed into is be a constant. However, at -O0 even a constant expression may not be converted to a constant. Emit an error message intead of crashing. llvm-svn: 198531	2014-01-05 01:47:20 +00:00
Venkatraman Govindaraju	bfcef24b91	[SparcV9]: Implement RETURNADDR and FRAMEADDR lowering in SPARC64. Fixes PR18356. llvm-svn: 198480	2014-01-04 07:17:21 +00:00
Quentin Colombet	23080225fa	[RegAlloc] Make tryInstructionSplit less aggressive. The greedy register allocator tries to split a live-range around each instruction where it is used or defined to relax the constraints on the entire live-range (this is a last chance split before falling back to spill). The goal is to have a big live-range that is unconstrained (i.e., that can use the largest legal register class) and several small local live-range that carry the constraints implied by each instruction. E.g., Let csti be the constraints on operation i. V1= op1 V1(cst1) op2 V1(cst2) V1 live-range is constrained on the intersection of cst1 and cst2. tryInstructionSplit relaxes those constraints by aggressively splitting each def/use point: V1= V2 = V1 V3 = V2 op1 V3(cst1) V4 = V2 op2 V4(cst2) Because of how the coalescer infrastructure works, each new variable (V3, V4) that is alive at the same time as V1 (or its copy, here V2) interfere with V1. Thus, we end up with an uncoalescable copy for each split point. To make tryInstructionSplit less aggressive, we check if the split point actually relaxes the constraints on the whole live-range. If it does not, we do not insert it. Indeed, it will not help the global allocation problem: - V1 will have the same constraints. - V1 will have the same interference + possibly the newly added split variable VS. - VS will produce an uncoalesceable copy if alive at the same time as V1. <rdar://problem/15570057> llvm-svn: 198369	2014-01-02 22:47:22 +00:00
Rafael Espindola	4b38156778	Make the ARM ABI selectable via SubtargetFeature. This patch makes it possible to select the ABI with -mattr. It will be used to forward clang's -target-abi option to llvm's CodeGen. llvm-svn: 198304	2014-01-02 13:40:08 +00:00
Venkatraman Govindaraju	cb22135b23	[Sparc] Handle atomic loads/stores in sparc backend. llvm-svn: 198286	2014-01-01 22:11:54 +00:00
Venkatraman Govindaraju	e8745ffca1	[SparcV9]: Custom lower UMULO/SMULO so that the arguments are send to __multi3() in correct order. llvm-svn: 198281	2014-01-01 20:22:45 +00:00
Venkatraman Govindaraju	2fc6090f42	[SparcV9]: Use SRL instead of SLL to clear top 32-bits in ctpop:i32. SLL does not clear top 32 bit, only SRL does. llvm-svn: 198280	2014-01-01 19:00:10 +00:00
Elena Demikhovsky	7174584583	AVX-512: Added intrinsics for vcvt, vcvtt, vrndscale, vcmp Printing rounding control. Enncoding for EVEX_RC (rounding control). llvm-svn: 198277	2014-01-01 15:12:34 +00:00
Jiangning Liu	583b8a7116	For AArch64 Neon, simplify scalar dup by lane0 for fp. llvm-svn: 198194	2013-12-30 02:44:35 +00:00
Hao Liu	ab32d54fad	[AArch64]Add code to spill/fill Q register tuples such as QPair/QTriple/QQuad. llvm-svn: 198193	2013-12-30 02:38:12 +00:00
Hao Liu	8bef865160	[AArch64]Can't select shift left 0 of type v1i64 llvm-svn: 198192	2013-12-30 02:12:46 +00:00
Kevin Qin	cbb0be4bee	Fix a bug in DAGcombiner about zero-extend after setcc. For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190	2013-12-30 02:05:13 +00:00
Hao Liu	e8d49c2088	[AArch64]Fix the problem that can't select mul of v1i64/v2i64 types. E.g. Can't select such IR: %tmp = mul <2 x i64> %a, %b llvm-svn: 198188	2013-12-30 01:38:41 +00:00
Bill Wendling	984fb2bf17	Un-XFAILify some tests which are now passing. llvm-svn: 198184	2013-12-29 23:09:14 +00:00
Venkatraman Govindaraju	451c278cbc	[SparcV9] Use separate instruction patterns for 64 bit arithmetic instructions instead of reusing 32 bit instruction patterns. This is done to avoid spilling the result of the 64-bit instructions to a 4-byte slot. llvm-svn: 198157	2013-12-29 07:15:09 +00:00
Venkatraman Govindaraju	d46a491054	[SparcV9] For codegen generated library calls that return float, set inreg flag manually in LowerCall(). This makes the sparc backend to generate Sparc64 ABI compliant code. llvm-svn: 198149	2013-12-29 04:27:21 +00:00
Venkatraman Govindaraju	05510dd426	[SparcV9]: Implement lowering of long double (fp128) arguments in Sparc64 ABI. Also, pass fp128 arguments to varargs through integer registers if necessary. llvm-svn: 198145	2013-12-29 01:20:36 +00:00
Andrew Trick	ed2d925c84	New machine model for cortex-a9. Schedule for resources and latency. Schedule more conservatively to account for stalls on floating point resources and latency. Use the AGU resource to model latency stalls since it's shared between FP and LD/ST instructions. This might not be completely accurate but should work well in practice. llvm-svn: 198125	2013-12-28 21:57:05 +00:00
NAKAMURA Takumi	a213dd2dcc	llvm/test/CodeGen/X86/vselect.ll: Unbreak Windows x64 targets to add -mtriple=x86_64-unknown-unknown. llvm-svn: 198114	2013-12-28 13:04:29 +00:00
Andrea Di Biagio	b2f4969e98	[X86] Teach the backend how to fold target specific dag node for packed vector shift by immedate count (VSHLI/VSRLI/VSRAI) into a build_vector when the vector in input to the shift is a build_vector of all constants or UNDEFs. Target specific nodes for packed shifts by immediate count are in general introduced by function 'getTargetVShiftByConstNode' (in X86ISelLowering.cpp) when lowering shift operations, SSE/AVX immediate shift intrinsics and (only in very few cases) SIGN_EXTEND_INREG dag nodes. This patch adds extra rules for simplifying vector shifts inside function 'getTargetVShiftByConstNode'. Added file test/CodeGen/X86/vec_shift5.ll to verify that packed shifts by immediate are correctly folded into a build_vector when the input vector to the shift dag node is a vector of constants or undefs. llvm-svn: 198113	2013-12-28 11:11:52 +00:00
Andrea Di Biagio	86fc6e8bd5	Teach DAGCombiner how to fold a SIGN_EXTEND_INREG of a BUILD_VECTOR of ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR. For example, given the following sequence of dag nodes: i32 C = Constant<1> v4i32 V = BUILD_VECTOR C, C, C, C v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1 The SIGN_EXTEND_INREG node can be folded into a build_vector since the vector in input is a BUILD_VECTOR of constants. The optimized sequence is: i32 C = Constant<-1> v4i32 Result = BUILD_VECTOR C, C, C, C llvm-svn: 198084	2013-12-27 20:20:28 +00:00
Venkatraman Govindaraju	8c2d10768d	[Sparc] Lower and MachineInstr to MC and print assembly using MCInstPrinter. llvm-svn: 198030	2013-12-26 01:49:59 +00:00
Simon Atanasyan	f306a50db4	[Mips] Does not take in account 'use-soft-float' attribute's value when consider to generate stubs for mips16 hard-float mode. The patch reviewed by Reed Kotler. llvm-svn: 198019	2013-12-25 17:00:27 +00:00
Hao Liu	8ed49e0c42	[AArch64]Fix a problem that the register order of fmls/fmla by element is incorrect. E.g. the codegen result is fmls v1.2s, v0.2s, v2.s[3] which is expected to be fmls v0.2s, v1.2s, v2.s[3] llvm-svn: 198001	2013-12-25 07:12:34 +00:00
Jiangning Liu	ca1d69d4c2	Add missing pattern matches to support ACLE intrinsics of AArch64 NEON. llvm-svn: 197993	2013-12-25 01:22:51 +00:00
Richard Sandiford	99ae48f5bb	[SystemZ] Use interlocked-access 1 instructions for CodeGen ...namely LOAD AND ADD, LOAD AND AND, LOAD AND OR and LOAD AND EXCLUSIVE OR. LOAD AND ADD LOGICAL isn't really separately useful for LLVM. I'll look at adding reusing the CC results in new year. llvm-svn: 197985	2013-12-24 15:18:04 +00:00
Elena Demikhovsky	2d23dc9650	AVX-512: fixed some patterns for MVT::i1 llvm-svn: 197981	2013-12-24 14:24:07 +00:00
Hao Liu	8ef969c4a0	[AArch64]Add patterns to match normal shift nodes: shl, sra and srl. llvm-svn: 197969	2013-12-24 09:00:21 +00:00
Kevin Qin	3993f1cd71	[AArch64 NEON] Fix a bug when lowering BUILD_VECTOR. DAG.getVectorShuffle() doesn't always return a vector_shuffle node. If mask is the exact sequence of it's operand(For example, operand_0 is v8i8, and the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly return that operand. So a check is added here. llvm-svn: 197967	2013-12-24 08:16:06 +00:00
Kevin Qin	8f86911897	[AArch64 NEON] Fix a pattern match failure with NEON_VDUP. This failure caused by improper condition when lowering shuffle_vector to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not be generated. llvm-svn: 197966	2013-12-24 08:11:47 +00:00
Ana Pazos	85f191fc73	[AArch64] Check fmul node single use in fused multiply patterns Check for single use of fmul node in fused multiply patterns to allow generation of fused multiply add/sub instructions. Otherwise fmul operation ends up being repeated more than once which does not help peformance on targets with only one MAC unit, as for example cortex-a53. llvm-svn: 197929	2013-12-24 00:47:29 +00:00
Ana Pazos	8821a9ef6b	[AArch64 NEON] Fixed fused multiply negate add/sub patterns The correct pattern matching should be: - fnmadd is (-Ra) + (-Rn)Rm which should be matched as: fma (fneg node:$Rn), node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm)))) - fnmsub is (-Ra) + RnRm which should be matched as fma node:$Rn, node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra)))) llvm-svn: 197928	2013-12-24 00:40:10 +00:00
Hao Liu	3ae1e13884	[AArch64]The compare to zero intrinsics should be implemented by 'icmp/fcmp' and 'sext' not 'zext'. Modify the test cases. llvm-svn: 197897	2013-12-23 02:42:10 +00:00
Elena Demikhovsky	39275c48ca	AVX512: SETCC returns i1 for AVX-512 and i8 for all others llvm-svn: 197876	2013-12-22 10:13:18 +00:00
Roman Divacky	513296cd04	Implement initial-exec TLS for PPC32. llvm-svn: 197824	2013-12-20 18:08:54 +00:00
Richard Sandiford	8daaabe4c3	[SystemZ] Optimize comparisons with truncated extended loads If the extension of a loaded value is compared against zero and used in other arithmetic, InstCombine will change the comparison to use the unextended load. It's also possible that the comparison could be against the unextended load from the outset. In DAG form this becomes a truncation of an extending load. We want to strip the truncation if possible so that we can use load-and-test instructions. llvm-svn: 197804	2013-12-20 11:56:02 +00:00
Richard Sandiford	48a0b2f8e3	[SystemZ] Extend RISBG optimization The handling of ANY_EXTEND and ZERO_EXTEND was too strict. In this context we can treat ZERO_EXTEND in much the same way as an AND and then also handle outermost ZERO_EXTENDs. I couldn't find a test that benefited from the ANY_EXTEND change, but it's more obvious to write it this way once SIGN_EXTEND and ZERO_EXTEND are handled differently. llvm-svn: 197802	2013-12-20 11:49:48 +00:00
Tom Stellard	b39ac07c09	R600: Allow ftrunc v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc v3: move ftrunc pattern next to TRUNC definition, it's available since R600 Patch By: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197783	2013-12-20 05:11:55 +00:00
Quentin Colombet	884367d931	[X86][fast-isel] Fix select lowering. The condition in selects is supposed to be i1. Make sure we are just reading the less significant bit of the 8 bits width value to match this constraint. <rdar://problem/15651765> llvm-svn: 197712	2013-12-19 18:32:04 +00:00
Josh Magee	f3c5790260	Unbreak ARM buildbots after r197653 by forcing the target triple on this test. llvm-svn: 197709	2013-12-19 18:14:42 +00:00
Rafael Espindola	64a77ceb5f	Add a triple so that this passes on OS X. I am surprised I am the first one to notice this. llvm-svn: 197689	2013-12-19 16:06:33 +00:00
NAKAMURA Takumi	e67a3fe0ef	Add REQUIRES:asserts to 3 tests in llvm/test/CodeGen/R600 added in r192212. They are failing in assertions. llvm-svn: 197669	2013-12-19 10:41:12 +00:00
Matt Arsenault	e64331a159	R600/SI: Make private pointers be 32-bit. Different sized address spaces should theoretically work most of the time now, and since 64-bit add is currently disabled, using more 32-bit pointers fixes some cases. llvm-svn: 197659	2013-12-19 05:32:55 +00:00
Josh Magee	86d29cffa7	[stackprotector] Use analysis from the StackProtector pass for stack layout in PEI a nd LocalStackSlot passes. This changes the MachineFrameInfo API to use the new SSPLayoutKind information produced by the StackProtector pass (instead of a boolean flag) and updates a few pass dependencies (to preserve the SSP analysis). The stack layout follows the same approach used prior to this change - i.e., only LargeArray stack objects will be placed near the canary and everything else will be laid out normally. After this change, structures containing large arrays will also be placed near the canary - a case previously missed by the old implementation. Out of tree targets will need to update their usage of MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. The next patch will implement the rules for sspstrong and sspreq. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D2158 llvm-svn: 197653	2013-12-19 03:17:11 +00:00
Reid Kleckner	f795c3e4a9	Begin adding docs and IR-level support for the inalloca attribute The inalloca attribute is designed to support passing C++ objects by value in the Microsoft C++ ABI. It behaves the same as byval, except that it always implies that the argument is in memory and that the bytes are never copied. This attribute allows the caller to take the address of an outgoing argument's memory and execute arbitrary code to store into it. This patch adds basic IR support, docs, and verification. It does not attempt to implement any lowering or fix any possibly broken transforms. When this patch lands, a complete description of this feature should appear at http://llvm.org/docs/InAlloca.html . Differential Revision: http://llvm-reviews.chandlerc.com/D2173 llvm-svn: 197645	2013-12-19 02:14:12 +00:00
Reed Kotler	012c0a0f79	Fix a problem with mips16 stubs when calls are transformed during tail call optimization. Some more work may be needed for indirect calls but this patch fixes the current regression in Prolangc++/trees. S2 optimization as part of the general cleanup and optimization of prolog and epilog was not saving S2 in this case and needed to. llvm-svn: 197630	2013-12-18 23:57:48 +00:00
Andrew Trick	e73fd60399	Revert "Add -mcpu=z10 to SystemZ tests." This reverts commit r197466. The MachineCSE fix that required the -mcpu flag has been disabled until more work can be done to fix downstream issues. Adding -mcpu wasn't the right workaround anyway. llvm-svn: 197624	2013-12-18 23:04:37 +00:00
Weiming Zhao	628bf03d65	[aarch32] fix bug 18268: Incorrect condition of vsel Given vsel_cc, op1, op2, since vsel has no LE/LT, to generate vsel for such selection, it needs to inverse cc and swap op1 and op2. To inverse cc, both L/G and E bits should be flipped. llvm-svn: 197615	2013-12-18 22:25:17 +00:00
Rafael Espindola	6dc5fe883a	Correctly handle the degenerated triple "thumb". Fixes a crash in llc where some parts think the target is thumb and others think it is ARM. llvm-svn: 197607	2013-12-18 21:29:44 +00:00
Rafael Espindola	e1792e72e1	One ppc32-darwin, a i64 inside a structure can have 32 bit alignment. Thanks for Iain Sandoe for testing this with the original gcc. Clang was already getting this right. llvm-svn: 197572	2013-12-18 14:35:37 +00:00

1 2 3 4 5 ...

8867 Commits