llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Changpeng Fang	fc2b5c3fe3	AMDGPU/SI: Define S_GETREG Intrinsic Summary: Define s_getreg intrinsic to generate s_getreg instruction to read hardware registers. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17892 llvm-svn: 263124	2016-03-10 16:47:15 +00:00
Saleem Abdulrasool	b9035e7dcd	ARM: follow up improvements for SVN r263118 The initial change was insufficiently complete for always getting the semantics of __builtin_longjmp correct. The builtin is translated into a `tInt_eh_sjlj_longjmp` DAG node. This node set R7 as clobbered. However, the code would then follow up with a clobber of R11. I had failed to notice the imp-def,kill on R7 in the isel. Unfortunately, it seems that it is not possible to conditionalise the Defs list via an !if. Instead, construct a new parallel WIN node and prefer that when targeting windows. This ensures that we now both correctly model the __builtin_longjmp as well as construct the frame in a more ABI conformant manner. llvm-svn: 263123	2016-03-10 16:26:37 +00:00
David L Kreitzer	b2fd0878a4	Unified the handling of returns in the X87 stackifier so that the stackifier runs successfully on routines containing IRETs. This fixes PR26410. Differential Revision: http://reviews.llvm.org/D17643 llvm-svn: 263120	2016-03-10 15:14:02 +00:00
Saleem Abdulrasool	33390fc262	ARM: correct __builtin_longjmp on WoA WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast to AAPCS which states r7. This adjusts __builtin_longjmp to not clobber r7 and to properly restore the frame pointer on execution. llvm-svn: 263118	2016-03-10 15:11:09 +00:00
Elena Demikhovsky	2f71b82e0e	AVX-512: Fixed a bug in i1 vector zero extending. (Skylake-avx512) (failed on instruction selection phase) Differential Revision: http://reviews.llvm.org/D17924 llvm-svn: 263111	2016-03-10 13:44:22 +00:00
Simon Pilgrim	57e48aebb3	[X86][AVX] Improve target shuffle combining of BLEND+zero The BLEND+zero combine was failing to combine equivalent BLEND masks. Follow up to D17483 and D17858 llvm-svn: 263105	2016-03-10 11:50:15 +00:00
Simon Pilgrim	4fad44ad43	[X86][SSE] Basic combining of unary target shuffles of binary target shuffles. This patch reorders the combining of target shuffle masks so that when a unary shuffle takes a binary shuffle as its input but only references one of its inputs it can correctly combine into a unary shuffle mask. This is starting to encroach on the purpose of resolveTargetShuffleInputs, but I don't want to remove it until we definitely know we won't need it for full binary shuffle combining. There is a lot more work before we can properly support binary target shuffle masks but this was an easy case to add support for. Differential Revision: http://reviews.llvm.org/D17858 llvm-svn: 263102	2016-03-10 11:23:51 +00:00
Elena Demikhovsky	ef21e26acb	AVX-512: Fixed a bug in shuffle for v64i8 type Operation SCALAR_TO_VECTOR for v64i8 and v32i16 should be lowered if BW feature is "on". Differential Revision: http://reviews.llvm.org/D17994 llvm-svn: 263097	2016-03-10 08:32:09 +00:00
Roman Levenstein	26a5894f0f	Add support for a preserve_most calling convention to the AArch64 backend. This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092	2016-03-10 04:35:09 +00:00
Sanjay Patel	468f494438	[x86, AVX] optimize masked loads with constant masks Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper. Differential Revision: http://reviews.llvm.org/D17899 llvm-svn: 263067	2016-03-09 22:12:08 +00:00
Kit Barton	0f693412b6	[PPC] backend changes to generate xvabs[s,d]p and xvnabs[s,d]p instructions This has to be committed before the FE changes Phabricator: http://reviews.llvm.org/D17837 llvm-svn: 263035	2016-03-09 17:48:01 +00:00
Tom Stellard	d319e18a36	SelectionDAG: Fix a crash on inline asm when output register supports multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022	2016-03-09 16:02:52 +00:00
Sam Kolton	96f1d9ee4d	[AMDGPU] Assembler: Support DPP instructions. Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008	2016-03-09 12:29:31 +00:00
Dan Gohman	639ae6b08b	[WebAssembly] Implement irreducible control flow. This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982	2016-03-09 02:01:14 +00:00
Chad Rosier	5c9777923f	[AArch64] Disable the MI scheduler to turn bots green after r262942. llvm-svn: 262944	2016-03-08 17:33:34 +00:00
Quentin Colombet	12ca34d7b0	Revert r262759 and r262760. The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943	2016-03-08 17:29:11 +00:00
Hans Wennborg	a95f4d4d22	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935	2016-03-08 16:21:41 +00:00
Igor Breger	ad6e830865	AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1. Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929	2016-03-08 15:21:25 +00:00
Simon Pilgrim	a416f76807	[X86] Regenerated vector float extension tests llvm-svn: 262919	2016-03-08 09:17:12 +00:00
Dan Gohman	8e7d4c3823	[WebAssembly] Update for spec change from tableswitch to br_table. Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903	2016-03-08 03:18:12 +00:00
Quentin Colombet	dc021403fd	[AArch64][GlobalISel] Add a test case for the IRTranslator. llvm-svn: 262898	2016-03-08 01:48:08 +00:00
Quentin Colombet	52df828cbf	[MIR] Teach the parser/printer that generic virtual registers do not need a register class. llvm-svn: 262893	2016-03-08 01:17:03 +00:00
Quentin Colombet	ee9e67f422	[MIR] Teach the parser how to parse complex types of generic machine instructions. By complex types, I mean aggregate or vector types. llvm-svn: 262890	2016-03-08 00:57:31 +00:00
Quentin Colombet	c906e3b8ba	[MIR] Print the type of generic machine instructions. llvm-svn: 262880	2016-03-08 00:29:15 +00:00
Quentin Colombet	2dd7c9e7bb	[MIR] Teach the mir parser about types on generic machine instructions. llvm-svn: 262879	2016-03-08 00:20:48 +00:00
Sanjay Patel	092d64512e	[x86] add test to show missing optimization This should make it clearer how this proposed patch: http://reviews.llvm.org/D11393 ...will change codegen. llvm-svn: 262875	2016-03-07 23:13:06 +00:00
Sanjay Patel	b845f96623	[x86] simplify test and tighten checks I noticed this test as part of: http://reviews.llvm.org/D11393 ...which is confusing enough as-is. Let's show the exact codegen, so the changes will be more obvious. llvm-svn: 262874	2016-03-07 22:53:23 +00:00
Quentin Colombet	a1fc09c260	[MIR] Teach the MIPrinter about size for generic virtual registers. llvm-svn: 262867	2016-03-07 21:57:52 +00:00
Matt Arsenault	d89dec289d	AMDGPU: Match more med3 integer patterns llvm-svn: 262864	2016-03-07 21:54:48 +00:00
Quentin Colombet	adaa42e9e7	[MIR] Teach the parser how to handle the size of generic virtual registers. llvm-svn: 262862	2016-03-07 21:48:43 +00:00
Simon Pilgrim	2df8497807	[X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809	2016-03-06 21:54:52 +00:00
Igor Breger	c376e5b7a2	AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 262803	2016-03-06 12:38:58 +00:00
Igor Breger	d0d6119cbd	AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector). Differential Revision: http://reviews.llvm.org/D17763 llvm-svn: 262797	2016-03-06 07:46:03 +00:00
Simon Pilgrim	0c991e17ab	[X86][AVX] Improved VPERMILPS variable shuffle mask decoding. Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784	2016-03-05 22:53:31 +00:00
Matthias Braun	d7e6a2dcfd	RegisterCoalescer: Remap subregister lanemasks before exchanging operands Rematerializing and merging into a bigger register class at the same time, requires the subregister range lanemasks getting remapped to the new register class. This fixes http://llvm.org/PR26805 llvm-svn: 262768	2016-03-05 04:36:13 +00:00
Quentin Colombet	32c807a283	[X86] Fix the lowering of setjmp intrinsic on i386. When the lowering of the setjmp intrinsic requires a global base pointer to be set, make sure such pointer gets defined by the CGBR pass. This fixes PR26742. llvm-svn: 262762	2016-03-05 00:31:04 +00:00
Quentin Colombet	7f05c4b72c	Add missing triple in my previous commit! llvm-svn: 262760	2016-03-04 23:36:32 +00:00
Quentin Colombet	248c4c35f1	[X86] Do not use cmpxchgXXb when we need the base pointer (RBX). cmpxchgXXb uses RBX as one of its implicit argument. I.e., when we use that instruction we need to clobber RBX. This is generally fine, expect when RBX is a reserved register because in that case, the register allocator will not track its value and will not save and restore it when interferences occur. rdar://problem/24851412 llvm-svn: 262759	2016-03-04 23:29:39 +00:00
Sanjay Patel	a167c8861d	[x86] add tests for masked loads with constant masks llvm-svn: 262758	2016-03-04 23:28:07 +00:00
David Majnemer	0db3c7acce	[X86] Support cleaning more than 2**16 bytes of stack The x86 ret instruction has a 16 bit immediate indicating how many bytes to pop off of the stack beyond the return address. There is a problem when extremely large structs are passed by value: we might not be able to fit the number of bytes to pop into the return instruction. To fix this, expand RET_FLAG a little later and use a special sequence to clean the stack: pop %ecx ; return address is now in %ecx add $n, %esp ; clean the stack push %ecx ; bring the return address back on the stack ret ; pop the return address and jmp to it's value llvm-svn: 262755	2016-03-04 22:56:17 +00:00
Michael Kuperstein	f8caea219d	[DAGCombine] Fix divrem combine not to assume div/rem type is simple. The divrem combine assumed the type of the div/rem is simple, which isn't necessarily true. This probably worked fine until r250825, since it only saw legal types, but now breaks when it runs as a pre-type-legalization combine. This fixes PR26835. Differential Revision: http://reviews.llvm.org/D17878 llvm-svn: 262746	2016-03-04 21:23:29 +00:00
Renato Golin	52bc44295a	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738	2016-03-04 19:19:36 +00:00
Tom Stellard	c656bfbad2	AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732	2016-03-04 18:31:18 +00:00
Zoran Jovanovic	ed3c27bd7d	[mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725	2016-03-04 17:34:31 +00:00
Simon Pilgrim	5516ba8111	[X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining llvm-svn: 262718	2016-03-04 15:19:42 +00:00
Simon Pilgrim	910b95e254	[X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw) llvm-svn: 262710	2016-03-04 11:15:23 +00:00
Nikolay Haustov	3529c0cbe0	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701	2016-03-04 10:39:50 +00:00
NAKAMURA Takumi	57f709ace4	llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple. We will see it for targeting win32; LLVM ERROR: CPU: 'generic' does not support ARM mode execution! llvm-svn: 262668	2016-03-03 22:38:39 +00:00
Simon Pilgrim	c831916cbc	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test. PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661	2016-03-03 21:55:01 +00:00
Simon Pilgrim	a9b6ea15aa	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635	2016-03-03 18:13:53 +00:00

1 2 3 4 5 ...

15180 Commits