llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00

Author	SHA1	Message	Date
Nemanja Ivanovic	eaa9a4f81a	Fix for PR 26378 This patch corresponds to review: http://reviews.llvm.org/D17712 We were not clearing the TOC vector in PPCAsmPrinter when initializing it. This caused duplicate definition asserts when the pass is reused on the module (i.e. with -compile-twice or in JIT contexts). llvm-svn: 263338	2016-03-12 10:23:07 +00:00
Quentin Colombet	aaf2db6c80	[X86] Make sure we do not clobber RBX with cmpxchg when used as a base pointer. cmpxchg[8\|16]b uses RBX as one of its argument. In other words, using this instruction clobbers RBX as it is defined to hold one the input. When the backend uses dynamically allocated stack, RBX is used as a reserved register for the base pointer. Reserved registers have special semantic that only the target understands and enforces, because of that, the register allocator don’t use them, but also, don’t try to make sure they are used properly (remember it does not know how they are supposed to be used). Therefore, when RBX is used as a reserved register but defined by something that is not compatible with that use, the register allocator will not fix the surrounding code to make sure it gets saved and restored properly around the broken code. This is the responsibility of the target to do the right thing with its reserved register. To fix that, when the base pointer needs to be preserved, we use a different pseudo instruction for cmpxchg that save rbx. That pseudo takes two more arguments than the regular instruction: - One is the value to be copied into RBX to set the proper value for the comparison. - The other is the virtual register holding the save of the value of RBX as the base pointer. This saving is done as part of isel (i.e., we emit a copy from rbx). cmpxchg_save_rbx <regular cmpxchg args>, input_for_rbx_reg, save_of_rbx_as_bp This gets expanded into: rbx = copy input_for_rbx_reg cmpxchg <regular cmpxchg args> rbx = save_of_rbx_as_bp Note: The actual modeling of the pseudo is a bit more complicated to make sure the interferes that appears after the pseudo gets expanded are properly modeled before that expansion. This fixes PR26883. llvm-svn: 263325	2016-03-12 02:25:27 +00:00
Simon Pilgrim	d62ab3da09	[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303	2016-03-11 22:18:05 +00:00
Ahmed Bougacha	a2af74a580	[AArch64] Don't blindly lower f16/f128 FCCMPs. Instead, extend f16 (like we do when lowering a standalone SETCC), and let f128 be legalized to the RT calls. Fixes PR26803. llvm-svn: 263301	2016-03-11 22:02:58 +00:00
Chad Rosier	8faf1c7ea8	Update test case to appease bots after 263255. I'll follow up with Matt to confirm this is the correct fix. llvm-svn: 263268	2016-03-11 17:33:36 +00:00
Quentin Colombet	8cb0c9cc03	[IRTranslator] Translate unconditional branches. llvm-svn: 263265	2016-03-11 17:28:03 +00:00
Simon Pilgrim	d1894c7f7a	[X86][AVX] Fixed issue where a long chain of shuffles could attempt to combine to a single (illegal) PSHUFB instruction. Its not enough that we test for SSSE3 - that's only OK for 128-bit vectors - we also need to test for AVX2 / AVX512BW for 256/512 bit vector cases. llvm-svn: 263239	2016-03-11 14:39:10 +00:00
Vasileios Kalintiris	f63ff65eda	[mips] MIPSR6 Instruction itineraries Summary: Defines instruction itineraries for common MIPSR6 instructions. Patch by Simon Dardis. Reviewers: vkalintiris Subscribers: MatzeB, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17198 llvm-svn: 263229	2016-03-11 13:05:06 +00:00
Nikolay Haustov	941b85256d	[AMDGPU] Assembler: change v_madmk operands to have same order as mad. The constant is now at source operand 1 (previously at 2). This is also how it is in legacy AMD sp3 assembler. Update tests. Differential Revision: http://reviews.llvm.org/D17984 llvm-svn: 263212	2016-03-11 09:27:25 +00:00
Matt Arsenault	f422acdb0e	AMDGPU: Materialize sign bits with bfrev If a constant is the same as the reverse of an inline immediate, this is 4 bytes smaller than having to embed a 32-bit literal. llvm-svn: 263201	2016-03-11 07:42:49 +00:00
Tim Northover	594e5f0364	AArch64: only try to use scaled fcvt ops on legal vector types. Before we ended up calling getSimpleVectorType on a <3 x float>, which asserted. llvm-svn: 263169	2016-03-10 23:02:21 +00:00
Simon Pilgrim	74609b7c8b	[X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159	2016-03-10 20:40:26 +00:00
Artur Pilipenko	7bad97e2f6	Support arbitrary addrspace pointers in masked load/store intrinsics This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 263158	2016-03-10 20:39:22 +00:00
Balaram Makam	635be27aa9	Fix testicase to turn buildbot green. NFC. llvm-svn: 263154	2016-03-10 19:07:50 +00:00
Nicolai Haehnle	b8a3bf37c5	AMDGPU/SI: add llvm.amdgcn.buffer.load/store.format intrinsics Summary: They correspond to BUFFER_LOAD/STORE_FORMAT_XYZW and will be used by Mesa to implement the GL_ARB_shader_image_load_store extension. The intention is that for llvm.amdgcn.buffer.load.format, LLVM will decide whether one of the _X/_XY/_XYZ opcodes can be used (similar to image sampling and loads). However, this is not currently implemented. For llvm.amdgcn.buffer.store, LLVM cannot decide to use one of the "smaller" opcodes and therefore the intrinsic is overloaded. Currently, only the v4f32 is actually implemented since GLSL also only has a vec4 variant of the store instructions, although it's conceivable that Mesa will want to be smarter about this in the future. BUFFER_LOAD_FORMAT_XYZW is already exposed via llvm.SI.vs.load.input, which has a legacy name, pretends not to access memory, and does not capture the full flexibility of the instruction. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17277 llvm-svn: 263140	2016-03-10 18:43:50 +00:00
Michael Kuperstein	a8857efd95	[X86] Correctly select registers to pop into for x86_64 When trying to replace an add to esp with pops, we need to choose dead registers to pop into. Registers clobbered by the call and not imp-def'd by it should be safe. Except that it's not enough to check the register itself isn't defined, we also need to make sure no overlapping registers are defined either. This fixes PR26711. Differential Revision: http://reviews.llvm.org/D18029 llvm-svn: 263139	2016-03-10 18:43:21 +00:00
Balaram Makam	a2655214ad	[AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2 Summary: Peephole optimization that generates a single TBZ/TBNZ instruction for test and branch sequences like in the example below. This handles the cases that miss folding of AND into TBZ/TBNZ during ISelLowering of BR_CC Examples: and w8, w8, #0x400 cbnz w8, L1 to tbnz w8, #10, L1 Reviewers: MatzeB, jmolloy, mcrosier, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17942 llvm-svn: 263136	2016-03-10 17:54:55 +00:00
Sanjay Patel	7332ee563f	give regression test a meaningful name llvm-svn: 263135	2016-03-10 17:52:19 +00:00
Alexandros Lamprineas	00a64ba9ec	[ARM] Cortex-R8 support This patch adds Cortex-R8 to Target Parser and TableGen. It also adds CodeGen tests for the build attributes. Patch by Pablo Barrio. Differential Revision: http://reviews.llvm.org/D17925 llvm-svn: 263132	2016-03-10 17:38:41 +00:00
Changpeng Fang	fc2b5c3fe3	AMDGPU/SI: Define S_GETREG Intrinsic Summary: Define s_getreg intrinsic to generate s_getreg instruction to read hardware registers. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17892 llvm-svn: 263124	2016-03-10 16:47:15 +00:00
Saleem Abdulrasool	b9035e7dcd	ARM: follow up improvements for SVN r263118 The initial change was insufficiently complete for always getting the semantics of __builtin_longjmp correct. The builtin is translated into a `tInt_eh_sjlj_longjmp` DAG node. This node set R7 as clobbered. However, the code would then follow up with a clobber of R11. I had failed to notice the imp-def,kill on R7 in the isel. Unfortunately, it seems that it is not possible to conditionalise the Defs list via an !if. Instead, construct a new parallel WIN node and prefer that when targeting windows. This ensures that we now both correctly model the __builtin_longjmp as well as construct the frame in a more ABI conformant manner. llvm-svn: 263123	2016-03-10 16:26:37 +00:00
David L Kreitzer	b2fd0878a4	Unified the handling of returns in the X87 stackifier so that the stackifier runs successfully on routines containing IRETs. This fixes PR26410. Differential Revision: http://reviews.llvm.org/D17643 llvm-svn: 263120	2016-03-10 15:14:02 +00:00
Saleem Abdulrasool	33390fc262	ARM: correct __builtin_longjmp on WoA WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast to AAPCS which states r7. This adjusts __builtin_longjmp to not clobber r7 and to properly restore the frame pointer on execution. llvm-svn: 263118	2016-03-10 15:11:09 +00:00
Elena Demikhovsky	2f71b82e0e	AVX-512: Fixed a bug in i1 vector zero extending. (Skylake-avx512) (failed on instruction selection phase) Differential Revision: http://reviews.llvm.org/D17924 llvm-svn: 263111	2016-03-10 13:44:22 +00:00
Simon Pilgrim	57e48aebb3	[X86][AVX] Improve target shuffle combining of BLEND+zero The BLEND+zero combine was failing to combine equivalent BLEND masks. Follow up to D17483 and D17858 llvm-svn: 263105	2016-03-10 11:50:15 +00:00
Simon Pilgrim	4fad44ad43	[X86][SSE] Basic combining of unary target shuffles of binary target shuffles. This patch reorders the combining of target shuffle masks so that when a unary shuffle takes a binary shuffle as its input but only references one of its inputs it can correctly combine into a unary shuffle mask. This is starting to encroach on the purpose of resolveTargetShuffleInputs, but I don't want to remove it until we definitely know we won't need it for full binary shuffle combining. There is a lot more work before we can properly support binary target shuffle masks but this was an easy case to add support for. Differential Revision: http://reviews.llvm.org/D17858 llvm-svn: 263102	2016-03-10 11:23:51 +00:00
Elena Demikhovsky	ef21e26acb	AVX-512: Fixed a bug in shuffle for v64i8 type Operation SCALAR_TO_VECTOR for v64i8 and v32i16 should be lowered if BW feature is "on". Differential Revision: http://reviews.llvm.org/D17994 llvm-svn: 263097	2016-03-10 08:32:09 +00:00
Roman Levenstein	26a5894f0f	Add support for a preserve_most calling convention to the AArch64 backend. This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092	2016-03-10 04:35:09 +00:00
Sanjay Patel	468f494438	[x86, AVX] optimize masked loads with constant masks Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper. Differential Revision: http://reviews.llvm.org/D17899 llvm-svn: 263067	2016-03-09 22:12:08 +00:00
Kit Barton	0f693412b6	[PPC] backend changes to generate xvabs[s,d]p and xvnabs[s,d]p instructions This has to be committed before the FE changes Phabricator: http://reviews.llvm.org/D17837 llvm-svn: 263035	2016-03-09 17:48:01 +00:00
Tom Stellard	d319e18a36	SelectionDAG: Fix a crash on inline asm when output register supports multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022	2016-03-09 16:02:52 +00:00
Sam Kolton	96f1d9ee4d	[AMDGPU] Assembler: Support DPP instructions. Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008	2016-03-09 12:29:31 +00:00
Dan Gohman	639ae6b08b	[WebAssembly] Implement irreducible control flow. This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982	2016-03-09 02:01:14 +00:00
Chad Rosier	5c9777923f	[AArch64] Disable the MI scheduler to turn bots green after r262942. llvm-svn: 262944	2016-03-08 17:33:34 +00:00
Quentin Colombet	12ca34d7b0	Revert r262759 and r262760. The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943	2016-03-08 17:29:11 +00:00
Hans Wennborg	a95f4d4d22	Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935	2016-03-08 16:21:41 +00:00
Igor Breger	ad6e830865	AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1. Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929	2016-03-08 15:21:25 +00:00
Simon Pilgrim	a416f76807	[X86] Regenerated vector float extension tests llvm-svn: 262919	2016-03-08 09:17:12 +00:00
Dan Gohman	8e7d4c3823	[WebAssembly] Update for spec change from tableswitch to br_table. Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903	2016-03-08 03:18:12 +00:00
Quentin Colombet	dc021403fd	[AArch64][GlobalISel] Add a test case for the IRTranslator. llvm-svn: 262898	2016-03-08 01:48:08 +00:00
Quentin Colombet	52df828cbf	[MIR] Teach the parser/printer that generic virtual registers do not need a register class. llvm-svn: 262893	2016-03-08 01:17:03 +00:00
Quentin Colombet	ee9e67f422	[MIR] Teach the parser how to parse complex types of generic machine instructions. By complex types, I mean aggregate or vector types. llvm-svn: 262890	2016-03-08 00:57:31 +00:00
Quentin Colombet	c906e3b8ba	[MIR] Print the type of generic machine instructions. llvm-svn: 262880	2016-03-08 00:29:15 +00:00
Quentin Colombet	2dd7c9e7bb	[MIR] Teach the mir parser about types on generic machine instructions. llvm-svn: 262879	2016-03-08 00:20:48 +00:00
Sanjay Patel	092d64512e	[x86] add test to show missing optimization This should make it clearer how this proposed patch: http://reviews.llvm.org/D11393 ...will change codegen. llvm-svn: 262875	2016-03-07 23:13:06 +00:00
Sanjay Patel	b845f96623	[x86] simplify test and tighten checks I noticed this test as part of: http://reviews.llvm.org/D11393 ...which is confusing enough as-is. Let's show the exact codegen, so the changes will be more obvious. llvm-svn: 262874	2016-03-07 22:53:23 +00:00
Quentin Colombet	a1fc09c260	[MIR] Teach the MIPrinter about size for generic virtual registers. llvm-svn: 262867	2016-03-07 21:57:52 +00:00
Matt Arsenault	d89dec289d	AMDGPU: Match more med3 integer patterns llvm-svn: 262864	2016-03-07 21:54:48 +00:00
Quentin Colombet	adaa42e9e7	[MIR] Teach the parser how to handle the size of generic virtual registers. llvm-svn: 262862	2016-03-07 21:48:43 +00:00
Simon Pilgrim	2df8497807	[X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809	2016-03-06 21:54:52 +00:00

1 2 3 4 5 ...

15199 Commits