llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	7647026685	[X86][AVX512] Add support for AVX512 VMOVQ (load) shuffle decoding llvm-svn: 259496	2016-02-02 13:32:56 +00:00
Sjoerd Meijer	e59db362af	Removed FeatureVFPOnlySP from the Cortex-R7 processor model description and changed the regression test accordingly. The default configuration of a Cortex-R7 is to implement the VFPv3-D16 architecture and the feature line as it was is too restrictive. llvm-svn: 259480	2016-02-02 09:28:20 +00:00
David Majnemer	1619ed9924	[RegisterCoalescer] Better DebugLoc for reMaterializeTrivialDef When rematerializing a computation by replacing the copy, use the copy's location. The location of the copy is more representative of the original program. This partially fixes PR10003. llvm-svn: 259469	2016-02-02 06:41:55 +00:00
Sanjoy Das	6b9e756a71	[X86] Fix a bug in getMemOpBaseRegImmOfs Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of `MemOp` is a frame index memory operand. The fix is to have `getMemOpBaseRegImmOfs` bail out in such cases. We can possibly be more clever here, if needed. llvm-svn: 259456	2016-02-02 02:32:43 +00:00
Ahmed Bougacha	7e17e58964	[X86][FastISel] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR. FastISel counterpart to r259448. llvm-svn: 259449	2016-02-02 01:44:03 +00:00
Ahmed Bougacha	d732a878e7	[X86] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR. Officially, we don't acknowledge non-default configurations of MXCSR, as getting there would require usage of the FENV_ACCESS pragma (at least insofar as rounding mode is concerned). We don't support the pragma, so we can assume that the default rounding mode - round to nearest, ties to even - is always used. However, it's inconsistent with the rest of the instruction set, where MXCSR is always effective (unless otherwise specified). Also, it's an unnecessary obstacle to the few brave souls that use fenv.h with LLVM. Avoid the hard-coded rounding mode for fp_to_f16; use MXCSR instead. llvm-svn: 259448	2016-02-02 01:32:50 +00:00
Simon Pilgrim	50130a8ffb	[X86][AVX512] Add support for AVX512 VMOVD (load) shuffle decoding llvm-svn: 259430	2016-02-01 23:04:05 +00:00
Simon Pilgrim	176f061ffa	[X86][AVX512] Add support for AVX512 VMOVSD/VMOVSS shuffle decoding llvm-svn: 259427	2016-02-01 22:26:28 +00:00
Simon Pilgrim	f6407af598	[X86][AVX512] Add support for AVX512 VINSERTPS shuffle decoding llvm-svn: 259420	2016-02-01 22:05:50 +00:00
Simon Pilgrim	44c36a2766	[X86][SSE] Regenerated load vector + element extraction tests. llvm-svn: 259416	2016-02-01 21:46:12 +00:00
Simon Pilgrim	b3498bc10a	[X86][SSE] Add AVX512 merge consecutive load tests Add AVX512F/AVX512BW 512-bit tests. Add AVX512F tests to existing 128/256-bit tests. llvm-svn: 259410	2016-02-01 21:30:50 +00:00
Simon Pilgrim	85d3fd7a2c	Regenerate vector blend tests. llvm-svn: 259406	2016-02-01 21:06:32 +00:00
Simon Pilgrim	5a878d6ad7	Regenerate vector sext/zext constant folding tests. llvm-svn: 259405	2016-02-01 21:01:29 +00:00
Balaram Makam	a423972bd5	AArch64: Implement missed conditional compare sequences. Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387	2016-02-01 19:13:07 +00:00
Ulrich Weigand	c791b42454	[SystemZ] Fix wrong-code generation for certain always-false conditions We've found another bug in the code generation logic conditions for a certain class of always-false conditions, those of the form if ((a & 1) < 0) These only reach the back end when compiling without optimization. The bug was introduced by the choice of using TEST UNDER MASK to implement a check for if ((a & MASK) < VAL) as if ((a & MASK) == 0) where VAL is less than the the lowest bit of MASK. This is correct in all cases except for VAL == 0, in which case the original condition is always false, but the replacement isn't. Fixed by excluding that particular case. llvm-svn: 259381	2016-02-01 18:31:19 +00:00
Asaf Badouh	7d5bdf84bb	[X86][AVX512VBMI] add encoding and intrinsics for Multishift Differential Revision: http://reviews.llvm.org/D16399 llvm-svn: 259363	2016-02-01 15:48:21 +00:00
Vasileios Kalintiris	2d209abe7a	[mips] Split large test file into 3 smaller ones. Remove the old select.ll file and use select-int.ll, select-flt.ll, select-dbl.ll for testing selects on integers, floats & doubles respectivelly. llvm-svn: 259361	2016-02-01 15:19:35 +00:00
Igor Breger	15632eed43	AVX512: fix mask handling for gather/scatter/prefetch intrinsics. Differential Revision: http://reviews.llvm.org/D16755 llvm-svn: 259346	2016-02-01 09:57:15 +00:00
Simon Pilgrim	f62b33f8d4	[X86][SSE] Find source of the inserted element of INSERTPS Minor patch to trace back through target shuffles to the source of the inserted element in a (V)INSERTPS shuffle. Differential Revision: http://reviews.llvm.org/D16652 llvm-svn: 259343	2016-02-01 08:59:30 +00:00
Igor Breger	fa62fb9857	AVX512 : Fix SETCCE lowering for KNL 32 bit. Differential Revision: http://reviews.llvm.org/D16752 llvm-svn: 259342	2016-02-01 07:56:09 +00:00
David Majnemer	feb8745705	Revert r258580 and r258581. Those commits created an artificial edge from a cleanup to a synthesized catchswitch in order to get the MSVC personality routine to execute cleanups which don't cleanupret and are not wrapped by a catchswitch. This worked well enough but is not a complete solution in situations where there the cleanup infinite loops. However, the real deal breaker behind this approach comes about from a degenerate case where the cleanup is post-dominated by unreachable and throws an exception. This ends poorly because the catchswitch will inadvertently catch the exception. Because of this we should go back to our previous behavior of not executing certain cleanups (identical behavior with the Itanium ABI implementation in clang, GCC and ICC). N.B. I think this could be salvaged by making the catchpad rethrow the exception and properly transforming throwing calls in the cleanup into invokes. llvm-svn: 259338	2016-02-01 03:29:38 +00:00
Derek Schuff	2f77371cea	[WebAssembly] Fix uses of FrameIndex as store values Previously the code assumed all uses of FI on loads and stores were as addresses. This checks whether the use is the address or a value and handles the latter case as it does for non-memory instructions. llvm-svn: 259306	2016-01-30 21:43:08 +00:00
JF Bastien	d89bb7340c	WebAssembly: don't optimize frameindex store The previous code was incorrect (can't getReg a frameindex). We could instead optimize it to reduce tree height, but I'm not sure that's worthwhile yet because we then try to eliminate the frameindex. This patch also fixes frame index elimination for operations which may load or store: it used to assume the base was operand 2 and immediate offset operand 1. That's not true for stores, where they're 4 and 3. llvm-svn: 259305	2016-01-30 14:11:26 +00:00
Matt Arsenault	2699008644	AMDGPU: Fix emitting invalid workitem intrinsics for HSA The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. llvm-svn: 259297	2016-01-30 05:19:45 +00:00
Matt Arsenault	bcaaea3448	AMDGPU: Stop checking intrinsics not used by HSA for dispatch-ptr Only the dispatch.ptr intrinsic is supposed to be used now to get the workgroup size, and the read.local.size intrinsics do not work correctly. llvm-svn: 259296	2016-01-30 05:10:59 +00:00
Dan Gohman	3f3cb842c4	[WebAssembly] Refine block placement to insert blocks between trees. Refine the test for whether an instruction is in an expression tree so that it detects when one tree ends and another begins, so we can place a block at that point, rather than continuing to find the first instruction not in a tree at all. llvm-svn: 259294	2016-01-30 05:01:06 +00:00
Matt Arsenault	2f96fe904d	AMDGPU: Add new amdgcn workitem intrinsics These use the correct prefix and follow the HSA naming convention rather than the config register option names. llvm-svn: 259293	2016-01-30 04:25:19 +00:00
Justin Lebar	e107deb822	[CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. Summary: Previously we'd just silently skip these. Reviewers: tra, jholewinski Subscribers: llvm-commits, jhen, echristo, Differential Revision: http://reviews.llvm.org/D16739 llvm-svn: 259279	2016-01-30 01:07:38 +00:00
Tim Northover	81271b4305	ARM: don't mangle DAG constant if it has more than one use The basic optimisation was to convert (mul $LHS, $complex_constant) into roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected to be cheaper. The original logic checks that the mul only has one use (since we're mangling $complex_constant), but when used in even more complex addressing modes there may be an outer addition that can pick up the wrong value too. I think the ARM addressing-mode problem is actually unreachable at the moment, but that depends on complex assessments of the profitability of pre-increment addressing modes so I've put a real check in there instead of an assertion. llvm-svn: 259228	2016-01-29 19:18:46 +00:00
Derek Schuff	385a651d43	[WebAssembly] Support frame pointer Add support for frame pointer use in prolog/epilog. Supports dynamic allocas but not yet over-aligned locals. Target-independend CG generates SP updates, but we still need to write back the SP value to memory when necessary. llvm-svn: 259220	2016-01-29 18:37:49 +00:00
Ahmed Bougacha	6ca6e9769b	[X86] Add missing "CHECK" colon in r259065 test. llvm-svn: 259219	2016-01-29 18:25:33 +00:00
Alexandros Lamprineas	1eca4c99e9	[ARM] Emit trap instruction using .inst directive The trap instruction is emitted as a data-in-text rather than an instruction. This patch uses the .inst directive for emitting trap. Differential Revision: http://reviews.llvm.org/D16684 llvm-svn: 259182	2016-01-29 10:23:32 +00:00
Matt Arsenault	b119805900	AMDGPU: Remove 24-bit intrinsics The known bit matching code seems to work reasonably well, so these shouldn't really be needed. llvm-svn: 259180	2016-01-29 10:05:16 +00:00
Eric Christopher	14a36607b8	Since LI/LIS sign extend the constant passed into the instruction we should check that the sign extended constant fits into 16-bits if we want a zero extended value, otherwise go ahead and put it together piecemeal. Fixes PR26356. llvm-svn: 259177	2016-01-29 07:20:01 +00:00
David Majnemer	d6f63d5999	[WinEH] Don't perform state stores in cleanups Our cleanups do not support true lexical nesting of funclets which obviates the need to perform state stores. This fixes PR26361. llvm-svn: 259161	2016-01-29 05:33:15 +00:00
Ahmed Bougacha	94c2d7d066	[AArch64] Fix i64 nontemporal high-half extraction. Since we only have pair - not single - nontemporal store instructions, we have to extract the high part into a separate register to be able to use them. When the initial nontemporal codegen support was added, I wrote the extract using the nonsensical UBFX [0,32[. Use the correct LSR form instead. llvm-svn: 259134	2016-01-29 01:08:41 +00:00
Simon Pilgrim	9da50f1ac6	[X86][AVX] Added more thorough 256-bit vector consecutive load tests. llvm-svn: 259100	2016-01-28 22:29:51 +00:00
Matt Arsenault	750cfb2e8d	AMDGPU: Match fmed3 patterns with legacy fmin/fmax llvm-svn: 259090	2016-01-28 20:53:48 +00:00
Matt Arsenault	28a667546a	AMDGPU: Match some med3 patterns llvm-svn: 259089	2016-01-28 20:53:42 +00:00
Matt Arsenault	2c5f486332	AMDGPU: Set DX10Clamp bit llvm-svn: 259088	2016-01-28 20:53:35 +00:00
David Majnemer	444856c58e	Address buildbot fallout from r259065 llvm-svn: 259074	2016-01-28 18:59:04 +00:00
David Majnemer	05fc1c63f9	[X86] Don't transform X << 1 to X + X during type legalization While legalizing a 64-bit shift left by 1, the following occurs: We split the shift operand in half: a high half and a low half. We then create an ADDC with the low half and a ADDE with the high half + the carry bit from the ADDC. This is problematic if X is any_ext'd because the high half computation is now undef + undef + carry bit and there is no way to ensure that the two undef values had the same bitwise representation. This results in the lowest bit in the high half turning into garbage. Instead, do not try to turn shifts into arithmetic during type legalization. This fixes PR26350. llvm-svn: 259065	2016-01-28 18:20:05 +00:00
Oliver Stannard	20e26aa0e1	Revert r259035, it introduces a cyclic library dependency llvm-svn: 259045	2016-01-28 13:19:47 +00:00
Igor Breger	707a1dde86	AVX512: Fix truncate v32i8 to v32i1 lowering implementation. Enable truncate 128/256bit packed byte/word with AVX512BW but without AVX512VL, use 512bit instructions. Differential Revision: http://reviews.llvm.org/D16531 llvm-svn: 259044	2016-01-28 13:19:25 +00:00
Zoran Jovanovic	935bb972c6	[mips][microMIPS] Disable FastISel for microMIPS Author: milena.vujosevic.janicic Reviewers: dsanders FastIsel is not supported for microMIPS, thus it needs to be disabled. Test micromips-zero-mat-uses.ll is deleted since the tested sequence of instructions is not generated for microMIPS without FastISel. Differential Revision: http://reviews.llvm.org/D15892 llvm-svn: 259039	2016-01-28 11:08:03 +00:00
Oliver Stannard	8eec8bac0d	Add backend dignostic printer for unsupported features Re-commit of r258951 after fixing layering violation. The related LLVM patch adds a backend diagnostic type for reporting unsupported features, this adds a printer for them to clang. In the case where debug location information is not available, I've changed the printer to report the location as the first line of the function, rather than the closing brace, as the latter does not give the user any information. This also affects optimisation remarks. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 259035	2016-01-28 10:07:27 +00:00
Asaf Badouh	547a7d4edb	[X86][AVX512] small fix in ptestm intrinsics move ptestm{q\|d} intrinsics from patterns form (in td file) to the intrinsics table Differential Revision: http://reviews.llvm.org/D16633 llvm-svn: 259029	2016-01-28 08:33:22 +00:00
Junmo Park	f494fe59f5	[DAGCombiner] Don't add volatile or indexed stores to ChainedStores Summary: findBetterNeighborChains does not handle volatile or indexed stores. However, it did not check when adding stores to ChainedStores. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D16463 llvm-svn: 259024	2016-01-28 06:23:33 +00:00
NAKAMURA Takumi	a814b67e03	Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features" It broke layering violation in LLVMIR. clang r258950 "Add backend dignostic printer for unsupported features" llvm r258951 "Refactor backend diagnostics for unsupported features" llvm-svn: 259016	2016-01-28 04:41:32 +00:00
Dan Gohman	120265029b	[WebAssembly] Don't stackify a register def past a get_local use in the same tree. llvm-svn: 259013	2016-01-28 03:59:09 +00:00

1 2 3 4 5 ...

14838 Commits