llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00

Author	SHA1	Message	Date
Matt Arsenault	9094a66425	AMDGPU: Move AMDGPU intrinsics only used by R600 llvm-svn: 258790	2016-01-26 04:49:24 +00:00
Matt Arsenault	f5c703425b	AMDGPU: Tidy minor td file issues Make comments and indentation more consistent. Rearrange a few things to be in a more consistent order, such as organizing subtarget features from those describing an actual device property, and those used as options. llvm-svn: 258789	2016-01-26 04:49:22 +00:00
Matt Arsenault	51a14cbbc7	AMDGPU: Make v32i8/v64i8 illegal types Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. llvm-svn: 258788	2016-01-26 04:43:48 +00:00
Matt Arsenault	b7742acaf4	AMDGPU: Remove old sample intrinsics I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787	2016-01-26 04:38:08 +00:00
Matt Arsenault	581518df24	AMDGPU: Add new amdgcn intrinsics for cube instructions More cleanup to try to get all intrinsics using the correct amdgcn prefix that are as close to the instruction as possible. llvm-svn: 258786	2016-01-26 04:29:56 +00:00
Matt Arsenault	667cd15c1c	AMDGPU: Implement read_register and write_register intrinsics Some of the special intrinsics now that now correspond to a instruction also have special setting of some registers, e.g. llvm.SI.sendmsg sets m0 as well as use s_sendmsg. Using these explicit register intrinsics may be a better option. Reading the exec mask and others may be useful for debugging. For this I'm not sure this is entirely correct because we would want this to be convergent, although it's possible this is already treated sufficently conservatively. llvm-svn: 258785	2016-01-26 04:29:24 +00:00
Matt Arsenault	31798fd428	AMDGPU: Note mesa version in release notes llvm-svn: 258784	2016-01-26 04:29:15 +00:00
Matt Arsenault	97a3b39dcb	AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783	2016-01-26 04:14:16 +00:00
Dan Gohman	bf8e0c60a6	[WebAssembly] Optimize memcpy/memmove/memcpy calls. These calls return their first argument, but because LLVM uses an intrinsic with a void return type, they can't use the returned attribute. Generalize the store results pass to optimize these calls too. llvm-svn: 258781	2016-01-26 04:01:11 +00:00
Dan Gohman	82741c1205	[WebAssembly] Remove a completed entry from the README.txt. llvm-svn: 258780	2016-01-26 03:43:48 +00:00
Dan Gohman	e694253126	[WebAssembly] Implement unaligned loads and stores. Differential Revision: http://reviews.llvm.org/D16534 llvm-svn: 258779	2016-01-26 03:39:31 +00:00
Haicheng Wu	5302d65f58	[LIR] Add support for structs and hand unrolled loops This is a recommit of r258620 which causes PR26293. The original message: Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258777	2016-01-26 02:27:47 +00:00
Reid Kleckner	0081990bfb	Use binary search for intrinsic ID lookups This improves compile time of Function.cpp from 57s to 37s for me locally. Intrinsic IDs are cached on the Function object, so this shouldn't regress performance. llvm-svn: 258774	2016-01-26 02:06:41 +00:00
Matthias Braun	036b32dcef	LiveIntervalAnalysis: Improve some comments As recommended by Justin. llvm-svn: 258771	2016-01-26 01:40:48 +00:00
Reid Kleckner	b834dd994c	Sort intrinsics by LLVM intrinsic name, rather than tablegen def name Step one towards using a simple binary search to lookup intrinsic IDs instead of our crazy table generated switch+memcmp+startswith code that makes Function.cpp take about a minute to compile. See PR24785 and PR11951 for why we should do this. The X86 backend contains tables that need to be sorted on intrinsic ID, so reorder those. llvm-svn: 258757	2016-01-26 00:55:00 +00:00
Matthias Braun	7d3caf5306	LiveIntervalAnalysis: Cleanup handleMove{Down\|Up}() functions, NFC These two functions are hard to reason about. This commit makes the code more comprehensible: - Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut) with a fixed value instead of a changing iterator I that points to different things during the function. - Remove the early explanation before the function in favor of more detailed comments inside the function. Should have more/clearer comments now stating which conditions are tested and which invariants hold at different points in the functions. The behaviour of the code was not changed. I hope that this will make it easier to review the changes in http://reviews.llvm.org/D9067 which I will adapt next. Differential Revision: http://reviews.llvm.org/D16379 llvm-svn: 258756	2016-01-26 00:43:50 +00:00
Dan Gohman	2bd89d3994	Followup to 258750; update more tests to use .p2align . llvm-svn: 258755	2016-01-26 00:35:07 +00:00
Dan Gohman	39414df67d	Followup to 258750; update all MC tests to use .p2align . llvm-svn: 258754	2016-01-26 00:27:59 +00:00
Dan Gohman	afdd4e6630	Followup to 258750; update this test to use .p2align . llvm-svn: 258752	2016-01-26 00:17:24 +00:00
Dan Gohman	a72e83c26e	[MC] Use .p2align instead of .align For historic reasons, the behavior of .align differs between targets. Fortunately, there are alternatives, .p2align and .balign, which make the interpretation of the parameter explicit, and which behave consistently across targets. This patch teaches MC to use .p2align instead of .align, so that people reading code for multiple architectures don't have to remember which way each platform does its .align directive. Differential Revision: http://reviews.llvm.org/D16549 llvm-svn: 258750	2016-01-26 00:03:25 +00:00
Philip Reames	75dc59c9a3	[GVN] Rearrange code to make local vs non-local cases more obvious [NFCI] llvm-svn: 258747	2016-01-25 23:37:53 +00:00
Evgeniy Stepanov	258db6665b	[cfi] Cross-DSO CFI diagnostic mode (LLVM part). * __cfi_check gets a 3rd argument: ubsan handler data * Instead of trapping on failure, call __cfi_check_fail which must be present in the module (generated in the frontend). llvm-svn: 258746	2016-01-25 23:35:03 +00:00
Philip Reames	9298d3408c	[GVN] Factor out common code [NFCI] We had the same code duplicated for each type of Def. We also have the entire block duplicated between the local and non-local case, but let's start with local cleanup. llvm-svn: 258740	2016-01-25 23:19:12 +00:00
Vedant Kumar	b5f67b4e31	[docs] Document how to merge patches into release branches llvm-svn: 258736	2016-01-25 22:47:54 +00:00
Matthias Braun	69950e68ba	X86ISelLowering: Fix cmov(cmov) special lowering bug There's a special case in EmitLoweredSelect() that produces an improved lowering for cmov(cmov) patterns. However this special lowering is currently broken if the inner cmov has multiple users so this patch stops using it in this case. If you wonder why this wasn't fixed by continuing to use the special lowering and inserting a 2nd PHI for the inner cmov: I believe this would incur additional copies/register pressure so the special lowering does not improve upon the normal one anymore in this case. This fixes http://llvm.org/PR26256 (= rdar://24329747) llvm-svn: 258729	2016-01-25 22:08:25 +00:00
Teresa Johnson	a52cfbb4e3	[ThinLTO] Find all needed metadata when linking metadata as postpass For metadata postpass linking, after importing all functions, we need to recursively walk through any nodes reached via imported functions to locate needed subprogram metadata. Some might only be reached indirectly via the variable list for an inlined function. llvm-svn: 258728	2016-01-25 22:04:56 +00:00
Simon Pilgrim	a63717e4c0	[X86][AVX] Add commutation support for VPERM2X128 instructions Its main use is to allow memory folding of the 1st operand Differential Revision: http://reviews.llvm.org/D16521 llvm-svn: 258726	2016-01-25 21:51:34 +00:00
Teresa Johnson	8b2ed4fc5e	[ThinLTO] Handle DISubprogram reached indirectly from DIImportedEntity Extend fix for PR26037 to identify DISubprogram reached from a DIImportedEntity via a DILexicalBlock. llvm-svn: 258722	2016-01-25 21:29:55 +00:00
Xinliang David Li	50d58058da	Fix a typo llvm-svn: 258716	2016-01-25 20:38:13 +00:00
Lawrence Hu	baafd4c214	Enable loopreroll to rerool loop with pointer induction variable. Example: while (buf !=end ) { S += buf[0]; S += buf[1]; buf +=2; }; Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258709	2016-01-25 19:43:45 +00:00
Lawrence Hu	0572a631ee	Undo commit 258700 due to missing commit message llvm-svn: 258708	2016-01-25 19:36:30 +00:00
Matthew Simpson	d9e4b63bf8	Reapply commit r25804 with fix We were hitting an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239. llvm-svn: 258705	2016-01-25 19:24:29 +00:00
Quentin Colombet	06230e1d45	Speculatively revert r258620 as it is the likely culprid of PR26293. llvm-svn: 258703	2016-01-25 19:12:49 +00:00
Ivan Krasin	09873095d3	Temporary disable broken fuzzer/timeout tests. Reviewers: kcc Differential Revision: http://reviews.llvm.org/D16543 llvm-svn: 258702	2016-01-25 19:05:45 +00:00
Rafael Espindola	3bf30c1acf	Add a test showing we can write a vector of floats. llvm-svn: 258701	2016-01-25 19:02:20 +00:00
Lawrence Hu	1cf7c9fba6	Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258700	2016-01-25 18:53:39 +00:00
Sanjay Patel	3bb30646d9	don't repeat function names in documentation comments; NFC llvm-svn: 258699	2016-01-25 18:38:38 +00:00
Dan Gohman	d5075eb344	[WebAssembly] Fix unbalanced register stack code in the case of late DCE. Instructions can be DCE'd after the RegStackify pass. If the instruction which would be the pop for what would be a push is removed, don't use a push. llvm-svn: 258694	2016-01-25 16:48:44 +00:00
Dan Gohman	d1cae1c975	[WebAssembly] Add tests for negative offsets with global variable addresses. llvm-svn: 258693	2016-01-25 15:19:39 +00:00
Dan Gohman	6aa3a18666	[WebAssembly] Minor code formatting cleanups. NFC. llvm-svn: 258692	2016-01-25 15:12:05 +00:00
Dan Gohman	105451ecf0	[SelectionDAG] Use the correct return type for memcpy, memmove, and memset. When generating calls to memcpy, memmove, and memset, use void* as the return type rather than void, to match the standard signatures for these functions. This has no practical effect for most targets, since the return values of these calls aren't being used anyway, and most calling conventions tolerate this kind of mismatch. However, this change will help support future optimizations to utilize the return value to avoid holding the argument value live across a call. llvm-svn: 258691	2016-01-25 15:05:56 +00:00
James Molloy	fe66a6200d	[DemandedBits] Fix computation of demanded bits for ICmps The computation of ICmp demanded bits is independent of the individual operand being evaluated. We simply return a mask consisting of the minimum leading zeroes of both operands. We were incorrectly passing "I" to ComputeKnownBits - this should be "UserI->getOperand(0)". In cases where we were evaluating the 1th operand, we were taking the minimum leading zeroes of it and itself. This should fix PR26266. llvm-svn: 258690	2016-01-25 14:49:36 +00:00
Michael Zuckerman	847379aa25	[AVX512] Adding PTESTNMB/D/W/Q instruction Differential Revision: http://reviews.llvm.org/D16520 llvm-svn: 258688	2016-01-25 14:43:23 +00:00
Aaron Ballman	aabad88851	Reapplying r256836 with a fix for MSVC 14 support. Enable more strict standards conformance in MSVC for rvalue casting and string literal type conversion to non-const types. Also enables generation of intrinsics for more functions. Patch by Alexander Riccio llvm-svn: 258687	2016-01-25 14:17:39 +00:00
Michael Zuckerman	5131ab0907	[AVX512] Adding PTESTMB/W/D/Q instruction Differential Revision: http://reviews.llvm.org/D16519 llvm-svn: 258686	2016-01-25 13:27:32 +00:00
Bradley Smith	849b958836	[ARM] Add DSP build attribute and extension targeting This patch was originally committed as r257885, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258683	2016-01-25 11:26:11 +00:00
Bradley Smith	28db0fcf02	[ARM] Add new system registers to ARMv8-M Baseline/Mainline This patch was originally committed as r257884, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258682	2016-01-25 11:25:36 +00:00
Bradley Smith	bb33c3478a	[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline This patch was originally committed as r257883, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258681	2016-01-25 11:24:47 +00:00
Asaf Badouh	ec3729528a	[X86][IFMA] adding intrinsics and encoding for multiply and add of unsigned 52bit integer VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators Differential Revision: http://reviews.llvm.org/D16407 llvm-svn: 258680	2016-01-25 11:14:24 +00:00
Oliver Stannard	43a6ec7452	[ARM] Add ARMv8.2-A FP16 scalar instructions This was originally committed as r255762, but reverted as it broke windows bots. Re-commitiing the exact same patch, as the underlying cause was fixed by r258677. ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. The assembly for these instructions uses S registers (AArch32 does not have H registers), but the instructions have ".f16" type specifiers rather than ".f32" or ".f64". The top 16 bits of each source register are ignored, and the top 16 bits of the destination register are set to zero. These instructions are mostly the same as the 32- and 64-bit versions, but they use coprocessor 9 rather than 10 and 11. Two new instructions, VMOVX and VINS, have been added to allow packing and extracting two 16-bit floats stored in the top and bottom halves of an S register. New fixup kinds have been added for the PC-relative load and store instructions, but no ELF relocations have been added as they have a range of 512 bytes. Differential Revision: http://reviews.llvm.org/D15038 llvm-svn: 258678	2016-01-25 10:26:26 +00:00

1 2 3 4 5 ...

126614 Commits