llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Andrea Di Biagio	afbe21e2b8	x86 interrupt calling convention: only save xmm registers if the target supports SSE The existing code always saves the xmm registers for 64-bit targets even if the target doesn't support SSE (which is common for kernels). Thus, the compiler inserts movaps instructions which lead to CPU exceptions when an interrupt handler is invoked. This commit fixes this bug by returning a register set without xmm registers from getCalleeSavedRegs and getCallPreservedMask for such targets. Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D29959 llvm-svn: 295347	2017-02-16 18:25:37 +00:00
Sanjay Patel	0e46a1b5cf	[x86] add more tests of select of constants; NFC llvm-svn: 295346	2017-02-16 18:15:16 +00:00
Artur Pilipenko	17b23dc26c	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Resubmit -r295314 with PowerPC and AMDGPU tests updated. Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295336	2017-02-16 17:07:27 +00:00
Sjoerd Meijer	1f1704e85c	[AArch64] AArch64AsmParser clean up of isImmediate functions. NFC Regression test neon-diagnostics.s needed changing because it now produces a more specific diagnostic about the immediate ranges. One change in the expected error message is not obvious, but there multiple candidate and it happens to pick the immediate diagnostic. Differential Revision: https://reviews.llvm.org/D29939 llvm-svn: 295331	2017-02-16 15:52:22 +00:00
Dan Gohman	5b9fa0f519	[WebAssembly] Add a cast to void to fix an unused private member warning, for now. llvm-svn: 295327	2017-02-16 15:21:37 +00:00
Simon Pilgrim	687e21ed00	[X86] Remove local areOnlyUsersOf helper and use SDNode::areOnlyUsersOf instead. llvm-svn: 295326	2017-02-16 15:11:49 +00:00
Marshall Clow	0eaf285460	Remove uses of deprecated std::random_shuffle in the LLVM code base. Reviewed as https://reviews.llvm.org/D29780 . llvm-svn: 295325	2017-02-16 14:37:03 +00:00
Diana Picus	10445ac029	[ARM] GlobalISel: Select floating point loads llvm-svn: 295321	2017-02-16 14:10:50 +00:00
Artur Pilipenko	310ec69c66	Rever -r295314 "[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine" This change causes some of AMDGPU and PowerPC tests to fail. llvm-svn: 295316	2017-02-16 13:04:46 +00:00
Artur Pilipenko	409c061e49	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295314	2017-02-16 12:53:26 +00:00
Diana Picus	e647a7a202	[ARM] GlobalISel: Select G_SEQUENCE and G_EXTRACT Since they're only used for passing around double precision floating point values into the general purpose registers, we'll lower them to VMOVDRR and VMOVRRD. llvm-svn: 295310	2017-02-16 12:19:57 +00:00
Diana Picus	ad8a798a52	[ARM] GlobalISel: Select double G_FADD and copies Just use VADDD if available, bail out if not. llvm-svn: 295309	2017-02-16 12:19:52 +00:00
Diana Picus	e6063ce05b	[ARM] GlobalISel: Assert that we don't use the FPR bank if we don't have VFP llvm-svn: 295308	2017-02-16 11:25:09 +00:00
Diana Picus	bf4b3c8bca	[ARM] GlobalISel: Add reg bank mappings for G_SEQUENCE and G_EXTRACT Support G_SEQUENCE and G_EXTRACT as needed for passing double precision floating point values in the soft-fp float mode. llvm-svn: 295306	2017-02-16 11:00:31 +00:00
Diana Picus	2eba84f7a1	[ARM] GlobalISel: Make the FPR bank 64-bit wide Also add mappings for single and double precision FP, and use them for G_FADD and G_LOAD. llvm-svn: 295302	2017-02-16 10:12:49 +00:00
Diana Picus	dc8724964f	[ARM] GlobalISel: Legalize 64-bit G_FADD and G_LOAD For now we just mark them as legal all the time and let the other passes bail out if they can't handle it. In the future, we'll want to move more of the brains into the legalizer. llvm-svn: 295300	2017-02-16 09:09:49 +00:00
NAKAMURA Takumi	beec9280ea	RWMutex.h: Use llvm-config.h instead of config.h in installed headers. llvm-svn: 295297	2017-02-16 08:22:08 +00:00
Diana Picus	070c98b2bf	[ARM] GlobalISel: Lower double precision FP args For the hard float calling convention, we just use the D registers. For the soft-fp calling convention, we use the R registers and move values to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we make sure to honor the endianness of the target, since the CCAssignFn doesn't do that for us. For pure soft float targets, we still bail out because we don't support the libcalls yet. llvm-svn: 295295	2017-02-16 07:53:07 +00:00
Craig Topper	aeb7a3dc72	[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit. llvm-svn: 295294	2017-02-16 07:35:23 +00:00
Craig Topper	19ca5c2ad5	[AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics. The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO. llvm-svn: 295290	2017-02-16 06:31:54 +00:00
Rui Ueyama	3da8848889	Split WinCOFFObjectWriter::writeSection. llvm-svn: 295276	2017-02-16 02:56:06 +00:00
Rui Ueyama	d6795f3c20	Split WinCOFFObjectWriter::writeObject function. llvm-svn: 295273	2017-02-16 02:35:48 +00:00
Matt Arsenault	97a1843703	AMDGPU: Remove llvm.SI.sendmsg llvm-svn: 295270	2017-02-16 02:01:17 +00:00
Matt Arsenault	daf5e675f7	AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsics Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269	2017-02-16 02:01:13 +00:00
Rui Ueyama	b4a640174f	Remove useless local variable. llvm-svn: 295268	2017-02-16 01:41:04 +00:00
Rui Ueyama	53b4d6e9de	Rename variables to match the LLVM style. llvm-svn: 295265	2017-02-16 01:06:45 +00:00
Hans Wennborg	babbd26fe3	[X86] Re-enable conditional tail calls and fix PR31257. This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262	2017-02-16 00:04:05 +00:00
Peter Collingbourne	4a4d05f793	PMB: Add an importing WPD pass to the start of the ThinLTO backend pipeline. Differential Revision: https://reviews.llvm.org/D30008 llvm-svn: 295260	2017-02-15 23:48:38 +00:00
Teresa Johnson	b2c109488e	Collapse my two entries in CODE_OWNERS.txt llvm-svn: 295259	2017-02-15 23:45:21 +00:00
Tim Northover	5b6f2d0cd7	GlobalISel: legalize va_arg on AArch64. Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255	2017-02-15 23:22:50 +00:00
Tim Northover	675ff280b6	GlobalISel: support translating va_arg Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also need to attach the required alignment info. llvm-svn: 295254	2017-02-15 23:22:33 +00:00
Daniel Berlin	d8a24ee5b1	Implement intrinsic mangling for literal struct types. Fixes PR 31921 Summary: Predicateinfo requires an ugly workaround to try to avoid literal struct types due to the intrinsic mangling not being implemented. This workaround actually does not work in all cases (you can hit the assert by bootstrapping with -print-predicateinfo), and can't be made to work without DFS'ing the type (IE copying getMangledStr and using a version that detects if it would crash). Rather than do that, i just implemented the mangling. It seems simple, since they are unified structurally. Looking at the overloaded-mangling testcase we have, it actually turns out the gc intrinsics will also crash if you try to use a literal struct. Thus, the testcase added fails before this patch, and works after, without needing to resort to predicateinfo. Reviewers: chandlerc, davide Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29925 llvm-svn: 295253	2017-02-15 23:16:20 +00:00
Matt Arsenault	f68954636d	AMDGPU: Remove dead node definitions llvm-svn: 295247	2017-02-15 22:23:04 +00:00
Matt Arsenault	46abf021e5	Fix typos llvm-svn: 295246	2017-02-15 22:19:06 +00:00
Matt Arsenault	d0625484c6	AMDGPU: Consolidate sendmsg/sendmsghalt handling and tests llvm-svn: 295244	2017-02-15 22:17:09 +00:00
Eugene Zelenko	50e5259010	[Support] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295243	2017-02-15 22:17:02 +00:00
Matt Arsenault	4d2de9e5d1	DAG: Do not scalarize fsub if fneg is legal Tests will be included with future commit. llvm-svn: 295242	2017-02-15 22:02:42 +00:00
Peter Collingbourne	d91df946cb	Re-apply r295110 and r295144 with a fix for the ASan issue. llvm-svn: 295241	2017-02-15 21:56:51 +00:00
Matt Arsenault	332d674a45	AMDGPU: Replace assert with report_fatal_error Also use a more refined condition. llvm-svn: 295239	2017-02-15 21:50:34 +00:00
Keno Fischer	c568d201c5	[GlobalObject] Fix setSection("") Summary: In rL291613, the section name was interned in LLVMContext. However, this broke the ability to remove the section from a GlobalObject, because it tried to intern empty strings, which is not allowed. Fix that and add an appropriate regression test. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D29795 llvm-svn: 295238	2017-02-15 21:42:42 +00:00
Sanjay Patel	5dc8de15ec	[InstCombine] improve formatting; NFC llvm-svn: 295237	2017-02-15 21:31:34 +00:00
Peter Collingbourne	fccb6e3a69	AssumptionCache: Disable the verifier by default, move it behind a hidden cl::opt and verify from releaseMemory(). This is a short term solution to the problem that many passes currently fail to update the assumption cache. In the long term the verifier should not be controllable with a flag. We should either fix all passes to correctly update the assumption cache and enable the verifier unconditionally or somehow arrange for the assumption list to be updated automatically by passes. Differential Revision: https://reviews.llvm.org/D30003 llvm-svn: 295236	2017-02-15 21:10:09 +00:00
Simon Pilgrim	9fc30279d8	[X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing. Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow. llvm-svn: 295235	2017-02-15 21:09:00 +00:00
Arnold Schwaighofer	8e9cd89822	AddressSanitizer: don't track swifterror memory addresses They are register promoted by ISel and so it makes no sense to treat them as memory. Inserting calls to the thread sanitizer would also generate invalid IR. You would hit: "swifterror value can only be loaded and stored from, or as a swifterror argument!" llvm-svn: 295230	2017-02-15 20:43:43 +00:00
Ahmed Bougacha	6d1de4abe7	[AArch64] Make am_ldrlit an iPTR - not OtherVT - operand. NFC-ish. am_ldrlit diverged from am_brcond in r207105, but kept the OtherVT operand type. It made sense for branch targets, as those are represented as MVT::Other in SDAG. But loads operate on pointers. This shouldn't have an observable effect on any in-tree code, but helps make the patterns consistent for external users. llvm-svn: 295229	2017-02-15 20:38:31 +00:00
Ahmed Bougacha	b0c2ac7a60	[OptDiag] Pass const Values/Types to Argument. NFC. llvm-svn: 295228	2017-02-15 20:38:28 +00:00
Ahmed Bougacha	233dd4cec3	[IR] Accept 'const Type &' in the Type operator<<. NFC. Type::print is const; there's no reason for the operator not to be. llvm-svn: 295227	2017-02-15 20:38:22 +00:00
Tobias Edler von Koch	fddccdb0f9	[LTO] Add ability to emit assembly to new LTO API Summary: Add a field to LTO::Config, CGFileType, to select the file type to emit (object or assembly). This is useful for testing and to implement -save-temps. Reviewers: tejohnson, mehdi_amini, pcc Reviewed By: mehdi_amini Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D29475 llvm-svn: 295226	2017-02-15 20:36:36 +00:00
Kyle Butt	96c1e7e4f0	Codegen: Make chains from trellis-shaped CFGs Lay out trellis-shaped CFGs optimally. A trellis of the shape below: A B \|\ /\| \| \ / \| \| X \| \| / \ \| \|/ \\| C D would be laid out A; B->C ; D by the current layout algorithm. Now we identify trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an increasing number of predecessors. A trellis is a a group of 2 or more predecessor blocks that all have the same successors. because of this we can tail duplicate to extend existing trellises. As an example consider the following CFG: B D F H / \ / \ / \ / \ A---C---E---G---Ret Where A,C,E,G are all small (Currently 2 instructions). The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret. The current code will copy C into B, E into D and G into F and yield the layout A,C,B(C),E,D(E),F(G),G,H,ret define void @straight_test(i32 %tag) { entry: br label %test1 test1: ; A %tagbit1 = and i32 %tag, 1 %tagbit1eq0 = icmp eq i32 %tagbit1, 0 br i1 %tagbit1eq0, label %test2, label %optional1 optional1: ; B call void @a() br label %test2 test2: ; C %tagbit2 = and i32 %tag, 2 %tagbit2eq0 = icmp eq i32 %tagbit2, 0 br i1 %tagbit2eq0, label %test3, label %optional2 optional2: ; D call void @b() br label %test3 test3: ; E %tagbit3 = and i32 %tag, 4 %tagbit3eq0 = icmp eq i32 %tagbit3, 0 br i1 %tagbit3eq0, label %test4, label %optional3 optional3: ; F call void @c() br label %test4 test4: ; G %tagbit4 = and i32 %tag, 8 %tagbit4eq0 = icmp eq i32 %tagbit4, 0 br i1 %tagbit4eq0, label %exit, label %optional4 optional4: ; H call void @d() br label %exit exit: ret void } here is the layout after D27742: straight_test: # @straight_test ; ... Prologue elided ; BB#0: # %entry ; A (merged with test1) ; ... More prologue elided mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_2 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_3 b .LBB0_4 .LBB0_2: # %optional1 ; B (copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_4 .LBB0_3: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_5 b .LBB0_6 .LBB0_4: # %optional2 ; D (copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_5: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 b .LBB0_7 .LBB0_6: # %optional3 ; F (copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit ; Ret ld 30, 96(1) # 8-byte Folded Reload addi 1, 1, 112 ld 0, 16(1) mtlr 0 blr The tail-duplication has produced some benefit, but it has also produced a trellis which is not laid out optimally. With this patch, we improve the layouts of such trellises, and decrease the cost calculation for tail-duplication accordingly. This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have back edges, which is a negative, but it has a bigger compensating positive, which is that it handles the case where there are long strings of skipped blocks much better than the original layout. Both layouts handle runs of executed blocks equally well. Branch prediction also improves if there is any correlation between subsequent optional blocks. Here is the resulting concrete layout: straight_test: # @straight_test ; BB#0: # %entry ; A (merged with test1) mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_4 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_5 .LBB0_2: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_3: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 bne 0, .LBB0_7 b .LBB0_8 .LBB0_4: # %optional1 ; B (Copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_2 .LBB0_5: # %optional2 ; D (Copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_3 .LBB0_6: # %optional3 ; F (Copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit Differential Revision: https://reviews.llvm.org/D28522 llvm-svn: 295223	2017-02-15 19:49:14 +00:00
Xinliang David Li	06359df83c	include function name in dot filename Differential Revision: http://reviews.llvm.org/D29975 llvm-svn: 295220	2017-02-15 19:21:04 +00:00

1 2 3 4 5 ...

144949 Commits