llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00

Author	SHA1	Message	Date
Tim Northover	5b6f2d0cd7	GlobalISel: legalize va_arg on AArch64. Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255	2017-02-15 23:22:50 +00:00
Tim Northover	675ff280b6	GlobalISel: support translating va_arg Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also need to attach the required alignment info. llvm-svn: 295254	2017-02-15 23:22:33 +00:00
Matt Arsenault	46abf021e5	Fix typos llvm-svn: 295246	2017-02-15 22:19:06 +00:00
Matt Arsenault	4d2de9e5d1	DAG: Do not scalarize fsub if fneg is legal Tests will be included with future commit. llvm-svn: 295242	2017-02-15 22:02:42 +00:00
Kyle Butt	96c1e7e4f0	Codegen: Make chains from trellis-shaped CFGs Lay out trellis-shaped CFGs optimally. A trellis of the shape below: A B \|\ /\| \| \ / \| \| X \| \| / \ \| \|/ \\| C D would be laid out A; B->C ; D by the current layout algorithm. Now we identify trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an increasing number of predecessors. A trellis is a a group of 2 or more predecessor blocks that all have the same successors. because of this we can tail duplicate to extend existing trellises. As an example consider the following CFG: B D F H / \ / \ / \ / \ A---C---E---G---Ret Where A,C,E,G are all small (Currently 2 instructions). The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret. The current code will copy C into B, E into D and G into F and yield the layout A,C,B(C),E,D(E),F(G),G,H,ret define void @straight_test(i32 %tag) { entry: br label %test1 test1: ; A %tagbit1 = and i32 %tag, 1 %tagbit1eq0 = icmp eq i32 %tagbit1, 0 br i1 %tagbit1eq0, label %test2, label %optional1 optional1: ; B call void @a() br label %test2 test2: ; C %tagbit2 = and i32 %tag, 2 %tagbit2eq0 = icmp eq i32 %tagbit2, 0 br i1 %tagbit2eq0, label %test3, label %optional2 optional2: ; D call void @b() br label %test3 test3: ; E %tagbit3 = and i32 %tag, 4 %tagbit3eq0 = icmp eq i32 %tagbit3, 0 br i1 %tagbit3eq0, label %test4, label %optional3 optional3: ; F call void @c() br label %test4 test4: ; G %tagbit4 = and i32 %tag, 8 %tagbit4eq0 = icmp eq i32 %tagbit4, 0 br i1 %tagbit4eq0, label %exit, label %optional4 optional4: ; H call void @d() br label %exit exit: ret void } here is the layout after D27742: straight_test: # @straight_test ; ... Prologue elided ; BB#0: # %entry ; A (merged with test1) ; ... More prologue elided mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_2 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_3 b .LBB0_4 .LBB0_2: # %optional1 ; B (copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_4 .LBB0_3: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_5 b .LBB0_6 .LBB0_4: # %optional2 ; D (copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_5: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 b .LBB0_7 .LBB0_6: # %optional3 ; F (copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit ; Ret ld 30, 96(1) # 8-byte Folded Reload addi 1, 1, 112 ld 0, 16(1) mtlr 0 blr The tail-duplication has produced some benefit, but it has also produced a trellis which is not laid out optimally. With this patch, we improve the layouts of such trellises, and decrease the cost calculation for tail-duplication accordingly. This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have back edges, which is a negative, but it has a bigger compensating positive, which is that it handles the case where there are long strings of skipped blocks much better than the original layout. Both layouts handle runs of executed blocks equally well. Branch prediction also improves if there is any correlation between subsequent optional blocks. Here is the resulting concrete layout: straight_test: # @straight_test ; BB#0: # %entry ; A (merged with test1) mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_4 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_5 .LBB0_2: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_3: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 bne 0, .LBB0_7 b .LBB0_8 .LBB0_4: # %optional1 ; B (Copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_2 .LBB0_5: # %optional2 ; D (Copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_3 .LBB0_6: # %optional3 ; F (Copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit Differential Revision: https://reviews.llvm.org/D28522 llvm-svn: 295223	2017-02-15 19:49:14 +00:00
Xinliang David Li	06359df83c	include function name in dot filename Differential Revision: http://reviews.llvm.org/D29975 llvm-svn: 295220	2017-02-15 19:21:04 +00:00
Michael Kuperstein	8c57546852	[DAG] Don't try to create an INSERT_SUBVECTOR with an illegal source We currently can't legalize those, but we should really not be creating them in the first place, since legalization would probably look similar to the way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD. This fixes PR311956. Differential Revision: https://reviews.llvm.org/D29961 llvm-svn: 295213	2017-02-15 18:37:26 +00:00
Sagar Thakur	5602c70c9c	[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit. Reviewed by sdardis, dberris Differential: D27697 llvm-svn: 295164	2017-02-15 10:48:11 +00:00
Craig Topper	a77b0bd57e	[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract. This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract. Reviewers: zvi, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29926 llvm-svn: 295152	2017-02-15 05:57:16 +00:00
Reid Kleckner	96b6dea648	[BranchFolding] Tail common all identical unreachable blocks Summary: Blocks ending in unreachable are typically cold because they end the program or throw an exception, so merging them with other identical blocks is usually profitable because it reduces the size of cold code. MachineBlockPlacement generally does not arrange to fall through to such blocks, so commoning these blocks will not introduce additional unconditional branches. Reviewers: hans, iteratee, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29153 llvm-svn: 295105	2017-02-14 21:02:24 +00:00
Tim Northover	f6d277ab26	GlobalISel: introduce G_PTR_MASK to simplify alloca handling. This instruction clears the low bits of a pointer without requiring (possibly dodgy if pointers aren't ints) conversions to and from an integer. Since (as far as I'm aware) all masks are statically known, the instruction takes an immediate operand rather than a register to specify the mask. llvm-svn: 295103	2017-02-14 20:56:18 +00:00
Eric Christopher	6cba8bbd4f	Reformat slightly. llvm-svn: 295096	2017-02-14 19:43:50 +00:00
Wolfgang Pieb	5bbf2372ac	Reapply r294532, reverted in r294787. Store instructions can have more than one memory operand as a result of optimizations that fold different stores into one. When we identify spill instructions to generate DBG_VALUE instructions to record the spilling of a variable, we disregard stores with multiple memory operands for now. We may miss some relevant spills but the handling is a bit more complex, so we'll do it in a different patch. This fixes PR31935. llvm-svn: 295093	2017-02-14 19:08:45 +00:00
Aditya Nandakumar	b5cce55dd5	[Tablegen] Instrumenting table gen DAGGenISelDAG To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081	2017-02-14 18:32:41 +00:00
Adam Nemet	337f461009	Add new pass LazyMachineBlockFrequencyInfo And use it in MachineOptimizationRemarkEmitter. A test will follow on top of Justin's changes to enable MachineORE in AsmPrinter. The approach is similar to the IR-level pass. It's a bit simpler because BPI is immutable at the Machine level so we don't need to make that lazy. Because of this, a new function mapping is introduced (BPIPassTrait::getBPI). This function extracts BPI from the pass. In case of the lazy pass, this is when the calculation of the BFI occurs. For Machine-level, this is the identity function. Differential Revision: https://reviews.llvm.org/D29836 llvm-svn: 295072	2017-02-14 17:21:09 +00:00
Artyom Skrobov	7174382d29	Removing a redundant assignment llvm-svn: 295055	2017-02-14 14:44:01 +00:00
Eugene Zelenko	c7114c5b3a	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009	2017-02-14 00:33:36 +00:00
Tim Northover	979f34e8cc	GlobalISel: represent atomic loads & stores via the MachineMemOperand. Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993	2017-02-13 22:14:16 +00:00
Tim Northover	291842e005	MIR: parse & print the atomic parts of a MachineMemOperand. We're going to need them very soon for GlobalISel. llvm-svn: 294992	2017-02-13 22:14:08 +00:00
Taewook Oh	f644f79154	Address post-commit comments for https://reviews.llvm.org/D29596 . NFCI. llvm-svn: 294985	2017-02-13 21:12:27 +00:00
Arnold Schwaighofer	51dc444272	swiftcc: Don't emit tail calls from callers with swifterror parameters Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982	2017-02-13 19:58:28 +00:00
Taewook Oh	c49af0f212	Make MachineBasicBlock::updateTerminator to update DebugLoc as well Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int begin, int end) { 4 int i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976	2017-02-13 18:15:31 +00:00
Quentin Colombet	e407a5d391	[FastISel] Add a diagnostic to warm on fallback. This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970	2017-02-13 17:38:59 +00:00
Sanne Wouda	1c46d953e0	[Assembler] Improve diagnostics for inline assembly. Summary: Keep a vector of LocInfos around; one for each call to EmitInlineAsm. Since each call to EmitInlineAsm creates a new buffer in the inline asm SourceMgr, we can use the buffer number to map to the right LocInfo. Reviewers: rengolin, grosbach, rnk, echristo Reviewed By: rnk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29769 llvm-svn: 294947	2017-02-13 13:58:00 +00:00
Andrew V. Tischenko	78f7599134	Compile time decreasing in the case we're dealing with Machine Combiner. Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936	2017-02-13 09:43:37 +00:00
Craig Topper	0828b50f44	[DAGCombiner] Teach DAG combine that inserting an extract_subvector result into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932	2017-02-13 04:53:33 +00:00
Craig Topper	b43e610c0d	[DAGCombiner] Remove the half vector width check for the combine of EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930	2017-02-12 23:49:49 +00:00
Sanjay Patel	c0558e8c9e	[TargetLowering] fix SETCC SETLT folding with FP types The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924	2017-02-12 23:07:52 +00:00
Craig Topper	c3fd393098	[DAGCombiner] Make the combine of INSERT_SUBVECTOR into a CONCAT_VECTOR more generic to support larger concats. llvm-svn: 294875	2017-02-11 22:57:09 +00:00
Sanjay Patel	286263086f	[TargetLowering] check for sign-bit comparisons in SimplifyDemandedBits I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often. The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here: https://llvm.org/bugs/show_bug.cgi?id=11210 Differential Revision: https://reviews.llvm.org/D29687 llvm-svn: 294863	2017-02-11 18:01:55 +00:00
Nico Weber	4b68201166	Revert r294532, it caused PR31935 llvm-svn: 294787	2017-02-10 21:57:30 +00:00
Tim Shen	564288cf4a	[XRay] Implement powerpc64le xray. Summary: powerpc64 big-endian is not supported, but I believe that most logic can be shared, except for xray_powerpc64.cc. Also add a function InvalidateInstructionCache to xray_util.h, which is copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest, and I don't know how. Reviewers: dberris, echristo, iteratee, kbarton, hfinkel Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29742 llvm-svn: 294781	2017-02-10 21:03:24 +00:00
Tim Northover	e0ca173664	GlobalISel: drop lifetime intrinsics during translation. We don't use them yet and they just cause problems. llvm-svn: 294770	2017-02-10 19:10:38 +00:00
Simon Pilgrim	91218fd943	[DAGCombine] Allow vector constant folding of any value type before type legalization The patch comes in 2 parts: 1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types. 2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled. Fix for PR30760 Differential Revision: https://reviews.llvm.org/D29568 llvm-svn: 294749	2017-02-10 14:37:25 +00:00
Craig Topper	eef1a7854a	[SelectionDAG] Dump the DAG after legalizing vector ops and after the second type legalization Summary: With -debug, we aren't dumping the DAG after legalizing vector ops. In particular, on X86 with AVX1 only, we don't dump the DAG after we split 256-bit integer ops into pairs of 128-bit ADDs since this occurs during vector legalization. I'm only dumping if the legalize vector ops changes something since we don't print anything during legalize vector ops. So this dump shows up right after the first type-legalization dump happens. So if nothing changed this second dump is unnecessary. Having said that though, I think we should probably fix legalize vector ops to log what its doing. Reviewers: RKSimon, eli.friedman, spatel, arsenm, chandlerc Reviewed By: RKSimon Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D29554 llvm-svn: 294711	2017-02-10 05:05:57 +00:00
Eric Fiselier	db10580e9b	[CMake] Fix pthread handling for out-of-tree builds LLVM defines `PTHREAD_LIB` which is used by AddLLVM.cmake and various projects to correctly link the threading library when needed. Unfortunately `PTHREAD_LIB` is defined by LLVM's `config-ix.cmake` file which isn't installed and therefore can't be used when configuring out-of-tree builds. This causes such builds to fail since `pthread` isn't being correctly linked. This patch attempts to fix that problem by renaming and exporting `LLVM_PTHREAD_LIB` as part of`LLVMConfig.cmake`. I renamed `PTHREAD_LIB` because It seemed likely to cause collisions with downstream users of `LLVMConfig.cmake`. llvm-svn: 294690	2017-02-10 01:59:20 +00:00
Geoff Berry	06b077336c	[SelectionDAG] Fix bugs in inverted condition splitting code. Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605	2017-02-09 18:28:17 +00:00
David Bozier	1b4cfb5426	Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function" this reverts revision r294590 as it broke some buildbots. llvm-svn: 294593	2017-02-09 15:40:14 +00:00
David Bozier	4feda20555	[Stack Protection] Add diagnostic information for why stack protection was applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 294590	2017-02-09 15:08:40 +00:00
Artur Pilipenko	f15639bed9	[DAGCombiner] Support non-zero offset in load combine Enable folding patterns which load the value from non-zero offset: i8 a = ... i32 val = a[4] \| (a[5] << 8) \| (a[6] << 16) \| (a[7] << 24) => i32 val = ((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582	2017-02-09 12:06:01 +00:00
Wolfgang Pieb	d9e1b39f5d	Reapply r294356 ("Keep track of spilled variables in LiveDebugValues"). Was reverted with r294447 due to undefined behavior with negative offsets in DBG_VALUE instructions. llvm-svn: 294532	2017-02-08 23:46:59 +00:00
Tim Northover	32ff691d90	GlobalISel: legalize G_FPOW to a libcall on AArch64. There's no instruction to implement it. llvm-svn: 294531	2017-02-08 23:23:39 +00:00
Tim Northover	be5e6870cb	GlobalISel: translate @llvm.pow intrinsic to G_FPOW. It'll usually be immediately legalized back to a libcall, but occasionally something can be done with it so we'd just as well enable that flexibility from the start. llvm-svn: 294530	2017-02-08 23:23:32 +00:00
Tim Northover	2ea0d1ebe3	GlobalISel: expand mul-with-overflow into mul-hi on AArch64. AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. llvm-svn: 294519	2017-02-08 21:22:15 +00:00
Simon Dardis	4b8ff08f4f	[DebugInfo] Rename EmitDebugValue to EmitDebugThreadLocal (NFC) As pointed out by David Blaikie in the post commit review of r292624, EmitDebugValue should be called EmitDebugThreadLocal. llvm-svn: 294500	2017-02-08 19:03:46 +00:00
Artur Pilipenko	ec27d1db39	[DAGCombiner] NFC. Mark ByteProvider accessors as const llvm-svn: 294494	2017-02-08 17:59:34 +00:00
Tim Northover	7032d34569	GlobalISel: translate @llvm.va_start intrinsic. Because we need to preserve the memory access being performed we need a separate instruction to represent this. llvm-svn: 294492	2017-02-08 17:57:20 +00:00
Sanne Wouda	22ae21fb30	[Assembler] Enable nicer diagnostics for inline assembly. Fixed test. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294458	2017-02-08 14:48:05 +00:00
Diana Picus	ac31e77632	Revert "[Assembler] Enable nicer diagnostics for inline assembly." This reverts commit r294433 because it seems it broke the buildbots. llvm-svn: 294448	2017-02-08 14:02:16 +00:00
NAKAMURA Takumi	fe7417d149	Revert r294356, "DebugInfo: Track spilled variables in LiveDebugValues" It caused undefined behavior in VarLoc. As far as I investigated, - VarLoc::VarLoc() treats negative offset value as InvalidKind. Consider the case that (int64_t)MI.getOperand(1).getImm() is negative and whether it satisfies ((uint64_t)Offset < (1ULL << 32)). - Comparison operators in VarLoc behave undefined since VarLoc::Loc.Hash is uninitialized in case of InvalidKind. I guess Offset (in VarLoc) could be made aware of signed, but I am not sure. So I have reverted it for now. llvm-svn: 294447	2017-02-08 13:49:28 +00:00
Sanne Wouda	2ff17740cf	[Assembler] Enable nicer diagnostics for inline assembly. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294433	2017-02-08 10:20:07 +00:00
Matt Arsenault	25233f6591	TargetLowering: Remove AddrSpace parameter from GetAddrModeArguments It doesn't make any sense to pass in to what is supposed to be parsing the call, and this can be inferred from the pointer output. llvm-svn: 294412	2017-02-08 07:09:03 +00:00
Amaury Sechet	95fe0b1163	[DAGCombiner] Push truncate through adde when the carry isn't used. Summary: As per title. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29528 llvm-svn: 294394	2017-02-08 00:32:36 +00:00
Wolfgang Pieb	7a6ce00dea	DebugInfo: Track spilled variables in LiveDebugValues When variables are spilled to the stack by the register allocator, keep track of their debug locations in LiveDebugValues and insert DBG_VALUE instructions at the appropriate place. Ensure that the locations are propagated down the dominator tree via the existing mechanisms. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D29500 llvm-svn: 294356	2017-02-07 21:23:15 +00:00
Hans Wennborg	2f0688f909	[X86] Disable conditional tail calls (PR31257) They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. llvm-svn: 294348	2017-02-07 20:37:45 +00:00
Tim Northover	f2b2f0a4d5	GlobalISel: translate @llvm.va_end intrinsic. Turns out no-one actually cares about this one (at least) in tree so we can just drop it entirely. llvm-svn: 294345	2017-02-07 20:08:59 +00:00
Sanjoy Das	50e258bbdb	[ImplicitNullCheck] Extend Implicit Null Check scope by using stores Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 llvm-svn: 294338	2017-02-07 19:19:49 +00:00
Reid Kleckner	b7e06ea0d8	[SDAGISel] Simplify some SDAGISel code, NFC Hoist entry block code for arguments and swift error values out of the basic block instruction selection loop. Lowering arguments once up front seems much more readable than doing it conditionally inside the loop. It also makes it clear that argument lowering can update StaticAllocaMap because no instructions have been selected yet. Also use range-based for loops where possible. llvm-svn: 294329	2017-02-07 18:42:53 +00:00
Sanjay Patel	5128bb8442	[TargetLowering] fix formatting and comments for ShrinkDemandedConstant; NFC llvm-svn: 294325	2017-02-07 18:04:26 +00:00
Igor Laevsky	d32716a2c2	[CodeGenPrepare] Hoist all getSubtargetImpl calls to the beginning of the pass Differential Revision: https://reviews.llvm.org/D29456 llvm-svn: 294301	2017-02-07 13:27:20 +00:00
Daniel Jasper	53cbac5d95	Revert "[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry)" This reverts commit r294186. On an internal test, this triggers an out-of-memory error on PPC, presumably because there is another dagcombine that does the exact opposite triggering and endless loop consuming more and more memory. Chandler has started at creating a reduced test case and we'll attach it as soon as possible. llvm-svn: 294288	2017-02-07 08:57:50 +00:00
Matthias Braun	9d774631eb	RegisterCoalescer: Fix joinReservedPhysReg() joinReservedPhysReg() can only deal with a liverange in a single basic block when copying from a vreg into a physreg. See also rdar://30306405 Differential Revision: https://reviews.llvm.org/D29436 llvm-svn: 294268	2017-02-07 01:59:39 +00:00
Tim Northover	84d9dc082a	GlobalISel: legalize narrow G_SELECTS on AArch64. Otherwise there aren't any patterns to select them. llvm-svn: 294261	2017-02-06 23:41:27 +00:00
Tim Northover	6c1671c039	GlobalISel: legalize G_INSERT instructions We don't handle all cases yet (see arm64-fallback.ll for an example), but this is enough to cover most common C++ code so it's a good place to start. llvm-svn: 294247	2017-02-06 21:56:47 +00:00
Artur Pilipenko	45810c9c2e	[DAGCombiner] Support bswap as a part of load combine patterns Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29397 llvm-svn: 294201	2017-02-06 17:48:08 +00:00
Amaury Sechet	d379414d1c	Add ADDC to SelectionDAG::computeKnownBits and ComputeNumSignBits. Summary: As per title. Reviewers: bkramer, sunfish, lattner, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29521 llvm-svn: 294188	2017-02-06 14:59:06 +00:00
Amaury Sechet	cc55016d9e	[DAGCombiner] Make DAGCombiner smarter about overflow Summary: Leverage it to transform addc into add. Reviewers: mkuper, spatel, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29524 llvm-svn: 294187	2017-02-06 14:54:49 +00:00
Amaury Sechet	9df9894472	[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry) Summary: This is extracted from D29443 . Reviewers: mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29564 llvm-svn: 294186	2017-02-06 14:28:39 +00:00
Simon Pilgrim	54efd46261	[X86][SSE] Combine shuffle nodes with multiple uses if all the users are being combined. Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines. We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree. This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list. Differential Revision: https://reviews.llvm.org/D29399 llvm-svn: 294183	2017-02-06 13:44:45 +00:00
Kamil Rytarowski	aa9de58af3	Revamp llvm::once_flag to be closer to std::once_flag Summary: Make this interface reusable similarly to std::call_once and std::once_flag interface. This makes porting LLDB to NetBSD easier as there was in the original approach a portable way to specify a non-static once_flag. With this change translating std::once_flag to llvm::once_flag is mechanical. Sponsored by <The NetBSD Foundation> Reviewers: mehdi_amini, labath, joerg Reviewed By: mehdi_amini Subscribers: emaste, clayborg Differential Revision: https://reviews.llvm.org/D29566 llvm-svn: 294143	2017-02-05 21:13:06 +00:00
Geoff Berry	60ab01989c	[SelectionDAG] In InstrEmitter, handle EXTRACT_SUBREG of a physical register. Summary: Without this change, the getVR() call would hit an assert since it was being passed a physical register. Update the AArch64/ldst-opt.ll test with a case that triggers this behavior by adding a run with strict-align, which causes an unaligned STR XZR instruction to be split into byte stores, creating an EXTRACT_SUBREG of XZR that triggers the original problem. Reviewers: bogner, qcolombet, MatzeB, atrick Subscribers: aemerson, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D29495 llvm-svn: 294129	2017-02-05 18:28:14 +00:00
Amaury Sechet	0d47da337e	[DAGCombiner] Leverage add's commutativity Summary: This avoid the need to duplicate all pattern and actually end up exposing some opportunity to optimize existing pattern that did not exists in both directions on an existing test case. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29541 llvm-svn: 294125	2017-02-05 14:22:20 +00:00
Craig Topper	4afd14d090	[DAGCombiner] Canonicalize the order of a chain of INSERT_SUBVECTORs. Based on similar code for INSERT_VECTOR_ELT. llvm-svn: 294110	2017-02-04 23:26:39 +00:00
Craig Topper	4de93cedb8	[DAGCombiner] Use DAG.getAnyExtOrTrunc to simplify some code. NFC llvm-svn: 294109	2017-02-04 23:26:37 +00:00
Craig Topper	1c21c7148f	[DAGCombiner] In visitINSERT_VECTOR_ELT, move check for BUILD_VECTOR being legal below code that just canonicalizes INSERT_VECTOR_ELT without creating BUILD_VECTORS. llvm-svn: 294108	2017-02-04 23:26:34 +00:00
Amaury Sechet	58d066d9b0	Formatting in DAGCombiner. NFC llvm-svn: 294091	2017-02-04 13:01:53 +00:00
Matthias Braun	95a43d873a	MachineCopyPropagation: Respect implicit operands of COPY The code missed to check implicit operands of COPY instructions for defs/uses. Differential Revision: https://reviews.llvm.org/D29522 llvm-svn: 294088	2017-02-04 02:27:20 +00:00
Matthias Braun	6977c6a133	MachineCopyPropagation: Do not consider undef operands as clobbers This was originally introduced in r278321 to work around correctness problems in the ExecutionDepsFix pass; Probably also to keep the performance benefits of breaking the false dependencies which of course also affect undef operands. ExecutionDepsFix has been improved here recently (see for example r278321) so we should not need this exception any longer. Differential Revision: https://reviews.llvm.org/D29525 llvm-svn: 294087	2017-02-04 02:27:13 +00:00
Kyle Butt	0dd8e04bae	[CodeGen]: BlockPlacement: Skip extraneous logging. Move a check for blocks that are not candidates for tail duplication up before the logging. Reduces logging noise. No non-logging changes intended. llvm-svn: 294086	2017-02-04 02:26:34 +00:00
Kyle Butt	1d06a6ceba	[CodeGen]: BlockPlacement: Apply const liberally. NFC Anything that needs to be passed to AnalyzeBranch unfortunately can't be const, or more would be const. Added const_iterator to BlockChain to allow BlockChain to be const when we don't expect to change it. llvm-svn: 294085	2017-02-04 02:26:32 +00:00
Eugene Zelenko	4405dc9364	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce TargetInstrInfo.h dependencies. llvm-svn: 294084	2017-02-04 02:00:53 +00:00
Craig Topper	2cfb460c9f	[TwoAddressInstruction] Fix typo in comment. NFC llvm-svn: 294083	2017-02-04 01:58:10 +00:00
Brendon Cahoon	e92b0e4047	[RegisterCoalescer] Do not call getInstructionIndex with DBG_VALUE An assert occurs when calling SlotIndexes::getInstructionIndex with a DBG_VALUE instruction because the function expects an instruction with a slot index. However, there is no slot index for a DBG_VALUE instruction. Differential Revision: https://reviews.llvm.org/D29048 llvm-svn: 294070	2017-02-04 00:10:22 +00:00
Ahmed Bougacha	86ba72cd49	[TLI] Robustize SDAG LibFunc proto checking by merging it into TLI. This re-applies commit r292189, reverted in r292191. SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking. Use TLI instead, removing another heap of redundant code. This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to a few tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. - a handful of SystemZ tests check the SDAG support for lax prototype checking: Ulrich agrees on removing them. I don't think it's worth supporting any of these (IMO) invalid testcases. Instead, fix them to be more meaningful. llvm-svn: 294028	2017-02-03 19:11:19 +00:00
Tim Northover	7804ec0aaa	GlobalISel: translate dynamic alloca instructions. llvm-svn: 294022	2017-02-03 18:22:45 +00:00
Alexey Bataev	a3c2206d19	[SelectionDAG] Fix for PR30775: Assertion `NodeToMatch->getOpcode() != ISD::DELETED_NODE && "NodeToMatch was removed partway through selection"' failed. NodeToMatch can be modified during matching, but code does not handle this situation. Differential Revision: https://reviews.llvm.org/D29292 llvm-svn: 294003	2017-02-03 12:28:40 +00:00
David Blaikie	75e64d5564	DebugInfo: ensure type and namespace names are included in pubnames/pubtypes even when they are only present in type units While looking to add support for placing singular types (types that will only be emitted in one place (such as attached to a strong vtable or explicit template instantiation definition)) not in type units (since type units have overhead) I stumbled across that change causing an increase in pubtypes. Turns out we were missing some types from type units if they were only referenced from other type units and not from the debug_info section. This fixes that, following GCC's line of describing the offset of such entities as the CU die (since there's no compile unit-relative offset that would describe such an entity - they aren't in the CU). Also like GCC, this change prefers to describe the type stub within the CU rather than the "just use the CU offset" fallback where possible. This may give the DWARF consumer some opportunity to find the extra info in the type stub - though I'm not sure GDB does anything with this currently. The size of the pubnames/pubtypes sections now match exactly with or without type units enabled. This nearly triples (+189%) the pubtypes section for a clang self-host and grows pubnames by 0.07% (without compression). For a total of 8% increase in debug info sections of the objects of a Split DWARF build when using type units. llvm-svn: 293971	2017-02-03 00:44:18 +00:00
Bob Haarman	b5c776edaf	[lto] add getLinkerOpts() Summary: Some compilers, including MSVC and Clang, allow linker options to be specified in source files. In the legacy LTO API, there is a getLinkerOpts() method that returns linker options for the bitcode module being processed. This change adds that method to the new API, so that the COFF linker can get the right linker options when using the new LTO API. Reviewers: pcc, ruiu, mehdi_amini, tejohnson Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D29207 llvm-svn: 293950	2017-02-02 23:00:49 +00:00
Reid Kleckner	b9a68db4f5	[CodeGen] Remove dead call-or-prologue enum from CCState This enum has been dead since Olivier Stannard re-implemented ARM byval handling in r202985 (2014). llvm-svn: 293943	2017-02-02 21:58:22 +00:00
Xinliang David Li	0bca950cbd	[PGO] internal option cleanups 1. Added comments for options 2. Added missing option cl::desc field 3. Uniified function filter option for graph viewing. Now PGO count/raw-counts share the same filter option: -view-bfi-func-name=. llvm-svn: 293938	2017-02-02 21:29:17 +00:00
Quentin Colombet	780b512e18	[LiveRangeEdit] Don't mess up with LiveInterval when a new vreg is created. In r283838, we added the capability of splitting unspillable register. When doing so we had to make sure the split live-ranges were also unspillable and we did that by marking the related live-ranges in the delegate method that is called when a new vreg is created. However, by accessing the live-range there, we also triggered their lazy computation (LiveIntervalAnalysis::getInterval) which is not what we want in general. Indeed, later code in LiveRangeEdit is going to build the live-ranges this lazy computation may mess up that computation resulting in assertion failures. Namely, the createEmptyIntervalFrom method expect that the live-range is going to be empty, not computed. Thanks to Mikael Holmén <mikael.holmen@ericsson.com> for noticing and reporting the problem. llvm-svn: 293934	2017-02-02 20:44:36 +00:00
Xinliang David Li	83423510b2	[PGO] make graph view internal options available for all builds Differential Revision: https://reviews.llvm.org/D29259 llvm-svn: 293921	2017-02-02 19:18:56 +00:00
Nirav Dave	63300d8c5e	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915	2017-02-02 18:24:55 +00:00
Amaury Sechet	7aea955fa2	Use N0 instead of N->getOperand(0) in DagCombiner::visitAdd. NFC llvm-svn: 293903	2017-02-02 16:07:44 +00:00
Nirav Dave	d4909b474b	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893	2017-02-02 14:39:42 +00:00
Matthias Braun	0d46e1f86e	RegisterCoalescer: Cleanup joinReservedPhysReg(); NFC - Factor out a common subexpression - Add some helpful comments - Fix printing of a register in a debug message llvm-svn: 293856	2017-02-02 02:23:27 +00:00
Paul Robinson	3a545364e4	Remove an assertion that doesn't hold when mixing -g and -gmlt through LTO. Replace it with a related assertion, ensuring that abstract variables appear only in abstract scopes. Part of PR31437. Differential Revision: http://reviews.llvm.org/D29430 llvm-svn: 293841	2017-02-01 23:51:56 +00:00
Dehao Chen	efda70ec5a	Change debug-info-for-profiling from a TargetOption to a function attribute. Summary: LTO requires the debug-info-for-profiling to be a function attribute. Reviewers: echristo, mehdi_amini, dblaikie, probinson, aprantl Reviewed By: mehdi_amini, dblaikie, aprantl Subscribers: aprantl, probinson, ahatanak, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29203 llvm-svn: 293833	2017-02-01 22:45:09 +00:00
Paul Robinson	24fe9eabb6	Remove an assertion that doesn't hold when mixing -g and -gmlt through LTO. Part of PR31437. Differential Revision: http://reviews.llvm.org/D29310 llvm-svn: 293818	2017-02-01 21:54:50 +00:00
Sanjoy Das	f98a07f15c	[ImplicitNullCheck] Extend canReorder scope Summary: This change allows a re-order of two intructions if their uses are overlapped. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29120 llvm-svn: 293775	2017-02-01 16:04:21 +00:00

1 2 3 4 5 ...

22082 Commits