llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Matthew Simpson	ca1b8b1ae6	[LV] Correct misleading comments in test (NFC) llvm-svn: 285402	2016-10-28 14:27:45 +00:00
Davide Italiano	aada583c07	[Reassociate] Removing instructions mutates the IR. Fixes PR 30784. Discussed with Justin, who pointed out that in the new PassManager infrastructure we can have more fine-grained control on which analyses we want to preserve, but this is the best we can do with the current infrastructure. llvm-svn: 285380	2016-10-28 02:47:09 +00:00
Davide Italiano	63658c3a0c	[ConstantFold] Get the correct vector type when folding a getelementptr. Differential Revision: https://reviews.llvm.org/D26014 llvm-svn: 285371	2016-10-28 00:53:16 +00:00
Davide Italiano	380df773bb	Remove accidentally commited test. llvm-svn: 285366	2016-10-27 23:40:19 +00:00
Davide Italiano	62de06c0e3	[IR] Reintroduce getGEPReturnType(), it will be used in a later patch. llvm-svn: 285365	2016-10-27 23:38:51 +00:00
Sanjay Patel	fc59e659b5	[InstCombine] fix foldSPFofSPF() to handle vector splats llvm-svn: 285345	2016-10-27 21:19:40 +00:00
Sanjay Patel	01ffcf5a12	[InstCombine] add vector tests for foldSPFofSPF to show missing folds llvm-svn: 285340	2016-10-27 20:51:03 +00:00
Sanjay Patel	0e5308d054	[InstCombine] auto-generate checks for min/max tests llvm-svn: 285336	2016-10-27 19:54:15 +00:00
Sanjay Patel	6b6d270c74	[InstCombine] handle simple vector integer constants in IsFreeToInvert llvm-svn: 285318	2016-10-27 17:30:50 +00:00
Dehao Chen	ecb41605f5	Add Loop Sink pass to reverse the LICM based of basic block frequency. Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency. Reviewers: hfinkel, davidxl, chandlerc Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22778 llvm-svn: 285308	2016-10-27 16:30:08 +00:00
Sanjay Patel	87ccc5a816	[ValueTracking] fix matchSelectPattern to allow vector splat folds of min/max/abs/nabs llvm-svn: 285303	2016-10-27 15:26:10 +00:00
Sanjay Patel	e121eccc6f	[InstCombine] add tests for missing folds of vector abs/nabs/min/max llvm-svn: 285299	2016-10-27 15:02:45 +00:00
Sanjay Patel	6941aa6e91	[InstCombine] auto-generate better checks; NFC llvm-svn: 285293	2016-10-27 13:55:37 +00:00
Alexey Bataev	b6c0260e86	[SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer. After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286	2016-10-27 12:02:28 +00:00
Sanjoy Das	4802c96296	Simplify `x >=u x >> y` and `x >=u x udiv y` Summary: Extends InstSimplify to handle both `x >=u x >> y` and `x >=u x udiv y`. This is a folloup of rL258422 and https://github.com/rust-lang/rust/pull/30917 where llvm failed to optimize away the bounds checking in a binary search. Patch by Arthur Silva! Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25941 llvm-svn: 285228	2016-10-26 19:18:43 +00:00
Dehao Chen	291651da88	Introduce updateDiscriminator interface to DILocation to make it cleaner assigning discriminators. Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later. Reviewers: dnovillo, dblaikie, aprantl, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25959 llvm-svn: 285207	2016-10-26 15:48:45 +00:00
Sanjay Patel	8a25f27095	[InstCombine] consolidate zext tests and auto-generate checks; NFC llvm-svn: 285195	2016-10-26 14:08:49 +00:00
Sanjay Patel	6d8f326d58	[InstCombine] auto-generate better checks; NFC llvm-svn: 285194	2016-10-26 13:58:22 +00:00
Rong Xu	95093c2f32	[PGO] Fix select instruction annotation Summary: Select instruction annotation in IR PGO uses the edge count to infer the branch count. It's currently placed in setInstrumentedCounts() where no all the BB counts have been computed. This leads to wrong branch weights. Move the annotation after all BB counts are populated. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25961 llvm-svn: 285128	2016-10-25 21:47:24 +00:00
Guozhi Wei	14476b73f7	[InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996 The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996. The problem is with following code xB = load (type B); xA = load (type A); +yA = (A)xB; B -> A +zAn = PHI[yA, xA]; PHI +zBn = (B)zAn; // A -> B store zAn; store zBn; optimizeBitCastFromPhi generates +zBn = (B)zAn; // A -> B and expects it will be combined with the following store instruction to another store zAn Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely. optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions. Differential Revision: https://reviews.llvm.org/D23896 llvm-svn: 285116	2016-10-25 20:43:42 +00:00
Sanjay Patel	c4399434a9	[InstCombine] Ensure that truncated int types are legal. Fixes the FIXMEs in D25952 and rL285075. Patch by bryant! Differential Revision: https://reviews.llvm.org/D25955 llvm-svn: 285108	2016-10-25 20:11:47 +00:00
Matthew Simpson	47bcdf8c60	[LV] Sink scalar operands of predicated instructions When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097	2016-10-25 18:59:45 +00:00
Sanjay Patel	22ed496741	[InstCombine] add tests for missing icmp + shl nuw fold Patch by bryant! Differential Revision: https://reviews.llvm.org/D25952 llvm-svn: 285095	2016-10-25 18:47:56 +00:00
Michael Ilseman	eba5480140	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. Thanks to Adrian Prantl for stewarding this patch! llvm-svn: 285094	2016-10-25 18:44:13 +00:00
Geoff Berry	80e7303b39	[EarlyCSE] Make MemorySSA memory dependency check more aggressive. Now that MemorySSA keeps track of whether MemoryUses are optimized, use getClobberingMemoryAccess() to check MemoryUse memory dependencies since it should no longer be so expensive. This is a follow-up change to https://reviews.llvm.org/D25881 llvm-svn: 285080	2016-10-25 16:18:47 +00:00
Sanjay Patel	87ae2b5c21	[InstCombine] add test and code comment to show potentially misguided icmp trunc transform llvm-svn: 285075	2016-10-25 15:16:39 +00:00
Sanjay Patel	802f621594	[InstCombine] fix checks for previous commit (r285069) Accidentally put in the hoped-for checks ahead of the transform! llvm-svn: 285070	2016-10-25 13:30:19 +00:00
Sanjay Patel	24a71b8bd7	[InstCombine] add tests for bitcast interference with min/max (PR28001) llvm-svn: 285069	2016-10-25 13:27:56 +00:00
Sanjay Patel	73022fbc15	[InstCombine] auto-generate checks llvm-svn: 285046	2016-10-25 00:44:02 +00:00
Sanjay Patel	c4fe292ebd	[InstCombine] auto-generate checks llvm-svn: 285045	2016-10-25 00:41:00 +00:00
Sanjay Patel	f7b399052c	[InstCombine] regenerate some checks llvm-svn: 285036	2016-10-24 22:50:26 +00:00
Mandeep Singh Grang	54f45ece9e	[llvm] Remove redundant --check-prefix=CHECK from tests Reviewers: MatzeB, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D25894 llvm-svn: 285003	2016-10-24 18:57:55 +00:00
Adrian Prantl	1f1c90553c	add-discriminators: Fix handling of lexical scopes. This fixes a bug in the handling of lexical scopes, when more than one scope is defined on the same line or functions are inlined into call sites that are on the same line as the function definition. This situation can easily happen in macro expansions. The problem is solved by introducing a SmallDenseMap<DIScope , DILexicalBlockFile , 1> that keeps track of all the different lexical scopes that share a line/file location. Fixes PR30681. llvm-svn: 284998	2016-10-24 18:23:51 +00:00
Geoff Berry	4cf33cb292	[EarlyCSE] Optimize MemoryPhis and reduce memory clobber queries w/ MemorySSA Summary: When using MemorySSA, re-optimize MemoryPhis when removing a store since this may create MemoryPhis with all identical arguments. Also, when using MemorySSA to check if two MemoryUses are reading from the same version of the heap, use the defining access instead of calling getClobberingAccess, since the latter can currently result in many more AA calls. Once the MemorySSA use optimization tracking changes are done, we can remove this limitation, which should result in more loads being CSE'd. Reviewers: dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25881 llvm-svn: 284984	2016-10-24 15:54:00 +00:00
Nico Weber	18b9f876a0	Revert 284971. It seems to break selfhost on some bots, see e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22 llvm-svn: 284979	2016-10-24 14:52:04 +00:00
Pablo Barrio	d45a0eaca9	[JumpThreading] Unfold selects that depend on the same condition Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop Subscribers: jojo, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D25477 llvm-svn: 284971	2016-10-24 13:04:45 +00:00
Anna Thomas	0e0d440281	[StripGCRelocates] New pass to remove gc.relocates added by RS4GC Summary: Utility pass to remove gc.relocates created by rewrite statepoints for GC. With respect to safepoint verification, the IR generated would be incorrect, and cannot run as such. This would be a single transformation on the final optimized IR. The benefit of the pass is for easy analysis when the IRs are 'polluted' by too many gc.relocates. Added tests. test run: All RS4GC tests with -verify option. Local downstream tests on large IR files. This also works when the pointer being gc.relocated is another gc.relocate. Reviewers: sanjoy, reames Subscribers: beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25096 llvm-svn: 284855	2016-10-21 18:43:16 +00:00
Artur Pilipenko	d79ec47437	[LVI] Fix a bug with a guard being the very first instruction in a BB not taken into account While looking for guards use reverse iterator and scan up to rend() not to begin() llvm-svn: 284827	2016-10-21 15:02:21 +00:00
John Brawn	c944a4af03	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Davide Italiano	d7b3d1566d	Revert "[GVN/PRE] Hoist global values outside of loops." There's no agreement about this patch. I personally find the PRE machinery of the current GVN hard enough to reason about that I'm not sure I'll try to land this again, instead of working on the rewrite). llvm-svn: 284796	2016-10-21 01:37:02 +00:00
Michael Kuperstein	ae8887d98b	[X86] Enable interleaved memory access by default This lets the loop vectorizer generate interleaved memory accesses on x86. Differential Revision: https://reviews.llvm.org/D25350 llvm-svn: 284779	2016-10-20 21:04:31 +00:00
Sanjay Patel	4b2cca44ae	[InstSimplify] fold negation of sign-bit 0 - X --> X, if X is 0 or the minimum signed value 0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW I noticed this pattern might be created in the backend after the change from D25485, so we'll want to add a similar fold for the DAG. The use of computeKnownBits in InstSimplify may be something to investigate if the compile time of InstSimplify is noticeable. We could replace computeKnownBits with specific pattern matchers or limit the recursion. Differential Revision: https://reviews.llvm.org/D25785 llvm-svn: 284649	2016-10-19 21:23:45 +00:00
Artur Pilipenko	51d1f548b9	[IndVarSimplify] Teach calculatePostIncRange to take guards into account Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25739 llvm-svn: 284632	2016-10-19 19:43:54 +00:00
Matthew Simpson	f64cb1c3a7	[LV] Avoid emitting trivially dead instructions Some instructions from the original loop, when vectorized, can become trivially dead. This happens because of the way we structure the new loop. For example, we create new induction variables and induction variable "steps" in the new loop. Thus, when we go to vectorize the original induction variable update, it may no longer be needed due to the instructions we've already created. This patch prevents us from creating these redundant instructions. This reduces code size before simplification and allows greater flexibility in code generation since we have fewer unnecessary instruction uses. Differential Revision: https://reviews.llvm.org/D25631 llvm-svn: 284631	2016-10-19 19:22:02 +00:00
Artur Pilipenko	2bac40ee1c	[IndVarSimplify] Use control-dependent range information to prove non-negativity This change is motivated by the case when IndVarSimplify doesn't widen a comparison of IV increment because it can't prove IV increment being non-negative. We end up with a redundant trunc of the widened increment on this example. for.body: %i = phi i32 [ %start, %for.body.lr.ph ], [ %i.inc, %for.inc ] %within_limits = icmp ult i32 %i, 64 br i1 %within_limits, label %continue, label %for.end continue: %i.i64 = zext i32 %i to i64 %arrayidx = getelementptr inbounds i32, i32* %base, i64 %i.i64 %val = load i32, i32* %arrayidx, align 4 br label %for.inc for.inc: %i.inc = add nsw nuw i32 %i, 1 %cmp = icmp slt i32 %i.inc, %limit br i1 %cmp, label %for.body, label %for.end There is a range check inside of the loop which guarantees the IV to be non-negative. NSW on the increment guarantees that the increment is also non-negative. Teach IndVarSimplify to use the range check to prove non-negativity of loop increments. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D25738 llvm-svn: 284629	2016-10-19 18:59:03 +00:00
Sanjay Patel	054119b163	[InstSimplify] move one and add more tests for potential negation folds llvm-svn: 284627	2016-10-19 18:42:12 +00:00
Dehao Chen	48514ff998	Update the section.ll to fix non-x86 failure. llvm-svn: 284566	2016-10-19 03:53:41 +00:00
Dehao Chen	9da0fdce12	Revert r284545 again as the regression in ppc still exists. There is bug in MBPI exposed by th patch. Also update the section.ll to fix non-x86 failure. llvm-svn: 284563	2016-10-19 01:18:25 +00:00
Rong Xu	40080ca01c	Conditionally eliminate library calls where the result value is not used Summary: This pass shrink-wraps a condition to some library calls where the call result is not used. For example: sqrt(val); is transformed to if (val < 0) sqrt(val); Even if the result of library call is not being used, the compiler cannot safely delete the call because the function can set errno on error conditions. Note in many functions, the error condition solely depends on the incoming parameter. In this optimization, we can generate the condition can lead to the errno to shrink-wrap the call. Since the chances of hitting the error condition is low, the runtime call is effectively eliminated. These partially dead calls are usually results of C++ abstraction penalty exposed by inlining. This optimization hits 108 times in 19 C/C++ programs in SPEC2006. Reviewers: hfinkel, mehdi_amini, davidxl Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz Differential Revision: https://reviews.llvm.org/D24414 llvm-svn: 284542	2016-10-18 21:36:27 +00:00
Dehao Chen	b90f841254	Add target for test to fix regression introduced by r284533. llvm-svn: 284538	2016-10-18 21:13:31 +00:00
Dehao Chen	fdbd269422	Use profile info to set function section prefix to group hot/cold functions. Summary: The original implementation is in r261607, which was reverted in r269726 to accomendate the ProfileSummaryInfo analysis pass. The new implementation: 1. add a new metadata for function section prefix 2. query against ProfileSummaryInfo in CGP to set the correct section prefix for each function 3. output the section prefix set by CGP Reviewers: davidxl, eraman Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24989 llvm-svn: 284533	2016-10-18 20:42:47 +00:00
Simon Pilgrim	293c1a3deb	[X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32 As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types. This patch adds support for fptosi to 2i32 from both 2f64 and 2f32. It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch. Differential Revision: https://reviews.llvm.org/D23808 llvm-svn: 284459	2016-10-18 07:42:15 +00:00
Davide Italiano	d7b0922d20	[opt] Strip coverage if debug info is not present. If -coverage is passed, but -g is not, clang populates the PassManager pipeline with StripSymbols(debugOnly = true). The stripSymbol pass therefore scans the list of named metadata, drops !llvm.dbg.cu, but leaves !llvm.gcov and !0 (the compileUnit MD) around. The verifier runs, and finds out that there's a CU not listed in !llvm.dbg.cu (as it was previously dropped) -> crash. When we strip debug info, so, check if there's coverage data, and strip it as well, in order to avoid pending metadata left around. Differential Revision: https://reviews.llvm.org/D25689 llvm-svn: 284418	2016-10-17 20:05:35 +00:00
Dehao Chen	fbd5229bfe	Ignore debug info when making optimization decisions in SimplifyCFG. Summary: Debug info should not affect code generation. This patch properly handles debug info to make sure the generated code are the same with or without debug info. Reviewers: davidxl, mzolotukhin, jmolloy Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D25286 llvm-svn: 284415	2016-10-17 19:28:44 +00:00
Oliver Stannard	7e6c6a8a5a	[SimplifyCFG] Don't lower complex ConstantExprs to lookup tables Not all ConstantExprs can be represented by a global variable, for example most pointer arithmetic other than addition of a constant, so we can't convert these values from switch statements to lookup tables. Differential Revision: https://reviews.llvm.org/D25550 llvm-svn: 284379	2016-10-17 12:00:24 +00:00
Davide Italiano	d7c0f18c12	[GVN/PRE] Hoist global values outside of loops. In theory this could be generalized to move anything where we prove the operands are available, but that would require rewriting PRE. As NewGVN will hopefully come soon, and we're trying to rewrite PRE in terms of NewGVN+MemorySSA, it's probably not worth spending too much time on it. Fix provided by Daniel Berlin! llvm-svn: 284311	2016-10-15 21:35:23 +00:00
David L Kreitzer	1233356719	Add a pass to optimize patterns of vectorized interleaved memory accesses for X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260	2016-10-14 18:20:41 +00:00
Tom Stellard	19c9255423	AMDGPU/SI: Don't allow unaligned scratch access Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257	2016-10-14 18:10:39 +00:00
David L Kreitzer	d68b19752a	[safestack] Use non-thread-local unsafe stack pointer for Contiki OS Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D19852 llvm-svn: 284254	2016-10-14 17:56:00 +00:00
Sanjay Patel	4538ac9f97	[InstCombine] use m_APInt to allow sub with constant folds for splat vectors llvm-svn: 284247	2016-10-14 16:31:54 +00:00
Sanjay Patel	037f524f36	[InstCombine] add tests for missing vector folds llvm-svn: 284245	2016-10-14 15:55:34 +00:00
Sanjay Patel	8c7be2e26d	[InstCombine] auto-generate checks llvm-svn: 284244	2016-10-14 15:41:25 +00:00
Sanjay Patel	b1fc9d9f65	[InstCombine] remove redundant test This test was apparently checking for 2 independent folds, but we have plenty of tests for those individual folds already. We are lacking vector tests, however, because we don't have the shift folds for vectors. llvm-svn: 284243	2016-10-14 15:36:28 +00:00
Sanjay Patel	e5bc6982b9	[InstCombine] update test to use FileCheck and auto-generate checks llvm-svn: 284242	2016-10-14 15:30:31 +00:00
Sanjay Patel	8024bbbb08	[InstCombine] sub X, sext(bool Y) -> add X, zext(bool Y) Prefer add/zext because they are better supported in terms of value-tracking. Note that the backend should be prepared for this IR canonicalization (including vector types) after: https://reviews.llvm.org/rL284015 Differential Revision: https://reviews.llvm.org/D25135 llvm-svn: 284241	2016-10-14 15:24:31 +00:00
David L Kreitzer	eaea365433	[safestack] Move X86-targeted tests into the X86 subdirectory. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D25340 llvm-svn: 284139	2016-10-13 17:51:59 +00:00
Matthew Simpson	fc6d2c527b	[LV] Account for predicated stores in instruction costs This patch ensures that we scale the estimated cost of predicated stores by block probability. This is a follow-on patch for r284123. llvm-svn: 284126	2016-10-13 14:54:31 +00:00
Matthew Simpson	c7b4269af0	[LV] Avoid rounding errors for predicated instruction costs This patch modifies the cost calculation of predicated instructions (div and rem) to avoid the accumulation of rounding errors due to multiple truncating integer divisions. The calculation for predicated stores will be addressed in a follow-on patch since we currently don't scale the cost of predicated stores by block probability. Differential Revision: https://reviews.llvm.org/D25333 llvm-svn: 284123	2016-10-13 14:19:48 +00:00
Sebastian Pop	6c08f40f51	commit back "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)" This is with an extra change to avoid calling MemoryLocation::get() on a call instruction. Differential Revision: https://reviews.llvm.org/D25542 llvm-svn: 284098	2016-10-13 01:39:10 +00:00
Reid Kleckner	95149fb393	Revert "GVN-hoist: fix store past load dependence analysis (PR30216, PR30499)" This CL didn't actually address the test case in PR30499, and clang still crashes. Also revert dependent change "Memory-SSA cleanup of clobbers interface, NFC" Reverts r283965 and r283967. llvm-svn: 284093	2016-10-13 00:18:26 +00:00
Haicheng Wu	5b13afc1d2	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053	2016-10-12 21:29:38 +00:00
Haicheng Wu	9079316128	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" This reverts commit r284044. llvm-svn: 284051	2016-10-12 21:02:22 +00:00
Krzysztof Parzyszek	3ebda330a6	Fix testcases failing after r284036 The codegen has changed slightly between my tests and the commit. llvm-svn: 284049	2016-10-12 20:39:33 +00:00
Haicheng Wu	3e43a84017	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044	2016-10-12 20:24:32 +00:00
Krzysztof Parzyszek	9741038bd4	Do not remove implicit defs in BranchFolder Branch folder removes implicit defs if they are the only non-branching instructions in a block, and the branches do not use the defined registers. The problem is that in some cases these implicit defs are required for the liveness information to be correct. Differential Revision: https://reviews.llvm.org/D25478 llvm-svn: 284036	2016-10-12 19:50:57 +00:00
Sanjoy Das	54e3386b64	[SimplifyCFG] Don't create PHI nodes for constant bundle operands Summary: Constant bundle operands may need to retain their constant-ness for correctness. I'll admit that this is slightly odd, but it looks like SimplifyCFG already does this for things like @llvm.frameaddress and @llvm.stackmap, so I suppose adding one more case is not a big deal. It is possible to add a mechanism to denote bundle operands that need to remain constants, but that's probably too complicated for the time being. Reviewers: jmolloy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25502 llvm-svn: 284028	2016-10-12 18:15:33 +00:00
Artur Pilipenko	6245b73221	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Since this change is known to cause performance degradations in some cases it's commited under a temporary flag which is turned off by default. Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 284022	2016-10-12 16:18:43 +00:00
Chad Rosier	3984bf527a	[CVP] Convert an AShr to a LShr if 1st operand is known to be nonnegative. An arithmetic shift can be safely changed to a logical shift if the first operand is known positive. This allows ComputeKnownBits (and similar analysis) to determine the sign bit of the shifted value in some cases. In turn, this allows InstCombine to canonicalize a signed comparison (a > 0) into an equality check (a != 0). PR30577 Differential Revision: https://reviews.llvm.org/D25119 llvm-svn: 284013	2016-10-12 13:41:38 +00:00
Simon Pilgrim	72f99d6988	[InstCombine] Fix constexpr issue in select combining As discussed by Andrea on PR30486, we have an unsafe cast to an Instruction type in the select combine which doesn't take into account that it could be a ConstantExpr instead. Differential Revision: https://reviews.llvm.org/D25466 llvm-svn: 284000	2016-10-12 10:20:15 +00:00
Sebastian Pop	fdf5952343	GVN-hoist: fix store past load dependence analysis (PR30216, PR30499) This is a refreshed version of a patch that was reverted: it fixes the problems reported in both PR30216 and PR30499, and contains all the test-cases from both bugs. To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Tested on x86_64-linux with check and a test-suite run. Differential Revision: https://reviews.llvm.org/D25476 llvm-svn: 283965	2016-10-12 02:23:39 +00:00
David Majnemer	e00f20b94a	[InstCombine] Transform !range metadata to !nonnull when combining loads When combining an integer load with !range metadata that does not include 0 to a pointer load, make sure emit !nonnull metadata on the newly-created pointer load. This prevents the !nonnull metadata from being dropped during a ptrtoint/inttoptr pair. This fixes PR30597. Patch by Ariel Ben-Yehuda! Differential Revision: https://reviews.llvm.org/D25215 llvm-svn: 283836	2016-10-11 01:00:45 +00:00
Adrian Prantl	95da712509	Teach llvm::StripDebugInfo() about global variable !dbg attachments. This is a regression introduced by the global variable ownership reversal performed in r281284. rdar://problem/28448075 llvm-svn: 283784	2016-10-10 17:53:33 +00:00
Simon Pilgrim	4df8e37155	[SLPVectorizer][X86] Add 512-bit sitofp/uitofp tests llvm-svn: 283756	2016-10-10 14:28:06 +00:00
Simon Pilgrim	1dbb87da59	[SLPVectorizer][X86] Add avx512 sitofp/uitofp tests llvm-svn: 283751	2016-10-10 14:14:31 +00:00
Simon Pilgrim	8047ad9428	[SLPVectorizer][X86] Fixed alignments of scalar loads in sitofp/uitofp tests Fixed copy+paste vector alignment to correct for per-element scalar loads Increased to 512-bit data sizes in preparation of avx512 tests llvm-svn: 283748	2016-10-10 14:10:41 +00:00
Adam Nemet	bedb129f6f	[OptRemarks] Remove non-printable chars from function name Value names may be prefixed with a binary '1' to indicate that the backend should not modify the symbols due to any platform naming convention. This should not show up in the YAML opt record file because it breaks the YAML parser. llvm-svn: 283656	2016-10-08 04:47:20 +00:00
Gor Nishanov	680815e118	[coroutines] Store an address of destroy OR cleanup part in the coroutine frame. Summary: If heap allocation of a coroutine is elided, we need to make sure that we will update an address stored in the coroutine frame from f.destroy to f.cleanup. Before this change, CoroSplit synthesized these stores after coro.begin: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr store void (%f.Frame) @f.destroy, void (%f.Frame)* %destroy.addr ``` In those cases where we did heap elision, but were not able to devirtualize all indirect calls, destroy call will attempt to "free" the coroutine frame stored on the stack. Oops. Now we use select to put an appropriate coroutine subfunction in the destroy slot. As bellow: ``` store void (%f.Frame) @f.resume, void (%f.Frame)* %resume.addr %0 = select i1 %need.alloc, void (%f.Frame) @f.destroy, void (%f.Frame) @f.cleanup store void (%f.Frame) %0, void (%f.Frame)* %destroy.addr ``` Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25377 llvm-svn: 283625	2016-10-08 00:22:50 +00:00
Davide Italiano	99e8d201a0	[InstCombine] Don't unpack arrays that are too large (part 2). This is similar to r283599, but for store instructions. Thanks to David for pointing out! llvm-svn: 283612	2016-10-07 21:53:09 +00:00
Davide Italiano	c6c98e0346	[InstCombine] Don't unpack arrays that are too large Differential Revision: https://reviews.llvm.org/D25376 llvm-svn: 283599	2016-10-07 20:57:42 +00:00
Anna Thomas	99e41a98d7	[RS4GC] Strengthen coverage: add more tests Summary: Add tests for cases where we have zero coverage in RS4GC. Reviewers: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25341 llvm-svn: 283591	2016-10-07 20:34:00 +00:00
Sanjay Patel	4b9d19b1ec	[InstCombine] fold select X, (ext X), C If we're going to canonicalize IR towards select of constants, try harder to create those. Also, don't lose the metadata. This is actually 4 related transforms in one patch: // select X, (sext X), C --> select X, -1, C // select X, (zext X), C --> select X, 1, C // select X, C, (sext X) --> select X, C, 0 // select X, C, (zext X) --> select X, C, 0 Differential Revision: https://reviews.llvm.org/D25126 llvm-svn: 283575	2016-10-07 17:53:07 +00:00
Matthew Simpson	c4d75790e7	[LV] Don't mark multi-use branch conditions uniform Previously, we marked the branch conditions of latch blocks uniform after vectorization if they were instructions contained in the loop. However, if a condition instruction has users other than the branch, it may not remain uniform. This patch ensures the conditions we mark uniform are only used by the branch. This should fix PR30627. Reference: https://llvm.org/bugs/show_bug.cgi?id=30627 llvm-svn: 283563	2016-10-07 15:20:13 +00:00
Tom Stellard	6fa6d1d712	[ValueTracking] Fix crash in GetPointerBaseWithConstantOffset() Summary: While walking defs of pointer operands we were assuming that the pointer size would remain constant. This is not true, because addresspacecast instructions may cast the pointer to an address space with a different pointer width. This partial reverts r282612, which was a more conservative solution to this problem. Reviewers: reames, sanjoy, apilipenko Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24772 llvm-svn: 283557	2016-10-07 14:23:29 +00:00
Alexey Bataev	0a23402e2a	[SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535	2016-10-07 09:39:22 +00:00
Oliver Stannard	7be9f2d236	[ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI With the ROPI and RWPI relocation models we can't always have pointers to global data or functions in constant data, so don't try to convert switches into lookup tables if any value in the lookup table would require a relocation. We can still safely emit lookup tables of other values, such as simple constants. Differential Revision: https://reviews.llvm.org/D24462 llvm-svn: 283530	2016-10-07 08:48:24 +00:00
David Majnemer	7890f416e2	[SimplifyCFG] Correctly test for unconditional branches in GetCaseResults GetCaseResults assumed that a terminator with one successor was an unconditional branch. This is not necessarily the case, it could be a cleanupret. Strengthen the check by querying whether or not the terminator is exceptional. llvm-svn: 283517	2016-10-07 01:38:35 +00:00
Michael Kuperstein	e1ab454f4d	[LV] Remove triples from target-independent vectorizer tests. NFC. Vectorizer tests in the target-independent directory should not have a target triple. If a test really needs to query a specific backend, it belongs in the right target subdirectory (which "REQUIRES" the right backend). Otherwise, it should not specify a triple. llvm-svn: 283512	2016-10-06 23:57:25 +00:00
Rong Xu	ebf7db8774	[PGO] Create weak alias for the renamed Comdat function Add a weak alias to the renamed Comdat function in IR level instrumentation, using it's original name. This ensures the same behavior w/ and w/o IR instrumentation, even for non standard conforming code. Differential Revision: http://reviews.llvm.org/D25339 llvm-svn: 283490	2016-10-06 20:38:13 +00:00
Michael Ilseman	65a382e8fc	Revert "Add -strip-nonlinetable-debuginfo capability" This reverts commit r283473. Reverted until review is completed. llvm-svn: 283478	2016-10-06 18:30:26 +00:00
Michael Ilseman	ae58e6368d	Add -strip-nonlinetable-debuginfo capability This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. llvm-svn: 283473	2016-10-06 17:58:38 +00:00
Hal Finkel	95d4e72334	Don't filter diagnostics written as YAML to the output file The purpose of the YAML diagnostic output file is to collect information on optimizations performed, or not performed, for later processing by tools that help users (and compiler developers) understand how code was optimized. As such, the diagnostics that appear in the file should not be coupled to what a user might want to see summarized for them as the compiler runs, and in fact, because the user likely does not know what optimization diagnostics their tools might want to use, the user cannot provide a useful filter regardless. As such, we shouldn't filter the diagnostics going to the output file. Differential Revision: https://reviews.llvm.org/D25224 llvm-svn: 283236	2016-10-04 18:13:45 +00:00
Adam Nemet	9b5496d078	Serialize remark argument as a mapping to get proper quotation for the value. llvm-svn: 283231	2016-10-04 17:05:04 +00:00
Alexey Bataev	c9e7a8f58f	[SLPVectorizer] Add a test with non-vectorizable IR. llvm-svn: 283225	2016-10-04 15:07:23 +00:00
Anna Thomas	ded4f59371	[RS4GC] Handle ShuffleVector instruction in findBasePointer Summary: This patch modifies the findBasePointer to handle the shufflevector instruction. Tests run: RS4GC tests, local downstream tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25197 llvm-svn: 283219	2016-10-04 13:48:37 +00:00
Sanjoy Das	b4ec359753	[PruneEH] Be correct in the face IPO This fixes one spot I had missed in r265762. Credit goes to Philip Reames for spotting this one! llvm-svn: 283137	2016-10-03 19:35:30 +00:00
Hans Wennborg	c250a100e2	Jump threading: avoid trying to split edge into landingpad block (PR27840) Splitting the edge is nontrivial because of the landing pad, and we would currently assert trying to do it. Differential Revision: https://reviews.llvm.org/D24680 llvm-svn: 283129	2016-10-03 18:18:04 +00:00
Sanjay Patel	2e8c69fd7b	[x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119	2016-10-03 16:38:27 +00:00
Simon Pilgrim	15c3bf66e9	[SLPVectorizer][X86] Added fptosi/fptoui tests llvm-svn: 283048	2016-10-01 19:35:59 +00:00
Simon Pilgrim	b14252ccf4	[SLPVectorizer][X86] Added fcopysign tests llvm-svn: 283046	2016-10-01 17:00:26 +00:00
Simon Pilgrim	04d9993d37	[SLPVectorizer][X86] Added fabs tests llvm-svn: 283045	2016-10-01 16:54:01 +00:00
Sanjay Patel	6f740e7b76	[InstCombine] allow non-splat folds of select cond (ext X), C llvm-svn: 282906	2016-09-30 19:49:22 +00:00
Dehao Chen	9a6d19a0c6	Revert test change in r282894 as it's broken in some platforms. llvm-svn: 282903	2016-09-30 19:25:23 +00:00
Gor Nishanov	f845a3821b	[Coroutines] Part15c: Fix coro-split to correctly handle definitions between coro.save and coro.suspend Summary: In the case below, %Result.i19 is defined between coro.save and coro.suspend and used after coro.suspend. We need to correctly place such a value into the coroutine frame. ``` %save = call token @llvm.coro.save(i8* null) %Result.i19 = getelementptr inbounds %"struct.lean_future<int>::Awaiter", %"struct.lean_future<int>::Awaiter"* %ref.tmp7, i64 0, i32 0 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) switch i8 %suspend, label %exit [ i8 0, label %await.ready i8 1, label %exit ] await.ready: %val = load i32, i32* %Result.i19 ``` Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24418 llvm-svn: 282902	2016-09-30 19:24:19 +00:00
Sanjay Patel	373ee08f45	[InstCombine] add tests for non-splat select(ext) llvm-svn: 282901	2016-09-30 19:15:41 +00:00
Gor Nishanov	04e0faa3b3	[Coroutines] Part15b: Fix dbg information handling in coro-split. Summary: Without the fix, if there was a function inlined into the coroutine with debug information, CloneFunctionInto(NewF, &F, VMap, /ModuleLevelChanges=/true, Returns); would duplicate all of the debug information including the DICompileUnit. We know use VMap to indicate that debug metadata for a File, Unit and FunctionType should not be duplicated when we creating clones that will become f.resume, f.destroy and f.cleanup. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24417 llvm-svn: 282899	2016-09-30 19:05:06 +00:00
Gor Nishanov	9f360633cd	[Coroutines] Part 15a: Lower coro.subfn.addr in CoroCleanup Summary: Not all coro.subfn.addr intrinsics can be eliminated in CoroElide through devirtualization. Those that remain need to be lowered in CoroCleanup. Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24412 llvm-svn: 282897	2016-09-30 18:41:35 +00:00
Sanjay Patel	3cb99ee21f	clean up tests and auto-generate checks llvm-svn: 282896	2016-09-30 18:37:34 +00:00
Dehao Chen	766ad13bec	Update loop unroller cost model to make sure debug info does not affect optimization decisions. Summary: Debug info should not affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info. Reviewers: davidxl, mzolotukhin Subscribers: haicheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25098 llvm-svn: 282894	2016-09-30 18:30:04 +00:00
Sanjay Patel	6e0052ca56	[InstCombine] add tests for select X, (ext X), C llvm-svn: 282891	2016-09-30 18:10:14 +00:00
Matthew Simpson	f6e1d8dedb	[LV] Build all scalar steps for non-uniform induction variables When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863	2016-09-30 15:13:52 +00:00
Piotr Padlewski	98cb07e1d2	[thinlto] Don't decay threshold for hot callsites Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 llvm-svn: 282833	2016-09-30 03:01:17 +00:00
Piotr Padlewski	85b64e4eaf	[thinlto] Add cold-callsite import heuristic Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 llvm-svn: 282733	2016-09-29 17:32:07 +00:00
Evgeny Stupachenko	837a069ffd	Wisely choose sext or zext when widening IV. Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650	2016-09-28 23:39:39 +00:00
Sanjay Patel	a78562d801	[InstCombine] update to use FileCheck Also, remove unnecessary function attributes, parameters, and comments. It looks like at least some of these tests are not minimal though... llvm-svn: 282620	2016-09-28 19:10:16 +00:00
Artur Pilipenko	e07af89244	Don't look through addrspacecast in GetPointerBaseWithConstantOffset Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612	2016-09-28 17:57:16 +00:00
Sanjay Patel	51c21d223b	[InstSimplify] allow or-of-icmps folds with vector splat constants llvm-svn: 282592	2016-09-28 14:27:21 +00:00
Sanjay Patel	b28b4a20f7	[InstSimplify] add vector splat tests for or-of-icmps llvm-svn: 282591	2016-09-28 14:17:35 +00:00
Sanjay Patel	aac597925f	[InstSimplify] allow and-of-icmps folds with vector splat constants llvm-svn: 282590	2016-09-28 13:53:13 +00:00
Adam Nemet	ec2292c80c	[Inliner] Port all opt remarks to new streaming API llvm-svn: 282559	2016-09-27 23:47:03 +00:00
Adam Nemet	c99191ee9f	Pass -S to opt in this test to avoid printing binary on mismatch The purpose of the test is to verify diagnostics. llvm-svn: 282558	2016-09-27 23:46:59 +00:00
Adam Nemet	e23cf79174	[Inliner] Fold the analysis remark into the missed remark There is really no reason for these to be separate. The vectorizer started this pretty bad tradition that the text of the missed remarks is pretty meaningless, i.e. vectorization failed. There, you have to query analysis to get the full picture. I think we should just explain the reason for missing the optimization in the missed remark when possible. Analysis remarks should provide information that the pass gathers regardless whether the optimization is passing or not. llvm-svn: 282542	2016-09-27 21:58:17 +00:00
Michael Zolotukhin	38f796095c	[LoopSimplify] When simplifying phis in loop-simplify, do it only if it preserves LCSSA form. llvm-svn: 282541	2016-09-27 21:03:45 +00:00
Adam Nemet	f602aa8cdd	Output optimization remarks in YAML (Re-committed after moving the template specialization under the yaml namespace. GCC was complaining about this.) This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282539	2016-09-27 20:55:07 +00:00
Adam Nemet	5058aadaf2	Revert "Output optimization remarks in YAML" This reverts commit r282499. The GCC bots are failing llvm-svn: 282503	2016-09-27 16:39:24 +00:00
Adam Nemet	b1d6f940c4	Output optimization remarks in YAML This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282499	2016-09-27 16:15:16 +00:00
Ivan Krasin	920c8236df	Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation Summary: We don't currently need this facility for CFI. Disabling individual hot methods proved to be a better strategy in Chrome. Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne. Reviewers: pcc Subscribers: kcc Differential Revision: https://reviews.llvm.org/D24948 llvm-svn: 282461	2016-09-27 00:29:53 +00:00
Piotr Padlewski	3152e057e1	[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437	2016-09-26 20:37:32 +00:00
Daniel Berlin	1c6e9ba785	Remove pruning of phi nodes in MemorySSA - it makes updating harder Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24923 llvm-svn: 282419	2016-09-26 17:22:54 +00:00
Matthew Simpson	894b941fee	[LV] Scalarize instructions marked scalar after vectorization This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418	2016-09-26 17:08:37 +00:00
Gor Nishanov	5e19e9e23d	[Coroutines] Part14: Handle coroutines with no suspend points. Summary: If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function. Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24408 llvm-svn: 282414	2016-09-26 15:49:28 +00:00
Alexey Bataev	a554bff930	[InstCombine] Fixed bug introduced in r282237 The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. llvm-svn: 282401	2016-09-26 13:18:59 +00:00
Andrea Di Biagio	c32aed95bc	[InstCombine] Teach the udiv folding logic how to handle constant expressions. This patch fixes PR30366. Function foldUDivShl() worked under the assumption that one of the values in input to the function was always an instance of llvm::Instruction. However, function visitUDivOperand() (the only user of foldUDivShl) was clearly violating that precondition; internally, visitUDivOperand() uses pattern matches to check the operands of a udiv. Pattern matchers for binary operators know how to handle both Instruction and ConstantExpr values. This patch fixes the problem in foldUDivShl(). Now we use pattern matchers instead of explicit casts to Instruction. The reduced test case from PR30366 has been added to test file InstCombine/udiv-simplify.ll. Differential Revision: https://reviews.llvm.org/D24565 llvm-svn: 282398	2016-09-26 12:07:23 +00:00
Sanjay Patel	6d495bce9c	[TLI] isdigit / isascii / toascii param type should match return type (PR30484) We crash in LibCallSimplifier if we don't check the validity of the function signature properly. llvm-svn: 282278	2016-09-23 18:44:09 +00:00
Alexey Bataev	5e200a4d72	[InstCombine] Fix for PR29124: reduce insertelements to shufflevector If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 llvm-svn: 282237	2016-09-23 09:14:08 +00:00
Sanjay Patel	3fb981ce63	[InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672) We already have the udiv variant of this transform, so I think this is ok for InstCombine too even though there is an increase in IR instructions. As the tests and TODO comments show, the transform can lead to follow-on combines. This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672 Differential Revision: https://reviews.llvm.org/D24527 llvm-svn: 282209	2016-09-22 22:36:26 +00:00
Hans Wennborg	397edc9070	Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)" and also the dependent r282175 "GVN-hoist: do not dereference null pointers" It's causing compiler crashes building Harfbuzz (PR30499). llvm-svn: 282199	2016-09-22 21:20:53 +00:00
Sebastian Pop	0a87104c7e	GVN-hoist: fix store past load dependence analysis (PR30216) To hoist stores past loads, we used to search for potential conflicting loads on the hoisting path by following a MemorySSA def-def link from the store to be hoisted to the previous defining memory access, and from there we followed the def-use chains to all the uses that occur on the hoisting path. The problem is that the def-def link may point to a store that does not alias with the store to be hoisted, and so the loads that are walked may not alias with the store to be hoisted, and even as in the testcase of PR30216, the loads that may alias with the store to be hoisted are not visited. The current patch visits all loads on the path from the store to be hoisted to the hoisting position and uses the alias analysis to ask whether the store may alias the load. I was not able to use the MemorySSA functionality to ask for whether load and store are clobbered: I'm not sure which function to call, so I used a call to AA->isNoAlias(). Store past store is still working as before using a MemorySSA query: I added an extra test to pr30216.ll to make sure store past store does not regress. Differential Revision: https://reviews.llvm.org/D24517 llvm-svn: 282168	2016-09-22 15:33:51 +00:00
Sebastian Pop	1b14464087	GVN-hoist: move hoist testcase to GVNHoist dir llvm-svn: 282161	2016-09-22 14:45:46 +00:00
Sebastian Pop	628c0b289f	GVN-hoist: only hoist relevant scalar instructions Without this patch, GVN-hoist would think that a branch instruction is a scalar instruction and would try to value number it. The patch filters out all such kind of irrelevant instructions. A bit frustrating is that there is no easy way to discard all those very infrequent instructions, a bit like isa<TerminatorInst> that stands for a large family of instructions. I'm thinking that checking for those very infrequent other instructions would cost us more in compilation time than just letting those instructions getting numbered, so I'm still thinking that a simpler check: if (isa<TerminatorInst>(I)) return false; is better than listing all the other less frequent instructions. Differential Revision: https://reviews.llvm.org/D23929 llvm-svn: 282160	2016-09-22 14:45:40 +00:00
Anna Thomas	16cf546b7c	[RS4GC] Remat in presence of phi and use live value Summary: Reviewers: Subscribers: llvm-svn: 282150	2016-09-22 13:13:06 +00:00

1 2 3 4 5 ...

7836 Commits