llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Craig Topper	d5cf04da1c	[InstCombine] Support ~(c-X) --> X+(-c-1) and ~(X-c) --> (-c-1)-X for splat vectors. llvm-svn: 310195	2017-08-06 06:28:41 +00:00
Craig Topper	f043cfbb9a	[InstCombine] Fold (C - X) ^ signmask -> (C + signmask - X). llvm-svn: 310186	2017-08-05 20:00:44 +00:00
Craig Topper	c19e9b9071	[InstCombine] Teach the code that pulls logical operators through constant shifts to handle vector splats too. llvm-svn: 310185	2017-08-05 20:00:42 +00:00
Craig Topper	eb05f8fcb3	[InstCombine] Support vector splats in foldSelectICmpAnd. Unfortunately, it looks like there's some other missed optimizations in the generated code for some of these cases. I'll try to look at some of those next. llvm-svn: 310184	2017-08-05 20:00:41 +00:00
Dinar Temirbulatov	a7733cfe8c	[SLPVectorizer] Add extra parameter to setInsertPointAfterBundle to handle different opcodes, NFCI. Differential Revision: https://reviews.llvm.org/D35769 llvm-svn: 310183	2017-08-05 18:43:52 +00:00
Sanjay Patel	2eb16c0389	[InstCombine] refactor trunc(binop) transforms; NFCI In addition to moving the shift transforms over, we may want to detect too-wide rotate patterns here (PR34046). llvm-svn: 310181	2017-08-05 15:19:18 +00:00
Craig Topper	0f832f8e82	[InstCombine] In foldSelectICmpAnd, if we need to to truncate from the 'and' type to the 'select' type, do it after shifting right instead of just bailing. Previously we were always trying to emit the zext or truncate before any shift. This meant if the 'and' mask was larger than the size of the truncate we would skip the transformation. Now we shift the result of the and right first leaving the bit within the range of the truncate. This matches what we are doing in foldSelectICmpAndOr for the same problem. llvm-svn: 310159	2017-08-05 01:45:17 +00:00
Sanjay Patel	23bd4a9364	[InstCombine] narrow truncated add/sub/mul with constant Name: narrow_sub %sub = sub i32 C1, %x %r = trunc i32 %sub to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = sub i8 %narrowC, %xn Name: narrow_add %add = add i32 %x, C1 %r = trunc i32 %add to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = add i8 %xn, %narrowC Name: narrow_mul %mul = mul i32 %x, C1 %r = trunc i32 %mul to i8 => %xn = trunc i32 %x to i8 %narrowC = trunc i32 C1 to i8 %r = mul i8 %xn, %narrowC http://rise4fun.com/Alive/QpS This doesn't solve PR34046 (failure to recognize rotate): https://bugs.llvm.org/show_bug.cgi?id=34046 ...but it reduces an extra complication in the description examples to a form that we can more easily match. llvm-svn: 310141	2017-08-04 22:30:34 +00:00
Nico Weber	1ee42ff794	Revert r310055, it caused PR34074. llvm-svn: 310123	2017-08-04 20:40:38 +00:00
Evgeny Stupachenko	2be9c5c55b	Fix PR33514 Summary: The bug was uncovered after fix of PR23384 (part 3 of 3). The patch restricts pointer multiplication in SCEV computaion for ICmpZero. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D36170 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310092	2017-08-04 18:46:13 +00:00
Reid Kleckner	e29583d0d2	[ArgPromotion] Preserve alignment of byval argument in new alloca The frontend may have requested a higher alignment for any reason, and downstream optimizations may already have taken advantage of it. We should keep the same alignment when moving the allocation from the parameter area to the local variable area. Fixes PR34038 llvm-svn: 310071	2017-08-04 17:09:11 +00:00
Benjamin Kramer	423f2d0b11	[InstCombine] Fold single-use variable into assert. Avoids unused variable warnings in Release builds. No functional change. llvm-svn: 310064	2017-08-04 16:08:41 +00:00
Craig Topper	bcfbcd1b18	[InstCombine] Remove the (not (sext)) case from foldBoolSextMaskToSelect and inline the remaining code to match visitOr Summary: The (not (sext)) case is really (xor (sext), -1) which should have been simplified to (sext (xor, 1)) before we got here. So we shouldn't need to handle it. With that taken care of we only need to two cases so don't need the swap anymore. This makes us in sync with the equivalent code in visitOr so inline this to match. Reviewers: spatel, eli.friedman, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36240 llvm-svn: 310063	2017-08-04 16:07:20 +00:00
Craig Topper	fe1a0c7e67	[InstCombine] Use ConstantInt::getFalse to reduce some code. NFC llvm-svn: 310062	2017-08-04 16:07:18 +00:00
Sanjay Patel	0e224c7256	[InstCombine] narrow lshr with constant Name: narrow_shift Pre: C1 < 8 %zx = zext i8 %x to i32 %l = lshr i32 %zx, C1 => %narrowC = trunc i32 C1 to i8 %ns = lshr i8 %x, %narrowC %l = zext i8 %ns to i32 http://rise4fun.com/Alive/jIV This isn't directly applicable to PR34046 as written, but we need to have more narrowing folds like this to be sure that rotate patterns are recognized. llvm-svn: 310060	2017-08-04 15:42:47 +00:00
Filipe Cabecinhas	4058d9a4ca	[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to. Summary: This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Reviewers: hfinkel, junbuml, jfb, RKSimon, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 310055	2017-08-04 12:28:36 +00:00
Nikolai Bozhenov	899aec6301	[InstCombine] Canonicalize clamp of float types to minmax in fast mode. Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 310054	2017-08-04 12:22:17 +00:00
Max Kazantsev	9209a4521a	Do not declare a variable which is used only in assert. NFC llvm-svn: 310034	2017-08-04 07:41:24 +00:00
Max Kazantsev	ffa0782bec	[IRCE] Handle loops with step different from 1/-1 This patch generalizes IRCE to handle IV steps that are not equal to 1 or -1. Differential Revision: https://reviews.llvm.org/D35539 llvm-svn: 310032	2017-08-04 07:01:04 +00:00
Max Kazantsev	1aed4a4dfc	[IRCE] Recognize loops with unsigned latch conditions This patch enables recognition of loops with ult/ugt latch conditions. Differential Revision: https://reviews.llvm.org/D35302 llvm-svn: 310027	2017-08-04 05:40:20 +00:00
Craig Topper	14179d8998	[InstCombine] Move the call to foldSelectICmpAnd into foldSelectInstWithICmp. NFCI llvm-svn: 310025	2017-08-04 05:12:37 +00:00
Craig Topper	0f297d09b1	[InstCombine] Remove unnecessary casts. NFC We're calling an overload of getOpcode that already returns Instruction::CastOps. llvm-svn: 310024	2017-08-04 05:12:35 +00:00
Victor Leschuk	eabd98601c	Un-revert r310014: false revert, it wasn't the cause of build break llvm-svn: 310021	2017-08-04 04:51:15 +00:00
Victor Leschuk	fe0e5c87b4	Revert r310014 as it breaks build lld-x86_64-darwin13 llvm-svn: 310020	2017-08-04 04:43:54 +00:00
Adrian Prantl	d3acfe5504	Teach GlobalSRA to update the debug info for split-up globals. This is similar to what we are doing in "regular" SROA and creates DW_OP_LLVM_fragment operations to describe the resulting variables. rdar://problem/33654891 llvm-svn: 310014	2017-08-04 01:19:54 +00:00
Teresa Johnson	cde6934bb7	Use profile summary to disable peeling for huge working sets Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold. When the working set size is determined to be huge, disable peeling to avoid bloating the working set further. Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K). Reviewers: davidxl Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36288 llvm-svn: 310005	2017-08-03 23:42:58 +00:00
Davide Italiano	6806b06bf7	[NewGVN] Fix the case where we have a phi-of-ops which goes away. Patch by Daniel Berlin, fixes PR33196 (and probably something else). llvm-svn: 309988	2017-08-03 21:17:49 +00:00
Teresa Johnson	ffba812867	Disable loop peeling during full unrolling pass. Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling. Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled. Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks. Reviewers: chandlerc, davidxl Subscribers: mzolotukhin, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36258 llvm-svn: 309966	2017-08-03 17:52:38 +00:00
Sanjay Patel	ef97ac86ad	[NewGVN] fix typos; NFC llvm-svn: 309946	2017-08-03 15:18:27 +00:00
Ewan Crawford	ae5418b4ef	[Cloning] Move distinct GlobalVariable debug info metadata in CloneModule Duplicating the distinct Subprogram and CU metadata nodes seems like the incorrect thing to do in CloneModule for GlobalVariable debug info. As it results in the scope of the GlobalVariable DI no longer being consistent with the rest of the module, and the new CU is absent from llvm.dbg.cu. Fixed by adding RF_MoveDistinctMDs to MapMetadata flags for GlobalVariables. Current unit test IR after clone: ``` @gv = global i32 1, comdat($comdat), !dbg !0, !type !5 define private void @f() comdat($comdat) personality void ()* @persfn !dbg !14 { !llvm.dbg.cu = !{!10} !0 = !DIGlobalVariableExpression(var: !1) !1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5) !3 = !DIFile(filename: "filename.c", directory: "/file/dir/") !4 = !DISubroutineType(types: !5) !5 = !{} !6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8) !7 = !DIFile(filename: "filename.c", directory: "/file/dir") !8 = !{!0} !9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") !10 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !11) !11 = !{!12} !12 = !DIGlobalVariableExpression(var: !13) !13 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !14, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !14 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !10, variables: !5) ``` Patched IR after clone: ``` @gv = global i32 1, comdat($comdat), !dbg !0, !type !5 define private void @f() comdat($comdat) personality void ()* @persfn !dbg !2 { !llvm.dbg.cu = !{!6} !0 = !DIGlobalVariableExpression(var: !1) !1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true) !2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5) !3 = !DIFile(filename: "filename.c", directory: "/file/dir/") !4 = !DISubroutineType(types: !5) !5 = !{} !6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8) !7 = !DIFile(filename: "filename.c", directory: "/file/dir") !8 = !{!0} !9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") ``` Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36082 llvm-svn: 309928	2017-08-03 09:23:03 +00:00
Matt Arsenault	3207fbe899	LV: Don't insert runtime ptr checks on divergent targets llvm-svn: 309890	2017-08-02 21:43:08 +00:00
Craig Topper	e87eb175b2	[InstCombine] Remove unnecessary temporary APInt. NFCI llvm-svn: 309887	2017-08-02 21:05:40 +00:00
Teresa Johnson	f4af38fa5f	[PM] Split LoopUnrollPass and make partial unroller a function pass Summary: This is largely NFC, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 llvm-svn: 309886	2017-08-02 20:35:29 +00:00
Craig Topper	d4646fd843	[InstCombine] Remove explicit code for folding (xor(zext(cmp)), 1) and (xor(sext(cmp)), -1) to ext(!cmp). As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor. Differential Revision: https://reviews.llvm.org/D36214 llvm-svn: 309882	2017-08-02 20:30:27 +00:00
Craig Topper	b8ffa0a173	[InstCombine] Support sext in foldLogicCastConstant This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214. Differential Revision: https://reviews.llvm.org/D36234 llvm-svn: 309880	2017-08-02 20:25:56 +00:00
Jakub Kuderski	8f78266b9f	[Dominators] Teach LoopDeletion to use the new incremental API Summary: This patch makes LoopDeletion use the incremental DominatorTree API. We modify LoopDeletion to perform the deletion in 5 steps: 1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch. 2. Inform the DomTree about the new edge. 3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable. 4. Inform the DomTree about the deleted edge. 5. Remove the unreachable block from the function. Creating the dummy conditional branch is necessary to perform incremental DomTree update. We should consider using the batch updater when it's ready. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35391 llvm-svn: 309850	2017-08-02 18:17:52 +00:00
Alexey Bataev	6c386e33ac	[SLPVectorizer] Generalize interface of functions, NFC. llvm-svn: 309816	2017-08-02 14:38:07 +00:00
Alexey Bataev	abac065cef	[SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements Summary: Currently most of the time vectors of extractelement instructions are treated as scalars that must be gathered into vectors. But in some cases, like when we have extractelement instructions from single vector with different constant indeces or from 2 vectors of the same size, we can treat this operations as shuffle of a single vector or blending of 2 vectors. ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %x0 = extractelement <2 x i8> %x, i32 0 %y1 = extractelement <2 x i8> %y, i32 1 %x0x0 = mul i8 %x0, %x0 %y1y1 = mul i8 %y1, %y1 %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0 %ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1 ret <2 x i8> %ins2 } ``` can be converted to something like ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3> %2 = mul <2 x i8> %1, %1 ret <2 x i8> %2 } ``` Currently this type of conversion is considered as high cost transformation. Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon Subscribers: ashahid, RKSimon, spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D30200 llvm-svn: 309812	2017-08-02 13:25:26 +00:00
Davide Italiano	205a09d9d6	[NewGVN] Fold single-use variables. NFCI. llvm-svn: 309790	2017-08-02 04:05:49 +00:00
Davide Italiano	b829a675a1	[NewGVN] Remove a (now stale) comment. NFCI. llvm-svn: 309789	2017-08-02 03:51:40 +00:00
Craig Topper	1142dd4c6d	[SimplifyCFG] Fix typo in comment. NFC llvm-svn: 309785	2017-08-02 02:34:16 +00:00
Chandler Carruth	eb7a769c09	[PM] Fix a bug where through CGSCC iteration we can get infinite-inlining across multiple runs of the inliner by keeping a tiny history of internal-to-SCC inlining decisions. This is still a bit gross, but I don't yet have any fundamentally better ideas and numerous people are blocked on this to use new PM and ThinLTO together. The core of the idea is to detect when we are about to do an inline that has a chance of re-splitting an SCC which we have split before with a similar inlining step. That is a critical component in the inlining forming a cycle and so far detects all of the various cyclic patterns I can come up with as well as the original real-world test case (which comes from a ThinLTO build of libunwind). I've added some tests that I think really demonstrate what is going on here. They are essentially state machines that march the inliner through various steps of a cycle and check that we stop when the cycle is closed and that we actually did do inlining to form that cycle. A lot of thanks go to Eric Christopher and Sanjoy Das for the help understanding this issue and improving the test cases. The biggest "yuck" here is the layering issue -- the CGSCC pass manager is providing somewhat magical state to the inliner for it to use to make itself converge. This isn't great, but I don't honestly have a lot of better ideas yet and at least seems nicely isolated. I have tested this patch, and it doesn't block any inlining on the entire LLVM test suite and SPEC, so it seems sufficiently narrowly targeted to the issue at hand. We have come up with hypothetical issues that this patch doesn't cover, but so far none of them are practical and we don't have a viable solution yet that covers the hypothetical stuff, so proceeding here in the interim. Definitely an area that we will be back and revisiting in the future. Differential Revision: https://reviews.llvm.org/D36188 llvm-svn: 309784	2017-08-02 02:09:22 +00:00
Chad Rosier	e36216c004	[Value Tracking] Default argument to true and rename accordingly. NFC. IMHO this is a bit more readable. llvm-svn: 309739	2017-08-01 20:18:54 +00:00
Craig Topper	83316a77e5	[InstCombine] Remove explicit check for impossible condition. Replace with assert Summary: As far as I can tell the earlier call getLimitedValue will guaranteed ShiftAmt is saturated to BitWidth-1 preventing it from ever being equal or greater than BitWidth. At one point in the past the getLimitedValue call was only passed BitWidth not BitWidth - 1. This would have allowed the equality case to get here. And in fact this check was initially added as just BitWidth == ShiftAmt, but was changed shortly after to include > which should have never been possible. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36123 llvm-svn: 309690	2017-08-01 15:10:25 +00:00
Max Kazantsev	f2bbb1101e	[IRCE][NFC] Add another assert that AddRecExpr's step is not zero One more assertion of this kind. It is a preparation step for generalizing to the case of stride not equal to +1/-1. llvm-svn: 309663	2017-08-01 06:49:29 +00:00
Max Kazantsev	3d9db69707	[IRCE][NFC] Add assert that AddRecExpr's step is not zero We should never return zero steps, ensure this fact by adding a sanity check when we are analyzing the induction variable. llvm-svn: 309661	2017-08-01 06:27:51 +00:00
Davide Italiano	da86042768	[MetaRenamer] Leave `@main` alone. To the best of my knowledge -metarenamer is used in two cases: 1) obfuscate names, when e.g. they contain informations that can't be shared. 2) Improve clarity of the textual IR for testcases. One of the usecases if getting the output of `opt` and passing it to the lli interpreter to run the test. If metarenamer renames @main, lli can't find an entry point. llvm-svn: 309657	2017-08-01 05:14:45 +00:00
Alina Sbirlea	73dde02cf6	Default MemoryLocation passed to getModRefInfo should be None (D35441) llvm-svn: 309645	2017-08-01 00:47:17 +00:00
Kostya Serebryany	a154c261b9	[sanitizer-coverage] relax an assertion llvm-svn: 309644	2017-08-01 00:44:05 +00:00
Alina Sbirlea	7b373d280b	Allow None as a MemoryLocation to getModRefInfo Summary: Adding part of the changes in D30369 (needed to make progress): Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA. Original summary from D30369, by dberlin: Currently, we have instructions which affect memory but have no memory location. If you call, for example, MemoryLocation::get on a fence, it asserts. This means things specifically have to avoid that. It also means we end up with a copy of each API, one taking a memory location, one not. This starts to fix that. We add MemoryLocation::getOrNone as a new call, and reimplement the old asserting version in terms of it. We make MemoryLocation optional in the (Instruction, MemoryLocation) version of getModRefInfo, and kill the old one argument version in favor of passing None (it had one caller). Now both can handle fences because you can just use MemoryLocation::getOrNone on an instruction and it will return a correct answer. We use all this to clean up part of MemorySSA that had to handle this difference. Note that literally every actual getModRefInfo interface we have could be made private and replaced with: getModRefInfo(Instruction, Optional<MemoryLocation>) and getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>) and delegating to the right ones, if we wanted to. I have not attempted to do this yet. Reviewers: dberlin, davide, dblaikie Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D35441 llvm-svn: 309641	2017-08-01 00:28:29 +00:00
Sanjay Patel	9716dfe75b	[InstCombine] allow mask hoisting transform for vector types llvm-svn: 309627	2017-07-31 21:01:53 +00:00
Peter Collingbourne	da82874983	Update phi nodes in LowerTypeTests control flow simplification D33925 added a control flow simplification for -O2 --lto-O0 builds that manually splits blocks and reassigns conditional branches but does not correctly update phi nodes. If the else case being branched to had incoming phi nodes the control-flow simplification would leave phi nodes in that BB with an unhandled predecessor. Patch by Vlad Tsyrklevich! Differential Revision: https://reviews.llvm.org/D36012 llvm-svn: 309621	2017-07-31 20:43:07 +00:00
Kostya Serebryany	434bcf9183	[sanitizer-coverage] don't instrument available_externally functions llvm-svn: 309611	2017-07-31 20:00:22 +00:00
Kostya Serebryany	7923f1a0ff	[sanitizer-coverage] ensure minimal alignment for coverage counters and guards llvm-svn: 309610	2017-07-31 19:49:45 +00:00
Davide Italiano	3c9ab0171a	[SLPVectorizer] Unbreak the build with -Werror. GCC was complaining about `&&` within `\|\|` without explicit parentheses. NFCI. llvm-svn: 309606	2017-07-31 19:14:19 +00:00
Craig Topper	d5daa59720	[X86][InstCombine] Add some simplifications for BZHI intrinsics This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications. This could be in InstSimplify, but we don't handle any target specific intrinsics there today. Differential Revision: https://reviews.llvm.org/D36069 llvm-svn: 309604	2017-07-31 18:52:15 +00:00
Craig Topper	80686037d0	[X86][InstCombine] Add basic simplification support for BEXTR/BEXTRI intrinsics. This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either. I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify. Differential Revision: https://reviews.llvm.org/D36063 llvm-svn: 309603	2017-07-31 18:52:13 +00:00
David Majnemer	fb412dedc3	[IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer We are not allowed to reason about an initializer value without first consulting hasDefinitiveInitializer. llvm-svn: 309594	2017-07-31 17:47:07 +00:00
Florian Hahn	f80450c0bb	Extend ifdefs to more unused helper functions. This fixes a buildbot failure with -Werror introduced by r309553 llvm-svn: 309572	2017-07-31 16:11:43 +00:00
Alexey Bataev	2aa430ee58	[SLP] Initial rework for min/max horizontal reduction vectorization, NFC. Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations. Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D29402 llvm-svn: 309566	2017-07-31 14:36:05 +00:00
Alexey Bataev	55309303be	[Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC. llvm-svn: 309563	2017-07-31 14:19:32 +00:00
Ayal Zaks	dca1c68e5c	[LV] Avoid redundant operations manipulating masks The Loop Vectorizer generates redundant operations when manipulating masks: AND with true, OR with false, compare equal to true. Instead of relying on a subsequent pass to clean them up, this patch avoids generating them. Use null (no-mask) to represent all-one full masks, instead of a constant all-one vector, following the convention of masked gathers and scatters. Preparing for a follow-up VPlan patch in which these mask manipulating operations are modeled using recipes. Differential Revision: https://reviews.llvm.org/D35725 llvm-svn: 309558	2017-07-31 13:21:42 +00:00
Florian Hahn	3d431b22ac	Guard print() functions only used by dump() functions. Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553	2017-07-31 10:07:49 +00:00
Florian Hahn	e48aac838f	[LoopInterchange] Do not interchange loops with function calls. Summary: Without any information about the called function, we cannot be sure that it is safe to interchange loops which contain function calls. For example there could be dependences that prevent interchanging between accesses in the called function and the loops. Even functions without any parameters could cause problems, as they could access memory using global pointers. For now, I think it is only safe to interchange loops with calls marked as readnone. With this patch, the LLVM test suite passes with `-O3 -mllvm -enable-loopinterchange` and LoopInterchangeProfitability::isProfitable returning true for all loops. check-llvm and check-clang also pass when bootstrapped in a similar fashion, although only 3 loops got interchanged. Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35489 llvm-svn: 309547	2017-07-31 09:00:52 +00:00
Sam Elliott	410ed659bc	Migrate PGOMemOptSizeOpt to use new OptimizationRemarkEmitter Pass Summary: Fixes PR33790. This patch still needs a yaml-style test, which I shall write tomorrow Reviewers: anemet Reviewed By: anemet Subscribers: anemet, llvm-commits Differential Revision: https://reviews.llvm.org/D35981 llvm-svn: 309497	2017-07-30 00:35:33 +00:00
Sumanth Gundapaneni	5ecc6dbfd3	[SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables Differential Revision: https://reviews.llvm.org/D35579 llvm-svn: 309444	2017-07-28 22:25:40 +00:00
Adrian Prantl	c83c29a7b7	Remove the obsolete offset parameter from @llvm.dbg.value There is no situation where this rarely-used argument cannot be substituted with a DIExpression and removing it allows us to simplify the DWARF backend. Note that this patch does not yet remove any of the newly dead code. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D35951 llvm-svn: 309426	2017-07-28 20:21:02 +00:00
Alexey Bataev	078766df94	[SLP] Allow vectorization of the instruction from the same basic blocks only, NFC. Summary: After some changes in SLP vectorizer we missed some additional checks to limit the instructions for vectorization. We should not perform analysis of the instructions if the parent of instruction is not the same as the parent of the first instruction in the tree or it was analyzed already. Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D34881 llvm-svn: 309425	2017-07-28 20:11:16 +00:00
Wei Mi	d9cf09c389	[GVN] Recommit the patch "Add phi-translate support in scalarpre" Recommit after workaround the bug PR31652. Three bugs fixed in previous recommits: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 309397	2017-07-28 15:47:25 +00:00
Davide Italiano	6e5954a153	[JumpThreading] Stop falsely preserving LazyValueInfo. JumpThreading claims to preserve LVI, but it doesn't preserve the analyses which LVI holds a reference to (e.g. the Dominator). In the current pass manager infrastructure, after JT runs, the PM frees these analyses (including DominatorTree) but preserves LVI. CorrelatedValuePropagation runs immediately after and queries a corrupted domtree, causing weird miscompiles. This commit disables the preservation of LVI for the time being. Eventually, we should either move LVI to a proper dependency tracking mechanism (i.e. an analyses shouldn't hold references to other analyses and compute them on demand if needed), or we should teach all the passes preserving LVI to preserve the analyses LVI depends on. The new pass manager has a mechanism to invalidate LVI in case one of the analyses it depends on becomes invalid, so this problem shouldn't exist (at least not in this immediate form), but handling of analyses holding references is still a very delicate subject. Fixes PR33917 (and rustc). llvm-svn: 309355	2017-07-28 03:10:43 +00:00
Davide Italiano	8c7e46a022	[JumpThreading] Add an option to dump LazyValueInfo after the run. Differential Revision: https://reviews.llvm.org/D35973 llvm-svn: 309353	2017-07-28 02:57:43 +00:00
Dehao Chen	8742093089	Increase the ImportHotMultiplier to 10.0 Summary: The original 3.0 hot mupltiplier is too small, and would prevent hot callsites from being inline. This patch increases the hot multilier to 10.0 Reviewers: davidxl, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D35969 llvm-svn: 309344	2017-07-28 01:02:34 +00:00
Kostya Serebryany	62523c8dcf	[sanitizer-coverage] rename sanitizer-coverage-create-pc-table into sanitizer-coverage-pc-table and add plumbing for a clang flag llvm-svn: 309337	2017-07-28 00:09:29 +00:00
Kostya Serebryany	68db4867dd	[sanitizer-coverage] add a feature sanitizer-coverage-create-pc-table=1 (works with trace-pc-guard and inline-8bit-counters) that adds a static table of instrumented PCs to be used at run-time llvm-svn: 309335	2017-07-27 23:36:49 +00:00
whitequark	ab0ae1ff66	[MergeFunctions] Remove alias support. The alias support was dead code since 2011. It was last touched in r124182, where it was reintroduced after being removed in r110434, and since then it was gated behind a HasGlobalAliases flag that was permanently stuck as `false`. It is also broken. I'm not sure if it bitrotted or was just broken in the first place because it appears to have never been tested, but the following IR results in a crash: define internal i32 @a(i32 %a, i32 %b) unnamed_addr { %c = add i32 %a, %b %d = xor i32 %a, %c ret i32 %c } define internal i32 @b(i32 %a, i32 %b) unnamed_addr { %c = add i32 %a, %b %d = xor i32 %a, %c ret i32 %c } It seems safe to remove buggy untested code that no one cared about for seven years. Differential Revision: https://reviews.llvm.org/D34802 llvm-svn: 309313	2017-07-27 19:36:13 +00:00
Davide Italiano	64a06c8cad	[FunctionImport] Prefer isa<> to dyn_cast<> as the value is not used. This change makes GCC7 happy again. llvm-svn: 309305	2017-07-27 18:38:09 +00:00
Hiroshi Yamauchi	a7d6028861	[InstCombine] Simplify pointer difference subtractions (GEP-GEP) where GEPs have other uses and one non-constant index Summary: Pointer difference simplifications currently happen only if input GEPs don't have other uses or their indexes are all constants, to avoid duplicating indexing arithmetic. This patch enables cases with exactly one non-constant index among input GEPs to happen where there is no duplicated arithmetic or code size increase even if input GEPs have other uses. For example, this patch allows "(&A[42][i]-&A[42][0])" --> "i", which didn't happen previously, if the input GEP(s) have other uses. Reviewers: sanjoy, bkramer Reviewed By: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35499 llvm-svn: 309304	2017-07-27 18:27:11 +00:00
Adam Nemet	e1bfd295b2	[ICP] Migrate to OptimizationRemarkEmitter This is a module pass so for the old PM, we can't use ORE, the function analysis pass. Instead ORE is created on the fly. A few notes: - isPromotionLegal is folded in the caller since we want to emit the Function in the remark but we can only do that if the symbol table look-up succeeded. - There was good test coverage for remarks in this pass. - promoteIndirectCall uses ORE conditionally since it's also used from SampleProfile which does not use ORE yet. Fixes PR33792. Differential Revision: https://reviews.llvm.org/D35929 llvm-svn: 309294	2017-07-27 16:54:15 +00:00
Daniel Neilson	0d6908f2bf	All libcalls should be considered to be GC-leaf functions. Summary: It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc), but these passes will not mark the call as a gc-leaf-function. All libcalls are actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that available libcalls are equivalent to gc-leaf-function calls. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35840 llvm-svn: 309291	2017-07-27 16:49:39 +00:00
Alexey Bataev	a5b418f4ae	[SLP] Outline code for the check that instruction users are part of vectorization tree, NFC. llvm-svn: 309284	2017-07-27 15:48:44 +00:00
David Blaikie	582df1a02a	Fix assert from r309278 llvm-svn: 309281	2017-07-27 15:28:10 +00:00
David Blaikie	e204899c29	ThinLTO: Don't import aliases of any kind (even linkonce_odr) Summary: Until a more advanced version of importing can be implemented for aliases (one that imports an alias as an available_externally definition of the aliasee), skip the narrow subset of cases that was possible but came at a cost: aliases of linkonce_odr functions could be imported because the linkonce_odr function could be safely duplicated from the source module. This came/comes at the cost of not being able to 'home' imported linkonce functions (they had to be emitted linkonce_odr in all the destination modules (even if they weren't used by an alias) rather than as available_externally - causing extra object size). Tangentially, this also was the only reason ThinLTO would emit multiple CUs in to the resulting DWARF - which happens to be a problem for Fission (there's a fix for this in GDB but not released yet, etc). (actually it's not the only reason - but I'm sending a patch to fix the other reason shortly) There's no reason to believe this particularly narrow alias importing was especially/meaningfully important, only that it was /possible/ to implement in this way. When a more general solution is done, it should still satisfy the DWARF concerns above, since the import will still be available_externally, and thus not create extra CUs. Since now all aliases are treated the same, I removed/simplified some test cases since they were testing corner cases where there are no longer any corners. Reviewers: tejohnson, mehdi_amini Differential Revision: https://reviews.llvm.org/D35875 llvm-svn: 309278	2017-07-27 15:09:06 +00:00
Hiroshi Yamauchi	4147ce4079	Fix a comment (test commit). llvm-svn: 309192	2017-07-26 21:54:43 +00:00
Adam Nemet	7b7de60cec	Migrate SimplifyLibCalls to new OptimizationRemarkEmitter Summary: This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter API. In fact, as SimplifyLibCalls is only ever called via InstCombine, (as far as I can tell) the OptimizationRemarkEmitter is added there, and then passed through to SimplifyLibCalls later. I have avoided changing any remark text. This closes PR33787 Patch by Sam Elliott! Reviewers: anemet, davide Reviewed By: anemet Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35608 llvm-svn: 309158	2017-07-26 19:03:18 +00:00
Wei Mi	60633f66f7	Disable loop unswitching for some patterns containing equality comparison with undef. This is a workaround for the bug described in PR31652 and http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for some simple patterns we can know the equality comparison may contains undef, so we regard such comparison as unsafe and will not do loop-unswitching for them. We also need to disable the select simplification when one of select operand is undef and its result feeds into equality comparison. The patch cannot clear the safety issue caused by the bug, but it can suppress the issue from happening to some extent. Differential Revision: https://reviews.llvm.org/D35811 llvm-svn: 309059	2017-07-25 23:37:17 +00:00
Chandler Carruth	87057f138b	[LIR] Teach LIR to avoid extending the BE count prior to adding one to it when safe. Very often the BE count is the trip count minus one, and the plus one here should fold with that minus one. But because the BE count might in theory be UINT_MAX or some such, adding one before we extend could in some cases wrap to zero and break when we scale things. This patch checks to see if it would be safe to add one because the specific case that would cause this is guarded for prior to entering the preheader. This should handle essentially all of the common loop idioms coming out of C/C++ code once canonicalized by LLVM. Before this patch, both forms of loop in the added test cases ended up subtracting one from the size, extending it, scaling it up by 8 and then adding 8 back onto it. This is really silly, and it turns out made it all the way into generated code very often, so this is a surprisingly important cleanup to do. Many thanks to Sanjoy for showing me how to do this with SCEV. Differential Revision: https://reviews.llvm.org/D35758 llvm-svn: 308968	2017-07-25 10:48:32 +00:00
Kostya Serebryany	7f64536d37	[sanitizer-coverage] simplify the code, NFC llvm-svn: 308944	2017-07-25 02:07:38 +00:00
Florian Hahn	2d7c76cc7a	[LoopInterchange] Update code to use range-based for loops (NFC). Summary: The remaining non range-based for loops do not iterate over full ranges, so leave them as they are. Reviewers: karthikthecool, blitz.opensource, mcrosier, mkuper, aemerson Reviewed By: aemerson Subscribers: aemerson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35777 llvm-svn: 308872	2017-07-24 11:41:30 +00:00
Xinliang David Li	a5e8a71ae3	[PGOInstr] Add a debug print llvm-svn: 308785	2017-07-21 21:36:25 +00:00
Haojie Wang	13f5c24893	ThinLTO Minimized Bitcode File Size Reduction Summary: Currently the ThinLTO minimized bitcode file only strip the debug info, but there is still a lot of information in the minimized bit code file that will be not used for thin linker. In this patch, most of the extra information is striped to reduce the minimized bitcode file. Now only ModuleVersion, ModuleInfo, ModuleGlobalValueSummary, ModuleHash, Symtab and Strtab are left. Now the minimized bitcode file size is reduced to 15%-30% of the debug info stripped bitcode file size. Reviewers: danielcdh, tejohnson, pcc Reviewed By: pcc Subscribers: mehdi_amini, aprantl, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D35334 llvm-svn: 308760	2017-07-21 17:25:20 +00:00
Anna Thomas	d9589bfdfd	[RuntimeUnroll] NFC: Add a profitability function for mutliexit loop Separated out the profitability from the safety analysis for multiexit loop unrolling. Currently, this is an NFC because profitability is true only if the unroll-runtime-multi-exit is set to true (off-by-default). This is to ease adding the profitability heuristic up for review at D35380. llvm-svn: 308753	2017-07-21 16:30:38 +00:00
Dinar Temirbulatov	50b5e7be49	[SLPVectorizer] Replace E->Scalars to VL0 at vectorizeTree and move comment, NFCI. llvm-svn: 308750	2017-07-21 16:02:56 +00:00
Dinar Temirbulatov	27a9d21730	[SLPVectorizer] buildTree_rec replace cast<Instruction>(VL[0]) to VL0, NFCI. llvm-svn: 308745	2017-07-21 15:31:54 +00:00
Dinar Temirbulatov	030b7675c5	[SLPVectorizer] Change canReuseExtract function parameter Opcode from unsigned to Value *, NFCI. llvm-svn: 308739	2017-07-21 13:32:36 +00:00
Jonas Paulsson	c38a4eb7d4	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Davide Italiano	215ad2e876	[PGO] Move the PGOInstrumentation pass to new OptRemark API. This fixes PR33791. llvm-svn: 308668	2017-07-20 20:43:05 +00:00
Peter Collingbourne	5d041bb166	LowerTypeTests: Drop function type metadata only if we're going to replace it. Previously we were (mis)handling jump table members with a prevailing definition in a full LTO module and a non-prevailing definition in a ThinLTO module by dropping type metadata on those functions entirely, which would cause type tests involving such functions to fail. This patch causes us to drop metadata only if we are about to replace it with metadata from cfi.functions. We also want to replace metadata for available_externally functions, which can arise in the opposite scenario (prevailing ThinLTO definition, non-prevailing full LTO definition). The simplest way to handle that is to remove the definition; there's little value in keeping it around at this point (i.e. after most optimization passes have already run) and later code will try to use the function's linkage to create an alias, which would result in invalid IR if the function is available_externally. Fixes PR33832. Differential Revision: https://reviews.llvm.org/D35604 llvm-svn: 308642	2017-07-20 18:02:05 +00:00
David Majnemer	00aacae685	[LICM] Make sinkRegion and hoistRegion non-recursive Large CFGs can cause us to blow up the stack because we would have a recursive step for each basic block in a region. Instead, create a worklist and iterate it. This limits the stack usage to something more manageable. Differential Revision: https://reviews.llvm.org/D35609 llvm-svn: 308582	2017-07-20 03:27:02 +00:00
Davide Italiano	dc30551029	[TRE] Move to the new OptRemark API. Fixes PR33788. Differential Revision: https://reviews.llvm.org/D35570 llvm-svn: 308524	2017-07-19 21:13:22 +00:00
Peter Collingbourne	3d4c92eec4	ThinLTOBitcodeWriter: Do not rewrite intrinsic functions when splitting modules. Changing the type of an intrinsic may invalidate the IR. Differential Revision: https://reviews.llvm.org/D35593 llvm-svn: 308500	2017-07-19 17:54:29 +00:00

1 2 3 4 5 ...

18523 Commits