llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Sanjoy Das	b20d278ebd	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762	2016-04-08 00:48:30 +00:00
David Majnemer	0cf2c938ab	[InstCombine] Don't sink an instr after a catchswitch A catchswitch is a terminator, instructions cannot be inserted after it. llvm-svn: 265158	2016-04-01 17:28:17 +00:00
Matt Arsenault	2a2b3fa835	AMDGPU: Add frexp_exp intrinsic llvm-svn: 264944	2016-03-30 22:28:52 +00:00
Matt Arsenault	c182eae9e4	AMDGPU: Constant folding for frexp_mant llvm-svn: 264943	2016-03-30 22:28:26 +00:00
Junmo Park	f9c43d451f	Minor code cleanup. NFC. llvm-svn: 264124	2016-03-23 01:38:35 +00:00
David Majnemer	2669789037	[InstCombine] Don't insert instructions before a catch switch CatchSwitches are not splittable, we cannot insert casts, etc. before them. This fixes PR26992. llvm-svn: 263874	2016-03-19 04:39:52 +00:00
Guozhi Wei	9cfdcc479b	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734	2016-03-17 18:47:20 +00:00
Bjorn Steinbrink	a31b1fe444	Also handle the new Rust pers fn to isCatchAll() llvm-svn: 263585	2016-03-15 20:57:07 +00:00
Justin Lebar	7e369d0dca	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Originally landed as r261544, then reverted in r261544 for (incidental) build breakage. Re-landed here with no changes. Reviewers: chandlerc, jingyue Subscribers: llvm-commits, tra, jhen, hfinkel Differential Revision: http://reviews.llvm.org/D17739 llvm-svn: 263481	2016-03-14 20:18:54 +00:00
Mehdi Amini	12d624a26a	Remove PreserveNames template parameter from IRBuilder This reapplies r263258, which was reverted in r263321 because of issues on Clang side. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263393	2016-03-13 21:05:13 +00:00
Sanjay Patel	60bcb48851	[x86, InstCombine] delete x86 SSE2 masked store with zero mask This follows up on the related AVX instruction transforms, but this one is too strange to do anything more with. Intel's behavioral description of this instruction in its Software Developer's Manual is tragi-comic. llvm-svn: 263340	2016-03-12 15:16:59 +00:00
Eric Christopher	559b0be562	Temporarily revert: commit ae14bf6488e8441f0f6d74f00455555f6f3943ac Author: Mehdi Amini <mehdi.amini@apple.com> Date: Fri Mar 11 17:15:50 2016 +0000 Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258 91177308-0d34-0410-b5e6-96231b3b80d8 until we can figure out what to do about clang and Release build testing. This reverts commit 263258. llvm-svn: 263321	2016-03-12 01:47:22 +00:00
Mehdi Amini	1f82b794e4	Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263258	2016-03-11 17:15:50 +00:00
Chandler Carruth	6150530377	[PM] Make the AnalysisManager parameter to run methods a reference. This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219	2016-03-11 11:05:24 +00:00
Benjamin Kramer	af02270708	[InstCombine] Use Twines to generate names. Since the names are used in a loop this does more work in debug builds. In release builds value names are generally discarded so we don't have to do the concatenation at all. It's also simpler code, no functional change intended. llvm-svn: 263215	2016-03-11 10:20:56 +00:00
Philip Reames	64fa288117	[ValueTracking] Extract isKnownPositive [NFCI] Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062	2016-03-09 21:31:47 +00:00
Philip Reames	a1cf015229	[InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0) When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min. Differential Revision: http://reviews.llvm.org/D17873 llvm-svn: 263059	2016-03-09 21:05:07 +00:00
Matthias Braun	3757169d38	InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2 As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 llvm-svn: 263047	2016-03-09 18:47:11 +00:00
Petar Jovanovic	d649b976d0	Reland r262337 "calculate builtin_object_size if arg is a removable pointer" Original commit message: calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 Reland the original change with a small modification (first do a null check and then do the cast) to satisfy ubsan. llvm-svn: 263011	2016-03-09 14:12:47 +00:00
Junmo Park	4c71496a1c	Revert "[InstCombine] Combine A->B->A BitCast" This reverts commit r262670 due to compile failure. llvm-svn: 262916	2016-03-08 07:09:46 +00:00
Guozhi Wei	a7fc8e012e	[InstCombine] Combine A->B->A BitCast This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670	2016-03-03 23:21:38 +00:00
Sanjay Patel	fe4b2c4bc6	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702) Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645	2016-03-03 19:19:04 +00:00
Amaury Sechet	09dedb75e8	Explode store of arrays in instcombine Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530	2016-03-02 22:36:45 +00:00
Amaury Sechet	7169395ec4	Unpack array of all sizes in InstCombine Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521	2016-03-02 21:28:30 +00:00
Sanjay Patel	afbc4f1551	revert r262424 because there's a clang test for AArch64 that checks -O3 asm output that is broken by this change llvm-svn: 262440	2016-03-02 01:04:09 +00:00
Sanjay Patel	7266aa16e2	[InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for scalar integers comparisons to vector integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424	2016-03-01 23:55:18 +00:00
Dehao Chen	9581e2d337	Perform InstructioinCombiningPass before SampleProfile pass. Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419	2016-03-01 22:53:02 +00:00
Owen Anderson	999a9f171c	Fix an issue where fast math flags were dropped during scalarization. Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376	2016-03-01 19:35:52 +00:00
Petar Jovanovic	0302dd0999	Revert "calculate builtin_object_size if argument is a removable pointer" Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349	2016-03-01 16:50:08 +00:00
Petar Jovanovic	a1fb751763	calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337	2016-03-01 14:39:55 +00:00
Sanjay Patel	aee22b5eed	[x86, InstCombine] transform more x86 masked loads to LLVM intrinsics Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273	2016-02-29 23:59:00 +00:00
Sanjay Patel	4d045a2e91	[x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269	2016-02-29 23:16:48 +00:00
Reid Kleckner	43aeb743c9	[InstCombine] Be more conservative about removing stackrestore We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095	2016-02-27 00:53:54 +00:00
Sanjay Patel	014ad4d329	[x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsics Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077	2016-02-26 21:51:44 +00:00
Sanjay Patel	d5185a0950	[x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064	2016-02-26 21:04:14 +00:00
Sanjay Patel	97801c676a	[InstCombine] enable optimization of casted vector xor instructions This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750	2016-02-24 17:00:34 +00:00
Artur Pilipenko	5746dd289e	NFC. Move isDereferenceable to Loads.h/cpp This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736	2016-02-24 12:49:04 +00:00
Sanjay Patel	4f966611aa	[InstCombine] refactor visitOr() to use foldCastedBitwiseLogic() Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra check from the nearly identical 'or' case: if ((!isa<ICmpInst>(Cast0Src) \|\| !isa<ICmpInst>(Cast1Src)) But I'm not sure how to expose that difference in a regression test. Without that check, the 'or' path will infinite loop on: test/Transforms/InstCombine/zext-or-icmp.ll because the zext-or-icmp fold is attempting a reverse transform. The refactoring should extend to the 'xor' case next to solve part of PR26702. llvm-svn: 261707	2016-02-23 23:56:23 +00:00
Sanjay Patel	e4ec3d519e	[InstCombine] improve readability ; NFCI Less indenting, named local variables, more descriptive names. llvm-svn: 261659	2016-02-23 17:41:34 +00:00
Sanjay Patel	4e682fcfe8	[InstCombine] less indenting; NFC llvm-svn: 261652	2016-02-23 16:59:21 +00:00
Sanjay Patel	70dd21aa9d	[InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCI This is a straight cut and paste of the existing code and is intended to be the first step in solving part of PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 We should be able to reuse most of this and delete the nearly identical existing code in visitOr(). Then, we can enhance visitXor() to use the same code too. llvm-svn: 261649	2016-02-23 16:36:07 +00:00
Justin Lebar	f84464712c	Revert "[attrs] Handle convergent CallSites." This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549	2016-02-22 18:24:43 +00:00
Justin Lebar	ca379cda9f	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544	2016-02-22 17:51:35 +00:00
Sanjay Patel	ff9ee24191	fix inaccurate comment; NFC llvm-svn: 261484	2016-02-21 17:33:31 +00:00
Sanjay Patel	67805931c5	[InstCombine] add getNegativeIsTrueBoolVec() helper function; NFC Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483	2016-02-21 17:29:33 +00:00
Simon Pilgrim	f781386f81	[InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element llvm-svn: 261460	2016-02-20 23:17:35 +00:00
Chandler Carruth	3d4a43dca8	[AA] Preserve the AA results wrapper pass as well as BasicAA in a few more places to prevent gratuitous re-"runs" of these passes. The passes themselves don't do any work when run, but we keep spending time scheduling and running these needlessly when we really don't need to do so. This is the first patch towards fixing the really horrible loop pass pipeline fragmentation pointed out by Sanjoy in PR24804. llvm-svn: 261302	2016-02-19 03:12:14 +00:00
Richard Trieu	5a759985de	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Amaury Sechet	ea7abf7b10	NFC: Fix formating llvm-svn: 261156	2016-02-17 21:21:29 +00:00
Amaury Sechet	14d2c58ecf	Fix load alignement when unpacking aggregates structs Summary: Store and loads unpacked by instcombine do not always have the right alignement. This explicitely compute the alignement and set it. Reviewers: dblaikie, majnemer, reames, hfinkel, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17326 llvm-svn: 261139	2016-02-17 19:21:28 +00:00

1 2 3 4 5 ...

1651 Commits