llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 13:33:37 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	c8cdf126d3	[CostModel][X86] Fix AVX2 v16i16 shift 'splat' costs. llvm-svn: 291366	2017-01-07 22:08:09 +00:00
Simon Pilgrim	269a611dc0	[CostModel][X86] Match 256-bit vector shift 'splat' costs for AVX2 and above We were matching against general vector shift costs before the uniform splat costs llvm-svn: 291365	2017-01-07 21:47:10 +00:00
Simon Pilgrim	5eef6afa17	[CostModel][AVX512BW] Add v32i16 vector shift costs for avx512bw targets. llvm-svn: 291354	2017-01-07 17:54:10 +00:00
Simon Pilgrim	271b5ff570	[X86][AVX512] Use lowerShuffleAsRepeatedMaskAndLanePermute for non-VBMI v64i8 shuffles (PR31470) llvm-svn: 291347	2017-01-07 15:37:50 +00:00
Simon Pilgrim	b938c434f9	[CostModel][X86] Add AVX512 and 512-bit vector shift cost tests. llvm-svn: 291269	2017-01-06 19:41:26 +00:00
Chad Rosier	2e3e81f00b	[AArch64] Reduce vector insert/extract cost for Falkor. Differential Revision: https://reviews.llvm.org/D28403 llvm-svn: 291254	2017-01-06 18:03:26 +00:00
Simon Pilgrim	f29ec731bd	[CostModel][X86] Fix 512-bit SDIV/UDIV 'big' costs. Set the costs on the lowest target that supports the type. llvm-svn: 291229	2017-01-06 11:12:53 +00:00
Simon Pilgrim	6597231ee4	[CostModel][X86] Add SDIV/UDIV cost tests for a wider range of targets Added a test demonstrating bug in AVX512 division costs llvm-svn: 291228	2017-01-06 11:02:40 +00:00
Simon Pilgrim	87a1dcdf6d	[CostModel][X86] Include the cost of 256-bit upper subvector extract/insertion in AVX1 v4i64 MUL Matches other MUL/ADD/SUB 256-bit case on AVX1 llvm-svn: 291149	2017-01-05 18:20:25 +00:00
Chad Rosier	54356c6a0e	[AArch64][CostModel] Add coverage for bswap intrinsics. llvm-svn: 291140	2017-01-05 16:55:32 +00:00
Simon Pilgrim	e00b6e9f42	[CostModel][X86] Add support for broadcast shuffle costs Currently only for broadcasts with input and output of the same width. Differential Revision: https://reviews.llvm.org/D27811 llvm-svn: 291122	2017-01-05 15:56:08 +00:00
Chad Rosier	2caaab6e8f	[AArch64] Remove mcpu option as this test is not target specific. NFC. llvm-svn: 291117	2017-01-05 15:05:03 +00:00
Chad Rosier	4891053009	[AArch64] Remove unused arguments from tests. NFC. llvm-svn: 291112	2017-01-05 14:48:53 +00:00
Tobias Grosser	08b29b5d00	Add missing CHECK: line to test case added in 29097 Without this CHECK line, we may not detect incorrectly detected additional regions at the end of the region tree. llvm-svn: 290994	2017-01-04 19:35:38 +00:00
Tobias Grosser	6152532380	RegionInfo: add new test case This test case has been reduced from test/Analysis/RegionInfo/mix_1.ll and provides us with a minimal example of a test case which caused problems while working on an improved version of the RegionInfo analysis. We upstream this test case, as it certainly can be helpful in future debugging and optimization tests. Test case reduced by Pratik Bhatu <cs12b1010@iith.ac.in> llvm-svn: 290974	2017-01-04 17:50:15 +00:00
Simon Pilgrim	f1fa399ee0	[CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costs Actual codegen is much better than the extract+insert patterns that was assumed. llvm-svn: 290962	2017-01-04 14:01:33 +00:00
Elena Demikhovsky	49038bfad5	Fixed shuffle-reverse cost on AVX-512. (This changed was approved in https://reviews.llvm.org/D28118, but Simon asked to submit it separately). llvm-svn: 290812	2017-01-02 11:44:10 +00:00
Elena Demikhovsky	ac6dc0e1f0	AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns. X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost. In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426). * Shiffle-broadcast cost will be changed in Simon's upcoming patch. Differential Revision: https://reviews.llvm.org/D28118 llvm-svn: 290810	2017-01-02 10:37:52 +00:00
Sanjay Patel	6927f00f21	[ValueTracking] add tests for known-nonnull-at; NFC llvm-svn: 290790	2016-12-31 19:23:26 +00:00
Sanjoy Das	8c493b5ff3	[TBAAVerifier] Be stricter around verifying scalar nodes This fixes the issue exposed in PR31393, where we weren't trying sufficiently hard to diagnose bad TBAA metadata. This does reduce the variety in the error messages we print out, but I think the tradeoff of verifying more, simply and quickly overrules the need for more helpful error messags here. llvm-svn: 290713	2016-12-29 15:47:05 +00:00
Chandler Carruth	888ae91182	[PM] Teach MemDep to invalidate its result object when its cached analysis handles become invalid. Add a test case for its invalidation logic. llvm-svn: 290620	2016-12-27 19:33:04 +00:00
Chandler Carruth	9a136f0e41	[PM] Add more dedicated testing to cover the invalidation logic added to BasicAA in r290603. I've kept the basic testing in the new PM test file as that also covers the AAManager invalidation logic. If/when there is a good place for broader AA testing it could move there. This test is somewhat unsatisfying as I can't get it to fail even with ASan outside of explicit checks of the invalidation. Apparently we don't yet have any test coverage of the BasicAA code paths using either the domtree or loopinfo -- I made both of them always be null and check-llvm passed. llvm-svn: 290612	2016-12-27 17:59:22 +00:00
Bryant Wong	b0749ef47d	[AliasAnalysis] Teach BasicAA about memcpy. Differential Revision: https://reviews.llvm.org/D27034 llvm-svn: 290526	2016-12-25 22:42:27 +00:00
Simon Pilgrim	2882ed164e	[X86][SSE] Improve lowering of vXi64 multiplies As mentioned on PR30845, we were performing our vXi64 multiplication as: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32); when we could avoid one of the upper shifts with: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi + AhiBlo, 32); This matches the lowering on gcc/icc. Differential Revision: https://reviews.llvm.org/D27756 llvm-svn: 290267	2016-12-21 20:00:10 +00:00
Michael Kuperstein	e4bd47a0d4	[ConstantFolding] Fix vector GEPs harder For vector GEPs, CastGEPIndices can end up in an infinite recursion, because we compare the vector type to the scalar pointer type, find them different, and then try to cast a type to itself. Differential Revision: https://reviews.llvm.org/D28009 llvm-svn: 290260	2016-12-21 17:34:21 +00:00
Daniel Jasper	4c6537eccb	Add files I seem to have dropped in my revert (r290086). Sorry! llvm-svn: 290087	2016-12-19 08:32:13 +00:00
Daniel Jasper	162ffcacd6	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Matthew Simpson	bf784fec18	[AArch64] Guard Misaligned 128-bit store penalty by subtarget feature This patch checks that the SlowMisaligned128Store subtarget feature is set when penalizing such stores in getMemoryOpCost. Differential Revision: https://reviews.llvm.org/D27677 llvm-svn: 289845	2016-12-15 18:36:59 +00:00
Simon Pilgrim	97b5f6a46b	[CostModel][X86] Updated reverse shuffle costs llvm-svn: 289819	2016-12-15 14:24:07 +00:00
Simon Pilgrim	6219e36a4f	[CostModel] Fix long standing bug with reverse shuffle mask detection Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles llvm-svn: 289811	2016-12-15 12:12:45 +00:00
Simon Pilgrim	d5264e8f7c	[CostModel][X86] Add tests for reverse shuffle costs llvm-svn: 289800	2016-12-15 10:45:53 +00:00
Hal Finkel	f224db75d2	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Hal Finkel	502475d4f3	Make processing @llvm.assume more efficient by using operand bundles There was an efficiency problem with how we processed @llvm.assume in ValueTracking (and other places). The AssumptionCache tracked all of the assumptions in a given function. In order to find assumptions relevant to computing known bits, etc. we searched every assumption in the function. For ValueTracking, that means that we did O(#assumes * #values) work in InstCombine and other passes (with a constant factor that can be quite large because we'd repeat this search at every level of recursion of the analysis). Several of us discussed this situation at the last developers' meeting, and this implements the discussed solution: Make the values that an assume might affect operands of the assume itself. To avoid exposing this detail to frontends and passes that need not worry about it, I've used the new operand-bundle feature to add these extra call "operands" in a way that does not affect the intrinsic's signature. I think this solution is relatively clean. InstCombine adds these extra operands based on what ValueTracking, LVI, etc. will need and then those passes need only search the users of the values under consideration. This should fix the computational-complexity problem. At this point, no passes depend on the AssumptionCache, and so I'll remove that as a follow-up change. Differential Revision: https://reviews.llvm.org/D27259 llvm-svn: 289755	2016-12-15 02:53:42 +00:00
Sanjoy Das	a0e8011216	[Verifier] Add verification for TBAA metadata Summary: This change adds some verification in the IR verifier around struct path TBAA metadata. Other than some basic sanity checks (e.g. we get constant integers where we expect constant integers), this checks: - That by the time an struct access tuple `(base-type, offset)` is "reduced" to a scalar base type, the offset is `0`. For instance, in C++ you can't start from, say `("struct-a", 16)`, and end up with `("int", 4)` -- by the time the base type is `"int"`, the offset better be zero. In particular, a variant of this invariant is needed for `llvm::getMostGenericTBAA` to be correct. - That there are no cycles in a struct path. - That struct type nodes have their offsets listed in an ascending order. - That when generating the struct access path, you eventually reach the access type listed in the tbaa tag node. Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26438 llvm-svn: 289402	2016-12-11 20:07:15 +00:00
Keno Fischer	ab41662e58	ConstantFolding: Don't crash when encountering vector GEP ConstantFolding tried to cast one of the scalar indices to a vector type. Instead, use the vector type only for the first index (which is the only one allowed to be a vector) and use its scalar type otherwise. Fixes PR31250. Reviewers: majnemer Differential Revision: https://reviews.llvm.org/D27389 llvm-svn: 289073	2016-12-08 17:22:35 +00:00
Haicheng Wu	20ce778776	[AArch64] Correct the check of signed 9-bit imm in isLegalAddressingMode() In the addressing mode, signed 9-bit imm is [-256, 255], not [-512, 511]. Differential Revision: https://reviews.llvm.org/D27480 llvm-svn: 288876	2016-12-07 01:45:04 +00:00
Haicheng Wu	c157a2a55a	[TTI/CostModel] Correct the way getGEPCost() calls isLegalAddressingMode() Fix a bug when we call isLegalAddressingMode() from getGEPCost(). Differential Revision: https://reviews.llvm.org/D27357 llvm-svn: 288569	2016-12-03 01:57:24 +00:00
Guozhi Wei	27571a5d37	[ppc] Correctly compute the cost of loading 32/64 bit memory into VSR VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial. Differential Revision: https://reviews.llvm.org/D26713 llvm-svn: 288560	2016-12-03 00:41:43 +00:00
Alexey Bataev	7135b5ad24	[SLP] Fixed cost model for horizontal reduction. Currently when cost of scalar operations is evaluated the vector type is used for scalar operations. Patch fixes this issue and fixes evaluation of the vector operations cost. Several test showed that vector cost model is too optimistic. It allowed vectorization of 8 or less add/fadd operations, though scalar code is faster. Actually, only for 16 or more operations vector code provides better performance. Differential Revision: https://reviews.llvm.org/D26277 llvm-svn: 288398	2016-12-01 18:42:42 +00:00
Alexey Bataev	68437dbc6c	[SLP] Additional tests with the cost of vector operations. llvm-svn: 288377	2016-12-01 17:26:54 +00:00
Alexey Bataev	ef66b3c144	Revert "[SLP] Additional tests with the cost of vector operations." This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e. llvm-svn: 288371	2016-12-01 16:45:04 +00:00
Alexey Bataev	7f4c68a45a	[SLP] Additional tests with the cost of vector operations. llvm-svn: 288369	2016-12-01 16:11:48 +00:00
Sanjay Patel	b2bc8129f9	[InstSimplify] allow integer vector types to use computeKnownBits Note that the non-splat lshr+lshr test folded, but that does not work in general. Something is missing or wrong in computeKnownBits as the non-splat shl+shl test still shows. llvm-svn: 288005	2016-11-27 21:07:28 +00:00
Sanjay Patel	7b435e81e5	add tests to show missing analysis; NFC llvm-svn: 287998	2016-11-27 15:54:45 +00:00
Simon Pilgrim	b2804b00f4	[X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287882	2016-11-24 14:46:55 +00:00
Simon Pilgrim	5df905e2e1	[X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances llvm-svn: 287762	2016-11-23 14:01:18 +00:00
Simon Pilgrim	1f99fea9ae	[CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs llvm-svn: 287760	2016-11-23 13:42:09 +00:00
Simon Pilgrim	648b765007	[CostModel][X86] Add v2f32 -> v2i64 fptosi/fptoui cost tests llvm-svn: 287756	2016-11-23 11:43:00 +00:00
Simon Pilgrim	b94dcea1ae	[CostModel][X86] Updated sitofp/uitofp scalar/vector cost tests Better coverage of all legal types + special cases. Removed old fptoui tests which are all handled in fptoui.ll llvm-svn: 287678	2016-11-22 18:55:49 +00:00
Yaxun Liu	7bae0ef103	Fix known zero bits for addrspacecast. Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1. This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand. Differential Revision: https://reviews.llvm.org/D26803 llvm-svn: 287545	2016-11-21 15:42:31 +00:00
Craig Topper	d5ba7319d4	[AVX-512] Support FCOPYSIGN for v16f32 and v8f64 Summary: This extends FCOPYSIGN support to 512-bit vectors. I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads. Reviewers: delena, zvi, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26791 llvm-svn: 287298	2016-11-18 02:25:34 +00:00
Simon Pilgrim	50c2f1dd68	[CostModel][X86] Added mul costs for vXi8 vectors More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result llvm-svn: 286838	2016-11-14 15:54:24 +00:00
Simon Pilgrim	bf629ee6cc	[X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargets Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason. This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit) llvm-svn: 286832	2016-11-14 14:45:16 +00:00
Peter Collingbourne	fbb7ea5270	IR: Introduce inrange attribute on getelementptr indices. If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange. This can be used, e.g. for alias analysis or to split globals at element boundaries where beneficial. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html Differential Revision: https://reviews.llvm.org/D22793 llvm-svn: 286514	2016-11-10 22:34:55 +00:00
Tobias Grosser	68e42a688d	[RegionInfo] Add three tests that include infinite loops These examples are variations that were inspired from a small subgraph taken from paper.ll which are interesting as they show certain issues with infinite loops. llvm-svn: 286450	2016-11-10 13:56:19 +00:00
Andrew Kaylor	c136ea50fc	[BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes Differential Revision: https://reviews.llvm.org/D26382 llvm-svn: 286294	2016-11-08 21:07:42 +00:00
Simon Pilgrim	f59c806c0f	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233	2016-11-08 14:10:28 +00:00
Chad Rosier	f56813d057	[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D26252 llvm-svn: 286108	2016-11-07 14:11:45 +00:00
Alexey Bataev	40602a91a6	Improved cost model for FDIV and FSQRT, by Andrew Tischenko There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 llvm-svn: 285564	2016-10-31 12:10:53 +00:00
Tom Stellard	390e5c25aa	[Loads] Fix crash in is isDereferenceableAndAlignedPointer() Summary: We were trying to add APInt values with different bit sizes after visiting an addrspacecast instruction which changed the bit width of the pointer. Reviewers: majnemer, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24774 llvm-svn: 285407	2016-10-28 15:32:28 +00:00
Simon Pilgrim	68d206a9d2	[X86][AVX512] Fix MUL v8i64 costs on non-AVX512DQ targets llvm-svn: 285329	2016-10-27 18:32:06 +00:00
Simon Pilgrim	268d797e5a	[X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64 With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq). Updated cost table accordingly. Differential Revision: https://reviews.llvm.org/D26011 llvm-svn: 285304	2016-10-27 15:27:00 +00:00
Chad Rosier	7c0f8709c5	Revert "[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory." This reverts commit r285191. LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent code from being moved outside of the original scope. llvm-svn: 285227	2016-10-26 19:18:19 +00:00
Chad Rosier	5c8105a75f	[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D25969 llvm-svn: 285191	2016-10-26 12:42:11 +00:00
Eli Friedman	245e1a53f5	Fix regression from my recent GlobalsAA fix. There are two fixes here: one, AnalyzeUsesOfPointer can't return false until it has checked all the uses of the pointer. Two, if a global uses another global, we have to assume the address of the first global escapes. Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 . Differential Revision: https://reviews.llvm.org/D25798 llvm-svn: 285034	2016-10-24 21:47:44 +00:00
Simon Pilgrim	464fee80a0	[CostModel][X86] Added tests for current integer signed/unsigned remainder costs llvm-svn: 284940	2016-10-23 18:35:02 +00:00
Simon Pilgrim	9a5c4b864b	[X86][SSE] Add SSE41/AVX1 costs for vector shifts. We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results. llvm-svn: 284939	2016-10-23 16:49:04 +00:00
Simon Pilgrim	6405c6dd36	[CostModel][X86] Added tests for current integer trunc costs llvm-svn: 284938	2016-10-23 15:17:52 +00:00
Gerolf Hoflehner	c01563abdd	[BasicAA] Fix - missed alias in GEP expressions In BasicAA GEP operand values get adjusted ("wrap-around") based on the pointersize. Otherwise, in non-64b modes, AA could report false negatives. However, a wrap-around is valid only for a fully evaluated expression. It had been introduced to fix an alias problem in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html. This commit restricts the wrap-around to constant gep operands only where the value is known at compile-time. llvm-svn: 284908	2016-10-22 02:41:39 +00:00
Li Huang	6f568d5434	[SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor. Summary: When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV multiple times if this SCEV is referenced by multiple other SCEVs. This has exponential time complexity in the worst case. Memoizing the results will avoid re-visiting the same SCEV. Add a map to save the results, and override the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each SCEV once and thus returns the same result for the same input SCEV. This patch fixes PR18606, PR18607. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25810 llvm-svn: 284868	2016-10-21 20:05:21 +00:00
John Brawn	c944a4af03	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Li Huang	0a4f3b84af	[SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV This is to avoid inlining too many multiplication operands into a SCEV, which could take exponential time in the worst case. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25794 llvm-svn: 284784	2016-10-20 21:38:39 +00:00
Simon Pilgrim	6773ac2510	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv uniformconst costs for 256/512 bit integer vectors We weren't checking for uniform const costs before the general cost, resulting in very high estimates. llvm-svn: 284755	2016-10-20 18:00:35 +00:00
Simon Pilgrim	aaf8c094e3	[CostModel][X86] Added tests for sdiv/udiv costs for uniform const and uniform const power-of-2 Shows poor costings in AVX1/AVX512BW for certain vector types llvm-svn: 284748	2016-10-20 17:16:38 +00:00
Simon Pilgrim	9091f77856	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv general costs for 256/512 bit integer vectors We weren't accounting for legal types on every subtarget, meaning that many of the costs were using defaults. We still don't correctly cost (or test) the 512-bit sdiv/udiv by uniform const cases, nor the power-of-2 cases. llvm-svn: 284744	2016-10-20 16:39:11 +00:00
Simon Pilgrim	696a79808d	[CostModel][X86] Added tests for sdiv/udiv costs for scalar and 128/256/512 bit integer vectors Shows current bug in AVX1/AVX512BW costs for 256 bit vector types llvm-svn: 284723	2016-10-20 12:34:00 +00:00
Chad Rosier	d8727d0208	[AliasSetTracker] Add support for memcpy and memmove. Differential Revision: https://reviews.llvm.org/D25776 llvm-svn: 284630	2016-10-19 19:09:03 +00:00
John Brawn	fc3f42231b	[SCEV] More accurate calculation of max backedge count of some less-than loops In loops that look something like i = n; do { ... } while(i++ < n+k); where k is a constant, the maximum backedge count is k (in fact the backedge count will be either 0 or k, depending on whether n+k wraps). More generally for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will iterate either 0 or that constant number of times. This allows for more loop unrolling with the recent upper bound loop unrolling changes, and I'm working on a patch that will let loop unrolling additionally make use of the loop being executed either 0 or k times (we need to retain the loop comparison only on the first unrolled iteration). Differential Revision: https://reviews.llvm.org/D25607 llvm-svn: 284465	2016-10-18 10:10:53 +00:00
Simon Pilgrim	293c1a3deb	[X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32 As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types. This patch adds support for fptosi to 2i32 from both 2f64 and 2f32. It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch. Differential Revision: https://reviews.llvm.org/D23808 llvm-svn: 284459	2016-10-18 07:42:15 +00:00
Tobias Grosser	0307223760	[SCEV] Consider delinearization pattern with extension with identity factor Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor. Reviewers: sanjoy, grosser Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser Differential Revision: https://reviews.llvm.org/D16492 llvm-svn: 284378	2016-10-17 11:56:26 +00:00
Tom Stellard	6fa6d1d712	[ValueTracking] Fix crash in GetPointerBaseWithConstantOffset() Summary: While walking defs of pointer operands we were assuming that the pointer size would remain constant. This is not true, because addresspacecast instructions may cast the pointer to an address space with a different pointer width. This partial reverts r282612, which was a more conservative solution to this problem. Reviewers: reames, sanjoy, apilipenko Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24772 llvm-svn: 283557	2016-10-07 14:23:29 +00:00
Bjorn Pettersson	403539ab64	[ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement. Summary: The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24955 llvm-svn: 283434	2016-10-06 09:56:21 +00:00
Eli Friedman	6579a1a109	Make GlobalsAA ignore dead constant expressions. Slightly improves the precision of GlobalsAA in certain situations, and makes the behavior of optimization passes more predictable. Differential Revision: https://reviews.llvm.org/D24104 llvm-svn: 283165	2016-10-04 00:03:55 +00:00
Sanjay Patel	2e8c69fd7b	[x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119	2016-10-03 16:38:27 +00:00
Simon Pilgrim	85cdcda83a	[CostModel][X86] Added tests for current fptosi/fptoui costs llvm-svn: 283047	2016-10-01 19:09:59 +00:00
Simon Pilgrim	781f5cf02c	[CostModel][X86] Added fcopysign costs llvm-svn: 283044	2016-10-01 16:41:52 +00:00
Simon Pilgrim	b90adcf288	[CostModel][X86] Added fabs costs llvm-svn: 283042	2016-10-01 16:30:13 +00:00
Jun Bum Lim	bd9da0f624	Enhance calcColdCallHeuristics for InvokeInst Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst. Reviewers: mcrosier, hfinkel, davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D24868 llvm-svn: 282262	2016-09-23 17:26:14 +00:00
Simon Pilgrim	38157a81bd	[CostModel][X86] Added scalar float op costs llvm-svn: 281864	2016-09-18 21:01:20 +00:00
Sanjay Patel	5e1cecf28d	auto-generate checks llvm-svn: 281756	2016-09-16 17:54:52 +00:00
David L Kreitzer	d923ca0cb8	Reapplying r278731 after fixing the problem that caused it to be reverted. Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 281732	2016-09-16 14:38:13 +00:00
Wei Mi	c1cf1864f4	Create a getelementptr instead of sub expr for ValueOffsetPair if the value is a pointer. This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair, if the value is of pointer type, we can only create a getelementptr instead of sub expr. Differential Revision: https://reviews.llvm.org/D24088 llvm-svn: 281439	2016-09-14 04:39:50 +00:00
Elena Demikhovsky	eed0bb3d71	[Loop Vectorizer] Fixed memory confilict checks. Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930	2016-08-28 08:53:53 +00:00
Wei Mi	d25dea67a3	[UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748	2016-08-25 16:17:18 +00:00
Evgeny Stupachenko	a52e9dab82	The patch improves ValueTracking on left shift with nsw flag. Summary: The patch fixes PR28946. Reviewers: majnemer, sanjoy Differential Revision: http://reviews.llvm.org/D23296 From: Li Huang llvm-svn: 279684	2016-08-24 23:01:33 +00:00
Artur Pilipenko	d6ee117fbb	Remove missing file from r279433 reversal llvm-svn: 279434	2016-08-22 13:18:19 +00:00
Simon Pilgrim	e8edc2ae9d	[CostModel][X86] Removed shift tests There are more thorough tests found in vshift-*-cost.ll llvm-svn: 279406	2016-08-21 19:56:02 +00:00
Simon Pilgrim	be23c3cc0f	[CostModel][X86] Added costs for vXi16 and vXi8 vectors for add/sub/mul/and/or/xor tests llvm-svn: 279405	2016-08-21 19:44:44 +00:00
Simon Pilgrim	7d6c48f7b2	[CostModel][X86] Replaced SSSE3 with SSE2 costs to create a better baseline llvm-svn: 279404	2016-08-21 19:14:48 +00:00
Simon Pilgrim	36b78d8e6a	[CostModel][X86] Added fsqrt and fma costs llvm-svn: 279403	2016-08-21 19:06:25 +00:00
Simon Pilgrim	e77cbaf718	[CostModel][X86] Split off float arithmetic cost tests llvm-svn: 279402	2016-08-21 18:34:47 +00:00
Simon Pilgrim	70428fcec0	[CostModel][X86] Added sub, or, and, fadd and fsub costs and missing 512-bit mul costs llvm-svn: 279301	2016-08-19 19:07:10 +00:00
Simon Pilgrim	1c46dc174f	[CostModel][X86] Added some AVX512 and 512-bit vector cost tests llvm-svn: 279291	2016-08-19 18:24:10 +00:00
Simon Pilgrim	e189b72731	[CostModel][X86] Add fdiv + frem cost tests llvm-svn: 279283	2016-08-19 17:39:00 +00:00
Michael Kuperstein	2a0f74a4bf	[AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large. Repeated inserts into AliasSetTracker have quadratic behavior - inserting a pointer into AST is linear, since it requires walking over all "may" alias sets and running an alias check vs. every pointer in the set. We can avoid this by tracking the total number of pointers in "may" sets, and when that number exceeds a threshold, declare the tracker "saturated". This lumps all pointers into a single "may" set that aliases every other pointer. (This is a stop-gap solution until we migrate to MemorySSA) This fixes PR28832. Differential Revision: https://reviews.llvm.org/D23432 llvm-svn: 279274	2016-08-19 17:05:22 +00:00
Hans Wennborg	2c5ccecba6	SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932) Differential Revision: https://reviews.llvm.org/D23594 llvm-svn: 278999	2016-08-17 22:50:18 +00:00
Reid Kleckner	bffc0a3155	Revert "Enhance SCEV to compute the trip count for some loops with unknown stride." This reverts commit r278731. It caused http://crbug.com/638314 llvm-svn: 278853	2016-08-16 21:02:04 +00:00
Sanjoy Das	2169ec7470	Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag" This reverts commit r278172. It causes PR28946. llvm-svn: 278740	2016-08-15 21:01:31 +00:00
David L Kreitzer	dbb1c574cf	Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 278731	2016-08-15 20:21:41 +00:00
Eli Friedman	9889f09e8d	[DSE] Don't remove stores made live by a call which unwinds. Issue exposed by noalias or more aggressive alias analysis. Fixes http://llvm.org/PR25422. Differential revision: https://reviews.llvm.org/D21007 llvm-svn: 278451	2016-08-12 01:09:53 +00:00
Andrew Kaylor	6c20f4ff06	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 278267	2016-08-10 18:47:19 +00:00
Andrew Kaylor	3ceabfd932	[ValueTracking] Improve ValueTracking on left shift with nsw flag Patch by Li Huang Differential Revison: https://reviews.llvm.org/D23296 llvm-svn: 278172	2016-08-09 22:41:35 +00:00
Wei Mi	c62339e54c	Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The patch is to fix the bug in PR28705. It was caused by setting wrong return value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion have different meanings when the function is used in different ways so it is easy to make mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion, and specifies where each interface is expected to be used. Differential Revision: https://reviews.llvm.org/D22942 llvm-svn: 278161	2016-08-09 20:40:03 +00:00
Wei Mi	3da1a46d30	Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The fix for PR28705 will be committed consecutively. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 278160	2016-08-09 20:37:50 +00:00
Sanjoy Das	4e9513d4ff	[SCEV] Un-grep'ify tests; NFC llvm-svn: 277861	2016-08-05 20:33:49 +00:00
Sanjoy Das	1c4709dd53	[SCEV] Don't infinitely recurse on unreachable code llvm-svn: 277848	2016-08-05 18:34:14 +00:00
Michael Kuperstein	7044572182	[LV, X86] Be more optimistic about vectorizing shifts. Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 llvm-svn: 277782	2016-08-04 22:48:03 +00:00
Simon Pilgrim	f889b79772	[X86] Dropped XOP ctbits checks - they match the AVX checks llvm-svn: 277718	2016-08-04 11:04:13 +00:00
Simon Pilgrim	70f3672799	[X86][SSE] Add initial costs for vector CTTZ/CTLZ llvm-svn: 277716	2016-08-04 10:51:41 +00:00
George Burgess IV	648b4cdc02	[CFLAA] Remove modref queries from CFLAA. As it turns out, modref queries are broken with CFLAA. Specifically, the data source we were using for determining modref behaviors explicitly ignores operations on non-pointer values. So, it wouldn't note e.g. storing an i32 to an i32* (or loading an i64 from an i64). It also ignores external function calls, rather than acting conservatively for them. (N.B. These operations, where necessary, are* tracked by CFLAA; we just use a different mechanism to do so. Said mechanism is relatively imprecise, so it's unlikely that we can provide reasonably good modref answers with it as implemented.) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22978 llvm-svn: 277366	2016-08-01 18:47:28 +00:00
Adam Nemet	9449a00cc1	[BPI] Add new LazyBPI analysis Summary: The motivation is the same as in D22141: In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. BFI depends on BPI so unless we make this lazy as well we would still compute BPI unconditionally. The solution is to use the new LazyBPI pass in LazyBFI and only compute BPI when computation of BFI is requested by the client. I extended the laziness test using a LoopDistribute test to also cover BPI. Reviewers: hfinkel, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22835 llvm-svn: 277083	2016-07-28 23:31:12 +00:00
George Burgess IV	628ac58686	[CFLAA] Add getModRefBehavior to CFLAnders. This patch lets CFLAnders respond to mod-ref queries. It also includes a small bugfix to CFLSteens. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22823 llvm-svn: 276939	2016-07-27 23:07:07 +00:00
Hans Wennborg	1ff36cfcf2	Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion." It causes Clang tests to fail after Windows self-host (PR28705). (Also reverts follow-up r276139.) llvm-svn: 276822	2016-07-26 23:25:13 +00:00
Wei Mi	5534e22fe5	Remove useless pass from the pipeline in test/Analysis/Dominators/2007-01-14-BreakCritEdges.ll. llvm-svn: 276644	2016-07-25 16:27:34 +00:00
Sanjoy Das	d06246dfe7	[SCEV] Make isImpliedCondOperandsViaRanges smarter This change lets us prove things like "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow" It does this by replacing replacing getConstantDifference by computeConstantDifference (which is smarter) in isImpliedCondOperandsViaRanges. llvm-svn: 276505	2016-07-23 00:54:36 +00:00
Wei Mi	95c770abfc	[PM] Port BreakCriticalEdges to the new PM. Differential Revision: https://reviews.llvm.org/D22688 llvm-svn: 276449	2016-07-22 18:04:25 +00:00
Wei Mi	b7c8cbfa86	Fix test/Analysis/ScalarEvolution/scev-expander-existing-value-offset.ll for rL276136. The content in this testcase was accidentally duplicated. Fix the error. llvm-svn: 276139	2016-07-20 16:54:58 +00:00
Wei Mi	6fe94448f1	Use ValueOffsetPair to enhance value reuse during SCEV expansion. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 276136	2016-07-20 16:40:33 +00:00
Simon Pilgrim	c494279f63	[X86][SSE] Add cost model values for CTPOP of vectors This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 llvm-svn: 276104	2016-07-20 10:41:28 +00:00
George Burgess IV	48af022908	[CFLAA] Make a test tell the truth. NFC. Dishonesty noted by Jia Chen. llvm-svn: 276028	2016-07-19 20:56:41 +00:00
George Burgess IV	fc60bb603d	[CFLAA] Add some interproc. analysis to CFLAnders. This patch adds function summary support to CFLAnders. It also comes with a lot of tests! Woohoo! Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22450 llvm-svn: 276026	2016-07-19 20:47:15 +00:00
Simon Pilgrim	6883102e64	[X86] Add CTPOP/CTLZ/CTTZ scalar cost tests llvm-svn: 275725	2016-07-17 18:29:19 +00:00
George Burgess IV	1820150537	[CFLAA] Add attributes handling for CFLAnders. This patch adds proper handling of stratified attributes into our anders-style CFLAA implementation. It also comes bundled with more CFLAnders tests. :) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22325 llvm-svn: 275604	2016-07-15 20:02:49 +00:00
George Burgess IV	8b6295a5d8	[CFLAA] Add an initial CFLAnders implementation. This adds an incomplete anders-style implementation for CFLAA. It's incomplete in that it's missing interprocedural analysis, attrs handling, etc. and that it needs more tests. More tests and features will be added in future commits. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22291 llvm-svn: 275602	2016-07-15 19:53:25 +00:00
Tom Stellard	f071f36723	GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals Summary: In preparation for changing GlobalsAA to stop assuming that intrinsics can't read arbitrary globals, we need to make sure GlobalsAA is querying function attributes rather than relying on this assumption. This patch was inspired by: http://reviews.llvm.org/D20206 Reviewers: jmolloy, hfinkel Subscribers: eli.friedman, llvm-commits Differential Revision: https://reviews.llvm.org/D21318 llvm-svn: 275433	2016-07-14 15:50:27 +00:00
Adam Nemet	071e00e973	[BFI] Add new LazyBFI analysis pass Summary: This is necessary for D21771. In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. However we don't want to pay for computing BFI unless the hotness attribute is requested. This is achieved by making BFI lazy at the very high-level through a new analysis pass -- BFI is not calculated unless requested. I am adding a test to check the laziness under D21771 where the first user of the analysis is added. Reviewers: hfinkel, dexonsmith, davidxl Subscribers: davidxl, dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D22141 llvm-svn: 275250	2016-07-13 05:01:48 +00:00
Keno Fischer	7919bf891e	Fix ScalarEvolutionExpander step scaling bug The expandAddRecExprLiterally function incorrectly transforms `[Start + Step * X]` into `Step * [Start + X]` instead of the correct transform of `[Step * X] + Start`. This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219 due to what appeared to be sufficiently complicated loop interactions. Patch by Jameson Nash (jameson@juliacomputing.com). Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D16505 llvm-svn: 275239	2016-07-13 01:28:12 +00:00
Michael Kuperstein	7e6a08b33c	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106	2016-07-11 21:39:44 +00:00
Hal Finkel	392961ad04	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038	2016-07-11 03:08:49 +00:00
Hal Finkel	f9c4041c84	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037	2016-07-11 02:48:23 +00:00
Hal Finkel	9dd10de0f9	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035	2016-07-11 01:32:20 +00:00
Adam Nemet	50b66ccab5	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. llvm-svn: 274935	2016-07-08 21:24:06 +00:00
Justin Bogner	b7a198f7fd	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769	2016-07-07 16:40:17 +00:00
Sean Silva	09ccac554e	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712	2016-07-07 01:01:53 +00:00
Sanjay Patel	cd5177a158	[x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) This is "cvtdq2ps" which does not appear to be particularly slow on any CPU according to Agner's tables. Choosing "5" as a cost here as suggested in: https://llvm.org/bugs/show_bug.cgi?id=21356 ...but it seems very conservative given that the instruction is fully pipelined, and I think these costs are supposed to model throughput. Note that related costs are also most likely too high, but this fixes PR21356 and partly fixes PR28434. llvm-svn: 274658	2016-07-06 19:15:54 +00:00
Michael Kuperstein	d59d4568d1	[TTI] The cost model should not assume vector casts get completely scalarized The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 llvm-svn: 274642	2016-07-06 17:30:56 +00:00
George Burgess IV	9f9488ba33	[CFLAA] Split into Anders+Steens analysis. StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower. So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA. Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc. This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21910 llvm-svn: 274589	2016-07-06 00:26:41 +00:00
Nemanja Ivanovic	f7fafe4ce2	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535	2016-07-05 09:22:29 +00:00
Nicolai Haehnle	fe1657d8ae	Add writeonly IR attribute Summary: This complements the earlier addition of IntrWriteMem and IntrWriteArgMem LLVM intrinsic properties, see D18291. Also start using the attribute for memset, memcpy, and memmove intrinsics, and remove their special-casing in BasicAliasAnalysis. Reviewers: reames, joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18714 llvm-svn: 274485	2016-07-04 08:01:29 +00:00
Xinliang David Li	ad67ad9771	[PM] Port LoopAccessInfo analysis to new PM It is implemented as a LoopAnalysis pass as discussed and agreed upon. llvm-svn: 274452	2016-07-02 21:18:40 +00:00

1 2 3 4 5 ...

1242 Commits