llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00

Author	SHA1	Message	Date
Sanjoy Das	8c493b5ff3	[TBAAVerifier] Be stricter around verifying scalar nodes This fixes the issue exposed in PR31393, where we weren't trying sufficiently hard to diagnose bad TBAA metadata. This does reduce the variety in the error messages we print out, but I think the tradeoff of verifying more, simply and quickly overrules the need for more helpful error messags here. llvm-svn: 290713	2016-12-29 15:47:05 +00:00
Sanjoy Das	2d9d8d6e59	[TBAAVerifier] Make things const-consistent; NFC llvm-svn: 290712	2016-12-29 15:47:01 +00:00
Sanjoy Das	f2cdbf0522	[TBAAVerifier] Memoize validity of scalar tbaa nodes; NFCI llvm-svn: 290711	2016-12-29 15:46:57 +00:00
Artem Tamazov	7abf7635ee	[AMDGPU][mc] Enable absolute expressions in .hsa_code_object_isa directive Among other stuff, this allows to use predefined .option.machine_version_major /minor/stepping symbols in the directive. Relevant test expanded at once (also file renamed for clarity). Differential Revision: https://reviews.llvm.org/D28140 llvm-svn: 290710	2016-12-29 15:41:52 +00:00
Igor Laevsky	64a0622c9c	Fix documentation generator warnings after rL290708. llvm-svn: 290709	2016-12-29 15:08:57 +00:00
Igor Laevsky	cde17c8a3e	Introduce element-wise atomic memcpy intrinsic This change adds a new intrinsic which is intended to provide memcpy functionality with additional atomicity guarantees. Please refer to the review thread or language reference for further details. Differential Revision: https://reviews.llvm.org/D27133 llvm-svn: 290708	2016-12-29 14:31:07 +00:00
Craig Topper	c31995bf02	[InstCombine] Use getVectorNumElements instead of explicitly casting to VectorType and calling getNumElements. NFC llvm-svn: 290707	2016-12-29 07:03:18 +00:00
Craig Topper	7e34057ad3	[InstCombine] Fix typo in comment. NFC llvm-svn: 290706	2016-12-29 05:38:31 +00:00
Craig Topper	5b7df5e739	[InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC llvm-svn: 290705	2016-12-29 04:24:32 +00:00
Craig Topper	84af66a480	[InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. llvm-svn: 290704	2016-12-29 03:30:17 +00:00
Kostya Serebryany	4ee731d6c9	[libFuzzer] make __sanitizer_cov_trace_switch more predictable llvm-svn: 290703	2016-12-29 02:50:35 +00:00
Craig Topper	412163fcd4	[InstCombine] Fix some of the AVX-512 scalar arithmetic test cases to do a better job of testing what they intended to test. The accidentally had trivially dead code. Also needed to adjust the rounding mode to not CUR_DIRECTION so the intrinsics don't get converted to native operations before going through SimplifyDemandedVectorElts. llvm-svn: 290702	2016-12-29 02:29:04 +00:00
Mehdi Amini	cff3bece5c	Remove BitstreamWriter::Emit64(), it was never called (NFC) llvm-svn: 290701	2016-12-29 01:40:53 +00:00
Reid Kleckner	158f53fe75	Fix mingw build by moving the static const data member before the bitfields Apparently GCC targeting Windows breaks bitfields on static data members: struct Foo { unsigned X : 16; static const int M = 42; unsigned Y : 16; }; static_assert(sizeof(Foo) == 4, "asdf"); // fails Who knew. llvm-svn: 290700	2016-12-29 01:14:41 +00:00
Daniel Berlin	23bf9929d1	NewGVN: Sort Dominator Tree in RPO order, and use that for generating order. Summary: The optimal iteration order for this problem is RPO order. We want to process as many preds of a backedge as we can before we process the backedge. At the same time, as we add predicate handling, we want to be able to touch instructions that are dominated by a given block by ranges (because a change in value numbering a predicate possibly affects all users we dominate that are using that predicate). If we don't do it this way, we can't do value inference over backedges (the paper covers this in depth). The newgvn branch currently overshoots the last part, and guarantees that it will touch at least the right set of instructions, but it does touch more. This is because the bitvector instruction ranges are currently generated in RPO order (so we take the max and the min of the ranges of dominated blocks, which means there are some in the middle we didn't have to touch that we did). We can do better by sorting the dominator tree, and then just using dominator tree order. As a preliminary, the dominator tree has some RPO guarantees, but not enough. It guarantees that for a given node, your idom must come before you in the RPO ordering. It guarantees no relative RPO ordering for siblings. We add siblings in whatever order they appear in the module. So that is what we fix. We sort the children array of the domtree into RPO order, and then use the dominator tree for ordering, instead of RPO, since the dominator tree is now a valid RPO ordering. Note: This would help any other pass that iterates a forward problem in dominator tree order. Most of them are single pass. It will still maximize whatever result they compute. We could also build the dominator tree in this order, but our incremental updates would still put it out of sort order, and recomputing the sort order is almost as hard as general incremental updates of the domtree. Also note that the sorting does not affect any tests, etc. Nothing depends on domtree order, including the verifier, the equals functions for domtree nodes, etc. How much could this matter, you ask? Here are the current numbers. This is generated by running NewGVN over all files in LLVM. Note that once we propagate equalities, the differences go up by an order of magnitude or two (IE instead of 29, the max ends up in the thousands, since the worst case we add a factor of N, where N is the number of branch predicates). So while it doesn't look that stark for the default ordering, it gets much much worse. There are also programs in the wild where the difference is already pretty stark (2 iterations vs hundreds). RPO ordering: 759040 Number of iterations is 1 112908 Number of iterations is 2 Default dominator tree ordering: 755081 Number of iterations is 1 116234 Number of iterations is 2 603 Number of iterations is 3 27 Number of iterations is 4 2 Number of iterations is 5 1 Number of iterations is 7 Dominator tree sorted: 759040 Number of iterations is 1 112908 Number of iterations is 2 <yay!> Really bad ordering (sort domtree siblings in postorder. not quite the worst possible, but yeah): 754008 Number of iterations is 1 21 Number of iterations is 10 8 Number of iterations is 11 6 Number of iterations is 12 5 Number of iterations is 13 2 Number of iterations is 14 2 Number of iterations is 15 3 Number of iterations is 16 1 Number of iterations is 17 2 Number of iterations is 18 96642 Number of iterations is 2 1 Number of iterations is 20 2 Number of iterations is 21 1 Number of iterations is 22 1 Number of iterations is 29 17266 Number of iterations is 3 2598 Number of iterations is 4 798 Number of iterations is 5 273 Number of iterations is 6 186 Number of iterations is 7 80 Number of iterations is 8 42 Number of iterations is 9 Reviewers: chandlerc, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28129 llvm-svn: 290699	2016-12-29 01:12:36 +00:00
Reid Kleckner	36f848cda1	Add a static_assert about the sizeof(GlobalValue) I added one for Value back in r262045, and I'm starting to think we should have these for any class with bitfields whose memory efficiency really matters. llvm-svn: 290698	2016-12-29 00:55:51 +00:00
Daniel Berlin	231815c580	Update equalsStoreHelper for the fact that only one branch can be true llvm-svn: 290697	2016-12-29 00:49:32 +00:00
Justin Lebar	85f5dbd306	[GlobalValue] Move HasLLVMReservedName into existing bitfield. NFC Summary: Follow-up to r290691, where I introduced HasLLVMReservedName. rnk pointed out that that patch added an extra word to GlobalValue on MSVC, because it doesn't pack bitfields with different types. This patch moves HasLLVMReservedName into the existing bitfield, where we appear to have plenty of bits to spare. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28149 llvm-svn: 290696	2016-12-29 00:30:46 +00:00
Justin Lebar	a399eb32c6	[IR] Clarify that Value::getName() is not actually cheap. It involves a hashtable lookup when the Value has a name. llvm-svn: 290695	2016-12-29 00:30:42 +00:00
Reid Kleckner	551887db33	[COFF] Use 32-bit jump table entries in .rdata for Win64 Summary: We were already using 32-bit jump table entries, but this was a consequence of the default PIC model on Win64, and not an intentional design decision. This patch ensures that we always use 32-bit label difference jump table entries on Win64 regardless of the PIC model. This is a good idea because it saves executable size and object file size. Moving the jump tables to .rdata cleans up the disassembled object code and reduces the available ROP targets, but it requires adding one more RIP-relative lea to the code. COFF doesn't have relocations to express the difference between two arbitrary symbols, so we can't use the jump table label in the label difference like we do elsewhere. Fixes PR31488 Reviewers: majnemer, compnerd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28141 llvm-svn: 290694	2016-12-29 00:12:39 +00:00
Mehdi Amini	16e26549a6	Change Metadata Index emission in the bitcode to use 2x32 bits for the placeholder The Bitstream reader and writer are limited to handle a "size_t" at most, which means that we can't backpatch and read back a 64bits value on 32 bits platform. llvm-svn: 290693	2016-12-28 23:45:54 +00:00
Piotr Padlewski	84cc9a1e8e	Revert "[NewGVN] replace emplace_back with push_back" llvm-svn: 290692	2016-12-28 23:24:02 +00:00
Justin Lebar	e38b309355	Speed up Function::isIntrinsic() by adding a bit to GlobalValue. NFC Summary: Previously isIntrinsic() called getName(). This involves a hashtable lookup, so is nontrivially expensive. And isIntrinsic() is called frequently, particularly by dyn_cast<IntrinsicInstr>. This patch steals a bit of IntID and uses that to store whether or not getName() starts with "llvm." Reviewers: bogner, arsenm, joker-eph Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D22949 llvm-svn: 290691	2016-12-28 22:59:45 +00:00
Mehdi Amini	57004ae3ac	Add an index for Module Metadata record in the bitcode This index record the position for each metadata record in the bitcode, so that the reader will be able to lazy-load on demand each individual record. We also make sure that every abbrev is emitted upfront so that the block can be skipped while reading. I don't plan to commit this before having the reader counterpart, but I figured this can be reviewed mostly independently. Recommit r290684 (was reverted in r290686 because a test was broken) after adding a threshold to avoid emitting the index when unnecessary (little amount of metadata). This optimization "hides" a limitation of the ability to backpatch in the bitstream: we can only backpatch safely when the position has been flushed. So if we emit an index for one metadata, it is possible that (part of) the offset placeholder hasn't been flushed and the backpatch will fail. Differential Revision: https://reviews.llvm.org/D28083 llvm-svn: 290690	2016-12-28 22:30:28 +00:00
Saleem Abdulrasool	4242753013	Revert "Add an index for Module Metadata record in the bitcode" This reverts commit a0ca6ae2d38339e4ede0dfa588086fc23d87e836. Revert at Mehdi's request as it is breaking bots. llvm-svn: 290686	2016-12-28 20:37:22 +00:00
Piotr Padlewski	a0e01c6faf	[NewGVN] replace emplace_back with push_back emplace_back is not faster if it is equivalent to push_back. In this cases emplaced value had the same type that the one stored in container. It is ugly and it might be even slower (see Scott Meyers presentation about emplacement). llvm-svn: 290685	2016-12-28 20:36:08 +00:00
Mehdi Amini	93db4a9bcc	Add an index for Module Metadata record in the bitcode Summary: This index record the position for each metadata record in the bitcode, so that the reader will be able to lazy-load on demand each individual record. We also make sure that every abbrev is emitted upfront so that the block can be skipped while reading. I don't plan to commit this before having the reader counterpart, but I figured this can be reviewed mostly independently. Reviewers: pcc, tejohnson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28083 llvm-svn: 290684	2016-12-28 19:44:19 +00:00
Piotr Padlewski	2f2307753a	[NewGVN] Simplyfy loop NFC llvm-svn: 290683	2016-12-28 19:42:49 +00:00
Mehdi Amini	39e06ea43b	[ThinLTO] Honor -O{0,1,2,4} passed through the libLTO interface for ThinLTO This was hardcoded to be O3 till now, without any way to change it without changing the code. llvm-svn: 290682	2016-12-28 19:37:16 +00:00
Piotr Padlewski	d0c9c03d55	[NewGVN] replace typedefs with usings llvm-svn: 290680	2016-12-28 19:29:26 +00:00
Piotr Padlewski	14aed02753	[NewGVN] NFC fixes llvm-svn: 290679	2016-12-28 19:17:17 +00:00
Reid Kleckner	911a079ad9	[WinEH] Don't assume endFunction is called while in .text Jump table emission can switch to .rdata before WinException::endFunction gets called. Just remember the appropriate text section we started in and reset back to it when we end the function. We were already switching sections back from .xdata anyway. Fixes the first problem in PR31488, so that now COFF switch tables can live in .rdata if we want them to. llvm-svn: 290678	2016-12-28 19:05:12 +00:00
Davide Italiano	339bff25b9	[NewGVN] Global sweep replacing NULL with nullptr. NFCI. llvm-svn: 290670	2016-12-28 14:00:11 +00:00
Davide Italiano	77ff190de6	[NewGVN] Remove redundant code. NFCI. llvm-svn: 290669	2016-12-28 13:54:16 +00:00
Davide Italiano	019dd7e862	[NewGVN] equals() for loads/stores is the same. Unify. Differential Revision: https://reviews.llvm.org/D28116 llvm-svn: 290667	2016-12-28 13:37:17 +00:00
Chandler Carruth	176c94be9c	[PM] Introduce a devirtualization iteration layer for the new PM. This is an orthogonal and separated layer instead of being embedded inside the pass manager. While it adds a small amount of complexity, it is fairly minimal and the composability and control seems worth the cost. The logic for this ends up being nicely isolated and targeted. It should be easy to experiment with different iteration strategies wrapped around the CGSCC bottom-up walk using this kind of facility. The mechanism used to track devirtualization is the simplest one I came up with. I think it handles most of the cases the existing iteration machinery handles, but I haven't done a very in depth analysis. It does however match the basic intended semantics, and we can tweak or tune its exact behavior incrementally as necessary. One thing that we may want to revisit is freshly building the value handle set on each iteration. While I don't think this will be a significant cost (it is strictly fewer value handles but more churn of value handes than the old call graph), it is conceivable that we'll want a somewhat more clever tracking mechanism. My hope is to layer that on as a follow up patch with data supporting any implementation complexity it adds. This code also provides for a basic count heuristic: if the number of indirect calls decreases and the number of direct calls increases for a given function in the SCC, we assume devirtualization is responsible. This matches the heuristics currently used in the legacy pass manager. Differential Revision: https://reviews.llvm.org/D23114 llvm-svn: 290665	2016-12-28 11:07:33 +00:00
Chandler Carruth	90f7376682	[PM] Teach the CGSCC's CG update utility to more carefully invalidate analyses when we're about to break apart an SCC. We can't wait until after breaking apart the SCC to invalidate things: 1) Which SCC do we then invalidate? All of them? 2) Even if we invalidate all of them, a newly created SCC may not have a proxy that will convey the invalidation to functions! Previously we only invalidated one of the SCCs and too late. This led to stale analyses remaining in the cache. And because the caching strategy actually works, they would get used and chaos would ensue. Doing invalidation early is somewhat pessimizing though if we know that the SCC structure won't change. So it turns out that the design to make the mutation API force the caller to know the kind of mutation in advance was indeed 100% correct and we didn't do enough of it. So this change also splits two cases of switching a call edge to a ref edge into two separate APIs so that callers can clearly test for this and take the easy path without invalidating when appropriate. This is particularly important in this case as we expect most inlines to be between functions in separate SCCs and so the common case is that we don't have to so aggressively invalidate analyses. The LCG API change in turn needed some basic cleanups and better testing in its unittest. No interesting functionality changed there other than more coverage of the returned sequence of SCCs. While this seems like an obvious improvement over the current state, I'd like to revisit the core concept of invalidating within the CG-update layer at all. I'm wondering if we would be better served forcing the callers to handle the invalidation beforehand in the cases that they can handle it. An interesting example is when we want to teach the inliner to update and preserve analyses. But we can cross that bridge when we get there. With this patch, the new pass manager an build all of the LLVM test suite at -O3 and everything passes. =D I haven't bootstrapped yet and I'm sure there are still plenty of bugs, but this gives a nice baseline so I'm going to increasingly focus on fleshing out the missing functionality, especially the bits that are just turned off right now in order to let us establish this baseline. llvm-svn: 290664	2016-12-28 10:34:50 +00:00
Gadi Haber	51f2170fda	This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible. There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers. The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled. Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky Differential Revision: https://reviews.llvm.org/D27901 llvm-svn: 290663	2016-12-28 10:12:48 +00:00
Chandler Carruth	69a65a83e5	[PM] Teach the inliner's call graph update to handle inserting new edges when they are call edges at the leaf but may (transitively) be reached via ref edges. It turns out there is a simple rule: insert everything as a ref edge which is a safe conservative default. Then we let the existing update logic handle promoting some of those to call edges. Note that it would be fairly cheap to make these call edges right away if that is desirable by testing whether there is some existing call path from the source to the target. It just seemed like slightly more complexity in this code path that isn't strictly necessary. If anyone feels strongly about handling this differently I'm happy to change it. llvm-svn: 290649	2016-12-28 03:13:12 +00:00
Craig Topper	3c08d598fe	[InstCombine] Remove a piece of a comment that said that InstCombiner contains pass infrastructure. That hasn't been true since r226618. NFC llvm-svn: 290648	2016-12-28 03:12:42 +00:00
Chandler Carruth	870a2669f0	[PM] Actually commit the test update that was supposed to accompany r290644. Sorry for this. llvm-svn: 290646	2016-12-28 02:31:24 +00:00
Chandler Carruth	ff1b18d787	[LCG] Teach the ref edge removal to handle a ref edge that is trivial due to a call cycle. This actually crashed the ref removal before. I've added a unittest that covers this kind of interesting graph structure and mutation. llvm-svn: 290645	2016-12-28 02:24:58 +00:00
Chandler Carruth	34272b757e	[PM] Disable the loop vectorizer from the new PM's pipeline as it currenty relies on the old PM's dependency system forming LCSSA. The new PM will require a different design for this, and for now this is causing most of the issues I'm currently seeing in testing. I'd like to get to a testable baseline and then work on re-enabling things one at a time. llvm-svn: 290644	2016-12-28 02:24:55 +00:00
Michael Kuperstein	04dff9bc7a	[InstCombine] Canonicalize insert splat sequences into an insert + shuffle This adds a combine that canonicalizes a chain of inserts which broadcasts a value into a single insert + a splat shufflevector. This fixes PR31286. Differential Revision: https://reviews.llvm.org/D27992 llvm-svn: 290641	2016-12-28 00:18:08 +00:00
Kostya Serebryany	d6593db5e1	[libFuzzer] add an experimental flag -experimental_len_control=1 that sets max_len to 1M and tries to increases the actual max sizes of mutations very gradually (second attempt) llvm-svn: 290637	2016-12-27 23:24:55 +00:00
Eric Fiselier	03477e9665	Mark comparator call operator as const llvm-svn: 290636	2016-12-27 23:15:58 +00:00
Kostya Serebryany	b6d58e94d4	[libFuzzer] don't create large random mutations when given an empty seed llvm-svn: 290634	2016-12-27 22:15:04 +00:00
Kostya Serebryany	aece9ad2f5	[sanitizer-coverage] sort the switch cases llvm-svn: 290628	2016-12-27 21:20:06 +00:00
Hemant Kulkarni	2cd4d50d15	llvm-readobj: ELF: Make DT tags machine aware llvm-svn: 290623	2016-12-27 19:59:29 +00:00
Kostya Serebryany	647bec73f9	[libFuzzer] fix UB and simplify the computation of the RNG seed (https://llvm.org/bugs/show_bug.cgi?id=31456 ) llvm-svn: 290622	2016-12-27 19:51:34 +00:00

... 3 4 5 6 7 ...

142736 Commits