llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Diego Novillo	372bf7dc64	SamplePGO - Do not count never-executed inlined functions when computing coverage. If a function was originally inlined but not actually hot at runtime, its samples will not be counted inside the parent function. This throws off the coverage calculation because it expects to find more used records than it should. Fixed by ignoring functions that will not be inlined into the parent. Currently, this is inlined functions with 0 samples. In subsequent patches, I'll change this to mean "cold" functions. llvm-svn: 253716	2015-11-20 21:46:38 +00:00
Tilmann Scheller	a99f5d534e	Revert "[FunctionAttrs] Remove redundant assignment." This reverts r253661. Turns out that the assignment is not redundant (despite the Clang static analyzer claiming the opposite). The variable is being used by the lambda function AddUsersToWorklistIfCapturing(). llvm-svn: 253696	2015-11-20 19:17:10 +00:00
Diego Novillo	4255fabac8	SamplePGO - Add line offset and discriminator information to sample reports. While debugging some sampling coverage problems, I found this useful: When applying samples from a profile, it helps to also know what line offset and discriminator the sample belongs to. This makes it easy to correlate against the input profile. llvm-svn: 253670	2015-11-20 15:39:42 +00:00
Tilmann Scheller	baba8378a4	[FunctionAttrs] Remove redundant assignment. Identified by the Clang static analyzer. llvm-svn: 253661	2015-11-20 12:51:58 +00:00
James Molloy	2208ca52dd	[GlobalOpt] Localize some globals that have non-instruction users We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive. llvm-svn: 253584	2015-11-19 18:04:33 +00:00
James Molloy	b585b0aee8	[FunctionAttrs] Provide a mechanism for adding function attributes from the command line This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc). The syntax is -force-attribute=function_name:attribute_name All function attributes are parsed except alignstack as it requires an argument. llvm-svn: 253550	2015-11-19 08:49:57 +00:00
James Molloy	b92ba28077	[LTO] Add an early run of functionattrs Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt. llvm-svn: 253451	2015-11-18 11:24:42 +00:00
Elena Demikhovsky	0600baa03e	Vector of pointers in function attributes calculation While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers. I added vector-of-pointers to the call arguments types that should be checked. Differential Revision: http://reviews.llvm.org/D14693 llvm-svn: 253363	2015-11-17 19:30:51 +00:00
James Molloy	63f33470bb	[GlobalOpt] Address post-commit review comments on r253168 Address Duncan Exon Smith's comments on D14148, which was added after the patch had been LGTM'd and committed: * clang-format one area where whitespace diffs occurred. * Add a threshold to limit the store/load dominance checks as they are quadratic. llvm-svn: 253192	2015-11-16 10:16:22 +00:00
Benjamin Kramer	c51f76e89b	Move helper classes into anonymous namespaces. NFC. llvm-svn: 253189	2015-11-16 09:01:28 +00:00
James Molloy	85bd37fc58	[GlobalOpt] Demote globals to locals more aggressively Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized. For a global to be demoted, it must only be accessed by one function and that function: 1. Must never recurse directly or indirectly, else the GV would be clobbered. 2. Must never rely on the value in GV at the start of the function (apart from the initializer). GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once. In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason. The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing. llvm-svn: 253168	2015-11-15 14:21:37 +00:00
James Molloy	7d379efeea	[GlobalOpt] Make sure all debug lines end with '\n' GlobalVariable::print() used to emit a newline. It hasn't for a while now, but these debug lines weren't updated. llvm-svn: 253030	2015-11-13 11:05:13 +00:00
James Molloy	e22291bd4d	[GlobalOpt] Coding style - remove function names from doxygen comments Suggested by Mehdi in the review of D14148. llvm-svn: 253029	2015-11-13 11:05:07 +00:00
Akira Hatanaka	a41bf2744e	Revert r252990. Some of the buildbots are still failing. llvm-svn: 252999	2015-11-13 01:44:32 +00:00
Akira Hatanaka	5df31fbb8f	Provide a way to specify inliner's attribute compatibility and merging. This reapplies r252949. I've changed the type of FuncName to be std::string instead of StringRef in emitFnAttrCompatCheck. Original commit message for r252949: Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252990	2015-11-13 01:23:11 +00:00
Akira Hatanaka	8642e85c17	Revert r252949. It broke some of the bots including clang-x64-ninja-win7. llvm-svn: 252951	2015-11-12 21:19:18 +00:00
Akira Hatanaka	ca7dc7a319	Provide a way to specify inliner's attribute compatibility and merging rules using table-gen. NFC. This commit adds new classes CompatRule and MergeRule to Attributes.td, which are used to generate code to check attribute compatibility and merge attributes of the caller and callee. rdar://problem/19836465 llvm-svn: 252949	2015-11-12 20:59:43 +00:00
James Molloy	64e756bd07	Revert "Revert "[FunctionAttrs] Identify norecurse functions"" This reapplies this patch, with test fixes. llvm-svn: 252871	2015-11-12 10:55:20 +00:00
James Molloy	9cd723ec63	Revert "[FunctionAttrs] Identify norecurse functions" This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened. llvm-svn: 252863	2015-11-12 09:05:43 +00:00
James Molloy	4da975f03a	[FunctionAttrs] Identify norecurse functions A function can be marked as norecurse if: * The SCC to which it belongs has cardinality 1; and either a) It does not call any non-norecurse function. This includes self-recursion; or b) It only has one callsite and the function that callsite is within is marked norecurse. a) is best propagated bottom-up and b) is best propagated top-down. We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down. llvm-svn: 252862	2015-11-12 08:53:04 +00:00
David Majnemer	3deb8be573	[IR] Add support for empty tokens When working with tokens, it is often the case that one has instructions which consume a token and produce a new token. Currently, we have no mechanism to represent an initial token state. Instead, we can create a notional "empty token" by inventing a new constant which captures the semantics we would like. This new constant is called ConstantTokenNone and is written textually as "token none". Differential Revision: http://reviews.llvm.org/D14581 llvm-svn: 252811	2015-11-11 21:57:16 +00:00
Oliver Stannard	989496fc9c	GlobalOpt should maintain externally_initialized when splitting aggregates When GlobalOpt splits an internal, global variable with an aggregate type, it should propagate the externally_initialized flag to the newly created globals. This makes the pass safe for our downstream use of this flag, while still allowing some useful optimisations (such as removing dead parts of the split aggregate) to be performed. Differential Revision: http://reviews.llvm.org/D13382 llvm-svn: 252490	2015-11-09 16:47:16 +00:00
Sanjoy Das	62bf9f3dd6	Unbreak the build My code clashed with some ilist iterator changes upstream. Fix by adding an explicit "&*" coercion. llvm-svn: 252392	2015-11-07 02:26:53 +00:00
Sanjoy Das	c550b2c217	[FunctionAttrs] Add comment and clarify assertion message; NFC llvm-svn: 252389	2015-11-07 01:56:07 +00:00
Sanjoy Das	6b6a5c9388	[FunctionAttrs] Add handling for operand bundles Summary: Teach the FunctionAttrs to do the right thing for IR with operand bundles. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14408 llvm-svn: 252387	2015-11-07 01:56:00 +00:00
Sanjoy Das	9dbcbc0397	[FunctionAttrs] Fix an iterator wraparound bug Summary: This change fixes an iterator wraparound bug in `determinePointerReadAttrs`. Ideally, ++'ing off the `end()` of an iplist should result in a failed assert, but currently iplist seems to silently wrap to the head of the list on `end()++`. This is why the bad behavior is difficult to demonstrate. Reviewers: chandlerc, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14350 llvm-svn: 252386	2015-11-07 01:55:53 +00:00
Duncan P. N. Exon Smith	2c4e1a2f50	ADT: Remove last implicit ilist iterator conversions, NFC Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371	2015-11-07 00:01:16 +00:00
Peter Collingbourne	5b721561aa	DI: Reverse direction of subprogram -> function edge. Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219	2015-11-05 22:03:56 +00:00
Sanjoy Das	3c5b8b4566	[FunctionAttrs] Remove a loop, NFC refactor Summary: Remove the loop over the uses of the CallSite in ArgumentUsesTracker. Since we have the `Use *` for actual argument operand, we can just use pointer subtraction. The time complexity remains the same though (except for a vararg argument) -- `std::advance` is O(UseIndex) for the ArgumentList iterator. The real motivation is to make a later change adding support for operand bundles simpler. Reviewers: reames, chandlerc, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14363 llvm-svn: 252141	2015-11-05 03:04:40 +00:00
Adam Nemet	933537bc6b	LLE 6/6: Add LoopLoadElimination pass Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that require versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 llvm-svn: 252017	2015-11-03 23:50:08 +00:00
Teresa Johnson	90b0eac682	Restore "Support for ThinLTO function importing and symbol linking." This restores commit r251837, with the new library dependence added to llvm-link/Makefile to address bot failures. llvm-svn: 251866	2015-11-03 00:14:15 +00:00
Teresa Johnson	077216c4b1	Revert "Support for ThinLTO function importing and symbol linking." This reverts commit r251837, due to a number of bot failures of the form: /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function loadIndex(llvm::LLVMContext&, llvm::Module const): error: undefined reference to 'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::Module const, bool)' /home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()' I'm not sure why these are happening - I added Object to the requred libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these symbols come out of libLLVMObject.a. What am I missing? llvm-svn: 251841	2015-11-02 22:17:32 +00:00
Teresa Johnson	30954fb923	Support for ThinLTO function importing and symbol linking. Summary: Support for necessary linkage changes and symbol renaming during ThinLTO function importing. Also includes llvm-link support for manually importing functions and associated llvm-link based tests. Note that this does not include support for intelligently importing metadata, which is currently imported duplicate times. That support will be in the follow-on patch, and currently is ignored by the tests. Reviewers: dexonsmith, joker.eph, davidxl Subscribers: tobiasvk, tejohnson, llvm-commits Differential Revision: http://reviews.llvm.org/D13515 llvm-svn: 251837	2015-11-02 21:39:10 +00:00
David Blaikie	a98b9cfa54	StringRef-ify DiagnosticInfoSampleProfile::Filename llvm-svn: 251823	2015-11-02 20:01:13 +00:00
Teresa Johnson	1cf56803da	Clang format a few prior patches (NFC) I had clang formatted my earlier patches using the wrong style. Reformatted with the LLVM style. llvm-svn: 251812	2015-11-02 18:02:11 +00:00
Diego Novillo	b170ccb76f	SamplePGO - Count sample records in embedded profiles when computing coverage. The initial coverage checking code for sample records failed to count records inside inlined profiles. This change fixes the oversight. llvm-svn: 251752	2015-10-31 21:53:58 +00:00
Chandler Carruth	cabded8f2b	[FunctionAttrs] Inline the prototype attribute inference to an existing loop over the SCC. The separate function wasn't really adding much, NFC. llvm-svn: 251728	2015-10-31 00:28:37 +00:00
Justin Bogner	4a77d84be2	[PM] Port StripDeadPrototypes to the new pass manager This is a really straightforward port. Also adds a test for the pass, since it only seemed to be tested tangentially before. llvm-svn: 251726	2015-10-30 23:28:12 +00:00
Justin Bogner	7d746e1b5b	Whitespace. NFC llvm-svn: 251724	2015-10-30 23:02:38 +00:00
Chandler Carruth	5b1a50aeaa	[FunctionAttrs] Separate another chunk of the logic for functionattrs from its pass harness by providing a lambda to query for AA results. This allows the legacy pass to easily provide a lambda that uses the special helpers to construct function AA results from a legacy CGSCC pass. With the new pass manager (the next patch) the lambda just directly wraps the intuitive query API. llvm-svn: 251715	2015-10-30 16:48:08 +00:00
Chandler Carruth	f6cef1566e	[FunctionAttrs] Provide a single SCC node set to all of the transformations in FunctionAttrs rather than building a new one each time. This isn't trivial because there are different heuristics from different passes for exactly what set they want. The primary difference is whether an overridable function completely disables the synthesis of attributes. I've modeled this by directly testing for overridable, and using the common set that excludes external and opt-none functions. This does cause some changes by disabling more optimizations in the face of opt-none. Specifically, we were still optimizing calls to opt-none functions based on their attributes, just not the bodies. It seems better to be conservative on both fronts given the intended semanticas here (best effort to not assume or disturb anything). I've not tried to test this change as it seems complex, brittle, and not important to the implicit contract of opt-none. Instead, it seems more like a choice that should be dictated by the simplified implementation and the change to be acceptable differences within the space of opt-none. A big benefit here is that these transformations no longer rely on the legacy pass manager's SCC types, they just work on generic sets of function pointers. This will make it easy to re-use their logic in the new pass manager. I've also made the transforms static functions instead of members where trivial while I was touching the signatures. llvm-svn: 251640	2015-10-29 18:29:15 +00:00
Daniel Jasper	d024d95356	Fix use-after-free. Thanks ASAN for giving me a detailed report :-). llvm-svn: 251623	2015-10-29 12:49:37 +00:00
Diego Novillo	9692d53ba5	SamplePGO - Add flag to check sampling coverage. This adds the flag -mllvm -sample-profile-check-coverage=N to the SampleProfile pass. N is the percent of input sample records that the user expects to apply. If the pass does not use N% (or more) of the sample records in the input, it emits a warning. This is useful to detect some forms of stale profiles. If the code has drifted enough from the original profile, there will be records that do not match the IR anymore. This will not detect cases where a sample profile record for line L is referring to some other instructions that also used to be at line L. llvm-svn: 251568	2015-10-28 22:30:25 +00:00
Diego Novillo	512c00aec3	SamplePGO - Clear per-function data after applying a profile. The pass was keeping around a lot of per-function data (visited blocks, edges, dominance, etc) that is just taking up memory for no reason. In fact, from function to function it could potentially confuse the propagator since some maps are indexed by line offsets which can be common between functions. llvm-svn: 251531	2015-10-28 17:40:22 +00:00
James Molloy	b82cf99fec	[GlobalOpt] Add newlines to DEBUG messages I think these were affected by a change way back when to stop printing newlines in Value::dump() by default. This change simply allows the debug output to be readable. NFC. llvm-svn: 251517	2015-10-28 14:30:53 +00:00
Diego Novillo	0ad41ba447	Tidy a comment. NFC. llvm-svn: 251434	2015-10-27 18:41:46 +00:00
Diego Novillo	13365b7f1f	Fix SamplePGO segfault when debug info is missing. When emitting a remark for a conditional branch annotation, the remark uses the line location information of the conditional branch in the message. In some cases, that information is unavailable and the optimization would segfaul. I'm still not sure whether this is a bug or WAI, but the optimizer should not die because of this. llvm-svn: 251420	2015-10-27 17:37:00 +00:00
Chandler Carruth	33adb879ee	[function-attrs] Refactor code to handle shorter code with early exits. No functionality changed here, but the indentation is substantially reduced and IMO the code is much easier to read. I've also added some helpful comments. This is just a clean-up I wrote while studying the code, and that has been in my backlog for a while. llvm-svn: 251381	2015-10-27 01:41:43 +00:00
Diego Novillo	4d3e41c2db	Remove unused local variable. NFC. llvm-svn: 251344	2015-10-26 20:50:26 +00:00
Diego Novillo	73fa815dcb	SamplePGO - Add optimization reports. This adds a couple of optimization remarks to the SamplePGO transformation. When it decides to inline a hot function (to mimic the inline stack and repeat useful inline decisions in the original build). It will also report branch destinations. For instance, given the code fragment: 6 if (i < 1000) 7 sum -= i; 8 else 9 sum += -i * rand(); If the 'else' branch is taken most of the time, building this code with -Rpass=sample-profile will produce: a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile] sum += -i * rand(); ^ llvm-svn: 251330	2015-10-26 18:52:53 +00:00

1 2 3 4 5 ...

2272 Commits