llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Hongtao Yu	7761552c73	[CSSPGO] Tweaking inlining with pseudo probes. Fixing up a couple places where `getCallSiteIdentifier` is needed to support pseudo-probe-based callsites. Also fixing an issue in the extbinary profile reader where the metadata section is not fully scanned based on the number of profiles loaded only for the current module. Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D95791	2021-02-01 13:56:40 -08:00
Hongtao Yu	9b80fe63e4	[CSSPGO] Support of CS profiles in extended binary format. This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names. Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D95547	2021-01-27 21:29:46 -08:00
Kazu Hirata	69d44af40a	[llvm] Use isDigit (NFC)	2021-01-21 19:59:50 -08:00
Wei Mi	c2ff32c8b0	[SampleFDO] Add the support to split the function profiles with context into separate sections. For ThinLTO, all the function profiles without context has been annotated to outline functions if possible in prelink phase. In postlink phase, profile annotation in postlink phase is only meaningful for function profile with context. If the profile is large, it is better to split the profile into two parts, one with context and one without, so the profile reading in postlink phase only has to read the part with context. To have the profile splitting, we extend the ExtBinary format to support different section arrangement. It will be flexible to add other section layout in the future without the need to create new class inheriting from ExtBinary class. Differential Revision: https://reviews.llvm.org/D94435	2021-01-19 15:16:19 -08:00
Kazu Hirata	56ec2fba1e	[llvm] Populate std::vector at construction time (NFC)	2021-01-18 10:16:33 -08:00
Kazu Hirata	3109d596ee	[llvm] Use llvm::append_range (NFC)	2021-01-06 18:27:33 -08:00
Alan Phipps	f85fa6973c	[Coverage] Add support for Branch Coverage in LLVM Source-Based Code Coverage This is an enhancement to LLVM Source-Based Code Coverage in clang to track how many times individual branch-generating conditions are taken (evaluate to TRUE) and not taken (evaluate to FALSE). Individual conditions may comprise larger boolean expressions using boolean logical operators. This functionality is very similar to what is supported by GCOV except that it is very closely anchored to the ASTs. Differential Revision: https://reviews.llvm.org/D84467	2021-01-05 09:51:51 -06:00
Simon Pilgrim	b4343abe68	[ProfileData] GCOVFile::readGCNO - silence undefined pointer warning. NFCI. Silence clang static analyzer warning that 'fn' could still be in an undefined state - this shouldn't happen depending on the likely tag order, but the analyzer can't know that.	2021-01-04 16:50:05 +00:00
Wei Mi	74da8abf7f	[NFC][SampleFDO] Preparation to support multiple sections with the same type in ExtBinary format. Currently ExtBinary format doesn't support multiple sections with the same type in the profile. We add the support in this patch. Previously we use the section type to identify a section uniquely. Now we introduces a LayoutIndex in the SecHdrTableEntry and use the LayoutIndex to locate the target section. The allocations of NameTable and FuncOffsetTable are adjusted accordingly. Currently it works as a NFC because it won't change anything for current layout. The test for multiple sections support will be included in another patch where a new type of profile containing multiple sections with the same type is introduced. Differential Revision: https://reviews.llvm.org/D93254	2020-12-16 22:28:45 -08:00
Hongtao Yu	66121fdf6a	[CSSPGO] Consume pseudo-probe-based AutoFDO profile This change enables pseudo-probe-based sample counts to be consumed by the sample profile loader under the regular `-fprofile-sample-use` switch with minimal adjustments to the existing sample file formats. After the counts are imported, a probe helper, aka, a `PseudoProbeManager` object, is automatically launched to verify the CFG checksum of every function in the current compilation against the corresponding checksum from the profile. Mismatched checksums will cause a function profile to be slipped. A `SampleProfileProber` pass is scheduled before any of the `SampleProfileLoader` instances so that the CFG checksums as well as probe mappings are available during the profile loading time. The `PseudoProbeManager` object is set up right after the profile reading is done. In the future a CFG-based fuzzy matching could be done in `PseudoProbeManager`. Samples will be applied only to pseudo probe instructions as well as probed callsites once the checksum verification goes through. Those instructions are processed in the same way that regular instructions would be processed in the line-number-based scenario. In other words, a function is processed in a regular way as if it was reduced to just containing pseudo probes (block probes and callsites). Adjustment to profile format A CFG checksum field is being added to the existing AutoFDO profile formats. So far only the text format and the extended binary format are supported. For the text format, a new line like ``` !CFGChecksum: 12345 ``` is added to the end of the body sample lines. For the extended binary profile format, we introduce a metadata section to store the checksum map from function names to their CFG checksums. Differential Revision: https://reviews.llvm.org/D92347	2020-12-16 15:57:18 -08:00
Fangrui Song	506848f563	[llvm-cov gcov] Replace Donald B. Johnson's cycle enumeration with iterative cycle finding gcov computes the line execution count as the sum of (a) counts from predecessors on other lines and (b) the sum of loop execution counts of blocks on the same line (think of loops on one line). For (b), we use Donald B. Johnson's cycle enumeration algorithm and perform cycle cancelling for each cycle. This number of candidate cycles were exponential and D93036 made it polynomial by skipping zero count cycles. The time complexity is high (O(VE^2) (it could be O(E^2) but the linear `Blocks` check made it higher) and the implementation is complex. We could just identify loops and sum all back edges. However, this requires a dominator tree construction which is more complex. The time complexity can be decreased to almost linear, though. This patch just performs cycle cancelling iteratively. Add two members `traversable` and `incoming` to GCOVArc. There are 3 states: `!traversable`: blocks not on this line or explored blocks * `traversable && incoming == nullptr`: unexplored blocks * `traversable && incoming != nullptr`: blocks which are being explored (on the stack) If an arc points to a block being explored, a cycle has been found. Let E be the number of arcs. Every time a cycle is found, at least one arc is saturated (`edgeCount` reduced to 0), so there are at most E cycles. Finding one cycle takes O(E) time, so the overall time complexity is O(E^2). Note that we always augment through a back edge and never need to augment its reverse edge so reverse edges in traditional flow networks are not needed. Reviewed By: xinhaoyuan Differential Revision: https://reviews.llvm.org/D93073	2020-12-11 18:28:16 -08:00
Xinhao Yuan	38310b229d	[llvm-cov][gcov] Optimize the cycle counting algorithm by skipping zero count cycles This change is similar to http://gcc.gnu.org/PR90380 This reduces the complexity from exponential to polynomial of the arcs. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93036	2020-12-10 15:22:29 -08:00
Wei Mi	2b916be8fe	[SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if MD5 is used. Currently during sample profile loading, NameTable has to be loaded entirely up front before any name string is retrieved. That is because NameTable is stored using ULEB128 encoding and cannot be directly accessed like an array. However, if MD5 is used to represent name in the NameTable, it has fixed length. If MD5 names are stored in uint64_t type instead of ULEB128, NameTable can be accessed like an array then in many cases only part of the NameTable has to be read. This is helpful for reducing compile time especially when small source file is compiled. We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%. The size of the profile is slightly reduced with this change by ~0.2%, and that also indicates encoding MD5 in ULEB128 doesn't save the storage space. Differential Revision: https://reviews.llvm.org/D92621	2020-12-08 16:21:01 -08:00
wlei	db7fa377e4	[CSSPGO][llvm-profgen] Context-sensitive profile data generation This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC. This change supports context-sensitive profile data generation into llvm-profgen. With simultaneous sampling for LBR and call stack, we can identify leaf of LBR sample with calling context from stack sample . During the process of deriving fall through path from LBR entries, we unwind LBR by replaying all the calls and returns (including implicit calls/returns due to inlining) backwards on top of the sampled call stack. Then the state of call stack as we unwind through LBR always represents the calling context of current fall through path. we have two types of virtual unwinding 1) LBR unwinding and 2) linear range unwinding. Specifically, for each LBR entry which can be classified into call, return, regular branch, LBR unwinding will replay the operation by pushing, popping or switching leaf frame towards the call stack and since the initial call stack is most recently sampled, the replay should be in anti-execution order, i.e. for the regular case, pop the call stack when LBR is call, push frame on call stack when LBR is return. After each LBR processed, it also needs to align with the next LBR by going through instructions from previous LBR's target to current LBR's source, which we named linear unwinding. As instruction from linear range can come from different function by inlining, linear unwinding will do the range splitting and record counters through the range with same inline context. With each fall through path from LBR unwinding, we aggregate each sample into counters by the calling context and eventually generate full context sensitive profile (without relying on inlining) to driver compiler's PGO/FDO. A breakdown of noteworthy changes: - Added `HybridSample` class as the abstraction perf sample including LBR stack and call stack * Extended `PerfReader` to implement auto-detect whether input perf script output contains CS profile, then do the parsing. Multiple `HybridSample` are extracted * Speed up by aggregating `HybridSample` into `AggregatedSamples` * Added VirtualUnwinder that consumes aggregated `HybridSample` and implements unwinding of calls, returns, and linear path that contains implicit call/return from inlining. Ranges and branches counters are aggregated by the calling context.  Here calling context is string type, each context is a pair of function name and callsite location info, the whole context is like `main:1 @ foo:2 @ bar`. * Added PorfileGenerater that accumulates counters by ranges unfolding or branch target mapping, then generates context-sensitive function profile including function body, inferring callee's head sample, callsite target samples, eventually records into ProfileMap.  * Leveraged LLVM build-in(`SampleProfWriter`) writer to support different serialization format with no stop - `getCanonicalFnName` for callee name and name from ELF section - Added regression test for both unwinding and profile generation Test Plan: ninja & ninja check-llvm Reviewed By: hoy, wenlei, wmi Differential Revision: https://reviews.llvm.org/D89723	2020-12-07 13:48:58 -08:00
Wenlei He	6ab8756fe0	[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon. Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO. Interface There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile. - Query base profile (`getBaseSamplesFor`) Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function. - Query context profile (`getContextSamplesFor`) Context profile is a function's CFG profile for a given calling context. We can query context profile by context string. - Track inlined context profile (`markContextSamplesInlined`) When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function. - Track not-inlined context profile (`promoteMergeContextSamplesTree`) When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination. Implementation Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state. On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes. Integration `SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`. Differential Revision: https://reviews.llvm.org/D90125	2020-12-06 11:49:18 -08:00
serge-sans-paille	82b6e6053d	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Wei Mi	9479e29c3c	[NFC][SampleFDO] Move some common stuff from SampleProfileReaderExtBinary/WriterExtBinary to their parent classes. SampleProfileReaderExtBinary/SampleProfileWriterExtBinary specify the typical section layout currently used by SampleFDO. Currently a lot of section reader/writer stay in the two classes. However, as we expect to have more types of SampleFDO profiles, we hope those new types of profiles can share the common sections while configuring their own sections easily with minimal change. That is why I move some common stuff from SampleProfileReaderExtBinary/SampleProfileWriterExtBinary to SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase so new profiles class inheriting from the base class can reuse them. Differential Revision: https://reviews.llvm.org/D89524	2020-10-22 15:56:55 -07:00
Hubert Tong	fc872167c3	Fix various format specifier mismatches Format specifiers of incorrect length are replaced with format specifier macros from `<cinttypes>` matching the typedefs used to declare the type of the value being printed. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D89637	2020-10-18 12:39:15 -04:00
Hiroshi Yamauchi	7e9ad11889	[PGO] Remove the old memop value profiling buckets. Following up D81682 and D83903, remove the code for the old value profiling buckets, which have been replaced with the new, extended buckets and disabled by default. Also syncing InstrProfData.inc between compiler-rt and llvm. Differential Revision: https://reviews.llvm.org/D88838	2020-10-15 10:09:49 -07:00
Vedant Kumar	267d2a2041	[llvm-cov] Warn when -arch spec is missing/invalid for universal binary (reland) llvm-cov reports a poor error message when the -arch specifier is missing or invalid, and a binary has multiple slices. Make the error message more specific. (This version of the patch avoids using llvm::none_of -- the way I used the utility caused compile errors on many bots, possibly because the wrong overload of `none_of` was selected.) rdar://40312677	2020-10-13 16:46:03 -07:00
Vedant Kumar	65259aae54	Revert "[llvm-cov] Warn when -arch spec is missing/invalid for universal binary" This reverts commit b81d4bfb44c14575130bb06c047728b69c3213aa. It's causing some bots to fail to build due to: "error: no matching function for call to ‘__iterator_category".	2020-10-13 16:32:31 -07:00
Vedant Kumar	40fbc09049	[llvm-cov] Warn when -arch spec is missing/invalid for universal binary llvm-cov reports a poor error message when the -arch specifier is missing or invalid, and a binary has multiple slices. Make the error message more specific. rdar://40312677	2020-10-13 16:29:26 -07:00
Zequan Wu	f60564004b	[Coverage] Add empty line regions to SkippedRegions Differential Revision: https://reviews.llvm.org/D84988	2020-09-21 12:42:53 -07:00
Fangrui Song	1bd4869627	[llvm-cov gcov] Add --demangled-names (-m) gcov 4.9 introduced the option.	2020-09-16 23:18:50 -07:00
Fangrui Song	c8bd947872	[llvm-cov gcov] Refactor counting and reporting The current organization of FileInfo and its referenced utility functions of (GCOVFile, GCOVFunction, GCOVBlock) is messy. Some members of FileInfo are just copied from GCOVFile. FileInfo::print (.gcov output and --intermediate output) is interleaved with branch statistics and computation of line execution counts. --intermediate has to do redundant .gcov output to gather branch statistics. This patch deletes lots of code and introduces a clearer work flow: ``` fn collectFunction for each block b for each line lineNum let line be LineInfo of the file on lineNum line.exists = 1 increment function's lines & linesExec if necessary increment line.count line.blocks.push_back(&b) fn collectSourceLine compute cycle counts count = incoming_counts + cycle_counts if line.exists ++summary->lines if line.count ++summary->linesExec fn collectSource for each line call collectSourceLine fn main for each function call collectFunction print function summary for each source file call collectSource print file summary annotate the source file with line execution counts if -i print intermediate file ``` The output order of functions and files now follows the original order in .gcno files.	2020-09-13 23:00:59 -07:00
Fangrui Song	0637c5d6a0	[llvm-cov gcov] Add -r (--relative-only) && -s (--source-prefix) gcov 4.7 introduced the two options. https://sourceware.org/pipermail/gcc-patches/2011-November/328782.html -r only dumps files with relative paths or absolute paths with the prefix specified by -s. The two options are useful filtering out system header files.	2020-09-13 14:54:20 -07:00
Fangrui Song	f6898867d5	[llvm-cov gcov] Improve accuracy when some edges are not measured Also guard against infinite recursion if GCOV_ARC_ON_TREE edges contain a cycle.	2020-09-12 22:33:41 -07:00
Fangrui Song	e83f76a177	[llvm-cov gcov] Simply computation of line counts and exit block counter	2020-09-08 23:15:37 -07:00
Fangrui Song	789e9aff28	[llvm-cov gcov] Compute unmeasured arc counts by Kirchhoff's circuit law For a CFG G=(V,E), Knuth describes that by Kirchoff's circuit law, the minimum number of counters necessary is \|E\|-(\|V\|-1). The emitted edges form a spanning tree. libgcov emitted .gcda files leverages this optimization while clang --coverage's doesn't. Propagate counts by Kirchhoff's circuit law so that llvm-cov gcov can correctly print line counts of gcc --coverage emitted files and enable the future improvement of clang --coverage.	2020-09-08 18:45:11 -07:00
Wei Mi	d9e172d0fb	[SampleFDO] Enhance profile remapping support for searching inline instance and indirect call promotion candidate. Profile remapping is a feature to match a function in the module with its profile in sample profile if the function name and the name in profile look different but are equivalent using given remapping rules. This is a useful feature to keep the performance stable by specifying some remapping rules when sampleFDO targets are going through some large scale function signature change. However, currently profile remapping support is only valid for outline function profile in SampleFDO. It cannot match a callee with an inline instance profile if they have different but equivalent names. We found that without the support for inline instance profile, remapping is less effective for some large scale change. To add that support, before any remapping lookup happens, all the names in the profile will be inserted into remapper and the Key to the name mapping will be recorded in a map called NameMap in the remapper. During name lookup, a Key will be returned for the given name and it will be used to extract an equivalent name in the profile from NameMap. So with the help of the NameMap, we can translate any given name to an equivalent name in the profile if it exists. Whenever we try to match a name in the module to a name in the profile, we will try the match with the original name first, and if it doesn't match, we will use the equivalent name got from remapper to try the match for another time. In this way, the patch can enhance the profile remapping support for searching inline instance and searching indirect call promotion candidate. In a planned large scale change of int64 type (long long) to int64_t (long), we found the performance of a google internal benchmark degraded by 2% if nothing was done. If existing profile remapping was enabled, the performance degradation dropped to 1.2%. If the profile remapping with the current patch was enabled, the performance degradation further dropped to 0.14% (Note the experiment was done before searching indirect call promotion candidate was added. We hope with the remapping support of searching indirect call promotion candidate, the degradation can drop to 0% in the end. It will be evaluated post commit). Differential Revision: https://reviews.llvm.org/D86332	2020-08-26 11:07:35 -07:00
Zequan Wu	425e233cb4	[llvm-cov] reset executation count to 0 after wrapped segment Fix the bug: https://bugs.llvm.org/show_bug.cgi?id=36979. It also fixes this bug: https://bugs.llvm.org/show_bug.cgi?id=35404, which I think is caused by the same problem. Differential Revision: https://reviews.llvm.org/D85036	2020-08-04 18:38:44 -07:00
Hiroshi Yamauchi	0b0a5993c1	[PGO] Extend the value profile buckets for mem op sizes. Extend the memop value profile buckets to be more flexible (could accommodate a mix of individual values and ranges) and to cover more value ranges (from 11 to 22 buckets). Disabled behind a flag (to be enabled separately) and the existing code to be removed later. Differential Revision: https://reviews.llvm.org/D81682	2020-08-03 11:04:32 -07:00
Wei Mi	51d4708437	Supplement instr profile with sample profile. PGO profile is usually more precise than sample profile. However, PGO profile needs to be collected from loadtest and loadtest may not be representative enough to the production workload. Sample profile collected from production can be used as a supplement -- for functions cold in loadtest but warm/hot in production, we can scale up the related function in PGO profile if the function is warm or hot in sample profile. The implementation contains changes in compiler side and llvm-profdata side. Given an instr profile and a sample profile, for a function cold in PGO profile but warm/hot in sample profile, llvm-profdata will either mark all the counters in the profile to be -1 or scale up the max count in the function to be above hot threshold, depending on the zero counter ratio in the profile. The assumption is if there are too many counters being zero in the function profile, the profile is more likely to cause harm than good, then llvm-profdata will mark all the counters to be -1 indicating the function is hot but the profile is unaccountable. In compiler side, if a function profile with all -1 counters is seen, the function entry count will be set to be above hot threshold but its internal profile will be dropped. In the long run, it may be useful to let compiler support using PGO profile and sample profile at the same time, but that requires more careful design and more substantial changes to make two profiles work seamlessly. The patch here serves as a simple intermediate solution. Differential Revision: https://reviews.llvm.org/D81981	2020-07-27 20:17:40 -07:00
Fangrui Song	4e9b56ee13	Revert D81682 "[PGO] Extend the value profile buckets for mem op sizes." This reverts commit 4a539faf74b9b4c25ee3b880e4007564bd5139b0. There is a __llvm_profile_instrument_range related crash in PGO-instrumented clang: ``` (gdb) bt llvm::ConstantRange const&, llvm::APInt const&, unsigned int, bool) () llvm::ScalarEvolution::getRangeForAffineAR(llvm::SCEV const, llvm::SCEV const, llvm::SCEV const*, unsigned int) () ``` (The body of __llvm_profile_instrument_range is inlined, so we can only find__llvm_profile_instrument_target in the trace) ``` 23│ 0x000055555dba0961 <+65>: nopw %cs:0x0(%rax,%rax,1) 24│ 0x000055555dba096b <+75>: nopl 0x0(%rax,%rax,1) 25│ 0x000055555dba0970 <+80>: mov %rsi,%rbx 26│ 0x000055555dba0973 <+83>: mov 0x8(%rsi),%rsi # %rsi=-1 -> SIGSEGV 27│ 0x000055555dba0977 <+87>: cmp %r15,(%rbx) 28│ 0x000055555dba097a <+90>: je 0x55555dba0a76 <__llvm_profile_instrument_target+342> ```	2020-07-22 16:08:25 -07:00
Rong Xu	005085c634	[PGO] Supporting code for always instrumenting entry block This patch includes the supporting code that enables always instrumenting the function entry block by default. This patch will NOT the default behavior. It adds a variant bit in the profile version, adds new directives in text profile format, and changes llvm-profdata tool accordingly. This patch is a split of D83024 (https://reviews.llvm.org/D83024) Many test changes from D83024 are also included. Differential Revision: https://reviews.llvm.org/D84261	2020-07-22 15:01:53 -07:00
Fangrui Song	adac7ac5fb	[llvm-cov gcov] Don't require NUL terminator when reading files .gcno, .gcda and source files can be modified while we are reading them. If the concurrent modification of a file being read nullifies the NUL terminator assumption, llvm-cov can trip over an assertion failure in MemoryBuffer::init. This is not so rare - the source files can be in an editor and .gcda can be written by an running process (if the process forks, when .gcda gets written is probably more unpredictable). There is no accompanying test because an assertion failure requires data races with some involved setting.	2020-07-19 00:31:52 -07:00
Hiroshi Yamauchi	a85cda4f5a	[PGO] Extend the value profile buckets for mem op sizes. Extend the memop value profile buckets to be more flexible (could accommodate a mix of individual values and ranges) and to cover more value ranges (from 11 to 22 buckets). Disabled behind a flag (to be enabled separately) and the existing code to be removed later.	2020-07-15 10:26:15 -07:00
Wei Mi	7fc0e8b3ed	[NFC] Change getEntryForPercentile to be a static function in ProfileSummaryBuilder. Change file static function getEntryForPercentile to be a static member function in ProfileSummaryBuilder so it can be used by other files. Differential Revision: https://reviews.llvm.org/D83439	2020-07-09 16:38:19 -07:00
Hiroshi Yamauchi	b3de353064	Revert "[PGO] Extend the value profile buckets for mem op sizes." This reverts commit 63a89693f09f6b24ce4f2350d828150bd9c4f3e8. Due to a build failure like http://lab.llvm.org:8011/builders/sanitizer-windows/builds/65386/steps/annotate/logs/stdio	2020-06-25 11:13:49 -07:00
Hiroshi Yamauchi	754259b7af	[PGO] Extend the value profile buckets for mem op sizes. Extend the memop value profile buckets to be more flexible (could accommodate a mix of individual values and ranges) and to cover more value ranges (from 11 to 22 buckets). Disabled behind a flag (to be enabled separately) and the existing code to be removed later. Differential Revision: https://reviews.llvm.org/D81682	2020-06-25 10:22:56 -07:00
Fangrui Song	78ddb2f901	[llvm-cov gcov] Support clang<11 fake 4.2 format Test cases are restored from a3bed4bd3743b5fee1e66116a63089df742bcae1	2020-06-17 10:17:15 -07:00
Fangrui Song	3ac41a5d21	[llvm-cov gcov] Don't suppress .gcov output if .gcda is corrupted If .gcda is corrupted, gcov continues to produce a .gcov and just assumes execution counts are zeros. This is reasonable, because the program can corrupt its .gcda output. The code path should be similar to the code path without .gcda.	2020-06-16 14:55:38 -07:00
Fangrui Song	d1d0909fd1	[gcov] Add -i --intermediate-format Between gcov 4.9~8, `gcov -i $file` prints coverage information to $file.gcov in an intermediate text format (single file, instead of $source.gcov for each source file). lcov newer than 2019-05-24 detects -i support and uses it to increase processing speed. gcov 9 (GCC r265587) removed --intermediate-format and -i was changed to mean --json-format. However, we consider this format still useful and support it. geninfo (part of lcov) supports this format even if we announce that we are compatible with gcov 9.0.0	2020-06-16 14:14:28 -07:00
Fangrui Song	69837fa7b5	[gcov] Refactor llvm-cov gcov and add SourceInfo	2020-06-16 14:14:26 -07:00
Fangrui Song	a7a8160485	[gcov] Improve tests and lower the minimum supported version to gcov 3.4 global-ctor.ll no longer checks what it intended to check (@_GLOBAL__sub_I_global-ctor.ll needs a !dbg to work). Rewrite it. gcov 3.4 and gcov 4.2 use the same format, thus we can lower the version requirement to 3.4	2020-06-06 23:11:32 -07:00
Fangrui Song	b7f9cc057b	[gcov] Don't error 'unexpected end of memory buffe'	2020-06-03 22:05:15 -07:00
Fangrui Song	bc4d185797	[gcov] Make `Creating 'filename'` compatible with gcov And clean up llvm-cov.test a bit	2020-06-03 21:48:01 -07:00
Fangrui Song	9b29d98c47	[gcov] Improve .gcno compatibility with gcov and use DataExtractor llvm-cov.test and many Inputs/test* files contain wrong tests. This patch rewrites a large portion of these files. The pre-canned .gcno & .gcda are replaced by binaries produced by clang --coverage (compatible with gcov 4.8~7) (after some GCDAProfiling.c bugs were fixed by my previous commits). Also make llvm-cov gcov on a little-endian host capable to parse big-endian .gcno and .gcda, and make llvm-cov gcov on big-endian host capable to parse little-endian .gcno and .gcda	2020-06-03 19:29:21 -07:00
Fangrui Song	b7042493f2	[gcov] Emit GCOV_TAG_OBJECT_SUMMARY/GCOV_TAG_PROGRAM_SUMMARY correctly and fix llvm-cov's decoding of runcount gcov 9 (r264462) started to use GCOV_TAG_OBJECT_SUMMARY. Before, GCOV_TAG_PROGRAM_SUMMARY was used. libclang_rt.profile should emit just one tag according to the version. Another bug introduced by rL194499 is that the wrong runcount field was selected. Fix the two bugs so that gcov can correctly decode "Runs:" from libclang_rt.profile produced .gcda files, and llvm-cov gcov can correctly decode "Runs:" from libgcov produced .gcda files.	2020-05-11 21:53:53 -07:00
Fangrui Song	5ac915ed2c	[gcov] Implement --stdout -t gcov by default prints to a .gcov file. With --stdout, stdout is used. Some summary information is omitted. There is no separator for multiple source files.	2020-05-10 21:02:38 -07:00

1 2 3 4 5 ...

566 Commits