llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00

Author	SHA1	Message	Date
James Henderson	f75dcd7d67	[docs][tools] Add missing "program" tags to rst files Sphinx allows for definitions of command-line options using `.. option <name>` and references to those options via `:option:<name>`. However, it looks like there is no scoping of these options by default, meaning that links can end up pointing to incorrect documents. See for example the llvm-mca document, which contains references to -o that, prior to this patch, pointed to a different document. What's worse is that these links appear to be non-deterministic in which one is picked (on my machine, some references end up pointing to opt, whereas on the live docs, they point to llvm-dwarfdump, for example). The fix is to add the .. program <name> tag. This essentially namespaces the options (definitions and references) to the named program, ensuring that the links are kept correct. Reviwed by: andreadb Differential Revision: https://reviews.llvm.org/D63873 llvm-svn: 364538	2019-06-27 13:24:46 +00:00
Andrea Di Biagio	c160d26a4d	[llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models. Differential Revision: https://reviews.llvm.org/D63556 llvm-svn: 363830	2019-06-19 16:10:58 +00:00
Matt Davis	25329df600	[Docs] [llvm-mca] Point out a caveat for using llvm-mca markers in source code. Summary: See: https://bugs.llvm.org/show_bug.cgi?id=42173 Reviewers: andreadb, mattd, RKSimon, spatel Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits, andreadb Tags: #llvm Patch by Max Marrone (maxpm)! Thanks! Differential Revision: https://reviews.llvm.org/D63040 llvm-svn: 362979	2019-06-10 20:38:56 +00:00
Andrea Di Biagio	98e0298cb2	[MCA] Add support for nested and overlapping region markers This patch fixes PR41523 https://bugs.llvm.org/show_bug.cgi?id=41523 Regions can now nest/overlap provided that they have different names. Anonymous regions cannot overlap. Region end markers must specify the region name. The only exception is for when there is only one user-defined region; in that particular case, the region end marker doesn't need to specify a name. Incorrect region end markers are no longer ignored. Instead, the tool reports an error and we exit with an error code. Added test cases to verify the new diagnostic error messages. Updated the llvm-mca docs to reflect this feature change. Differential Revision: https://reviews.llvm.org/D61676 llvm-svn: 360351	2019-05-09 15:18:09 +00:00
Andrea Di Biagio	c08b465d91	[llvm-mca][scheduler-stats] Print issued micro opcodes per cycle. NFCI It makes more sense to print out the number of micro opcodes that are issued every cycle rather than the number of instructions issued per cycle. This behavior is also consistent with the dispatch-stats: numbers from the two views can now be easily compared. llvm-svn: 357919	2019-04-08 16:05:54 +00:00
Andrea Di Biagio	c5a150eca8	[MCA] Highlight kernel bottlenecks in the summary view. This patch adds a new flag named -bottleneck-analysis to print out information about throughput bottlenecks. MCA knows how to identify and classify dynamic dispatch stalls. However, it doesn't know how to analyze and highlight kernel bottlenecks. The goal of this patch is to teach MCA how to correlate increases in backend pressure to backend stalls (and therefore, the loss of throughput). From a Scheduler point of view, backend pressure is a function of the scheduler buffer usage (i.e. how the number of uOps in the scheduler buffers changes over time). Backend pressure increases (or decreases) when there is a mismatch between the number of opcodes dispatched, and the number of opcodes issued in the same cycle. Since buffer resources are limited, continuous increases in backend pressure would eventually leads to dispatch stalls. So, there is a strong correlation between dispatch stalls, and how backpressure changed over time. This patch teaches how to identify situations where backend pressure increases due to: - unavailable pipeline resources. - data dependencies. Data dependencies may delay execution of instructions and therefore increase the time that uOps have to spend in the scheduler buffers. That often translates to an increase in backend pressure which may eventually lead to a bottleneck. Contention on pipeline resources may also delay execution of instructions, and lead to a temporary increase in backend pressure. Internally, the Scheduler classifies instructions based on whether register / memory operands are available or not. An instruction is marked as "ready to execute" only if data dependencies are fully resolved. Every cycle, the Scheduler attempts to execute all instructions that are ready to execute. If an instruction cannot execute because of unavailable pipeline resources, then the Scheduler internally updates a BusyResourceUnits mask with the ID of each unavailable resource. ExecuteStage is responsible for tracking changes in backend pressure. If backend pressure increases during a cycle because of contention on pipeline resources, then ExecuteStage sends a "backend pressure" event to the listeners. That event would contain information about instructions delayed by resource pressure, as well as the BusyResourceUnits mask. Note that ExecuteStage also knows how to identify situations where backpressure increased because of delays introduced by data dependencies. The SummaryView observes "backend pressure" events and prints out a "bottleneck report". Example of bottleneck report: ``` Cycles with backend pressure increase [ 99.89% ] Throughput Bottlenecks: Resource Pressure [ 0.00% ] Data Dependencies: [ 99.89% ] - Register Dependencies [ 0.00% ] - Memory Dependencies [ 99.89% ] ``` A bottleneck report is printed out only if increases in backend pressure eventually caused backend stalls. About the time complexity: Time complexity is linear in the number of instructions in the Scheduler::PendingSet. The average slowdown tends to be in the range of ~5-6%. For memory intensive kernels, the slowdown can be significant if flag -noalias=false is specified. In the worst case scenario I have observed a slowdown of ~30% when flag -noalias=false was specified. We can definitely recover part of that slowdown if we optimize class LSUnit (by doing extra bookkeeping to speedup queries). For now, this new analysis is disabled by default, and it can be enabled via flag -bottleneck-analysis. Users of MCA as a library can enable the generation of pressure events through the constructor of ExecuteStage. This patch partially addresses https://bugs.llvm.org/show_bug.cgi?id=37494 Differential Revision: https://reviews.llvm.org/D58728 llvm-svn: 355308	2019-03-04 11:52:34 +00:00
Andrea Di Biagio	57fcc40fb8	[llvm-mca][View] Improved Retire Control Unit Statistics. RetireControlUnitStatistics now reports extra information about the ROB and the avg/maximum number of entries consumed over the entire simulation. Example: Retire Control Unit - number of cycles where we saw N instructions retired: [# retired], [# cycles] 0, 109 (17.9%) 1, 102 (16.7%) 2, 399 (65.4%) Total ROB Entries: 64 Max Used ROB Entries: 35 ( 54.7% ) Average Used ROB Entries per cy: 32 ( 50.0% ) Documentation in llvm/docs/CommandGuide/llvmn-mca.rst has been updated to reflect this change. llvm-svn: 347493	2018-11-23 12:12:57 +00:00
Andrea Di Biagio	7e5d9331c7	[llvm-mca] Report the number of dispatched micro opcodes in the DispatchStatistics view. This patch introduces the following changes to the DispatchStatistics view: * DispatchStatistics now reports the number of dispatched opcodes instead of the number of dispatched instructions. * The "Dynamic Dispatch Stall Cycles" table now also reports the percentage of stall cycles against the total simulated cycles. This change allows users to easily compare dispatch group sizes with the processor DispatchWidth. Before this change, it was difficult to correlate the two numbers, since DispatchStatistics view reported numbers of instructions (instead of opcodes). DispatchWidth defines the maximum size of a dispatch group in terms of number of micro opcodes. The other change introduced by this patch is related to how DispatchStage generates "instruction dispatch" events. In particular: * There can be multiple dispatch events associated with a same instruction * Each dispatch event now encapsulates the number of dispatched micro opcodes. The number of micro opcodes declared by an instruction may exceed the processor DispatchWidth. Therefore, we cannot assume that instructions are always fully dispatched in a single cycle. DispatchStage knows already how to handle instructions declaring a number of opcodes bigger that DispatchWidth. However, DispatchStage always emitted a single instruction dispatch event (during the first simulated dispatch cycle) for instructions dispatched. With this patch, DispatchStage now correctly notifies multiple dispatch events for instructions that cannot be dispatched in a single cycle. A few views had to be modified. Views can no longer assume that there can only be one dispatch event per instruction. Tests (and docs) have been updated. Differential Revision: https://reviews.llvm.org/D51430 llvm-svn: 341055	2018-08-30 10:50:20 +00:00
Andrea Di Biagio	80b01d0203	[llvm-mca] Add fields "Total uOps" and "uOps Per Cycle" to the report generated by the SummaryView. This patch adds two new fields to the perf report generated by the SummaryView. Fields are now logically organized into two small groups; only the second group contains throughput indicators. Example: ``` Iterations: 100 Instructions: 300 Total Cycles: 414 Total uOps: 700 Dispatch Width: 4 uOps Per Cycle: 1.69 IPC: 0.72 Block RThroughput: 4.0 ``` This patch also updates the docs for llvm-mca. Due to the nature of this change, several tests in the tools/llvm-mca directory were affected, and had to be updated using script `update_mca_test_checks.py`. llvm-svn: 340946	2018-08-29 17:56:39 +00:00
Andrea Di Biagio	f707cd4166	[llvm-mca] Improved report generated by the SchedulerStatistics view. Before this patch, the SchedulerStatistics only printed the maximum number of buffer entries consumed in each scheduler's queue at a given point of the simulation. This patch restructures the reported table, and adds an extra field named "Average number of used buffer entries" to it. This patch also uses different colors to help identifying bottlenecks caused by high scheduler's buffer pressure. llvm-svn: 340746	2018-08-27 14:52:52 +00:00
Matt Davis	b0535f09cc	[llvm-mca][docs] Move the code marker text into its own subsection. NFC. Also fixed a few undecorated 'llvm-mca' references to be highlighted with the 'program' emphasis. llvm-svn: 338900	2018-08-03 15:56:07 +00:00
Andrea Di Biagio	3d390604c7	[llvm-mca] Speed up the computation of the wait/ready/issued sets in the Scheduler. This patch is a follow-up to r338702. We don't need to use a map to model the wait/ready/issued sets. It is much more efficient to use a vector instead. This patch gives us an average 7.5% speedup (on top of the ~12% speedup obtained after r338702). llvm-svn: 338883	2018-08-03 12:55:28 +00:00
Andrea Di Biagio	1aca2c2e82	[llvm-mca][docs] Improve the CommandLine documentation. This patch replaces all the remaining occurrences of string "MCA" with ":program:`llvm-mca`". Somehow I missed those strings when I committed r338394. This patch also improves section "Instruction Dispatch". llvm-svn: 338881	2018-08-03 12:44:56 +00:00
Matt Davis	7b7e97dd13	[llvm-mca][docs] Replace "temporary" with "physical registers". NFC. llvm-svn: 338415	2018-07-31 18:59:46 +00:00
Andrea Di Biagio	4a150d2528	[llvm-mca][docs] Improve the "How LLVM-MCA works" section. llvm-svn: 338410	2018-07-31 18:19:15 +00:00
Andrea Di Biagio	159d252dca	[llvm-mca][docs] Always use `llvm-mca` in place of `MCA`. llvm-svn: 338394	2018-07-31 15:29:10 +00:00
Matt Davis	2fb231bb42	[llvm-mca][docs] Add instruction flow documentation. NFC. Summary: This patch mostly copies the existing Instruction Flow, and stage descriptions from the mca README. I made a few text tweaks, but no semantic changes, and made reference to the "default pipeline." I also removed the internals references (e.g., reference to class names and header files). I did leave the LSUnit name around, but only as an abbreviated word for the load-store unit. Reviewers: andreadb, courbet, RKSimon, gbedwell, filcab Reviewed By: andreadb Subscribers: tschuett, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D49692 llvm-svn: 338319	2018-07-30 22:30:14 +00:00
Matt Davis	c79a134b78	[llvm-mca][docs] Define IPC where it is first mentioned. NFC. Expand the abbreviation where it is first used, and use IPC elsewhere. llvm-svn: 337739	2018-07-23 21:10:50 +00:00
Matt Davis	9597587b97	[llvm-mca][docs] Add documentation for the statistic outputs from mca. NFC Summary: The original text was lifted from the MCA README. I re-ran the dot-product example and updated the output seen in the docs. I also added a few paragraphs discussing the instruction issued and retired histograms, as well as discussing the register file stats. Reviewers: andreadb, RKSimon, courbet, gbedwell, filcab Reviewed By: andreadb Subscribers: tschuett Differential Revision: https://reviews.llvm.org/D49614 llvm-svn: 337648	2018-07-21 18:32:47 +00:00
Matt Davis	3fa9be1fea	[llvm-mca][docs] Add Timeline and How MCA works. For the most part, these changes were from the RFC. I made a few minor word/structure changes, but nothing significant. I also regenerated the example output, and adjusted the text accordingly. Differential Revision: https://reviews.llvm.org/D49527 llvm-svn: 337496	2018-07-19 20:33:59 +00:00
Matt Davis	7447fc96e1	[llvm-mca][docs] Revert mca internals docs. We're going to work on this in a separate review focusing more on documenting the View and probably removing some of the less-interesting/less-useful pieces. This reverts r337219,337225 llvm-svn: 337295	2018-07-17 16:11:54 +00:00
Matt Davis	4c4b8e599d	[llvm-mca][docs] Add notes about cycle and resource callbacks. NFC. llvm-svn: 337225	2018-07-16 23:50:53 +00:00
Matt Davis	c3a63ec6c0	[llvm-mca][docs] Initial description of mca internals. NFC This patch introduces a brief description of the components of MCA. The main focus is on Views. This is a work in progress, and more descriptions will be introduced later. I want to flesh-out the Views section more and provide a detailed description of eventing in MCA. Eventually a brief code example of a View should accompany the description. Also, we should consider moving the MCA internals guide elsewhere at some point. llvm-svn: 337219	2018-07-16 21:42:58 +00:00
Simon Pilgrim	5d64fe4125	Fix typo in declaring code-block snippet llvm-svn: 332630	2018-05-17 16:58:42 +00:00
Andrea Di Biagio	74d366f0da	[llvm-mca] Add an example showing how to get Intel assembly syntax Patch by Jeff Muizelaar. llvm-svn: 332627	2018-05-17 16:48:53 +00:00
Andrea Di Biagio	8bbb45c175	[llvm-mca] add flag -all-views and flag -all-stats. Flag -all-views enables all the views. Flag -all-stats enables all the views that print hardware statistics. llvm-svn: 332602	2018-05-17 12:27:03 +00:00
Andrea Di Biagio	517a29f95b	[llvm-mca] Default to the native host cpu if flag -mcpu is not specified. llvm-svn: 330809	2018-04-25 10:18:25 +00:00
Andrea Di Biagio	9a3d82e1c1	[llvm-mca][CommandGuide] Fix typo in example. llvm-svn: 330703	2018-04-24 10:09:32 +00:00
Andrea Di Biagio	d9344b8000	[llvm-mca] Renamed BackendStatistics to RetireControlUnitStatistics. Also, removed flag -verbose in favor of flag -retire-stats. llvm-svn: 329794	2018-04-11 12:12:53 +00:00
Andrea Di Biagio	b46fe1126f	[llvm-mca] Move the logic that prints scheduler statistics from BackendStatistics to its own view. Added flag -scheduler-stats to print scheduler related statistics. llvm-svn: 329792	2018-04-11 11:37:46 +00:00
Sanjay Patel	65e7849dc2	[llvm-mca] reorder text On 2nd reading, putting the C example after the bit about multiple regions makes this flow better. llvm-svn: 329732	2018-04-10 18:10:14 +00:00
Sanjay Patel	1ce0d0fab1	[llvm-mca] fix formatting llvm-svn: 329729	2018-04-10 17:56:24 +00:00
Sanjay Patel	1455c44d73	[llvm-mca] add example workflow for source code This is copied from Andrea's text in PR36875: https://bugs.llvm.org/show_bug.cgi?id=36875 As noted there, this is a hack...but it's a good one! It's important to show potential workflows up-front with examples, so customers can copy and experiment with them. llvm-svn: 329726	2018-04-10 17:49:45 +00:00
Andrea Di Biagio	7af89ed924	[llvm-mca] Move the logic that prints dispatch unit statistics from BackendStatistics to its own view. This patch moves the logic that collects and analyzes dispatch events to the DispatchStatistics view. Added flag -dispatch-stats to print statistics related to the dispatch logic. llvm-svn: 329708	2018-04-10 14:55:14 +00:00
Andrea Di Biagio	af24ba4a16	[llvm-mca] Increase the default number of iterations to 100. llvm-svn: 329694	2018-04-10 12:50:03 +00:00
Andrea Di Biagio	66e474bb74	[llvm-mca] Add the ability to mark regions of code for analysis (PR36875) This patch teaches llvm-mca how to parse code comments in search for special "markers" used to select regions of code. Example: # LLVM-MCA-BEGIN My Code Region .... # LLVM-MCA-END The MCAsmLexer now delegates to an object of class MCACommentParser (i.e. an AsmCommentConsumer) the parsing of code comments to search for begin/end code region markers. A comment starting with substring "LLVM-MCA-BEGIN" marks the beginning of a new region of code. A comment starting with substring "LLVM-MCA-END" marks the end of the last region. This implementation doesn't allow regions to overlap. Each region can have a optional description; internally, each region is identified by a range of source code locations (SMLoc). MCInst objects are added to a region R only if the source location for the MCInst is in the range of locations specified by R. By default, the tool allocates an implicit "Default" code region which contains every source location. See new tests llvm-mca-marker-*.s for a few examples. A new Backend object is created for every region. So, the analysis is conducted on every parsed code region. The final report is the union of the reports generated for every code region. Note that empty regions are skipped. Special "[#] Code Region - ..." strings are used in the report to mark the portion which is specific to a code region only. For example, see llvm-mca-markers-5.s. Differential Revision: https://reviews.llvm.org/D45433 llvm-svn: 329590	2018-04-09 16:39:52 +00:00
Andrea Di Biagio	ee5b4f2814	[documentation][llvm-mca] Update the documentation. Scheduling models can now describe processor register files and retire control units. This updates the existing documentation and the README file. llvm-svn: 329311	2018-04-05 16:42:32 +00:00
Andrea Di Biagio	56f9bc1f61	[llvm-mca] Remove flag -max-retire-per-cycle, and update the docs. This is done in preparation for D45259. With D45259, models can specify the size of the reorder buffer, and the retire throughput directly via tablegen. llvm-svn: 329274	2018-04-05 11:36:50 +00:00
Andrea Di Biagio	e2bef9a902	[llvm-mca] Move the logic that prints register file statistics to its own view. NFCI Before this patch, the "BackendStatistics" view was responsible for printing the register file usage (as well as many other statistics). Now users can enable register file usage statistics using the command line flag `-register-file-stats`. By default, the tool doesn't print register file statistics. llvm-svn: 329083	2018-04-03 16:46:23 +00:00
Andrea Di Biagio	49f7508452	[llvm-mca] Add a flag -instruction-info to enable/disable the instruction info view. llvm-svn: 328493	2018-03-26 13:44:54 +00:00
Andrea Di Biagio	7168afd322	[llvm-mca] Update the commandline docs after r328305. Document that flag -resource-pressure can be used to enable/disable the resource pressure view. This change should have been part of r328305. llvm-svn: 328492	2018-03-26 13:21:48 +00:00
Andrea Di Biagio	a309a8b1e0	[llvm-mca] Add flag -instruction-tables to print the theoretical resource pressure distribution for instructions (PR36874) The goal of this patch is to address most of PR36874. To fully fix PR36874 we need to split the "InstructionInfo" view from the "SummaryView". That would make easy to check the latency and rthroughput as well. The patch reuses all the logic from ResourcePressureView to print out the "instruction tables". We have an entry for every instruction in the input sequence. Each entry reports the theoretical resource pressure distribution. Resource pressure is uniformly distributed across all the processor resource units of a group. At the moment, the backend pipeline is not configurable, so the only way to fix this is by creating a different driver that simply sends instruction events to the resource pressure view. That means, we don't use the Backend interface. Instead, it is simpler to just have a different code-path for when flag -instruction-tables is specified. Once Clement addresses bug 36663, then we can port the "instruction tables" logic into a stage of our configurable pipeline. Updated the BtVer2 test cases (thanks Simon for the help). Now we pass flag -instruction-tables to each modified test. Differential Revision: https://reviews.llvm.org/D44839 llvm-svn: 328487	2018-03-26 12:04:53 +00:00
Andrea Di Biagio	45f0e5261e	[llvm-mca] LLVM Machine Code Analyzer. llvm-mca is an LLVM based performance analysis tool that can be used to statically measure the performance of code, and to help triage potential problems with target scheduling models. llvm-mca uses information which is already available in LLVM (e.g. scheduling models) to statically measure the performance of machine code in a specific cpu. Performance is measured in terms of throughput as well as processor resource consumption. The tool currently works for processors with an out-of-order backend, for which there is a scheduling model available in LLVM. The main goal of this tool is not just to predict the performance of the code when run on the target, but also help with diagnosing potential performance issues. Given an assembly code sequence, llvm-mca estimates the IPC (instructions per cycle), as well as hardware resources pressure. The analysis and reporting style were mostly inspired by the IACA tool from Intel. This patch is related to the RFC on llvm-dev visible at this link: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html Differential Revision: https://reviews.llvm.org/D43951 llvm-svn: 326998	2018-03-08 13:05:02 +00:00

43 Commits