llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Pavel Labath	e7b8d7b648	Refactor dwarfdump -apple-names output Summary: This modifies the dwarfdump output to align it with the new .debug_names dump. It also renames two header fields to match similar fields in the dwarf5 header. A couple of tests needed to be updated to match new output. The changes were fairly straight-forward, although not really automatable. Reviewers: JDevlieghere, aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42415 llvm-svn: 323641	2018-01-29 11:33:17 +00:00
Pavel Labath	4e33537215	[DebugInfo] Basic .debug_names dumping support Summary: This commit renames DWARFAcceleratorTable to AppleAcceleratorTable to free up the first name as an interface for the different accelerator tables. Then I add a DWARFDebugNames class for the dwarf5 table. Presently, the only common functionality of the two classes is the dump() method, because this is the only method that was necessary to implement dwarfdump -debug-names; and because the rest of the AppleAcceleratorTable interface does not directly transfer to the dwarf5 tables (the main reason for that is that the present interface assumes the tables are homogeneous, but the dwarf5 tables can have different keys associated with each entry). I expect to make the common interface richer as I add more functionality to the new class (and invent a way to represent it in generic way). In terms of sharing the implementation, I found the format of the two tables sufficiently different to frustrate any attempts to have common parsing or dumping code, so presently the implementations share just low level code for formatting dwarf constants. Reviewers: vleschuk, JDevlieghere, clayborg, aprantl, probinson, echristo, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42297 llvm-svn: 323638	2018-01-29 11:08:32 +00:00
George Rimar	0a6d36f5a8	[ThinLTO] - Stop internalizing and drop non-prevailing symbols. Implementation marks non-prevailing symbols as not live in the summary. Then them are dropped in backends. Fixes https://bugs.llvm.org/show_bug.cgi?id=35938 Differential revision: https://reviews.llvm.org/D42107 llvm-svn: 323633	2018-01-29 08:03:30 +00:00
Hiroshi Inoue	628b900993	[NFC] fix trivial typos in comments and documents "to to" -> "to" llvm-svn: 323628	2018-01-29 05:17:03 +00:00
Jonas Devlieghere	924e1d99fc	[Support] Move DJB hash to support. NFC This patch moves the DJB hash to support. This is consistent with other hashing algorithms living there. The hash is used by the DWARF accelerator tables. We're doing this now because the hashing function is needed by dsymutil and we don't want to link against libBinaryFormat. Differential revision: https://reviews.llvm.org/D42594 llvm-svn: 323616	2018-01-28 11:05:10 +00:00
Daniel Neilson	dabc84eeb7	Add IRBuilder API to create memcpy/memmove calls with differing source and dest alignments Summary: This change is step two in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323597	2018-01-27 17:59:10 +00:00
Daniil Fukalov	28862346c2	[AMDGPU] fix LDS f32 intrinsics - using qualified pointer addrspace in intrinsics class to avoid .f32 mangling - changed too common atomic mangling to ds - added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic Reviewed by: b-sumner Differential Revision: https://reviews.llvm.org/D42383 llvm-svn: 323516	2018-01-26 11:09:38 +00:00
Hiroshi Inoue	7f54536b89	[NFC] fix trivial typos in comments and documents "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508	2018-01-26 08:15:29 +00:00
Aditya Nandakumar	f2df36f3f1	Fix buildfailure by making some MIPatternMatchers inline llvm-svn: 323487	2018-01-26 00:50:56 +00:00
Paul Robinson	fa778bec05	[DWARFv5] Support DW_FORM_line_strp in llvm-dwarfdump. This form is like DW_FORM_strp, but points to .debug_line_str instead of .debug_str as the string section. It's intended to be used from the line-table header, and allows string-pooling of directory and filenames across compilation units. Differential Revision: https://reviews.llvm.org/D42553 llvm-svn: 323476	2018-01-25 22:02:36 +00:00
Easwaran Raman	bd3a1aab59	[SyntheticCounts] Rewrite the code using only graph traits. Summary: The intent of this is to allow the code to be used with ThinLTO. In Thinlink phase, a traditional Callgraph can not be computed even though all the necessary information (nodes and edges of a call graph) is available. This is due to the fact that CallGraph class is closely tied to the IR. This patch first extends GraphTraits to add a CallGraphTraits graph. This is then used to implement a version of counts propagation on a generic callgraph. Reviewers: davidxl Subscribers: mehdi_amini, tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D42311 llvm-svn: 323475	2018-01-25 22:02:29 +00:00
Vedant Kumar	d07c26287f	[Debug] Add a utility to propagate dbg.value to new PHIs, NFC This simply moves an existing utility to Utils for reuse. Split out of: https://reviews.llvm.org/D42551 Patch by Matt Davis! llvm-svn: 323471	2018-01-25 21:37:05 +00:00
Easwaran Raman	3831ff54c8	Re-land "[ThinLTO] Add call edges' relative block frequency to per-module summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460	2018-01-25 19:27:17 +00:00
Benjamin Kramer	2bb5afc3db	[ADT] Make moving Optional not reset the Optional it moves from. This brings it in line with std::optional. My recent changes to make Optional of trivial types trivially copyable introduced diverging behavior depending on the type, which is bad. Now all types have the same moving behavior. llvm-svn: 323445	2018-01-25 17:24:22 +00:00
George Rimar	7489087151	[LTO] - Introduce GlobalResolution::Prevailing flag. It is NFC refactoring change that will make D42107 a bit smaller. Differential revision: https://reviews.llvm.org/D42528 llvm-svn: 323444	2018-01-25 17:23:27 +00:00
Sam McCall	45da3cf0f3	Give scope_exit helper correct move semantics llvm-svn: 323442	2018-01-25 16:55:48 +00:00
Amjad Aboud	ba09d82dc0	Another try to commit 323321 (aggressive instruction combine). llvm-svn: 323416	2018-01-25 12:06:32 +00:00
George Rimar	ba9f28eca0	[LTO] - Get rid of friend 'computeDeadSymbols'. NFC. computeDeadSymbols accessed isLive() which was not public before. It does not make much sence to keep isLive() private because flags are available via flags() public member anyways. llvm-svn: 323415	2018-01-25 11:45:02 +00:00
Jonas Devlieghere	33eed2b799	[Dwarf] Add dsymutil Atom extensions. NFC This patch extends the atom types used by the Apple accelerator tables with two dsymutil extensions: - DW_ATOM_type_type_flags - DW_ATOM_qual_name_hash llvm-svn: 323414	2018-01-25 11:19:08 +00:00
Aditya Nandakumar	af9f61606c	Add support for pattern matching MachineInsts. https://reviews.llvm.org/D42439 Add Instcombine like matchers for MachineInstructions. There are only globalISel matchers for now. llvm-svn: 323400	2018-01-25 02:53:06 +00:00
Lang Hames	0943412040	[ORC] Refactor the various lookupFlags methods to return the flags map via the first argument. This makes lookupFlags more consistent with lookup (which takes the query as the first argument) and composes better in practice, since lookups are usually linearly chained: Each lookupFlags can populate the result map based on the symbols not found in the previous lookup. (If the maps were returned rather than passed by reference there would have to be a merge step at the end). llvm-svn: 323398	2018-01-25 01:43:00 +00:00
Aditya Nandakumar	44de88f9fc	[GISel]: Fix modules build by including <cassert> llvm-svn: 323394	2018-01-25 01:16:14 +00:00
Lang Hames	c647b54b9d	[ORC] Try to silence compiler error at http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/17264 NFC. llvm-svn: 323393	2018-01-25 01:05:29 +00:00
Aditya Nandakumar	65202b9329	[GISel]: Implement GlobalISel combiner API. https://reviews.llvm.org/D41373 The various components are GICombinerHelper contains transformations that are common to all targets. Targets can pick and choose which transformations (at function/opcode granularity) each pass uses via configuring a GICombinerInfo. GICombiner contains some common code and it does the traversal, driving of combines, worklist management and iterating until convergence. GICombinerInfo is an interface with a virtual method called combine. The combiner info will allow targets to pick and choose (or implement their own specific combines). CombineInfos can make use of available combines in GICombineHelper to configure the transformations for a particular pass. Currently this approach allows cherry picking transformations from helpers (at function/opcode granularity) and also allows early returning on specific transformations. Targets also get to prioritize whether target specific combines run before/after the opt-in generic combines. Ideally we would like this part to be configured by both C++ and Tablegen. The CombinerInfo also has a field which indicates how to deal with IllegalOps (ie - should we allow to create them/or legalize them?). A CombinerPass would configure a CombinerInfo, create the GICombiner with the Info, and call GICombiner::combineMachineInstrs(MachineFunction&). This organization is very similar to the GISelLegalizer. llvm-svn: 323392	2018-01-25 00:41:58 +00:00
Lang Hames	e2eabc82ff	[ORC] Add helpers for building orc::SymbolResolvers from legacy findSymbol-style functions/methods that return JITSymbols. lookupFlagsWithLegacyFn takes a SymbolNameSet and a legacy lookup function and returns a LookupFlagsResult. It uses the legacy lookup function to search for each symbol. If found, getFlags is called on the symbol and the flags added to the SymbolFlags map. If not found, the symbol is added to the SymbolsNotFound set. lookupWithLegacyFn takes an AsynchronousSymbolQuery, a SymbolNameSet and a legacy lookup function. Each symbol in the SymbolNameSet is searched for via the legacy lookup function. If it is found, its getAddress function is called (triggering materialization if it has not happened already) and the resulting mapping stored in the query. If it is not found the symbol is added to the unresolved symbols set which is returned at the end of the function. If an error occurs during legacy lookup or materialization it is passed to the query via setFailed and the function returns immediately. llvm-svn: 323388	2018-01-24 23:09:07 +00:00
Lang Hames	6e0cc41ad1	[ORC] Add a LambdaSymbolResolver convenience class and docs for SymbolResolver. This patch adds a LambdaSymbolResolver convenience utility that can create an orc::SymbolResolver from a pair of function objects that supply the behavior for the lookupFlags and lookup methods. This class plays the same role for orc::SymbolResolver as the legacy LambdaResolver class plays for LegacyJITSymbolResolver, and will replace the latter class once all ORC APIs are migrated to orc::SymbolResolver. This patch also adds some documentation for the orc::SymbolResolver class as this was left out of the original commit. llvm-svn: 323375	2018-01-24 21:21:10 +00:00
Easwaran Raman	144f3acb63	Revert "[ThinLTO] Add call edges' relative block frequency to per-module summary." Causes buildbot regressions. llvm-svn: 323358	2018-01-24 18:15:29 +00:00
Easwaran Raman	e7546e2838	[ThinLTO] Add call edges' relative block frequency to per-module summary. Summary: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. Reviewers: tejohnson, pcc Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D42212 llvm-svn: 323349	2018-01-24 17:51:23 +00:00
Daniel Sanders	99f8a8b118	[globalisel] Introduce LegalityQuery to better encapsulate the legalizer decisions. NFC. Summary: `getAction(const InstrAspect &) const` breaks encapsulation by exposing the smaller components that are used to decide how to legalize an instruction. This is a problem because we need to change the implementation of LegalizerInfo so that it's able to describe particular type combinations rather than just cartesian products of types. For example, declaring the following setAction({..., 0, s32}, Legal) setAction({..., 0, s64}, Legal) setAction({..., 1, s32}, Legal) setAction({..., 1, s64}, Legal) currently declares these type combinations as legal: {s32, s32} {s64, s32} {s32, s64} {s64, s64} but we currently have no means to say that, for example, {s64, s32} is not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/ G_UNMERGE_VALUES has relationships between the types that are currently described incorrectly. Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics differently to atomics. The necessary information is in the MMO but we have no way to use this in the legalizer. Similarly, there is currently no way for the register type and the memory type to differ so there is no way to cleanly represent extending-load/truncating-store in a way that can't be broken by optimizers (resulting in illegal MIR). This patch introduces LegalityQuery which provides all the information needed by the legalizer to make a decision on whether something is legal and how to legalize it. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner Reviewed By: bogner Subscribers: bogner, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D42244 llvm-svn: 323342	2018-01-24 17:17:46 +00:00
Jonas Devlieghere	bab446772d	[NFC] Make magic number for DJB hash function customizable. This allows us to specify the magic number for the DJB hash function. This feature is needed by dsymutil to emit Apple types accelerator table. llvm-svn: 323341	2018-01-24 16:53:14 +00:00
Sanjay Patel	373af89ec1	[ValueTracking] add recursion depth param to matchSelectPattern We're getting bug reports: https://bugs.llvm.org/show_bug.cgi?id=35807 https://bugs.llvm.org/show_bug.cgi?id=35840 https://bugs.llvm.org/show_bug.cgi?id=36045 ...where we blow up the stack in value tracking because other passes are sending in selects that have an operand that is itself the select. We don't currently have a reliable way to avoid analyzing dead code that may take non-standard forms, so bail out when things go too far. This mimics the recursion depth limitations in other parts of value tracking. Unfortunately, this pushes the underlying problems for other passes (jump-threading, simplifycfg, correlated-propagation) into hiding. If someone wants to uncover those again, the first draft of this patch on Phab would do that (it would assert rather than bail out). Differential Revision: https://reviews.llvm.org/D42442 llvm-svn: 323331	2018-01-24 15:20:37 +00:00
Amjad Aboud	bed9def2b0	Reverted 323321. llvm-svn: 323326	2018-01-24 14:48:49 +00:00
Amjad Aboud	5a41bfbb07	[InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321	2018-01-24 12:42:42 +00:00
Malcolm Parsons	12e0bc3d59	Fix typos of occurred and occurrence llvm-svn: 323318	2018-01-24 10:33:39 +00:00
Sander de Smalen	ee2cc50e7b	[Metadata] Extend 'count' field of DISubrange to take a metadata node Summary: This patch extends the DISubrange 'count' field to take either a (signed) constant integer value or a reference to a DILocalVariable or DIGlobalVariable. This is patch [1/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: rnk, probinson, fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41695 llvm-svn: 323313	2018-01-24 09:56:07 +00:00
Jakub Kuderski	2a246b4a78	[Dominators] Introduce DomTree verification levels Summary: Currently, there are 2 ways to verify a DomTree: * `DT.verify()` -- runs full tree verification and checks all the properties and gives a reason why the tree is incorrect. This is run by when EXPENSIVE_CHECKS are enabled or when `-verify-dom-info` flag is set. * `DT.verifyDominatorTree()` -- constructs a fresh tree and compares it against the old one. This does not check any other tree properties (DFS number, levels), nor ensures that the construction algorithm is correct. Used by some passes inside assertions. This patch introduces DomTree verification levels, that try to close the gape between the two ways of checking trees by introducing 3 verification levels: - Full -- checks all properties, but can be slow (O(N^3)). Used when manually requested (e.g. `assert(DT.verify())`) or when `-verify-dom-info` is set. - Basic -- checks all properties except the sibling property, and compares the current tree with a freshly constructed one instead. This should catch almost all errors, but does not guarantee that the construction algorithm is correct. Used when EXPENSIVE checks are enabled. - Fast -- checks only basic properties (reachablility, dfs numbers, levels, roots), and compares with a fresh tree. This is meant to replace the legacy `DT.verifyDominatorTree()` and in my tests doesn't cause any noticeable performance impact even in the most pessimistic examples. When used to verify dom tree wrapper pass analysis on sqlite3, the 3 new levels make `opt -O3` take the following amount of time on my machine: - no verification: 8.3s - `DT.verify(VerificationLevel::Fast)`: 10.1s - `DT.verify(VerificationLevel::Basic)`: 44.8s - `DT.verify(VerificationLevel::Full)`: 1m 46.2s (and the previous `DT.verifyDominatorTree()` is within the noise of the Fast level) This patch makes `DT.verifyDominatorTree()` pick between the 3 verification levels depending on EXPENSIVE_CHECKS and `-verify-dom-info`. Reviewers: dberlin, brzycki, davide, grosser, dmgreen Reviewed By: dberlin, brzycki Subscribers: MatzeB, llvm-commits Differential Revision: https://reviews.llvm.org/D42337 llvm-svn: 323298	2018-01-24 02:40:35 +00:00
Sam Clegg	d27dd1200a	[WebAssembly] Add minor helper functions to WasmObjectFile Also, fix crash when exporting an imported function. Differential Revision: https://reviews.llvm.org/D42454 llvm-svn: 323290	2018-01-24 01:27:17 +00:00
Benjamin Kramer	0cc88116ee	[TblGen] Inline an (almost) trivial accessor. No functionality change. llvm-svn: 323276	2018-01-23 23:03:50 +00:00
Volkan Keles	2152a6d815	Add missing include to fix the failure caused by r323266 llvm-svn: 323274	2018-01-23 22:55:28 +00:00
Volkan Keles	4c29cfd3e4	[llvm-extract] Support extracting basic blocks Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266	2018-01-23 21:51:34 +00:00
Nico Weber	a51f55c712	Introduce errorToBool() helper and use it. errorToBool() converts an Error to a bool and puts the Error in a checked state. No behavior change. https://reviews.llvm.org/D42422 llvm-svn: 323238	2018-01-23 19:03:13 +00:00
Dan Gohman	5878b3b620	[WebAssembly] Add mem.* intrinsics. The grow_memory and current_memory instructions are expected to be officially renamed to mem.grow and mem.size. Introduce new intrinsics with the new names. These new names aren't yet official, so for now, use them at your own risk. Also, take this opportunity to add arguments for the currently unused immediate field in those instructions. llvm-svn: 323222	2018-01-23 17:02:02 +00:00
Ashutosh Nema	2a3b64fc69	This change add's optimization remark in LoopVersioning LICM pass. Summary: This patch is adding remark messages to the LoopVersioning LICM pass, which will be useful for optimization remark emitter (ORE) infrastructure. Patch by: Deepak Porwal Reviewers: anemet, ashutosh.nema, eastig Subscribers: eastig, vivekvpandya, fhahn, llvm-commits llvm-svn: 323183	2018-01-23 09:47:28 +00:00
Hiroshi Inoue	2f37cdd743	[NFC] fix trivial typos in comments "the the" -> "the" llvm-svn: 323176	2018-01-23 05:49:30 +00:00
David Blaikie	07fc4b0c75	NewPM: Add an extension point for the start of the pipeline. This applies to most pipelines except the LTO and ThinLTO backend actions - it is for use at the beginning of the overall pipeline. This extension point will be used to add the GCOV pass when enabled in Clang. llvm-svn: 323166	2018-01-23 01:25:20 +00:00
Chandler Carruth	5c3f34f10b	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took in the speculative execution, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` in addition to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile all libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we strongly recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155	2018-01-22 22:05:25 +00:00
Reid Kleckner	0f305649d8	[CodeGen] Shrink MachineOperand by 8 bytes on Windows Use 'unsigned' for these bitfields so they actually pack together. Previously it used three words for these bits instead of one. Add some static_asserts to prevent this from being undone. llvm-svn: 323135	2018-01-22 17:50:20 +00:00
Eugene Leviant	828889d031	[ThinLTO] Re-commit of dot dumper after test fix llvm-svn: 323116	2018-01-22 13:35:40 +00:00
Pavel Labath	1dd5338ab2	Rename DwarfAcceleratorTable to AppleAcceleratorTable. NFC This frees up the first name to be used as an base class for the apple table and the dwarf5 .debug_names accel table. The rename was split off from D42297 (adding of debug_names support), which is still under review. llvm-svn: 323113	2018-01-22 13:17:23 +00:00
Haojian Wu	d8056b452f	[YAML] Plain scalars can not begin with most indicators. Summary: Discovered when clangd loads YAML symbols, some symbol documentations start with indicators (e.g. "-"), but YAML prints them as plain scalars (no quotes), which make the YAML parser fail to parse. For these kind of strings, we need quotes. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, ioeric, llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D42362 llvm-svn: 323097	2018-01-22 10:20:48 +00:00

1 2 3 4 5 ...

33413 Commits