llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Eugene Zelenko	4d66583321	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 310766	2017-08-11 21:30:02 +00:00
Eli Friedman	28e7964c1c	[OptDiag] Updating Remarks in SampleProfile Updating remark API to newer OptimizationDiagnosticInfo API. This allows remarks to show up in diagnostic yaml file, and enables use of opt-viewer tool. Hotness information for remarks (L505 and L751) do not display hotness information, most likely due to profile information not being propagated yet. Unsure if this is the desired outcome. Patch by Tarun Rajendran. Differential Revision: https://reviews.llvm.org/D36127 llvm-svn: 310763	2017-08-11 21:12:04 +00:00
Zachary Turner	6c0b0dd57f	[LLD/PDB] Write actual records to the globals stream. Previously we were writing an empty globals stream. Windows tools interpret this as "private symbols are not present in this PDB", even when they are, so we need to fix this. Regardless, without it we don't have information about global variables, so we need to fix it anyway. This patch does that. With this patch, the "lm" command in WinDbg correctly reports that we have private symbols available, but the "dv" command still refuses to display local variables. Differential Revision: https://reviews.llvm.org/D36535 llvm-svn: 310743	2017-08-11 19:00:03 +00:00
Craig Topper	27d22fe4a6	[AVX512] Remove and autoupgrade many of the broadcast intrinsics Summary: This autoupgrades most of the broadcast intrinsics. They've been unused in clang for some time. This leaves the 32x2 intrinsics because they are still used in clang. Reviewers: RKSimon, zvi, igorb Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36606 llvm-svn: 310725	2017-08-11 16:22:45 +00:00
Nirav Dave	c5dcb602f8	[X86][DAG] Switch X86 Target to post-legalized store merge Move store merge to happen after intrinsic lowering to allow lowered stores to be merged. Some regressions due in MergeConsecutiveStores to missing insert_subvector that are addressed in follow up patch. Reviewers: craig.topper, efriedma, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34559 llvm-svn: 310710	2017-08-11 13:21:35 +00:00
Sjoerd Meijer	f4051e8c3e	[AArch64] Remove dotprod from base extension list Dot product is an optional ARMv8.2a extension; remove it from the ARMv8.2a base extension list. This was introduced in commit r310480. Differential Revision: https://reviews.llvm.org/D36609 llvm-svn: 310708	2017-08-11 13:12:49 +00:00
Chandler Carruth	10f8380767	[PM] Switch the CGSCC debug messages to use the standard LLVM debug printing techniques with a DEBUG_TYPE controlling them. It was a mistake to start re-purposing the pass manager `DebugLogging` variable for generic debug printing -- those logs are intended to be very minimal and primarily used for testing. More detailed and comprehensive logging doesn't make sense there (it would only make for brittle tests). Moreover, we kept forgetting to propagate the `DebugLogging` variable to various places making it also ineffective and/or unavailable. Switching to `DEBUG_TYPE` makes this a non-issue. llvm-svn: 310695	2017-08-11 05:47:13 +00:00
Craig Topper	2d53908029	[DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use. This make it consistent with STATISTIC which it will often appears near. While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary. llvm-svn: 310637	2017-08-10 17:48:11 +00:00
Krzysztof Parzyszek	3786e693af	Add "Restored" flag to CalleeSavedInfo The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM. Differential Revision: https://reviews.llvm.org/D36160 llvm-svn: 310619	2017-08-10 16:17:32 +00:00
Nirav Dave	910ea9034d	[X86] Keep dependencies when constructing loads in combineStore Summary: Preserve chain dependecies between old and new loads constructed to prevent loads from reordering below later stores. Fixes PR34088. Reviewers: craig.topper, spatel, RKSimon, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36528 llvm-svn: 310604	2017-08-10 15:12:32 +00:00
Sam Parker	ae56166b53	[ARM][AArch64] ARMv8.3-A enablement The beta ARMv8.3 ISA specifications have been released for AArch64 and AArch32, these can be found at: https://developer.arm.com/products/architecture/a-profile/exploration-tools An introduction to this architecture update can be found at: https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions This patch is the first in a series which will add ARM v8.3-A support in LLVM and Clang. It adds the necessary changes that create targets for both the ARM and AArch64 backends. Differential Revision: https://reviews.llvm.org/D36514 llvm-svn: 310561	2017-08-10 09:41:00 +00:00
Coby Tayree	c6ce400322	[X86][Asm] Allow negative immediate to appear before bracketed expression Currently, only non-negative immediate is allowed prior to a brac expression (memory reference). MASM / GAS does not have any problem cope with the left side of the real line, so we should be able to as well. Differntial Revision: https://reviews.llvm.org/D36229 llvm-svn: 310528	2017-08-09 21:49:17 +00:00
Lang Hames	08671757e4	[RuntimeDyld][ORC] Add support for Thumb mode to RuntimeDyldMachOARM. This patch adds support for thumb relocations to RuntimeDyldMachOARM, and adds a target-specific flags field to JITSymbolFlags (so that on ARM we can record whether each symbol is Thumb-mode code). RuntimeDyldImpl::emitSection is modified to ensure that stubs memory is correctly aligned based on the size returned by getStubAlignment(). llvm-svn: 310517	2017-08-09 20:19:27 +00:00
David Blaikie	0b02c3b093	PointerLikeTypeTraits: class->struct & remove the base definition This simplifies implementations and removing the base definition paves the way for detecting whether a type is 'pointer like'. llvm-svn: 310507	2017-08-09 18:34:21 +00:00
Mandeep Singh Grang	f347410ced	[COFF, ARM64] Add MS builtins __dmb, __dsb, __isb Reviewers: mstorsjo, rnk, ruiu, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36110 llvm-svn: 310502	2017-08-09 17:58:39 +00:00
Nuno Lopes	5a8769dcd9	CFLAA: return MustAlias when pointers p, q are equal, i.e., must-alias(p, sz_p, p, sz_q) irrespective of access sizes sz_p, sz_q As discussed a couple of weeks ago on the ML. This makes the behavior consistent with that of BasicAA. AA clients already check the obj size themselves and may not require the obj size to match exactly the access size (e.g., in case of store forwarding) llvm-svn: 310495	2017-08-09 17:02:18 +00:00
Sjoerd Meijer	185fe0c128	[AArch64] Assembler support for the ARMv8.2a dot product instructions Dot product is an optional ARMv8.2a extension, see also the public architecture specification here: https://developer.arm.com/products/architecture/a-profile/exploration-tools. This patch adds AArch64 assembler support for these dot product instructions. Differential Revision: https://reviews.llvm.org/D36515 llvm-svn: 310480	2017-08-09 14:59:54 +00:00
Benoit Belley	2452e8e7fe	[Support] PR33388 - Fix formatv_object move constructor formatv_object currently uses the implicitly defined move constructor, but it is buggy. In typical use-cases, the problem doesn't show-up because all calls to the move constructor are elided. Thus, the buggy constructors are never invoked. The issue especially shows-up when code is compiled using the -fno-elide-constructors compiler flag. For instance, this is useful when attempting to collect accurate code coverage statistics. The exact issue is the following: The Parameters data member is correctly moved, thus making the parameters occupy a new memory location in the target object. Unfortunately, the default copying of the Adapters blindly copies the vector of pointers, leaving each of these pointers referencing the parameters in the original object instead of the copied one. These pointers quickly become dangling when the original object is deleted. This quickly leads to crashes. The solution is to update the Adapters pointers when performing a move. The copy constructor isn't useful for format objects and can thus be deleted. This resolves PR33388. Differential Revision: https://reviews.llvm.org/D34463 llvm-svn: 310475	2017-08-09 13:47:01 +00:00
Jonas Paulsson	54a000e514	[LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess() isLegalAddressingMode() has recently gained the extra optional Instruction* parameter, and therefore it can now do the job that previously only isFoldableMemAccess() could do. The SystemZ implementation of isLegalAddressingMode() has gained the functionality of checking for offsets, which used to be done with isFoldableMemAccess(). The isFoldableMemAccess() hook has been removed everywhere. Review: Quentin Colombet, Ulrich Weigand https://reviews.llvm.org/D35933 llvm-svn: 310463	2017-08-09 11:28:01 +00:00
Chandler Carruth	d7fd660b9a	[LCG] Switch one of the update methods for the LazyCallGraph to support limited batch updates. Specifically, allow removing multiple reference edges starting from a common source node. There are a few constraints that play into supporting this form of batching: 1) The way updates occur during the CGSCC walk, about the most we can functionally batch together are those with a common source node. This also makes the batching simpler to implement, so it seems a worthwhile restriction. 2) The far and away hottest function for large C++ files I measured (generated code for protocol buffers) showed a huge amount of time was spent removing ref edges specifically, so it seems worth focusing there. 3) The algorithm for removing ref edges is very amenable to this restricted batching. There are just both API and implementation special casing for the non-batch case that gets in the way. Once removed, supporting batches is nearly trivial. This does modify the API in an interesting way -- now, we only preserve the target RefSCC when the RefSCC structure is unchanged. In the face of any splits, we create brand new RefSCC objects. However, all of the users were OK with it that I could find. Only the unittest needed interesting updates here. How much does batching these updates help? I instrumented the compiler when run over a very large generated source file for a protocol buffer and found that the majority of updates are intrinsically updating one function at a time. However, nearly 40% of the total ref edges removed are removed as part of a batch of removals greater than one, so these are the cases batching can help with. When compiling the IR for this file with 'opt' and 'O3', this patch reduces the total time by 8-9%. Differential Revision: https://reviews.llvm.org/D36352 llvm-svn: 310450	2017-08-09 09:05:27 +00:00
Zachary Turner	022830a97d	Fix -Wreorder-fields warning. llvm-svn: 310440	2017-08-09 04:34:11 +00:00
Zachary Turner	d0823e0006	[PDB] Fix an issue writing the publics stream. In the refactor to merge the publics and globals stream, a bug was introduced that wrote the wrong value for one of the fields of the PublicsStreamHeader. This caused debugging in WinDbg to break. We had no way of dumping any of these fields, so in addition to fixing the bug I've added dumping support for them along with a test that verifies the correct value is written. llvm-svn: 310439	2017-08-09 04:23:59 +00:00
Zachary Turner	62cb11667a	[PDB] Merge Global and Publics Builders. The publics stream and globals stream are very similar. They both contain a list of hash buckets that refer into a single shared stream, the symbol record stream. Because of the need for each builder to manage both an independent hash stream as well as a single shared record stream, making the two builders be independent entities is not the right design. This patch merges them into a single class, of which only a single instance is needed to create all 3 streams. PublicsStreamBuilder and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder class, which writes all 3 streams at once. Note that this patch does not contain any functionality change. So we're still not yet writing any records to the globals stream. All we're doing is making it so that when we do start writing records to the globals, this refactor won't have to be part of that patch. Differential Revision: https://reviews.llvm.org/D36489 llvm-svn: 310438	2017-08-09 04:23:25 +00:00
Quentin Colombet	3f63039f98	Revert "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425	2017-08-08 22:22:30 +00:00
Wei Mi	6484ea55e4	[GVN] Remove stale entries in phitranslate cache when new phi is generated for PRE When a new phi is generated for scalarpre of an expression, the phiTranslate cache will become stale: Before PRE, the candidate expression must not be available in a predecessor block, and phitranslate will cache the information. After PRE, the expression will become available in all predecessor blocks, so the related entries in phiTranslate cache becomes stale. The patch will simply remove the stale entries so phiTranslate can be recomputed next time. The stale entries in phitranslate cache will not affect correctness but will cause missing PRE opportunity for later instructions. Differential Revision: https://reviews.llvm.org/D36124 llvm-svn: 310421	2017-08-08 21:40:14 +00:00
Connor Abbott	1a5a919d2d	[AMDGPU] Add llvm.amdgpu.update.dpp intrinsic Summary: Now that we've made all the necessary backend changes, we can add a new intrinsic which exposes the new capabilities to IR producers. Since llvm.amdgpu.update.dpp is a strict superset of llvm.amdgpu.mov.dpp, we should deprecate the former. We also add tests for all the functionality that was added in previous changes, now that we can access it via an IR construct. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34718 llvm-svn: 310399	2017-08-08 18:52:22 +00:00
Zachary Turner	ae9d9f3bb3	[PDB] Fix linking of function symbols and local variables. The compiler outputs PROC32_ID symbols into the object files for functions, and these symbols have an embedded type index which, when copied to the PDB, refer to the IPI stream. However, the symbols themselves are also converted into regular symbols (e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular symbol records refer to the TPI stream. So this patch applies two fixes to function records. 1. It converts ID symbols to the proper non-ID record type. 2. After remapping the type index from the object file's index space to the PDB file/IPI stream's index space, it then remaps that index to the TPI stream's index space by. Besides functions, during the remapping process we were also discarding symbol record types which we did not recognize. In particular, we were discarding S_BPREL32 records, which is what MSVC uses to describe local variables on the stack. So this patch fixes that as well by copying them to the PDB. Differential Revision: https://reviews.llvm.org/D36426 llvm-svn: 310394	2017-08-08 18:34:44 +00:00
Sanjoy Das	902454db3f	[DomTree] Use a non-recursive DFS instead of a recursive one; NFC Summary: The recursive DFS can stack overflow in pathological cases. Reviewers: kuhar Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36442 llvm-svn: 310383	2017-08-08 17:15:29 +00:00
Craig Topper	a6da68c365	[KnownBits][ValueTracking] Move the math for calculating known bits for add/sub into a static method in KnownBits object I want to reuse this code in SimplifyDemandedBits handling of Add/Sub. This will make that easier. Wonder if we should use it in SelectionDAG's computeKnownBits too. Differential Revision: https://reviews.llvm.org/D36433 llvm-svn: 310378	2017-08-08 16:29:35 +00:00
Daniel Sanders	364773e4e5	[globalisel][tablegen] Add support for importing 'imm' operands. Summary: This patch enables the import of rules containing 'imm' operands that do not constrain the acceptable values using predicates. Support for ImmLeaf will arrive in a later patch. Depends on D35681 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35833 llvm-svn: 310343	2017-08-08 10:44:31 +00:00
Craig Topper	ad78181353	[KnownBits] Fix copy pasto in comment. NFC llvm-svn: 310320	2017-08-07 22:35:55 +00:00
Reid Kleckner	9c35a9d645	[Object] Initialize LoadConfig member to null Executables may not contain a load config, and clients should be able to test for nullability. Previously we'd return uninitialized memory. Now getLoadConfig32/64 return valid pointers or null. Fixes PR34108 llvm-svn: 310308	2017-08-07 21:23:38 +00:00
Alexey Bataev	9f19665894	[SLP] General improvements of SLP vectorization process. Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310260	2017-08-07 15:25:49 +00:00
Matt Arsenault	cee1e1c818	Fix typo in comment llvm-svn: 310259	2017-08-07 14:58:43 +00:00
Alexey Bataev	92afaf2479	Revert "[SLP] General improvements of SLP vectorization process." This reverts commit r310255. llvm-svn: 310257	2017-08-07 14:51:52 +00:00
Alexey Bataev	ec62fc0fc9	[SLP] General improvements of SLP vectorization process. Summary: Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310255	2017-08-07 14:03:17 +00:00
Chandler Carruth	49c5e16507	[ADT] Add a much simpler loop to DenseMap::clear when the types are POD-like and we can just splat the empty key across memory. Sadly we can't optimize the normal loop well enough because we can't turn the conditional store into an unconditional store according to the memory model. This loop actually showed up in a profile of code that was calling clear as a serious source of time. =[ llvm-svn: 310189	2017-08-05 22:48:37 +00:00
Chandler Carruth	c777bdad5b	[LCG] Completely remove the parent set and leaf tracking for RefSCCs. After the previous series of patches, this is now trivial and deletes a pretty astonishing amount of complexity. This has been a long time coming, as the move toward a PO sequence of RefSCCs started eroding the underlying use cases for this half of the data structure. Among the biggest advantages here is that now there aren't two independent data structures that need to stay in sync. Some of my profiling has also indicated that updating the parent sets was among the most expensive parts of the lazy call graph. Eliminating it whole sale is likely to be a nice win in terms of compile time. Last but not least, I had discussed with some folks previously keeping it around for asserts and other correctness checking, but once the fundamentals of the parent and child checking were implemented without the parent sets their value in correctness checking was tiny and no where near worth the cost of the complexity required to keep everything up-to-date. llvm-svn: 310171	2017-08-05 07:37:00 +00:00
Chandler Carruth	50e3192084	[LCG] Re-implement the basic isParentOf, isAncestorOf, isChildOf, and isDescendantOf methods on RefSCCs in terms of the forward edges rather than the parent sets. This is technically slower, but probably not interestingly slower, and all of these routines were already so expensive that they're guarded behind both !NDEBUG and EXPENSIVE_CHECKS. This removes another non-critical usage of parent sets. I've also added some comments to try and help clarify to any potential users the costs of these routines. They're mostly useful for debugging, asserts, or other queries. llvm-svn: 310170	2017-08-05 06:24:09 +00:00
Chandler Carruth	be2977a4ea	[LCG] Add the concept of a "dead" node and use it to avoid a complex walk over the parent set. When removing a single function from the call graph, we previously would walk the entire RefSCC's parent set and then walk every outgoing edge just to find the ones to remove. In addition to this being quite high complexity in theory, it is also the last fundamental use of the parent sets. With this change, when we remove a function we transform the node containing it to be recognizably "dead" and then teach the edge iterators to recognize edges to such nodes and skip them the same way they skip null edges. We can't move fully to using "dead" nodes -- when disconnecting two live nodes we need to null out the edge. But the complexity this adds to the edge sequence isn't too bad and the simplification of lazily handling this seems like a significant win. llvm-svn: 310169	2017-08-05 05:47:37 +00:00
Joel Jones	cee6711d56	[AArch64] LSE Atomics reorg - part 1 Add memory synchronization semantics to LSE Atomics. The memory semantics feature will be added in a subsequent patch. In this patch, several corrections were added to the existing LSE Atomics implementation, based on the ARM Errata D11904 from 05/12/2017. Patch by: steleman Differential Revision: https://reviews.llvm.org/D35319 llvm-svn: 310167	2017-08-05 04:30:55 +00:00
Chandler Carruth	cd13c2cca4	[LCG] Replace an implicit bool operator with a named function. (NFC) The definition of 'false' here was already pretty vague and debatable, and I'm about to add another potential 'false' that would actually make much more sense in a bool operator. Especially given how rarely this is used, a nicely named method seems better. llvm-svn: 310165	2017-08-05 04:04:06 +00:00
Adrian McCarthy	ee6fb7079a	Enable llvm-pdbutil to list enumerations using native PDB reader This extends the native reader to enable llvm-pdbutil to list the enums in a PDB and it includes a simple test. It does not yet list the values in the enumerations, which requires an actual implementation of NativeEnumSymbol::FindChildren. To exercise this code, use a command like: llvm-pdbutil pretty -native -enums foo.pdb Differential Revision: https://reviews.llvm.org/D35738 llvm-svn: 310144	2017-08-04 22:37:58 +00:00
Adrian Prantl	1bed051b67	Remove unused include directive and un-break the module build. llvm-svn: 310124	2017-08-04 20:41:37 +00:00
Quentin Colombet	0a7c56803e	[GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. llvm-svn: 310115	2017-08-04 20:15:46 +00:00
Connor Abbott	277c5ff889	[AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic Summary: This intrinsic lets us set inactive lanes to an identity value when implementing wavefront reductions. In combination with Whole Wavefront Mode, it lets inactive lanes be skipped over as required by GLSL/Vulkan. Lowering the intrinsic needs to happen post-RA so that RA knows that the destination isn't completely overwritten due to the EXEC shenanigans, so we need another pseudo-instruction to represent the un-lowered intrinsic. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34719 llvm-svn: 310088	2017-08-04 18:36:54 +00:00
Connor Abbott	c83a4aedcc	[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 llvm-svn: 310087	2017-08-04 18:36:52 +00:00
Connor Abbott	547b308884	[AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM Summary: Previously, we assumed that certain types of instructions needed WQM in pixel shaders, particularly DS instructions and image sampling instructions. This was ok because with OpenGL, the assumption was correct. But we want to start using DPP instructions for derivatives as well as other things, so the assumption that we can infer whether to use WQM based on the instruction won't continue to hold. This intrinsic lets frontends like Mesa indicate what things need WQM based on their knowledge of the API, rather than second-guessing them in the backend. We need to keep around the old method of enabling WQM, but eventually we should remove it once Mesa catches up. For now, this will let us use DPP instructions for computing derivatives correctly. Reviewers: arsenm, tpr, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35167 llvm-svn: 310085	2017-08-04 18:36:49 +00:00
Marcello Maggioni	7e0b2ca4fd	[MachineOperand] Add ChangeToTargetIndex method. NFC Differential Revision: https://reviews.llvm.org/D36301 llvm-svn: 310083	2017-08-04 18:24:09 +00:00
Reid Kleckner	c3417d6f71	[Support] Remove getPathFromOpenFD, it was unused Summary: It was added to support clang warnings about includes with case mismatches, but it ended up not being necessary. Reviewers: twoh, rafael Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36328 llvm-svn: 310078	2017-08-04 17:43:49 +00:00

1 2 3 4 5 ...

32207 Commits