llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Lang Hames	310a62559f	[ORC] Add a "lazy call-through" utility based on the same underlying trampoline implementation as lazy compile callbacks, and a "lazy re-exports" utility that builds lazy call-throughs. Lazy call-throughs are similar to lazy compile callbacks (and are based on the same underlying state saving/restoring trampolines) but resolve their targets by performing a standard ORC lookup rather than invoking a user supplied compiler callback. This allows them to inherit the thread-safety of ORC lookups while blocking only the calling thread (whereas compile callbacks also block one compile thread). Lazy re-exports provide a simple way of building lazy call-throughs. Unlike a regular re-export, a lazy re-export generates a new address (a stub entry point) that will act like the re-exported symbol when called. The first call via a lazy re-export will trigger compilation of the re-exported symbol before calling through to it. llvm-svn: 343061	2018-09-26 04:18:30 +00:00
Lang Hames	4a600aef42	[ORC] Fix BuildingAJIT tutorial examples that were broken by r343059. createLocalCompileCallbackManager now returns an Expected value. This commit wraps the call with cantFail to unwrap it. llvm-svn: 343060	2018-09-26 04:00:58 +00:00
Lang Hames	fceb425485	[ORC] Refactor trampoline pool management out of JITCompileCallbackManager. This will allow trampoline pools to be re-used for a new lazy-reexport utility that generates looks up function bodies using the standard symbol lookup process (rather than using a user provided compile function). This new utility provides the same capabilities (since MaterializationUnits already allow user supplied compile functions to be run) as JITCompileCallbackManager, but can use the new asynchronous lookup functions to avoid blocking a compile thread. This patch also updates createLocalCompileCallbackManager to return an error if a callback manager can not be created, and updates clients of that API to account for the change. Finally, the OrcCBindingsStack is updates so that if a callback manager is not available for the target platform a valid stack (without support for lazy compilation) can still be constructed. llvm-svn: 343059	2018-09-26 03:32:12 +00:00
Lang Hames	49b25f7a75	[ORC] Add support for multithreaded compiles to LLJIT and LLLazyJIT. LLJIT and LLLazyJIT can now be constructed with an optional NumCompileThreads arguments. If this is non-zero then a thread-pool will be created with the given number of threads, and compile tasks will be dispatched to the thread pool. To enable testing of this feature, two new flags are added to lli: (1) -compile-threads=N (N = 0 by default) controls the number of compile threads to use. (2) -thread-entry can be used to execute code on additional threads. For each -thread-entry argument supplied (multiple are allowed) a new thread will be created and the given symbol called. These additional thread entry points are called after static constructors are run, but before main. llvm-svn: 343058	2018-09-26 02:39:42 +00:00
Lang Hames	de6b20d3dc	[ORC] Include-what-you-use fixes. llvm-svn: 343057	2018-09-26 02:01:39 +00:00
Lang Hames	80e388bda7	[ORC] Fix a missing include in r343055. llvm-svn: 343056	2018-09-26 01:54:13 +00:00
Lang Hames	7d9758f33a	[ORC] Add ThreadSafeModule and ThreadSafeContext wrappers to support concurrent compilation of IR in the JIT. ThreadSafeContext is a pair of an LLVMContext and a mutex that can be used to lock that context when it needs to be accessed from multiple threads. ThreadSafeModule is a pair of a unique_ptr<Module> and a shared_ptr<ThreadSafeContext>. This allows the lifetime of a ThreadSafeContext to be managed automatically in terms of the ThreadSafeModules that refer to it: Once all modules using a ThreadSafeContext are destructed, and providing the client has not held on to a copy of shared context pointer, the context will be automatically destructed. This scheme is necessary due to the following constraits: (1) We need multiple contexts for multithreaded compilation (at least one per compile thread plus one to store any IR not currently being compiled, though one context per module is simpler). (2) We need to free contexts that are no longer being used so that the JIT does not leak memory over time. (3) Module lifetimes are not predictable (modules are compiled as needed depending on the flow of JIT'd code) so there is no single point where contexts could be reclaimed. JIT clients not using concurrency can safely use one ThreadSafeContext for all ThreadSafeModules. JIT clients who want to be able to compile concurrently should use a different ThreadSafeContext for each module, or call setCloneToNewContextOnEmit on their top-level IRLayer. The former reduces compile latency (since no clone step is needed) at the cost of additional memory overhead for uncompiled modules (as every uncompiled module will duplicate the LLVM types, constants and metadata that have been shared). llvm-svn: 343055	2018-09-26 01:24:12 +00:00
Vyacheslav Zakharin	0bb50c0ba6	Remove LoopID metadata from the branch instruction that follows the peeled iterations. Differential Revision: https://reviews.llvm.org/D52176 llvm-svn: 343054	2018-09-26 01:03:21 +00:00
Zhaoshi Zheng	b303e00c9b	Revert "Revert "[ConstHoist] Do not rebase single (or few) dependent constant"" This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346. Reland r342994: disabled the optimization and explicitly enable it in test. -mllvm -consthoist-min-num-to-rebase<unsigned>=0 [ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 343053	2018-09-26 00:59:09 +00:00
Thomas Lively	557b2fcc24	[WebAssembly] SIMD conversions Summary: Lowers (s\|u)itofp and fpto(s\|u)i instructions for vectors. The fp to int conversions produce poison values if their arguments are out of the convertible range, so a future CL will have to add an LLVM intrinsic to make the saturating behavior of this conversion usable. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52372 llvm-svn: 343052	2018-09-26 00:34:36 +00:00
Craig Topper	173aa6dc3e	[DAGCombiner] Remove unnecessary check for visitSDIVLike/visitUDIVLike returning a UDIVREM or SDIVREM node. This shouldn't be possible and is a leftover from when we used to recursively call combine here. llvm-svn: 343049	2018-09-25 23:52:07 +00:00
Stanislav Mekhanoshin	011d08c69a	[AMDGPU] Fix ds combine with subregs Differential Revision: https://reviews.llvm.org/D52522 llvm-svn: 343047	2018-09-25 23:33:18 +00:00
Craig Topper	f8c4c3a58a	[X86] Allow movmskpd/ps ISD nodes to be created and selected with integer input types. This removes an int->fp bitcast between the surrounding code and the movmsk. I had already added a hack to combineMOVMSK to try to look through this bitcast to improve the SimplifyDemandedBits there. But I found an additional issue where the bitcast was preventing combineMOVMSK from being called again after earlier nodes in the DAG are optimized. The bitcast gets revisted, but not the user of the bitcast. By using integer types throughout, the bitcast doesn't get in the way. llvm-svn: 343046	2018-09-25 23:28:27 +00:00
Craig Topper	759d13ef35	[X86] Add some more movmsk test cases. NFC These IR patterns represent the exact behavior of a movmsk instruction using (zext (bitcast (icmp slt X, 0))). For the v4i32/v8i32/v2i64/v4i64 we currently emit a PCMPGT for the icmp slt which is unnecessary since we only care about the sign bit of the result. This is because of the int->fp bitcast we put on the input to the movmsk nodes for these cases. I'll be fixing this in a future patch. llvm-svn: 343045	2018-09-25 23:28:24 +00:00
Lang Hames	be5c285309	[ORC] Add an asynchronous jit-link function, jitLinkForORC, to RuntimeDyld and switch RTDyldObjectLinkingLayer2 to use it. RuntimeDyld::loadObject is currently a blocking operation. This means that any JIT'd code whose call-graph contains an embedded complete K graph will require at least K threads to link, which precludes the use of a fixed sized thread pool for concurrent JITing of arbitrary code (whatever K the thread-pool is set at, any code with a K+1 complete subgraph will deadlock at JIT-link time). To address this issue, this commmit introduces a function called jitLinkForORC that uses continuation-passing style to pass the fix-up and finalization steps to the asynchronous symbol resolver interface so that linking can be performed without blocking. llvm-svn: 343043	2018-09-25 22:57:44 +00:00
Sanjay Patel	4db9800f13	[InstCombine] add fneg variation of shuffle-binop fold; NFC If the fsub in this pattern was replaced by an actual fneg instruction, we would need to add a fold to recognize that because fneg would not be a binop. llvm-svn: 343041	2018-09-25 22:48:58 +00:00
Changpeng Fang	1e204ff173	AMDGPU: Add Selection patterns to support add of one bit. Summary: We generate s_xor to lower add of i1s in general cases, and s_not to lower add with a one-bit imm of -1 (true). Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D52518 llvm-svn: 343030	2018-09-25 21:21:18 +00:00
Anna Thomas	146a9e87c5	[LV][LAA] Vectorize loop invariant values stored into loop invariant address Summary: We are overly conservative in loop vectorizer with respect to stores to loop invariant addresses. More details in https://bugs.llvm.org/show_bug.cgi?id=38546 This is the first part of the fix where we start with vectorizing loop invariant values to loop invariant addresses. This also includes changes to ORE for stores to invariant address. Reviewers: anemet, Ayal, mkuper, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50665 llvm-svn: 343028	2018-09-25 20:57:20 +00:00
Craig Topper	ff15ca143c	[MCAsmParser] Move AltMacroMode tracking out of MCAsmLexer The Lexer doesn't use this state itself. It is only set and used by AsmParser so it seems like it should just be part of AsmParser. Differential Revision: https://reviews.llvm.org/D52515 llvm-svn: 343027	2018-09-25 20:55:55 +00:00
Simon Pilgrim	7b9017da12	[X86] combineUIntToFP - Fix UINT_TO_FP(vXi1) comment (PR39078) llvm-svn: 343026	2018-09-25 20:52:08 +00:00
Lang Hames	bd45c5f4db	Remove 'orc' namespace from MSVCErrorWorkarounds.h, fix some typos that were breaking windows builds. The 'orc' namespace was accidentally left in when the workarounds were moved out of orc in r343011. llvm-svn: 343025	2018-09-25 20:48:57 +00:00
Lang Hames	5324e0c217	Fix a missing includes and a use of the MSVC promise/future workaround that were left out of r343011/r343012. llvm-svn: 343022	2018-09-25 20:16:06 +00:00
Teresa Johnson	44ebb78e5e	[ThinLTO] Efficiency fix for writing type id records in per-module indexes Summary: In D49565/r337503, the type id record writing was fixed so that only referenced type ids were emitted into each per-module index for ThinLTO distributed builds. However, this still left an efficiency issue: each per-module index checked all type ids for membership in the referenced set, yielding O(M*N) performance (M indexes and N type ids). Change the TypeIdMap in the summary to be indexed by GUID, to facilitate correlating with type identifier GUIDs referenced in the function summary TypeIdInfo structures. This allowed simplifying other places where a map from type id GUID to type id map entry was previously being used to aid this correlation. Also fix AsmWriter code to handle the rare case of type id GUID collision. For a large internal application, this reduced the thin link time by almost 15%. Reviewers: pcc, vitalybuka Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51330 llvm-svn: 343021	2018-09-25 20:14:40 +00:00
Craig Topper	2008068a9a	[MC] Return a std::string instead of taking it as an out parameter. Make two parser methods into static functions at file scope. NFC llvm-svn: 343020	2018-09-25 20:13:55 +00:00
Heejin Ahn	ab02666df3	Unify landing pad information adding routines (NFC) Summary: We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`, both of which add landing pad information to populate `LandingPadInfo` but are called from different locations, which was confusing. This patch unifies them with one `MachineFunction::addLandingPad` function, which now has functionlities of both functions. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52428 llvm-svn: 343018	2018-09-25 19:56:44 +00:00
Lang Hames	276d90bd36	[ORC] Reapply r342939 with a fix for MSVC's promise/future restrictions. llvm-svn: 343012	2018-09-25 19:48:46 +00:00
Lang Hames	d1f4173f73	Move MSVC workarounds for future<Error>/future<Expected<T>> out of ORC and into a header in support. MSVC's std::future implementation requires types to be default constructible, but Error and Expected are not. This issue came up once before in ORC's RPCUtils.h header and was worked around there but came up again in r342939, so I am moving the workaround to Support to make it available to other clients. llvm-svn: 343011	2018-09-25 19:48:44 +00:00
Craig Topper	4a4034fc60	[MC] Fix bad indentation and 80 column violations. Use StringRef::front instead of dereferencing StringRef::begin. NFC llvm-svn: 343010	2018-09-25 19:37:35 +00:00
Sanjay Patel	50d6ec057c	[x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449) This is the final (I hope!) problem pattern mentioned in PR37749: https://bugs.llvm.org/show_bug.cgi?id=37749 We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op. The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, we have this vector-type-legalized sequence: t29: v8i32 = concat_vectors t27, t28 t30: v4i64 = bitcast t29 t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ... t31: v4i64 = bitcast t18 t32: v4i64 = xor t30, t31 t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ... t34: v4i64 = bitcast t9 t35: v4i64 = and t32, t34 t36: v8i32 = bitcast t35 t37: v4i32 = extract_subvector t36, Constant:i64<0> t38: v4i32 = extract_subvector t36, Constant:i64<4> Differential Revision: https://reviews.llvm.org/D52318 llvm-svn: 343008	2018-09-25 19:09:34 +00:00
Yury Delendik	da8773b8d1	[WebAssembly] Move/clone DBG_VALUE during WebAssemblyRegStackify pass Summary: The MoveForSingleUse or MoveAndTeeForMultiUse functions move wasm instructions, however DBG_VALUE stay unchanged -- moving or cloning these. Reviewers: dschuff Reviewed By: dschuff Subscribers: mattd, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits, aardappel Tags: #debug-info Differential Revision: https://reviews.llvm.org/D49034 llvm-svn: 343007	2018-09-25 18:59:34 +00:00
Jessica Paquette	2539089260	Revert "[ConstHoist] Do not rebase single (or few) dependent constant" This caused a couple test failures on a bot: CodeGen/X86/constant-hoisting-bfi.ll Transforms/ConstantHoisting/X86/ehpad.ll Example: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/ llvm-svn: 343005	2018-09-25 18:41:40 +00:00
Daniil Fukalov	991f9ea5b7	[RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction. E.g. for a MIR bb.100: %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction. But it is not possible since can generate incorrect code in terms of exec mask. The change makes regalloc to ignore such physreg candidates. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D52052 llvm-svn: 343004	2018-09-25 18:37:38 +00:00
Craig Topper	3bf4ee9d1e	[MC] Replace NULL constant in code with nullptr. llvm-svn: 343003	2018-09-25 18:33:00 +00:00
Daniel Sanders	091f90c030	[globalisel][tblgen] Table optimization should consider the C++ code in C++ predicates This fixes PR39045 llvm-svn: 342997	2018-09-25 17:59:02 +00:00
Zhaoshi Zheng	bf32af5c9a	[ConstHoist] Do not rebase single (or few) dependent constant If an instance (InsertionPoint or IP) of Base constant A has only one or few rebased constants depending on it, do NOT rebase. One extra ADD instruction is required to materialize each rebased constant, assuming A and the rebased have the same materialization cost. Differential Revision: https://reviews.llvm.org/D52243 llvm-svn: 342994	2018-09-25 17:45:37 +00:00
Justin Bogner	245caff12b	Revert "[DebugInfo] Do not generate address info for removed debug labels." The added test is failing on macOS: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53550/ This reverts r342943. llvm-svn: 342993	2018-09-25 17:29:30 +00:00
Craig Topper	e41e84b57b	[X86] Add AVX512 support to combineVectorSizedSetCCEquality. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52424 llvm-svn: 342989	2018-09-25 16:27:12 +00:00
Sanjay Patel	55fffb978e	[InstCombine] narrow binops on concatenated vectors (PR33026) The motivating case from: https://bugs.llvm.org/show_bug.cgi?id=33026 ...has no shuffles now. This kind of pattern may occur during vectorization when targets have lumpy ISAs like SSE/AVX. llvm-svn: 342988	2018-09-25 15:57:37 +00:00
Nirav Dave	36a936ebf7	[ARM] Share predecessor bookkeeping in CombineBaseUpdate. NFCI. llvm-svn: 342987	2018-09-25 15:30:47 +00:00
Nirav Dave	c26720b88c	[AArch64] Share search bookkeeping in combines. NFCI. Share predecessor search bookkeeping in both perform PostLD1Combine and performNEONPostLDSTCombine. This should be approximately a 4x and 2x performance improvement. llvm-svn: 342986	2018-09-25 15:30:22 +00:00
Nirav Dave	9c9fb30cc3	[LegalizeDAG] Prune Predecessor check in ExpandExtractFromVectorThroughStack. NFCI. llvm-svn: 342985	2018-09-25 15:29:57 +00:00
Nirav Dave	b68f8ddcb9	[DAGCombine] Improve Predecessor check in SimplifySelectOps. NFCI. Reuse search space bookkeeping across multiple predecessor checks qdone to avoid redundancy. This should cut search cost by ~4x. llvm-svn: 342984	2018-09-25 15:29:30 +00:00
Nirav Dave	049e71fd4e	[DAGCombine] Share predecessor bookkeeping in CombineToPostIndexedLoadStore. NFCI. llvm-svn: 342983	2018-09-25 15:29:04 +00:00
Guillaume Chatelet	d867442cfe	[llvm-exegesis] Serializes registers initial values. Summary: Adds the registers initial values to the YAML output of llvm-exegesis. Reviewers: courbet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52460 llvm-svn: 342982	2018-09-25 15:15:54 +00:00
Guillaume Chatelet	ad6af5f3a5	[llvm-exegesis] Fix missing document separator in YAML output. Reviewers: courbet Subscribers: tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52496 llvm-svn: 342981	2018-09-25 14:48:24 +00:00
Nirav Dave	ba65b2d43a	[DAGCombine] Don't fold dependent loads across SELECT_CC. DAGCombine will try to fold two loads that feed a SELECT or SELECT_CC after the select, resulting in a select of an address and a single load after. If either of the loads depend on the other, this is not legal as it could introduce cycles. However, it only checked this if the opcode was a SELECT, and not for a SELECT_CC. Unfortunately, the only reproducer I have for this is for our downstream target. I've tried getting it to trigger on an upstream one but haven't been successful. Patch thanks to Bevin Hansson. llvm-svn: 342980	2018-09-25 14:43:05 +00:00
Clement Courbet	4d1bc5dc43	[llvm-exegesis] Add lit tests (v2). Summary: This revisits rL342953 by adding detection of host support. Reviewers: gchatelet, lebedev.ri, alexshap Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52464 llvm-svn: 342975	2018-09-25 13:59:35 +00:00
Guillaume Chatelet	b798312f4c	[llvm-exegesis] Fix broken test. llvm-svn: 342971	2018-09-25 13:18:10 +00:00
Simon Pilgrim	4a581314fe	Revert rL342916: [X86] Remove shift/rotate by CL memory (RMW) overrides As suggested by Craig Topper - I'm going to look at cleaning up the RMW sequences instead. The uops are slightly different to the register variant, so requires a +1uop tweak llvm-svn: 342969	2018-09-25 13:01:26 +00:00
Guillaume Chatelet	d1656d7dab	[llvm-exegesis][NFC] Rewrite of the YAML serialization. Summary: This is a NFC in preparation of exporting the initial registers as part of the YAML dump Reviewers: courbet Reviewed By: courbet Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D52427 llvm-svn: 342967	2018-09-25 12:18:08 +00:00

1 2 3 4 5 ...

169605 Commits