llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Lang Hames	907267f02a	[JITLink] Re-apply 6884fbc2c4f (ELF eh support) with fix for broken test case.	2021-01-26 11:55:41 +11:00
Craig Topper	0571d7eb26	[TargetLowering][RISCV] Don't transform (seteq/ne (sext_inreg X, VT), C1) -> (seteq/ne (zext_inreg X, VT), C1) if the sext_inreg is cheaper RISCV has to use 2 shifts for (i64 (zext_inreg X, i32)), but we can use addiw rd, rs1, x0 for sext_inreg. We already understood this when type legalizing i32 seteq/ne on rv64. But this transform in SimplifySetCC would sometimes undo it. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95289	2021-01-25 16:37:21 -08:00
David Blaikie	fd9d683351	DebugInfo: Generalize the .debug_addr minimization flag to pave the way for including other strategies	2021-01-25 16:24:35 -08:00
Mitch Phillips	587eafbc21	Revert "Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method"" This reverts commit 554b3211fefd09b56b64357b9edd66c78ae200b5. Differential Revision: https://reviews.llvm.org/D95035	2021-01-25 16:22:22 -08:00
Craig Topper	d64f83a48c	[RISCV] Add isel patterns to optimize slli.uw patterns without Zba extension. This pattern can occur when an unsigned is used to index an array on RV64. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95290	2021-01-25 16:12:08 -08:00
Changpeng Fang	117234d675	AMDGPU: Reduce the number of expensive calls in SIFormMemoryClause Summary: RPTracker::reset(MI) is a very expensive call when the number of virtual registers is huge. We observed a long compilation time issue when RPT::reset() is called once for each cluster. In this work, we call RPT.reset() only at the first seen cluster, and use advance() to get the register pressure for the later clusters in the same basic block. This could effectively reduce the number of the expensive calls and thus reduce the compile time. Reviewers: rampitec Fixes: SWDEV-239161 Differential Revision: https://reviews.llvm.org/D95273	2021-01-25 16:08:08 -08:00
Stanislav Mekhanoshin	0b7b07f683	[AMDGPU] Added -mcpu=tahiti to 3 tests. NFC.	2021-01-25 15:50:59 -08:00
modimo	46d7a90a64	[InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks This change leverages the work done in D83743 to replay in the SampleProfile inliner to also be used in the CGSCC inliner. NOTE: currently restricted to non-ML advisors only. The added switch `-cgscc-inline-replay=<remarks file>` will replay the inlining decisions in that file where the remarks file is generated via `-Rpass=inline`. The aim here is to make it easier to analyze changes that would modify inlining heuristics to be separated from this behavior. Doing so allows easier examination of assembly and runtime behavior compared to the baseline rather than trying to dig through the large churn caused by inlining. In LTO compilation, since inlining is done twice you can separately specify replay by passing the flag to the FE (`-cgscc-inline-replay=`) and to the linker (`-Wl,cgscc-inline-replay=`) with the remarks generated from their respective places. Testing on mysqld by comparing the inline decisions between base (generates remarks.txt) and diff (replay using identical input/tools with remarks.txt) and examining the inlining sites with `diff` shows 14,000 mismatches out of 247,341 for a ~94% replay accuracy. I believe this gap can be narrowed further though for the general case we may never achieve full accuracy. For my personal use, this is close enough to be representative: I set the baseline as the one generated by the replay on identical input/toolset and compare that to my modified input/toolset using the same replay. Testing: ninja check-llvm newly added test correctly replays CGSCC inlining decisions Reviewed By: mtrofin, wenlei Differential Revision: https://reviews.llvm.org/D94334	2021-01-25 15:38:57 -08:00
Duncan P. N. Exon Smith	d8f7c22241	Support: Remove duplicated code in {File,clang::ModulesDependency}Collector, NFC Refactor the duplicated canonicalize-path logic in `FileCollector` and `ModulesDependencyCollector` into a new utility called `PathCanonicalizer` that's shared. This popped up when tracking down a bug common to both in https://reviews.llvm.org/D95202. As drive-bys, update a few names and comments to better reflect the effect of the code, delay removal of `..`s to avoid an unnecessary extra string copy, and leave behind a couple of FIXMEs for future consideration. Differential Revision: https://reviews.llvm.org/D95279	2021-01-25 15:09:00 -08:00
Nikita Popov	694eb3a53f	[LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943) When LSR converts a branch on the pre-inc IV into a branch on the post-inc IV, the nowrap flags on the addition may no longer be valid. Previously, a poison result of the addition might have been ignored, in which case the program was well defined. After branching on the post-inc IV, we might be branching on poison, which is undefined behavior. Fix this by discarding nowrap flags which are not present on the SCEV expression. Nowrap flags on the SCEV expression are proven by SCEV to always hold, independently of how the expression will be used. This is essentially the same fix we applied to IndVars LFTR, which also performs this kind of pre-inc to post-inc conversion. I believe a similar problem can also exist for getelementptr inbounds, but I was not able to come up with a problematic test case. The inbounds case would have to be addressed in a differently anyway (as SCEV does not track this property). Fixes https://bugs.llvm.org/show_bug.cgi?id=46943. Differential Revision: https://reviews.llvm.org/D95286	2021-01-25 23:13:48 +01:00
Fraser Cormack	a4f74d403a	[RISCV] Add RVV insertelt/extractelt scalable-vector patterns Original patch by @rogfer01. This patch adds support for insertelt and extractelt operations on scalable vectors. Special care must be taken on RV32 when dealing with i64 vectors as there are no straightforward ways to insert a 64-bit element without a register of that size. To that end, both are custom-lowered to different sequences. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94615	2021-01-25 22:03:52 +00:00
Cassie Jones	caaddba8b7	Recommit "[AArch64][GlobalISel] Implement widenScalar for signed overflow" Implement widening for G_SADDO and G_SSUBO. Add legalize-add/sub tests for narrow overflowing add/sub on AArch64. Differential Revision: https://reviews.llvm.org/D95034	2021-01-25 16:57:20 -05:00
Richard Smith	b9cfb936a1	Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV" This reverts commit 53176c168061d6f26dcf3ce4fa59288b7d67255e, which introduceed a layering violation. LLVM's IR library can't include headers from Analysis.	2021-01-25 13:53:38 -08:00
Jonas Devlieghere	ba9adaa9dd	[YAML I/O] Fix bug in emission of empty sequence Don't emit an output dash for an empty sequence. Take emitting a vector of strings for example: std::vector<std::string> Strings = {"foo", "bar"}; LLVM_YAML_IS_SEQUENCE_VECTOR(std::string) yout << Strings; This emits the following YAML document. --- - foo - bar ... When the vector is empty, this generates the following result: --- - [] ... Although this is valid YAML, it does not match what we meant to emit. The result is a one-element sequence consisting of an empty list. Indeed, if we were to try to read this again we get an error: YAML:2:4: error: not a mapping - [] The problem is the output dash before the empty list. The correct output would be: --- [] ... This patch fixes that by not emitting the output dash for an empty sequence. Differential revision: https://reviews.llvm.org/D95280	2021-01-25 13:35:36 -08:00
Julian Lettner	b139438d21	Revert "[lit] Use os.cpu_count() to cleanup TODO" A bot owner contacted me. I will re-land after confirming that this doesn't break anyone (since it's low priority). This reverts commit 9946b169c379daee603436a4753acfef8be373dd.	2021-01-25 13:32:30 -08:00
Konstantin Zhuravlyov	b36722c726	Revert "[IndirectFunctions] Skip propagating attributes to address taken functions" This reverts commit dd8ae42674b494e46ec40a22f40068db2b4a8b60. This commit causes infinite loop when compiling rocThrust and hipCUB. Differential Revision: https://reviews.llvm.org/D95389	2021-01-25 15:58:06 -05:00
LLVM GN Syncbot	37f3352001	[gn build] Port e123cd674c02	2021-01-25 20:11:10 +00:00
Akira Hatanaka	96195f7442	[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end annotates calls with attribute "clang.arc.rv"="retain" or "clang.arc.rv"="claim", which indicates the call is implicitly followed by a marker instruction and a retainRV/claimRV call that consumes the call result. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the annotated calls in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the annotated calls. It doesn't remove the attribute on the call since the backend needs it to emit the marker instruction. The retainRV/claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes the autoreleaseRV call in the callee that returns the result if nothing in the callee prevents it from being paired up with the calls annotated with "clang.arc.rv"="retain/claim" in the caller. If the call is annotated with "claim", a release call is inserted since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the attributes to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV call returning the callee result, which makes it impossible to pair it up with the retainRV or claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the call is annotated with "retain" and does nothing if it's annotated with "claim". - This patch teaches dead argument elimination pass not to change the return type of a function if any of the calls to the function are annotated with attribute "clang.arc.rv". This is necessary since the pass can incorrectly determine nothing in the IR uses the function return, which can happen since the front-end no longer explicitly emits retainRV/claimRV calls in the IR, and change its return type to 'void'. Future work: - Use the attribute on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the attributes. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-01-25 11:57:08 -08:00
Julian Lettner	08fcf6f580	[lit] Use os.cpu_count() to cleanup TODO We can now use Python3. Let's use `os.cpu_count()` to cleanup this helper. Differential Revision: https://reviews.llvm.org/D94734	2021-01-25 11:44:18 -08:00
Florian Hahn	1010e3b222	[VPlan] Replace uses with new value in VPInstructionsToVPRecipe (NFC). Now that VPRecipeBase inherits from VPDef, we can always use the new VPValue for replacement, if the recipe defines one. Given the recipes that are supported at the moment, all new recipes must have either 0 or 1 defined values.	2021-01-25 19:38:08 +00:00
Nick Desaulniers	87bfbbe74e	[GVN] do not repeat PRE on failure to split critical edge Fixes an infinite loop encountered in GVN. GVN will delay PRE if it encounters critical edges, attempt to split them later via calls to SplitCriticalEdge(), then restart. The caller of GVN::splitCriticalEdges() assumed a return value of true meant that critical edges were split, that the IR had changed, and that PRE should be re-attempted, upon which we loop infinitely. This was exposed after D88438, by compiling the Linux kernel for s390, but the test case is reproducible on x86. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261 Reviewed By: void Differential Revision: https://reviews.llvm.org/D94996	2021-01-25 11:23:44 -08:00
Craig Topper	d674a5ff2d	[RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw. This makes our i8/i16 codegen more similar to the i32 codegen. I've also added computeKnownBits support for DIVUW/REMUW so that we can remove zero extending ANDs from the output. Without this we end up turning DIVUW/REMUW back into DIVU/REMU via some isel patterns. Reviewed By: frasercrmck, luismarques Differential Revision: https://reviews.llvm.org/D95322	2021-01-25 10:47:22 -08:00
Reid Kleckner	d8778f44e4	[Win64] Ensure all stack frames are 8 byte aligned The unwind info format requires that all adjustments are 8 byte aligned, and the bottom three bits are masked out. Most Win64 calling conventions have 32 bytes of shadow stack space for spilling parameters, and I believe that constructing these fixed stack objects had the side effect of ensuring an alignment of 8. However, the Intel regcall convention does not have this shadow space, so when using that convention, it was possible to make a 4 byte stack frame, which was impossible to describe with unwind info. Fixes pr48867	2021-01-25 10:39:27 -08:00
Wei Mi	6c082dbc2c	[SampleFDO] Report error when reading a bad/incompatible profile instead of turning off SampleFDO silently. Currently sample loader pass turns off SampleFDO optimization silently when it sees error in reading the profile. This behavior will defeat the tests which could have caught those bad/incompatible profile problems. This patch change the behavior to report error. Differential Revision: https://reviews.llvm.org/D95269	2021-01-25 10:28:23 -08:00
Nemanja Ivanovic	e9b75e2e14	[PowerPC] Add missing negate for VPERMXOR on little endian subtargets This intrinsic is supposed to have the permute control vector complemented on little endian systems (as the ABI specifies and GCC implements). With the current code gen, the result vector is byte-reversed. Differential revision: https://reviews.llvm.org/D95004	2021-01-25 12:23:33 -06:00
David Green	2d325e321e	[ARM] Use half directly for args/return types in test. NFC Until fairly recently the calling convention for IR half was not handled correctly in the ARM backend, meaning we needed to pass pointers that were loaded/stored. Now that that is fixed we can switch to using the type directly instead.	2021-01-25 17:50:19 +00:00
Craig Topper	125d2de1e0	[RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64. As far as I know 32 bits arguments and returns on RV64 are always sign extended to i64. So I think we should be taking this into account around libcalls. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95285	2021-01-25 09:33:48 -08:00
Xun Li	3542809d3f	Revert "Fix unused variable in CoroFrame.cpp when building Release with GCC 10" This reverts commit ff5e896425577f445ed080d88b582aab0896fba0.	2021-01-25 08:37:45 -08:00
Dmitry Preobrazhensky	e0eb602697	[AMDGPU][MC] Improved errors handling for SDWA operands Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D95212	2021-01-25 19:02:53 +03:00
Nico Weber	a018238088	Revert "[JITLink] Enable exception handling for ELF." This reverts commit 6884fbc2c4fb46d0528c02d16d510f4f725fac11. Breaks tests on Windows: http://45.33.8.238/win/31981/step_11.txt	2021-01-25 11:00:38 -05:00
Jeroen Dobbelaere	099d05c508	[Verifier] disable llvm.experimental.noalias.scope.decl dominance check. This was enabled in https://reviews.llvm.org/D95335 but it breaks the stage2 fuchsia build (See http://lab.llvm.org:8011/#/builders/98/builds/4105/steps/9/logs/stdio)	2021-01-25 16:43:08 +01:00
Simon Pilgrim	6390ed0760	[X86][AVX] Generalize vperm2f128/vperm2i128 patterns to support all legal 256-bit vector types Remove bitcasts to/from v4x64 types through vperm2f128/vperm2i128 ops to help improve shuffle combining and demanded vector elts folding.	2021-01-25 15:35:36 +00:00
Jeroen Dobbelaere	8d875453b4	[Verifier] enable and limit llvm.experimental.noalias.scope.decl dominance checking Checking the llvm.experimental.noalias.scope.decl dominance can be worstcase O(N^2). Limit the dominance check to N=32. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D95335	2021-01-25 16:19:12 +01:00
xgupta	0d37ea875a	[Doc][NFC] Fix Kaleidoscope links, typos and add blog posts for MCJIT	2021-01-25 19:59:36 +05:30
Florian Hahn	804ecf1ca5	[VPlan] Handle scalarized values in VPTransformState. This patch adds plumbing to handle scalarized values directly in VPTransformState. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D92282	2021-01-25 14:21:56 +00:00
xgupta	c232049547	[NFC] Fix title comment typo and provide description for LLJIT example.	2021-01-25 19:46:02 +05:30
Simon Pilgrim	1c92cd102a	[X86][AVX] combineX86ShuffleChainWithExtract - widen to at least original root size. NFCI. We're relying on the source inputs for shuffle combining having already been widened to the root size (otherwise the offset logic falls over) - we're going to be supporting different sized shuffle inputs soon, so we need to explicitly make the minimum widened width the original root size.	2021-01-25 13:45:37 +00:00
Abhina Sreeskantharajan	559ef0a573	Revert "[SystemZ][z/OS] Fix No such file or directory expression error" This reverts commit 06f8a49693957bc27b83e0ab5f429ff874941a07.	2021-01-25 08:29:38 -05:00
Abhina Sreeskantharajan	d5e5913794	Revert "[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued" This reverts commit 520b5ecf856152f35ee38207eec39f5674dd2bd4.	2021-01-25 08:29:38 -05:00
Sanjay Patel	ca34d6ec29	[InstCombine] narrow min/max intrinsics with extended inputs We can sink extends after min/max if they match and would not change the sign-interpreted compare. The only combo that doesn't work is zext+smin/smax because the zexts could change a negative number into positive: https://alive2.llvm.org/ce/z/D6sz6J Sext+umax/umin works: define i32 @src(i8 %x, i8 %y) { %0: %sx = sext i8 %x to i32 %sy = sext i8 %y to i32 %m = umax i32 %sx, %sy ret i32 %m } => define i32 @tgt(i8 %x, i8 %y) { %0: %m = umax i8 %x, %y %r = sext i8 %m to i32 ret i32 %r } Transformation seems to be correct!	2021-01-25 07:52:50 -05:00
Sanjay Patel	5835cf484b	[InstCombine] add tests for min/max intrinsics with extended values; NFC	2021-01-25 07:52:50 -05:00
Sander de Smalen	d25eddef40	[SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost. This change also changes getReductionCost to return InstructionCost, and it simplifies two expressions by removing a redundant 'isValid' check.	2021-01-25 12:27:01 +00:00
Simon Pilgrim	ca94b9a73a	[X86][AVX] LowerTRUNCATE - avoid bitcasts around extract_subvectors. We allow extract_subvector lowering of all legal types, so pre-bitcast the source type to try and reduce bitcast pollution.	2021-01-25 12:10:36 +00:00
Simon Pilgrim	1c1001043f	[X86][AVX] combineX86ShuffleChain - avoid bitcasts around insert_subvector() shuffle patterns. We allow insert_subvector lowering of all legal types, so don't always cast to the vXi64/vXf64 shuffle types - this is only necessary for X86ISD::SHUF128/X86ISD::VPERM2X128 patterns later.	2021-01-25 11:35:45 +00:00
Simon Pilgrim	efb80b4284	[TableGen] RuleMatcher::defineComplexSubOperand avoid std::string copy. NFCI. Use const reference to avoid std::string copy - accordingly to the style guide we shouldn't be using auto anyway. Fixes MSVC analyzer warning.	2021-01-25 11:35:44 +00:00
Sander de Smalen	110f0110d8	[InstructionCost] Prevent InstructionCost being created with CostState. For a function that returns InstructionCost, it is very tempting to write: return InstructionCost::Invalid; But that actually returns InstructionCost(1 /* int value of Invalid */)) which has a totally different meaning. By marking this constructor as `delete`, this can no longer happen.	2021-01-25 11:26:56 +00:00
Fraser Cormack	c88b1ceeef	[SelectionDAG] Support scalable-vector splats in more cases This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501	2021-01-25 10:58:15 +00:00
Philip Pfaffe	42b43f8006	[llvm-dwp] Automatically set the target triple The llvm-dwp tool hard-codes the target triple to x86. Instead, deduce the target triple from the object files being read. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93749	2021-01-25 11:58:54 +01:00
Georgii Rymar	561d962889	[ObjectYAML] - An attempt to fix BB after commit of D95140. D95140 introduced `static constexpr StringRef TypeStr = "SectionHeaderTable";` member of `SectionHeaderTable` with in-class initialized. BB reports the link error: /usr/bin/ld: lib/libLLVMObjectYAML.a(ELFYAML.cpp.o): in function `llvm::yaml::MappingTraits<std::unique_ptr<llvm::ELFYAML::Chunk, std::default_delete<llvm::ELFYAML::Chunk> > >::mapping(llvm::yaml::IO&, std::unique_ptr<llvm::ELFYAML::Chunk, std::default_delete<llvm::ELFYAML::Chunk> >&)': ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x58): undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr' /usr/bin/ld: ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x353):undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr' /usr/bin/ld: ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x6e5): undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr' This patch adds a definition to cpp file, I guess it should fix the issue.	2021-01-25 13:26:06 +03:00
Georgii Rymar	fc47fdd498	[yaml2obj, obj2yaml] - Implement section header table as a special Chunk. This was discussed in D93678 thread. Currently we have one special chunk - Fill. This patch re implements the "SectionHeaderTable" key to become a special chunk too. With that we are able to place the section header table at any location, just like we place sections. Differential revision: https://reviews.llvm.org/D95140	2021-01-25 13:08:08 +03:00

1 2 3 4 5 ...

210246 Commits