llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Julian Lettner	b139438d21	Revert "[lit] Use os.cpu_count() to cleanup TODO" A bot owner contacted me. I will re-land after confirming that this doesn't break anyone (since it's low priority). This reverts commit 9946b169c379daee603436a4753acfef8be373dd.	2021-01-25 13:32:30 -08:00
Konstantin Zhuravlyov	b36722c726	Revert "[IndirectFunctions] Skip propagating attributes to address taken functions" This reverts commit dd8ae42674b494e46ec40a22f40068db2b4a8b60. This commit causes infinite loop when compiling rocThrust and hipCUB. Differential Revision: https://reviews.llvm.org/D95389	2021-01-25 15:58:06 -05:00
LLVM GN Syncbot	37f3352001	[gn build] Port e123cd674c02	2021-01-25 20:11:10 +00:00
Akira Hatanaka	96195f7442	[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end annotates calls with attribute "clang.arc.rv"="retain" or "clang.arc.rv"="claim", which indicates the call is implicitly followed by a marker instruction and a retainRV/claimRV call that consumes the call result. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the annotated calls in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the annotated calls. It doesn't remove the attribute on the call since the backend needs it to emit the marker instruction. The retainRV/claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes the autoreleaseRV call in the callee that returns the result if nothing in the callee prevents it from being paired up with the calls annotated with "clang.arc.rv"="retain/claim" in the caller. If the call is annotated with "claim", a release call is inserted since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the attributes to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV call returning the callee result, which makes it impossible to pair it up with the retainRV or claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the call is annotated with "retain" and does nothing if it's annotated with "claim". - This patch teaches dead argument elimination pass not to change the return type of a function if any of the calls to the function are annotated with attribute "clang.arc.rv". This is necessary since the pass can incorrectly determine nothing in the IR uses the function return, which can happen since the front-end no longer explicitly emits retainRV/claimRV calls in the IR, and change its return type to 'void'. Future work: - Use the attribute on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the attributes. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-01-25 11:57:08 -08:00
Julian Lettner	08fcf6f580	[lit] Use os.cpu_count() to cleanup TODO We can now use Python3. Let's use `os.cpu_count()` to cleanup this helper. Differential Revision: https://reviews.llvm.org/D94734	2021-01-25 11:44:18 -08:00
Florian Hahn	1010e3b222	[VPlan] Replace uses with new value in VPInstructionsToVPRecipe (NFC). Now that VPRecipeBase inherits from VPDef, we can always use the new VPValue for replacement, if the recipe defines one. Given the recipes that are supported at the moment, all new recipes must have either 0 or 1 defined values.	2021-01-25 19:38:08 +00:00
Nick Desaulniers	87bfbbe74e	[GVN] do not repeat PRE on failure to split critical edge Fixes an infinite loop encountered in GVN. GVN will delay PRE if it encounters critical edges, attempt to split them later via calls to SplitCriticalEdge(), then restart. The caller of GVN::splitCriticalEdges() assumed a return value of true meant that critical edges were split, that the IR had changed, and that PRE should be re-attempted, upon which we loop infinitely. This was exposed after D88438, by compiling the Linux kernel for s390, but the test case is reproducible on x86. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261 Reviewed By: void Differential Revision: https://reviews.llvm.org/D94996	2021-01-25 11:23:44 -08:00
Craig Topper	d674a5ff2d	[RISCV] Custom type legalize i8/i16 UDIV/UREM/SDIV on RV64 so we can use divuw/remuw/divw. This makes our i8/i16 codegen more similar to the i32 codegen. I've also added computeKnownBits support for DIVUW/REMUW so that we can remove zero extending ANDs from the output. Without this we end up turning DIVUW/REMUW back into DIVU/REMU via some isel patterns. Reviewed By: frasercrmck, luismarques Differential Revision: https://reviews.llvm.org/D95322	2021-01-25 10:47:22 -08:00
Reid Kleckner	d8778f44e4	[Win64] Ensure all stack frames are 8 byte aligned The unwind info format requires that all adjustments are 8 byte aligned, and the bottom three bits are masked out. Most Win64 calling conventions have 32 bytes of shadow stack space for spilling parameters, and I believe that constructing these fixed stack objects had the side effect of ensuring an alignment of 8. However, the Intel regcall convention does not have this shadow space, so when using that convention, it was possible to make a 4 byte stack frame, which was impossible to describe with unwind info. Fixes pr48867	2021-01-25 10:39:27 -08:00
Wei Mi	6c082dbc2c	[SampleFDO] Report error when reading a bad/incompatible profile instead of turning off SampleFDO silently. Currently sample loader pass turns off SampleFDO optimization silently when it sees error in reading the profile. This behavior will defeat the tests which could have caught those bad/incompatible profile problems. This patch change the behavior to report error. Differential Revision: https://reviews.llvm.org/D95269	2021-01-25 10:28:23 -08:00
Nemanja Ivanovic	e9b75e2e14	[PowerPC] Add missing negate for VPERMXOR on little endian subtargets This intrinsic is supposed to have the permute control vector complemented on little endian systems (as the ABI specifies and GCC implements). With the current code gen, the result vector is byte-reversed. Differential revision: https://reviews.llvm.org/D95004	2021-01-25 12:23:33 -06:00
David Green	2d325e321e	[ARM] Use half directly for args/return types in test. NFC Until fairly recently the calling convention for IR half was not handled correctly in the ARM backend, meaning we needed to pass pointers that were loaded/stored. Now that that is fixed we can switch to using the type directly instead.	2021-01-25 17:50:19 +00:00
Craig Topper	125d2de1e0	[RISCV] Use sign extend for i32 arguments and returns in makeLibCall on RV64. As far as I know 32 bits arguments and returns on RV64 are always sign extended to i64. So I think we should be taking this into account around libcalls. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95285	2021-01-25 09:33:48 -08:00
Xun Li	3542809d3f	Revert "Fix unused variable in CoroFrame.cpp when building Release with GCC 10" This reverts commit ff5e896425577f445ed080d88b582aab0896fba0.	2021-01-25 08:37:45 -08:00
Dmitry Preobrazhensky	e0eb602697	[AMDGPU][MC] Improved errors handling for SDWA operands Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D95212	2021-01-25 19:02:53 +03:00
Nico Weber	a018238088	Revert "[JITLink] Enable exception handling for ELF." This reverts commit 6884fbc2c4fb46d0528c02d16d510f4f725fac11. Breaks tests on Windows: http://45.33.8.238/win/31981/step_11.txt	2021-01-25 11:00:38 -05:00
Jeroen Dobbelaere	099d05c508	[Verifier] disable llvm.experimental.noalias.scope.decl dominance check. This was enabled in https://reviews.llvm.org/D95335 but it breaks the stage2 fuchsia build (See http://lab.llvm.org:8011/#/builders/98/builds/4105/steps/9/logs/stdio)	2021-01-25 16:43:08 +01:00
Simon Pilgrim	6390ed0760	[X86][AVX] Generalize vperm2f128/vperm2i128 patterns to support all legal 256-bit vector types Remove bitcasts to/from v4x64 types through vperm2f128/vperm2i128 ops to help improve shuffle combining and demanded vector elts folding.	2021-01-25 15:35:36 +00:00
Jeroen Dobbelaere	8d875453b4	[Verifier] enable and limit llvm.experimental.noalias.scope.decl dominance checking Checking the llvm.experimental.noalias.scope.decl dominance can be worstcase O(N^2). Limit the dominance check to N=32. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D95335	2021-01-25 16:19:12 +01:00
xgupta	0d37ea875a	[Doc][NFC] Fix Kaleidoscope links, typos and add blog posts for MCJIT	2021-01-25 19:59:36 +05:30
Florian Hahn	804ecf1ca5	[VPlan] Handle scalarized values in VPTransformState. This patch adds plumbing to handle scalarized values directly in VPTransformState. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D92282	2021-01-25 14:21:56 +00:00
xgupta	c232049547	[NFC] Fix title comment typo and provide description for LLJIT example.	2021-01-25 19:46:02 +05:30
Simon Pilgrim	1c92cd102a	[X86][AVX] combineX86ShuffleChainWithExtract - widen to at least original root size. NFCI. We're relying on the source inputs for shuffle combining having already been widened to the root size (otherwise the offset logic falls over) - we're going to be supporting different sized shuffle inputs soon, so we need to explicitly make the minimum widened width the original root size.	2021-01-25 13:45:37 +00:00
Abhina Sreeskantharajan	559ef0a573	Revert "[SystemZ][z/OS] Fix No such file or directory expression error" This reverts commit 06f8a49693957bc27b83e0ab5f429ff874941a07.	2021-01-25 08:29:38 -05:00
Abhina Sreeskantharajan	d5e5913794	Revert "[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued" This reverts commit 520b5ecf856152f35ee38207eec39f5674dd2bd4.	2021-01-25 08:29:38 -05:00
Sanjay Patel	ca34d6ec29	[InstCombine] narrow min/max intrinsics with extended inputs We can sink extends after min/max if they match and would not change the sign-interpreted compare. The only combo that doesn't work is zext+smin/smax because the zexts could change a negative number into positive: https://alive2.llvm.org/ce/z/D6sz6J Sext+umax/umin works: define i32 @src(i8 %x, i8 %y) { %0: %sx = sext i8 %x to i32 %sy = sext i8 %y to i32 %m = umax i32 %sx, %sy ret i32 %m } => define i32 @tgt(i8 %x, i8 %y) { %0: %m = umax i8 %x, %y %r = sext i8 %m to i32 ret i32 %r } Transformation seems to be correct!	2021-01-25 07:52:50 -05:00
Sanjay Patel	5835cf484b	[InstCombine] add tests for min/max intrinsics with extended values; NFC	2021-01-25 07:52:50 -05:00
Sander de Smalen	d25eddef40	[SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost. This change also changes getReductionCost to return InstructionCost, and it simplifies two expressions by removing a redundant 'isValid' check.	2021-01-25 12:27:01 +00:00
Simon Pilgrim	ca94b9a73a	[X86][AVX] LowerTRUNCATE - avoid bitcasts around extract_subvectors. We allow extract_subvector lowering of all legal types, so pre-bitcast the source type to try and reduce bitcast pollution.	2021-01-25 12:10:36 +00:00
Simon Pilgrim	1c1001043f	[X86][AVX] combineX86ShuffleChain - avoid bitcasts around insert_subvector() shuffle patterns. We allow insert_subvector lowering of all legal types, so don't always cast to the vXi64/vXf64 shuffle types - this is only necessary for X86ISD::SHUF128/X86ISD::VPERM2X128 patterns later.	2021-01-25 11:35:45 +00:00
Simon Pilgrim	efb80b4284	[TableGen] RuleMatcher::defineComplexSubOperand avoid std::string copy. NFCI. Use const reference to avoid std::string copy - accordingly to the style guide we shouldn't be using auto anyway. Fixes MSVC analyzer warning.	2021-01-25 11:35:44 +00:00
Sander de Smalen	110f0110d8	[InstructionCost] Prevent InstructionCost being created with CostState. For a function that returns InstructionCost, it is very tempting to write: return InstructionCost::Invalid; But that actually returns InstructionCost(1 /* int value of Invalid */)) which has a totally different meaning. By marking this constructor as `delete`, this can no longer happen.	2021-01-25 11:26:56 +00:00
Fraser Cormack	c88b1ceeef	[SelectionDAG] Support scalable-vector splats in more cases This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501	2021-01-25 10:58:15 +00:00
Philip Pfaffe	42b43f8006	[llvm-dwp] Automatically set the target triple The llvm-dwp tool hard-codes the target triple to x86. Instead, deduce the target triple from the object files being read. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93749	2021-01-25 11:58:54 +01:00
Georgii Rymar	561d962889	[ObjectYAML] - An attempt to fix BB after commit of D95140. D95140 introduced `static constexpr StringRef TypeStr = "SectionHeaderTable";` member of `SectionHeaderTable` with in-class initialized. BB reports the link error: /usr/bin/ld: lib/libLLVMObjectYAML.a(ELFYAML.cpp.o): in function `llvm::yaml::MappingTraits<std::unique_ptr<llvm::ELFYAML::Chunk, std::default_delete<llvm::ELFYAML::Chunk> > >::mapping(llvm::yaml::IO&, std::unique_ptr<llvm::ELFYAML::Chunk, std::default_delete<llvm::ELFYAML::Chunk> >&)': ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x58): undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr' /usr/bin/ld: ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x353):undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr' /usr/bin/ld: ELFYAML.cpp:(.text._ZN4llvm4yaml13MappingTraitsISt10unique_ptrINS_7ELFYAML5ChunkESt14default_deleteIS4_EEE7mappingERNS0_2IOERS7_+0x6e5): undefined reference to `llvm::ELFYAML::SectionHeaderTable::TypeStr' This patch adds a definition to cpp file, I guess it should fix the issue.	2021-01-25 13:26:06 +03:00
Georgii Rymar	fc47fdd498	[yaml2obj, obj2yaml] - Implement section header table as a special Chunk. This was discussed in D93678 thread. Currently we have one special chunk - Fill. This patch re implements the "SectionHeaderTable" key to become a special chunk too. With that we are able to place the section header table at any location, just like we place sections. Differential revision: https://reviews.llvm.org/D95140	2021-01-25 13:08:08 +03:00
Sjoerd Meijer	a2eb0c1564	[AArch64] Add Cortex CPU subtarget features for instruction fusion. This adds subtarget features for AES, literal, and compare and branch instruction fusion for different Cortex CPUs. Patch by: Cassie Jones. Differential Revision: https://reviews.llvm.org/D94457	2021-01-25 09:11:29 +00:00
Simon Cook	4e27f3f274	[RISCV] Add attribute support for all supported extensions This adds support for ".attribute arch" for all extensions that are currently supported by the compiler. Differential Revision: https://reviews.llvm.org/D94931	2021-01-25 08:58:53 +00:00
Fangrui Song	7cc3572b85	[XRay] Support DW_TAG_call_site and delete unneeded PATCHABLE_EVENT_CALL/PATCHABLE_TYPED_EVENT_CALL lowering	2021-01-25 00:49:18 -08:00
Fangrui Song	18f43f5ac9	[XRay] Make __xray_customevent support non-Linux	2021-01-25 00:48:21 -08:00
Andre Vieira	378009030a	[AArch64] Merge [US]MULL with half adds and subs into [US]ML[AS]L This patch adds patterns to teach the AArch64 backend to merge [US]MULL instructions and adds/subs of half the size into [US]ML[AS]L where we don't use the top half of the result. Differential Revision: https://reviews.llvm.org/D95218	2021-01-25 07:58:12 +00:00
Lang Hames	f7963e4e27	[JITLink] Enable exception handling for ELF. Adds the EHFrameSplitter and EHFrameEdgeFixer passes to the default JITLink pass pipeline for ELF/x86-64, and teaches EHFrameEdgeFixer to handle some new pointer encodings. Together these changes enable exception handling (at least for the basic cases that I've tested so far) for ELF/x86-64 objects loaded via JITLink.	2021-01-25 15:31:27 +11:00
QingShan Zhang	d5b70bbb38	[NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero For now, we correct the result for sqrt if iteration > 0. This doesn't make sense as they are not strict relative. Reviewed By: dmgreen, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D94480	2021-01-25 04:02:44 +00:00
David Blaikie	ac191922c9	Fix sign-comparison warnings in unit test EXPECTs	2021-01-24 18:38:16 -08:00
Chen Zheng	db58448497	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-24 21:28:21 -05:00
Carl Ritson	8b5995e559	[AMDGPU] Fix llvm.amdgcn.init.exec and frame materialization Frame-base materialization may insert vector instructions before EXEC is initialised. Fix this by moving lowering of llvm.amdgcn.init.exec later in backend. Also remove SI_INIT_EXEC_LO pseudo as this is not necessary. Reviewed By: ruiling Differential Revision: https://reviews.llvm.org/D94645	2021-01-25 08:31:17 +09:00
Craig Topper	f6cb3fe42b	[RISCV] Use bitsLE instead of strict == MVT::i32 in assertsexti32 and assertzexti32. The patterns that use this really want to know if the operand has at least 32 sign/zero bits. This increases opportunities to use W instructions when the original source used i8/i16. Not sure how much this matters for performance, but it makes i8/i16 code more consistent with i32.	2021-01-24 13:58:14 -08:00
Craig Topper	6b0c3267a4	[RISCV] Add test cases for missed opportunities to use *W instructions for div/rem when inputs are sign/zero extended from i8/16 instead of i32.	2021-01-24 13:56:38 -08:00
Craig Topper	90bfb79ab7	[RISCV] Add test cases for missed opportunities to use fcvt.*.w(u) instructions on RV64 when input is known to be extended from i8/i16.	2021-01-24 13:48:29 -08:00
David Green	59b0539afb	[ARM] Extra MVE unaligned VLDn tests. NFC	2021-01-24 21:39:00 +00:00

... 4 5 6 7 8 ...

210482 Commits