llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	173d94e2bc	LanaiTargetMachine.h - remove unnecessary includes. NFCI.	2020-09-29 18:15:31 +01:00
Simon Pilgrim	3ce97c9901	LanaiSubtarget.h - remove unnecessary includes. NFCI. TargetFrameLowering.h is guaranteed to be covered by LanaiFrameLowering.h	2020-09-29 18:15:31 +01:00
Juneyoung Lee	5f4bfccd18	[BuildLibCalls] Add noundef to the returned pointers of allocators and argument of free This patch adds noundef to the returned pointers of allocators (malloc, calloc, ...) and the pointer argument of free. The returned pointer of allocators cannot be poison or (partially) undef. Since the pointer that is given to free should precisely have zero offset, it cannot be poison or (partially) undef too. For the size arguments of allocators, noundef wasn't attached simply because I wasn't sure whether attaching it is okay or not. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87984	2020-09-30 02:13:48 +09:00
Zequan Wu	3358ffa747	[COFF][CG Profile] set undefined symbol to external Differential Revision: https://reviews.llvm.org/D88456	2020-09-29 09:49:51 -07:00
Simon Pilgrim	3d0b5a22ca	[InstCombine] Fix the outofrange tests and add exact shift tests for D88429	2020-09-29 17:15:16 +01:00
Simon Pilgrim	2352264886	[InstCombine] visitTrunc - remove dead trunc(lshr (zext A), C) combine. NFCI. I added additional test coverage at rG7a55989dc4305 - but all are handled independently of this combine and http://lab.llvm.org:8080/coverage/coverage-reports/ indicates the code is never used. Differential revision: https://reviews.llvm.org/D88492	2020-09-29 17:15:16 +01:00
Simon Pilgrim	08a0e9a912	MSP430TargetMachine.h - remove unused includes. NFCI.	2020-09-29 16:41:59 +01:00
Simon Pilgrim	0626d95654	NVPTXTargetMachine.h - remove unused includes. NFCI.	2020-09-29 16:41:59 +01:00
Simon Pilgrim	6301392398	SparcSubtarget.h - cleanup include dependencies. NFCI. TargetFrameLowering.h is guaranteed to be covered by SparcFrameLowering.h Fix missing implicit Triple.h dependency.	2020-09-29 16:41:58 +01:00
Cameron McInally	8895b924c4	[SVE] Fix typo in CHECK lines for sve-fixed-length-int-reduce.ll	2020-09-29 10:12:58 -05:00
Sanjay Patel	bf8ef9fdc3	[InstCombine] use redirect of input file in regression tests; NFC This is a repeat of 1880092722 from 2009. We should have less risk of hitting bugs at this point because we auto-generate positive CHECK lines only, but this makes things consistent. Copying the original commit msg: "Change tests from "opt %s" to "opt < %s" so that opt doesn't see the input filename so that opt doesn't print the input filename in the output so that grep lines in the tests don't unintentionally match strings in the input filename."	2020-09-29 11:06:25 -04:00
Simon Pilgrim	1dd688eff2	[InstCombine] Add some basic trunc(lshr(zext(x),c)) tests Copied from the sext equivalents	2020-09-29 15:49:57 +01:00
Simon Pilgrim	46a5992f75	[InstCombine] Inherit exact flags on extended shifts in trunc (lshr (sext A), C) --> (ashr A, C) This was missed in D88475	2020-09-29 15:32:09 +01:00
Simon Pilgrim	487233e67b	[InstCombine] Add exact shift tests missed in D88475 I missed the post-LGTM comment from @lebedev.ri	2020-09-29 15:24:59 +01:00
Krzysztof Parzyszek	6796541b93	[SDAG] Do not convert undef to 0 when folding CONCAT/BUILD_VECTOR Differential Revision: https://reviews.llvm.org/D88273	2020-09-29 09:12:26 -05:00
Simon Pilgrim	11e1b3f795	[InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the m_APInt to m_Constant for that patch I thought I would do it for the only other user of the APInt first. I've added a ConstantExpr::getUMin helper - its trivial to add UMAX/SMIN/SMAX but thought I'd wait until we have use cases. Differential Revision: https://reviews.llvm.org/D88475	2020-09-29 15:01:16 +01:00
Dominik Montada	83da9d3852	[GlobalISel] fix widenScalarUnmerge if widen type is not a multiple of destination type Fix creation of illegal unmerge when widen was requested to a type which is not a multiple of the destination type. E.g. when trying to widen an s48 unmerge to s64 the existing code would create an illegal unmerge from s64 to s48. Instead, create further unmerges to a GCD type, then use this to remerge these intermediate results to the actual destinations. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88422	2020-09-29 15:52:20 +02:00
Mirko Brkusanin	367c918b83	Revert "[AMDGPU] Reorganize GCN subtarget features for unaligned access" This reverts commit f5cd7ec9f3fc969ff5e1feed961996844333de3b. Certain rocPRIM/rocThrust/hipCUB tests were failing because of this change.	2020-09-29 15:33:34 +02:00
Jay Foad	e606604b45	[SDag] Verify DAG divergence after dumping. NFC. When debugging, it's useful to be able to see the DAG that has just failed divergence verification.	2020-09-29 14:05:07 +01:00
Jay Foad	938bf31fa3	[SDag] Refactor and simplify divergence calculation and checking. NFC.	2020-09-29 14:05:07 +01:00
Jonas Paulsson	d53723ece8	[SystemZ] Don't emit PC-relative memory accesses to unaligned symbols. In the presence of packed structures (#pragma pack(1)) where elements are referenced through pointers, there will be stores/loads with alignment values matching the default alignments for the element types while the elements are in fact unaligned. Strictly speaking this is incorrect source code, but is unfortunately part of existing code and therefore now addressed. This patch improves the pattern predicate for PC-relative loads and stores by not only checking the alignment value of the instruction, but also making sure that the symbol (and element) itself is aligned. Fixes https://bugs.llvm.org/show_bug.cgi?id=44405 Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D87510	2020-09-29 14:51:13 +02:00
Florian Hahn	5656134a0c	[LoopUtils] Only verify SE in builds with assertions. Follow up to 60b852092c98.	2020-09-29 13:39:23 +01:00
Daniel Kiss	2497820205	[AArch64] Add BTI to CFI jumptables. With branch protection the jump to the jump table entries requires a landing pad. Reviewed By: eugenis, tamas.petz Differential Revision: https://reviews.llvm.org/D81251	2020-09-29 13:50:23 +02:00
David Stenberg	669dbd44d7	[IndVarSimplify] Fix Modified status for removal of overflow intrinsics When removing an overflow intrinsic the Changed status in SimplifyIndvar was not set, leading to the IndVarSimplify pass returning an incorrect status. This was caught using the check introduced by D80916. As pointed out in the code review, a similar bug may exist for eliminateTrunc(). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D85971	2020-09-29 13:20:59 +02:00
Vitaly Buka	01af5a47c9	[msan] Fix llvm.abs.v intrinsic The last argument of the intrinsic is a boolean flag to control INT_MIN handling and does not affect msan metadata.	2020-09-29 03:52:27 -07:00
Vitaly Buka	72c59279c4	[msan] Add test for vector abs intrinsic	2020-09-29 03:52:27 -07:00
sstefan1	e47fd785ee	[OpenMPOpt][Fix] Only initialize ICV initial values once. Reviewers: jdoerfert, ggeorgakoudis Differential Revision: https://reviews.llvm.org/D88441	2020-09-29 12:22:58 +02:00
Simon Pilgrim	5159bb7b3f	[InstCombine] Add trunc(lshr(sext(x),c)) non-uniform vector tests	2020-09-29 10:56:15 +01:00
Florian Hahn	02a3467af0	[LoopDeletion] Forget loop before setting values to undef After D71539, we need to forget the loop before setting the incoming values of phi nodes in exit blocks, because we are looking through those phi nodes now and the SCEV expression could depend on the loop phi. If we update the phi nodes before forgetting the loop, we miss those users during invalidation. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D88167	2020-09-29 10:38:44 +01:00
Max Kazantsev	41907f58c6	[SCEV][NFC] Introduce isBasicBlockEntryGuardedByCond Currently, we have `isLoopEntryGuardedByCond` method in SCEV, which checks that some fact is true if we enter the loop. In fact, this is just a particular case of more general concept `isBasicBlockEntryGuardedByCond` applied to given loop's header. In fact, the logic if this code is largely independent on the given loop and only cares code above it. This patch makes this generalization. Now we can query it for any block, and `isBasicBlockEntryGuardedByCond` is just a particular case. Differential Revision: https://reviews.llvm.org/D87828 Reviewed By: fhahn	2020-09-29 15:53:45 +07:00
Tres Popp	59b6daf823	Revert "OpaquePtr: Add type to sret attribute" This reverts commit 55c4ff91bd820d72014f63dcf7f3d5a0d3397986. Issues were introduced as discussed in https://reviews.llvm.org/D88241 where this change made previous bugs in the linker and BitCodeWriter visible.	2020-09-29 10:31:04 +02:00
Serguei Katkov	fcb17e5e03	[IsKnownNonZero] Handle the case with non-constant phi nodes Handle the case when all inputs of phi are proven to be non zero. Constants are checked in beginning of this method before check for depth of recursion, so it is a partial case of non-constant phi. Recursion depth is already handled by the function. Reviewers: aqjune, nikic, efriedma Reviewed By: nikic Subscribers: dantrushin, hiraditya, jdoerfert, llvm-commits Differential Revision: https://reviews.llvm.org/D88276	2020-09-29 15:22:10 +07:00
Florian Hahn	8112c564bf	Revert "Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one."" Looks like there is still another remaining issue: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/22273/steps/build%20libcxx%2Fmsan/logs/stdio This reverts commit 86a20d9e34f5a9989da72097f23f3b0a44157e73.	2020-09-29 09:18:19 +01:00
Florian Hahn	6a5bcd3f18	Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one." This version includes an small fix allowing function pointers to be unconditionally replaced for now. This reverts commit 4c5e4aa89b11ec3253258b8df5125833773d1b1e.	2020-09-29 09:10:27 +01:00
Sam Parker	46ff74493f	[NFC][ARM] Comments and lambdas Add some comments in LowOverheadLoops and make some lambda variables explicit arguments instead of capturing.	2020-09-29 08:41:53 +01:00
Craig Topper	c69069f9b0	[X86] Add computeKnownBits support for PEXT. The number of zeros in the mask provides a lower bound on the number of leading zeros in the result.	2020-09-28 22:54:07 -07:00
Craig Topper	3f90911e46	[X86] Add known bits test for PEXT. NFC	2020-09-28 22:54:07 -07:00
Arthur Eubanks	ee468fc3e5	[Docs][NewPM] Add note about required passes Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D88342	2020-09-28 21:45:14 -07:00
Max Kazantsev	4bab0c4016	[NFC] Use assert instead of checking the guaranteed condition From preconditions it is known that either A dominates B or B dominates A. If A does not dominate B, we do not really need to check it. Assert should be enough. Should save some compile time.	2020-09-29 11:38:45 +07:00
Max Kazantsev	ec6ab63143	[IndVars] Remove exiting conditions that are trivially true/false When removing exiting loop conditions, we only consider checks for which we know the exact exit count. We could also eliminate checks for which the condition is always true/false. Differential Revision: https://reviews.llvm.org/D87344 Reviewed By: lebedev.ri, reames	2020-09-29 11:35:32 +07:00
Yonghong Song	7ba626add6	BPF: explicitly specify bpfel triple for certain tests Commit 54d9f743c8b0 ("BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible") changed most of CORE tests with opt run followed by llc and opt requires the target triple specified in the IR. There are few tests where little endian and big endian will report different result and for little endian versions of tests, "target triple = "bpf"" will produce wrong results if the test executed in a big endian machine, e.g. PowerPC big endian machine, since target "bpf" represents host endian and will resolve to "bpfeb". The builtbot reported such failures when build-and-run on a PowerPC big endian machine. To fix the issue, using "target triple = "bpfel"" instead.	2020-09-28 20:25:25 -07:00
Amara Emerson	09394476cd	[AArch64][GlobalISel] Scalarize <2 x s64> G_MUL since we don't have native support for it. Differential Revision: https://reviews.llvm.org/D88437	2020-09-28 19:29:45 -07:00
LLVM GN Syncbot	117b16b4b6	[gn build] Port 54d9f743c8b	2020-09-29 00:24:06 +00:00
Ruiling Song	62e7593653	[RegisterCoalescer] Pass Undefs to extendToIndices() When extending the subranges, the reaching-def may be an undefs. When extending such kind of subrange, it will try to search for the reaching def first. If the reaching def is an undef and we did not provide 'Undefs', The findReachingDefs() will fail with message: "Use of $noreg does not have a corresponding definition on every path: LLVM ERROR: Use not jointly dominated by defs." So we computeSubRangeUndefs() and pass the result to extendToIndices(). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87744	2020-09-29 08:14:24 +08:00
Yonghong Song	b27683b417	BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible Move abstractMemberAccess and PreserveDIType passes as early as possible, right after clang code generation. Currently, compiler may transform the above code p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); bpf_probe_read(buf, buf_size, p2); } to p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { bpf_probe_read(buf, buf_size, p2); } and eventually assembly code looks like reloc_exist = 1; reloc_member_offset = 10; //calculate member offset from base p2 = base + reloc_member_offset; if (reloc_exist) { bpf_probe_read(bpf, buf_size, p2); } if during libbpf relocation resolution, reloc_exist is actually resolved to 0 (not exist), reloc_member_offset relocation cannot be resolved and will be patched with illegal instruction. This will cause verifier failure. This patch attempts to address this issue by do chaining analysis and replace chains with special globals right after clang code gen. This will remove the cse possibility described in the above. The IR typically looks like %6 = load @llvm.sk_buff:0:50$0:0:0:2:0 %7 = bitcast %struct.sk_buff* %2 to i8* %8 = getelementptr i8, i8* %7, %6 for a particular address computation relocation. But this transformation has another consequence, code sinking may happen like below: PHI = <possibly different @preserve__access_globals> %7 = bitcast %struct.sk_buff %2 to i8* %8 = getelementptr i8, i8* %7, %6 For such cases, we will not able to generate relocations since multiple relocations are merged into one. This patch introduced a passthrough builtin to prevent such optimization. Looks like inline assembly has more impact for optimizaiton, e.g., inlining. Using passthrough has less impact on optimizations. A new IR pass is introduced at the beginning of target-dependent IR optimization, which does: - report fatal error if any reloc global in PHI nodes - remove all bpf passthrough builtin functions Changes for existing CORE tests: - for clang tests, add "-Xclang -disable-llvm-passes" flags to avoid builtin->reloc_global transformation so the test is still able to check correctness for clang generated IR. - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> \| llvm-dis" command before "llc" command since "opt" is needed to call newly-placed builtin->reloc_global transformation. Add target triple in the IR file since "opt" requires it. - Since target triple is added in IR file, if a test may produce different results for different endianness, two tests will be created, one for bpfeb and another for bpfel, e.g., some tests for relocation of lshift/rshift of bitfields. - field-reloc-bitfield-1.ll has different relocations compared to old codes. This is because for the structure in the test, new code returns struct layout alignment 4 while old code is 8. Align 8 is more precise and permits double load. With align 4, the new mechanism uses 4-byte load, so generating different relocations. - test intrinsic-transforms.ll is removed. This is used to test cse on intrinsics so we do not lose metadata. Now metadata is attached to global and not instruction, it won't get lost with cse. Differential Revision: https://reviews.llvm.org/D87153	2020-09-28 16:56:22 -07:00
Mehdi Amini	8fedbbff8a	Guard `find_library(tensorflow_c_api ...)` by checking for TENSORFLOW_C_LIB_PATH to be set by the user Also have CMake fails if the user provides a TENSORFLOW_C_LIB_PATH but we can't find TensorFlow at this path. At the moment the CMake script tries to figure if TensorFlow is available on the system and enables support for it. This is in general not desirable to customize build features this way and instead it is preferable to let the user opt-in explicitly into the features they want to enable. This is in line with other optional external dependencies like Z3. There are a few reasons to this but amongst others: - reproducibility: making features "magically" enabled based on whether we find a package on the system or not makes it harder to handle bug reports from users. - user control: they can't have TensorFlow on the system and build LLVM without TensorFlow right now. They also would suddenly distribute LLVM with a different set of features unknowingly just because their build machine environment would change subtly. Right now this is motivated by a user reporting build failures on their system: .../mesa-git/llvm-git/src/llvm-project/llvm/lib/Analysis/TFUtils.cpp:23:10: fatal error: tensorflow/c/c_api.h: No such file or directory 23 \| #include "tensorflow/c/c_api.h" \| ^~~~~~ It looks like we detected TensorFlow at configure time but couldn't set all the paths correctly. Differential Revision: https://reviews.llvm.org/D88371	2020-09-28 22:15:55 +00:00
Philip Reames	c4e9c9a455	[CVP] Allow two transforms in one invocation For a call site which had both constant deopt operands and nonnull arguments, we were missing the opportunity to recognize the later by bailing early. This is somewhat of a speculative fix. Months ago, I'd had a private report of performance and compile time regressions from the deopt operand folding. I never received a test case. However, the only possibility I see was that after that change CVP missed the nonnull fold, and we end up with a pass ordering/missed simplification issue. So, since it's a real issue, fix it and hope.	2020-09-28 15:11:42 -07:00
Fangrui Song	e0ac770663	[EHStreamer] Simplify sharedTypeIDs with std::mismatch (Note that EMStreamer.cpp is largely under tested. The only test checking the prefix sharing is CodeGen/WebAssembly/eh-lsda.ll)	2020-09-28 15:05:59 -07:00
Craig Topper	369d12cf0e	[X86] Add support for calling SimplifyDemandedBits on the input of PDEP with a constant mask. We can do several optimizations for PDEP using computeKnownBits and SimplifyDemandedBits -If the MSBs of the output aren't demanded, those MSBs of the mask input aren't demanded either. We need to keep the most significant demanded bit of the mask and any mask bits before it. -The number of possible ones in the mask determines how many bits of the lsbs of the other operand are demanded. Any bits of the mask we don't demand by the previous rule should not be counted. -The result will have zeros in any position that the mask is zero. -Since non-mask input bits can only be output in the original position or a higher bit position, the result will have at least as many trailing zeroes as the non-mask input. Differential Revision: https://reviews.llvm.org/D87883	2020-09-28 14:21:30 -07:00
Craig Topper	a9c2ef5d71	[X86] Add tests for D87883. NFC	2020-09-28 14:21:29 -07:00

1 2 3 4 5 ...

204355 Commits