llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Roman Lebedev	c607b2fbf8	[NFC][lit] Extract 'test time' reading/writing into standalone functions Simply refactor code into reusable functions, to allow read_test_times() to be reused later.	2021-03-22 15:25:32 +03:00
Roman Lebedev	f090e49be4	[NFC][lit] Add a test showing that timing data for tests not executed is lost I.e. when you first run lit on a directory, and then on a single test, the timing knowledge about anything else other than that single test is lost. This isn't right.	2021-03-22 15:25:32 +03:00
Roman Lebedev	43b198106a	[NFCI][lit] Unbreak more lit self-tests after D98179 All of these depend on the order of tests, so if one runs them twice, the tests within them will naturally be reordered using the previous run times, which breaks them.	2021-03-22 15:25:32 +03:00
Roman Lebedev	866f214e22	[NFC][lit] discovery: find_tests_for_inputs: avoid py warning when no suites found If lit was run on a directory that contained no suites, then naturally suite[0] will not be there, and that line would cause python warnings. So just predicate it with a check that it is there in the first place.	2021-03-22 15:25:32 +03:00
Florian Hahn	aa081d2756	[ConstraintElimination] Add gep tests without inbounds. Add a set of interesting test cases for GEPs without inbounds for upcoming patches.	2021-03-22 12:23:30 +00:00
Bradley Smith	839304c777	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Sjoerd Meijer	093c73b470	[AArch64] Add some float -> int -> float conversion patterns This adds some conversion match patterns for which we want to keep the int values in FP registers using the corresponding NEON instructions (not the FP instructions) to avoid more costly int <-> fp register transfers. Differential Revision: https://reviews.llvm.org/D98956	2021-03-22 11:06:08 +00:00
Stefan Gränitz	32f9a2f66f	[llvm-jitlink] Fix Windows build after 4a8161fe40cc	2021-03-22 11:42:05 +01:00
Florian Hahn	e59b5c1dd4	[ConstraintElimination] Add multi-dimension GEP tests. Add a set of interesting test cases with multi-dimensional GEPs for upcoming patches.	2021-03-22 10:37:12 +00:00
Stefan Gränitz	c5ba233284	[llvm-jitlink] Add diagnostic output and port executor to getaddrinfo(3) as well Add diagnostic output for TCP connections on both sides, llvm-jitlink and llvm-jitlink-executor. Port the executor to use getaddrinfo(3) as well. This makes the code more symmetric and seems to be the recommended way for implementing the server side. Reviewed By: rzurob Differential Revision: https://reviews.llvm.org/D98581	2021-03-22 11:20:23 +01:00
Stefan Gränitz	35d95eba5c	[llvm-jitlink] Fix use of getaddrinfo(3) when connecting remote executor via TCP socket Since llvm-jitlink moved from gethostbyname to getaddrinfo in D95477, it seems to no longer connect to llvm-jitlink-executor via TCP. I can reproduce this behavior on both, Debian 10 and macOS 10.15.7: ``` > llvm-jitlink-executor listen=localhost:10819 -- > llvm-jitlink --oop-executor-connect=localhost:10819 /path/to/obj.o Failed to resolve localhost:10819 ``` Reviewed By: rzurob Differential Revision: https://reviews.llvm.org/D98579	2021-03-22 11:20:23 +01:00
serge-sans-paille	45601b4fd8	[NFC] Simpler and faster key computation for getSubtargetImpl memoization There's no use in computing a large key that's only used for a memoization optimization.	2021-03-22 10:02:51 +01:00
Kristof Beyls	917f15dbfc	[docs] GettingInvolved: split out flang and openmp meeting series Split out the flang and openmp meeting series, as each has a separate canonical page where the information is maintained. As part of that, also call out the alias analysis series separately as it doesn't seem to be relevant for just flang. Differential Revision: https://reviews.llvm.org/D99012	2021-03-22 09:25:57 +01:00
Qiu Chaofan	7c18c33042	[PowerPC] Enable redundant TOC save removal on AIX Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D97039	2021-03-22 14:29:22 +08:00
Bing1 Yu	edd216b650	[X86] Pass to transform tdpbf16ps intrinsics to scalar operation. In previous patch https://reviews.llvm.org/D93594, we only scalarize tilezero, tileload, tilestore and tiledpbssd. In this patch we scalarize tdpbf16ps intrinsic. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D96110	2021-03-22 13:00:40 +08:00
Max Kazantsev	54694d897b	[IndVars] Sharpen context in eliminateIVComparison When eliminating comparisons, we can use common dominator of all its users as context. This gives better results when ICMP is not computed right before the branch that uses it. Differential Revision: https://reviews.llvm.org/D98924 Reviewed By: lebedev.ri	2021-03-22 11:55:57 +07:00
Lang Hames	15f4aaa81e	[JITLink][ELF/x86-64] Add support for R_X86_64_GOTPC64 and R_X86_64_GOT64. Start adding support for ELF x86-64 large code model, PIC relocations.	2021-03-21 21:52:54 -07:00
Lang Hames	5f6f892eda	[JITLink] Start laying the groundwork for ELF x86-64 large code model support. Introduces DefineExternalSectionStartAndEndSymbols.h, which defines a template for a JITLink pass that transforms external symbols meeting a user-supplied predicate into defined symbols pointing at the start and end of a Section identified by the predicate. JITLink.h is updated with a new makeAbsolute function to support this pass. Also renames BasicGOTAndStubsBuilder to PerGraphGOTAndPLTStubsBuilder -- the new name better describes the intent of this GOT and PLT stubs builder, and will help to distinguish it from future GOT and PLT stub builders that build entries that may be shared between multiple graphs.	2021-03-21 20:56:47 -07:00
Lang Hames	0ff9692159	[JITLink][ELF/x86-64] Add Delta32, NegDelta32, NegDelta64 support. These were missing, but are used in eh-frame section support.	2021-03-21 20:15:40 -07:00
Luo, Yuanke	9147f4aeba	[X86][AMX] Add test cases for AMX load/store lowering. Differential Revision: https://reviews.llvm.org/D99030	2021-03-22 09:14:52 +08:00
Roman Lebedev	eaf93dd2b1	[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights 08196e0b2e1f8aaa8a854585335c17ba479114df exposed LowerExpectIntrinsic's internal implementation detail in the form of LikelyBranchWeight/UnlikelyBranchWeight options to the outside. While this isn't incorrect from the results viewpoint, this is suboptimal from the layering viewpoint, and causes confusion - should transforms also use those weights, or should they use something else, D98898? So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight internal again, and fixing all the code that used it directly, which currently is only clang codegen, thankfully, to emit proper @llvm.expect intrinsics instead.	2021-03-21 22:50:21 +03:00
Roman Lebedev	5ae449cd3e	Revert "[BranchProbability] move options for 'likely' and 'unlikely'" Upon reviewing D98898 i've come to realization that these are implementation detail of LowerExpectIntrinsicPass, and they should not be exposed to outside of it. This reverts commit ee8b53815ddf6f6f94ade0068903cd5ae843fafa.	2021-03-21 22:50:21 +03:00
Craig Topper	11c658c4e5	[DAGCombiner] Minor compile time improvement to (sext_in_reg (sign_extend_vector_inreg x)) optimization. Don't bother calling ComputeNumSignBits if N00Bits < ExtVTBits. No matter what answer we get back this will be true: (N00Bits - DAG.ComputeNumSignBits(N00, DemandedSrcElts)) < ExtVTBits) So we might as well save the computation. This makes the code more consistent with the similar (sext_in_reg (sext x)) handling above.	2021-03-21 11:16:41 -07:00
Nikita Popov	60f065900e	[ValueTracking] Improve mul handling in isKnownNonEqual() X != X * C is true if: * C is not 0 or 1 * X is not 0 * mul is nsw or nuw Proof: https://alive2.llvm.org/ce/z/uwF29z This is motivated by one of the cases in D98422.	2021-03-21 18:41:35 +01:00
Nikita Popov	312855ffec	[ValueTracking] Add more tests for isKnownNonEqual() of mul (NFC) This is for the case of (x * C) == x, rather than the (x * C1) == (x * C2) variant that we already cover.	2021-03-21 18:41:35 +01:00
Matt Arsenault	34dde63750	MIR: Fix missing serialization for HasTailCall	2021-03-21 13:14:04 -04:00
Matt Arsenault	8767e9f9fd	AMDGPU: Fix allowing immediates for tail call pseudo. The pseudo was using SSrc_b64, so it allowed folding immediates into the destination operand for a tail call to null. However, this is not a valid operand for the s_setpc_b64 this will be lowered to. Avoids printing the operand as an invalid immediate. Avoids a regression when tail calls are enabled in GlobalISel (somehow tail calls to null get deleted in the DAG).	2021-03-21 13:14:04 -04:00
Nikita Popov	4766db4e38	Reapply [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast() There seems to be an impedance mismatch between what the type system considers an aggregate (structs and arrays) and what constants consider an aggregate (structs, arrays and vectors). Adjust the type check to consider vectors as well. The previous version of the patch dropped the type check entirely, but it turns out that getAggregateElement() does require the constant to be an aggregate in some edge cases: For Poison/Undef the getNumElements() API is called, without checking in advance that we're dealing with an aggregate. Possibly the implementation should avoid doing that, but for now I'm adding an assert so the next person doesn't fall into this trap.	2021-03-21 17:48:21 +01:00
Nikita Popov	b3cc8585de	[InstSimplify] Add load of undef aggregate test (NFC) To make sure this doesn't crash the following commit.	2021-03-21 17:42:26 +01:00
Nikita Popov	d872e52c16	[InstSimplify] Regenerate test checks (NFC)	2021-03-21 17:41:21 +01:00
Nikita Popov	259cab0fcc	[InstSimplify] Add additional select operand replacement tests (NFC) This tests for binops with identity elements.	2021-03-21 15:30:30 +01:00
Nikita Popov	befc0b53a8	[InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI) Replace Op with RepOp up-front, and then always work with the new operands, rather than checking for replacement in various places.	2021-03-21 15:30:30 +01:00
Matt Arsenault	2eda243ee8	GlobalISel: Avoid unnecessary truncation to i64 We can just directly pass through the APInt to create a new constant.	2021-03-21 10:07:41 -04:00
Matt Arsenault	140b871148	AMDGPU/GlobalISel: Enable CSE in pre-legalizer combiner	2021-03-21 10:07:37 -04:00
Simon Pilgrim	4d0156a36f	[DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.	2021-03-21 14:01:37 +00:00
Simon Pilgrim	d0bf75b218	[X86][AVX] ComputeNumSignBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting. Added as an extension to PR49658	2021-03-21 12:22:51 +00:00
Simon Pilgrim	a9cc1a6025	[X86] Add 'mulhs' variant of PR49658 test case	2021-03-21 12:09:05 +00:00
David Green	3dae9e9960	[ARM] VINS f16 pattern This adds an extra pattern for inserting an f16 into a odd vector lane via an VINS. If the dual-insert-lane pattern does not happen to apply, this can help with some simple cases. Differential Revision: https://reviews.llvm.org/D95471	2021-03-21 12:00:06 +00:00
luxufan	01bf694073	[RISCV] remove redundant instruction when eliminate frame index The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created. Differential Revision: https://reviews.llvm.org/D92479	2021-03-21 18:54:00 +08:00
Simon Pilgrim	db3cbc0a8e	[X86][AVX] computeKnownBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting. Suggested by @craig.topper on PR49658	2021-03-21 10:40:57 +00:00
Simon Pilgrim	ee17e81726	[X86] Add PR49658 test case	2021-03-21 10:16:55 +00:00
Simon Pilgrim	a2fff44e8e	[X86] computeKnownBitsForTargetNode - add X86ISD::PMULUDQ handling Reuse the existing KnownBits multiplication code to handle what is effectively a ISD::UMUL_LOHI varient	2021-03-21 09:57:20 +00:00
Craig Topper	a29c1e206d	[RISCV] Add test case to show a case where (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization does not improve code. If the mul add two users, one of which was a sext.w, the mul would also be selected to a MULW before our pattern runs. This causes the ANDs to now be used by the already selected MULW and the mul we still need to select. They are unneeded on the MULW since MULW only reads the lower bits. So they get selected to SLLI+SRLI for the MULW use. The use for the (mul (and X, 0xffffffff), (and Y, 0xffffffff)) manages to reuse the SLLI. The end result is increased register pressure and no improvement to how soon we can start the MULW.	2021-03-20 17:54:28 -07:00
Andrew Litteken	4514855e45	Revert "[IRSim] Adding basic implementation of llvm-sim." Causing build errors on the Windows Buildbots. This reverts commit 5155dff2784a47583d432d796b7cf47a0bed9f20.	2021-03-20 18:03:09 -05:00
Jessica Clarke	42f6c00a37	[RISCV] Update comment in RISCVInstrInfoM.td Missed in 07ed62b7d551.	2021-03-20 22:35:40 +00:00
Craig Topper	00a544d23a	[RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled. This optimization is trying to save SRLI instructions needed to implement the ANDs. If we have zext.w we won't save anything. Because we don't check that the multiply is the only user of the AND we might even increase instruction count.	2021-03-20 15:31:45 -07:00
Craig Topper	dd7835bd09	[RISCV] Add Zba command lines to xaluo.ll. NFC Some of the patterns end up with 32 to 64 bit zero extends on RV64 which can be handled by zext.w.	2021-03-20 15:31:45 -07:00
Craig Topper	e094271bc1	[RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64 This patterns computes the full 64 bit product of a 32x32 unsigned multiply. This requires a two pairs of SLLI+SRLI to zero the upper 32 bits of the inputs. We can do better than this by using two SLLI to move the lower bits to the upper bits then use MULHU to compute the product. This is the high half of a full 64x64 product. Since we put 32 0s in the lower bits of the inputs we know the 128-bit product will have zeros in the lower 64 bits. So the upper 64 bits, which MULHU computes, will contain the original 64 bit product we were after. The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32)) using MULHS, but sext_inreg is sext.w which is already one instruction so we wouldn't save anything. Differential Revision: https://reviews.llvm.org/D99026	2021-03-20 14:55:46 -07:00
Andrew Litteken	db0fe80a86	[IRSim] Adding basic implementation of llvm-sim. This is a similarity visualization tool that accepts a Module and passes it to the IRSimilarityIdentifier. The resulting SimilarityGroups are output in a JSON file. Tests are found in test/tools/llvm-sim and check for the file not found, a bad module, and that the JSON is created correctly. Reviewers: paquette, jroelofs, MaskRay Recommit of: 15645d044bcfe2a0f63156048b302f997a717688 to fix linking errors. Differential Revision: https://reviews.llvm.org/D86974	2021-03-20 16:47:50 -05:00
Jinsong Ji	7a834ebbd0	[AIX] Update rpath for BUILD_SHARED_LIBS BUILD_SHARED_LIBS build llvm component as shared library, which can reduce the size a lot. Normally, the binary use ORIGIN../lib to load component libraries, unfortunatly, ORIGIN is not supported by AIX ld. We hardcoded the build lib and install lib path in rpath for now to enable BUILD_SHARED_LIBS build. Understand that this is not perfect solution, we can update this when we find better solution. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D98901	2021-03-20 20:31:43 +00:00

1 2 3 4 5 ...

212995 Commits