llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Georgii Rymar	437027774a	[lib/Object, tools] - Make ELFObjectFile::getELFFile return reference. We always have an object, so we don't have to return a pointer. Differential revision: https://reviews.llvm.org/D92560	2020-12-04 16:02:29 +03:00
Kazushi (Jam) Marukawa	2f154fa157	[VE] Add vfadd, vfsub, vfmul, and vfdiv intrinsic instructions Add vfadd, vfsub, vfmul, and vfdiv intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92649	2020-12-04 21:58:51 +09:00
Simon Pilgrim	b7f5cff25b	[X86] LowerRotate - enable custom lowering of ROTL/ROTR vXi16 on VBMI2 targets.	2020-12-04 12:16:59 +00:00
Simon Pilgrim	9eb0d1777d	[X86] LowerRotate - VBMI2 targets can lower vXi16 rotates using funnel shifts. Ideally we'd do this inside DAGCombine but until we can make the FSHL/FSHR opcodes legal for VBMI2 it won't help us.	2020-12-04 11:29:23 +00:00
Simon Pilgrim	b8f498327f	[X86] Let VBMI2 non-VLX targets still use funnel shifts instructions	2020-12-04 11:06:43 +00:00
Georgii Rymar	93e3e13a09	[yaml2obj,obj2yaml] - Make Symbol::Section field optional. This is similar to what we did earlier for fields of the Section class. When a field is optional we can use the =<none> syntax in macros. This was splitted from D92478. Differential revision: https://reviews.llvm.org/D92565	2020-12-04 13:45:47 +03:00
QingShan Zhang	047b9115cc	[PowerPC] Fix the regression caused by commit 9c588f53fc42 Add a TypeLegal check for MVT::i1 and add the test.	2020-12-04 10:22:13 +00:00
Simon Pilgrim	a24449ee17	[DAGCombiner] Use const APInt& for getConstantOperandAPInt results. NFCI. Avoid unnecessary instantiation. Noticed while removing unnecessary autos	2020-12-04 09:44:58 +00:00
Simon Pilgrim	da97e1a88c	[X86] Remove unnecessary bitcast. NFC. The X86ISD::SUBV_BROADCAST node is already VT	2020-12-04 09:44:57 +00:00
Evgeniy Brevnov	c94df02b1d	[NFC][NARY-REASSOCIATE] Restructure code to aviod isPotentiallyReassociatable Currently we have to duplicate the same checks in isPotentiallyReassociatable and tryReassociate. With simple pattern like add/mul this may be not a big deal. But the situation gets much worse when I try to add support for min/max. Min/Max may be represented by several instructions and can take different forms. In order reduce complexity for upcoming min/max support we need to restructure the code a bit to avoid mentioned code duplication. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88286	2020-12-04 16:19:43 +07:00
Evgeniy Brevnov	6aaf92e546	[NARY-REASSOCIATE] Simplify traversal logic by post deleting dead instructions Currently we delete optimized instructions as we go. That has several negative consequences. First it complicates traversal logic itself. Second if newly generated instruction has been deleted the traversal is repeated from scratch. But real motivation for the change is upcoming change with support for min/max reassociation. Here we employ SCEV expander to generate code. As a result newly generated instructions may be inserted not right before original instruction (because SCEV may do hoisting) and there is no way to know 'next' instruction. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88285	2020-12-04 16:17:50 +07:00
Kazu Hirata	9e95dea61d	[JumpThreading] Call eraseBlock when folding a conditional branch This patch teaches the jump threading pass to call BPI->eraseBlock when it folds a conditional branch. Without this patch, BranchProbabilityInfo could end up with stale edge probabilities for the basic block containing the conditional branch -- one edge probability with less than 1.0 and the other for a removed edge. Differential Revision: https://reviews.llvm.org/D92608	2020-12-03 23:50:17 -08:00
Max Kazantsev	6a8ff5d40d	Return "[IndVars] ICmpInst should not prevent IV widening" This reverts commit 4bd35cdc3ae1874c6d070c5d410b3f591de54ee6. The patch was reverted during the investigation. The investigation shown that the patch did not cause any trouble, but just exposed the existing problem that is addressed by the previous patch "[IndVars] Quick fix LHS/RHS bug". Returning without changes.	2020-12-04 12:34:43 +07:00
Max Kazantsev	da72560ea4	[IndVars] Quick fix LHS/RHS bug The code relies on fact that LHS is the NarrowDef but never really checks it. Adding the conservative restrictive check, will follow-up with handling of case where RHS is a NarrowDef.	2020-12-04 12:34:42 +07:00
Jianzhou Zhao	37af161d61	[dfsan] Support passing non-i16 shadow values in TLS mode This is a child diff of D92261. It extended TLS arg/ret to work with aggregate types. For a function t foo(t1 a1, t2 a2, ... tn an) Its arguments shadow are saved in TLS args like a1_s, a2_s, ..., an_s TLS ret simply includes r_s. By calculating the type size of each shadow value, we can get their offset. This is similar to what MSan does. See __msan_retval_tls and __msan_param_tls from llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp. Note that this change does not add test cases for overflowed TLS arg/ret because this is hard to test w/o supporting aggregate shdow types. We will be adding them after supporting that. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D92440	2020-12-04 02:45:07 +00:00
Arthur Eubanks	4e9e752fe4	Use isIgnored instead of checking pass name In preparation for https://reviews.llvm.org/D92616 which will remove angle brackets from pass manager/adaptor names. Reviewed By: dexonsmith, thakis Differential Revision: https://reviews.llvm.org/D92625	2020-12-03 18:37:57 -08:00
Duncan P. N. Exon Smith	fac28722eb	Support: Change InMemoryFileSystem::addFileNoOwn to take a MemoryBufferRef, NFC Found this by chance when looking at the InMemoryFileSystem API, seems like an easy cleanup. Differential Revision: https://reviews.llvm.org/D90893	2020-12-03 18:09:52 -08:00
Xiang1 Zhang	47e9765193	[X86] Unbind the ebx with GOT address in regcall calling convention No register can be allocated for indirect call when it use regcall calling convention and passed 5/5+ args. For example: call vreg (ag1, ag2, ag3, ag4, ag5, ...) --> 5 regs (EAX, ECX, EDX, ESI, EDI) used for pass args, 1 reg (EBX )used for hold GOT point, so no regs can be allocated to vreg. The Intel386 architecture provides 8 general purpose 32-bit registers. RA mostly use 6 of them (EAX, EBX, ECX, EDX, ESI, EDI). 5 of this regs can be used to pass function arguments (EAX, ECX, EDX, ESI, EDI). EBX used to hold the GOT pointer when making function calls via the PLT. ESP and EBP usually be "reserved" in register allocation. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D91020	2020-12-04 10:00:13 +08:00
Arthur Eubanks	5bc5d5ec44	[NewPM] Support --print-before/after in NPM This changes --print-before/after to be a list of strings rather than legacy passes. (this also has the effect of not showing the entire list of passes in --help-hidden after --print-before/after, which IMO is great for making it less verbose). Currently PrintIRInstrumentation passes the class name rather than pass name to llvm::shouldPrintBeforePass(), meaning llvm::shouldPrintBeforePass() never functions as intended in the NPM. There is no easy way of converting class names to pass names outside of within an instance of PassBuilder. This adds a map of pass class names to their short names in PassRegistry.def within PassInstrumentationCallbacks. It is populated inside the constructor of PassBuilder, which takes a PassInstrumentationCallbacks. Add a pointer to PassInstrumentationCallbacks inside PrintIRInstrumentation and use the newly created map. This is a bit hacky, but I can't think of a better way since the short id to class name only exists within PassRegistry.def. This also doesn't handle passes not in PassRegistry.def but rather added via PassBuilder::registerPipelineParsingCallback(). llvm/test/CodeGen/Generic/print-after.ll doesn't seem very useful now with this change. Reviewed By: ychen, jamieschmeiser Differential Revision: https://reviews.llvm.org/D87216	2020-12-03 16:52:14 -08:00
Craig Topper	2eefcbd100	[RISCV] Rename FPCCToExtend->FPOpToExpand and FPOpToExtend->FPOpToExpand. NFC These are used to call setOperationAction/setCondCodeAction with the Expand action so it seems that Expand is a better name than Extend.	2020-12-03 16:00:49 -08:00
Philip Reames	bb48347d74	Use deref facts derived from minimum object size of allocations This change should be fairly straight forward. If we've reached a call, check to see if we can tell the result is dereferenceable from information about the minimum object size returned by the call. To control compile time impact, I'm only adding the call for base facts in the routine. getObjectSize can also do recursive reasoning, and we don't want that general capability here. As a follow up patch (without separate review), I will plumb through the missing TLI parameter. That will have the effect of extending this to known libcalls - malloc, new, and the like - whereas currently this only covers calls with the explicit allocsize attribute. Differential Revision: https://reviews.llvm.org/D90341	2020-12-03 15:01:14 -08:00
Philip Reames	17055bcff3	[LoopVec] Support non-instructions as argument to uniform mem ops The initial step of the uniform-after-vectorization (lane-0 demanded only) analysis was very awkwardly written. It would revisit use list of each pointer operand of a widened load/store. As a result, it was in the worst case O(N^2) where N was the number of instructions in a loop, and had restricted operand Value types to reduce the size of use lists. This patch replaces the original algorithm with one which is at most O(2N) in the number of instructions in the loop. (The key observation is that each use of a potentially interesting pointer is visited at most twice, once on first scan, once in the use list of it's operand. Only instructions within the loop have their uses scanned.) In the process, we remove a restriction which required the operand of the uniform mem op to itself be an instruction. This allows detection of uniform mem ops involving global addresses. Differential Revision: https://reviews.llvm.org/D92056	2020-12-03 14:51:44 -08:00
Fangrui Song	0924137113	Revert D90844 "[TableGen][SchedModels] Fix read/write variant substitution" This reverts commit 112b3cb6ba49aacd821440d0913f15b32131480e. D90844 made lib/Target/AArch64/AArch64GenSubtargetInfo.inc non-deterministic.	2020-12-03 14:24:29 -08:00
Craig Topper	6d987ac858	[RISCV] Merge FMV_H_X_RV32/FMV_H_X_RV64 into a single opcode. Same with FMV_X_ANYEXTH_RV32/RV64 Rather than having a different opcode for RV32 and RV64. Let's just say the integer type is XLenVT and use a single opcode for both modes. Differential Revision: https://reviews.llvm.org/D92538	2020-12-03 11:12:40 -08:00
serge-sans-paille	8db0e1abee	Speedup some unicode rendering Use a fast path for column width computation for ascii characters. Especially relevant for llvm-objdump. before: % time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null ./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.75s user 0.01s system 99% cpu 0.757 total after: % time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null ./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.37s user 0.01s system 99% cpu 0.378 total Differential Revision: https://reviews.llvm.org/D92180	2020-12-03 20:11:11 +01:00
Anna Thomas	5d70260f30	[ScalarizeMaskedMemIntrin] NFC: Pass args by reference	2020-12-03 14:04:21 -05:00
Fangrui Song	972a573aa5	[Metadata] Fix layer violation in D91576 There is a library layering issue. LLVMAnalysis provides llvm/Analysis/ScopedNoAliasAA.h and depends on LLVMCore. LLVMCore provides llvm/IR/Metadata.cpp and it should not include a header file in LLVMAnalysis	2020-12-03 10:58:46 -08:00
Craig Topper	8ad0bb8601	[RISCV] Remove RISCVMergeBaseOffsetOpt from the -O0 pass pipeline. Internally the pass skips any function with the optnone attribute. But that still requires checking each function. If the opt level is set to None we might as well just skip putting in the pipeline at all. This what is already done for many of the passes added by TargetPassConfig. Differential Revision: https://reviews.llvm.org/D92511	2020-12-03 09:58:25 -08:00
modimo	6b81c5b99d	[MemCpyOpt] Correctly merge alias scopes during call slot optimization When MemCpyOpt performs call slot optimization it will concatenate the `alias.scope` metadata between the function call and the memcpy. However, scoped AA relies on the domains in metadata to be maintained in a caller-callee relationship. Naive concatenation breaks this assumption leading to bad AA results. The fix is to take the intersection of domains then union the scopes within those domains. The original bug came from a case of rust bad codegen which uses this bad aliasing to perform additional memcpy optimizations. As show in the added test case `%src` got forwarded past its lifetime leading to a dereference of garbage data. Testing ninja check-llvm Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D91576	2020-12-03 09:23:37 -08:00
Kazu Hirata	b51993db92	[X86] Remove DecodeVPERMVMask and DecodeVPERMV3Mask This patch removes the variants of DecodeVPERMVMask and DecodeVPERMV3Mask that take "const Constant C" as they are not used anymore. They were introduced on Sep 8, 2015 in commit e88038f23517ffc741acfd307ff92e2b1af136d8. The last use of DecodeVPERMVMask(const Constant C, ...) was removed on Feb 7, 2016 in commit 73fc26b44a8591b15f13eaffef17e67161c69388. The last use of DecodeVPERMV3Mask(const Constant *C, ...) was removed on May 28, 2018 in commit dcfcfdb0d166fff8388bdd2edc5a2948054c9da1. Differential Revision: https://reviews.llvm.org/D91926	2020-12-03 09:12:02 -08:00
Anna Thomas	b963f5b680	[ScalarizeMaskedMemIntrin] NFC: Convert member functions to static This will make it easier to add new PM support once the pass is moved into transforms (D92407).	2020-12-03 11:46:38 -05:00
Ahmed Bougacha	fe6a3c2668	[Triple][MachO] Define "arm64e", an AArch64 subarch for Pointer Auth. This also teaches MachO writers/readers about the MachO cpu subtype, beyond the minimal subtype reader support present at the moment. This also defines a preprocessor macro to allow users to distinguish __arm64__ from __arm64e__. arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing pointer-authentication codegen. It also currently defaults to ios14 and macos11. Differential Revision: https://reviews.llvm.org/D87095	2020-12-03 07:53:59 -08:00
Baptiste Saleil	e737a423ce	[PowerPC] Fix for excessive ACC copies due to PHI nodes When using accumulators in loops, they are passed around in PHI nodes of unprimed accumulators, causing the generation of additional prime/unprime instructions. This patch detects these cases and changes these PHI nodes to primed accumulator PHI nodes. We also add IR and MIR test cases for several PHI node cases. Differential Revision: https://reviews.llvm.org/D91391	2020-12-03 09:51:23 -06:00
Yonghong Song	2b2723c653	[BPF] support atomic instructions Implement fetch_<op>/fetch_and_<op>/exchange/compare-and-exchange instructions for BPF. Specially, the following gcc intrinsics are implemented. __sync_fetch_and_add (32, 64) __sync_fetch_and_sub (32, 64) __sync_fetch_and_and (32, 64) __sync_fetch_and_or (32, 64) __sync_fetch_and_xor (32, 64) __sync_lock_test_and_set (32, 64) __sync_val_compare_and_swap (32, 64) For __sync_fetch_and_sub, internally, it is implemented as a negation followed by __sync_fetch_and_add. For __sync_lock_test_and_set, despite its name, it actually does an atomic exchange and return the old content. https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html For intrinsics like __sync_{add,sub}_and_fetch and __sync_bool_compare_and_swap, the compiler is able to generate codes using __sync_fetch_and_{add,sub} and __sync_val_compare_and_swap. Similar to xadd, atomic xadd, xor and xxor (atomic_<op>) instructions are added for atomic operations which do not have return values. LLVM will check the return value for __sync_fetch_and_{add,and,or,xor}. If the return value is used, instructions atomic_fetch_<op> will be used. Otherwise, atomic_<op> instructions will be used. All new instructions only support 64bit and 32bit with alu32 mode. old xadd instruction still supports 32bit without alu32 mode. For encoding, please take a look at test atomics_2.ll. Differential Revision: https://reviews.llvm.org/D72184	2020-12-03 07:38:00 -08:00
dfukalov	b944ac9e0a	[NFC] Reduce include files dependency. 1. Removed #include "...AliasAnalysis.h" in other headers and modules. 2. Cleaned up includes in AliasAnalysis.h. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92489	2020-12-03 18:25:05 +03:00
Paul C. Anagnostopoulos	15379f2764	[TableGen] Eliminate the 'code' type Update the documentation. Rework various backends that relied on the code type. Differential Revision: https://reviews.llvm.org/D92269	2020-12-03 10:19:11 -05:00
Kazushi (Jam) Marukawa	6ffbb84f85	[VE] Add vsll, vsrl, vsla, vsra, and vsfa intrinsic instructions Add vsll, vsrl, vsla, vsra, and vsfa intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92550	2020-12-03 23:19:58 +09:00
Joe Ellis	3e31000cc2	[DAGCombine] Fix TypeSize warning in DAGCombine::visitLIFETIME_END Bail out early if we encounter a scalable store. Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D92392	2020-12-03 12:12:41 +00:00
Tim Northover	a8373fee73	arm64: count Triple::aarch64_32 as an aarch64 target and enable leaf frame pointers	2020-12-03 11:09:44 +00:00
Max Kazantsev	78b51b6b79	Revert "[IndVars] ICmpInst should not prevent IV widening" This reverts commit 0c9c6ddf17bb01ae350a899b3395bb078aa0c62e. We are seeing some failures with this patch locally. Not clear if it's causing them or just triggering a problem in another place. Reverting while investigating.	2020-12-03 18:01:41 +07:00
Kazushi (Jam) Marukawa	4516e4161b	[VE] Add veqv and vseq intrinsic instructions Add veqv and vseq intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92527	2020-12-03 17:39:24 +09:00
modimo	5b1e62daa4	[NFC] Fix typo	2020-12-02 22:23:57 -08:00
Jianzhou Zhao	cb92e3d61f	[dfsan] Rename ShadowTy/ZeroShadow with prefix Primitive This is a child diff of D92261. After supporting field/index-level shadow, the existing shadow with type i16 works for only primitive types. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D92459	2020-12-03 05:31:01 +00:00
Craig Topper	a584f7033c	[RISCV] Add f16 to isFMAFasterThanFMulAndFAdd now that the Zfh extension is supported	2020-12-02 20:31:43 -08:00
QingShan Zhang	563c9f0908	[PowerPC] Add the hw sqrt test for vector type v4f32/v2f64 PowerPC ISA support the input test for vector type v4f32 and v2f64. Replace the software compare with hw test will improve the perf. Reviewed By: ChenZheng Differential Revision: https://reviews.llvm.org/D90914	2020-12-03 03:19:18 +00:00
Kazu Hirata	4dd4adf3d5	[SelectionDAG] Use is_contained (NFC)	2020-12-02 19:09:45 -08:00
Craig Topper	5b5b0951c9	[RISCV] Initialize MergeBaseOffsetOptPass so it will work with print-before/after-all. If its not in the PassRegistry it's not recognized as a pass when we print before/after. Happened to notice while I was working on a new pass.	2020-12-02 18:04:22 -08:00
Hsiangkai Wang	39e5617bc0	[RISCV] Support Zfh half-precision floating-point extension. Support "Zfh" extension according to https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex Differential Revision: https://reviews.llvm.org/D90738	2020-12-03 09:16:33 +08:00
Xun Li	5243362b90	Small improvements to Intrinsic::getName While I was adding a new intrinsic instruction (not overloaded), I accidentally used CreateUnaryIntrinsic to create the intrinsics, which turns out to be passing the type list to getName, and ended up naming the intrinsics function with type suffix, which leads to wierd bugs latter on. It took me a long time to debug. It seems a good idea to add an assertion in getName so that it fails if types are passed but it's not a overloaded function. Also, the overloade version of getName is less efficient because it creates an std::string. We should avoid calling it if we know that there are no types provided. Differential Revision: https://reviews.llvm.org/D92523	2020-12-02 16:49:12 -08:00
Mircea Trofin	611de20466	[NFC][MC] TargetRegisterInfo::getSubReg is a MCRegister. Typing the API appropriately. Differential Revision: https://reviews.llvm.org/D92341	2020-12-02 15:46:38 -08:00

1 2 3 4 5 ...

141685 Commits