llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	32e46239ea	[X86] Don't attempt to fold sub(C1, xor(X, C2)) with opaque constants Fixes PR49451	2021-03-11 12:06:40 +00:00
Serguei Katkov	3b0fb3a8e1	[Statepoint Lowering] Handle the case with several gc.result Recently gc.result has been marked with readnone instead of readonly and this opens a door for different optimization to duplicate gc.result. Statepoint lowering is not ready to see several gc.results. The problem appears when there are gc.results with one located in the same basic block and another located in other basic block. In this case we need both export VR and fill local setValue. Note that this case is not sufficient optimization done before CodeGen. It is evident that local gc.result dominates all other gc.results and it is handled by GVN and EarlyCSE. But anyway, even if IR is not optimal Backend should not crash on a valid IR. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98393	2021-03-11 18:44:44 +07:00
Thomas Preud'homme	f41f493697	[FileCheck] Fix naming of OverflowErrorStr var As pointed out by Joel E. Denny in D97845, the OverflowErrorStr variable is misnamed because the error is raised for any parsing error. Note that in FileCheck proper this only happens in case of (under\|over)flow because the regex will ensure a number in the correct format is matched. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D98342	2021-03-11 10:31:04 +00:00
Jay Foad	c6bd271cd0	[MCA] Support in-order CPUs with MicroOpBufferSize=1 Differential Revision: https://reviews.llvm.org/D98356	2021-03-11 10:12:54 +00:00
Nikita Popov	a833e60074	Reapply [LICM] Make promotion faster Relative to the previous implementation, this always uses aliasesUnknownInst() instead of aliasesPointer() to correctly handle atomics. The added test case was previously miscompiled. ----- Even when MemorySSA-based LICM is used, an AST is still populated for scalar promotion. As the AST has quadratic complexity, a lot of time is spent in this step despite the existing access count limit. This patch optimizes the identification of promotable stores. The idea here is pretty simple: We're only interested in must-alias mod sets of loop invariant pointers. As such, only populate the AST with loop-invariant loads and stores (anything else is definitely not promotable) and then discard any sets which alias with any of the remaining, definitely non-promotable accesses. If we promoted something, check whether this has made some other accesses loop invariant and thus possible promotion candidates. This is much faster in practice, because we need to perform AA queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable) instead of O(NumTotal^2), and NumPromotable tends to be small. Additionally, promotable accesses have loop invariant pointers, for which AA is cheaper. This has a signicant positive compile-time impact. We save ~1.8% geomean on CTMark at O3, with 6% on lencod in particular and 25% on individual files. Conceptually, this change is NFC, but may not be so in practice, because the AST is only an approximation, and can produce different results depending on the order in which accesses are added. However, there is at least no impact on the number of promotions (licm.NumPromoted) in test-suite O3 configuration with this change. Differential Revision: https://reviews.llvm.org/D89264	2021-03-11 10:50:28 +01:00
Augusto Noronha	7126fca5c3	Save and restore previous terminal after setting the terminal for checking if terminal supports colors. The call to "set_curterm" inside the "terminalHasColors" function breaks the EditLine configuration on some Linux distributions, causing certain characters that have functions bound to them to not show up and backspace to stop deleting characters (only visually). This patch ensures that term struct is restored after the routine for cheking if terminal supports colors is done, which fixes the aforementioned issue. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D95230	2021-03-11 10:47:06 +01:00
Djordje Todorovic	1e88deac13	[Debugify][OriginalDIMode] Export the report into JSON file By using the original-di check with debugify in the combination with the llvm/utils/llvm-original-di-preservation.py it becomes very user friendly tool. An example of the HTML page with the issues related to debug info can be found at [0]. [0] https://djolertrk.github.io/di-checker-html-report-example/ Differential Revision: https://reviews.llvm.org/D82546	2021-03-11 01:11:13 -08:00
David Blaikie	82a7e62f0f	Fix unused lambda capture in a non-asserts build For locally scoped lambdas like this there's no particular benefit to explicitly listing captures - or avoiding capturing this. Switch to [&] and make it all easier to maintain. (& driveby change std::function to llvm::function_ref)	2021-03-11 00:22:18 -08:00
Petr Hosek	1de679fe12	[InstrProfiling] Don't generate __llvm_profile_runtime_user This is no longer needed, we can add __llvm_profile_runtime directly to llvm.compiler.used or llvm.used to achieve the same effect. Differential Revision: https://reviews.llvm.org/D98325	2021-03-10 22:33:51 -08:00
Juneyoung Lee	f828df253e	Resolve unused variable warning (NFC)	2021-03-11 12:03:03 +09:00
Vy Nguyen	ec56220409	[llvm] Fix thinko in getVendorSignature(), where expected values of ECX and EDX were flipped for the AMD case. Follow up to D97504 Differential Revision: https://reviews.llvm.org/D98322	2021-03-10 21:39:19 -05:00
Juneyoung Lee	2a52d0ee68	[InstSimplify] Pass SimplifyQuery to computePointerICmp (NFC)	2021-03-11 11:13:46 +09:00
Ruiling Song	d4ef89cda8	[AMDGPU] Always create Stack Object for reserved VGPR As we may overwrite inactive lanes of a caller-save-vgpr, we should always save/restore the reserved vgpr for sgpr spill. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D98319	2021-03-11 10:06:07 +08:00
Ruiling Song	648b01f347	[ValueMapper] Add debug output for metadata remapping This is useful for debugging which pointers are updated during remapping process. Differential Revision: https://reviews.llvm.org/D95775	2021-03-11 09:54:55 +08:00
Daniel Sanders	4335051f8d	[mir] Change 'undef' for MMO base addresses to 'unknown-address' Differential Revision: https://reviews.llvm.org/D98100	2021-03-10 16:46:44 -08:00
Reid Kleckner	007699e694	Re-land "[PDB] Defer relocating .debug$S until commit time and parallelize it" This reverts commit bacf9cf2c5cdec3567580e5030c4c82f42b3d745 and reinstates commit 1a9bd5b81328adf0dd5a8b4f3ad5949463e66da3. Reverting this commit did not appear to make the problem go away, so we can go ahead and reland it.	2021-03-10 15:14:09 -08:00
Wael Yehia	4079c2fd64	llvm-lto: default Relocation Model should be selected by the TargetMachine. Right now, the createTargetMachine function in LTOBackend.cpp (used by llvm-lto, and other components) selects the default Relocation Model when none is specified in the module. Other components (such as opt and llc) that construct a TargetMachine delegate the decision on the default value to the polymorphic TargetMachine's constructor. This commit aligns llvm-lto with other components. Reviewed By: daltenty, fhahn Differential Revision: https://reviews.llvm.org/D97507	2021-03-10 17:31:26 -05:00
David Green	931c7a5ab6	[AArch64] Extend vecreduce -> udot handling to mla reductions We previously have lowering for: vecreduce.add(zext(X)) to vecreduce.add(UDOT(zero, X, one)) This extends that to also handle: vecreduce.add(mul(zext(X), zext(Y)) to vecreduce.add(UDOT(zero, X, Y)) It extends the existing code to optionally handle a mul with equal extends. Differential Revision: https://reviews.llvm.org/D97280	2021-03-10 22:25:12 +00:00
kuterd	682dd42aa3	[Attributor] Attributor call site specific AAValueConstantRange This patch makes uses of the context bridges introduced in D83299 to make AAValueConstantRange call site specific. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D83744	2021-03-11 01:19:44 +03:00
David Green	d76d8c5fde	[AArch64] Extend vecreduce -> udot handling to v8i8 https://reviews.llvm.org/D88577 added v16i8 vecreduce to udot/sdot lowering. This extends that to v8i8 too, generalizing the pattern to handle the extra types. Differential Revision: https://reviews.llvm.org/D97279	2021-03-10 21:03:15 +00:00
Mauri Mustonen	3e49aa8f87	[VPlan] Support to widen select intructions in VPlan native path Add support to widen select instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer. Previously select instructions get handled by the wrong recipe and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D97136	2021-03-10 20:59:53 +00:00
Alexey Lapshin	c75c8e75b3	[llvm-objcopy][NFC] replace class Buffer/MemBuffer/FileBuffer with streams. During D88827 it was requested to remove the local implementation of Memory/File Buffers: // TODO: refactor the buffer classes in LLVM to enable us to use them here // directly. This patch uses raw_ostream instead of Buffers. Generally, using streams could allow us to reduce memory usages. No need to load all data into the memory - the data could be streamed through a smaller buffer. Thus, this patch uses raw_ostream as an interface for output data: Error executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out); Note 1. This patch does not change the implementation of Writers so that data would be directly stored into raw_ostream. This is assumed to be done later. Note 2. It would be better if Writers would be implemented in a such way that data could be streamed without seeking/updating. If that would be inconvenient then raw_ostream could be replaced with raw_pwrite_stream to have a possibility to seek back and update file headers. This is assumed to be done later if necessary. Note 3. Current FileOutputBuffer allows using a memory-mapped file. The raw_fd_ostream (which could be used if data should be stored in the file) does not allow us to use a memory-mapped file. Memory map functionality could be implemented for raw_fd_ostream: It is possible to add resize() method into raw_ostream. class raw_ostream { void resize(uint64_t size); } That method, implemented for raw_fd_ostream, could create a memory-mapped file. The streamed data would be written into that memory file then. Thus we would be able to use memory-mapped files with raw_fd_ostream. This is assumed to be done later if necessary. Differential Revision: https://reviews.llvm.org/D91028	2021-03-10 23:50:04 +03:00
Stanislav Mekhanoshin	056503ce50	[AMDGPU] Disable SCC bit on fp atomics Differential Revision: https://reviews.llvm.org/D98221	2021-03-10 12:36:09 -08:00
Stanislav Mekhanoshin	2ed90deb94	[AMDGPU] Always expand system scope fp atomics on gfx90a FP atomics in system scope cannot be used and shall always be expanded in a CAS loop. Differential Revision: https://reviews.llvm.org/D98085	2021-03-10 12:35:23 -08:00
Matteo Favaro	2d8f67490d	[DSE] Extending isOverwrite to support offsetted fully overlapping stores The isOverwrite function is making sure to identify if two stores are fully overlapping and ideally we would like to identify all the instances of OW_Complete as they'll yield possibly killable stores. The current implementation is incapable of spotting instances where the earlier store is offsetted compared to the later store, but still fully overlapped. The limitation seems to lie on the computation of the base pointers with the GetPointerBaseWithConstantOffset API that often yields different base pointers even if the stores are guaranteed to partially overlap (e.g. the alias analysis is returning AliasResult::PartialAlias). The patch relies on the offsets computed and cached by BatchAAResults (available after D93529) to determine if the offsetted overlapping is OW_Complete. Differential Revision: https://reviews.llvm.org/D97676	2021-03-10 21:09:33 +01:00
Sriraman Tallam	af0d8fe721	Remove original implementation of UniqueInternalLinkageNames pass. D96109 was recently submitted which contains the refactored implementation of -funique-internal-linakge-names by adding the unique suffixes in clang rather than as an LLVM pass. Deleting the former implementation in this change. Differential Revision: https://reviews.llvm.org/D98234	2021-03-10 11:57:40 -08:00
Rafael Auler	84a67d070d	[RuntimeDyld] Support more relocations This patch introduces functionality used by BOLT when re-linking the final binary. It adds new relocation types that are currently unsupported by RuntimeDyldELF. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D97899	2021-03-10 11:19:38 -08:00
Quentin Colombet	68c832f40a	[NFC] Fix compiler warnings Fix warnings caused by -Wrange-loop-analysis. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98298	2021-03-10 11:03:50 -08:00
Amy Kwan	009e68a8c7	[PowerPC] Implement patterns for PC-Rel zextload/extload byte loads This patch adds patterns to select the PC-Relative extloadi1 and zextloadi1 byte loads. Differential Revision: https://reviews.llvm.org/D98042	2021-03-10 12:18:13 -06:00
gbtozers	bed8d760fe	[DebugInfo][NFC] Refactor BinOp+GEP salvaging in salvageDebugInfoImpl This patch refactors out the salvaging of GEP and BinOp instructions into separate functions, in preparation for further changes to the salvaging of these instructions coming in another patch; there should be no functional change as a result of this refactor. Differential Revision: https://reviews.llvm.org/D92851	2021-03-10 18:03:12 +00:00
Craig Topper	49c7edfa1f	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Craig Topper	03a9d9f036	[RISCV] Starting fixing issues that prevent us from testing vXi64 intrinsics on RV32. Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895	2021-03-10 09:45:38 -08:00
Daniil Seredkin	db24a5bc2f	[InstCombine][SimplifyLibCalls] An extra sqrtf was produced because of transformations in optimizePow function See: https://bugs.llvm.org/show_bug.cgi?id=47613 There was an extra sqrt call because shrinking emitted a new powf and at the same time optimizePow replaces the previous pow with sqrt and as the result we have two instructions that will be in worklist of InstCombie despite the fact that %powf is not used by anyone (it is alive because of errno). As the result we have two instructions: %powf = call fast float @powf(float %x, float 5.000000e-01) %sqrt = call fast double @sqrt(double %dx) %powf will be converted to %sqrtf on a later iteration. As a quick fix for that I moved shrinking to the end of optimizePow so that pow is replaced with sqrt at first that allows not to emit a new shrunk powf. Differential Revision: https://reviews.llvm.org/D98235	2021-03-10 12:33:05 -05:00
Craig Topper	0cac1b0df2	[RISCV] Manually split vector operands to VECREDUCE when handling vXi64 vectors on RV32. The type legalizer will visit the result before the operands. To avoid creating an illegal target specific node or falling back to scalarization, we need to manually split vector operands. This still doesn't handle the case of non-power of 2 operands which need to be widened. I'm not sure the type legalizer is ready for it. I think we would need to insert an INSERT_SUBVECTOR with the power of 2 type we want, with an undef first operand, and the non-power of 2 orignal operand as the vector to insert. Then fill in the neutral elements into the elements the padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector. From there we carry on splitting if needed to get to a legal type then do the target specific code. The problem with this is the type legalizer doesn't know how to widen an insert_subvector yet. We would need to add that including the handling for a non-undef first vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98292	2021-03-10 09:27:38 -08:00
Ta-Wei Tu	95819d3c63	Revert "[LoopInterchange] Replace tightly-nesting-ness check with the one from `LoopNest`" This reverts commit df9158c9a45a6902c2b0394f9bd6512e3e441f31.	2021-03-11 01:24:43 +08:00
Stephen Tozer	727772e7af	[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578	2021-03-10 17:15:24 +00:00
Jianzhou Zhao	419c64802a	[dfsan] Tracking origins at phi nodes This is a part of https://reviews.llvm.org/D95835. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D98268	2021-03-10 17:02:58 +00:00
Dávid Bolvanský	6294fcabc7	[DSE] Handle memmove with equal non-const sizes Follow up for fhahn's D98284. Also fixes a case from PR47644. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D98346	2021-03-10 17:52:00 +01:00
Jay Foad	d51dcd4c3d	[AMDGPU] Fix isReallyTriviallyReMaterializable for V_MOV_* D57708 changed SIInstrInfo::isReallyTriviallyReMaterializable to reject V_MOVs with extra implicit operands, but it accidentally rejected all V_MOVs because of their implicit use of exec. Fix it but avoid adding a moderately expensive call to MI.getDesc().getNumImplicitUses(). In real graphics shaders this changes quite a few vgpr copies into move- immediates, which is good for avoiding stalls on GFX10. Differential Revision: https://reviews.llvm.org/D98347	2021-03-10 16:18:12 +00:00
Stephen Tozer	10d5b90e89	Reapply "[DebugInfo] Add DWARF emission for DBG_VALUE_LIST" This reverts commit 429c6ecbb302e2beedd8694378ae5be456206209.	2021-03-10 15:59:24 +00:00
Yusra Syeda	b994fddd41	[SystemZ][NFC] Renaming of ELF specific variables. Rename ELF specific variables, making it easier to add the XPLink variables in future patches. Reviewed By: abhina.sreeskantharajan, Kai Differential Revision: https://reviews.llvm.org/D98199	2021-03-10 10:15:01 -05:00
Stephen Tozer	fb96dd1917	Revert "[DebugInfo] Add DWARF emission for DBG_VALUE_LIST" This reverts commit 0da27ba56c9f5e3f534a65401962301189eac342. This revision was causing an error on the sanitizer-x86_64-linux-autoconf build.	2021-03-10 14:35:33 +00:00
gbtozers	e2ad584fe3	[DebugInfo] Add DWARF emission for DBG_VALUE_LIST This patch allows DBG_VALUE_LIST instructions to be emitted to DWARF with valid DW_AT_locations. This change mainly affects DbgEntityHistoryCalculator, which now tracks multiple registers per value, and DwarfDebug+DwarfExpression, which can now emit multiple machine locations as part of a DWARF expression. Differential Revision: https://reviews.llvm.org/D83495	2021-03-10 13:46:20 +00:00
Jingu Kang	0747239149	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
Christudasan Devadasan	aa8030a6bf	GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM It is good to have a combined `divrem` instruction when the `div` and `rem` are computed from identical input operands. Some targets can lower them through a single expansion that computes both division and remainder. It effectively reduces the number of instructions than individually expanding them. Reviewed By: arsenm, paquette Differential Revision: https://reviews.llvm.org/D96013	2021-03-10 18:46:07 +05:30
Jinzheng Tu	e25add7483	[NFC] Unify FIME with FIXME in comments There are 5 occurrences FIME and 15333 FIXME. All of them should be FIXME. Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D98321	2021-03-10 14:00:51 +01:00
Serguei Katkov	26844947fb	[Statepoint Lowering] Fix the crash with gc.relocate in a separate block If it was decided to relocate derived pointer using the spill its value is not exported in general case. When gc.relocate is located in an another block than a statepoint we cannot get SD for derived value but for spill case it is not required at all. However implementation of gc.relocate lowering unconditionally request SD value causing the assert triggering. The CL fixes this by handling spill case earlier than SD is really required. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98324	2021-03-10 19:51:04 +07:00
gbtozers	308759bb3b	[DebugInfo] Process DBG_VALUE_LIST in LiveDebugVariables This patch adds support for DBG_VALUE_LIST in the LiveDebugVariables pass. The changes are mostly in computeIntervals, extendDef, and addDefsFromCopies; when extending the def of a DBG_VALUE_LIST the live ranges of every used register must be considered, and when such a def is killed by more than one of its used registers being killed at the same time it is necessary to find valid copies of all of those registers to create a new def with. The DebugVariableValue class has also been changed to reference multiple location numbers instead of just one. This has been accomplished by using a C-style array with a unique_ptr and an array length packed into 6 bits, to minimize the size of the class (which must be kept low to be used with IntervalMap). This may not be the most efficient solution possible, and should be looked at if performance issues arise. Differential Revision: https://reviews.llvm.org/D83895	2021-03-10 12:37:59 +00:00
Alex Richardson	6ae0a3f9c3	[SLC] Simplify strcpy and friends with non-zero address spaces The current logic in TargetLibraryInfoImpl::getLibFunc() was only treating strcpy, etc. with i8* arguments in address space zero as a valid library function. However, in the CHERI and Morello targets we expect all libc functions to use address space 200 arguments. This commit updates isValidProtoForLibFunc() to check that the argument is a pointer type. This also drops the check for i8* since we should not be checking the pointee type any more. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D95142	2021-03-10 11:17:34 +00:00
Florian Hahn	9450d09863	[DSE] Handle memcpy/memset with equal non-const sizes. Currently DSE misses cases where the size is a non-const IR value, even if they match. For example, this means that llvm.memcpy/llvm.memset calls are not eliminated, even if they write the same number of bytes. This patch extends isOverwite to try to get IR values for the number of bytes written from the analyzed instructions. If the values match, alias checks are performed and the result is returned. At the moment this only covers llvm.memcpy/llvm.memset. In the future, we may enable MemoryLocation to also track variable sizes, but this simple approach should allow us to cover the important cases in DSE. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98284	2021-03-10 10:13:58 +00:00

... 3 4 5 6 7 ...

145095 Commits