llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00

Author	SHA1	Message	Date
Kai Luo	24e6d035e6	[DAG][PowerPC] Fix dropped `nsw` flag in `SimplifySetCC` by adding `doesNodeExist` helper `SimplifySetCC` invokes `getNodeIfExists` without passing `Flags` argument and `getNodeIfExists` uses a default `SDNodeFlags` to intersect the original flags, as a consequence, flags like `nsw` is dropped. Added a new helper function `doesNodeExist` to check if a node exists without modifying its flags. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D89938	2020-11-25 04:39:03 +00:00
Zarko Todorovski	d5cb6187da	[PPC][AIX] Add vector callee saved registers for AIX extended vector ABI This patch is the initial patch for support of the AIX extended vector ABI. The extended ABI treats vector registers V20-V31 as non-volatile and we add them as callee saved registers in this patch. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D88676	2020-11-24 23:01:51 -05:00
Hongtao Yu	e6841e5f41	[SelectionDAG] Add PseudoProbeSDNode to LargestSDNode to fix 32-bt build break.	2020-11-24 15:35:08 -08:00
Zarko Todorovski	42371b1ea0	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Janek van Oirschot	96555c9f4f	[HardwareLoops] Change order of SCEV expression construction for InitLoopCount. Putting the +1 before the zero-extend will allow scalar evolution to fold the expression in some cases such as the one shown in PowerPC's `shrink-wrap.ll` test. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D91724	2020-11-24 18:01:42 +00:00
diggerlin	d8d8dfe63b	[NFC][AIX][XCOFF] change function name from getNumofGPRsSaved to getNumOfGPRsSaved change function name from getNumofGPRsSaved to getNumOfGPRsSaved for class XCOFFTracebackTable Reviewers: Jason Liu Differential Revision: https://reviews.llvm.org/D91882	2020-11-24 10:23:57 -05:00
Max Kazantsev	d09dfce52e	Revert "[NFC][SCEV] Generalize monotonicity check for full and limited iteration space" This reverts commit 2734a9ebf4a31df0131acdfc739395a5e692c342. This patch appeared to not be a NFC. It introduced an execution path where monotonicity check on limited space started relying in existing nsw/nuw flags, which is illegal. The motivating test will follow-up.	2020-11-24 17:56:59 +07:00
Arthur Eubanks	f4ee2f482f	Reland [CGSCC] Detect devirtualization in more cases The devirtualization wrapper misses cases where if it wraps a pass manager, an individual pass may devirtualize an indirect call created by a previous pass. For example, inlining may create a new indirect call which is devirtualized by instcombine. Currently the devirtualization wrapper will not see that because it only checks cgscc edges at the very beginning and end of the pass (manager) it wraps. This fixes some tests testing this exact behavior in the legacy PM. Instead of checking WeakTrackingVHs for CallBases at the very beginning and end of the pass it wraps, check every time updateCGAndAnalysisManagerForPass() is called. check-llvm and check-clang with -abort-on-max-devirt-iterations-reached on by default doesn't show any failures outside of tests specifically testing it so it doesn't needlessly rerun passes more than necessary. (The NPM -O2/3 pipeline run the inliner/function simplification pipeline under a devirtualization repeater pass up to 4 times by default). http://llvm-compile-time-tracker.com/?config=O3&stat=instructions&remote=aeubanks shows that 7zip has ~1% compile time regression. I looked at it and saw that there indeed was devirtualization happening that was not previously caught, so now it reruns the CGSCC pipeline on some SCCs, which is WAI. The initial land assumed CallBase WeakTrackingVHs would always be CallBases, but they can be RAUW'd with undef. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89587	2020-11-23 21:28:59 -08:00
Jameson Nash	1c1949a4cb	fix some Wundef warnings in public headers Differential Revision: https://reviews.llvm.org/D91094	2020-11-23 20:17:46 -05:00
Amy Huang	2504c5bf49	Revert "[llvm-symbolizer] Switch to using native symbolizer by default on Windows" Breaks some asan tests on the buildbot. This reverts commit c74b427cb2a90309ee0c29df21ad1ca26390263c.	2020-11-23 16:29:45 -08:00
Amy Huang	f6737ef448	[llvm-symbolizer] Switch to using native symbolizer by default on Windows llvm-symbolizer used to use the DIA SDK for symbolization on Windows; this patch switches to using native symbolization, which was implemented recently. Users can still make the symbolizer use DIA by adding the `-dia` flag in the LLVM_SYMBOLIZER_OPTS environment variable. Differential Revision: https://reviews.llvm.org/D91814	2020-11-23 15:57:08 -08:00
Arthur Eubanks	0cdba06f91	Revert "[CGSCC] Detect devirtualization in more cases" This reverts commit 14a68b4aa9732293ad7e16f105b0feb53dc8dbe2. Causes building self hosted clang to crash when using NPM.	2020-11-23 13:21:05 -08:00
Martin Storsjö	0f20d48a09	Reapply "[CodeGen] [WinException] Only produce handler data at the end of the function if needed" This reapplies 36c64af9d7f97414d48681b74352c9684077259b in updated form. Emit the xdata for each function at .seh_endproc. This keeps the exact same output header order for most code generated by the LLVM CodeGen layer. (Sections still change order for code built from assembly where functions lack an explicit .seh_handlerdata directive, and functions with chained unwind info.) The practical effect should be that assembly output lacks superfluous ".seh_handlerdata; .text" pairs at the end of functions that don't handle exceptions, which allows such functions to use the AArch64 packed unwind format again. Differential Revision: https://reviews.llvm.org/D87448	2020-11-23 23:17:03 +02:00
Haowei Wu	9267952bee	[llvm-elfabi] Emit ELF header and string table sections This change serves to create the initial framework for outputting ELF files from llvm-elfabi. Differential Revision: https://reviews.llvm.org/D61767	2020-11-23 12:18:58 -08:00
Haowei Wu	a30b09c09d	Revert "[llvm-elfabi] Emit ELF header and string table sections" This reverts commit 53c5fdd59a5cf7fbb4dcb7a7e84c9c4a40d32a84. Reason of revert: Some builders failed to build with ld.	2020-11-23 11:58:51 -08:00
Arthur Eubanks	1d64d9e07d	Port -print-memderefs to NPM There is lots of code duplication, but hopefully it won't matter soon. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D91683	2020-11-23 11:56:22 -08:00
Arthur Eubanks	9cf634ad08	[CGSCC] Detect devirtualization in more cases The devirtualization wrapper misses cases where if it wraps a pass manager, an individual pass may devirtualize an indirect call created by a previous pass. For example, inlining may create a new indirect call which is devirtualized by instcombine. Currently the devirtualization wrapper will not see that because it only checks cgscc edges at the very beginning and end of the pass (manager) it wraps. This fixes some tests testing this exact behavior in the legacy PM. Instead of checking WeakTrackingVHs for CallBases at the very beginning and end of the pass it wraps, check every time updateCGAndAnalysisManagerForPass() is called. check-llvm and check-clang with -abort-on-max-devirt-iterations-reached on by default doesn't show any failures outside of tests specifically testing it so it doesn't needlessly rerun passes more than necessary. (The NPM -O2/3 pipeline run the inliner/function simplification pipeline under a devirtualization repeater pass up to 4 times by default). http://llvm-compile-time-tracker.com/?config=O3&stat=instructions&remote=aeubanks shows that 7zip has ~1% compile time regression. I looked at it and saw that there indeed was devirtualization happening that was not previously caught, so now it reruns the CGSCC pipeline on some SCCs, which is WAI. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89587	2020-11-23 11:55:20 -08:00
Haowei Wu	c90bb24fa0	[llvm-elfabi] Emit ELF header and string table sections This change serves to create the initial framework for outputting ELF files from llvm-elfabi. Differential Revision: https://reviews.llvm.org/D61767	2020-11-23 11:31:57 -08:00
Craig Topper	f7ac298f12	[SelectionDAG][ARM][AArch64][Hexagon][RISCV][X86] Add SDNPCommutative to fma and fmad nodes in tablegen. Remove explicit commuted patterns from targets. X86 was already specially marking fma as commutable which allowed tablegen to autogenerate commuted patterns. This moves it to the target independent definition and fix up the targets to remove now unneeded patterns. Unfortunately, the tests change because the commuted version of the patterns are generating operands in a different than the explicit patterns. Differential Revision: https://reviews.llvm.org/D91842	2020-11-23 10:09:20 -08:00
Paul C. Anagnostopoulos	58226c6585	[TableGen] Eliminte source location from CodeInit Step 1 in eliminating the 'code' type. Differential Revision: https://reviews.llvm.org/D91932	2020-11-23 11:30:13 -05:00
Jay Foad	1ac4a188a1	Fix speling in comments. NFC.	2020-11-23 14:43:24 +00:00
Kerry McLaughlin	1a23665577	[APInt] Add the truncOrSelf resizing operator to APInt Truncates the APInt if the bit width is greater than the width specified, otherwise do nothing Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91445	2020-11-23 11:27:30 +00:00
Max Kazantsev	9b9c5b87c1	[NFC] Reduce code duplication in binop processing in computeExitLimitFromCondCached Handling of `and` and `or` vastly uses copy-paste. Factored out into a helper function as preparation step for further fix (see PR48225). Differential Revision: https://reviews.llvm.org/D91864 Reviewed By: nikic	2020-11-23 13:18:12 +07:00
Sanjay Patel	9de62dc3bf	[CostModel] add basic handling for FP maximum/minimum intrinsics This might be a regression for some ARM targets, but that should be changed in the target-specific overrides. There is apparently still no default lowering for these nodes, so I am assuming these intrinsics are not in common use. X86, PowerPC, and RISC-V for example, just crash given the most basic IR.	2020-11-22 13:43:53 -05:00
Simon Pilgrim	a6d420d6cf	[DAG] LowerMINMAX - move default expansion to generic TargetLowering::expandIntMINMAX This is part of the discussion on D91876 about trying to reduce custom lowering of MIN/MAX ops on older SSE targets - if we can improve generic vector expansion we should be able to relax the limitations in SelectionDAGBuilder when it will let MIN/MAX ops be generated, and avoid having to flag so many ops as 'custom'.	2020-11-22 13:02:27 +00:00
Nikita Popov	6db5c2de20	[BasicAA] Return DecomposedGEP (NFC) Instead of requiring the caller to initialize the DecomposedGEP structure and then passing it in by reference, make DecomposeGEPExpression() responsible for initializing and returning the structure.	2020-11-21 21:05:26 +01:00
Simon Pilgrim	818f8b6f08	MachineDominators.h - remove unused <vector> include	2020-11-21 17:11:26 +00:00
Simon Pilgrim	4199692948	DominanceFrontier - remove unused <vector> includes	2020-11-21 17:11:26 +00:00
sameeran joshi	2ce817a34c	[Flang][OpenMP][NFC][2/2] Reorder OmpStructureChecker and simplify it. `OmpStructureChecker` has too much boilerplate code in source file. This patch: 1. Use helpers from `check-directive-structure.h` and reduces the boilerplate. 2. Use TableGen infrastructure as much as possible. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D90834	2020-11-21 22:37:35 +05:30
sameeran joshi	739932e963	[flang][openmp] Fix bug in `OmpClause::Hint` clause which was missing to generate inside in OMP.cpp.inc file. Before this patch "Hint" isn't found inside the generated file. ./bin/llvm-tblgen --gen-directive-gen ../llvm-project/llvm/include/llvm/Frontend/OpenMP/OMP.td -I ../llvm-project/llvm/include/ > OMP.cpp.in Reviewed By: clementval Differential Revision: https://reviews.llvm.org/D91909	2020-11-21 19:02:34 +05:30
Matt Arsenault	12c5d5881d	OpaquePtr: Make byval/sret types mandatory	2020-11-20 21:23:33 -05:00
Richard Smith	9400251e2c	Demangling support for class type non-type template parameter extensions. The extensions in question are described in: https://github.com/itanium-cxx-abi/cxx-abi/issues/47 https://github.com/itanium-cxx-abi/cxx-abi/issues/63 Differential Revision: https://reviews.llvm.org/D90003	2020-11-20 13:45:08 -08:00
Hongtao Yu	a385d0e6a4	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Hongtao Yu	db4396f62a	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Sanjay Patel	0d725bd74e	[CostModel] mostly remove cost-kind predicate for intrinsics in basic TTI implementation This is re-applying a combination of f7eac51b9b3f and 8ec7ea3ddce7 as one patch to avoid regressions now that we have better testing in place. Those were reverted with 32dd5870ee31 because of crashing in experimental intrinsics. That bug should be fixed with 7ae346434. Paraphrased original commit messages: This is the last step in removing cost-kind as a consideration in the basic class model for intrinsics. See D89461 for the start of that. Subsequent commits dealt with each of the special-case intrinsics that had customization here in the basic class. This should remove a barrier to retrying D87188 (canonicalization to the abs intrinsic). The ARM and x86 cost diffs seen here may be wrong because the target-specific overrides have their own bugs, but we hope this is less wrong - if something has a significant throughput cost, then it should have a significant size / blended cost too by default. The only behavioral diff in current regression tests is shown in the x86 scatter-gather test (which is misplaced or broken because it runs the entire -O3 pipeline) - we unrolled less, and we assume that is a improvement. Exception: in general, we want the size cost for a scalar call to be cheap even if the other costs are expensive - we expect it to just be a branch with some optional stack manipulation. It is likely that we will want to carve out some exceptions/overrides to this rule as follow-up patches for calls that have some general and/or target-specific difference to the expected lowering. This was noticed as a regression in unrolling, so we have a test for that now along with a couple of direct cost model tests. If the assumed scalarization costs for the oversized vector calls are not realistic, that would be another follow-up refinement of the cost models. Differential Revision: https://reviews.llvm.org/D90554	2020-11-20 11:21:10 -05:00
Alex Richardson	9c96f39f77	Add a default address space for globals to DataLayout This is similar to the existing alloca and program address spaces (D37052) and should be used when creating/accessing global variables. We need this in our CHERI fork of LLVM to place all globals in address space 200. This ensures that values are accessed using CHERI load/store instructions instead of the normal MIPS/RISC-V ones. The problem this is trying to fix is that most of the time the type of globals is created using a simple PointerType::getUnqual() (or ::get() with the default address-space value of 0). This does not work for us and we get assertion/compilation/instruction selection failures whenever a new call is added that uses the default value of zero. In our fork we have removed the default parameter value of zero for most address space arguments and use DL.getProgramAddressSpace() or DL.getGlobalsAddressSpace() whenever possible. If this change is accepted, I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead of relying on the default value of 0 for PointerType::get(), etc. This patch and the follow-up changes will not have any functional changes for existing backends with the default globals address space of zero. A follow-up commit will change the default globals address space for AMDGPU to 1. Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D70947	2020-11-20 15:46:52 +00:00
Sanjay Patel	7e28c46bfe	[CostModel] avoid crashing while finding scalarization overhead The constrained intrinsics have metadata arguments, so the tests here were crashing as noted in D90554 (and that was reverted even though this bug exists independently of that change).	2020-11-20 10:18:29 -05:00
Jamie Schmeiser	71ead90dbb	[NFC intended] Refactor the code for printChanged for reuse and to facilitate subsequent reporters of changes to the IR in the new pass manager. Summary: [NFC intended] Refactor the code for printChanged for reuse and to facilitate subsequent reporters of changes to the IR in the new pass manager. Create abstract template base classes for common functionality and give classes more appropriate names. The base classes handle all of the determination of when a function or pass is "interesting" and should be reported or filtered out. They have pure virtual functions which are called when a change by a pass has been recognized so the derived class need only provide the overrides to present the information about the changing IR. There are at least 2 more change reporters to come (which were presented in my tutorial at the 2020 llvm developer's meeting) that derive from these classes. Respond to review comments: move function out of line, remove inline keyword, remove unneeded qualifiers, simplify comparison. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks), madhur13490 (Madhur Amilkanthwar) Differential Revision: https://reviews.llvm.org/D87000	2020-11-20 09:43:06 -05:00
Pavel Iliin	2529cb73ff	[AArch64] Out-of-line atomics (-moutline-atomics) implementation. This patch implements out of line atomics for LSE deployment mechanism. Details how it works can be found in llvm/docs/Atomics.rst Options -moutline-atomics and -mno-outline-atomics to enable and disable it were added to clang driver. This is clang and llvm part of out-of-line atomics interface, library part is already supported by libgcc. Compiler-rt support is provided in separate patch. Differential Revision: https://reviews.llvm.org/D91157	2020-11-20 13:30:12 +00:00
Georgii Rymar	49ab0f3272	[lib/Object] - Generalize the RelocationResolver API. This allows to reuse the RelocationResolver from the code that doesn't want to deal with `RelocationRef` class. I am going to use it in llvm-readobj. See the description of D91530 for more details. Differential revision: https://reviews.llvm.org/D91533	2020-11-20 10:32:49 +03:00
Eric Christopher	e7321a134d	Temporarily Revert "[CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation" as it's causing crashes in the optimizer. A reduced testcase has been posted as a follow-up. This reverts commit f7eac51b9b3f780c96ca41913293851c5acb465b. Temporarily Revert "[CostModel] make default size cost for libcalls small (again)" as it depends upon the primary revert. This reverts commit 8ec7ea3ddce7379e13e8dfb4a5260a6d2004aa1c. Temporarily Revert "[CostModel] add tests for math library calls; NFC" as it depends upon the primary revert. This reverts commit df09f825995b10da03f148133c119f52c94fd6e4. Temporarily Revert "[LoopUnroll] add test for full unroll that is sensitive to cost-model; NFC" as it depends upon the primary revert. This reverts commit 618d555e8d926a83161774df2035519c387269db.	2020-11-19 22:10:23 -08:00
Duncan P. N. Exon Smith	7668c9150a	ADT: Split out isSafeToReferenceAfterResize helper to use early returns, NFC The assertion logic in SmallVector::assertSafeToReferenceAfterResize is hard to follow; split out SmallVector::isSafeToReferenceAfterResize and add early returns and comments. No functionality change here.	2020-11-19 17:55:04 -08:00
Arthur Eubanks	1e77e12c0e	Port -lower-matrix-intrinsics-minimal to NPM This reuses the existing lower-matrix-intrinsics pass rather than going the legacy pass route of creating a new pass. Use this new variant in the NPM -O0 pipeline. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91811	2020-11-19 17:42:48 -08:00
Duncan P. N. Exon Smith	82e253383a	ADT: Use early returns in SmallVector::resize, NFC Just a simple cleanup, no functionality change here.	2020-11-19 17:28:57 -08:00
Duncan P. N. Exon Smith	8605039d2b	ADT: Weaken SmallVector::resize assertion from 5abf76fbe37380874a88cc9aa02164800e4e10f3 There's no need to check for reference invalidation when `SmallVector::resize` is shrinking; the parameter isn't accessed. Differential Revision: https://reviews.llvm.org/D91832	2020-11-19 17:25:36 -08:00
Nikita Popov	53b556c27d	[MemLoc] Require LocationSize argument (NFC) When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two different states, and callers should make an explicit choice regarding the kind of MemoryLocation they want to have.	2020-11-19 21:45:52 +01:00
Arthur Eubanks	9dd02d7898	[NPM] Move more O0 pass building into PassBuilder This moves handling of alwaysinline, coroutines, matrix lowering, PGO, and LTO-required passes into PassBuilder. Much of this is replicated between Clang and opt. Other out-of-tree users also replicate some of this, such as Rust [1] replicating the alwaysinline, LTO, and PGO passes. The LTO passes are also now run in build(Thin)LTOPreLinkDefaultPipeline() since they are semantically required for (Thin)LTO. [1]: `f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D91585	2020-11-19 11:22:23 -08:00
Leonard Chan	c24d9d2b01	[llvm][IR] Add dso_local_equivalent Constant The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target. When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit. In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done. See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details. Differential Revision: https://reviews.llvm.org/D77248	2020-11-19 10:26:17 -08:00
Adhemerval Zanella	3f1ee6c0a7	[AArch64] Lower fptrunc/fpext from/to FP128t to/from FP16 The compiler-rt part which adds the emitted symbols is handled in a subsequent patch. Differential Revision: https://reviews.llvm.org/D91731	2020-11-19 15:14:50 -03:00
Joseph Huber	c0b5fdf2f0	[OpenMP] Add Location Fields to Libomptarget Runtime for Debugging Summary: Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values. Reviewers: jdoerfert grokos Differential Revision: https://reviews.llvm.org/D87946	2020-11-19 12:01:53 -05:00

1 2 3 4 5 ...

43111 Commits