llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Kazushi (Jam) Marukawa	12d012b50c	[VE] Add vsum and vfsum intrinsic instructions Add vsum and vfsum intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92938	2020-12-10 01:11:53 +09:00
Paul C. Anagnostopoulos	02220af447	[TableGen] Cache the vectors of records returned by getAllDerivedDefinitions(). Differential Revision: https://reviews.llvm.org/D92674	2020-12-09 10:54:04 -05:00
Sanjay Patel	d6e6ff92ec	[VectorCombine] allow peeking through an extractelt when creating a vector load This is an enhancement to load vectorization that is motivated by a pattern in https://llvm.org/PR16739. Unfortunately, it's still not enough to make a difference there. We will have to handle multi-use cases in some better way to avoid creating multiple overlapping loads. Differential Revision: https://reviews.llvm.org/D92858	2020-12-09 10:36:14 -05:00
Roman Lebedev	ae9bbdd2bb	[InstCombine] canonicalizeSaturatedAdd(): last fold is only valid for strict comparison (PR48390) We could create uadd.sat under incorrect circumstances if a select with -1 as the false value was canonicalized by swapping the T/F values. Unlike the other transforms in the same function, it is not invariant to equality. Some alive proofs: https://alive2.llvm.org/ce/z/emmKKL Based on original patch by David Green! Fixes https://bugs.llvm.org/show_bug.cgi?id=48390 Differential Revision: https://reviews.llvm.org/D92717	2020-12-09 18:19:09 +03:00
Roman Lebedev	be06a88cc2	[NFC][InstCombine] Add test coverage for @llvm.uadd.sat canonicalization The non-strict variants are already handled because they are canonicalized to strict variants by swapping hands in both the select and icmp, and the fold simply considers that strictness is irrelevant here. But that isn't actually true for the last pattern, as PR48390 reports.	2020-12-09 18:19:08 +03:00
Kazushi (Jam) Marukawa	ccf91aa076	[VE] Add vfmk intrinsic instructions Add vfmk intrinsic instructions, a few pseudo instructions to expand vfmk intrinsic using VM512 correctly, and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92758	2020-12-10 00:08:20 +09:00
Simon Pilgrim	7df5c0cf1d	[X86] Fold CONCAT(VPERMV3(X,Y,M0),VPERMV3(Z,W,M1)) -> VPERMV3(CONCAT(X,Z),CONCAT(Y,W),CONCAT(M0,M1)) Further prep work toward supporting different subvector sizes in combineX86ShufflesRecursively	2020-12-09 14:29:32 +00:00
Anton Afanasyev	3423fa8d78	[SLP] Use the width of value truncated just before storing For stores chain vectorization we choose the size of vector elements to ensure we fit to minimum and maximum vector register size for the number of elements given. This patch corrects vector element size choosing the width of value truncated just before storing instead of the width of value stored. Fixes PR46983 Differential Revision: https://reviews.llvm.org/D92824	2020-12-09 16:38:45 +03:00
Djordje Todorovic	52a4141c7f	[Debuginfo] [CSInfo] Do not create CSInfo for undef arguments If a function parameter is marked as "undef", prevent creation of CallSiteInfo for that parameter. Without this patch, the parameter's call_site_value would be incorrect. The incorrect call_value case reported in PR39716, addressed in D85111. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D92471	2020-12-09 12:54:59 +01:00
Kerry McLaughlin	a778bf874d	[SVE][CodeGen] Add DAG combines for s/zext_masked_gather This patch adds the following DAGCombines, which apply if isVectorLoadExtDesirable() returns true: - fold (and (masked_gather x)) -> (zext_masked_gather x) - fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x) LowerMGATHER has also been updated to fetch the LoadExtType associated with the gather and also use this value to determine the correct masked gather opcode to use. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92230	2020-12-09 11:53:19 +00:00
Sander de Smalen	ca602d5346	[LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF. * Steps are scaled by `vscale`, a runtime value. * Changes to circumvent the cost-model for now (temporary) so that the cost-model can be implemented separately. This can vectorize the following loop [1]: void loop(int N, double a, double b) { #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < N; i++) { a[i] = b[i] + 1.0; } } [1] This source-level example is based on the pragma proposed separately in D89031. This patch only implements the LLVM part. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91077	2020-12-09 11:25:21 +00:00
Sander de Smalen	05c67c93f8	[LoopVectorizer] NFC: Remove unnecessary asserts that VF cannot be scalable. This patch removes a number of asserts that VF is not scalable, even though the code where this assert lives does nothing that prevents VF being scalable. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91060	2020-12-09 11:25:21 +00:00
Kerry McLaughlin	d324dad642	[SVE][CodeGen] Add the ExtensionType flag to MGATHER Adds the ExtensionType flag, which reflects the LoadExtType of a MaskedGatherSDNode. Also updated SelectionDAGDumper::print_details so that details of the gather load (is signed, is scaled & extension type) are printed. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91084	2020-12-09 11:19:08 +00:00
Joe Ellis	eb525ef991	[SelectionDAG] Add llvm.vector.{extract,insert} intrinsics This commit adds two new intrinsics. - llvm.experimental.vector.insert: used to insert a vector into another vector starting at a given index. - llvm.experimental.vector.extract: used to extract a subvector from a larger vector starting from a given index. The codegen work for these intrinsics has already been completed; this commit is simply exposing the existing ISD nodes to LLVM IR. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D91362	2020-12-09 11:08:41 +00:00
Cullen Rhodes	d85b4494d3	[IR] Support scalable vectors in CastInst::CreatePointerCast Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92482	2020-12-09 10:39:36 +00:00
Simon Moll	358f33d2af	[VP] Build VP SDNodes Translate VP intrinsics to VP_* SDNodes. The tests check whether a matching vp_* SDNode is emitted. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91441	2020-12-09 11:36:51 +01:00
Alex Zinenko	f6c53da76a	[OpenMPIRBuilder] Put the barrier in the exit block in createWorkshapeLoop The original code was inserting the barrier at the location given by the caller. Make sure it is always inserted at the end of the loop exit block instead. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D92849	2020-12-09 11:33:04 +01:00
Tim Northover	47fa1c8edc	AArch64: use correct operand for ubsantrap immediate. I accidentally pushed the wrong patch originally.	2020-12-09 10:17:16 +00:00
Roman Lebedev	65652d512a	[NFC][Instructions] Refactor CmpInst::getFlippedStrictnessPredicate() in terms of is{,Non}StrictPredicate()/get{Non,}StrictPredicate() In particular, this creates getStrictPredicate() method, to be symmetrical with already-existing getNonStrictPredicate().	2020-12-09 12:43:08 +03:00
Fraser Cormack	1174544011	[RISCV] Fix missing def operand when creating VSETVLI pseudos The register operand was not being marked as a def when it should be. No tests for this in the main branch as there are not yet any pseudos without a non-negative VLIndex. Also change the type of a virtual register operand from unsigned to Register and adjust formatting. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D92823	2020-12-09 09:35:28 +00:00
Georgii Rymar	c56960224e	[llvm-readelf/obj] - Improve diagnostics when printing NT_FILE notes. This changes the `printNotesHelper` to report warnings on its side when there are errors when dumping notes. With that we can provide more content when reporting warnings about broken notes. Differential revision: https://reviews.llvm.org/D92636	2020-12-09 12:31:46 +03:00
Georgii Rymar	fa15fb5f0a	[obj2yaml] - Support dumping objects that have multiple SHT_SYMTAB_SHNDX sections. It is allowed to have multiple `SHT_SYMTAB_SHNDX` sections, though we currently don't implement it. The current implementation assumes that there is a maximum of one SHT_SYMTAB_SHNDX section and that it is always linked with .symtab section. This patch drops this limitations. Differential revision: https://reviews.llvm.org/D92644	2020-12-09 12:14:58 +03:00
Siddhesh Poyarekar	70c1820d15	Fix typo in llvm/lib/Target/README.txt Trivial typo, replace __builtin_objectsize with __builtin_object_size. Differential Revision: https://reviews.llvm.org/D92914	2020-12-09 10:12:26 +01:00
David Green	2573f332ed	[ARM] Common inverse constant predicates to VPNOT This scans through blocks looking for constants used as predicates in MVE instructions. When two constants are found which are the inverse of one another, the second can be replaced by a VPNOT of the first, potentially allowing that not to be folded away into an else predicate of a vpt block. Differential Revision: https://reviews.llvm.org/D92470	2020-12-09 07:56:44 +00:00
David Green	30000e84d9	[ARM] Constant Mask VPT block tests. NFC	2020-12-09 07:44:49 +00:00
Craig Topper	27e1f6d8c2	[RISCV] Use SDLoc created early in RISCVDAGToDAGISel::Select instead of recreating it in multiple cases in the switch. NFC	2020-12-08 21:13:25 -08:00
Craig Topper	dfde01efb0	[RISCV] Add a table showing the layout of the fields in VTYPE. Rename MaskedOffAgnostic->MaskAgnostic. NFC	2020-12-08 20:41:57 -08:00
Jinsong Ji	e974df0b6f	[PowerPC] Set SubRegIndex offset for sub_vsx1/sub_pair1 We defined SubRegIndex for 256/512 regs, but we did not set the offset for higher part, so the offset of lower and higher part are the same. This may cause problem in assessing ranges of SubReg, it is great that this haven't affected any testcases, but I think we should fix it to avoid hidden bugs in the future. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D92864	2020-12-08 22:56:44 -05:00
Jinsong Ji	538edff0b8	[PowerPC] Precommit testcases for regpressure compute fix	2020-12-09 03:37:00 +00:00
Kazu Hirata	5a3bc245ea	[MemorySSA] Remove unused declaration determineInsertionPoint (NFC) The declaration was introduced without a corresponding definition on Feb 2, 2016 in commit e1100f533f0a48f55e80e1152b06f5deab5f9b30.	2020-12-08 19:21:44 -08:00
Kazu Hirata	a7ca5fc6a4	[IR] Use llvm::is_contained (NFC)	2020-12-08 19:06:37 -08:00
Sam Clegg	e7c9a10ed2	[WebAssembly] Fix code generated for atomic operations in PIC mode The main this this test does is to add the `IsNotPIC` predicate to the all the atomic instructions pattern that directly refer to `tglobaladdr`. This is because in PIC mode we need to generate separate instruction sequence (either a direct global.get, or __memory_base + offset) for accessing global addresses. As part of this change I noticed that many of the `Requires` attributes added to the instruction in `WebAssemblyInstrAtomics.td` were being honored. This is because the wrapped in a `let Predicates = [HasAtomics]` block and it seems that that outer wrapping overrides any `Requires` on defs within it. As a workaround I removed the outer `let` and added `HasAtomics` to all the inner `Requires`. I believe that all the instrucitons that don't have `Requires` explicit bottom out in `ATOMIC_I` and `ATOMIC_NRI` which have `HasAtomics` so this should not remove this predicate from any patterns (at least that is the idea). The alternative to this approach looks like implementing something like `PredicateControl` in `Mips.td` where we can split the predicates into groups so they don't clobber each other. Differential Revision: https://reviews.llvm.org/D92744	2020-12-08 18:41:32 -08:00
Dávid Bolvanský	8983260832	[NFC] Added test for PR33549	2020-12-09 03:21:52 +01:00
Chen Zheng	d8e252f75a	[PowerPC] prepare more dq form for P10 pair load/store Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92393	2020-12-08 21:01:40 -05:00
Duncan P. N. Exon Smith	8f8ff1a394	Support: Add RedirectingFileSystem::create from simple list of redirections Add an overload of `RedirectingFileSystem::create` that builds a redirecting filesystem off of a simple vector of string pairs. This is intended to be used to support `clang::arcmt::FileRemapper` and `clang::PreprocessorOptions::RemappedFiles`. Differential Revision: https://reviews.llvm.org/D91317	2020-12-08 17:53:30 -08:00
Duncan P. N. Exon Smith	cb4f6a1d60	VFS: Return new file systems as uniquely owned when possible, almost NFC Uniformly return uniquely-owned filesystems from VFS creation APIs. The one exception is `getRealFileSystem`, which has a single instance and needs to be shared. This is almost NFC, except that it fixes a memory leak in `vfs::collectVFSFromYAML()`. Depends on https://reviews.llvm.org/D92888 Differential Revision: https://reviews.llvm.org/D92890	2020-12-08 17:33:46 -08:00
Duncan P. N. Exon Smith	7308ff63ab	ADT: Allow IntrusiveRefCntPtr construction from std::unique_ptr, NFC Allow a `std::unique_ptr` to be moved into the an `IntrusiveRefCntPtr`, and remove a couple of now-unnecessary `release()` calls. Differential Revision: https://reviews.llvm.org/D92888	2020-12-08 17:33:19 -08:00
Wei Mi	2b916be8fe	[SampleFDO] Store fixed length MD5 in NameTable instead of using ULEB128 if MD5 is used. Currently during sample profile loading, NameTable has to be loaded entirely up front before any name string is retrieved. That is because NameTable is stored using ULEB128 encoding and cannot be directly accessed like an array. However, if MD5 is used to represent name in the NameTable, it has fixed length. If MD5 names are stored in uint64_t type instead of ULEB128, NameTable can be accessed like an array then in many cases only part of the NameTable has to be read. This is helpful for reducing compile time especially when small source file is compiled. We find that after this change, the elapsed time to build a large application distributively is reduced by 5% and the accumulative cpu time used for building is also reduced by 5%. The size of the profile is slightly reduced with this change by ~0.2%, and that also indicates encoding MD5 in ULEB128 doesn't save the storage space. Differential Revision: https://reviews.llvm.org/D92621	2020-12-08 16:21:01 -08:00
Craig Topper	bf52a3f18f	[RISCV] Share VTYPE encoding code between the assembler and the CustomInserter for adding VSETVLI before vector instructions This merges the SEW and LMUL enums that each used into singles enums in RISCVBaseInfo.h. The patch also adds a new encoding helper to take SEW, LMUL, tail agnostic, mask agnostic and turn it into a vtype immediate. I also stopped storing the Encoding in the VTYPE operand in the assembler. It is easy to calculate when adding the operand which should only happen once per instruction. Differential Revision: https://reviews.llvm.org/D92813	2020-12-08 16:04:20 -08:00
Ilya Leoshkevich	942af666c6	Prevent FENTRY_CALL reordering FEntryInserter prepends FENTRY_CALL to the first basic block. In case there are other instructions, PostRA Machine Instruction Scheduler can move FENTRY_CALL call around. This actually occurs on SystemZ (see the testcase). This is bad for the following reasons: * FENTRY_CALL clobbers registers. * Linux Kernel depends on whatever FENTRY_CALL expands to to be the very first instruction in the function. Fix by adding isCall attribute to FENTRY_CALL, which prevents reordering by making it a scheduling boundary for PostRA Machine Instruction Scheduler. Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D91218	2020-12-09 00:59:01 +01:00
Philip Reames	88c99ea84d	[indvars] Common a bit of code [NFC]	2020-12-08 15:25:48 -08:00
Duncan P. N. Exon Smith	0282c36f00	ADT: Add hash_value overload for Optional Add a `hash_value` for Optional so that other data structures with optional fields can easily hash them. I have a use for this in an upcoming patch. Differential Revision: https://reviews.llvm.org/D92676	2020-12-08 15:25:03 -08:00
Duncan P. N. Exon Smith	1b58820fff	ADT: Remove the unused explicit `OptionalTest` fixture, NFC `OptionalTest` was empty; drop it and switch all the tests to use the shorter `TEST` instead of `TEST_F`. Differential Revision: https://reviews.llvm.org/D92675	2020-12-08 15:25:03 -08:00
Arthur Eubanks	1309eee42f	[gold][NPM] Use NPM with ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D92869	2020-12-08 15:13:34 -08:00
Jessica Paquette	c6dc17550a	[AArch64][GlobalISel] Swap select operands when inverting condition code This was not obvious when reading the imported tablegen patterns in AArch64GenDAGISel. Update select-select.mir.	2020-12-08 14:17:26 -08:00
Anna Thomas	ea0d3b5f95	[ScalarizeMaskedMemIntrin] Add new PM support This patch adds new PM support for the pass and the pass can be now used during middle-end transforms. The old pass is remamed to ScalarizeMaskedMemIntrinLegacyPass. Reviewed-By: skatkov, aeubanks Differential Revision: https://reviews.llvm.org/D92743	2020-12-08 17:15:22 -05:00
Arthur Eubanks	0c5928d1bc	Pin -loop-reduce to legacy PM LSR currently only runs in the codegen pass manager. There are a couple issues with LSR and the NPM. 1) Lots of tests assume that LCSSA isn't run before LSR. This breaks a bunch of tests' expected output. This is fixable with some time put in. 2) LSR doesn't preserve LCSSA. See llvm/test/Analysis/MemorySSA/update-remove-deadblocks.ll. LSR's use of SCEVExpander is the only use of SCEVExpander where the PreserveLCSSA option is off. Turning it on causes some code sinking out of loops to fail due to SCEVExpander's inability to handle the newly created trivial PHI nodes in the broken critical edge (I was looking at llvm/test/Transforms/LoopStrengthReduce/X86/2011-11-29-postincphi.ll). I also tried simply just calling formLCSSA() at the end of LSR, but the extra PHI nodes cause regressions in codegen tests. We'll delay figuring these issues out until later. This causes the number of check-llvm failures with -enable-new-pm true by default to go from 60 to 29. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D92796	2020-12-08 13:48:23 -08:00
Jessica Paquette	398ebddfba	[AArch64][GlobalISel] Check if G_SELECT has been optimized when folding binops `TryFoldBinOpIntoSelect` didn't have a check for `Optimized`, meaning you could end up folding twice. (e.g. a select with a G_ADD on the true side, and a G_SUB on the false side) Add in the missing `if` and a test.	2020-12-08 13:47:08 -08:00
Arthur Eubanks	ee18a1da49	[NFC] Rename IsCodeGenPass to ShouldPinPassToLegacyPM Codegen-specific passes are being ported to the NPM. Rename for better clarity and note that ported passes that fully work with the NPM should be removed from these lists. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D92818	2020-12-08 13:38:56 -08:00
Kazushi (Jam) Marukawa	c3b7c2e861	[VE] Correct LVLGen (LVL instruction insert pass) SX Aurora VE uses an intermediate representation similar to VP as its MIR. VE itself uses invidiual VL register as its own vector length register at the hardware level. So, LLVM needs to insert load VL (LVL) instruction just before vector instructions if the value of VL is changed. This LVLGen pass generates LVL instructions for such purpose. Previously, a bug is pointed out in D91416. This patch correct this bug and add a regression test. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D92716	2020-12-09 06:33:53 +09:00

... 2 3 4 5 6 ...

208142 Commits