llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
David Green	0d22518864	[ARM] Extra tests for MVE vhadd and vmulh. NFC	2021-05-20 14:13:39 +01:00
David Sherwood	f86b85d85e	[CodeGen] Add support for widening the result of EXTRACT_SUBVECTOR When trying to return a type such as <vscale x 1 x i32> from a function we crash in DAGTypeLegalizer::WidenVecRes_EXTRACT_SUBVECTOR when attempting to get the fixed number of elements in the vector. For the simple case we are dealing with, i.e. extracting <vscale x 1 x i32> from index 0 of input vector <vscale x 4 x i32> we can simply rely upon existing code that just returns the input. Differential Revision: https://reviews.llvm.org/D102605	2021-05-20 12:27:08 +01:00
Simon Pilgrim	145caddc0a	[CostModel][X86][AVX2] Improve 256-bit vector non-uniform shifts costs Haswell, Excavator and early Ryzen all have slower 256-bit non-uniform vector shifts (confirmed on AMDSoG/Agner/instlatx64 and llvm models) - so bump the worst case costs accordingly. Noticed while investigating PR50364	2021-05-20 12:16:16 +01:00
David Truby	80b1235577	[llvm][sve] Lowering for VLS MLOAD/MSTORE This adds custom lowering for the MLOAD and MSTORE ISD nodes when passed fixed length vectors in SVE. This is done by converting the vectors to VLA vectors and using the VLA code generation. Fixed length extending loads and truncating stores currently produce correct code, but do not use the built in extend/truncate in the load and store instructions. This will be fixed in a future patch. Differential Revision: https://reviews.llvm.org/D101834	2021-05-20 10:50:59 +00:00
Roman Lebedev	07d20f9eb2	[NFC][Coroutines] Autogenerate a few tests for ease of further updates	2021-05-20 13:37:44 +03:00
Sergey Dmitriev	7095312ff1	[llvm-strip] Add support for '--' for delimiting options from input files This will allow to use llvm-strip with file names that begin with dashes. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102825	2021-05-20 03:33:51 -07:00
David Green	eb198d6426	[AArch64] Add extra codegen tests. NFC This adds some extra codegen tests for abs and hadd, regenerating some of the existing tests with updated check lines.	2021-05-20 11:32:51 +01:00
LLVM GN Syncbot	75ce07ec7f	[gn build] Port 081c62501e4f	2021-05-20 10:17:56 +00:00
Alexey Lapshin	2bc33a6598	[llvm-objcopy] Refactor CopyConfig structure. This patch prepares llvm-objcopy to move its implementation into a separate library. To make it possible it is necessary to minimize internal dependencies. Differential Revision: https://reviews.llvm.org/D99055	2021-05-20 13:14:51 +03:00
Roman Lebedev	029935da14	[NFC][CHR] Autogenerate checklines in a few tests for ease of updates	2021-05-20 13:12:45 +03:00
Roman Lebedev	3605ab1b0c	[NFC][PruneEH] Autogenerate checklines in a few tests for ease of updates	2021-05-20 13:12:45 +03:00
Roman Lebedev	e8395509b1	[NFC][SimplifyCFG] Autogenerate checklines in a few tests for ease of updates	2021-05-20 13:12:44 +03:00
Simon Pilgrim	0e2845d424	[X86][AVX] Don't scrub pointer math in avx-vperm2x128.ll This will make it easier to track address offsets in folded loads/broadcasts of subvectors	2021-05-20 10:53:20 +01:00
Luke	7de9fb29f4	[RISCV] Add legality check for vectorizing reduction Check if it is legal to vectorize reduction. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99509	2021-05-20 17:45:46 +08:00
David Sherwood	3bda651940	[CodeGen] Add support for widening INSERT_SUBVECTOR operands When attempting to return something like a <vscale x 1 x i32> type from a function we end up trying to widen the vector by inserting a <vscale x 1 x i32> subvector into an undefined <vscale x 4 x i32> vector. However, during legalisation we then attempt to widen the INSERT_SUBVECTOR operands and hit an error in WidenVectorOperand. This patch adds a new WidenVecOp_INSERT_SUBVECTOR function that currently only supports inserting subvectors into undefined vectors. Differential Revision: https://reviews.llvm.org/D102501	2021-05-20 10:37:03 +01:00
Heejin Ahn	05d69ddc38	[WebAssembly] Ignore filters in Emscripten EH landingpads We have been handling filters and landingpads incorrectly all along. We pass clauses' (catches') types to `__cxa_find_matching_catch` in JS glue code, which returns the thrown pointer and sets the selector using `setTempRet0()`. We apparently have been doing the same for filters' (exception specs') types; we pass them to `__cxa_find_matching_catch` just the same way as clauses. And `__cxa_find_matching_catch` treats all given types as clauses. So it is a little surprising; maybe we intended to do something from the JS side and didn't end up doing? So anyway, I don't think supporting exception specs in Emscripten EH is a priority, but this can actually cause incorrect results for normal catches when functions are inlined and the inlined spec type has a parent-child relationship with the catch's type. --- The below is an example of a bug that can happen when inlining and class hierarchy is mixed. If you are busy you can skip this part: ``` struct A {}; struct B : A {}; void bar() throw (B) { throw B(); } void foo() { try { bar(); } catch (A &) { fputs ("Expected result\n", stdout); } } ``` In the unoptimized code, `bar`'s landingpad will have a filter for `B` and `foo`'s landingpad will have a clause for `A`. But when `bar` is inlined into `foo`, `foo`'s landingpad has both a filter for `B` and a clause for `A`, and it passes the both types to `__cxa_find_matching_catch`: ``` __cxa_find_matching_catch(typeinfo for B, typeinfo for A) ``` `__cxa_find_matching_catch` thinks both are clauses, and looks at the first type `B`, which belongs to a filter. And the thrown type is `B`, so it thinks the first type `B` is caught. But this makes it return an incorrect selector, because it is supposed to catch the exception using the second type `A`, which is a parent of `B`. As a result, the `foo` in the example program above does not print "Expected result" but just throws the exception to the caller. (This wouldn't have happened if `A` and `B` are completely disjoint types, such as `float` and `int`) Fixes https://bugs.llvm.org/show_bug.cgi?id=50357. Reviewed By: dschuff, kripken Differential Revision: https://reviews.llvm.org/D102795	2021-05-20 01:28:16 -07:00
Caroline Concatto	85e935efe6	[CostModel][AArch64] Add missing costs for getShuffleCost with scalable vectors Differential Revision: https://reviews.llvm.org/D102490	2021-05-20 09:08:31 +01:00
serge-sans-paille	26806aa0f7	Force visibility of llvm::Any to external llvm::Any::TypeId::Id relies on the uniqueness of the address of a static variable defined in a template function. hidden visibility implies vague linkage for that variable, which does not guarantee the uniqueness of the address across a binary and a shared library. This totally breaks the implementation of llvm::Any. Ideally, setting visibility to llvm::Any::TypeId::Id should be enough, unfortunately this doesn't work as expected and we lack time (before 12.0.1 release) to understand why setting the visibility to llvm::Any does work. See https://gcc.gnu.org/wiki/Visibility and https://gcc.gnu.org/onlinedocs/gcc/Vague-Linkage.html for more information on that topic. Differential Revision: https://reviews.llvm.org/D101972	2021-05-20 10:06:00 +02:00
Andrew Savonichev	607f7c19c2	[AArch64] Combine vector shift instructions in SelectionDAG bswap.v2i16 + sitofp in LLVM IR generate a sequence of: - REV32 + USHR for bswap.v2i16 - SHL + SSHR + SCVTF for sext to v2i32 and scvt The shift instructions are excessive as noted in PR24820, and they can be optimized to just SSHR. Differential Revision: https://reviews.llvm.org/D102333	2021-05-20 10:50:13 +03:00
Amara Emerson	e9e3784d95	[GlobalISel] Fix div+rem -> divrem combine causing use-def violation.	2021-05-19 23:13:41 -07:00
Simon Giesecke	aa649d36a0	Add option to llvm-gsymutil to read addresses from stdin. Differential Revision: https://reviews.llvm.org/D102224	2021-05-20 06:10:35 +00:00
Xiang1 Zhang	652257ec88	Revert "[HWASAN] Update the tag info for X86_64." This reverts commit 81c18ce03cd8199cc4f2c817e31b42a191a0fe7d.	2021-05-20 13:12:59 +08:00
Xiang1 Zhang	de91b3f2fe	[HWASAN] Update the tag info for X86_64. In LAM model X86_64 will use bits 57-62 (of 0-63) as HWASAN tag. So here we make sure the tag shift position and tag mask is correct for x86-64. Differential Revision: https://reviews.llvm.org/D102472	2021-05-20 11:22:12 +08:00
Sergey Dmitriev	5281788ba1	[llvm-objcopy] Update LIT test to resolve bot failure [NFC] Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D102823	2021-05-19 19:56:35 -07:00
Zhiwei Chen	8ecb4a2780	[sanitizer] Reduce redzone size for small size global objects Currently 1 byte global object has a ridiculous 63 bytes redzone. This patch reduces the redzone size to be less than 32 if the size of global object is less than or equal to half of 32 (the minimal size of redzone). A 12 bytes object has a 20 bytes redzone, a 20 bytes object has a 44 bytes redzone. Reviewed By: MaskRay, #sanitizers, vitalybuka Differential Revision: https://reviews.llvm.org/D102469	2021-05-19 19:18:50 -07:00
Jon Roelofs	94fbaca3d2	Fix warnings in windows bots. NFC	2021-05-19 17:42:34 -07:00
LLVM GN Syncbot	077e051b22	[gn build] Port 4bf69fb52b3c	2021-05-19 22:27:27 +00:00
Ahmed Bougacha	50227f56b6	[docs] Describe reporting security issues on the chromium tracker. To track security issues, we're starting with the chromium bug tracker (using the llvm project there). We considered using Github Security Advisories. However, they are currently intended as a way for project owners to publicize their security advisories, and aren't well-suited to reporting issues. This also moves the issue-reporting paragraph to the beginning of the document, in part to make it more discoverable, in part to allow the anchor-linking to actually display the paragraph at the top of the page. Note that this doesn't update the concrete list of security-sensitive areas, which is still an open item. When we do, we may want to move the list of security-sensitive areas next to the issue-reporting paragraph as well, as it seems like relevant information needed in the reporting process. Finally, when describing the discission medium, this splits the topics discussed into two: the concrete security issues, discussed in the issue tracker, and the logistics of the group, in our mailing list, as patches on public lists, and in the monthly sync-up call. While there, add a SECURITY.md page linking to the relevant paragraph. Differential Revision: https://reviews.llvm.org/D100873	2021-05-19 15:21:50 -07:00
Jon Roelofs	ad30e385ed	[Remarks] Add analysis remarks for memset/memcpy/memmove lengths Differential revision: https://reviews.llvm.org/D102452	2021-05-19 15:09:18 -07:00
Ryan Prichard	dc2b1a16fe	[MC][ARM] Reject Thumb "ror rX, #0 " The ROR instruction can only handle immediates between 1 and 31. The would-be encoding for ROR #0 is actually the RRX instruction. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D102455	2021-05-19 15:05:39 -07:00
Petr Hosek	2054565e98	[CMake] Don't LTO optimize targets that aren't part of any distribution When using distributions, targets that aren't included in any distribution don't need to be as optimized as targets that are included since those targets are typically only used for tests. We might consider avoiding LTO for these targets altogether, see https://lists.llvm.org/pipermail/llvm-dev/2021-April/149843.html Differential Revision: https://reviews.llvm.org/D102732	2021-05-19 15:02:11 -07:00
wlei	aa88690f46	[CSSPGO] Avoid deleting probe instruction in FoldValueComparisonIntoPredecessors This change tries to fix a place missing `moveAndDanglePseudoProbes `. In FoldValueComparisonIntoPredecessors, it folds the BB into predecessors and then marked the BB unreachable. However, the original logic from the BB is still alive, deleting the probe will mislead the SampleLoader mark it as zero count sample. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D102721	2021-05-19 13:39:05 -07:00
Lang Hames	0e09eb0cf2	[ORC] Add a CPU getter to JITTargetMachineBuilder.	2021-05-19 13:31:25 -07:00
Arthur Eubanks	63f5e603f7	[OpaquePtr] Make atomicrmw work with opaque pointers FullTy is only necessary when we need to figure out what type an instruction works with given a pointer's pointee type. However, we just end up using the value operand's type, so FullTy isn't necessary. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102788	2021-05-19 12:49:28 -07:00
Arthur Eubanks	208107dd2c	[OpaquePtr] Make cmpxchg work with opaque pointers Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102745	2021-05-19 12:44:10 -07:00
Arthur Eubanks	764e5745a3	[OpaquePtr] Make GEPs work with opaque pointers No verifier changes needed, the verifier currently doesn't check that the pointer operand's pointee type matches the GEP type. There is a similar check in GetElementPtrInst::Create() though. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102744	2021-05-19 12:39:37 -07:00
Joseph Huber	a0d824fa55	[Diagnostics] Allow emitting analysis and missed remarks on functions Summary: Currently, only `OptimizationRemarks` can be emitted using a Function. Add constructors to allow this for `OptimizationRemarksAnalysis` and `OptimizationRemarkMissed` as well. Reviewed By: jdoerfert thegameg Differential Revision: https://reviews.llvm.org/D102784	2021-05-19 15:10:20 -04:00
Sanjay Patel	f66389763f	[x86] add tests for fma folds with fast-math-flags; NFC Part of prep work for D90901	2021-05-19 14:28:57 -04:00
Sanjay Patel	f47bf23962	[x86] propagate FMF from x86-specific intrinsic nodes to others during combining This is another FMF gap exposed by D90901, but I don't see a way to show the difference in a regression test as with: f66ba4c 6025663 We will see an asm difference if we add a test as part of D90901.	2021-05-19 14:25:09 -04:00
Andrea Di Biagio	af6ea1e0bd	[MCA] Unbreak the buildbots by passing flag -mcpu=generic to the new test added by commit e5d59db469. This should unbreak buildbot clang-ppc64le-linux-lnt.	2021-05-19 19:12:33 +01:00
Sanjay Patel	ce4fb4a2a8	[x86] update fma test with deprecated intrinsics; NFC Similar to 8854b27 - All of the CHECK lines should be identical to before, but without any of the x86-specific calls that were replaced with generic FMA long ago. The file still has value because it shows a miscompile as demonstrated in D90901, but we probably need to add tests with FMF to make that explicit without losing coverage.	2021-05-19 13:52:08 -04:00
Pirama Arumuga Nainar	7b5dc69f32	[CoverageMapping] Handle gaps in counter IDs for source-based coverage For source-based coverage, the frontend sets the counter IDs and the constraints of counter IDs is not defined. For e.g., the Rust frontend until recently had a reserved counter #0 (https://github.com/rust-lang/rust/pull/83774). Rust coverage instrumentation also creates counters on edges in addition to basic blocks. Some functions may have more counters than regions. This breaks an assumption in CoverageMapping.cpp where the number of counters in a function is assumed to be bounded by the number of regions: Counts.assign(Record.MappingRegions.size(), 0); This assumption causes CounterMappingContext::evaluate() to fail since there are not enough counter values created in the above call to `Counts.assign`. Consequently, some uncovered functions are not reported in coverage reports. This change walks a Function's CoverageMappingRecord to find the maximum counter ID, and uses it to initialize the counter array when instrprof records are missing for a function in sparse profiles. Differential Revision: https://reviews.llvm.org/D101780	2021-05-19 10:46:38 -07:00
Roman Lebedev	22691d2522	[NFCI][Local] TryToSimplifyUncondBranchFromEmptyBlock(): use DeleteDeadBlocks()	2021-05-19 20:38:30 +03:00
Roman Lebedev	9892711a85	[NFCI][Local] MergeBlockIntoPredecessor(): use DeleteDeadBlocks()	2021-05-19 20:38:30 +03:00
Roman Lebedev	8b0f054cf7	[NFCI][Local] removeUnreachableBlocks(): use DeleteDeadBlocks()	2021-05-19 20:38:30 +03:00
Patrick Holland	81da9f4819	[MCA] llvm-mca MCTargetStreamer segfault fix In order to create the code regions for llvm-mca to analyze, llvm-mca creates an AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions(). Within this function, both an MCAsmParser and MCTargetAsmParser are created so that MCAsmParser::Run() can be used to create the code regions for us. These parser classes were created for llvm-mc so they are designed to emit code with an MCStreamer and MCTargetStreamer that are expected to be setup and passed into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any code, an MCStreamerWrapper class gets created instead and passed into the MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides many of the emit methods to just do nothing. The exception is the emitInstruction() method which calls Regions.addInstruction(Inst). This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build our code regions, however there are a few directives which rely on the MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of those directives, a segfault will occur. In x86, each one of these 7 directives will cause this segfault if they exist in the input assembly to llvm-mca: .cv_fpo_proc .cv_fpo_setframe .cv_fpo_pushreg .cv_fpo_stackalloc .cv_fpo_stackalign .cv_fpo_endprologue .cv_fpo_endproc I haven’t looked at other targets, but I wouldn’t be surprised if some of the other ones also have certain directives which could result in this same segfault. My proposed solution is to simply initialize an MCTargetStreamer after we initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream object, but we don’t actually want any of these directives to be emitted anywhere, so I use an ostream created with the nulls() function. Since this needs to happen after the MCStreamerWrapper has been initialized, it needs to happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize within the main() function of llvm-mca. So this MCInstPrinter gets constructed within main() then passed into the parseCodeRegions() function as a parameter. (If you feel like it would be appropriate and possible to create the MCInstPrinter within the parseCodeRegions() function, then feel free to modify my solution. That would stop us from having to pass it into the function and would limit its scope / lifetime.) My solution stops the segfault from happening and still passes all of the current (expected) llvm-mca tests. I also added a new test for x86 that checks for this segfault on an input that includes one of the .cv_fpo directives (this test fails without my solution, but passes with it). As far as I can tell, all of the functions that I modified are only called from within llvm-mca so there shouldn’t be any worries about breaking other tools. Differential Revision: https://reviews.llvm.org/D102709	2021-05-19 18:36:10 +01:00
Philip Reames	a4f9bca98e	Do actual DCE in LoopUnroll (try 4) Turns out simplifyLoopIVs sometimes returns a non-dead instruction in it's DeadInsts out param. I had done a bit of NFC cleanup which was only NFC if simplifyLoopIVs obeyed it's documentation. I'm simplfy dropping that part of the change. Commit message from try 3: Recommitting after fixing a bug found post commit. Amusingly, try 1 had been correct, and by reverting to incorporate last minute review feedback, I introduce the bug. Oops. :) Original commit message: The problem was that recursively deleting an instruction can delete instructions beyond the current iterator (via a dead phi), thus invalidating iteration. Test case added in LoopUnroll/dce.ll to cover this case. LoopUnroll does a limited DCE pass after unrolling, but if you have a chain of dead instructions, it only deletes the last one. Improve the code to recursively delete all trivially dead instructions. Differential Revision: https://reviews.llvm.org/D102511	2021-05-19 10:25:31 -07:00
Sanjay Patel	faf74c5c9d	[x86] propagate FMF from x86-specific intrinsic nodes to others during lowering This is another fast-math-flags failure exposed by D90901.	2021-05-19 13:11:15 -04:00
Sanjay Patel	a3f02702bf	[x86] add test check lines to demonstrate FMF propagation failure; NFC	2021-05-19 13:11:15 -04:00
Nikita Popov	ae95b48b2e	[ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC) We only use BackedgeTakenInfo::hasOperand().	2021-05-19 18:42:14 +02:00

... 2 3 4 5 6 ...

216227 Commits