llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Sanjay Patel	2b34c4298e	[PhaseOrdering] add test for missing vector/CSE transforms (PR45015); NFC	2020-02-25 09:13:49 -05:00
Sanjay Patel	82c3c898a1	[VectorCombine] add tests for possible extract->shuffle; NFC	2020-02-25 08:41:59 -05:00
Sanjay Patel	653c6caf15	[VectorCombine] make cost calc consistent for binops and cmps Code duplication (subsequently removed by refactoring) allowed a logic discrepancy to creep in here. We were being conservative about creating a vector binop -- but not a vector cmp -- in the case where a vector op has the same estimated cost as the scalar op. We want to be more aggressive here because that can allow other combines based on reduced instruction count/uses. We can reverse the transform in DAGCombiner (potentially with a more accurate cost model) if this causes regressions. AFAIK, this does not conflict with InstCombine. We have a scalarize transform there, but it relies on finding a constant operand or a matching insertelement, so that means it eliminates an extractelement from the sequence (so we won't have 2 extracts by the time we get here if InstCombine succeeds). Differential Revision: https://reviews.llvm.org/D75062	2020-02-25 08:41:59 -05:00
Florian Hahn	ac6a52e368	[DSE,MSSA] Do not attempt to remove un-removable memdefs. We have to skip MemoryDefs that cannot be removed. This fixes a crash in the newly added test case and fixes a wrong case in memset-and-memcpy.ll.	2020-02-25 13:31:46 +00:00
Nico Weber	c0713c8977	[gn build] (manually) merge fee41517fe0f	2020-02-25 07:19:49 -05:00
whitequark	b9bac03900	Remove myself from CODE_OWNERS.	2020-02-25 11:59:29 +00:00
Hans Wennborg	96d1f4da3f	build_llvm_package.bat: Produce zip files in addition to the installers Now that the Windows installer no longer does anything besides self-extract, maybe it would make sense to distribute the toolchain as a plain zip file in addition to the current installer. Differential revision: https://reviews.llvm.org/D74896	2020-02-25 12:14:07 +01:00
Andrzej Warzynski	4be34a5506	[AArch64][SVE] Update names and comments for gathers/scatters (NFC) Summary: This patch renames functions and TableGen classes for SVE gathers and scatters. The original names implied that the corresponding methods/classes are only suited for regular gathers/scatters (i.e. LD1 and ST1), which is not the case. Indeed, we will be re-using them for non-temporal and first-faulting gathers/scatters in the forthcoming patches. The new names also highlight the split into Vector-Scalar (VS) and Scalar-Vector (SV) cases. List of changes: * `performLD1GatherCombine` and `performST1ScatterCombine` are renamed as `performGatherLoadCombine` and `performScatterStoreCombine`, respectively. * Selection DAG types for scatters and gathers from AArch64SVEInstrInfo.td are renamed. For example, `SDT_AArch64_GLD1` is renamed as `SDT_AArch64_GATHER_SV`. SV stands for Scalar-Vector, as opposed to Vector-Scalar (VS). * The intrinsic classes from IntrinsicsAArch64.td are renamed. For example, `AdvSIMD_GatherLoad_64bitOffset_Intrinsic` is renamed as `AdvSIMD_GatherLoad_SV_64b_Offsets_Intrinsic`. * Updated comments in `performGatherLoadCombine` and `performScatterStoreCombine`. Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75035	2020-02-25 11:09:01 +00:00
Hans Wennborg	2671045ba2	Don't generate libcalls for wide shift on Windows ARM (PR42711) The previous patch (cff90f07cb5cc3c3bc58277926103af31caef308) didn't cover ARM.	2020-02-25 11:54:07 +01:00
Georgii Rymar	11c96ce249	[yaml2obj] - Address post commit comments for D74764 It removes a stale comment and fixes the comment in the test and section names related accordingly.	2020-02-25 13:26:46 +03:00
Cullen Rhodes	cf01d0410b	[AArch64][SVE] Add predicate reinterpret intrinsics Summary: Implements the following intrinsics: * llvm.aarch64.sve.convert.to.svbool * llvm.aarch64.sve.convert.from.svbool For converting the ACLE svbool_t type (<n x 16 x i1>) to and from the other predicate types: <n x 8 x i1>, <n x 4 x i1> and <n x 2 x i1>. Reviewers: sdesmalen, kmclaughlin, efriedma, dancgr, rengolin Reviewed By: sdesmalen, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74471	2020-02-25 10:24:06 +00:00
Igor Kudrin	84c56f2c7c	[DebugInfo] Fix printing CIE offsets in EH FDEs. While the value of the CIE pointer field in a DWARF FDE record is an offset to the corresponding CIE record from the beginning of the section, for EH FDE records it is relative to the current offset. Previously, we did not make that distinction when dumped both kinds of FDE records and just printed the same value for the CIE pointer field and the CIE offset; that was acceptable for DWARF FDEs but was wrong for EH FDEs. This patch fixes the issue by explicitly printing the offset of the linked CIE object. Differential Revision: https://reviews.llvm.org/D74613	2020-02-25 17:10:29 +07:00
Hans Wennborg	88d39a207c	Add llvm-cov to LLVM_TOOLCHAIN_TOOLS See https://github.com/llvm/llvm-project/issues/141	2020-02-25 10:59:55 +01:00
Jay Foad	4dace44b9b	GlobalISel: NFC minor cleanup to avoid a couple of fixed size local arrays	2020-02-25 09:49:19 +00:00
Jay Foad	6007dfcb1d	AMDGPU/GlobalISel: add legalize tests for s64 max/min	2020-02-25 09:49:19 +00:00
Kang Zhang	40015f35b4	[NFC][PowerPC] Add a new test case scalar_cmp.ll	2020-02-25 09:19:27 +00:00
Craig Topper	fe184bc7b6	[X86] Pass parameters into selectVectorAddr to remove dependency on X86MaskedGatherScatterSDNode. Might be able to get rid of X86ISD::SCATTER and some uses of X86ISD::GATHER. Which require isel to use ISD::SCATTER and ISD::GATHER as well.	2020-02-24 23:56:34 -08:00
Craig Topper	c65a52bcc1	[X86] Remove mask output from X86 gather/scatter ISD opcodes. Instead add it when we make the machine nodes during instruction selections. This makes this ISD node closer to ISD::MGATHER. Trying to see if we remove the X86 specific ones.	2020-02-24 23:56:28 -08:00
Jim Lin	eb8838c3f2	[Sparc][NFC] Remove trailing space	2020-02-25 14:38:58 +08:00
Hideto Ueno	bfbcd7c284	[Attributor] Use AssumptionCache in AANonNullFloating::initialize	2020-02-25 13:00:03 +09:00
Matt Arsenault	d7d89ae055	GlobalISel: Remove unneeded initialiation Removes implicit unsigned->Register conversion.	2020-02-24 22:42:53 -05:00
Matt Arsenault	b754a8cb27	AMDGPU/GlobalISel: Introduce post-legalize combiner The current set of custom combines are only really useful after legalization, so move them there. There is a lot of overlap in the boilerplate here, but I think we do want a pretty different set of combines before and after legalize. I think we will want a lot of overlap between the post-legalize and a post-regbankselect combiner.	2020-02-24 22:12:12 -05:00
Roman Tereshin	eba472875b	[MachineVerifier] Doing ::calcRegsPassed over faster sets: ~15-20% faster MV, NFC MachineVerifier still takes 45-50% of total compile time with -verify-machineinstrs, with calcRegsPassed dataflow taking ~50-60% of MachineVerifier. The majority of that time is spent in BBInfo::addPassed, mostly within DenseSet implementing the sets the dataflow is operating over. In particular, 1/4 of that DenseSet time is spent just iterating over it (operator++), 40-50% on insertions, and most of the rest in ::count. Given that, we're implementing custom sets just for this analysis here, focusing on cheap insertions and O(n) iteration time (as opposed to O(U), where U is the universe). As it's based _mostly_ on BitVector for sparse and SmallVector for dense, it may remotely resemble SparseSet. The difference is, our solution is a lot less clever, doesn't have constant time `clear` that we won't use anyway as reusing these sets across analyses is cumbersome, and thus more space efficient and safer (got a resizable Universe and a fallback to DenseSet for sparse if it gets too big). With this patch MachineVerifier gets ~15-20% faster, its contribution to total compile time drops from 45-50% to ~35%, while contribution of calcRegsPassed to MachineVerifier drops from 50-60% to ~35% as well. calcRegsPassed itself gets another 2x faster here. All measured on a large suite of shaders targeting a number of GPUs. Reviewers: bogner, stoklund, rudkx, qcolombet Reviewed By: rudkx Tags: #llvm Differential Revision: https://reviews.llvm.org/D75033	2020-02-24 19:01:21 -08:00
Bill Wendling	ffaf238dd0	Allow "callbr" to return non-void values Summary: Terminators in LLVM aren't prohibited from returning values. This means that the "callbr" instruction, which is used for "asm goto", can support "asm goto with outputs." This patch removes all restrictions against "callbr" returning values. The heavy lifting is done by the code generator. The "INLINEASM_BR" instruction's a terminator, and the code generator doesn't allow non-terminator instructions after a terminator. In order to correctly model the feature, we need to copy outputs from "INLINEASM_BR" into virtual registers. Of course, those copies aren't terminators. To get around this issue, we split the block containing the "INLINEASM_BR" right before the "COPY" instructions. This results in two cheats: - Any physical registers defined by "INLINEASM_BR" need to be marked as live-in into the block with the "COPY" instructions. This violates an assumption that physical registers aren't marked as "live-in" until after register allocation. But it seems as if the live-in information only needs to be correct after register allocation. So we're able to get away with this. - The indirect branches from the "INLINEASM_BR" are moved to the "COPY" block. This is to satisfy PHI nodes. I've been told that MLIR can support this handily, but until we're able to use it, we'll have to stick with the above. Reviewers: jyknight, nickdesaulniers, hfinkel, MaskRay, lattner Reviewed By: nickdesaulniers, MaskRay, lattner Subscribers: rriddle, qcolombet, jdoerfert, MatzeB, echristo, MaskRay, xbolva00, aaron.ballman, cfe-commits, JonChesterfield, hiraditya, llvm-commits, rnk, craig.topper Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D69868	2020-02-24 18:29:06 -08:00
Sourabh Singh Tomar	d95c06c4ad	[DebugInfo]: Refactored Macinfo section consumption part to allow future macro section dumping. Summary: Previously macinfo infrastructure was using functions names that were ambiguous i.e `getMacro/getMacroDWO` in a sense of conveying stated intentions. This patch refactored them into more reasonable `getDebugMacinfo/getDebugMacinfoDWO` names thus making room for macro implementation. Reviewers: aprantl, probinson, jini.susan.george, dblaikie Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D75037	2020-02-25 07:56:48 +05:30
Matt Arsenault	8e2bc18686	AMDGPU/GlobalISel: Fix incorrect VOP3P fneg folding We use some s32 values in VOP3P operands, and won't see any intervening casts from a 32-bit fneg. Make sure it's really a packed fneg before folding.	2020-02-24 21:20:35 -05:00
Matt Arsenault	089cb64937	GlobalISel: Reimplement fewerElementsVectorBasic Changes the handling of odd breakdowns, and avoids using G_EXTRACT/G_INSERT. Pad with undef to a wider size, and unmerge. Also avoid introducing instructions for the fully undef components.	2020-02-24 21:19:47 -05:00
Eli Friedman	8e5fb180ad	[AArch64] SVE implies fullfp16 This is explicitly guaranteed in ARMARM. And it makes reasoning about vectors easier: we can assume that if a vector operation is legal, the corresponding scalar operation is also legal. Differential Revision: https://reviews.llvm.org/D74993	2020-02-24 17:19:35 -08:00
Joerg Sonnenberger	bede12ada9	Prefer PATH_MAX to MAXPATHLEN The former is part of POSIX and requires less heavy headers. They are practically functionally equivalent.	2020-02-25 01:37:29 +01:00
Shoaib Meenai	88b0fca2ac	[arcconfig] Delete subproject arcconfigs From https://secure.phabricator.com/book/phabricator/article/arcanist_new_project/: > An .arcconfig file is a JSON file which you check into your project's root. I've done some experimentation, and it looks like the subproject .arcconfigs just get ignored, as the documentation says. Given that we're fully on the monorepo now, it's safe to remove them. Differential Revision: https://reviews.llvm.org/D74996	2020-02-24 16:20:36 -08:00
Craig Topper	f173a7a659	[LegalizeTypes] Scalarize non-byte sized loads in WidenRecRes_Load and SplitVecResLoad Should fix PR42803 and PR44902 Differential Revision: https://reviews.llvm.org/D74590	2020-02-24 15:14:33 -08:00
Jay Foad	2dc02e28bd	AMDGPU/GlobalISel: Lower 64-bit uaddo/usubo Summary: Add more test cases for signed and unsigned add/sub with overflow. Reviewers: arsenm, rampitec, kerbowa Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75051	2020-02-24 23:08:14 +00:00
Bardia Mahjour	10f87c06af	[NFC] [DA] Refactoring getIndexExpressionsFromGEP Summary: This patch moves the getIndexExpressionsFromGEP function from polly into ScalarEvolution so that both polly and DependenceAnalysis can use it for the purpose of subscript delinearization when the array sizes are not parametric. Authored By: bmahjour Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu Reviewed By: Meinersbur Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73995	2020-02-24 17:32:30 -05:00
Greg Clayton	6beeaecd76	Add methods to data extractor for extracting bytes and fixed length C strings. Summary: These modificaitons will be used in D74883. Fixed length C strings can have trailing NULLs or sometimes spaces (BSD archive files), so the fixed length C string defaults to stripping trailing NULLs, but can have the arguments specify to remove one or more kinds of spaces if needed. This is used to extract fixed length C strings from ELF NOTEs in D74883. Reviewers: labath, dblaikie, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74991	2020-02-24 14:17:43 -08:00
Ikhlas Ajbar	e5455cb803	[Hexagon] Lower vector predicate store This patch lowers store of vector predicate of type v128i1.	2020-02-24 15:43:04 -06:00
Eric Astor	f393949a0c	Reland "[ms] [llvm-ml] Improve data support, adding names and complex initializers." This reverts commit 9fe769a961dc8e3ce7d967ea0e07a4f0e5fac6e9, and re-lands commit c2e272f8cf76ec97f675e0dfdada75445bbee5c5. Summary: Add support for ?, DUP, and string initializers, as well as MASM syntax for named data locations. This version avoids the use of a C++17-only feature, if-statements with initializer. Reviewers: rnk, thakis Reviewed By: thakis Tags: #llvm Differential Revision: https://reviews.llvm.org/D73226	2020-02-24 16:40:25 -05:00
Roman Tereshin	b34d8342eb	[MachineVerifier] Doing ::calcRegsPassed in RPO: ~35% faster MV, NFC Depending on the target, test suite, pipeline config and perhaps other factors machine verifier when forced on with -verify-machineinstrs can increase compile time 2-2.5 times over (Release, Asserts On), taking up ~60% of the time. An invaluable tool, it significantly slows down machine verifier-enabled testing. Nearly 75% of its time MachineVerifier spends in the calcRegsPassed method. It's a classic forward dataflow analysis executed over sets, but visiting MBBs in arbitrary order. We switch that to RPO here. This speeds up MachineVerifier by about 35%, decreasing the overall compile time with -verify-machineinstrs by 20-25% or so. calcRegsPassed itself gets 2x faster here. All measured on a large suite of shaders targeting a number of GPUs. Reviewers: bogner, stoklund, rudkx, qcolombet Reviewed By: bogner Tags: #llvm Differential Revision: https://reviews.llvm.org/D75032	2020-02-24 13:30:01 -08:00
Ikhlas Ajbar	e16adb830a	[Hexagon] Lower bitcast of a vector predicate This patch lowers bitcast of vector predicate of type v32i1/v64i1 to i32/i64 type.	2020-02-24 15:25:51 -06:00
Eric Astor	97dadf463b	Revert "[ms] [llvm-ml] Improve data support, adding names and complex initializers." This reverts commit c2e272f8cf76ec97f675e0dfdada75445bbee5c5, which broke builds.	2020-02-24 16:08:40 -05:00
Eric Astor	408e3579d1	[ms] [llvm-ml] Improve data support, adding names and complex initializers. Summary: Add support for ?, DUP, and string initializers, as well as MASM syntax for named data locations. Reviewers: rnk, thakis Reviewed By: thakis Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73226	2020-02-24 15:40:04 -05:00
Craig Topper	22fda509e2	[X86] Add back fmaddsub intrinsics to work towards fixing the strict fp implementation Previously we emitted an fmadd and a fmadd+fneg and combined them with a shufflevector. But this doesn't follow the correct exception behavior for unselected elements so the backend can't merge them into the fmaddsub/fmsubadd instructions. This patch restores the the fmaddsub intrinsics so we don't have two arithmetic operations. We lose out on optimization opportunity in the non-strict FP case, but I don't think this is a big loss. If someone gives us a test case we can look into adding instcombine/dagcombine improvements. I'd rather not have the frontend do completely different things for strict and non-strict. This still has problems because target specific intrinsics don't support strict semantics yet. We also still have all of the problems with masking. But we at least generate the right instruction in constrained mode now. Differential Revision: https://reviews.llvm.org/D74268	2020-02-24 12:07:21 -08:00
Stanislav Mekhanoshin	82801db7f3	[AMDGPU] use llvm_unreachable instead of default for rp set GCC 9.2 seems to incorrectly issue warning about out of bounds access. This situation should not happen in any way. Differential Revision: https://reviews.llvm.org/D75071	2020-02-24 12:02:12 -08:00
Ayke van Laethem	00a4c69ae3	[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder. Differential Revision: https://reviews.llvm.org/D51642	2020-02-24 20:15:51 +01:00
Simon Pilgrim	f8f1dd101b	[SelectionDAG] Merge constant SDNode arithmetic into foldConstantArithmetic This is the second patch as part of https://bugs.llvm.org/show_bug.cgi?id=36544 Merging in the ConstantSDNode variant of FoldConstantArithmetic. After this, I will begin merging in FoldConstantVectorArithmetic I've ensured this patch can build & pass all lit tests in Windows and Linux environments. Patch by @justice_adams (Justice Adams) Differential Revision: https://reviews.llvm.org/D74881	2020-02-24 18:54:22 +00:00
Francis Visoiu Mistrih	3646289c1d	[MachO] Add cpu(sub)type tests and improve error handling Add checks for triples that don't use mach-o, and unit tests for everything.	2020-02-24 10:44:42 -08:00
Ayke van Laethem	3509611238	[bindings/go] Add RemoveFromParentAsInstruction This allows removing instructions without erasing them. They can then be added somewhere else in the IR using Builder.Insert().	2020-02-24 19:38:47 +01:00
Ayke van Laethem	159dabd5ed	[AVR] Disassemble register operands Simply by implementing a few functions I was able to correctly disassemble a much larger amount of instructions. Differential Revision: https://reviews.llvm.org/D74045	2020-02-24 19:35:51 +01:00
Simon Pilgrim	91b05f7d3e	[X86] combineX86ShuffleChain - select X86ISD::FAND/ISD::AND based on MaskVT Noticed by inspection, we shouldn't use FloatDomain directly, we've already bitcast both inputs to MaskVT so select the opcode using that.	2020-02-24 18:24:44 +00:00
Ayke van Laethem	ec95f4b35a	[AVR] Don't assert on an undefined operand Not all operands are correctly disassembled at the moment. This means that some machine instructions won't have all the necessary operands set. To avoid asserting, print an error instead until the necessary support has been implemented. Differential Revision: https://reviews.llvm.org/D73958	2020-02-24 19:22:52 +01:00
Ayke van Laethem	a33e7d00b3	[AVR] Use correct register class for mul instructions A number of multiplication instructions (muls, mulsu, fmul, fmuls, fmulsu) had the wrong register class for an operand. This resulted in the wrong register being used for the instruction. Example: target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8" target triple = "avr-atmel-none" define i16 @sliceAppend(i16, i16, i16, i16, i16, i16) addrspace(1) { %d = mul i16 %0, %5 ret i16 %d } The first instruction would be muls r24, r31 before this patch. The r31 should have been r15 if you look at the intermediate forms during instruction selection / register allocation, but the generated instruction uses r31. After this patch, an extra movw is inserted to get %5 in range for muls. To make sure this bug is fixed everywhere, I checked all instructions and found that most multiplication instructions suffered from this bug, which I have fixed with this patch. No other instructions appear to be affected. Differential Revision: https://reviews.llvm.org/D74281	2020-02-24 19:19:56 +01:00

1 2 3 4 5 ...

192465 Commits