llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Yonghong Song	c572b095fb	[BPF] do not generate unused local/global types The kernel currently has a limit for # of types to be 64KB and the size of string subsection to be 64KB. A simple bcc tool runqlat.py generates: . the size of ~33KB type section, roughly ~10K types . the size of ~17KB string section The majority type is from the types referenced by local variables in the bpf program. For example, the kernel "task_struct" itself recursively brings in ~900 other types. This patch did the following optimization to avoid generating unused types: . do not generate types for local variables unless they are function arguments. . do not generate types for external globals. If an external global is not used in the program, llvm already removes it from IR, so global variable saving is typical small. For runqlat.py, only one variable "llvm.used" is the external global. The types for locals and external globals can be added back once there is a usage for them. After the above optimization, the runqlat.py generates: . the size of ~1.5KB type section, roughtly 500 types . the size of ~0.7KB string section Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 356232	2019-03-15 04:42:01 +00:00
Sam Clegg	cdfca3b241	[WebAssembly] Remove unused load/store patterns that use texternalsym Differential Revision: https://reviews.llvm.org/D59395 llvm-svn: 356221	2019-03-15 00:20:13 +00:00
Matt Arsenault	571a94ef5d	MIR: Allow targets to serialize MachineFunctionInfo This has been a very painful missing feature that has made producing reduced testcases difficult. In particular the various registers determined for stack access during function lowering were necessary to avoid undefined register errors in a large percentage of cases. Implement a subset of the important fields that need to be preserved for AMDGPU. Most of the changes are to support targets parsing register fields and properly reporting errors. The biggest sort-of bug remaining is for fields that can be initialized from the IR section will be overwritten by a default initialized machineFunctionInfo section. Another remaining bug is the machineFunctionInfo section is still printed even if empty. llvm-svn: 356215	2019-03-14 22:54:43 +00:00
Jessica Paquette	5fc1e87953	[AArch64][GlobalISel] Add isel support for G_UADDO on s32s and s64s This adds instruction selection support for G_UADDO on s32s and s64s. Also - Add an instruction selection test - Update the arm64-xaluo.ll test to show that we generate the correct assembly Differential Revision: https://reviews.llvm.org/D58734 llvm-svn: 356214	2019-03-14 22:54:29 +00:00
Amara Emerson	6da5735cc0	[AArch64][GlobalISel] Implement selection for G_UNMERGE of vectors to vectors. This re-uses the previous support for extract vector elt to extract the subvectors. Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356213	2019-03-14 22:48:18 +00:00
Amara Emerson	90869d8494	[AArch64][GlobalISel] Add some support for G_CONCAT_VECTORS. Handles concatenating 2 x v2s32 and 2 x v4s16 Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356212	2019-03-14 22:48:15 +00:00
Matt Arsenault	559ab603a2	AMDGPU: Correct type for waitcnt debug flag llvm-svn: 356206	2019-03-14 21:23:59 +00:00
Pete Couperus	84f781f880	[ARC] Add more load/store variants. On ARC ISA, general format of load instruction is this: LD<zz><.x><.aa><.di> a, [b,c] And general format of store is this: ST<zz><.aa><.di> c, [b,s9] Where: <zz> is data size field and can be one of <empty> (bits 00) - Word (32-bit), default behavior B (bits 01) - Byte H (bits 10) - Half-word (16-bit) <.x> is data extend mode: <empty> (bit 0) - If size is not Word(32-bit), then data is zero extended X (bit 1) - If size is not Word(32-bit), then data is sign extended <.aa> is address write-back mode: <empty> (bits 00) - no write-back .AW (bits 01) - Preincrement, base register updated pre memory transaction .AB (bits 10) - Postincrement, base register updated post memory transaction <.di> is cache bypass mode: <empty> (bit 0) - Cached memory access, default mode .DI (bit 1) - Non-cached data memory access This patch adds these load/store instruction variants to the ARC backend. Patch By Denis Antrushin! <denis@synopsys.com> Differential Revision: https://reviews.llvm.org/D58980 llvm-svn: 356200	2019-03-14 20:50:54 +00:00
Jessica Paquette	5e305f80d3	[GlobalISel][AArch64] Add partial selection support for G_INSERT_VECTOR_ELT This adds support for inserting elements into packed vectors. It also adds two tests: one for selection, and one for regbank select. Unpacked vectors will come in a follow-up. Differential Revision: https://reviews.llvm.org/D59325 llvm-svn: 356182	2019-03-14 18:01:30 +00:00
Pete Couperus	f3147c517c	[ARC] Better classify add/sub immediate instructions in frame lowering. Summary: Some operations have multiple ARC instructions that are applicable. For instance, "add r0, r0, 123" can be encoded as a "LImm" instruction with a 32-bit immediate (8-bytes), or as a signed 12-bit immediate instruction for the case where the source and destination register are the same (4-bytes). The ARC assembler will choose the shortest encoding, but we should track the correct instruction in the compiler. This patch fixes the instruction used in some cases from ARCFrameLowering. Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59326 llvm-svn: 356179	2019-03-14 17:50:46 +00:00
Craig Topper	29ffc6485f	[X86] Fix the pattern changes from r356121 so that the RORr1/RORm1 pattern use the rotr opcode. These instructions used to use rotl with a bitwidth-1 immediate. I changed the immediate to 1, but failed to change the opcode. Thankfully this seems to have not caused a functional issue because we now had two rotl by 1 patterns, but the correct ones were earlier and took priority. So we just missed some optimization. llvm-svn: 356164	2019-03-14 16:53:24 +00:00
Sanjay Patel	322f3c3c52	[x86] prevent infinite looping from vselect commutation (PR41066) This is an immediate fix for: https://bugs.llvm.org/show_bug.cgi?id=41066 ...but as noted there and the code comments, we should do better by stubbing this out sooner. llvm-svn: 356158	2019-03-14 15:32:34 +00:00
Matt Arsenault	557ae7b7e6	AMDGPU: Scavenge register instead of findUnusedReg llvm-svn: 356149	2019-03-14 14:19:01 +00:00
Matt Arsenault	fd11c30e7f	AMDGPU: Don't add unnecessary convergent attributes These are redundant with the intrinsic declaration. llvm-svn: 356143	2019-03-14 13:46:09 +00:00
Sam Parker	923be8eb23	[ARM][ParallelDSP] Enable multiple uses of loads When choosing whether a pair of loads can be combined into a single wide load, we check that the load only has a sext user and that sext also only has one user. But this can prevent the transformation in the cases when parallel macs use the same loaded data multiple times. To enable this, we need to fix up any other uses after creating the wide load: generating a trunc and a shift + trunc pair to recreate the narrow values. We also need to keep a record of which loads have already been widened. Differential Revision: https://reviews.llvm.org/D59215 llvm-svn: 356132	2019-03-14 11:14:13 +00:00
Sam Parker	df3e0498f9	[ARM] Run ARMParallelDSP in the IRPasses phase Run EarlyCSE before ParallelDSP and do this in the backend IR opt phase. Differential Revision: https://reviews.llvm.org/D59257 llvm-svn: 356130	2019-03-14 10:57:40 +00:00
Alex Bradbury	fcca23e948	[RISCV] Fix rL356123 The wrong version of the patch was committed. This fixes typos that broke the build. llvm-svn: 356124	2019-03-14 08:31:35 +00:00
Alex Bradbury	d124e6db2b	[RISCV][NFC] Rename callee saved regs 'CSR' to CSR_ILP32_LP64 and minor RISCVRegisterInfo refactoring The CSR renaming further prepares the way for an upcoming patch adding support for more RISC-V ABIs. Modify RISCVRegisterInfo::getCalleeSavedRegs and RISCVRegisterInfo::getReservedRegs to do MF->getSubtarget<RISCVSubtarget>() once rather than multiple times. llvm-svn: 356123	2019-03-14 08:28:48 +00:00
Craig Topper	013902e369	[X86] Add patterns for rotr by immediate to fix PR41057. Prior to the introduction of funnel shift intrinsics we could count on rotate by immediates prefering to use rotl since that's what MatchRotate would check first. The or+shift pattern doesn't have a direction so one must be chosen arbitrarily. With funnel shift, there is a direction and fshr will try to use rotr first. While fshl will try to use rotl first. This patch adds the isel patterns for rotr to complement the rotl patterns. I've put the rotr by 1 patterns in the instruction patterns. And moved the rotl by bitwidth-1 patterns to separate Pat patterns. Fixes PR41057. llvm-svn: 356121	2019-03-14 07:07:26 +00:00
Jessica Paquette	9a354720e0	[AArch64][GlobalISel] Gardening: Simplify subregister copy in selectBuildVector NFC. Some more preliminary factoring for G_INSERT_VECTOR_ELT. Also better code-reuse, etc., etc. Differential Revision: https://reviews.llvm.org/D59323 llvm-svn: 356107	2019-03-13 23:29:54 +00:00
Jessica Paquette	4117eab6b5	[GlobalISel][AArch64] Gardening: Factor out vector inserts Factor out the vector insert code in `selectBuildVector`. Replace part of it with `emitScalarToVector`, since it was pretty much equivalent. This will make implementing G_INSERT_VECTOR_ELT easier. Differential Revision: https://reviews.llvm.org/D59322 llvm-svn: 356106	2019-03-13 23:22:23 +00:00
Jessica Paquette	fcc568af0c	[GlobalISel][AArch64] Gardening: Factor out code to find lane indices Some more refactoring for G_INSERT_VECTOR_ELT. Factor out the code used to find a lane index from `selectExtractElt`. Put it into a more general-purpose `getConstantValueForReg` function. This will be shared with the code for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59324 llvm-svn: 356101	2019-03-13 21:19:29 +00:00
Stanislav Mekhanoshin	0210f14ed7	[AMDGPU] Silence gcc 7 warnings Differential Revision: https://reviews.llvm.org/D59330 llvm-svn: 356100	2019-03-13 21:15:52 +00:00
Tim Renouf	0740b7d5b8	[AMDGPU] Switched HSA metadata to use MsgPackDocument Summary: MsgPackDocument is the lighter-weight replacement for MsgPackTypes. This commit switches AMDGPU HSA metadata processing to use MsgPackDocument instead of MsgPackTypes. Differential Revision: https://reviews.llvm.org/D57024 Change-Id: I0751668013abe8c87db01db1170831a76079b3a6 llvm-svn: 356081	2019-03-13 18:55:50 +00:00
Craig Topper	53b1ed4a97	[X86] Check for 64-bit mode in X86Subtarget::hasCmpxchg16b() The feature flag alone can't be trusted since it can be passed via -mattr. Need to ensure 64-bit mode as well. We had a 64 bit mode check on the instruction to make the assembler work correctly. But we weren't guarding any of our lowering code or the hooks for the AtomicExpandPass. I've added 32-bit command lines to atomic128.ll with and without cx16. The tests there would all previously fail if -mattr=cx16 was passed to them. I had to move one test case for f128 to a new file as it seems to have a different 32-bit mode or possibly sse issue. Differential Revision: https://reviews.llvm.org/D59308 llvm-svn: 356078	2019-03-13 18:48:50 +00:00
Simon Pilgrim	268ff3b066	[X86][AVX] Add X86ISD::VTRUNC handling to SimplifyDemandedVectorEltsForTargetNode llvm-svn: 356067	2019-03-13 17:00:18 +00:00
Simon Pilgrim	cbbfb5ea0e	[X86][AVX] Add combineConcatVectors support to improve subvector handling Attempt to combine CONCAT_VECTORS nodes, which we only really have pre-legalization. This encourages a lot of X86ISD::SUBV_BROADCAST generation, so I've added SimplifyDemandedVectorEltsForTargetNode handling for this at the same time. The X86ISD::VTRUNC regression in shuffle-vs-trunc-256-widen.ll will be handled in a future commit. llvm-svn: 356064	2019-03-13 16:37:30 +00:00
Alex Bradbury	a678d607a6	[RISCV] Only mark fp as reserved if the function has a dedicated frame pointer This follows similar logic in the ARM and Mips backends, and allows the free use of s0 in functions without a dedicated frame pointer. The changes in callee-saved-gprs.ll most clearly show the effect of this patch. llvm-svn: 356063	2019-03-13 16:33:45 +00:00
Simon Atanasyan	27f5465731	[mips] Join some adjacent `let DecoderNamespace` blocks. NFC llvm-svn: 356059	2019-03-13 16:00:42 +00:00
Sanjay Patel	996b4d3924	[x86] limit extractelement of setcc to pre-legalization A fuzzer found the crasher: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13700 The bug was introduced recently here: rL355741 This is the quick fix. If we need to do this transform later, then we'd have to extend/truncate the vector setcc element type to the scalar setcc type (i8). llvm-svn: 356053	2019-03-13 14:49:52 +00:00
Simon Atanasyan	e4c4aa3874	[mips] Fix encoding of the `mov.d` command for microMIPS R6 Before this change LLVM emits non-microMIPS variant of the `mov.d` command for microMIPS code. Differential Revision: http://reviews.llvm.org/D59045 llvm-svn: 356052	2019-03-13 14:23:12 +00:00
Simon Atanasyan	97f0b70b96	[mips] Define `mov.d` instructions using `ABSS_M` multiclass. NFC llvm-svn: 356051	2019-03-13 14:22:58 +00:00
Simon Pilgrim	f505cb7adc	Fix signed/unsigned mismatch warning. NFCI. llvm-svn: 356046	2019-03-13 13:14:14 +00:00
Simon Atanasyan	b8ca8b155a	[mips] Map SW instruction to its microMIPS R6 variant To provide mapping between standard and microMIPS R6 variants of the `sw` command we have to rename SWSP_xxx commands from "sw" to "swsp". Otherwise `tablegen` starts to show the error `Multiple matches found for `SW'`. After that to restore printing SWSP command as `sw`, I add an appropriate `MipsInstAlias` instance. We also need to implement "size reduction" for microMIPS R6. But this task is for separate patch. After that the `micromips-lwsp-swsp.ll` test case will be extended. Differential Revision: http://reviews.llvm.org/D59046 llvm-svn: 356045	2019-03-13 13:09:30 +00:00
Simon Pilgrim	ac6f6352c9	[X86][AVX] lowerShuffleAsBroadcast - improve load folding by avoiding bitcasts AVX1 broadcasts were failing as we were adding bitcasts that caused MayFoldLoad's hasOneUse to return false. This patch stops introducing bitcasts so early and also replaces the broadcast index scaling through bitcasts (which can't succeed in some cases) to instead just keep track of the bitoffset which can be converted back to the broadcast index later on. Differential Revision: https://reviews.llvm.org/D58888 llvm-svn: 356043	2019-03-13 12:20:39 +00:00
Simon Atanasyan	19e953f0bb	[MIPS][microMIPS] Fix PseudoMTLOHI_MM matching and expansion On micromips MipsMTLOHI is always matched to PseudoMTLOHI_DSP regardless of +dsp argument. This patch checks is HasDSP predicate is present for PseudoMTLOHI_DSP so PseudoMTLOHI_MM can be matched when appropriate. Add expansion of PseudoMTLOHI_MM instruction into a mtlo/mthi pair. Patch by Mirko Brkusanin. Differential Revision: http://reviews.llvm.org/D59203 llvm-svn: 356039	2019-03-13 11:04:38 +00:00
Jonas Hahnfeld	d1e932f9c7	[ELF] Fix GCC8 warnings about "fall through", NFCI Add break statements in Object/ELF.cpp since the code should consider the generic tags for Hexagon, MIPS, and PPC. Add a test (copied from llvm-readobj) to show that this works correctly (earlier versions of this patch would have asserted). The warnings in X86ELFObjectWriter.cpp are actually false-positives since the nested switch() handles all possible values and returns in all cases. Make this explicit by adding llvm_unreachable's. Differential Revision: https://reviews.llvm.org/D58837 llvm-svn: 356037	2019-03-13 10:38:17 +00:00
Alex Bradbury	ba45169928	[RISCV] Replace incorrect use of sizeof with array_lengthof RISCVDisassembler was incorrectly using sizeof(Arr) when it should have used sizeof(Arr)/sizeof(Arr[0]). Update to use array_lengthof instead. llvm-svn: 356035	2019-03-13 09:22:57 +00:00
Craig Topper	66e946eb0f	[X86] Enable printAliasInstr for the Intel assembly printer so that AAM and AAD will print without an immediate when the immediate is 10. llvm-svn: 355997	2019-03-13 00:43:03 +00:00
Heejin Ahn	fa32545e1d	[WebAssembly] Place 'try' and 'catch' correctly wrt EH_LABELs Summary: After instruction selection phase, possibly-throwing calls, which were previously invoke, are wrapped in `EH_LABEL` instructions. For example: ``` EH_LABEL <mcsymbol .Ltmp0> CALL_VOID @foo ... EH_LABEL <mcsymbol .Ltmp1> ``` `EH_LABEL` is placed also in the beginning of EH pads: ``` bb.1 (landing-pad): EH_LABEL <mcsymbol .Ltmp2> ... ``` And we'd like to maintian this relationship, so when we place a `try`, ``` TRY ... EH_LABEL <mcsymbol .Ltmp0> CALL_VOID @foo ... EH_LABEL <mcsymbol .Ltmp1> ``` When we place a `catch`, ``` bb.1 (landing-pad): EH_LABEL <mcsymbol .Ltmp2> %0:except_ref = CATCH ... ... ``` Previously we didn't treat EH_LABELs specially, so `try` was placed right before a call, and `catch` was placed in the beginning of an EH pad. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58914 llvm-svn: 355996	2019-03-13 00:37:31 +00:00
Jason Liu	65a0422be5	Add XCOFF triple object format type for AIX This patch adds an XCOFF triple object format type into LLVM. This XCOFF triple object file type will be used later by object file and assembly generation for the AIX platform. Differential Revision: https://reviews.llvm.org/D58930 llvm-svn: 355989	2019-03-12 22:01:10 +00:00
Philip Reames	5c708a3b00	For faulting ops, include a comment w/the fault destination A faulting_op is one that has specified behavior when a fault occurs, generally redirecting control flow to another location. This change just adds a comment to the assembly output which makes it both human readable, and machine checkable w/o having to parse the FaultMap section. This is used to split a test file into two parts, so that I can (in a near future commit) easily extend the test file to demonstrate another case. llvm-svn: 355982	2019-03-12 21:05:31 +00:00
Matt Arsenault	01d726ce5c	IR: Add immarg attribute This indicates an intrinsic parameter is required to be a constant, and should not be replaced with a non-constant value. Add the attribute to all AMDGPU and generic intrinsics that comments indicate it should apply to. I scanned other target intrinsics, but I don't see any obvious comments indicating which arguments are intended to be only immediates. This breaks one questionable testcase for the autoupgrade. I'm unclear on whether the autoupgrade is supposed to really handle declarations which were never valid. The verifier fails because the attributes now refer to a parameter past the end of the argument list. llvm-svn: 355981	2019-03-12 21:02:54 +00:00
Sanjay Patel	7439052886	[x86] scalarize extractelement 0 of FP vselect llvm-svn: 355955	2019-03-12 19:20:45 +00:00
Jinsong Ji	e0d2002cc3	Set useful flags for vector imm setting instructions Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz Should behave like LI8. We should set corresponding flags to allow rematerialization and other opts in LICM, RA, Scheduling etc. Differential Revision: https://reviews.llvm.org/D58645 llvm-svn: 355948	2019-03-12 18:27:09 +00:00
Eli Friedman	5d07ab0e8a	[RISCV][MC] Find matching pcrel_hi fixup in more cases. If a symbol points to the end of a fragment, instead of searching for fixups in that fragment, search in the next fragment. Fixes spurious assembler error with subtarget change next to "la" pseudo-instruction, or expanded equivalent. Alternate proposal to fix the problem discussed in https://reviews.llvm.org/D58759. Testcase by Ana Pazos. Differential Revision: https://reviews.llvm.org/D58943 llvm-svn: 355946	2019-03-12 18:14:16 +00:00
Craig Topper	6114726daa	[X86] Arrange more CPU features to inherit from earlier CPUs. NFCI This makes SandyBridge inherit back to Westmere/Nehalem. Make bdver1-4 inherit from each other and btver2 inherit from btver1. llvm-svn: 355935	2019-03-12 16:35:30 +00:00
Jinsong Ji	83e9fa63ad	[NFC][PowerPC]Assert when trying to generate directmove below P8. This was found when we generated COPY from G8RC to F8RC in EmitInstrWithCustomInserter without checking proper architecture, we silently generated mtvsrd, which require P8 and up. This is a NFC patch to add assert when we call copyPhysReg, in case someone accidentally generate COPY between G8RC to F8RC for P7 and below. llvm-svn: 355920	2019-03-12 14:01:29 +00:00
David Stuttard	bbae73ce2c	[AMDGPU] Add support for immediate operand for S_ENDPGM Summary: Add support for immediate operand in S_ENDPGM Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6 Reviewers: alexshap Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59213 llvm-svn: 355902	2019-03-12 09:52:58 +00:00
David Blaikie	feb6347be0	Hexagon RDF: Replace function template (plus explicit specializations) with non-template overloads For the design in question, overloads seem to be a much simpler and less subtle solution. This removes ODR issues, and errors of the kind where code that uses the specialization in question will accidentally and erroneously specialize the primary template. This only "works" by accident; the program is ill-formed NDR. (Found with -Wundefined-func-template.) Patch by Thomas Köppe! Differential Revision: https://reviews.llvm.org/D58998 llvm-svn: 355880	2019-03-11 23:10:33 +00:00

1 2 3 4 5 ...

51217 Commits