llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	1b7526b7e8	[X86][AVX] Attempt to lower v16i32/v16f32 shuffles with lowerShuffleAsRepeatedMaskAndLanePermute Avoids prematurely creating permps/permd variable shuffles. Fixes PR46249	2020-06-23 18:33:50 +01:00
Momchil Velikov	631e350306	[ARM] Describe defs/uses of VLLDM and VLSTM The VLLDM and VLSTM instructions are incompletely specified. They (potentially) write (or read, respectively) registers Q0-Q7, VPR, and FPSCR, but the compiler is unaware of it. In the new test case `cmse-vlldm-no-reorder.ll` case the compiler missed an anti-dependency and reordered a `VLLDM` ahead of the instruction, which stashed the return value from the non-secure call, effectively clobbering said value. This test case does not fail with upstream LLVM, because of scheduling differences and I couldn't find a test case for the VLSTM either. Differential Revision: https://reviews.llvm.org/D81586	2020-06-23 16:04:23 +01:00
Mikhail Maltsev	14bad468ca	[BFloat] Add convert/copy instrinsic support This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a Specifically it adds intrinsic support in clang and llvm for Arm and AArch64. The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Alexandros Lamprineas - Luke Cheeseman - Mikhail Maltsev - Momchil Velikov - Luke Geeson Differential Revision: https://reviews.llvm.org/D80928	2020-06-23 14:27:05 +00:00
Matt Arsenault	a5ea3b8c6a	AMDGPU/GlobalISel: Fix asserts on non-s32 sitofp/uitofp sources The combine to form cvt_f32_ubyte0 was assuming the source type was always 32-bit, but this needs to tolerate any legal source type.	2020-06-23 10:00:35 -04:00
Mikhail Maltsev	069aeafb71	[ARM] BFloat MatMul Intrinsics&CodeGen Summary: This patch adds support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are provided as needed. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman - Simon Tatham Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81740	2020-06-23 12:06:37 +00:00
hsmahesha	f764419b10	[AMDGPU/MemOpsCluster] Compute `width` for `MIMG` instruction class. Summary: `width` computation is missing for newly added `MIMG` instruction class. Add it. Reviewers: foad, rampitec, arsenm Reviewed By: foad Subscribers: MatzeB, javed.absar, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81649	2020-06-23 17:32:17 +05:30
Sander de Smalen	0b0c0d92c0	[AArch64][SVE] ACLE: Add bfloat16 to struct load/stores. This patch contains: - Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4. - New bfloat16 ACLE builtins for svld(2\|3\|4)[_vnum] and svst(2\|3\|4)[_vnum] Reviewers: stuij, efriedma, c-rhodes, fpetrogalli Reviewed By: fpetrogalli Tags: #clang, #lldb, #llvm Differential Revision: https://reviews.llvm.org/D82187	2020-06-23 12:12:35 +01:00
Chen Zheng	4cc65c325c	[PowerPC] fold addi's imm operand to its imm form consumer's displacement This patch adds a function to do following transformation: %0:g8rc_and_g8rc_nox0 = ADDI8 %5:g8rc_and_g8rc_nox0, 144 STD killed %7:g8rc, 16, %0:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8) ------> STD killed %7:g8rc, 160, %5:g8rc_and_g8rc_nox0 :: (store 8 into %ir.8) Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81723	2020-06-23 06:28:18 -04:00
Simon Pilgrim	bbe9672a5e	[X86] truncateVectorWithPACK - fix outdated comment. NFC. We perform PACKSS/PACKUS on AVX512 targets if the calling function wants to.	2020-06-23 10:45:27 +01:00
Paul Walker	833bdd906f	[SVE] Code generation for fixed length vector loads & stores. Summary: This patch adds base support for code generating fixed length vector operations targeting a known SVE vector length. To achieve this we lower fixed length vector operations to equivalent scalable vector operations, whereby SVE predication is used to limit the elements processed to those present within the fixed length vector. Specifically this patch implements load and store operations, which get lowered to their masked counterparts thusly: V = load(Addr) => V = extract_fixed_vector(masked_load(make_pred(V.NumElts), Addr)) store(V, (Addr)) => masked_store(insert_fixed_vector(V), make_pred(V.NumElts), Addr)) Reviewers: rengolin, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80385	2020-06-23 09:39:03 +00:00
Dylan McKay	d82ee5e9f8	[AVR] Rewrite the function calling convention. Summary: The previous version relied on the standard calling convention using std::reverse() to try to force the AVR ABI. But this only works for simple cases, it fails for example with aggregate types. This patch rewrites the calling convention with custom C++ code, that implements the ABI defined in https://gcc.gnu.org/wiki/avr-gcc. To do that it adds a few 16-bit pseudo registers for unaligned argument passing, such as R24R23. For example this function: define void @fun({ i8, i16 } %a) will pass %a.0 in R22 and %a.1 in R24R23. There are no instructions that can use these pseudo registers, so a new register class, DREGSMOVW, is defined to make them apart. Also the ArgCC_AVR_BUILTIN_DIV is no longer necessary, as it is identical to the C++ behavior (actually the clobber list is more strict for __div* functions, but that is currently unimplemented). Reviewers: dylanmckay Subscribers: Gaelan, Sh4rK, indirect, jwagen, efriedma, dsprenkels, hiraditya, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68524 Patch by Rodrigo Rivas Costa.	2020-06-23 21:36:18 +12:00
Michael Liao	a3d92e3de4	[SDAG] Add new AssertAlign ISD node. Summary: - AssertAlign node records the guaranteed alignment on its source node, where these alignments are retrieved from alignment attributes in LLVM IR. These tracked alignments could help DAG combining and lowering generating efficient code. - In this patch, the basic support of AssertAlign node is added. So far, we only generate AssertAlign nodes on return values from intrinsic calls. - Addressing selection in AMDGPU is revised accordingly to capture the new (base + offset) patterns. Reviewers: arsenm, bogner Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81711	2020-06-23 00:51:11 -04:00
Amy Kwan	999d7986df	[PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang This patch implements builtins for the following prototypes for the VSX Permute Control Vector Generate with Mask Instructions: vector unsigned char vec_genpcvm (vector unsigned char, const int); vector unsigned short vec_genpcvm (vector unsigned short, const int); vector unsigned int vec_genpcvm (vector unsigned int, const int); vector unsigned long long vec_genpcvm (vector unsigned long long, const int); Differential Revision: https://reviews.llvm.org/D81774	2020-06-22 21:09:34 -05:00
Ayke van Laethem	482cfa14e2	[AVR] Disassemble double register instructions Add disassembly support for the movw, adiw, and sbiw instructions. I had previously committed test cases for the adiw and sbiw instructions, but had accidentally made them not runnable so they were skipped all this time. Oops. This patch fixes that by adding support for disassembling those instructions. Differential Revision: https://reviews.llvm.org/D82093	2020-06-23 02:18:04 +02:00
Ayke van Laethem	75eff6e3b0	[AVR] Disassemble instructions with fixed Z operand Some instructions have a fixed Z register and don't have an explicit register operand. This can be worked around by simply printing the operand directly if the particular register class is detected. The LPM and ELPM instructions also needed a custom decoder, which is also included in this patch. Differential Revision: https://reviews.llvm.org/D82088	2020-06-23 02:17:53 +02:00
Ayke van Laethem	af22684b77	[AVR] Disassemble multiplication instructions These can often only use a limited range of registers, and apparently need special decoding support. Differential Revision: https://reviews.llvm.org/D81971	2020-06-23 02:17:37 +02:00
Ayke van Laethem	81a28f0089	[AVR] Decode single register instructions This is a set of instructions that take just a single register as an operand, with no immediates. Because all instructions share the same format, I haven't added exhaustive bit testing to all instructions but just to the inc instruction. Differential Revision: https://reviews.llvm.org/D81968	2020-06-23 02:17:15 +02:00
Ayke van Laethem	69a35debec	[AVR] Don't adjust for instruction size I'm not entirely sure why this was ever needed, but when I remove both adjustments all tests still pass. This fixes a bug where a long branch (using the `jmp` instead of the `rjmp` instruction) was incorrectly adjusted by 2 because it jumps to an absolute address instead of a PC-relative address. I could have added AVR::fixup_call to the list of exceptions, but it seemed more sensible to me to just remove this code. Differential Revision: https://reviews.llvm.org/D78459	2020-06-23 02:15:42 +02:00
Sam Clegg	41b638ec50	[WebAssembly] Add support for externalref to MC and wasm-ld This allows code for handling externref values to be processed by the assembler and linker. Differential Revision: https://reviews.llvm.org/D81977	2020-06-22 15:57:24 -07:00
Christopher Tetreault	682e8217e0	[SVE] Remove calls to VectorType::getNumElements from ARM Reviewers: efriedma, greened, c-rhodes, david-arm, dmgreen Reviewed By: dmgreen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, dmgreen, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82216	2020-06-22 15:18:58 -07:00
Hans Wennborg	462b895849	Revert "[X86][SSE] MatchVectorAllZeroTest - handle OR vector reductions" This caused a Chromium test to miscompile. See discussion on the Phabricator review. > This patch extends MatchVectorAllZeroTest to handle OR vector reduction patterns where the result is compared against zero. > > Fixes PR45378 > > Differential Revision: https://reviews.llvm.org/D81547 This reverts 057c9c7ee00b7f7696065a3fc26a3df5ce3ebe96	2020-06-22 21:27:11 +02:00
Christopher Tetreault	2b2009995c	[SVE] Remove calls to VectorType::getNumElements from WebASM Summary: The getNumElements in base VectorType is being deprecated. See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html Reviewers: efriedma, tlively, fpetrogalli, c-rhodes, dschuff Reviewed By: tlively, dschuff Subscribers: dschuff, sbc100, tschuett, jgravelle-google, hiraditya, aheejin, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82217	2020-06-22 12:25:08 -07:00
Francesco Petrogalli	aa3627ac28	[sve][acle] Add SVE BFloat16 extensions. Summary: List of intrinsics: svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t op) For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 00bet4" Reviewers: sdesmalen, ctetreau, efriedma, david-arm, rengolin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82141	2020-06-22 16:53:02 +00:00
stozer	3078d95513	[DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions Following on from this RFC[0] from a while back, this is the first patch towards implementing variadic debug values. This patch specifically adds a set of functions to MachineInstr for performing operations specific to debug values, and replacing uses of the more general functions where appropriate. The most prevalent of these is replacing getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the operands corresponding to values will no longer be at index 0, but index 2 and upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been added for the other operands, along with some helper functions to replace oft-repeated code and operate on a variable number of value operands. [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste> Differential Revision: https://reviews.llvm.org/D81852	2020-06-22 16:01:12 +01:00
Guillaume Chatelet	14e49035e3	[ARC] Add missing return statement	2020-06-22 14:57:29 +00:00
Simon Pilgrim	0e6d2e0cc0	[DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI. We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.	2020-06-22 14:24:39 +01:00
Jay Foad	f7e3d3a72e	[AMDGPU] Update more live intervals in SIWholeQuadMode This fixes various assertion failures that would otherwise be triggered by a later patch to move SIWholeQuadMode later in the pass pipeline. Differential Revision: https://reviews.llvm.org/D82190	2020-06-22 13:50:15 +01:00
Tim Corringham	db80bc93e7	[AMDGPU] clang-format of SIModeRegister.cpp Ran clang-format just to ease future reviews. No functional changes.	2020-06-22 13:31:52 +01:00
Anton Korobeynikov	2379b4e67b	Revert "[MSP430] Update register names" This reverts commit 8f6620f663031da2bb35b788239f4b607271af84.	2020-06-22 13:37:22 +03:00
Anatoly Trosinenko	0ebad7ad63	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. Differential Revision: https://reviews.llvm.org/D82184	2020-06-22 13:24:03 +03:00
Anatoly Trosinenko	b2e4c430fd	[MSP430] Enable some basic support for debug information This commit technically permits LLVM to emit the debug information for ELF files for MSP430 architecture. Aside from this, it only defines the register numbers as defined by part 10.1 of MSP430 EABI specification (assuming the 1-byte subregisters share the register numbers with corresponding full-size registers). This commit was basically tested by me with TI-provided GCC 8.3.1 toolchain by compiling an example program with `clang` (please note manual linking may be required due to upstream `clang` not yet handling the `-msim` option necessary to run binaries on the GDB-provided simulator) and then running it and single-stepping with `msp430-elf-gdb` like this: ``` $sysroot/bin/msp430-elf-gdb ./test -ex "target sim" -ex "load ./test" (gdb) ... traditional GDB commands follow ... ``` While this implementation is most probably far from completeness and is considered experimental, it can already help with debugging MSP430 programs as well as finding issues in LLVM debug info support for MSP430 itself. One of the use cases includes trying to find a point where UBSan check in a trap-on-error mode was triggered. The expected debug information format is described in the [MSP430 Embedded Application Binary Interface](http://www.ti.com/lit/an/slaa534/slaa534.pdf) specification, part 10. Differential Revision: https://reviews.llvm.org/D81488	2020-06-22 13:14:07 +03:00
Djordje Todorovic	1133d59d10	[CSInfo][MIPS] Don't describe parameters loaded by sub/super reg copy When describing parameter value loaded by a COPY instruction, consider case where needed Reg value is a sub- or super- register of the COPY instruction's destination register. Without this patch, compile process will crash with the assertion "TargetInstrInfo::describeLoadedValue can't describe super- or sub-regs for copy instructions". Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D82000	2020-06-22 10:49:02 +02:00
Michael Liao	da79991dc4	[amdgpu] Fix REL32 relocations with negative offsets. Summary: - The offset should be treated as a signed one. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82234	2020-06-21 23:09:03 -04:00
David Green	0987ead525	[CGP] Convert phi types If a collection of interconnected phi nodes is only ever loaded, stored or bitcast then we can convert the whole set to the bitcast type, potentially helping to reduce the number of register moves needed as the phi's are passed across basic block boundaries. This has to be done in CodegenPrepare as it naturally straddles basic blocks. The alorithm just looks from phi nodes, looking at uses and operands for a collection of nodes that all together are bitcast between float and integer types. We record visited phi nodes to not have to process them more than once. The whole subgraph is then replaced with a new type. Loads and Stores are bitcast to the correct type, which should then be folded into the load/store, changing it's type. This comes up in the biquad testcase due to the way MVE needs to keep values in integer registers. I have also seen it come up from aarch64 partner example code, where a complicated set of sroa/inlining produced integer phis, where float would have been a better choice. I also added undef and extract element handling which increased the potency in some cases. This adds it with an option that defaults to off, and disabled for 32bit X86 due to potential issues around canonicalizing NaNs. Differential Revision: https://reviews.llvm.org/D81827	2020-06-21 15:54:17 +01:00
Simon Pilgrim	5724ae0b1e	[X86][SSE] Add SimplifyDemandedVectorEltsForTargetShuffle to handle target shuffle variable masks Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries. This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements. To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code. Differential Revision: https://reviews.llvm.org/D81791	2020-06-21 11:16:07 +01:00
Amy Kwan	048b03aa1b	[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang This patch implements builtins for the following prototypes: ``` vector signed char vec_clrl (vector signed char a, unsigned int n); vector unsigned char vec_clrl (vector unsigned char a, unsigned int n); vector signed char vec_clrr (vector signed char a, unsigned int n); vector signed char vec_clrr (vector unsigned char a, unsigned int n); ``` Differential Revision: https://reviews.llvm.org/D81707	2020-06-20 18:29:16 -05:00
Eric Christopher	396d126465	Typos around a -> an.	2020-06-20 14:04:48 -07:00
Simon Pilgrim	5d9eb15567	[X86] combineSetCCMOVMSK - consistently use CmpBits variable. NFCI. The comparison value should be the same size - I've added an assert to be absolutely certain.	2020-06-20 12:35:24 +01:00
Simon Pilgrim	92c7d023b2	[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) != -1 -> !PTESTZ(X,X) allof patterns	2020-06-20 12:17:32 +01:00
Eric Christopher	4f849f4d44	[Target] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist. This change affects an internal llvm command line option.	2020-06-20 00:06:39 -07:00
Craig Topper	0d00716d4c	[X86] Correct the implementation of ud1(a.k.a. ud2b) instruction. We were missing the modrm byte this instruction has according to current Intel SDM. Experiments with gcc indicate that different modrm values are chosen based on 2 operands so I've added those as well. I think our previous implementation was based on an older behavior of binutils that has since been changed.	2020-06-19 23:57:48 -07:00
Craig Topper	ef35532c13	[X86] Ignore bits 2:0 of the modrm byte when disassembling lfence, mfence, and sfence. These are documented as using modrm byte of 0xe8, 0xf0, and 0xf8 respectively. But hardware ignore bits 2:0. So 0xe9-0xef is treated the same as 0xe8. Similar for the other two. Fixing this required adding 8 new formats to the X86 instructions to convey this information. Could have gotten away with 3, but adding all 8 made for a more logical conversion from format to modrm encoding. I renumbered the format encodings to keep the register modrm formats grouped together.	2020-06-19 22:24:24 -07:00
Wang Rui	070b3d06c0	[Mips] Error if a non-immediate operand is used while an immediate is expected The 32-bit type relocation (R_MIPS_32) cannot be used for instructions below: ori $4, $4, start ori $4, $4, (start - .) We should print an error instead. Reviewed By: atanasyan, MaskRay Differential Revision: https://reviews.llvm.org/D81908	2020-06-19 22:08:59 -07:00
Carl Ritson	f2504628bd	[AMDGPU] Avoid use of V_READLANE into EXEC in SGPR spills Always prefer to clobber input SGPRs and restore them after the spill. This applies to both spills to VGPRs and scratch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81914	2020-06-20 12:10:47 +09:00
Heejin Ahn	3298d1e5b3	[WebAssembly] Remove TEEs when dests are unstackified When created in RegStackify pass, `TEE` has two destinations, where op0 is stackified and op1 is not. But it is possible that op0 becomes unstackified in `fixUnwindMismatches` function in CFGStackify pass when a nested try-catch-end is introduced, violating the invariant of `TEE`s destinations. In this case we convert the `TEE` into two `COPY`s, which will eventually be resolved in ExplicitLocals. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D81851	2020-06-19 14:55:21 -07:00
Amara Emerson	e3c37200f9	[AArch64][GlobalISel] Make G_SEXT_INREG legal and add selection support. We were defaulting to the lower action for this, resulting in SHL+ASHR sequences. On AArch64 we can do this in one instruction for an arbitrary extension using SBFM as we do for G_SEXT. Differential Revision: https://reviews.llvm.org/D81992	2020-06-19 13:20:41 -07:00
Stanislav Mekhanoshin	ca27e2e8a4	[AMDGPU] Some formatting fixes. NFC.	2020-06-19 09:02:59 -07:00
Piotr Sobczak	b9c5311c77	Revert "[AMDGPU] Select s_cselect" This caused some failures detected by the buildbot with expensive checks enabled. This reverts commit 4067de569f119a81419fbf2e79d5f3307dfdda5b.	2020-06-19 16:41:04 +02:00
dfukalov	c53b1ee136	[AMDGPU][CostModel] Add fneg cost estimation Summary: The estimation uses AMDGPUTargetLowering::isFNegFree() Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82065	2020-06-19 17:31:35 +03:00
Piotr Sobczak	3a8e847d39	[AMDGPU] Select s_cselect Summary: Add patterns to select s_cselect in the isel. Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81925	2020-06-19 16:17:46 +02:00

1 2 3 4 5 ...

58084 Commits