llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

Author	SHA1	Message	Date
Peter Collingbourne	4679dc84ef	AArch64: Prefer FP-relative debug locations in HWASANified functions. To help produce better diagnostics for stack use-after-return, we'd like to be able to determine the addresses of each HWASANified function's local variables given a small amount of information recorded on entry to the function. Currently we require all HWASANified functions to use frame pointers and record (PC, FP) on function entry. This works better than recording SP because FP cannot change during the function, unlike SP which can change e.g. due to dynamic alloca. However, most variables currently end up using SP-relative locations in their debug info. This prevents us from recomputing the address of most variables because the distance between SP and FP isn't recorded in the debug info. To address this, make the AArch64 backend prefer FP-relative debug locations when producing debug info for HWASANified functions. Differential Revision: https://reviews.llvm.org/D63300 llvm-svn: 364117	2019-06-22 00:06:51 +00:00
Tom Tan	b713d2117e	[COFF, ARM64] Fix encoding of debugtrap for Windows On Windows ARM64, intrinsic __debugbreak is compiled into brk #0xF000 which is mapped to llvm.debugtrap in Clang. Instruction brk #F000 is the defined break point instruction on ARM64 which is recognized by Windows debugger and exception handling code, so llvm.debugtrap should map to it instead of redirecting to llvm.trap (brk #1) as the default implementation. Differential Revision: https://reviews.llvm.org/D63635 llvm-svn: 364115	2019-06-21 23:38:05 +00:00
Matt Arsenault	47811338cc	AMDGPU: Fix not using s33 for scratch wave offset in kernels Fixes missing piece from r363990. llvm-svn: 364099	2019-06-21 20:04:02 +00:00
Craig Topper	4f5a568045	[X86] Add DAG combine to turn (vzmovl (insert_subvector undef, X, 0)) into (insert_subvector allzeros, (vzmovl X), 0) 128/256 bit scalar_to_vectors are canonicalized to (insert_subvector undef, (scalar_to_vector), 0). We have isel patterns that try to match this pattern being used by a vzmovl to use a 128-bit instruction and a subreg_to_reg. This patch detects the insert_subvector undef portion of this and pulls it through the vzmovl, creating a narrower vzmovl and an insert_subvector allzeroes. We can then match the insertsubvector into a subreg_to_reg operation by itself. Then we can fall back on existing (vzmovl (scalar_to_vector)) patterns. Note, while the scalar_to_vector case is the motivating case I didn't restrict to just that case. I'm also wondering about shrinking any 256/512 vzmovl to an extract_subvector+vzmovl+insert_subvector(allzeros) but I fear that would have bad implications to shuffle combining. I also think there is more canonicalization we can do with vzmovl with loads or scalar_to_vector with loads to create vzload. Differential Revision: https://reviews.llvm.org/D63512 llvm-svn: 364095	2019-06-21 19:10:21 +00:00
Craig Topper	5aa82d65e9	[X86] Don't mark v64i8/v32i16 ISD::SELECT as custom unless they are legal types. We don't have any Custom handling during type legalization. Only operation legalization. Fixes PR42355 llvm-svn: 364093	2019-06-21 18:50:00 +00:00
Craig Topper	aa14b9b4c2	[X86] Add a debug print of the node in the default case for unhandled opcodes in ReplaceNodeResults. This should be unreachable, but bugs can make it reachable. This adds a debug print so we can see the bad node in the output when the llvm_unreachable triggers. llvm-svn: 364091	2019-06-21 18:49:21 +00:00
Simon Pilgrim	1d92268266	[X86][AVX] Combine INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) as shuffle Subvector shuffling often ends up as insert/extract subvector. llvm-svn: 364090	2019-06-21 18:35:04 +00:00
Amara Emerson	98ded02108	[AArch64][GlobalISel] Implement selection support for the new G_JUMP_TABLE and G_BRJT ops. With this we can now fully code generate jump tables, which is important for code size. Differential Revision: https://reviews.llvm.org/D63223 llvm-svn: 364086	2019-06-21 18:10:41 +00:00
Craig Topper	5cca3be196	[X86] Use vmovq for v4i64/v4f64/v8i64/v8f64 vzmovl. We already use vmovq for v2i64/v2f64 vzmovl. But we were using a blendpd+xorpd for v4i64/v4f64/v8i64/v8f64 under opt speed. Or movsd+xorpd under optsize. I think the blend with 0 or movss/d is only needed for vXi32 where we don't have an instruction that can move 32 bits from one xmm to another while zeroing upper bits. movq is no worse than blendpd on any known CPUs. llvm-svn: 364079	2019-06-21 17:24:21 +00:00
Amara Emerson	8eb83e55cc	[AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal. We sometimes get poor code size because constants of types < 32b are legalized as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the localizer can no longer sink them (although it's possible to extend it to do so). On AArch64 however s8 and s16 constants can be selected in the same way as s32 constants, with a mov pseudo into a W register. If we make s8 and s16 constants legal then we can avoid unnecessary truncates, they can be CSE'd, and the localizer can sink them as normal. There is a caveat: if the user of a smaller constant has to widen the sources, we end up with an anyext of the smaller typed G_CONSTANT. This can cause regressions because of the additional extend and missed pattern matching. To remedy this, there's a new artifact combiner to generate the wider G_CONSTANT if it's legal for the target. Differential Revision: https://reviews.llvm.org/D63587 llvm-svn: 364075	2019-06-21 16:43:50 +00:00
Stanislav Mekhanoshin	7c12bcccbf	[AMDGPU] hazard recognizer for fp atomic to s_denorm_mode This requires 3 wait states unless there is a wait or VALU in between. Differential Revision: https://reviews.llvm.org/D63619 llvm-svn: 364074	2019-06-21 16:30:14 +00:00
Simon Pilgrim	b8b304222c	[X86] isBinOp - move commutative ops to isCommutativeBinOp. NFCI. TargetLoweringBase::isBinOp checks isCommutativeBinOp as a fallback, so don't duplicate. llvm-svn: 364072	2019-06-21 16:23:28 +00:00
Simon Pilgrim	b6bee639c8	Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 364068	2019-06-21 16:11:18 +00:00
Sam Elliott	75a022ea2b	[RISCV] Add RISCV-specific TargetTransformInfo Summary: LLVM Allows Targets to provide information that guides optimisations made to LLVM IR. This is done with callbacks on a TargetTransformInfo object. This patch adds a TargetTransformInfo class for RISC-V. This will allow us to implement RISC-V specific callbacks as they become necessary. This commit also adds the getIntImmCost callbacks, and tests them with a simple constant hoisting test. Our immediate costs are on the conservative side, for the moment, but we prevent hoisting in most circumstances anyway. Previous review was on D63007 Reviewers: asb, luismarques Reviewed By: asb Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny Tags: #llvm Differential Revision: https://reviews.llvm.org/D63433 llvm-svn: 364046	2019-06-21 13:36:09 +00:00
Simon Tatham	3b314c392b	[ARM] Add MVE 64-bit GPR <-> vector move instructions. These instructions let you load half a vector register at once from two general-purpose registers, or vice versa. The assembly syntax for these instructions mentions the vector register name twice. For the move _into_ a vector register, the MC operand list also has to mention the register name twice (once as the output, and once as an input to represent where the unchanged half of the output register comes from). So we can conveniently assign one of the two asm operands to be the output $Qd, and the other $QdSrc, which avoids confusing the auto-generated AsmMatcher too much. For the move _from_ a vector register, there's no way to get round the fact that both instances of that register name have to be inputs, so we need a custom AsmMatchConverter to avoid generating two separate output MC operands. (And even that wouldn't have worked if it hadn't been for D60695.) Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62679 llvm-svn: 364041	2019-06-21 13:17:23 +00:00
Simon Tatham	e16469b097	[ARM] Add MVE vector instructions that take a scalar input. This adds the `MVE_qDest_rSrc` superclass and all its instances, plus a few other instructions that also take a scalar input register or two. I've also belatedly added custom diagnostic messages to the operand classes for odd- and even-numbered GPRs, which required matching changes in two of the existing MVE assembly test files. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62678 llvm-svn: 364040	2019-06-21 13:17:08 +00:00
Simon Pilgrim	d3e7305789	[X86] X86ISD::ANDNP is a (non-commutative) binop The sat add/sub tests still have unnecessary extract_subvector((vandnps ymm, ymm), 0) uses that should be split to (vandnps (extract_subvector(ymm, 0), extract_subvector(ymm, 0)), but its getting better. llvm-svn: 364038	2019-06-21 12:42:39 +00:00
Simon Tatham	f48c4c76b3	[ARM] Add a batch of similarly encoded MVE instructions. Summary: This adds the `MVE_qDest_qSrc` superclass and all instructions that inherit from it. It's not the complete class of _everything_ with a q-register as both destination and source; it's a subset of them that all have similar encodings (but it would have been hopelessly unwieldy to call it anything like MVE_111x11100). This category includes add/sub with carry; long multiplies; halving multiplies; multiply and accumulate, and some more complex instructions. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62677 llvm-svn: 364037	2019-06-21 12:13:59 +00:00
Simon Pilgrim	63f0bd16c5	[X86] createMMXBuildVector - call with BuildVectorSDNode directly. NFCI. llvm-svn: 364030	2019-06-21 11:25:06 +00:00
Fangrui Song	4cfa8e915b	[ARM] Fix -Wimplicit-fallthrough after D62675 llvm-svn: 364028	2019-06-21 11:19:11 +00:00
Simon Tatham	84e2268946	[ARM] Add MVE vector compare instructions. Summary: These take a pair of vector register to compare, and a comparison type (written in the form of an Arm condition suffix); they output a vector of booleans in the VPR register, where predication can conveniently use them. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62676 llvm-svn: 364027	2019-06-21 11:14:51 +00:00
Simon Pilgrim	a2428a85c5	[X86] combineAndnp - use isNOT instead of manually checking for (XOR x, -1) llvm-svn: 364026	2019-06-21 11:13:15 +00:00
Simon Pilgrim	c40e3f2037	[X86] foldVectorXorShiftIntoCmp - use isConstOrConstSplat. NFCI. Use the isConstOrConstSplat helper instead of inspecting the build vector manually. llvm-svn: 364024	2019-06-21 10:54:30 +00:00
Simon Pilgrim	d91d3ace18	[X86][AVX] isNOT - handle concat_vectors(xor X, -1, xor Y, -1) pattern llvm-svn: 364022	2019-06-21 10:44:15 +00:00
Simon Tatham	30a95c7cbf	[ARM] Add a batch of MVE floating-point instructions. Summary: This includes floating-point basic arithmetic (add/sub/multiply), complex add/multiply, unary negation and absolute value, rounding to integer value, and conversion to/from integer formats. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62675 llvm-svn: 364013	2019-06-21 09:35:07 +00:00
Fangrui Song	8219c5373b	Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC llvm-svn: 364006	2019-06-21 05:40:31 +00:00
Fangrui Song	ff4b9bc74b	[MIPS GlobalISel] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63541 llvm-svn: 364003	2019-06-21 01:51:50 +00:00
Matt Arsenault	21f8c99ea6	AMDGPU: Always use s33 for global scratch wave offset Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. llvm-svn: 363990	2019-06-20 21:58:24 +00:00
Eli Friedman	5cb67c7c9d	[ARM GlobalISel] Add support for s64 G_ADD and G_SUB. Teach RegisterBankInfo to use the correct register class, and tell the legalizer it's legal. Everything else just works. The one thing that's slightly weird about this compared to SelectionDAG isel is that legalization can't distinguish between i64 and <1 x i64>, so we might end up with more NEON instructions than the user expects. Differential Revision: https://reviews.llvm.org/D63585 llvm-svn: 363989	2019-06-20 21:56:47 +00:00
Jinsong Ji	b770a2d90a	[PowerPC][NFC] Fix comments for AltVSXFMARel mapping. llvm-svn: 363987	2019-06-20 21:36:06 +00:00
Matt Arsenault	2b192f15b1	AMDGPU: Add intrinsics for DS GWS semaphore instructions llvm-svn: 363983	2019-06-20 21:11:42 +00:00
Matt Arsenault	d44a022d4e	AMDGPU: Insert mem_viol check loop around GWS pre-GFX9 It is necessary to emit this loop around GWS operations in case the wave is preempted pre-GFX9. llvm-svn: 363979	2019-06-20 20:54:32 +00:00
Craig Topper	e1c3ab10ae	[X86] Add BLSI to isUseDefConvertible. Summary: BLSI sets the C flag is the input is not zero. So if its followed by a TEST of the input where only the Z flag is consumed, we can replace it with the opposite check of the C flag. We should be able to do the same for BLSMSK and BLSR, but the naive test case for those is being optimized to a subo by CodeGenPrepare. Reviewers: spatel, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63589 llvm-svn: 363957	2019-06-20 17:52:53 +00:00
Matt Arsenault	8ae5308ac9	AMDGPU: Fix ignoring DisableFramePointerElim in leaf functions The attribute can specify elimination for leaf or non-leaf, so it should always be considered. I copied this bug from AArch64, which probably should also be fixed. llvm-svn: 363949	2019-06-20 17:03:23 +00:00
Matt Arsenault	abe33e0352	AMDGPU: Treat undef as an inline immediate This should only matter in vectors with an undef component, since a full undef vector would have been folded out. llvm-svn: 363941	2019-06-20 16:01:09 +00:00
Simon Tatham	7bde4ccbd2	[ARM] Add a batch of MVE integer instructions. This includes integer arithmetic of various kinds (add/sub/multiply, saturating and not), and the immediate forms of VMOV and VMVN that load an immediate into all lanes of a vector. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62674 llvm-svn: 363936	2019-06-20 15:16:56 +00:00
Stanislav Mekhanoshin	2653a95667	[AMDGPU] gfx1010 core wave32 changes Differential Revision: https://reviews.llvm.org/D63204 llvm-svn: 363934	2019-06-20 15:08:34 +00:00
Simon Pilgrim	df0cf35680	[X86] LowerAVXExtend - handle ANY_EXTEND_VECTOR_INREG lowering as well. llvm-svn: 363922	2019-06-20 11:31:54 +00:00
Petar Avramovic	26e6dcea4c	[MIPS GlobalISel] Select integer to floating point conversions Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912	2019-06-20 09:05:02 +00:00
Petar Avramovic	7b95eb82be	[MIPS GlobalISel] Select floating point to integer conversions Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911	2019-06-20 08:52:53 +00:00
Craig Topper	d83e9a7c80	[X86] Remove memory instructions form isUseDefConvertible. The caller of this is looking for comparisons of the input to these instructions with 0. But the memory instructions input is an addess not a value input in a register. llvm-svn: 363907	2019-06-20 04:58:40 +00:00
Craig Topper	8b3f514179	[X86] Add v64i8/v32i16 to several places in X86CallingConv.td where they seemed obviously missing. llvm-svn: 363906	2019-06-20 04:29:00 +00:00
Matt Arsenault	4534094a9d	AMDGPU: Don't clobber VCC in MUBUF addr64 emulation Introducing VCC defs during SIFixSGPRCopies is generally problematic. Avoid it by starting with the VOP3 form with the general condition register. This is the easiest to fix instance, but doesn't solve any specific problems I'm looking at. llvm-svn: 363904	2019-06-20 00:51:28 +00:00
Eli Friedman	dbb5894a25	[llvm-objdump] Switch between ARM/Thumb based on mapping symbols. The ARMDisassembler changes allow changing between ARM and Thumb mode based on the MCSubtargetInfo, rather than the Target, which simplifies the other changes a bit. I'm not really happy with adding more target-specific logic to tools/llvm-objdump/, but there isn't any easy way around it: the logic in question specifically applies to disassembling an object file, and that code simply isn't located in lib/Target, at least at the moment. Differential Revision: https://reviews.llvm.org/D60927 llvm-svn: 363903	2019-06-20 00:29:40 +00:00
Matt Arsenault	a9abe64167	AMDGPU: Consolidate some getGeneration checks This is incomplete, and ideally these would all be removed, but it's better to localize them to the subtarget first with comments about what they're for. llvm-svn: 363902	2019-06-19 23:54:58 +00:00
Matt Arsenault	ed29247314	AMDGPU: Undo sub x, c canonicalization for v2i16 Should avoid regression from D62341 llvm-svn: 363899	2019-06-19 23:37:43 +00:00
Simon Atanasyan	6d00d0274a	[mips] Mark the `lwupc` instruction as MIPS64 R6 only The "The MIPS64 Instruction Set Reference Manual" [1] states that the `lwupc` is MIPS64 Release 6 only. It should not be supported for 32-bit CPUs. [1] https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00087-2B-MIPS64BIS-AFP-6.06.pdf llvm-svn: 363886	2019-06-19 22:08:06 +00:00
Simon Atanasyan	c6a8025590	[mips] Add (GPR\|PTR)_64 predicates to PseudoReturn64 and PseudoIndirectHazardBranch64 This patch is one of a series of patches. The goal is to make P5600 scheduler model complete and turn on the `CompleteModel` flag. llvm-svn: 363885	2019-06-19 22:07:46 +00:00
Matt Arsenault	495c7b3f90	AMDGPU: Fix folding immediate into readfirstlane through reg_sequence The def instruction for the vreg may not match, because it may be folding through a reg_sequence. The assert was overly conservative and not necessary. It's not actually important if DefMI really defined the register, because the fold that will be done cares about the def of the value that will be folded. For some reason copies aren't making it through the reg_sequence, although they should. llvm-svn: 363876	2019-06-19 20:44:15 +00:00
Peter Collingbourne	9936afd522	hwasan: Shrink outlined checks by 1 instruction. Turns out that we can save an instruction by folding the right shift into the compare. Differential Revision: https://reviews.llvm.org/D63568 llvm-svn: 363874	2019-06-19 20:40:03 +00:00

1 2 3 4 5 ...

52598 Commits