llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00

Author	SHA1	Message	Date
Florian Hahn	a84d68c5a7	[ConstraintElimination] Add test with pointer bitcast.	2021-02-02 17:36:05 +00:00
Fangrui Song	b110101bd6	[MC] Support SHF_GNU_RETAIN as section flag 'R' On Linux target triples, GNU as sets EI_OSABI to ELFOSABI_GNU when SHF_GNU_RETAIN is used。 On `--freebsd`, it usually sets EI_OSABI to ELFOSABI_FREEBSD. GNU ld respects SHF_GNU_RETAIN only for ELFOSABI_FREEBSD/ELFOSABI_GNU. https://sourceware.org/bugzilla/show_bug.cgi?id=27282 MC doesn't set ELFOSABI_GNU for SHF_GNU_RETAIN/STB_GNU_UNIQUE/STT_GNU_IFUNC. MC assembled object files do not have special semantics in GNU ld. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D95730	2021-02-02 09:34:09 -08:00
Fangrui Song	4be3ad3853	[yaml2obj/obj2yaml/llvm-readobj] Support SHF_GNU_RETAIN In binutils, the flag is defined for ELFOSABI_GNU and ELFOSABI_FREEBSD. It can be used to mark a section as a GC root. In practice, the flag has generic semantics and can be applied to many EI_OSABI values, so we consider it generic. Differential Revision: https://reviews.llvm.org/D95728	2021-02-02 09:19:53 -08:00
Sanjay Patel	a9c32b94c7	[ExpandReductions] add test for fmin with FMF; NFC	2021-02-02 12:17:08 -05:00
Jeroen Dobbelaere	378758ee45	[InlineFunction] Only update noalias scopes once for an instruction. Inlining sometimes maps different instructions to be inlined onto the same instruction. We must ensure to only remap the noalias scopes once. Otherwise the scope might disappear (at best). This patch ensures that we only replace scopes for which the mapping is known. This approach is preferred over tracking which instructions we already handled in a SmallPtrSet, as that one will need more memory. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95862	2021-02-02 17:57:10 +01:00
David Green	e7a4a9e940	[ARM] Correct some tablegen operand types. NFC	2021-02-02 16:55:31 +00:00
Florian Hahn	4314d26857	[ConstraintElimination] Add nicer way to dump constraints (NFC). Use ConstraintSystem::dump(Names) to display the result of decomposing a condition.	2021-02-02 16:36:45 +00:00
David Green	91ae3d7f3f	[ARM] Mark MVE_VMOV_to_lane_32 as isInsertSubregLike This allows the peephole optimizer to know that a MVE_VMOV_to_lane_32 is the same as an insert subreg, allowing it to optimize some redundant lane moves. Differential Revision: https://reviews.llvm.org/D95433	2021-02-02 16:35:47 +00:00
Sebastian Neubauer	e79a30d836	[AMDGPU] Remove unused tmp register The temporary register is only used to compute the frame pointer. The frame pointer is overwritten and not used in between, so we can reuse the frame pointer for the computation, saving one register. Differential Revision: https://reviews.llvm.org/D95865	2021-02-02 17:17:54 +01:00
Sebastian Neubauer	1d5127c810	[AMDGPU] Save fp/bp after csr saves Saving callee-save registers happens in whole wave mode. Exec is saved to a free register, which can be reused to save the frame pointer. Therefore, saving the fp needs to happen after saving csrs. Differential Revision: https://reviews.llvm.org/D95861	2021-02-02 17:17:54 +01:00
Wenlei He	eddb3ffdfb	[CSSPGO] Factor out common part for CSSPGO inline and AFDO inline Refactoring SampleProfileLoader::inlineHotFunctions to use helpers from CSSPGO inlining and reduce similar code in the inlining loop, plus minor cleanup for AFDO path. This is resubmit of D95024, with build break and overtighten assertion fixed. Test Plan:	2021-02-02 07:55:08 -08:00
Stefan Pintilie	020aa3c460	[PowerPC] Materialize 34 bit constants with pli on Power 10. NOTE: This patch was originally written by Anil Mahmud. His code has been rebased but otherwise left mostly unchanged. A new instructon on Power 10 allows for the materialization of 34 bit immediate values. This patch allows the compiler to take advantage of the new instruction in this situation. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D92879	2021-02-02 09:49:22 -06:00
David Green	bee1ac1e53	[ARM] Add MVE insert-of-extract pattern A v4i32 insert of an extract can become a simple lane move, as opposed to round-tripping via a GPR. This adds a patterns that turns an v4i32 insert-extract pair into a EXTRACT_SUBREG/INSERT_SUBREG, with the required COPY_TO_REGCLASS. These get better optimized into a simple lane move by the rest of the backend. Differential Revision: https://reviews.llvm.org/D95428	2021-02-02 15:15:04 +00:00
Roman Lebedev	0136cec13c	[InstCombine] Host inversion out of ashr's value operand (PR48995) This is a yet another hint that we will eventually need InstCombineInverter, which would consistently sink inversions, but but for that we'll need to consistently hoist inversions where possible, so let's do that here. Example of a proof: https://alive2.llvm.org/ce/z/78SbDq See https://bugs.llvm.org/show_bug.cgi?id=48995	2021-02-02 17:56:43 +03:00
Roman Lebedev	27b7c783ff	[NFC][InstCombine] Add tests for (~x) a>> y --> ~(x a>> y) fold (PR48995) See https://bugs.llvm.org/show_bug.cgi?id=48995	2021-02-02 17:56:31 +03:00
Tom Weaver	99a2c6afdf	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit df3e39f60b356ca9dbfc11e96e5fdda30afa7acb. introduced failing test instrprof-gc-sections.c causing build bot to fail: http://lab.llvm.org:8011/#/builders/53/builds/1184	2021-02-02 14:19:31 +00:00
David Green	b8cb37b9c8	[ARM] Extra shuffle tests. NFC	2021-02-02 14:16:42 +00:00
David Green	dc98bdfb7d	[ARM] Select VINS from vector inserts This patch adds tablegen patterns for pairs of i16/f16 insert/extracts. If we are inserting into two adjacent vector lanes (0 and 1 for example), we can use either a vmov;vins or vmovx;vins to insert the pair together, avoiding a round-trip from GRP registers. This is quite a large patterns with a number of EXTRACT_SUBREG/INSERT_SUBREG/ COPY_TO_REGCLASS nodes, but hopefully as most of those become copies all that will be cleaned up by further optimizations. The VINS pattern was also adjusted to allow it to represent that it is inserting into the top half of an existing register. Differential Revision: https://reviews.llvm.org/D95381	2021-02-02 13:50:02 +00:00
Simon Pilgrim	347d983c7a	[X86][SSE] LowerINSERT_VECTOR_ELT - pull out repeated EltSizeInBits calls. NFCI.	2021-02-02 13:45:18 +00:00
Sander de Smalen	45b8ee9aa4	NFC: Migrate SpeculateAroundPHIs to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: ctetreau Differential Revision: https://reviews.llvm.org/D95353	2021-02-02 13:32:45 +00:00
Sander de Smalen	53e3448991	NFC: Migrate SimpleLoopUnswitch to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D95352	2021-02-02 13:32:44 +00:00
Dmitry Preobrazhensky	e22eed53cb	[AMDGPU][MC] Corrected parsing of optional modifiers Fixed bugs in parsing of "no*" modifiers and improved errors handling. See https://bugs.llvm.org/show_bug.cgi?id=41282. Differential Revision: https://reviews.llvm.org/D95675	2021-02-02 14:52:29 +03:00
Simon Pilgrim	8caaa7e0d2	[X86][AVX512] Support variable-index vector insertion on AVX512 targets (PR47924) With predicate masks, AVX512 can efficiently perform variable-index vector insertion with 2 broadcasts + 1 comparison, avoiding a lot of aliased memory traffic. Differential Revision: https://reviews.llvm.org/D95779	2021-02-02 11:41:18 +00:00
Andrew Ng	dd96c8fb71	[X86] Fix disassembly of x86-64 GDTLS code sequence For x86-64 the REX.w prefix takes precedence over any other size override (i.e. 0x66). Therefore, for x86-64 when REX.w is present set 'hasOpSize' to false to ensure that any size override is ignored. Fixes PR48901. Differential Revision: https://reviews.llvm.org/D95682	2021-02-02 11:35:00 +00:00
Simon Pilgrim	0e7f51461d	[X86][AVX] Add missing VEX_WIG tags from VPACKUSDW/VPHSUBD/VPCMPISTRI/VPCMPISTRM/VPCMPESTRI/VPCMPESTRM Fixes PR48877 Differential Revision: https://reviews.llvm.org/D95801	2021-02-02 11:25:44 +00:00
David Green	505ef9e320	[ARM] Remove DLS lr, lr A DLS lr, lr instruction only moves lr to itself. It need not be emitted on it's own to save a instruction in the loop preheader. Differential Revision: https://reviews.llvm.org/D78916	2021-02-02 11:09:31 +00:00
Adrian Kuegel	306c653648	Revert "[CSSPGO] Factor out common part for CSSPGO inline and AFDO inline" This reverts commit 9a03058d6322edb8abc803ba3e436cc62647d979.	2021-02-02 11:51:04 +01:00
Adrian Kuegel	930ba0d06e	Revert "Fix build break from D95024" This reverts commit 09cd849fdef2b2d3de2d0b0a5c512100957e0ef6.	2021-02-02 11:51:04 +01:00
David Green	2ad4eade18	[ARM] Regenerate LowOverheadLoops mir tests. NFC	2021-02-02 10:28:58 +00:00
David Sherwood	81a8f9f3aa	[SVE][LoopVectorize] Add masked load/store and gather/scatter support for SVE This patch updates IRBuilder::CreateMaskedGather/Scatter to work with ScalableVectorType and adds isLegalMaskedGather/Scatter functions to AArch64TargetTransformInfo. In addition I've fixed up isLegalMaskedLoad/Store to return true for supported scalar types, since this is what the vectorizer asks for. In LoopVectorize.cpp I've changed LoopVectorizationCostModel::getInterleaveGroupCost to return an invalid cost for scalable vectors, since currently this relies upon using shuffle vector for reversing vectors. In addition, in LoopVectorizationCostModel::setCostBasedWideningDecision I have assumed that the cost of scalarising memory ops is infinitely expensive. I have added some simple masked load/store and gather/scatter tests, including cases where we use gathers and scatters for conditional invariant loads and stores. Differential Revision: https://reviews.llvm.org/D95350	2021-02-02 09:52:39 +00:00
Benjamin Kramer	96428d51bc	Fold one-use variable into assert. NFCI. Avoids a warning in Release builds.	2021-02-02 10:50:48 +01:00
Sebastian Neubauer	8175a7d5b7	[AMDGPU] Mark epilog restores as frame-destroy I guess instructions were marked as frame-setup by accident, they are restores as part of the epilog. Differential Revision: https://reviews.llvm.org/D95783	2021-02-02 10:24:37 +01:00
Sebastian Neubauer	32c7642ef3	[AMDGPU] Clarify calling conv about inactive lanes So far, it was not specified what happens with the VGPRs of inactive lanes when functions are called. This patch explicitely mentions that the VGPR values of inactive lanes need to be preserved for all registers. This describes the current behavior, as only active lanes of registers are saved to scratch. Also, as the multi-lane nature of VGPRs is not properly modeled, we cannot determine the live VGPRs from inactive lanes at calls. So we cannot save them, even if we intended to do so. Differential Revision: https://reviews.llvm.org/D95610	2021-02-02 10:15:09 +01:00
Wenlei He	41fcc00b36	Fix build break from D95024	2021-02-02 01:01:12 -08:00
Wenlei He	60e5015150	[CSSPGO] Factor out common part for CSSPGO inline and AFDO inline Refactoring SampleProfileLoader::inlineHotFunctions to use helpers from CSSPGO inlining and reduce similar code in the inlining loop, plus minor cleanup for AFDO path. Test Plan: Differential Revision: https://reviews.llvm.org/D95024	2021-02-02 00:34:06 -08:00
Thomas Symalla	af31f24b1c	Fixed includes. Differential Revision: https://reviews.llvm.org/D93708	2021-02-02 09:14:54 +01:00
Thomas Symalla	298055ddcd	Fixed includes.	2021-02-02 09:14:54 +01:00
Thomas Symalla	a7e61e92bb	Reverted whitespace changes. Differential Revision: https://reviews.llvm.org/D90968	2021-02-02 09:14:54 +01:00
Thomas Symalla	b7a94c0dc2	Added missing includes.	2021-02-02 09:14:54 +01:00
Thomas Symalla	ba95ff2c1c	Renamed med3 opcode, removed superfluous copy.	2021-02-02 09:14:54 +01:00
Thomas Symalla	087ff79f1a	Removed the generic virtual register creations. Reworked the tests.	2021-02-02 09:14:54 +01:00
Thomas Symalla	f548399826	Implemented a MED3_S32 GIR opcode.	2021-02-02 09:14:53 +01:00
Thomas Symalla	ee664c032b	Added and used new target pseudo for v_cvt_pk_i16_i32, changes due to code review.	2021-02-02 09:14:53 +01:00
Thomas Symalla	43278a1cb3	Formatting changes	2021-02-02 09:14:53 +01:00
Thomas Symalla	46f1f49a56	Formatting changes.	2021-02-02 09:14:53 +01:00
Thomas Symalla	8aacc5adcd	Updating formatting changes.	2021-02-02 09:14:53 +01:00
Thomas Symalla	8955159f2d	Resolve formatting changes.	2021-02-02 09:14:53 +01:00
Thomas Symalla	38676d07e0	Code changes yielded from review.	2021-02-02 09:14:53 +01:00
Thomas Symalla	4c17035470	Fixed tests.	2021-02-02 09:14:53 +01:00
Thomas Symalla	ea43201600	Move step to PreLegalizer	2021-02-02 09:14:53 +01:00

1 2 3 4 5 ...

210647 Commits