llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00

Author	SHA1	Message	Date
Jessica Paquette	8e6b545e6e	[GlobalISel][AArch64] Make FP constraint checks consider possible use/def banks In a few places in getInstrMapping, we check if use/def instructions for the instruction we're mapping have floating point constraints. We can improve this check and reduce the number of copies in GISel-compiled code if we make a couple observations: - For a def instruction, it only matters if the def instruction must always output a value stored on a FPR - For a use instruction, it only matters if the use instruction must always only take in values stored in FPRs This adds two new functions: - onlyUsesFP - onlyDefinesFP Then we can use those when we're checking the uses/defs instead. Without this patch, the load, unmerge, store, and select in the added test would have unnecessary copies. Differential Revision: https://reviews.llvm.org/D62426 llvm-svn: 361679	2019-05-24 23:08:45 +00:00
Jessica Paquette	64b8570062	[GlobalISel][AArch64] NFC: Factor out HasFPConstraints into a proper function Factor it out into a function, and replace places where we had the same check with the new function. Differential Revision: https://reviews.llvm.org/D62421 llvm-svn: 361677	2019-05-24 22:12:21 +00:00
Jessica Paquette	905dc0f2a9	[GlobalISel][AArch64] Improve register bank mappings for G_SELECT The fcsel and csel instructions differ in only the register banks they work on. So, they're entirely interchangeable otherwise. With this in mind, this does two things: - Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as the outputs. - Teach it to choose the best register bank mapping based off the constraints of the inputs and outputs. The "best" in this case means the one that requires the smallest number of copies to properly emit a fcsel/csel. For example, if the inputs are all already going to be on FPRs, we should emit a fcsel, even if the output is a GPR. This costs one copy to produce the result, but saves us from copying the inputs into GPRs. Also update the regbank-select.mir to check that we end up with the right select instruction. Differential Revision: https://reviews.llvm.org/D62267 llvm-svn: 361665	2019-05-24 19:35:25 +00:00
Nick Desaulniers	e9fdb3a112	[AArch64] check for INLINEASM_BR along w/ INLINEASM Summary: It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Reviewers: t.p.northover, peter.smith Reviewed By: peter.smith Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62402 llvm-svn: 361661	2019-05-24 19:00:13 +00:00
Cullen Rhodes	7236b9fc52	[AArch64][SVE2] Asm: support SVE2 String Processing Group Summary: Patch adds support for the SVE2 character match instructions MATCH and NMATCH. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62206 llvm-svn: 361627	2019-05-24 10:32:01 +00:00
Cullen Rhodes	c8f153b14d	[AArch64][SVE2] Asm: support SVE2 Narrowing Group Summary: Patch adds support for the following instructions: SVE2 bitwise shift right narrow: * SQSHRUNB, SQSHRUNT, SQRSHRUNB, SQRSHRUNT, SHRNB, SHRNT, RSHRNB, RSHRNT, SQSHRNB, SQSHRNT, SQRSHRNB, SQRSHRNT, UQSHRNB, UQSHRNT, UQRSHRNB, UQRSHRNT SVE2 integer add/subtract narrow high part: * ADDHNB, ADDHNT, RADDHNB, RADDHNT, SUBHNB, SUBHNT, RSUBHNB, RSUBHNT SVE2 saturating extract narrow: * SQXTNB, SQXTNT, UQXTNB, UQXTNT, SQXTUNB, SQXTUNT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62205 llvm-svn: 361624	2019-05-24 10:22:30 +00:00
Cullen Rhodes	0c30c890bf	[AArch64][SVE2] Asm: support SVE2 Accumulate Group Summary: Patch adds support for the following instructions: SVE2 bitwise shift and insert: * SRI, SLI SVE2 bitwise shift right and accumulate: * SSRA, USRA, SRSRA, URSRA SVE2 complex integer add: * CADD, SQCADD SVE2 integer absolute difference and accumulate: * SABA, UABA SVE2 integer absolute difference and accumulate long: * SABALB, SABALT, UABALB, UABALT SVE2 integer add/subtract long with carry: * ADCLB, ADCLT, SBCLB, SBCLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62204 llvm-svn: 361622	2019-05-24 10:10:34 +00:00
Cullen Rhodes	8c9c640bf8	[AArch64][SVE2] Asm: add PMULLB/PMULLT instructions Summary: This patch adds support for the polynomial multiplication instructions PMULLB/PMULLT. The 64-bit source and 128-bit destination element variants are enabled with crypto extensions (+sve2-aes), similar to the NEON PMULL2 instruction. All other variants are enabled with +sve2. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62145 llvm-svn: 361619	2019-05-24 09:56:23 +00:00
Cullen Rhodes	79a36d0606	[AArch64][SVE2] Asm: add integer add/sub long/wide instructions Summary: Patch adds support for the following instructions: SVE2 integer add/subtract long: * SADDLB, SADDLT, UADDLB, UADDLT, SSUBLB, SSUBLT, USUBLB, USUBLT, SABDLB, SABDLT, UABDLB, UABDLT SVE2 integer add/subtract wide: * SADDWB, SADDWT, UADDWB, UADDWT, SSUBWB, SSUBWT, USUBWB, USUBWT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62142 llvm-svn: 361615	2019-05-24 09:28:27 +00:00
Cullen Rhodes	a2bc20e66c	[AArch64][SVE2] Asm: add various bitwise shift instructions Summary: This patch adds support for the SVE2 saturating/rounding bitwise shift left (predicated) group of instructions: * SRSHL, URSHL, SRSHLR, URSHLR, SQSHL, UQSHL, SQRSHL, UQRSHL, SQSHLR, UQSHLR, SQRSHLR, UQRSHLR Immediate forms of the SQSHL and UQSHL instructions are also added to the existing SVE bitwise shift by immediate (predicated) group, as well as three new instructions SRSHR/URSHR/SQSHLU. The new instructions in this group are encoded similarly and are implemented using the same TableGen class with a minimal change (1 bit in encoding). The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62140 llvm-svn: 361612	2019-05-24 09:17:23 +00:00
Cullen Rhodes	5d9e48d62f	[AArch64][SVE2] Asm: add saturating add/sub instructions Summary: Patch adds support for the following instructions: * SQADD, UQADD, SUQADD, USQADD * SQSUB, UQSUB, SQSUBR, UQSUBR The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62130 llvm-svn: 361611	2019-05-24 09:06:37 +00:00
Cullen Rhodes	72174ee169	[AArch64][SVE2] Asm: fix overlapping bit Summary: Bit 20 in sve2_int_arith_pred TableGen class was overlapping. The encodings are not affected as bit 20 is defined by the opc bits and this was overwriting the earlier error of setting bit 20 to 0. Raised by Momchil: https://reviews.llvm.org/D62130 Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62292 llvm-svn: 361609	2019-05-24 08:45:37 +00:00
Tim Northover	3994d25957	GlobalISel: support swifterror attribute on AArch64. swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. llvm-svn: 361608	2019-05-24 08:40:13 +00:00
Reid Kleckner	c46660a5e0	[AArch64] Preserve X8 for thunks ending in variadic musttail calls Summary: On Windows, X8 may be used to pass in the address of an aggregate that is returned indirectly. Therefore, it should be forwarded to variadic musttail calls and preserved in thunks. Fixes PR41997 Reviewers: mgrang, efriedma Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62344 llvm-svn: 361585	2019-05-24 01:27:20 +00:00
Serge Pavlov	7d595fa212	[AArch64] Add nvcast patterns for v2f32 -> v1f64 Summary: Constant stores of f32 values can create such NvCast nodes. Reviewers: t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62285 llvm-svn: 361584	2019-05-24 01:20:34 +00:00
Alexey Lapshin	1664547e1d	[DebugInfo][AArch64] Recognise target specific instruction as mov instr This fix is for the problem from https://bugs.llvm.org/show_bug.cgi?id=38714. Specifically, Simple Register Coalescing creates following conversion : undef %0.sub_32:gpr64 = ORRWrs $wzr, %3:gpr32common, 0, debug-location !24; It copies 32-bit value from gpr32 into gpr64. But Live DEBUG_VALUE analysis is not able to create debug location record for that instruction. So the problem is in that debug info for argc variable is incorrect. The fix is to write custom isCopyInstrImpl() which would recognize the ORRWrs instr. llvm-svn: 361417	2019-05-22 18:48:58 +00:00
Sjoerd Meijer	ef37506e4d	[AArch64] Subtarget crypto extension defaults The Armv8.2-A crypto extensions all defaulted to true, but should default to false, like all the other extensions. Differential Revision: https://reviews.llvm.org/D62180 llvm-svn: 361354	2019-05-22 07:10:27 +00:00
Florian Hahn	fc5d6589f6	[AArch64] Skip mask checks for masks with an odd number of elements. Some checks in isShuffleMaskLegal expect an even number of elements, e.g. isTRN_v_undef_Mask or isUZP_v_undef_Mask, otherwise they access invalid elements and crash. This patch adds checks to the impacted functions. Fixes PR41951 Reviewers: t.p.northover, dmgreen, samparker Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D60690 llvm-svn: 361235	2019-05-21 10:05:26 +00:00
Cullen Rhodes	861f8b4a86	[AArch64][SVE2] Asm: add integer unary instructions (predicated) Summary: Patch adds support for the following instructions: * URECPE, URSQRTE, SQABS, SQNEG The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62129 llvm-svn: 361230	2019-05-21 09:06:51 +00:00
Cullen Rhodes	855ae54f60	[AArch64][SVE2] Asm: add integer pairwise arithmetic instructions Summary: Patch adds support for the following instructions: ADDP, SMAXP, UMAXP, SMINP, UMINP The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62128 llvm-svn: 361229	2019-05-21 08:59:00 +00:00
Martin Storsjo	d4561248c0	[AArch64] Handle lowering lround on windows, where long is 32 bit Differential Revision: https://reviews.llvm.org/D62108 llvm-svn: 361192	2019-05-20 19:53:28 +00:00
Cullen Rhodes	31fcfd404f	[AArch64][SVE2] Asm: add SADALP and UADALP instructions Summary: This patch adds support for the integer pairwise add and accumulate long instructions SADALP/UADALP. These instructions are predicated. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62001 llvm-svn: 361154	2019-05-20 13:50:15 +00:00
Cullen Rhodes	9d12f27a52	[AArch64][SVE2] Asm: add int halving add/sub (predicated) instructions Summary: This patch adds support for the predicated integer halving add/sub instructions: * SHADD, UHADD, SRHADD, URHADD * SHSUB, UHSUB, SHSUBR, UHSUBR The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D62000 llvm-svn: 361136	2019-05-20 10:35:23 +00:00
Cullen Rhodes	54ce3bc691	[AArch64][SVE2] Asm: add saturating multiply-add interleaved long instructions Summary: Patch adds support for SQDMLALBT and SQDMLSLBT instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61998 llvm-svn: 361135	2019-05-20 10:29:48 +00:00
Cullen Rhodes	7adb348bc8	[AArch64][SVE2] Asm: add saturating multiply-add long instructions Summary: Patch adds support for indexed and unpredicated vectors forms of the following instructions: * SQDMLALB, SQDMLALT, SQDMLSLB, SQDMLSLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D61997 llvm-svn: 361005	2019-05-17 09:29:43 +00:00
Cullen Rhodes	2272f32d5a	[AArch64][SVE2] Asm: add integer multiply-add long instructions Summary: Patch adds support for indexed and unpredicated vectors forms of the following instructions: * SMLALB, SMLALT, UMLALB, UMLALT, SMLSLB, SMLSLT, UMLSLB, UMLSLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61951 llvm-svn: 361003	2019-05-17 09:19:41 +00:00
Cullen Rhodes	6bda113a4e	[AArch64][SVE2] Asm: add integer multiply long instructions Summary: Patch adds support for indexed and unpredicated vectors forms of the following instructions: * SMULLB, SMULLT, UMULLB, UMULLT, SQDMULLB, SQDMULLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61936 llvm-svn: 361002	2019-05-17 09:04:44 +00:00
Fangrui Song	f78f3148bd	[AArch64] Support .reloc , R_AARCH64_NONE, Summary: This can be used to create references among sections. When --gc-sections is used, the referenced section will be retained if the origin section is retained. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D61973 llvm-svn: 360981	2019-05-17 03:05:07 +00:00
Adhemerval Zanella	101145b734	[AArch64] Handle ISD::LROUND and ISD::LLROUND This patch optimizes ISD::LROUND and ISD::LLROUND to fcvtas instruction. It currently only handles the scalar version. llvm-svn: 360894	2019-05-16 13:30:18 +00:00
Cullen Rhodes	7ef8584543	[AArch64][SVE2] Asm: implement CMLA/SQRDCMLAH instructions Summary: This patch adds support for the indexed and unpredicated vectors forms of the CMLA and SQRDCMLAH instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D61906 llvm-svn: 360871	2019-05-16 09:42:22 +00:00
Cullen Rhodes	2eaf0460bc	[AArch64][SVE2] Asm: implement CDOT instruction Summary: The complex DOT instructions perform a dot-product on quadtuplets from two source vectors and the resuling wide real or wide imaginary is accumulated into the destination register. The instructions come in two forms: Vector form, e.g. cdot z0.s, z1.b, z2.b, #90 - complex dot product on four 8-bit quad-tuplets, accumulating results in 32-bit elements. The complex numbers in the second source vector are rotated by 90 degrees. cdot z0.d, z1.h, z2.h, #180 - complex dot product on four 16-bit quad-tuplets, accumulating results in 64-bit elements. The complex numbers in the second source vector are rotated by 180 degrees. Indexed form, e.g. cdot z0.s, z1.b, z2.b[3], #0 - complex dot product on four 8-bit quad-tuplets, with specified quadtuplet from second source vector, accumulating results in 32-bit elements. cdot z0.d, z1.h, z2.h[1], #0 - complex dot product on four 16-bit quad-tuplets, with specified quadtuplet from second source vector, accumulating results in 64-bit elements. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer, rovka Differential Revision: https://reviews.llvm.org/D61903 llvm-svn: 360870	2019-05-16 09:33:44 +00:00
Cullen Rhodes	f96af16cbf	[AArch64][SVE2] Asm: add unpredicated integer multiply instructions Summary: Add support for the following instructions: * MUL (indexed and unpredicated vectors forms) * SQDMULH (indexed and unpredicated vectors forms) * SQRDMULH (indexed and unpredicated vectors forms) * SMULH (unpredicated, predicated form added in SVE) * UMULH (unpredicated, predicated form added in SVE) * PMUL (unpredicated) The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer, rovka Differential Revision: https://reviews.llvm.org/D61902 llvm-svn: 360867	2019-05-16 09:07:26 +00:00
Mandeep Singh Grang	4fdc3d2e09	[AArch64] only indicate CFI on Windows if we emitted CFI Summary: Otherwise, we emit directives for CFI without any actual CFI opcodes to go with them, which causes tools to malfunction. The technique is similar to what the x86 backend already does. Fixes https://bugs.llvm.org/show_bug.cgi?id=40876 Patch by: froydnj (Nathan Froyd) Reviewers: mstorsjo, eli.friedman, rnk, mgrang, ssijaric Reviewed By: rnk Subscribers: javed.absar, kristof.beyls, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D61960 llvm-svn: 360816	2019-05-15 21:23:41 +00:00
Richard Trieu	4e494af6c5	[AArch64] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360709	2019-05-14 21:33:53 +00:00
Cullen Rhodes	81830575d0	[AArch64][SVE2] Asm: add SQRDMLAH/SQRDMLSH instructions Summary: This patch adds support for the indexed and unpredicated vectors forms of the SQRDMLAH and SQRDMLSH instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61515 llvm-svn: 360683	2019-05-14 15:10:16 +00:00
Cullen Rhodes	dc081e21c0	[AArch64][SVE2] Asm: add integer multiply-add/subtract (indexed) instructions Summary: This patch adds support for the following instructions: MLA mul-add, writing addend (Zda = Zda + Zn * Zm[idx]) MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm[idx]) Predicated forms of these instructions were added in SVE. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61514 llvm-svn: 360682	2019-05-14 15:01:00 +00:00
Tim Northover	18a8d29140	AArch64: support binutils-like things on arm64_32. This adds support for the arm64_32 watchOS ABI to LLVM's low level tools, teaching them about the specific MachO choices and constants needed to disassemble things. llvm-svn: 360663	2019-05-14 11:25:44 +00:00
Cullen Rhodes	f378d68edf	[AArch64][SVE2] Add SVE2 target features to backend and TargetParser Summary: This patch adds the following features defined by Arm SVE2 architecture extension: sve2, sve2-aes, sve2-sm4, sve2-sha3, bitperm For existing CPUs these features are declared as unsupported to prevent scheduler errors. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewers: SjoerdMeijer, sdesmalen, ostannard, rovka Reviewed By: SjoerdMeijer, rovka Subscribers: rovka, javed.absar, tschuett, kristof.beyls, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61513 llvm-svn: 360573	2019-05-13 10:10:24 +00:00
Richard Trieu	29ecf8ea25	[AArch64] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360486	2019-05-10 23:50:01 +00:00
Bill Wendling	3bb1a9e50f	Add ".dword" directive Summary: The ".dword" directive is a synonym for ".xword" and is used used by klibc, a minimalistic libc subset for initramfs. Reviewers: t.p.northover, nickdesaulniers Reviewed By: nickdesaulniers Subscribers: nickdesaulniers, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61719 llvm-svn: 360381	2019-05-09 21:57:44 +00:00
Simon Pilgrim	3a3c43bf2f	[AArch64] Remove scan-build "Value stored during its initialization is never read" warnings. NFCI. llvm-svn: 360268	2019-05-08 16:29:39 +00:00
Simon Pilgrim	9a42a780eb	[AArch64] Fix scan-build null/uninitialized pointer warnings. NFCI. llvm-svn: 360267	2019-05-08 16:27:24 +00:00
Martin Storsjo	ed4e27a076	[AArch64] Default to SEH exception handling on MinGW The SEH implementation is pretty mature at this point. Differential Revision: https://reviews.llvm.org/D61590 llvm-svn: 360080	2019-05-06 21:18:15 +00:00
Amara Emerson	79073a3227	[GlobalISel] Handle <1 x T> vector return types properly. After support for dealing with types that need to be extended in some way was added in r358032 we didn't correctly handle <1 x T> return types. These types don't have a GISel direct representation, instead we just see them as scalars. When we need to pad them into <2 x T> types however we need to use a G_BUILD_VECTOR instead of trying to do a G_CONCAT_VECTOR. This fixes PR41738. llvm-svn: 360068	2019-05-06 19:41:01 +00:00
Jessica Paquette	db19e9797c	[AArch64][GlobalISel] Use fcsel instead of csel for G_SELECT on FPRs This saves us some unnecessary copies. If the inputs to a G_SELECT are floating point, we should use fcsel rather than csel. Changes here are... - Teach selectCopy about s1-to-s1 copies across register banks. - AArch64RegisterBankInfo about G_SELECT in general. - Teach the instruction selector about the FCSEL instructions. Also add two tests: - select-select.mir to show that we get the expected FCSEL - regbank-select.mir (unfortunately named) to show the register banks on G_SELECT are properly preserved And update fast-isel-select.ll to show that we do the same thing as other instruction selectors in these cases. llvm-svn: 359940	2019-05-03 22:37:46 +00:00
Mandeep Singh Grang	ab504d3276	[COFF, ARM64] Fix ABI implementation of struct returns Summary: Refer the ABI doc at: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values Related clang patch: D60349 Reviewers: rnk, efriedma, TomTan, ssijaric Reviewed By: rnk, efriedma Subscribers: mstorsjo, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60348 llvm-svn: 359934	2019-05-03 21:12:36 +00:00
Simon Pilgrim	2628d88f52	Avoid cppcheck operator precedence warnings. NFCI. Prefer ((X & Y) ? A : B) to (X & Y ? A : B) llvm-svn: 359884	2019-05-03 13:50:38 +00:00
Eli Friedman	f714f84cd7	[AArch64][MC] Reject "add x0, x1, w2, lsl #1 " etc. Looks like just a minor oversight in the parsing code. Fixes https://bugs.llvm.org/show_bug.cgi?id=41504. Differential Revision: https://reviews.llvm.org/D60840 llvm-svn: 359855	2019-05-03 00:59:52 +00:00
Evandro Menezes	46d8190766	[AArch64] Update for Exynos Fix the forwarding of multiplication results for Exynos M4. llvm-svn: 359834	2019-05-02 22:01:39 +00:00
Jessica Paquette	62380e005c	[GlobalISel][AArch64] Use fmov for G_FCONSTANT when possible This adds support for using fmov rather than a standard mov to materialize G_FCONSTANT when it's safe to do so. Update arm64-fast-isel-materialize.ll and select-constant.mir to show that the selection is correct. llvm-svn: 359734	2019-05-01 22:39:43 +00:00
Sjoerd Meijer	8238653a0e	[TargetLowering] Change getOptimalMemOpType to take a function attribute list The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537	2019-04-30 08:38:12 +00:00
Jessica Paquette	c024feffc8	[GlobalISel][AArch64] Select llvm.aarch64.crypto.sha1h This was falling back and gives us a reason to create a selectIntrinsic function which we would need eventually anyway. Update arm64-crypto.ll to show that we correctly select it. Also factor out the code for finding an intrinsic ID. llvm-svn: 359501	2019-04-29 20:58:17 +00:00
Cullen Rhodes	ffbe24d53e	[AArch64][SVE] Asm: add aliases for unpredicated bitwise logical instructions This patch adds aliases for element sizes .B/.H/.S to the AND/ORR/EOR/BIC bitwise logical instructions. The assembler now accepts these instructions with all element sizes up to 64-bit (.D). The preferred disassembly is .D. llvm-svn: 359457	2019-04-29 15:27:27 +00:00
Jessica Paquette	2be787ea59	[GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for extracts getConstantVRegValWithLookThrough does the same thing as the getConstantValueForReg function, and has more visibility across GISel. Plus, it supports looking through G_TRUNC, G_SEXT, and G_ZEXT. So, we get better code reuse and more functionality for free by using it. Add some test cases to select-extract-vector-elt.mir to show that we can now look through those instructions. llvm-svn: 359351	2019-04-26 21:53:13 +00:00
Nick Desaulniers	02a0e7f7fc	[AsmPrinter] refactor to support %c w/ GlobalAddress' Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337	2019-04-26 18:45:04 +00:00
Jessica Paquette	1b1fd6193b	[AArch64][GlobalISel] Select G_BSWAP for vectors of s32 and s64 There are instructions for these, so mark them as legal. Select the correct instruction in AArch64InstructionSelector.cpp. Update select-bswap.mir and arm64-rev.ll to reflect the changes. llvm-svn: 359331	2019-04-26 18:00:01 +00:00
Hans Wennborg	83fd9bda79	Fix alignment in AArch64InstructionSelector::emitConstantPoolEntry() The code was using the alignment of a pointer to the value, not the alignment of the constant itself. Maybe we got away with it so far because the pointer alignment is fairly high, but we did end up under-aligning <16 x i8> vectors, which was caught in the Chromium build after lld stopped over-aligning the .rodata.cst16 section in r356428. (See crbug.com/953815) Differential revision: https://reviews.llvm.org/D61124 llvm-svn: 359287	2019-04-26 08:31:00 +00:00
Jessica Paquette	620cbaeb4e	[GlobalISel][AArch64] Make G_EXTRACT_VECTOR_ELT legal for v8s16s This case was missing before, so we couldn't legalize it. Add it to AArch64LegalizerInfo.cpp and update select-extract-vector-elt.mir. llvm-svn: 359231	2019-04-25 20:00:57 +00:00
Jessica Paquette	a49fa69bbb	[GlobalISel][AArch64] Add generic legalization rule for extends This adds a legalization rule for G_ZEXT, G_ANYEXT, and G_SEXT which allows extends whenever the types will fit in registers (or the source is an s1). Update tests. Add GISel checks throughout all of arm64-vabs.ll, where we now select a good portion of the code. Add GISel checks to arm64-subvector-extend.ll, which has a good number of vector extends in it. Differential Revision: https://reviews.llvm.org/D60889 llvm-svn: 359222	2019-04-25 18:42:00 +00:00
Jessica Paquette	fd3dd81348	[GlobalISel][AArch64] Legalize G_FNEARBYINT Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc. Since the importer allows us to automatically select this after legalization, also add tests for selection etc. Also update arm64-vfloatintrinsics.ll. llvm-svn: 359204	2019-04-25 16:44:40 +00:00
Jessica Paquette	a799766d83	[AArch64][GlobalISel] Select G_INTRINSIC_ROUND Add selection support for G_INTRINSIC_ROUND, add a selection test, and add check lines to arm64-vfloatintrinsics.ll and f16-instructions.ll. llvm-svn: 359046	2019-04-23 23:03:03 +00:00
Jessica Paquette	6cf54d8ec9	[AArch64][GlobalISel] Mark G_INTRINSIC_ROUND as a pre-isel floating point opcode Add G_INTRINSIC_ROUND to isPreISelGenericFloatingPointOpcode to ensure that its input and output are assigned the correct register bank. Add a regbankselect test to verify that we get what we expect here. llvm-svn: 359044	2019-04-23 22:47:00 +00:00
Jessica Paquette	b31b0d9e64	[AArch64][GlobalISel] Legalize G_INTRINSIC_ROUND Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing switch case to AArch64LegalizerInfo.cpp. llvm-svn: 359033	2019-04-23 21:11:57 +00:00
Jessica Paquette	d298f73700	[AArch64][GlobalISel] Actually select G_INTRINSIC_TRUNC Apparently FileCheck wasn't actually matching the fallback check lines in arm64-vfloatintrinsics.ll properly. So, there were selection fallbacks for G_INTRINSIC_TRUNC there. Actually hook it up into AArch64InstructionSelector.cpp and write a proper selection test. I guess I'll figure out the FileCheck magic to make the fallback checks work properly in arm64-vfloatintrinsics.ll. llvm-svn: 359030	2019-04-23 20:46:19 +00:00
Jessica Paquette	0651875195	[AArch64][GlobalISel] Teach regbankselect about G_INTRINSIC_TRUNC Add it to isPreISelGenericFloatingPointOpcode, and add a regbankselect test. Update arm64-vfloatintrinsics.ll now that we can select it. llvm-svn: 359022	2019-04-23 18:20:47 +00:00
Jessica Paquette	ff3cf1d228	[AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNC Same patch as G_FCEIL etc. Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct rule in AArch64LegalizerInfo.cpp, and add a test. llvm-svn: 359021	2019-04-23 18:20:44 +00:00
Jessica Paquette	d1184e3a9f	[AArch64][GlobalISel] Legalize G_FMA for more vector types Same as G_FCEIL, G_FABS, etc. Just move it into that rule. Add a legalizer test for G_FMA, which we didn't have before and update arm64-vfloatintrinsics.ll. llvm-svn: 359015	2019-04-23 17:37:56 +00:00
Jessica Paquette	2b697e981f	[AArch64][GlobalISel] Add G_FMA to isPreISelGenericFloatingPointOpcode Noticed an unnecessary fallback in arm64-vmul caused by this. Also add a regbankselect test for G_FMA. llvm-svn: 359013	2019-04-23 17:17:06 +00:00
Simon Pilgrim	f3464e0685	Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 358969	2019-04-23 11:11:34 +00:00
Javed Absar	39dfbd5229	[AArch64] Add support for MTE intrinsics This patch provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. The intrinsics are described in detail in the latest ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed by: David Spickett Differential Revision: https://reviews.llvm.org/D60486 llvm-svn: 358963	2019-04-23 09:39:58 +00:00
Amara Emerson	4cebcd0399	Revert r358800. Breaks Obsequi from the test suite. The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with a different issue. llvm-svn: 358829	2019-04-20 21:25:00 +00:00
Amara Emerson	8abbc39fc7	Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"" We were shifting the wrong component of a split load when trying to combine them back into a single value. llvm-svn: 358800	2019-04-19 23:54:44 +00:00
Jessica Paquette	d57f5959df	[GlobalISel][AArch64] Legalize + select G_FRINT Exactly the same as G_FCEIL, G_FABS, etc. Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc. Differential Revision: https://reviews.llvm.org/D60895 llvm-svn: 358799	2019-04-19 23:41:52 +00:00
Eli Friedman	1ee278d235	[AArch64] Fix checks for AArch64MCExpr::VK_SABS flag. VK_SABS is part of the SymLoc bitfield in the variant kind which should be compared for equality, not by checking the VK_SABS bit. As far as I know, the existing code happened to produce the correct results in all cases, so this is just a cleanup. Patch by Stephen Crane. Differential Revision: https://reviews.llvm.org/D60596 llvm-svn: 358788	2019-04-19 21:58:10 +00:00
Amara Emerson	771972a5c7	Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores" This introduces some runtime failures which I'll need to investigate further. llvm-svn: 358771	2019-04-19 17:42:13 +00:00
Jessica Paquette	26fc1186e9	[GlobalISel][AArch64] Legalize vector G_FPOW This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc. Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change. Differential Revision: https://reviews.llvm.org/D60218 llvm-svn: 358764	2019-04-19 16:28:08 +00:00
Bjorn Pettersson	e9abad235c	[CodeGen] Add "const" to MachineInstr::mayAlias Summary: The basic idea here is to make it possible to use MachineInstr::mayAlias also when the MachineInstr is const (or the "Other" MachineInstr is const). The addition of const in MachineInstr::mayAlias then rippled down to the need for adding const in several other places, such as TargetTransformInfo::getMemOperandWithOffset. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60856 llvm-svn: 358744	2019-04-19 09:08:38 +00:00
Jessica Paquette	b1d320f69e	[GlobalISel][AArch64] Legalize/select G_(S/Z/ANY)_EXT for v8s8s This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s. We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic. This adds legalizer support for those 3, which gives us selection via the importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and add a GISel line to arm64-vabs.ll. Differential Revision: https://reviews.llvm.org/D60881 llvm-svn: 358715	2019-04-18 21:15:48 +00:00
Jessica Paquette	872bc88b70	[GlobalISel][AArch64] Legalize v8s8 loads Add legalizer support for loads of v8s8 and update legalize-load-store.mir. Differential Revision: https://reviews.llvm.org/D60877 llvm-svn: 358714	2019-04-18 21:13:58 +00:00
Nick Desaulniers	85ce9e7ccd	[AsmPrinter] hoist %a output template to base class for ARM+Aarch64 Summary: X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do basically the same thing (Aarch64 did not correctly handle immediates, ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses %a with an immediate) for a flag that should be target independent anyways. Reviewers: echristo, peter.smith Reviewed By: echristo Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60841 llvm-svn: 358618	2019-04-17 22:21:10 +00:00
Amara Emerson	c8667e9456	[GlobalISel] Add legalization support for non-power-2 loads and stores Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613	2019-04-17 21:30:07 +00:00
Amara Emerson	5d4c0a60e2	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369	2019-04-15 05:04:20 +00:00
Amara Emerson	442ab2552d	[GlobalISel] Introduce a CSEConfigBase class to allow targets to define their own CSE configs. Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE configs that can be passed between TargetPassConfig and the targets' custom pass configs. This CSEConfigBase allows targets to create custom CSE configs which is then used by the GISel passes for the CSEMIRBuilder. This support will be used in a follow up commit to allow constant-only CSE for -O0 compiles in D60580. llvm-svn: 358368	2019-04-15 04:53:46 +00:00
Amara Emerson	249244f9ca	[AArch64][GlobalISel] Enable copy elision in the pre-legalizer combine and fix a crash. This enables the simple copy combine that already exists in the CombinerHelper. However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a set of MIs to process, and so would end up causing a crash when deleted MIs were being added to the combiner worklist again. Differential Revision: https://reviews.llvm.org/D60579 llvm-svn: 358318	2019-04-13 00:33:25 +00:00
Amara Emerson	f16c13f916	[AArch64][GlobalISel] Fix a crash when selecting shufflevectors with an undef mask element. If a shufflevector's mask vector has an element with "undef" then the generic instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT. This fixes the selector to handle this case, and for now assumes that undef just means zero. In future we'll optimize this case properly. llvm-svn: 358312	2019-04-12 21:31:21 +00:00
Eric Christopher	b7a05e858c	Add explicit dependencies on MCSection.h and MCDwarf.h to the .cpp files rather than rely on transitive includes from MCStreamer.h. llvm-svn: 358263	2019-04-12 07:40:01 +00:00
Amara Emerson	19a1782c85	[AArch64][GlobalISel] Flesh out vector load/store support for more types. Some of these were legalizing into smaller vector types unnecessarily, others were simply not supported yet. llvm-svn: 358223	2019-04-11 20:40:01 +00:00
Amara Emerson	a14e5c054e	[AArch64][GlobalISel] Legalization and ISel support for load/stores of vectors of pointers. Loads and store of values with type like <2 x p0> currently don't get imported because SelectionDAG has no knowledge of pointer types. To leverage the existing support for vector load/stores, we can bitcast the value to have s64 element types instead. We do this as a custom legalization. This patch also adds support for general loads of <2 x s64>, and relaxes some type conditions on selecting G_BITCAST. Differential Revision: https://reviews.llvm.org/D60534 llvm-svn: 358221	2019-04-11 20:32:24 +00:00
Diogo N. Sampaio	db0243a51e	[AArch64] Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64 Summary: Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64 Reviewers: pbarrio, DavidSpickett, LukeGeeson Reviewed By: LukeGeeson Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60259 llvm-svn: 358171	2019-04-11 14:19:43 +00:00
Amara Emerson	81b0599afa	[AArch64][GlobalISel] Make <2 x p0> = G_BUILD_VECTOR legal. The existing isel support already works for p0 once the legalizer accepts it. llvm-svn: 358144	2019-04-10 23:06:14 +00:00
Amara Emerson	b7a6d81ca1	[AArch64][GlobalISel] Add legalizer support for <8 x s16> and <16 x s8> G_ADD. llvm-svn: 358143	2019-04-10 23:06:11 +00:00
Amara Emerson	ba18fe7b6a	[AArch64][GlobalISel] Scalarize vector SDIV. llvm-svn: 358142	2019-04-10 23:06:08 +00:00
Craig Topper	441f15067b	[AArch64] Teach getTestBitOperand to look through ANY_EXTENDS This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression. Differential Revision: https://reviews.llvm.org/D60482 llvm-svn: 358108	2019-04-10 17:27:29 +00:00
Nick Desaulniers	ba87dcb4ad	[AsmPrinter] refactor to remove remove AsmVariant. NFC Summary: The InlineAsm::AsmDialect is only required for X86; no architecture makes use of it and as such it gets passed around between arch-specific and general code while being unused for all architectures but X86. Since the AsmDialect is queried from a MachineInstr, which we also pass around, remove the additional AsmDialect parameter and query for it deep in the X86AsmPrinter only when needed/as late as possible. This refactor should help later planned refactors to AsmPrinter, as this difference in the X86AsmPrinter makes it harder to make AsmPrinter more generic. Reviewers: craig.topper Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60488 llvm-svn: 358101	2019-04-10 16:38:43 +00:00
Diogo N. Sampaio	072614629e	[AArch64] Add lowering pattern for scalar fp16 facge and facgt Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands. Reviewers: olista01, javed.absar, pbarrio Reviewed By: javed.absar Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60212 llvm-svn: 358083	2019-04-10 13:34:18 +00:00
Clement Courbet	2c7520f781	[NFC] Fix unused variable warning. llvm-svn: 358080	2019-04-10 13:18:05 +00:00
Amara Emerson	98071b2dbb	[AArch64][GlobalISel] Add isel support for vector G_ICMP and G_ASHR & G_SHL The selection for G_ICMP is unfortunately not currently importable from SDAG due to the use of custom SDNodes. To support this, this selection method has an opcode table which has been generated by a script, indexed by various instruction properties. Ideally in future we will have a GISel native selection patterns that we can write in tablegen to improve on this. For selection of some types we also need support for G_ASHR and G_SHL which are generated as a result of legalization. This patch also adds support for them, generating the same code as SelectionDAG currently does. Differential Revision: https://reviews.llvm.org/D60436 llvm-svn: 358035	2019-04-09 21:22:43 +00:00
Amara Emerson	de0bcea47f	[AArch64][GlobalISel] Legalize vector G_ICMP. Selection support will be coming in a later patch. Differential Revision: https://reviews.llvm.org/D60435 llvm-svn: 358034	2019-04-09 21:22:40 +00:00
Amara Emerson	cd87034109	[AArch64][GlobalISel] Add legalization for some vector G_SHL and G_ASHR. This is needed for some future support for vector ICMP. Differential Revision: https://reviews.llvm.org/D60433 llvm-svn: 358033	2019-04-09 21:22:37 +00:00
Amara Emerson	5b2864cb25	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032	2019-04-09 21:22:33 +00:00
Evandro Menezes	1fd81231ce	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Sander de Smalen	b0dbd990db	[AArch64][AsmParser] Fix .arch_extension directive parsing This patch fixes .arch_extension directive parsing to handle a wider range of architecture extension options. The existing parser was parsing extensions as an identifier which breaks for extensions containing a "-", such as the "tlb-rmi" extension. The extension is now parsed as a string. This is consistent with the extension parsing in the .arch and .cpu directive parsing. Patch by Cullen Rhodes (c-rhodes) Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D60118 llvm-svn: 357677	2019-04-04 09:11:17 +00:00
Jessica Paquette	1bef4155b3	[AArch64][GlobalISel] Legalize G_FEXP2 Same as G_EXP. Add a test, and update legalizer-info-validation.mir and f16-instructions.ll. Differential Revision: https://reviews.llvm.org/D60165 llvm-svn: 357605	2019-04-03 16:58:32 +00:00
Javed Absar	185877db73	[AArch64] Update v8.5a MTE LDG/STG instructions The latest MTE specification adds register Xt to the STG instruction family: STG [Xn, #offset] -> STG Xt, [Xn, #offset] The tag written to memory is taken from Xt rather than Xn. Also, the LDG instruction also was changed to read return address from Xt: LDG Xt, [Xn, #offset]. This patch includes those changes and tests. Specification is at: https://developer.arm.com/docs/ddi0596/c Differential Revision: https://reviews.llvm.org/D60188 llvm-svn: 357583	2019-04-03 14:12:13 +00:00
Jessica Paquette	7443b6a2d9	[AArch64][GlobalISel] Select llvm.aarch64.stlxr(i64, i64*) This adds partial instruction selection support for llvm.aarch64.stlxr. It also factors out selection for G_INTRINSIC_W_SIDE_EFFECTS into its own function. The new function removes the restriction that the intrinsic ID on the G_INTRINSIC_W_SIDE_EFFECTS be on operand 0. Also add a test, and add a GISel line to arm64-ldxr-stxr.ll. Differential Revision: https://reviews.llvm.org/D60100 llvm-svn: 357518	2019-04-02 19:57:26 +00:00
Jessica Paquette	cc01c3459a	[AArch64][GlobalISe] Select STRQui for stores into v264s instead of scalarizing This improves selection for vector stores into v2s64s. Before we just scalarized them, but we can just use a STRQui instead. Differential Revision: https://reviews.llvm.org/D60083 llvm-svn: 357432	2019-04-01 22:19:13 +00:00
David Spickett	163262b85e	[AArch64] Add v8.5-a Memory Tagging STZGM instruction This instruction writes a block of allocation tags and stores zero to the associated data locations. It differs from STGM by 1 bit and has the same arguments. The specification can be found here: https://developer.arm.com/docs/ddi0596/c Differential Revision: https://reviews.llvm.org/D60065 llvm-svn: 357397	2019-04-01 14:56:37 +00:00
David Spickett	6761055449	[AArch64] Add v8.5-a Memory Tagging STGM/LDGM instructions The STGV/LDGV instructions were replaced with STGM/LDGM. The encodings remain the same but there is no longer writeback so there are no unpredictable encodings to check for. The specfication can be found here: https://developer.arm.com/docs/ddi0596/c Differential Revision: https://reviews.llvm.org/D60064 llvm-svn: 357395	2019-04-01 14:52:18 +00:00
David Spickett	3cba39ed76	[AArch64] Add v8.5-a Memory Tagging GMID_EL1 register The latest version of the MTE spec added a system register 'GMID_EL1'. It contains the block size used by the LDGM and STGM instructions and is read only. The specification can be found here: https://developer.arm.com/docs/ddi0596/c llvm-svn: 357392	2019-04-01 14:41:14 +00:00
Jessica Paquette	e5318e0de2	[GlobalISel][AArch64] Add isel support for G_INSERT_VECTOR_ELT on v2s32s This adds support for v2s32 vector inserts, and updates the selection + regbankselect tests for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59910 llvm-svn: 357318	2019-03-29 21:39:36 +00:00
Evandro Menezes	4a0ab6cec5	[CodeGen] Refactor the option for the maximum jump table size Refactor the option `max-jump-table-size` to default to the maximum representable number. Essentially, NFC. llvm-svn: 357280	2019-03-29 17:28:11 +00:00
Amara Emerson	6dd746aa24	[AArch64][GlobalISel] Make G_PHI of v2s64, v4s32, v2s32 legal. llvm-svn: 357108	2019-03-27 18:31:46 +00:00
Sander de Smalen	6fb2b88ccc	[AArch64][SVE] Asm: error on unexpected SVE vector register type suffix This patch fixes an assembler bug that allowed SVE vector registers to contain a type suffix when not expected. The SVE unpredicated movprfx instruction is the only instruction affected. The following are examples of what was previously valid: movprfx z0.b, z0.b movprfx z0.b, z0.s movprfx z0, z0.s These instructions are now erroneous. Patch by Cullen Rhodes (c-rhodes) Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D59636 llvm-svn: 357094	2019-03-27 17:23:38 +00:00
Sander de Smalen	e4a32ea8e7	[AArch64] NFC: Cleanup isAArch64FrameOffsetLegal Cleanup isAArch64FrameOffsetLegal by: - Merging the large switch statement to reuse AArch64InstrInfo::getMemOpInfo(). - Using AArch64InstrInfo::getUnscaledLdSt() to determine whether an instruction has an unscaled variant. - Simplifying the logic that calculates the offset to fit the immediate. Reviewers: paquette, evandro, eli.friedman, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59636 llvm-svn: 357064	2019-03-27 13:16:19 +00:00
Sander de Smalen	e4c844b90f	[AArch64] Adds cases for LDRSHWui and LDRSHXui to getMemOpInfo This patch also adds cases PRFUMi and PRFMui. This change was discussed in https://reviews.llvm.org/D59635. llvm-svn: 357059	2019-03-27 10:39:03 +00:00
Eli Friedman	ea5a6285ae	[AArch64] Prefer "mov" over "orr" to materialize constants. This is generally more readable due to the way the assembler aliases work. (This causes a lot of test changes, but it's not really as scary as it looks at first glance; it's just mechanically changing a bunch of checks for orr to check for mov instead.) Differential Revision: https://reviews.llvm.org/D59720 llvm-svn: 356954	2019-03-25 21:25:28 +00:00
Evandro Menezes	c19776dc48	[AArch64, ARM] Add support for Exynos M5 Add Exynos M5 support and test cases. llvm-svn: 356793	2019-03-22 18:42:14 +00:00
Amara Emerson	de0bb2f505	[AArch64] Split the neon.addp intrinsic into integer and fp variants. This is the result of discussions on the list about how to deal with intrinsics which require codegen to disambiguate them via only the integer/fp overloads. It causes problems for GlobalISel as some of that information is lost during translation, while with other operations like IR instructions the information is encoded into the instruction opcode. This patch changes clang to emit the new faddp intrinsic if the vector operands to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to upgrade existing calls to aarch64.neon.addp with fp vector arguments, and we remove the workarounds introduced for GlobalISel in r355865. This is a more permanent solution to PR40968. Differential Revision: https://reviews.llvm.org/D59655 llvm-svn: 356722	2019-03-21 22:31:37 +00:00
Evandro Menezes	3bd1a2cbe6	[AArch64] Update for Exynos Fix the feature set for Exynos M4 by removing support for `+fp16fml` and fix test case. llvm-svn: 356698	2019-03-21 18:54:58 +00:00
Oliver Stannard	b5c44abcf5	[AArch64] Allow -mattr=tpidr-el[1\|2\|3] Added subtarget features for AArch64 to use TPIDR_EL[1\|2\|3] as the TLS base register, rather than the default TPIDR_EL0. Patch by Philip Derrin! Differential revision: https://reviews.llvm.org/D54685 llvm-svn: 356657	2019-03-21 11:30:17 +00:00
Amara Emerson	36fb50659e	[AArch64][GlobalISel] Add an optimization to select vector DUP instructions. This adds pattern matching for the insert+shufflevector sequence so we can generate dup instructions instead of the current TBL sequence. Differential Revision: https://reviews.llvm.org/D59558 llvm-svn: 356526	2019-03-19 21:43:05 +00:00
Amara Emerson	3e478dfc75	[AArch64][GlobalISel] Make v4s32 G_IMPLICIT_DEF legal. llvm-svn: 356525	2019-03-19 21:43:02 +00:00
Amara Emerson	48f1898b5d	Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy() After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396	2019-03-18 19:20:10 +00:00
Adhemerval Zanella	badc888cad	[AArch64] Small fix for getIntImmCost It uses the generic AArch64_IMM::expandMOVImm to get the correct number of instruction used in immediate materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58461 llvm-svn: 356391	2019-03-18 18:50:58 +00:00
Adhemerval Zanella	b2a4fe0946	[AArch64] Optimize floating point materialization This patch follows some ideas from r352866 to optimize the floating point materialization even further. It changes isFPImmLegal to considere up to 2 mov instruction or up to 5 in case subtarget has fused literals. The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but the mov+fmov sequence is always better because of the reduced d-cache pressure. The timings are still the same if you consider movw+movk+fmov vs. adrp+ldr will be fused (although one instruction longer). Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58460 llvm-svn: 356390	2019-03-18 18:45:57 +00:00
Adhemerval Zanella	3f35ea2992	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389	2019-03-18 18:40:07 +00:00
Adhemerval Zanella	7dcbe2d92b	[AArch64] Refactor floating point materialization. NFC It splits the login of actual instruction emission away from the logic that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm. The new function AArch64_IMM::expandMOVImm, which return the list of the instructions to materialize the immediate constant, is implemented on a separated unit because it will be used in a subsequent patch to optimize floating point materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58915 llvm-svn: 356387	2019-03-18 18:23:23 +00:00
Christof Douma	59deedef1c	[AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse Fixes https://bugs.llvm.org/show_bug.cgi?id=35094 The Dead register definition pass should leave alone the atomicrmw instructions on AArch64 (LTE extension). The reason is the following statement in the Arm ARM: "The ST<OP> instructions, and LD<OP> instructions where the destination register is WZR or XZR, are not regarded as doing a read for the purpose of a DMB LD barrier." A good example was given in the gcc thread by Will Deacon (linked in the bugzilla ticket 35094): P0 (atomic_int* y,atomic_int* x) { atomic_store_explicit(x,1,memory_order_relaxed); atomic_thread_fence(memory_order_release); atomic_store_explicit(y,1,memory_order_relaxed); } P1 (atomic_int* y,atomic_int* x) { atomic_fetch_add_explicit(y,1,memory_order_relaxed); // STADD atomic_thread_fence(memory_order_acquire); int r0 = atomic_load_explicit(x,memory_order_relaxed); } P2 (atomic_int* y) { int r1 = atomic_load_explicit(y,memory_order_relaxed); } My understanding is that it is forbidden for r0 == 0 and r1 == 2 after this test has executed. However, if the relaxed add in P1 compiles to STADD and the subsequent acquire fence is compiled as DMB LD, then we don't have any ordering guarantees in P1 and the forbidden result could be observed. Change-Id: I419f9f9df947716932038e1100c18d10a96408d0 llvm-svn: 356360	2019-03-18 09:21:06 +00:00
Amara Emerson	3d592fdef6	[GlobalISel] Allow MachineIRBuilder to build subregister copies. This relaxes some asserts about sizes, and adds an optional subreg parameter to buildCopy(). Also update AArch64 instruction selector to use this in places where we previously used MachineInstrBuilder manually. Differential Revision: https://reviews.llvm.org/D59434 llvm-svn: 356304	2019-03-15 21:59:50 +00:00
Nikita Popov	b965ccb2fd	[AArch64] Turn BIC immediate creation into a DAG combine Switch BIC immediate creation for vector ANDs from custom lowering to a DAG combine, which gives generic DAG combines a change to apply first. In particular this avoids (and x, -1) being turned into a (bic x, 0) instead of being eliminated entirely. Differential Revision: https://reviews.llvm.org/D59187 llvm-svn: 356299	2019-03-15 21:04:34 +00:00
Amara Emerson	af364080e2	[AArch64][GlobalISel] Regbankselect: Fix G_BUILD_VECTOR trying to use s16 gpr sources. Since we can't insert s16 gprs as we don't have 16 bit GPR registers, we need to teach RBS to assign them to the FPR bank so our selector works. llvm-svn: 356282	2019-03-15 18:00:01 +00:00
Jessica Paquette	5fc1e87953	[AArch64][GlobalISel] Add isel support for G_UADDO on s32s and s64s This adds instruction selection support for G_UADDO on s32s and s64s. Also - Add an instruction selection test - Update the arm64-xaluo.ll test to show that we generate the correct assembly Differential Revision: https://reviews.llvm.org/D58734 llvm-svn: 356214	2019-03-14 22:54:29 +00:00
Amara Emerson	6da5735cc0	[AArch64][GlobalISel] Implement selection for G_UNMERGE of vectors to vectors. This re-uses the previous support for extract vector elt to extract the subvectors. Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356213	2019-03-14 22:48:18 +00:00
Amara Emerson	90869d8494	[AArch64][GlobalISel] Add some support for G_CONCAT_VECTORS. Handles concatenating 2 x v2s32 and 2 x v4s16 Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356212	2019-03-14 22:48:15 +00:00
Jessica Paquette	5e305f80d3	[GlobalISel][AArch64] Add partial selection support for G_INSERT_VECTOR_ELT This adds support for inserting elements into packed vectors. It also adds two tests: one for selection, and one for regbank select. Unpacked vectors will come in a follow-up. Differential Revision: https://reviews.llvm.org/D59325 llvm-svn: 356182	2019-03-14 18:01:30 +00:00
Jessica Paquette	9a354720e0	[AArch64][GlobalISel] Gardening: Simplify subregister copy in selectBuildVector NFC. Some more preliminary factoring for G_INSERT_VECTOR_ELT. Also better code-reuse, etc., etc. Differential Revision: https://reviews.llvm.org/D59323 llvm-svn: 356107	2019-03-13 23:29:54 +00:00
Jessica Paquette	4117eab6b5	[GlobalISel][AArch64] Gardening: Factor out vector inserts Factor out the vector insert code in `selectBuildVector`. Replace part of it with `emitScalarToVector`, since it was pretty much equivalent. This will make implementing G_INSERT_VECTOR_ELT easier. Differential Revision: https://reviews.llvm.org/D59322 llvm-svn: 356106	2019-03-13 23:22:23 +00:00
Jessica Paquette	fcc568af0c	[GlobalISel][AArch64] Gardening: Factor out code to find lane indices Some more refactoring for G_INSERT_VECTOR_ELT. Factor out the code used to find a lane index from `selectExtractElt`. Put it into a more general-purpose `getConstantValueForReg` function. This will be shared with the code for G_INSERT_VECTOR_ELT. Differential Revision: https://reviews.llvm.org/D59324 llvm-svn: 356101	2019-03-13 21:19:29 +00:00
Jessica Paquette	9202e8eb2a	Recommit "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT" After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without running into any problematic intrinsics. Also add a fix for lane copies, which don't support index 0. llvm-svn: 355871	2019-03-11 22:18:01 +00:00
Jessica Paquette	1d2b488ef7	[GlobalISel][AArch64] Always fall back on aarch64.neon.addp.* Overloaded intrinsics aren't necessarily safe for instruction selection. One such intrinsic is aarch64.neon.addp.*. This is a temporary workaround to ensure that we always fall back on that intrinsic. Eventually this will be replaced with a proper solution. https://bugs.llvm.org/show_bug.cgi?id=40968 Differential Revision: https://reviews.llvm.org/D59062 llvm-svn: 355865	2019-03-11 20:51:17 +00:00
Nikita Popov	b4de4b44fe	[SDAG][AArch64] Legalize VECREDUCE Fixes https://bugs.llvm.org/show_bug.cgi?id=36796. Implement basic legalizations (PromoteIntRes, PromoteIntOp, ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes. There are more legalizations missing (esp float legalizations), but there's no way to test them right now, so I'm not adding them. This also includes a few more changes to make this work somewhat reasonably: * Add support for expanding VECREDUCE in SDAG. Usually experimental.vector.reduce is expanded prior to codegen, but if the target does have native vector reduce, it may of course still be necessary to expand due to legalization issues. This uses a shuffle reduction if possible, followed by a naive scalar reduction. * Allow the result type of integer VECREDUCE to be larger than the vector element type. For example we need to be able to reduce a v8i8 into an (nominally) i32 result type on AArch64. * Use the vector operand type rather than the scalar result type to determine the action, so we can control exactly which vector types are supported. Also change the legalize vector op code to handle operations that only have vector operands, but no vector results, as is the case for VECREDUCE. * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE), explicitly specify for which vector types the reductions are supported. This does not handle anything related to VECREDUCE_STRICT_*. Differential Revision: https://reviews.llvm.org/D58015 llvm-svn: 355860	2019-03-11 20:22:13 +00:00
Stanislav Mekhanoshin	9260748488	Use bitset for assembler predicates AMDGPU target run out of Subtarget feature flags hitting the limit of 64. AssemblerPredicates uses at most uint64_t for their representation. At the same time CodeGen has exhausted this a long time ago and switched to a FeatureBitset with the current limit of 192 bits. This patch completes transition to the bitset for feature bits extending it to asm matcher and MC code emitter. Differential Revision: https://reviews.llvm.org/D59002 llvm-svn: 355839	2019-03-11 17:04:35 +00:00
Amara Emerson	17778b627c	[AArch64][GlobalISel] Fix i1 arguments not being zero-extended as required by ABI. Fixes PR41001. llvm-svn: 355745	2019-03-08 22:17:00 +00:00
Mitch Phillips	5c6e533e01	[HWASan] Save + print registers when tag mismatch occurs in AArch64. Summary: This change change the instrumentation to allow users to view the registers at the point at which tag mismatch occured. Most of the heavy lifting is done in the runtime library, where we save the registers to the stack and emit unwind information. This allows us to reduce the overhead, as very little additional work needs to be done in each __hwasan_check instance. In this implementation, the fast path of __hwasan_check is unmodified. There are an additional 4 instructions (16B) emitted in the slow path in every __hwasan_check instance. This may increase binary size somewhat, but as most of the work is done in the runtime library, it's manageable. The failure trace now contains a list of registers at the point of which the failure occured, in a format similar to that of Android's tombstones. It currently has the following format: Registers where the failure occurred (pc 0x0055555561b4): x0 0000000000000014 x1 0000007ffffff6c0 x2 1100007ffffff6d0 x3 12000056ffffe025 x4 0000007fff800000 x5 0000000000000014 x6 0000007fff800000 x7 0000000000000001 x8 12000056ffffe020 x9 0200007700000000 x10 0200007700000000 x11 0000000000000000 x12 0000007fffffdde0 x13 0000000000000000 x14 02b65b01f7a97490 x15 0000000000000000 x16 0000007fb77376b8 x17 0000000000000012 x18 0000007fb7ed6000 x19 0000005555556078 x20 0000007ffffff768 x21 0000007ffffff778 x22 0000000000000001 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000007ffffff6f0 x30 00000055555561b4 ... and prints after the dump of memory tags around the buggy address. Every register is saved exactly as it was at the point where the tag mismatch occurs, with the exception of x16/x17. These registers are used in the tag mismatch calculation as scratch registers during __hwasan_check, and cannot be saved without affecting the fast path. As these registers are designated as scratch registers for linking, there should be no important information in them that could aid in debugging. Reviewers: pcc, eugenis Reviewed By: pcc, eugenis Subscribers: srhines, kubamracek, mgorny, javed.absar, krytarowski, kristof.beyls, hiraditya, jdoerfert, llvm-commits, #sanitizers Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58857 llvm-svn: 355738	2019-03-08 21:22:35 +00:00
Abderrazek Zaafrani	e9a93d92e1	[AArch64] Improve FP16 instruction selection for vector round and vector conver from half instructions https://reviews.llvm.org/D58855 llvm-svn: 355545	2019-03-06 20:30:06 +00:00
Amara Emerson	db1da90c33	[AArch64] Remove a stray test from the AArch64 directory. llvm-svn: 355534	2019-03-06 18:54:07 +00:00
Jessica Paquette	1b3a1b13fd	Revert "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT" This broke test-suite::aarch64_neon_intrinsics.test Reverting while I look into it. Example failure: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/17740 llvm-svn: 355408	2019-03-05 15:47:00 +00:00
Jessica Paquette	dd5c010238	[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT This adds instruction selection support for G_EXTRACT_VECTOR_ELT for cases where the index is defined by a G_CONSTANT. It also factos out the lane copy opcode selection part into its own function, `getLaneCopyOpcode`. This is used by both `selectUnmergeValues` and `selectExtractElt`. Differential Revision: https://reviews.llvm.org/D58469 llvm-svn: 355344	2019-03-04 22:35:32 +00:00
Jessica Paquette	aff4683104	[GlobalISel][AArch64] Legalize vector G_SELECT Just scalarize it, and add a test showing it works. Differential Revision: https://reviews.llvm.org/D58747 llvm-svn: 355339	2019-03-04 21:12:46 +00:00
Amara Emerson	5792128251	Re-commit r355104: "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1." The code to materialize a mask from a constant pool load tried to use a 128 bit LDR to load a 64 bit constant pool entry, which was 8 byte aligned. This resulted in a link failure in the NEON tests in the test suite since the LDR address was unaligned. This change fixes that to instead emit a 64 bit LDR if the entry is 64 bit, before converting back to a 128 bit register for the TBL. llvm-svn: 355326	2019-03-04 19:16:00 +00:00
Jonas Hahnfeld	a58dcdaa5e	[AArch64/ARM] Fix two compiler warnings in InstructionSelector, NFCI 1) GCC complains that KnownValid is set but not used. 2) In ARMInstructionSelector::selectGlobal() the code is mixing "enumeral and non-enumeral type in conditional expression". Solve this by casting to unsigned which is the final type anyway. Differential Revision: https://reviews.llvm.org/D58834 llvm-svn: 355304	2019-03-04 08:51:32 +00:00
Eli Friedman	e8f9092ebb	[AArch64] [Windows] Don't skip constructing UnwindHelp. In certain cases, the first non-frame-setup instruction in a function is a branch. For example, it could be a cbz on an argument. Make sure we correctly allocate the UnwindHelp, and find an appropriate register to use to initialize it. Fixes https://bugs.llvm.org/show_bug.cgi?id=40184 Differential Revision: https://reviews.llvm.org/D58752 llvm-svn: 355136	2019-02-28 20:38:45 +00:00
Abderrazek Zaafrani	3581c4535e	[AArch64] Improve FP16 vector convert from short instructions. https://reviews.llvm.org/D58563 llvm-svn: 355134	2019-02-28 20:21:46 +00:00
Amara Emerson	09b6b37f1b	Revert "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1." Seems to break some neon intrinsics tests. llvm-svn: 355115	2019-02-28 18:47:29 +00:00
Amara Emerson	5268203d9c	[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1. This extends the existing support for shufflevector to handle cases like <2 x float>, which we can implement by concating the vectors and using a TBL1. Differential Revision: https://reviews.llvm.org/D58684 llvm-svn: 355104	2019-02-28 16:43:11 +00:00
Abderrazek Zaafrani	7c74725f8a	[AArch64] Generate FP16 vector compare instructions. https://reviews.llvm.org/D58561 llvm-svn: 355050	2019-02-28 00:31:38 +00:00
Amara Emerson	1e2d6950a1	[AArch64][GlobalISel] Refactor selectBuildVector to use MachineIRBuilder. NFC. This is a preparatory change as I want to use emitScalarToVector() elsewhere, and in general we want to transition to MIRBuilder instead of using BuildMI directly. Differential Revision: https://reviews.llvm.org/D58528 llvm-svn: 354807	2019-02-25 18:52:54 +00:00
Luke Cheeseman	6af96f2e37	[AArch64] Add support for Cortex-A76 and Cortex-A76AE - Add LLVM backend support for Cortex-A76 and Cortex-A76AE - Documentation can be found at https://developer.arm.com/products/processors/cortex-a/cortex-a76 llvm-svn: 354788	2019-02-25 15:08:27 +00:00
Amara Emerson	0697bcf2e4	Re-land "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR"" Thanks to Richard Trieu for pointing out that the failures were due to a use-after-free of an ArrayRef. llvm-svn: 354616	2019-02-21 20:20:16 +00:00
David Spickett	886462b5ef	[AArch64] Print instruction before atomic semantic annotations Commit r353303 added annotations when acquire semantics were dropped from an instruction. printAnnotation was called before printInstruction. So if you didn't set a separate comment output stream you got <comment><instr> instead of <instr><comment> as expected. To fix this move the new printAnnotation to after the instruction is printed. Differential Revision: https://reviews.llvm.org/D58059 llvm-svn: 354565	2019-02-21 10:42:49 +00:00
Amara Emerson	d0b15a3f1e	Revert "[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR" This reverts r354521 because it broke the bots, but passes on Darwin somehow. llvm-svn: 354532	2019-02-21 00:31:13 +00:00
Amara Emerson	117293b921	[AArch64][GlobalISel] Implement partial support for G_SHUFFLE_VECTOR This change makes some basic type combinations for G_SHUFFLE_VECTOR legal, and implements them with a very pessimistic TBL2 instruction in the selector. For TBL2, support is also needed to generate constant pool entries and load from them in order to materialize the mask register. Currently supports <2 x s64> and <4 x s32> result types. Differential Revision: https://reviews.llvm.org/D58466 llvm-svn: 354521	2019-02-20 22:11:39 +00:00
Jessica Paquette	f7efb97beb	[GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsics Legalize/select llvm.ctlz.* Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and arm64-vclz.ll to show that we perform valid transformations in optimized builds, and document where GISel can improve. Differential Revision: https://reviews.llvm.org/D58155 llvm-svn: 354299	2019-02-18 23:33:24 +00:00
Matt Arsenault	3ca7deb4b1	GlobalISel: Add alignment to LegalityQuery MMOs This allows targets to specify the minimum alignment required for the load/store. llvm-svn: 354071	2019-02-14 22:41:09 +00:00
Petr Hosek	469dd5010d	[AArch64] Support reserving arbitrary general purpose registers This is a follow up to D48580 and D48581 which allows reserving arbitrary general purpose registers with the exception of registers with special purpose (X8, X16-X18, X29, X30) and registers used by LLVM (X0, X19). This change also generalizes some of the existing logic to rely entirely on values generated from tablegen. Differential Revision: https://reviews.llvm.org/D56305 llvm-svn: 353957	2019-02-13 17:28:47 +00:00
Nikita Popov	fad5d2671a	[AArch64] Expand v8i8 cttz (PR39729) Fix for https://bugs.llvm.org/show_bug.cgi?id=39729. Rather than adding just a case for v8i8 I'm setting cttz to expand for all vector types. Differential Revision: https://reviews.llvm.org/D58008 llvm-svn: 353872	2019-02-12 18:55:53 +00:00
Jessica Paquette	bc25529f39	[GlobalISel][AArch64] Select llvm.bswap* for non-vector types This teaches the IRTranslator to emit G_BSWAP when it runs into Intrinsic::bswap. This allows us to select G_BSWAP for non-vector types in AArch64. Add a select-bswap.mir test, and add global isel checks to a couple existing tests in test/CodeGen/AArch64. This doesn't handle every bswap case, since some of these rely on known bits stuff. This just lets us handle the naive case. Differential Revision: https://reviews.llvm.org/D58081 llvm-svn: 353861	2019-02-12 17:28:17 +00:00
Jessica Paquette	c562e11af8	[AArch64][GlobalISel] Add isel support for a couple vector exts/truncs Add support for - v4s16 <-> v4s32 - v2s64 <-> v2s32 And update tests that use them to show that we generate the correct instructions. Differential Revision: https://reviews.llvm.org/D57832 llvm-svn: 353732	2019-02-11 18:56:39 +00:00
Jessica Paquette	5a40815c0d	[GlobalISel][AArch64] Select G_FFLOOR This teaches the legalizer about G_FFLOOR, and lets us select G_FFLOOR in AArch64. It updates the existing floating point tests, and adds a select-floor.mir test. Differential Revision: https://reviews.llvm.org/D57486 llvm-svn: 353722	2019-02-11 17:22:58 +00:00
Benjamin Kramer	f0a0f80db1	Move some classes into anonymous namespaces. NFC. llvm-svn: 353710	2019-02-11 15:16:21 +00:00
Eli Friedman	b87d675297	[AArch64] Fix condition for "high-vector" DUP optimizations. AArch64 NEON has a bunch of instructions with a "2" suffix that extract the top half of the source vectors, instead of the bottom half. We have some DAGCombines to try to take advantage of that. However, they assumed that any EXTRACT_VECTOR was extracting the high half of the vector in question. This issue has apparently existed since the AArch64 backend was merged. Fixes https://bugs.llvm.org/show_bug.cgi?id=40632 . Differential Revision: https://reviews.llvm.org/D57862 llvm-svn: 353486	2019-02-08 00:23:35 +00:00
Matt Arsenault	e739762e43	GlobalISel: Implement narrowScalar for shift main type This is pretty much directly ported from SelectionDAG. Doesn't include the shift by non-constant but known bits version, since there isn't a globalisel version of computeKnownBits yet. This shows a disadvantage of targets not specifically which type should be used for the shift amount. If type 0 is legalized before type 1, the operations on the shift amount type use the wider type (which are also less likely to legalize). This can be avoided by targets specifying legalization actions on type 1 earlier than for type 0. llvm-svn: 353455	2019-02-07 19:37:44 +00:00
Tim Northover	77f59ae9f5	AArch64: implement copy for paired GPR registers. When doing 128-bit atomics using CASP we might need to copy a GPRPair to a different register, but that was unimplemented up to now. llvm-svn: 353383	2019-02-07 10:35:34 +00:00
Tim Northover	009fdea45c	AArch64: enforce even/odd register pairs for CASP instructions. ARMv8.1a CASP instructions need the first of the pair to be an even register (otherwise the encoding is unallocated). We enforced this during assembly, but not CodeGen before. llvm-svn: 353308	2019-02-06 15:26:35 +00:00
Tim Northover	f20f21ec5e	AArch64: annotate atomics with dropped acquire semantics when printing. A quirk of the v8.1a spec is that when the writeback regiser for an atomic read-modify-write instruction is wzr/xzr, the instruction no longer enforces acquire ordering. However, it's still written with the misleading 'a' mnemonic. So this adds an annotation when disassembling such instructions, mentioning the change. llvm-svn: 353303	2019-02-06 15:07:59 +00:00
Aditya Nandakumar	bba2c76cb2	[NFC][GlobalISel]: Add a convenience method to MachineInstrBuilder to simplify getOperand(i).getReg() https://reviews.llvm.org/D57608 It's a common pattern in GISel to have a MachineInstrBuilder from which we get various regs (commonly MIB->getOperand(0).getReg()). This adds a helper method and the above can be replaced with MIB.getReg(0). llvm-svn: 353223	2019-02-05 22:14:40 +00:00
Oliver Stannard	a390364af8	[AArch64][Outliner] Don't outline BTI instructions We can't outline BTI instructions, because they need to be the very first instruction executed after an indirect call or branch. If we outline them, then an indirect call might go to the branch to the outlined function, which will fault. Differential revision: https://reviews.llvm.org/D57753 llvm-svn: 353190	2019-02-05 17:21:57 +00:00
Matt Arsenault	eaac595e10	AArch64/GlobalISel: Don't clamp from 2 to 2 This is equivalent to clampMaxNumElements, but saves a check. llvm-svn: 353188	2019-02-05 16:57:18 +00:00
Florian Hahn	ad99b95580	[CGP] Add support for sinking operands to their users, if they are free. This patch improves code generation for some AArch64 ACLE intrinsics. It adds support to CGP to duplicate and sink operands to their user, if they can be folded into a target instruction, like zexts and sub into usubl. It adds a TargetLowering hook shouldSinkOperands, which looks at the operands of instructions to see if sinking is profitable. I decided to add a new target hook, as for the sinking to be profitable, at least on AArch64, we have to look at multiple operands of an instruction, instead of looking at the users of a zext for example. The sinking is done in CGP, because it works around an instruction selection limitation. If instruction selection is not limited to a single basic block, this patch should not be needed any longer. Alternatively this could be done in the LoopSink pass, which tries to undo LICM for instructions in blocks that are not executed frequently. Note that we do not force the operands to sink to have a single user, because we duplicate them before sinking. Therefore this is only desirable if they really can be done for free. Additionally we could consider the impact on live ranges later on. This should fix https://bugs.llvm.org/show_bug.cgi?id=40025. As for performance, we have internal code that uses intrinsics and can be speed up by 10% by this change. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D57377 llvm-svn: 353152	2019-02-05 10:27:40 +00:00
Andrea Di Biagio	2820258583	[AsmPrinter] Remove hidden flag -print-schedule. This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043	2019-02-04 12:51:26 +00:00
Mandeep Singh Grang	553513010b	[AArch64] Fix unused variable [NFC] llvm-svn: 352940	2019-02-01 23:42:34 +00:00
Mandeep Singh Grang	9cf3ca95c2	[COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present. Reviewers: rnk, efriedma, ssijaric, TomTan Reviewed By: rnk, efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D57183 llvm-svn: 352923	2019-02-01 21:41:33 +00:00
James Y Knight	d34c1cbe9e	[opaque pointer types] Pass value type to GetElementPtr creation. This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913	2019-02-01 20:44:47 +00:00
James Y Knight	c8b30de05f	[opaque pointer types] Pass value type to LoadInst creation. This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911	2019-02-01 20:44:24 +00:00
James Y Knight	31a2057127	[opaque pointer types] Pass function types to CallInst creation. This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909	2019-02-01 20:43:25 +00:00
Adhemerval Zanella	36b7b3c0fa	[AArch64] Optimize floating point materialization This patch changes isFPImmLegal to return if the value can be enconded as the immediate operand of a logical instruction besides checking if for immediate field for fmov. This optimizes some floating point materization, inclusive values used on isinf lowering. Reviewed By: rengolin, efriedma, evandro Differential Revision: https://reviews.llvm.org/D57044 llvm-svn: 352866	2019-02-01 12:26:06 +00:00
James Y Knight	846be29e5e	[opaque pointer types] Add a FunctionCallee wrapper type, and use it. Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc doesn't choke on it, hopefully. Original Message: The FunctionCallee type is effectively a {FunctionType,Value} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352827	2019-02-01 02:28:03 +00:00
James Y Knight	06da6dcca4	Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it." This reverts commit f47d6b38c7a61d50db4566b02719de05492dcef1 (r352791). Seems to run into compilation failures with GCC (but not clang, where I tested it). Reverting while I investigate. llvm-svn: 352800	2019-01-31 21:51:58 +00:00
James Y Knight	fa51e33345	[opaque pointer types] Add a FunctionCallee wrapper type, and use it. The FunctionCallee type is effectively a {FunctionType,Value} pair, and is a useful convenience to enable code to continue passing the result of getOrInsertFunction() through to EmitCall, even once pointer types lose their pointee-type. Then: - update the CallInst/InvokeInst instruction creation functions to take a Callee, - modify getOrInsertFunction to return FunctionCallee, and - update all callers appropriately. One area of particular note is the change to the sanitizer code. Previously, they had been casting the result of `getOrInsertFunction` to a `Function*` via `checkSanitizerInterfaceFunction`, and storing that. That would report an error if someone had already inserted a function declaraction with a mismatching signature. However, in general, LLVM allows for such mismatches, as `getOrInsertFunction` will automatically insert a bitcast if needed. As part of this cleanup, cause the sanitizer code to do the same. (It will call its functions using the expected signature, however they may have been declared.) Finally, in a small number of locations, callers of `getOrInsertFunction` actually were expecting/requiring that a brand new function was being created. In such cases, I've switched them to Function::Create instead. Differential Revision: https://reviews.llvm.org/D57315 llvm-svn: 352791	2019-01-31 20:35:56 +00:00
Sjoerd Meijer	068d715728	[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS And instead just generate a libcall. My motivating example on ARM was a simple: shl i64 %A, %B for which the code bloat is quite significant. For other targets that also accept __int128/i128 such as AArch64 and X86, it is also beneficial for these cases to generate a libcall when optimising for minsize. On these 64-bit targets, the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS lowering operation action is not set to custom/expand. Differential Revision: https://reviews.llvm.org/D57386 llvm-svn: 352736	2019-01-31 08:07:30 +00:00
Matt Arsenault	1067bf29d3	GlobalISel: Fix creating MMOs with align 0 llvm-svn: 352712	2019-01-31 01:38:47 +00:00
Jessica Paquette	156cec86e5	[GlobalISel][AArch64] Select G_FEXP This teaches the legalizer to handle G_FEXP in AArch64. As a result, it also allows us to select G_FEXP. It... - Updates the legalizer-info tests - Adds a test for legalizing exp - Updates the existing fp tests to show that we can now select G_FEXP https://reviews.llvm.org/D57483 llvm-svn: 352692	2019-01-30 23:46:15 +00:00
Jessica Paquette	d46e179ece	[GlobalISel][AArch64] Select G_FABS This adds instruction selection support for G_FABS in AArch64. It also updates the existing basic FP tests, adds a selection test for G_FABS. https://reviews.llvm.org/D57418 llvm-svn: 352684	2019-01-30 22:54:21 +00:00
Jessica Paquette	d5349f419b	[GlobalISel][AArch64] Add instruction selection support for @llvm.log2 This teaches GlobalISel to emit a RTLib call for @llvm.log2 when it encounters it. It updates the existing floating point tests to show that we don't fall back on the intrinsic, and select the correct instructions. It also adds a legalizer test for G_FLOG2. https://reviews.llvm.org/D57357 llvm-svn: 352673	2019-01-30 21:16:04 +00:00
Jessica Paquette	d03d1c2ace	[GlobalISel][AArch64] Add instruction selection support for @llvm.sqrt This teaches the legalizer about G_FSQRT in AArch64. Also adds a legalizer test for G_FSQRT, a selection test for it, and updates existing floating point tests. https://reviews.llvm.org/D57361 llvm-svn: 352671	2019-01-30 21:03:52 +00:00
Amara Emerson	478ae74dcd	[AArch64][GlobalISel] Unmerge into scalars from a vector should use FPR bank. This currently shows up as a selection fallback since the dest regs were given GPR banks but the source was a vector FPR reg. Differential Revision: https://reviews.llvm.org/D57408 llvm-svn: 352545	2019-01-29 21:19:33 +00:00
Martin Storsjo	30a71b482a	[COFF, ARM64] Don't put jump table into a separate COFF section for EK_LabelDifference32 Windows ARM64 has PIC relocation model and uses jump table kind EK_LabelDifference32. This produces jump table entry as ".word LBB123 - LJTI1_2" which represents the distance between the block and jump table. A new relocation type (IMAGE_REL_ARM64_REL32) is needed to do the fixup correctly if they are in different COFF section. This change saves the jump table to the same COFF section as the associated code. An ideal fix could be utilizing IMAGE_REL_ARM64_REL32 relocation type. Patch by Tom Tan! Differential Revision: https://reviews.llvm.org/D57277 llvm-svn: 352465	2019-01-29 09:36:48 +00:00
Reid Kleckner	18f71dad93	[AArch64] Include AArch64GenCallingConv.inc once Summary: Avoids duplicating generated static helpers for calling convention analysis. This also means you can modify AArch64CallingConv.td without recompiling the AArch64ISelLowering.cpp monolith, so it provides faster incremental rebuilds. Saves 12K in llc.exe, but adds a new object file, which is large. Reviewers: efriedma, t.p.northover Subscribers: mgorny, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56948 llvm-svn: 352430	2019-01-28 21:28:40 +00:00
Jessica Paquette	91e5ed536e	[GlobalISel][AArch64] Add legalization for G_FLOG This adds support for legalizing G_FLOG into a RTLib call. It adds a legalizer test, and updates the existing floating point tests. https://reviews.llvm.org/D57347 llvm-svn: 352429	2019-01-28 21:27:23 +00:00
Jessica Paquette	ff99f81513	[GlobalISel][AArch64] Add instruction selection support for @llvm.log10 This adds instruction selection support for @llvm.log10 in AArch64. It teaches GISel to lower it to a library call, updates the relevant tests, and adds a legalizer test for log10. https://reviews.llvm.org/D57341 llvm-svn: 352418	2019-01-28 19:53:14 +00:00
Francis Visoiu Mistrih	2379ffe16f	[AArch64] Add 'apple-latest' CPU alias The 'apple-latest' alias is supposed to provide a CPU that contains the latest Apple processor model supported by LLVM. This is supposed to be used by tools like lldb to provide a target that supports most of the CPU features. For now, this is mapped to Cyclone. Differential Revision: https://reviews.llvm.org/D56384 llvm-svn: 352412	2019-01-28 19:27:33 +00:00
Jessica Paquette	207e020957	[GlobalISel][AArch64] Add instruction selection support for G_FCOS and G_FSIN This contains all of the legalizer changes from D57197 necessary to select G_FCOS and G_FSIN. It also updates several existing IR tests in test/CodeGen/AArch64 that verify that we correctly lower the G_FCOS and G_FSIN instructions. https://reviews.llvm.org/D57197 3/3 llvm-svn: 352402	2019-01-28 18:34:18 +00:00
Amara Emerson	c61a90aead	[AArch64][GlobalISel] Teach RBS about G_FNEG default mapping. llvm-svn: 352340	2019-01-28 03:21:14 +00:00
Amara Emerson	5ebce2e19a	[AArch64][GlobalISel] Add some missing vector support for FP arithmetic ops. Moved the fneg lowering legalization test from AArch64 to X86, as we want to specify that it's already legal. llvm-svn: 352338	2019-01-28 02:28:22 +00:00
Amara Emerson	ba9fd8068d	[AArch64][GlobalISel] Add some vector support for fp <-> int conversions. Some unrelated, but benign, test changes as well due to the test update script. llvm-svn: 352337	2019-01-28 02:27:59 +00:00
Jessica Paquette	00aa216e32	[GlobalISel][AArch64][NFC] Fix incorrect comment in selectUnmergeValues s/scalar/vector/ llvm-svn: 352243	2019-01-25 21:28:27 +00:00
Simon Pilgrim	504bf36a53	Fix gcc -Wparentheses warning. NFCI. llvm-svn: 352193	2019-01-25 11:38:40 +00:00
Matt Arsenault	f3ebfd83c4	GlobalISel: Add convenience mutatations to scalarize llvm-svn: 352143	2019-01-25 00:51:00 +00:00
Benjamin Kramer	1d47eb5cbb	[GlobalISel][AArch64] Avoid unused variable warning for variable only used in assert llvm-svn: 352133	2019-01-24 23:45:07 +00:00
Benjamin Kramer	fe37caf3d2	[GlobalISel][AArch64] Avoid unused function warnings in Release builds llvm-svn: 352129	2019-01-24 23:39:47 +00:00
Jessica Paquette	52154c0f89	Suppress unused capture warning in CheckCopy Werror bots didn't like the lambda + assert thing in my previous commit. Capture everything to suppress the error. Example failure here: http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29393 llvm-svn: 352124	2019-01-24 22:51:31 +00:00
Jessica Paquette	af0b0f857f	[GlobalISel][AArch64] Add isel support for FP16 vector @llvm.ceil This patch adds support for vector @llvm.ceil intrinsics when full 16 bit floating point support isn't available. To do this, this patch... - Implements basic isel for G_UNMERGE_VALUES - Teaches the legalizer about 16 bit floats - Teaches AArch64RegisterBankInfo to respect floating point registers on G_BUILD_VECTOR and G_UNMERGE_VALUES - Teaches selectCopy about 16-bit floating point vectors It also adds - A legalizer test for the 16-bit vector ceil which verifies that we create a G_UNMERGE_VALUES and G_BUILD_VECTOR when full fp16 isn't supported - An instruction selection test which makes sure we lower to G_FCEIL when full fp16 is supported - A test for selecting G_UNMERGE_VALUES And also updates arm64-vfloatintrinsics.ll to show that the new ceiling types work as expected. https://reviews.llvm.org/D56682 llvm-svn: 352113	2019-01-24 22:00:41 +00:00
Benjamin Kramer	ca8adadeee	[AArch64] Fix out of bounds strlen CFIInst is not zero-terminated. This is one of more annoying functional differences between StringRef and ArrayRef. Found by asan. llvm-svn: 351955	2019-01-23 14:51:21 +00:00
Kristof Beyls	1b3bd06827	[SLH] AArch64: correctly pick temporary register to mask SP As part of speculation hardening, the stack pointer gets masked with the taint register (X16) before a function call or before a function return. Since there are no instructions that can directly mask writing to the stack pointer, the stack pointer must first be transferred to another register, where it can be masked, before that value is transferred back to the stack pointer. Before, that temporary register was always picked to be x17, since the ABI allows clobbering x17 on any function call, resulting in the following instruction pattern being inserted before function calls and returns/tail calls: mov x17, sp and x17, x17, x16 mov sp, x17 However, x17 can be live in those locations, for example when the call is an indirect call, using x17 as the target address (blr x17). To fix this, this patch looks for an available register just before the call or terminator instruction and uses that. In the rare case when no register turns out to be available (this situation is only encountered twice across the whole test-suite), just insert a full speculation barrier at the start of the basic block where this occurs. Differential Revision: https://reviews.llvm.org/D56717 llvm-svn: 351930	2019-01-23 08:18:39 +00:00
Peter Collingbourne	2818e607ab	hwasan: Move memory access checks into small outlined functions on aarch64. Each hwasan check requires emitting a small piece of code like this: https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#memory-accesses The problem with this is that these code blocks typically bloat code size significantly. An obvious solution is to outline these blocks of code. In fact, this has already been implemented under the -hwasan-instrument-with-calls flag. However, as currently implemented this has a number of problems: - The functions use the same calling convention as regular C functions. This means that the backend must spill all temporary registers as required by the platform's C calling convention, even though the check only needs two registers on the hot path. - The functions take the address to be checked in a fixed register, which increases register pressure. Both of these factors can diminish the code size effect and increase the performance hit of -hwasan-instrument-with-calls. The solution that this patch implements is to involve the aarch64 backend in outlining the checks. An intrinsic and pseudo-instruction are created to represent a hwasan check. The pseudo-instruction is register allocated like any other instruction, and we allow the register allocator to select almost any register for the address to check. A particular combination of (register selection, type of check) triggers the creation in the backend of a function to handle the check for specifically that pair. The resulting functions are deduplicated by the linker. The pseudo-instruction (really the function) is specified to preserve all registers except for the registers that the AAPCS specifies may be clobbered by a call. To measure the code size and performance effect of this change, I took a number of measurements using Chromium for Android on aarch64, comparing a browser with inlined checks (the baseline) against a browser with outlined checks. Code size: Size of .text decreases from 243897420 to 171619972 bytes, or a 30% decrease. Performance: Using Chromium's blink_perf.layout microbenchmarks I measured a median performance regression of 6.24%. The fact that a perf/size tradeoff is evident here suggests that we might want to make the new behaviour conditional on -Os/-Oz. But for now I've enabled it unconditionally, my reasoning being that hwasan users typically expect a relatively large perf hit, and ~6% isn't really adding much. We may want to revisit this decision in the future, though. I also tried experimenting with varying the number of registers selectable by the hwasan check pseudo-instruction (which would result in fewer variants being created), on the hypothesis that creating fewer variants of the function would expose another perf/size tradeoff by reducing icache pressure from the check functions at the cost of register pressure. Although I did observe a code size increase with fewer registers, I did not observe a strong correlation between the number of registers and the performance of the resulting browser on the microbenchmarks, so I conclude that we might as well use ~all registers to get the maximum code size improvement. My results are below: Regs \| .text size \| Perf hit -----+------------+--------- ~all \| 171619972 \| 6.24% 16 \| 171765192 \| 7.03% 8 \| 172917788 \| 5.82% 4 \| 177054016 \| 6.89% Differential Revision: https://reviews.llvm.org/D56954 llvm-svn: 351920	2019-01-23 02:20:10 +00:00
Matt Arsenault	cfaee823da	GlobalISel: Allow shift amount to be a different type For AMDGPU the shift amount is never 64-bit, and this needs to use a 32-bit shift. X86 uses i8, but seemed to be hacking around this before. llvm-svn: 351882	2019-01-22 21:42:11 +00:00
Matt Arsenault	fd94cae16d	Reapply "IR: Add fp operations to atomicrmw" This reapplies commits r351778 and r351782 with RISCV test fixes. llvm-svn: 351850	2019-01-22 18:18:02 +00:00
Chandler Carruth	3764443332	Revert r351778: IR: Add fp operations to atomicrmw This broke the RISCV build, and even with that fixed, one of the RISCV tests behaves surprisingly differently with asserts than without, leaving there no clear test pattern to use. Generally it seems bad for hte IR to differ substantially due to asserts (as in, an alloca is used with asserts that isn't needed without!) and nothing I did simply would fix it so I'm reverting back to green. This also required reverting the RISCV build fix in r351782. llvm-svn: 351796	2019-01-22 10:29:58 +00:00
Matt Arsenault	44582e29c8	IR: Add fp operations to atomicrmw Add just fadd/fsub for now. llvm-svn: 351778	2019-01-22 03:32:36 +00:00
Eli Friedman	1d6f130191	[AArch64] Add patterns for zext/sext of shift amount. Not sure this is the best fix, but it saves an instruction for certain constructs involving variable shifts. Differential Revision: https://reviews.llvm.org/D55572 llvm-svn: 351768	2019-01-22 00:21:35 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Sanjin Sijaric	19c7db09aa	Fix the buildbot failure introduced by r351404 EXPENSIVE_CHECKS buildbots are failing due to r351404. Add x1 as live in to the funclet basic block for SEH funclets, as well as -verify-machineinstrs to the test case that triggered the failure. llvm-svn: 351472	2019-01-17 20:24:14 +00:00
Matt Arsenault	afd1f8cb4f	Allow FP types for atomicrmw xchg llvm-svn: 351427	2019-01-17 10:49:01 +00:00
Sanjin Sijaric	10c2d07dfd	[SEH] [ARM64] Retrieve the frame pointer from SEH funclets The Windows ARM64 runtime passes the establisher frame to funclets as the first argument. llvm-svn: 351404	2019-01-17 00:24:38 +00:00
Mandeep Singh Grang	9f9869fa2f	[COFF, ARM64] Implement support for SEH extensions __try/__except/__finally Summary: This patch supports MS SEH extensions __try/__except/__finally. The intrinsics localescape and localrecover are responsible for communicating escaped static allocas from the try block to the handler. We need to preserve frame pointers for SEH. So we create a new function/property HasLocalEscape. Reviewers: rnk, compnerd, mstorsjo, TomTan, efriedma, ssijaric Reviewed By: rnk, efriedma Subscribers: smeenai, jrmuizel, alex, majnemer, ssijaric, ehsan, dmajor, kristina, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53540 llvm-svn: 351370	2019-01-16 19:52:59 +00:00
Aditya Nandakumar	c969a48f8c	[GISel]: Add support for CSEing continuously during GISel passes. https://reviews.llvm.org/D52803 This patch adds support to continuously CSE instructions during each of the GISel passes. It consists of a GISelCSEInfo analysis pass that can be used by the CSEMIRBuilder. llvm-svn: 351283	2019-01-16 00:40:37 +00:00
Evandro Menezes	ed44540ada	[AArch64] Adjust the feature set for Exynos Enable the fusion of arithmetic and logic instructions for Exynos M4. llvm-svn: 351149	2019-01-15 01:53:49 +00:00
Eli Friedman	d0fe48e3c6	[AArch64] Explicitly use v1i64 type for llvm.aarch64.neon.abs.i64 . Otherwise, with D56544, the intrinsic will be expanded to an integer csel, which is probably not what the user expected. This matches the general convention of using "v1" types to represent scalar integer operations in vector registers. While I'm here, also add some error checking so we don't generate illegal ABS nodes. Differential Revision: https://reviews.llvm.org/D56616 llvm-svn: 351141	2019-01-15 00:15:24 +00:00
Evandro Menezes	7744c8a2b5	[AArch64] Add new target feature to fuse arithmetic and logic operations This feature enables the fusion of some arithmetic and logic instructions together. Differential revision: https://reviews.llvm.org/D56572 llvm-svn: 351139	2019-01-14 23:54:36 +00:00
Evandro Menezes	949fd90c8b	[AArch64] Improve Exynos predicates Expand the predicate using shifted arithmetic and logic instructions to also consider the respective not shifted instructions. llvm-svn: 350976	2019-01-11 22:39:47 +00:00
Evandro Menezes	85b71933f5	[AArch64] Add pipeline model for Exynos M4 Add the scheduling and cost model for Exynos M4. llvm-svn: 350960	2019-01-11 19:36:25 +00:00
Evandro Menezes	1df9d07d1f	[AArch64] Create feature set for Exynos M4 Complete the feature set for Exynos M4 and update test cases. llvm-svn: 350953	2019-01-11 18:54:25 +00:00
Bryan Chan	7440d37913	[AArch64] Fix operation actions for FP16 vector intrinsics Summary: This patch changes the legalization action for some half-precision floating- point vector intrinsics (FSIN, FLOG, etc.) from Promote to Expand. These ops are not supported in hardware for half-precision vectors, but promotion is not always possible (for v8f16 operands). Changing the action to Expand fixes an assertion failure in the legalizer when the frontend produces such ops. In addition, a quick microbenchmark shows that, in the v4f16 case, expanding introduces fewer spills and is therefore slightly faster than promoting. Reviewers: t.p.northover, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56296 llvm-svn: 350825	2019-01-10 15:02:37 +00:00
Mandeep Singh Grang	9bf821bd2f	[AArch64] Emit the correct MCExpr relocations specifiers like VK_ABS_G0, etc Summary: D55896 and D56029 add support to emit fixups for :abs_g0: , :abs_g1_s: , etc. This patch adds the necessary enums and MCExpr needed for lowering these. Reviewers: rnk, mstorsjo, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56037 llvm-svn: 350798	2019-01-10 04:59:44 +00:00
Kristof Beyls	e117265e0e	Initial AArch64 SLH implementation. This is an initial implementation for Speculative Load Hardening for AArch64. It builds on top of the recently introduced AArch64SpeculationHardening pass. This doesn't implement (yet) some of the optimizations implemented for the X86SpeculativeLoadHardening pass. I thought introducing the optimizations incrementally in follow-up patches should make this easier to review. Differential Revision: https://reviews.llvm.org/D55929 llvm-svn: 350729	2019-01-09 15:13:34 +00:00
Diogo N. Sampaio	dfa50c3062	[AArch64] Move feature predctrl to predres Follow up patch of rL350385, for adding predres command line option. This patch renames the feature as to keep it aligned with the option passed by/to clang Differential Revision: https://reviews.llvm.org/D56484 llvm-svn: 350702	2019-01-09 11:24:15 +00:00
Matt Arsenault	ad57c9218e	GlobalISel: Implement fewerElements for implicit_def llvm-svn: 350697	2019-01-09 07:51:52 +00:00
Evandro Menezes	2cf1366fcb	[AArch64] Adjust the cost model for Exynos Improve the modeling of ALU instructions. llvm-svn: 350663	2019-01-08 22:29:58 +00:00
Tim Northover	19ac7a1fe6	AArch64: avoid splitting vector truncating stores. We have code to split vector splats (of zero and non-zero) for performance reasons, but it ignores the fact that a store might be truncating. Actually, truncating stores are formed for vNi8 and vNi16 types. Since the truncation is from a legal type, the size of the store is always <= 64-bits and so they don't actually benefit from being split up anyway, so this patch just disables that transformation. llvm-svn: 350620	2019-01-08 13:30:27 +00:00
Mandeep Singh Grang	ab3dbc4187	[MC] [AArch64] Support resolving signed fixups for :abs_g0_s: etc. Summary: This patch is a follow-up to D55896. Reviewers: efriedma, mstorsjo Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56029 llvm-svn: 350606	2019-01-08 04:48:00 +00:00
Evandro Menezes	30be00653a	[AArch64] Adjust the cost model for Exynos M3 Improve the modeling of ASIMD loads and stores. llvm-svn: 350434	2019-01-04 21:02:25 +00:00
Evandro Menezes	41c27199d9	[AArch64] Add new scheduling predicates Add new scheduling predicates to identify the ASIMD loads and stores using the post indexed addressing mode. llvm-svn: 350332	2019-01-03 17:28:09 +00:00
Martin Storsjo	9c2f92fb12	[AArch64] Accept "sve" as arch feature in assembler Differential Revision: https://reviews.llvm.org/D56128 llvm-svn: 350174	2018-12-31 10:22:04 +00:00
Martin Storsjo	4ebb5f7268	[AArch64] Implement the .arch_extension directive Differential Revision: https://reviews.llvm.org/D56131 llvm-svn: 350169	2018-12-30 21:06:32 +00:00
Diogo N. Sampaio	4f295fed38	[AArch64] Add command-line option for SB SB (Speculative Barrier) is only mandatory from 8.5 onwards but is optional from Armv8.0-A. This patch adds a command line option to enable SB, as it was previously only possible to enable by selecting -march=armv8.5-a. This patch also moves to FeatureSB the old FeatureSpecRestrict. Reviewers: pbarrio, olista01, t.p.northover, LukeCheeseman Differential Revision: https://reviews.llvm.org/D55921 llvm-svn: 350126	2018-12-28 17:14:58 +00:00
Jessica Paquette	481e7777a1	[GlobalISel][AArch64] Add support for widening G_FCEIL This adds support for widening G_FCEIL in LegalizerHelper and AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't support full FP 16. This also updates AArch64/f16-instructions.ll to show that we perform the correct transformation. llvm-svn: 349927	2018-12-21 17:05:26 +00:00
Evandro Menezes	36af2563a9	[AArch64] Refactor Exynos predicate (NFC) Change order of conditions in predicate. llvm-svn: 349918	2018-12-21 15:51:34 +00:00
Simon Pilgrim	0a3fdb6a75	[AArch64] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349908	2018-12-21 15:05:10 +00:00
Luke Cheeseman	e503aa9de4	[Dwarf/AArch64] Return address signing B key dwarf support - When signing return addresses with -msign-return-address=<scope>{+<key>}, either the A key instructions or the B key instructions can be used. To correctly authenticate the return address, the unwinder/debugger must know which key was used to sign the return address. - When and exception is thrown or a break point reached, it may be necessary to unwind the stack. To accomplish this, the unwinder/debugger must be able to first authenticate an the return address if it has been signed. - To enable this, the augmentation string of CIEs has been extended to allow inclusion of a 'B' character. Functions that are signed using the B key variant of the instructions should have and FDE whose associated CIE has a 'B' in the augmentation string. - One must also be able to preserve these semantics when first stepping from a high level language into assembly and then, as a second step, into an object file. To achieve this, I have introduced a new assembly directive '.cfi_b_key_frame ', that tells the assembler the current frame uses return address signing with the B key. - This ensures that the FDE is associated with a CIE that has 'B' in the augmentation string. Differential Revision: https://reviews.llvm.org/D51798 llvm-svn: 349895	2018-12-21 10:45:08 +00:00
Jessica Paquette	be246a61d1	[GlobalISel][AArch64] Add G_FCEIL to isPreISelGenericFloatingPointOpcode If you don't do this, then if you hit a G_LOAD in getInstrMapping, you'll end up with GPRs on the G_FCEIL instead of FPRs. This causes a fallback. Add it to the switch, and add a test verifying that this happens. llvm-svn: 349822	2018-12-20 21:14:15 +00:00

... 3 4 5 6 7 ...

3627 Commits