llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Lucas Prates	a515304bf7	[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169	2020-06-18 13:15:13 +01:00
Vedant Kumar	3358e53af8	[ARM] Mark some tests as not safe for -debugify-and-strip-all, NFC These tests contain debug instructions which get checked, so we can't insert synthetic debug info and expect the tests to pass. The rest of the ARM backend tests appear to be fair game.	2020-04-22 17:03:39 -07:00
Eli Friedman	50e1810fbf	[SelectionDAG] Always preserve offset in MachinePointerInfo Previously, getWithOffset() would drop the offset if the base was null. Because of this, MachineMemOperand would return the wrong result from getAlign() in these cases. MachineMemOperand stores the alignment of the pointer without the offset. A bunch of MIR tests changed because we print the offset now. Split off from D77687. Differential Revision: https://reviews.llvm.org/D78049	2020-04-14 15:29:41 -07:00
Sjoerd Meijer	bbf5be9d55	[MIR][ARM] MachineOperand comments This adds infrastructure to print and parse MIR MachineOperand comments. The motivation for the ARM backend is to print condition code names instead of magic constants that are difficult to read (for human beings). For example, instead of this: dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14, $noreg t2Bcc %bb.4, 0, killed $cpsr we now print this: dead renamable $r2, $cpsr = tEOR killed renamable $r2, renamable $r1, 14 /* CC::always /, $noreg t2Bcc %bb.4, 0 / CC:eq /, killed $cpsr This shows that MachineOperand comments are enclosed between / and /. In this example, the EOR instruction is not conditionally executed (i.e. it is "always executed"), which is encoded by the 14 immediate machine operand. Thus, now this machine operand has / CC::always / as a comment. The 0 on the next conditional branch instruction represents the equal condition code, thus now this operand has / CC:eq */ as a comment. As it is a comment, the MI lexer/parser completely ignores it. The benefit is that this keeps the change in the lexer extremely minimal and no target specific parsing needs to be done. The changes on the MIPrinter side are also minimal, as there is only one target hooks that is used to create the machine operand comments. Differential Revision: https://reviews.llvm.org/D74306	2020-02-24 14:19:21 +00:00
Matt Arsenault	b740707aac	GlobalISel: Fix lowering of G_CTLZ/G_CTTZ The type passed to lower was invalid, so I'm not sure how this was even working before. The source and destination type also do not have to match, so make sure to use the right ones.	2020-02-07 06:54:12 -08:00
Jay Foad	57f65e8370	[GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister Summary: As a side effect some redundant copies of constant values are removed by CSEMIRBuilder. Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73789	2020-01-31 17:07:16 +00:00
Diogo Sampaio	69646a28e6	[ARM][Thumb2] Fix ADD/SUB invalid writes to SP Summary: This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80". The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable. To enforce that the instruction t2(ADD\|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP). Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations. When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant ) It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt). Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma, andreadb Reviewed By: efriedma Subscribers: gbedwell, john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70680	2020-01-14 11:47:19 +00:00
Diogo Sampaio	4c96f162e8	Reverting, broke some bots. Need further investigation. Summary: This reverts commit 8c12769f3046029e2a9b4e48e1645b1a77d28650. Reviewers: Subscribers:	2020-01-10 13:40:41 +00:00
Diogo Sampaio	82699e5bd9	[ARM][Thumb2] Fix ADD/SUB invalid writes to SP Summary: This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80". The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable. To enforce that the instruction t2(ADD\|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP). Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations. When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant ) It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt). Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma Reviewed By: efriedma Subscribers: john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70680	2020-01-10 11:25:44 +00:00
Aditya Nandakumar	a264814e2a	[GlobalISel]: Allow targets to override how to widen constants during legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.	2019-12-03 10:41:10 -08:00
Daniel Sanders	7a5b72e3a3	[globalisel] Rename G_GEP to G_PTR_ADD Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734	2019-11-05 10:31:17 -08:00
Eli Friedman	1031830a08	[ARM] VFPv2 only supports 16 D registers. r361845 changed the way we handle "D16" vs. "D32" targets; there used to be a negative "d16" which removed instructions from the instruction set, and now there's a "d32" feature which adds instructions to the instruction set. This is good, but there was an oversight in the implementation: the behavior of VFPv2 was changed. In particular, the "vfp2" feature was changed to imply "d32". This is wrong: VFPv2 only supports 16 D registers. In practice, this means if you specify -mfpu=vfpv2, the compiler will generate illegal instructions. This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2" and "vfp2sp" so they don't imply "d32". Differential Revision: https://reviews.llvm.org/D67375 llvm-svn: 372186	2019-09-17 21:42:38 +00:00
Volkan Keles	e1dc48d1b9	[GlobalISel] CSEMIRBuilder: Add support for G_GEP Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070	2019-08-15 23:45:45 +00:00
Matt Arsenault	3482b4fef4	GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR llvm-svn: 368705	2019-08-13 15:52:21 +00:00
Matt Arsenault	284e8e1c63	GlobalISel: Change representation of shuffle masks Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704	2019-08-13 15:34:38 +00:00
Daniel Sanders	233ed83478	[globalisel] Add G_SEXT_INREG Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487	2019-08-09 21:11:20 +00:00
Diana Picus	54d3122693	[GlobalISel] Accept multiple vregs for lowerCall's args Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512	2019-06-27 09:18:03 +00:00
Diana Picus	0d78943f2c	[GlobalISel] Accept multiple vregs for lowerCall's result Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511	2019-06-27 09:15:53 +00:00
Diana Picus	7313013525	[GlobalISel] Accept multiple vregs in lowerFormalArgs Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510	2019-06-27 08:54:17 +00:00
Eli Friedman	ee12e24642	[ARM GlobalISel] Tests for s64 G_ADD and G_SUB. Forgot to commit these in r363989 (https://reviews.llvm.org/D63585) llvm-svn: 363991	2019-06-20 22:00:07 +00:00
Matt Arsenault	23c462a8b5	Rename ExpandISelPseudo->FinalizeISel, delay register reservation This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757	2019-06-19 00:25:39 +00:00
Simon Tatham	a1d7f2fdc1	[ARM] Replace fp-only-sp and d16 with fp64 and d32. Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845	2019-05-28 16:13:20 +00:00
Diana Picus	6bce97dbb8	[ARM GlobalISel] Un-XFAIL some tests. NFC It turns out we support big endian now (probably since r332449, but I haven't bisected to confirm). llvm-svn: 361756	2019-05-27 10:32:34 +00:00
Diana Picus	d9cee1e985	[IRTranslator] Don't hardcode GEP index type When breaking up loads and stores of aggregates, the IRTranslator uses LLT::scalar(64) for the index type of the G_GEP instructions that compute the addresses. This is unnecessarily large for 32-bit targets. Use the int ptr type provided by the DataLayout instead. Note that we're already doing the right thing when translating getelementptr instructions from the IR. This is just an oversight when generating new ones while translating loads/stores. Both x86 and AArch64 already have tests confirming that the old behaviour is preserved for 64-bit targets. Differential Revision: https://reviews.llvm.org/D61852 llvm-svn: 360656	2019-05-14 09:25:17 +00:00
Diana Picus	c0879829c6	[ARM GlobalISel] Map DBG_VALUE for types != s32 ...and make sure we fail elegantly for unsupported values. s64 goes into DPR, anything <= 32 into GPR. llvm-svn: 360321	2019-05-09 09:49:36 +00:00
Tim Northover	72a3b6b9e2	ARM: disallow SP as Rn for Thumb2 TST & TEQ instructions Using SP in this position is unpredictable in ARMv7. CMP and CMN are not affected, and of course v8 relaxes this requirement, but that's handled elsewhere. llvm-svn: 360242	2019-05-08 10:59:08 +00:00
Diana Picus	86cf873810	[ARM GlobalISel] Widen G_SELECT operands ...except for the condition operand. llvm-svn: 360135	2019-05-07 11:39:30 +00:00
Diana Picus	9a5fab0d4a	[ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINT We actually have a couple of G_PTRTOINT to s8 when building clang, so we should do something about them. llvm-svn: 360130	2019-05-07 10:48:01 +00:00
Diana Picus	0dbf594bb0	[ARM GlobalISel] Widen G_GEP index operand llvm-svn: 360127	2019-05-07 10:11:57 +00:00
Diana Picus	e8b91df048	[ARM GlobalISel] Select extensions to < 32 bits Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in the exact same way as 32 bits. This overwrites the higher bits, but that should be ok since all legal users of types smaller than 32 bits ignore those bits anyway. llvm-svn: 359768	2019-05-02 09:28:00 +00:00
Diana Picus	3d1035c103	[ARM GlobalISel] Rename some inst selector tests. NFC Prepare to add support for extensions to types smaller than 32 bits. llvm-svn: 359767	2019-05-02 09:24:47 +00:00
Diana Picus	c8e9ac0231	[ARM GlobalISel] Legalize extensions to < 32 bits Make it legal to extend from e.g. s1 to s8 or s16. llvm-svn: 359766	2019-05-02 09:21:46 +00:00
Diana Picus	70251b88cb	[ARM GlobalISel] Widen small shift operands The legalizer was already widening the shift amount. Add tests for that behaviour, and also support widening the shifted value. llvm-svn: 359542	2019-04-30 09:24:43 +00:00
Diana Picus	2e869a616a	[ARM GlobalISel] Be more careful about bailing out Bail out on function arguments/returns with types aggregating an unsupported type. This fixes cases where we would happily and incorrectly lower functions taking e.g. [1 x i64] parameters, when we don't even support plain i64 yet. llvm-svn: 359540	2019-04-30 09:05:25 +00:00
Amara Emerson	5d4c0a60e2	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 llvm-svn: 358369	2019-04-15 05:04:20 +00:00
Diana Picus	6d99a106a0	[ARM GlobalISel] Select G_FCONSTANT for VFP3 Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. llvm-svn: 358063	2019-04-10 09:14:32 +00:00
Diana Picus	019cdf61f3	[ARM GlobalISel] Select G_FCONSTANT into pools Put all floating point constants into constant pools and load their values from there. llvm-svn: 358062	2019-04-10 09:14:24 +00:00
Diana Picus	2868e35d3e	[ARM GlobalISel] Map G_FCONSTANT llvm-svn: 358061	2019-04-10 09:14:16 +00:00
Amara Emerson	5b2864cb25	[GlobalISel][AArch64] Allow CallLowering to handle types which are normally required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032	2019-04-09 21:22:33 +00:00
Diana Picus	2c02c7f9b0	[ARM GlobalISel] Support DBG_VALUE Make sure we can map and select DBG_VALUE. llvm-svn: 357681	2019-04-04 10:24:51 +00:00
Diana Picus	e42279e7ac	[ARM GlobalISel] Run regbankselect test for Thumb. NFCI This should just work, since ARM mode and Thumb2 mode are at the same level of support now and should map the same to GPR and FPR. llvm-svn: 357159	2019-03-28 10:57:29 +00:00
Diana Picus	74bf817add	[ARM GlobalISel] Fix G_STORE with s1 G_STORE for 1-bit values uses a STRBi12, which stores the whole byte. Zero out the undefined bits before writing. llvm-svn: 357154	2019-03-28 09:09:36 +00:00
Diana Picus	284b4fed5e	[ARM GlobalISel] Fix selection of G_SELECT G_SELECT uses a 1-bit scalar for the condition, and is currently implemented with a plain CMPri against 0. This means that values such as 0x1110 are interpreted as true, when instead the higher bits should be treated as undefined and therefore ignored. Replace the CMPri with a TSTri against 0x1, which performs an implicit AND, yielding the expected result. llvm-svn: 357153	2019-03-28 09:09:27 +00:00
Diana Picus	d9476b9432	[ARM GlobalISel] 64-bit memops should be aligned We currently use only VLDR/VSTR for all 64-bit loads/stores, so the memory operands must be word-aligned. Mark aligned operations as legal and narrow non-aligned ones to 32 bits. While we're here, also mark non-power-of-2 loads/stores as unsupported. llvm-svn: 356872	2019-03-25 08:54:29 +00:00
Diana Picus	eb83e39778	[ARM GlobalISel] Support G_CTLZ for Thumb2 Same as ARM mode but with different opcode. llvm-svn: 355191	2019-03-01 10:12:28 +00:00
Diana Picus	b56a0077bb	[ARM GlobalISel] Check target flags in test. NFCI There was a time when we couldn't dump target-specific flags such as arm-sbrel etc, so the tests didn't check for them. We can now be more specific in our tests. llvm-svn: 355189	2019-03-01 10:01:22 +00:00
Diana Picus	5ee2be1ef2	[ARM GlobalISel] Support global variables for Thumb2 Add the same level of support as for ARM mode (i.e. still no TLS support). In most cases, it is sufficient to replace the opcodes with the t2-equivalent, but there are some idiosyncrasies that I decided to preserve because I don't understand the full implications: * For ARM we use LDRi12 to load from constant pools, but for Thumb we use t2LDRpci (I'm not sure if the ideal would be to use t2LDRi12 for Thumb as well, or to use LDRcp for ARM). * For Thumb we don't have an equivalent for MOV\|LDRLIT_ga_pcrel_ldr, so we have to generate MOV\|LDRLIT_ga_pcrel plus a load from GOT. The tests are in separate files because they're hard enough to read even without doubling the number of checks. llvm-svn: 355077	2019-02-28 10:42:47 +00:00
Diana Picus	b059d0c690	[ARM GlobalISel] Support floating point for Thumb2 This is exactly the same as arm mode, so for the instruction selector tests we just extract them to a new file and run with the same checks for both arm and thumb mode. For the legalizer we need to update the tests for soft float a bit, but only because BL and tBL are slightly different. We could be pedantic and check that we get a well-formed BL for arm mode and a tBL for thumb, but for the purposes of the legalizer test it's sufficient to just skip over the predicate operands in the checks. Also note that we have the pedantic checks in the divmod test, so we're covered. llvm-svn: 354665	2019-02-22 09:54:54 +00:00
Diana Picus	f043f25c01	[ARM GlobalISel] Support G_FRAME_INDEX for Thumb2 Same as arm mode. llvm-svn: 354579	2019-02-21 13:00:02 +00:00
Diana Picus	7078facb64	[ARM GlobalISel] Support G_PHI for Thumb2 Same as arm mode. llvm-svn: 354310	2019-02-19 10:26:47 +00:00

1 2 3 4 5 ...

257 Commits