llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Diana Picus	9bc7ab4910	[ARM GlobalISel] Support G_CONSTANT for Thumb2 All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610	2018-12-19 09:55:10 +00:00
Tim Northover	173f0949e5	FastIsel: take care to update iterators when removing instructions. We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365	2018-12-17 17:25:53 +00:00
Tim Northover	29b969dda5	ARM: use acquire/release instruction variants when available. These features (fairly) recently got split out into their own feature, so we should make CodeGen use them when available. The main change here is that the check used to be based on the triple, but now it's based on CPU features. llvm-svn: 349355	2018-12-17 15:05:32 +00:00
Diana Picus	47cdb935e0	[ARM GlobalISel] Thumb2: casts between int and ptr Mark as legal and add tests. Nothing special to do. llvm-svn: 349147	2018-12-14 13:45:38 +00:00
Diana Picus	36a62f59bc	[ARM GlobalISel] Minor refactoring. NFCI Refactor the ARMInstructionSelector to cache some opcodes in the constructor instead of checking all the time if we're in ARM or Thumb mode. llvm-svn: 349143	2018-12-14 12:37:24 +00:00
Diana Picus	1f523d85de	[ARM GlobalISel] Allow simple binary ops in Thumb2 Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM and Thumb2. Extract the legalizer tests for these opcodes into another file. Add tests for the instruction selector. llvm-svn: 349142	2018-12-14 11:58:14 +00:00
Diana Picus	5026fe4d53	[ARM GlobalISel] Support exts and truncs for Thumb2 Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for them in the instruction selector. This uses handwritten code again because the patterns that are generated with TableGen are tuned for what the DAG combiner would produce and not for simple sext/zext nodes. Luckily, we only need to update the opcodes to use the Thumb2 variants, everything else can be reused from ARM. llvm-svn: 349026	2018-12-13 12:06:54 +00:00
Diana Picus	8868d7ab14	[ARM GlobalISel] Select load/store for Thumb2 Unfortunately we can't use TableGen for this because it doesn't yet support predicates on the source pattern root. Therefore, add a bit of handwritten code to the instruction selector to handle the most basic cases. Also mark them as legal and extract their legalizer test cases to a new test file. llvm-svn: 348920	2018-12-12 10:32:15 +00:00
Tim Northover	906154f438	ARM: use correct offset from base pointer (r6) in call frame regions. When we had dynamic call frames (i.e. sp adjustment around each call) we were including that adjustment into offsets calculated based on r6, even though it's only sp that changes. This led to incorrect stack slot accesses. llvm-svn: 348591	2018-12-07 13:43:55 +00:00
David Green	76448ad394	[Targets] Add errors for tiny and kernel codemodel on targets that don't support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585	2018-12-07 12:10:23 +00:00
Diana Picus	4a4e7c454b	[ARM GlobalISel] Nothing is legal for Thumb ...yet! A lot of the current code should be shared for arm and thumb mode, but until we add tests and work out some of the details (e.g. checking the correct subtarget feature for G_SDIV) it's safer to bail out as early as possible for thumb targets. This should have arguably been part of r348347, which allowed Thumb functions to be handled by the IR Translator. llvm-svn: 348472	2018-12-06 09:26:14 +00:00
Aditya Nandakumar	ba54e27cac	[GISel]: Provide standard interface to observe changes in GISel passes https://reviews.llvm.org/D54980 This provides a standard API across GISel passes to observe and notify passes about changes (insertions/deletions/mutations) to MachineInstrs. This patch also removes the recordInsertion method in MachineIRBuilder and instead provides method to setObserver. Reviewed by: vkeles. llvm-svn: 348406	2018-12-05 20:14:52 +00:00
Diana Picus	38eb1743f5	[ARM GlobalISel] Implement call lowering for Thumb2 The only things that are different from arm are: * different opcodes for calls and returns * Thumb calls take predicate operands llvm-svn: 348347	2018-12-05 10:35:28 +00:00
Tim Northover	5daefbd8b2	ARM: use target-specific SUBS node when combining cmp with cmov. This has two positive effects. First, using a custom node prevents recombination leading to an infinite loop since the output DAG is notionally a little more complex than the input one. Using a flag-setting instruction also allows the subtraction to be folded with the related comparison more easily. https://reviews.llvm.org/D53190 llvm-svn: 348122	2018-12-03 11:16:21 +00:00
Oliver Stannard	eb66331216	[ARM][MC] Move information about variadic register defs into tablegen Currently, variadic operands on an MCInst are assumed to be uses, because they come after the defs. However, this is not always the case, for example the Arm/Thumb LDM instructions write to a variable number of registers. This adds a property of instruction definitions which can be used to mark variadic operands as defs. This only affects MCInst, because MachineInstruction already tracks use/def per operand in each instance of the instruction, so can already represent this. This property can then be checked in MCInstrDesc, allowing us to remove some special cases in ARMAsmParser::isITBlockTerminator. Differential revision: https://reviews.llvm.org/D54853 llvm-svn: 348114	2018-12-03 10:32:42 +00:00
Oliver Stannard	a7553313be	[ARM][Asm] Debug trace for the processInstruction loop In the Arm assembly parser, we first match an instruction, then call processInstruction to possibly change it to a different encoding, to match rules in the architecture manual which can't be expressed by the table-generated matcher. This adds debug printing so that this process is visible when using the -debug option. To support this, I've added a new overload of MCInst::dump_pretty which takes the opcode name as a StringRef, since we don't have an InstPrinter instance in the assembly parser. Instead, we can get the same information directly from the MCInstrInfo. Differential revision: https://reviews.llvm.org/D54852 llvm-svn: 348113	2018-12-03 10:21:28 +00:00
Sjoerd Meijer	129cbf3900	[ARM] FP16: support vld1.16 for vector loads with post-increment Differential Revision: https://reviews.llvm.org/D55112 llvm-svn: 348110	2018-12-03 08:26:34 +00:00
Sjoerd Meijer	532a78148a	[ARM] Don't expand sdiv when optimising for minsize Don't expand SDIV with an immediate that is a power of 2 if we optimise for minimum code size. For example: sdiv %1, i32 4 gets expanded to a sequence of 3 instructions, but this is suboptimal for minimum code size so instead we just generate a MOV and a SDIV if integer division is supported. Differential Revision: https://reviews.llvm.org/D54546 llvm-svn: 347965	2018-11-30 08:14:28 +00:00
Jonas Devlieghere	48fe864a41	Produce an error on non-encodable offsets for darwin ARM scattered relocations. Scattered ARM relocations for Mach-O's only have 24 bits available to encode the offset. This is not checked but just truncated and can result in corrupt binaries after linking because the relocations are applied to the wrong offset. This patch will check and error out in those situations instead of emitting a wrong relocation. Patch by: Sander Bogaert (dzn) Differential revision: https://reviews.llvm.org/D54776 llvm-svn: 347922	2018-11-29 21:58:23 +00:00
Sterling Augustine	567aedab5a	Notify the linker when a TU compiled with split-stack has a function without a prologue. More context here: https://go-review.googlesource.com/c/go/+/148819/ llvm-svn: 347614	2018-11-26 23:26:31 +00:00
Diana Picus	7511d6e3fd	[ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEF We can now select CLZ via the TableGen'erated code, so support G_CTLZ and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32. Legalizer: If the CLZ instruction is available, use it for both G_CTLZ and G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and lower G_CTLZ in terms of it. In order to achieve this we need to add support to the LegalizerHelper for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2). We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF if that is supported as a libcall, as opposed to just if it is Legal or Custom. Due to a minor refactoring of the helper function in charge of this, we will also allow the same behaviour for G_CTTZ and G_CTPOP. This is not going to be a problem in practice since we don't yet have support for treating G_CTTZ and G_CTPOP as libcalls (not even in DAGISel). Reg bank select: Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point. Instruction select: Nothing to do. llvm-svn: 347545	2018-11-26 11:07:02 +00:00
Sam Parker	2cf70e40fa	[ARM] Prevent parallel macs for unsigned values Both zext and sext are currently allowed during the search for narrow sequences and sexts operands are later added to the mac candidates. But operands of muls are also added, without checking whether they're sext or zext, which means we can generate a signed smlad when we shouldn't. Differential Revision: https://reviews.llvm.org/D54790 llvm-svn: 347542	2018-11-26 10:22:55 +00:00
Fangrui Song	a4304aa848	[ARM] Add dependency from ARMAsmParser to ARMAsmPrinter after r347494 This fixes -DBUILD_SHARED_LIBS=on llvm-svn: 347506	2018-11-23 23:43:46 +00:00
Oliver Stannard	878c27c4ea	[ARM][AsmParser] Improve debug printing of parsed asm operands In ARMOperand::print: - Print human-readable register names, instead of numbers. - Print the correct names for IT condition masks (these were in the wrong order before). - Print all parts of memory operands, not just the base register. This makes the output of llvm-mc -show-inst-operands more readable. Differential revision: https://reviews.llvm.org/D54850 llvm-svn: 347494	2018-11-23 14:27:21 +00:00
Sam Parker	9ada10a3ae	[ARM] Remove trunc sinks in ARM CGP Truncs are treated as sources if their produce a value of the same type as the one we currently trying to promote. Truncs used to be considered as a sink if their operand was the same value type. We now allow smaller types in the search, so we should search through truncs that produce a smaller value. These truncs can then be converted to an AND mask. This leaves sinks as being: - points where the value in the register is being observed, such as an icmp, switch or store. - points where value types have to match, such as calls and returns. - zext are included to ease the transformation and are generally removed later on. During this change, it also became apart from truncating sinks was broken: if a sink used a source, its type information had already been lost by the time the truncation happens. So I've changed the method of caching the type information. Differential Revision: https://reviews.llvm.org/D54515 llvm-svn: 347191	2018-11-19 11:34:40 +00:00
Eli Friedman	d24c268544	[ARM] Add MemOperand to LDRcp to enable DCE. LDRcp should be deleted when the dest register is dead in register coalescing. Without MemOp, dead LDRcp will cause dead constant pool value which references to non-existing label. Patch by Yin Ma. Differential Revision: https://reviews.llvm.org/D54173 llvm-svn: 346563	2018-11-09 23:09:17 +00:00
Sam Parker	dfb73cdc7f	[ARM] Don't promote i1 types in ARM CGP Now that we have mixed type sizes, i1 values need to be explicitly handled as we want to avoid promoting these values. Differential Revision: https://reviews.llvm.org/D54308 llvm-svn: 346499	2018-11-09 15:06:33 +00:00
Sam Parker	7ab006801a	[ARM] Enable mixed types in ARM CGP Previously, during the search, all values had to have the same 'TypeSize', which is equal to number of bits of the integer type of the icmp operand. All values in the tree had to match this size; meaning that, if we searched from i16, we wouldn't accept i8s. A change in type size requires zext and truncs to perform the casts so, to allow mixed narrow types, the handling of these instructions is now slightly different: - we allow casts if their result or operand is <= TypeSize. - zexts are sinks if their result > TypeSize. - truncs are still sinks if their operand == TypeSize. - truncs are still sources if their result == TypeSize. The transformation bails on finding an icmp that operates on data smaller than the current TypeSize. Differential Revision: https://reviews.llvm.org/D54108 llvm-svn: 346480	2018-11-09 09:28:27 +00:00
Sam Parker	beba9684fb	[ARM] Small reorganisation in ARMParallelDSP A few code movement things: - AreSymmetrical is now a method of BinOpChain. - Created a lambda in CreateParallelMACPairs to reduce loop nesting. - A Reduction object now gets pasted in a couple of places instead, including CreateParallelMACPairs so it doesn't need to return a value. I've also added RecordSequentialLoads, which is run before the transformation begins, and caches the interesting loads. This can then be queried later instead of cross checking many load values. Differential Revision: https://reviews.llvm.org/D54254 llvm-svn: 346479	2018-11-09 09:18:00 +00:00
Petr Pavlu	d6dcfd1b38	[ARM] Enable spilling of the hGPR register class in Thumb2 Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and loadRegToStackSlot() to allow the GPR class or any of its sub-classes (including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12. Differential Revision: https://reviews.llvm.org/D51927 llvm-svn: 346401	2018-11-08 13:02:10 +00:00
Eli Friedman	5c366741bf	[ARM] Fix CPSR liveness in tMOVCCr_pseudo lowering. The lowering was missing live-ins in certain cases, like a sequence of multiple tMOVCCr_pseudo instructions. This would lead to a verifier failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered. For reasons I don't completely understand, it's hard to get a sequence of multiple tMOVCCr_pseudo instructions; the issue only seems to show up with 64-bit comparisons where the result is zero-extended. I added some extra testcases in case that changes in the future. Probably some optimization opportunities here if anyone is interested. (@test_slt_not is the case that was getting miscompiled.) The code to check the liveness of CPSR was stolen from X86ISelLowering.cpp; maybe it could be refactored into common helper, but I have no idea where to put it. Differential Revision: https://reviews.llvm.org/D54192 llvm-svn: 346355	2018-11-07 21:08:13 +00:00
Sam Parker	60779a051d	[ARM] Turn assert into condition in ARMCGP Turn the assert in PrepareConstants into a conditon so that we can handle mul instructions with negative immediates. Differential Revision: https://reviews.llvm.org/D54094 llvm-svn: 346126	2018-11-05 11:26:04 +00:00
Sam Parker	c3a5f0c518	[ARM][ARMCGP] Remove unecessary zexts and truncs r345840 slightly changed the way promotion happens which could result in zext and truncs having the same source and destination types. This fixes that issue. We can now also remove the zext and trunc in the following case: (zext (trunc (promoted op)), i32) This means that we can no longer treat a value, that is only used by a sink, to be safe to promote. I've also added in some extra asserts and replaced a cast for a dyn_cast. Differential Revision: https://reviews.llvm.org/D54032 llvm-svn: 346125	2018-11-05 10:58:37 +00:00
Matthias Braun	85fed59b03	ARMExpandPseudoInsts: Fix CMP_SWAP expansion adding a kill flag to a def llvm-svn: 346026	2018-11-02 18:22:15 +00:00
Sam Parker	504c9c7ac2	[ARM] Attempt to fix ppc64be buildbot llvm-svn: 345850	2018-11-01 16:44:45 +00:00
Sam Parker	0b95995066	[ARM][CGP] Negative constant operand handling While mutating instructions, we sign extended negative constant operands for binary operators that can safely overflow. This was to allow instructions, such as add nuw i8 %a, -2, to still be able to perform a subtraction. However, the code to handle constants doesn't take into consideration that instructions, such as sub nuw i8 -2, %a, require the i8 -2 to be converted into i32 254. This is a relatively simple fix, but I've taken the time to reorganise the code a bit - mainly that instructions that can be promoted are cached and splitting up the Mutate function. Differential Revision: https://reviews.llvm.org/D53972 llvm-svn: 345840	2018-11-01 15:23:42 +00:00
Eli Friedman	ac1b7ad93f	[ARM] Add missing pseudo-instruction for Thumb1 RSBS. Shows up rarely for 64-bit arithmetic, more frequently for the compare patterns added in r325323. Differential Revision: https://reviews.llvm.org/D53848 llvm-svn: 345782	2018-10-31 21:45:48 +00:00
Dorit Nuzman	a2771a93ac	[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705	2018-10-31 09:57:56 +00:00
Eli Friedman	d2bd14842f	[ARM] Make InstrEmitter mark CPSR defs dead for Thumb1. The "dead" markings allow existing target-independent optimizations, like MachineSink, to trigger more frequently. The CPSR defs would have eventually been marked dead by LiveVariables, so this only affects optimizations before regalloc. The ARMBaseInstrInfo.cpp change is fixing a bug which is only visible with this change: the transform adds a use to an otherwise dead def of CPSR. This is covered by existing regression tests. thumb2-tbh.ll breaks for Thumb1 due to MachineLICM changing the generated code; I'll fix it in D53452. Differential Revision: https://reviews.llvm.org/D53453 llvm-svn: 345420	2018-10-26 19:32:24 +00:00
Sam Parker	e4b39c84d1	[ARM] Use Cortex-A57 sched model for Cortex-A72 This mirrors what we already do for AArch64 as the cores are similar. As discussed in the review, enabling the machine scheduler causes more variations in performance changes so it is not enabled for now. This patch improves LNT scores by a geomean of 1.57% at -O3. Differential Revision: https://reviews.llvm.org/D53562 llvm-svn: 345272	2018-10-25 15:08:29 +00:00
Simon Pilgrim	afbfdd6bd6	[TTI] Add generic SK_Broadcast shuffle costs I noticed while fixing PR39368 that we don't have generic shuffle costs for broadcast style shuffles. This patch adds SK_BROADCAST handling, but exposes ARM/AARCH64 lack of handling of this type, which I've added a fix for at the same time. Differential Revision: https://reviews.llvm.org/D53570 llvm-svn: 345253	2018-10-25 10:52:36 +00:00
Thomas Lively	e649dc137e	[NFC] Rename minnan and maxnan to minimum and maximum Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218	2018-10-24 22:49:55 +00:00
Peter Collingbourne	3f38e31261	ARM: Use BKPT instead of TRAP to implement llvm.debugtrap. The BKPT instruction is specified to cause a software breakpoint, and at least on Linux results in a SIGTRAP. This makes it more suitable for implementing debugtrap than TRAP (aka UDF #254), which is specified to cause an undefined instruction exception and results in a SIGILL on Linux. Moreover, BKPT is not marked as a terminator, which is not only consistent with the IR instruction but allows the analyzeBlock function to correctly analyze a basic block containing the instruction, which fixes an assertion failure in the machine block placement pass previously triggered by the included test case. Because BKPT is only supported starting with ARMv5T, we continue to use UDF #254 when targeting v4T. Differential Revision: https://reviews.llvm.org/D53614 llvm-svn: 345171	2018-10-24 18:10:38 +00:00
Saleem Abdulrasool	eca209183e	ARM: handle checking aliases with out-of-bounds GEPs A global alias may use indices which are not considered in bounds. In such a case, accessing the base object will fail as it only peers through inbounds accesses. This pattern is used by the swift compiler to create references to preceeding members in the type metadata. This would cause the code generation to fail when targeting a platform that used ELF as the object file format. Be conservative and fail the read-only check if we run into an alias that we cannot peer through. llvm-svn: 345107	2018-10-24 00:00:52 +00:00
Simon Pilgrim	df74e03322	[TTI] Add generic cost handling of SK_Reverse shuffles These can be treated as a general permute. This required a fix for missing reverse patterns on ARM llvm-svn: 345015	2018-10-23 09:42:10 +00:00
Eli Friedman	ec411cfda2	Revert r344693 ("[ARM] bottom-top mul support in ARMParallelDSP") Still causing failures on the polly-aosp buildbot; I'll follow up with a reduced testcase. llvm-svn: 344752	2018-10-18 19:34:30 +00:00
Sam Parker	967e905c0d	[ARM] bottom-top mul support in ARMParallelDSP Previously reverted in rL343082. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 344693	2018-10-17 13:02:48 +00:00
Sjoerd Meijer	3e73839b39	[ARM] Do not fuse VADD and VMUL, continued (2/2) This is patch 2/2, following up on D53314, and is the functional change to prevent fusing mul + add sequences into VFMAs. Differential revision: https://reviews.llvm.org/D53315 llvm-svn: 344683	2018-10-17 10:05:44 +00:00
Sjoerd Meijer	e06ebcf09e	[ARM] Follow up of rL344671, attempt to pacify a buildbot It was rightfully complaining about an unpretty logical expression. llvm-svn: 344677	2018-10-17 07:51:24 +00:00
Sjoerd Meijer	dbb2ea77e4	[ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2) This is a follow up of rL342874, which stopped fusing muls and adds into VMLAs for performance reasons on the Cortex-M4 and Cortex-M33. This is a serie of 2 patches, that is trying to achieve the same for VFMA. The second column in the table below shows what we were generating before rL342874, the third column what changed with rL342874, and the last column what we want to achieve with these 2 patches: -------------------------------------------------------- \| Opt \| < rL342874 \| >= rL342874 \| \| \|------------------------------------------------------\| \|-O3 \| vmla \| vmul \| vmul \| \| \| \| vadd \| vadd \| \|------------------------------------------------------\| \|-Ofast \| vfma \| vfma \| vmul \| \| \| \| \| vadd \| \|------------------------------------------------------\| \|-Oz \| vmla \| vmla \| vmla \| -------------------------------------------------------- This patch 1/2, is a cleanup of the spaghetti predicate logic on the different VMLA and VFMA codegen rules, so that we can make the final functional change in patch 2/2. This also fixes a typo in the regression test added in rL342874. Differential revision: https://reviews.llvm.org/D53314 llvm-svn: 344671	2018-10-17 07:26:35 +00:00

... 4 5 6 7 8 ...

10103 Commits