llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

Author	SHA1	Message	Date
David Green	2cffd5437a	[ARM] Generate 8.1-m CSINC, CSNEG and CSINV instructions. Arm 8.1-M adds a number of related CSEL instructions, including CSINC, CSNEG and CSINV. These choose between two values given the content in CPSR and a condition, performing an increment, negation or inverse of the false value. This adds some selection for them, either from constant values or patterns. It does not include CSEL directly, which is currently not always making code better. It is still useful, but we will have to check more carefully where it should and shouldn't be used. Code by Ranjeet Singh and Simon Tatham, with some modifications from me. Differential revision: https://reviews.llvm.org/D66483 llvm-svn: 370739	2019-09-03 10:53:07 +00:00
Jian Cai	fa10d00255	Reland "[ARM] push LR before __gnu_mcount_nc" This relands r369147 with fixes to unit tests. https://reviews.llvm.org/D65019 llvm-svn: 369173	2019-08-16 23:30:16 +00:00
Jian Cai	f9176646df	Revert "[ARM] push LR before __gnu_mcount_nc" This reverts commit f4cf3b959333f62b7a7b2d7771f7010c9d8da388. llvm-svn: 369149	2019-08-16 20:40:21 +00:00
Jian Cai	c24f6bc936	[ARM] push LR before __gnu_mcount_nc Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of the stack on ARM32. Differential Revision: https://reviews.llvm.org/D65019 llvm-svn: 369147	2019-08-16 20:21:08 +00:00
David Green	c40c592399	[ARM] Don't pretend we know how to generate MVE VLDn We don't yet know how to generate these instructions for MVE. And in the case of VLD3, we don't even have the instruction. For the moment don't tell the vectoriser that we have VLD4, just to end up serialising the results. Differential Revision: https://reviews.llvm.org/D66009 llvm-svn: 369101	2019-08-16 13:06:49 +00:00
Eli Friedman	910badf6bd	[ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1. This is extremely specific, but saves three instructions when it's legal. I don't think the code can be usefully generalized. Differential Revision: https://reviews.llvm.org/D65351 llvm-svn: 367492	2019-07-31 23:19:21 +00:00
David Green	e0ad51cc5a	[ARM] Rewrite how VCMP are lowered, using a single node This removes the VCEQ/VCNE/VCGE/VCEQZ/etc nodes, just using two called VCMP and VCMPZ with an extra operand as the condition code. I believe this will make some combines simpler, allowing us to just look at these codes and not the operands. It also helps fill in a missing VCGTUZ MVE selection without adding extra nodes for it. Differential Revision: https://reviews.llvm.org/D65072 llvm-svn: 366934	2019-07-24 17:36:47 +00:00
David Green	2ff5f4ceeb	[ARM] Better OR's for MVE compares This adds a DeMorgan combine for OR's of compares to turn them into AND's, helping prevent them from going into and out of gpr registers. It also fills in the VCLE and VCLT nodes that MVE can select, allowing it to invert more compares. Differential Revision: https://reviews.llvm.org/D65059 llvm-svn: 366920	2019-07-24 16:42:09 +00:00
David Green	2a0f923462	[ARM] MVE predicate register support This adds support code for building and shuffling i1 predicate registers. It generally uses two basic principles, either converting the predicate into an scalar (through a PREDICATE_CAST) and doing scalar operations on it there, or by converting the register to an full vector register and back. Some of the code here is a not super efficient but will hopefully cover most cases of moving i1 vectors around and can be improved in subsequent patches. Some code by David Sherwood. Differential Revision: https://reviews.llvm.org/D65052 llvm-svn: 366890	2019-07-24 11:51:36 +00:00
David Green	8eb4894ad4	[ARM] MVE integer compares and selects This adds the very basics for MVE vector predication, adding integer VCMP and VSEL instruction support. This is done through predicate registers (MVT::v16i1, MVT::v8i1, MVT::v4i1), but otherwise using same mechanics as NEON to custom lower setcc's through ARMISD::VCXX nodes (VCEQ, VCGT, VCEQZ, etc). An extra VCNE was added, as this can be handled sensibly by MVE's expanded number of VCMP condition codes. (There are also VCLE and VCLT which are added later). VPSEL is also added here, simply selecting on the vselect. Original code by David Sherwood. Differential Revision: https://reviews.llvm.org/D65051 llvm-svn: 366885	2019-07-24 11:08:14 +00:00
Sam Parker	9e06b6a0db	[ARM][LowOverheadLoops] Fix branch target codegen While lowering test.set.loop.iterations, it wasn't checked how the brcond was using the result and so the wls could branch to the loop preheader instead of not entering it. The same was true for loop.decrement.reg. So brcond and br_cc and now lowered manually when using the hwloop intrinsics. During this we now check whether the result has been negated and whether we're using SETEQ or SETNE and 0 or 1. We can then figure out which basic block the WLS and LE should be targeting. Differential Revision: https://reviews.llvm.org/D64616 llvm-svn: 366809	2019-07-23 14:08:46 +00:00
David Green	6fde6fb1ae	[ARM] Rename NEONModImm to VMOVModImm. NFC Rename NEONModImm to VMOVModImm as it is used in both NEON and MVE. llvm-svn: 366790	2019-07-23 09:19:24 +00:00
David Green	8241a13714	[ARM] Adjust how NEON shifts are lowered This adjusts the way that we lower NEON shifts to use a DAG target node, not via a neon intrinsic. This is useful for handling MVE shifts operations in the same the way. It also renames some of the immediate shift nodes for consistency, and moves some of the processing of immediate shifts into LowerShift allowing it to capture more cases. Differential Revision: https://reviews.llvm.org/D64426 llvm-svn: 366051	2019-07-15 10:44:50 +00:00
Martin Storsjo	caf16fb54f	[ARM] Add support for MSVC stack cookie checking Heavily based on the same for AArch64, from SVN r346469. Differential Revision: https://reviews.llvm.org/D64292 llvm-svn: 365283	2019-07-07 18:57:31 +00:00
David Green	1676daeb17	[ARM] MVE VMOV immediate handling This adds some handling for VMOVimm, using the same method that NEON uses. We create VMOVIMM/VMVNIMM/VMOVFPIMM nodes based on the immediate, and select them using the now renamed ARMvmovImm/etc. There is also an extra 64bit immediate mode that I have not yet added here. Code by David Sherwood Differential Revision: https://reviews.llvm.org/D63884 llvm-svn: 365178	2019-07-05 10:02:43 +00:00
Roman Lebedev	57458dafa4	[Codegen][X86][AArch64][ARM][PowerPC] Inc-of-add vs sub-of-not (PR42457) Summary: This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]. In middle-end, we'd want to prefer the form with two adds - D63992, but as this diff shows, not every target will prefer that pattern. Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars, but only X86 prefer that same pattern for vectors. Here i'm adding a new TLI hook, always defaulting to the inc-of-add, but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars. Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel Reviewed By: efriedma Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64090 llvm-svn: 365010	2019-07-03 09:41:35 +00:00
Simon Tatham	b7791e7f2e	[ARM] MVE: allow soft-float ABI to pass vector types. Passing a vector type over the soft-float ABI involves it being split into four GPRs, so the first thing that has to happen at the start of the function is to recombine those into a vector register. The ABI types all vectors as v2f64, so we need to support BUILD_VECTOR for that type, which I do in this patch by allowing it to be expanded in terms of INSERT_VECTOR_ELT, and writing an ISel pattern for that in turn. Similarly, I provide a rule for EXTRACT_VECTOR_ELT so that a returned vector can be marshalled back into GPRs. While I'm here, I've also added ISD::UNDEF to the list of operations we turn back on in `setAllExpand`, because I noticed that otherwise it gets expanded into a BUILD_VECTOR with explicit zero inputs, leading to pointless machine instructions to zero out a vector register that's about to have every lane overwritten of in any case. Reviewers: dmgreen, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63937 llvm-svn: 364910	2019-07-02 11:26:11 +00:00
Simon Tatham	4152cccce8	[ARM] Stop using scalar FP instructions in integer-only MVE mode. If you compile with `-mattr=+mve` (enabling integer MVE instructions but not floating-point ones), then the scalar FP //registers// exist and it's legal to move things in and out of them, load and store them, but it's not legal to do arithmetic on them. In D60708, the calls to `addRegisterClass` in ARMISelLowering that enable use of the scalar FP registers became conditionalised on `Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so that loads, stores and moves of those registers would work. But I didn't realise that that would also enable all the operations on those types by default. Now, if the target doesn't have basic VFP, we follow up those `addRegisterClass` calls by turning back off all the nontrivial operations you can perform on f32 and f64. That causes several knock-on failures, which are fixed by allowing the `VMOVDcc` and `VMOVScc` instructions to be selected even if all you have is `HasFPRegs`, and adjusting several checks for 'is this a double in a single-precision-only world?' to the more general 'is this any FP type we can't do arithmetic on?'. Between those, the whole of the `float-ops.ll` and `fp16-instructions.ll` tests can now run in MVE-without-FP mode and generate correct-looking code. One odd side effect is that I had to relax the check lines in that test so that they permit test functions like `add_f` to be generated as tailcalls to software FP library functions, instead of ordinary calls. Doing that is entirely legal, but the mystery is why this is the first RUN line that's needed the relaxation: on the usual kind of non-FP target, no tailcalls ever seem to be generated. Going by the llc messages, I think `SoftenFloatResult` must be perturbing the code generation in some way, but that's as much as I can guess. Reviewers: dmgreen, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63938 llvm-svn: 364909	2019-07-02 11:26:00 +00:00
Sam Parker	5f8f6e4068	[ARM] WLS/LE Code Generation Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733	2019-07-01 08:21:28 +00:00
Sam Tebbs	9cb186cbd7	[ARM] Add support for the MVE long shift instructions MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers. The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl. test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions. Differential Revision: https://reviews.llvm.org/D63430 llvm-svn: 364654	2019-06-28 15:43:31 +00:00
David Green	a82cfd92f0	[ARM] Widening loads and narrowing stores MVE has instructions to widen as it loads, and narrow as it stores. This adds the required patterns and legalisation to make them work including specifying that they are legal, patterns to select them and test changes. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63839 llvm-svn: 364636	2019-06-28 09:47:55 +00:00
David Green	3cb5671eb3	[ARM] MVE vector shuffles This patch adds necessary shuffle vector and buildvector support for ARM MVE. It essentially adds support for VDUP, VREVs and some VMOVs, which are often required by other code (like upcoming patches). This mostly uses the same code from Neon that already generated NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and moved to ARMInstrInfo as they are common to both architectures. Most of the selection code seems to be applicable to both, but NEON does have some more instructions making some parts specific. Most code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63567 llvm-svn: 364626	2019-06-28 07:08:42 +00:00
Simon Tatham	cf702a78f5	[ARM] Code-generation infrastructure for MVE. This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and if you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 llvm-svn: 364329	2019-06-25 16:48:46 +00:00
Simon Pilgrim	b087bfb569	[TargetLowering] Add MachineMemOperand::Flags to allowsMemoryAccess tests (PR42123) As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space. This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them. If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores. Differential Revision: https://reviews.llvm.org/D63075 llvm-svn: 363179	2019-06-12 17:14:03 +00:00
Diogo N. Sampaio	23ccb5de58	[ARM][FIX] Ran out of registers due tail recursion Summary: - pr42062 When compiling for MinSize, ARMTargetLowering::LowerCall decides to indirect multiple calls to a same function. However, it disconsiders the limitation that thumb1 indirect calls require the callee to be in a register from r0 to r3 (llvm limiation). If all those registers are used by arguments, the compiler dies with "error: run out of registers during register allocation". This patch tells the function IsEligibleForTailCallOptimization if we intend to perform indirect calls, as to avoid tail call optimization. Reviewers: dmgreen, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62683 llvm-svn: 362366	2019-06-03 08:58:05 +00:00
Alexander Timofeev	96c0cb0bff	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. llvm-svn: 361741	2019-05-26 20:33:26 +00:00
Peter Collingbourne	855b0e1cfe	Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio llvm-svn: 361688	2019-05-25 01:52:38 +00:00
Alexander Timofeev	6481982aa2	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 llvm-svn: 361644	2019-05-24 15:32:18 +00:00
Sjoerd Meijer	8238653a0e	[TargetLowering] Change getOptimalMemOpType to take a function attribute list The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537	2019-04-30 08:38:12 +00:00
Simon Pilgrim	1af5220084	[TargetLowering] Rename preferShiftsToClearExtremeBits and shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526	2019-04-16 20:57:28 +00:00
Simon Pilgrim	3be8fb12c8	[SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468	2019-03-19 16:24:55 +00:00
Adhemerval Zanella	3f35ea2992	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389	2019-03-18 18:40:07 +00:00
Florian Hahn	5351cc7086	[ARM] Sink zext/sext operands for add and sub to enable vsubl generation. This uses the infrastructure added in rL353152 to sink zext and sexts to sub/add users, to enable vsubl/vaddl generation when NEON is available. See https://bugs.llvm.org/show_bug.cgi?id=40025. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D58063 llvm-svn: 355460	2019-03-06 00:10:03 +00:00
Sam Parker	81bbef609c	[ARM] Add OptMinSize to ARMSubtarget In many places in the backend, we like to know whether we're optimising for code size and this is performed by checking the current machine function attributes. A subtarget is created on a per-function basis, so it's possible to know when we're compiling for code size on construction so record this in the new object. Differential Revision: https://reviews.llvm.org/D57812 llvm-svn: 353501	2019-02-08 07:57:42 +00:00
Sjoerd Meijer	068d715728	[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS And instead just generate a libcall. My motivating example on ARM was a simple: shl i64 %A, %B for which the code bloat is quite significant. For other targets that also accept __int128/i128 such as AArch64 and X86, it is also beneficial for these cases to generate a libcall when optimising for minsize. On these 64-bit targets, the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS lowering operation action is not set to custom/expand. Differential Revision: https://reviews.llvm.org/D57386 llvm-svn: 352736	2019-01-31 08:07:30 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Eli Friedman	d7ff7fc94b	[ARM] Complete the Thumb1 shift+and->shift+shift transforms. This saves materializing the immediate. The additional forms are less common (they don't usually show up for bitfield insert/extract), but they're still relevant. I had to add a new target hook to prevent DAGCombine from reversing the transform. That isn't the only possible way to solve the conflict, but it seems straightforward enough. Differential Revision: https://reviews.llvm.org/D55630 llvm-svn: 349857	2018-12-20 23:39:54 +00:00
Tim Northover	5daefbd8b2	ARM: use target-specific SUBS node when combining cmp with cmov. This has two positive effects. First, using a custom node prevents recombination leading to an infinite loop since the output DAG is notionally a little more complex than the input one. Using a flag-setting instruction also allows the subtraction to be folded with the related comparison more easily. https://reviews.llvm.org/D53190 llvm-svn: 348122	2018-12-03 11:16:21 +00:00
Sjoerd Meijer	532a78148a	[ARM] Don't expand sdiv when optimising for minsize Don't expand SDIV with an immediate that is a power of 2 if we optimise for minimum code size. For example: sdiv %1, i32 4 gets expanded to a sequence of 3 instructions, but this is suboptimal for minimum code size so instead we just generate a MOV and a SDIV if integer division is supported. Differential Revision: https://reviews.llvm.org/D54546 llvm-svn: 347965	2018-11-30 08:14:28 +00:00
Alex Bradbury	b88141f5a6	[AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550	2018-09-19 14:51:42 +00:00
Tim Northover	9530f416eb	ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4. The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127	2018-09-13 10:28:05 +00:00
Eli Friedman	8b94932846	[ARM] Enable GEP offset splitting for 32-bit ARM. It has essentially the same benefit it has on 64-bit ARM: it substantially reduces the number of constants used by large GEP operations. Seems to be generally helpful across a few different codebases I've tried. Differential Revision: https://reviews.llvm.org/D51462 llvm-svn: 341136	2018-08-30 22:18:27 +00:00
Eli Friedman	23b65e265f	[ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly. Intentionally excluding nodes from the DAGCombine worklist is likely to lead to weird optimizations and infinite loops, so it's generally a bad idea. To avoid the infinite loops, fix DAGCombine to use the isDesirableToCommuteWithShift target hook before performing the transforms in question, and implement the target hook in the ARM backend disable the transforms in question. Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a reduced testcase for that bug. But we should have sufficient test coverage for PerformSHLSimplify given that we're not playing weird tricks with the worklist. I can try to bugpoint it if necessary, though.) Differential Revision: https://reviews.llvm.org/D50667 llvm-svn: 339734	2018-08-14 22:10:25 +00:00
Eli Friedman	251919f579	[ARM] Adjust AND immediates to make them cheaper to select. LLVM normally prefers to minimize the number of bits set in an AND immediate, but that doesn't always match the available ARM instructions. In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer a two-instruction sequence movs+ands or movs+bics. Some potential improvements outlined in ARMTargetLowering::targetShrinkDemandedConstant, but seems to work pretty well already. The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX instruction due to a larger-than-expected mask. (It's orthogonal, in some sense, but as far as I can tell it's either impossible or nearly impossible to reproduce the bug without this change.) According to my testing, this seems to consistently improve codesize by a small amount by forming bic more often for ISD::AND with an immediate. Differential Revision: https://reviews.llvm.org/D50030 llvm-svn: 339472	2018-08-10 21:21:53 +00:00
Tim Northover	9ef696c849	ARM: don't try to over-align large vectors as arguments. By default LLVM thinks very large vectors get aligned to their size when passed across functions. Unfortunately no-one told the ARM backend so it doesn't trigger stack realignment and so accesses can cause the usual misalignment issues (e.g. a data abort). This changes the ABI alignment to the stack alignment, which in practice (and as a bonus) also coincides with the alignment "natural" vectors get. llvm-svn: 331451	2018-05-03 12:54:25 +00:00
Adrian Prantl	076a6683eb	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Craig Topper	0681be8ccc	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
David Blaikie	6fc7a3075e	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	938d7fe477	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
Christof Douma	d98f4487e4	[ARM] Support float literals under XO When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313	2018-03-23 13:02:03 +00:00

1 2 3 4 5 ...

462 Commits