llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00

Author	SHA1	Message	Date
Tim Northover	173f0949e5	FastIsel: take care to update iterators when removing instructions. We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365	2018-12-17 17:25:53 +00:00
Alexandros Lamprineas	ff2cabbc44	[AArch64] Re-run load/store optimizer after aggressive tail duplication The Load/Store Optimizer runs before Machine Block Placement. At O3 the Tail Duplication Threshold is set to 4 instructions and this can create new opportunities for the Load/Store Optimizer. It seems worthwhile to run it once again. llvm-svn: 349338	2018-12-17 10:45:43 +00:00
Evandro Menezes	6f096d4f5a	[AArch64] Simplify the scheduling predicates (NFC) The instruction encodings make it unnecessary to distinguish extended W-form from X-form instructions. llvm-svn: 349185	2018-12-14 20:04:58 +00:00
Evandro Menezes	6b61a418d6	[AArch64] Fix Exynos predicates (NFC) Fix the logic in the definition of the `ExynosShiftExPred` as a more specific version of `ExynosShiftPred`. But, since `ExynosShiftExPred` is not used yet, this change has NFC. llvm-svn: 349091	2018-12-13 23:19:46 +00:00
Arnaud A. de Grandmaison	9ed5c53397	[AArch64] Catch some more CMN opportunities. Fixes https://bugs.llvm.org/show_bug.cgi?id=33486 llvm-svn: 349022	2018-12-13 10:31:32 +00:00
Mandeep Singh Grang	cb7f7e69ee	[COFF, ARM64] Emit COFF function header Summary: Emit COFF header when printing out the function. This is important as the header contains two important pieces of information: the storage class for the symbol and the symbol type information. This bit of information is required for the linker to correctly identify the type of symbol that it is dealing with. This patch mimics X86 and ARM COFF behavior for function header emission. Reviewers: rnk, mstorsjo, compnerd, TomTan, ssijaric Reviewed By: mstorsjo Subscribers: dmajor, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55535 llvm-svn: 348875	2018-12-11 18:36:14 +00:00
Aditya Nandakumar	4750d08150	[GISel]: Refactor MachineIRBuilder to allow passing additional parameters to build Instrs https://reviews.llvm.org/D55294 Previously MachineIRBuilder::buildInstr used to accept variadic arguments for sources (which were either unsigned or MachineInstrBuilder). While this worked well in common cases, it doesn't allow us to build instructions that have multiple destinations. Additionally passing in other optional parameters in the end (such as flags) is not possible trivially. Also a trivial call such as B.buildInstr(Opc, Reg1, Reg2, Reg3) can be interpreted differently based on the opcode (2defs + 1 src for unmerge vs 1 def + 2srcs). This patch refactors the buildInstr to buildInstr(Opc, ArrayRef<DstOps>, ArrayRef<SrcOps>) where DstOps and SrcOps are typed unions that know how to add itself to MachineInstrBuilder. After this patch, most invocations would look like B.buildInstr(Opc, {s32, DstReg}, {SrcRegs..., SrcMIBs..}); Now all the other calls (such as buildAdd, buildSub etc) forward to buildInstr. It also makes it possible to build instructions with multiple defs. Additionally in a subsequent patch, we should make it possible to add flags directly while building instructions. Additionally, the main buildInstr method is now virtual and other builders now only have to override buildInstr (for say constant folding/cseing) is straightforward. Also attached here (https://reviews.llvm.org/F7675680) is a clang-tidy patch that should upgrade the API calls if necessary. llvm-svn: 348815	2018-12-11 00:48:50 +00:00
Amara Emerson	97cf0a563b	[GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes. This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788	2018-12-10 18:44:58 +00:00
Evandro Menezes	c839f35d10	[AArch64] Refactor the Exynos scheduling predicates Refactor the scheduling predicates based on `MCInstPredicate`. In this case, for the Exynos processors. Differential revision: https://reviews.llvm.org/D55345 llvm-svn: 348774	2018-12-10 17:17:26 +00:00
Evandro Menezes	96ed90a002	[AArch64] Refactor the scheduling predicates Refactor the scheduling predicates based on `MCInstPredicate`. Augment the number of helper predicates used by processor specific predicates. Differential revision: https://reviews.llvm.org/D55375 llvm-svn: 348768	2018-12-10 16:24:30 +00:00
David Green	76448ad394	[Targets] Add errors for tiny and kernel codemodel on targets that don't support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585	2018-12-07 12:10:23 +00:00
Evandro Menezes	ee698790d2	[AArch64] Fix Exynos predicate Fix predicate for arithmetic instructions with shift and/or extend. llvm-svn: 348510	2018-12-06 18:25:37 +00:00
Diogo N. Sampaio	cd73ce3566	[NFC][AArch64] Split out backend features This patch splits backend features currently hidden behind architecture versions. For example, currently the only way to activate complex numbers extension is targeting an v8.3 architecture, where after the patch this extension can be added separately. This refactoring is required by the new command lines proposal: http://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html Reviewers: DavidSpickett, olista01, t.p.northover Subscribers: kristof.beyls, bryanpkc, javed.absar, pbarrio Differential revision: https://reviews.llvm.org/D54633 -- It was reverted in rL348249 due a build bot failure in one of the regression tests: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/14386 The problem seems to be that FileCheck behaves different in windows and linux. This new patch splits the test file in multiple, and does more exact pattern matching attempting to circumvent the issue. llvm-svn: 348493	2018-12-06 15:39:17 +00:00
Matthias Braun	8707b0918b	AArch64: Fix invalid CCMP emission The code emitting AND-subtrees used to check whether any of the operands was an OR in order to figure out if the result needs to be negated. However the OR could be hidden in further subtrees and not immediately visible. Change the code so that canEmitConjunction() determines whether the result of the generated subtree needs to be negated. Cleanup emission logic to use this. I also changed the code a bit to make all negation decisions early before we actually emit the subtrees. This fixes http://llvm.org/PR39550 Differential Revision: https://reviews.llvm.org/D54137 llvm-svn: 348444	2018-12-06 01:40:23 +00:00
Aditya Nandakumar	ba54e27cac	[GISel]: Provide standard interface to observe changes in GISel passes https://reviews.llvm.org/D54980 This provides a standard API across GISel passes to observe and notify passes about changes (insertions/deletions/mutations) to MachineInstrs. This patch also removes the recordInsertion method in MachineIRBuilder and instead provides method to setObserver. Reviewed by: vkeles. llvm-svn: 348406	2018-12-05 20:14:52 +00:00
Evandro Menezes	b2f8f727c4	[AArch64] Reword description of feature (NFC) Reword the description of the feature that enables custom handling of cheap instructions. llvm-svn: 348398	2018-12-05 18:42:57 +00:00
Saleem Abdulrasool	dc3c5dd6f9	AArch64: support funclets in fastcall and swift_call Functions annotated with `__fastcall` or `__attribute__((__fastcall__))` or `__attribute__((__swiftcall__))` may contain SEH handlers even on Win64. This matches the behaviour of cl which allows for `__try`/`__except` inside a `__fastcall` function. This was detected while trying to self-host clang on Windows ARM64. llvm-svn: 348337	2018-12-05 07:09:20 +00:00
Amara Emerson	84e4cfc565	[AArch64][GlobalISel] Re-enable selection of volatile loads. We previously disabled this in r323371 because of a bug where we selected an extending load, but didn't delete the old G_LOAD, resulting in two loads being generated for volatile loads. Since we now have dedicated G_SEXTLOAD/G_ZEXTLOAD operations, and that the tablegen patterns should no longer be able to select (ext(load x)) patterns, it should be safe to re-enable it. The old test case should still work as expected. llvm-svn: 348320	2018-12-05 00:03:09 +00:00
Saleem Abdulrasool	f28e4aa042	AArch64: clean up some whitespace in Windows CC (NFC) Drive by clean up for Windows ARM64 variadic CC (NFC). llvm-svn: 348310	2018-12-04 22:19:29 +00:00
Simon Pilgrim	dbfdba9405	Fix -Wparentheses warning. NFCI. llvm-svn: 348254	2018-12-04 12:24:10 +00:00
Simon Pilgrim	5f2f923973	Revert rL348121 from llvm/trunk: [NFC][AArch64] Split out backend features This patch splits backend features currently hidden behind architecture versions. For example, currently the only way to activate complex numbers extension is targeting an v8.3 architecture, where after the patch this extension can be added separately. This refactoring is required by the new command lines proposal: http://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html Reviewers: DavidSpickett, olista01, t.p.northover Subscribers: kristof.beyls, bryanpkc, javed.absar, pbarrio Differential revision: https://reviews.llvm.org/D54633 ........ This has been causing buildbots failures for the past 24 hours: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/14386 llvm-svn: 348249	2018-12-04 10:55:48 +00:00
Sanjin Sijaric	96650a3083	[ARM64][Windows] Fix local stack size for funclets The comment was misplaced, and the code didn't do what the comment indicated, namely ignoring the varargs portion when computing the local stack size of a funclet in emitEpilogue. This results in incorrect offset computations within funclets that are contained in vararg functions. Differential Revision: https://reviews.llvm.org/D55096 llvm-svn: 348222	2018-12-04 00:54:52 +00:00
Jessica Paquette	387345645f	[MachineOutliner] Move stack instr check logic to getOutliningCandidateInfo This moves the stack check logic into a lambda within getOutliningCandidateInfo. This allows us to be less conservative with stack checks. Whether or not a stack instruction is safe to outline is dependent on the frame variant and call variant of the outlined function; only in cases where we modify the stack can these be unsafe. So, if we move that logic later, when we're looking at an individual candidate, we can make better decisions here. This gives some code size savings as a result. llvm-svn: 348220	2018-12-04 00:31:55 +00:00
Jessica Paquette	7e1eb1b5c3	[MachineOutliner][AArch64][NFC] Add early exit to candidate discarding logic If we dropped too many candidates to be beneficial when dropping candidates that modify the stack, there's no reason to check for other cost model qualities. llvm-svn: 348219	2018-12-04 00:31:47 +00:00
Jessica Paquette	7d1c45a131	[MachineOutliner] Drop candidates that require fixups if it's beneficial If it's a bigger code size win to drop candidates that require stack fixups than to demote every candidate to that variant, the outliner should do that. This happens if the number of bytes taken by calls to functions that don't require fixups, plus the number of bytes that'd be left is less than the number of bytes that it'd take to emit a save + restore for all candidates. Also add tests for each possible new behaviour. - machine-outliner-compatible-candidates shows that when we have candidates that don't use the stack, we can use the default call variant along with the no save/regsave variant. - machine-outliner-all-stack shows that when it's better to fix up the stack, we still will demote all candidates to that case - machine-outliner-drop-stack shows that we can discard candidates that require stack fixups when it would be beneficial to do so. llvm-svn: 348168	2018-12-03 19:11:27 +00:00
Pablo Barrio	72d3164a16	[AArch64] Add command-line option for SSBS Summary: SSBS (Speculative Store Bypass Safe) is only mandatory from 8.5 onwards but is optional from Armv8.0-A. This patch adds a command line option to enable SSBS, as it was previously only possible to enable by selecting -march=armv8.5-a. Similar patch upstream in GNU binutils: https://sourceware.org/ml/binutils/2018-09/msg00274.html Reviewers: olista01, samparker, aemerson Reviewed By: samparker Subscribers: javed.absar, kristof.beyls, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D54629 llvm-svn: 348137	2018-12-03 14:00:47 +00:00
Diogo N. Sampaio	61a6678d57	[NFC][AArch64] Split out backend features This patch splits backend features currently hidden behind architecture versions. For example, currently the only way to activate complex numbers extension is targeting an v8.3 architecture, where after the patch this extension can be added separately. This refactoring is required by the new command lines proposal: http://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html Reviewers: DavidSpickett, olista01, t.p.northover Subscribers: kristof.beyls, bryanpkc, javed.absar, pbarrio Differential revision: https://reviews.llvm.org/D54633 llvm-svn: 348121	2018-12-03 11:08:13 +00:00
Jessica Paquette	c877e03376	[MachineOutliner][AArch64] Improve checks for stack instructions If we know that we'll definitely save LR to a register, there's no reason to pre-check whether or not a stack instruction is unsafe to fix up. This makes it so that we check for that condition before mapping instructions. This allows us to outline more, since we don't pessimise as many instructions. Also update some tests, since we outline more. llvm-svn: 348081	2018-12-01 21:24:06 +00:00
Jessica Paquette	2534c6723e	[MachineOutliner] Outline both register save calls + no LR save calls together Instead of treating the outlined functions for these as distinct frames, they should be combined into one case. Neither allows for stack fixups, and both generate the same frame. Thus, they ought to be considered one case. This makes the code far easier to understand, for one thing. It also offers some small code size improvements. It's fairly rare to see a class of outlined functions that doesn't fall entirely into one variant (on CTMark anyway). It does happen from time to time though. This mostly offers some serious simplification. Also update the test to show the added functionality. llvm-svn: 348036	2018-11-30 21:14:58 +00:00
Peter Collingbourne	64dd39903c	AArch64: Don't emit CFI for SCS register in nounwind functions. All that you can legitimately do with the CFI for a nounwind function is get a backtrace, and adjusting the SCS register is not (currently) required for this purpose. Differential Revision: https://reviews.llvm.org/D54988 llvm-svn: 348035	2018-11-30 21:04:25 +00:00
Francis Visoiu Mistrih	b1262b3845	[MachineScheduler] Order FI-based memops based on stack direction It makes more sense to order FI-based memops in descending order when the stack goes down. This allows offsets to stay "consecutive" and allow easier pattern matching. llvm-svn: 347906	2018-11-29 20:03:19 +00:00
Craig Topper	dfe7e315ea	[SelectionDAG][AArch64][X86] Move legalization of vector MULHS/MULHU from LegalizeDAG to LegalizeVectorOps I believe we should be legalizing these with the rest of vector binary operations. If any custom lowering is required for these nodes, this will give the DAG combine between LegalizeVectorOps and LegalizeDAG to run on the custom code before constant build_vectors are lowered in LegalizeDAG. I've moved MULHU/MULHS handling in AArch64 from Lowering to isel. Moving the lowering earlier caused build_vector+extract_subvector simplifications to kick in which made the generated code worse. Differential Revision: https://reviews.llvm.org/D54276 llvm-svn: 347902	2018-11-29 19:36:17 +00:00
Petr Pavlu	a73160fd34	[GlobalISel] Make EnableGlobalISel always set when GISel is enabled Change meaning of TargetOptions::EnableGlobalISel. The flag was previously set only when a target switched on GlobalISel but it is now always set when the GlobalISel pipeline is enabled. This makes the flag consistent with TargetOptions::EnableFastISel and allows its use in other parts of the compiler to determine when GlobalISel is enabled. The EnableGlobalISel flag had previouly only one use in TargetPassConfig::isGlobalISelAbortEnabled(). The method used its value to determine if GlobalISel was enabled by a target and returned false in such a case. To preserve the current behaviour, a new flag TargetOptions::GlobalISelAbort is introduced to separately record the abort behaviour. Differential Revision: https://reviews.llvm.org/D54518 llvm-svn: 347861	2018-11-29 12:56:32 +00:00
Francis Visoiu Mistrih	c7da15d985	[MachineScheduler] Add support for clustering mem ops with FI base operands Before this patch, the following stores in `merge_fail` would fail to be merged, while they would get merged in `merge_ok`: ``` void use(unsigned long long ); void merge_fail(unsigned key, unsigned index) { unsigned long long args[8]; args[0] = key; args[1] = index; use(args); } void merge_ok(unsigned long long dst, unsigned a, unsigned b) { dst[0] = a; dst[1] = b; } ``` The reason is that `getMemOpBaseImmOfs` would return false for FI base operands. This adds support for this. Differential Revision: https://reviews.llvm.org/D54847 llvm-svn: 347747	2018-11-28 12:00:28 +00:00
Francis Visoiu Mistrih	6683b9c236	[CodeGen][NFC] Make `TII::getMemOpBaseImmOfs` return a base operand Currently, instructions doing memory accesses through a base operand that is not a register can not be analyzed using `TII::getMemOpBaseRegImmOfs`. This means that functions such as `TII::shouldClusterMemOps` will bail out on instructions using an FI as a base instead of a register. The goal of this patch is to refactor all this to return a base operand instead of a base register. Then in a separate patch, I will add FI support to the mem op clustering in the MachineScheduler. Differential Revision: https://reviews.llvm.org/D54846 llvm-svn: 347746	2018-11-28 12:00:20 +00:00
Evandro Menezes	a13afb10aa	[TableGen] Refactor macro names (NFC) Make the names for the macros for `TargetInstrInfo` uniform. llvm-svn: 347706	2018-11-27 20:58:27 +00:00
David Blaikie	8a379c9d75	AArch64ISelLowering: Remove a return-of-assignment to allow NRVO Patch by Arthur O'Dwyer! llvm-svn: 347609	2018-11-26 22:57:18 +00:00
Evandro Menezes	c0e9308c17	[AArch64] Refactor the scheduling predicates (3/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::hasExtendedReg()`. Differential revision: https://reviews.llvm.org/D54822 llvm-svn: 347599	2018-11-26 21:47:46 +00:00
Evandro Menezes	12a8ee69b4	[AArch64] Refactor the scheduling predicates (2/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::hasShiftedReg()`. Differential revision: https://reviews.llvm.org/D54820 llvm-svn: 347598	2018-11-26 21:47:41 +00:00
Evandro Menezes	7caf28a957	[AArch64] Refactor the scheduling predicates (1/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::isScaledAddr()` Differential revision: https://reviews.llvm.org/D54777 llvm-svn: 347597	2018-11-26 21:47:28 +00:00
Than McIntosh	19bdd4a2bb	[CodeGen] Support custom format of stack maps Summary: Add a hook to the GCMetadataPrinter for emitting stack maps in custom format. The hook will be called at stack map generation time. The default stack map format is used if there is no hook. For this to be useful a few data structures and accessors are exposed from the StackMaps class, so the custom printer can access the stack map data. This patch authored by Cherry Zhang <cherryyz@google.com>. Reviewers: thanm, apilipenko, reames Reviewed By: reames Subscribers: reames, apilipenko, nemanjai, javed.absar, kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53892 llvm-svn: 347584	2018-11-26 18:43:48 +00:00
Luke Cheeseman	610670577d	Revert r347490 as it breaks address sanitizer builds llvm-svn: 347499	2018-11-23 17:13:06 +00:00
Luke Cheeseman	30437c2af7	Revert r343341 - Cannot reproduce the build failure locally and the build logs have been deleted. llvm-svn: 347490	2018-11-23 11:01:47 +00:00
John Brawn	0cab72eb19	[AArch64] Fix SelectionDAG infinite loop for v1i64 SCALAR_TO_VECTOR A consequence of r347274 is that SCALAR_TO_VECTOR can be converted into BUILD_VECTOR by SimplifyDemandedBits, but LowerBUILD_VECTOR can turn BUILD_VECTOR into SCALAR_TO_VECTOR so we get an infinite loop. Fix this by making LowerBUILD_VECTOR not do this transformation for those vectors that would get transformed back, i.e. BUILD_VECTOR of a single-element constant vector. Doing that means we get a DUP, which we then need to recognise in ISel as a copy. llvm-svn: 347456	2018-11-22 11:45:23 +00:00
Simon Pilgrim	d77e74948a	Fix MSVC 'truncation of constant value' warning. NFCI. llvm-svn: 347308	2018-11-20 14:29:40 +00:00
Martin Elshuber	b3759e54f4	Subject: [PATCH] [CodeGen] Add pass to combine interleaved loads. This patch defines an interleaved-load-combine pass. The pass searches for ShuffleVector instructions that represent interleaved loads. Matches are converted such that they will be captured by the InterleavedAccessPass. The pass extends LLVMs capabilities to use target specific instruction selection of interleaved load patterns (e.g.: ld4 on Aarch64 architectures). Differential Revision: https://reviews.llvm.org/D52653 llvm-svn: 347208	2018-11-19 14:26:10 +00:00
Peter Collingbourne	a752a87637	AArch64: Emit a call frame instruction for the shadow call stack register. When unwinding past a function that uses shadow call stack, we must subtract 8 from the value of the x18 register. This patch causes us to emit a call frame instruction that causes that to happen. Differential Revision: https://reviews.llvm.org/D54609 llvm-svn: 347089	2018-11-16 20:08:54 +00:00
Jessica Paquette	d867e75f8e	[MachineOutliner][NFC] Don't compute liveness if X16/X17/NZCV are unused Using the MBB flags, we can tell if X16/X17/NZCV are unused in a block, and also not live out. If this holds for all MBBs, then we can avoid checking for liveness on that candidate. Furthermore, if it holds for an individual candidate's MBB, then we can avoid checking for liveness on that candidate. llvm-svn: 346901	2018-11-14 22:23:38 +00:00
Jessica Paquette	dbef4ec37e	[MachineOutliner][NFC] Use flags set in all candidates to check for calls If we keep track of if the ContainsCalls bit is set in the MBB flags for each candidate, then we have a better chance of not checking the candidate for calls at all. This saves quite a few checks in some CTMark tests (~200 in Bullet, for example.) llvm-svn: 346816	2018-11-13 23:41:31 +00:00
Jessica Paquette	a05f516cdf	[MachineOutliner][NFC] Use MBB flags to avoid call checks in getOutliningInfo We already determine a bunch of information about an MBB in getMachineOutlinerMBBFlags. We can reuse that information to avoid calculating things that must be false/true. The first thing we can easily check is if an outlined sequence could ever contain calls. There's no reason to walk over the outlined range, checking for calls, if we already know that there are no calls in the block containing the sequence. llvm-svn: 346809	2018-11-13 23:01:34 +00:00
Jessica Paquette	76d6c36f51	[MachineOutliner][NFC] Exit getOutliningType if there are < 2 candidates Since we never outline anything with fewer than 2 occurrences, there's no reason to compute cost model information if there's less than that. llvm-svn: 346803	2018-11-13 22:16:27 +00:00
Jessica Paquette	184774cd2c	[MachineOutliner][NFC] Simplify isMBBSafeToOutlineFrom check in AArch64 outliner Turns out it's way simpler to do this check with one LRU. Instead of maintaining two, just keep one. Check if each of the registers is available, and then check if it's a live out from the block. If it's a live out, but available in the block, we know we're in an unsafe case. llvm-svn: 346721	2018-11-13 00:32:09 +00:00
Jessica Paquette	909f3e35d2	[MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to isMBBSafeToOutlineFrom Instead of returning Flags, return true if the MBB is safe to outline from. This lets us check for unsafe situations, like say, in AArch64, X17 is live across a MBB without being defined in that MBB. In that case, there's no point in performing an instruction mapping. llvm-svn: 346718	2018-11-12 23:51:32 +00:00
Sanjay Patel	47f4c40e25	[x86] allow vector load narrowing with multi-use values This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs. Apart from 2-3 strange cases, these are all wins. I've structured this to be no-functional-change-intended for any target except for x86 because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those targets have existing regression tests (4, 4, 10 files respectively) that would be affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show any regression test diffs. The trade-off is deciding if an extra vector load is better than a single wide load + extract_subvector. For x86, this is almost always better (on paper at least) because we often can fold loads into subsequent ops and not increase the official instruction count. There's also some unknown -- but potentially large -- benefit from using narrower vector ops if wide ops are implemented with multiple uops and/or frequency throttling is avoided. Differential Revision: https://reviews.llvm.org/D54073 llvm-svn: 346595	2018-11-10 20:05:31 +00:00
Eli Friedman	f70ae59019	[ARM64] [Windows] Handle funclets This patch adds support for funclets in frame lowering and ISel lowering. Together with D50288 and D50166, it enables C++ exception handling. Patch by Sanjin Sijaric, with some fixes by me. Differential Revision: https://reviews.llvm.org/D51524 llvm-svn: 346568	2018-11-09 23:33:30 +00:00
Bryan Chan	21e4975915	[AArch64] Support HiSilicon's TSV110 processor Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls Reviewed By: kristof.beyls Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D53908 llvm-svn: 346546	2018-11-09 19:32:08 +00:00
Clement Courbet	21390a9b77	[llvm-exegesis][NFC] Add a way to declare the default counter binding for unbound CPUs for a target. Summary: This simplifies the code and moves everything to tablegen for consistency. This also prepares the ground for adding issue counters. Reviewers: gchatelet, john.brawn, jsji Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54297 llvm-svn: 346489	2018-11-09 13:15:32 +00:00
Mandeep Singh Grang	4404a659c5	[COFF, ARM64] Add support for MSVC buffer security check Reviewers: rnk, mstorsjo, compnerd, efriedma, TomTan Reviewed By: rnk Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D54248 llvm-svn: 346469	2018-11-09 02:48:36 +00:00
Eli Friedman	cb7473de1d	[AArch64] [Windows] Address post-commit review comment on r346358. In this context, usesWindowsCFI() is basically the same thing as isOSWindows(), but it makes the relevant property of the target more explicit. llvm-svn: 346366	2018-11-07 22:30:56 +00:00
Eli Friedman	2b52060063	[AArch64] [Windows] Trap after noreturn calls. Like the comment says, this isn't the most efficient fix in terms of codesize, but it works. Differential Revision: https://reviews.llvm.org/D54129 llvm-svn: 346358	2018-11-07 21:31:14 +00:00
Evandro Menezes	a7659b716b	[PATCH] [AArch64] Refactor helper functions (NFC) Refactor helper functions in AArch64InstrInfo to be static methods. llvm-svn: 346273	2018-11-06 22:17:14 +00:00
Matthias Braun	d0d3b55c57	AArch64: Cleanup CCMP code; NFC Cleanup CCMP pattern matching code in preparation for review/bugfix: - Rename `isConjunctionDisjunctionTree()` to `canEmitConjunction()` (it won't accept arbitrary disjunctions and is really about whether we can transform the subtree into a conjunction that we can emit). - Rename `emitConjunctionDisjunctionTree()` to `emitConjunction()` llvm-svn: 346203	2018-11-06 03:15:22 +00:00
Craig Topper	487ffb554f	[TargetLowering] Change TargetLoweringBase::getPreferredVectorAction to take an MVT instead of an EVT. NFC The main caller of this already has an MVT and several targets called getSimpleVT inside without checking isSimple. This makes the simpleness explicit. llvm-svn: 346180	2018-11-05 23:26:13 +00:00
Mandeep Singh Grang	f318e7e734	[COFF, ARM64] Implement Intrinsic.sponentry for AArch64 Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Patch by: Yin Ma (yinma@codeaurora.org) Reviewers: mgrang, ssijaric, eli.friedman, TomTan, mstorsjo, rnk, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53996 llvm-svn: 345909	2018-11-01 23:22:25 +00:00
Mandeep Singh Grang	6cc0c853bd	[COFF, ARM64] Implement llvm.addressofreturnaddress intrinsic Reviewers: rnk, mstorsjo, efriedma, TomTan Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53962 llvm-svn: 345892	2018-11-01 21:23:47 +00:00
Volkan Keles	2011a121c4	[GlobalISel] Fix a bug in LegalizeRuleSet::clampMaxNumElements Summary: This function was causing a crash when `MaxElements == 1` because it was trying to create a single element vector type. Reviewers: dsanders, aemerson, aditya_nandakumar Reviewed By: dsanders Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53734 llvm-svn: 345875	2018-11-01 19:01:53 +00:00
Reid Kleckner	efe1340d16	[AArch64] Fix unintended fallthrough and strengthen cast This was added in r330630. GCC's -Wimplicit-fallthrough seems to not fire when the previous case contains a switch itself. This fallthrough was bening because the helper function implementing the case used dyn_cast to re-check the type of the node in question. After fixing the fallthrough, we can strengthen the cast. llvm-svn: 345864	2018-11-01 18:02:27 +00:00
Mandeep Singh Grang	7e05f4f778	Revert "[COFF, ARM64] Implement Intrinsic.sponentry for AArch64" This reverts commit 585b6667b4712e3c7f32401e929855b3313b4ff2. llvm-svn: 345863	2018-11-01 17:53:57 +00:00
Chad Rosier	623ce6503d	[AArch64] Add support for ARMv8.4 in Saphira. llvm-svn: 345827	2018-11-01 13:45:16 +00:00
Mandeep Singh Grang	0878b7e6bd	[COFF, ARM64] Implement Intrinsic.sponentry for AArch64 Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform. Reviewers: mgrang, TomTan, rnk, compnerd, mstorsjo, efriedma Reviewed By: efriedma Subscribers: majnemer, chrib, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53673 llvm-svn: 345791	2018-10-31 23:16:20 +00:00
Evandro Menezes	7d9c8c534b	[AArch64] Sort switch cases (NFC) llvm-svn: 345786	2018-10-31 21:56:49 +00:00
Dorit Nuzman	a2771a93ac	[LV] Support vectorization of interleave-groups that require an epilog under optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705	2018-10-31 09:57:56 +00:00
Sanjin Sijaric	a41bbe57fe	[ARM64] [Windows] Exception handling support in frame lowering Emit pseudo instructions indicating unwind codes corresponding to each instruction inside the prologue/epilogue. These are used by the MCLayer to populate the .xdata section. Differential Revision: https://reviews.llvm.org/D50288 llvm-svn: 345701	2018-10-31 09:27:01 +00:00
Martin Storsjo	378940c00f	[AArch64] Mark condition flags and x16/x17 as clobbered when calling __chkstk This is similar to SVN r311061 for ARM. Differential Revision: https://reviews.llvm.org/D53878 llvm-svn: 345698	2018-10-31 08:14:09 +00:00
Mandeep Singh Grang	597f80e768	[COFF, ARM64] Make sure to forward arguments from vararg to musttail vararg Summary: Thunk functions in Windows are varag functions that call a musttail function to pass the arguments after the fixup is done. We need to make sure that we forward the arguments from the caller vararg to the callee vararg function. This is the same mechanism that is used for Windows on X86. Reviewers: ssijaric, eli.friedman, TomTan, mgrang, mstorsjo, rnk, compnerd, efriedma Reviewed By: efriedma Subscribers: efriedma, kristof.beyls, chrib, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53843 llvm-svn: 345641	2018-10-30 20:46:10 +00:00
Eli Friedman	7d3a8434aa	[AArch64] [Windows] SEH opcodes should be scheduling boundaries. Prevents the post-RA scheduler from modifying the prologue sequences emitting by frame lowering. This is roughly similar to what we do for other targets: TargetInstrInfo::isSchedulingBoundary checks isPosition(), which checks for CFI_INSTRUCTION. isSEHInstruction is taken from D50288; it'll land with whatever patch lands first. Differential Revision: https://reviews.llvm.org/D53851 llvm-svn: 345634	2018-10-30 19:24:51 +00:00
David Greene	38d898663d	[AArch64] Create proper memoperand for multi-vector stores Re-apply r345315 with testcase fixes. Include all of the store's source vector operands when creating the MachineMemOperand. Previously, we were missing the first operand, making the store size seem smaller than it really is. Differential Revision: https://reviews.llvm.org/D52816 llvm-svn: 345631	2018-10-30 19:17:51 +00:00
Diogo N. Sampaio	14e7c7404e	[AArch64] Add support for UDF instruction Summary: Add support for AArch64 UDF instruction. UDF - Permanently Undefined generates an Undefined Instruction exception (ESR_ELx.EC = 0b000000). Reviewers: DavidSpickett, javed.absar, t.p.northover Reviewed By: javed.absar Subscribers: nhaehnle, kristof.beyls Differential Revision: https://reviews.llvm.org/D53319 llvm-svn: 345581	2018-10-30 11:06:50 +00:00
Reid Kleckner	8995c313e1	Remove unneeded friend declarations that clang-cl warns on llvm-svn: 345549	2018-10-29 22:38:13 +00:00
Bryan Chan	86cd8f734f	[AArch64] Rename FP16FML instruction format (NFC) Rename SIMDThreeSameMult (etc.) to SIMDThreeSameVectorFML (etc.) to follow usual naming convention, and add some comments in the .td files. llvm-svn: 345515	2018-10-29 17:27:34 +00:00
Luke Cheeseman	0066760239	[AArch64] Return address signing B key support - Add support to generate AUTIBSP, PACIBSP, RETAB instructions for return address signing - The key used to sign the function is controlled by the function attribute "sign-return-address-key" Differential Revision: https://reviews.llvm.org/D51427 llvm-svn: 345511	2018-10-29 16:26:58 +00:00
Sanjin Sijaric	b45305d59c	[ARM64][Windows] MCLayer support for exception handling Add ARM64 unwind codes to MCLayer, as well SEH directives that will be emitted by the frame lowering patch to follow. We only emit unwind codes into object object files for now. Differential Revision: https://reviews.llvm.org/D50166 llvm-svn: 345450	2018-10-27 06:13:06 +00:00
Vlad Tsyrklevich	4d6b75f373	Revert "[AArch64] Create proper memoperand for multi-vector stores" This reverts commit r345315, it was causing test failures on sanitizer-x86_64-linux-fast. llvm-svn: 345356	2018-10-26 02:00:14 +00:00
Bryan Chan	7c08c135e3	[AArch64] Implement FP16FML intrinsics Add LLVM intrinsics for the ARMv8.2-A FP16FML vector-form instructions. Add a DAG pattern to define the indexed-form intrinsics in terms of the vector-form ones, similarly to how the Dot Product intrinsics were implemented. Based on a patch by Gao Yiling. Differential Revision: https://reviews.llvm.org/D53632 llvm-svn: 345337	2018-10-25 23:36:41 +00:00
David Greene	e14c99cea5	[AArch64] Create proper memoperand for multi-vector stores Include all of the store's source vector operands when creating the MachineMemOperand. Previously, we were missing the first operand, making the store size seem smaller than it really is. Differential Revision: https://reviews.llvm.org/D52816 llvm-svn: 345315	2018-10-25 21:10:39 +00:00
Volkan Keles	8cde7f6e08	[GISel] LegalizerInfo: Rename MemDesc::Size to SizeInBits to make the value clearer Requested in D53679. llvm-svn: 345288	2018-10-25 17:37:07 +00:00
Volkan Keles	095217b618	[AArch64][GlobalISel] Fix the LegalityPredicate for lowerIf for G_LOAD/G_STORE Summary: Currently, Legalizer is trying to lower G_LOAD with a vector type that has more than two elements due to the incorrect LegalityPredicate. This patch fixes the issue by removing the multiplication by 8 as `MemDesc.Size` already contains the size in bits. Reviewers: dsanders, aemerson Reviewed By: dsanders Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53679 llvm-svn: 345282	2018-10-25 17:23:25 +00:00
Evandro Menezes	4c88c8fc5b	[AArch64] Refactor Exynos feature sets (NFC) llvm-svn: 345279	2018-10-25 16:45:46 +00:00
John Brawn	9a9d09f98c	[AArch64] Add EXT patterns for 64-bit EXT of a subvector of a 128-bit vector If we have a 64-bit EXT where one of the operands is a subvector of a 128-bit vector then in some cases we can eliminate an extract_subvector by converting to a 128-bit EXT of the 128-bit vector. Differential Revision: https://reviews.llvm.org/D53582 llvm-svn: 345275	2018-10-25 15:31:51 +00:00
John Brawn	e1f764cb2a	[AArch64] Refactor definition of EXT patterns to use a multiclass Using a multiclass reduces duplication, and makes it easier to add new patterns later. This refactoring does add some new patterns, but as far as I can tell there's no IR that will end up triggering them so this is effectively NFC. Differential Revision: https://reviews.llvm.org/D53580 llvm-svn: 345271	2018-10-25 15:00:10 +00:00
John Brawn	5f32a8dbc4	[AArch64] Do 64-bit vector move of 0 and -1 by extracting from the 128-bit move Currently a vector move of 0 or -1 will use different instructions depending on the size of the vector. Using a single instruction (the 128-bit one) for both gives more opportunity for Machine CSE to eliminate instructions. Differential Revision: https://reviews.llvm.org/D53579 llvm-svn: 345270	2018-10-25 14:56:48 +00:00
Amara Emerson	aa8a544e25	[GlobalISel] Use the target preferred type for G_EXTRACT_VECTOR_ELT index. Allows for better imported pattern re-use. llvm-svn: 345265	2018-10-25 14:04:54 +00:00
Simon Pilgrim	afbfdd6bd6	[TTI] Add generic SK_Broadcast shuffle costs I noticed while fixing PR39368 that we don't have generic shuffle costs for broadcast style shuffles. This patch adds SK_BROADCAST handling, but exposes ARM/AARCH64 lack of handling of this type, which I've added a fix for at the same time. Differential Revision: https://reviews.llvm.org/D53570 llvm-svn: 345253	2018-10-25 10:52:36 +00:00
Thomas Lively	e649dc137e	[NFC] Rename minnan and maxnan to minimum and maximum Summary: Changes all uses of minnan/maxnan to minimum/maximum globally. These names emphasize that the semantic difference between these operations is more than just NaN-propagation. Reviewers: arsenm, aheejin, dschuff, javed.absar Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D53112 llvm-svn: 345218	2018-10-24 22:49:55 +00:00
Evandro Menezes	b1d0aef74d	[AArch64] Refactor Exynos machine model Effectively, NFC. llvm-svn: 345201	2018-10-24 21:40:43 +00:00
Tim Northover	f7d57b84b7	AArch64: add a pass to compress jump-table entries when possible. llvm-svn: 345188	2018-10-24 20:19:09 +00:00
Evandro Menezes	ecadbaba69	[AArch64] Refactor Exynos machine model (NFC) llvm-svn: 345187	2018-10-24 20:03:24 +00:00
Evandro Menezes	cb4099b53e	[AArch64] Fix overlapping instructions Fix overlapping instruction descriptions in the machine model for Exynos M3. Effectively, NFC. llvm-svn: 345186	2018-10-24 20:03:20 +00:00
Fangrui Song	db2f6ced8d	Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFC llvm-svn: 344774	2018-10-19 06:12:02 +00:00
Evandro Menezes	63ce5f2a6b	[PATCH] [NFC][AArch64] Fix refactoring of macro fusion Fix compiler error. llvm-svn: 344632	2018-10-16 17:41:45 +00:00
Evandro Menezes	c3f7ade6e7	[NFC][AArch64] Refactor macro fusion Simplify API of checking functions. llvm-svn: 344624	2018-10-16 17:19:28 +00:00
Simon Pilgrim	1d92260763	[AARCH64] Improve vector popcnt lowering with ADDLP AARCH64 equivalent to D53257 - uses widening pairwise adds on vXi8 CTPOP to support i16/i32/i64 vectors. This is a blocker for generic vector CTPOP expansion (P32655) - this will remove the aarch64 diff from D53258. Differential Revision: https://reviews.llvm.org/D53259 llvm-svn: 344554	2018-10-15 21:15:58 +00:00
Dorit Nuzman	a3df726c55	recommit 344472 after fixing build failure on ARM and PPC. llvm-svn: 344475	2018-10-14 08:50:06 +00:00
Dorit Nuzman	70052d5053	revert 344472 due to failures. llvm-svn: 344473	2018-10-14 07:21:20 +00:00
Dorit Nuzman	c4c9199631	[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472	2018-10-14 07:06:16 +00:00
Arnaud A. de Grandmaison	66eb737214	[AArch64] Swap comparison operands if that enables some folding. Summary: AArch64 can fold some shift+extend operations on the RHS operand of comparisons, so swap the operands if that makes sense. This provides a fix for https://bugs.llvm.org/show_bug.cgi?id=38751 Reviewers: efriedma, t.p.northover, javed.absar Subscribers: mcrosier, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D53067 llvm-svn: 344439	2018-10-13 07:43:56 +00:00
Diogo N. Sampaio	8218713482	[AARCH64][FIX] Emit data symbol for constant pool data The ARM64 elf emitter would omit printing data symbol for zero filled constant data. This patch overrides the emitFill method as to enforce that the symbol is correctly printed. Differential revision: https://reviews.llvm.org/D53132 llvm-svn: 344248	2018-10-11 14:10:32 +00:00
Oliver Stannard	db32f39ce4	[AArch64][v8.5A] Don't create BR instructions in outliner when BTI enabled When branch target identification is enabled, we can only do indirect tail-calls through x16 or x17. This means that the outliner can't transform a BLR instruction at the end of an outlined region into a BR. Differential revision: https://reviews.llvm.org/D52869 llvm-svn: 343969	2018-10-08 14:12:08 +00:00
Oliver Stannard	886a23647c	[AArch64][v8.5A] Restrict indirect tail calls to use x16/17 only when using BTI When branch target identification is enabled, all indirectly-callable functions start with a BTI C instruction. this instruction can only be the target of certain indirect branches (direct branches and fall-through are not affected): - A BLR instruction, in either a protected or unprotected page. - A BR instruction in a protected page, using x16 or x17. - A BR instruction in an unprotected page, using any register. Without BTI, we can use any non call-preserved register to hold the address for an indirect tail call. However, when BTI is enabled, then the code being compiled might be loaded into a BTI-protected page, where only x16 and x17 can be used for indirect tail calls. Legacy code withiout this restriction can still indirectly tail-call BTI-protected functions, because they will be loaded into an unprotected page, so any register is allowed. Differential revision: https://reviews.llvm.org/D52868 llvm-svn: 343968	2018-10-08 14:09:15 +00:00
Oliver Stannard	b3f32dd524	[AArch64][v8.5A] Branch Target Identification code-generation pass The Branch Target Identification extension, introduced to AArch64 in Armv8.5-A, adds the BTI instruction, which is used to mark valid targets of indirect branches. When enabled, the processor will trap if an instruction in a protected page tries to perform an indirect branch to any instruction other than a BTI. The BTI instruction uses encodings which were NOPs in earlier versions of the architecture, so BTI-enabled code will still run on earlier hardware, just without the extra protection. There are 3 variants of the BTI instruction, which are valid targets for different kinds or branches: - BTI C can be targeted by call instructions, and is inteneded to be used at function entry points. These are the BLR instruction, as well as BR with x16 or x17. These BR instructions are allowed for use in PLT entries, and we can also use them to allow indirect tail-calls. - BTI J can be targeted by BR only, and is intended to be used by jump tables. - BTI JC acts ab both a BTI C and a BTI J instruction, and can be targeted by any BLR or BR instruction. Note that RET instructions are not restricted by branch target identification, the reason for this is that return addresses can be protected more effectively using return address signing. Direct branches and calls are also unaffected, as it is assumed that an attacker cannot modify executable pages (if they could, they wouldn't need to do a ROP/JOP attack). This patch adds a MachineFunctionPass which: - Adds a BTI C at the start of every function which could be indirectly called (either because it is address-taken, or externally visible so could be address-taken in another translation unit). - Adds a BTI J at the start of every basic block which could be indirectly branched to. This could be either done by a jump table, or by taking the address of the block (e.g. the using GCC label values extension). We only need to use BTI JC when a function is indirectly-callable, and takes the address of the entry block. I've not been able to trigger this from C or IR, but I've included a MIR test just in case. Using BTI C at function entries relies on the fact that no other code in BTI-protected pages uses indirect tail-calls, unless they use x16 or x17 to hold the address. I'll add that code-generation restriction as a separate patch. Differential revision: https://reviews.llvm.org/D52867 llvm-svn: 343967	2018-10-08 14:04:24 +00:00
Oliver Stannard	72a42ff46d	[AArch64] Fix verifier error when outlining indirect calls The MachineOutliner for AArch64 transforms indirect calls into indirect tail calls, replacing the call with the TCRETURNri pseudo-instruction. This pseudo lowers to a BR, but has the isCall and isReturn flags set. The problem is that TCRETURNri takes a tcGPR64 as the register argument, to prevent indiret tail-calls from using caller-saved registers. The indirect calls transformed by the outliner could use caller-saved registers. This is fine, because the outliner ensures that the register is available at all call sites. However, this causes a verifier failure when the register is not in tcGPR64. The fix is to add a new pseudo-instruction like TCRETURNri, but which accepts any GPR. Differential revision: https://reviews.llvm.org/D52829 llvm-svn: 343959	2018-10-08 09:18:48 +00:00
Matthias Braun	4acd274986	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions This rebases and recommits r343520. hwasan should be fixed now and this shouldn't break the tests anymore. Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343895	2018-10-05 22:00:13 +00:00
Jonas Paulsson	d362ca6156	[TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints() Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851	2018-10-05 14:23:11 +00:00
Matthias Braun	1f941370f1	AArch64: Fix XSeqPairs/WSeqPairs problems - Fix spill/reloads of XSeqPairs failing with vregs (only physregs worked correctly) - Add missing spill/reload code for WSeqPairs class Differential Revision: https://reviews.llvm.org/D52761 llvm-svn: 343799	2018-10-04 17:02:53 +00:00
Daniel Sanders	0cc3eb8f8b	Add the missing new files from r343654 llvm-svn: 343655	2018-10-03 02:21:30 +00:00
Daniel Sanders	111851a7ec	Re-commit: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 The previous commit failed portions of the test-suite on GreenDragon due to duplicate COPY instructions and iterator invalidation. Both issues have now been fixed. To assist with this, a helper (cloneVirtualRegister) has been added to MachineRegisterInfo that can be used to get another register that has the same type and class/bank as an existing one. llvm-svn: 343654	2018-10-03 02:12:17 +00:00
Matt Morehouse	de1f22a41d	Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions" This reverts r343520 due to breakage of HWASan tests on Android. llvm-svn: 343616	2018-10-02 18:35:44 +00:00
Oliver Stannard	f2ee7fe9fe	[AArch64][v8.5A] Add Memory Tagging instructions This adds new instructions to manipluate tagged pointers, and to load and store the tags associated with memory. Patch by Pablo Barrio, David Spickett and Oliver Stannard! Differential revision: https://reviews.llvm.org/D52490 llvm-svn: 343572	2018-10-02 10:04:39 +00:00
Oliver Stannard	4924de9fcd	[AArch64][v8.5A] Add Memory Tagging system registers This adds new system registers introduced by the Memory Tagging extension. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52488 llvm-svn: 343571	2018-10-02 09:54:35 +00:00
Oliver Stannard	88bee226e6	[AArch64][v8.5A] Add MTE system instructions The Memory Tagging Extension adds system instructions for data cache maintenance, implemented as new operands to the DC instruction. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52487 llvm-svn: 343570	2018-10-02 09:48:43 +00:00
Oliver Stannard	5a50eac9f3	[AArch64][v8.5A] Add MTE as an optional AArch64 extension This adds the memory tagging extension, which is an optional extension introduced in v8.5A. The new instructions and registers will be added by subsequent patches. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52486 llvm-svn: 343563	2018-10-02 09:36:28 +00:00
Daniel Sanders	3d284c2113	Revert: r343521 and r343541: [globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 There's a strange assertion on two of the Green Dragon bots that goes away when this is reverted. The assertion is in RegBankAlloc and if it is this commit then -verify-machine-instrs should have caught it earlier in the pipeline. llvm-svn: 343546	2018-10-01 22:32:08 +00:00
Daniel Sanders	684f49116c	[globalisel] Add a combiner helpers for extending loads and use them in a pre-legalize combiner for AArch64 Summary: Depends on D45541 Reviewers: ab, aditya_nandakumar, bogner, rtereshin, volkan, rovka, javed.absar, aemerson Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45543 llvm-svn: 343521	2018-10-01 18:56:47 +00:00
Matthias Braun	9c73b0eb23	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343520	2018-10-01 18:56:39 +00:00
Evandro Menezes	a59148b265	[AArch64] Refactor cheap cost model Refactor the order in `TII::isAsCheapAsAMove()` to ease future development and maintenance. Practically NFC. llvm-svn: 343489	2018-10-01 16:11:19 +00:00
Evandro Menezes	fdd7b1d490	[AArch64] Split zero cycle feature more granularly Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp` and `zcz-fp`, respectively, while retaining the original feature option to mean both. Differential revision: https://reviews.llvm.org/D52621 llvm-svn: 343354	2018-09-28 19:05:09 +00:00
Luke Cheeseman	c58e7107ed	Revert r343317 - asan buildbots are breaking and I need to investigate the issue llvm-svn: 343341	2018-09-28 17:01:50 +00:00
Luke Cheeseman	0b7f5a040e	Reapply changes reverted by r343235 - Add fix so that all code paths that create DWARFContext with an ObjectFile initialise the target architecture in the context - Add an assert that the Arch is known in the Dwarf CallFrameString method llvm-svn: 343317	2018-09-28 13:37:27 +00:00
David Spickett	3cacab9d1f	Remove extra whitespace. NFC. (test commit) llvm-svn: 343301	2018-09-28 08:45:28 +00:00
Luke Cheeseman	8bf597ea01	Revert r343192 as an ubsan build is currently failing llvm-svn: 343235	2018-09-27 16:47:30 +00:00
Oliver Stannard	0dbca0c67d	[AArch64] Refactor immediate details out of add/sub tblgen class (NFCI) Bits [23-22] are used in Add and Sub to specify the shift. The value of the shift field must be 0x; values of 1x are unallocated. MTE adds some instructions that use such encodings, and this patch refactors the Add/Sub class so that another class could derive from this one to implement other encodings and other formats of bitfields. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52489 llvm-svn: 343231	2018-09-27 16:19:04 +00:00
Oliver Stannard	ed351c7109	[AArch64][v8.5A] Add speculation barriers SSBB and PSSBB This adds two new barrier instructions which can be used to restrict speculative execution of load instructions. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52483 llvm-svn: 343229	2018-09-27 16:09:05 +00:00
Oliver Stannard	5be239fc50	[AArch64][v8.5A] Add Branch Target Identification instructions This adds new instructions used by the Branch Target Identification feature. When this is enabled, these are the only instructions which can be targeted by indirect branch instructions. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52485 llvm-svn: 343225	2018-09-27 14:54:33 +00:00
Oliver Stannard	c499c7b046	[AArch64][v8.5A] Add speculation restriction system registers This adds some new system registers which can be used to restrict certain types of speculative execution. Patch by Pablo Barrio and David Spickett! Differential revision: https://reviews.llvm.org/D52482 llvm-svn: 343218	2018-09-27 14:05:46 +00:00
Oliver Stannard	d597eb64ea	[AArch64][v8.5A] Add Armv8.5-A random number instructions This adds two new system registers, used to generate random numbers. This is an optional extension to v8.5-A, and will be controlled by the "+rng" modifier of the -march= and -mcpu= options. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52481 llvm-svn: 343217	2018-09-27 14:01:40 +00:00
Oliver Stannard	2d45ef92b2	[AArch64][v8.5A] Add Armv8.5-A "DC CVADP" instruction This adds a new variant of the DC system instruction for persistent memory. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52480 llvm-svn: 343216	2018-09-27 13:53:35 +00:00
Oliver Stannard	e0b46dae60	[AArch64][v8.5A] Add prediction invalidation instructions to AArch64 This adds new system instructions which act as barriers to speculative execution based on earlier execution within a particular execution context. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52479 llvm-svn: 343214	2018-09-27 13:47:40 +00:00
Oliver Stannard	959c7d0025	[AArch64][v8.5A] Add speculation barrier to AArch64 instruction set This is a new barrier which limits speculative execution of the instructions following it. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52476 llvm-svn: 343211	2018-09-27 13:39:06 +00:00
Oliver Stannard	634d96e078	[AArch64][v8.5A] Add FRINT[32,64][Z,X] instructions These are some new variants of the "Floating-point Round to Integral" family of instructions, which round to the nearest floating-point value which fits in a 32- or 64-bit integer. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52475 llvm-svn: 343209	2018-09-27 13:32:06 +00:00
Luke Cheeseman	a0fae94979	Reapply changes reverted in r343114, lldb patch to follow shortly llvm-svn: 343192	2018-09-27 10:39:20 +00:00
Oliver Stannard	1790db7411	[AArch64][v8.5A] Add PSTATE manipulation instructions XAFlag and AXFlag These new instructions manipluate the NZCV bits, to convert between the regular Arm floating-point comare format and an alternative format. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52473 llvm-svn: 343187	2018-09-27 09:11:27 +00:00
Fangrui Song	c2791239be	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Oliver Stannard	bc87e250e3	[AArch64] Extend single-operand FP insns to match Arm ARM (NFCI) The Armv8.3-A reference manual defines floating-point data-processing instructions with one source operand to have an opcode of 6 bits [20:15]. The current class in tablegen, BaseSingleOperandFPData, only allows [18:15]. This was ok because [20:19] could only be '00', with other encodings unallocated. Armv8.5-A brings in the FRINT group of instructions which use other values for these bits. This patch refactors the existing class a bit to allow using the full 6 bits of the opcode, as defined in the Arm ARM. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52474 llvm-svn: 343120	2018-09-26 15:42:47 +00:00
Luke Cheeseman	aa2bb63fda	Revert r343112 as CallFrameString API change has broken lldb builds llvm-svn: 343114	2018-09-26 14:48:03 +00:00
Oliver Stannard	b7590e3407	[AArch64] Refactor instructions that write PSTATE (NFCI) Reuse some code in preparation for the v8.5A XAFlag/AXFlag instructions, which shares part of the encoding of the MSR-immediate. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52472 llvm-svn: 343113	2018-09-26 14:42:59 +00:00
Luke Cheeseman	adcedeea67	[AArch64] - Return address signing dwarf support - Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll llvm-svn: 343112	2018-09-26 14:30:29 +00:00
Oliver Stannard	19d022549d	[AArch64][AsmParser] Show name of missing feature for system instructions Parsing of the system instructions (IC, DC, AT and TLBI) uses this function to show the required architecture when the operand is valid, but the architecture is not enabled. Armv8.5A adds a few different system instructions as part of optional features, so we need to extend it to show individual features, not just base architectures. This is NFC for now, but will be used by three different features added in v8.5A, and will be tested by them. Patch by David Spickett! Differential revision: https://reviews.llvm.org/D52478 llvm-svn: 343109	2018-09-26 13:52:27 +00:00
Hans Wennborg	905bc345c4	Revert r343089 "[AArch64] - Return address signing dwarf support" This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail. > Functions that have signed return addresses need additional dwarf support: > - After signing the LR, and before authenticating it, the LR register is in a > state the is unusable by a debugger or unwinder > - To account for this a new directive, .cfi_negate_ra_state, is added > - This directive says the signed state of the LR register has now changed, > i.e. unsigned -> signed or signed -> unsigned > - This directive has the same CFA code as the SPARC directive GNU_window_save > (0x2d), adding a macro to account for multiply defined codes > - This patch matches the gcc implementation of this support: > https://patchwork.ozlabs.org/patch/800271/ > > Differential Revision: https://reviews.llvm.org/D50136 llvm-svn: 343103	2018-09-26 12:57:45 +00:00
Oliver Stannard	347bc189fb	[ARM/AArch64][v8.5A] Add Armv8.5-A target This patch allows targeting Armv8.5-A, adding the architecture to tablegen and setting the options to be identical to Armv8.4-A for the time being. Subsequent patches will add support for the different features included in the Armv8.5-A Reference Manual. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52470 llvm-svn: 343102	2018-09-26 12:48:21 +00:00
Luke Cheeseman	05f11dbf44	[AArch64] - Return address signing dwarf support Functions that have signed return addresses need additional dwarf support: - After signing the LR, and before authenticating it, the LR register is in a state the is unusable by a debugger or unwinder - To account for this a new directive, .cfi_negate_ra_state, is added - This directive says the signed state of the LR register has now changed, i.e. unsigned -> signed or signed -> unsigned - This directive has the same CFA code as the SPARC directive GNU_window_save (0x2d), adding a macro to account for multiply defined codes - This patch matches the gcc implementation of this support: https://patchwork.ozlabs.org/patch/800271/ Differential Revision: https://reviews.llvm.org/D50136 llvm-svn: 343089	2018-09-26 10:14:15 +00:00
Nirav Dave	c26720b88c	[AArch64] Share search bookkeeping in combines. NFCI. Share predecessor search bookkeeping in both perform PostLD1Combine and performNEONPostLDSTCombine. This should be approximately a 4x and 2x performance improvement. llvm-svn: 342986	2018-09-25 15:30:22 +00:00
Benjamin Kramer	0b160a4ed1	[Aarch64] Fix memcpy that was copying 4x too many bytes Found by asan. llvm-svn: 342845	2018-09-23 18:43:28 +00:00
Tri Vo	ae0244420d	[AArch64] Support adding X[8-15,18] registers as CSRs. Summary: Specifying X[8-15,18] registers as callee-saved is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. As part of this patch we: - use custom CSR list/mask when user specifies custom CSRs - update Machine Register Info's list of CSRs with additional custom CSRs in LowerCall and LowerFormalArguments. Reviewers: srhines, nickdesaulniers, efriedma, javed.absar Reviewed By: nickdesaulniers Subscribers: kristof.beyls, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52216 llvm-svn: 342824	2018-09-22 22:17:50 +00:00
Matthias Braun	5b52412a7c	AArch64FastISel: Abort if we failed to select operand of intrinsic rdar://44642447 Differential Revision: https://reviews.llvm.org/D52335 llvm-svn: 342742	2018-09-21 15:47:41 +00:00
Matthias Braun	5df786d5c0	AArch64: Add FuseCryptoEOR fusion rules There's some additional rules available on newer apple CPUs. rdar://41235346 llvm-svn: 342590	2018-09-19 20:50:51 +00:00
Alex Bradbury	b88141f5a6	[AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550	2018-09-19 14:51:42 +00:00
Calixte Denizet	1697eec17c	Verify commit access in fixing typo llvm-svn: 342538	2018-09-19 11:26:20 +00:00
Matthias Braun	a36913825c	AArch64MacroFusion: Factor out some opcode handling code; NFC llvm-svn: 342521	2018-09-19 00:23:37 +00:00
David Green	4fe2509252	[AArch64] Attempt to parse more operands as expressions This tries to make use of evaluateAsRelocatable in AArch64AsmParser::classifySymbolRef to parse more complex expressions as relocatable operands. It is hopefully better than the existing code which only handles Symbol +- Constant. This allows us to parse more complex adr/adrp, mov, ldr/str and add operands. It also loosens the requirements on parsing addends in ld/st and mov's and adds a number of tests. Differential Revision: https://reviews.llvm.org/D51792 llvm-svn: 342455	2018-09-18 09:44:53 +00:00
Sander de Smalen	1e20d76347	[AArch64] Implement aarch64_vector_pcs codegen support. This patch adds codegen support for the saving/restoring V8-V23 for functions specified with the aarch64_vector_pcs calling convention attribute, as added in patch D51477. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51479 llvm-svn: 342049	2018-09-12 12:10:22 +00:00
Sander de Smalen	85610c1d39	[AArch64] NFC: Refactoring to prepare for vector PCS. This patch refactors several parts of AArch64FrameLowering so that it can be easily extended to support saving/restoring of FPR128 (Q) registers. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51478 llvm-svn: 342038	2018-09-12 09:44:46 +00:00
Sander de Smalen	26f87b9829	[AArch64] Add parsing of aarch64_vector_pcs attribute. This patch adds parsing support for the 'aarch64_vector_pcs' calling convention attribute to calls and function declarations. More information describing the vector ABI and procedure call standard can be found here: https://developer.arm.com/products/software-development-tools/\ hpc/arm-compiler-for-hpc/vector-function-abi Reviewers: t.p.northover, rnk, rengolin, javed.absar, thegameg, SjoerdMeijer Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D51477 llvm-svn: 342030	2018-09-12 08:54:06 +00:00
JF Bastien	37d0f6f182	NFC: use bit_cast more in AArch64AddressingModes The was previously committed as r341749 then reverted as r341750 because bit_cast needed to do its own thing to check is_trivially_copyable on GCC 4.x. This is now done and std;:array should now get accepted. llvm-svn: 341897	2018-09-11 04:08:05 +00:00
Benjamin Kramer	6dd0efd86d	[Target] Untangle disassemblers Disassemblers cannot depend on main target headers. The same is true for MCTargetDesc, but there's a lot more cleanup needed for that. llvm-svn: 341822	2018-09-10 12:53:46 +00:00
JF Bastien	977e3b3f7b	Revert "NFC: use bit_cast more in AArch64AddressingModes" It seems some bots think std::array is either not trivially-copyable, or isn't the right size. llvm-svn: 341750	2018-09-08 16:50:56 +00:00
JF Bastien	49ac004a33	NFC: use bit_cast more in AArch64AddressingModes llvm-svn: 341749	2018-09-08 16:43:49 +00:00
JF Bastien	cce5caaaa1	ADT: add <bit> header, implement C++20 bit_cast, use Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning. This was originally committed as r341728 and reverted in r341730. Reviewers: javed.absar, steven_wu, srhines Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51693 llvm-svn: 341741	2018-09-08 03:55:25 +00:00
JF Bastien	5cfa301c8f	Revert "ADT: add <bit> header, implement C++20 bit_cast, use" Bots sad. Looks like missing std::is_trivially_copyable. llvm-svn: 341730	2018-09-07 23:23:47 +00:00
JF Bastien	b3b918823d	ADT: add <bit> header, implement C++20 bit_cast, use Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning. Reviewers: javed.absar Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51693 llvm-svn: 341728	2018-09-07 23:08:26 +00:00
Nick Desaulniers	14c4d9c7b3	[AArch64] Support reserving x1-7 registers. Summary: Reserving registers x1-7 is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. This change adds support for reserving registers x1 through x7. Reviewers: javed.absar, phosek, srhines, nickdesaulniers, efriedma Reviewed By: nickdesaulniers, efriedma Subscribers: niravd, jfb, manojgupta, nickdesaulniers, jyknight, efriedma, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D48580 llvm-svn: 341706	2018-09-07 20:58:57 +00:00
JF Bastien	a500948b04	ARM64: improve non-zero memset isel by ~2x Summary: I added a few ARM64 memset codegen tests in r341406 and r341493, and annotated where the generated code was bad. This patch fixes the majority of the issues by requesting that a 2xi64 vector be used for memset of 32 bytes and above. The patch leaves the former request for f128 unchanged, despite f128 materialization being suboptimal: doing otherwise runs into other asserts in isel and makes this patch too broad. This patch hides the issue that was present in bzero_40_stack and bzero_72_stack because the code now generates in a better order which doesn't have the store offset issue. I'm not aware of that issue appearing elsewhere at the moment. <rdar://problem/44157755> Reviewers: t.p.northover, MatzeB, javed.absar Subscribers: eraman, kristof.beyls, chrib, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51706 llvm-svn: 341558	2018-09-06 16:03:32 +00:00
JF Bastien	19a1c00664	NFC: improve ARM64 isFPImmLegal debug print Forking this change from D51706. This just made it easier to understand llc output with -debug. llvm-svn: 341504	2018-09-05 23:38:11 +00:00
Martin Storsjo	3bc21d8a2b	[MinGW] Move code for indicating "potentially not DSO local" into shouldAssumeDSOLocal. NFC. On Windows, if shouldAssumeDSOLocal returns false, it's either a dllimport reference, or a reference that we should treat as non-local and create a stub for. Clean up AArch64Subtarget::ClassifyGlobalReference a little while touching the flag handling relating to dllimport. Differential Revision: https://reviews.llvm.org/D51590 llvm-svn: 341402	2018-09-04 20:56:28 +00:00
Martin Storsjo	3d5c7d84ef	[MinGW] [AArch64] Add stubs for potential automatic dllimported variables The runtime pseudo relocations can't handle the AArch64 format PC relative addressing in adrp+add/ldr pairs. By using stubs, the potentially dllimported addresses can be touched up by the runtime pseudo relocation framework. Differential Revision: https://reviews.llvm.org/D51452 llvm-svn: 341401	2018-09-04 20:56:21 +00:00
Martin Storsjo	92f0889ba0	[AArch64] Simplify code in LowerGlobalAddress. NFCI. When initial support for dllimport was added for aarch64 in SVN r316555, ClassifyGlobalReference didn't set the MO_DLLIMPORT flag - that was only completed in SVN r323810. Reuse the return value from ClassifyGlobalReference for this purpose as well. llvm-svn: 341310	2018-09-03 11:59:23 +00:00
Martin Storsjo	e0151ea917	[AArch64] Hook up the missed machine operand flag name for MO_DLLIMPORT llvm-svn: 341178	2018-08-31 08:00:34 +00:00
Ties Stuij	43c46fbb66	[CodeGen] emit inline asm clobber list warnings for reserved (cont) Summary: This is a continuation of https://reviews.llvm.org/D49727 Below the original text, current changes in the comments: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } Compiled for thumb, this gives: $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @APP mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ Reviewers: efriedma, olista01, javed.absar Reviewed By: efriedma Subscribers: eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D51165 llvm-svn: 341062	2018-08-30 12:52:35 +00:00
David Green	b3ee777d21	[AArch64] Optimise load(adr address) to ldr address Providing that the load is known to be 4 byte aligned, we can optimise a ldr(adr address) to just ldr address. Differential Revision: https://reviews.llvm.org/D51030 llvm-svn: 341058	2018-08-30 11:55:16 +00:00
Eli Friedman	4bc71970fe	[AArch64] Reject inline asm with FP registers when FP is disabled. Otherwise, we would crash trying to deal with an illegal input. Differential Revision: https://reviews.llvm.org/D51202 llvm-svn: 340637	2018-08-24 19:12:13 +00:00
Joel Galenson	d9f62d7cbe	Find PLT entries for x86, x86_64, and AArch64. This adds a new method to ELFObjectFileBase that returns the symbols and addresses of PLT entries. This design was suggested by pcc and eugenis in https://reviews.llvm.org/D49383. Differential Revision: https://reviews.llvm.org/D50203 llvm-svn: 340610	2018-08-24 15:21:56 +00:00
David Green	61ff754920	[AArch64] Add Tiny Code Model for AArch64 This adds the plumbing for the Tiny code model for the AArch64 backend. This, instead of loading addresses through the normal ADRP;ADD pair used in the Small model, uses a single ADR. The 21 bit range of an ADR means that the code and its statically defined symbols need to be within 1MB of each other. This makes it mostly interesting for embedded applications where we want to fit as much as we can in as small a space as possible. Differential Revision: https://reviews.llvm.org/D49673 llvm-svn: 340397	2018-08-22 11:31:39 +00:00
Daniel Sanders	11bfc8c381	[aarch64][mc] Don't lookup symbols when there is no symbol lookup callback Summary: When run under llvm-mc-disassemble-fuzzer, there is no symbol lookup callback so tryAddingSymbolicOperand() must fail gracefully instead of crashing Reviewers: aemerson, javed.absar Reviewed By: aemerson Subscribers: lhames, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D51005 llvm-svn: 340287	2018-08-21 15:47:25 +00:00
Sander de Smalen	63fb4fcb81	[AArch64][SVE] Asm: Add SVE System registers This patch adds system registers for controlling aspects of SVE: - ZCR_EL1 (r/w) visible at EL1 and EL0. - ZCR_EL2 (r/w) visible at EL2 and Non-secure EL1 and EL0. - ZCR_EL3 (r/w) visible at all exception levels. and a system register identifying SVE: - ID_AA64ZFR0_EL1 (r) SVE Feature identifier. Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D50885 llvm-svn: 340158	2018-08-20 09:16:59 +00:00
Luke Cheeseman	079a490c8b	[AArch64] - Generate pointer authentication instructions - Generate pointer authentication instructions - The functions instrumented depend on function attribtues: all (all functions instrumentent) non-leaf (only those that spill LR) none - Function epilogues sign the LR before spilling to the stack and authenticate the LR once restored - If the target is v8.3a or greater than can use the combined authenticate and return instruction Differential revision: https://reviews.llvm.org/D49793 llvm-svn: 340018	2018-08-17 12:53:22 +00:00
Bernard Ogden	1a1a8110d3	[ARM/AArch64] Support FP16 +fp16fml instructions Add +fp16fml feature for new FP16 instructions, which are a mandatory part of FP16 from v8.4-A and an optional part of FP16 from v8.2-A. It doesn't seem to be possible to model this in LLVM, but the relationship between the options is handled by the related clang patch. In keeping with what I think is the usual practice, the fp16fml extension is accepted regardless of base architecture version. Builds on/replaces Sjoerd Meijer's patch to add these instructions at https://reviews.llvm.org/D49839. Differential Revision: https://reviews.llvm.org/D50228 llvm-svn: 340013	2018-08-17 11:29:49 +00:00
Chandler Carruth	8079aabce6	[MI] Change the array of `MachineMemOperand` pointers to be a generically extensible collection of extra info attached to a `MachineInstr`. The primary change here is cleaning up the APIs used for setting and manipulating the `MachineMemOperand` pointer arrays so chat we can change how they are allocated. Then we introduce an extra info object that using the trailing object pattern to attach some number of MMOs but also other extra info. The design of this is specifically so that this extra info has a fixed necessary cost (the header tracking what extra info is included) and everything else can be tail allocated. This pattern works especially well with a `BumpPtrAllocator` which we use here. I've also added the basic scaffolding for putting interesting pointers into this, namely pre- and post-instruction symbols. These aren't used anywhere yet, they're just there to ensure I've actually gotten the data structure types correct. I'll flesh out support for these in a subsequent patch (MIR dumping, parsing, the works). Finally, I've included an optimization where we store any single pointer inline in the `MachineInstr` to avoid the allocation overhead. This is expected to be the overwhelmingly most common case and so should avoid any memory usage growth due to slightly less clever / dense allocation when dealing with >1 MMO. This did require several ergonomic improvements to the `PointerSumType` to reasonably support the various usage models. This also has a side effect of freeing up 8 bits within the `MachineInstr` which could be repurposed for something else. The suggested direction here came largely from Hal Finkel. I hope it was worth it. ;] It does hopefully clear a path for subsequent extensions w/o nearly as much leg work. Lots of thanks to Reid and Justin for careful reviews and ideas about how to do all of this. Differential Revision: https://reviews.llvm.org/D50701 llvm-svn: 339940	2018-08-16 21:30:05 +00:00
Chandler Carruth	81e2f0deb5	[SDAG] Remove the reliance on MI's allocation strategy for `MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be surprised at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740	2018-08-14 23:30:32 +00:00
Eli Friedman	23b65e265f	[ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly. Intentionally excluding nodes from the DAGCombine worklist is likely to lead to weird optimizations and infinite loops, so it's generally a bad idea. To avoid the infinite loops, fix DAGCombine to use the isDesirableToCommuteWithShift target hook before performing the transforms in question, and implement the target hook in the ARM backend disable the transforms in question. Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a reduced testcase for that bug. But we should have sufficient test coverage for PerformSHLSimplify given that we're not playing weird tricks with the worklist. I can try to bugpoint it if necessary, though.) Differential Revision: https://reviews.llvm.org/D50667 llvm-svn: 339734	2018-08-14 22:10:25 +00:00
Bryan Chan	2b6ebc3031	[AArch64] Fix assertion failure on widened f16 BUILD_VECTOR Summary: Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the same type. This fixes an assertion failure in VerifySDNode. Reviewers: SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50202 llvm-svn: 339013	2018-08-06 14:14:41 +00:00
Alexander Ivchenko	67b31b9526	[GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per Value This is logical continuation of https://reviews.llvm.org/D46018 (r332449) Differential Revision: https://reviews.llvm.org/D49660 llvm-svn: 338685	2018-08-02 08:33:31 +00:00
David Green	7170871c24	[AArch64] Add support for got relocated LDR's As a part of adding the tiny codemodel, we need to support ldr's with :got: relocations on them. This seems to be mostly already done, just needs the relocation type support. Differential Revision: https://reviews.llvm.org/D50137 llvm-svn: 338673	2018-08-02 06:24:40 +00:00
Lei Liu	4779c3a4b6	[AArch64] DWARF: do not generate AT_location for thread local AArch64 ELF ABI does not define a static relocation type for TLS offset within a module, which makes it impossible for compiler to generate a valid DW_AT_location content for thread local variables. Currently LLVM generates an invalid R_AARCH64_ABS64 relocation at the DW_AT_location field for a TLS variable. That causes trouble for linker because thread local variable does not have an absolute address at link time. AArch64 GCC solves the problem by not generating DW_AT_location for thread local variables. We should do the same in LLVM. Differential Revision: https://reviews.llvm.org/D43860 llvm-svn: 338655	2018-08-01 23:46:49 +00:00
Bryan Chan	67e7e5f934	[AArch64] Fix FCCMP with FP16 operands Summary: This patch adds support for FCCMP instruction with FP16 operands, avoiding an assertion during instruction selection. Reviewers: olista01, SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50115 llvm-svn: 338554	2018-08-01 13:50:29 +00:00
Martin Storsjo	676e2b8028	[AArch64] Disallow the MachO specific .loh directive for windows Also add a test for it being unsupported for linux. Differential Revision: https://reviews.llvm.org/D49929 llvm-svn: 338493	2018-08-01 06:50:18 +00:00
Martin Storsjo	2a982f7d52	[AArch64] Support the .inst directive for MachO and COFF targets Contrary to ELF, we don't add any markers that distinguish data generated with .long from normal instructions, so the .inst directive only adds compatibility with assembly that uses it. Differential Revision: https://reviews.llvm.org/D49935 llvm-svn: 338355	2018-07-31 09:26:52 +00:00
Amara Emerson	6a47e74b23	[AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR. Also refactors some existing code to materialize addresses for the large code model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR. This implements PR36390. Differential Revision: https://reviews.llvm.org/D49903 llvm-svn: 338337	2018-07-31 00:09:02 +00:00
Amara Emerson	d6b9e05a84	[AArch64][GlobalISel] Make G_BLOCK_ADDR legal. Differential Revision: https://reviews.llvm.org/D49902 llvm-svn: 338336	2018-07-31 00:08:56 +00:00
Craig Topper	225a9ee41b	[DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329	2018-07-30 23:22:00 +00:00
Craig Topper	ea2a3152fb	[DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to BuildSDIVPow2. llvm-svn: 338303	2018-07-30 21:04:34 +00:00
Fangrui Song	121474a01b	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00

... 2 3 4 5 6 ...

3314 Commits