llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Bradley Smith	aa602e5e4d	[AArch64] Add workaround for Cortex-A53 erratum (835769) Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is possible for a 64-bit multiply-accumulate instruction in AArch64 state to generate an incorrect result. The details are quite complex and hard to determine statically, since branches in the code may exist in some circumstances, but all cases end with a memory (load, store, or prefetch) instruction followed immediately by the multiply-accumulate operation. The safest work-around for this issue is to make the compiler avoid emitting multiply-accumulate instructions immediately after memory instructions and the simplest way to do this is to insert a NOP. This patch implements such work-around in the backend, enabled via the option -aarch64-fix-cortex-a53-835769. The work-around code generation is not enabled by default. llvm-svn: 219603	2014-10-13 10:12:35 +00:00
Lang Hames	e0ce993042	[PBQP] Add missing headers from r219421. llvm-svn: 219425	2014-10-09 18:36:59 +00:00
Lang Hames	1ac2927c37	[PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint). This patch removes the PBQPBuilder class and its subclasses and replaces them with a composable constraints class: PBQPRAConstraint. This allows constraints that are only required for optimisation (e.g. coalescing, soft pairing) to be mixed and matched. This patch also introduces support for target writers to supply custom constraints for their targets by overriding a TargetSubtargetInfo method: std::unique_ptr<PBQPRAConstraints> getCustomPBQPConstraints() const; This patch should have no effect on allocations. llvm-svn: 219421	2014-10-09 18:20:51 +00:00
Kevin Qin	33566b5879	[AArch64] Enable partial & runtime unrolling on cortex-a57. llvm-svn: 219401	2014-10-09 10:13:27 +00:00
Chad Rosier	75e17097bb	[AArch64] Generate vector signed/unsigned mul and mla/mls long. Phabricator Revision: http://reviews.llvm.org/D5589 Patch by Balaram Makam <bmakam@codeaurora.org>!! llvm-svn: 219276	2014-10-08 02:31:24 +00:00
Juergen Ributzka	06c32d25cd	[FastISel][AArch64] Teach the address computation code to also fold sign-/zero-extends. The code already folds sign-/zero-extends, but only if they are arguments to mul and shift instructions. This extends the code to also fold them when they are direct inputs. llvm-svn: 219187	2014-10-07 03:40:06 +00:00
Juergen Ributzka	6619d4d4b3	[FastISel][AArch64] Teach the address computation to also fold sub instructions. Tiny enhancement to the address computation code to also fold sub instructions if the rhs is constant and can be folded into the offset. llvm-svn: 219186	2014-10-07 03:40:03 +00:00
Juergen Ributzka	d396480465	[FastISel][AArch64] Fix "Fold sign-/zero-extends into the load instruction." This commit fixes an issue with sign-/zero-extending loads that was discovered by Richard Barton. We use now the correct load instructions for sign-extending loads to 64bit. Also updated and added more unit tests. llvm-svn: 219185	2014-10-07 03:39:59 +00:00
Eric Christopher	faca264c55	Add subtarget caches to aarch64, arm, ppc, and x86. These will make it easier to test further changes to the code generation and optimization pipelines as those are moved to subtargets initialized with target feature and target cpu. llvm-svn: 219106	2014-10-06 06:45:36 +00:00
Benjamin Kramer	860521c88b	Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it. llvm-svn: 219068	2014-10-04 22:44:29 +00:00
Jingyue Wu	4a186967a9	Add fake use to suppress defined-but-unused warnings llvm-svn: 219045	2014-10-04 03:50:10 +00:00
Benjamin Kramer	4c9fb3d669	Eliminate some deep std::vector copies. NFC. llvm-svn: 218999	2014-10-03 18:33:16 +00:00
Eric Christopher	47c04156aa	constify TargetMachine parameter. llvm-svn: 218934	2014-10-03 00:42:41 +00:00
Juergen Ributzka	2eab1ef77e	[Stackmaps] Make ithe frame-pointer required for stackmaps. Do not eliminate the frame pointer if there is a stackmap or patchpoint in the function. All stackmap references should be FP relative. This fixes PR21107. llvm-svn: 218920	2014-10-02 22:21:49 +00:00
Adrian Prantl	2b1df58ebe	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787	2014-10-01 18:55:02 +00:00
Adrian Prantl	0959156fa3	Revert r218778 while investigating buldbot breakage. "Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782	2014-10-01 18:10:54 +00:00
Adrian Prantl	229943585f	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778	2014-10-01 17:55:39 +00:00
Tom Coxon	50ff005894	[AArch64] Allow access to all system registers with MRS/MSR instructions. The A64 instruction set includes a generic register syntax for accessing implementation-defined system registers. The syntax for these registers is: S<op0>_<op1>_<CRn>_<CRm>_<op2> The encoding space permitted for implementation-defined system registers is: op0 op1 CRn CRm op2 11 xxx 1x11 xxxx xxx The full encoding space can now be accessed: op0 op1 CRn CRm op2 xx xxx xxxx xxxx xxx This is useful to anyone needing to write assembly code supporting new system registers before the assembler has learned the official names for them. llvm-svn: 218753	2014-10-01 10:13:59 +00:00
Asiri Rathnayake	0e12aa2ad9	Add missing natual vector cast. Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST was introduced in r217159 and r217138. This patch adds a missing cast from v2f32 to v1i64 which is causing some compilation failures. Also added test cases to cover various modimm types and BUILD_VECTORs with i64 elements. llvm-svn: 218751	2014-10-01 09:59:45 +00:00
Juergen Ributzka	4b2325a925	Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Note: This version fixed an issue with the TBZ/TBNZ instructions that were generated in FastISel. The issue was that the 64bit version of TBZ (TBZX) automagically sets the upper bit of the immediate field that is used to specify the bit we want to test. To test for any of the lower 32bits we have to first extract the subregister and use the 32bit version of the TBZ instruction (TBZW). Original commit message: Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218693	2014-09-30 19:59:35 +00:00
Tom Coxon	856ab42e33	[AArch64] Remove unnecessary whitespace. (Test commit) llvm-svn: 218680	2014-09-30 16:23:16 +00:00
Juergen Ributzka	040a60a3d3	[FastISel][AArch64] Fold sign-/zero-extends into the load instruction. The sign-/zero-extension of the loaded value can be performed by the memory instruction for free. If the result of the load has only one use and the use is a sign-/zero-extend, then we emit the proper load instruction. The extend is only a register copy and will be optimized away later on. Other instructions that consume the sign-/zero-extended value are also made aware of this fact, so they don't fold the extend too. This fixes rdar://problem/18495928. llvm-svn: 218653	2014-09-30 00:49:58 +00:00
Juergen Ributzka	d6d5162a97	[FastISel][AArch64] Factor out scale factor calculation. NFC. Factor out the code that determines the implicit scale factor of memory operations for a given value type. llvm-svn: 218652	2014-09-30 00:49:54 +00:00
Dave Estes	a9a3195105	[AArch64] Refines the Cortex-A57 Machine Model Primarily refines all of the instructions with accurate latency and micro-op information. Refinements largely focus on the NEON instructions. Additionally, a few advanced features are modeled, including forwarding for MAC instructions and hazards for floating point SQRT and DIV. Lastly, the issue-width is reduced to three so that the scheduler will better accommodate the narrower decode and dispatch width. llvm-svn: 218627	2014-09-29 21:27:36 +00:00
Chad Rosier	df7883744f	[AArch64] Improve cost model to handle sdiv by a pow-of-two. This patch improves the target-specific cost model to better handle signed division by a power of two. The immediate result is that this enables the SLP vectorizer to do a better job. http://reviews.llvm.org/D5469 PR20714 llvm-svn: 218607	2014-09-29 13:59:31 +00:00
Jim Grosbach	41b2a508dd	AArch64: allow constant expressions for shifted reg literals e.g., add w1, w2, w3, lsl #(2 - 1) This sort of thing comes up in pre-processed assembly playing macro games. Still validate that it's an assembly time constant. The early exit error check was just a bit overzealous and disallowed a left paren. rdar://18430542 llvm-svn: 218336	2014-09-23 22:16:02 +00:00
Oliver Stannard	d5ddd9f5ee	Fix segfault in AArch64 backend with -g and -mbig-endian Fix a null pointer dereference when trying to swap the endianness of fixups in the .eh_frame section in the AArch64 backend. llvm-svn: 218311	2014-09-23 15:38:11 +00:00
Juergen Ributzka	96a3a7534a	[FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left for booleans (i1). Shift-left immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. This should fix a bug found by Chad. llvm-svn: 218275	2014-09-22 21:08:53 +00:00
Juergen Ributzka	c86ece1ad5	[FastIsel][AArch64] Fix a think-o in address computation. When looking through sign/zero-extensions the code would always assume there is such an extension instruction and use the wrong operand for the address. There was also a minor issue in the handling of 'AND' instructions. I accidentially used a 'cast' instead of a 'dyn_cast'. llvm-svn: 218161	2014-09-19 22:23:46 +00:00
Aaron Ballman	c9d2119dc2	Reverting NFC changes from r218050. Instead, the warning was disabled for GCC in r218059, so these changes are no longer required. llvm-svn: 218062	2014-09-18 17:34:23 +00:00
Aaron Ballman	2e4b3f3dca	Fixing a bunch of -Woverloaded-virtual warnings due to hiding getSubtargetImpl from the base class. NFC. llvm-svn: 218050	2014-09-18 13:27:14 +00:00
Juergen Ributzka	675ee57091	Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ." Reverting it until I have time to investigate a regression. llvm-svn: 218035	2014-09-18 08:07:40 +00:00
Juergen Ributzka	1686c27b86	Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies. When folding the intrinsic flag into the branch or select we also have to consider the fact if the intrinsic got simplified, because it changes the flag we have to check for. llvm-svn: 218034	2014-09-18 07:26:26 +00:00
Juergen Ributzka	1f5c139173	[FastISel][AArch64] Simplify XALU multiplies. Simplify {s\|u}mul.with.overflow to {s\|u}add.with.overflow when possible. llvm-svn: 218033	2014-09-18 07:04:54 +00:00
Juergen Ributzka	f3375ce58f	[FastISel][AArch64] Followup commit for 218031 to handle negative offsets too. llvm-svn: 218032	2014-09-18 07:04:49 +00:00
Juergen Ributzka	4cc0932882	[FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address. Small optimization in 'simplifyAddress'. When the offset cannot be encoded in the load/store instruction, then we need to materialize the address manually. The add instruction can encode a wider range of immediates than the load/store instructions. This change tries to fold the offset into the add instruction first before materializing the offset in a register. llvm-svn: 218031	2014-09-18 05:40:47 +00:00
Juergen Ributzka	0d3e02d8bb	[FastISel][AArch64] Fold 'AND' instruction during the address computation. The 'AND' instruction could be used to mask out the lower 32 bits of a register. If this is done inside an address computation we might be able to fold the instruction into the memory instruction itself. and x1, x1, #0xffffffff ---> ldrb x0, [x0, w1, uxtw] ldrb x0, [x0, x1] llvm-svn: 218030	2014-09-18 05:40:41 +00:00
Juergen Ributzka	9d3f8d17bd	[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). llvm-svn: 218010	2014-09-18 02:44:13 +00:00
Juergen Ributzka	3e34cdac5c	[FastISel][AArch64] Custom lower sdiv by power-of-2. Emit an optimized instruction sequence for sdiv by power-of-2 depending on the exact flag. This fixes rdar://problem/18224511. llvm-svn: 217986	2014-09-17 21:55:55 +00:00
Juergen Ributzka	49a4f8311b	[FastISel][AArch64] Simplify mul to shift when possible. This is related to rdar://problem/18369687. llvm-svn: 217980	2014-09-17 20:35:41 +00:00
Juergen Ributzka	df7d94ca78	[FastISel][AArch64] Fold mul into add/sub and logical operations. Try to fold the multiply into the add/sub or logical operations (when possible). This is related to rdar://problem/18369687. llvm-svn: 217978	2014-09-17 19:51:38 +00:00
Juergen Ributzka	6305202d76	[FastISel][AArch64] Fold mul into the address computation of memory operations. Teach 'computeAddress' to also fold multiplies into the address computation (when possible). This fixes rdar://problem/18369443. llvm-svn: 217977	2014-09-17 19:19:31 +00:00
Juergen Ributzka	27d8a0df16	[FastISel][AArch64] Fold compare with zero and branch into CBZ and CBNZ. This takes advanatage of the CBZ and CBNZ instruction to further optimize the common null check pattern into a single instruction. This is related to rdar://problem/18358882. llvm-svn: 217972	2014-09-17 18:05:34 +00:00
Juergen Ributzka	06b1780a0b	[FastISel][AArch64] Improve branch selection to support all FP conditions. This adds the last two missing floating-point condition codes (FCMP_UEQ and FCMP_ONE) also to the branch selection. In these two cases an additonal branch instruction is required. This also adds unit tests to checks all the different condition codes. This is related o rdar://problem/18358882. llvm-svn: 217966	2014-09-17 17:46:47 +00:00
Robin Morisset	4c9d292205	[X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928	2014-09-17 00:06:58 +00:00
Juergen Ributzka	22a43c26cd	[FastISel][AArch64] Add vector support to argument lowering. Lower the first 8 vector arguments too. llvm-svn: 217850	2014-09-16 00:25:30 +00:00
Juergen Ributzka	795aadd45c	[FastISel][AArch64] Allow handling of vectors during return lowering for little endian machines. Allow handling of vectors during return lowering at least for little endian machines. This was restricted in r208200 to fix it for big endian machines (according to the comment), but it also disabled it for little endian too. llvm-svn: 217846	2014-09-15 23:40:10 +00:00
Juergen Ributzka	1b300e160c	[FastISel][AArch64] Update function and variable names to follow the coding standard. NFC. llvm-svn: 217845	2014-09-15 23:20:17 +00:00
Juergen Ributzka	4740a11e7c	[FastISel][AArch64] Make AArch64FastISel class final. NFC. llvm-svn: 217840	2014-09-15 22:33:11 +00:00
Juergen Ributzka	25497b3f2d	[FastISel][AArch64] Lower sin/cos/pow to runtime lib calls. Also lower sin/cos/pow to runtime lib calls. This fixes rdar://problem/18343468. llvm-svn: 217839	2014-09-15 22:33:06 +00:00
Juergen Ributzka	e596897b8b	[FastISel][AArch64] Add lowering support for frem. This lowers frem to a runtime libcall inside fast-isel. The test case also checks the CallLoweringInfo bug that was exposed by this change. This fixes rdar://problem/18342783. llvm-svn: 217833	2014-09-15 22:07:49 +00:00
Juergen Ributzka	48cecd5f82	[FastISel][AArch64] Refactor selectAddSub, selectLogicalOp, and SelectShift. NFC. Small refactor to tidy up the code a little. llvm-svn: 217827	2014-09-15 21:27:56 +00:00
Juergen Ributzka	10569f4764	[FastISel][AArch64] Refactor code to use isTypeSupported. NFC. Gets rid of isLoadStoreTypeLegal and replace it with isTypeSupported. llvm-svn: 217826	2014-09-15 21:27:54 +00:00
Juergen Ributzka	dd6e5e3f62	[FastISel][AArch64] Improve floating-point compare support. Add support for the last two missing fcmp condition codes: UEQ and ONE. This fixes rdar://problem/18341575. llvm-svn: 217823	2014-09-15 20:47:16 +00:00
James Molloy	0b5d57a103	[A57FPLoadBalancing] Modify r217689 - actually we do need to check defs ... Just make sure we check uses first so we see the kill first. It turns out ignoring defs gives some pretty nasty runtime failures. I'm certain this is the fix but I'm still reducing a testcase. llvm-svn: 217735	2014-09-14 18:24:26 +00:00
Juergen Ributzka	e238e394d2	[FastISel][AArch64] Add support for non-native types for logical ops. Extend the logical ops selection to also support non-native types such as i1, i8, and i16. Fixes rdar://problem/18330589. llvm-svn: 217732	2014-09-13 23:46:28 +00:00
Chad Rosier	4762471838	[AArch64] Don't enable the post-RA MI scheduler at OptNone. Hopefully, this will appease the bots. llvm-svn: 217712	2014-09-12 22:17:28 +00:00
Chad Rosier	e7b10df26b	[AArch64] Enable post-RA MI scheduler. Phabricator Revision: http://reviews.llvm.org/D5278 Patch by Sanjin Sijaric! llvm-svn: 217693	2014-09-12 17:40:39 +00:00
James Molloy	168ae87629	[A57FPLoadBalancing] Remove support for vector types Vector MUL/MLAs have tied operands, which gives us extra constraints that we currently can't handle. Instead of silently doing the wrong thing, remove support to be readded later properly. llvm-svn: 217690	2014-09-12 16:55:32 +00:00
James Molloy	95dc4092ea	[A57FPLoadBalancing] Ignore <def>s when checking if a chain may be killed. Defs are seen before uses, so a def without the kill flag doesn't necessarily mean that the register is not killed on that instruction. It may be killed in a later use operand. llvm-svn: 217689	2014-09-12 16:55:26 +00:00
James Molloy	ce321b6608	[A57LoadBalancing] unique_ptr-ify. Thanks to David Blakie for the in-depth review! llvm-svn: 217682	2014-09-12 14:35:17 +00:00
Patrik Hagglund	469c227cfc	Fix gcc -Wpedantic. llvm-svn: 217669	2014-09-12 12:32:08 +00:00
Gerolf Hoflehner	3ae826c32b	[AArch64] Revert r216141 for cyclone The increase of the interleave factor to 4 has side-effects like performance losses eg. due to reminder loops being executed more frequently and may increase code size. It requires more analysis and careful heuristic tuning. Expect double digit gains in small benchmarks like lowercase.c and losses in puzzle.c. llvm-svn: 217540	2014-09-10 20:31:57 +00:00
Sanjay Patel	8030ed3639	Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528	2014-09-10 17:58:16 +00:00
Arnaud A. de Grandmaison	2dff70f158	[AArch64] Address Chad's post commit review comments for r217504 (PBQP experimental support) llvm-svn: 217518	2014-09-10 17:03:25 +00:00
Arnaud A. de Grandmaison	9e3c616e11	[AArch64] Pacify lld buildbot complaining about an unused static function in release build. llvm-svn: 217505	2014-09-10 14:24:02 +00:00
Arnaud A. de Grandmaison	99989cbd0c	[AArch64] Add experimental PBQP support This adds target specific support for using the PBQP register allocator on the AArch64, for the A57 cpu. By default, the PBQP allocator is not used, unless explicitely required on the command line with "-aarch64-pbqp". llvm-svn: 217504	2014-09-10 14:06:10 +00:00
Asiri Rathnayake	245674ae13	[AArch 64] Use a constant pool load for weak symbol references when using static relocation model and small code model. Summary: currently we generate GOT based relocations for weak symbol references regardless of the underlying relocation model. This should be change so that in static relocation model we use a constant pool load instead. Patch from: Keith Walker Reviewers: Renato Golin, Tim Northover llvm-svn: 217503	2014-09-10 13:54:38 +00:00
Chad Rosier	6796f48338	[AArch64] Enabled AA support for Cortex-A57. llvm-svn: 217381	2014-09-08 15:34:16 +00:00
Chad Rosier	7f50169e13	[AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 llvm-svn: 217371	2014-09-08 14:43:48 +00:00
Chad Rosier	1d31d7e6bb	[AArch64] Enabled AA support for Cortex-A53. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 llvm-svn: 217370	2014-09-08 14:31:49 +00:00
Jiangning Liu	ea6ab99806	[AArch64] Add pass to enable additional comparison optimizations by CSE. Patched by Sergey Dmitrouk. This pass tries to make consecutive compares of values use same operands to allow CSE pass to remove duplicated instructions. For this it analyzes branches and adjusts comparisons with immediate values by converting: GE -> GT GT -> GE LT -> LE LE -> LT and adjusting immediate values appropriately. It basically corrects two immediate values towards each other to make them equal. llvm-svn: 217220	2014-09-05 02:55:24 +00:00
Tim Northover	2ca258e6f9	AArch64: fix vector-immediate BIC/ORR on big-endian devices. Follow up to r217138, extending the logic to other NEON-immediate instructions. As before, the instruction already performs the correct operation and we're just using a different type for convenience, so we want a true nop-cast. Patch by Asiri Rathnayake. llvm-svn: 217159	2014-09-04 15:05:24 +00:00
Tim Northover	54d4e0e00b	AArch64: fix big-endian immediate materialisation We were materialising big-endian constants using DAG nodes with types different from what was requested, followed by a bitcast. This is fine on little-endian machines where bitcasting is a nop, but we need a slightly different representation for big-endian. This adds a new set of NVCAST (natural-vector cast) operations which are always nops. Patch by Asiri Rathnayake. llvm-svn: 217138	2014-09-04 09:46:14 +00:00
Juergen Ributzka	328ca3c6c8	[FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC. llvm-svn: 217119	2014-09-04 01:29:21 +00:00
Juergen Ributzka	db9e90956f	[FastISel][AArch64] Add target-specific lowering for logical operations. This change adds support for immediate and shift-left folding into logical operations. This fixes rdar://problem/18223183. llvm-svn: 217118	2014-09-04 01:29:18 +00:00
Robin Morisset	c2d1634d13	Refactor AtomicExpandPass and add a generic isAtomic() method to Instruction Summary: Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs. Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving the X86 backend to this pass. This requires a way of easily detecting which instructions are atomic. I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making isAtomic() a method of Instruction implemented by a switch on the opcodes. Test Plan: make check Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5035 llvm-svn: 217080	2014-09-03 21:29:59 +00:00
Juergen Ributzka	76dd2e3da7	[FastISel][tblgen] Rename tblgen generated FastISel functions. NFC. This is the final round of renaming. This changes tblgen to emit lower-case function names for FastEmitInst_* and FastEmit_*, and updates all its uses in the source code. Reviewed by Eric llvm-svn: 217075	2014-09-03 20:56:59 +00:00
Juergen Ributzka	fa7bc008ce	[FastISel] Rename public visible FastISel functions. NFC. This commit renames the following public FastISel functions: LowerArguments -> lowerArguments SelectInstruction -> selectInstruction TargetSelectInstruction -> fastSelectInstruction FastLowerArguments -> fastLowerArguments FastLowerCall -> fastLowerCall FastLowerIntrinsicCall -> fastLowerIntrinsicCall FastEmitZExtFromI1 -> fastEmitZExtFromI1 FastEmitBranch -> fastEmitBranch UpdateValueMap -> updateValueMap TargetMaterializeConstant -> fastMaterializeConstant TargetMaterializeAlloca -> fastMaterializeAlloca TargetMaterializeFloatZero -> fastMaterializeFloatZero LowerCallTo -> lowerCallTo Reviewed by Eric llvm-svn: 217074	2014-09-03 20:56:52 +00:00
Eric Christopher	0fba2245b3	Remove unnecessary getTarget call now that the subtarget is cached on the machine function. llvm-svn: 217070	2014-09-03 20:36:26 +00:00
Juergen Ributzka	89b8a87a22	[FastISel] Some long overdue spring cleaning of FastISel. Things got a little bit messy over the years and it is time for a little bit spring cleaning. This first commit is focused on the FastISel base class itself. It doxyfies all comments, C++11fies the code where it makes sense, renames internal methods to adhere to the coding standard, and clang-formats the files. Reviewed by Eric llvm-svn: 217060	2014-09-03 18:46:45 +00:00
Juergen Ributzka	addc44901e	[FastISel][AArch64] Move unconditional branch handling into 'SelectBranch'. NFC. llvm-svn: 217054	2014-09-03 17:58:10 +00:00
Benjamin Kramer	e991977346	Add override to overriden virtual methods, remove virtual keywords. No functionality change. Changes made by clang-tidy + some manual cleanup. llvm-svn: 217028	2014-09-03 11:41:21 +00:00
Juergen Ributzka	af69801c85	Reapply r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR."" This reapplies r216805 with a fix to a copy-past error, which resulted in an incorrect register class. Original commit message: Select the correct register class for the various instructions that are generated when combining instructions and constrain the registers to the appropriate register class. This fixes rdar://problem/18183707. llvm-svn: 217019	2014-09-03 07:07:10 +00:00
Juergen Ributzka	c26a394afc	[FastISel][AArch64] Add target-dependent instruction selection for Add/Sub. There is already target-dependent instruction selection support for Adds/Subs to support compares and the intrinsics with overflow check. This takes advantage of the existing infrastructure to also support Add/Sub, which allows the folding of immediates, sign-/zero-extends, and shifts. This fixes rdar://problem/18207316. llvm-svn: 217007	2014-09-03 01:38:36 +00:00
Juergen Ributzka	0622b98f0b	[FastISel][AArch64] Use the target-dependent selection code for shifts first. This uses the target-dependent selection code for shifts first, which allows us to create better code for shifts with immediates and sign-/zero-extend folding. Vector type are not handled yet and the code falls back to target-independent instruction selection for these cases. This fixes rdar://problem/17907920. llvm-svn: 216985	2014-09-02 22:33:57 +00:00
Juergen Ributzka	69be8004af	[FastISel][AArch64] Use a new helper function to determine if a value type is supported. NFCI. FastISel for AArch64 supports more value types than are actually legal. Use a dedicated helper function to reflect this. It is very similar to the isLoadStoreTypeLegal function, with the exception that vector types are not supported yet. llvm-svn: 216984	2014-09-02 22:33:53 +00:00
Eric Christopher	2f6f860aaa	Reinstate "Nuke the old JIT." Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982	2014-09-02 22:28:02 +00:00
Juergen Ributzka	b33a0a8941	[FastISel][AArch64] Move over to target-dependent instruction selection only. This change moves FastISel for AArch64 to target-dependent instruction selection only. This change replicates the existing target-independent behavior, therefore there are no changes to the unit tests or new tests. Future changes will take advantage of this change and update functionality and unit tests. llvm-svn: 216955	2014-09-02 21:32:54 +00:00
Pete Cooper	92fc86558d	Change MCSchedModel to be a struct of statically initialized data. This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour Reviewed by Andy Trick and Chandler C llvm-svn: 216919	2014-09-02 17:43:54 +00:00
Alexey Samsonov	4c992226cd	Fix left shifts of negative integers in AArch64 InstPrinter/Disassembler Summary: Left shift of negative integer is an undefined behavior, and is reported by UBSan. It's ok for imm values to be negative, so we can just replace left shifts with multiplications. Test Plan: check-llvm test suite Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: aemerson, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5132 llvm-svn: 216910	2014-09-02 16:19:41 +00:00
Aaron Ballman	debd1a67f5	Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC. llvm-svn: 216902	2014-09-02 12:19:02 +00:00
David Xu	c1288ad655	Merge Extend and Shift into a UBFX llvm-svn: 216899	2014-09-02 09:33:56 +00:00
Craig Topper	57c93cf3ef	Remove 'virtual' keyword from methods markedwith 'override' keyword. llvm-svn: 216823	2014-08-30 16:48:34 +00:00
Juergen Ributzka	0421e5dc75	Revert r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR." I think this broke the build bot. Reverting it for now until I have time to take a closer look. llvm-svn: 216813	2014-08-30 06:16:26 +00:00
Juergen Ributzka	f2d945e9c8	[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR. Select the correct register class for the various instructions that are generated when combining instructions and constrain the registers to the appropriate register class. This fixes rdar://problem/18183707. llvm-svn: 216805	2014-08-29 23:48:09 +00:00
Juergen Ributzka	40528aa400	[FastISel][AArch64] Use the correct register class for branches. Also constrain the register class for branches. This fixes rdar://problem/18181496. llvm-svn: 216804	2014-08-29 23:48:06 +00:00
Alexey Samsonov	835beeaa34	Make isValidMCLOHType take unsigned instead of enum to avoid loading invalid enum values llvm-svn: 216797	2014-08-29 22:34:28 +00:00
Reid Kleckner	ef4ab8e118	AArch64: Silence -Wabsolute-value warning with std::abs llvm-svn: 216794	2014-08-29 22:14:26 +00:00
Robin Morisset	e583310c3b	Fix typos in comments, NFC Summary: Just fixing comments, no functional change. Test Plan: N/A Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5130 llvm-svn: 216784	2014-08-29 21:53:01 +00:00
Louis Gerbarg	914b23f552	Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit values This patch checks for DAG patterns that are an add or a sub followed by a compare on 16 and 8 bit inputs. Since AArch64 does not support those types natively they are legalized into 32 bit values, which means that mask operations are inserted into the DAG to emulate overflow behaviour. In many cases those masks do not change the result of the processing and just introduce a dependent operation, often in the middle of a hot loop. This patch detects the relevent DAG patterns and then tests to see if the transforms are equivalent with and without the mask, removing the mask if possible. The exact mechanism of this patch was discusses in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html There is a reasonably good chance there are missed oppurtunities due to similiar (but not identical) DAG patterns that could be funneled into this test, adding them should be simple if we see test cases. Tests included. rdar://13754426 llvm-svn: 216776	2014-08-29 21:00:22 +00:00
Juergen Ributzka	69d86a518a	[FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc. When we select a trunc instruction we don't emit any code if the type is already i32 or smaller. This is because the instruction that uses the truncated value will deal with it. This behavior can incorrectly transfer a kill flag, which was meant for the result of the truncate, onto the source register. %2 = trunc i32 %1 to i16 ... = ... %2 -> ... = ... vreg1 <kill> ... = ... %1 ... = ... vreg1 This commit fixes this by emitting a COPY instruction, so that the result and source register are distinct virtual registers. This fixes rdar://problem/18178188. llvm-svn: 216750	2014-08-29 17:58:16 +00:00
Tim Northover	4af4565d10	AArch64: only try to get operand of a known node. A bug in r216725 meant we tried to discover the type of a SETCC before confirming the node actually was a SETCC. llvm-svn: 216734	2014-08-29 15:34:58 +00:00
Tim Northover	6087cb1e5a	AArch64: skip select/setcc combine in complex case. In an llvm-stress generated test, we were trying to create a v0iN type and asserting when that failed. This case could probably be handled by the function, but not without added complexity and the situation it arises in is sufficiently odd that there's probably no benefit anyway. Should fix PR20775. llvm-svn: 216725	2014-08-29 13:05:18 +00:00
Arnaud A. de Grandmaison	3b20311baf	[AArch64] FPLoadBalancing: move ownership of the chain to its current accumulator register and forget about the previously used accumulator. Coming up with a simple testcase is not easy, as this highly depends on what the register allocator is doing: this issue showed up while working with the PBQP allocator, which produced a different allocation scheme. A testcase would need to come up with chain starting in D[0-7], then moving to D[8-15], followed by a call to a function whose regmask clobbers the starting accumulator in D[0-7], then another use of the chain. Fixed some formatting, added some invariant checks while there. llvm-svn: 216721	2014-08-29 09:54:11 +00:00
Jiangning Liu	bffae55891	[AArch64] Fix some failures exposed by value type v4f16 and v8f16. 1) Add some missing bitcast patterns for v8f16. 2) Add type promotion for operand of ld/st operations. llvm-svn: 216706	2014-08-29 01:31:42 +00:00
Juergen Ributzka	629f4d87cb	[FastISel][AArch64] Don't fold instructions that are not in the same basic block. This fix checks first if the instruction to be folded (e.g. sign-/zero-extend, or shift) is in the same machine basic block as the instruction we are folding into. Not doing so can result in incorrect code, because the value might not be live-out of the basic block, where the value is defined. This fixes rdar://problem/18169495. llvm-svn: 216700	2014-08-29 00:19:21 +00:00
Jim Grosbach	399c8fd8d4	AArch64: More correctly constrain target vector extend lowering. The AArch64 target lowering for [zs]ext of vectors is set up to handle input simple types and expects the generic SDag path to do something reasonable with anything that's not a simple type. The code, however, was only checking that the result type was a simple type and assuming that implied that the source type would also be a simple type. That's not a valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>" demonstrate. The fix is to simply explicitly validate the source type as well as the result type. PR20791 llvm-svn: 216689	2014-08-28 22:08:28 +00:00
David Xu	94f87246ca	Generate CMN when comparing a short int with minus llvm-svn: 216651	2014-08-28 04:59:53 +00:00
Juergen Ributzka	b377816322	Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation." Quentin pointed out that this is not the correct approach and there is a better and easier solution. llvm-svn: 216632	2014-08-27 23:09:40 +00:00
Juergen Ributzka	f229c0d353	[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation. Currently instructions are folded very aggressively into the memory operation, which can lead to the use of killed operands: %vreg1<def> = ADDXri %vreg0<kill>, 2 %vreg2<def> = LDRBBui %vreg0, 2 ... = ... %vreg1 ... This usually happens when the result is also used by another non-memory instruction in the same basic block, or any instruction in another basic block. If the computed address is used by only memory operations in the same basic block, then it is safe to fold them. This is because all memory operations will fold the address computation and the original computation will never be emitted. This fixes rdar://problem/18142857. llvm-svn: 216629	2014-08-27 22:52:33 +00:00
Juergen Ributzka	7e51cfaccc	[FastISel][AArch64] Fix a comment in my previous commit (r216617). llvm-svn: 216622	2014-08-27 21:40:50 +00:00
Juergen Ributzka	e52b6901f5	[FastISel][AArch64] Fix simplify address when the address comes from a shift. When the address comes directly from a shift instruction then the address computation cannot be folded into the memory instruction, because the zero register is not available as a base register. Simplify addess needs to emit the shift instruction and use the result as base register. llvm-svn: 216621	2014-08-27 21:38:33 +00:00
Juergen Ributzka	8c8c692bd7	[FastISel][AArch64] Use the zero register for stores. Use the zero register directly when possible to avoid an unnecessary register copy and a wasted register at -O0. This also uses integer stores to store a positive floating-point zero. This saves us from materializing the positive zero in a register and then storing it. llvm-svn: 216617	2014-08-27 21:04:52 +00:00
Oliver Stannard	8901713bd2	Teach the AArch64 backend about v4f16 and v8f16 This teaches the AArch64 backend to deal with the operations required to deal with the operations on v4f16 and v8f16 which are exposed by NEON intrinsics, plus the add, sub, mul and div operations. llvm-svn: 216555	2014-08-27 16:16:04 +00:00
Craig Topper	43cee2f5fc	Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created. llvm-svn: 216525	2014-08-27 05:25:25 +00:00
Juergen Ributzka	069b8c9f42	[FastISel][AArch64] Fix address simplification. When a shift with extension or an add with shift and extension cannot be folded into the memory operation, then the address calculation has to be materialized separately. While doing so the code forgot to consider a possible sign-/zero- extension. This fix folds now also the sign-/zero-extension into the add or shift instruction which is used to materialize the address. This fixes rdar://problem/18141718. llvm-svn: 216511	2014-08-27 00:58:30 +00:00
Juergen Ributzka	d16dcac638	[FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction. llvm-svn: 216510	2014-08-27 00:58:26 +00:00
James Molloy	b5144baff8	Change the return value of "getEnd()" from a MachineInstr* to a MachineBasicBlock::iterator. It seems on Darwin the illegal round-trip ::iterator -> MachineInstr* -> ::iterator breaks execution horribly when the iterator is not a real MachineInstr, like ::end(). llvm-svn: 216455	2014-08-26 13:41:31 +00:00
Dylan Noblesmith	70b930984e	AArch64: use std::fill instead of memset Followup based on review. llvm-svn: 216436	2014-08-26 03:33:26 +00:00
Dylan Noblesmith	8c4eed9349	Revert "AArch64: use std::vector for temp array" This reverts commit r216365. llvm-svn: 216433	2014-08-26 02:03:43 +00:00
Juergen Ributzka	3b8a0ea6f2	[FastISel][AArch64] Refactor float zero materialization. NFCI. llvm-svn: 216403	2014-08-25 19:58:05 +00:00
Karthik Bhat	d94045aa5a	Allow vectorization of division by uniform power of 2. This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371	2014-08-25 04:56:54 +00:00
Dylan Noblesmith	ad548ab13e	AArch64: unique_ptr-ify map structures llvm-svn: 216366	2014-08-25 01:59:38 +00:00
Dylan Noblesmith	81767bf6f1	AArch64: use std::vector for temp array llvm-svn: 216365	2014-08-25 01:59:36 +00:00
Juergen Ributzka	c1382c6cd1	[FastISel][AArch64] Add support for variable shift. This adds the missing variable shift support for value type i8, i16, and i32. This fixes <rdar://problem/18095685>. llvm-svn: 216242	2014-08-21 23:06:07 +00:00
Robin Morisset	b2dd60f27d	Rename AtomicExpandLoadLinked into AtomicExpand AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of a group that aim at making it more target-independent. See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html for details The command line option is "atomic-expand" llvm-svn: 216231	2014-08-21 21:50:01 +00:00
Juergen Ributzka	98be3942ed	[FastISel][AArch64] Use the correct register class to make the MI verifier happy. This is mostly achieved by providing the correct register class manually, because getRegClassFor always returns the GPRAllRegClass for MVT::i32 and MVT::i64. Also cleanup the code to use the FastEmitInst_ method whenever possible. This makes sure that the operands' register class is properly constrained. For all the remaining cases this adds the missing constrainOperandRegClass calls for each operand. llvm-svn: 216225	2014-08-21 20:57:57 +00:00
Quentin Colombet	ca131ef450	[AArch64] Run a peephole pass right after AdvSIMD pass. The AdvSIMD pass may produce copies that are not coalescer-friendly. The peephole optimizer knows how to fix that as demonstrated in the test case. <rdar://problem/12702965> llvm-svn: 216200	2014-08-21 18:10:07 +00:00
Juergen Ributzka	3f0a3fc649	[FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI. llvm-svn: 216199	2014-08-21 18:02:25 +00:00
James Molloy	65aa6c84f5	[LoopVectorize] Up the maximum unroll factor to 4 for AArch64 Only for Cortex-A57 and Cyclone for now, where it has shown wins. llvm-svn: 216141	2014-08-21 00:02:51 +00:00
Juergen Ributzka	7c8f6aa104	[FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare. This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero- extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both operands have to be sign-/zero-extended with separate instructions. Related to <rdar://problem/17913111>. llvm-svn: 216073	2014-08-20 16:34:15 +00:00
Aaron Ballman	85f7f5057f	Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC. llvm-svn: 216067	2014-08-20 12:14:35 +00:00
Juergen Ributzka	ce5953230a	[FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0. Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register class to be used with the zero register. This makes the MachineInstruction verifier happy again. This is related to <rdar://problem/18027157>. llvm-svn: 216040	2014-08-20 01:10:36 +00:00
Juergen Ributzka	21b19be38f	[FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding. Factor out the ADDS/SUBS instruction emission code into helper functions and make the helper functions more clever to support most of the different ADDS/SUBS instructions the architecture support. This includes better immedediate support, shift folding, and sign-/zero-extend folding. This fixes <rdar://problem/17913111>. llvm-svn: 216033	2014-08-19 22:29:55 +00:00
Alexey Samsonov	8dee78d45c	Delete unused argument in AArch64MCInstLower constructor: it doesn't use Mangler, and Mangler is in fact not even created when AArch64MCInstLower is constructed. This bug is reported by UBSan. llvm-svn: 216030	2014-08-19 21:51:08 +00:00
Juergen Ributzka	f39a032c8b	Reapply [FastISel][AArch64] Add support for more addressing modes (r215597). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] llvm-svn: 216013	2014-08-19 19:44:17 +00:00
Juergen Ributzka	1cb2d0a61e	Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. llvm-svn: 216009	2014-08-19 19:44:02 +00:00
Alexey Samsonov	155b633a98	Hide two different AlignMode enums in anonymous namespaces. This bug is reported by UBSan. llvm-svn: 216001	2014-08-19 18:40:39 +00:00
Juergen Ributzka	9261bd7fe9	[FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register. This fixes a few BuildMI callsites where the result register was added by using addReg, which is per default a use and therefore an operand register. Also use the zero register as result register when emitting a compare instruction (SUBS with unused result register). llvm-svn: 215997	2014-08-19 17:41:53 +00:00
Robin Morisset	92b539f285	Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends Summary: Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends These helper functions are introduced in D4844. Depends D4844 Test Plan: make check-all passes Reviewers: jfb Subscribers: aemerson, llvm-commits, mcrosier, reames Differential Revision: http://reviews.llvm.org/D4937 llvm-svn: 215902	2014-08-18 16:48:58 +00:00
Oliver Stannard	0f36700d69	Teach the AArch64 backend to handle f16 This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv operations on f16 (half-precision) types by promoting to f32. llvm-svn: 215891	2014-08-18 14:22:39 +00:00
Oliver Stannard	159a549ea3	[ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage Externally-defined functions with weak linkage should not be tail-called on ARM or AArch64, as the AAELF spec requires normal calls to undefined weak functions to be replaced with a NOP or jump to the next instruction. The behaviour of branch instructions in this situation (as used for tail calls) is implementation-defined, so we cannot rely on the linker replacing the tail call with a return. llvm-svn: 215890	2014-08-18 12:42:15 +00:00
Tim Northover	9127b613b1	TableGen: allow use of uint64_t for available features mask. ARM in particular is getting dangerously close to exceeding 32 bits worth of possible subtarget features. When this happens, various parts of MC start to fail inexplicably as masks get truncated to "unsigned". Mostly just refactoring at present, and there's probably no way to test. llvm-svn: 215887	2014-08-18 11:49:42 +00:00
Robin Morisset	8881e6ec1a	Fix typos in comments llvm-svn: 215777	2014-08-15 22:17:28 +00:00
Juergen Ributzka	1b56a877d8	[FastISel][AArch64] Fix a latent bug in floating-point materialization. The floating-point value positive zero (+0.0) is a valid immedate value according to isFPImmLegal. As a result AArch64 FastISel went ahead and used the immediate version of fmov to materialize the constant. The problem is that the immediate version of fmov cannot encode an imediate for postive zero. Instead a fmov from the zero register was supposed to be used in this case. This fix adds handling for this special case and uses fmov from the zero register to materialize a positive zero (negative zeroes go to the constant pool). There is no test case for this, because this code is currently dead. It will be enabled in a future commit and I will add a test case in a separate commit after that. This fixes <rdar://problem/18027157>. llvm-svn: 215753	2014-08-15 18:55:55 +00:00
Juergen Ributzka	5a7aa49232	Reapplying [FastISel][AArch64] Cleanup constant materialization code. NFCI. Note: This reapplies r215582 without any modifications. The refactoring wasn't responsible for the buildbot failures. Original commit message: Cleanup and prepare constant materialization code for future commits. llvm-svn: 215752	2014-08-15 18:55:52 +00:00
Amara Emerson	03cfd262eb	[AArch64] Narrow arguments passed in wrong position on the stack in big-endian mode. Patch by Asiri Rathnayake. Differential Revision: http://reviews.llvm.org/D4922 llvm-svn: 215716	2014-08-15 14:29:57 +00:00
Rafael Espindola	80b0c0c71c	Remove HasLEB128. We already require CFI, so it should be safe to require .leb128 and .uleb128. llvm-svn: 215712	2014-08-15 14:01:07 +00:00
Juergen Ributzka	a981de1e50	Revert several FastISel commits to track down a buildbot error. This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X86] Use XOR to materialize the "0" value." r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization." r215591 "[FastISel][AArch64] Make use of the zero register when possible." r215588 "[FastISel] Let the target decide first if it wants to materialize a constant." r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI." llvm-svn: 215673	2014-08-14 19:56:28 +00:00

1 2 3 4 5 ...

805 Commits