llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00

Author	SHA1	Message	Date
Quentin Colombet	a179ebf75a	[ARM][TEST] Strengthen test against smarter reg alloc. Follow-up of r236247. rdar://problem/20770899 llvm-svn: 236296	2015-05-01 00:45:55 +00:00
Pete Cooper	9851cd43b2	[ARM] optimizeSelect should clear kill flags. If we move an instruction from one block down to a MOVC and predicate it, then the original instruction could be moved in to a loop. In this case, its invalid for any kill flags to remain on there. Fails with -verfy-machineinstrs. rdar://problem/20752113 llvm-svn: 236290	2015-04-30 23:57:47 +00:00
Pete Cooper	2c871b35a8	Commute the internal flag on MachineOperands. When commuting a thumb instruction in the size reduction pass, thumb instructions are represented as a bundle and so some operands may be marked as internal. The internal flag has to move with the operand when commuting. This test is sensitive to register allocation so can't specifically check that this error was happening, but so long as it continues to pass with -verify then hopefully its still ok. rdar://problem/20752113 llvm-svn: 236282	2015-04-30 23:14:14 +00:00
Quentin Colombet	43702a31a2	[AArch64] Fix bad register class constraint in fast-isel for TST instruction. rdar://problem/20748715 llvm-svn: 236273	2015-04-30 22:27:20 +00:00
Pete Cooper	90ff562042	Don't always apply kill flag in thumb2 ABS pseudo expansion. The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272	2015-04-30 22:15:59 +00:00
Reid Kleckner	6e67433faf	[X86] Use 4 byte preferred aggregate alignment on Win32 This helps reduce the frequency of stack realignment prologues in 32-bit X86 Windows code. Before this change and the corresponding clang change, we would take the max of the type preferred alignment and the explicit alignment on the alloca. If you don't override aggregate alignment in datalayout, you get a default of 8. This dates back to 2007 / r34356, and changing it seems prohibitively difficult at this point. llvm-svn: 236270	2015-04-30 22:11:59 +00:00
Andrea Di Biagio	50f551d703	Fix comment in test. NFC. llvm-svn: 236262	2015-04-30 21:22:28 +00:00
Andrea Di Biagio	2fddd75212	Fix for PR23103. Correctly propagate the 'IsUndef' flag to the register operands of a commuted instruction. Revision 220239 exposed a latent bug in method 'TargetInstrInfo::commuteInstruction'. When commuting the operands of a machine instruction, method 'commuteInstruction' didn't correctly propagate the 'IsUndef' flag to the register operands of the new (commuted) instruction. Before this patch, the following instruction: %vreg4<def> = VADDSDrr %vreg14, %vreg5<undef>; FR64:%vreg4,%vreg14,%vreg5 was wrongly converted by method 'commuteInstruction' into: %vreg4<def> = VADDSDrr %vreg5, %vreg14<undef>; FR64:%vreg4,%vreg5,%vreg14 The correct instruction should have been: %vreg4<def> = VADDSDrr %vreg5<undef>, %vreg14; FR64:%vreg4,%vreg5,%vreg14 This patch fixes the problem in method 'TargetInstrInfo::commuteInstruction'. When swapping the operands of a machine instruction, we now make sure that 'IsUndef' flags are correctly set. Added test case 'pr23103.ll'. Differential Revision: http://reviews.llvm.org/D9406 llvm-svn: 236258	2015-04-30 21:03:29 +00:00
Pete Cooper	54ecccfd00	Don't rewrite jumps to empty BBs to landing pads. In the test case here, the 'unreachable' BB was removed by BranchFolding because its empty. It then rewrote the jump from 'entry' to jump to its fallthrough, which was a landing pad. This results in 'entry' jumping to 2 different landing pads, which fails the machine verifier. rdar://problem/20750162 llvm-svn: 236248	2015-04-30 18:58:23 +00:00
Quentin Colombet	bc9c5a5af1	[ARM] Do not generate invalid encoding for stack adjust, even if this is just temporary. Because of that: 1. The machine verifier was complaining on such code. 2. The generate code worked just because the thumb reduction size pass fixed the opcode. rdar://problem/20749824 llvm-svn: 236247	2015-04-30 18:52:49 +00:00
Jan Vesely	72f354c922	Reinstate revisions r234755, r234759, r234760 changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240	2015-04-30 17:15:56 +00:00
Daniel Sanders	0bbd3e3f2c	[mips][msa] Rename main check prefix to 'ALL' in basic operations tests. NFC Summary: The majority of the checks are subtarget independent. The few that aren't will be corrected shortly. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9340 llvm-svn: 236220	2015-04-30 09:57:37 +00:00
Daniel Sanders	95fe30da62	[mips][msa] Use CHECK-LABEL where missing, and remove checks matching the .size directive. NFC. Summary: Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9339 llvm-svn: 236219	2015-04-30 09:56:30 +00:00
Daniel Sanders	32a98df5fb	[mips] Add missing signext attributes to MSA basic operations tests. NFC. Summary: This doesn't make much difference to MIPS32, but it will simplify a MIPS64r6 bugfix which will follow shortly by removing unnecessary sign-extension of parameters. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9338 llvm-svn: 236216	2015-04-30 09:24:09 +00:00
Simon Pilgrim	07a58b66cf	[SSE] Fix for MUL v16i8 on pre-SSE41 targets (PR23369). Sign extension of i8 to i16 was placing the unpacked bytes in the lower byte instead of the upper byte. llvm-svn: 236209	2015-04-30 08:23:16 +00:00
Owen Anderson	da14e7665d	Semantically revert r236031, which is not a good idea for in-order targets. At the least it should be guarded by some kind of target hook. It also introduced catastrophic compile time and code quality regressions on some out of tree targets (test case still being reduced/sanitized). Sanjay agreed with reverting this patch until these issues can be resolved. llvm-svn: 236199	2015-04-30 04:06:32 +00:00
Hans Wennborg	847f84eef8	XFAIL test/CodeGen/Generic/MachineBranchProb.ll on Hexagon (PR23377) llvm-svn: 236196	2015-04-30 01:59:04 +00:00
Hans Wennborg	1476a3b8b7	Switch lowering: use profile info to build weight-balanced binary search trees This will cause hot nodes to appear closer to the root. The literature says building the tree like this makes it a near-optimal (in terms of search time given key frequencies) binary search tree. In LLVM's case, we can do up to 3 comparisons in each leaf node, so it might be better to opt for lower tree height in some cases; that's something to look into in the future. Differential Revision: http://reviews.llvm.org/D9318 llvm-svn: 236192	2015-04-30 00:57:37 +00:00
Ahmed Bougacha	bfe392760e	Flip r236172 testcase RUN option ordering for BSD sed(1). NFC. llvm-svn: 236186	2015-04-30 00:07:34 +00:00
Pete Cooper	00f179e7a1	Change x86 CMOVE_F to read it source, not write it. This was breaking sqlite with the machine verifier because operand 0 was a def according to tablegen, but didn't have the 'isDef' flag set. Looking at the ISA, its clear that this operand is a source as writing to st(0) is implicit. So move the operand to the correct place in the td file. rdar://problem/20751584 llvm-svn: 236183	2015-04-29 23:51:33 +00:00
Reid Kleckner	3691294c25	[WinEH] Start EH preparation for 32-bit x86, it uses no arguments 32-bit x86 MSVC-style exceptions are functionaly similar to 64-bit, but they take no arguments. Instead, they implicitly use the value of EBP passed in by the caller as a pointer to the parent's frame. In LLVM, we can represent this as llvm.frameaddress(1), and feed that into all of our calls to llvm.framerecover. The next steps are: - Add an alloca to the fs:00 linked list of handlers - Add something like llvm.sjlj.lsda or generalize it to store in the alloca - Move state number calculation to WinEHPrepare, arrange for FunctionLoweringInfo to call it - Use the state numbers to insert explicit loads and stores in the IR llvm-svn: 236172	2015-04-29 22:49:54 +00:00
Reid Kleckner	97ba8a8b4c	[X86] Avoid mangling frameescape labels x86 Windows uses the '_' prefix for all global symbols, and this was mistakenly being applied to frameescape labels, which are not externally visible global symbols. They use the private global prefix 'L'. The right way to fix this is probably to stop masquerading this label as an ExternalSymbol and create a new SDNode type. These labels are not "external", and we know they will be resolved by assembly time. Having a custom SDNode type would allow us to do better X86 address mode matching, so it's probably worth doing eventually. llvm-svn: 236123	2015-04-29 16:46:01 +00:00
Duncan P. N. Exon Smith	09b5c9c24d	IR: Give 'DI' prefix to debug info metadata Finish off PR23080 by renaming the debug info IR constructs from `MD` to `DI`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the previous commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to this commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120	2015-04-29 16:38:44 +00:00
Vasileios Kalintiris	6f47a197aa	Mips fast-isel - handle functions which return i8 or i6 . Summary: Allow Mips fast-isel to handle functions which return i8/i16 signed/unsigned. Test Plan: Make check tests are forthcoming. Already passes test-suite at O0/O2 for Mips 32 r1/r2 Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6765 llvm-svn: 236103	2015-04-29 14:17:14 +00:00
Daniel Sanders	d40d7b9594	[mips] Correct 128-bit shifts on 64-bit targets. Summary: The existing code was correct for 32-bit GPR's but not 64-bit GPR's. It now accounts for both cases. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits, mohit.bhakkad, sagar Differential Revision: http://reviews.llvm.org/D9337 llvm-svn: 236099	2015-04-29 12:28:58 +00:00
Tim Northover	5bf87deee6	ARM: fix peephole optimisation of TST We were trying to look through COPY instructions, but only to the next instruction in a BB and incorrectly anyway. The cases where that would actually be a good idea are rare enough (and not even tested!) that it's not worth trying to get right. rdar://20721342 llvm-svn: 236050	2015-04-28 22:03:55 +00:00
Andrew Kaylor	9f3b999556	[WinEH] Split blocks at calls to llvm.eh.begincatch Differential Revision: http://reviews.llvm.org/D9311 llvm-svn: 236046	2015-04-28 21:54:14 +00:00
Sanjay Patel	9b05b26bc6	transform fadd chains to increase parallelism This is a compromise: with this simple patch, we should always handle a chain of exactly 3 operations optimally, but we're not generating the optimal balanced binary tree for a longer sequence. In general, this transform will reduce the dependency chain for a sequence of instructions using N operands from a worst case N-1 dependent operations to N/2 dependent operations. The optimal balanced binary tree would reduce the chain to log2(N). The trade-off for not dealing with longer sequences is: (1) we have less complexity in the compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we don't need to worry about the increased register pressure required to parallelize longer sequences. It also seems unlikely that we would ever encounter really long strings of dependent ops like that in the wild, but I'm not sure how to verify that speculation. FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math and this patch. We can extend this patch to cover other associative operations such as fmul, fmax, fmin, integer add, integer mul. This is a partial fix for: https://llvm.org/bugs/show_bug.cgi?id=17305 and if extended: https://llvm.org/bugs/show_bug.cgi?id=21768 https://llvm.org/bugs/show_bug.cgi?id=23116 The issue also came up in: http://reviews.llvm.org/D8941 Differential Revision: http://reviews.llvm.org/D9232 llvm-svn: 236031	2015-04-28 21:03:22 +00:00
Tom Stellard	347b82f6fc	R600: Fix up for AsmPrinter's OutStreamer being a unique_ptr Fixes a crash with basically any OpenGL application using the radeonsi driver. Patch by: Michel Dänzer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90176 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 236004	2015-04-28 17:37:03 +00:00
Justin Holewinski	41800386f1	[NVPTX] Handle addrspacecast constant expressions in aggregate initializers We need to track if an AddrSpaceCast expression was seen when generating an MCExpr for a ConstantExpr. This change introduces a custom lowerConstant method to the NVPTX asm printer that will create NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode the information that a given symbol needs to be casted to a generic address. llvm-svn: 236000	2015-04-28 17:18:30 +00:00
Elena Demikhovsky	224807ff06	Fixed crash of variable shift inst on AVX2 https://llvm.org/bugs/show_bug.cgi?id=22955 llvm-svn: 235993	2015-04-28 14:46:35 +00:00
Elena Demikhovsky	23e33119e3	AVX-512: Added "pandn" intrinsics set by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 235971	2015-04-28 08:12:42 +00:00
Hans Wennborg	cc333a9a05	Switch lowering: Take branch weight into account when ordering for fall-through Previously, the code would try to put a fall-through case last, even if that meant moving a case with much higher branch weight further down the chain. Ordering by branch weight is most important, putting a fall-through block last is secondary. llvm-svn: 235942	2015-04-27 23:35:22 +00:00
Ahmed Bougacha	54c4fb4a3f	[AArch64] Also combine vector selects fed by non-i1 SETCCs. After legalization, scalar SETCC has an i32 result type on AArch64. The i1 requirement seems too conservative, replace it with an assert. This also means that we now can run after legalization. That should also be fine, since the ops legalizer runs again after each combine, and all types created all have the same sizes as the (legal) inputs. Exposed by r235917; while there, robustize its tests (bsl also uses the register it defines). llvm-svn: 235922	2015-04-27 21:43:12 +00:00
Ahmed Bougacha	5f0f3e8528	[AArch64] Don't assert when combining (v3f32 select (setcc f64)). When the setcc has f64 operands, we can't build a vector setcc mask to feed a vselect, because f64 doesn't divide v3f32 evenly. Just bail out when that happens. llvm-svn: 235917	2015-04-27 21:01:20 +00:00
Hans Wennborg	7103f56991	Switch lowering: order bit tests by branch weight. llvm-svn: 235912	2015-04-27 20:21:17 +00:00
Bill Schmidt	6661e2ddb2	[PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations This patch adds a new SSA MI pass that runs on little-endian PPC64 code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors without alignment constraints are accomplished for little-endian using lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional xxswapd instructions hurts performance in comparison with big-endian code, but they are necessary in the general case to support correct semantics. However, the general case does not apply to most vector code. Many vector instructions are lane-insensitive; they do not "care" which lanes the parallel computations are performed within, provided that the resulting data is stored into the correct locations. Thus this pass looks for computations that perform only lane-insensitive operations, and remove the unnecessary swaps from loads and stores in such computations. Future improvements will allow computations using certain lane-sensitive operations to also be optimized in this manner, by modifying the lane-sensitive operations to account for the permuted order of the lanes. However, this patch only adds the infrastructure to permit this; no lane-sensitive operations are optimized at this time. This code is heavily exercised by the various vectorizing applications in the projects/test-suite tree. For the time being, I have only added one simple test case to demonstrate what the pass is doing. Although it is quite simple, it provides coverage for much of the code, including the special case handling of copies and subreg-to-reg operations feeding the swaps. I plan to add additional tests in the future as I fill in more of the "special handling" code. Two existing tests were affected, because they expected the swaps to be present, but they are now removed. llvm-svn: 235910	2015-04-27 19:57:34 +00:00
Elena Demikhovsky	489127abd6	AVX-512: added calling conventions for i1 vectors. Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724 llvm-svn: 235889	2015-04-27 15:11:19 +00:00
Brendon Cahoon	37b8b0d293	[Hexagon] Use constant extenders to fix up hardware loops Use a loop instruction with a constant extender for a hardware loop instruction that is too far away from the start of the loop. This is cheaper than changing the SA register value. Differential Revision: http://reviews.llvm.org/D9262 llvm-svn: 235882	2015-04-27 14:16:43 +00:00
Vasileios Kalintiris	50a170aec4	Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel."" This reapplies r235194, which was reverted in r235495 because it was causing a failure in our out-of-tree buildbots for MIPS. With the sign-extension patch in r235718, this patch doesn't cause any problem any more. llvm-svn: 235878	2015-04-27 13:28:05 +00:00
Elena Demikhovsky	3485573818	AVX-512: Extend/Truncate operations for SKX, SETCC for bit-vectors llvm-svn: 235875	2015-04-27 12:57:59 +00:00
Simon Pilgrim	8174379b38	[X86][SSE] Add v16i8/v32i8 multiplication support Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized. The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result. Differential Revision: http://reviews.llvm.org/D9115 llvm-svn: 235837	2015-04-27 07:55:46 +00:00
Matt Arsenault	363e46ec74	R600: Remove / merge redundant testcases llvm-svn: 235813	2015-04-26 00:53:33 +00:00
Sanjay Patel	6a762d07bf	add SSE run to check non-AVX codegen llvm-svn: 235809	2015-04-25 20:41:51 +00:00
Simon Pilgrim	85f3a113c7	line endings fix llvm-svn: 235800	2015-04-25 12:12:43 +00:00
Reid Kleckner	3a59b4e3a9	[SEH] Implement GetExceptionCode in __except blocks This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered by copying the EAX value live into whatever basic block it is called from. Obviously, this only works if you insert it late during codegen, because otherwise mid-level passes might reschedule it. llvm-svn: 235768	2015-04-24 20:25:05 +00:00
David Blaikie	2fcc0180e4	[opaque pointer type] Add textual IR support for explicit type parameter to the invoke instruction Same as r235145 for the call instruction - the justification, tradeoffs, etc are all the same. The conversion script worked the same without any false negatives (after replacing 'call' with 'invoke'). llvm-svn: 235755	2015-04-24 19:32:54 +00:00
Sundeep Kushwaha	17d5168fea	[PATCH] [Hexagon] Adding a test case for calling convention. http://reviews.llvm.org/D9241 llvm-svn: 235754	2015-04-24 19:22:02 +00:00
Yaron Keren	b412c387f7	Teach AArch64\lit.local.cfg the new triple names windows-gnu and windows-msvc. Tests were failing when built with -DLLVM_DEFAULT_TARGET_TRIPLE=i686-pc-windows-gnu. llvm-svn: 235733	2015-04-24 17:14:16 +00:00
Hans Wennborg	2223cdaf41	Switch lowering: fix APInt overflow causing infinite loop / OOM llvm-svn: 235729	2015-04-24 16:53:55 +00:00
Reid Kleckner	a6adb4cb4e	[WinEH] Split the landingpad BB instead of cloning it This means we don't have to RAUW the landingpad instruction and landingpad BB, which is a nice win. llvm-svn: 235725	2015-04-24 16:22:19 +00:00
Jingyue Wu	b822e43272	[NVPTX] Emits "generic()" depending on the original address space Summary: Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()" on aggregates of addrspacecasts. Test Plan: addrspacecast-gvar.ll Reviewers: eliben, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9130 llvm-svn: 235689	2015-04-24 02:57:30 +00:00
Matt Arsenault	5c45e4a835	R600/SI: Fix verifier error when producing v_madmk_f32 Copy the kill flags when swapping the operands. llvm-svn: 235687	2015-04-24 01:57:58 +00:00
Matthias Braun	3cae2dc2b2	R600/RegisterCoalescer: Enable more rematerialization/add missing testcase This enables the rematerialization of some R600 MOV instructions in the RegisterCoalescer and adds a testcase for r235668. llvm-svn: 235675	2015-04-24 00:25:50 +00:00
Reid Kleckner	40c6671601	Re-commit "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works" This reverts commit r235617. r235649 should have addressed the problems. llvm-svn: 235667	2015-04-23 23:22:33 +00:00
Hal Finkel	acf4a0f1ca	[PowerPC] Use sync inst alias when printing So long as the choice between printing msync and sync is not ambiguous, we can print 'sync 0' and just 'sync'. llvm-svn: 235663	2015-04-23 23:05:08 +00:00
Tom Stellard	5903dce77b	R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operands llvm-svn: 235662	2015-04-23 22:59:24 +00:00
Andrew Kaylor	ea1f0f8058	[WinEH] Ignore filter clauses while mapping landing pad blocks. llvm-svn: 235656	2015-04-23 22:38:36 +00:00
Reid Kleckner	ffabe51b14	[WinEH] Replace more lpad value uses with undef We were asserting on code like this: extern "C" unsigned long _exception_code(); void might_crash(unsigned long); void foo() { __try { might_crash(0); } __except(1) { might_crash(_exception_code()); } } Gtest and many other libraries get the exception code from the __except block. What's supposed to happen here is that EAX is live into the __except block, and it contains the exception code. Eventually we'll represent that as a use of the landingpad ehptr value, but for now we can replace it with undef. llvm-svn: 235649	2015-04-23 21:22:30 +00:00
Quentin Colombet	329e6c2b9b	[MachineCopyPropagation] Handle undef flags conservatively so that we do not remove copies that are useful after breaking some hardware dependencies. In other words, handle this kind of situations conservatively by assuming reg2 is redefined by the undef flag. reg1 = copy reg2 = inst reg2<undef> reg2 = copy reg1 Copy propagation used to remove the last copy. This is incorrect because the undef flag on reg2 in inst, allows next passes to put whatever trashed value in reg2 that may help. In practice we end up with this code: reg1 = copy reg2 reg2 = 0 = inst reg2<undef> reg2 = copy reg1 This fixes PR21743. llvm-svn: 235647	2015-04-23 21:17:39 +00:00
Tom Stellard	289922ac39	R600/SI: Fix indirect addressing with a negative constant offset When the base register index of the vector plus the constant offset was less than zero, we were passing the wrong base register to the indirect addressing instruction. In this case, we need to set the base register to v0 and then add the computed (negative) index to m0. llvm-svn: 235641	2015-04-23 20:32:01 +00:00
Peter Collingbourne	249a230d23	Thumb2: When applying branch optimizations, visit branches in reverse order. The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640	2015-04-23 20:31:35 +00:00
Peter Collingbourne	685edd3002	ARM: When re-creating a branch via InsertBranch, preserve CPSR flags. In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639	2015-04-23 20:31:32 +00:00
Peter Collingbourne	e778bcdbc1	Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638	2015-04-23 20:31:30 +00:00
Peter Collingbourne	2930fae4b8	ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637	2015-04-23 20:31:26 +00:00
Peter Collingbourne	337509326a	ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools. This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636	2015-04-23 20:31:22 +00:00
Reid Kleckner	7a1a1e4e4e	Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works" We still have some "uses remain after removal" issues in -O0 builds. This reverts commit r235557. llvm-svn: 235617	2015-04-23 18:34:01 +00:00
Hal Finkel	4e2def1840	[PowerPC] Enable printing instructions using aliases TableGen had been nicely generating code to print a number of instructions using shorter aliases (and PowerPC has plenty of short mnemonics), but we were not calling it. For some of the aliases we support in the parser, TableGen can't infer the "inverse" alias relationship, so there is still more to do. Thus, after some hours of updating test cases... llvm-svn: 235616	2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar	17c8db4460	[AArch64] Add nvcast patterns for v4f16 and v8f16 Summary: Constant stores of f16 vectors can create NvCast nodes from various operand types to v4f16 or v8f16 depending on patterns in the stored constants. This patch adds nvcast rules with v4f16 and v8f16 values. AArchISelLowering::LowerBUILD_VECTOR has the details on which constant patterns generate the nvcast nodes. Reviewers: jmolloy, srhines, ab Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9201 llvm-svn: 235610	2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar	8f58d5c8e1	[AArch64] Handle vec4, vec8, vec16 *itofp for half Summary: Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32, v8i8, v8i16 inputs to allow promotion of v4f16 results. Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32, and i64 vectors. Only missing tests are for v16i8 and v16i16 as the shift operations are too complicated to write a proper check sequence. The conversions from v4i64 to v4f16 do not depend on this patch - v4i64 is split and the conversion gets handled while lowering v2i64. I am adding a test here for completeness. Reviewers: aemerson, rengolin, ab, jmolloy, srhines Subscribers: rengolin, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D9166 llvm-svn: 235609	2015-04-23 17:16:27 +00:00
Hans Wennborg	8823c80ce0	Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608	2015-04-23 16:45:24 +00:00
Sanjay Patel	d205a174d9	use update_llc_test_checks.py to tighten checking; remove unnecessary CPU param llvm-svn: 235604	2015-04-23 16:07:50 +00:00
Krzysztof Parzyszek	a9abedec24	[Hexagon] Shrink-wrap stack frame (Hexagon-specific) llvm-svn: 235603	2015-04-23 16:05:39 +00:00
Krzysztof Parzyszek	7916d2dce4	[Hexagon] Add testcases for stack alignment and variable-sized objects llvm-svn: 235602	2015-04-23 15:12:49 +00:00
Aaron Ballman	be6ee771e3	Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597	2015-04-23 13:41:59 +00:00
Simon Pilgrim	72a57ec317	[DAGCombiner] Remove extra bitcasts surrounding vector shuffles Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) Differential Revision: http://reviews.llvm.org/D9097 llvm-svn: 235578	2015-04-23 08:43:13 +00:00
Andrew Kaylor	3715b60690	[WinEH] Removing seh-filter.ll until I can determine its validity llvm-svn: 235566	2015-04-23 00:38:22 +00:00
Andrew Kaylor	1ed92e06d5	[WinEH] Don't skip landing pads that end with an unreachable instruction. llvm-svn: 235563	2015-04-23 00:20:44 +00:00
Hans Wennborg	d4bc2d86b6	Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560	2015-04-22 23:14:56 +00:00
Reid Kleckner	dbbb7b8ac8	[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works This removes the -sehprepare flag and makes __C_specific_handler functions always to use WinEHPrepare. This was tested by building all of chromium_builder_tests and running a few tests that use SEH, but if something breaks, we can revert this. llvm-svn: 235557	2015-04-22 22:13:09 +00:00
Krzysztof Parzyszek	77fd8a054e	[Hexagon] Some cleanup of instruction selection code llvm-svn: 235552	2015-04-22 21:17:00 +00:00
Reid Kleckner	37da701417	[WinEH] Demote values and phis live across exception handlers up front In particular, this handles SSA values that are live out of a handler. The existing code only handles values that are live in to a handler. It also handles phi nodes in the block where normal control should resume after the end of a catch handler. When EH return points have phi nodes, we need to split the return edge. It is impossible for phi elimination to emit copies in the previous block if that block gets outlined. The indirectbr that we leave in the function is only notional, and is eliminated from the MachineFunction CFG early on. Reviewers: majnemer, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D9158 llvm-svn: 235545	2015-04-22 21:05:21 +00:00
Krzysztof Parzyszek	8605e42505	[Hexagon] Use A2_tfrsi for constant pool and jump table addresses llvm-svn: 235535	2015-04-22 18:25:53 +00:00
Pirama Arumuga Nainar	6ab4b5544e	Fix correctness check for test_vec_fpextend_double Summary: Remove the CHECK-DAG calls introduced in r235341, and add a comment that this test may break due to scheduling variations. This patch completes the fix discussed in http://reviews.llvm.org/D8804 Reviewers: dsanders, srhines Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9178 llvm-svn: 235530	2015-04-22 18:04:12 +00:00
Matt Arsenault	dca0e83270	R600: Fix always inline pass breaking noinline functions No test since calls are not actually supported yet. llvm-svn: 235524	2015-04-22 17:10:44 +00:00
Sanjay Patel	6006202c04	[x86] Add store-folded memop patterns for vcvtps2ph Differential Revision: http://reviews.llvm.org/D7296 llvm-svn: 235517	2015-04-22 16:11:19 +00:00
Andrea Di Biagio	7008d5a01f	[X86][AVX] Fix failure due to a missing ISel pattern to select VBROADCAST nodes (PR23259). This fixes a regression introduced at revision 218263. On AVX, if we optimize for size, a splat build_vector of a load is lowered into a VBROADCAST node. This is done even if the value type of the splat build_vector node is v2i64. Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast where the scalar element comes from a loadi64/loadf64. However, revision 218263 forgot to add an extra fallback pattern for the case where we have a X86VBroadcast of a loadi64 with multiple uses. This patch adds the missing tablegen pattern in X86InstrSSE.td. This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern to select v2i64 X86ISD::BROADCAST nodes. llvm-svn: 235509	2015-04-22 14:53:39 +00:00
Hal Finkel	df195a3bea	[DAGCombine] Disable select(c, load,load) for indexed loads This turned up after r235333, but was a pre-existing bug. The optimization which transforms select(c, load, load) into a load of a select of the addresses does not handle indexed loads (pre/post inc/dec). However, it did not check for them either, leading to a crash if it tried to transform one of them. llvm-svn: 235497	2015-04-22 11:32:25 +00:00
Vasileios Kalintiris	7d80549d5b	Revert "[mips][FastISel] Implement shift ops for Mips fast-isel." This reverts commit r235194. It was causing a failure in FastISel buildbots due to sign-extension issues. llvm-svn: 235495	2015-04-22 10:08:46 +00:00
James Molloy	ffea00f649	[AArch64] Disable complex GEP optimization by default. Enough concerns were raised that this optimization is pessimising some code patterns. The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher. llvm-svn: 235491	2015-04-22 09:11:38 +00:00
Lang Hames	76014c544d	[patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and the X86 backend. The code generated for symbolic targets is identical to the code generated for constant targets, except that a relocation is emitted to fix up the actual target address at link-time. This allows IR and object files containing patchpoints to be cached across JIT-invocations where the target address may change. llvm-svn: 235483	2015-04-22 06:02:31 +00:00
Sanjay Patel	b365b56ba8	[x86] allow 64-bit extracted vector element integer stores on a 32-bit system With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system even though 64-bit integers are not legal types. So instead of producing this: pshufd $229, %xmm0, %xmm1 ## xmm1 = xmm0[1,1,2,3] movd %xmm0, (%eax) movd %xmm1, 4(%eax) We can do: movq %xmm0, (%eax) This is a fix for the problem noted in D7296. Differential Revision: http://reviews.llvm.org/D9134 llvm-svn: 235460	2015-04-22 00:24:30 +00:00
Reid Kleckner	2e3b8976cc	[WinEH] Correctly handle inlined __finally blocks with captures We should also teach the inliner to collapse framerecover of frameaddress of the current frame down to an alloca, but that can happen later. llvm-svn: 235459	2015-04-22 00:07:52 +00:00
Krzysztof Parzyszek	0759b1ad49	[Hexagon] Patterns for frame index with offset for isel llvm-svn: 235418	2015-04-21 21:28:03 +00:00
Reid Kleckner	3b4014368f	Re-land r235154-r235156 under the existing -sehprepare flag Keep the old SEH fan-in lowering on by default for now, since projects rely on it. This will make it easy to test this change with a simple flag flip. llvm-svn: 235399	2015-04-21 18:23:57 +00:00
Matthias Braun	d300744bcd	X86: Match for X86ISD nodes in LowerBUILD_VECTOR instead of BUILD_VECTORCombine There doesn't seem to be a reason to perform this target ISD node matching in an DAGCombine, moving it to lowering fixes PR23296. Differential Revision: http://reviews.llvm.org/D9137 llvm-svn: 235394	2015-04-21 17:21:36 +00:00
Vasileios Kalintiris	ea11455700	[mips] Optimize code generation for 64-bit variable shift instructions. Summary: The 64-bit version of the variable shift instructions uses the shift_rotate_reg class which uses a GPR32Opnd to specify the variable shift amount. With this patch we avoid the generation of a redundant SLL instruction for the variable shift instructions in 64-bit targets. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7413 llvm-svn: 235376	2015-04-21 10:49:03 +00:00
Elena Demikhovsky	abf0138a81	AVX-512: Added logical and arithmetic instructions for SKX by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 235375	2015-04-21 10:27:40 +00:00
Simon Pilgrim	5ad623ef36	[X86][SSE] Provide execution domains for scalar floating point operations This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since. I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests. Differential Revision: http://reviews.llvm.org/D9095 llvm-svn: 235372	2015-04-21 08:40:22 +00:00
Simon Pilgrim	d5bdbd921b	CONCAT_VECTOR of BUILD_VECTOR - minor fix Fixed issue with the combine of CONCAT_VECTOR of 2 BUILD_VECTOR nodes - the optimisation wasn't ensuring that the scalar operands of both nodes were the same type/size for implicit truncation. Test case spotted by Patrik Hagglund llvm-svn: 235371	2015-04-21 08:05:43 +00:00
Pawel Bylica	b046734bb1	Fix generic shift expansion when shift amount is 0 Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=16439. This is one possible way to approach this. The other would be to split InL>>(nbits-Amt) into (InL>>(nbits-1-Amt))>>1, which is also valid since since we only need to care about Amt up nbits-1. It's hard to tell which one is better since the shift might be expensive if this stage of expansion is not yet a legal machine integer, whereas comparisons with zero are relatively cheap at all sizes, but more expensive than a shift if the shift is on a legal machine type. Patch by Keno Fischer! Test Plan: regression test from http://reviews.llvm.org/D7752 Reviewers: chfast, resistor Reviewed By: chfast, resistor Subscribers: sanjoy, resistor, chfast, llvm-commits Differential Revision: http://reviews.llvm.org/D4978 llvm-svn: 235370	2015-04-21 06:28:36 +00:00
Matthias Braun	3df6448424	X86: Do not select X86 custom vector nodes if operand types don't match X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected if the operand types do not match the result type because vector type legalization cannot deal with this for custom nodes. Testcase X86ISD::ADDSUB is attached. I could not create a testcase for the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296 Differential Revision: http://reviews.llvm.org/D9120 llvm-svn: 235367	2015-04-21 01:13:41 +00:00
Pirama Arumuga Nainar	c8c950f572	Fix flakiness in fp16-promote.ll Summary: In the f16-promote test, make the checks for native conversion instructions similar to the libcall checks: - Remove hard coded register names - Do not check exact instruction sequences. This fixes test flakiness due to non-determinism in instruction scheduling and register allocation. I also fixed a few minor things in the CHECK-LIBCALL checks. I'll try to find a way to check that unnecessary loads, stores, or conversions don't happen. Reviewers: mzolotukhin, srhines, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9112 llvm-svn: 235363	2015-04-20 23:54:41 +00:00
Sanjay Patel	3391f7c43c	use update_llc_test_checks.py to tighten checking Also, replace win and linux runs with a generic run because that makes no difference in what this test is checking. llvm-svn: 235361	2015-04-20 23:31:53 +00:00
Andrew Kaylor	721c6b75e7	[WinEH] Fix problem with mapping shared empty handler blocks. Differential Revision: http://reviews.llvm.org/D9125 llvm-svn: 235354	2015-04-20 22:04:09 +00:00
Olivier Sallenave	f748efd9fd	Refactoring and enhancement to FMA combine. llvm-svn: 235344	2015-04-20 20:29:40 +00:00
Andrew Kaylor	7f5330d3c0	Fixing line endings llvm-svn: 235342	2015-04-20 20:27:28 +00:00
Pirama Arumuga Nainar	a463349e8a	[MIPS] OperationAction for FP_TO_FP16, FP16_TO_FP Summary: Set operation action for FP16 conversion opcodes, so the Op legalizer can choose the gnu_* libcalls for Mips. Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to prevent (fpext (load )) and (store (fptrunc)) from getting combined into unsupported operations. Added test cases to test that these operations are handled correctly for f16 scalars and vectors. This patch depends on http://reviews.llvm.org/D8755. Reviewers: srhines Subscribers: llvm-commits, ab Differential Revision: http://reviews.llvm.org/D8804 llvm-svn: 235341	2015-04-20 20:15:36 +00:00
Tom Stellard	68ab711728	DAGCombine: Remove redundant NaN checks around ISD::FSQRT This folds: (select (setcc x, -0.0, *lt), NaN, (fsqrt x)) -> ( fsqrt x) llvm-svn: 235333	2015-04-20 19:38:27 +00:00
Andrea Di Biagio	b10889be03	[X86][FastIsel] Fix assertion failure when selecting int-to-double conversion (PR23273). This fixes a regression introduced at revision 231243. The target-independent selection algorithm in FastISel knows how to select a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr. Method X86FastISel::X86SelectSIToFP was therefore working under the wrong assumption that the target was AVX. That assumption was incorrect since we can have a target that is neither AVX nor SSE. So, rather than asserting for the presence of AVX, we should have had an early exit from 'X86SelectSIToFP' if the target was not AVX. This patch fixes the issue replacing the invalid assertion with an early exit. Thanks to Dimitry Andric for reporting this problem and for providing a small reproducible testcase. Added test pr23273.ll. llvm-svn: 235295	2015-04-20 11:56:59 +00:00
Hal Finkel	40b204ea4b	[InlineAsm] Remove EarlyClobber on registers that are also inputs When an inline asm call has an output register marked as early-clobber, but that same register is also an input operand, what should we do? GCC accepts this, and is documented to accept this for read/write operands saying, "Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it's used." For write-only operands, the situation seems less clear, but I have at least one existing codebase that assumes this will work, in part because it has syscall macros like this: ({ \ register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \ register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \ register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \ register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \ __asm__ __volatile__ \ ("sc" \ : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \ : "0"(r0), "1"(r3), "2"(r4), "3"(r5) \ : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \ r3; \ }) Furthermore, with register aliases and subregister relationships that only the backend knows about, rejecting this in the frontend seems like a difficult proposition (if we wanted to do so). However, keeping the early-clobber flag on the INLINEASM MI does not work for us, because it will cause the register's live interval to end to soon (so it will not appear defined to be used as an input). Fortunately, fixing this does not seem hard: When forming the INLINEASM MI, check to see if any of the early-clobber outputs are also inputs, and if so, remove the early-clobber flag. llvm-svn: 235283	2015-04-20 00:01:30 +00:00
Simon Pilgrim	292688507a	[X86][SSE] Fix for getScalarValueForVectorElement to detect scalar sources requiring truncation. The fix ensures that scalar sources inserted into a vector are the correct bit size. Integer scalar sources from BUILD_VECTOR and SCALAR_TO_VECTOR nodes may require truncation that this function doesn't currently support. llvm-svn: 235281	2015-04-19 22:16:49 +00:00
Simon Pilgrim	1a7066b9c2	[X86][SSE] Extended copysign tests to include llvm intrinsic implementation and constant folding. llvm-svn: 235279	2015-04-19 21:34:57 +00:00
Simon Pilgrim	cdf694c820	[X86][AVX2] Force execution domain on broadcast folding tests. llvm-svn: 235260	2015-04-18 21:24:16 +00:00
Simon Pilgrim	048a4e3644	[X86][SSE] Force execution domain on float/double unpack shuffle tests. llvm-svn: 235259	2015-04-18 18:50:55 +00:00
Ahmed Bougacha	fa8edc9a41	[GlobalMerge] Look at uses to create smaller global sets. Instead of merging everything together, look at the users of GlobalVariables, and try to group them by function, to create sets of globals used "together". Using that information, a less-aggressive alternative is to keep merging everything together except globals that are only ever used alone, that is, those for which it's clearly non-profitable to merge with others. In my testing, grouping by Function is too aggressive, but grouping by BasicBlock is too conservative. Anything in-between isn't trivially available, so stick with Function grouping for now. cl::opts are added for testing; both enabled by default. A few of the testcases aren't testing the merging proper, but just various edge cases when merging does occur. Update them to use the previous grouping behavior. Also, one of the tests is unrelated to GlobalMerge; change it accordingly. While there, switch to r234666' flags rather than the brutal -O3. Differential Revision: http://reviews.llvm.org/D8070 llvm-svn: 235249	2015-04-18 01:21:58 +00:00
Ahmed Bougacha	6c08d94243	[AArch64] Don't force MVT::Untyped when selecting LD1LANEpost. The result is either an Untyped reg sequence, on ldN with N > 1, or just the type of the input vector, on ld1. Don't force Untyped. Instead, just use the type of the reg sequence. This mirrors the behavior of createTuple, which feeds the LD1*_POST. The narrow code path wasn't actually covered by tests, because V64 insert_vector_elt are widened to V128 before the LD1LANEpost combine has the chance to run, usually. The only case where it does run on V64 vectors is if the vector ops legalizer ran. So, tickle the code with a ctpop. Fixes PR23265. llvm-svn: 235243	2015-04-17 23:43:33 +00:00
Ahmed Bougacha	2286313fee	Fix another typo in r235224 testcase. NFC. Third time's the charm! llvm-svn: 235242	2015-04-17 23:38:46 +00:00
Andrew Kaylor	dcac5320d4	[WinEH] Fixes for a few cppeh failures. Differential Review: http://reviews.llvm.org/D9065 llvm-svn: 235239	2015-04-17 23:05:43 +00:00
Pete Cooper	2a4be5132d	AArch64: Add test for returning [2 x i64] in registers. NFC. llvm-svn: 235228	2015-04-17 21:31:25 +00:00
Ahmed Bougacha	0f72cb9a76	Fix typo in r235224 testcase. NFC. llvm-svn: 235226	2015-04-17 21:11:58 +00:00
Ahmed Bougacha	ab193cb218	[AArch64] Avoid vector->load dependency cycles when creating LD1post. They would break the SelectionDAG. Note that the opposite load->vector dependency is already obvious in: (LD1post vec, ..) llvm-svn: 235224	2015-04-17 21:02:30 +00:00
Pirama Arumuga Nainar	f8369b5437	Add support to promote f16 to f32 Summary: This patch adds legalization support to operate on FP16 as a load/store type and do operations on it as floats. Tests for ARM are added to test/CodeGen/ARM/fp16-promote.ll Reviewers: srhines, t.p.northover Differential Revision: http://reviews.llvm.org/D8755 llvm-svn: 235215	2015-04-17 18:36:25 +00:00
Vasileios Kalintiris	029b9d94fb	[mips][FastISel] Implement FastMaterializeAlloca in Mips fast-isel. Summary: Implement the method FastMaterializeAlloca in Mips fast-isel Based on a patch by Reed Kotler. Test Plan: Passes test-suite at O0/O2 for mips32 r1/r2 fastalloca.ll Reviewers: dsanders, rkotler Subscribers: rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6742 llvm-svn: 235213	2015-04-17 17:29:58 +00:00
Sanjay Patel	d7195a62c2	[X86, AVX] add an exedepfix entry for vmovq == vmovlps == vmovlpd This is the AVX extension of r235014: http://llvm.org/viewvc/llvm-project?view=revision&revision=235014 Review: http://reviews.llvm.org/D8691 llvm-svn: 235210	2015-04-17 17:02:37 +00:00
Vasileios Kalintiris	8c6dc71fa1	[mips][FastISel] Implement shift ops for Mips fast-isel. Summary: Add shift operators implementation to fast-isel for Mips. These are shift ops for non legal forms, i.e. i8 and i16. Based on a patch by Reed Kotler. Test Plan: Reviewers: dsanders Subscribers: echristo, rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6726 llvm-svn: 235194	2015-04-17 14:29:21 +00:00
James Molloy	8614d8be5f	Fix TRUNCATE splitting helper logic. This is a followon to r233681 - I'd misunderstood the semantics of FTRUNC, and had confused it with (FP_ROUND ..., 0). Thanks for Ahmed Bougacha for his post-commit review! llvm-svn: 235191	2015-04-17 13:51:40 +00:00
Vasileios Kalintiris	e73d807a5c	[mips] Teach the delay slot filler to remove needless KILL instructions. Summary: Previously, the presence of KILL instructions would block valid candidates from filling a specific delay slot. With the elimination of the KILL instructions, in the appropriate range, we are able to fill more slots and keep the information from future def/use analysis consistent. Reviewers: dsanders Reviewed By: dsanders Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D7724 llvm-svn: 235183	2015-04-17 12:01:02 +00:00
Nico Weber	ffd0269e17	Revert r235154-r235156, they cause asserts when building win64 code (http://crbug.com/477988 ) llvm-svn: 235170	2015-04-17 09:10:43 +00:00
Reid Kleckner	560ebfc81d	Fix test failure due to racing commits It looks like r235145 changed the .ll syntax for variadic calls. Update tests to use the new syntax. llvm-svn: 235156	2015-04-17 01:09:53 +00:00
Reid Kleckner	52ead7bcab	[SEH] Reimplement x64 SEH using WinEHPrepare This now emits simple, unoptimized xdata tables for __C_specific_handler based on the handlers listed in @llvm.eh.actions calls produced by WinEHPrepare. This adds support for running __finally blocks when exceptions are thrown, and removes the old landingpad fan-in codepath. I ran some manual execution tests on small basic test cases with and without optimization, as well as on Chrome base_unittests, which uses a small amount of SEH. I'm sure there are bugs, and we may need to revert. llvm-svn: 235154	2015-04-17 01:01:27 +00:00
Ahmed Bougacha	99c7f5d42e	[AArch64] Don't assert on f16 in DUP PerfectShuffle generator. Found by code inspection, but breaking i16 at least breaks other tests. They aren't checking this in particular though, so also add some explicit tests for the already working types. llvm-svn: 235148	2015-04-16 23:57:07 +00:00
David Blaikie	dfadb4e9ee	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=\|:\|^\|\s)call\s(?:[^@]?))(\s$\|\s(?:(?:\[\[[a-zA-Z0-9_]+\]\]\|[@%](?:(")?[\\\?@a-zA-Z0-9_.]?(?(3)"\|)\|{{.}}))(?:$\|$)\|undef\|inttoptr\|bitcast\|null\|asm).$)') addrspace_end = re.compile(r"addrspace\(\d+$\s\$") func_end = re.compile("(?:void.\|\)\s)\$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145	2015-04-16 23:24:18 +00:00
Pete Cooper	7099480464	Disable AArch64 fast-isel on big-endian call vector returns. A big-endian vector return needs a byte-swap which we aren't doing right now. For now just bail on these cases to get correctness back. llvm-svn: 235133	2015-04-16 21:19:36 +00:00
Reid Kleckner	01ceef5d17	[WinEH] Handle a landingpad, resume, and cleanup all rolled into a BB This happens a lot with simple cleanups after SimplifyCFG. llvm-svn: 235117	2015-04-16 17:02:23 +00:00
Hans Wennborg	ff837f8fc0	Revert the switch lowering change (r235101, r235103, r235106) Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now. llvm-svn: 235108	2015-04-16 15:43:26 +00:00
Hans Wennborg	78be2559e7	Add a triple to switch.ll test. llvm-svn: 235103	2015-04-16 15:09:33 +00:00
Hans Wennborg	bc33cd14d7	Switch lowering: extract jump tables and bit tests before building binary tree (PR22262) This is a major rewrite of the SelectionDAG switch lowering. The previous code would lower switches as a binary tre, discovering clusters of cases suitable for lowering by jump tables or bit tests as it went along. To increase the likelihood of finding jump tables, the binary tree pivot was selected to maximize case density on both sides of the pivot. By not selecting the pivot in the middle, the binary trees would not always be balanced, leading to performance problems in the generated code. This patch rewrites the lowering to search for clusters of cases suitable for jump tables or bit tests first, and then builds the binary tree around those clusters. This way, the binary tree will always be balanced. This has the added benefit of decoupling the different aspects of the lowering: tree building and jump table or bit tests finding are now easier to tweak separately. For example, this will enable us to balance the tree based on profile info in the future. The algorithm for finding jump tables is O(n^2), whereas the previous algorithm was O(n log n) for common cases, and quadratic only in the worst-case. This doesn't seem to be major problem in practice, e.g. compiling a file consisting of a 10k-case switch was only 30% slower, and such large switches should be rare in practice. Compiling e.g. gcc.c showed no compile-time difference. If this does turn out to be a problem, we could limit the search space of the algorithm. This commit also disables all optimizations during switch lowering in -O0. Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235101	2015-04-16 14:49:23 +00:00
Simon Pilgrim	969718c5af	TRUNCATE constant folding - minor fix for rL233224 Fix for test case found by James Molloy - TRUNCATE of constant build vectors can be more simply achieved by simply replacing with a new build vector node with the truncated value type - no need to touch the scalar operands at all. llvm-svn: 235079	2015-04-16 08:21:09 +00:00
Ahmed Bougacha	697a3e0440	[CodeGen] Re-apply r234809 (concat of scalars), with an x86_mmx fix. The only type that isn't an integer, isn't floating point, and isn't a vector; ladies and gentlemen, the gift that keeps on giving: x86_mmx! Fixes PR23246. Original message (reverted in r235062): [CodeGen] Combine concat_vectors of scalars into build_vector. Combine something like: (v8i8 concat_vectors (v2i8 bitcast (i16)) x4) into: (v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4))) If any of the scalars are floating point, use that throughout. Differential Revision: http://reviews.llvm.org/D8948 llvm-svn: 235072	2015-04-16 02:39:14 +00:00
Nick Lewycky	a203d8de5f	Revert r234809 because it caused PR23246. llvm-svn: 235062	2015-04-16 00:56:20 +00:00
Reid Kleckner	3183c50ddf	[SEH] Deal with users of the old lpad for SEH catch-all blocks The way we split SEH catch-all blocks can leave some dead EH values behind at -O0. Try to remove them, and if we fail, replace them all with undef. Fixes a crash when removing the old unreachable landingpad which is still used by extractvalue instructions in the catch-all block. llvm-svn: 235061	2015-04-16 00:02:04 +00:00
Duncan P. N. Exon Smith	380b5bd2b0	DebugInfo: Remove 'inlinedAt:' field from MDLocalVariable Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory (variables with it seem to be single largest `Metadata` contributer to memory usage right now in -g -flto builds), this stops optimization and backend passes from having to change local variables. The 'inlinedAt:' field was used by the backend in two ways: 1. To tell the backend whether and into what a variable was inlined. 2. To create a unique id for each inlined variable. Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg` attachment, and change the DWARF backend to use a typedef called `InlinedVariable` which is `std::pair<MDLocalVariable, MDLocation>`. This `DebugLoc` is already passed reliably through the backend (as verified by r234021). This commit removes the check from r234021, but I added a new check (that will survive) in r235048, and changed the `DIBuilder` API in r235041 to require a `!dbg` attachment whose 'scope:` is in the same `MDSubprogram` as the variable's. If this breaks your out-of-tree testcases, perhaps the script I used (mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778 in a moment. llvm-svn: 235050	2015-04-15 22:29:27 +00:00
Duncan P. N. Exon Smith	565a460434	DebugInfo: Add missing !dbg attachments to intrinsics Add missing `!dbg` attachments to `@llvm.dbg.*` intrinsics. I updated these using a script (add-dbg-to-intrinsics.sh) that I'll attach to PR22778 for posterity. llvm-svn: 235040	2015-04-15 21:04:10 +00:00
Reid Kleckner	9ab21ae989	[WinEH] Try to make the MachineFunction CFG more accurate This avoids emitting code for unreachable landingpad blocks that contain calls to llvm.eh.actions and indirectbr. It's also a first step towards unifying the SEH and WinEH lowering codepaths. I'm keeping the old fan-in lowering of SEH around until the preparation version works well enough that we can switch over without breaking existing users. llvm-svn: 235037	2015-04-15 18:48:15 +00:00
Reid Kleckner	7698801502	Reland "[WinEH] Use the parent function when computing frameescape labels" Fixed the test by removing extraneous quotes. llvm-svn: 235028	2015-04-15 17:47:26 +00:00
Reid Kleckner	8de582220d	Revert "[WinEH] Use the parent function when computing frameescape labels" This reverts commit r235025. The test isn't passing yet. llvm-svn: 235027	2015-04-15 17:43:54 +00:00
Reid Kleckner	9aae652557	[WinEH] Use the parent function when computing frameescape labels Fixes assertions in MC when a local label wasn't defined. llvm-svn: 235025	2015-04-15 17:32:01 +00:00
Rafael Espindola	518baef93e	Update tests to not be as dependent on section numbers. Many of these predate llvm-readobj. With elf-dump we had to match a relocation to symbol number and symbol number to symbol name or section number. llvm-svn: 235015	2015-04-15 15:59:37 +00:00
Sanjay Patel	e7d64b577c	[X86] add an exedepfix entry for movq == movlps == movlpd This is a 1-line patch (with a TODO for AVX because that will affect even more regression tests) that lets us substitute the appropriate 64-bit store for the float/double/int domains. It's not clear to me exactly what the difference is between the 0xD6 (MOVPQI2QImr) and 0x7E (MOVSDto64mr) opcodes, but this is apparently the right choice. Differential Revision: http://reviews.llvm.org/D8691 llvm-svn: 235014	2015-04-15 15:47:51 +00:00

1 2 3 4 5 ...

12694 Commits