llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 11:33:24 +02:00

Author	SHA1	Message	Date
Hans Wennborg	7103f56991	Switch lowering: order bit tests by branch weight. llvm-svn: 235912	2015-04-27 20:21:17 +00:00
Bill Schmidt	6661e2ddb2	[PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations This patch adds a new SSA MI pass that runs on little-endian PPC64 code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors without alignment constraints are accomplished for little-endian using lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional xxswapd instructions hurts performance in comparison with big-endian code, but they are necessary in the general case to support correct semantics. However, the general case does not apply to most vector code. Many vector instructions are lane-insensitive; they do not "care" which lanes the parallel computations are performed within, provided that the resulting data is stored into the correct locations. Thus this pass looks for computations that perform only lane-insensitive operations, and remove the unnecessary swaps from loads and stores in such computations. Future improvements will allow computations using certain lane-sensitive operations to also be optimized in this manner, by modifying the lane-sensitive operations to account for the permuted order of the lanes. However, this patch only adds the infrastructure to permit this; no lane-sensitive operations are optimized at this time. This code is heavily exercised by the various vectorizing applications in the projects/test-suite tree. For the time being, I have only added one simple test case to demonstrate what the pass is doing. Although it is quite simple, it provides coverage for much of the code, including the special case handling of copies and subreg-to-reg operations feeding the swaps. I plan to add additional tests in the future as I fill in more of the "special handling" code. Two existing tests were affected, because they expected the swaps to be present, but they are now removed. llvm-svn: 235910	2015-04-27 19:57:34 +00:00
Zachary Turner	3aea053fae	Make llvm-symbolizer work on Windows. Differential Revision: http://reviews.llvm.org/D9234 Reviewed By: Alexey Samsonov llvm-svn: 235900	2015-04-27 17:19:51 +00:00
Elena Demikhovsky	489127abd6	AVX-512: added calling conventions for i1 vectors. Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724 llvm-svn: 235889	2015-04-27 15:11:19 +00:00
Brendon Cahoon	37b8b0d293	[Hexagon] Use constant extenders to fix up hardware loops Use a loop instruction with a constant extender for a hardware loop instruction that is too far away from the start of the loop. This is cheaper than changing the SA register value. Differential Revision: http://reviews.llvm.org/D9262 llvm-svn: 235882	2015-04-27 14:16:43 +00:00
Toma Tabacu	9931ebcb0f	[mips] [IAS] Improve warning for using AT with .set noat. Summary: Changed the warning message to show the current value of $at, similar to what clang does for typedef's, and renamed warnIfAssemblerTemporary to a more descriptive name. I also changed the type of variables which store registers from int to unsigned, updated the relevant test and tried to make the related comments clearer. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8479 llvm-svn: 235881	2015-04-27 14:05:04 +00:00
Vasileios Kalintiris	50a170aec4	Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel."" This reapplies r235194, which was reverted in r235495 because it was causing a failure in our out-of-tree buildbots for MIPS. With the sign-extension patch in r235718, this patch doesn't cause any problem any more. llvm-svn: 235878	2015-04-27 13:28:05 +00:00
Elena Demikhovsky	3485573818	AVX-512: Extend/Truncate operations for SKX, SETCC for bit-vectors llvm-svn: 235875	2015-04-27 12:57:59 +00:00
Toma Tabacu	1e19472701	[MC] [IAS] Add support for the \@ .macro pseudo-variable. Summary: When used, it is substituted with the number of .macro instantiations we've done up to that point in time. So if this is the 1st time we've instantiated a .macro (any .macro, regardless of name), \@ will instantiate to 0, if it's the 2nd .macro instantiation, it will instantiate to 1 etc. It can only be used inside a .macro definition, an .irp definition or an .irpc definition (those last 2 uses are undocumented). Reviewers: echristo, rafael Reviewed By: rafael Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D9197 llvm-svn: 235862	2015-04-27 10:50:29 +00:00
Pawel Bylica	59a86f528f	Constfold insertelement to undef when index is out-of-bounds Summary: This patch adds constant folding of insertelement instruction to undef value when index operand is constant and is not less than vector size or is undef. InstCombine does not support this case, but I'm happy to add it there also if this change is accepted. Test Plan: Unittests and regression tests for ConstProp pass. Reviewers: majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9287 llvm-svn: 235854	2015-04-27 09:30:49 +00:00
Simon Pilgrim	8174379b38	[X86][SSE] Add v16i8/v32i8 multiplication support Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized. The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result. Differential Revision: http://reviews.llvm.org/D9115 llvm-svn: 235837	2015-04-27 07:55:46 +00:00
Philip Reames	ddbf5c48a9	[RewriteStatepointsForGC] Exclude constant values from being considered live at a safepoint There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code. To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work. Differential Revision: http://reviews.llvm.org/D9236 llvm-svn: 235821	2015-04-26 19:48:03 +00:00
Philip Reames	95df870e77	Don't Place Entry Safepoints Before the llvm.frameescape() Intrinsic llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block. Patch by: Swaroop.Sridhar@microsoft.com Differential Revision: http://reviews.llvm.org/D8910 llvm-svn: 235820	2015-04-26 19:41:23 +00:00
Matt Arsenault	363e46ec74	R600: Remove / merge redundant testcases llvm-svn: 235813	2015-04-26 00:53:33 +00:00
Sanjay Patel	6e05431540	[x86] instcombine more cases of insertps into a shufflevector This is a follow-on to D8833 (insertps optimization when the zero mask is not used). In this patch, we check for the case where the zmask is used, but both input vectors to the insertps intrinsic are the same operand or the zmask overrides the destination lane. This lets us replace the 2nd shuffle input operand with the zero vector. Differential Revision: http://reviews.llvm.org/D9257 llvm-svn: 235810	2015-04-25 20:55:25 +00:00
Sanjay Patel	6a762d07bf	add SSE run to check non-AVX codegen llvm-svn: 235809	2015-04-25 20:41:51 +00:00
Simon Pilgrim	85f3a113c7	line endings fix llvm-svn: 235800	2015-04-25 12:12:43 +00:00
Duncan P. N. Exon Smith	9cabb129a0	Linker: Copy over function metadata attachments Update `lib/Linker` to handle `Function` metadata attachments. The attachments stick with the function body. llvm-svn: 235786	2015-04-24 22:07:31 +00:00
Duncan P. N. Exon Smith	c4adf5ea45	IR: Add assembly/bitcode support for function metadata attachments Add serialization support for function metadata attachments (added in r235783). The syntax is: define @foo() !attach !0 { Metadata attachments are only allowed on functions with bodies. Since they come before the `{`, they're not really part of the body; since they require a body, they're not really part of the header. In `LLParser` I gave them a separate function called from `ParseDefine()`, `ParseOptionalFunctionMetadata()`. In bitcode, I'm using the same `METADATA_ATTACHMENT` record used by instructions. Instruction metadata attachments are included in a special "attachment" block at the end of a `Function`. The attachment records are laid out like this: InstID (KindID MetadataID)+ Note that these records always have an odd number of fields. The new code takes advantage of this to recognize function attachments (which don't need an instruction ID): (KindID MetadataID)+ This means we can use the same attachment block already used for instructions. This is part of PR23340. llvm-svn: 235785	2015-04-24 22:04:41 +00:00
Hans Wennborg	0a37a86ca9	SimplifyCFG: Correctly handle switch lookup tables which fully cover the input type and use bit tests to check for holes When using bit tests for hole checks, we call AddPredecessorToBlock to give the phi node a value from the bit test block. This would break if we've previously called removePredecessor on the default destination because the switch is fully covered. Test case by Mark Lacey. llvm-svn: 235771	2015-04-24 20:57:56 +00:00
Reid Kleckner	3a59b4e3a9	[SEH] Implement GetExceptionCode in __except blocks This introduces an intrinsic called llvm.eh.exceptioncode. It is lowered by copying the EAX value live into whatever basic block it is called from. Obviously, this only works if you insert it late during codegen, because otherwise mid-level passes might reschedule it. llvm-svn: 235768	2015-04-24 20:25:05 +00:00
David Blaikie	2fcc0180e4	[opaque pointer type] Add textual IR support for explicit type parameter to the invoke instruction Same as r235145 for the call instruction - the justification, tradeoffs, etc are all the same. The conversion script worked the same without any false negatives (after replacing 'call' with 'invoke'). llvm-svn: 235755	2015-04-24 19:32:54 +00:00
Sundeep Kushwaha	17d5168fea	[PATCH] [Hexagon] Adding a test case for calling convention. http://reviews.llvm.org/D9241 llvm-svn: 235754	2015-04-24 19:22:02 +00:00
David Blaikie	d53179affb	Revert changes to LTO test case since llvm-lto can't handle textual IR inputs llvm-svn: 235738	2015-04-24 18:13:27 +00:00
David Blaikie	a674c3cd83	Skip extra LLVM IR assemble/disassemble steps in some tests llvm-svn: 235736	2015-04-24 18:06:09 +00:00
David Blaikie	196bfb60ad	[opaque pointer type] bitcode: add explicit callee type to invoke instructions llvm-svn: 235735	2015-04-24 18:06:06 +00:00
Yaron Keren	b412c387f7	Teach AArch64\lit.local.cfg the new triple names windows-gnu and windows-msvc. Tests were failing when built with -DLLVM_DEFAULT_TARGET_TRIPLE=i686-pc-windows-gnu. llvm-svn: 235733	2015-04-24 17:14:16 +00:00
Duncan P. N. Exon Smith	3d54d64f27	Linker: Update -override testcase to check callers Check that `@main` is calling `@foo2` (the renamed internal function), not the `@foo` with external linkage that's been pulled in from the override file. llvm-svn: 235730	2015-04-24 16:56:24 +00:00
Hans Wennborg	2223cdaf41	Switch lowering: fix APInt overflow causing infinite loop / OOM llvm-svn: 235729	2015-04-24 16:53:55 +00:00
Reid Kleckner	a6adb4cb4e	[WinEH] Split the landingpad BB instead of cloning it This means we don't have to RAUW the landingpad instruction and landingpad BB, which is a nice win. llvm-svn: 235725	2015-04-24 16:22:19 +00:00
Filipe Cabecinhas	6a560937ff	[BitcodeReader] Fix asserts when we read a non-vector type for insert/extract/shuffle Added some additional checking for vector types + tests. Bug found with AFL fuzz. llvm-svn: 235710	2015-04-24 11:30:15 +00:00
Jingyue Wu	f2297fbf03	Resurrect r235688 We should skip vector types which are not SCEVable. test/CodeGen/NVPTX/sched2.ll passes llvm-svn: 235695	2015-04-24 04:22:39 +00:00
Jingyue Wu	3bc256d7df	Revert r235688 Seems breaking builds llvm-svn: 235690	2015-04-24 03:26:11 +00:00
Jingyue Wu	b822e43272	[NVPTX] Emits "generic()" depending on the original address space Summary: Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()" on aggregates of addrspacecasts. Test Plan: addrspacecast-gvar.ll Reviewers: eliben, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9130 llvm-svn: 235689	2015-04-24 02:57:30 +00:00
Jingyue Wu	6e67398da9	[NVPTX] enable NaryReassociate in NVPTX Summary: We run NaryReassociate right after SLSR because SLSR enables many opportunities for NaryReassociate. For example, in nary-slsr.ll foo((a + b) + c); foo((a + b * 2) + c); foo((a + b * 3) + c); // 2 muls and 6 adds after SLSR: ab = a + b; foo(ab + c); ab2 = ab + b; foo(ab2 + c); ab3 = ab2 + b; foo(ab3 + c); // 6 adds after NaryReassociate: abc = (a + b) + c; foo(abc); ab2c = abc + b; foo(ab2c); ab3c = ab2c + b; foo(ab3c); // 4 adds Test Plan: nary-slsr.ll Reviewers: jholewinski, eliben Reviewed By: eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9066 llvm-svn: 235688	2015-04-24 02:54:06 +00:00
Matt Arsenault	5c45e4a835	R600/SI: Fix verifier error when producing v_madmk_f32 Copy the kill flags when swapping the operands. llvm-svn: 235687	2015-04-24 01:57:58 +00:00
Matthias Braun	3cae2dc2b2	R600/RegisterCoalescer: Enable more rematerialization/add missing testcase This enables the rematerialization of some R600 MOV instructions in the RegisterCoalescer and adds a testcase for r235668. llvm-svn: 235675	2015-04-24 00:25:50 +00:00
Reid Kleckner	40c6671601	Re-commit "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works" This reverts commit r235617. r235649 should have addressed the problems. llvm-svn: 235667	2015-04-23 23:22:33 +00:00
Hal Finkel	8d20785e6c	[PowerPC] Support register name prefixes for vector registers Match binutils by supporting the optional register name prefix for new vector registers ("vs" for VSX registers and "q" for QPX registers). llvm-svn: 235665	2015-04-23 23:16:22 +00:00
Hal Finkel	acf4a0f1ca	[PowerPC] Use sync inst alias when printing So long as the choice between printing msync and sync is not ambiguous, we can print 'sync 0' and just 'sync'. llvm-svn: 235663	2015-04-23 23:05:08 +00:00
Tom Stellard	5903dce77b	R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operands llvm-svn: 235662	2015-04-23 22:59:24 +00:00
Hal Finkel	5faf4416ba	[PowerPC] Add asm/disasm support for dcbt with hint Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint field specified (non-zero). Unforunately, the syntax for this instruction is special in that it differs for server vs. embedded cores: dcbt ra, rb, th [server] dcbt th, ra, rb [embedded] where th can be omitted when it is 0. dcbtst is the same. Thus we need to play games in the parser and the printer to flip the operands around on the embedded cores. We'll use the server syntax as the default (binutils currently uses the embedded form by default, but IBM is changing that). We also stop marking dcbtst as having unmodeled side effects (this is not necessary, it is just a hint like dcbt -- noticed by inspection, so no separate test case). llvm-svn: 235657	2015-04-23 22:47:57 +00:00
Andrew Kaylor	ea1f0f8058	[WinEH] Ignore filter clauses while mapping landing pad blocks. llvm-svn: 235656	2015-04-23 22:38:36 +00:00
Reid Kleckner	ffabe51b14	[WinEH] Replace more lpad value uses with undef We were asserting on code like this: extern "C" unsigned long _exception_code(); void might_crash(unsigned long); void foo() { __try { might_crash(0); } __except(1) { might_crash(_exception_code()); } } Gtest and many other libraries get the exception code from the __except block. What's supposed to happen here is that EAX is live into the __except block, and it contains the exception code. Eventually we'll represent that as a use of the landingpad ehptr value, but for now we can replace it with undef. llvm-svn: 235649	2015-04-23 21:22:30 +00:00
Quentin Colombet	329e6c2b9b	[MachineCopyPropagation] Handle undef flags conservatively so that we do not remove copies that are useful after breaking some hardware dependencies. In other words, handle this kind of situations conservatively by assuming reg2 is redefined by the undef flag. reg1 = copy reg2 = inst reg2<undef> reg2 = copy reg1 Copy propagation used to remove the last copy. This is incorrect because the undef flag on reg2 in inst, allows next passes to put whatever trashed value in reg2 that may help. In practice we end up with this code: reg1 = copy reg2 reg2 = 0 = inst reg2<undef> reg2 = copy reg1 This fixes PR21743. llvm-svn: 235647	2015-04-23 21:17:39 +00:00
Tom Stellard	289922ac39	R600/SI: Fix indirect addressing with a negative constant offset When the base register index of the vector plus the constant offset was less than zero, we were passing the wrong base register to the indirect addressing instruction. In this case, we need to set the base register to v0 and then add the computed (negative) index to m0. llvm-svn: 235641	2015-04-23 20:32:01 +00:00
Peter Collingbourne	249a230d23	Thumb2: When applying branch optimizations, visit branches in reverse order. The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640	2015-04-23 20:31:35 +00:00
Peter Collingbourne	685edd3002	ARM: When re-creating a branch via InsertBranch, preserve CPSR flags. In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639	2015-04-23 20:31:32 +00:00
Peter Collingbourne	e778bcdbc1	Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638	2015-04-23 20:31:30 +00:00
Peter Collingbourne	2930fae4b8	ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637	2015-04-23 20:31:26 +00:00

1 2 3 4 5 ...

29747 Commits