llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	877d6ce5ff	[X86] Split ctpop/ctlz/cttz cost tests This will make things a lot easier to test all the permutations of avx512 llvm-svn: 303290	2017-05-17 19:57:20 +00:00
Matt Arsenault	b41d61b11a	AMDGPU: Fix min3/max3 combines for f16/i16 Fix missing instruction definitions for min3/max3. llvm-svn: 303284	2017-05-17 19:25:06 +00:00
Simon Pilgrim	52e72fd51b	[X86][AVX512] Add 512-bit vector bitreverse costs + tests llvm-svn: 303283	2017-05-17 19:20:20 +00:00
Sanjay Patel	66f3d22962	[InstCombine] add isCanonicalPredicate() helper function and use it; NFCI There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code. The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring should make it easier if we want to change any of that behavior. 1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to rL32751 from what I can tell, but I'm not sure if there's a justification for that rule. 2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches for selects? 3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. Should one or both be changed? No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates. Differential Revision: https://reviews.llvm.org/D33247 llvm-svn: 303261	2017-05-17 14:21:19 +00:00
Daniel Sanders	c6132632b6	[globalisel][tablegen] Import rules containing intrinsic_wo_chain. Summary: As of this patch, 1018 out of 3938 rules are currently imported. Depends on D32275 Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar Reviewed By: qcolombet Subscribers: dberris, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32278 llvm-svn: 303259	2017-05-17 13:39:49 +00:00
Sanjay Patel	8c90f78395	[x86] Update tests in psubus.ll; NFC Remove unnecessary memops to minimize tests. Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D32643 llvm-svn: 303258	2017-05-17 13:39:16 +00:00
Krzysztof Parzyszek	3fd2e20288	[PPC] Properly update register save area offsets The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 llvm-svn: 303257	2017-05-17 13:25:09 +00:00
Igor Breger	259f70612a	[GlobalISel][X86] Support add i64 in IA32. Summary: support G_UADDE instruction selection. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D33096 llvm-svn: 303255	2017-05-17 12:48:08 +00:00
Jonas Paulsson	4273fbf74e	[SystemZ] Modelling of costs of divisions with a constant power of 2. Such divisions will eventually be implemented with shifts which should be reflected in the cost function. Review: Ulrich Weigand llvm-svn: 303254	2017-05-17 12:46:26 +00:00
Daniel Sanders	fb194eb754	[globalisel][tablegen] Require that all registers between instructions of a match are virtual. Summary: Without this, it's possible to encounter multiple defs for a register. This is triggered by the current version of D32868 when applied to trunk. Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls Reviewed By: qcolombet Subscribers: llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D32869 llvm-svn: 303253	2017-05-17 12:43:30 +00:00
Daniel Cederman	e286942bf0	[Sparc] Remove execute permissions from non-executable text files Reviewers: jyknight, lero_chris, venkatra Reviewed By: jyknight Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27127 llvm-svn: 303245	2017-05-17 11:05:20 +00:00
Diana Picus	3318f6498e	[GlobalISel][TableGen] Fix handling of default operands When looping through a destination pattern's operands to decide how many default operands we need to introduce, we used to count the "expanded" number of operands. So if one default operand would be rendered as 2 values, we'd count it as 2 operands, when in fact it needs to count as only 1 operand regardless of how many values it expands to. This turns out to be a problem only in some very specific cases, e.g. when we have one operand with multiple default values followed by more operands with default values (see the new test). In such a situation we'd stop looping before looking at all the operands, and then error out assuming that we don't have enough default operands to make up the shortfall. At the moment this only affects ARM. The patch removes the loop counting default operands entirely and assumes that we'll have to introduce values for any default operand that we find (i.e. we're assuming it cannot be given as a child at all). It also extracts the code for adding renderers for default operands into a helper method. Differential Revision: https://reviews.llvm.org/D33031 llvm-svn: 303240	2017-05-17 08:57:28 +00:00
Pavel Labath	79e9360669	[RuntimeDyld] Fix debug section relocation (pr20457) Summary: Debug info sections, (or non-SHF_ALLOC sections in general) should be linked as if their load address was zero to emulate the behavior of the static linker. This bug was discovered because it was breaking lldb expression evaluation on linux. Reviewers: lhames Subscribers: aprantl, eugene, clayborg, lldb-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D32899 llvm-svn: 303239	2017-05-17 08:47:28 +00:00
Gor Nishanov	ef81a105e5	[coroutines] Handle spills before catchswitch If we need to spill the result of the PHI instruction, we insert the spill after all of the PHIs and EHPads, however, in a catchswitch block there is no room to insert the spill. Make room by splitting away catchswitch into a separate block. Before the fix: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %switch = catchswitch within none [label %catch] unwind label %cleanuppad After: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %tok = cleanuppad within none [] ; spill goes here cleanupret from %tok unwind label %catch.dispatch.switch catch.dispatch.switch: %switch = catchswitch within none [label %catch] unwind label %cleanuppad https://reviews.llvm.org/D31846 llvm-svn: 303232	2017-05-17 03:09:22 +00:00
Davide Italiano	e6fa4e685f	[NewGVN] Re-enable test now that the nondeterminism has been fixed. llvm-svn: 303217	2017-05-16 22:27:06 +00:00
NAKAMURA Takumi	0fc3371c12	llvm/test/Transforms/InstCombine/debuginfo-skip.ll REQUIRES +asserts. llvm-svn: 303216	2017-05-16 22:19:56 +00:00
Sanjay Patel	5c904052a0	[InstSimplify] add folds for constant mask of value shifted by constant We would eventually catch these via demanded bits and computing known bits in InstCombine, but I think it's better to handle the simple cases as soon as possible as a matter of efficiency. This fold allows further simplifications based on distributed ops transforms. eg: %a = lshr i8 %x, 7 %b = or i8 %a, 2 %c = and i8 %b, 1 InstSimplify can directly fold this now: %a = lshr i8 %x, 7 Differential Revision: https://reviews.llvm.org/D33221 llvm-svn: 303213	2017-05-16 21:51:04 +00:00
Amara Emerson	b4afa9c73c	Re-commit r302678, fixing PR33053. The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. llvm-svn: 303211	2017-05-16 21:29:22 +00:00
Easwaran Raman	d91313ddb5	[Inliner] Do not mix callsite and callee hotness based updates. Update threshold based on callee's hotness only when BFI is not available. Otherwise use only callsite's hotness. This makes it easier to reason about hotness related threshold updates. Differential revision: https://reviews.llvm.org/D33157 llvm-svn: 303210	2017-05-16 21:18:09 +00:00
Tim Shen	82dcf06a2b	[PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC. Differential Revisions: https://reviews.llvm.org/D32763 llvm-svn: 303209	2017-05-16 20:58:55 +00:00
Matthias Braun	b5e4bc434a	Test for r303197 llvm-svn: 303208	2017-05-16 20:53:27 +00:00
Tim Shen	d0970ab97a	[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync. Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 llvm-svn: 303205	2017-05-16 20:18:06 +00:00
Sanjay Patel	95166c2c5b	[InstCombine] auto-generate better checks; NFC llvm-svn: 303203	2017-05-16 20:09:32 +00:00
Dmitry Mikulin	2eecd88090	In debug builds non-trivial amount of time is spent in InstCombine processing @llvm.dbg.* calls in visitCallInst(). They can be safely ignored. llvm-svn: 303202	2017-05-16 20:08:49 +00:00
Reid Kleckner	6ef635c682	Revert "[X86] Replace slow LEA instructions in X86" This reverts commit r303183, it broke various buildbots and introduced sanitizer errors. llvm-svn: 303199	2017-05-16 19:55:03 +00:00
Nirav Dave	3633380341	Elide stores which are overwritten without being observed. Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 llvm-svn: 303198	2017-05-16 19:43:56 +00:00
Renato Golin	03c513ec21	Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove" Revert "[ARM] Mark LEApcrel as not having side effects" This reverts commit r303054 and r303053, as they broke the ARM self-hosting buildbots: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845 Offline investigation on course. llvm-svn: 303193	2017-05-16 17:59:07 +00:00
Sanjay Patel	483e8a2253	[InstCombine] add motivational comment for tests; NFC The referenced tests are derived from: https://bugs.llvm.org/show_bug.cgi?id=32791 and: https://reviews.llvm.org/D33172 The motivation for including negative tests may not be clear, so I'm adding an explanatory comment here. In the post-commit thread for r303133: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170515/453793.html ...it was mentioned that we don't want to add redundant tests. This is a valid point. But in this case, we have a patch under review (D33172) that demonstrates that no existing regression tests are affected by a proposed code change, but these are. Therefore, I think these tests have value not visible in any existing regression tests regardless of whether they show a transform. Differential Revision: https://reviews.llvm.org/D33242 llvm-svn: 303185	2017-05-16 16:30:46 +00:00
Lama Saba	9f9269fa35	[X86] Replace slow LEA instructions in X86 According to Intel's Optimization Reference Manual for SNB+: " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must dispatch via port 1: - LEA that has all three source operands: base, index, and offset - LEA that uses base and index registers where the base is EBP, RBP,or R13 - LEA that uses RIP relative addressing mode - LEA that uses 16-bit addressing mode " This patch currently handles the first 2 cases only. Differential Revision: https://reviews.llvm.org/D32277 llvm-svn: 303183	2017-05-16 16:01:36 +00:00
Matthew Simpson	be5fce863d	Revert 303174, 303176, and 303178 These commits are breaking the bots. Reverting to investigate. llvm-svn: 303182	2017-05-16 15:50:30 +00:00
Matthew Simpson	f9ca5aa639	Make test target-specific llvm-svn: 303178	2017-05-16 15:33:22 +00:00
Matthew Simpson	2b38ec1ab7	Fix test case to unbreak bots llvm-svn: 303176	2017-05-16 15:20:27 +00:00
Matthew Simpson	fdeda43e2f	[LV] Avoid potentential division by zero when selecting IC llvm-svn: 303174	2017-05-16 14:43:55 +00:00
Gor Nishanov	e2a5e02b38	[coroutines] Handle unwind edge splitting Summary: RewritePHIs algorithm used in building of CoroFrame inserts a placeholder ``` %placeholder = phi [%val] ``` on every edge leading to a block starting with PHI node with multiple incoming edges, so that if one of the incoming values was spilled and need to be reloaded, we have a place to insert a reload. We use SplitEdge helper function to split the incoming edge. SplitEdge function does not deal with unwind edges comping into a block with an EHPad. This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge. For landing pads, we clone the landing pad into every edge block and replace the original landing pad with a PHI collection the values from all incoming landing pads. For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the edge blocks. Reviewers: majnemer, rnk Reviewed By: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31845 llvm-svn: 303172	2017-05-16 14:11:39 +00:00
Igor Breger	7b3a99c110	[GlobalISel][X86] Split memop test file. NFC llvm-svn: 303169	2017-05-16 13:37:31 +00:00
Max Kazantsev	66886e6d12	[SCEV] Fix sorting order for AddRecExprs The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148	2017-05-16 07:27:06 +00:00
Davide Italiano	c968f4e36b	Revert "[NewGVN] Replace predicate info leftovers." It's breaking the bots. llvm-svn: 303142	2017-05-16 05:51:21 +00:00
Davide Italiano	08da355c1b	[NewGVN] Replace predicate info leftovers. Fixes PR32945. Differential Revision: https://reviews.llvm.org/D33226 llvm-svn: 303141	2017-05-16 05:23:23 +00:00
Sanjay Patel	3f32289f72	[InstCombine] add tests for PR32791; NFC llvm-svn: 303133	2017-05-15 23:59:28 +00:00
Francis Visoiu Mistrih	6212dd268d	[ShrinkWrapping] Handle restores on no-return paths Shrink-wrapping uses post-dominators to find a restore point that post-dominates all the uses of CSR / stack. The way dominator trees are modeled in LLVM today is that unreachable blocks are not present in a generic dominator tree, so, an unreachable node is dominated by anything: include/llvm/Support/GenericDomTree.h:467. Since for post-dominators, a no-return block is considered "unreachable", calling findNearestCommonDominator on an unreachable node A and a non-unreachable node B, will return B, which can be false. If we find such node, we bail out since there is no good restore point available. rdar://problem/30186931 llvm-svn: 303130	2017-05-15 23:13:35 +00:00
Sanjay Patel	6a5f74bb94	[InstSimplify] add tests for unnecessary mask of shifted values; NFC llvm-svn: 303127	2017-05-15 22:54:37 +00:00
Justin Bogner	e9f4448b87	Add "REQUIRES:" to the last few tests that use target specific intrinsics llvm-svn: 303123	2017-05-15 22:15:22 +00:00
Tim Northover	cfd3f40830	AArch64: use linker-private symbols for globals in MachO. We don't use section-relative relocations on AArch64, so all symbols must be at least visible to the linker (i.e. properly global or l_whatever, but not L_whatever). llvm-svn: 303118	2017-05-15 21:51:38 +00:00
David Blaikie	cbf90e5613	PR32288: Describe a bool parameter's DWARF location with a simple register There's no need (& a bit incorrect) to mask off the high bits of the register reference when describing a simple bool value. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D31062 llvm-svn: 303117	2017-05-15 21:34:01 +00:00
Adam Nemet	f9607f0660	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Hans Wennborg	247e13c637	Revert r302678 "[AArch64] Enable use of reduction intrinsics." This caused PR33053. Original commit message: > The new experimental reduction intrinsics can now be used, so I'm enabling this > for AArch64. We will need this for SVE anyway, so it makes sense to do this for > NEON reductions as well. > > The existing code to match shufflevector patterns are replaced with a direct > lowering of the reductions to AArch64-specific nodes. Tests updated with the > new, simpler, representation. > > Differential Revision: https://reviews.llvm.org/D32247 llvm-svn: 303115	2017-05-15 20:59:32 +00:00
Tim Northover	77c86e2d11	AArch64: diagnose unrecognized features in .cpu directive. We were silently ignoring any features we couldn't match up, which led to errors in an inline asm block missing the conventional "\n\t". llvm-svn: 303108	2017-05-15 19:42:15 +00:00
Sanjay Patel	612c21f9a9	[InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949) This is the InstCombine counterpart to D32954. I added some comments about the code duplication in: rL302436 Alive-based verification: http://rise4fun.com/Alive/dPw This is a 2nd fix for the problem reported in: https://bugs.llvm.org/show_bug.cgi?id=32949 Differential Revision: https://reviews.llvm.org/D32970 llvm-svn: 303105	2017-05-15 19:27:53 +00:00
Sanjay Patel	6116bcb0ac	[InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949) These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving: https://bugs.llvm.org/show_bug.cgi?id=9343 As shown here: http://rise4fun.com/Alive/C8 ...however, the sdiv exact case needs a stronger predicate. I opted for duplicated code instead of adding another fallthrough because I think that's easier to read (and edit in case we need/want to restrict/loosen the predicates any more). This should fix: https://bugs.llvm.org/show_bug.cgi?id=32949 https://bugs.llvm.org/show_bug.cgi?id=32948 Differential Revision: https://reviews.llvm.org/D32954 llvm-svn: 303104	2017-05-15 19:16:49 +00:00
Evgeny Stupachenko	d11ab9e578	The patch adds CTLZ idiom recognition. Summary: The following loops should be recognized: i = 0; while (n) { n = n >> 1; i++; body(); } use(i); And replaced with builtin_ctlz(n) if body() is empty or for CPUs that have CTLZ instruction converted to countable: for (j = 0; j < builtin_ctlz(n); j++) { n = n >> 1; i++; body(); } use(builtin_ctlz(n)); Reviewers: rengolin, joerg Differential Revision: http://reviews.llvm.org/D32605 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303102	2017-05-15 19:08:56 +00:00
Davide Italiano	a3335b259b	[NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency(). verifyMemoryCongruency() filters out trivially dead MemoryDef(s), as we find them immediately dead, before moving from TOP to a new congruence class. This fixes the same problem for PHI(s) skipping MemoryPhis if all the operands are dead. Differential Revision: https://reviews.llvm.org/D33044 llvm-svn: 303100	2017-05-15 18:50:53 +00:00
Teresa Johnson	b00b861ff8	Add support for handling ifuncs to GlobalValue::getBaseObject Summary: All GlobalIndirectSymbol types (not just GlobalAlias) should return their base object. Without this patch LTO would warn "Unable to determine comdat of alias!" for an ifunc. Reviewers: pcc Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D33202 llvm-svn: 303096	2017-05-15 18:28:29 +00:00
Kyle Butt	0bcf661a2a	CodeGen: BlockPlacement: Increase tail duplication size for O3. At O3 we are more willing to increase size if we believe it will improve performance. The current threshold for tail-duplication of 2 instructions is conservative, and can be relaxed at O3. Benchmark results: llvm test-suite: 6% improvement in aha, due to duplication of loop latch 3% improvement in hexxagon 2% slowdown in lpbench. Seems related, but couldn't completely diagnose. Internal google benchmark: Produces 4% improvement on internal google protocol buffer serialization benchmarks. Differential-Revision: https://reviews.llvm.org/D32324 llvm-svn: 303084	2017-05-15 17:30:47 +00:00
Simon Pilgrim	5188a44a1b	[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146) Follow up to D33147 NVPTXTargetLowering::LowerCall was trusting the default argument values. Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33189 llvm-svn: 303082	2017-05-15 17:17:44 +00:00
Rafael Espindola	7369fbc825	Add an extra test for archive symbol tables. The table should include only defined symbols. llvm-svn: 303075	2017-05-15 15:56:23 +00:00
Simon Pilgrim	a09d9c5af0	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 add/sub/mul llvm-svn: 303074	2017-05-15 15:48:15 +00:00
Florian Hahn	c8227d3439	[AArch64] Enable FeatureFuseAES on Cortex-A72. This patch enables fusing dependent AESE/AESMC and AESD/AESIMC instruction pairs on Cortex-A72, as recommended in the Software Optimization Guide, section 4.10. llvm-svn: 303073	2017-05-15 15:15:22 +00:00
Dmitry Preobrazhensky	5a5f736ba9	[AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64 See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 llvm-svn: 303070	2017-05-15 14:28:23 +00:00
Simon Pilgrim	61ca3be831	[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 shifts llvm-svn: 303069	2017-05-15 14:27:11 +00:00
Dinar Temirbulatov	41867e5c63	Test commit. llvm-svn: 303059	2017-05-15 13:14:04 +00:00
Dmitry Preobrazhensky	5cc1148d9a	[AMDGPU][MC] Removed V_MQSAD_U16_U8 This instruction does not really exist See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018 Reviewers: vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D33126 llvm-svn: 303055	2017-05-15 12:37:03 +00:00
John Brawn	3f252f0e9f	[ARM] Mark LEApcrel instructions as isAsCheapAsAMove Doing this means that if an LEApcrel is used in two places we will rematerialize instead of generating two MOVs. This is particularly useful for printfs using the same format string, where we want to generate an address into a register that's going to get corrupted by the call. Differential Revision: https://reviews.llvm.org/D32858 llvm-svn: 303054	2017-05-15 11:57:54 +00:00
John Brawn	1f37e6bdfc	[ARM] Mark LEApcrel as not having side effects Doing this lets us hoist it out of loops, and I've also marked it as rematerializable the same as the thumb1 and thumb2 counterparts. It looks like it being marked as such was just a mistake, as the commit that made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the LEApcrelJT instructions were marked as having side-effects, so it looks like the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was accidentally marked as such also. Differential Revision: https://reviews.llvm.org/D32857 llvm-svn: 303053	2017-05-15 11:50:21 +00:00
Ayman Musa	46c492dbd0	[X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050	2017-05-15 11:30:54 +00:00
Sam Kolton	396cd8fa77	[TableGen] Add EncoderMethod to RegisterOperand Reviewers: stoklund, grosbach, vpykhtin Differential Revision: https://reviews.llvm.org/D32493 llvm-svn: 303044	2017-05-15 10:13:07 +00:00
Arnaud A. de Grandmaison	5d50c74fd1	MCObjectStreamer : fail with a diagnostic when emitting an out of range value. We were previously silently emitting bogus data in release mode, making it very hard to diagnose the error, or crashing with an assert in debug mode. A proper diagnostic is now always emitted when the value to be emitted is out of range. llvm-svn: 303041	2017-05-15 08:43:27 +00:00
Igor Breger	bbf2eba3b0	[GlobalISel][X86] G_BR instruction select test llvm-svn: 303036	2017-05-15 07:03:38 +00:00
Daniel Jasper	72029fd63e	Add '#' to test regex that I forgot in r303025. llvm-svn: 303034	2017-05-15 04:58:27 +00:00
Daniel Jasper	9ff7dd68c1	Fix two tests that weren't correctly copied. One didn't correctly fine the regex variable, the other still had a RUN line for FNOBUILTIN-checks, which weren't copied to the file. llvm-svn: 303025	2017-05-14 22:07:50 +00:00
Simon Pilgrim	58549a36e1	[X86][AVX1] Account for cost of extract/insert of 256-bit shifts llvm-svn: 303023	2017-05-14 20:52:11 +00:00
Simon Pilgrim	3d4f03d434	[X86][AVX2] Fix costs for v4i64 ashr by splat llvm-svn: 303022	2017-05-14 20:25:42 +00:00
Simon Pilgrim	662344253d	[X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat llvm-svn: 303021	2017-05-14 20:02:34 +00:00
Craig Topper	d441caa7a7	[X86] Add avx512vl command lines to the 128/256-bit vector-lzcnt tests so we can see what compare instructions are being used in the lookup table code. I noticed the 512-bit lzcnts don't use the X86 specific lookup table code and instead use the EXPAND case in LegalizeDAG. I was toying around with fixing this and noticed it would require compare instructions that generate i1 masks and then converting from mask to vector. Then I noticed that we don't test which compares are used with avx512vl and no avx512cd. llvm-svn: 303020	2017-05-14 19:38:11 +00:00
Craig Topper	7bf8b47c7e	[X86] Cleanup some of the check-prefixes in the vector-lzcnt tests. Remove an unneeded prefix from the 32-bit command line. Make all the 64-bit triples match. Replace ALL with X64 and remove it from the 32-bit test. llvm-svn: 303019	2017-05-14 19:38:09 +00:00
Simon Pilgrim	9b072aaee8	[X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul sequences llvm-svn: 303017	2017-05-14 18:52:15 +00:00
Shoaib Meenai	3230d2780d	[COFF] Gracefully handle empty .drectve sections Running `llvm-readobj -coff-directives msvcrt.lib` resulted in this error: Invalid data was encountered while parsing the file This happened because some of the object files in the archive have empty `.drectve` sections. These empty sections result in a `parse_failed` error being returned from `COFFObjectFile::getSectionContents()`, which in turn caused `llvm-readobj` to stop. With this change, `getSectionContents` now returns success, and like before the resulting array is empty. Patch by Dave Lee. Differential Revision: https://reviews.llvm.org/D32652 llvm-svn: 303014	2017-05-14 18:34:56 +00:00
Simon Pilgrim	bd07f4d3ed	[X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + mask. Tweak cost model to match what lowering actually does. llvm-svn: 303013	2017-05-14 17:59:46 +00:00
Simon Pilgrim	d440f01082	[X86][SSE] Account for cost of extract/insert of v32i8 vector shifts llvm-svn: 303012	2017-05-14 17:36:07 +00:00
Simon Pilgrim	75f3dd7e94	[X86][XOP] Account for cost of extract/insert of 256-bit vector shifts llvm-svn: 303010	2017-05-14 13:38:53 +00:00
Simon Pilgrim	42caa7142b	[X86][AVX] Allow 32-bit targets to peek through subvectors to extract constant splats for vXi64 shifts. llvm-svn: 303009	2017-05-14 11:46:26 +00:00
Simon Pilgrim	3b5e5ceefe	[X86][AVX] Add additional 32-bit target vector shift tests Shows issue with 32-bits not being able to peek through subvectors to extract constant splats llvm-svn: 303008	2017-05-14 11:13:03 +00:00
Craig Topper	6c2eb35d5e	[InstSimplify] Add patterns for folding (A & B) \| (~A ^ B) -> (~A ^ B) and its commuted variants. We already had (A & ~B) \| (A ^ B), but we missed the cases where the not was part of the xor. llvm-svn: 303004	2017-05-14 07:54:43 +00:00
Craig Topper	3da63abd9f	foo llvm-svn: 303003	2017-05-14 07:54:40 +00:00
Xinliang David Li	5f25d42892	Renable test that was disabled due to cost analysis llvm-svn: 303000	2017-05-14 02:58:39 +00:00
Zachary Turner	30b209f262	[llvm-pdbdump] Add the option to sort functions and data. llvm-svn: 302998	2017-05-14 01:13:40 +00:00
Simon Pilgrim	fb7acbc016	[SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts in ComputeNumSignBits llvm-svn: 302997	2017-05-13 22:10:58 +00:00
Simon Pilgrim	e3e00995f1	[X86][SSE] Test showing missing EXTRACT_SUBVECTOR/CONCAT_VECTORS demandedelts support in ComputeNumSignBits llvm-svn: 302994	2017-05-13 21:50:18 +00:00
Simon Pilgrim	dd6308c45b	[SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBits llvm-svn: 302993	2017-05-13 19:57:10 +00:00
Simon Pilgrim	64ce41df9a	[X86][SSE] Test showing inability of ComputeNumSignBits to resolve shuffles llvm-svn: 302992	2017-05-13 17:41:07 +00:00
Justin Bogner	0e68367839	MSan: Mark MemorySanitizer tests that use x86 intrinsics as REQUIRES: x86 Tests that use target intrinsics are inherently target specific. Mark them as such. llvm-svn: 302990	2017-05-13 16:24:38 +00:00
Simon Pilgrim	91451ab4d4	[x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization) Further perf tests on Jaguar indicate that: vxorps %ymm0, %ymm0, %ymm0 vcmpps $15, %ymm0, %ymm0, %ymm0 is consistently faster (by about 9%) than: vpcmpeqd %xmm0, %xmm0, %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well. Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D32416 llvm-svn: 302989	2017-05-13 13:42:35 +00:00
Simon Pilgrim	288ebc9253	[LoopOptimizer][Fix]PR32859, PR24738 The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988	2017-05-13 13:25:57 +00:00
Craig Topper	d39e7102a2	[InstCombine] Prevent InstCombine from triggering an extra iteration if something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982	2017-05-13 06:56:04 +00:00
Justin Bogner	104b204907	ConstProp: Split x86 SSE intrinsic tests out of calls.ll This allows us to mark this as `REQUIRES: x86`, since it uses x86 target specific intrinsics. llvm-svn: 302980	2017-05-13 05:52:17 +00:00
Justin Bogner	2d310588a2	InstCombine: Move tests that use target intrinsics into subdirectories Tests with target intrinsics are inherently target specific, so it doesn't actually make sense to run them if we've excluded their target. llvm-svn: 302979	2017-05-13 05:39:46 +00:00
NAKAMURA Takumi	0474b0e5ba	Disable llvm/test/Transforms/NewGVN/pr32934.ll while Davide is investigating. llvm-svn: 302977	2017-05-13 03:05:38 +00:00
Davide Italiano	c5771fbe04	[NewGVN] XFAIL a flaky test until I find out what's going on. I bet the change is correct but this test seems to expose some underlying problem that manifest only on some buildbots, and I'm not able to reproduce locally. Unfortunately I can't debug right now but I don't want to annoy people with spurious failures, so I'll XFAIL until I can take a look (over the weekend). llvm-svn: 302976	2017-05-13 02:45:47 +00:00
Dylan McKay	c826f672fe	[AVR] When lowering Select8/Select16, put newly generated MBBs in the same spot Contributed by Dr. Gergő Érdi. Fixes a bug. Raised from (https://github.com/avr-rust/rust/issues/49). llvm-svn: 302973	2017-05-13 00:22:34 +00:00
Justin Bogner	9e6beb3722	AA: Use generic intrinsics for tests instead of target specific ones Update a few tests to use llvm.masked.load/store instead of arm neon vector loads and stores, and move the tests that are actually specific to those arm intrinsics to their own files. This lets us mark the tests that use target specific intrinsics as requiring those targets. llvm-svn: 302972	2017-05-13 00:12:52 +00:00
Xinliang David Li	28a5d9c340	[PartialInlining] Profile based cost analysis Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967	2017-05-12 23:41:43 +00:00
Andrew Kaylor	392b1353f7	[TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 llvm-svn: 302957	2017-05-12 22:11:26 +00:00
Andrew Kaylor	0792f83d68	[ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31788 llvm-svn: 302956	2017-05-12 22:11:20 +00:00
Andrew Kaylor	95445317bd	[TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31787 llvm-svn: 302955	2017-05-12 22:11:12 +00:00
Sanjay Patel	f353bb583f	[x86] add vector tests for demanded bits; NFC llvm-svn: 302949	2017-05-12 20:53:48 +00:00
Changpeng Fang	c5587a9cbd	AMDGPU/SI: Don't promote to vector if the load/store is volatile. Summary: We should not change volatile loads/stores in promoting alloca to vector. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D33107 llvm-svn: 302943	2017-05-12 20:31:12 +00:00
Simon Pilgrim	86a30cc8d4	[NVPTX] Don't flag StoreRetVal memory chain operands as ReadMem (PR32146) This fixes 47 of the 75 NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146. Differential Revision: https://reviews.llvm.org/D33147 llvm-svn: 302942	2017-05-12 19:56:43 +00:00
Dehao Chen	d7d29ebf8d	Add LiveRangeShrink pass to shrink live range within BB. Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB. Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb Reviewed By: MatzeB, andreadb Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D32563 llvm-svn: 302938	2017-05-12 19:29:27 +00:00
Tom Stellard	31c4d52ea9	AMDGPU: Add lit.local.cfg to disable global-isel tests when global-isel is disabled This should fix bots broken by r302919. llvm-svn: 302928	2017-05-12 17:59:30 +00:00
Reid Kleckner	ca8b6180fe	[codeview] Fix assertion failure introduced in r295354 refactoring CodeViewDebug sets Asm to nullptr to disable debug info generation. You can get a .ll file like no-cus.ll from 'clang -gcodeview -g0', which happens in the ubsan test suite. llvm-svn: 302923	2017-05-12 17:02:40 +00:00
Tom Stellard	e65dcab676	AMDGPU/GlobalISel: Mark 32-bit integer constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33115 llvm-svn: 302919	2017-05-12 16:46:46 +00:00
James Y Knight	9afced58c2	[SPARC] Support 'f' and 'e' inline asm constraints. Based on patch by Patrick Boettcher and Chris Dewhurst. Differential Revision: https://reviews.llvm.org/D29116 llvm-svn: 302911	2017-05-12 15:59:10 +00:00
Sanjay Patel	4f6e2e4397	[x86] add tests for potential vector narrowing optimization (PR32790) llvm-svn: 302910	2017-05-12 15:56:39 +00:00
Davide Italiano	c3a517dde1	[LoopUnroll] Fix a test. REQUIRE should be REQUIRES. Found by inspection. llvm-svn: 302909	2017-05-12 15:30:58 +00:00
Davide Italiano	4d6c0b489d	[NewGVN] Don't incorrectly reset the memory leader. This code was missing a check for stores, so we were thinking the congruency class didn't have any memory members, and reset the memory leader. Differential Revision: https://reviews.llvm.org/D33056 llvm-svn: 302905	2017-05-12 15:22:45 +00:00
Serguei Katkov	4f8d50be22	[BPI] Ignore remainder while distributing the remaining probability from unreachanble This is a follow up patch for https://reviews.llvm.org/rL300440 to address a comment. To make implementation to be consistent with other cases we just ignore the remainder after distribution of remaining probability between reachable edges. If we reduced the probability of some edges coming to unreachable blocks we should distribute the remaining part across other edges coming to reachable blocks to satisfy the condition that sum of all probabilities should be equal to one. If this remaining part is not divided by number of "reachable" edges then we get this remainder. This remainder probability should be pretty small. Other cases just ignore if the sum of probabilities is not equal to one so we do the same. Reviewers: chandlerc, sanjoy, vsk, junbuml, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D32124 llvm-svn: 302883	2017-05-12 07:50:06 +00:00
Jonas Paulsson	9727d7a8ef	Handle a COPY with undef source operand in LowerCopy() Llvm-stress discovered that a COPY may end up in ExpandPostRA::LowerCopy() with an undef source operand. It is not possible for the target to handle this, as this flag is not passed to TII->copyPhysReg(). This patch solves this by treating such a COPY as an identity COPY. Review: Matthias Braun https://reviews.llvm.org/D32892 llvm-svn: 302877	2017-05-12 06:32:03 +00:00
Mikael Holmen	3048bfca17	[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle Summary: Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot always be trusted) at the end to fixup the CFG we keep the CFG updated as we go along and remove or add branches and merge blocks. This way we won't have any problems if the involved MBBs contain unanalyzable instructions. This fixes PR32721. In that case we had a triangle EBB \| \ \| \| \| TBB \| / FBB where FBB didn't have any successors at all since it ended with an unconditional return. Then TBB and FBB were be merged into EBB, but EBB would still keep its successors, and the use of analyzeBranch and CorrectExtraCFGEdges wouldn't help to remove them since the return instruction is not analyzable (at least not on ARM). Reviewers: kparzysz, iteratee, MatzeB Reviewed By: iteratee Subscribers: aemerson, rengolin, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33037 llvm-svn: 302876	2017-05-12 06:28:58 +00:00
Chandler Carruth	a360b737fa	[PM/Unswitch] Teach the new simple loop unswitch to handle loop invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867	2017-05-12 02:19:59 +00:00
David Blaikie	ea27535943	DWARF: Avoid cross-CU references under Fission Turns out that the Fission/Split DWARF package format (DWP) is currently insufficient to handle cross-CU (ref_addr) references. So for now, duplicate any debug info needed in these situations: * inlined_subroutine's abstract_origin * inlined variable's abstract_origin * types Keep the ref_addr behavior in general, including in the split DWARF inline debug info that can be emitted into the object files for online symbolication. Keep a flag to use the old (ref_addr) behavior for testing ways of addressing this limitation in the DWP tool (& for those not using DWP packaging). llvm-svn: 302858	2017-05-12 01:13:45 +00:00
Dehao Chen	228587901e	Change sample profile writer to make it deterministic. Summary: This patch changes the function profile output order to be deterministic. In order to make it easier to understand, hottest functions (with most total samples) is ordered first. Reviewers: dnovillo, davidxl Reviewed By: dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33111 llvm-svn: 302851	2017-05-11 23:43:44 +00:00
Teresa Johnson	81aa3660f7	Restrict call metadata based hotness detection to Sample PGO mode Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844	2017-05-11 23:18:05 +00:00
Guozhi Wei	37cf363f24	[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 llvm-svn: 302834	2017-05-11 22:17:35 +00:00
Easwaran Raman	7fb8c288b9	Decrease inlinecold-threshold to 45 I ran the test-suite (including SPEC 2006) in PGO mode comparing cold thresholds of 225 and 45. Here are some stats on the text size: Out of 904 tests that ran, 197 see a change in text size. The average text size reduction (of all the 904 binaries) is 1.07%. Of the 197 binaries, 19 see a text size increase, as high as 18%, but most of them are small single source benchmarks. There are 3 multisource benchmarks with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On the other side of the spectrum, 31 benchmarks see >10% size reduction and 6 of them are MultiSource. I haven't run the test-suite with other values of inlinecold-threshold. Since we have a cold callsite threshold of 45, I picked this value. Differential revision: https://reviews.llvm.org/D33106 llvm-svn: 302829	2017-05-11 21:36:28 +00:00
Chad Rosier	4962115cec	[AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD. Differential Revision: http://reviews.llvm.org/D33101. llvm-svn: 302822	2017-05-11 20:07:24 +00:00
Vadzim Dambrouski	80fb90f30d	[MSP430] Generate EABI-compliant libcalls Updates the MSP430 target to generate EABI-compatible libcall names. As a byproduct, adjusts the hardware multiplier options available in the MSP430 target, adds support for promotion of the ISD::MUL operation for 8-bit integers, and correctly marks R11 as used by call instructions. Patch by Andrew Wygle. Differential Revision: https://reviews.llvm.org/D32676 llvm-svn: 302820	2017-05-11 19:56:14 +00:00
Matt Arsenault	4f44d0e3e0	AMDGPU: Remove tfe bit from flat instruction definitions We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814	2017-05-11 17:38:33 +00:00
Matt Arsenault	fd9981d4ab	AMDGPU: Pull fneg out of extract_vector_elt This allows folding source modifiers in more f16 cases. Makes it easier to select per-component packed neg modifiers. llvm-svn: 302813	2017-05-11 17:26:25 +00:00
Adam Nemet	73703d12a4	[SLP] Emit optimization remarks The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811	2017-05-11 17:06:17 +00:00
Nemanja Ivanovic	58096023de	[PowerPC] Eliminate integer compare instructions - vol. 1 This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 llvm-svn: 302810	2017-05-11 16:54:23 +00:00
Simon Pilgrim	8ee014b2e7	[X86][AVX] Added zeroall/zeroupper scheduler tests Missing on SandyBridge and Btver2 models llvm-svn: 302804	2017-05-11 15:02:49 +00:00
Javed Absar	a6a50d93e8	[IR] Allow attributes with global variables This patch extends llvm-ir to allow attributes to be set on global variables. An RFC was sent out earlier by my colleague James Molloy: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053100.html A key part of that proposal was to extend LLVM-IR to carry attributes on global variables. This generic feature could be useful for multiple purposes. In our present context, it would be useful to carry user specified sections for bss/rodata/data. Reviewed by: Jonathan Roelofs, Reid Kleckner Differential Revision: https://reviews.llvm.org/D32009 llvm-svn: 302794	2017-05-11 12:28:08 +00:00
Alexander Potapenko	d099c88f0a	[msan] Fix PR32842 It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless. This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero). Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0. This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact. For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code: orq %rcx, %rax jne .LBB0_6 , instead of: orl %ecx, %eax testb $1, %al jne .LBB0_6 llvm-svn: 302787	2017-05-11 11:07:48 +00:00
Chandler Carruth	7ba7a0f05a	[x86] Fix a failure to select with AVX-512 when the type legalizer manages to form a VSELECT with a non-i1 element type condition. Those are technically allowed in SDAG (at least, the generic type legalization logic will form them and I wouldn't want to try to audit everything te preclude forming them) so we need to be able to lower them. This isn't too hard to implement. We mark VSELECT as custom so we get a chance in C++, add a fast path for i1 conditions to get directly handled by the patterns, and a fallback when we need to manually force the condition to be an i1 that uses the vptestm instruction to turn a non-mask into a mask. This, unsurprisingly, generates awful code. But it at least doesn't crash. This was actually impacting open source packages built with LLVM for AVX-512 in the wild, so quickly landing a patch that at least stops the immediate bleeding. I think I've found where to fix the codegen quality issue, but less confident of that change so separating it out from the thing that doesn't change the result of any existing test case but causes mine to not crash. llvm-svn: 302785	2017-05-11 10:52:16 +00:00
Diana Picus	aa41f21260	[ARM][GlobalISel] Legalize narrow scalar ops by widening This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB and MUL to 32 bits since we only have TableGen patterns for 32 bits. See the commit message for r292827 for more details. At this point we could just remove some of the tests for regbankselect and instruction-select, since we're not going to see any narrow operations at those levels anymore. Instead I decided to update them with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences generated by the legalizer. llvm-svn: 302782	2017-05-11 09:45:57 +00:00
Diana Picus	97796d8b55	[ARM][GlobalISel] Support for G_ANYEXT G_ANYEXT can be introduced by the legalizer when widening scalars. Add support for it in the register bank info (same mapping as everything else) and in the instruction selector. When selecting it, we treat it as a COPY, just like G_TRUNC. On this occasion we get rid of some assertions in selectCopy so we can reuse it. This shouldn't be a problem at the moment since we're not supporting any complicated cases (e.g. FPR, different register banks). We might want to separate the paths when we do. llvm-svn: 302778	2017-05-11 08:28:31 +00:00
Igor Breger	d4690d9a2f	[GlobalISel][X86] G_ICMP support. Summary: support G_ICMP for scalar types i8/i16/i64. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits, krytarowski Differential Revision: https://reviews.llvm.org/D32995 llvm-svn: 302774	2017-05-11 07:17:40 +00:00
David L. Jones	3e3254804c	Revert "[SDAG] Relax conditions under stores of loaded values can be merged" This reverts r302712. The change fails with ASAN enabled: ERROR: AddressSanitizer: use-after-poison on address ... at ... READ of size 2 at ... thread T0 #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42 #1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3 #2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17 #3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7 Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33081 llvm-svn: 302746	2017-05-10 23:56:21 +00:00
Sanjay Patel	c4d21ee61b	[InstCombine] remove fold that swaps xor/or with constants; NFCI // (X ^ C1) \| C2 --> (X \| C2) ^ (C1&~C2) This canonicalization was added at: https://reviews.llvm.org/rL7264 By moving xors out/down, we can more easily combine constants. I'm adding tests that do not change with this patch, so we can verify that those kinds of transforms are still happening. This is no-functional-change-intended because there's a later fold: // (X^C)\|Y -> (X\|Y)^C iff Y&C == 0 ...and demanded-bits appears to guarantee that any fold that would have hit the fold we're removing here would be caught by that 2nd fold. Similar reasoning was used in: https://reviews.llvm.org/rL299384 The larger motivation for removing this code is that it could interfere with the fix for PR32706: https://bugs.llvm.org/show_bug.cgi?id=32706 Ie, we're not checking if the 'xor' is actually a 'not', so we could reverse a 'not' optimization and cause an infinite loop by altering an 'xor X, -1'. Differential Revision: https://reviews.llvm.org/D33050 llvm-svn: 302733	2017-05-10 21:33:55 +00:00
Matt Arsenault	46f6718d8f	AMDGPU: Make some packed shuffles free VOP3P instructions can encode access to either half of the register. llvm-svn: 302730	2017-05-10 21:29:33 +00:00
Nirav Dave	0603cde0c8	[SDAG] Relax conditions under stores of loaded values can be merged Summary: Allow consecutive stores whose values come from consecutive loads to merged in the presense of other uses of the loads. Previously this was disallowed as in general the merged load cannot be shared with the other uses. Merging N stores into 1 may cause as many as N redundant loads. However in the context of caching this should have neglible affect on memory pressure and reduce instruction count making it almost always a win. Fixes PR32086. Reviewers: spatel, jyknight, andreadb, hfinkel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30471 llvm-svn: 302712	2017-05-10 19:53:41 +00:00
Sanjay Patel	9a17eb1902	[InstSimplify, InstCombine] move 'or' simplification tests; NFC Surprisingly, I don't think these are redundant for InstSimplify. They were just misplaced as InstCombine tests. llvm-svn: 302684	2017-05-10 15:57:47 +00:00
Simon Pilgrim	3c3c9c0eef	[X86][SSE] Check vec_set BUILD_VECTOR tests on both 32 and 64-bit targets llvm-svn: 302683	2017-05-10 15:52:59 +00:00
Quentin Colombet	625bc4b0e8	[AArch64][RegisterBankInfo] Change the default mapping of fp stores. For stores, check if the stored value is defined by a floating point instruction and if yes, we return a default mapping with FPR instead of GPR. llvm-svn: 302679	2017-05-10 15:19:41 +00:00
Amara Emerson	08ca9bd16b	[AArch64] Enable use of reduction intrinsics. The new experimental reduction intrinsics can now be used, so I'm enabling this for AArch64. We will need this for SVE anyway, so it makes sense to do this for NEON reductions as well. The existing code to match shufflevector patterns are replaced with a direct lowering of the reductions to AArch64-specific nodes. Tests updated with the new, simpler, representation. Differential Revision: https://reviews.llvm.org/D32247 llvm-svn: 302678	2017-05-10 15:15:38 +00:00
Sanjay Patel	d548d9e52a	[InstCombine] remove redundant tests The first test in this file is duplicated exactly in and.ll -> test33. We have commuted and vector variants there too. The second test is a composite of 2 folds. The first fold is tested independently in add.ll -> flip_and_mask (including vector variant). After that transform fires, the IR is identical to the first transform. llvm-svn: 302676	2017-05-10 14:54:49 +00:00
Sanjay Patel	984fd017a0	[InstCombine] fix auto-generated FileCheck-captured variable refs The script at utils/update_test_checks.py has (had?) a bug when variables start with the same sequence of letters (clearly, not all of the time). llvm-svn: 302674	2017-05-10 14:40:04 +00:00
Sanjay Patel	cc782735dc	[InstCombine] fix typo in test comment; NFC llvm-svn: 302669	2017-05-10 14:25:23 +00:00
Ulrich Weigand	bb3bddd00e	[SystemZ] Add miscellaneous instructions This adds a few missing instructions for the assembler and disassembler. Those should be the last missing general- purpose (Chapter 7) instructions for the z10 ISA. llvm-svn: 302667	2017-05-10 14:20:15 +00:00
Ulrich Weigand	1ef7c92f6f	[SystemZ] Add missing arithmetic instructions This adds the remaining general arithmetic instructions for assembler / disassembler use. Most of these are not useful for codegen; a few might be, and those are listed in the README.txt for future improvements. llvm-svn: 302665	2017-05-10 14:18:47 +00:00
Sam Clegg	aaf3055813	[llvm-readobj] Improve errors on invalid binary The previous code was discarding the error message from createBinary() by calling errorToErrorCode(). This meant that such error were always reported unhelpfully as "Invalid data was encountered while parsing the file". Other tools such as llvm-objdump already produce a more the error message in this case. Differential Revision: https://reviews.llvm.org/D32985 llvm-svn: 302664	2017-05-10 14:18:11 +00:00
Sanjay Patel	28b1842d00	[InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1 This is another step towards favoring 'not' ops over random 'xor' in IR: https://bugs.llvm.org/show_bug.cgi?id=32706 This transformation may have occurred in longer IR sequences using computeKnownBits, but that could be much more expensive to calculate. As the scalar result shows, we do not currently favor 'not' in all cases. The 'not' created by the transform is transformed again (unnecessarily). Vectors don't have this problem because vectors are (wrongly) excluded from several other combines. llvm-svn: 302659	2017-05-10 13:56:52 +00:00
Michael Zuckerman	0456ea13b1	[LLVM][inline-asm] Altmacro string escape character '!' This patch is the fourth patch in a series of reviews for the Altmacro feature. This patch introduces a new escape character '!' and it depends on D32701. according to https://sourceware.org/binutils/docs/as/Altmacro.html: "single-character string escape To include any single character literally in a string (even if the character would otherwise have some special meaning), you can prefix the character with !' (an exclamation mark). For example, you can write <4.3 !> 5.4!!>' to get the literal text `4.3 > 5.4!'. " Differential Revision: https://reviews.llvm.org/D32792 llvm-svn: 302652	2017-05-10 13:08:11 +00:00
Mikael Holmen	0aa88ec197	[IfConversion] Add missing check in IfConversion/canFallThroughTo Summary: When trying to figure out if MBB could fallthrough to ToMBB (possibly by falling through a bunch of other MBBs) we didn't actually check if there was fallthrough between the last two blocks in the chain. Reviewers: kparzysz, iteratee, MatzeB Reviewed By: kparzysz, iteratee Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D32996 llvm-svn: 302650	2017-05-10 13:06:13 +00:00
Jonas Paulsson	d401c2a29b	[SystemZ] Implement getRepRegClassFor() This method must return a valid register class, or the list-ilp isel scheduler will crash. For MVT::Untyped nullptr was previously returned, but now ADDR128BitRegClass is returned instead. This is needed just as long as list-ilp (and probably also list-hybrid) is still there. Review: Ulrich Weigand, A Trick https://reviews.llvm.org/D32802 llvm-svn: 302649	2017-05-10 13:03:25 +00:00
Dmitry Preobrazhensky	299fc6910a	[AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in disassembler output See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D32913 llvm-svn: 302648	2017-05-10 13:00:28 +00:00
Igor Breger	6263596e23	[GlobalISel][X86] Split test file. NFC llvm-svn: 302647	2017-05-10 12:58:31 +00:00
Ulrich Weigand	55d6a88557	[SystemZ] Add decimal integer instructions This adds the set of decimal integer (BCD) instructions for assembler / disassembler use. llvm-svn: 302646	2017-05-10 12:42:45 +00:00
Ulrich Weigand	250144573e	[SystemZ] Add crypto instructions This adds the set of message-security assist instructions for assembler / disassembler use. llvm-svn: 302645	2017-05-10 12:42:00 +00:00
Ulrich Weigand	f0971128ba	[SystemZ] Add translate/convert instructions This adds the set of character-set translate and convert instructions for assembler / disassembler use. llvm-svn: 302644	2017-05-10 12:41:12 +00:00
Ulrich Weigand	7c859f5897	[SystemZ] Add missing memory/string instructions This adds a number of missing memory and string instructions for assembler / disassembler use. llvm-svn: 302643	2017-05-10 12:40:15 +00:00
Ulrich Weigand	934386e031	[SystemZ] Reformat assembler/disassembler tests The assembler and disassmebler test cases started out formatted and sorted in a particular way, but this got lost over time as patches were added. Reformat them again. NFC. llvm-svn: 302642	2017-05-10 12:39:11 +00:00
Simon Pilgrim	ac3e69a0aa	[DAGCombiner] Add vector support to fold (shl/srl 0, x) -> 0 llvm-svn: 302641	2017-05-10 12:34:27 +00:00
Chandler Carruth	20b358db9d	Revert r301950: SpeculativeExecution: Stop using whitelist for costs This pass doesn't correctly handle testing for when it is legal to hoist arbitrary instructions. The whitelist happens to make it safe, so before it is removed the pass's legality checks will need to be enhanced. Details have been added to the code review thread for the patch. llvm-svn: 302640	2017-05-10 12:30:07 +00:00
Amara Emerson	668fbd4cf5	Add a late IR expansion pass for the experimental reduction intrinsics. This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631	2017-05-10 09:42:49 +00:00
Igor Breger	7350527066	[GlobalISel][X86] G_ZEXT i1 to i32/i64 support. Summary: Support G_ZEXT i1 to i32/i64 instruction selection. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D32965 llvm-svn: 302623	2017-05-10 06:52:58 +00:00
Ahmed Bougacha	d7292f3867	[CodeGen] Don't require AA in TwoAddress at -O0. This is a follow-up to r302611, which moved an -O0 computation of DT from SDAGISel to TwoAddress. Don't use it here either, and avoid computing it completely. The only use was forwarding the analysis as an optional argument to utility functions. Differential Revision: https://reviews.llvm.org/D32766 llvm-svn: 302612	2017-05-10 00:56:00 +00:00
Ahmed Bougacha	d74b69039b	[CodeGen] Don't require AA in SDAGISel at -O0. Before r247167, the pass manager builder controlled which AA implementations were used, exporting them all in the AliasAnalysis analysis group. Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA implementations if made available in the pass pipeline. But regardless, SDAGISel is required at O0, and really doesn't need to be doing fancy optimizations based on useful AA results. Don't require AA at CodeGenOpt::None, and only use it otherwise. This does have a functional impact (and one testcase is pessimized because we can't reuse a load). But I think that's desirable no matter what. Note that this alone doesn't result in less DT computations: TwoAddress was previously able to reuse the DT we computed for SDAG. That will be fixed separately. Differential Revision: https://reviews.llvm.org/D32766 llvm-svn: 302611	2017-05-10 00:39:30 +00:00
Ahmed Bougacha	8c31c46df1	[CodeGen] Compute DT/LI lazily in SafeStackLegacyPass. NFC. We currently require SCEV, which requires DT/LI. Those are expensive to compute, but the pass only runs for functions that have the safestack attribute. Compute DT/LI to build SCEV lazily, only when the pass is actually going to transform the function. Differential Revision: https://reviews.llvm.org/D31302 llvm-svn: 302610	2017-05-10 00:39:25 +00:00
Ahmed Bougacha	c5a347f087	[CodeGen] Add an -O0 backend pipeline test. NFC. This should hopefully makes changes to the O0 pipeline obvious; it's easy to require expensive passes, and this helps make informed decisions. Case in point: in the few weeks separating the time when I initially wrote this patch to the time when I committed, the test regressed as r302103 added another use of DT! llvm-svn: 302608	2017-05-10 00:39:17 +00:00
Sam Clegg	19dcd69096	[WebAssembly] Improve libObject support for wasm imports and exports Previously we had only supported the importing and exporting of functions and globals. Also, add usefull overload of getWasmSymbol() and getNumberOfSymbols() in support of lld port. Differential Revision: https://reviews.llvm.org/D33011 llvm-svn: 302601	2017-05-09 23:48:41 +00:00
Sanjay Patel	bba8bf8184	[InstCombine] add tests for andn; NFC llvm-svn: 302599	2017-05-09 23:40:13 +00:00
Keno Fischer	64e8b703ce	[GVN] Fix a crash on encountering non-integral pointers Summary: This fixes the immediate crash caused by introducing an incorrect inttoptr before attempting the conversion. There may still be a legality check missing somewhere earlier for non-integral pointers, but this change seems necessary in any case. Reviewers: sanjoy, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32623 llvm-svn: 302587	2017-05-09 21:07:20 +00:00
Sanjay Patel	9807e9af6a	[InstCombine] update test file to use FileCheck; NFC llvm-svn: 302585	2017-05-09 20:46:12 +00:00
Zvi Rackover	8253c91a9b	DAGCombine: Combine shuffles of splat-shuffles Summary: Reapply r299047, but this time handle correctly splat-masks with undef elements. Reviewers: spatel, RKSimon, eli.friedman, andreadb Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31961 llvm-svn: 302583	2017-05-09 20:25:38 +00:00
Matthew Simpson	383f076bfc	[AArch64] Consider widening instructions in cost calculations The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582	2017-05-09 20:18:12 +00:00
Reid Kleckner	1b8d831812	[codeview] Check for a DIExpression offset for local variables Fixes inalloca parameters, which previously all pointed to the same offset. Extend the test to use llvm-readobj so that we can test the offset in a readable way. llvm-svn: 302578	2017-05-09 19:59:29 +00:00
Adrian Prantl	05ee77f883	Make it illegal for two Functions to point to the same DISubprogram As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 This reapplies r302469 with a fix for a bot failure (reparentDebugInfo now checks for the case the orig and new function are identical). llvm-svn: 302576	2017-05-09 19:47:37 +00:00
Wolfgang Pieb	9b56d2f6b8	[DWARF] Fix a parsing issue with type unit headers. Reviewers: dblaikie Differential Revision: https://reviews.llvm.org/D32987 llvm-svn: 302574	2017-05-09 19:38:38 +00:00
Jacques Pienaar	d55c78bcc5	[lanai] Add computeKnownBitsForTargetNode for Lanai. Summary: computeKnownBitsForTargetNode was not defined for Lanai which resulted in additional AND's with 0x1 for the output of SETCC instructions. Reviewers: eliben, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29605 llvm-svn: 302568	2017-05-09 18:35:26 +00:00
Sam Clegg	c698370461	[WebAssembly] Fix validation of start function The check for valid start function was inverted. Added a new test in test/Object to check this case and fixed the existing tests in for ObjectYAML. Differential Revision: https://reviews.llvm.org/D32986 llvm-svn: 302560	2017-05-09 17:51:38 +00:00
Davide Italiano	334449f848	[NewGVN] Fix a consistent order for phi nodes operands. The way we currently define congruency for two PHIExpression(s) is: 1) The operands to the phi functions are congruent 2) The PHIs are defined in the same BasicBlock. NewGVN works under the assumption that phi operands are in predecessor order, or at least in some consistent order. OTOH, is valid IR: patatino: %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ] %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ] br label %end and the in-memory representations of the two SSA registers have an inconsistent order. This violation of NewGVN assumptions results into two PHIs found congruent when they're not. While we think it's useful to have always a consistent order enforced, let's fix this in NewGVN sorting uses in predecessor order before creating a PHI expression. Differential Revision: https://reviews.llvm.org/D32990 llvm-svn: 302552	2017-05-09 16:58:28 +00:00
Craig Topper	4cf65193e1	[X86] Add more patterns for BZHI isel This patch adds more patterns that a reasonable person might write that can be compiled to BZHI. This adds support for (~0U >> (32 - b)) & a; and a << (32 - b) >> (32 - b); This was inspired by the code in APInt::clearUnusedBits. This can pass an index of 32 to the bzhi instruction which a quick test of Haswell hardware shows will not mask any bits. Though the description text in the Intel manual says the "index is saturated to OperandSize-1". The pseudocode in the same manual indicates no bits will be zeroed for this case. I think this is still missing cases where the subtract portion is an 8-bit operation. Differential Revision: https://reviews.llvm.org/D32616 llvm-svn: 302549	2017-05-09 16:32:11 +00:00
Sanjay Patel	4f2714d90b	[InstCombineCasts] Fix checks in sext->lshr->trunc pattern. The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D32285 llvm-svn: 302548	2017-05-09 16:24:59 +00:00
Guy Blank	52106f6b42	VX512] Only look at lower bit in constant scalar masks for scalar masked instructions only the lower bit of the mask is relevant. so for constant masks we should either do an unmasked operation or no operation, depending on the value of the lower bit. This patch handles cases where the lower bit is '1'. Differential Revision: https://reviews.llvm.org/D32805 llvm-svn: 302546	2017-05-09 16:16:48 +00:00
Reid Kleckner	bed1389ae3	Re-land "Use the frame index side table for byval and inalloca arguments" This re-lands r302483. It was not the cause of PR32977. llvm-svn: 302544	2017-05-09 16:02:20 +00:00
Reid Kleckner	fc145824a1	Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare" This re-lands commit r302461. It was not the cause of PR32977. llvm-svn: 302543	2017-05-09 16:01:47 +00:00
Hans Wennborg	1ddac6ae37	Revert r302469 "Make it illegal for two Functions to point to the same DISubprogram" This caused PR32977. Original commit message: > Make it illegal for two Functions to point to the same DISubprogram > > As recently discussed on llvm-dev [1], this patch makes it illegal for > two Functions to point to the same DISubprogram and updates > FunctionCloner to also clone the debug info of a function to conform > to the new requirement. To simplify the implementation it also factors > out the creation of inlineAt locations from the Inliner into a > general-purpose utility in DILocation. > > [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html > <rdar://problem/31926379> > > Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302533	2017-05-09 14:44:15 +00:00
Anna Thomas	3580c4d010	[LV] Fix insertion point for shuffle vectors in first order recurrence Summary: In first order recurrence vectorization, when the previous value is a phi node, we need to set the insertion point to the first non-phi node. We can have the previous value being a phi node, due to the generation of new IVs as part of trunc optimization [1]. [1] https://reviews.llvm.org/rL294967 Reviewers: mssimpso, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32969 llvm-svn: 302532	2017-05-09 14:29:33 +00:00
Guy Blank	31e1f1d978	[X86][AVX512] Refine some avx512er intrinsics tests. NFC. The modified tests should test the masked intrinsics. Currently the mask is constant, which with a future patch (https://reviews.llvm.org/D32805) will cause the intrinsics to be replaced with an unmasked version. This patch changes the constant mask to be a variable one. llvm-svn: 302529	2017-05-09 14:03:51 +00:00
Serge Pavlov	b8ce9ec478	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527	2017-05-09 13:35:13 +00:00
Simon Dardis	76a4991023	Revert "[MIPS] Add support to match more patterns for DINS instruction" This reverts commit rL302512. This broke the mips buildbots. llvm-svn: 302526	2017-05-09 13:18:48 +00:00
Simon Pilgrim	50affcce7b	[X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X) Similar to what we do for vXi8 ASHR(X, 7), use SSE42's PCMPGTQ to splat the sign instead of using the PSRAD+PSHUFD. Avoiding bitcasts this improves combines that utilize computeNumSignBits, permits memory folding and reduces pipe pressure. Although it does require a second register, given that this is a (cheap) zero register the impact is minimal. Differential Revision: https://reviews.llvm.org/D32973 llvm-svn: 302525	2017-05-09 13:14:40 +00:00
Guy Blank	03bb27e056	[X86][AVX512] Add test for masking of scalar instructions. llvm-svn: 302519	2017-05-09 12:32:48 +00:00
Nikolai Bozhenov	3789a9bfa0	[X86] Clang option -fuse-init-array has no effect when generating for MCU target Reviewers: Eugene.Zelenko, dschuff, craig.topper Reviewed By: craig.topper Subscribers: ahatanak, aaboud, DavidKreitzer, llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D32543 Patch by AndreiGrischenko <andrei.l.grischenko@intel.com> llvm-svn: 302513	2017-05-09 10:14:03 +00:00
Strahinja Petrovic	d020c9cb48	[MIPS] Add support to match more patterns for DINS instruction This patch adds support for recognizing patterns to match DINS instruction. Differential Revision: https://reviews.llvm.org/D31465 llvm-svn: 302512	2017-05-09 10:02:00 +00:00
Reid Kleckner	1a48591876	Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare" This reverts commit r302461. It appears to be causing failures compiling gtest with debug info on the Linux sanitizer bot. I was unable to reproduce the failure locally, however. llvm-svn: 302504	2017-05-09 01:57:44 +00:00
Teresa Johnson	7ff9f7abb3	Fix code section prefix for proper layout Summary: r284533 added hot and cold section prefixes based on profile information, to enable grouping of hot/cold functions at link time. However, it used "cold" as the prefix for cold sections, but gold only recognizes "unlikely" (which is used by gcc for cold sections). Therefore, cold sections were not properly being grouped. Switch to using "unlikely" Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32983 llvm-svn: 302502	2017-05-09 01:43:24 +00:00
Reid Kleckner	e98eae6da6	Revert "Use the frame index side table for byval and inalloca arguments" This reverts r302483 and it's follow up fix. llvm-svn: 302493	2017-05-09 01:14:39 +00:00
Evgeniy Stepanov	49f6da0167	Ignore !associated metadata with null argument. Fixes PR32577 (comment 10). Such metadata may legitimately appear in LTO. llvm-svn: 302485	2017-05-08 23:46:20 +00:00
Reid Kleckner	944adda3ae	Relax Dwarf filecheck test for 32-bit hosts llvm-svn: 302484	2017-05-08 23:27:52 +00:00

... 2 3 4 5 6 ...

44986 Commits