llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Sanjay Patel	891bf2b484	[InstSimplify] vector div/rem with any zero element in divisor is undef This was suggested as a DAG simplification in the review for rL297026 : http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435253.html ...but let's start with IR since we have actual docs for IR (LangRef). Differential Revision: https://reviews.llvm.org/D30665 llvm-svn: 297390	2017-03-09 16:20:52 +00:00
Sam Parker	e2243cc305	[ARM] Remove t2xtpk feature from tests I previously removed the T2XtPk feature from the ARM backend, but it looks like I missed some of the tests that were using the feature. Differential Revision: https://reviews.llvm.org/D30778 llvm-svn: 297386	2017-03-09 15:14:32 +00:00
Sanjay Patel	9d801bd5a1	[DAG] recognize div/rem by 0 as undef before trying constant folding As discussed in the review thread for rL297026, this is actually 2 changes that would independently fix all of the test cases in the patch: 1. Return undef in FoldConstantArithmetic for div/rem by 0. 2. Move basic undef simplifications for div/rem (simplifyDivRem()) before foldBinopIntoSelect() as a matter of efficiency. I will handle the case of vectors with any zero element as a follow-up. That change is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic(). I'm deleting the test for PR30693 because it does not test for the actual bug any more (dangers of using bugpoint). Differential Revision: https://reviews.llvm.org/D30741 llvm-svn: 297384	2017-03-09 15:02:25 +00:00
Simon Pilgrim	aa5013fecf	[X86][SSE] Speed up constant pool shuffle mask decoding with direct copy (PR32037). If the constants are already the correct size, we can copy them directly into the shuffle mask. llvm-svn: 297381	2017-03-09 14:06:39 +00:00
Simon Dardis	5556ec2bac	[mips] Revert fixes for PR32020. The fix introduces segfaults and clobbers the value to be stored when the atomic sequence loops. Revert "[Target/MIPS] Kill dead code, no functional change intended." This reverts commit r296153. Revert "Recommit "[mips] Fix atomic compare and swap at O0."" This reverts commit r296134. llvm-svn: 297380	2017-03-09 14:03:26 +00:00
Simon Pilgrim	966ef0146a	Fixed typos in comments. NFCI. llvm-svn: 297379	2017-03-09 13:57:04 +00:00
Nuno Lopes	611b0a719b	fix build on Cygwin llvm-svn: 297378	2017-03-09 13:43:31 +00:00
Joey Gouly	14e9c77536	[SelectionDAG] Make SelectCode return void SelectCode has been returning nullptr since `182dac0` ("SDAG: Make SelectCodeCommon return void", 2016-05-10). Make SelectCode also return void instead, as all callers have been updated. Patch by Sven van Haastregt. Review: https://reviews.llvm.org/D30497 llvm-svn: 297377	2017-03-09 13:38:06 +00:00
Sjoerd Meijer	f7f11a5353	[ARM] remove FIXMEs and add vcmp MC test Minor cleanup in ARMInstrVFP.td: removed some FIXMEs and added a MC test for vcmp that was actually missing. Differential Revision: https://reviews.llvm.org/D30745 llvm-svn: 297376	2017-03-09 13:28:37 +00:00
Chandler Carruth	ef741a9d22	[PM/Inliner] Make the new PM's inliner process call edges across an entire SCC before iterating on newly-introduced call edges resulting from any inlined function bodies. This more closely matches the behavior of the old PM's inliner. While it wasn't really clear to me initially, this behavior is actually essential to the inliner behaving reasonably in its current design. Because the inliner is fundamentally a bottom-up inliner and all of its cost modeling is designed around that it often runs into trouble within an SCC where we don't have any meaningful bottom-up ordering to use. In addition to potentially cyclic, infinite inlining that we block with the inline history mechanism, it can also take seemingly simple call graph patterns within an SCC and turn them into insanely large functions by accidentally working top-down across the SCC without any of the threshold limitations that traditional top-down inliners use. Consider this diabolical monster.cpp file that Richard Smith came up with to help demonstrate this issue: ``` template <int N> extern const char str; void g(const char ); template <bool K, int N> void f(bool B, bool E) { if (K) g(str<N>); if (B == E) return; if (B) f<true, N + 1>(B + 1, E); else f<false, N + 1>(B + 1, E); } template <> void f<false, MAX>(bool B, bool E) { return f<false, 0>(B, E); } template <> void f<true, MAX>(bool B, bool E) { return f<true, 0>(B, E); } extern bool arr, end; void test() { f<false, 0>(arr, end); } ``` When compiled with '-DMAX=N' for various values of N, this will create an SCC with a reasonably large number of functions. Previously, the inliner would try to exhaust the inlining candidates in a single function before moving on. This, unfortunately, turns it into a top-down inliner within the SCC. Because our thresholds were never built for that, we will incrementally decide that it is always worth inlining and proceed to flatten the entire SCC into that one function. What's worse, we'll then proceed to the next function, and do the exact same thing except we'll skip the first function, and so on. And at each step, we'll also make some of the constant factors larger, which is awesome. The fix in this patch is the obvious one which makes the new PM's inliner use the same technique used by the old PM: consider all the call edges across the entire SCC before beginning to process call edges introduced by inlining. The result of this is essentially to distribute the inlining across the SCC so that every function incrementally grows toward the inline thresholds rather than allowing the inliner to grow one of the functions vastly beyond the threshold. The code for this is a bit awkward, but it works out OK. We could consider in the future doing something more powerful here such as prioritized order (via lowest cost and/or profile info) and/or a code-growth budget per SCC. However, both of those would require really substantial work both to design the system in a way that wouldn't break really useful abstraction decomposition properties of the current inliner and to be tuned across a reasonably diverse set of code and workloads. It also seems really risky in many ways. I have only found a single real-world file that triggers the bad behavior here and it is generated code that has a pretty pathological pattern. I'm not worried about the inliner not doing an awesome* job here as long as it does ok. On the other hand, the cases that will be tricky to get right in a prioritized scheme with a budget will be more common and idiomatic for at least some frontends (C++ and Rust at least). So while these approaches are still really interesting, I'm not in a huge rush to go after them. Staying even closer to the existing PM's behavior, especially when this easy to do, seems like the right short to medium term approach. I don't really have a test case that makes sense yet... I'll try to find a variant of the IR produced by the monster template metaprogram that is both small enough to be sane and large enough to clearly show when we get this wrong in the future. But I'm not confident this exists. And the behavior change here should be unobservable without snooping on debug logging. So there isn't really much to test. The test case updates come from two incidental changes: 1) We now visit functions in an SCC in the opposite order. I don't think there really is a "right" order here, so I just update the test cases. 2) We no longer compute some analyses when an SCC has no call instructions that we consider for inlining. llvm-svn: 297374	2017-03-09 11:35:40 +00:00
Simon Dardis	dcab2a544c	[mips] Fix return lowering Fix a machine verifier issue where a instruction was using a invalid register. The return pseudo is expanded and has the return address register added to it. The return register may have been spuriously mark as killed earlier. This partially resolves PR/27458 Thanks to Quentin Colombet for reporting the issue! llvm-svn: 297372	2017-03-09 11:19:48 +00:00
Adam Nemet	ecad7d67fb	[SSP] In opt remarks, stream Function directly With this, it shows up as an attribute in YAML and non-printable characters are properly removed by GlobalValue::getRealLinkageName. llvm-svn: 297362	2017-03-09 06:10:27 +00:00
Adam Nemet	d57520ae79	[SLP] Mark values in Dot that need to be extracted llvm-svn: 297361	2017-03-09 05:48:03 +00:00
Matt Arsenault	2d30bf1fc6	DAG: Check no signed zeros instead of unsafe math attribute llvm-svn: 297354	2017-03-09 01:36:39 +00:00
Peter Collingbourne	ec2596eea1	WholeProgramDevirt: Implement importing for uniform ret val opt. Differential Revision: https://reviews.llvm.org/D29854 llvm-svn: 297350	2017-03-09 01:11:15 +00:00
Konstantin Zhuravlyov	faffafd5fa	AMDGPU: add missing lit.local.cfg to test/DebugInfo/AMDGPU llvm-svn: 297334	2017-03-09 00:21:36 +00:00
Peter Collingbourne	8ab4ca6dbf	WholeProgramDevirt: Implement importing for single-impl devirtualization. Differential Revision: https://reviews.llvm.org/D29844 llvm-svn: 297333	2017-03-09 00:21:25 +00:00
Teresa Johnson	bce1b9a58d	Perform symbol binding for .symver versioned symbols Summary: In a .symver assembler directive like: .symver name, name2@@nodename "name2@@nodename" should get the same symbol binding as "name". While the ELF object writer is updating the symbol binding for .symver aliases before emitting the object file, not doing so when the module inline assembly is handled by the RecordStreamer is causing the wrong behavior in LTO mode. E.g. when "name" is global, "name2@@nodename" must also be marked as global. Otherwise, the symbol is skipped when iterating over the LTO InputFile symbols (InputFile::Symbol::shouldSkip). So, for example, when performing any LTO via the gold-plugin, the versioned symbol definition is not recorded by the plugin and passed back to the linker. If the object was in an archive, and there were no other symbols needed from that object, the object would not be included in the final link and references to the versioned symbol are undefined. The llvm-lto2 tests added will give an error about an unused symbol resolution without the fix. Reviewers: rafael, pcc Reviewed By: pcc Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D30485 llvm-svn: 297332	2017-03-09 00:19:49 +00:00
Changpeng Fang	31a069303d	AMDGPU/SI: Disable unrolling in the loop vectorizer if the loop is not vectorized. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D30719 llvm-svn: 297328	2017-03-09 00:07:00 +00:00
Evgeniy Stepanov	4759b38338	Don't merge global constants with non-dbg metadata. !type metadata can not be dropped. An alternative to this is adding !type metadata from the replaced globals to the replacement, but that may weaken type tests and make them slower at the same time. The merged global gets !dbg metadata from replaced globals, and can end up with multiple debug locations. llvm-svn: 297327	2017-03-09 00:03:37 +00:00
Konstantin Zhuravlyov	086cf9de9d	[DebugInfo] Emit address space with DW_AT_address_class attribute for pointer and reference types Differential Revision: https://reviews.llvm.org/D29670 llvm-svn: 297320	2017-03-08 23:55:44 +00:00
Jessica Paquette	a6024d5e77	[Outliner] Fix memory leak in suffix tree. This commit changes the BumpPtrAllocator for suffix tree nodes to a SpecificBumpPtrAllocator. Before, node construction was leaking memory because of the DenseMap in SuffixTreeNodes. Changing this to a SpecificBumpPtrAllocator allows this memory to properly be released. llvm-svn: 297319	2017-03-08 23:55:33 +00:00
Javed Absar	0a1e320a6e	[ConstantFold] Fix defect in constant folding computation for GEP When the array indexes are all determined by GVN to be constants, a call is made to constant-folding to optimize/simplify the address computation. The constant-folding, however, makes a mistake in that it sometimes reads back stale Idxs instead of NewIdxs, that it re-computed in previous iteration. This leads to incorrect addresses coming out of constant-folding to GEP. A test case is included. The error is only triggered when indexes have particular patterns that the stale/new index updates interplay matters. Reviewers: Daniel Berlin Differential Revision: https://reviews.llvm.org/D30642 llvm-svn: 297317	2017-03-08 23:01:50 +00:00
Zachary Turner	7dad21d20f	[Support] Add llvm::sys::fs::remove_directories. We already have a function create_directories() which can create an entire tree, and remove() which can remove an empty directory, but we do not have remove_directories() which can remove an entire tree. This patch adds such a function. Because removing a directory tree can have dangerous consequences when the tree contains a directory symlink, the patch here updates the existing directory_iterator construct to optionally not follow symlinks (previously it would always follow symlinks). The delete algorithm uses this flag so that for symlinks, only the links are removed, and not the targets. On Windows this is implemented with SHFileOperation, which also does not recurse into symbolic links or junctions. Differential Revision: https://reviews.llvm.org/D30676 llvm-svn: 297314	2017-03-08 22:49:32 +00:00
Jordan Rose	cd4e96bd9d	Add red zones to BumpPtrAllocator under ASan To help catch buffer overruns, this patch changes BumpPtrAllocator to insert an extra unused byte between allocations when building with ASan. SpecificBumpPtrAllocator opts out of this behavior, since it needs to destroy its items later by walking the allocated memory. Reviewed by Pete Cooper. llvm-svn: 297310	2017-03-08 21:53:12 +00:00
George Burgess IV	de90a6b85f	[MemCpyOpt] clang-format + trim the legacy pass. NFC. None of the declarations below `// Helper functions` seem to have definitions anymore. llvm-svn: 297309	2017-03-08 21:28:19 +00:00
Tim Northover	c488fed4a7	GlobalISel: correctly handle trivial fcmp predicates. It makes sense to only do them once in IRTranslator rather than making everyone deal with them. llvm-svn: 297304	2017-03-08 18:49:54 +00:00
Adam Nemet	c956f416cc	[SLP] Visualize SLP trees with -view-slp-tree Analyzing larger trees is extremely difficult with the current debug output so this adds GraphTraits and DOTGraphTraits on top of the VectorizableTree data structure. We can now display the SLP trees with Graphviz as in https://reviews.llvm.org/F3132765. I decorated the graph where a value needs to be gathered for one reason or another. These are the red nodes. There are other improvement I am planning to make as I work through my case here. For example, I would also like to mark nodes that need to be extracted. Differential Revision: https://reviews.llvm.org/D30731 llvm-svn: 297303	2017-03-08 18:47:50 +00:00
Matthew Simpson	9d666ee035	[LV] Select legal insert point when fixing first-order recurrences Because IRBuilder performs constant-folding, it's not guaranteed that an instruction in the original loop map to an instruction in the vector loop. It could map to a constant vector instead. The handling of first-order recurrences was incorrectly making this assumption when setting the IRBuilder's insert point. llvm-svn: 297302	2017-03-08 18:18:20 +00:00
Volkan Keles	79375edeb2	[GlobalISel] Add default action for G_FNEG Summary: rL297171 introduced G_FNEG for floating-point negation instruction and IRTranslator started to translate `FSUB -0.0, X` to `FNEG X`. This patch adds a default action for G_FNEG to avoid breaking existing targets. Reviewers: qcolombet, ab, kristof.beyls, t.p.northover, aditya_nandakumar, dsanders Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D30721 llvm-svn: 297301	2017-03-08 18:09:14 +00:00
Zachary Turner	28d9b5adef	Resubmit FileSystem changes. This was originall reverted due to some test failures in ModuleCache and TestCompDirSymlink. These issues have all been resolved and the code now passes all tests. Differential Revision: https://reviews.llvm.org/D30698 llvm-svn: 297300	2017-03-08 17:56:08 +00:00
Sanjay Patel	658cbdf5ad	[x86] regenerate checks; NFC This test could be reduced? The check fails for a seemingly unrelated change, so I'm adding full checks to see what is happening. llvm-svn: 297296	2017-03-08 17:19:56 +00:00
Matthew Simpson	d0f8fe1fd0	[LV] Make the test case for PR30183 less fragile This patch also renames the PR number the test points to. The previous reference was PR29559, but that bug was somehow deleted and recreated under PR30183. llvm-svn: 297295	2017-03-08 17:03:38 +00:00
Matthew Simpson	8498ddbfe8	[LV] Add missing check labels to tests and reformat llvm-svn: 297294	2017-03-08 16:55:34 +00:00
Krzysztof Parzyszek	a611af501f	[Hexagon] Use correct offset when extracting from the high word When extracting a bitfield from the high register in a register pair, the final offset should be relative to the high register (for 32-bit extracts). llvm-svn: 297288	2017-03-08 15:46:28 +00:00
Daniel Cederman	f99e079bc7	[Sparc] Check register use with isPhysRegUsed() instead of reg_nodbg_empty() Summary: By using reg_nodbg_empty() to determine if a function can be treated as a leaf function or not, we miss the case when the register pair L0_L1 is used but not L0 by itself. This has the effect that use_all_i32_regs(), a test in reserved-regs.ll which tries to use all registers, gets treated as a leaf function. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: davide, RKSimon, sepavloff, llvm-commits Differential Revision: https://reviews.llvm.org/D27089 llvm-svn: 297285	2017-03-08 15:23:10 +00:00
Jun Bum Lim	c7dfda0d78	[JumpThread] Use AA in SimplifyPartiallyRedundantLoad() Summary: Use AA when scanning to find an available load value. Reviewers: rengolin, mcrosier, hfinkel, trentxintong, dberlin Reviewed By: rengolin, dberlin Subscribers: aemerson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D30352 llvm-svn: 297284	2017-03-08 15:22:30 +00:00
Sanjay Patel	13fa58d399	[InstCombine] avoid crashing on shuffle shrinkage when input type is not same as result type llvm-svn: 297280	2017-03-08 15:02:23 +00:00
John Brawn	803e5cc63e	[ARM] Split up lsl-zero test into two tests On Windows stderr and stdout happen to get interleaved in a way that causes the test to fail, so split it up into a test that checks for errors and a test that doesn't. llvm-svn: 297273	2017-03-08 12:49:18 +00:00
Sam Parker	c4cc86708e	[LoopRotate] Propagate dbg.value intrinsics Recommitting patch which was previously reverted in r297159. These changes should address the casting issues. The original patch enables dbg.value intrinsics to be attached to newly inserted PHI nodes. Differential Review: https://reviews.llvm.org/D30701 llvm-svn: 297269	2017-03-08 09:56:22 +00:00
Simon Pilgrim	ca0651aa18	[X86][SSE] combineX86ShufflesRecursively can handle shuffle masks up to 64 elements wide By defining the mask types as SmallVector<int, 16> we were causing a lot of unnecessary heap usage. llvm-svn: 297267	2017-03-08 09:36:39 +00:00
Jonas Hahnfeld	c9a8650f00	[Support] Remove unit test for fs::is_local rL295768 introduced this test that fails if LLVM is built and tested on an NFS share. Delete the test as discussed on the corresponing commit thread. The only feasible solution would have been to introduce environment variables and to en/disable the test conditionally. llvm-svn: 297260	2017-03-08 08:36:21 +00:00
Amjad Aboud	0533b05f81	[SLP] Fixed non-deterministic behavior in Loop Vectorizer. Differential Revision: https://reviews.llvm.org/D30638 llvm-svn: 297257	2017-03-08 05:09:10 +00:00
Tim Shen	cb40c6d037	Revert "Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size"" After inspection, it's an UB in our code base. Someone cast a var-arg function pointer to a non-var-arg one. :/ Re-commit r296771 to continue testing on the patch. Sorry for the trouble! llvm-svn: 297256	2017-03-08 02:41:35 +00:00
Zachary Turner	b21c035434	Work around an ICE on MSVC 2017. MSVC 2017 was released today, and I found one bug in the compiler which prevents a successful build of LLVM. This patch works around the bug in a fairly benign way. llvm-svn: 297255	2017-03-08 01:57:40 +00:00
Sebastian Pop	b86663486c	Handle UnreachableInst in isGuaranteedToTransferExecutionToSuccessor A block with an UnreachableInst does not transfer execution to a successor. The problem was exposed by GVN-hoist. This patch fixes bug 32153. Patch by Aditya Kumar. Differential Revision: https://reviews.llvm.org/D30667 llvm-svn: 297254	2017-03-08 01:54:50 +00:00
Davide Italiano	cab9494f2f	[SCCP] Merge markOverdefined and markAnythingOverdefined. There's no need to have two separate APIs. llvm-svn: 297253	2017-03-08 01:26:37 +00:00
Justin Lebar	571441fc3d	[NVPTX] Remove unnecessary isImageReadoOnly(), isImageWriteOnly(), & isImageReadWrite calls This is repetition of isImage() function in NVPTXUtilities.cpp. Patch by Briana Grace! Differential Revision: https://reviews.llvm.org/D30706 llvm-svn: 297252	2017-03-08 01:14:15 +00:00
Matt Arsenault	b3bd0133e4	AMDGPU: Don't wait at end of block with a trivial successor If there is only one successor, and that successor only has one predecessor the wait can obviously be delayed until uses or the end of the next block. This avoids code quality regressions when there are trivial fallthrough blocks inserted for structurization. llvm-svn: 297251	2017-03-08 01:06:58 +00:00
Eli Friedman	c2710dc724	[DAGCombine] Simplify ISD::AND in GetDemandedBits. This helps in cases involving bitfields where an AND is exposed by legalization. Differential Revision: https://reviews.llvm.org/D30472 llvm-svn: 297249	2017-03-08 00:56:35 +00:00

... 3 4 5 6 7 ...

146174 Commits