llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Oliver Stannard	0d63fd3476	[ARM] Diagnose ARM MOVT without :lower16: or :upper16: expression This instruction was missing from the list of opcodes that we check, so we were hitting an llvm_unreachable in ARMMCCodeEmitter.cpp for the ARM MOVT instruction, rather than the diagnostic that is emitted for the other MOVW/MOVT instructions. Differential revision: https://reviews.llvm.org/D30936 llvm-svn: 297739	2017-03-14 13:50:10 +00:00
Artyom Skrobov	1dcd7b3239	De-duplicate the two implementations of ARMBaseInstrInfo::isProfitableToIfCvt() [NFC] Reviewers: congh, rengolin Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D30934 llvm-svn: 297738	2017-03-14 13:38:45 +00:00
Ayal Zaks	91ca0c753e	[LV] Refactor Cost Model's selectVectorizationFactor(); NFC Refactoring Cost Model's selectVectorizationFactor() so that it handles only the selection of the best VF from a pre-computed range of candidate VF's, extracting early-exit criteria and the computation of a MaxVF upper-bound to other methods, all driven by a newly introduced LoopVectorizationPlanner. Differential Revision: https://reviews.llvm.org/D30653 llvm-svn: 297737	2017-03-14 13:07:04 +00:00
Simon Pilgrim	70b25e7d4e	[X86][MMX] Update FIXME comment. NFCI. llvm-svn: 297736	2017-03-14 12:13:41 +00:00
Daniel Berlin	9a45dae8e1	Make PredIteratorCache size() logically const. Do not require copying predecessors to get size. Summary: Every single benchmark i can run, on large and small cfgs, fully connected, etc, across 3 different platforms (x86, arm., and PPC) says that the current pred iterator cache is a losing proposition. I can't find a case where it's faster than just walking preds, and in some cases, it's 5-10% slower. This is due to copying the preds. It also degrades into copying the entire cfg. The one operation that is occasionally faster is the cached size. This makes that operation faster by not relying on having the copies available. I'm not even sure that is faster enough to be worth it. I, again, have trouble finding cases where this takes long enough in a pass to be worth caching compared to a million other things they could cache or improve. My suggestion: We next remove the get() interface. We do stronger benchmarking of size(). We probably end up killing this entire cache. / Reviewers: chandlerc Subscribers: aemerson, llvm-commits, trentxintong Differential Revision: https://reviews.llvm.org/D30873 llvm-svn: 297733	2017-03-14 11:25:45 +00:00
James Henderson	821ed055a3	Test commit. llvm-svn: 297731	2017-03-14 10:51:14 +00:00
Benjamin Kramer	def6f9d002	[CodeGen] Fix -Wreorder warning. llvm-svn: 297729	2017-03-14 10:29:47 +00:00
Tobias Grosser	66a68536f4	Fix typos in ADCE comments llvm-svn: 297726	2017-03-14 10:18:11 +00:00
Oliver Stannard	63381d7b41	[ValueTracking] Out of range shifts might be undef If it is possible for the RHS of a shift operation to be greater than or equal to the bit-width, then the result might be undef, and we can't report any known bits. In some cases, this was allowing a transformation in instcombine which widened an undef value from i1 to i32, increasing the range of values that a function could return. Differential revision: https://reviews.llvm.org/D30781 llvm-svn: 297724	2017-03-14 10:13:17 +00:00
Sam Parker	1cba526c4c	[ARM] Move SMULW[B\|T] isel to DAG Combine Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716	2017-03-14 09:13:22 +00:00
Oren Ben Simhon	5cdf2fcc64	Disable Callee Saved Registers Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller. Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list. The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee. The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee. Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span). The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments. The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC. Differential Revision: https://reviews.llvm.org/D28566 llvm-svn: 297715	2017-03-14 09:09:26 +00:00
Craig Topper	ce8e621808	[AVX-512] Use iPTR instead of i64 in patterns for extract_subvector/insert_subvector index. llvm-svn: 297707	2017-03-14 06:40:04 +00:00
Craig Topper	e4a6576174	[AVX-512] Add test cases that demonstrate some patterns that don't work correctly in 32-bit mode. NFC llvm-svn: 297706	2017-03-14 06:40:00 +00:00
Jonas Paulsson	42e7a2d74b	[TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved getIntrinsicInstrCost() used to only compute scalarization cost based on types. This patch improves this so that the actual arguments are checked when they are available, in order to handle only unique non-constant operands. Tests updates: Analysis/CostModel/X86/arith-fp.ll Transforms/LoopVectorize/AArch64/interleaved_cost.ll Transforms/LoopVectorize/ARM/interleaved_cost.ll The improvement in getOperandsScalarizationOverhead() to differentiate on constants made it necessary to update the interleaved_cost.ll tests even though they do not relate to intrinsics. Review: Hal Finkel https://reviews.llvm.org/D29540 llvm-svn: 297705	2017-03-14 06:35:36 +00:00
Craig Topper	9982fc8657	[AVX-512] Pre-emptively fix more places in fastisel where we might copy a VK1 register into a AH/BH/CH/DH register. llvm-svn: 297704	2017-03-14 04:18:25 +00:00
Daniel Berlin	c19cdac06d	Add missing condprop-xfail.ll that contains the remaining xfail'd tests llvm-svn: 297699	2017-03-14 01:46:51 +00:00
Nirav Dave	8d60f2fd82	Recommitting Craig Topper's patch now that r296476 has been recommitted. When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 297698	2017-03-14 01:42:23 +00:00
Nirav Dave	889cd22a6a	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 297695	2017-03-14 00:34:14 +00:00
Vitaly Buka	8e3839c39d	[libFuzzer] Reorder includes in test llvm-svn: 297692	2017-03-13 23:49:00 +00:00
Vitaly Buka	5c4b250aa0	[libFuzzer] Fix compilation of CustomCrossOverAndMutateTest on Windows llvm-svn: 297690	2017-03-13 23:46:30 +00:00
Zachary Turner	71089e8136	Add the beginning of PDB diffing support. For now this only diffs the stream directory and the MSF Superblock. Future patches will drill down into individual streams to find out where the differences lie. Differential Revision: https://reviews.llvm.org/D30908 llvm-svn: 297689	2017-03-13 23:28:25 +00:00
Adrian Prantl	d3f451c0b0	Revert "Debug Info: Add basic support for external types references." This reverts commit r242302. External type refs of this form were never used by any LLVM frontend so this is effectively dead code. (They were introduced to support clang module debug info, but in the end we came up with a better design that doesn't use this feature at all.) rdar://problem/25897929 Differential Revision: https://reviews.llvm.org/D30917 llvm-svn: 297684	2017-03-13 22:56:14 +00:00
Daniel Berlin	b96c517f43	NewGVN: We pass rle-nonlocal, we just perform the replacement in a way that keeps the old name instead of the new one llvm-svn: 297683	2017-03-13 22:43:30 +00:00
Artyom Skrobov	0ccdb41dc1	[Thumb1] combine ADDC/SUBC with a negative immediate Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400 Reviewers: efriedma, jmolloy Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297682	2017-03-13 22:36:14 +00:00
Rui Ueyama	034096e25d	Make FileOutputBuffer fail early if you pass a directory. Previously, it created a temporary directory and then failed when FileOutputBuffer tried to rename that file to the destination file (which is actually a directory name). Differential Revision: https://reviews.llvm.org/D30912 llvm-svn: 297679	2017-03-13 22:19:05 +00:00
Craig Topper	d664044360	[AVX-512] Fix another case where we are copying from a mask register using AH/BH/CH/DH with fastisel. Fixes PR32256. Still planning to do an audit for other possible cases. llvm-svn: 297678	2017-03-13 21:58:54 +00:00
David Blaikie	2316ccd2f3	Fix llvm-symbolizer to navigate both DW_AT_abstract_origin and DW_AT_specification in a single chain In a recent refactoring (r291959) this regressed to only following one or the other, not both, in a single chain. llvm-svn: 297676	2017-03-13 21:46:37 +00:00
David Blaikie	2dbd650938	Remove unused lambda capture llvm-svn: 297675	2017-03-13 21:46:14 +00:00
David Blaikie	aed67710d2	Fix sign compare warning in unit test by using an explicit unsigned literal suffix llvm-svn: 297674	2017-03-13 21:46:12 +00:00
Marcello Maggioni	7b495cf55a	[IPRA] Change algorithm for RegUsageInfoCollector. The previous algorithm for RegUsageInfoCollector had pretty bad performance on architectures with a lot of registers that alias a lot one another, because we potentially iterate for every register over all the aliasing registers. This costs even more if the function is small and doesn't define a lot of registers. This patch changes the algorithm to one that while iterating over all the registers it will iterate over the aliasing registers only if the register itself is defined. This should be faster based on the assumption that only a subset of the whole LLVM registers set is actually defined in the function. Differential Revision: https://reviews.llvm.org/D30880 llvm-svn: 297673	2017-03-13 21:42:53 +00:00
Juergen Ributzka	054e583082	[Support] Follow-up for "Test directory iterators and recursive directory iterators with broken symlinks." Fix the test by sorting the result vector. llvm-svn: 297672	2017-03-13 21:40:20 +00:00
Volkan Keles	f603fe7e92	GlobalISel: Translate ConstantDataVector Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab Reviewed By: qcolombet, dsanders, ab Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30216 llvm-svn: 297670	2017-03-13 21:36:19 +00:00
Juergen Ributzka	036b9677ed	[Support] Test directory iterators and recursive directory iterators with broken symlinks. This commit adds a unit test to the file system tests to verify the behavior of the directory iterator and recursive directory iterator with broken symlinks. This test is Unix only. llvm-svn: 297669	2017-03-13 21:34:07 +00:00
Tim Northover	70406195b7	Revert "GlobalISel: move vector extract/insert inside generic opcode region." I was writing against an earlier branch and Volkan had already fixed this. llvm-svn: 297668	2017-03-13 21:25:10 +00:00
Simon Pilgrim	57d2266552	[X86][MMX] Fix folding of shift value loads to cover whole 64-bits rL230225 made the assumption that only the lower 32-bits of an MMX register load is used as a shift value, when in fact the whole 64-bits are reloaded and treated as a i64 to determine the shift value. This patch reverts rL230225 to ensure that the whole 64-bits of memory are folded and ensures that the upper 32-bit are zero'd for cases where the shift value has come from a scalar source. Found during fuzz testing. Differential Revision: https://reviews.llvm.org/D30833 llvm-svn: 297667	2017-03-13 21:23:29 +00:00
Tim Northover	96a7930f52	GlobalISel: move vector extract/insert inside generic opcode region. Otherwise they won't be legalized or selected, causing instruction selection to fail horribly. llvm-svn: 297666	2017-03-13 21:18:59 +00:00
Andrew Kaylor	5b14a90ddd	Revert r295004 (Add MXCSR) due to errors reported by MachineVerifier I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced. llvm-svn: 297664	2017-03-13 20:35:10 +00:00
Volkan Keles	55c5038c53	[GlobalISel] Update PRE_ISEL_GENERIC_OPCODE_END marker llvm-svn: 297663	2017-03-13 20:31:45 +00:00
Matt Arsenault	7b61ea7700	AMDGPU: Re-use TM.getNullPointerValue llvm-svn: 297662	2017-03-13 20:18:14 +00:00
Rafael Espindola	4358d27b64	Bring back r297624. The issues was just a missing REQUIRES in the test. llvm-svn: 297661	2017-03-13 20:00:25 +00:00
Sanjay Patel	f688b567a8	[SimplifyCFG] move tests for PR31028 from CGP Hopefully, this will make sense with a forthcoming patch. If not, we can move these back. llvm-svn: 297660	2017-03-13 19:59:14 +00:00
Matt Arsenault	78e7d66a36	AMDGPU: Treat 0 as private null pointer in addrspacecast lowering llvm-svn: 297658	2017-03-13 19:47:31 +00:00
Rafael Espindola	b89a1dbe86	Revert "Fix crash when multiple raw_fd_ostreams to stdout are created." This reverts commit r297624. It was failing on the bots. llvm-svn: 297657	2017-03-13 19:38:32 +00:00
Daniel Berlin	58f8a88d1e	Fix some indenting and line-wrapping issues identified in ProgrammersManual. Make description of debugCounters a little clearer llvm-svn: 297656	2017-03-13 19:09:23 +00:00
Jessica Paquette	ad5647725b	[Outliner] Add tail call support This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. llvm-svn: 297653	2017-03-13 18:39:33 +00:00
Craig Topper	c2088276f3	[X86] Lower AVX2 gather intrinsics similar to AVX-512. Apply the same input source optimizations to break execution dependencies. For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2. llvm-svn: 297652	2017-03-13 18:34:46 +00:00
Craig Topper	3cb591fd44	[AVX-512] If gather mask is all ones, force the input to a zero vector. We were already forcing undef inputs to become a zero vector, this now catches an all ones mask too. Ideally we'd use undef and let execution dep fix handle picking the best register/clearance for the undef, but I don't think it can handle the early clobber today. llvm-svn: 297651	2017-03-13 18:17:46 +00:00
Matt Arsenault	9763d98635	AMDGPU: Fold icmp/fcmp into icmp intrinsic The typical use is a library vote function which compares to 0. Fold the user condition into the intrinsic. llvm-svn: 297650	2017-03-13 18:14:02 +00:00
Jonas Devlieghere	9840ae61ae	[Linker] Provide callback for internalization Differential Revision: https://reviews.llvm.org/D30738 llvm-svn: 297649	2017-03-13 18:08:11 +00:00
Craig Topper	d35723bd11	[SelectionDAG] Enhance SDTCisSameNumEltsAs to work with scalar types and use it on extend/trunc/round operations. Currently we don't enforce that ISD::ANY_EXTEND, ZERO_EXTEND, SIGN_EXTEND, TRUNC, FP_ROUND, FP_EXTEND have the same number of elements(including scalar) between their input and output. Though we have them documented as such. Up until a few months ago x86 created nodes that violated this rule. That's all been fixed now, and we should enforce the rule going forward. In order to do this we need to allow SDTCisSameNumEltsAs to support scalar types and not enforce being a vector. If one type is scalar we will force the other type to also be scalar. Differential Revision: https://reviews.llvm.org/D30878 llvm-svn: 297648	2017-03-13 17:37:14 +00:00

1 2 3 4 5 ...

146174 Commits