llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Sanjay Patel	967565769d	[InstCombine] revert r280637 because it causes test failures on an ARM bot http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/14952/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aicmp.ll llvm-svn: 280676	2016-09-05 22:36:32 +00:00
Craig Topper	e5f780472a	[AVX-512] Integrate mask register copying more completely into X86InstrInfo::copyPhysReg and simplify. No functional change intended. The code is now written in terms of source and dest classes with feature checks inside each type of copy instead of having separate functions for each feature set. llvm-svn: 280673	2016-09-05 20:34:50 +00:00
Benjamin Kramer	48d2159c40	[WebAssembly] Unbreak the build. Not sure why ADL isn't working here. llvm-svn: 280656	2016-09-05 12:06:47 +00:00
Valery Pykhtin	616e586d42	[AMDGPU] Refactor FLAT TD instructions Differential revision: https://reviews.llvm.org/D24072 llvm-svn: 280655	2016-09-05 11:22:51 +00:00
James Molloy	a0cf4d86b2	[Thumb1] Add relocations for fixups fixup_arm_thumb_{br,bcc} These need to be mapped through to R_ARM_THM_JUMP{11,8} respectively. Fixes PR30279. llvm-svn: 280651	2016-09-05 08:29:15 +00:00
Igor Breger	ba087d916d	[AVX512] Fix v8i1 /v16i1 zext + bitcast lowering pattern. Explicitly zero upper bits. Differential Revision: http://reviews.llvm.org/D23983 llvm-svn: 280650	2016-09-05 08:26:51 +00:00
Craig Topper	4ac30e51b2	[X86] Make some static arrays of opcodes const and shrink to uint16_t. NFC llvm-svn: 280649	2016-09-05 07:14:21 +00:00
Craig Topper	b6c15620f8	[AVX-512] Simplify X86InstrInfo::copyPhysReg for 128/256-bit vectors with AVX512, but not VLX. We should use the VEX opcodes and trust the register allocator to not use the extended XMM/YMM register space. Previously we were extending to copying the whole ZMM register. The register allocator shouldn't use XMM16-31 or YMM16-31 in this configuration as the instructions to spill them aren't available. llvm-svn: 280648	2016-09-05 06:43:06 +00:00
Gor Nishanov	026990c96b	[Coroutines] Part11: Add final suspend handling. Summary: A frontend may designate a particular suspend to be final, by setting the second argument of the coro.suspend intrinsic to true. Such a suspend point has two properties: * it is possible to check whether a suspended coroutine is at the final suspend point via coro.done intrinsic; * a resumption of a coroutine stopped at the final suspend point leads to undefined behavior. The only possible action for a coroutine at a final suspend point is destroying it via coro.destroy intrinsic. This patch adds final suspend handling logic to CoroEarly and CoroSplit passes. Now, the final suspend point example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex5.ll). Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24068 llvm-svn: 280646	2016-09-05 04:44:30 +00:00
Craig Topper	fef4374983	[X86] Remove FsVMOVAPSrm/FsVMOVAPDrm/FsMOVAPSrm/FsMOVAPDrm. Due to their placement in the td file they had lower precedence than (V)MOVSS/SD and could almost never be selected. The only way to select them was in AVX512 mode because EVEX VMOVSS/SD was below them and the patterns weren't qualified properly for AVX only. So if you happened to have an aligned FR32/FR64 load in AVX512 you could get a VEX encoded VMOVAPS/VMOVAPD. I tried to search back through history and it seems like these instructions were probably unselectable for at least 5 years, at least to the time the VEX versions were added. But I can't prove they ever were. llvm-svn: 280644	2016-09-05 02:20:49 +00:00
Sanjay Patel	4127705ad2	[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectors The code to calculate 'UsesRemoved' could be simplified. As-is, that code is a victim of PR30273: https://llvm.org/bugs/show_bug.cgi?id=30273 llvm-svn: 280637	2016-09-04 20:58:27 +00:00
Craig Topper	8d44c07194	[AVX-512] Add EVEX encoded scalar FMA intrinsic instructions to isNonFoldablePartialRegisterLoad. llvm-svn: 280636	2016-09-04 19:33:47 +00:00
Craig Topper	e6bea68a97	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div intrinsics and upgrade to native IR. llvm-svn: 280633	2016-09-04 18:13:33 +00:00
Lang Hames	d1046d5a1d	[ORC] Clone module flags metadata into the globals module in the CompileOnDemandLayer. Also contains a tweak to the orc-lazy jit in LLI to enable the test case. llvm-svn: 280632	2016-09-04 17:53:30 +00:00
Sanjay Patel	3d8049bdba	[InstCombine] recode icmp fold in a vector-friendly way; NFC The transform in question: icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1' ...is still not enabled for vectors, thus no functional change intended. It's not clear to me if this is a good transform for vectors or even scalars in general. Changing that behavior may be a follow-on patch. llvm-svn: 280627	2016-09-04 14:32:15 +00:00
Hal Finkel	c40902b979	[PowerPC] During branch relaxation, recompute padding offsets before each iteration We used to compute the padding contributions to the block sizes during branch relaxation only at the start of the transformation. As we perform branch relaxation, we change the sizes of the blocks, and so the amount of inter-block padding might change. Accordingly, we need to recompute the (alignment-based) padding in between every iteration on our way toward the fixed point. Unfortunately, I don't have a test case (and none was provided in the bug report), and while this obviously seems needed, algorithmically, I don't have any way of generating a small and/or non-fragile regression test. llvm-svn: 280626	2016-09-04 14:18:29 +00:00
Igor Breger	a37a45a263	revert r279960. https://llvm.org/bugs/show_bug.cgi?id=30249 llvm-svn: 280625	2016-09-04 14:03:52 +00:00
Simon Pilgrim	ddb7b99be6	Strip trailing whitespace llvm-svn: 280623	2016-09-04 13:28:46 +00:00
Chandler Carruth	6a72044ca6	[LCG] Clean up and make NDEBUG verify calls more rigorous with make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619	2016-09-04 08:34:31 +00:00
Chandler Carruth	2f848e0e0f	[LCG] A NFC refactoring to extract the logic for doing a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618	2016-09-04 08:34:24 +00:00
Dorit Nuzman	f6954572bd	[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617	2016-09-04 07:49:39 +00:00
Lang Hames	b0cac60a81	[ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine. ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The practical impact of this change is that ORC users no longer need to link MCJIT to use ObjectCaches. llvm-svn: 280616	2016-09-04 07:24:11 +00:00
Dorit Nuzman	e09e716ef5	Test commit. llvm-svn: 280615	2016-09-04 07:06:00 +00:00
Hal Finkel	bf0592c975	[PowerPC] Zero-extend constants in FastISel As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614	2016-09-04 06:07:19 +00:00
Craig Topper	7bf68ae691	[AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to native IR. llvm-svn: 280611	2016-09-04 02:09:53 +00:00
Joseph Tremoulet	8e8eb5eaa8	Fix inliner funclet unwind memoization Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610	2016-09-04 01:23:20 +00:00
Craig Topper	6fe1d5984a	[X86] Combine some of the strings in autoupgrade code. llvm-svn: 280603	2016-09-03 23:55:13 +00:00
Xinliang David Li	42e5b2ce44	Cleanup : Use metadata preserving API for branch creation Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602	2016-09-03 22:26:11 +00:00
Xinliang David Li	4abbd8e898	[Profile] preserve branch metadata lowering select in CGP CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600	2016-09-03 21:26:36 +00:00
Mehdi Amini	0debeb1eea	Fix ThinLTO crash with debug info Because the recent change about ODR type uniquing in the context, we can reach types defined in another module during IR linking. This triggered some assertions in case we IR link without starting from an empty module. To alleviate that, we can self-map metadata defined in the destination module so that they won't be visited. Differential Revision: https://reviews.llvm.org/D23841 llvm-svn: 280599	2016-09-03 21:12:33 +00:00
Simon Pilgrim	9eb9c00241	Strip trailing whitespace llvm-svn: 280598	2016-09-03 20:36:05 +00:00
Matt Arsenault	ecba57a1e7	AMDGPU: Set sizes of spill pseudos llvm-svn: 280595	2016-09-03 17:25:44 +00:00
Matt Arsenault	2751d90018	AMDGPU: Fix adding duplicate implicit exec uses I'm not sure if this should be considered a bug in copyImplicitOps or not, but implicit operands that are part of the static instruction definition should not be copied. llvm-svn: 280594	2016-09-03 17:25:39 +00:00
Craig Topper	b662036043	[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test. llvm-svn: 280593	2016-09-03 17:20:07 +00:00
Craig Topper	8b7f2911fc	[AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent. llvm-svn: 280592	2016-09-03 16:28:03 +00:00
Nicolai Haehnle	bfd5fc8a84	AMDGPU: Reduce the duration of whole-quad-mode Summary: This contains two changes that reduce the time spent in WQM, with the intention of reducing bandwidth required by VMEM loads: 1. Sampling instructions by themselves don't need to run in WQM, only their coordinate inputs need it (unless of course there is a dependent sampling instruction). The initial scanInstructions step is modified accordingly. 2. When switching back from WQM to Exact, switch back as soon as possible. This affects the logic in processBlock. This should always be a win or at best neutral. There are also some cleanups (e.g. remove unused ExecExports) and some new debugging output. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22092 llvm-svn: 280590	2016-09-03 12:26:38 +00:00
Nicolai Haehnle	472bdb08df	AMDGPU: Fix an interaction between WQM and polygon stippling Summary: This fixes a rare bug in polygon stippling with non-monolithic pixel shaders. The underlying problem is as follows: the prolog part contains the polygon stippling sequence, i.e. a kill. The main part then enables WQM based on the _reduced_ exec mask, effectively undoing most of the polygon stippling. Since we cannot know whether polygon stippling will be used, the main part of a non-monolithic shader must always return to exact mode to fix this problem. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23131 llvm-svn: 280589	2016-09-03 12:26:32 +00:00
Matt Arsenault	53e1580587	AMDGPU: Do basic folding of class intrinsic This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586	2016-09-03 07:06:58 +00:00
Matt Arsenault	7f6114dde5	AMDGPU: Fix spilling of m0 readlane/writelane do not support using m0 as the output/input. Constrain the register class of spill vregs to try to avoid this, but also handle spilling of the physreg when necessary by inserting an additional copy to a normal SGPR. llvm-svn: 280584	2016-09-03 06:57:55 +00:00
Matt Arsenault	2fc763f3ca	Improve debug error message with register name llvm-svn: 280583	2016-09-03 06:57:49 +00:00
Craig Topper	5a9e876f61	[AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables. llvm-svn: 280581	2016-09-03 04:37:50 +00:00
Hal Finkel	68985b72e3	[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics PowerPC assembly code in the wild, so it seems, has things like this: bc+ 12, 28, .L9 This is a bit odd because the '+' here becomes part of the BO field, and the BO field is otherwise the first operand. Nevertheless, the ISA specification does clearly say that the +- hint syntax applies to all conditional-branch mnemonics (that test either CTR or a condition register, although not the forms which check both), both basic and extended, so this is supposed to be valid. This introduces some asm-parser-only definitions which take only the upper three bits from the specified BO value, and the lower two bits are implied by the +- suffix (via some associated aliases). Fixes PR23646. llvm-svn: 280571	2016-09-03 02:31:44 +00:00
Duncan P. N. Exon Smith	3dccc3d478	ADT: Do not inherit from std::iterator in ilist_iterator Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570	2016-09-03 02:27:35 +00:00
Duncan P. N. Exon Smith	57584a7379	ADT: Remove external uses of ilist_iterator, NFC Delete the dead code for Write(ilist_iterator) in the IR Verifier, inline report(ilist_iterator) at its call sites in the MachineVerifier, and use simple_ilist<>::iterator in SymbolTableListTraits. The only remaining reference to ilist_iterator outside of the ilist implementation is from MachineInstrBundleIterator. I'll get rid of that in a follow-up. llvm-svn: 280565	2016-09-03 01:22:56 +00:00
Hal Finkel	a0698b2905	[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev These few book-III instructions are used by the Linux kernel. Partially fixes PR24796. llvm-svn: 280560	2016-09-02 23:42:01 +00:00
Hal Finkel	4e3ca08f33	[PowerPC] Add support for the extended dcbf form and mnemonics dcbf has an optional hint-like field, add support for the extended form and the associated mnemonics (dcbfl and dcbflp). Partially fixes PR24796. llvm-svn: 280559	2016-09-02 23:41:54 +00:00
Yunzhong Gao	18fe9799ec	(LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block: 1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not. 0xNN and NNh may come with optional U or L suffix. 2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not. NNb may come with optional U or L suffix. Differential Revision: https://reviews.llvm.org/D22112 llvm-svn: 280555	2016-09-02 23:15:29 +00:00
Ron Lieberman	bb7aee52af	Make sure to maintain register liveness when generating predicated instructions. Author: Krzysztof Parzyszek <kparzysz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D24209 llvm-svn: 280552	2016-09-02 22:56:24 +00:00
Xinliang David Li	ed50966900	[Profile] handle select instruction in 'expect' lowering Builtin expect lowering currently ignores select. This patch fixes the issue Differential Revision: http://reviews.llvm.org/D24166 llvm-svn: 280547	2016-09-02 22:03:40 +00:00
Hal Finkel	3b1203e54e	[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha When we have an offset into a global, etc. that is accessed relative to the TOC base pointer, and the offset is larger than the minimum alignment of the global itself and the TOC base pointer (which is 8-byte aligned), we can still fold the @toc@ha into the memory access, but we must update the addis instruction's symbol reference with the offset as the symbol addend. When there is only one use of the addi to be folded and only one use of the addis that would need its symbol's offset adjusted, then we can make the adjustment and fold the @toc@l into the memory access. llvm-svn: 280545	2016-09-02 21:37:07 +00:00

1 2 3 4 5 ...

94598 Commits