llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Sanjay Patel	59d186a190	[AArch64][x86] add tests for ctpop != 1; NFC This is the inverted predicate pattern for D63004. llvm-svn: 364314	2019-06-25 13:37:16 +00:00
Simon Pilgrim	f796e31c0b	[X86] lowerShuffleAsSpecificZeroOrAnyExtend - add ANY_EXTEND TODO. lowerShuffleAsSpecificZeroOrAnyExtend should be able to lower to ANY_EXTEND_VECTOR_INREG as well as ZER_EXTEND_VECTOR_INREG. llvm-svn: 364313	2019-06-25 13:36:53 +00:00
Fangrui Song	42567bb50b	[ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D60692 llvm-svn: 364312	2019-06-25 13:28:44 +00:00
Simon Pilgrim	aec0c38270	[TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support Add 'lowest' demanded elt -> bitcast fold to all *_EXTEND_VECTOR_INREG cases. Reapplies rL363856. llvm-svn: 364311	2019-06-25 13:25:57 +00:00
Whitney Tsang	3e1847f0ff	Expand cloneLoopWithPreheader() to support cloning loop nest Summary: cloneLoopWithPreheader() currently only support innermost loop, and assert otherwise. Reviewers: Meinersbur, fhahn, kbarton Reviewed By: Meinersbur Subscribers: hiraditya, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63446 llvm-svn: 364310	2019-06-25 13:23:13 +00:00
Matt Arsenault	f2f9a89033	AMDGPU/GlobalISel: Fix duplicated test Somehow ended up with copies of the same tests in AMDGPU and AMDGPU/GlobalISel llvm-svn: 364309	2019-06-25 13:23:08 +00:00
Matt Arsenault	891c1876b8	AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT llvm-svn: 364308	2019-06-25 13:18:11 +00:00
James Henderson	2a551ce03b	[llvm-objcopy][llvm-strip] Fix help text typo for --allow-broken-links llvm-svn: 364307	2019-06-25 13:14:18 +00:00
James Henderson	dc8e33b402	[docs][llvm-readobj] Improve llvm-readobj documentation There were a number of issues with the llvm-readobj documentation. The following points were raised in https://bugs.llvm.org/show_bug.cgi?id=42255, and have been fixed in this patch: 1. The description section claimed "The tool and its output is primarily designed for use in FileCheck-based tests" which is not really the case any more. 2. The documentation used single-dash long options for option names, but references in the help text to other options exclusively used double-dashes. Fixed by standardising on double-dashes for all long-form options. 3. The majority of options available and in the help text were not present in the documentation. This patch adds them. 4. Several aliases, both long and short, were missing, e.g. --relocs. Additionally, this patch improves the documentation by: 1. Splitting the options into categories based on the file format they are specific to. 2. Updating the Exit Status section to correctly mention that errors lead to a non-zero exit code. 3. Adding a See Also section referencing other similar LLVM tools. 4. Improving/correcting some of the descriptions of options that did not quite match up with what llvm-readobj does. Reviewed by: peter.smith, MaskRay, mtrent Differential Revision: https://reviews.llvm.org/D63719 llvm-svn: 364306	2019-06-25 13:12:38 +00:00
Simon Tatham	3d50da41f5	[ARM] Re-enable misspelled RUN: lines in fullfp16.s. rL364293 committed a couple of lines that just said "// RUN llvm-mc ..." without the all-important ':' after RUN, so those test lines weren't actually running anything. llvm-svn: 364305	2019-06-25 13:10:29 +00:00
Matt Arsenault	8f42855556	AMDGPU: Make amdgcn.s.get.waveid.in.workgroup inaccessiblememonly This should probably be readnone, even though the instruction looks like a load. llvm-svn: 364304	2019-06-25 13:03:06 +00:00
Simon Pilgrim	f92e9771b2	[TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ANY_EXTEND_VECTOR_INREG Simplify ZERO_EXTEND_VECTOR_INREG if the extended bits are not required. Matches what we already do for ZERO_EXTEND. Reapplies rL363850 but now with legality checks added at rL364290 llvm-svn: 364303	2019-06-25 12:57:43 +00:00
Sanjay Patel	b956364009	[SDAG] improve expansion of ctpop+setcc This should not cause any visible change in output, but it's more efficient because we were producing non-canonical 'sub x, 1' and 'setcc ugt x, 0'. As mentioned in the TODO, we should also be handling the inverse predicate. llvm-svn: 364302	2019-06-25 12:49:35 +00:00
Simon Pilgrim	9e195c63c6	Fix frame.s test dir-separator checks Handle / and \ separators llvm-svn: 364301	2019-06-25 12:35:38 +00:00
Simon Tatham	17fd136d10	[ARM] Fix buildbot failure due to -Werror. Including both 'case ARM_AM::uxtw' and 'default' in the getShiftOp switch caused a buildbot to fail with error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] llvm-svn: 364300	2019-06-25 12:23:46 +00:00
Simon Pilgrim	f701700461	[TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG Simplify SIGN_EXTEND_VECTOR_INREG if the extended bits are not required/known zero. Matches what we already do for SIGN_EXTEND. Reapplies rL363802 but now with legality checks added at rL364290 llvm-svn: 364299	2019-06-25 12:19:12 +00:00
Sjoerd Meijer	8e5fde9271	[ARM] MVE VPT Blocks A minor iteration on the MVE VPT Block pass to enable more efficient VPT Block code generation: consecutive VPT predicated statements, predicated on the same condition, will be placed within the same VPT Block. This essentially is also an exercise to write some more tests for the next step, which should be more generic also merging instructions when they are not consecutive. Differential Revision: https://reviews.llvm.org/D63711 llvm-svn: 364298	2019-06-25 12:04:31 +00:00
Nicolai Haehnle	216a9fc9a1	AMDGPU: Write LDS objects out as global symbols in code generation Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297	2019-06-25 11:52:30 +00:00
Nicolai Haehnle	a42afe2f42	AMDGPU/MC: Add .amdgpu_lds directive Summary: The directive defines a symbol as an group/local memory (LDS) symbol. LDS symbols behave similar to common symbols for the purposes of ELF, using the processor-specific SHN_AMDGPU_LDS as section index. It is the linker and/or runtime loader's job to "instantiate" LDS symbols and resolve relocations that reference them. It is not possible to initialize LDS memory (not even zero-initialize as for .bss). We want to be able to link together objects -- starting with relocatable objects, but possible expanding to shared objects in the future -- that access LDS memory in a flexible way. LDS memory is in an address space that is entirely separate from the address space that contains the program image (code and normal data), so having program segments for it doesn't really make sense. Furthermore, we want to be able to compile multiple kernels in a compilation unit which have disjoint use of LDS memory. In that case, we may want to place LDS symbols differently for different kernels to save memory (LDS memory is very limited and physically private to each kernel invocation), so we can't simply place LDS symbols in a .lds section. Hence this solution where LDS symbols always stay undefined. Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61493 llvm-svn: 364296	2019-06-25 11:51:35 +00:00
Simon Pilgrim	340fe790bd	[VectorLegalizer] ExpandANY_EXTEND_VECTOR_INREG/ExpandZERO_EXTEND_VECTOR_INREG - widen source vector The *_EXTEND_VECTOR_INREG opcodes were relaxed back around rL346784 to support source vector widths that are smaller than the output - it looks like the legalizers were never updated to account for this. This patch inserts the smaller source vector into an undef vector of the same width of the result before performing the shuffle+bitcast to correctly handle this. Part of the yak shaving to solve the crashes from rL364264 and rL364272 llvm-svn: 364295	2019-06-25 11:31:37 +00:00
Simon Tatham	bbffa3e0d1	[ARM] Explicit lowering of half <-> double conversions. If an FP_EXTEND or FP_ROUND isel dag node converts directly between f16 and f32 when the target CPU has no instruction to do it in one go, it has to be done in two steps instead, going via f32. Previously, this was done implicitly, because all such CPUs had the storage-only implementation of f16 (i.e. the only thing you can do with one at all is to convert it to/from f32). So isel would legalize the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp node (or vice versa), and then the fp_extend would already be f32->f64 rather than f16->f64. But that technique can't support a target CPU which has full f16 support but _not_ f64, such as some variants of Arm v8.1-M. So now we provide custom lowering for FP_EXTEND and FP_ROUND, which checks support for f16 and f64 and decides on the best thing to do given the combination of flags it gets back. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60692 llvm-svn: 364294	2019-06-25 11:24:50 +00:00
Simon Tatham	ba1b1937ee	[ARM] Extra MVE-related testing. This adds some extra RUN lines to existing test files, to check that things that worked in previous architecture versions haven't accidentally stopped working in 8.1-M. Also we add some new tests: a test of scalar floating point instructions that could be easily confused with the similar-looking vector ones at assembly time, a test of basic load/store/move access to the FP registers (which has to work even in integer-only MVE); and one final check of the really obvious case where turning off MVE should make sure MVE instructions really are rejected. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62682 llvm-svn: 364293	2019-06-25 11:24:42 +00:00
Simon Tatham	d12f1d5d0c	[ARM] Add remaining miscellaneous MVE instructions. This final batch includes the tail-predicated versions of the low-overhead loop instructions (LETP); the VPSEL instruction to select between two vector registers based on the predicate mask without having to open a VPT block; and VPNOT which complements the predicate mask in place. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62681 llvm-svn: 364292	2019-06-25 11:24:33 +00:00
Simon Tatham	84fe6af65f	[ARM] Add MVE vector load/store instructions. This adds the rest of the vector memory access instructions. It includes contiguous loads/stores, with an ordinary addressing mode such as [r0,#offset] (plus writeback variants); gather loads and scatter stores with a scalar base address register and a vector of offsets from it (written [r0,q1] or similar); and gather/scatters with a vector of base addresses (written [q0,#offset], again with writeback). Additionally, some of the loads can widen each loaded value into a larger vector lane, and the corresponding stores narrow them again. To implement these, we also have to add the addressing modes they need. Also, in AsmParser, the `isMem` query function now has subqueries `isGPRMem` and `isMVEMem`, according to which kind of base register is used by a given memory access operand. I've also had to add an extra check in `checkTargetMatchPredicate` in the AsmParser, without which our last-minute check of `rGPR` register operands against SP and PC was failing an assertion because Tablegen had inserted an immediate 0 in place of one of a pair of tied register operands. (This matches the way the corresponding check for `MCK_rGPR` in `validateTargetOperandClass` is guarded.) Apparently the MVE load instructions were the first to have ever triggered this assertion, but I think only because they were the first to have a combination of the usual Arm pre/post writeback system and the `rGPR` class in particular. Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62680 llvm-svn: 364291	2019-06-25 11:24:18 +00:00
Simon Pilgrim	263a203b09	[TargetLowering] SimplifyDemandedBits - legal checks for SIGN/ZERO_EXTEND -> ZERO/ANY_EXTEND As part of the fix for rL364264 + rL364272 - limit the *_EXTEND conversion to !TLO.LegalOperations \|\| isOperationLegal cases. We'll improve X86 legality in future commits. llvm-svn: 364290	2019-06-25 10:51:15 +00:00
Nemanja Ivanovic	230906bf49	[PowerPC] Emit XXSEL for vec_sel and code that has the same pattern As pointed out in https://bugs.llvm.org/show_bug.cgi?id=41777 we do not emit a vector select even when the pretty much asks for one. This patch changes that. Differential revision: https://reviews.llvm.org/D61658 llvm-svn: 364289	2019-06-25 10:46:13 +00:00
Sam Parker	2f44405590	[ARM] DLS/LE low-overhead loop code generation Introduce three pseudo instructions to be used during DAG ISel to represent v8.1-m low-overhead loops. One maps to set_loop_iterations while loop_decrement_reg is lowered to two, so that we can separate the decrement and branching operations. The pseudo instructions are expanded pre-emission, where we can still decide whether we actually want to generate a low-overhead loop, in a new pass: ARMLowOverheadLoops. The pass currently bails, reverting to an sub, icmp and br, in the cases where a call or stack spill/restore happens between the decrement and branching instructions, or if the loop is too large. Differential Revision: https://reviews.llvm.org/D63476 llvm-svn: 364288	2019-06-25 10:45:51 +00:00
James Henderson	1ad83965e2	[docs][llvm-cxxfilt] Write llvm-cxxfilt documentation There was a stub for llvm-cxxfilt, but it didn't describe the options. Additionally, it was in markdown, which was causing issues, so as discussed in https://reviews.llvm.org/D63211, this change replaces the existing stub with an RST file. Reviewed by: MaskRay, mattd Differential Revision: https://reviews.llvm.org/D63722 llvm-svn: 364287	2019-06-25 10:36:15 +00:00
Roman Lebedev	60843c0aeb	[Codegen] TargetLowering::SimplifySetCC(): omit urem when possible Summary: This addresses the regression that is being exposed by D50222 in `test/CodeGen/X86/jump_sign.ll` The missing fold, at least partially, looks trivial: https://rise4fun.com/Alive/Zsln i.e. if we are comparing with zero, and comparing the `urem`-by-non-power-of-two, and the `urem` is of something that may at most have a single bit set (or no bits set at all), the `urem` is not needed. Reviewers: RKSimon, craig.topper, xbolva00, spatel Reviewed By: xbolva00, spatel Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63390 llvm-svn: 364286	2019-06-25 10:01:42 +00:00
George Rimar	5646c326e2	[yaml2obj/obj2yaml] - Allow having the symbols and sections with duplicated names. The patch teaches yaml2obj/obj2yaml to support parsing/dumping the sections and symbols with the same name. A special suffix is added to a name to make it unique. Differential revision: https://reviews.llvm.org/D63596 llvm-svn: 364282	2019-06-25 08:22:57 +00:00
Clement Courbet	5b2aa25659	[ExpandMemCmp] Move all options to TargetTransformInfo. Split off from D60318. llvm-svn: 364281	2019-06-25 08:04:13 +00:00
Hiroshi Inoue	7b7fe56333	[NFC] fix trivial typos in documents llvm-svn: 364278	2019-06-25 07:24:27 +00:00
Hans Wennborg	17d9bd7dd5	Add llvm-symbolizer to LLVM_TOOLCHAIN_TOOLS (PR40152) So that it gets installed in LLVM_INSTALL_TOOLCHAIN_ONLY builds. llvm-svn: 364277	2019-06-25 07:15:41 +00:00
Hans Wennborg	abf917a67c	[LLVM-C] Add LLVM-C.dll to Windows installer package This is a follow up to D56781, D56774 and D35077 to makes the LLVM-C.dll file and LLVM-C.lib be installed on Windows, just like LTO.dll and LTO.lib are. Patch by Jakob Bornecrantz! Differential revision: https://reviews.llvm.org/D63717 llvm-svn: 364275	2019-06-25 07:05:00 +00:00
Craig Topper	6cde116fa8	[X86] Add test case that led to the revert of r363802, r363850, and r363856 in r364264 I've been trying to fix this, but hit some roadblocks. So I'm committing the test case for now so we'll at least avoid recreating that failure. llvm-svn: 364272	2019-06-25 06:40:28 +00:00
Craig Topper	34c96d4546	Revert r363802, r363850, and r363856 "[TargetLowering] SimplifyDemandedBits..." This reverts the following patches. "[TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG" "[TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ANY_EXTEND_VECTOR_INREG" "[TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support" We can end up with an any_extend_vector_inreg with a 256 bit result type and a 128 bit result type. This is allowed by the ISD opcode, but the generic operation legalizer is only able to expand cases where the total vector width is the same. The X86 backend creates these mismatched cases for zext_vec_inreg/sext_vec_inreg. The SimplifyDemandedBits changes are allowing those nodes to become aext_vec_inreg. For the zext/sext cases, the X86 backend has Custom handling and never lets them get to the generic legalizer. We need to do the same for aext_vec_inreg. llvm-svn: 364264	2019-06-25 01:32:42 +00:00
Seiya Nuta	df06ece11b	[llvm-objcopy][NFCI] Fix build failure with GCC Here is unreachable since the switch statement above is exhaustive. llvm-svn: 364263	2019-06-25 01:08:21 +00:00
Matt Arsenault	8c8f23ce52	AMDGPU/GlobalISel: Fix regbankselect for amdgcn.class llvm-svn: 364262	2019-06-25 01:07:22 +00:00
Huihui Zhang	2ff76f422f	[InstCombine][NFC] Add test to show missing fold for icmp ult/uge (shl %x, C2), C1. Summary: 'shl' inequality test ``` icmp ult/uge (shl %x, C2), C1 iff C1 is power of two ``` can be simplified as 'and' equality test ``` icmp eq/ne (and %x, (lshr -C1, C2)), 0. ``` Reviewers: lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63670 llvm-svn: 364256	2019-06-25 00:14:02 +00:00
Huihui Zhang	46e227f1f5	[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier. Summary: To generate simplified IR, make sure fold (X & ~C) ==/!= 0 --> X u</u>= C+1 is scheduled before fold ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. https://rise4fun.com/Alive/7ZN Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63505 llvm-svn: 364255	2019-06-25 00:09:10 +00:00
Seiya Nuta	5d86310fb5	[llvm-objcopy][NFC] Refactor output target parsing Summary: Use an enum instead of string to hold the output file format in Config.InputFormat and Config.OutputFormat. It's essential to support other output file formats other than ELF. Reviewers: espindola, alexshap, rupprecht, jhenderson Reviewed By: rupprecht, jhenderson Subscribers: jyknight, compnerd, emaste, arichardson, fedor.sergeev, jakehehrlich, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63239 llvm-svn: 364254	2019-06-25 00:02:04 +00:00
David Blaikie	6feabba84d	DataExtractor: use decodeSLEB128 to implement getSLEB128 Should've been NFC, but turns out DataExtractor had better test coverage for decoding SLEB128 than the decodeSLEB128 did - revealing a couple of bugs (one in the error handling, another in sign extension). So fixed those to get the DataExtractor tests passing again. llvm-svn: 364253	2019-06-24 23:45:18 +00:00
Seiya Nuta	7dadcad7b6	[llvm-objcopy][MachO] Fix strict-aliasing warning. NFCI Summary: Use MachOObjectFile::isRelocationScattered instead of reinterpret_cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42360 Reviewers: alexshap, rupprecht, jhenderson Reviewed By: alexshap Subscribers: dendibakh, bjope, uabelho, jakehehrlich, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63699 llvm-svn: 364252	2019-06-24 23:39:01 +00:00
Tim Shen	e327392ab9	Revert "[NVPTX][NFC] Fix documentation for shfl instructions." The original documentation is correct as it matches the C++ builtins. llvm-svn: 364250	2019-06-24 23:29:20 +00:00
Douglas Yung	527742cd31	[NFC] Fix tests added in r364225 which failed on Windows due to incorrect path separators. llvm-svn: 364249	2019-06-24 23:16:32 +00:00
Tim Shen	d89bfe68ec	[NVPTX][NFC] Fix documentation for shfl instructions. llvm-svn: 364248	2019-06-24 23:16:32 +00:00
Vitaly Buka	57f0d515a8	[NFC] Add missing consts into memoryaccess_def_iterator llvm-svn: 364247	2019-06-24 22:42:53 +00:00
Sanjay Patel	7dcf742952	[InstCombine] squash is-not-power-of-2 using ctpop This is the Demorgan'd 'not' of the pattern handled in: D63660 / rL364153 This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is not a power-of-2 using ctpop(X) > 1, so combining that with an is-zero check of the input is the same as testing if not exactly 1 bit is set: (X == 0) \|\| (ctpop(X) u> 1) --> ctpop(X) != 1 llvm-svn: 364246	2019-06-24 22:35:26 +00:00
Matt Arsenault	d2660f447c	AMDGPU/GlobalISel: Add tests for regbankselect of v2s16 and/or/xor llvm-svn: 364244	2019-06-24 22:21:02 +00:00
Vasileios Porpodas	dbd03cf1a4	[SLP] NFC: Fixed typo in comment llvm-svn: 364237	2019-06-24 21:40:48 +00:00

1 2 3 4 5 ...

180802 Commits