llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 12:43:36 +01:00

Author	SHA1	Message	Date
David Green	13ea5d8f72	[ARM] More MVE load/store tests for offsets around the negative limit. NFC llvm-svn: 370726	2019-09-03 09:42:16 +00:00
Bjorn Pettersson	e93d54f960	[LV] Fix miscompiles by adding non-header PHI nodes to AllowedExit Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721	2019-09-03 09:33:55 +00:00
Bjorn Pettersson	f4d7c30670	[LV] Precommit test case showing miscompile from PR43166. NFC Summary: Precommit test case showing miscompile from PR43166. Reviewers: fhahn, Ayal Reviewed By: fhahn Subscribers: rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67072 llvm-svn: 370720	2019-09-03 09:33:40 +00:00
Sjoerd Meijer	654ab28c6e	[LV] Tail-folding, runtime scev checks Now that we allow tail-folding, not only when we optimise for size, make sure we do not run in this assert. Differential revision: https://reviews.llvm.org/D66932 llvm-svn: 370711	2019-09-03 08:53:02 +00:00
Sjoerd Meijer	7b5470772a	[LV] Tail-folding with runtime memory checks The loop vectorizer was running in an assert when it tried to fold the tail and had to emit runtime memory disambiguation checks. Differential revision: https://reviews.llvm.org/D66803 llvm-svn: 370707	2019-09-03 08:38:24 +00:00
James Molloy	33edb0a6f1	[MachinePipeliner] Add a way to unit-test the schedule emitter Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705	2019-09-03 08:20:31 +00:00
Sam Tebbs	f88ca36af5	[ARM] Select vmla This patch adds vmla selection. Differential revision: https://reviews.llvm.org/D66297 llvm-svn: 370704	2019-09-03 08:17:46 +00:00
Craig Topper	26d299992f	[X86] Simplify the setOperationAction handling for fp_to_uint by improving the Custom handler a bit. This merges the 32-bit and 64-bit mode code to just use Custom for both i32 and i64. We already had most of the handling in the custom handling due to the AVX512 having legal fp_to_uint. Just needed to add the i32->i64 promotion handling. Refactor the fp_to_uint code in the custom handler to simplify the number of times we check things. Tweak cost model tables to match the default handling we were getting due to Expand before. llvm-svn: 370700	2019-09-03 05:57:22 +00:00
Craig Topper	65cc8b4a7f	[X86] Don't use Expand for i32 fp_to_uint on SSE1/2 targets on 32-bit target. Use Custom lowering instead. Fall back to default expansion only when the scalar FP type belongs in an XMM register. This improves lowering for i32 to fp80, and also i32 to double on SSE1 only. llvm-svn: 370699	2019-09-03 05:57:18 +00:00
Craig Topper	01e84c517f	[X86] Add an exhaustive test for i32 fptosi/fptoui across different triples and features. llvm-svn: 370698	2019-09-03 05:57:14 +00:00
Craig Topper	434a360032	[LegalizeDAG] Pass DAG to two calls to SDNode::dump in debug prints so that they will print target specific nodes correctly. The dump methods can only print target node names correctly if they can get access to the TLI object. llvm-svn: 370694	2019-09-03 02:51:14 +00:00
Craig Topper	af345a0f8e	[X86] Custom promote i32->f80 uint_to_fp on AVX512 64-bit targets. Reuse the same code to promote all i32 uint_to_fp on 64-bit targets to simplify the X86ISelLowering constructor. llvm-svn: 370693	2019-09-03 02:51:10 +00:00
Simon Pilgrim	76d3d5fb33	[CostModel][X86] Add scalar sext/zext cost tests llvm-svn: 370684	2019-09-02 21:02:51 +00:00
Craig Topper	4194f27f3f	[X86] Enable fp128 as a legal type with SSE1 rather than with MMX. FP128 values are passed in xmm registers so should be asssociated with an SSE feature rather than MMX which uses a different set of registers. llc enables sse1 and sse2 by default with x86_64. But does not enable mmx. Clang enables all 3 features by default. I've tried to add command lines to test with -sse where possible, but any test that returns a value in an xmm register fails with a fatal error with -sse since we have no defined ABI for that scenario. llvm-svn: 370682	2019-09-02 20:16:30 +00:00
David Green	b3389c16e8	[ARM] MVE predicate bitcast test and VPSEL adjustment. NFC llvm-svn: 370678	2019-09-02 19:03:35 +00:00
David Green	f5fd08fc0d	[ARM] Use MQPR not QPR for MVE registers We should be using MQPR, and if we don't we can get COPYs and PHIs created for QPR. These get folded into instructions, failing verification checks. Differential revision: https://reviews.llvm.org/D66214 llvm-svn: 370676	2019-09-02 17:18:23 +00:00
Robert Lougher	a7d1d8c4ff	[TargetLowering][PS4] Add sincos(f) lib functions when target is PS4 PS4 supports sincosf and sincos. Adding the library functions enables the sin(f)+cos(f) -> sincos(f) optimization. Differential Revision: https://reviews.llvm.org/D67009 llvm-svn: 370675	2019-09-02 16:53:32 +00:00
Ulrich Weigand	68d6f27a24	[SystemZ] Support constrained fpto[su]i intrinsics Now that constrained fpto[su]i intrinsic are available, add codegen support to the SystemZ backend. In addition to pure back-end changes, I've also needed to add the strict_fp_to_[su]int and any_fp_to_[su]int pattern fragments in the obvious way. llvm-svn: 370674	2019-09-02 16:49:29 +00:00
Kerry McLaughlin	acca14efdf	[SVE][Inline-Asm] Support for SVE asm operands Summary: Adds the following inline asm constraints for SVE: - w: SVE vector register with full range, Z0 to Z31 - x: Restricted to registers Z0 to Z15 inclusive. - y: Restricted to registers Z0 to Z7 inclusive. This change also adds the "z" modifier to interpret a register as an SVE register. Not all of the bitconvert patterns added by this patch are used, but they have been included here for completeness. Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, rengolin, cameron.mcinally, greened Reviewed By: sdesmalen Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66302 llvm-svn: 370673	2019-09-02 16:12:31 +00:00
Simon Pilgrim	3e7cad2294	[X86] getPMOVMSKB - add MVT::v64i8 handling and remove from combineBitcastvxi1. NFCI. llvm-svn: 370670	2019-09-02 15:10:35 +00:00
George Rimar	2e52b72fcb	Recommit r370661 "[llvm-nm] - Add a test case for case when we dump a symbol that belongs to a section with a broken sh_name." Fix: add a 'consumeError()' call to ObjectFile.cpp. This error was never checked. Original commit message: It adds a test case for a problem fixed by D66976 <https://reviews.llvm.org/D66976>. It was introduced by me in D66089 <https://reviews.llvm.org/D66089>. The error reported was never consumed because of a wrong variable name used, so it could fail when LLVM_ENABLE_ABI_BREAKING_CHECKS is used. Differential revision: https://reviews.llvm.org/D67002 llvm-svn: 370669	2019-09-02 14:57:35 +00:00
Sanjay Patel	6a053c6ace	[DAGCombiner] try to form test+set out of shift+mask patterns The motivating bugs are: https://bugs.llvm.org/show_bug.cgi?id=41340 https://bugs.llvm.org/show_bug.cgi?id=42697 As discussed there, we could view this as a failure of IR canonicalization, but then we would need to implement a backend fixup with target overrides to get this right in all cases. Instead, we can just view this as a codegen opportunity. It's not even clear for x86 exactly when we should favor test+set; some CPUs have better theoretical throughput for the ALU ops than bt/test. This patch is made more complicated than I expected because there's an early DAGCombine for 'and' that can change types of the intermediate ops via trunc+anyext. Differential Revision: https://reviews.llvm.org/D66687 llvm-svn: 370668	2019-09-02 14:52:09 +00:00
Jay Foad	1f578036a2	Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0" Summary: D61491 caused us to use relocs when they're not strictly necessary, to refer to symbols in the text section. This is a pessimization and it's a problem for some loaders that don't support relocs yet. Reviewers: nhaehnle, arsenm, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65813 llvm-svn: 370667	2019-09-02 14:40:57 +00:00
Dmitry Preobrazhensky	011ce9a859	[AMDGPU][MC][GFX10] Corrected constant bus checks to exclude null See AMD SWDEV-157286 Reviewers: atamazov, arsenm Differential Revision: https://reviews.llvm.org/D65229 llvm-svn: 370665	2019-09-02 14:19:52 +00:00
Thomas Preud'homme	f7825f2b74	[FileCheck] Make NumericVariable ctor explicit Summary: Make FileCheckNumericVariable constructor explicit to avoid implicit conversions from StringRef. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66640 llvm-svn: 370664	2019-09-02 14:04:05 +00:00
Thomas Preud'homme	393fb2cc2e	[FileCheck] Forbid using var defined on same line Summary: Commit r366897 introduced the possibility to set a variable from an expression, such as [[#VAR2:VAR1+3]]. While introducing this feature, it introduced extra logic to allow using such a variable on the same line later on. Unfortunately that extra logic is flawed as it relies on a mapping from variable to expression defining it when the mapping is from variable definition to expression. This flaw causes among other issues PR42896. This commit avoids the problem by forbidding all use of a variable defined on the same line, and removes the now useless logic. Redesign will be done in a later commit because it will require some amount of refactoring first for the solution to be clean. One example is the need for some sort of transaction mechanism to set a variable temporarily and from an expression and rollback if the CHECK pattern does not match so that diagnostics show the right variable values. Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D66141 llvm-svn: 370663	2019-09-02 14:04:00 +00:00
George Rimar	59a79d45c8	Revert r370661 "[llvm-nm] - Add a test case for case when we dump a symbol that belongs to a section with a broken sh_name" It broke BB: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16955/steps/test/logs/stdio Expected<T> must be checked before access or destruction. Unchecked Expected<T> contained error: a section [index 1] has an invalid sh_name (0xffff) offset which goes past the end of the section name string tableStack dump: 0. Program arguments: /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/test/tools/llvm-nm/Output/format-sysv-section.test.tmp2.o --format=sysv #0 0x00000000008af7c4 PrintStackTraceSignalHandler(void) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8af7c4) #1 0x00000000008ad8be llvm::sys::RunSignalHandlers() (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8ad8be) #2 0x00000000008afbd8 SignalHandler(int) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8afbd8) #3 0x00007f0a6b989730 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12730) #4 0x00007f0a6b48d7bb raise (/lib/x86_64-linux-gnu/libc.so.6+0x377bb) #5 0x00007f0a6b478535 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22535) #6 0x000000000042004b llvm::Expected<llvm::StringRef>::fatalUncheckedExpected() const (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x42004b) #7 0x00000000008367f5 (/sv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8367f5) #8 0x0000000000817b80 llvm::object::IRObjectFile::findBitcodeInObject(llvm::object::ObjectFile const&) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x817b80) #9 0x0000000000838416 llvm::object::SymbolicFile::createSymbolicFile(llvm::MemoryBufferRef, llvm::file_magic, llvm::LLVMContext) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x838416) #10 0x00000000007f36cb llvm::object::createBinary(llvm::MemoryBufferRef, llvm::LLVMContext*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x7f36cb) #11 0x0000000000413123 dumpSymbolNamesFromFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x413123) #12 0x0000000000412e38 main (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x412e38) #13 0x00007f0a6b47a09b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409b) #14 0x00000000004120da _start (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x4120da) FileCheck error: '-' is empty. FileCheck command line: /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.src/test/tools/llvm-nm/format-sysv-section.test --check-prefix=ERR -- llvm-svn: 370662	2019-09-02 14:03:50 +00:00
George Rimar	8942653f23	[llvm-nm] - Add a test case for case when we dump a symbol that belongs to a section with a broken sh_name. It adds a test case for a problem fixed by D66976. It was introduced by me in D66089. The error reported was never consumed because of a wrong variable name used, so it could fail when LLVM_ENABLE_ABI_BREAKING_CHECKS is used. Differential revision: https://reviews.llvm.org/D67002 llvm-svn: 370661	2019-09-02 13:54:45 +00:00
Dmitry Preobrazhensky	8fbb623003	[AMDGPU][MC][GFX10] Enabled null with 64-bit operands See Bug 42745: https://bugs.llvm.org/show_bug.cgi?id=42745 Reviewers: atamazov, arsenm https://reviews.llvm.org/D65231 llvm-svn: 370660	2019-09-02 13:42:25 +00:00
Sanjay Patel	ed830e5e7b	[InstCombine] recognize bswap disguised as shufflevector bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N8} --> bswap (bitcast X to i{N8}) In PR43146: https://bugs.llvm.org/show_bug.cgi?id=43146 ...we have a more complicated case where SLP is making a mess of bswap. This patch won't do anything for that currently, but we need to improve bswap recognition in instcombine, SLP, and/or a standalone pass to avoid that problem. This is limited using the data-layout so we don't try to do this transform with actual vector types. The backend does not appear to have folds to convert in either direction, so we don't want to mess up something that is actually better lowered as a shuffle. On x86, we're trading something like this: vmovd %edi, %xmm0 vpshufb LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u] vmovd %xmm0, %eax For: movl %edi, %eax bswapl %eax Differential Revision: https://reviews.llvm.org/D66965 llvm-svn: 370659	2019-09-02 13:33:20 +00:00
Martin Storsjo	eec6c132ce	[test] [llvm-dlltool] Improve test strictness a little. NFC. llvm-svn: 370657	2019-09-02 13:28:21 +00:00
Martin Storsjo	0b651a90a6	[llvm-dlltool] Handle external and internal names with differing decoration Also add a missed part of the test from SVN r369747. Differential Revision: https://reviews.llvm.org/D66996 llvm-svn: 370656	2019-09-02 13:28:16 +00:00
Martin Storsjo	d718a53c2e	[llvm-dlltool] Remove support for implying output name I don't see GNU dlltool supporting doing this; with only a -d option and no -l option, GNU dlltool runs successfully but doesn't write any output file. Differential Revision: https://reviews.llvm.org/D65645 llvm-svn: 370655	2019-09-02 13:28:07 +00:00
Dmitry Preobrazhensky	93effe4818	[AMDGPU][MC][GFX10] Corrected constant bus limit for 64-bit shift instructions See bug 42744: https://bugs.llvm.org/show_bug.cgi?id=42744 Reviewers: atamazov, arsenm Differential Revision: https://reviews.llvm.org/D65228 llvm-svn: 370652	2019-09-02 12:50:05 +00:00
Andrea Di Biagio	08b9b0a5ac	[X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions. On BtVer2 conditional SIMD stores are heavily microcoded. The latency is directly proportional to the number of packed elements extracted from the input vector. Also, according to micro-benchmarks, most of the computation seems to be done in the integer unit. Only a minority of the uOPs is executed by the FPU. The observed behaviour on the FPU looks similar to this: - The input MASK value is moved to the Integer Unit -- [ a VMOVMSK-like uOP-executed on JFPU0]. - In parallel, each element of the input XMM/YMM is extracted and then sent to the IntegerUnit through JFPU1. As expected, a (conditional) store is executed for every extracted element. Interestingly, a (speculative) load is executed for every extracted element too. It is as-if a "LOAD - BIT_EXTRACT- CMOV" sequence of uOPs is repeated by the integer unit for every contionally stored element. VMASKMOVDQU is a special case: the number of speculative loads is always 2 (presumably, one load per quadword). That means, extra shifts and masking is performed on (one of) the loaded quadwords before each conditional store (that also explains the big number of non-FP uOPs retired). This patch replaces the existing writes for conditional SIMD stores (i.e. WriteFMaskedStore, and WriteFMaskedStoreY) with the following new writes: WriteFMaskedStore32 [ XMM Packed Single ] WriteFMaskedStore32Y [ YMM Packed Single ] WriteFMaskedStore64 [ XMM Packed Double ] WriteFMaskedStore64Y [ YMM Packed Double ] Added a wrapper class named X86SchedWriteMaskMove in X86Schedule.td to describe both RM and MR variants for conditional SIMD moves in a single tablegen definition. Instances of that class are then passed in input to multiclass avx_movmask_rm when constructing MASKMOVPS/PD definitions. Since this patch introduces new writes, I had to update all the X86 scheduling models. Differential Revision: https://reviews.llvm.org/D66801 llvm-svn: 370649	2019-09-02 12:32:28 +00:00
Jeremy Morse	7698ed73e1	[DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370648	2019-09-02 12:28:36 +00:00
James Henderson	277b2b47c0	[llvm-strings][test] Merge two closely related tests This is a follow-up to feedback on D66015. Reviewed by: grimar Differential Revision: https://reviews.llvm.org/D67069 llvm-svn: 370643	2019-09-02 11:42:30 +00:00
Nandor Licker	bd1a6d1125	Revert [Clang Interpreter] Initial patch for the constexpr interpreter This reverts r370636 (git commit 8327fed9475a14c3376b4860c75370c730e08f33) llvm-svn: 370642	2019-09-02 11:34:47 +00:00
Simon Pilgrim	a05c487c5f	[X86] combineHorizontalPredicateResult - pull out repeated getTargetLoweringInfo() calls. NFCI. llvm-svn: 370637	2019-09-02 10:42:48 +00:00
Nandor Licker	aec22e274e	[Clang Interpreter] Initial patch for the constexpr interpreter Summary: This patch introduces the skeleton of the constexpr interpreter, capable of evaluating a simple constexpr functions consisting of if statements. The interpreter is described in more detail in the RFC. Further patches will add more features. Reviewers: Bigcheese, jfb, rsmith Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D64146 llvm-svn: 370636	2019-09-02 10:38:08 +00:00
Piotr Sobczak	57bd509a8c	[AMDGPU] Add test Summary: Add test checking that the redundant immediate MOV instruction (by-product of handling phi nodes) is not found in the generated code. Reviewers: arsenm, anton-afanasyev, craig.topper, rtereshin, bogner Reviewed By: arsenm Subscribers: kzhuravl, yaxunl, dstuttard, tpr, t-tye, wdng, jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63860 llvm-svn: 370634	2019-09-02 10:02:54 +00:00
George Rimar	207e4c4399	[yaml2obj] - Allow overriding sh_name fields of the sections. This is in line with the previous changes which allowed to override the sh_offset/sh_size and useful for writing test cases. Differential revision: https://reviews.llvm.org/D66998 llvm-svn: 370633	2019-09-02 09:47:17 +00:00
Djordje Todorovic	4e774ff0ea	[DWARFVerifier] Verify GNU extensions of call site DWARF symbols Verify that the call site DWARF symbols (added during the implementation of the debug entry values feature) are generated properly. Differential Revision: https://reviews.llvm.org/D66865 llvm-svn: 370631	2019-09-02 09:20:46 +00:00
Amara Emerson	947ae930cb	[AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creating the merges. Fixes PR43171. llvm-svn: 370627	2019-09-02 08:18:55 +00:00
Craig Topper	e2950db97a	[X86] Add initial support for unfolding broadcast loads from arithmetic instructions to enable LICM hoisting of the load MachineLICM can hoist an invariant load, but if that load is folded it needs to be unfolded. On AVX512 sometimes this load is an broadcast load which we were previously unable to unfold. This patch adds initial support for that with a very basic list of supported instructions as a starting point. Differential Revision: https://reviews.llvm.org/D67017 llvm-svn: 370620	2019-09-01 22:14:36 +00:00
Sanjay Patel	10ab80f054	[DAGCombiner] improve throughput of shift+logic+shift The motivating case for this is a long way from here: https://bugs.llvm.org/show_bug.cgi?id=43146 ...but I think this is where we have to start. We need to canonicalize/optimize sequences of shift and logic to ease pattern matching for things like bswap and improve perf in general. But without the artificial limit of '!LegalTypes' (early combining), there are a lot of test diffs, and not all are good. In the minimal tests added for this proposal, x86 should have better throughput in all cases. AArch64 is neutral for scalar tests because it can fold shifts into bitwise logic ops. There are 3 shift opcodes and 3 logic opcodes for a total of 9 possible patterns: https://rise4fun.com/Alive/VlI https://rise4fun.com/Alive/n1m https://rise4fun.com/Alive/1Vn Differential Revision: https://reviews.llvm.org/D67021 llvm-svn: 370617	2019-09-01 18:38:15 +00:00
Simon Pilgrim	0e0bc115cf	Fix MSVC unreferenced formal parameter warning. NFCI. llvm-svn: 370615	2019-09-01 16:04:51 +00:00
Simon Pilgrim	ed54aaac69	Fix MSVC unreferenced formal parameter warning. NFCI. llvm-svn: 370614	2019-09-01 16:04:38 +00:00
Simon Pilgrim	48da3eae88	[X86][AVX] Rename + cleanup lowerShuffleAsLanePermuteAndBlend. NFCI. Rename to lowerShuffleAsLanePermuteAndShuffle to make it clear that not just blends are performed. Cleanup the in-lane shuffle mask generation to make it more obvious what's going on. Some prep work noticed while investigating the poor shuffle code mentioned in D66004. llvm-svn: 370613	2019-09-01 16:04:28 +00:00
Simon Pilgrim	1b1ef4b3b0	Fix shadow variable warning. NFCI. llvm-svn: 370610	2019-09-01 13:10:18 +00:00

... 3 4 5 6 7 ...

184453 Commits