llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Benjamin Kramer	bfaa3adccc	Move symbols from the global namespace into (anonymous) namespaces. NFC. llvm-svn: 294837	2017-02-11 11:06:55 +00:00
Craig Topper	6e91ba68a2	[AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables. llvm-svn: 294830	2017-02-11 07:01:40 +00:00
Craig Topper	139a0c4709	[AVX-512] Fix apparent typo in instruction name VMOVSSDrr_REV->VMOVSDZrr_REV. llvm-svn: 294829	2017-02-11 07:01:38 +00:00
Craig Topper	90a5367a80	[AVX-512] Add VPSADBW instructions to load folding tables. llvm-svn: 294827	2017-02-11 06:24:03 +00:00
Evgeny Stupachenko	cf81c93057	The patch fixes r294821 Summary: Update register match for windows testing From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294825	2017-02-11 05:39:00 +00:00
Craig Topper	1c93099805	[X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available. Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available. This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp. Overall I think this produces better results in the modified test cases. llvm-svn: 294824	2017-02-11 05:32:57 +00:00
Peter Collingbourne	fd1115698d	Address Mehdi's post-commit review comments on r294795. llvm-svn: 294822	2017-02-11 03:19:22 +00:00
Evgeny Stupachenko	3bbd6ed985	Fix PR23384 (under "-lsr-insns-cost" option) Summary: The patch adds instructions number generated by a solution to LSR cost under "-lsr-insns-cost" option. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D28307 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294821	2017-02-11 02:57:43 +00:00
Ahmed Bougacha	6d1fcfb76b	[ARM] Make f16 interleaved accesses expensive. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Teach the cost model to consider f16 interleaved operations as expensive. Otherwise, we are all but guaranteed to end up with a large block of scalarized vector code. llvm-svn: 294819	2017-02-11 01:53:04 +00:00
Ahmed Bougacha	c367f3652b	[ARM] Don't lower f16 interleaved accesses. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818	2017-02-11 01:53:00 +00:00
Ahmed Bougacha	a2c63859b5	[ARM] Unique some redundant CHECK lines. NFC. llvm-svn: 294817	2017-02-11 01:52:57 +00:00
Wei Mi	4f28dc3b0c	[LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with outerloop. The recommit includes some changes of testcases. No functional change to the patch. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 294814	2017-02-11 00:50:23 +00:00
Eugene Zelenko	f7c046da8b	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 294813	2017-02-11 00:27:28 +00:00
Matthias Braun	63a2ffaff0	config-ix.cmake: Search for CMAKE_XCRUN before using it. This was previously searched in CMakeLists.txt unconditionally but as of r294371 it is only searched in some circumstances. Repeating the search in config-ix.cmake to make this robust and hopefully fix the macOS Asan+Ubsan jenkins build. llvm-svn: 294811	2017-02-11 00:14:01 +00:00
Chandler Carruth	86f0af3b25	[PM] Fix a bug in how I ported LoopDeletion to the new PM. This was marking the loop for deletion after the loop was deleted. This almost works, except that when we do any kind of debug logging it starts reading the name of the loop from deleted memory or otherwise blowing up. This can fail in a bunch of ways. I recently added a test that always does this, and it started failing on the sanitizer bots. The fix is to mark the loop as deleted in the loop PM infrastructure before we remove the loop. We can do this by passing the updater into the routine. That also lets us simplify a bunch of other interface components here for a net win. llvm-svn: 294810	2017-02-11 00:09:30 +00:00
Dan Gohman	5c8e46c308	[WebAssembly] Remove old experimental disassemler code. Remove support for disassembling an old experimental wasm binary format, which is no longer in use anywhere. llvm-svn: 294809	2017-02-11 00:02:23 +00:00
Saleem Abdulrasool	7641b1278e	vim: add `returned` keyword The `returned` keyword was added in SVN r179925. Update the vim syntax rules. llvm-svn: 294808	2017-02-10 23:57:11 +00:00
Davide Italiano	777d6d3e01	[LTO] Share the optimization remarks setup between Thin/Full LTO. llvm-svn: 294807	2017-02-10 23:49:38 +00:00
Krzysztof Parzyszek	5f0b4b1ff0	[Hexagon] Introduce Hexagon V62 llvm-svn: 294805	2017-02-10 23:46:45 +00:00
Davide Italiano	306e44c0af	[tests] Be explicit about the files we want to remove. Hopefully Windows will stop whining after this change. llvm-svn: 294801	2017-02-10 22:55:37 +00:00
Peter Collingbourne	33fe886dfb	IR: Function summary extensions for whole-program devirtualization pass. The summary information includes all uses of llvm.type.test and llvm.type.checked.load intrinsics that can be used to devirtualize calls, including any constant arguments for virtual constant propagation. Differential Revision: https://reviews.llvm.org/D29734 llvm-svn: 294795	2017-02-10 22:29:38 +00:00
Benjamin Kramer	dde561f416	[InstCombine] Move class into anonymous namespace. NFC. This is necessary to avoid warnings from GCC. InstCombineLoadStoreAlloca.cpp:238:7: error: 'PointerReplacer' declared with greater visibility than the type of its field 'PointerReplacer::IC' llvm-svn: 294794	2017-02-10 22:26:35 +00:00
Davide Italiano	a96a2e7ce2	[lib/LTO] Rework optimization remarkers setup. This makes this code much more similar to what ThinLTO is using (also API wise), so now we can probably use a single code path instead of copying stuff around. llvm-svn: 294792	2017-02-10 22:16:17 +00:00
Benjamin Kramer	3166dbbf8b	[PPC] Silence warning in Release builds. llvm-svn: 294791	2017-02-10 22:13:34 +00:00
Davide Italiano	d6da2984fc	[LTO] Make these tests robust across multiple iterations. Same as r294784, but for regular LTO. llvm-svn: 294789	2017-02-10 22:11:06 +00:00
Benjamin Kramer	d184f0b9df	[InstCombine] Silence unused variable warning in Release builds. llvm-svn: 294788	2017-02-10 22:04:17 +00:00
Nico Weber	4b68201166	Revert r294532, it caused PR31935 llvm-svn: 294787	2017-02-10 21:57:30 +00:00
Yaxun Liu	5434749cfc	Fix invalid addrspacecast due to combining alloca with global var For function-scope variables with large initialisation list, FE usually generates a global variable to hold the initializer, then generates memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst identifies such allocas which are accessed only by reading and replaces them with the global variable. This is done by casting the global variable to the type of the alloca and replacing all references. However, when the global variable is in a different address space which is disjoint with addr space 0 (e.g. for IR generated from OpenCL, global variable cannot be in private addr space i.e. addr space 0), casting the global variable to addr space 0 results in invalid IR for certain targets (e.g. amdgpu). To fix this issue, when the global variable is not in addr space 0, instead of casting it to addr space 0, this patch chases down the uses of alloca until reaching the load instructions, then replaces load from alloca with load from the global variable. If during the chasing bitcast and GEP are encountered, new bitcast and GEP based on the global variable are generated and used in the load instructions. Differential Revision: https://reviews.llvm.org/D27283 llvm-svn: 294786	2017-02-10 21:46:07 +00:00
Davide Italiano	d4d29a84d1	[ThinLTO] Make this test more robust across multiple runs. The yaml emitter files are left around otherwise. llvm-svn: 294784	2017-02-10 21:35:31 +00:00
Tim Shen	6d8321dcc4	Fix a silly syntax error. llvm-svn: 294783	2017-02-10 21:17:35 +00:00
Dehao Chen	a75059ebaa	Encode duplication factor from loop vectorization and loop unrolling to discriminator. Summary: This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations. The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default. Reviewers: probinson, aprantl, davidxl, hfinkel, echristo Reviewed By: hfinkel Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26420 llvm-svn: 294782	2017-02-10 21:09:07 +00:00
Tim Shen	564288cf4a	[XRay] Implement powerpc64le xray. Summary: powerpc64 big-endian is not supported, but I believe that most logic can be shared, except for xray_powerpc64.cc. Also add a function InvalidateInstructionCache to xray_util.h, which is copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest, and I don't know how. Reviewers: dberris, echristo, iteratee, kbarton, hfinkel Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29742 llvm-svn: 294781	2017-02-10 21:03:24 +00:00
Krzysztof Parzyszek	70d0b6caa5	[Hexagon] Remove unused .td files llvm-svn: 294775	2017-02-10 19:54:00 +00:00
Ahmed Bougacha	905783e0c0	[X86] Bitcast subvector before broadcasting it. Since r274013, we've been looking through bitcasts on broadcast inputs. In the scalar-folding case (from a load, build_vector, or sc2vec), the input type didn't matter, as we'd simply bitcast the resulting scalar back. However, when broadcasting a 128-bit-lane-aligned element, we create an EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector of the original input type. llvm-svn: 294774	2017-02-10 19:51:47 +00:00
Kevin Enderby	1435f11dff	Yet another fix llvm-objdump so it picks a good CPU based for Mach-O files, in this case for CPU_SUBTYPE_ARM64_ALL. For this cpusubtype it should default to a cyclone CPU to give proper disassembly without a -mcpu= flag. rdar://27767188 llvm-svn: 294771	2017-02-10 19:27:10 +00:00
Tim Northover	e0ca173664	GlobalISel: drop lifetime intrinsics during translation. We don't use them yet and they just cause problems. llvm-svn: 294770	2017-02-10 19:10:38 +00:00
Marcos Pividori	3f6b34a274	[libFuzzer] Use stoull instead of stol to ensure 64 bits. Differential revision: https://reviews.llvm.org/D29831 llvm-svn: 294769	2017-02-10 18:44:14 +00:00
Simon Pilgrim	6cca34ea15	[X86][AVX512] Add vector rotate tests for AVX512 targets AVX512 does have vector rotate instructions, but we don't lower to them yet llvm-svn: 294766	2017-02-10 18:06:11 +00:00
Amaury Sechet	975b70d555	Autogenerate results for test/CodeGen/X86/peep-test-4.ll . NFC llvm-svn: 294765	2017-02-10 17:57:48 +00:00
Amaury Sechet	2016af35f2	Autogenerate results for test/CodeGen/X86/pr14314.ll . NFC llvm-svn: 294764	2017-02-10 17:57:46 +00:00
John Brawn	b7fc3ce6ba	[ARM] Fix incorrect mask bits in MSR encoding for write_register intrinsic In the encoding of system registers in the M-class MSR instruction the mask bits should be 2 for registers that don't take a _<bits> qualifier (the instruction is unpredictable otherwise), and should also be 2 if the register takes a _<bits> qualifier but it's not present as no _<bits> is an alias for _nzcvq. Differential Revision: https://reviews.llvm.org/D29828 llvm-svn: 294762	2017-02-10 17:41:08 +00:00
Amaury Sechet	64c6b32ce8	Use autogenerate check in CodeGen/X86/pr16031.ll . NFC llvm-svn: 294761	2017-02-10 17:26:21 +00:00
Mehdi Amini	ee53e4e0c5	Fix doc for `-opt-bisect-limit`: the LTO option prefix for lld is -mllvm Thanks Davide to catch it in my previous patch. llvm-svn: 294759	2017-02-10 17:16:00 +00:00
Alexander Kornienko	58c8076a16	Add a virtual destructor for LegalizerInfo. lib/Target/X86/X86TargetMachine.cpp has a code that deletes an instance of a LegalizerInfo descendant via a pointer to base. llvm-svn: 294757	2017-02-10 17:00:27 +00:00
Amaury Sechet	3ca36299b2	Check full codegen in CodeGen/X86/i256-add.ll NFC llvm-svn: 294756	2017-02-10 16:34:17 +00:00
Matthew Simpson	d972d92018	[LV] Remove type restriction for vector phi creation We previously only created a vector phi node for an induction variable if its type matched the type of the canonical induction variable. Differential Revision: https://reviews.llvm.org/D29776 llvm-svn: 294755	2017-02-10 16:15:26 +00:00
Krzysztof Parzyszek	45ad4809a2	[Hexagon] Replace instruction definitions with auto-generated ones llvm-svn: 294753	2017-02-10 15:33:13 +00:00
Rafael Espindola	26a1636cb7	Move some error handling down to MCStreamer. This makes sure we get the same redefinition rules regardless of who is printing (asm parser, codegen) and to what (asm, obj). This fixes an unintentional regression in r293936. llvm-svn: 294752	2017-02-10 15:13:12 +00:00
Simon Pilgrim	1393a74165	[X86][SSE] Added chained FDIV test cases for D26855 Tests to demonstrate throughput-latency decision between div and rcp on faster hardware such as Haswell llvm-svn: 294750	2017-02-10 14:56:12 +00:00
Simon Pilgrim	91218fd943	[DAGCombine] Allow vector constant folding of any value type before type legalization The patch comes in 2 parts: 1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types. 2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled. Fix for PR30760 Differential Revision: https://reviews.llvm.org/D29568 llvm-svn: 294749	2017-02-10 14:37:25 +00:00

1 2 3 4 5 ...

144701 Commits