llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Craig Topper	f724cae1ff	[SelectionDAG] Add a isel matcher op to check the type of node results other than result 0. I plan to use this to check the type of the mask result of masked gathers in the X86 backend. llvm-svn: 318820	2017-11-22 07:11:01 +00:00
Max Kazantsev	40279b1003	[SCEV] Strengthen variance condition in calculateLoopDisposition Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively. When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if `L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are consecutive sibling loops within the parent loop. In this case, `AR2` is also varying w.r.t. `L1`, but we don't correctly identify it. It can lead, for exaple, to attempt of incorrect folding. Consider: AR1 = {a,+,b}<L1> AR2 = {c,+,d}<L2> EXAR2 = sext(AR1) MUL = mul AR1, EXAR2 If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect because `AR2` is not available on entrance of `L1`. Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled with one check: "header of `L1` dominates header of `L2`". This patch replaces the old insufficient check with this one. Differential Revision: https://reviews.llvm.org/D39453 llvm-svn: 318819	2017-11-22 06:21:39 +00:00
Davide Italiano	b6d9bc9a26	[SCCP] Pick the right lattice value for constants. After the dataflow algorithm proves that an argument is constant, it replaces it value with the integer constant and drops the lattice value associated to the DEF. e.g. in the example we have @f() that's called twice: call @f(undef, ...) call @f(2, ...) `undef` MEET 2 = 2 so we replace the argument and all its uses with the constant 2. Shortly after, tryToReplaceWithConstantRange() tries to get the lattice value for the argument we just replaced, causing an assertion. This function is a little peculiar as it runs when we're doing replacement and not as part of the solver but still queries the solver. The fix is that of checking whether we replaced the value already and get a temporary lattice value for the constant. Thanks to Zhendong Su for the report! Fixes PR35357. llvm-svn: 318817	2017-11-22 03:04:55 +00:00
Craig Topper	b4d4208116	[X86] Move the information about the feature bits used by compiler-rt and shared by Host.cpp to a .def file and TargetParser.h so clang can make use of it. Since we keep Host.cpp and compiler-rt relatively in sync, clang can use this information as a proxy. llvm-svn: 318814	2017-11-21 23:36:42 +00:00
Krzysztof Parzyszek	fab2fb509e	[Hexagon] Add HexagonSubtarget::getVectorLength() llvm-svn: 318807	2017-11-21 22:13:16 +00:00
Peter Collingbourne	b05610e9e4	Object: Improve COFF irsymtab comdat representation. Change the representation of COFF comdats so that a COFF linker is able to accurately resolve comdats between IR and native object files. Specifically, apply name mangling to comdat names consistently with native object files, and do not export comdats with an internal leader because they do not affect symbol resolution. Differential Revision: https://reviews.llvm.org/D40278 llvm-svn: 318805	2017-11-21 22:06:20 +00:00
Krzysztof Parzyszek	10ae105d9b	[Hexagon] Make sure that RDF does not remove EH_LABELs Since EH_LABELs (and other labels) no longer have "side-effects", they should be checked for separately. llvm-svn: 318801	2017-11-21 21:05:51 +00:00
Craig Topper	8295be1621	[X86] Allow vpclmulqdq instructions to be commuted during isel to allow load folding. The commuting patterns for the AVX version actually still had priority over the new patterns. llvm-svn: 318800	2017-11-21 21:05:21 +00:00
Craig Topper	dd308dec13	[X86] Add BITALG, VAES, VBMI2, VNNI, VPCLMULQDQ, and VPOPCNTDQ instructions to icelake CPU. This is based on table 1-1 of the October 2017 revision of Intel® Architecture Instruction Set Extensions and Future Features Programming Reference llvm-svn: 318799	2017-11-21 21:05:18 +00:00
Nirav Dave	71ae010599	Avoid unecessary opsize byte in segment move to memory Segment moves to memory are always 16-bit. Remove invalid 32 and 64 bit variants. Recommiting with missing clang inline assembly test change. Fixes PR34478. Reviewers: rnk, craig.topper Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39847 llvm-svn: 318797	2017-11-21 19:28:13 +00:00
Craig Topper	4812755896	[X86] Sort bits in getHostCPUFeatures again. llvm-svn: 318792	2017-11-21 18:50:41 +00:00
Chad Rosier	d593be8413	[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as having side effects. This partially reverts r298851. The the underlying issue is that we don't currently model the dependency between mrs (read system register) and msr (write system register) instructions. Something like the below should never be reordered: msr TPIDR_EL0, x0 ;; set thread pointer mrs x8, TPIDR_EL0 ;; read thread pointer but was being reordered after r298851. The functional part of the patch that wasn't reverted needed to remain in place in order to not break r299462. PR35317 llvm-svn: 318788	2017-11-21 18:08:34 +00:00
Hans Wennborg	bdbe363676	Fix r318786 llvm-svn: 318787	2017-11-21 18:00:01 +00:00
Nuno Lopes	7eb86bf989	removed unused private method decl. NFC llvm-svn: 318786	2017-11-21 17:53:19 +00:00
Hans Wennborg	2ebede8b36	EntryExitInstrumenter: support __cyg_profile_func_enter_bare It works just like __cyg_profile_func_enter but takes no arguments. llvm-svn: 318783	2017-11-21 17:22:19 +00:00
Oliver Stannard	1e82259f07	[ARM] Remove pre-UAL FLDM/FSTM aliases These are pre-UAL syntax, and we don't support any other pre-UAL instructions, with the exception of FLDMX/FSTMX, which don't have a UAL equivalent. Therefore there's no reason to keep them or their AsmParser hacks around. With the AsmParser hacks removed, the FLDMX and FSTMX instructions get the same operand diagnostics as the UAL instructions. Differential revision: https://reviews.llvm.org/D39196 llvm-svn: 318777	2017-11-21 16:20:25 +00:00
Alina Sbirlea	77a10244b2	Add MemorySSA as loop dependency, disabled by default [NFC]. Summary: First step in adding MemorySSA as dependency for loop pass manager. Adding the dependency under a flag. New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null. Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default. Reviewers: sanjoy, davide, gberry Subscribers: mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D40274 llvm-svn: 318772	2017-11-21 15:45:46 +00:00
Oliver Stannard	33706d76ee	[ARM] Don't omit non-default predication code This was causing the (invalid) predicated versions of the NEON VRINTX and VRINTZ instructions to be accepted, with the condition code being ignored. Also, there is no NEON VRINTR instruction, so that part of the check was not necessary. Differential revision: https://reviews.llvm.org/D39193 llvm-svn: 318771	2017-11-21 15:34:15 +00:00
Oliver Stannard	bdf17e56f9	[Asm] Improve "too few operands" errors - We can still emit this error if the actual instruction has two or more operands missing compared to the expected one. - We should only emit this error once per instruction. Differential revision: https://reviews.llvm.org/D36746 llvm-svn: 318770	2017-11-21 15:16:50 +00:00
Oliver Stannard	b6bf719f2a	[ARM] Add diagnostics for SPR/DPR lists Differential revision: https://reviews.llvm.org/D39195 llvm-svn: 318766	2017-11-21 15:06:01 +00:00
Alex Bradbury	164d043067	[RISCV][NFC] Remove unnecessary {} around single statement if block Almost too trivial to worry about, but it seems worth having consistency with upcoming commits. llvm-svn: 318760	2017-11-21 12:41:41 +00:00
Simon Pilgrim	17fc3553e8	[X86][XOP] Add missing scheduler classes to XOP instructions All match equivalent basic classes (WritePHAdd, WriteFAdd etc.) according to both the AMD 15h SOG and Agner's tables. llvm-svn: 318758	2017-11-21 12:02:18 +00:00
Alex Bradbury	c88d30059b	[RISCV][NFC] Clean up RISCVDAGToDAGISel::Select As pointed out in post-commit review of r318738, `return ReplaceNode(..)` when both ReplaceNode and the current function return void is confusing. This patch moves to using a more obvious early return, and moves to just using an if to catch the one case we currently care about. A future patch that adds further custom instruction selection can introduce a switch. llvm-svn: 318757	2017-11-21 12:00:19 +00:00
Martell Malone	1b00723cb5	[ARM] Use SEH exceptions on thumbv7-windows Reviewers: mstorsjo Differential Revision: https://reviews.llvm.org/D40286 llvm-svn: 318756	2017-11-21 11:30:20 +00:00
Simon Pilgrim	7d84f55b6d	[X86][LWP] Add missing LWP itinerary class to lwpins instructions It's on all other LWP instruction but I missed it from lwpins, despite similar scheduling behaviour. llvm-svn: 318751	2017-11-21 11:17:11 +00:00
Eugene Leviant	db429d866b	[MI scheduler] Fix VADD and VSUB in cortex-a57 model This patch fixes instregex for interger vector add/sub instructions Differential revision: https://reviews.llvm.org/D40254 llvm-svn: 318749	2017-11-21 11:01:28 +00:00
Coby Tayree	fe22c86371	[x86][icelake]BITALG vpopcnt{b,w} Differential Revision: https://reviews.llvm.org/D40213 llvm-svn: 318748	2017-11-21 10:32:42 +00:00
Coby Tayree	194b252eca	[x86][icelake]VNNI Introducing Vector Neural Network Instructions, consisting of: vpdpbusd{s} vpdpwssd{s} Differential Revision: https://reviews.llvm.org/D40208 llvm-svn: 318746	2017-11-21 10:04:28 +00:00
Coby Tayree	c6c4bff339	[x86][icelake]vbmi2 introducing vbmi2, consisting of vpcompress{b,w} vpexpand{b,w} vpsh{l,r}d{w,d,q} vpsh{l,r}dv{w,d,q} Differential Revision: https://reviews.llvm.org/D40206 llvm-svn: 318745	2017-11-21 09:48:44 +00:00
NAKAMURA Takumi	be536e28e1	SLPVectorizer.cpp: Avoid std::stable_sort(properlyDominates()). properlyDominates() shouldn't be used as sort key. It causes different output between stdlibc++ and libc++. Instead, I introduced RPOT. In most cases, it works for CSE. llvm-svn: 318743	2017-11-21 09:41:01 +00:00
Coby Tayree	836d1e6a37	[x86][icelake]vpclmulqdq introduction an icelake promotion of pclmulqdq Differential Revision: https://reviews.llvm.org/D40101 llvm-svn: 318741	2017-11-21 09:30:33 +00:00
Coby Tayree	48de83a1a7	[x86][icelake]VAES introduction an icelake promotion of AES Differential Revision: https://reviews.llvm.org/D40078 llvm-svn: 318740	2017-11-21 09:11:41 +00:00
Alex Bradbury	43a1fee3ed	[RISCV] Use register X0 (ZERO) for constant 0 The obvious approach of defining a pattern like the one below actually doesn't work: `def : Pat<(i32 0), (i32 X0)>;` As was noted when Lanai made this change (https://reviews.llvm.org/rL288215), attempting to handle the constant 0 in tablegen leads to assertions due to a physical register being used where a virtual register is expected. llvm-svn: 318738	2017-11-21 08:23:08 +00:00
Alex Bradbury	cc724f15ca	[RISCV] Support and tests for a variety of additional LLVM IR constructs Previous patches primarily ensured that codegen was possible for the standard RISC-V instructions. However, there are a number of IR inputs that wouldn't be appropriately lowered. This patch both adds test cases and supports lowering for a number of these cases: * Improved sext/zext/trunc support * Support for setcc variants that don't map directly to RISC-V instructions * Lowering mul, and hence support for external symbols * addc, adde, subc, sube * mulhs, srem, mulhu, urem, udiv, sdiv * {srl,sra,shl}_parts * brind * br_jt * bswap, ctlz, cttz, ctpop * rotl, rotr * BlockAddress operands Differential Revision: https://reviews.llvm.org/D29938 llvm-svn: 318737	2017-11-21 08:11:03 +00:00
Alex Bradbury	dba566bb03	[RISCV] Implement lowering of ISD::SELECT Although ISD::SELECT_CC is a more natural match for RISCVISD::SELECT_CC (and ultimately the integer RISC-V conditional branch instructions), we choose to expand ISD::SELECT_CC and lower ISD::SELECT. The appropriate compare+branch will be created in the case where an ISD::SELECT condition value is created by an ISD::SETCC node, which operates on XLen types. Other datatypes such as floating point don't have conditional branch instructions, and lowering ISD::SELECT allows more flexibility for handling these cases. Differential Revision: https://reviews.llvm.org/D29937 llvm-svn: 318735	2017-11-21 07:51:32 +00:00
Dean Michael Berris	57f2739291	[XRay] Use optimistic logging model for FDR mode Summary: Before this change, the FDR mode implementation relied on at thread-exit handling to return buffers back to the (global) buffer queue. This introduces issues with the initialisation of the thread_local objects which, even through the use of pthread_setspecific(...) may eventually call into an allocation function. Similar to previous changes in this line, we're finding that there is a huge potential for deadlocks when initialising these thread-locals when the memory allocation implementation is also xray-instrumented. In this change, we limit the call to pthread_setspecific(...) to provide a non-null value to associate to the key created with pthread_key_create(...). While this doesn't completely eliminate the potential for the deadlock(s), it does allow us to still clean up at thread exit when we need to. The change is that we don't need to do more work when starting and ending a thread's lifetime. We also have a test to make sure that we actually can safely recycle the buffers in case we end up re-using the buffer(s) available from the queue on multiple thread entry/exits. This change cuts across both LLVM and compiler-rt to allow us to update both the XRay runtime implementation as well as the library support for loading these new versions of the FDR mode logging. Version 2 of the FDR logging implementation makes the following changes: * Introduction of a new 'BufferExtents' metadata record that's outside of the buffer's contents but are written before the actual buffer. This data is associated to the Buffer handed out by the BufferQueue rather than a record that occupies bytes in the actual buffer. * Removal of the "end of buffer" records. This is in-line with the changes we described above, to allow for optimistic logging without explicit record writing at thread exit. The optimistic logging model operates under the following assumptions: * Threads writing to the buffers will potentially race with the thread attempting to flush the log. To avoid this situation from occuring, we make sure that when we've finalized the logging implementation, that threads will see this finalization state on the next write, and either choose to not write records the thread would have written or write the record(s) in two phases -- first write the record(s), then update the extents metadata. * We change the buffer queue implementation so that once it's handed out a buffer to a thread, that we assume that buffer is marked "used" to be able to capture partial writes. None of this will be safe to handle if threads are racing to write the extents records and the reader thread is attempting to flush the log. The optimism comes from the finalization routine being required to complete before we attempt to flush the log. This is a fairly significant semantics change for the FDR implementation. This is why we've decided to update the version number for FDR mode logs. The tools, however, still need to be able to support older versions of the log until we finally deprecate those earlier versions. Reviewers: dblaikie, pelikan, kpw Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39526 llvm-svn: 318733	2017-11-21 07:16:57 +00:00
Craig Topper	0604471011	[X86] Simplify type constraints for AVX2 masked gather. We don't need separate 32 and 64 node types. We can use SDTCisInt and SDTCisSameSizeAs to ensure the mask size the result type and is integer. llvm-svn: 318732	2017-11-21 06:28:15 +00:00
Serguei Katkov	6fb481ba59	Revert "[CGP] Enable complex addr mode (2nd attempt)" Revert the patch rl318728 causing buildbot hangs-ups. llvm-svn: 318731	2017-11-21 06:03:43 +00:00
Craig Topper	b0d4cbb3b0	[X86] Simplify the predicates for avx2 masked gather patterns. We don't need a dyn_cast and we only need to check the type of the index. The base ptr is guaranteed to be scalar. llvm-svn: 318730	2017-11-21 06:01:20 +00:00
Rafael Espindola	709b1b61bb	move static function. NFC llvm-svn: 318729	2017-11-21 05:35:45 +00:00
Serguei Katkov	e8b6e750ba	[CGP] Enable complex addr mode (2nd attempt) 2nd attempt to enable complex addr modes after fix of the crash by rL318638. llvm-svn: 318728	2017-11-21 05:31:47 +00:00
Yaxun Liu	5069545c52	[AMDGPU] Fix DAGTypeLegalizer::SplitInteger for shift amount type DAGTypeLegalizer::SplitInteger uses default pointer size as shift amount constant type, which causes less performant ISA in amdgcn---amdgiz target since the default pointer type is i64 whereas the desired shift amount type is i32. This patch fixes that by using TLI.getScalarShiftAmountTy in DAGTypeLegalizer::SplitInteger. The X86 change is necessary since splitting i512 requires shifting amount of 256, which cannot be held by i8. Differential Revision: https://reviews.llvm.org/D40148 llvm-svn: 318727	2017-11-21 02:29:54 +00:00
Rafael Espindola	e5791cf034	Split a rename_handle out of rename on windows. llvm-svn: 318725	2017-11-21 01:52:44 +00:00
Richard Trieu	1e6ea96860	Add default values for member functions. Initialize IsVis2 and IsVis3 in SparcSubtarget::initializeSubtargetDependencies. MSan detected uninitialized read of IsVis3 after r318704. Initializing the variables to false will prevent undefined behavior. llvm-svn: 318724	2017-11-21 01:45:17 +00:00
Davide Italiano	7f9b83e34d	[SCCP] If we replace with a constant, we can't replace with a range. This microoptimization is NFC. llvm-svn: 318711	2017-11-21 00:21:52 +00:00
Richard Trieu	afc94c08cb	Revert r318678 to fix Clang test r318678 caused the Clang test CodeGen/ms-inline-asm.c to start failing. llvm-svn: 318710	2017-11-21 00:12:18 +00:00
Vitaly Buka	d4eaab2abf	[msan] Don't sanitize "nosanitize" instructions Reviewers: eugenis Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40205 llvm-svn: 318708	2017-11-20 23:37:56 +00:00
Craig Topper	8239c8d9fe	[SelectionDAG] When promoting the result of a VSELECT, make sure we promote the condition to the SetCC type for the final result type not the original type. Normally this would be cleaned up by promoting the condition operand next. But in the attached case we promoted the result from v2i48 to v2i64 and the condition from v2i1 to v2i48. Then we tried to "promote" the v2i48 condition back to v2i1 because that's what the SetCC result type for v2i64 is on X86 with VLX. But promote is either a NOP or SIGN_EXTEND and this would need a truncation. With the change here we now get the SetCC type of v2i1 when we're handling the result promotion and the operand no longer needs to be promoted itself. Fixes PR35272. llvm-svn: 318706	2017-11-20 23:08:50 +00:00
Fedor Sergeev	a8799e84b4	[Sparc] efficient pattern for UINT_TO_FP conversion Summary: while investigating performance degradation of imagick benchmark there were found inefficient pattern for UINT_TO_FP conversion. That pattern causes RAW hazard in assembly code. Specifically, uitofp IR operator results in poor assembler : st %i0, [%fp - 952] ldd [%fp - 952], %f0 it stores 32-bit integer register into memory location and then loads 64-bit floating point data from that location. That is exactly RAW hazard case. To optimize that case it is possible to use SPISD::ITOF and SPISD::XTOF for conversion from integer to floating point data type and to use ISD::BITCAST to copy from integer register into floating point register. The fix is to write custom UINT_TO_FP pattern using SPISD::ITOF, SPISD::XTOF, ISD::BITCAST. Patch by Alexey Lapshin Reviewers: fedor.sergeev, jyknight, dcederman, lero_chris Reviewed By: jyknight Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36875 llvm-svn: 318704	2017-11-20 22:33:58 +00:00
Hiroshi Yamauchi	0596ae1e4a	Add heuristics for irreducible loop metadata under PGO Summary: Add the following heuristics for irreducible loop metadata: - When an irreducible loop header is missing the loop header weight metadata, give it the minimum weight seen among other headers. - Annotate indirectbr targets with the loop header weight metadata (as they are likely to become irreducible loop headers after indirectbr tail duplication.) These greatly improve the accuracy of the block frequency info of the Python interpreter loop (eg. from ~3-16x off down to ~40-55% off) and the Python performance (eg. unpack_sequence from ~50% slower to ~8% faster than GCC) due to better register allocation under PGO. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39980 llvm-svn: 318693	2017-11-20 21:03:38 +00:00

1 2 3 4 5 ...

108200 Commits