llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Etienne Bergeron	a6c64717ba	[asan] Support dynamic shadow address instrumentation Summary: This patch is adding the support for a shadow memory with dynamically allocated address range. The compiler-rt needs to export a symbol containing the shadow memory range. This is required to support ASAN on windows 64-bits. Reviewers: kcc, rnk, vitalybuka Subscribers: zaks.anna, kubabrecka, dberris, llvm-commits, chrisha Differential Revision: https://reviews.llvm.org/D23354 llvm-svn: 282881	2016-09-30 17:46:32 +00:00
Matthew Simpson	f6e1d8dedb	[LV] Build all scalar steps for non-uniform induction variables When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863	2016-09-30 15:13:52 +00:00
Dylan McKay	8e9c5b26db	Revert "[RegAllocGreedy] Attempt to split unspillable live intervals" It was accidentally committed. llvm-svn: 282855	2016-09-30 14:05:15 +00:00
Dylan McKay	2b3e685349	[RegAllocGreedy] Attempt to split unspillable live intervals Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 282852	2016-09-30 13:59:20 +00:00
Craig Topper	0213fa7d16	Revert r282835 "[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not." Turns out this doesn't pass verify-machineinstrs. llvm-svn: 282841	2016-09-30 05:35:42 +00:00
Craig Topper	829b549027	[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not. If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of. I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change. llvm-svn: 282835	2016-09-30 04:31:33 +00:00
Piotr Padlewski	98cb07e1d2	[thinlto] Don't decay threshold for hot callsites Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 llvm-svn: 282833	2016-09-30 03:01:17 +00:00
Matt Arsenault	3a9a1ac61b	AMDGPU: Use unsigned compare for eq/ne For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832	2016-09-30 01:50:20 +00:00
Reid Kleckner	d8564063fd	[X86] Don't preserve Win64 SSE CSRs when SSE is disabled Code that doesn't use floating point and doesn't use SSE (kernel code) shouldn't save and restore SSE registers. Fixes PR30503 llvm-svn: 282819	2016-09-30 00:17:49 +00:00
Kevin Enderby	62874739b7	Next set of additional error checks for invalid Mach-O files for the load command that uses the MachO::entry_point_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_MAIN load command. llvm-svn: 282766	2016-09-29 21:07:29 +00:00
Reid Kleckner	53c5a2c5c0	[codeview] Use character types for all byte-sized integer types The VS debugger doesn't appear to understand the 0x68 or 0x69 type indices, which were probably intended for use on a platform where a C 'int' is 8 bits. So, use the character types instead. Clang was already using the character types because '[u]int8_t' is usually defined in terms of 'char'. See the Rust issue for screenshots of what VS does: https://github.com/rust-lang/rust/issues/36646 Fixes PR30552 llvm-svn: 282739	2016-09-29 17:55:01 +00:00
Kevin Enderby	dff68cad63	Next set of additional error checks for invalid Mach-O files for the load command that uses the Mach::source_version_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_SOURCE_VERSION load command. llvm-svn: 282736	2016-09-29 17:45:23 +00:00
Kostya Serebryany	83752c3be3	[sanitizer-coverage/libFuzzer] make the guards for trace-pc 32-bit; create one array of guards per function, instead of one guard per BB. reorganize the code so that trace-pc-guard does not create unneeded globals llvm-svn: 282735	2016-09-29 17:43:24 +00:00
Piotr Padlewski	85b64e4eaf	[thinlto] Add cold-callsite import heuristic Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 llvm-svn: 282733	2016-09-29 17:32:07 +00:00
Simon Pilgrim	3fc7eac79f	[X86] Add explicit test triple to make windows/msvc builds happier llvm-svn: 282719	2016-09-29 15:10:09 +00:00
Craig Topper	e3a9100e63	[X86] Really fix the FileCheck line from r282690. Why does Folded Spill comments print with a different number of # characters on different systems? llvm-svn: 282693	2016-09-29 06:49:21 +00:00
Craig Topper	44003789a5	[AVX-512] Fix a check line from r282690. llvm-svn: 282691	2016-09-29 06:37:21 +00:00
Craig Topper	574fec2e71	[AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available. This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used. Fixes one of the cases from PR29112. llvm-svn: 282690	2016-09-29 06:07:09 +00:00
Craig Topper	eb81e5883d	[X86] Remove extra FileCheck lines that got left behind in r282688. llvm-svn: 282689	2016-09-29 06:07:07 +00:00
Craig Topper	b2bb9a2bbc	[AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 (X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition. llvm-svn: 282688	2016-09-29 05:54:43 +00:00
Craig Topper	4a193b69ad	[X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution domain fixing table. llvm-svn: 282687	2016-09-29 05:54:39 +00:00
Craig Topper	4152760e70	[X86] Add 512-bit VPBROADCASTB and VPBROADCASTW tests. llvm-svn: 282685	2016-09-29 05:54:32 +00:00
Craig Topper	9593dd5f5b	[X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables. llvm-svn: 282684	2016-09-29 05:54:28 +00:00
Matt Arsenault	c8493c6153	AMDGPU: Partially fix control flow at -O0 Fixes to allow spilling all registers at the end of the block work with exec modifications. Don't emit s_and_saveexec_b64 for if lowering, and instead emit copies. Mark control flow mask instructions as terminators to get correct spill code placement with fast regalloc, and then have a separate optimization pass form the saveexec. This should work if SGPRs are spilled to VGPRs, but will likely fail in the case that an SGPR spills to memory and no workitem takes a divergent branch. llvm-svn: 282667	2016-09-29 01:44:16 +00:00
Lei Liu	51c8520dd3	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661	2016-09-29 01:05:48 +00:00
Evgeny Stupachenko	837a069ffd	Wisely choose sext or zext when widening IV. Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650	2016-09-28 23:39:39 +00:00
Kevin Enderby	3eada9770b	Next set of additional error checks for invalid Mach-O files for the load command that uses the Mach::rpath_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_RPATH load command. llvm-svn: 282649	2016-09-28 23:16:01 +00:00
Mike Aizatsky	5d501f4c5e	[sancov] introducing symbolized coverage files (.symcov) Summary: Answering any meaningful questions about .sancov files requires accessing symbol information from the corresponding binary. This change introduces a separate intermediate data structure and format: symbolized coverage. It contains all symbol information that is required to answer common queries: - merging - coverd/uncovered files and functions - line status. Also removing the html report functionality from sancov: generated HTML files are too huge, and a different approach is required. Maintaining this half-working approach in the C++ is painful. Differential Revision: https://reviews.llvm.org/D24947 llvm-svn: 282639	2016-09-28 21:39:28 +00:00
Kevin Enderby	7e2a5223b4	Next set of additional error checks for invalid Mach-O files for the other load commands that use the Mach::version_min_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_VERSION_MIN_MACOSX, LC_VERSION_MIN_IPHONEOS, LC_VERSION_MIN_TVOS and LC_VERSION_MIN_WATCHOS load commands. llvm-svn: 282635	2016-09-28 21:20:45 +00:00
Krzysztof Parzyszek	738486a316	IfConversion: Add implicit uses for redefined regs with live subregisters Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626	2016-09-28 20:07:41 +00:00
Konstantin Zhuravlyov	ebf3beb03f	[AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions Differential Revision: https://reviews.llvm.org/D24125 llvm-svn: 282624	2016-09-28 20:05:39 +00:00
Sanjay Patel	a78562d801	[InstCombine] update to use FileCheck Also, remove unnecessary function attributes, parameters, and comments. It looks like at least some of these tests are not minimal though... llvm-svn: 282620	2016-09-28 19:10:16 +00:00
Simon Pilgrim	f88cb7e092	[X86][AVX] Add test showing that VBROADCAST loads don't correctly respect dependencies llvm-svn: 282613	2016-09-28 17:59:30 +00:00
Artur Pilipenko	e07af89244	Don't look through addrspacecast in GetPointerBaseWithConstantOffset Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612	2016-09-28 17:57:16 +00:00
Adrian Prantl	47833cff03	Teach LiveDebugValues about lexical scopes. This addresses PR26055 LiveDebugValues is very slow. Contrary to the old LiveDebugVariables pass LiveDebugValues currently doesn't look at the lexical scopes before inserting a DBG_VALUE intrinsic. This means that we often propagate DBG_VALUEs much further down than necessary. This is especially noticeable in large C++ functions with many inlined method calls that all use the same "this"-pointer. For example, in the following code it makes no sense to propagate the inlined variable a from the first inlined call to f() into any of the subsequent basic blocks, because the variable will always be out of scope: void sink(int a); void __attribute((always_inline)) f(int a) { sink(a); } void foo(int i) { f(i); if (i) f(i); f(i); } This patch reuses the LexicalScopes infrastructure we have for LiveDebugVariables to take this into account. The effect on compile time and memory consumption is quite noticeable: I tested a benchmark that is a large C++ source with an enormous amount of inlined "this"-pointers that would previously eat >24GiB (most of them for DBG_VALUE intrinsics) and whose compile time was dominated by LiveDebugValues. With this patch applied the memory consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues. https://reviews.llvm.org/D24994 Thanks to Daniel Berlin and Keith Walker for reviewing! llvm-svn: 282611	2016-09-28 17:51:14 +00:00
Artem Belevich	ed0bd7024b	[NVPTX] Added intrinsics for atom.gen.{sys\|cta}.* instructions. These are only available on sm_60+ GPUs. Differential Revision: https://reviews.llvm.org/D24943 llvm-svn: 282607	2016-09-28 17:25:38 +00:00
Nirav Dave	1f7d22e77d	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604	2016-09-28 16:37:50 +00:00
Marina Yatsina	269811ad82	[x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel) Implement 'retn' simply by aliasing it to the relevant 'ret' instruction Commit on behalf of coby Differential Revision: https://reviews.llvm.org/D24346 llvm-svn: 282601	2016-09-28 15:52:56 +00:00
Nirav Dave	080cb64e9c	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600	2016-09-28 15:50:43 +00:00
Sanjay Patel	51c21d223b	[InstSimplify] allow or-of-icmps folds with vector splat constants llvm-svn: 282592	2016-09-28 14:27:21 +00:00
Sanjay Patel	b28b4a20f7	[InstSimplify] add vector splat tests for or-of-icmps llvm-svn: 282591	2016-09-28 14:17:35 +00:00
Sanjay Patel	aac597925f	[InstSimplify] allow and-of-icmps folds with vector splat constants llvm-svn: 282590	2016-09-28 13:53:13 +00:00
Guy Blank	4de6e44893	[X86][FastISel] Use a COPY from K register to a GPR instead of a K operation The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580	2016-09-28 11:22:17 +00:00
Michael Kuperstein	8fe0e6eb01	[DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567	2016-09-28 06:13:58 +00:00
Adam Nemet	ec2292c80c	[Inliner] Port all opt remarks to new streaming API llvm-svn: 282559	2016-09-27 23:47:03 +00:00
Adam Nemet	c99191ee9f	Pass -S to opt in this test to avoid printing binary on mismatch The purpose of the test is to verify diagnostics. llvm-svn: 282558	2016-09-27 23:46:59 +00:00
Kevin Enderby	bfcc34bc8a	Next set of additional error checks for invalid Mach-O files for the other load commands that use the MachO::dylinker_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_ID_DYLINKER, LC_LOAD_DYLINKER and LC_DYLD_ENVIRONMENT load commands. llvm-svn: 282553	2016-09-27 23:24:13 +00:00
Sanjay Patel	f1f848bbad	[x86] add folds for FP logic with vector zeros The 'or' case shows up in copysign. The copysign code also had redundant checking for a scalar zero operand with 'and', so I removed that. I'm not sure how to test vector 'and', 'andn', and 'xor' yet, but it seems better to just include all of the logic ops since we're fixing 'or' anyway. llvm-svn: 282546	2016-09-27 22:28:13 +00:00
Geoff Berry	8e556287ce	[TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg(). Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543	2016-09-27 22:17:27 +00:00
Adam Nemet	e23cf79174	[Inliner] Fold the analysis remark into the missed remark There is really no reason for these to be separate. The vectorizer started this pretty bad tradition that the text of the missed remarks is pretty meaningless, i.e. vectorization failed. There, you have to query analysis to get the full picture. I think we should just explain the reason for missing the optimization in the missed remark when possible. Analysis remarks should provide information that the pass gathers regardless whether the optimization is passing or not. llvm-svn: 282542	2016-09-27 21:58:17 +00:00

1 2 3 4 5 ...

39844 Commits