llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00

Author	SHA1	Message	Date
Sanjay Patel	3a58c4be6f	[DAG] use isConstOrConstSplat in ComputeNumSignBits to optimize SRA The scalar version of this pattern was noted in: https://reviews.llvm.org/D25485 and fixed with: https://reviews.llvm.org/rL284395 More refactoring of the constant/splat helpers is needed and will happen in follow-up patches. Differential Revision: https://reviews.llvm.org/D25685 llvm-svn: 284424	2016-10-17 20:41:39 +00:00
Sanjay Patel	d74ae92c30	[DAG] make isConstOrConstSplat and isConstOrConstSplatFP more accessible; NFC As noted in: https://reviews.llvm.org/D25685 This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch. In a minor attempt to keep some structure, we're pulling the FP helper over along with its integer sibling, but clearly we can and should do more refactoring of the similar helper functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality. llvm-svn: 284421	2016-10-17 20:26:46 +00:00
Michael LeMay	9fd255e199	Test commit. llvm-svn: 284411	2016-10-17 19:09:19 +00:00
Sanjay Patel	f0b7ab4378	[DAG] optimize away an arithmetic-right-shift of a 0 or -1 value This came up as part of: https://reviews.llvm.org/D25485 Note that the vector case is missed because ComputeNumSignBits() is deficient for vectors. llvm-svn: 284395	2016-10-17 15:58:28 +00:00
James Molloy	c6be02fbf0	[SDAG] Use ABI type alignment for constant pools when optimizing for size SelectionDAG::getConstantPool will automatically determine an appropriate alignment if one is not specified. It does this by querying the type's preferred alignment. This can end up creating quite a lot of padding when the preferred alignment for vectors is 128. In optimize-for-size mode, it makes sense to instead query the ABI type alignment which is often smaller and causes less padding. llvm-svn: 284381	2016-10-17 12:54:07 +00:00
Andrea Di Biagio	e0e599e202	[CodeGenPrepare] When moving a zext near to its associated load, do not retain the original debug location. CodeGenPrepare knows how to move a zext of a load into the same basic block where the load lives. The goal is to help ISel match a zero-extending load instead of two separated instructions. CGP attempts to move a zext computation even if it lives in a basic block that does not post-dominate the load's basic block. That means, the hoisted zext may be speculated. Preserving the zext location would hurt the debugging experience and the quality of sample pgo. With this patch, when moving a zext near to its associated load, CGP no longer propagates the zext's debug location. Instead, CGP conservatively reuses the same debug location for the load and the zext. An alternative approach would be to assign an artificial line-0 location to the zext. However we don't want to over-use the 'line-0' for this particular case because it would have a size cost in the line-table section for no additional benefit. Differential Revision: https://reviews.llvm.org/D25611 llvm-svn: 284377	2016-10-17 11:32:26 +00:00
Konstantin Zhuravlyov	1aefc15a1a	[MachineMemOperand] Move synchronization scope and atomic orderings from SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG. Differential Revision: https://reviews.llvm.org/D24577 llvm-svn: 284312	2016-10-15 22:01:18 +00:00
Tim Northover	dc91ae935f	GlobalISel: rename legalizer components to match others. The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287	2016-10-14 22:18:18 +00:00
Sanjay Patel	6ea6210357	[DAG] avoid creating illegal node when transforming negated shifted sign bit Eli noted this potential bug in the post-commit thread for: https://reviews.llvm.org/rL284239 ...but I'm not sure how to trigger it, so there's no test case yet. llvm-svn: 284268	2016-10-14 19:46:31 +00:00
Tom Stellard	0eccb3e2ea	TargetLowering: Add SimplifyDemandedBits() helper to TargetLoweringOpt Summary: The main purpose of this new helper is to enable simplifying operations that have multiple uses. SimplifyDemandedBits does not handle multiple uses currently, and this new function makes it possible to optimize: and v1, v0, 0xffffff mul24 v2, v1, v1 ; Multiply ignoring high 8-bits. To: mul24 v2, v0, v0 Where before this would not be optimized, because v1 has multiple uses. Reviewers: bogner, arsenm Subscribers: nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24964 llvm-svn: 284266	2016-10-14 19:14:26 +00:00
David L Kreitzer	1233356719	Add a pass to optimize patterns of vectorized interleaved memory accesses for X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260	2016-10-14 18:20:41 +00:00
David L Kreitzer	d68b19752a	[safestack] Use non-thread-local unsafe stack pointer for Contiki OS Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D19852 llvm-svn: 284254	2016-10-14 17:56:00 +00:00
Eric Christopher	259a7b3aba	Revert "In preparation for removing getNameWithPrefix off of TargetMachine," as it's causing sanitizer/memory issues until I can track down this set. This reverts commit r284203 llvm-svn: 284252	2016-10-14 17:28:23 +00:00
Sanjay Patel	54524ed76a	[DAG] add folds for negated shifted sign bit The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 llvm-svn: 284239	2016-10-14 14:26:47 +00:00
Nicolai Haehnle	0d87d9745d	Fix use-after-frees Extracted from D25313, as suggested by Justin Bogner. llvm-svn: 284220	2016-10-14 09:49:51 +00:00
Craig Topper	938c990f79	[DAGCombiner] Teach createBuildVecShuffle to handle cases where input vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. llvm-svn: 284204	2016-10-14 06:00:42 +00:00
Eric Christopher	bf50905153	In preparation for removing getNameWithPrefix off of TargetMachine, sink the current behavior into the callers and sink TargetMachine::getNameWithPrefix into TargetMachine::getSymbol. llvm-svn: 284203	2016-10-14 05:47:41 +00:00
Eric Christopher	403df02984	Tidy the calls to getCurrentSection().first -> getCurrentSectionOnly to help readability a bit. llvm-svn: 284202	2016-10-14 05:47:37 +00:00
Sanjay Patel	f9abb54589	[DAG] hoist DL(N) and fix formatting; NFC llvm-svn: 284170	2016-10-13 22:27:10 +00:00
Tom Stellard	4e05d887b7	LegalizeDAG: Implement PROMOTE for ISD::BITREVERSE Summary: This operation is promoted the same way was ISD::BSWAP. This will prevent a regression in test/Target/AMDGOU/bitreverse.ll when i16 support is implemented. Reviewers: bogner, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D25202 llvm-svn: 284163	2016-10-13 21:03:49 +00:00
David L Kreitzer	654c1b13ba	[safestack] Reapply r283248 after moving X86-targeted SafeStack tests into the X86 subdirectory. Original commit message: Requires a valid TargetMachine to be passed to the SafeStack pass. Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 284161	2016-10-13 20:57:51 +00:00
Nirav Dave	7c2dd71bf8	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon llvm-svn: 284157	2016-10-13 20:23:25 +00:00
Quentin Colombet	727043ec80	[RAGreedy] Empty live-ranges always succeed in last chance recoloring. Relax the constraint for empty live-ranges while doing last chance recoloring. Indeed, those live-ranges do not need an actual color to be fond for the recoloring to work. Empty live-range may happen as a result of splitting/spilling. Unfortunately no test case for in-tree targets. llvm-svn: 284152	2016-10-13 19:27:48 +00:00
Nirav Dave	01829c947a	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 284151	2016-10-13 19:20:16 +00:00
Simon Pilgrim	0c8d40e3e3	[DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), Y) style combines llvm-svn: 284122	2016-10-13 14:04:35 +00:00
Simon Pilgrim	71f238cfd7	[DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding llvm-svn: 284117	2016-10-13 12:49:31 +00:00
Simon Pilgrim	ca998fd02e	[DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization Improves commutation potential llvm-svn: 284113	2016-10-13 12:05:20 +00:00
Krzysztof Parzyszek	9b355ba01b	Handle lane masks in LivePhysRegs when adding live-ins Differential Revision: https://reviews.llvm.org/D25533 llvm-svn: 284076	2016-10-12 22:53:41 +00:00
Albert Gutowski	14303dbdfa	Create llvm.addressofreturnaddress intrinsic Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25293 llvm-svn: 284061	2016-10-12 22:13:19 +00:00
Krzysztof Parzyszek	91061ff7c3	[MIRParser] Parse lane masks for register live-ins Differential Revision: https://reviews.llvm.org/D25530 llvm-svn: 284052	2016-10-12 21:06:45 +00:00
Krzysztof Parzyszek	9741038bd4	Do not remove implicit defs in BranchFolder Branch folder removes implicit defs if they are the only non-branching instructions in a block, and the branches do not use the defined registers. The problem is that in some cases these implicit defs are required for the liveness information to be correct. Differential Revision: https://reviews.llvm.org/D25478 llvm-svn: 284036	2016-10-12 19:50:57 +00:00
Matt Arsenault	f80a9a7346	BranchRelaxation: Unique live ins when creating block llvm-svn: 284018	2016-10-12 15:32:04 +00:00
Simon Pilgrim	da31facd2c	[DAGCombiner] Update most ADD combines to support general vector combines Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors. Differential Revision: https://reviews.llvm.org/D25374 llvm-svn: 284015	2016-10-12 13:48:10 +00:00
Konstantin Zhuravlyov	da8ee8e7dc	[DAGCombiner] Do not remove the load of stored values when optimizations are disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 llvm-svn: 284014	2016-10-12 13:44:24 +00:00
Michael Kuperstein	0d331811ac	[DAG] Fix crash in build_vector -> vector_shuffle combine Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. llvm-svn: 283953	2016-10-11 22:44:31 +00:00
Tim Northover	573697e8ef	MIRParser: allow types on registers with a RegBank. This fixes some GlobalISel regression tests. llvm-svn: 283936	2016-10-11 20:50:04 +00:00
Kyle Butt	abb4b9d9ee	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283934	2016-10-11 20:36:43 +00:00
Arnold Schwaighofer	7430f50b4d	Silence -Wunused-but-set-variable warning llvm-svn: 283927	2016-10-11 19:49:29 +00:00
Sanjay Patel	237ffdc433	[DAG] add fold for masked negated sign-extended bool This enhances the fold added with: https://reviews.llvm.org/rL283900 llvm-svn: 283905	2016-10-11 17:05:52 +00:00
Sanjay Patel	61a6f1d61e	[DAG] add fold for masked negated extended bool The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. llvm-svn: 283900	2016-10-11 16:26:36 +00:00
Sanjay Patel	dc0ba19eca	[DAG] simplify logic; NFC llvm-svn: 283885	2016-10-11 14:14:30 +00:00
Sanjay Patel	fbe8e33811	[DAG] hoist DL(N) and fix formatting; NFC llvm-svn: 283884	2016-10-11 14:04:24 +00:00
Sanjay Patel	edf39dd95a	[DAG] fix formatting; NFC llvm-svn: 283878	2016-10-11 13:47:43 +00:00
Fraser Cormack	33445c78d3	Fix formatting in findRegisterUseOperandIdx. NFC. llvm-svn: 283860	2016-10-11 09:09:21 +00:00
Daniel Jasper	80e95dce89	Revert "Codegen: Tail-duplicate during placement." This reverts commit r283842. test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our internal testing. I'll share a link with you. llvm-svn: 283857	2016-10-11 07:36:11 +00:00
Matthias Braun	ec260ba34b	Fix warning; NFC llvm-svn: 283851	2016-10-11 04:32:03 +00:00
Matthias Braun	a38a444c10	MIRParser: generic register operands with types This should fix the fallout of r283848. llvm-svn: 283850	2016-10-11 04:22:29 +00:00
Matthias Braun	666af50fcf	MIRParser: Rewrite register info initialization; mostly NFC This changes MachineRegisterInfo to be initializes after parsing all instructions. This is in preparation for upcoming commits that allow the register class specification on the operand or deduce them from the MCInstrDesc. This commit removes the unused feature of having nonsequential register numbers. This was confusing anyway as the vreg numbers would be different after parsing when you had "holes" in your numbering. This patch also introduces the concept of an incomplete virtual register. An incomplete virtual register may be used during .mir parsing to construct MachineOperands without knowing the exact register class (or register bank) yet. NFC except for some error messages. Differential Revision: https://reviews.llvm.org/D22397 llvm-svn: 283848	2016-10-11 03:13:01 +00:00
Kyle Butt	f6ea6695ce	Codegen: Tail-duplicate during placement. The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Issue was worklist/scheduling/taildup issue in layout. Issue from 2nd rollback fixed, with 2 additional tests. Issue was tail merging/loop info/tail-duplication causing issue with loops that share a header block. Issue with early tail-duplication of blocks that branch to a fallthrough predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283842	2016-10-11 01:20:33 +00:00
Dylan McKay	a14534711a	[RegAllocGreedy] Attempt to split unspillable live intervals Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 283838	2016-10-11 01:04:36 +00:00

1 2 3 4 5 ...

21411 Commits