llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Eli Friedman	4fd32329a6	Revert r312724 ("[ARM] Remove redundant vcvt patterns."). It leads to some improvements, but also a regression for the simple case, so it's not clearly a good idea. test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference. Ultimately, the right solution is probably to custom-lower fp-to-int conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast. It's hard to do the right thing when the implicit bitcast isn't visible to DAG transforms. llvm-svn: 314169	2017-09-25 22:07:33 +00:00
Saleem Abdulrasool	6256c01685	X86: remove R12 from CSR on Windows x64 SwiftCC R12 is used for the SwiftError parameter. It is no longer a CSR as it is used for transfer the SwiftError, and the caller must preserve it if they need to. llvm-svn: 314165	2017-09-25 22:00:17 +00:00
Eli Friedman	08f50597ed	[ARM] Fix tests for vcvt+store to return void. This is what I meant to do in r314161; I didn't realize I'd messed up because the generated assembly is currently identical. llvm-svn: 314163	2017-09-25 21:55:27 +00:00
Eli Friedman	2da10bf0b6	[ARM] Add tests for vcvt followed by store. llvm-svn: 314161	2017-09-25 21:37:52 +00:00
Eli Friedman	f06e0331f8	[ARM] Regenerate vcvt test checks. llvm-svn: 314160	2017-09-25 21:34:29 +00:00
Craig Topper	854c77fbb7	[X86] Don't select anyext GR32->GR64 to SUBREG_TO_REG. Use INSERT_SUBREG instead. As far as I know SUBREG_TO_REG is stating that the upper bits are 0. But if we are just converting the GR32 with no checks, then we have no reason to say the upper bits are 0. I don't really know how to test this today since I can't find anything that looks that closely at SUBREG_TO_REG. The test changes here seems to be some perturbance of register allocation. Differential Revision: https://reviews.llvm.org/D38001 llvm-svn: 314152	2017-09-25 21:14:59 +00:00
Craig Topper	34dfaa97d8	[X86] Make all the NOREX CodeGenOnly instructions into postRA pseudos like the NOREX version of TEST. llvm-svn: 314151	2017-09-25 21:14:55 +00:00
Sanjay Patel	115f2aa1d5	[InstCombine] remove extract-of-select vector transform (2nd try) The 1st attempt at this: https://reviews.llvm.org/rL314117 was reverted at: https://reviews.llvm.org/rL314118 because of bot fails for clang tests that were checking optimized IR. That should be fixed with: https://reviews.llvm.org/rL314144 ...so try again. Original commit message: The transform to convert an extract-of-a-select-of-vectors was added at: https://reviews.llvm.org/rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314147	2017-09-25 20:30:53 +00:00
Justin Lebar	a89ad847a6	Revert "[NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.", rL314135. Causing assertion failures on macos: > Assertion failed: (Num < NumOperands && "Invalid child # of SDNode!"), > function getOperand, file > /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/include/llvm/CodeGen/SelectionDAGNodes.h, > line 835. http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/42739/testReport/LLVM/CodeGen_NVPTX/surf_read_cuda_ll/ llvm-svn: 314142	2017-09-25 19:41:56 +00:00
Konstantin Belochapka	e991d1a8a7	[X86] [ASM INTEL SYNTAX] fix for incorrect assembler code generation when x86-asm-syntax=intel (PR34617). Fix for incorrect code generation when x86-asm-syntax=intel. Differential Revision: https://reviews.llvm.org/D37945 llvm-svn: 314140	2017-09-25 19:26:48 +00:00
Craig Topper	1cf3918f57	[SelectionDAG] Teach simplifyDemandedBits to handle shifts by constant splat vectors This teach simplifyDemandedBits to handle constant splat vector shifts. This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount. I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here. I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change. Differential Revision: https://reviews.llvm.org/D37665 llvm-svn: 314139	2017-09-25 19:26:08 +00:00
Alexey Bataev	85ed5e891b	[SLP] Add a test for PR32086, NFC. llvm-svn: 314137	2017-09-25 19:12:59 +00:00
Artem Belevich	8a55fc8708	[NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins. Differential Revision: https://reviews.llvm.org/D38191 llvm-svn: 314135	2017-09-25 18:53:57 +00:00
Hongbin Zheng	8c8d69325e	[SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior This is a follow up of D34598 Differential Revision: https://reviews.llvm.org/D38072 llvm-svn: 314125	2017-09-25 17:39:40 +00:00
Arnold Schwaighofer	92ea97589e	ARM: Use the proper swifterror CSR list on platforms other than darwin Noticed by inspection llvm-svn: 314121	2017-09-25 17:19:50 +00:00
Sanjay Patel	e61651e45e	revert r314117 because there are bogus clang tests that depend on the optimizer llvm-svn: 314118	2017-09-25 17:00:04 +00:00
Sanjay Patel	4fd934685a	[InstCombine] remove extract-of-select vector transform The transform to convert an extract-of-a-select-of-vectors was added at: rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314117	2017-09-25 16:41:34 +00:00
Reid Kleckner	6ad67a38d2	[DebugInfo] Sort the SDDbgValue list before assuming it is in IR order Summary: This code iterates the 'Orders' vector in parallel with the DbgValue list, emitting all DBG_VALUEs that occurred between the last IR order insertion point and the next insertion point. This assumes the SDDbgValue list is sorted in IR order, which it usually is. However, it is not sorted when a node with a debug value is replaced with another one. When this happens, TransferDbgValues is called, and the new value is added to the end of the list. The problem can be solved by stably sorting the list by IR order. Reviewers: aprantl, Ka-Ka Reviewed By: aprantl Subscribers: MatzeB, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D38197 llvm-svn: 314114	2017-09-25 16:14:53 +00:00
Michael Zuckerman	14e5a5b466	[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF8 stride 4): This patch expands the support of lowerInterleavedStore to 8x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=8) and we plan to include more patterns in the future. The patch goal is to optimize the following sequence: At the end of the computation, we have xmm2, xmm0, xmm12 and xmm3 holding each 8 chars: c0, c1, , c7 m0, m1, , m7 y0, y1, , y7 k0, k1, ., k7 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Reviewers DavidKreitzer Farhana zvi igorb guyblank RKSimon Ayal Differential Revision: https://reviews.llvm.org/D36058 Change-Id: I3cc5c2ca5d6318901c192a4428493b99ef424c32 llvm-svn: 314109	2017-09-25 14:50:38 +00:00
Nemanja Ivanovic	48de75d5c6	[PowerPC] Eliminate compares - add i64 sext/zext handling for SETLT/SETGT As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314106	2017-09-25 14:05:46 +00:00
Chad Rosier	78be6742d5	[AArch64] Add basic support for Qualcomm's Saphira CPU. llvm-svn: 314105	2017-09-25 14:05:00 +00:00
Alexey Bataev	0b1859ac23	[SLP] Support for horizontal min/max reduction. Summary: SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Reviewers: spatel, mkuper, hfinkel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27846 llvm-svn: 314101	2017-09-25 13:34:59 +00:00
Craig Topper	7464cec0af	[X86] Make IFMA instructions during isel so we can fold broadcast loads. This required changing the ISD opcode for these instructions to have the commutable operands first and the addend last. This way tablegen can autogenerate the additional patterns for us. llvm-svn: 314083	2017-09-24 19:30:55 +00:00
Craig Topper	1821ddc87b	[X86] Add tests to show missed opportunities to fold broadcast loads into IFMA instructions when the load is on operand1 of the instrinsic. We need to enable commuting during isel to catch this since the load folding tables can't handle broadcasts. llvm-svn: 314082	2017-09-24 19:30:54 +00:00
Craig Topper	58d5232b50	[X86] Add IFMA instructions to the load folding tables and make them commutable for the multiply operands. llvm-svn: 314080	2017-09-24 17:28:14 +00:00
Simon Pilgrim	ccb6bb45fc	[X86][SSE] Add more tests for shuffle combining with extracted vector elements (PR22415) llvm-svn: 314077	2017-09-24 13:45:49 +00:00
Simon Pilgrim	60bcaca225	[X86][SSE] Add support for extending bool vectors bitcasted from scalars This patch acts as a reverse to combineBitcastvxi1 - bitcasting a scalar integer to a boolean vector and extending it 'in place' to the requested legal type. Currently this doesn't handle AVX512 at all - but the current mask register approach is lacking for some cases. Differential Revision: https://reviews.llvm.org/D35320 llvm-svn: 314076	2017-09-24 13:42:31 +00:00
Nemanja Ivanovic	c7eeab54f7	[PowerPC] Eliminate compares - add i64 sext/zext handling for SETLE/SETGE As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential review. llvm-svn: 314073	2017-09-24 05:48:11 +00:00
Craig Topper	bff345b46e	[AVX-512] Add pattern for selecting masked version of v8i32/v8f32 compare instructions when VLX isn't available. We use a v16i32/v16f32 compare instead and truncate the result. We already did this for the unmasked version, but were missing the version with 'and'. llvm-svn: 314072	2017-09-24 05:24:52 +00:00
Davide Italiano	e1060417c3	[Verifier] Stop accepting broken DIGlobalVariable(s). The code wasn't yelling at the user when there's a reference from a DIGlobalVariableExpression. Thanks to Adrian for the reduced testcase. Fixes PR34672. llvm-svn: 314069	2017-09-24 01:06:35 +00:00
Simon Pilgrim	60b112db3a	[X86] Regenerate i64 to v2f32 bitcast test llvm-svn: 314068	2017-09-23 19:18:29 +00:00
Sanjay Patel	8fe52dcc51	[x86] reduce 64-bit mask constant to 32-bits by right shifting This is a follow-up from D38181 (r314023). We have to put 64-bit constants into a register using a separate instruction, so we should try harder to avoid that. From what I see, we're not likely to encounter this pattern in the DAG because the upstream setcc combines from this don't (usually?) produce this pattern. If we fix that, then this will become more relevant. Since the cost of handling this case is just loosening the predicate of the existing fold, we might as well do it now. llvm-svn: 314064	2017-09-23 14:32:07 +00:00
Sanjay Patel	47c506979b	[x86] add an add+shift test for follow-up suggestion from D38181; NFC llvm-svn: 314063	2017-09-23 14:24:07 +00:00
Nemanja Ivanovic	468e2da8ae	[PowerPC] Eliminate compares - add i32 sext/zext handling for SETULT/SETUGT As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314062	2017-09-23 12:53:03 +00:00
Nemanja Ivanovic	ce5d3e6fbd	[PowerPC] Eliminate compares - add i32 sext/zext handling for SETULE/SETUGE As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314060	2017-09-23 09:50:12 +00:00
Nemanja Ivanovic	118a68dea9	[PowerPC] Eliminate compares - add i32 sext/zext handling for SETLT/SETGT As mentioned in https://reviews.llvm.org/D33718, this simply adds another pattern to the compare elimination sequence and is committed without a differential revision. llvm-svn: 314055	2017-09-23 04:41:34 +00:00
Konstantin Belochapka	96e1c71c44	[X86] [MC] fixed non optimal encoding of instruction memory operand (PR24038). Fixed suboptimal encoding of instruction memory operand when assembler is used to select 32 bit fixup rather than 8 bit immediate for encoding memory offset value. Differential Revision: https://reviews.llvm.org/D38117 llvm-svn: 314044	2017-09-22 23:37:48 +00:00
Craig Topper	b01d5413e4	[InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to equality comparisons when the min/max ranges intersect in a single value. This is the inverse of what we do for SGT/SLT/UGT/ULT. llvm-svn: 314032	2017-09-22 21:47:22 +00:00
Craig Topper	fb5f5f7b5b	[InstCombine] Add test cases for known bits simplifications for comparisons that don't depend on constant RHS. NFC This shows some missing simplifications for sge/sle/uge/ule relative to their non-equality counterparts. llvm-svn: 314031	2017-09-22 21:47:21 +00:00
Craig Topper	7889014853	[InstCombine] Remove a FIXME from a test that was fixed in r314025. llvm-svn: 314030	2017-09-22 21:47:20 +00:00
Sanjay Patel	a0ebd7c32a	[x86] remove over-specified platform from test config llvm-svn: 314027	2017-09-22 21:07:13 +00:00
Craig Topper	a151e88501	[InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases in foldICmpUsingKnownBits. llvm-svn: 314025	2017-09-22 19:54:15 +00:00
Sanjay Patel	d30aaf33b0	[x86] swap order of srl (and X, C1), C2 when it saves size The (non-)obvious win comes from saving 3 bytes by using the 0x83 'and' opcode variant instead of 0x81. There are also better improvements based on known-bits that allow us to eliminate the mask entirely. As noted, this could be extended. There are potentially other wins from always shifting first, but doing that reveals a tangle of problems in other pattern matching. We do this transform generically in instcombine, but we often have icmp IR that doesn't match that pattern, so we must account for this in the backend. Differential Revision: https://reviews.llvm.org/D38181 llvm-svn: 314023	2017-09-22 19:37:21 +00:00
Rafael Espindola	c4167be768	llvm-ar: align the first archive member consistently. Before we were aligning the member after the symbol table to 4 but other members to 8. llvm-svn: 314010	2017-09-22 18:36:00 +00:00
Tim Shen	7c634910d0	[XRay] support conditional return on PPC. Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest. Reviewers: dberris, echristo Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D38102 llvm-svn: 314005	2017-09-22 18:30:02 +00:00
Guozhi Wei	af71947aaf	[TargetTransformInfo] Handle intrinsic call in getInstructionLatency() Usually an intrinsic is a simple target instruction, it should have a small latency. A real function call has much larger latency. So handle the intrinsic call in function getInstructionLatency(). Differential Revision: https://reviews.llvm.org/D38104 llvm-svn: 314003	2017-09-22 18:25:53 +00:00
Rafael Espindola	76c70d6c2c	llvm-ar: Don't add an unnecessary alignment in gnu mode. This is mostly for getting stricter testing in preparation for future changes. llvm-svn: 314000	2017-09-22 18:16:13 +00:00
Pranav Bhandarkar	36cbef5143	Check vector elements for equivalence in the HexagonVectorLoopCarriedReuse pass If the two instructions being compared for equivalence have corresponding operands that are integer constants, then check their values to determine equivalence. Patch by Suyog Sarda! llvm-svn: 313993	2017-09-22 16:43:31 +00:00
Sanjay Patel	969c9899ca	[x86] remove unnecessary OS specifier from test llvm-svn: 313986	2017-09-22 14:38:57 +00:00
Sanjay Patel	91c9495a15	[x86] auto-generate complete checks; NFC llvm-svn: 313985	2017-09-22 14:30:52 +00:00

1 2 3 4 5 ...

47727 Commits