llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	b3bf67b5cd	[X86][AVX] Pad small shuffle inputs in combineX86ShufflesRecursively As detailed on PR45974 and D79987, getFauxShuffleMask is creating nodes on the fly to create shuffles with inputs the same size as the result, causing problems for hasOneUse() checks in later simplification stages. Currently only combineX86ShufflesRecursively benefits from these widened inputs so I've begun moving the functionality there, and out of getFauxShuffleMask. This allows us to remove the widening from VBROADCAST and EXTEND faux shuffle cases. This just leaves the INSERT_SUBVECTOR case in getFauxShuffleMask still creating nodes, which will require more extensive refactoring.	2020-05-31 11:43:47 +01:00
Florian Hahn	4c3ac27019	[ScheduleDAG] Avoid unnecessary recomputation of topological order. In some cases ScheduleDAGRRList has to add new nodes to resolve problems with interfering physical registers. When new nodes are added, it completely re-computes the topological order, which can take a long time, but is unnecessary. We only add nodes one by one, and initially they do not have any predecessors. So we can just insert them at the end of the vector. Later we add predecessors, but the helper function properly updates the topological order much more efficiently. With this change, the compile time for the program below drops from 300s to 30s on my machine. define i11129 @test1() { %L1 = load i11129, i11129* undef %B30 = ashr i11129 %L1, %L1 store i11129 %B30, i11129* undef ret i11129 %L1 } This should be generally beneficial, as we can skip a large amount of work. Theoretically there are some scenarios where we might not safe much, e.g. when we add a dependency between the first and last node. Then we would have to shift all nodes. But we still do not have to spend the time re-computing the initial order. Reviewers: MatzeB, atrick, efriedma, niravd, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D59722	2020-05-31 11:04:35 +01:00
Kang Zhang	ad15cce54c	Revert "[NFC][PowerPC] Add a new case to test phi-node-elimination pass" This case wll be failed on some machines which enable expensive-checks. This reverts commit af3abbf7bd2213003a133c361c212ac6efb1bd2b.	2020-05-31 09:24:21 +00:00
Kang Zhang	eb5c774879	[NFC][PowerPC] Add a new case to test phi-node-elimination pass	2020-05-31 08:05:27 +00:00
Jay Foad	8cf2c72f7d	[AMDGPU] Propagate fast-math flags when lowering FSIN and FCOS Differential Revision: https://reviews.llvm.org/D80813	2020-05-31 05:21:55 +01:00
Jay Foad	52203cdb2c	[AMDGPU] Precommit tests for D80813	2020-05-31 05:21:55 +01:00
Changpeng Fang	ac14e61af3	AMDGPU: Add setTruncStoreAction for vector i64 types made legal recently Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D80853	2020-05-30 20:45:27 -07:00
Craig Topper	b8eb4d4348	[X86] Remove unneeded bitconverts from isel patterns. NFC The types already match so TableGen is removing the bitconvert.	2020-05-30 20:24:52 -07:00
Craig Topper	38293fccd1	[X86] Add DAG combine to turn (v2i64 (scalar_to_vector (i64 (bitconvert (mmx))))) to MOVQ2DQ. Remove unneeded isel patterns. We already had a DAG combine for (mmx (bitconvert (i64 (extractelement v2i64)))) to MOVDQ2Q. Remove patterns for MMX_MOVQ2DQrr/MMX_MOVDQ2Qrr that use scalar_to_vector/extractelement involving i64 scalar type with v2i64 and x86mmx.	2020-05-30 19:47:08 -07:00
Craig Topper	87390781b5	[DAGCombiner] Move debug message and statistic update into CommitTargetLoweringOpt. This code was repeated in two callers of CommitTargetLoweringOpt. But CommitTargetLoweringOpt is also called from TargetLowering. We should print a message for those calls to. So sink the repeated code into CommitTargetLoweringOpt to catch those calls.	2020-05-30 19:47:07 -07:00
Craig Topper	dd848c4d13	[X86] Teach computeKnownBitsForTargetNode that the upper half of X86ISD::MOVQ2DQ is all zero.	2020-05-30 19:47:07 -07:00
Craig Topper	ca3c1fc0ed	[X86] Fix a place where we created MOVQ2DQ with a DstVT other than v2i64. The type profile and isel pattern have this type declared as being MVT::v2i64. But isel skips the explicit type check due to the type profile.	2020-05-30 19:47:07 -07:00
Craig Topper	b800414c5f	[X86] Autogenerate complete checks. NFC	2020-05-30 19:47:07 -07:00
Craig Topper	4c2e815193	[X86] Move MMX_SET0 pattern into the instruction definition. NFC	2020-05-30 19:47:07 -07:00
Fangrui Song	16f18fc060	[llvm-objdump] Delete unneeeded namespace llvm {}	2020-05-30 18:03:43 -07:00
Fangrui Song	2b3d6eb517	[llvm-objdump] Move llvm:: to llvm::objdump:: and qualifying definitions with objdump:: Or adding `static`. Qualifying definitions with `objdump::` comforms to the coding standards https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions	2020-05-30 18:00:15 -07:00
Fangrui Song	5d4d16ee4e	[llvm-objdump] Simplify reportError() and prepend outs().flush() As noticed by dblaikie. I don't know what code paths using reportError can cause stdout output to be interleaved with stderr, so no test is added now. Also drop an unneeded use of errs().fflush() in reportWarning(). I requested this in D64165.	2020-05-30 17:25:59 -07:00
Craig Topper	bcfbda1501	[X86] Add pseudo instructions to use MULX with a single destination when the low result isn't used. The instruction is defined to only produce high result if both destinations are the same. We can exploit this to avoid unnecessarily clobbering a register. In order to hide this from register allocation we use a pseudo instruction and expand the result during MCInst creation. Differential Revision: https://reviews.llvm.org/D80500	2020-05-30 16:01:01 -07:00
Craig Topper	1f7fa77b87	[X86] Minor cleanups to addShuffleComments in X86MCInstPrinter.cpp. NFCI -Replace some ifs that should be impossible with asserts. -Use X86::AddrDisp and X86::AddrNumOperands to make code more readable -Use X86II::isKMasked/isKMergeMasked to do some operand skipping to remove or simplify switches	2020-05-30 13:51:48 -07:00
Craig Topper	c891a2ee19	[X86] Factor constant pool comment printing out of the switch in X86AsmPrinter::emitInstruction. NFC Pull the verbose asm check out of the cases and move it up to the call of the new function.	2020-05-30 13:51:37 -07:00
Whitney Tsang	1badd023ca	[LoopUnroll] Add a test case for rG7873376bb36b. rG7873376bb36b fixes a build failure for allyesconfig. The problem happened when the single exiting block doesn't dominate the loop latch, then the immediate dominator of the exit block should not be the exiting block after unrolling. As the exiting block of different unrolled iteration can branch to the exit block, and the ith exiting block doesn't dominate (i+1)th exiting block, the immediate dominator of the exit block should not the nearest common dominator of the exiting block and the loop latch of the same iteration. Differential Revision: https://reviews.llvm.org/D80477	2020-05-30 20:34:27 +00:00
Philip Reames	727c874d62	[Tests] Convert last statepoint lowering tests to bundle format	2020-05-30 12:59:34 -07:00
Whitney Tsang	1fa84624ca	[LoopUnroll] Fix build failure for allyesconfig. Differential Revision: https://reviews.llvm.org/D80477.	2020-05-30 18:32:47 +00:00
zoecarver	4651d724fb	[DSE] Remove noop stores in MSSA. Adds a simple fast-path check for the pattern: v = load ptr store v to ptr I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent). Refs: https://bugs.llvm.org/show_bug.cgi?id=45795 Differential Revision: https://reviews.llvm.org/D79391	2020-05-30 09:57:30 -07:00
Florian Hahn	1cc7eb0b48	[BasicAA] Use known lower bounds for index values for size based check. Currently, BasicAA does not exploit information about value ranges of indexes. For example, consider the 2 pointers %a = %base and %b = %base + %stride below, assuming they are used to access 4 elements. If we know that %stride >= 4, we know the accesses do not alias. If %stride is a constant, BasicAA currently gets that. But if the >= 4 constraint is encoded using an assume, it misses the NoAlias. This patch extends DecomposedGEP to include an additional MinOtherOffset field, which tracks the constant offset similar to the existing OtherOffset, which the difference that it also includes non-negative lower bounds on the range of the index value. When checking if the distance between 2 accesses exceeds the access size, we can use this improved bound. For now this is limited to using non-negative lower bounds for indices, as this conveniently skips cases where we do not have a useful lower bound (because it is not constrained). We potential miss out in cases where the lower bound is constrained but negative, but that can be exploited in the future. Reviewers: sanjoy, hfinkel, reames, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D76194	2020-05-30 16:20:42 +01:00
Simon Pilgrim	abf14921b4	SafeStackColoring.h - reduce Instructions.h include to forward declaration. NFC. SafeStackColoring.cpp - remove includes directly defined in SafeStackColoring.h header. NFC.	2020-05-30 14:38:02 +01:00
Simon Pilgrim	787f97132a	CriticalAntiDepBreaker.cpp - remove includes directly defined in CriticalAntiDepBreaker.h header. NFC.	2020-05-30 14:32:36 +01:00
Simon Pilgrim	656274f67b	SafeStackLayout.cpp - remove includes directly defined in SafeStackLayout.h module header. NFC.	2020-05-30 14:30:19 +01:00
Simon Pilgrim	220d9ef6ed	[TargetLowering] SimplifyDemandedBits - remove shift amount clamps from getValidShiftAmountConstant calls. NFC. getValidShiftAmountConstant only returns a value if the shift amount is in range, so we don't need to check it again.	2020-05-30 14:04:55 +01:00
Simon Pilgrim	440c8825f9	[SelectionDAG] ComputeNumSignBits - use Valid Min/Max shift amount helpers directly. NFCI. We are calling getValidShiftAmountConstant first followed by getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant if that failed. But both are used in the same way in ComputeNumSignBits and the Min/Max variants call getValidShiftAmountConstant internally anyhow.	2020-05-30 14:02:14 +01:00
Simon Pilgrim	6012a879ba	PackedVersion.h - reduce includes to forward declarations. NFC.	2020-05-30 13:17:47 +01:00
Simon Pilgrim	ecc68c128d	TBEHandler.h - remove unnecessary VersionTuple forward declaration. NFC. We already have to include VersionTuple.h	2020-05-30 13:07:57 +01:00
Simon Pilgrim	c5d4a3f68b	ArchitectureSet.h - add missing <tuple> include. MSVC seems to implicitly include this from <utility> but other toolchains don't	2020-05-30 12:48:46 +01:00
Simon Pilgrim	5273f96cb9	ArchitectureSet.h - reduce raw_ostream.h include to forward declaration. NFC. Move raw_ostream.h include to ArchitectureSet.cpp.	2020-05-30 12:36:16 +01:00
Simon Pilgrim	02dff70f76	Architecture.h - reduce includes to forward declarations. NFC. Move includes to Architecture.cpp.	2020-05-30 12:17:13 +01:00
Simon Pilgrim	e30d327962	IPDBRawSymbol.h - remove already declared forward declarations. NFC. PDBTypes.h holds most PDB forward declarations already, move IPDBSession in there as well.	2020-05-30 12:00:17 +01:00
Simon Pilgrim	5c7d1e4f95	IPDBRawSymbol.h - reduce StringRef.h include to forward declaration. NFC.	2020-05-30 11:29:57 +01:00
Simon Pilgrim	10547eab71	[SelectionDAG] Remove repeated getOperand() call. NFC.	2020-05-30 10:21:36 +01:00
Craig Topper	1c05648385	[X86] Autogenerate complete checks. NFC	2020-05-29 23:45:04 -07:00
Martin Storsjö	ea1ebf9d95	[test] Regenerate checks in aarch64_win64cc_vararg.ll with update_llc_test_checks.py. NFC.	2020-05-30 09:22:09 +03:00
Martin Storsjö	6e7aff07a4	[AArch64] Treat x18 as callee-saved in functions with windows calling convention on non-windows OSes Treat it as callee-saved, and always back it up. When windows code calls entry points in unix code, marked with the windows calling convention, that unix code can call other functions that isn't compiled with -ffixed-x18 which may clobber x18 freely. By backing it up and restoring it on return, we preserve the register across the function call, fulfilling this part of the windows calling convention on another OS. This isn't enough for making sure that x18 is preseved when non-windows code does a callback to windows code, but is a clear improvement over the current status quo. Additionally, wine is nowadays building many modules as PE DLLs, which avoids the callback issue altogether for those DLLs. Differential Revision: https://reviews.llvm.org/D61892	2020-05-30 09:22:09 +03:00
Sourabh Singh Tomar	9015c15439	[DWARF5] Added support for emission of .debug_macro.dwo section This patch adds support for emission of following DWARFv5 macro forms in .debug_macro.dwo section: - DW_MACRO_start_file - DW_MACRO_end_file - DW_MACRO_define_strx - DW_MACRO_undef_strx Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D78866	2020-05-30 11:13:23 +05:30
Eric Christopher	8c2e933de8	NFC: Simplify O1 pass pipeline construction. Pull O1 pass pipeline out into a separate function and simplify buildFunctionSimplificationPipeline accordingly.	2020-05-29 20:08:22 -07:00
Eric Christopher	aceb767487	Fix full unrolling with new pass manager. Last we looked at this and couldn't come up with a reason to change it, but with a pragma for full loop unrolling we bypass every other loop unroll and then fail to fully unroll a loop when the pragma is set. Move the OnlyWhenForced out of the check and into the initialization of the full unroll pass in the new pass manager. This doesn't show up with the old pass manager. Add a new option to opt so that we can turn off loop unrolling manually since this is a difference between clang and opt. Tested with check-clang and check-llvm.	2020-05-29 20:08:21 -07:00
Fangrui Song	dca087a575	[ValueLattice] Fix uninitialized-value after D79036 Many check-clang-codegen tests failed.	2020-05-29 19:52:29 -07:00
Carl Ritson	39acbc605a	[AMDGPU] Remove assertion on S1024 SGPR to VGPR spill Summary: Replace an assertion that blocks S1024 SGPR to VGPR spill. The assertion pre-dates S1024 and is not wave size dependent. Reviewers: arsenm, sameerds, rampitec Reviewed By: arsenm Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80783	2020-05-30 11:16:19 +09:00
Matt Arsenault	e05423dc7a	AMDGPU: Optimize s_setreg_b32 to s_denorm_mode/s_round_mode This is a custom inserter because it was less work than teaching tablegen a way to indicate that it is sometimes OK to have a no side effect instruction in the output of a side effecting pattern. The asm is needed to look like a read of the mode register to prevent it from being deleted. However, there seems to be a bug where the mode register def instructions are moved across the asm sideeffect by the post-RA scheduler. Another oddity is the immediate is formatted differently between s_denorm_mode and s_round_mode.	2020-05-29 21:11:36 -04:00
Matt Arsenault	c709b1e4db	AMDGPU: Add new baseline tests for setreg handling Most of these should be identical and use a common prefix, but update_llc_test_checks is failing to generate shared checks for some reason.	2020-05-29 21:00:30 -04:00
Matt Arsenault	9fa36c2fa4	AMDGPU: Move MIMG MMO check to verifier	2020-05-29 20:58:23 -04:00
Christopher Tetreault	41198e7d2b	[SVE] Eliminate calls to default-false VectorType::get() from AMDGPU Reviewers: efriedma, david-arm, fpetrogalli, arsenm Reviewed By: david-arm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, tschuett, hiraditya, rkruppe, psnobl, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80328	2020-05-29 17:54:17 -07:00

1 2 3 4 5 ...

197559 Commits