llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Florian Hahn	f9400b47ee	[SCEV] Add test cases where the max BTC is imprecise, due to step != 1. Add a test case where we fail to compute a tight max backedge taken count, due to the step being != 1. This is part of the issue with PR40961.	2020-10-10 16:39:48 +01:00
Florian Hahn	dd97e149d5	[SCEV] Handle ULE in applyLoopGuards. Handle ULE predicate in similar fashion to ULT predicate in applyLoopGuards.	2020-10-10 16:26:28 +01:00
Florian Hahn	82b474d0d8	[SCEV] Add a test case with ULE loop guard.	2020-10-10 15:58:26 +01:00
Nikita Popov	6d9e2cd7e0	[MemCpyOpt] Add test for incorrect memset DSE (NFC) We can't shorten the memset if there's a throwing call in between and the destination is non-local.	2020-10-10 16:11:14 +02:00
David Green	995d885e43	[ARM] Attempt to make Tail predication / RDA more resilient to empty blocks There are a number of places in RDA where we assume the block will not be empty. This isn't necessarily true for tail predicated loops where we have removed instructions. This attempt to make the pass more resilient to empty blocks, not casting pointers to machine instructions where they would be invalid. The test contains a case that was previously failing, but recently been hidden on trunk. It contains an empty block to begin with to show a similar error. Differential Revision: https://reviews.llvm.org/D88926	2020-10-10 14:50:25 +01:00
Alok Kumar Sharma	0a8029e199	[DebugInfo] Support for DWARF attribute DW_AT_rank This patch adds support for DWARF attribute DW_AT_rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF attribute DW_AT_rank is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89141	2020-10-10 17:51:12 +05:30
David Green	d5dc3dc1f4	[AArch64][LV] Move vectorizer test to Transforms/LoopVectorize/AArch64. NFC	2020-10-10 10:15:43 +01:00
David Green	08b29d77c1	[TblGen][Scheduling] Fix debug output. NFC This just moves some newlines to the expected places.	2020-10-10 10:04:28 +01:00
Nikita Popov	12c9ccbc9d	[MemCpyOpt] Don't hoist store that's not guaranteed to execute MemCpyOpt can hoist stores while load+store pairs into memcpy. This hoisting can currently result in stores being executed that weren't guaranteed to execute in the original problem. Differential Revision: https://reviews.llvm.org/D89154	2020-10-10 10:26:28 +02:00
Denis Antrushin	2cab9515a5	[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty. Currently we allow passing pointers from deopt bundle on VReg only if they were seen in list of gc-live pointers passed on VRegs. This means that for the case of empty gc-live bundle we spill deopt bundle's pointers. This change allows lowering deopt pointers to VRegs in case of empty gc-live bundle. In case of non-empty gc-live bundle, behavior does not change. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D88999	2020-10-10 14:58:08 +07:00
Zi Xuan Wu	e3b36cdf43	[CSKY 1/n] Add basic stub or infra of csky backend This patch introduce files that just enough for lib/Target/CSKY to compile. Notably a basic CSKYTargetMachine and CSKYTargetInfo. Differential Revision: https://reviews.llvm.org/D88466	2020-10-10 10:44:08 +08:00
Fangrui Song	9d5d82fe09	[PowerPC] Fix signed overflow in decomposeMulByConstant after D88201 Caught by multipliers LONG_MAX (after +1) and LONG_MIN (after -1) in CodeGen/PowerPC/mul-const-i64.ll	2020-10-09 18:29:12 -07:00
Xiang1 Zhang	ddc18e0952	[X86] Add CET test, NFC	2020-10-10 09:13:04 +08:00
Fangrui Song	d0417e8192	[bugpoint] Delete -safe-llc and make -run-llc work like -run-llc -safe-run-llc	2020-10-09 16:38:30 -07:00
Changpeng Fang	7c619d4ea7	Sink: Handle instruction sink when a user is dead Summary: The current instruction sink pass uses findNearestCommonDominator of all users to find block to sink the instruction to. However, a user may be in a dead block, which will result in unexpected behavior. This patch handles such cases by skipping dead blocks. This patch fixes: https://bugs.llvm.org/show_bug.cgi?id=47415 Reviewers: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D89166	2020-10-09 16:20:26 -07:00
Joao Moreira	3755e8a85b	[X86] Check if call is indirect before emitting NT_CALL The notrack prefix is a relaxation of CET policies which makes it possible to indirectly call targets which do not have an ENDBR instruction in the landing address. To emit a call with this prefix, the special attribute "nocf_check" is used. When used as a function attribute, a CallInst targeting the respective function will return true for the method "doesNoCfCheck()", no matter if it is a direct call (and such should remain like this, as the information that the to-be-called function won't perform control-flow checks is useful in other contexts). Yet, when emitting an X86ISD::NT_CALL, the respective CallInst should be verified for its indirection, allowing that the prefixed calls are only emitted in the right situations. Update the respective testing unit to also verify for direct calls to functions with ''nocf_check'' attributes. The bug can also be reproduced through compiling the following C code using the -fcf-protection=full flag. int __attribute__((nocf_check)) foo(int a) {}; int main() { foo(42); } Differential Revision: https://reviews.llvm.org/D87320	2020-10-09 15:54:23 -07:00
Fangrui Song	17ccae5b61	[X86][test] Add a regression test for lock cmpxchg16b on a global variable with offset Add a test for a bug (uncovered by D88808) fixed by f34bb06935aa3bab353d70d515b767fdd2f5625c. Also delete cmpxchg16b.ll which is covered by atomic128.ll Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89163	2020-10-09 15:44:32 -07:00
Eli Friedman	8b737bdfec	[SCCP] Reduce the number of times ResolvedUndefsIn is called for large modules. If a module has many values that need to be resolved by ResolvedUndefsIn, compilation takes quadratic time overall. Solve should do a small amount of work, since not much is added to the worklists each time markOverdefined is called. But ResolvedUndefsIn is linear over the length of the function/module, so resolving one undef at a time is quadratic in general. To solve this, make ResolvedUndefsIn resolve every undef value at once, instead of resolving them one at a time. This loses a little optimization power, but can be a lot faster. We still need a loop around ResolvedUndefsIn because markOverdefined could change the set of blocks that are live. That should be uncommon, hopefully. We could optimize it by tracking which blocks transition from dead to live, instead of iterating over the whole module to find them. But I'll leave that for later. (The whole function will become a lot simpler once we start pruning branches on undef.) The regression test changes seem minor. The specific cases in question could probably be optimized with a bit more work, but they seem like edge cases that don't really matter. Fixes an "infinite" compile issue my team found on an internal workoad. Differential Revision: https://reviews.llvm.org/D89080	2020-10-09 15:24:16 -07:00
Steven Wu	c84011f079	[IRMover] Add missing open quote in the warning message Fix the missing single quotation mark in the warning message for target triple mismatch.	2020-10-09 15:17:16 -07:00
Jordan Rupprecht	5f9071dec6	Temporarily revert "[ThinLTO] Re-order modules for optimal multi-threaded processing" This reverts commit 6537004913f3009d896bc30856698e7d22199ba7. This is causing test failures internally, and while a few of the cases turned out to be bad user code (relying on a specific order of static initialization across translation units), some cases are less clear. Temporarily reverting for now, and Teresa is going to follow up with more details.	2020-10-09 14:36:20 -07:00
Thomas Lively	1457667113	[WebAssembly] Prototype i16x8.q15mulr_sat_s This saturating, rounding, Q-format multiplication instruction is proposed in https://github.com/WebAssembly/simd/pull/365. Differential Revision: https://reviews.llvm.org/D88968	2020-10-09 21:17:53 +00:00
Mircea Trofin	b0f57bde9d	[NFC][Regalloc] VirtRegAuxInfo::Hint does not need to be a field It is only used in weightCalcHelper, and cleared upon its finishing its job there. The patch further cleans up style guide discrepancies, and simplifies CopyHint by removing duplicate 'IsPhys' information (it's what the Reg field would report).	2020-10-09 13:42:23 -07:00
Krzysztof Parzyszek	3600e70de4	[Hexagon] Remove ISD node VSPLATW, use VSPLAT instead This is a step towards improving HVX codegen for splat.	2020-10-09 15:38:02 -05:00
Krzysztof Parzyszek	0065c2c37b	[Hexagon] Generalize handling of SDNodes created during ISel The selection of HVX shuffles can produce more nodes in the DAG, which need special handling, or otherwise they would be left unselected by the main selection code. Make the handling of such nodes more general.	2020-10-09 15:38:02 -05:00
Arthur Eubanks	b450556f70	[Reg2Mem][NewPM] Pin test to legacy PM This pass hasn't been touched in a long time and isn't used in tree.	2020-10-09 12:36:08 -07:00
Mircea Trofin	b67aae58ba	[NFC][Regalloc] Fix coding style in CalcSpillWeights	2020-10-09 12:22:12 -07:00
Craig Topper	fb5825dfe3	[X86] When expanding LCMPXCHG16B_NO_RBX in EmitInstrWithCustomInserter, directly copy address operands instead of going through X86AddressMode. I suspect getAddressFromInstr and addFullAddress are not handling all addresses cases properly based on a report from MaskRay. So just copy the operands directly. This should be more efficient anyway.	2020-10-09 11:55:24 -07:00
Craig Topper	db78542b9d	[X86] Don't copy kill flag when expanding LCMPXCHG16B_SAVE_RBX The expansion code creates a copy to RBX before the real LCMPXCHG16B. It's possible this copy uses a register that is also used by the real LCMPXCHG16B. If we set the kill flag on the use in the copy, then we'll fail the machine verifier on the use on the LCMPXCHG16B. Differential Revision: https://reviews.llvm.org/D89151	2020-10-09 11:55:24 -07:00
Nikita Popov	b9a4cf95b0	[MemCpyOpt] Add test for incorrectly hoisted store (NFC)	2020-10-09 20:52:08 +02:00
Arthur Eubanks	fb1aea563c	[BPF] Make BPFAbstractMemberAccessPass required Or else on optnone functions we get the following during instruction selection: fatal error: error in backend: Cannot select: intrinsic %llvm.preserve.struct.access.index Currently the -O0 pipeline doesn't properly run passes registered via TargetMachine::registerPassBuilderCallbacks(), so don't add that RUN line yet. That will be fixed after this. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D89083	2020-10-09 11:26:37 -07:00
Simon Pilgrim	cf6dd3759a	[ARM][MIPS] Add funnel shift test coverage Based on offline discussions regarding D89139 and D88783 - we want to make sure targets aren't doing anything particularly dumb Tests copied from aarch64 which has a mixture of general, legalization and special case tests	2020-10-09 19:19:47 +01:00
Giorgis Georgakoudis	5c60ef51ef	[OpenMPOpt] Merge parallel regions There are cases that generated OpenMP code consists of multiple, consecutive OpenMP parallel regions, either due to high-level programming models, such as RAJA, Kokkos, lowering to OpenMP code, or simply because the programmer parallelized code this way. This optimization merges consecutive parallel OpenMP regions to: (1) reduce the runtime overhead of re-activating a team of threads; (2) enlarge the scope for other OpenMP optimizations, e.g., runtime call deduplication and synchronization elimination. This implementation defensively merges parallel regions, only when they are within the same BB and any in-between instructions are safe to execute in parallel. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D83635	2020-10-09 09:59:04 -07:00
Arthur Eubanks	a70ba1098b	[FixIrreducible][NewPM] Port -fix-irreducible to NPM In the NPM, a pass cannot depend on another non-analysis pass. So pin the test that tests that -lowerswitch is run automatically to legacy PM. Reviewed By: sameerds Differential Revision: https://reviews.llvm.org/D89051	2020-10-09 09:22:09 -07:00
Arthur Eubanks	a2aedea790	[LoopInterchange][NewPM] Port -loop-interchange to NPM Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D89058	2020-10-09 09:21:31 -07:00
Jay Foad	0cf623e390	[AMDGPU] Only enable mad/mac legacy f32 patterns if denormals may be flushed Following on from D88890, this makes the newly added patterns conditional on NoFP32Denormals. mad/mac f32 instructions always flush denormals regardless of the MODE register setting, and I believe the legacy variants do the same. Differential Revision: https://reviews.llvm.org/D89123	2020-10-09 17:08:38 +01:00
Simon Pilgrim	99440aa5b8	[InstCombine] Support lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) uniform vector tests FoldShiftByConstant is hardcoded for scalar/uniform outer shift amounts atm so that needs to be fixed first to support non-uniform cases	2020-10-09 16:54:46 +01:00
Simon Pilgrim	9735e12152	[InstCombine] Add lshr(trunc(lshr(x,c1)), c2) -> trunc(lshr(lshr(x,c1),c2)) vector tests	2020-10-09 16:54:46 +01:00
David Green	863d69bc9d	[ARM] Add MVE vecreduce costmodel tests. NFC There were some existing tests that were not super useful. New ones are added for testing MVE specific patterns.	2020-10-09 16:25:25 +01:00
Scott Linder	644ae87b23	[NFC] Reformat MILexer.cpp:getIdentifierKind Reformat to avoid unrelated changes in diff of future patch. Committed as obvious.	2020-10-09 15:21:24 +00:00
Simon Pilgrim	23e4436662	[InstCombine] commonShiftTransforms - add support for pow2 nonuniform constant vectors in srem fold Note: we already fold srem to undef if any denominator vector element is undef.	2020-10-09 15:59:33 +01:00
Krzysztof Parzyszek	9ee6426bce	[Hexagon] Return 1 instead of 0 from getMaxInterleaveFactor	2020-10-09 09:46:18 -05:00
Sanjay Patel	45b43d4e29	[InstCombine] allow vector splats for add+and with high-mask There might be a better way to specify the pre-conditions, but this is hopefully clearer than the way it was written: https://rise4fun.com/Alive/Jhk3 Pre: C2 < 0 && isShiftedMask(C2) && (C1 == C1 & C2) %a = and %x, C2 %r = add %a, C1 => %a2 = add %x, C1 %r = and %a2, C2	2020-10-09 10:39:11 -04:00
Simon Pilgrim	f2ba86985b	[InstCombine] Add tests for X shift (A srem B) -> X shift (A and B-1) pow2 nonuniform constant vectors	2020-10-09 15:33:06 +01:00
LLVM GN Syncbot	eaed84a6bd	[gn build] Port 0741a2c9cac	2020-10-09 13:54:24 +00:00
Florian Hahn	9cb6002d04	[SCEV] Do not apply info from loop guards in AddRecs. We cannot guarantee that the replacement expression is loop-invariant in all AddRecs in the source expression. Use a rewriter that skips AddRecExpr for now. Fixes PR47776.	2020-10-09 14:47:26 +01:00
Simon Pilgrim	af14bbf8c2	[InstCombine] foldShiftOfShiftedLogic - add support for nonuniform constant vectors	2020-10-09 14:25:12 +01:00
Simon Pilgrim	5215c542c4	[InstCombine] foldShiftOfShiftedLogic - replace cast<BinaryOperator> with m_BinOp matcher. NFCI. Allows us to drop the !isa<ConstantExpr> check.	2020-10-09 14:10:12 +01:00
Simon Pilgrim	6d17aca7ab	[InstCombine] Add nonuniform/undef vector tests for shift(binop(shift(x,c1),y),c2) patterns	2020-10-09 13:42:11 +01:00
Roman Lebedev	7fd4fe466c	Reland "[NFC][SCEV] Improve tests for ptrtoint modelling (D88806)" I messed up runlines in the original commit.	2020-10-09 14:50:05 +03:00
Max Kazantsev	2cb9b29674	[NFC] Add option to disable IV widening if needed IV widening is sometimes a strictly harmful transform (some examples of this are shown in tests 11, 12 in widen-loop-comp.ll). One of the reasons of this is that sometimes SCEV fails to prove some facts after part of guards has been widened. Though each single such case looks like a bug that can be addressed, it seems that disabling of IV widening may be profitable in some cases. We want to have an option to do so. By default, existing behavior is preserved and IV widening is on.	2020-10-09 18:32:03 +07:00

... 5 6 7 8 9 ...

205207 Commits