llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Kazushi (Jam) Marukawa	b9417501fa	[VE] Add vrcp, vrsqrt, vcvt, vmrg, and vshf intrinsic instructions Add vrcp, vrsqrt, vcvt, vmrg, and vshf intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92750	2020-12-07 20:30:12 +09:00
Cullen Rhodes	143e05ecbb	[IR] Bail out for scalable vectors in ShuffleVectorInst::isConcat Shuffle mask for concat can't be expressed for scalable vectors, so we should bail out. A test has been added that previously crashed, also tested isIdentityWithPadding and isIdentityWithExtract where we already bail out. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92475	2020-12-07 10:48:35 +00:00
Cullen Rhodes	250db46c35	[IR] Support scalable vectors in ShuffleVectorInst::increasesLength Since the length of the llvm::SmallVector shufflemask is related to the minimum number of elements in a scalable vector, it is fine to just get the Min field of the ElementCount. This is already done for the similar function changesLength, tests have been added for both. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92472	2020-12-07 10:42:48 +00:00
Kazushi (Jam) Marukawa	92f08c12fd	[VE] Add vfmad, vfmsb, vfnmad, and vfnmsb intrinsic instructions Add vfmad, vfmsb, vfnmad, and vfnmsb intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92697	2020-12-07 19:28:17 +09:00
Oliver Stannard	c3a121d51a	[Lit] Fix flaky test on heavily loaded bots On some of the slow or heavily-loaded bots, this test was failing intermittently because the infinite_loop.py script might not emit anything to stdout before the 1 second timeout, so the "Command Output" line isn't present in the output. That output isn't really important to this test, we just care that the process is killed, so we can just rmove that check line from the test. Differential revision: https://reviews.llvm.org/D92563	2020-12-07 09:05:55 +00:00
Evgeny Leviant	74be868dc4	[TableGen][SchedModels] Simplify the code. NFC Differential revision: https://reviews.llvm.org/D92304	2020-12-07 11:53:33 +03:00
Martin Storsjö	2d977c839d	[CodeGen] Restore accessing __stack_chk_guard via a .refptr stub on mingw after 2518433f861fcb87 Add tests for this particular detail for x86 and arm (similar tests already existed for x86_64 and aarch64). The libssp implementation may be located in a separate DLL, and in those cases, the references need to be in a .refptr stub, to avoid needing to touch up code in the text section at runtime (which is supported but inefficient for x86, and unsupported for arm). Differential Revision: https://reviews.llvm.org/D92738	2020-12-07 09:35:12 +02:00
Esme-Yi	5ecbd7b777	[PowerPC] Add support for intrinsics dcbfps and dcbstps in P10. Summary: This patch added support for the intrinsics llvm.ppc.dcbfps and llvm.ppc.dcbstps. dcbfps and dcbstps are actually extended mnemonics of dcbf. dcbfps RA,RB ---> dcbf RA,RB,4 dcbstps RA,RB ---> dcbf RA,RB,6 Reviewed By: amyk, steven.zhang Differential Revision: https://reviews.llvm.org/D91323	2020-12-07 05:19:06 +00:00
Zi Xuan Wu	76f7fdd9e3	[CSKY 2/n] Add basic tablegen infra for CSKY This introduce basic tablegen infra such as CSKY{InstrFormats,InstrInfo,RegisterInfo,}.td. For now, only add instruction definitions for basic CSKY ISA operations, and the instruction format and register info are almost complete. Our initial target is a working MC layer rather than codegen, so appropriate SelectionDAG patterns will come later. Differential Revision: https://reviews.llvm.org/D89180	2020-12-07 11:56:09 +08:00
Qiu Chaofan	b78ec5b825	[PowerPC] Fix chain for i1-to-fp operation A simple SELECT is used for converting i1 to floating types on ppc32, but in constrained cases, the chain is not handled properly. This patch will fix that. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92365	2020-12-07 10:38:56 +08:00
Jun Ma	253199f455	[Coroutines] Add DW_OP_deref for transformed dbg.value intrinsic. Differential Revision: https://reviews.llvm.org/D92462	2020-12-07 10:24:44 +08:00
Bing1 Yu	118fe14792	[CodeGen] Modify the refineIndexType(...)'s code to fix a bug in D90942. In previous code, when refineIndexType(...) is called and Index is undef, Index.getOperand(0) will raise a assertion fail. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D92548	2020-12-07 08:49:07 +08:00
Fangrui Song	9bf35cadb5	[llvm-readobj] Delete unused declaration	2020-12-06 15:54:17 -08:00
Fangrui Song	db5a260b9c	[MC] Delete unused declarations Notes: * llvm::createAsmStreamer: it has been moved to TargetRegistry.h * (anon ns)::WasmObjectWriter::updateCustomSectionRelocations: remnant of D46335 * COFFAsmParser::ParseSEHRegisterNumber: remnant of D66625 * llvm::CodeViewContext::isValidCVFileNumber: accidentally added by r279847	2020-12-06 15:36:39 -08:00
Craig Topper	374bd34552	[LoopIdiomRecognize] Merge a conditional operator with an earlier if and remove an extra temporary variable. NFC The CountPrev variable was only used to forward a value from the if statement to the conditional operator under the same condition. While there move some variable declarations to their first assignment.	2020-12-06 15:23:18 -08:00
Fangrui Song	7e009b1df7	[llvm-c] Delete unimplemented llvm-c/LinkTimeOptimizer.h The file was added in 2007 but the functions have never been implemented. Having the file can only cause confusion to existing C API (llvm-c/lto.h) users.	2020-12-06 15:18:25 -08:00
Fangrui Song	f1fa2fa707	[X86] Delete 3 unused declarations	2020-12-06 15:13:39 -08:00
Fangrui Song	c75990caf0	[CodeGen] Delete 4 unused declarations	2020-12-06 15:02:18 -08:00
Fangrui Song	479679999b	[CodeGen] Delete 15 unused declarations Notes about a few declarations: * LiveVariables::RegisterDefIsDead: deleted by r47927 * createForwardControlFlowIntegrityPass, createJumpInstrTablesPass: deleted by r230780 * RegScavenger::setLiveInsUsed: deleted by r292543 * ScheduleDAGInstrs::{toggleKillFlag,startBlockForKills}: deleted by r304055 * Localizer::shouldLocalize: remnant of D75207 * DwarfDebug::addSectionLabel: deleted by r373273	2020-12-06 14:55:04 -08:00
Fangrui Song	4a079239ba	[TableGen] Delete 11 unused declarations	2020-12-06 13:21:07 -08:00
Fangrui Song	4cd07b6bd6	[Transforms] Delete unused declarations from NewGVN/CoroSplit/ValueMapper	2020-12-06 13:04:01 -08:00
Florian Hahn	0a2bfdbba4	[ConstraintElimination] Bail out if system gets too big. For some inputs, the constraint system can grow quite large during solving, because it replaces complex constraints with one or more simpler constraints. This adds a cut-off to avoid compile-time explosion on problematic inputs.	2020-12-06 20:19:15 +00:00
LLVM GN Syncbot	2fc06a6cc2	[gn build] Port 6b989a17107	2020-12-06 20:12:22 +00:00
Wenlei He	6ab8756fe0	[CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon. Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO. Interface There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile. - Query base profile (`getBaseSamplesFor`) Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function. - Query context profile (`getContextSamplesFor`) Context profile is a function's CFG profile for a given calling context. We can query context profile by context string. - Track inlined context profile (`markContextSamplesInlined`) When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function. - Track not-inlined context profile (`promoteMergeContextSamplesTree`) When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination. Implementation Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state. On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes. Integration `SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`. Differential Revision: https://reviews.llvm.org/D90125	2020-12-06 11:49:18 -08:00
Kazu Hirata	ee8c0f1a72	[InstCombine] Remove replacePointer (NFC) The declaration was introduced on Feb 10, 2017 in commit ba01ed00fef32c48d8e2787a6feaf33568a80bfe without a corresponding definition.	2020-12-06 10:24:08 -08:00
Kazu Hirata	360257b8f5	[Mips] Use llvm::is_contained (NFC)	2020-12-06 10:12:55 -08:00
Simon Pilgrim	0980444743	[X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) Noticed while triaging PR37506	2020-12-06 17:56:41 +00:00
Simon Pilgrim	2dcb452014	[X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold Noticed while triaging PR37506	2020-12-06 17:48:27 +00:00
Layton Kifer	bd58f59001	[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1) Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets. Differential Revision: https://reviews.llvm.org/D91589	2020-12-06 11:52:10 -05:00
Paul C. Anagnostopoulos	ec390c899f	[TableGen] [CodeGenTarget] Cache the target's instruction namespace. Differential Revision: https://reviews.llvm.org/D92722	2020-12-06 11:08:30 -05:00
Sanjay Patel	a64549da09	[InstCombine] avoid crash on phi with unreachable incoming block (PR48369)	2020-12-06 09:31:47 -05:00
Simon Pilgrim	f44bc6bb58	[CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.	2020-12-06 14:08:26 +00:00
Nikita Popov	0f3bf80439	[BasicAA] Migrate "same base pointer" logic to decomposed GEPs BasicAA has some special bit of logic for "same base pointer" GEPs that performs a structural comparison: It only looks at two GEPs with the same base (as opposed to two GEP chains with a MustAlias base) and compares their indexes in a limited way. I generalized part of this code in D91027, and this patch merges the remainder into the normal decomposed GEP logic. What this code ultimately wants to do is to determine that gep %base, %idx1 and gep %base, %idx2 don't alias if %idx1 != %idx2, and the access size fits within the stride. We can express this in terms of a decomposed GEP expression with two indexes scale%idx1 + -scale%idx2 where %idx1 != %idx2, and some appropriate checks for sizes and offsets. This makes the reasoning slightly more powerful, and more importantly brings all the GEP logic under a common umbrella. Differential Revision: https://reviews.llvm.org/D92723	2020-12-06 10:27:35 +01:00
Fangrui Song	9b8a5e5ef8	[TargetMachine] Delete asan workaround 687b83ceabafe81970cd4639e7f0c89036402081 has fixed the X86FastISel bug. We can revert the workaround now. Actually, the commit introduced a bug that ppc64 should be excluded.	2020-12-06 00:33:11 -08:00
Fangrui Song	ac9c226631	[X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model This fixes the bug referenced by 5582a7987662a92eda5d883b88fc4586e755acf5 which was exposed by 961f31d8ad14c66829991522d73e14b5a96ff6d4. With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`	2020-12-05 23:13:28 -08:00
Fangrui Song	f09209d278	[TargetMachine] Don't imply dso_local for memprof in static relocation model The workaround is no longer needed with my previous commit to MemProfiler.cpp	2020-12-05 21:39:03 -08:00
Fangrui Song	68d910197d	[MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation model The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference. The program will crash without the dso_local specifier.	2020-12-05 21:36:31 -08:00
Vitaly Buka	60c4aec932	[TargetMachine] Set dso_local for memprof Similar to 5582a7987662a92eda5d883b88fc4586e755acf5	2020-12-05 21:11:04 -08:00
Lang Hames	369d397bc4	[ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator.	2020-12-06 15:42:45 +11:00
Craig Topper	0d8bae1bf2	[RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient here. On the surface this would be slightly less optimal for the isel table, but due to a tablegen issue with HW mode this ends up generating a smaller isel table.	2020-12-05 20:18:22 -08:00
Fangrui Song	7bce8d652b	[TargetMachine] Set dso_local if asan is detected AddressSanitizer instrumentation does not set dso_local on non-thread-local global variables in -fno-pic and it seems to rely on implied dso_local to work. Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as appropriate. Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.	2020-12-05 17:51:10 -08:00
Kazu Hirata	d0d687e606	[ConstantHoisting] Remove unused declaration optimizeConstants (NFC) The function was renamed to runImpl on Jul 2, 2016 in commit 071d8306b0d9d1345c1da84ae3e1c1b231ffd29d, but the old declaration has remained since.	2020-12-05 16:22:12 -08:00
Philip Reames	2255584563	Add recursive decomposition reasoning to isKnownNonEqual The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them. We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.) The BasicAA change will be posted separately for review. Differential Revision: https://reviews.llvm.org/D92698	2020-12-05 15:58:19 -08:00
Fangrui Song	43af914fd3	[TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden) This does not deserve special handling. The code should be added to Clang instead if deemed useful. With this simplification, we can additionally delete the PIC extern_weak special case.	2020-12-05 15:52:33 -08:00
Kazu Hirata	2a17ab7e6e	[CodeGen] llvm::erase_if (NFC)	2020-12-05 15:44:40 -08:00
Aditya Kumar	1a3db3fca7	Remove memory allocation with string Differential Revision: https://reviews.llvm.org/D92506	2020-12-05 15:14:44 -08:00
Fangrui Song	9ea76a985a	[TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific hack is moved to X86Subtarget With my previous commit, X86Subtarget::classifyGlobalReference has learned to use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply dso_local for function declarations, we can drop the ppc64 special case as well. This is NFC in terms of Clang emitted assembly.	2020-12-05 15:13:42 -08:00
Fangrui Song	556dfb9bf9	[TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations, we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal. (Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.) By not implying dso_local, we will respect dso_local/dso_preemptable specifiers set by the frontend. This allows the proposed -fno-direct-access-external-data option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value) when taking the address of a function symbol. This patch should be NFC in terms of the Clang emitted assembly because the case we don't set dso_local is a case Clang sets dso_local. However, some tests don't set dso_local on some function declarations and expose some differences. Most tests have been fixed to be more robust in the previous commit.	2020-12-05 14:54:37 -08:00
Fangrui Song	66d138040b	[test] Add explicit dso_local to function declarations in static relocation model tests They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies dso_local. For such function declarations, clang -fno-pic emits the dso_local specifier. Adding explicit dso_local makes these tests align with the clang behavior and helps implementing an option to use GOT indirection when taking the address of a function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with non-zero st_value)).	2020-12-05 14:54:37 -08:00
Philip Reames	e27a387f8e	[BasicAA] Fix a bug with relational reasoning across iterations Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect. Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue. Differential Revision: https://reviews.llvm.org/D92694	2020-12-05 14:10:21 -08:00

1 2 3 4 5 ...

207833 Commits