1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

207833 Commits

Author SHA1 Message Date
Kazushi (Jam) Marukawa
b9417501fa [VE] Add vrcp, vrsqrt, vcvt, vmrg, and vshf intrinsic instructions
Add vrcp, vrsqrt, vcvt, vmrg, and vshf intrinsic instructions and
regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92750
2020-12-07 20:30:12 +09:00
Cullen Rhodes
143e05ecbb [IR] Bail out for scalable vectors in ShuffleVectorInst::isConcat
Shuffle mask for concat can't be expressed for scalable vectors, so we
should bail out. A test has been added that previously crashed, also
tested isIdentityWithPadding and isIdentityWithExtract where we already
bail out.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92475
2020-12-07 10:48:35 +00:00
Cullen Rhodes
250db46c35 [IR] Support scalable vectors in ShuffleVectorInst::increasesLength
Since the length of the llvm::SmallVector shufflemask is related to the
minimum number of elements in a scalable vector, it is fine to just get
the Min field of the ElementCount. This is already done for the similar
function changesLength, tests have been added for both.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D92472
2020-12-07 10:42:48 +00:00
Kazushi (Jam) Marukawa
92f08c12fd [VE] Add vfmad, vfmsb, vfnmad, and vfnmsb intrinsic instructions
Add vfmad, vfmsb, vfnmad, and vfnmsb intrinsic instructions and
regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92697
2020-12-07 19:28:17 +09:00
Oliver Stannard
c3a121d51a [Lit] Fix flaky test on heavily loaded bots
On some of the slow or heavily-loaded bots, this test was failing
intermittently because the infinite_loop.py script might not emit
anything to stdout before the 1 second timeout, so the "Command Output"
line isn't present in the output. That output isn't really important to
this test, we just care that the process is killed, so we can just rmove
that check line from  the test.

Differential revision: https://reviews.llvm.org/D92563
2020-12-07 09:05:55 +00:00
Evgeny Leviant
74be868dc4 [TableGen][SchedModels] Simplify the code. NFC
Differential revision: https://reviews.llvm.org/D92304
2020-12-07 11:53:33 +03:00
Martin Storsjö
2d977c839d [CodeGen] Restore accessing __stack_chk_guard via a .refptr stub on mingw after 2518433f861fcb87
Add tests for this particular detail for x86 and arm (similar tests
already existed for x86_64 and aarch64).

The libssp implementation may be located in a separate DLL, and in
those cases, the references need to be in a .refptr stub, to avoid
needing to touch up code in the text section at runtime (which is
supported but inefficient for x86, and unsupported for arm).

Differential Revision: https://reviews.llvm.org/D92738
2020-12-07 09:35:12 +02:00
Esme-Yi
5ecbd7b777 [PowerPC] Add support for intrinsics dcbfps and dcbstps in P10.
Summary: This patch added support for the intrinsics llvm.ppc.dcbfps and llvm.ppc.dcbstps.
dcbfps and dcbstps are actually extended mnemonics of dcbf.
dcbfps RA,RB ---> dcbf RA,RB,4
dcbstps RA,RB ---> dcbf RA,RB,6

Reviewed By: amyk, steven.zhang

Differential Revision: https://reviews.llvm.org/D91323
2020-12-07 05:19:06 +00:00
Zi Xuan Wu
76f7fdd9e3 [CSKY 2/n] Add basic tablegen infra for CSKY
This introduce basic tablegen infra such as CSKY{InstrFormats,InstrInfo,RegisterInfo,}.td.
For now, only add instruction definitions for basic CSKY ISA operations, and the instruction format and register info are almost complete.

Our initial target is a working MC layer rather than codegen, so appropriate SelectionDAG patterns will come later.

Differential Revision: https://reviews.llvm.org/D89180
2020-12-07 11:56:09 +08:00
Qiu Chaofan
b78ec5b825 [PowerPC] Fix chain for i1-to-fp operation
A simple SELECT is used for converting i1 to floating types on ppc32,
but in constrained cases, the chain is not handled properly. This patch
will fix that.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92365
2020-12-07 10:38:56 +08:00
Jun Ma
253199f455 [Coroutines] Add DW_OP_deref for transformed dbg.value intrinsic.
Differential Revision: https://reviews.llvm.org/D92462
2020-12-07 10:24:44 +08:00
Bing1 Yu
118fe14792 [CodeGen] Modify the refineIndexType(...)'s code to fix a bug in D90942.
In previous code, when refineIndexType(...) is called and Index is undef, Index.getOperand(0) will raise a assertion fail.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D92548
2020-12-07 08:49:07 +08:00
Fangrui Song
9bf35cadb5 [llvm-readobj] Delete unused declaration 2020-12-06 15:54:17 -08:00
Fangrui Song
db5a260b9c [MC] Delete unused declarations
Notes:

* llvm::createAsmStreamer: it has been moved to TargetRegistry.h
* (anon ns)::WasmObjectWriter::updateCustomSectionRelocations: remnant of D46335
* COFFAsmParser::ParseSEHRegisterNumber: remnant of D66625
* llvm::CodeViewContext::isValidCVFileNumber: accidentally added by r279847
2020-12-06 15:36:39 -08:00
Craig Topper
374bd34552 [LoopIdiomRecognize] Merge a conditional operator with an earlier if and remove an extra temporary variable. NFC
The CountPrev variable was only used to forward a value from
the if statement to the conditional operator under the same
condition.

While there move some variable declarations to their first
assignment.
2020-12-06 15:23:18 -08:00
Fangrui Song
7e009b1df7 [llvm-c] Delete unimplemented llvm-c/LinkTimeOptimizer.h
The file was added in 2007 but the functions have never been implemented.
Having the file can only cause confusion to existing C API (llvm-c/lto.h) users.
2020-12-06 15:18:25 -08:00
Fangrui Song
f1fa2fa707 [X86] Delete 3 unused declarations 2020-12-06 15:13:39 -08:00
Fangrui Song
c75990caf0 [CodeGen] Delete 4 unused declarations 2020-12-06 15:02:18 -08:00
Fangrui Song
479679999b [CodeGen] Delete 15 unused declarations
Notes about a few declarations:

* LiveVariables::RegisterDefIsDead: deleted by r47927
* createForwardControlFlowIntegrityPass, createJumpInstrTablesPass: deleted by r230780
* RegScavenger::setLiveInsUsed: deleted by r292543
* ScheduleDAGInstrs::{toggleKillFlag,startBlockForKills}: deleted by r304055
* Localizer::shouldLocalize: remnant of D75207
* DwarfDebug::addSectionLabel: deleted by r373273
2020-12-06 14:55:04 -08:00
Fangrui Song
4a079239ba [TableGen] Delete 11 unused declarations 2020-12-06 13:21:07 -08:00
Fangrui Song
4cd07b6bd6 [Transforms] Delete unused declarations from NewGVN/CoroSplit/ValueMapper 2020-12-06 13:04:01 -08:00
Florian Hahn
0a2bfdbba4 [ConstraintElimination] Bail out if system gets too big.
For some inputs, the constraint system can grow quite large during
solving, because it replaces complex constraints with one or more
simpler constraints. This adds a cut-off to avoid compile-time explosion
on problematic inputs.
2020-12-06 20:19:15 +00:00
LLVM GN Syncbot
2fc06a6cc2 [gn build] Port 6b989a17107 2020-12-06 20:12:22 +00:00
Wenlei He
6ab8756fe0 [CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining
This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon.

Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO.

**Interface**

There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile.

- Query base profile (`getBaseSamplesFor`)
Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function.

- Query context profile (`getContextSamplesFor`)
Context profile is a function's CFG profile for a given calling context. We can query context profile by context string.

- Track inlined context profile (`markContextSamplesInlined`)
When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function.

- Track not-inlined context profile (`promoteMergeContextSamplesTree`)
When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination.

**Implementation**

Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state.

On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes.

**Integration**

`SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`.

Differential Revision: https://reviews.llvm.org/D90125
2020-12-06 11:49:18 -08:00
Kazu Hirata
ee8c0f1a72 [InstCombine] Remove replacePointer (NFC)
The declaration was introduced on Feb 10, 2017 in commit
ba01ed00fef32c48d8e2787a6feaf33568a80bfe without a corresponding
definition.
2020-12-06 10:24:08 -08:00
Kazu Hirata
360257b8f5 [Mips] Use llvm::is_contained (NFC) 2020-12-06 10:12:55 -08:00
Simon Pilgrim
0980444743 [X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X)))
Noticed while triaging PR37506
2020-12-06 17:56:41 +00:00
Simon Pilgrim
2dcb452014 [X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold
Noticed while triaging PR37506
2020-12-06 17:48:27 +00:00
Layton Kifer
bd58f59001 [DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)
Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets.

Differential Revision: https://reviews.llvm.org/D91589
2020-12-06 11:52:10 -05:00
Paul C. Anagnostopoulos
ec390c899f [TableGen] [CodeGenTarget] Cache the target's instruction namespace.
Differential Revision: https://reviews.llvm.org/D92722
2020-12-06 11:08:30 -05:00
Sanjay Patel
a64549da09 [InstCombine] avoid crash on phi with unreachable incoming block (PR48369) 2020-12-06 09:31:47 -05:00
Simon Pilgrim
f44bc6bb58 [CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds
Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.
2020-12-06 14:08:26 +00:00
Nikita Popov
0f3bf80439 [BasicAA] Migrate "same base pointer" logic to decomposed GEPs
BasicAA has some special bit of logic for "same base pointer" GEPs
that performs a structural comparison: It only looks at two GEPs
with the same base (as opposed to two GEP chains with a MustAlias
base) and compares their indexes in a limited way. I generalized
part of this code in D91027, and this patch merges the remainder
into the normal decomposed GEP logic.

What this code ultimately wants to do is to determine that
gep %base, %idx1 and gep %base, %idx2 don't alias if %idx1 != %idx2,
and the access size fits within the stride.

We can express this in terms of a decomposed GEP expression with
two indexes scale*%idx1 + -scale*%idx2 where %idx1 != %idx2, and
some appropriate checks for sizes and offsets.

This makes the reasoning slightly more powerful, and more
importantly brings all the GEP logic under a common umbrella.

Differential Revision: https://reviews.llvm.org/D92723
2020-12-06 10:27:35 +01:00
Fangrui Song
9b8a5e5ef8 [TargetMachine] Delete asan workaround
687b83ceabafe81970cd4639e7f0c89036402081 has fixed the X86FastISel bug.
We can revert the workaround now. Actually, the commit introduced a
bug that ppc64 should be excluded.
2020-12-06 00:33:11 -08:00
Fangrui Song
ac9c226631 [X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model
This fixes the bug referenced by 5582a7987662a92eda5d883b88fc4586e755acf5
which was exposed by 961f31d8ad14c66829991522d73e14b5a96ff6d4.

With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`
2020-12-05 23:13:28 -08:00
Fangrui Song
f09209d278 [TargetMachine] Don't imply dso_local for memprof in static relocation model
The workaround is no longer needed with my previous commit to MemProfiler.cpp
2020-12-05 21:39:03 -08:00
Fangrui Song
68d910197d [MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation model
The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.
2020-12-05 21:36:31 -08:00
Vitaly Buka
60c4aec932 [TargetMachine] Set dso_local for memprof
Similar to 5582a7987662a92eda5d883b88fc4586e755acf5
2020-12-05 21:11:04 -08:00
Lang Hames
369d397bc4 [ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator. 2020-12-06 15:42:45 +11:00
Craig Topper
0d8bae1bf2 [RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient here.
On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.
2020-12-05 20:18:22 -08:00
Fangrui Song
7bce8d652b [TargetMachine] Set dso_local if asan is detected
AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.

Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.
2020-12-05 17:51:10 -08:00
Kazu Hirata
d0d687e606 [ConstantHoisting] Remove unused declaration optimizeConstants (NFC)
The function was renamed to runImpl on Jul 2, 2016 in commit
071d8306b0d9d1345c1da84ae3e1c1b231ffd29d, but the old declaration has
remained since.
2020-12-05 16:22:12 -08:00
Philip Reames
2255584563 Add recursive decomposition reasoning to isKnownNonEqual
The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them.

We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.)  The BasicAA change will be posted separately for review.

Differential Revision: https://reviews.llvm.org/D92698
2020-12-05 15:58:19 -08:00
Fangrui Song
43af914fd3 [TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden)
This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.
2020-12-05 15:52:33 -08:00
Kazu Hirata
2a17ab7e6e [CodeGen] llvm::erase_if (NFC) 2020-12-05 15:44:40 -08:00
Aditya Kumar
1a3db3fca7 Remove memory allocation with string
Differential Revision: https://reviews.llvm.org/D92506
2020-12-05 15:14:44 -08:00
Fangrui Song
9ea76a985a [TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific hack is moved to X86Subtarget
With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.

This is NFC in terms of Clang emitted assembly.
2020-12-05 15:13:42 -08:00
Fangrui Song
556dfb9bf9 [TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)

By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.

This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.
2020-12-05 14:54:37 -08:00
Fangrui Song
66d138040b [test] Add explicit dso_local to function declarations in static relocation model tests
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.

For such function declarations, clang -fno-pic emits the dso_local specifier.
Adding explicit dso_local makes these tests align with the clang behavior and
helps implementing an option to use GOT indirection when taking the address of a
function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with
non-zero st_value)).
2020-12-05 14:54:37 -08:00
Philip Reames
e27a387f8e [BasicAA] Fix a bug with relational reasoning across iterations
Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect.

Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue.

Differential Revision: https://reviews.llvm.org/D92694
2020-12-05 14:10:21 -08:00