1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

207819 Commits

Author SHA1 Message Date
Craig Topper
374bd34552 [LoopIdiomRecognize] Merge a conditional operator with an earlier if and remove an extra temporary variable. NFC
The CountPrev variable was only used to forward a value from
the if statement to the conditional operator under the same
condition.

While there move some variable declarations to their first
assignment.
2020-12-06 15:23:18 -08:00
Fangrui Song
7e009b1df7 [llvm-c] Delete unimplemented llvm-c/LinkTimeOptimizer.h
The file was added in 2007 but the functions have never been implemented.
Having the file can only cause confusion to existing C API (llvm-c/lto.h) users.
2020-12-06 15:18:25 -08:00
Fangrui Song
f1fa2fa707 [X86] Delete 3 unused declarations 2020-12-06 15:13:39 -08:00
Fangrui Song
c75990caf0 [CodeGen] Delete 4 unused declarations 2020-12-06 15:02:18 -08:00
Fangrui Song
479679999b [CodeGen] Delete 15 unused declarations
Notes about a few declarations:

* LiveVariables::RegisterDefIsDead: deleted by r47927
* createForwardControlFlowIntegrityPass, createJumpInstrTablesPass: deleted by r230780
* RegScavenger::setLiveInsUsed: deleted by r292543
* ScheduleDAGInstrs::{toggleKillFlag,startBlockForKills}: deleted by r304055
* Localizer::shouldLocalize: remnant of D75207
* DwarfDebug::addSectionLabel: deleted by r373273
2020-12-06 14:55:04 -08:00
Fangrui Song
4a079239ba [TableGen] Delete 11 unused declarations 2020-12-06 13:21:07 -08:00
Fangrui Song
4cd07b6bd6 [Transforms] Delete unused declarations from NewGVN/CoroSplit/ValueMapper 2020-12-06 13:04:01 -08:00
Florian Hahn
0a2bfdbba4 [ConstraintElimination] Bail out if system gets too big.
For some inputs, the constraint system can grow quite large during
solving, because it replaces complex constraints with one or more
simpler constraints. This adds a cut-off to avoid compile-time explosion
on problematic inputs.
2020-12-06 20:19:15 +00:00
LLVM GN Syncbot
2fc06a6cc2 [gn build] Port 6b989a17107 2020-12-06 20:12:22 +00:00
Wenlei He
6ab8756fe0 [CSSPGO] Infrastructure for context-sensitive Sample PGO and Inlining
This change adds the context-senstive sample PGO infracture described in CSSPGO RFC (https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s). It introduced an abstraction between input profile and profile loader that queries input profile for functions. Specifically, there's now the notion of base profile and context profile, and they are managed by the new SampleContextTracker for adjusting and merging profiles based on inline decisions. It works with top-down profiled guided inliner in profile loader (https://reviews.llvm.org/D70655) for better inlining with specialization and better post-inline profile fidelity. In the future, we can also expose this infrastructure to CGSCC inliner in order for it to take advantage of context-sensitive profile. This change is the consumption part of context-sensitive profile (The generation part is in this stack: https://reviews.llvm.org/D89707). We've seen good results internally in conjunction with Pseudo-probe (https://reviews.llvm.org/D86193). Pacthes for integration with Pseudo-probe coming up soon.

Currently the new infrastructure kick in when input profile contains the new context-sensitive profile; otherwise it's no-op and does not affect existing AutoFDO.

**Interface**

There're two sets of interfaces for query and tracking respectively exposed from SampleContextTracker. For query, now instead of simply getting a profile from input for a function, we can explicitly query base profile or context profile for given call path of a function. For tracking, there're separate APIs for marking context profile as inlined, or promoting and merging not inlined context profile.

- Query base profile (`getBaseSamplesFor`)
Base profile is the merged synthetic profile for function's CFG profile from any outstanding (not inlined) context. We can query base profile by function.

- Query context profile (`getContextSamplesFor`)
Context profile is a function's CFG profile for a given calling context. We can query context profile by context string.

- Track inlined context profile (`markContextSamplesInlined`)
When a function is inlined for given calling context, we need to mark the context profile for that context as inlined. This is to make sure we don't include inlined context profile when synthesizing base profile for that inlined function.

- Track not-inlined context profile (`promoteMergeContextSamplesTree`)
When a function is not inlined for given calling context, we need to promote the context profile tree so the not inlined context becomes top-level context. This preserve the sub-context under that function so later inline decision for that not inlined function will still have context profile for its call tree. Note that profile will be merged if needed when promoting a context profile tree if any of the node already exists at its promoted destination.

**Implementation**

Implementation-wise, `SampleContext` is created as abstraction for context. Currently it's a string for call path, and we can later optimize it to something more efficient, e.g. context id. Each `SampleContext` also has a `ContextState` indicating whether it's raw context profile from input, whether it's inlined or merged, whether it's synthetic profile created by compiler. Each `FunctionSamples` now has a `SampleContext` that tells whether it's base profile or context profile, and for context profile what is the context and state.

On top of the above context representation, a custom trie tree is implemented to track and manager context profiles. Specifically, `SampleContextTracker` is implemented that encapsulates a trie tree with `ContextTireNode` as node. Each node of the trie tree represents a frame in calling context, thus the path from root to a node represents a valid calling context. We also track `FunctionSamples` for each node, so this trie tree can serve efficient query for context profile. Accordingly, context profile tree promotion now becomes moving a subtree to be under the root of entire tree, and merge nodes for subtree if this move encounters existing nodes.

**Integration**

`SampleContextTracker` is now also integrated with AutoFDO, `SampleProfileReader` and `SampleProfileLoader`. When we detected input profile contains context-sensitive profile, `SampleContextTracker` will be used to track profiles, and all profile query will go to `SampleContextTracker` instead of `SampleProfileReader` automatically. Tracking APIs are called automatically for each inline decision from `SampleProfileLoader`.

Differential Revision: https://reviews.llvm.org/D90125
2020-12-06 11:49:18 -08:00
Kazu Hirata
ee8c0f1a72 [InstCombine] Remove replacePointer (NFC)
The declaration was introduced on Feb 10, 2017 in commit
ba01ed00fef32c48d8e2787a6feaf33568a80bfe without a corresponding
definition.
2020-12-06 10:24:08 -08:00
Kazu Hirata
360257b8f5 [Mips] Use llvm::is_contained (NFC) 2020-12-06 10:12:55 -08:00
Simon Pilgrim
0980444743 [X86] Fold MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X)))
Noticed while triaging PR37506
2020-12-06 17:56:41 +00:00
Simon Pilgrim
2dcb452014 [X86] Add tests for missing MOVMSK(ICMP_SGT(X,-1)) -> NOT(MOVMSK(X))) fold
Noticed while triaging PR37506
2020-12-06 17:48:27 +00:00
Layton Kifer
bd58f59001 [DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)
Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets.

Differential Revision: https://reviews.llvm.org/D91589
2020-12-06 11:52:10 -05:00
Paul C. Anagnostopoulos
ec390c899f [TableGen] [CodeGenTarget] Cache the target's instruction namespace.
Differential Revision: https://reviews.llvm.org/D92722
2020-12-06 11:08:30 -05:00
Sanjay Patel
a64549da09 [InstCombine] avoid crash on phi with unreachable incoming block (PR48369) 2020-12-06 09:31:47 -05:00
Simon Pilgrim
f44bc6bb58 [CostModel][X86] getGatherScatterOpCost - use default implementation for alt costkinds
Noticed while looking at D92701 - we only really handle TCK_RecipThroughput gather/scatter costs - for now drop back to the default implementation for non-legal gathers/scatters.
2020-12-06 14:08:26 +00:00
Nikita Popov
0f3bf80439 [BasicAA] Migrate "same base pointer" logic to decomposed GEPs
BasicAA has some special bit of logic for "same base pointer" GEPs
that performs a structural comparison: It only looks at two GEPs
with the same base (as opposed to two GEP chains with a MustAlias
base) and compares their indexes in a limited way. I generalized
part of this code in D91027, and this patch merges the remainder
into the normal decomposed GEP logic.

What this code ultimately wants to do is to determine that
gep %base, %idx1 and gep %base, %idx2 don't alias if %idx1 != %idx2,
and the access size fits within the stride.

We can express this in terms of a decomposed GEP expression with
two indexes scale*%idx1 + -scale*%idx2 where %idx1 != %idx2, and
some appropriate checks for sizes and offsets.

This makes the reasoning slightly more powerful, and more
importantly brings all the GEP logic under a common umbrella.

Differential Revision: https://reviews.llvm.org/D92723
2020-12-06 10:27:35 +01:00
Fangrui Song
9b8a5e5ef8 [TargetMachine] Delete asan workaround
687b83ceabafe81970cd4639e7f0c89036402081 has fixed the X86FastISel bug.
We can revert the workaround now. Actually, the commit introduced a
bug that ppc64 should be excluded.
2020-12-06 00:33:11 -08:00
Fangrui Song
ac9c226631 [X86FastISel] Fix MO_GOTPCREL GlobalValue reference in static relocation model
This fixes the bug referenced by 5582a7987662a92eda5d883b88fc4586e755acf5
which was exposed by 961f31d8ad14c66829991522d73e14b5a96ff6d4.

With this change, `movq src@GOTPCREL, %rcx` => `movq src@GOTPCREL(%rip), %rcx`
2020-12-05 23:13:28 -08:00
Fangrui Song
f09209d278 [TargetMachine] Don't imply dso_local for memprof in static relocation model
The workaround is no longer needed with my previous commit to MemProfiler.cpp
2020-12-05 21:39:03 -08:00
Fangrui Song
68d910197d [MemProf] Make __memprof_shadow_memory_dynamic_address dso_local in static relocation model
The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.
2020-12-05 21:36:31 -08:00
Vitaly Buka
60c4aec932 [TargetMachine] Set dso_local for memprof
Similar to 5582a7987662a92eda5d883b88fc4586e755acf5
2020-12-05 21:11:04 -08:00
Lang Hames
369d397bc4 [ORC] Fix missing forward of Allow filter in TPCDynamicLibrarySearchGenerator. 2020-12-06 15:42:45 +11:00
Craig Topper
0d8bae1bf2 [RISCV] Replace a custom SDTypeProfile with SDTIntBinOp which should be sufficient here.
On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.
2020-12-05 20:18:22 -08:00
Fangrui Song
7bce8d652b [TargetMachine] Set dso_local if asan is detected
AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.

Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.
2020-12-05 17:51:10 -08:00
Kazu Hirata
d0d687e606 [ConstantHoisting] Remove unused declaration optimizeConstants (NFC)
The function was renamed to runImpl on Jul 2, 2016 in commit
071d8306b0d9d1345c1da84ae3e1c1b231ffd29d, but the old declaration has
remained since.
2020-12-05 16:22:12 -08:00
Philip Reames
2255584563 Add recursive decomposition reasoning to isKnownNonEqual
The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them.

We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.)  The BasicAA change will be posted separately for review.

Differential Revision: https://reviews.llvm.org/D92698
2020-12-05 15:58:19 -08:00
Fangrui Song
43af914fd3 [TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden)
This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.
2020-12-05 15:52:33 -08:00
Kazu Hirata
2a17ab7e6e [CodeGen] llvm::erase_if (NFC) 2020-12-05 15:44:40 -08:00
Aditya Kumar
1a3db3fca7 Remove memory allocation with string
Differential Revision: https://reviews.llvm.org/D92506
2020-12-05 15:14:44 -08:00
Fangrui Song
9ea76a985a [TargetMachine] Clean up TargetMachine::shouldAssumeDSOLocal after x86-32 specific hack is moved to X86Subtarget
With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.

This is NFC in terms of Clang emitted assembly.
2020-12-05 15:13:42 -08:00
Fangrui Song
556dfb9bf9 [TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)

By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.

This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.
2020-12-05 14:54:37 -08:00
Fangrui Song
66d138040b [test] Add explicit dso_local to function declarations in static relocation model tests
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.

For such function declarations, clang -fno-pic emits the dso_local specifier.
Adding explicit dso_local makes these tests align with the clang behavior and
helps implementing an option to use GOT indirection when taking the address of a
function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with
non-zero st_value)).
2020-12-05 14:54:37 -08:00
Philip Reames
e27a387f8e [BasicAA] Fix a bug with relational reasoning across iterations
Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect.

Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue.

Differential Revision: https://reviews.llvm.org/D92694
2020-12-05 14:10:21 -08:00
Fangrui Song
e24be6085b [X86] Emit @PLT for x86-64 and keep unadorned symbols for x86-32
This essentially reverts the x86-64 side effect of r327198.

For x86-32, @PLT (R_386_PLT32) is not suitable in -fno-pic mode so the
code forces MO_NO_FLAG (like a forced dso_local) (https://bugs.llvm.org//show_bug.cgi?id=36674#c6).

For x86-64, both `call/jmp foo` and `call/jmp foo@PLT` emit R_X86_64_PLT32
(https://sourceware.org/bugzilla/show_bug.cgi?id=22791) so there is no
difference using @PLT. Using @PLT is actually favorable because this drops
a difference with -fpie/-fpic code and makes it possible to avoid a canonical
PLT entry when taking the address of an undefined function symbol.
2020-12-05 13:17:47 -08:00
Chris Sears
ee3732c1ab [llvmbuildectomy] removed vestigial LLVMBuild.txt files
LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.

llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt

Differential Revision: https://reviews.llvm.org/D92693
2020-12-05 22:00:22 +01:00
Fangrui Song
073082d2af [TargetMachine] Move X86 specific shouldAssumeDSOLocal logic to X86Subtarget::classifyGlobalFunctionReference 2020-12-05 12:32:50 -08:00
Nikita Popov
333311e9f5 [BasicAA] Add more tests for non-equal index (NFC) 2020-12-05 21:22:57 +01:00
Fangrui Song
226fe3b848 [TargetMachine] Simplify shouldAssumeDSOLocal by processing ExternalSymbolSDNode early
The function accrues many `GV` nullness checks. Process `!GV`
(ExternalSymbolSDNode) early to simplify code.

Also improve a comment added in r327198 (intrinsics is a subset of
ExternalSymbolSDNode).

Intended to be NFC.
2020-12-05 11:40:18 -08:00
Benjamin Kramer
46b183a1e7 [X86] Autodetect znver3 2020-12-05 19:08:20 +01:00
Florian Hahn
757dad386a [ConstraintElimination] Wrap dump() call in LLVM_DEBUG (NFC).
ConstraintSystem::dump only generates output with -debug, but there's no
need to call it without -debug.
2020-12-05 13:14:53 +00:00
Florian Hahn
efea313a11 [ConstraintElimination] Handle constraints with all zero var coeffs.
Constraints where all variable coefficients are 0 do not add any useful
information. When checking, we can check if they are always true/false.
2020-12-05 12:06:53 +00:00
Dmitry Preobrazhensky
9ab946b56b [AMDGPU][MC] Improved diagnostics message for sym/expr operands
See bug 48295 (https://bugs.llvm.org/show_bug.cgi?id=48295)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D92088
2020-12-05 14:05:53 +03:00
Nikita Popov
837200292a [AA] Initialize Depth member
Fix mistake introduced in f8afba5f7a25a69c12191d979d78d40fa6e5b684:
I failed to initialize the Depth member to zero.
2020-12-05 11:37:36 +01:00
Dmitry Preobrazhensky
d8b718388d [AMDGPU][MC] Corrected error position for invalid MOVREL src
See bug 47518 (https://bugs.llvm.org/show_bug.cgi?id=47518)

Reviewers: rampitec

Differential Revision: https://reviews.llvm.org/D92084
2020-12-05 13:23:14 +03:00
Nikita Popov
98424363ba [AA] Add statistics for alias results (NFC)
Count how many NoAlias/MustAlias/MayAlias we get from top-level
queries.
2020-12-05 11:09:15 +01:00
Nikita Popov
83eec1088d [BasicAA] Add recphi tests with nested loops (NFC) 2020-12-05 11:09:15 +01:00
Fangrui Song
57942737c1 [TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal
PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead.
Actually Clang ppc32 does produce a pair of absolute relocations which match GCC.

This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).
2020-12-05 00:42:07 -08:00