1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00
Commit Graph

212770 Commits

Author SHA1 Message Date
Alexey Lapshin
da19cde461 [llvm-objcopy][NFC] Move ownership keeping code into restoreStatOnFile().
The D93881 added functionality which preserve ownership for output file
if llvm-objcopy is called under root. That code was added into the place
where output file is created. The llvm-objcopy already has a function which
sets/restores rights/permissions for the output file.
That is the restoreStatOnFile() function. This patch moves code
(preserving ownershipping) into the restoreStatOnFile() function.

Differential Revision: https://reviews.llvm.org/D98511
2021-03-17 17:27:00 +03:00
Timotej Kapus
b99ed0163e [OCaml] Handle nullptr in Llvm.global_initializer
LLVMGetInitializer returns nullptr in case there is no initializer.
There is not much that can be done with nullptr in OCaml, not even
test if it is null. Also, there does not seem to be a C or OCaml API
to test if there is an initializer. So this diff changes
Llvm.global_initializer to return an option.

Reviewed By: whitequark

Differential Revision: https://reviews.llvm.org/D65195
2021-03-17 13:39:35 +00:00
Hans Wennborg
d0e43622c0 Revert "[DebugInfo] Handle multiple variable location operands in IR"
This caused non-deterministic compiler output; see comment on the
code review.

> This patch updates the various IR passes to correctly handle dbg.values with a
> DIArgList location. This patch does not actually allow DIArgLists to be produced
> by salvageDebugInfo, and it does not affect any pass after codegen-prepare.
> Other than that, it should cover every IR pass.
>
> Most of the changes simply extend code that operated on a single debug value to
> operate on the list of debug values in the style of any_of, all_of, for_each,
> etc. Instances of setOperand(0, ...) have been replaced with with
> replaceVariableLocationOp, which takes the value that is being replaced as an
> additional argument. In places where this value isn't readily available, we have
> to track the old value through to the point where it gets replaced.
>
> Differential Revision: https://reviews.llvm.org/D88232

This reverts commit df69c69427dea7f5b3b3a4d4564bc77b0926ec88.
2021-03-17 13:36:48 +01:00
Jason Hu
4d21b9cc78 [NFC][OCaml] Fix documentation for verify_function and const_of_int64
Documentation of verify_function is incorrect and that of
const_of_int64 is incomplete.

Reviewed By: whitequark

Differential Revision: https://reviews.llvm.org/D77884
2021-03-17 12:09:28 +00:00
Simon Pilgrim
2d3ee0e485 Revert rG3b635253ddd0106c88051cff3540d8eb90bee22f "[AMDGPU] Regenerate wave32.ll test checks"
Breaks on some buildbots.
2021-03-17 11:47:09 +00:00
David Zarzycki
b6009ddce9 [lit] Harmonize test timing data between Unix and Windows
The "path" recorded for timing purposes is only used as a key into a dictionary. It is never used as an actual path to a filesystem API, therefore we should use '/' as the canonical separator so that Unix and Windows machines can share timing data. This also ensures that the lit testing works across platforms.

Reviewed By: jhenderson, jmorse

Differential Revision: https://reviews.llvm.org/D98767
2021-03-17 07:42:40 -04:00
Bradley Smith
85ceade375 [AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE
Previously NEON used a target specific intrinsic for frintn, given that
the FROUNDEVEN ISD node now exists, move over to that instead and add
codegen support for that node for both NEON and fixed length SVE.

Differential Revision: https://reviews.llvm.org/D98487
2021-03-17 11:41:22 +00:00
Simon Pilgrim
773dd67e1c [AMDGPU] Regenerate wave32.ll test checks
This is to help simplify the diff on an upcoming patch
2021-03-17 11:27:11 +00:00
David Green
b0820d90be [LV] Account for the cost of predication of scalarized load/store
This adds the cost of an i1 extract and a branch to the cost in
getMemInstScalarizationCost when the instruction is predicated. These
predicated loads/store would generate blocks of something like:

    %c1 = extractelement <4 x i1> %C, i32 1
    br i1 %c1, label %if, label %else
  if:
    %sa = extractelement <4 x i32> %a, i32 1
    %sb = getelementptr inbounds float, float* %pg, i32 %sa
    %sv = extractelement <4 x float> %x, i32 1
    store float %sa, float* %sb, align 4
  else:

So this increases the cost by the extract and branch. This is probably
still too low in many cases due to the cost of all that branching, but
there is already an existing hack increasing the cost using
useEmulatedMaskMemRefHack. It will increase the cost of a memop if it is
a load or there are more than one store. This patch improves the cost
for when there is only a single store, and hopefully at some point in
the future the hack can be removed.

Differential Revision: https://reviews.llvm.org/D98243
2021-03-17 10:57:50 +00:00
Bu Le
b2b1c4104c [SLP] Fix the trunc instruction insertion problem
Current SLP pass has this piece of code that inserts a trunc instruction
after the vectorized instruction. In the case that the vectorized instruction
is a phi node and not the last phi node in the BB, the trunc instruction
will be inserted between two phi nodes, which will trigger verify problem
in debug version or unpredictable error in another pass.
This patch changes the algorithm to 'if the last vectorized instruction
is a phi, insert it after the last phi node in current BB' to fix this problem.
2021-03-17 13:51:08 +03:00
Fraser Cormack
f84a9cd429 [RISCV] Optimize "dominant element" BUILD_VECTORs
This patch adds an optimization path for BUILD_VECTOR nodes where the
majority of the elements are identical. These can be splatted, with the
remaining elements patched up with INSERT_VECTOR_ELTs. The threshold can
be tweaked as required - it is currently conservative. Undef elements
are disregarded when judging the dominance of a particular element. This
allows them to be covered by the splat value.

In addition, vectors of 2 elements are always optimized to a splat (for
the upper element) and an insert at element zero.

This optimization is disabled when optimizing for size.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98700
2021-03-17 10:09:04 +00:00
Jay Foad
9d2225fc35 [AMDGPU] Split dot2-insts feature
Split out some of the instructions predicated on the dot2-insts target
feature into a new dot7-insts, in preparation for subtargets that have
some but not all of these instructions. NFCI.

Differential Revision: https://reviews.llvm.org/D98717
2021-03-17 09:42:21 +00:00
Jay Foad
fdda5b21a5 [TableGen] Fix excessive compile time issue in FixedLenDecoderEmitter
This patch reduces the time taken for clang to compile the generated
disassembler for an out-of-tree target with InsnType bigger than 64 bits
from 4m30s to 48s.

D67686 did a similar thing for CodeEmitterGen.

The idea is to tweak the API of the APInt-like InsnType class so that
we don't need so many temporary InsnTypes. This takes advantage of the
rule stated in D52100 that currently "no string of bits extracted
from the encoding may exceeed 64-bits", so we can use uint64_t for some
temporaries.

D52100 goes on to say that "fields are still permitted to exceed 64-bits
so long as they aren't one contiguous string of bits". This patch breaks
that by always using a "uint64_t tmp" in the generated decodeToMCInst,
but it should be easy to fix in FilterChooser::emitBinaryParser by
choosing to use a different type of tmp based on the known total field
width.

Differential Revision: https://reviews.llvm.org/D98046
2021-03-17 09:28:50 +00:00
Bu Le
315ebb4028 [SLP][Test] Precommit test for D98423 2021-03-17 12:11:50 +03:00
edwin-wang
e5bb9a33f8 [NFC] [XCOFF] Update PowerPC readobj test case with expression
This patch is to replace the fixed value with expression.
Keep .file section as fixed values as it might be changed. The
remaining sections will hardly be modified. So the Index values
are sequential. By using expression, we can avoid the fixed value
changes in coming patches.

This is a follow-up of patch D97117.

Reviewed By: hubert.reinterpretcast, shchenz

Differential Revision: https://reviews.llvm.org/D98620
2021-03-17 16:02:50 +08:00
Fangrui Song
dfd32b8a1c [MC] Delete unused MCOperand::{create,is,get}FPImm 2021-03-17 00:30:38 -07:00
Praveen
291c355e10 [Flang][OpenMP][OpenACC] Add function for mapping parser clause classes with the corresponding clause kind.
1. Generate the mapping for clauses between the parser class and the
   corresponding clause kind for OpenMP and OpenACC using tablegen.

2. Add a common function to get the OmpObjectList from the OpenMP
   clauses to avoid repetition of code.

Reviewed by: Kiranchandramohan @kiranchandramohan , Valentin Clement @clementval

Differential Revision: https://reviews.llvm.org/D98603
2021-03-17 12:20:43 +05:30
Vaivaswatha Nagaraj
7fbfc910ee [OCaml] Fix buildbot failure in OCaml tests
The commit 506df1bbfd16233134a6ddea96ed2d49077840fd introduced
a call to `caml_alloc_initialized_string` which seems to be
unavailable on older OCaml versions. So I'm now switching to
using `caml_alloc_string` and using a `memcpy` after that, as
is done in the rest of the file.

Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/16/builds/7919
2021-03-17 11:29:55 +05:30
Arthur Eubanks
45770e01eb [Unswitch] Guard dbgs logging with LLVM_DEBUG 2021-03-16 22:31:57 -07:00
Vaivaswatha Nagaraj
63a1fa0826 [OCaml] DebugInfo support for OCaml bindings
Many (but not all) DebugInfo functions are now added to the
OCaml bindings, and rest can be safely added incrementally.

Differential Revision: https://reviews.llvm.org/D90831
2021-03-17 10:15:56 +05:30
Max Kazantsev
c38d6febb5 [BasicAA] Drop dependency on Loop Info. PR43276
BasicAA stores a reference to LoopInfo inside. This imposes an implicit
requirement of keeping it up to date whenever we modify the IR (in particular,
whenever we modify terminators of blocks that belong to loops). Failing
to do so leads to incorrect state of the LoopInfo.

Because general AA does not require loop info updates and provides to API to
update it properly, the users of AA reasonably assume that there is no need to
update the loop info. It may be a reason of bugs, as example in PR43276 shows.

This patch drops dependence of BasicAA on LoopInfo to avoid this problem.

This may potentially pessimize the result of queries to BasicAA.

Differential Revision: https://reviews.llvm.org/D98627
Reviewed By: nikic
2021-03-17 11:43:44 +07:00
Anirudh Prasad
1c455c1c4d Revert "[AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax""
This reverts commit b605cfb336989705f391d255b7628062d3dfe9c3.

Differential Revision: https://reviews.llvm.org/D98744
2021-03-16 18:39:04 -04:00
Zequan Wu
380ae2df93 Revert "[ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast()"
That commit caused chromium build to crash: https://bugs.chromium.org/p/chromium/issues/detail?id=1188885

This reverts commit edf7004851519464f86b0f641da4d6c9506decb1.
2021-03-16 14:36:21 -07:00
Sanjay Patel
0d0126a35e [SLP] separate min/max matching from its instruction-level implementation; NFC
The motivation is to handle integer min/max reductions independently
of whether they are in the current cmp+sel form or the planned intrinsic
form.

We assumed that min/max included a select instruction, but we can
decouple that implementation detail by checking the instructions
themselves rather than relying on the recurrence (reduction) type.
2021-03-16 17:16:11 -04:00
Fangrui Song
73df85ab79 [RISCV] Make empty name symbols SF_FormatSpecific so that llvm-symbolizer ignores them for symbolization
On RISC-V, clang emits empty name symbols used for label differences. (In GCC the symbols are typically `.L0`)
After D95916, the empty name symbols can show up in llvm-symbolizer's symbolization output.
They have no names and thus not useful. Set `SF_FormatSpecific` so that llvm-symbolizer will ignore them.

`SF_FormatSpecific` is also used in LTO but that case should not matter.

Corresponding addr2line problem: https://sourceware.org/bugzilla/show_bug.cgi?id=27585

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D98669
2021-03-16 14:12:18 -07:00
Anirudh Prasad
efc20bdcf7 [AsmParser][SystemZ][z/OS] Reland "Introduce HLASM Comment Syntax"
- Previously, https://reviews.llvm.org/D97703 was [[ https://reviews.llvm.org/D98543 | reverted ]] as it broke when building the unit tests when shared libs on.
- This patch reverts the "revert" and makes two minor changes
- The first is it also links in the MCParser lib when building the unittest. This should resolve the issue when building with with shared libs on and off
- The second renames the name of the unit test from `SystemZAsmLexer` to `SystemZAsmLexerTests` since the convention for unittest binaries is to suffix the name of the unit test with "Tests"

Reviewed By: Kai

Differential Revision: https://reviews.llvm.org/D98666
2021-03-16 17:11:46 -04:00
Roland McGrath
0f93f116e0 [AArch64] Parse "rng" feature flag in .arch directive
Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D98566
2021-03-16 14:10:19 -07:00
Mohammad Hadi Jooybar
0efd7ce0e4 [InstCombine] Avoid Bitcast-GEP fusion for pointers directly from allocation functions
Elimination of bitcasts with void pointer arguments results in GEPs with pure byte indexes. These GEPs do not preserve struct/array information and interrupt phi address translation in later pipeline stages.

Here is the original motivation for this patch:

```
#include<stdio.h>
#include<malloc.h>

typedef struct __Node{

  double f;
  struct __Node *next;

} Node;

void foo () {
  Node *a = (Node*) malloc (sizeof(Node));
  a->next = NULL;
  a->f = 11.5f;

  Node *ptr = a;
  double sum = 0.0f;
  while (ptr) {
    sum += ptr->f;
    ptr = ptr->next;
  }
  printf("%f\n", sum);
}
```
By explicit assignment  `a->next = NULL`, we can infer the length of the link list is `1`. In this case we can eliminate while loop traversal entirely. This elimination is supposed to be performed by GVN/MemoryDependencyAnalysis/PhiTranslation .

The final IR before this patch:

```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
  %next = getelementptr inbounds i8, i8* %call, i64 8
  %0 = bitcast i8* %next to %struct.__Node**
  store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2
  %f = bitcast i8* %call to double*
  store double 1.150000e+01, double* %f, align 8, !tbaa !8
  %tobool12 = icmp eq i8* %call, null
  br i1 %tobool12, label %while.end, label %while.body.lr.ph

while.body.lr.ph:                                 ; preds = %entry
  %1 = bitcast i8* %call to %struct.__Node*
  br label %while.body

while.body:                                       ; preds = %while.body.lr.ph, %while.body
  %sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ]
  %ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ]
  %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
  %2 = load double, double* %f1, align 8, !tbaa !8
  %add = fadd contract double %sum.014, %2
  %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
  %3 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
  %tobool = icmp eq %struct.__Node* %3, null
  br i1 %tobool, label %while.end, label %while.body

while.end:                                        ; preds = %while.body, %entry
  %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add, %while.body ]
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
  ret void
}
```

Final IR after this patch:
```
; Function Attrs: nofree nounwind
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
while.end:
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double 1.150000e+01)
  ret void
}
```

IR before GVN before this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
  %next = getelementptr inbounds i8, i8* %call, i64 8
  %0 = bitcast i8* %next to %struct.__Node**
  store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2
  %f = bitcast i8* %call to double*
  store double 1.150000e+01, double* %f, align 8, !tbaa !8
  %tobool12 = icmp eq i8* %call, null
  br i1 %tobool12, label %while.end, label %while.body.lr.ph

while.body.lr.ph:                                 ; preds = %entry
  %1 = bitcast i8* %call to %struct.__Node*
  br label %while.body

while.body:                                       ; preds = %while.body.lr.ph, %while.body
  %sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ]
  %ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ]
  %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
  %2 = load double, double* %f1, align 8, !tbaa !8
  %add = fadd contract double %sum.014, %2
  %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
  %3 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
  %tobool = icmp eq %struct.__Node* %3, null
  br i1 %tobool, label %while.end.loopexit, label %while.body

while.end.loopexit:                               ; preds = %while.body
  %add.lcssa = phi double [ %add, %while.body ]
  br label %while.end

while.end:                                        ; preds = %while.end.loopexit, %entry
  %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ]
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
  ret void
}
```
IR before GVN after this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
  %0 = bitcast i8* %call to %struct.__Node*
  %next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1
  store %struct.__Node* null, %struct.__Node** %next, align 8, !tbaa !2
  %f = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 0
  store double 1.150000e+01, double* %f, align 8, !tbaa !8
  %tobool12 = icmp eq i8* %call, null
  br i1 %tobool12, label %while.end, label %while.body.preheader

while.body.preheader:                             ; preds = %entry
  br label %while.body

while.body:                                       ; preds = %while.body.preheader, %while.body
  %sum.014 = phi double [ %add, %while.body ], [ 0.000000e+00, %while.body.preheader ]
  %ptr.013 = phi %struct.__Node* [ %2, %while.body ], [ %0, %while.body.preheader ]
  %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
  %1 = load double, double* %f1, align 8, !tbaa !8
  %add = fadd contract double %sum.014, %1
  %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
  %2 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
  %tobool = icmp eq %struct.__Node* %2, null
  br i1 %tobool, label %while.end.loopexit, label %while.body

while.end.loopexit:                               ; preds = %while.body
  %add.lcssa = phi double [ %add, %while.body ]
  br label %while.end

while.end:                                        ; preds = %while.end.loopexit, %entry
  %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ]
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
  ret void
}
```

The phi translation fails before this patch and it prevents GVN to remove the loop. The reason for this failure is in InstCombine. When the Instruction combining pass decides to convert:
```
 %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16)
  %0 = bitcast i8* %call to %struct.__Node*
  %next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1
  store %struct.__Node* null, %struct.__Node** %next
```
to
```
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16)
  %next = getelementptr inbounds i8, i8* %call, i64 8
  %0 = bitcast i8* %next to %struct.__Node**
  store %struct.__Node* null, %struct.__Node** %0

```

GEP instructions with pure byte indexes (e.g. `getelementptr inbounds i8, i8* %call, i64 8`) are obstacles for address translation. address translation is looking for structural similarity between GEPs and these GEPs usually do not match since they have different structure.

This change will cause couple of failures in LLVM-tests. However, in all cases we need to change expected result by the test. I will update those tests as soon as I get green light on this patch.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96881
2021-03-16 17:05:44 -04:00
Ricky Taylor
ad0007c85e [M68k] Add more specific operand classes
This change adds an operand class for each addressing mode, which can then
be used as part of the assembler to match instructions.

Differential Revision: https://reviews.llvm.org/D98535
2021-03-16 13:37:50 -07:00
Min-Yih Hsu
eae33fdf31 [M68k] Fixed incorrect extract-section command substitution
Fix Bug 49485 (https://bugs.llvm.org/show_bug.cgi?id=49485). Which was
caused by incorrect invocation of `extract-section.py` on Windows.
Replacing it with more general python script invocation.

Differential Revision: https://reviews.llvm.org/D98661
2021-03-16 13:37:50 -07:00
Philip Reames
422700f45c [rs4gc] Simplify code by cloning existing instructions when inserting base chain [NFC]
Previously we created a new node, then filled in the pieces. Now, we clone the existing node, then change the respective fields. The only change in handling is with phis since we have to handle multiple incoming edges from the same block a bit differently.

Differential Revision: https://reviews.llvm.org/D98316
2021-03-16 13:10:32 -07:00
Philip Reames
bedda2fc60 [rs4gc] don't force a conflict for a canonical broadcast
A broadcast is a shufflevector where only one input is used. Because of the way we handle constants (undef is a constant), the canonical shuffle sees a meet of (some value) and (nullptr). Given this, every broadcast gets treated as a conflict and a new base pointer computation is added.

The other way to tackle this would be to change constant handling specifically for undefs, but this seems easier.

Differential Revision: https://reviews.llvm.org/D98315
2021-03-16 12:59:06 -07:00
Philip Reames
93763f7273 [rs4gc] don't duplicate existing values which are provably base pointers
RS4GC needs to rewrite the IR to ensure that every relocated pointer has an associated base pointer. The existing code isn't particularly smart about avoiding duplication of existing IR when it turns out the original pointer we were asked to materialize a base pointer for is itself a base pointer.

This patch adds a stage to the algorithm which prunes nodes proven (with a simple forward dataflow fixed point) to be base pointers from the list of nodes considered for duplication. This does require changing some of the later invariants slightly, that's probably the riskiest part of the change.

Differential Revision: D98122
2021-03-16 12:51:21 -07:00
Nikita Popov
52b2bd3243 Revert "[regalloc] Ensure Query::collectInterferringVregs is called before interval iteration"
This reverts commit d40b4911bd9aca0573752e065f29ddd9aff280e1.

This causes a large compile-time regression:
https://llvm-compile-time-tracker.com/compare.php?from=0aa637b2037d882ddf7861284169abf63f524677&to=d40b4911bd9aca0573752e065f29ddd9aff280e1&stat=instructions
2021-03-16 20:41:26 +01:00
Liam Keegan
6c9c7f0375 [MemCpyOpt] Add missing MemorySSAWrapperPass dependency macro
Add MemorySSAWrapperPass as a dependency to MemCpyOptLegacyPass,
since MemCpyOpt now uses MemorySSA by default.

Differential Revision: https://reviews.llvm.org/D98484
2021-03-16 20:30:00 +01:00
Mircea Trofin
d68c0a9cbc [regalloc] Ensure Query::collectInterferringVregs is called before interval iteration
The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs()
needs to be called before iterating the interfering live ranges.

The rest of the patch offers support that is the case: instead of  clearing the query's
InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix
is invalidated (existing triggering mechanism).

Without the change in RegAllocGreedy.cpp, the compiler ices.

This patch should make it more easily discoverable by developers that
collectInterferringVregs needs to be called before iterating.

I will follow up with a subsequent patch to improve the usability and maintainability of Query.

Differential Revision: https://reviews.llvm.org/D98232
2021-03-16 12:10:10 -07:00
Nick Lewycky
e529b62823 Add ConstantDataVector::getRaw() to create a constant data vector from raw data.
This parallels ConstantDataArray::getRaw() and can be used with ConstantDataSequential::getRawDataValues() in the base class for both types.

Update BuildConstantData{Array,Vector} tests to test the getRaw API. Also removes its unused Module.

In passing, update some comments to include the support for half and bfloat. Update tests to include testing for bfloat.

Differential Revision: https://reviews.llvm.org/D98302
2021-03-16 11:57:53 -07:00
Maksym Wezdecki
abe2655fe0 Fix for memory leak reported by Valgrind
If llvm so lib is dlopened and dlclosed several times, then memory leak can be observed, reported by Valgrind.

This patch fixes the issue.

Reviewed By: lattner, dblaikie

Differential Revision: https://reviews.llvm.org/D83372
2021-03-16 11:01:31 -07:00
Philip Reames
3be79b1237 [gvn] CSE gc.relocates based on meaning, not spelling (try 2)
This was (partially) reverted in cfe8f8e0 because the conversion from readonly to readnone in Intrinsics.td exposed a couple of problems.  This change has been reworked to not need that change (via some explicit checks in client code).  This is being done to address the original optimization issue and simplify the testing of the readonly changes.  I'm working on that piece under 49607.

Original commit message follows:

The last two operands to a gc.relocate represent indices into the associated gc.statepoint's gc bundle list. (Effectively, gc.relocates are projections from the gc.statepoints multiple return values.)

We can use this to recognize when two gc.relocates are equivalent (and can be CSEd), even when the indices are non-equal. This is particular useful when considering a chain of multiple statepoints as it lets us eliminate all duplicate gc.relocates in a single pass.

Differential Revision: https://reviews.llvm.org/D97974
2021-03-16 10:59:31 -07:00
Florian Hahn
e31c38aacf [VPlan] Remove PredInst2Recipe, use VP operands instead. (NFC)
Instead of maintaining a separate map from predicated instructions to
recipes, we can instead directly look at the VP operands. If the operand
comes from a predicated instruction, the operand will be a
VPPredInstPHIRecipe with a VPReplicateRecipe as its operand.
2021-03-16 17:40:35 +00:00
Giorgis Georgakoudis
854868cb21 [Utils] Support lit-like substitutions in update_cc_test_checks
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D98712
2021-03-16 10:36:22 -07:00
Vaivaswatha Nagaraj
062bb2a6ef [Docs] Mention linking to reviews page when committing
Differential Revision: https://reviews.llvm.org/D98695
2021-03-16 23:04:22 +05:30
Fangrui Song
d3961f8ad2 [llvm-nm] Add --format=just-symbols and make --just-symbol-name its alias
https://sourceware.org/bugzilla/show_bug.cgi?id=27487 binutils will have
--format=just-symbols/-j as well.

Arbitrarily prefer `-j` to `--format=sysv`. Previously `--format=sysv -j` prints
in the sysv format while `-j` takes precedence over other formats.

Differential Revision: https://reviews.llvm.org/D98569
2021-03-16 10:07:01 -07:00
Adrian Prantl
7fb4439f85 Support !heapallocsite attachments in StripDebugInfo().
They point into the DIType type system, so they need to be stripped as well.

rdar://75341300

Differential Revision: https://reviews.llvm.org/D98668
2021-03-16 10:05:13 -07:00
Adrian Prantl
2a7cba934d Support !heapallocsite attachments in stripNonLineTableDebugInfo().
They point into the DIType type system, so they need to be stripped as well.

rdar://75341300

Differential Revision: https://reviews.llvm.org/D98667
2021-03-16 10:05:12 -07:00
Fangrui Song
7a78fa03a9 [RISCV] Support clang -fpatchable-function-entry && GNU function attribute 'patchable_function_entry'
Similar to D72215 (AArch64) and D72220 (x86).

```
% clang -target riscv32 -march=rv64g -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o
...
0000000000000000 <main>:
       0: 13 00 00 00   nop
       4: 13 00 00 00   nop

% clang -target riscv32 -march=rv64gc -c -fpatchable-function-entry=2 a.c && llvm-objdump -dr a.o
...
00000002 <main>:
       2: 01 00         nop
       4: 01 00         nop
```

Recently the mainline kernel started to use -fpatchable-function-entry=8 for riscv (https://git.kernel.org/linus/afc76b8b80112189b6f11e67e19cf58301944814).

Differential Revision: https://reviews.llvm.org/D98610
2021-03-16 10:02:35 -07:00
Simonas Kazlauskas
59b63b74d5 [InstSimplify] Restrict a GEP transform to avoid provenance changes
This is a follow-up to D98588, and fixes the inline `FIXME` about a GEP-related simplification not
preserving the provenance.

https://alive2.llvm.org/ce/z/qbQoAY

Additional tests were added in {rGf125f28afdb59eba29d2491dac0dfc0a7bf1b60b}

Depends on D98672

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D98611
2021-03-16 18:53:05 +02:00
Jeremy Morse
66e0244e31 Tweak spelling of system-windows UNSUPPORTED line 2021-03-16 16:52:00 +00:00
Sanjay Patel
d76ebc24d7 [LoopVectorize] add FP induction test with minimal FMF; NFC 2021-03-16 12:05:34 -04:00
Thomas Preud'homme
efde4985b2 [MemDepAnalysis] Remove redundant comment.
Exact same comment is found 2 lines above.
2021-03-16 15:51:17 +00:00