1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

215353 Commits

Author SHA1 Message Date
Juneyoung Lee
83fcabb534 [InstCombine] Precommit tests for D101807 (NFC) 2021-05-05 13:44:57 +09:00
Serguei Katkov
2281491db1 [GreedyRA] Add support for invoke statepoint with tied-defs.
statepoint instruction uses tied-def registers to represent live gc value which
is use and def at the same time on a call.
At the same time invoke statepoint instruction is a last split point which can throw and
jump to landing pad.
As a result we have instructon which is last split point with tied-defs registers and
we need to teach Greedy RA to work with it.

The option -use-registers-for-gc-values-in-landing-pad controls whether statepoint lowering
will generate tied-defs for invoke statepoint and is off by default now.

To resolve all issues the following changes has been done.
1) Last Split point for invoke statepoint should be statepoint itself

If statepoint has a def it is a relocated gc pointer and it should be available in landing pad.
So we cannot split interval after statepoint at end of basic block.

2) Do not split interval on tied-def

If end of interval for overlap utility is a use which has tied-def we
should not split interval on this instruction due to in this case use
and def may have different registers and it breaks tied-def property.

3) Take into account Last Split Point for enterIntvAtEnd

If the use after Last Split Point is a def so it should be tied-def and
we can take the def of the tied-use as ParentVNI and thus
tied-use and tied-def will be live in resulting interval.

4) Handle the case when def is after LIP in InlineSpiller

If def of LI is after last insertion point of basic block we cannot hoist in this BB.

The example of such instruction is invoke statepoint where def represents the
relocated live gc pointer. Invoke is a last insertion point and its def is located after it.
In this case there is no place to insert spill and we bail out.

5) Fix removeBackCopies to account empty copies

RegAssignMap cannot hold empty interval, so do not set stop
to kill value if it produces empty interval.

This can happen if we remove back-copy and right before that we have another
back-copy.

For example, for parent %0 we can get
%1 = COPY %0
%2 = COPY %0
while we removing %2 we cannot set kill for %1 due to its empty.

6) Do not hoist copy to BB if its def is after LSP

If the parent def is a LastSplitPoint or later we cannot hoist copy to this basic block
because inserted copy (or re-materialization) will be located before the def.

All parts have been reviewed separately as follows:
https://reviews.llvm.org/D100747
https://reviews.llvm.org/D100748
https://reviews.llvm.org/D100750
https://reviews.llvm.org/D100927
https://reviews.llvm.org/D100945
https://reviews.llvm.org/D101028

Reviewers: reames, rnk, void, MatzeB, wmi, qcolombet
Reviewed By: reames, qcolombet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D101150
2021-05-05 11:13:35 +07:00
LLVM GN Syncbot
795d87c4aa [gn build] Port f2018d6c16d1 2021-05-05 03:54:38 +00:00
Lang Hames
43ea1b5279 [ORC] Reintroduce the ORC C API test.
This test was removed in 51495fd285 due to broken bots. Its reintroduction is
expected to trigger failures on some builders. The test has been modified to
print error messages in full, which should aid in tracking these down.
2021-05-04 20:46:00 -07:00
Fangrui Song
98bf7fd6fd [llvm-objcopy] --dump-section: error if '=' is missing or filename is empty
Fix PR45416: the diagnostic when '=' is missing is misleading.
`FileOutputBuffer::create` returns successfully when the filename is empty
(the temporary file is `.tmp%%%%%%%`), but `FileOutputBuffer::commit` will error when
renaming `.tmp%%%%%%%` to the empty name).

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D101697
2021-05-04 17:30:57 -07:00
Han Zhu
33552ebc41 [loop-idiom] Hoist loop memcpys to loop preheader
For a simple loop like:
```
struct S {
  int x;
  int y;
  char b;
};

unsigned foo(S* __restrict__ a, S* b, int n) {
  for (int i = 0; i < n; i++)
    a[i] = b[i];

  return sizeof(a[0]);
}
```
We could eliminate the loop and convert it to a large memcpy of 12*n bytes. Currently this is not handled. Output of `opt -loop-idiom -S < memcpy_before.ll`
```
%struct.S = type { i32, i32, i8 }

define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr {
entry:
  %cmp7 = icmp sgt i32 %n, 0
  br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup

for.body.preheader:                               ; preds = %entry
  br label %for.body

for.cond.cleanup.loopexit:                        ; preds = %for.body
  br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond.cleanup.loopexit, %entry
  ret i32 12

for.body:                                         ; preds = %for.body, %for.body.preheader
  %i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ]
  %idxprom = zext i32 %i.08 to i64
  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom
  %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom
  %0 = bitcast %struct.S* %arrayidx2 to i8*
  %1 = bitcast %struct.S* %arrayidx to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 4 dereferenceable(12) %0, i8* nonnull align 4 dereferenceable(12) %1, i64 12, i1 false)
  %inc = add nuw nsw i32 %i.08, 1
  %cmp = icmp slt i32 %inc, %n
  br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
}

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0

attributes #0 = { argmemonly nofree nosync nounwind willreturn }

```
The loop idiom pass currently only handles load and store instructions. Since struct S is too big to fit in a register, the loop body contains a memcpy intrinsic.

With this change, re-run `opt -loop-idiom -S < memcpy_before.ll`. The loop memcpy is promoted to loop preheader. For this trivial case, the loop is dead and will be removed by another pass.
```
%struct.S = type { i32, i32, i8 }

define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr {
entry:
  %a1 = bitcast %struct.S* %a to i8*
  %b2 = bitcast %struct.S* %b to i8*
  %cmp7 = icmp sgt i32 %n, 0
  br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup

for.body.preheader:                               ; preds = %entry
  %0 = zext i32 %n to i64
  %1 = mul nuw nsw i64 %0, 12
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a1, i8* align 4 %b2, i64 %1, i1 false)
  br label %for.body

for.cond.cleanup.loopexit:                        ; preds = %for.body
  br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond.cleanup.loopexit, %entry
  ret i32 12

for.body:                                         ; preds = %for.body, %for.body.preheader
  %i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ]
  %idxprom = zext i32 %i.08 to i64
  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom
  %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom
  %2 = bitcast %struct.S* %arrayidx2 to i8*
  %3 = bitcast %struct.S* %arrayidx to i8*
  %inc = add nuw nsw i32 %i.08, 1
  %cmp = icmp slt i32 %inc, %n
  br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
}

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0

attributes #0 = { argmemonly nofree nosync nounwind willreturn }
```

Reviewed By: zino

Differential Revision: https://reviews.llvm.org/D97667
2021-05-04 17:05:04 -07:00
Baptiste Saleil
58e38260d0 [AMDGPU] Add rm line to lit test to cleanup bots 2021-05-04 18:27:50 -04:00
Florian Hahn
e85c53712f [VPlan] Properly handle sinking of replicate regions.
This patch updates the code that sinks recipes required for first-order
recurrences to properly handle replicate-regions. At the moment, the
code would just move the replicate recipe out of its replicate-region,
producing an invalid VPlan.

When sinking a recipe in a replicate-region, we have to sink the whole
region. To do that, we first need to split the block at the target
recipe and move the region in between.

This patch also adds a splitAt helper to VPBasicBlock to split a
VPBasicBlock at a given iterator.

Fixes PR50009.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D100751
2021-05-04 22:36:01 +01:00
Baptiste Saleil
6bfeb5cb1d [AMDGPU] Fix lit failure introduced by 6a17609157196878b9cd9aa9ce71bde247ca14db 2021-05-04 17:25:58 -04:00
Fangrui Song
c70d7b131b [MC] Add MCAsmParser::parseComma to improve diagnostics
llvm-mc will error "expected comma" instead of "unexpected token".
2021-05-04 14:13:19 -07:00
Dávid Bolvanský
3389648c5f Revert "[InstSimplify] Added tests for PR50173, NFC"
This reverts commit 4e7a4c73dab6605f4fcc7bf09c2ee85e7925f6d7. Not needed, pattern is handled by instcombine already.
2021-05-04 23:04:05 +02:00
Baptiste Saleil
a29ed3554d [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1
This patch disables some of the passes at -O1. These passes have a significant
impact on compilation time, so we only want them to be enabled starting from -O2.

Differential Revision: https://reviews.llvm.org/D101414
2021-05-04 16:44:39 -04:00
Fangrui Song
72e39e686f [MC] Don't capitalize a floating point diagnostic 2021-05-04 13:40:26 -07:00
Matt Arsenault
36b99228ee GlobalISel: Fix missing newline in debug printing 2021-05-04 16:36:37 -04:00
Matt Arsenault
ea50755f9c X86/GlobalISel: Rely on default assignValueToReg
The resulting output is semantically closer to what the DAG emits and
is more compatible with the existing CCAssignFns.

The returns of f32 in f80 are clearly broken, but they were broken
before when using G_ANYEXT to go from f32 to f80.
2021-05-04 16:36:37 -04:00
Fangrui Song
dcc729c605 [MC] Remove unneeded "in '.xxx' directive" from diagnostics
The directive name is not useful because the next line replicates the error line
which includes the directive.
2021-05-04 13:30:29 -07:00
Thomas Lively
4986a0d7fa [WebAssembly] Mark abs of v2i64 as legal
We previously had an ISel pattern for i64x2.abs, but because the ISDNode was not
marked legal for v2i64, the instruction was not being selected.

Differential Revision: https://reviews.llvm.org/D101803
2021-05-04 13:25:32 -07:00
Alina Sbirlea
f3aac64c1f Add cal entry for MemorySSA syncs. 2021-05-04 12:56:06 -07:00
Xun Li
cb79e5070e [Coroutines] Do not add alloca to the frame if the size is 0
This patch is to address https://bugs.llvm.org/show_bug.cgi?id=49916.
When the size of an alloca is 0, it will trigger an assertion in OptimizedStructLayout when being added to the frame.
Fix it by not adding it at all. We return index 0 (beginning of the frame) for all 0-sized allocas.

Differential Revision: https://reviews.llvm.org/D101841
2021-05-04 12:55:40 -07:00
Martin Storsjö
4d2b659c90 [llvm-readobj] [ARMWinEH] Try to resolve label symbols into regular ones
Unwind info generated by MSVC tends to have relocations pointing at
static "label" symbols like "$LN4" instead of regular ones based on
the actual function's name. Try to resolve such symbols to a non-label
symbol if possible (ideally to an external symbol), to improve
the readability.

Differential Revision: https://reviews.llvm.org/D101567
2021-05-04 22:22:18 +03:00
Giorgis Georgakoudis
608838933b [Utils] Run non-filecheck runlines in-order in update_cc_test_checks
The script update_cc_test_checks runs all non-filechecked runlines before the filechecked ones. This creates problems since outputs of those non-filechecked runlines may conflict and that will fail the execution of update_cc_test_checks. This patch executes non-filechecked in the order specified in the test file to avoid this issue.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101683
2021-05-04 12:06:03 -07:00
Alina Sbirlea
91efc4eb34 Add monthly MemorySSA sync. 2021-05-04 11:23:36 -07:00
Fangrui Song
0161f373d6 [llvm-objdump] Delete temporary Hexagon workaround options 2021-05-04 11:05:11 -07:00
Fangrui Song
ed74043acb [Hexagon][test] Migrate llvm-objdump --mv6[0567]t?/--mhvx to --mcpu=hexagonv*/--mattr=+hvx 2021-05-04 11:00:01 -07:00
Nikita Popov
92b46047e1 [SimplifyCFG] Create logical or in SimplifyCondBranchToCondBranch()
We need to use a logical or instead of a bitwise or to preserve
poison behavior. Poison from the second condition should not
propagate if the first condition is true.

We were already handling this correctly in FoldBranchToCommonDest(),
but not in this fold. (There are still other folds with this issue.)
2021-05-04 19:51:30 +02:00
Nikita Popov
eaabc88f7a [SimplifyCFG] Regenerate test checks (NFC)
Regenerate the branch weight test using --check-globals.
2021-05-04 19:51:30 +02:00
Nikita Popov
d814af217f [SimplifyCFG] Extract helper for creating logical op (NFC) 2021-05-04 19:51:30 +02:00
Fangrui Song
49f9c0a8a0 [llvm-objdump] Delete temporary workaround option --riscv-no-aliases
Use the user-facing `-M no-aliases` instead.
2021-05-04 10:41:40 -07:00
Fangrui Song
62a6c6bb17 [RISCV][test] Migrate llvm-objdump --riscv-no-aliases to -M no-aliases
--riscv-no-aliases is an internal cl::opt option not intended to be exported.
Use the user-facing -M no-aliases instead.
2021-05-04 10:38:33 -07:00
Dávid Bolvanský
acbdea3582 [InstSimplify] Added tests for PR50173, NFC 2021-05-04 19:32:44 +02:00
Arthur Eubanks
d13aef09f1 [docs] Fix some wording 2021-05-04 10:21:38 -07:00
Fangrui Song
d61b087845 [llvm-objdump] Fix -a after D100433
-a is alias for --archive-headers, not --all-headers
2021-05-04 10:17:36 -07:00
Fraser Cormack
33fa15f2b2 [ValueTypes] Add MVTs for v256i16 and v256f16
This patch adds the two MVTs to fix a legalizer crash when using vector
shuffles of <256 x i16> and <128 x i16> on RISC-V. The legalizer can't
promote the operand of `v256i32 = any_extend_vector_inreg v128i16`.

Reviewed By: craig.topper, RKSimon

Differential Revision: https://reviews.llvm.org/D101769
2021-05-04 18:06:13 +01:00
Ahsan Saghir
46a30bedec [PowerPC] Prevent argument promotion of types with size greater than 128 bits
This patch prevents argument promotion of types having
type size greater than 128 bits.

Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=49952

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D101188
2021-05-04 12:09:25 -05:00
Bjorn Pettersson
4d8434ed17 Revert "Make dependency between certain analysis passes transitive"
This reverts commit 3655f0757f2b4b61419446b326410118658826ba.

It caused assertion failures related to setLastUser in polly builds.
2021-05-04 19:08:41 +02:00
Wei Mi
a40dc394da [SampleFDO] Fix a bug when appending function symbol into the Callees set of
Root node in ProfiledCallGraph.

In ProfiledCallGraph::addProfiledFunction, to add a function symbol into the
ProfiledCallGraph, currently an uninitialized ProfiledCallGraphNode node is
created by ProfiledFunctions[Name] and inserted into Callees set of Root node
before the node is initialized. The Callees set use
ProfiledCallGraphNodeComparer as its comparator so the uninitialized
ProfiledCallGraphNode may fail to be inserted into Callees set if it happens
to contain a name in memory which has been inserted into the Callees set
before. The problem will prevent some function symbols from being annotated
with profiles and cause performance regression. The patch fixes the problem.

Differential Revision: https://reviews.llvm.org/D101815
2021-05-04 10:05:59 -07:00
Fangrui Song
79e86a2cab [llvm-objdump] Improve newline consistency between different pieces of information
When dumping multiple pieces of information (e.g. --all-headers),
there is sometimes no separator between two pieces.
This patch uses the "\nheader:\n" style, which generally improves
compatibility with GNU objdump.

Note: objdump -t/-T does not add a newline before "SYMBOL TABLE:" and "DYNAMIC SYMBOL TABLE:".
We add a newline to be consistent with other information.

`objdump -d` prints two empty lines before the first 'Disassembly of section'.
We print just one with this patch.

Differential Revision: https://reviews.llvm.org/D101796
2021-05-04 09:56:07 -07:00
gbreynoo
daae3b3476 [llvm-objdump] Remove Generic Options group from help text output
Reapply 7368624 after revert and fix

Looking at other tools using tablegen for help output, general options
like --help are not separated from other options. This change removes
the "Generic Options" option group so the options are listed together.
the macho specific option group is left unaffected.

The test help.test was modified to reflect this change.

Differential Revision: https://reviews.llvm.org/D101652
2021-05-04 17:42:20 +01:00
Dimitry Andric
fcb5b8b828 Revert "[llvm-objdump] Remove Generic Options group from help text output"
This reverts commit 73686247ac3e60c91fa5943c98956093df5e49ff, as there
were git stash conflict markers left unresolved.
2021-05-04 18:28:31 +02:00
Christudasan Devadasan
122c3d5088 DAG: Cleanup assertion in EmitFuncArgumentDbgValue
Removing an assertion introduced with D68945. The
patch was later reverted with 6531a78ac4b5, but failed
to remove this assertion. It causes a problem while
trying to split a 64-bit argument into sub registers.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D101594
2021-05-04 21:48:58 +05:30
Dimitry Andric
b9df55d1d3 Reland "[MC][ELF] Work around R_MIPS_LO16 relocation handling problem"
This fixes PR49821, and avoids "ld.lld: error: test.o:(.rodata.str1.1):
offset is outside the section" errors when linking MIPS objects with
negative R_MIPS_LO16 implicit addends.

ld.lld handles R_MIPS_HI16/R_MIPS_LO16 separately, not as a whole, so it
doesn't know that an R_MIPS_HI16 with implicit addend 1 and an
R_MIPS_LO16 with implicit addend -32768 represents 32768, which is in
range of a MergeInputSection. We could introduce a new RelExpr member
(like R_RISCV_PC_INDIRECT for R_RISCV_PCREL_HI20 / R_RISCV_PCREL_LO12)
but the complexity is unnecessary given that GNU as keeps the original
symbol for this case as well.

Adds a new test case for PR49821, and also updates two other test cases
that are affected by this change.

Reviewed By: atanasyan, MaskRay

Differential Revision: https://reviews.llvm.org/D101773
2021-05-04 18:16:09 +02:00
Sanjay Patel
30e39a65e1 [InstCombine] avoid infinite loops with select/icmp transforms
This fixes https://llvm.org/PR48900 , but as seen in the
regression tests prevents some optimizations.

There are a few options to restore those (switch to min/max
intrinsics, add larger pattern matching for select with
dominating condition, improve CVP), but we need to prevent
the bug 1st.
2021-05-04 11:54:06 -04:00
gbreynoo
506fd61a96 [llvm-objdump] Remove Generic Options group from help text output
Looking at other tools using tablegen for help output, general options
like --help are not separated from other options. This change removes
the "Generic Options" option group so the options are listed together.
the macho specific option group is left unaffected.

The test help.test was modified to reflect this change.

Differential Revision: https://reviews.llvm.org/D101652
2021-05-04 16:48:03 +01:00
Amy Kwan
379bc7c651 [PowerPC][NFC] Update atomic patterns to use the refactored load/store implementation
This patch updates the scalar atomic patterns to use the refactored load/store
implementation introduced in D93370.
All existing test cases pass with when the refactored patterns are utilized.

Differential Revision: https://reviews.llvm.org/D94498
2021-05-04 10:46:45 -05:00
gbreynoo
5291a705c1 [llvm-objdump] Remove --cfg option from command guide
The llvm-objdump command guide has the option --cfg which was removed
from the tool by 888320e9fa5eb33194c066f68d50f1e73c5fff5e in 2014. This
change updates the command guide to reflect this.

Differential Revision: https://reviews.llvm.org/D101648
2021-05-04 16:42:13 +01:00
Florian Hahn
31a10de716 [VPlan] Representing backedge def-use feeding reduction phis.
This patch updates the code handling reduction recipes to also keep
track of the incoming value from the latch in the recipe. This is needed
to model the def-use chains completely in VPlan, so that it is possible
to replace the incoming value with an arbitrary VPValue.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D99294
2021-05-04 16:33:22 +01:00
LLVM GN Syncbot
77a6a2e20b [gn build] Port 2021d272ad6c 2021-05-04 15:06:28 +00:00
Brock Wyma
1389385414 [CodeView] Truncate Long Type Names With An MD5 Hash
Replace long CodeView type names with an MD5 hash of the name.

Differential Revision: https://reviews.llvm.org/D97881
2021-05-04 10:51:21 -04:00
Fraser Cormack
1c9a07e9b0 [LangRef] Fix a typo in the vector-type memory layout section 2021-05-04 15:40:53 +01:00
Sander de Smalen
5c92b9bcf1 Reland "[LV] Calculate max feasible scalable VF."
Relands https://reviews.llvm.org/D98509

This reverts commit 51d648c119d7773ce6fb809353bd6bd14bca8818.
2021-05-04 15:44:41 +01:00