1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

204350 Commits

Author SHA1 Message Date
Simon Pilgrim
2352264886 [InstCombine] visitTrunc - remove dead trunc(lshr (zext A), C) combine. NFCI.
I added additional test coverage at rG7a55989dc4305 - but all are handled independently of this combine and http://lab.llvm.org:8080/coverage/coverage-reports/ indicates the code is never used.

Differential revision: https://reviews.llvm.org/D88492
2020-09-29 17:15:16 +01:00
Simon Pilgrim
08a0e9a912 MSP430TargetMachine.h - remove unused includes. NFCI. 2020-09-29 16:41:59 +01:00
Simon Pilgrim
0626d95654 NVPTXTargetMachine.h - remove unused includes. NFCI. 2020-09-29 16:41:59 +01:00
Simon Pilgrim
6301392398 SparcSubtarget.h - cleanup include dependencies. NFCI.
TargetFrameLowering.h is guaranteed to be covered by SparcFrameLowering.h

Fix missing implicit Triple.h dependency.
2020-09-29 16:41:58 +01:00
Cameron McInally
8895b924c4 [SVE] Fix typo in CHECK lines for sve-fixed-length-int-reduce.ll 2020-09-29 10:12:58 -05:00
Sanjay Patel
bf8ef9fdc3 [InstCombine] use redirect of input file in regression tests; NFC
This is a repeat of 1880092722 from 2009. We should have less risk
of hitting bugs at this point because we auto-generate positive CHECK
lines only, but this makes things consistent.

Copying the original commit msg:
"Change tests from "opt %s" to "opt < %s" so that opt doesn't see the
input filename so that opt doesn't print the input filename in the
output so that grep lines in the tests don't unintentionally match
strings in the input filename."
2020-09-29 11:06:25 -04:00
Simon Pilgrim
1dd688eff2 [InstCombine] Add some basic trunc(lshr(zext(x),c)) tests
Copied from the sext equivalents
2020-09-29 15:49:57 +01:00
Simon Pilgrim
46a5992f75 [InstCombine] Inherit exact flags on extended shifts in trunc (lshr (sext A), C) --> (ashr A, C)
This was missed in D88475
2020-09-29 15:32:09 +01:00
Simon Pilgrim
487233e67b [InstCombine] Add exact shift tests missed in D88475
I missed the post-LGTM comment from @lebedev.ri
2020-09-29 15:24:59 +01:00
Krzysztof Parzyszek
6796541b93 [SDAG] Do not convert undef to 0 when folding CONCAT/BUILD_VECTOR
Differential Revision: https://reviews.llvm.org/D88273
2020-09-29 09:12:26 -05:00
Simon Pilgrim
11e1b3f795 [InstCombine] visitTrunc - trunc (lshr (sext A), C) --> (ashr A, C) non-uniform support
This came from @lebedev.ri's suggestion to use m_SpecificInt_ICMP for D88429 - since I was going to change the m_APInt to m_Constant for that patch I thought I would do it for the only other user of the APInt first.

I've added a ConstantExpr::getUMin helper - its trivial to add UMAX/SMIN/SMAX but thought I'd wait until we have use cases.

Differential Revision: https://reviews.llvm.org/D88475
2020-09-29 15:01:16 +01:00
Dominik Montada
83da9d3852 [GlobalISel] fix widenScalarUnmerge if widen type is not a multiple of destination type
Fix creation of illegal unmerge when widen was requested to a type which
is not a multiple of the destination type. E.g. when trying to widen
an s48 unmerge to s64 the existing code would create an illegal unmerge
from s64 to s48.

Instead, create further unmerges to a GCD type, then use this to remerge
these intermediate results to the actual destinations.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88422
2020-09-29 15:52:20 +02:00
Mirko Brkusanin
367c918b83 Revert "[AMDGPU] Reorganize GCN subtarget features for unaligned access"
This reverts commit f5cd7ec9f3fc969ff5e1feed961996844333de3b.

Certain rocPRIM/rocThrust/hipCUB tests were failing because of this change.
2020-09-29 15:33:34 +02:00
Jay Foad
e606604b45 [SDag] Verify DAG divergence after dumping. NFC.
When debugging, it's useful to be able to see the DAG that has just
failed divergence verification.
2020-09-29 14:05:07 +01:00
Jay Foad
938bf31fa3 [SDag] Refactor and simplify divergence calculation and checking. NFC. 2020-09-29 14:05:07 +01:00
Jonas Paulsson
d53723ece8 [SystemZ] Don't emit PC-relative memory accesses to unaligned symbols.
In the presence of packed structures (#pragma pack(1)) where elements are
referenced through pointers, there will be stores/loads with alignment values
matching the default alignments for the element types while the elements are
in fact unaligned. Strictly speaking this is incorrect source code, but is
unfortunately part of existing code and therefore now addressed.

This patch improves the pattern predicate for PC-relative loads and stores by
not only checking the alignment value of the instruction, but also making
sure that the symbol (and element) itself is aligned.

Fixes https://bugs.llvm.org/show_bug.cgi?id=44405

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D87510
2020-09-29 14:51:13 +02:00
Florian Hahn
5656134a0c [LoopUtils] Only verify SE in builds with assertions.
Follow up to 60b852092c98.
2020-09-29 13:39:23 +01:00
Daniel Kiss
2497820205 [AArch64] Add BTI to CFI jumptables.
With branch protection the jump to the jump table entries requires a landing pad.

Reviewed By: eugenis, tamas.petz

Differential Revision: https://reviews.llvm.org/D81251
2020-09-29 13:50:23 +02:00
David Stenberg
669dbd44d7 [IndVarSimplify] Fix Modified status for removal of overflow intrinsics
When removing an overflow intrinsic the Changed status in SimplifyIndvar
was not set, leading to the IndVarSimplify pass returning an incorrect
status.

This was caught using the check introduced by D80916.

As pointed out in the code review, a similar bug may exist for
eliminateTrunc().

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D85971
2020-09-29 13:20:59 +02:00
Vitaly Buka
01af5a47c9 [msan] Fix llvm.abs.v intrinsic
The last argument of the intrinsic is a boolean
flag to control INT_MIN handling and does
not affect msan metadata.
2020-09-29 03:52:27 -07:00
Vitaly Buka
72c59279c4 [msan] Add test for vector abs intrinsic 2020-09-29 03:52:27 -07:00
sstefan1
e47fd785ee [OpenMPOpt][Fix] Only initialize ICV initial values once.
Reviewers: jdoerfert, ggeorgakoudis

Differential Revision: https://reviews.llvm.org/D88441
2020-09-29 12:22:58 +02:00
Simon Pilgrim
5159bb7b3f [InstCombine] Add trunc(lshr(sext(x),c)) non-uniform vector tests 2020-09-29 10:56:15 +01:00
Florian Hahn
02a3467af0 [LoopDeletion] Forget loop before setting values to undef
After D71539, we need to forget the loop before setting the incoming
values of phi nodes in exit blocks, because we are looking through those
phi nodes now and the SCEV expression could depend on the loop phi. If
we update the phi nodes before forgetting the loop, we miss those users
during invalidation.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D88167
2020-09-29 10:38:44 +01:00
Max Kazantsev
41907f58c6 [SCEV][NFC] Introduce isBasicBlockEntryGuardedByCond
Currently, we have `isLoopEntryGuardedByCond` method in SCEV, which
checks that some fact is true if we enter the loop. In fact, this is just a
particular case of more general concept `isBasicBlockEntryGuardedByCond`
applied to given loop's header. In fact, the logic if this code is largely
independent on the given loop and only cares code above it.

This patch makes this generalization. Now we can query it for any block,
and `isBasicBlockEntryGuardedByCond` is just a particular case.

Differential Revision: https://reviews.llvm.org/D87828
Reviewed By: fhahn
2020-09-29 15:53:45 +07:00
Tres Popp
59b6daf823 Revert "OpaquePtr: Add type to sret attribute"
This reverts commit 55c4ff91bd820d72014f63dcf7f3d5a0d3397986.

Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
2020-09-29 10:31:04 +02:00
Serguei Katkov
fcb17e5e03 [IsKnownNonZero] Handle the case with non-constant phi nodes
Handle the case when all inputs of phi are proven to be non zero.

Constants are checked in beginning of this method before check for depth of recursion,
so it is a partial case of non-constant phi.

Recursion depth is already handled by the function.

Reviewers: aqjune, nikic, efriedma
Reviewed By: nikic
Subscribers: dantrushin, hiraditya, jdoerfert, llvm-commits
Differential Revision: https://reviews.llvm.org/D88276
2020-09-29 15:22:10 +07:00
Florian Hahn
8112c564bf Revert "Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one.""
Looks like there is still another remaining issue:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/22273/steps/build%20libcxx%2Fmsan/logs/stdio

This reverts commit 86a20d9e34f5a9989da72097f23f3b0a44157e73.
2020-09-29 09:18:19 +01:00
Florian Hahn
6a5bcd3f18 Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one."
This version includes an small fix allowing function pointers to be
unconditionally replaced for now.

This reverts commit 4c5e4aa89b11ec3253258b8df5125833773d1b1e.
2020-09-29 09:10:27 +01:00
Sam Parker
46ff74493f [NFC][ARM] Comments and lambdas
Add some comments in LowOverheadLoops and make some lambda variables
explicit arguments instead of capturing.
2020-09-29 08:41:53 +01:00
Craig Topper
c69069f9b0 [X86] Add computeKnownBits support for PEXT.
The number of zeros in the mask provides a lower bound on the number
of leading zeros in the result.
2020-09-28 22:54:07 -07:00
Craig Topper
3f90911e46 [X86] Add known bits test for PEXT. NFC 2020-09-28 22:54:07 -07:00
Arthur Eubanks
ee468fc3e5 [Docs][NewPM] Add note about required passes
Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88342
2020-09-28 21:45:14 -07:00
Max Kazantsev
4bab0c4016 [NFC] Use assert instead of checking the guaranteed condition
From preconditions it is known that either A dominates B or
B dominates A. If A does not dominate B, we do not really need
to check it. Assert should be enough. Should save some compile
time.
2020-09-29 11:38:45 +07:00
Max Kazantsev
ec6ab63143 [IndVars] Remove exiting conditions that are trivially true/false
When removing exiting loop conditions, we only consider checks for
which we know the exact exit count. We could also eliminate checks for
which the condition is always true/false.

Differential Revision: https://reviews.llvm.org/D87344
Reviewed By: lebedev.ri, reames
2020-09-29 11:35:32 +07:00
Yonghong Song
7ba626add6 BPF: explicitly specify bpfel triple for certain tests
Commit 54d9f743c8b0 ("BPF: move AbstractMemberAccess and
PreserveDIType passes to EP_EarlyAsPossible") changed most
of CORE tests with opt run followed by llc and opt requires
the target triple specified in the IR.

There are few tests where little endian and big endian will
report different result and for little endian versions of
tests, "target triple = "bpf"" will produce wrong results
if the test executed in a big endian machine, e.g.
PowerPC big endian machine, since target "bpf" represents
host endian and will resolve to "bpfeb".
The builtbot reported such failures when build-and-run
on a PowerPC big endian machine.

To fix the issue, using "target triple = "bpfel"" instead.
2020-09-28 20:25:25 -07:00
Amara Emerson
09394476cd [AArch64][GlobalISel] Scalarize <2 x s64> G_MUL since we don't have native support for it.
Differential Revision: https://reviews.llvm.org/D88437
2020-09-28 19:29:45 -07:00
LLVM GN Syncbot
117b16b4b6 [gn build] Port 54d9f743c8b 2020-09-29 00:24:06 +00:00
Ruiling Song
62e7593653 [RegisterCoalescer] Pass Undefs to extendToIndices()
When extending the subranges, the reaching-def may be an undefs. When
extending such kind of subrange, it will try to search for the reaching
def first. If the reaching def is an undef and we did not provide 'Undefs',
The findReachingDefs() will fail with message:
"Use of $noreg does not have a corresponding definition on every path:
 LLVM ERROR: Use not jointly dominated by defs."
So we computeSubRangeUndefs() and pass the result to extendToIndices().

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D87744
2020-09-29 08:14:24 +08:00
Yonghong Song
b27683b417 BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible
Move abstractMemberAccess and PreserveDIType passes as early as
possible, right after clang code generation.

Currently, compiler may transform the above code
  p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
  p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
  a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
  if (a) {
    p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
    p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
    bpf_probe_read(buf, buf_size, p2);
  }
to
  p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
  p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
  a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
  if (a) {
    bpf_probe_read(buf, buf_size, p2);
  }
and eventually assembly code looks like
  reloc_exist = 1;
  reloc_member_offset = 10; //calculate member offset from base
  p2 = base + reloc_member_offset;
  if (reloc_exist) {
    bpf_probe_read(bpf, buf_size, p2);
  }
if during libbpf relocation resolution, reloc_exist is actually
resolved to 0 (not exist), reloc_member_offset relocation cannot
be resolved and will be patched with illegal instruction.
This will cause verifier failure.

This patch attempts to address this issue by do chaining
analysis and replace chains with special globals right
after clang code gen. This will remove the cse possibility
described in the above. The IR typically looks like
  %6 = load @llvm.sk_buff:0:50$0:0:0:2:0
  %7 = bitcast %struct.sk_buff* %2 to i8*
  %8 = getelementptr i8, i8* %7, %6
for a particular address computation relocation.

But this transformation has another consequence, code sinking
may happen like below:
  PHI = <possibly different @preserve_*_access_globals>
  %7 = bitcast %struct.sk_buff* %2 to i8*
  %8 = getelementptr i8, i8* %7, %6

For such cases, we will not able to generate relocations since
multiple relocations are merged into one.

This patch introduced a passthrough builtin
to prevent such optimization. Looks like inline assembly has more
impact for optimizaiton, e.g., inlining. Using passthrough has
less impact on optimizations.

A new IR pass is introduced at the beginning of target-dependent
IR optimization, which does:
  - report fatal error if any reloc global in PHI nodes
  - remove all bpf passthrough builtin functions

Changes for existing CORE tests:
  - for clang tests, add "-Xclang -disable-llvm-passes" flags to
    avoid builtin->reloc_global transformation so the test is still
    able to check correctness for clang generated IR.
  - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> | llvm-dis" command
    before "llc" command since "opt" is needed to call newly-placed
    builtin->reloc_global transformation. Add target triple in the IR
    file since "opt" requires it.
  - Since target triple is added in IR file, if a test may produce
    different results for different endianness, two tests will be
    created, one for bpfeb and another for bpfel, e.g., some tests
    for relocation of lshift/rshift of bitfields.
  - field-reloc-bitfield-1.ll has different relocations compared to
    old codes. This is because for the structure in the test,
    new code returns struct layout alignment 4 while old code
    is 8. Align 8 is more precise and permits double load. With align 4,
    the new mechanism uses 4-byte load, so generating different
    relocations.
  - test intrinsic-transforms.ll is removed. This is used to test
    cse on intrinsics so we do not lose metadata. Now metadata is attached
    to global and not instruction, it won't get lost with cse.

Differential Revision: https://reviews.llvm.org/D87153
2020-09-28 16:56:22 -07:00
Mehdi Amini
8fedbbff8a Guard find_library(tensorflow_c_api ...) by checking for TENSORFLOW_C_LIB_PATH to be set by the user
Also have CMake fails if the user provides a TENSORFLOW_C_LIB_PATH but
we can't find TensorFlow at this path.

At the moment the CMake script tries to figure if TensorFlow is
available on the system and enables support for it. This is in general
not desirable to customize build features this way and instead it is
preferable to let the user opt-in explicitly into the features they want
to enable. This is in line with other optional external dependencies
like Z3.
There are a few reasons to this but amongst others:
- reproducibility: making features "magically" enabled based on whether
  we find a package on the system or not makes it harder to handle bug
  reports from users.
- user control: they can't have TensorFlow on the system and build LLVM
  without TensorFlow right now. They also would suddenly distribute LLVM
  with a different set of features unknowingly just because their build
  machine environment would change subtly.

Right now this is motivated by a user reporting build failures on their system:

.../mesa-git/llvm-git/src/llvm-project/llvm/lib/Analysis/TFUtils.cpp:23:10: fatal error: tensorflow/c/c_api.h: No such file or directory
   23 | #include "tensorflow/c/c_api.h"
      |          ^~~~~~

It looks like we detected TensorFlow at configure time but couldn't set all the paths correctly.

Differential Revision: https://reviews.llvm.org/D88371
2020-09-28 22:15:55 +00:00
Philip Reames
c4e9c9a455 [CVP] Allow two transforms in one invocation
For a call site which had both constant deopt operands and nonnull arguments, we were missing the opportunity to recognize the later by bailing early.

This is somewhat of a speculative fix.  Months ago, I'd had a private report of performance and compile time regressions from the deopt operand folding.  I never received a test case.  However, the only possibility I see was that after that change CVP missed the nonnull fold, and we end up with a pass ordering/missed simplification issue.  So, since it's a real issue, fix it and hope.
2020-09-28 15:11:42 -07:00
Fangrui Song
e0ac770663 [EHStreamer] Simplify sharedTypeIDs with std::mismatch
(Note that EMStreamer.cpp is largely under tested. The only test checking the prefix sharing is CodeGen/WebAssembly/eh-lsda.ll)
2020-09-28 15:05:59 -07:00
Craig Topper
369d12cf0e [X86] Add support for calling SimplifyDemandedBits on the input of PDEP with a constant mask.
We can do several optimizations for PDEP using computeKnownBits and SimplifyDemandedBits

-If the MSBs of the output aren't demanded, those MSBs of the mask input aren't demanded either. We need to keep the most significant demanded bit of the mask and any mask bits before it.
-The number of possible ones in the mask determines how many bits of the lsbs of the other operand are demanded. Any bits of the mask we don't demand by the previous rule should not be counted.
-The result will have zeros in any position that the mask is zero.
-Since non-mask input bits can only be output in the original position or a higher bit position, the result will have at least as many trailing zeroes as the non-mask input.

Differential Revision: https://reviews.llvm.org/D87883
2020-09-28 14:21:30 -07:00
Craig Topper
a9c2ef5d71 [X86] Add tests for D87883. NFC 2020-09-28 14:21:29 -07:00
Amara Emerson
635632451e [GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.

Differential Revision: https://reviews.llvm.org/D88364
2020-09-28 14:00:46 -07:00
David Tenty
c82257db65 [CMake][AIX] Limit tools in external project build
This is a follow on to D85329 which disabled some llvm tools in the
runtimes build due to XCOFF64 limitations. This change disables them
in other external project builds as well, when no list of tools is
specified in the arguments.

Reviewed By: hubert.reinterpretcast, stevewan

Differential Revision: https://reviews.llvm.org/D88310
2020-09-28 16:59:25 -04:00
Nico Weber
cce683e16e [gn build] Re-run CompletionModelCodegen when input json files change 2020-09-28 16:58:00 -04:00
Amara Emerson
34b690d29f Revert "Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar.""
This isn't a real with the codegen, it's a previously known bug in clang which
causes non-deterministic failures due to garbage bits in undef registers being
used in saturating instructions.

I'm disabling the result checking for the test until this issue is resolved.

This reverts commit 6c8168324b5329c94fe7e8f9a1619802091b9bec.
2020-09-28 13:44:51 -07:00
Craig Topper
1804ba7f5b [X86] Use inlineasm flag output for the _bittest* intrinsics.
Instead of expliciting emitting a setc in the inline asm instructions,
we can use flag output. This allows the backend to use the flag
directly if it is needed by a branch. Previously we needed a test
instruction to convert the register back to a flag.

If the flag can't be used directly, the backend will emit a setcc.

Differential Revision: https://reviews.llvm.org/D87888
2020-09-28 13:33:22 -07:00