1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00
Commit Graph

213908 Commits

Author SHA1 Message Date
Jonas Hahnfeld
8976f41d83 [AArch64] Materialize FP constant in code for large code model
When using the large code model with FastISel (for example via
clang -O0 which adds the optnone attribute), FP constants could
still be materialized using adrp + ldr. Unconditionally enable
the existing path for MachO to materialize the constant in code.

For testing, restore literal_pools_float.ll to exercise the constant
pool and add two optnone-functions that return a float and a double,
respectively. Consolidate fpimm.ll and add a new fast-isel-fpimm.ll
to check the code paths taken with FastISel.

Differential Revision: https://reviews.llvm.org/D99607
2021-04-07 21:02:05 +02:00
Arthur Eubanks
44766d7c10 Revert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()"
This reverts commit 9583a3f2625818b78c0cf6d473cdedb9f23ad82c.

This wasn't NFC as initially thought. Needed for D99707.
2021-04-07 11:40:44 -07:00
Abhina Sreeskantharajan
99295a0db0 [Windows] Remove global OF_None flag for Windows in ToolOutputFiles
Since we have created a new OF_TextWithCRLF flag, we no longer need to worry about OF_Text flag turning on CRLF translation. I can remove this workaround I added to globally open all ToolOutputFiles as binary on Windows.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D100034
2021-04-07 14:10:04 -04:00
Craig Topper
7bc472c724 [RISCV] Support vslide1up/down intrinsics for SEW=64 on RV32.
This can't use our normal strategy of splatting the scalar and using
a .vv operation instead of .vx.

Instead this patch bitcasts the vector to the equivalent SEW=32
vector and inserts the scalar parts using two vslide1up/down. We
do that unmasked and apply the mask separately at the end with
a vmerge.

For vslide1up there maybe some other options here like getting
i64 into element 0 and using vslideup.vi with this vector as
vd and the original source as vs1. Masking would still need to
be done afterwards.

That idea doesn't work for vslide1down. We need to slidedown and
then insert a single scalar at vl-1 which we could do with a
vslideup, but that assumes vl > 0 which I don't think we can assume.

The i32 double slide1down implemented here is the best I could come
up with and I just made vslide1up consistent.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D99910
2021-04-07 10:44:53 -07:00
Craig Topper
2ccd678696 [SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR
This allows FoldConstantArithmetic to handle SPLAT_VECTOR in
addition to BUILD_VECTOR. This allows it to support scalable
vectors. I'm also allowing fixed length SPLAT_VECTOR which is
used by some targets, but I'm not familiar enough to write tests
for those targets.

I had to block this function from running on CONCAT_VECTORS to
avoid calling getNode for a CONCAT_VECTORS of 2 scalars.
This can happen because the 2 operand getNode calls this
function for any opcode. Previously we were protected because
CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR
before that call. But it's not always possible to fold a CONCAT_VECTORS
of SPLAT_VECTORs, and we don't even try.

This fixes PR49781 where DAG combine thought constant folding
should be possible, but FoldConstantArithmetic couldn't do it.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D99682
2021-04-07 10:03:33 -07:00
Craig Topper
004663cc80 [LoopIdiomRecognize] Minor cleanups to the FFS idiom matching. NFC
-Make sure of the CreateShl/LShr/AShr methods that take a uint64_t
instead of creating a ConstantInt for 1 ourselves.
-Use Builder.getInt1 or ConstantInt::getBool instead of a conditional.
-Pull out repeated calls to getType.
2021-04-07 10:03:14 -07:00
Dimitry Andric
c36025cd71 Avoid testing for libc++ internal macros after D99834
As D99834 was meant specifically for FreeBSD, which still uses the older
non-trivial std::pair copy constructors, test for `__FreeBSD__` instead
of relying on a macro which is an internal detail of libc++.

Noted by Louis Dionne.
2021-04-07 18:52:41 +02:00
Roman Lebedev
3d5a1f14f3 [InstCombine] foldAddWithConstant(): don't deal with non-immediate constants
All of the code that handles general constant here (other than the more
restrictive APInt-dealing code) expects that it is an immediate,
because otherwise we won't actually fold the constants, and increase
instruction count. And it isn't obvious why we'd be okay with
increasing the number of constant expressions,
those still will have to be run..

But after 2829094a8e252d04f13aabdf6f416c42a06af695
this could also cause endless combine loops.
So actually properly restrict this code to immediates.
2021-04-07 19:50:19 +03:00
Sanjay Patel
7d1726e81c [InstCombine] avoid infinite loop from partial undef vectors
This fixes the examples from
D99674 and
https://llvm.org/PR49878

The matchers succeed on partial undef/poison vector constants,
but the transform creates a full 'not' (-1) constant, so it
would undo a demanded vector elements change triggered by the
extractelement.

Differential Revision: https://reviews.llvm.org/D100044
2021-04-07 12:18:12 -04:00
wlei
d923efafdf [CSSPGO] Fix incorrect probe distribution factor computation in top-down inliner
We see a regression related to low probe factor(0.01) which prevents some callsites being promoted in ICPPass and later cause the missing inline in CGSCC inliner. The root cause is due to redundant(the second) multiplication of the probe factor and this change try to fix it.

`Sum` does multiply a factor right after findCallSamples but later when using as the parameter in setProbeDistributionFactor, it multiplies one again.

This change could get ~2% perf back on mcf benchmark. In mcf, previously the corresponding factor is 1 and it's the recent feature introducing the <1 factor then trigger this bug.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D99787
2021-04-07 08:48:59 -07:00
Simon Pilgrim
25459ad683 [X86][AVX] Add HADD lane crossing test
This used to work before rG77d625f8d8aa, but we now merge the shuffles across the fadd resulting in a hadd that requires a lane crossing post shuffle, which we don't permit on AVX1 targets
2021-04-07 16:43:47 +01:00
Abhina Sreeskantharajan
2f65469a10 [SystemZ][z/OS][TableGen] TableGen files should be text
This patch sets tablegen files as text. It should have no effect on Windows after this patch landed https://reviews.llvm.org/rG82b3e28e836d2f5c8cfd6e1047b93c088522365a.

Reviewed By: anirudhp

Differential Revision: https://reviews.llvm.org/D100036
2021-04-07 11:23:00 -04:00
Sander de Smalen
4bfea803ed [SVE] Remove checks for warnings in scalable-vector tests.
After D98856 these tests will by default break (fatal_error) if any of
the wrong interfaces are used, so there's no longer a need to have a
RUN line that checks for a warning message emitted by the compiler.
2021-04-07 15:59:32 +01:00
Sam Clegg
d2a91d13b3 [WebAssembly] Improve error messages regarding missing indirect function table. NFC
Use report_fatal_error here since this is an internal error, and not
something the user can/should be trying to fix.

Also distinguish between the symbol being missing and the symbol having
the wrong type.

We have a failure internally where the symbol is missing.  Currently
trying to reduce the test case to something we can attach to an llvm
bug.

Differential Revision: https://reviews.llvm.org/D99960
2021-04-07 07:58:43 -07:00
Sebastian Neubauer
b2479f1ec0 [AMDGPU] Update SGPRSpillVGPRCSR name. NFC
The struct is used for both, callee and caller-save registers now.
The frame index is not set for entrypoints, as we do not need to save
the registers then.
Update the struct name to reflect that.

Differential Revision: https://reviews.llvm.org/D99722
2021-04-07 16:30:40 +02:00
Jingu Kang
d5db750efa [NPM] Fix typo inisLTOPreLink for loop rotate
Differential Revision: https://reviews.llvm.org/D100033
2021-04-07 15:08:37 +01:00
Nico Weber
81631e4f6c Revert "[clang] Speedup line offset mapping computation"
This reverts commit 6951b72334bbe4c189c71751edc1e361d7b5632c.
Breaks several bots, see comments on https://reviews.llvm.org/D99409
2021-04-07 09:42:11 -04:00
Simon Pilgrim
4e8702a6e4 [X86] Improve optimizeCompareInstr for signed comparisons after AND/OR/XOR instructions
Extend D94856 to handle 'and', 'or' and 'xor' instructions as well

We still fail on many i8/i16 cases as the test and the logic-op are performed on different widths
2021-04-07 14:28:42 +01:00
Alexey Bataev
21489b2314 [SLP]Avoid multiple attempts to vectorize CmpInsts.
No need to lookup through and/or try to vectorize operands of the
CmpInst instructions during attempts to find/vectorize min/max
reductions. Compiler implements postanalysis of the CmpInsts so we can
skip extra attempts in tryToVectorizeHorReductionOrInstOperands and save
compile time.

Differential Revision: https://reviews.llvm.org/D99950
2021-04-07 06:15:42 -07:00
Jay Foad
1f77909f4d [AMDGPU] SIFoldOperands: don't dump extra '\n' after MachineInstr. NFC. 2021-04-07 14:13:00 +01:00
Sanjay Patel
58d17b652b [InstCombine] move abs transform to helper function; NFC
The swap of the operands can affect later transforms that
are expecting a constant as operand 1. I don't think we
can trigger a bug with the current code, but I hit that
problem while drafting a new transform for min/max intrinsics.
2021-04-07 08:35:07 -04:00
Sanjay Patel
de9270d0bc [InstCombine] add tests for not-of-min/max; NFC 2021-04-07 08:35:06 -04:00
Simon Pilgrim
95e478b56d [X86] Add AND/OR/XOR signed-comparison overflow test cases for PR48768
D94856 covered the BMI cases where we had existing tests, this adds missing AND/OR/XOR test cases
2021-04-07 13:31:54 +01:00
serge-sans-paille
926c5d1cc5 [clang] Speedup line offset mapping computation
Clang spends a decent amount of time in the LineOffsetMapping::get(...)
function. This function used to be vectorized (through SSE2) then the
optimization got dropped because the sequential version was on-par performance
wise.

This provides an optimization of the sequential version that works on a word at
a time, using (documented) bithacks to provide a portable vectorization.

When preprocessing the sqlite amalgamation, this yields a sweet 3% speedup.

Differential Revision: https://reviews.llvm.org/D99409
2021-04-07 14:04:32 +02:00
Simon Pilgrim
9ae167a79c [X86] Improve optimizeCompareInstr for signed comparisons after BZHI instructions
Extend D94856 to handle 'bzhi' instructions as well
2021-04-07 12:07:26 +01:00
Yevgeny Rouban
69596e5124 [Statepoint Lowering] Allow other than N byte sized types in deopt bundle
I do not see any bit-width restriction from the point of the
LLVM Lang Ref - Operand Bundles on the types of the deopt bundle
operands. Statepoint Lowering seems to be able to work with any
types.
This patch relaxes the two related assertions and adds a new test
for this change.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100006
2021-04-07 17:48:31 +07:00
Simon Pilgrim
52c5a6d304 [X86] Add BZHI test case for PR48768
D94856 covered the BMI cases where we had existing tests, this adds a missing BZHI test case
2021-04-07 11:38:47 +01:00
Kirill Bobyrev
f1dff186b8 [CMake] try creating symlink first on windows
//-E create_symlink//  is available on windows since CMake 3.13 (LLVM now uses 3.13.4)
It may needs administrator privileges or enabled developer mode (Windows 10)
See https://cmake.org/cmake/help/latest/release/3.13.html

Reviewed By: kbobyrev

Differential Revision: https://reviews.llvm.org/D99170
2021-04-07 11:23:10 +02:00
Stefan Gränitz
9c76412adb [Orc][examples] Add missing FileCheck for lit test and polish output 2021-04-07 11:12:20 +02:00
Roman Lebedev
4792a641c6 Reland [InstCombine] Fold ((X - Y) - Z) to X - (Y + Z) (PR49858)
This reverts commit a547b4e26b311e417cd51100e379693f51a3f448,
relanding commit 31d219d2997fed1b7dc97e0adf170d5aaf65883e,
which was reverted because there was a conflicting inverse transform,
which was causing an endless combine loop, which has now been adjusted.

Original commit message:

https://alive2.llvm.org/ce/z/67w-wQ

We prefer `add`s over `sub`, and this particular xform
allows further folds to happen:

Fixes https://bugs.llvm.org/show_bug.cgi?id=49858
2021-04-07 12:06:25 +03:00
Roman Lebedev
692350022e [InstCombine] Restrict "C-(X+C2) --> (C-C2)-X" fold to immediate constants
I.e., if any/all of the consants is an expression, don't do it.
Since those constants won't reduce into an immediate,
but would be left as an constant expression, they could cause
endless combine loops after 31d219d2997fed1b7dc97e0adf170d5aaf65883e
added an inverse transformation.
2021-04-07 12:06:24 +03:00
Roman Lebedev
a4b8059ee2 [NFC][InstCombine] Add sub-of-sub tests with constant expressions
These would cause endless combine loop after 31d219d2997fed1b7dc97e0adf170d5aaf65883e.
2021-04-07 12:06:24 +03:00
Thomas Preud'homme
c1f557c32d [PowerPC, test] Fix use of undef FileCheck var
LLVM test CodeGen/PowerPC/ppc-disable-non-volatile-cr.ll tries to check
for the absence of a sequence of instructions with several CHECK-NOT
with one of those directives using a variable defined in another.
However CHECK-NOT are checked independently so that is using a variable
defined in a pattern that should not occur in the input.

This commit changes occurence of the variable for the regex used in its
definition, thereby making each CHECK-NOT independent.

Reviewed By: NeHuang, nemanjai

Differential Revision: https://reviews.llvm.org/D99880
2021-04-07 09:45:21 +01:00
Thomas Preud'homme
a4ad4af918 [Coroutines, test] Fix use of var defined in CHECK-NOT
LLVM test Transforms/Coroutine/coro-split-sink-lifetime-O2.ll tries to
check for the absence of a sequence of instructions with several
CHECK-NOT with one of those directives using a variable defined in
another. However CHECK-NOT are checked independently so that is using a
variable defined in a pattern that should not occur in the input.

This commit simplifies the CHECK-NOT block to only check for the
presence of any lifetime start marker since that is effectively what
the test was testing at the moment.

Reviewed By: junparser

Differential Revision: https://reviews.llvm.org/D99856
2021-04-07 09:42:58 +01:00
Qiu Chaofan
0a05471d5c [PowerPC] Fix use check of swap-reduction
This will fix swap-reduction in DAGISel for cases where COPY_TO_REGCLASS
has multiple uses.
2021-04-07 15:55:52 +08:00
Stefan Gränitz
60b56b2c6a [Orc][examples] Add lit ToolSubst for LLJITWithRemoteDebugging example
The test case added in 258f055ed936 was lacking two important details for the test infrastructure. ae217bf1f327 added the executable to LLVM_TEST_DEPENDS in CMake to make sure the exectubale gets built before we run the test suite. This patch adds a ToolSubst for the executable in LIT, which replaces the tool invokation in the RUN line with an absolute path. It makes sure we don't run accidentally run some other tool from the user's PATH. The test works without it in case LLVM's main binary directory happens to be the working directory (which is default apparently). Configurations that don't build the examples ignore failures for this ToolSubst (and won't run the test).

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D99931
2021-04-07 09:47:04 +02:00
LemonBoy
67de6f40d0 [X86] Initialize TargetOptions::StackProtectorGuardOffset member to its default value
D88631 introduced a set of knobs to tweak how the stack protector is codegen'd for x86 targets, including the offset from the base register where the stack cookie is located. The `StackProtectorGuardOffset` field in `TargetOptions` was left uninitialized instead of being reset to its neutral value -1, making it possible to emit nonsensical code if the frontend doesn't change the field value at all before feeding the `TargetOptions` to the target machine initializer.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D99952
2021-04-07 09:04:25 +02:00
Max Kazantsev
c4958f4b06 [SCEV] Fix false-positive recognition of simple recurrences. PR49856
A value from reachable block may come to a Phi node as its input from
unreachable block. This may confuse matchSimpleRecurrence  which
has no access to DomTree and can falsely recognize something as a recurrency
because of this effect, as the attached test shows.

Patch `ae7b1e` deals with half of this problem, but it only accounts from
the case when an unreachable instruction comes to Phi as an input.

This patch provides a generalization by checking that no Phi block's
predecessor is unreachable (no matter what the input is).

Differential Revision: https://reviews.llvm.org/D99929
Reviewed By: reames
2021-04-07 13:55:17 +07:00
Petr Hosek
1f23f9cde8 Revert "[InstCombine] Fold ((X - Y) - Z) to X - (Y + Z) (PR49858)"
This reverts commit 31d219d2997fed1b7dc97e0adf170d5aaf65883e which
causes an infinite loop when compiling the XRay runtime.
2021-04-06 22:30:28 -07:00
Jonas Devlieghere
0567faf9a5 [dsymutil] Stop emulating dsymutil-classic CIE caching behavior
Stop emulating dsymutil-classic which only cached the last used CIE for
reuse.
2021-04-06 20:15:41 -07:00
Jonas Devlieghere
0bfa635758 [dsymutil] Don't keep old abbreviations
Don't keep the old abbreviations around. This code existed for
compatibility with dsymutil-classic.
2021-04-06 19:50:17 -07:00
Jonas Devlieghere
34cea44862 [dsymutil] Don't emit .debug_pubnames and .debug_pubtypes
Consider the .debug_pubnames and .debug_pubtypes their own kind of
accelerator and stop emitting them together with the Apple-style
accelerator tables. The only reason we were still emitting both was for
(byte-for-byte) compatibility with dsymutil-classic.

 - This patch adds a new accelerator table kind "Pub" which can be
   specified with --accelerator=Pub.
 - This patch removes the ability to emit both pubnames/types and apple
   style accelerator tables. I don't think anyone is relying on that but
   it's worth pointing out.
 - This patch removes the --minimize option and makes this behavior the
   default. Specifying the flag will result in a warning but won't abort
   the program.

Differential revision: https://reviews.llvm.org/D99907
2021-04-06 19:01:45 -07:00
Alex Orlov
d60a74e837 Removed redundant code. 2021-04-07 05:37:46 +04:00
Yevgeny Rouban
2070f722a2 [NewPM] Set verify-cfg-preserved=1 by default for debug builds 2021-04-07 08:34:30 +07:00
Craig Topper
8c93a46569 [RISCV] Add an assertion to the ReplaceNodeResults handling of bitcasts to make sure the VT is always a scalar integer. 2021-04-06 16:48:40 -07:00
Nicolás Alvarez
0bad578cf3 [docs] Fix doxygen comments wrongly attached to the llvm namespace
Looking at the Doxygen-generated documentation for the llvm namespace
currently shows all sorts of random comments from different parts of the
codebase. These are mostly caused by:

- File doc comments that aren't marked with \file, so they're attached to
  the next declaration, which is usually "namespace llvm {".
- Class doc comments placed before the namespace rather than before the
  class.
- Code comments before the namespace that (in my opinion) shouldn't be
  extracted by doxygen at all.

This commit fixes these comments. The generated doxygen documentation now
has proper docs for several classes and files, and the docs for the llvm
and llvm::detail namespaces are now empty.

Reviewed By: thakis, mizvekov

Differential Revision: https://reviews.llvm.org/D96736
2021-04-07 01:20:18 +02:00
Craig Topper
9cb5c40827 [RISCV] Don't custom type legalize fixed vector to scalar integer bitcasts if the fixed vector type isn't legal.
We encountered a hang in our internal code base. I'm having trouble
creating a test case because the test that hit it was testing some
code that is not upstream.
2021-04-06 15:00:33 -07:00
Craig Topper
70fa9bf05d [MachineValueTypes] Add blank lines between floating point vectors with different element types. NFC 2021-04-06 14:51:56 -07:00
Sidharth Baveja
de093310b0 [SplitEdge] Update SplitCriticalEdge to return a nullptr only when the edge is not critical
Summary:
The function SplitCriticalEdge (called by SplitEdge) can return a nullptr in
cases where the edge is a critical. SplitEdge uses SplitCriticalEdge assuming it
can always split all critical edges, which is an incorrect assumption.

The three cases where the function SplitCriticalEdge will return a nullptr is:
1. DestBB is an exception block
2. Options.IgnoreUnreachableDests is set to true and
isa(DestBB->getFirstNonPHIOrDbgOrLifetime()) is not equal to a nullptr
3. LoopSimplify form must be preserved (Options.PreserveLoopSimplify is true)
and it cannot be maintained for a loop due to indirect branches

For each of these situations they are handled in the following way:
1. Modified the function ehAwareSplitEdge originally from
llvm/lib/Transforms/Coroutines/CoroFrame.cpp to handle the cases when the DestBB
is an exception block. This function is called directly in SplitEdge.
SplitEdge does not call SplitCriticalEdge in this case
2. Options.IgnoreUnreachableDests is set to false by default, so this situation
does not apply.
3. Return a nullptr in this situation since the SplitCriticalEdge also returned
nullptr. Nothing we can do in this case.

Reviewed By: asbirlea

Differential Revision:https://reviews.llvm.org/D94619
2021-04-06 21:24:40 +00:00
Philip Reames
ecf630041a Replace calls to IntrinsicInst::Create with CallInst::Create [nfc]
There is no IntrinsicInst::Create.  These are binding to the method in the super type.  Be explicitly about which method is being called.
2021-04-06 13:23:58 -07:00