1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

208449 Commits

Author SHA1 Message Date
Kazu Hirata
49434d3cd6 [IR] Remove isPowerOf2ByteWidth
The predicate used to be used with the C backend, which was removed on
Mar 23, 2012 in commit 64a232343aa649fdacf78698da3e4d5737dee56a.  It
seems to be unused since then.
2020-12-14 23:00:17 -08:00
Max Kazantsev
81955ada9c [Test] Test on assertion failure with expensive SCEV range inference 2020-12-15 13:47:19 +07:00
Kazu Hirata
c82f0756af [Analysis] Use llvm::erase_value (NFC) 2020-12-14 22:40:13 -08:00
Hsiangkai Wang
f039b9862f [RISCV] Define vadd/vsub/vrsub intrinsics and lower to V instructions.
This patch is based on the proposal from Roger Ferrer Ibanez.
http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html

Differential Revision: https://reviews.llvm.org/D93013
2020-12-15 12:56:49 +08:00
LLVM GN Syncbot
851d64b940 [gn build] Port d2ed9d6b7ec 2020-12-15 03:35:00 +00:00
Nico Weber
42a753fa7c Reland "[MachineDebugify] Insert synthetic DBG_VALUE instructions"
This reverts commit 841f9c937f6e593c926a26aedf054436eb807fe6.
The change landed many months ago; something else broke those tests.
2020-12-14 22:34:23 -05:00
Nico Weber
e3a964236e Revert "[MachineDebugify] Insert synthetic DBG_VALUE instructions"
This reverts commit 2a5675f11d3bc803a245c0e2a3b47491c8f8a065.
The tests it adds fail: https://reviews.llvm.org/D78135#2453736
2020-12-14 22:14:48 -05:00
Nico Weber
f3e89af5f2 Revert "[Debugify] Support checking Machine IR debug info"
This reverts commit c4d2d4337d50bed3cafd564daece1a197005b22b.
Necessary to revert 2a5675f11d3bc803a245c0e.
2020-12-14 22:14:48 -05:00
Luo, Yuanke
fec37b307b [X86] Add test case for commit e52bc1d2bba794b.
Differential Revision: https://reviews.llvm.org/D93173
2020-12-15 11:14:16 +08:00
Nico Weber
d107b74361 Revert "[amdgpu] Default to code object v3"
This reverts commit 4b2e7d0215021d0d1df1a6319884b21d33936265.
Breaks check-clang, see https://reviews.llvm.org/D93258#2453600
2020-12-14 22:01:26 -05:00
Qiu Chaofan
8f2e4bf3b0 [NFC] [Legalizer] Use common method for expanding fp-to-int operands
Reviewed By: RKSimon, steven.zhang

Differential Revision: https://reviews.llvm.org/D92481
2020-12-15 10:45:40 +08:00
River Riddle
28caccdfb8 [mlir][Inliner] Refactor the inliner to use nested pass pipelines instead of just canonicalization
Now that passes have support for running nested pipelines, the inliner can now allow for users to provide proper nested pipelines to use for optimization during inlining. This revision also changes the behavior of optimization during inlining to optimize before attempting to inline, which should lead to a more accurate cost model and prevents the need for users to schedule additional duplicate cleanup passes before/after the inliner that would already be run during inlining.

Differential Revision: https://reviews.llvm.org/D91211
2020-12-14 18:09:47 -08:00
Xiang1 Zhang
6d8bb495f3 [Debugify] Support checking Machine IR debug info
Add mir-check-debug pass to check MIR-level debug info.

For IR-level, currently, LLVM have debugify + check-debugify to generate
and check debug IR. Much like the IR-level pass debugify, mir-debugify
inserts sequentially increasing line locations to each MachineInstr in a
Module, But there is no equivalent MIR-level check-debugify pass, So now
we support it at "mir-check-debug".

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D91595
2020-12-14 17:53:46 -08:00
Xiang1 Zhang
4721afcdaa Revert "[Debugify] Support checking Machine IR debug info"
This reverts commit 57a3d9ec4a8c1422f07264bed9f12a4ea416707e.
2020-12-14 17:48:49 -08:00
Xiang1 Zhang
437fc18cdb [Debugify] Support checking Machine IR debug info
Add mir-check-debug pass to check MIR-level debug info.

For IR-level, currently, LLVM have debugify + check-debugify to generate
and check debug IR. Much like the IR-level pass debugify, mir-debugify
inserts sequentially increasing line locations to each MachineInstr in a
Module, But there is no equivalent MIR-level check-debugify pass, So now
we support it at "mir-check-debug".

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D95195
2020-12-14 17:38:01 -08:00
Craig Topper
990821de64 [RISCV] Prevent assertion in the assembler if vmerge or vfmerge are given a V0 destination. 2020-12-14 17:22:55 -08:00
Craig Topper
3d9efd1db0 [RISCV] Handle Match_InvalidSImm5 in RISCVAsmParser::MatchAndEmitInstruction 2020-12-14 17:22:55 -08:00
Craig Topper
cc343aaa63 [RISCV] Teach debug output from assembly parser to print register names instead of enum values. 2020-12-14 17:22:55 -08:00
Jon Chesterfield
499614b729 [amdgpu] Default to code object v3
[amdgpu] Default to code object v3
v4 is not yet readily available, and doesn't appear
to be implemented in the back end

Reviewed By: t-tye

Differential Revision: https://reviews.llvm.org/D93258
2020-12-15 01:11:09 +00:00
Reid Kleckner
b945e1014a Revert "ADT: Migrate users of AlignedCharArrayUnion to std::aligned_union_t, NFC"
We determined that the MSVC implementation of std::aligned* isn't suited
to our needs. It doesn't support 16 byte alignment or higher, and it
doesn't really guarantee 8 byte alignment. See
https://github.com/microsoft/STL/issues/1533

Also reverts "ADT: Change AlignedCharArrayUnion to an alias of std::aligned_union_t, NFC"

Also reverts "ADT: Remove AlignedCharArrayUnion, NFC" to bring back
AlignedCharArrayUnion.

This reverts commit 4d8bf870a82765eb0d4fe53c82f796b957c05954.

This reverts commit d10f9863a5ac1cb681af07719650c44b48f289ce.

This reverts commit 4b5dc150b9862271720b3d56a3e723a55dd81838.
2020-12-14 17:04:06 -08:00
Changpeng Fang
23ab920e40 AMDGPU: If a store defines (alias) a load, it clobbers the load.
Summary:
 If a store defines (must alias) a load, it clobbers the load.

Fixes: SWDEV-258915

Reviewers:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D92951
2020-12-14 16:34:32 -08:00
Rong Xu
be3f0f958e [PGO] Verify BFI counts after loading profile data
This patch adds the functionality to compare BFI counts with real
profile
counts right after reading the profile. It will print remarks under
-Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo.

Differential Revision: https://reviews.llvm.org/D91813
2020-12-14 15:56:10 -08:00
Harald van Dijk
9906752fd3 [X86] Fix variadic argument handling for x32
The X86-64 ABI defines va_list as

  typedef struct {
    unsigned int gp_offset;
    unsigned int fp_offset;
    void *overflow_arg_area;
    void *reg_save_area;
  } va_list[1];

This means the size, alignment, and reg_save_area offset will depend on
whether we are in LP64 or in ILP32 mode, so this commit adds the checks.
Additionally, the VAARG_64 pseudo-instruction assumed 64-bit pointers, so
this commit adds a VAARG_X32 pseudo-instruction that behaves just like
VAARG_64, except for assuming 32-bit pointers.

Some of these changes were originally done by
Michael Liao <michael.hliao@gmail.com>.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48428.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93160
2020-12-14 23:47:27 +00:00
Sanjay Patel
05dd4fb241 [VectorCombine] add alignment test for gep load; NFC 2020-12-14 18:31:19 -05:00
Nico Weber
24b5f0a405 [gn build] (semi-manually) port 19d57b5c42b 2020-12-14 18:23:15 -05:00
Nico Weber
93ca28c075 [gn build] (semi-manually) port 7ad49aec125 2020-12-14 18:22:54 -05:00
Gulfem Savrun Yeniceri
196c46aa87 [clang][IR] Add support for leaf attribute
This patch adds support for leaf attribute as an optimization hint
in Clang/LLVM.

Differential Revision: https://reviews.llvm.org/D90275
2020-12-14 14:48:17 -08:00
Sanjay Patel
56023bebcf [VectorCombine] make load transform poison-safe
As noted in D93229, the transform from scalar load to vector load
potentially leaks poison from the extra vector elements that are
being loaded.

We could use freeze here (and x86 codegen at least appears to be
the same either way), but we already have a shuffle in this logic
to optionally change the vector size, so let's allow that
instruction to serve both purposes.

Differential Revision: https://reviews.llvm.org/D93238
2020-12-14 17:42:01 -05:00
Craig Topper
0f8c2c0fdc [LoopIdiomRecognize] Teach detectShiftUntilZeroIdiom to recognize loops where the counter is decrementing.
This adds support for loops like

unsigned clz(unsigned x) {
    unsigned w = sizeof (x) * CHAR_BIT;
    while (x) {
        w--;
        x >>= 1;
    }

    return w;
}

and

unsigned clz(unsigned x) {
    unsigned w = sizeof (x) * CHAR_BIT - 1;
    while (x >>= 1) {
        w--;
    }

    return w;
}

To support these we look for add x, -1 as well as add x, 1 that
we already matched. If the value was -1 we need to subtract from
the initial counter value instead of adding to it.

Fixes PR48404.

Differential Revision: https://reviews.llvm.org/D92745
2020-12-14 14:25:05 -08:00
Stanislav Mekhanoshin
23c9769258 [AMDGPU] Use multi-dword flat scratch for spilling
Differential Revision: https://reviews.llvm.org/D93067
2020-12-14 14:19:29 -08:00
Bardia Mahjour
8db5fd0457 Revert "[DDG] Data Dependence Graph - DOT printer"
This reverts commit fd4a10732c8bd646ccc621c0a9af512be252f33a, to
investigate the failure on windows: http://lab.llvm.org:8011/#/builders/127/builds/3274
2020-12-14 16:54:20 -05:00
Bardia Mahjour
6d099be63b [DDG] Data Dependence Graph - DOT printer
This patch implements a DDG printer pass that generates a graph in
the DOT description language, providing a more visually appealing
representation of the DDG. Similar to the CFG DOT printer, this
functionality is provided under an option called -dot-ddg and can
be generated in a less verbose mode under -dot-ddg-only option.

Differential Revision: https://reviews.llvm.org/D90159
2020-12-14 16:41:14 -05:00
Matt Arsenault
4c16866a59 OpaquePtr: Require byval on x86_intrcc parameter 0
Currently the backend special cases x86_intrcc and treats the first
parameter as byval. Make the IR require byval for this parameter to
remove this special case, and avoid the dependence on the pointee
element type.

Fixes bug 46672.

I'm not sure the IR is enforcing all the calling convention
constraints. clang seems to ignore the attribute for empty parameter
lists, but the IR tolerates it.
2020-12-14 16:34:37 -05:00
Zequan Wu
2bd2c2e5a1 [NFC] cleanup cg-profile emission on TargetLowerinng
Differential Revision: https://reviews.llvm.org/D93150
2020-12-14 13:07:44 -08:00
Guozhi Wei
43d0dff6c4 [MBP] Prevent rotating a chain contains entry block
The entry block should always be the first BB in a function.
So we should not rotate a chain contains the entry block.

Differential Revision: https://reviews.llvm.org/D92882
2020-12-14 12:48:55 -08:00
Philip Reames
569f60bb57 [LAA] Relax restrictions on early exits in loop structure
his is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be mostly NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist).  Moving the checks does end up changing some of the optimization warnings and debug output slightly, but nothing that appears to be a regression.

Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so.

Differential Revision: https://reviews.llvm.org/D92066
2020-12-14 12:44:01 -08:00
Sanjay Patel
9e252bc0c0 [VectorCombine] add test for load with offset; NFC 2020-12-14 14:40:06 -05:00
Reid Kleckner
0ab72ed3c5 [Hexagon] Tweak _MSC_VER workaround version
My bot runs VS 2019, but it could not compile this code.

Message:
[55/2465] Building CXX object lib\Target\Hexagon\CMakeFiles\LLVMHexagonCodeGen.dir\HexagonVectorCombine.cpp.obj
FAILED: lib/Target/Hexagon/CMakeFiles/LLVMHexagonCodeGen.dir/HexagonVectorCombine.cpp.obj
...
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): error C2976: 'std::map': too few template arguments
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): note: see declaration of 'std::map'

The version in the path, 14.23, corresponds to _MSC_VER 1923, so raise
the version floor to 1924.

I have not tested with versions between 1924 and 1928 (latest), but the
latest works with the variadic version.
2020-12-14 11:26:36 -08:00
Alina Sbirlea
5d7e73c1ea [NFC] Remove stray comment. 2020-12-14 11:19:17 -08:00
Craig Topper
40953f64d3 [RISCV] Move vtype decoding and printing from RISCVInstPrinter to RISCVBaseInfo. Share with the assembly parser's debug output
This moves the vtype decoding and printing to RISCVBaseInfo. This keeps all of
the decoding code in the same area as the encoding code. This will make it
easier to change the decoding for the 1.0 spec in the future.

We're now sharing the printing with the debug output for operands in the
assembler. This also fixes that debug output to include the tail and mask
agnostic bits. Since the printing code works on the vtype immediate value, we
now encode the immediate during parsing and store just the immediate in the
operand.
2020-12-14 10:50:26 -08:00
Jonas Paulsson
39dd827634 [SystemZ] Improve handling of backchain offset.
- New function SDValue getBackchainAddress() used by
  lowerDYNAMIC_STACKALLOC() and lowerSTACKRESTORE() to properly handle the
  backchain offset also with packed-stack.

- Make a common function getBackchainOffset() for the computation of the
  backchain offset and use in some places (NFC).

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D93171
2020-12-14 12:39:38 -06:00
Michael Liao
d0d0262443 [amdgpu] Fix a crash case when V_CNDMASK could be simplified.
- Once an instruction is simplified, foldable candidates from it should
  be invalidated or skipped as the operand index is no longer valid.

Differential Revision: https://reviews.llvm.org/D93174
2020-12-14 13:08:13 -05:00
Roman Lebedev
c362d4d27f [NFCI][Thumb2] Regenerate MVE tests i missed in 59560e85897afc50090b6c3d920bacfd28b49d06 2020-12-14 21:01:00 +03:00
Tony
66c131f86a [NFC] Remove trailing whitespace in llvm/CMakeLists.txt
Differential Revision: https://reviews.llvm.org/D93234
2020-12-14 17:48:16 +00:00
Cameron Desrochers
783e812623 [TableGen] Fixed 64-bit filters being sliced to 32 bits in FixedLenDecoderEmitter
When using the FixedLenDecoderEmitter, llvm-tblgen emits tables with (OPC_ExtractField, OPC_ExtractFilterValue) opcode sequences to match the contiguous fixed bits of a given instruction's encoding. This encoding is represented in a 64-bit integer. However, the filter values were represented in a 32-bit integer. As such, instructions with fixed 64-bit encodings resulted in a table with an OPC_ExtractField for all 64 bits, followed by an OPC_ExtractFilterValue containing just the low 32 bits of their encoding, causing the filter never to match.

The exact point at which the slicing occurred was during the map insertion at line 630.

Differential Revision: https://reviews.llvm.org/D92423
2020-12-14 12:42:35 -05:00
Nemanja Ivanovic
6da8209a9a [PowerPC] Restore stack ptr from frame ptr with setjmp
If a function happens to:

- call setjmp
- do a 16-byte stack allocation
- call a function that sets up a stack frame and longjmp's back

The stack pointer that is restores by setjmp will no longer point to a valid
back chain. According to the ABI, stack accesses in such a function are to be
frame pointer based - so it is an error (quite obviously) to restore the stack
from the back chain.
We already restore the stack from the frame pointer when there are calls to
fast_cc functions. We just need to also do that when there are calls to setjmp.
This patch simply does that.

This was pointed out by the Julia team.

Differential revision: https://reviews.llvm.org/D92906
2020-12-14 11:34:16 -06:00
Roman Lebedev
2516110411 [SimplifyCFG] FoldBranchToCommonDest(): temporairly put back restrictions on liveout uses of bonus instructions (PR48450)
Even though d38205144febf4dc42c9270c6aa3d978f1ef65e1 was mostly a correct
fix for the external non-PHI users, it's not a *generally* correct fix,
because the 'placeholder' values in those trivial PHI's we create
shouldn't be *always* 'undef', but the PHI itself for the backedges,
else we end up with wrong value, as the `@pr48450_2` test shows.

But we can't just do that, because we can't check that the PHI
can be it's own incoming value when coming from certain predecessor,
because we don't have a dominator tree.

So until we can address this correctness problem properly,
ensure that we don't perform the transformation
if there are such problematic external uses.

Making dominator tree available there is going to be involved,
since `-simplifycfg` pass currently does not preserve/update domtree...
2020-12-14 20:14:31 +03:00
Roman Lebedev
4292a82bd2 [NFC][SimplifyCFG] FoldBranchToCommonDest(): pull out 'common successor' into a variable
Makes it easier to use it elsewhere
2020-12-14 20:14:31 +03:00
Roman Lebedev
aef5ea6d96 [NFC][SimplifyCFG] Add another miscompiled test for PR48450 2020-12-14 20:14:31 +03:00
Stanislav Mekhanoshin
4e48d88543 [SLP] Control maximum vectorization factor from TTI
D82227 has added a proper check to limit PHI vectorization to the
maximum vector register size. That unfortunately resulted in at
least a couple of regressions on SystemZ and x86.

This change reverts PHI handling from D82227 and replaces it with
a more general check in SLPVectorizerPass::tryToVectorizeList().
Moved to tryToVectorizeList() it allows to restart vectorization
if initial chunk fails.

However, this function is more general and handles not only PHI
but everything which SLP handles. If vectorization factor would
be limited to maximum vector register size it would limit much
more vectorization than before leading to further regressions.
Therefore a new TTI callback getMaximumVF() is added with the
default 0 to preserve current behavior and limit nothing. Then
targets can decide what is better for them.

The callback gets ElementSize just like a similar getMinimumVF()
function and the main opcode of the chain. The latter is to avoid
regressions at least on the AMDGPU. We can have loads and stores
up to 128 bit wide, and <2 x 16> bit vector math on some
subtargets, where the rest shall not be vectorized. I.e. we need
to differentiate based on the element size and operation itself.

Differential Revision: https://reviews.llvm.org/D92059
2020-12-14 08:49:40 -08:00