1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

212966 Commits

Author SHA1 Message Date
Nikita Popov
d872e52c16 [InstSimplify] Regenerate test checks (NFC) 2021-03-21 17:41:21 +01:00
Nikita Popov
259cab0fcc [InstSimplify] Add additional select operand replacement tests (NFC)
This tests for binops with identity elements.
2021-03-21 15:30:30 +01:00
Nikita Popov
befc0b53a8 [InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI)
Replace Op with RepOp up-front, and then always work with the new
operands, rather than checking for replacement in various places.
2021-03-21 15:30:30 +01:00
Matt Arsenault
2eda243ee8 GlobalISel: Avoid unnecessary truncation to i64
We can just directly pass through the APInt to create a new constant.
2021-03-21 10:07:41 -04:00
Matt Arsenault
140b871148 AMDGPU/GlobalISel: Enable CSE in pre-legalizer combiner 2021-03-21 10:07:37 -04:00
Simon Pilgrim
4d0156a36f [DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension
As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.
2021-03-21 14:01:37 +00:00
Simon Pilgrim
d0bf75b218 [X86][AVX] ComputeNumSignBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources
The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.

Added as an extension to PR49658
2021-03-21 12:22:51 +00:00
Simon Pilgrim
a9cc1a6025 [X86] Add 'mulhs' variant of PR49658 test case 2021-03-21 12:09:05 +00:00
David Green
3dae9e9960 [ARM] VINS f16 pattern
This adds an extra pattern for inserting an f16 into a odd vector lane
via an VINS. If the dual-insert-lane pattern does not happen to apply,
this can help with some simple cases.

Differential Revision: https://reviews.llvm.org/D95471
2021-03-21 12:00:06 +00:00
luxufan
01bf694073 [RISCV] remove redundant instruction when eliminate frame index
The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will
create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi
was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the
mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created.

Differential Revision: https://reviews.llvm.org/D92479
2021-03-21 18:54:00 +08:00
Simon Pilgrim
db3cbc0a8e [X86][AVX] computeKnownBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources
The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.

Suggested by @craig.topper on PR49658
2021-03-21 10:40:57 +00:00
Simon Pilgrim
ee17e81726 [X86] Add PR49658 test case 2021-03-21 10:16:55 +00:00
Simon Pilgrim
a2fff44e8e [X86] computeKnownBitsForTargetNode - add X86ISD::PMULUDQ handling
Reuse the existing KnownBits multiplication code to handle what is effectively a ISD::UMUL_LOHI varient
2021-03-21 09:57:20 +00:00
Craig Topper
a29c1e206d [RISCV] Add test case to show a case where (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization does not improve code.
If the mul add two users, one of which was a sext.w, the mul
would also be selected to a MULW before our pattern runs. This
causes the ANDs to now be used by the already selected MULW and
the mul we still need to select. They are unneeded on the MULW
since MULW only reads the lower bits. So they get selected to
SLLI+SRLI for the MULW use. The use for the
(mul (and X, 0xffffffff), (and Y, 0xffffffff)) manages to reuse
the SLLI.

The end result is increased register pressure and no improvement
to how soon we can start the MULW.
2021-03-20 17:54:28 -07:00
Andrew Litteken
4514855e45 Revert "[IRSim] Adding basic implementation of llvm-sim."
Causing build errors on the Windows Buildbots.

This reverts commit 5155dff2784a47583d432d796b7cf47a0bed9f20.
2021-03-20 18:03:09 -05:00
Jessica Clarke
42f6c00a37 [RISCV] Update comment in RISCVInstrInfoM.td
Missed in 07ed62b7d551.
2021-03-20 22:35:40 +00:00
Craig Topper
00a544d23a [RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled.
This optimization is trying to save SRLI instructions needed to
implement the ANDs. If we have zext.w we won't save anything.
Because we don't check that the multiply is the only user of the
AND we might even increase instruction count.
2021-03-20 15:31:45 -07:00
Craig Topper
dd7835bd09 [RISCV] Add Zba command lines to xaluo.ll. NFC
Some of the patterns end up with 32 to 64 bit zero extends on RV64
which can be handled by zext.w.
2021-03-20 15:31:45 -07:00
Craig Topper
e094271bc1 [RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64
This patterns computes the full 64 bit product of a 32x32 unsigned
multiply. This requires a two pairs of SLLI+SRLI to zero the
upper 32 bits of the inputs.

We can do better than this by using two SLLI to move the lower
bits to the upper bits then use MULHU to compute the product. This
is the high half of a full 64x64 product. Since we put 32 0s in the lower
bits of the inputs we know the 128-bit product will have zeros in the
lower 64 bits. So the upper 64 bits, which MULHU computes, will contain
the original 64 bit product we were after.

The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32))
using MULHS, but sext_inreg is sext.w which is already one instruction so we
wouldn't save anything.

Differential Revision: https://reviews.llvm.org/D99026
2021-03-20 14:55:46 -07:00
Andrew Litteken
db0fe80a86 [IRSim] Adding basic implementation of llvm-sim.
This is a similarity visualization tool that accepts a Module and
passes it to the IRSimilarityIdentifier.  The resulting SimilarityGroups
are output in a JSON file.

Tests are found in test/tools/llvm-sim and check for the file not found,
a bad module, and that the JSON is created correctly.

Reviewers: paquette, jroelofs, MaskRay

Recommit of: 15645d044bcfe2a0f63156048b302f997a717688 to fix linking
errors.

Differential Revision: https://reviews.llvm.org/D86974
2021-03-20 16:47:50 -05:00
Jinsong Ji
7a834ebbd0 [AIX] Update rpath for BUILD_SHARED_LIBS
BUILD_SHARED_LIBS build llvm component as shared library,
which can reduce the size a lot.

Normally, the binary use ORIGIN../lib to load component libraries,
unfortunatly, ORIGIN is not supported by AIX ld.

We hardcoded the build lib and install lib path in rpath for now
to enable BUILD_SHARED_LIBS build.

Understand that this is not perfect solution,
we can update this when we find better solution.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D98901
2021-03-20 20:31:43 +00:00
Sanjay Patel
2a598ef9d2 [BranchProbability] move options for 'likely' and 'unlikely'
This makes the settings available for use in other passes by housing
them within the Support lib, but NFC otherwise.

See D98898 for the proposed usage in SimplifyCFG
(where this change was originally included).

Differential Revision: https://reviews.llvm.org/D98945
2021-03-20 14:46:46 -04:00
Fangrui Song
faf0152f73 [VE] Fix types of multiclass template arguments in TableGen files
There were not properly checked before `[TableGen] Improve handling of template arguments`.
2021-03-20 10:36:51 -07:00
Vaivaswatha Nagaraj
566ea91075 [OCaml] Add (get/set)_module_identifer functions
Also:

- Fix a bug that crept in when fixing a buildbot failure in
f7be9db622
- Use mlsize_t for cstr_to_string as that is what
caml_alloc_string specifies.

Differential Revision: https://reviews.llvm.org/D98851
2021-03-20 20:41:51 +05:30
David Zarzycki
980d119a88 [lit] Sort testing summary output
As fallout from from the record-and-reorder work, people asked that the
summary output be sorted to aid diffing.
2021-03-20 07:52:08 -04:00
Jeroen Dobbelaere
52195b6999 Revert of D49126 [PredicateInfo] Use custom mangling to support ssa_copy with unnamed types.
Now that intrinsic name mangling can cope with unnamed types, the custom name mangling in PredicateInfo (introduced by D49126) can be removed.
(See D91250, D48541)

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D91661
2021-03-20 11:37:09 +01:00
Wang, Pengfei
fff4eb8636 [X86] Fix a bug when calculating the ldtilecfg insertion points.
The BB we initialized the ldtilecfg is special. We don't need to check
if its predecessor BBs need to insert ldtilecfg for calls.

We reused the flag HasCallBeforeAMX, so that the predecessors won't be
added to CfgNeedInsert.

This case happens only when the entry BB is in a loop. We need to hoist
the first tile config point out of the loop in future.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D98845
2021-03-20 17:48:59 +08:00
Juneyoung Lee
4bb78e2cb0 [CFLGraph] Fix a crash due to missing handling of freeze
https://reviews.llvm.org/D85534#2636321
2021-03-21 02:14:13 +09:00
Shao-Ce Sun
c789b50f4d [NFC][ValueTypes] Align code by column
Adjusted some whitespaces.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98975
2021-03-20 13:43:07 +08:00
Carl Ritson
c6760fad72 [AMDGPU] Add MDT update missing from D98915 2021-03-20 13:38:58 +09:00
Nemanja Ivanovic
b918a37b96 [PowerPC][NFC] Do not produce i64 constants in 32-bit mode
There are some instances where we produce constants of type MVT::i64
unconditionally in the target DAG combines. This is not actually
valid in 32-bit mode.
2021-03-19 22:54:47 -05:00
Craig Topper
ca727da695 [RISCV] Rename WriteShift/ReadShift scheduler classes to WriteShiftImm/ReadShiftImm. Move variable shifts from WriteIALU/ReadIALU to new WriteShiftReg/ReadShiftReg.
Previously only immediate shifts were in WriteShift. Register
shifts were grouped with IALU. Seems likely that immediate shifts
would be as fast or faster than register shifts. And that immediate
shifts wouldn't be any faster than IALU. So if any deserved to be in
their own group it should be register shifts not immediate shifts.

Rather than try to flip them let's just add more granularity
and give each kind their own class. I've used new names for both to
make them unambiguous and to force any downstream implementations to
be forced to put correct information in their scheduler models.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D98911
2021-03-19 20:39:49 -07:00
Carl Ritson
ab6ec1f384 [AMDGPU] Rename SIInsertSkips Pass
Pass no longer handles skips.  Pass now removes unnecessary
unconditional branches and lowers early termination branches.
Hence rename to SILateBranchLowering.

Move code to handle returns to epilog from SIPreEmitPeephole
into SILateBranchLowering. This means SIPreEmitPeephole only
contains optional optimisations, and all required transforms
are in SILateBranchLowering.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D98915
2021-03-20 11:48:04 +09:00
Carl Ritson
6af08589c4 [AMDGPU] Merge SIRemoveShortExecBranches into SIPreEmitPeephole
SIRemoveShortExecBranches is an optimisation so fits well in the
context of SIPreEmitPeephole.

Test changes relate to early termination from kills which have now
been lowered prior to considering branches for removal.
As these use s_cbranch the execz skips are now retained instead.
Currently either behaviour is valid as kill with EXEC=0 is a nop;
however, if early termination is used differently in future then
the new behaviour is the correct one.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D98917
2021-03-20 11:26:42 +09:00
Lang Hames
8f653b1bbd [llvm-jitlink] Scan input files for first object to determine triple.
The previous logic would crash if the first input file was an archive rather
than an object.
2021-03-19 19:24:10 -07:00
Senran Zhang
d628808ad9 [Utils][vim] Highlight poison keyword
Reviewed By: awarzynski, MaskRay

Differential Revision: https://reviews.llvm.org/D98927
2021-03-19 19:09:11 -07:00
Lang Hames
ae897863a3 [JITLink] Remove redundant local variable definitions from a unit test. 2021-03-19 18:29:36 -07:00
Carl Ritson
fd73d84cf5 [AMDGPU] Allow index optimisation in SIPreEmitPeephole for bundles
Add code so duplication index register changes can be removed from
inside bundles.

Reviewed By: rampitec, foad

Differential Revision: https://reviews.llvm.org/D98940
2021-03-20 10:26:23 +09:00
Anshil Gandhi
b6fa39152d [NFC] [PowerPC] Determine Endianness in PPCTargetMachine
The TargetMachine uses the triple to determine endianness. Just
use that logic rather than replicating it in PPCSubtarget.

Differential revision: https://reviews.llvm.org/D98674
2021-03-19 20:22:16 -05:00
Peter Collingbourne
b0fbcbc034 gn build: Unbreak Android cross-compilation.
- D96404 defaulted to libunwind which isn't provided by NDK r21
  (or r22), so specify -rtlib=libgcc on non-arm32.
- D97993 means that we need to use --gcc-toolchain instead of -B
  to let the driver find libgcc.
2021-03-19 16:28:24 -07:00
Ellis Hoag
434b17dd9a Port D97640 to llvm/include/llvm/ProfileData/InstrProfData.inc
Differential Revision: https://reviews.llvm.org/D98982
2021-03-19 16:24:16 -07:00
Lang Hames
263720c0fc [JITLink] Don't issue lookups for empty symbol sets.
Issuing a lookup for an empty symbol set is legal, but can actually result in
unrelated work being done if there was a work queue left over from the previous
lookup. We can avoid doing this unrelated work (reducing stack depth and
interleaving of debugging output) by not issuing these no-op lookups in the
first place.
2021-03-19 16:10:47 -07:00
Christoffer Lernö
3f24585141 Add type attributes to LLVM C API
The LLVM C API is missing type attributes as is needed by attributes
such as sret and byval. This patch adds three missing wrapper
functions.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=48249

https://reviews.llvm.org/D97763
2021-03-19 19:07:04 -04:00
Arthur Eubanks
4b9d699155 [NewPM] Verify LoopAnalysisResults after a loop pass
All loop passes should preserve all analyses in LoopAnalysisResults. Add
checks for those when the checks are enabled (which is by default with
expensive checks on).

Note that due to PR44815, we don't check LAR's ScalarEvolution.
Apparently calling SE.verify() can change its results.

This is a reland of https://reviews.llvm.org/D98820 which was reverted
due to unacceptably large compile time regressions in normal debug
builds.
2021-03-19 14:56:37 -07:00
Jessica Paquette
ae291b6dfb [GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)
There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG.

E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain
code that matches a bitfield extract from an and + right shift.

Rather than duplicating code in the same way, this adds two opcodes:

- G_UBFX (unsigned bitfield extract)
- G_SBFX (signed bitfield extract)

They work like this

```
%x = G_UBFX %y, %lsb, %width
```

Where `lsb` and `width` are

- The least-significant bit of the extraction
- The width of the extraction

This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends
the result, while G_SBFX sign-extends the result.

This should allow us to use the combiner to match the bitfield extraction
patterns rather than duplicating pattern-matching code in each target.

Differential Revision: https://reviews.llvm.org/D98464
2021-03-19 14:37:19 -07:00
Fangrui Song
e2c184371f [llvm-readobj] Remove legacy GNU_PROPERTY_X86_ISA_1_{NEEDED,USED} and dump new GNU_PROPERTY_X86_ISA_1_{NEEDED,USED}
https://sourceware.org/bugzilla/show_bug.cgi?id=26703 deprecated the
previous GNU_PROPERTY_X86_ISA_1_{CMOV,SSE,*} values (renamed to `COMPAT`)
and added new values.

Since the legacy values are not used by compilers, having dumping support in
llvm-readobj is unnecessary. So just drop the legacy feature.

The new values are used by GCC 11
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97250) `-march=x86-64-v[234]` to
indicate the micro-architecture ISA levels.

Differential Revision: https://reviews.llvm.org/D98818
2021-03-19 14:35:22 -07:00
Arthur Eubanks
36d2e0e3b3 Revert "[NewPM] Verify LoopAnalysisResults after a loop pass"
This reverts commit 94c269baf58330a5e303a4f86f64681f2f7a858b.

Still causes too large of compile time regression in normal debug
builds. Will put under expensive checks instead.
2021-03-19 14:31:08 -07:00
Ella Ma
2a15d1e5e9 [llvm] Add assertions for the smart pointers with the possibility to be null in ModuleLazyLoaderCache::operator()
Split from D91844.

The return value of function `ModuleLazyLoaderCache::operator()` in file llvm/tools/llvm-link/llvm-link.cpp. According to the bug report of my static analyzer, the std::function variable `ModuleLazyLoaderCache::createLazyModule` points to function `loadFile`, which may return `nullptr` when error. And the pointer is dereferenced without a check.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D97258
2021-03-19 13:52:34 -07:00
Arthur Eubanks
20e98ffdeb [NewPM] Verify LoopAnalysisResults after a loop pass
All loop passes should preserve all analyses in LoopAnalysisResults. Add
    checks for those.

    Note that due to PR44815, we don't check LAR's ScalarEvolution.
    Apparently calling SE.verify() can change its results.

    Only verify MSSA when VerifyMemorySSA, normally it's very expensive.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98820
2021-03-19 13:26:45 -07:00
Sanjay Patel
32fe5823ff [SLP] remove unnecessary characters in test; NFC
Glitch that crept in with 62f9c3358b81
2021-03-19 15:09:53 -04:00