1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 02:33:06 +01:00
Commit Graph

45641 Commits

Author SHA1 Message Date
Johannes Doerfert
9b53e594b4 Reapply "[Attributor] Disable simplification AAs if a callback is present""
This reapplies commit cbb709e25124dc38ee593882051fc88c987fe591 and
includes the use of the lookup method instead of operator[] to avoid
accidentally setting (empty) simplification callbacks.

This reverts commit aa27430a625b2fd059707a87f8ba2df8f480ff11.
2021-07-27 19:14:50 -05:00
Mehdi Amini
115164b68b Add llvm::equal convenient wrapper for ranges around std::equal
Differential Revision: https://reviews.llvm.org/D106913
2021-07-28 00:10:22 +00:00
Johannes Doerfert
96f821bd7b Revert "[Attributor] Disable simplification AAs if a callback is present"
This reverts commit cbb709e25124dc38ee593882051fc88c987fe591 as it
breaks the tests, which was not supposed to happen. Investigating now.
2021-07-27 18:09:42 -05:00
Johannes Doerfert
0b6fb7b1ef [Attributor] Disable simplification AAs if a callback is present
AAValueSimplify, AAValueConstantRange, and AAPotentialValues all look at
the IR by default. If queried for a IR position which has a
simplification callback we should either look at the callback return, or
give up. We do the latter for now.
2021-07-27 17:50:26 -05:00
Alexey Zhikhartsev
6516543c4b Add jump-threading optimization for deterministic finite automata
The current JumpThreading pass does not jump thread loops since it can
result in irreducible control flow that harms other optimizations. This
prevents switch statements inside a loop from being optimized to use
unconditional branches.

This code pattern occurs in the core_state_transition function of
Coremark. The state machine can be implemented manually with goto
statements resulting in a large runtime improvement, and this transform
makes the switch implementation match the goto version in performance.

This patch specifically targets switch statements inside a loop that
have the opportunity to be threaded. Once it identifies an opportunity,
it creates new paths that branch directly to the correct code block.
For example, the left CFG could be transformed to the right CFG:

```
          sw.bb                        sw.bb
        /   |   \                    /   |   \
   case1  case2  case3          case1  case2  case3
        \   |   /                /       |       \
        latch.bb             latch.2  latch.3  latch.1
         br sw.bb              /         |         \
                           sw.bb.2     sw.bb.3     sw.bb.1
                            br case2    br case3    br case1
```

Co-author: Justin Kreiner @jkreiner
Co-author: Ehsan Amiri @amehsan

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D99205
2021-07-27 14:34:04 -04:00
Thomas Lively
bb4e957ebf [WebAssembly] Codegen for extmul SIMD instructions
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.

Differential Revision: https://reviews.llvm.org/D106724
2021-07-27 08:41:30 -07:00
Anirudh Prasad
2b967460b4 [SystemZ][z/OS] Initial code to generate assembly files on z/OS
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D106380
2021-07-27 11:29:15 -04:00
Anna Thomas
23f3f90569 Strip undef implying attributes when moving calls
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.

This patch introduces such an API and uses it in relevant passes. See
updated tests.

Fix for PR50744.

Reviewed By: nikic, jdoerfert, lebedev.ri

Differential Revision: https://reviews.llvm.org/D104641
2021-07-27 10:57:05 -04:00
Jeremy Morse
adf33c8470 [DebugInfo][InstrRef] Correctly update DBG_PHIs during instr scheduling
Avoid several crashes when DBG_INSTR_REF and DBG_PHI instructions are fed
to the instruction scheduler. DBG_INSTR_REFs should be treated like
DBG_LABELs, and just ignored for the purpose of scheduling [0].

DBG_PHIs however behave much more like DBG_VALUEs: they refer to register
operands, and if some register defs get shuffled around during instruction
scheduling, there's a risk that the debug instr will refer to the wrong
value. There's already a facility for updating DBG_VALUEs to reflect this;
add DBG_PHI to the list of instructions that it will update.

[0] Suboptimal, but it's what instr scheduling does right now.

Differential Revision: https://reviews.llvm.org/D106663
2021-07-27 15:12:46 +01:00
Chris Jackson
fe743a5b25 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reapplies commit 76f3ffb2b285998f02639db8fd42fb0de8a540d0 that was
reverted due to buildbot failures.

- Update lit tests with REQUIRES condition.
- Abandon salvage attempt if SCEVUnknown::getValue() returns nullptr.

Differential Revision: https://reviews.llvm.org/D105207
2021-07-27 14:22:09 +01:00
Chris Jackson
e13e00f7b3 [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This reverts commit 76f3ffb2b285998f02639db8fd42fb0de8a540d0 because
of a failure on sanitixer-X86-64-linux-autoconf.
2021-07-27 13:36:56 +01:00
Chris Jackson
2e863e863d [DebugInfo][LoopStrengthReduction] SCEV-based salvaging for LSR
This patch extends salvaging of debuginfo in the Loop Strength Reduction
(LSR) pass by translating Scalar Evaluations (SCEV) into DIExpressions.
The method is as follows:
- Cache dbg.value intrinsics that are salvageable.
- Obtain a loop Induction Variable (IV) from ScalarExpressionExpander or
  the loop header.
- Translate the IV SCEV into an expression that recovers the current
  loop iteration count. Combine this with the dbg.value's location
  op SCEV to create a DIExpression that salvages the value.

Review by: jmorse

Differential Revision: https://reviews.llvm.org/D105207
2021-07-27 13:00:36 +01:00
Jay Foad
3bc8cd6a0b [GlobalISel] Constant fold G_SITOFP and G_UITOFP in CSEMIRBuilder
Differential Revision: https://reviews.llvm.org/D104528
2021-07-27 11:27:58 +01:00
Esme-Yi
e8ead0fbb2 [Debug-Info][llvm-dwarfdump] Don't try to dump location
list for attributes that don't have the loclist class.

Summary: The overflow error occurs when we try to dump
location list for those attributes that do not have the
loclist class, like DW_AT_count and DW_AT_byte_size.
After re-reviewed the entire list, I sorted those
attributes into two parts, one for dumping location list
and one for dumping the location expression.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D105613
2021-07-27 07:28:59 +00:00
Lang Hames
d08c2fabb8 [ORC] Require ExecutorProcessControl when constructing an ExecutionSession.
Wrapper function call and dispatch handler helpers are moved to
ExecutionSession, and existing EPC-based tools are re-written to take an
ExecutionSession argument instead.

Requiring an ExecutorProcessControl instance simplifies existing EPC based
utilities (which only need to take an ES now), and should encourage more
utilities to use the EPC interface. It also simplifies process termination,
since the session can automatically call ExecutorProcessControl::disconnect
(previously this had to be done manually, and carefully ordered with the
rest of JIT tear-down to work correctly).
2021-07-27 16:53:49 +10:00
Johannes Doerfert
e9073c9b78 [OpenMP] Try to simplify all loads in device code
Eliminating loads/stores in the device code is worth the extra effort,
especially for the new device runtime.

At the same time we do not compute AAExecutionDomain for non-device code
anymore, there is no point.

Differential Revision: https://reviews.llvm.org/D106845
2021-07-27 01:44:15 -05:00
Johannes Doerfert
756885168c [Attributor][FIX] Copy all members in the assignment operator
Also improve debug output slightly.
2021-07-27 01:44:13 -05:00
Johannes Doerfert
86cba7bf5e [InstSimplify] Expose generic interface for replaced operand simplification
Users, especially the Attributor, might replace multiple operands at
once. The actual implementation of simplifyWithOpReplaced is able to
handle that just fine, the interface was simply not allowing to replace
more than one operand at a time. This is exposing a more generic
interface without intended changes for existing code.

Differential Revision: https://reviews.llvm.org/D106189
2021-07-27 00:56:12 -05:00
Johannes Doerfert
4ce16022d3 [Local] Do not introduce a new llvm.trap before unreachable
This is the second attempt to remove the `llvm.trap` insertion after
https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6
reverted the first one. It is not clear what the exact issue was back
then and it might already be gone by now, it has been >5 years after
all.

Replaces D106299.

Differential Revision: https://reviews.llvm.org/D106308
2021-07-26 23:33:36 -05:00
Johannes Doerfert
95f415bd34 [Attributor] Delete dead stores
D106185 allows us to determine if a store is needed easily. Using that
knowledge we can start to delete dead stores.

In AAIsDead we now track more state as an instruction can be dead (= the
old optimisitc state) or just "removable". A store instruction can be
removable while being very much alive, e.g., if it stores a constant
into an alloca or internal global. If we would pretend it was dead
instead of only removablewe we would ignore it when we determine what
values a load can see, so that is not what we want.

Differential Revision: https://reviews.llvm.org/D106188
2021-07-26 23:33:36 -05:00
Johannes Doerfert
5ee3942ae7 [Attributor] Introduce getPotentialCopiesOfStoredValue and use it
This patch introduces `getPotentialCopiesOfStoredValue` which uses
AAPointerInfo to determine all "aliases" or "potential copies" of a
value that is stored into memory. This operation can fail but if it
succeeds it means we can visit all "uses" of a value even if it is
temporarily stored in memory.

There are two users for the function:
  1) `Attributor::checkForAllUses` which will now ignore the value use
     in a store if all "potential copies" can be identified and instead
     be visited. This allows various AAs, including AAPointerInfo
     itself, to look through memory.
  2) `AANoCapture` which uses a custom use tracking through the
     CaptureTracker interface and therefore needs to be thought
     explicitly.

Differential Revision: https://reviews.llvm.org/D106185
2021-07-26 23:33:36 -05:00
Mitch Phillips
fcd0279b17 Revert "[GlobalISel] Add scalar widening for G_MERGE_VALUES destination"
This reverts commit 0a37163d1d855a2db41e1f46ddbc3f4570bd7ca6.

Reason: Broke the sanitizer msan bots. More details are available in the
original Phabricator review: https://reviews.llvm.org/D106814.
2021-07-26 19:52:12 -07:00
Jessica Paquette
1c2cdfcee3 [GlobalISel] Add scalar widening for G_MERGE_VALUES destination
This adds support for the case where

WideSize = DstSize + K * SrcSize

In this case, we can pad the G_MERGE_VALUES instruction with K extra undef
values with width SrcSize. Then the destination can be handled via
widenScalarDst.

Differential Revision: https://reviews.llvm.org/D106814
2021-07-26 17:00:00 -07:00
Masoud Ataei
8bfb4cf745 [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX
Before MASSV only supported P8 and P9 on AIX ans Linux . This patch proposes
MASSV to add support of P7 and P10 only on AIX too.

Differential: https://reviews.llvm.org/D106678
2021-07-26 23:21:38 +00:00
Amara Emerson
5e3045ed0c [GlobalISel] Add a constant folding combine.
Use it AArch64 post-legal combiner. These don't always get folded because when
the instructions are created the constants are obscured by artifacts.

Differential Revision: https://reviews.llvm.org/D106776
2021-07-26 14:53:33 -07:00
Matheus Izvekov
33d35b0a79 [CodeView] Saturate values bigger than supported by APInt.
This fixes an assert firing when compiling code which involves 128 bit
integrals.

This would trigger runtime checks similar to this:
```
Assertion failed: getMinSignedBits() <= 64 && "Too many bits for int64_t", file llvm/include/llvm/ADT/APInt.h, line 1646
```

To get around this, we just saturate those big values.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D105320
2021-07-26 22:15:26 +02:00
Reid Kleckner
a85a7951e1 Fix clang debug info irgen of i128 enums
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.

Split from D105320, thanks to Matheus Izvekov for the test case and
report.

Differential Revision: https://reviews.llvm.org/D106585
2021-07-26 12:25:29 -07:00
Amara Emerson
b09f2e63d9 [GlobalISel] Add combine for merge(unmerge) and use AArch64 postlegal-combiner.
Differential Revision: https://reviews.llvm.org/D106761
2021-07-26 10:37:31 -07:00
Florian Hahn
3ef6cc6cad [LAA] Remove RuntimeCheckingPtrGroup::RtCheck member (NFC).
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.

RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.

For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105481
2021-07-26 17:38:10 +01:00
Simon Pilgrim
e67bfc0f97 [Analysis] Fix getOrderedReductionCost to call target's getArithmeticInstrCost implementation
The getOrderedReductionCost implementation introduced in D105432 calls the CRTP base version getArithmeticInstrCost instead of the redirecting to the target version.

Differential Revision: https://reviews.llvm.org/D106795
2021-07-26 17:15:43 +01:00
Fangrui Song
f0aeef47e5 [yaml2obj][MachO] Rename PayloadString to Content
The new name is conciser and matches yaml2obj ELF & DWARF.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D106759
2021-07-26 09:04:51 -07:00
Kazu Hirata
24ba96f75f [AsmParser] Remove MDRef (NFC)
The last use was removed on Jan 12, 2015 in commit
ab617d597708fcf3c4b829bf595e9d990ca66c07.
2021-07-26 08:29:33 -07:00
Ulrich Weigand
81afdbc83c [SystemZ] Add support for new cpu architecture - arch14
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10304.

Note: No currently available Z system supports the arch14
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2021-07-26 16:57:28 +02:00
Nikita Popov
bab200ac44 [IR] Consider non-willreturn as side effect (PR50511)
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.

I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.

Differential Revision: https://reviews.llvm.org/D106749
2021-07-26 16:35:14 +02:00
Benjamin Kramer
9ae7d5aa56 Simplify away some SmallVector copies. NFCI.
The lifetime of the initializer list is the full expression, so we can
skip storing it in a temporary vector.
2021-07-26 16:33:38 +02:00
Paul Walker
fc75021aa7 [NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties.
Differential Revision: https://reviews.llvm.org/D106750
2021-07-26 12:25:46 +01:00
Philipp Krones
d7917544a3 [Inliner] Make the CallPenalty configurable
Tests with multiple benchmarks, like Embench [1], showed that the
CallPenalty magic number has the most influence on inlining decisions
when optimizing for size.

On the other hand, there was no good default value for this parameter.
Some benchmarks profited strongly from a reduced call penalty. On
example is the picojpeg benchmark compiled for RISC-V, which got 6%
smaller with a CallPenalty of 10 instead of 12. Other benchmarks
increased in size, like matmult.

This commit makes the compromise of turning the magic number constant of
CallPenalty into a configurable value. This introduces the flag
`--inline-call-penalty`. With that flag users can fine tune the inliner
to their needs.

The CallPenalty constant was also used for loops. This commit replaces
the CallPenalty constant with a new LoopPenalty constant that is now
used instead.

This is a slimmed down version of https://reviews.llvm.org/D30899

[1]: https://github.com/embench/embench-iot

Differential Revision: https://reviews.llvm.org/D105976
2021-07-26 12:07:49 +01:00
Dylan Fleming
6f6b3d4f7a [SVE] Add support for folding for select + masked loads
Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).

Patch originally authored by @paulwalker-arm

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106376
2021-07-26 11:58:41 +01:00
David Sherwood
2d2e4a1b17 [Analysis] Add simple cost model for strict (in-order) reductions
I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:

  1. Tree-wise. This is the typical fast-math reduction that involves
  continually splitting a vector up into halves and adding each
  half together until we get a scalar result. This is the default
  behaviour for integers, whereas for floating point we only do this
  if reassociation is allowed.
  2. Ordered. This now allows us to estimate the cost of performing
  a strict vector reduction by treating it as a series of scalar
  operations in lane order. This is the case when FP reassociation
  is not permitted. For scalable vectors this is more difficult
  because at compile time we do not know how many lanes there are,
  and so we use the worst case maximum vscale value.

I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.

New tests have been added here:

  Analysis/CostModel/AArch64/reduce-fadd.ll
  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D105432
2021-07-26 10:26:06 +01:00
Lang Hames
480ddba43e [ORC][ORC-RT] Add initial Objective-C and Swift support to MachOPlatform.
This allows ORC to execute code containing Objective-C and Swift classes and
methods (provided that the language runtime is loaded into the executor).
2021-07-26 18:02:01 +10:00
Nikita Popov
ed5a269ff1 [Attributes] Clean up handling of UB implying attributes (NFC)
Rather than adding methods for dropping these attributes in
various places, add a function that returns an AttrBuilder with
these attributes, which can then be used with existing methods
for dropping attributes. This is with an eye on D104641, which
also needs to drop them from returns, not just parameters.

Also be more explicit about the semantics of the method in the
documentation. Refer to UB rather than Undef, which is what this
is actually about.
2021-07-25 18:21:13 +02:00
Kazu Hirata
697f5408f6 [GlobalISel] Remove FlagsOp (NFC)
The class was introduced without a use on Dec 11, 2018 in commit
cef44a234219e38e1c28c902ff24586150eef682.
2021-07-25 07:05:07 -07:00
Kazu Hirata
3e06079a92 [Inline] Fix a warning by removing an explicit copy constructor
This patches fixes the warning:

  llvm/include/llvm/Analysis/InlineCost.h:62:3: error: definition of
  implicit copy assignment operator for 'CostBenefitPair' is
  deprecated because it has a user-declared copy constructor
  [-Werror,-Wdeprecated-copy]

by removing the explicit copy constructor.
2021-07-25 06:56:47 -07:00
Liqiang Tao
4952863892 [llvm][Inline] Add interface to return cost-benefit stuff
Return cost-benefit stuff which is computed by cost-benefit analysis.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D105349
2021-07-25 20:18:19 +08:00
Kazu Hirata
45ba93f745 [ADT] Remove WrappedPairNodeDataIterator (NFC)
The last use was removed on Jul 16, 2020 in commit
f1d4db4f0cdcbfeaee0840bf8a4fb5dc1b9b56fd.
2021-07-24 08:02:57 -07:00
Sander de Smalen
1f523effc3 [BasicTTI] Set scalarization cost of scalable vector casts to Invalid.
When BasicTTIImpl::getCastInstrCost can't determine the cost of a
vector cast operation when the types need legalization, it falls
back to calculating scalarization costs. Instead of crashing on
`cast<FixedVectorType>(DstVTy)` when the type is a scalable vector,
return an Invalid cost.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106655
2021-07-24 14:13:21 +01:00
Simon Pilgrim
cb0d04c29e [DAG] Add initial SelectionDAG::isGuaranteedNotToBeUndefOrPoison framework (PR51129)
I've setup the basic framework for the isGuaranteedNotToBeUndefOrPoison call and updated DAGCombiner::visitFREEZE to use it, further Opcodes can be handled when we have test coverage.

I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place.

SelectionDAG::isGuaranteedNotToBePoison wrappers have also been added.

Differential Revision: https://reviews.llvm.org/D106668
2021-07-24 11:36:35 +01:00
Amara Emerson
c79055fae4 [GlobalISel] Add GUnmerge, GMerge, GConcatVectors, GBuildVector abstractions. NFC.
Use these to slightly simplify some code in the artifact combiner.
2021-07-23 22:32:26 -07:00
Lang Hames
500a10cb5e Re-re-re-apply "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."
The ccache builders have recevied a config update that should eliminate the
build issues seen previously.
2021-07-24 13:16:12 +10:00
Kuter Dinel
a502a514d2 [AMDGPU] Deduce attributes with the Attributor
This patch introduces a pass that uses the Attributor to deduce AMDGPU specific attributes.

Reviewed By: jdoerfert, arsenm

Differential Revision: https://reviews.llvm.org/D104997
2021-07-24 06:07:15 +03:00