llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00

Author	SHA1	Message	Date
Florian Hahn	d00a7edec9	[LV] Add test to store a first-order rec via interleave group. This is a reduced version of the reproducer from https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2	2021-07-26 15:20:04 +01:00
Alexey Bataev	37023003ea	[SLP]Fix costs calculations. Need to fix several cost-related problems. The final type may be defined incorrectly because of to early definition (we may end up with the wider type), the CommonCost should not be redefined in ExtractElements cost related calculations and the shuffle of the final insertelements vectors should be calculated as a cost of single vector permutations + costs of two vector permutations for other n-1 incoming vectors. Differential Revision: https://reviews.llvm.org/D106578	2021-07-26 07:14:03 -07:00
gbreynoo	40ac5ec4fa	[llvm-readobj] Display multiple function names for stack size entries The current implementation of displaying .stack_size information presumes that each entry represents a single function but this is not always the case. For example with the use of ICF multiple functions can be represented with the same code, meaning that the address found in a .stack_size entry corresponds to multiple function symbols. This change allows multiple function names to be displayed when appropriate. Differential Revision: https://reviews.llvm.org/D105884	2021-07-26 14:49:53 +01:00
Jay Foad	453349344b	[AMDGPU][GISel] Fix MMO for raw/struct buffer access with non-constant offset Codegen for the raw/struct buffer access intrinsics would update the offset in the MMO to reflect the combined offset, if it was known to be constant. If the combined offset was not known to be constant, or if there was an index, it would set the offset in the MMO to 0. This is unsafe because it makes it look like the access does not alias with another access with a fixed non-zero offset. Fix these cases by setting the pointer in the MMO to null, to reflect the fact that we do not have any known IR value pointer + constant offset for the access. D106284 did this for SelectionDAG. This is the corresponding fix for GlobalISel. Differential Revision: https://reviews.llvm.org/D106451	2021-07-26 14:27:30 +01:00
Jay Foad	eaeb8d4f70	[AMDGPU] Pre-commit global-isel test case for D106451 This test case shows the scheduler wrongly reordering two buffer accesses that might alias.	2021-07-26 14:27:30 +01:00
Jay Foad	7f0f3d6b7b	[AMDGPU] Fix MMO for raw/struct buffer access with non-constant offset Codegen for the raw/struct buffer access intrinsics would update the offset in the MMO to reflect the combined offset, if it was known to be constant. If the combined offset was not known to be constant, or if there was an index, it would set the offset in the MMO to 0. This is unsafe because it makes it look like the access does not alias with another access with a fixed non-zero offset. Fix these cases by setting the pointer in the MMO to null, to reflect the fact that we do not have any known IR value pointer + constant offset for the access. Differential Revision: https://reviews.llvm.org/D106284	2021-07-26 14:27:30 +01:00
David Green	1cb5f9c5d7	[ARM] Ensure correct regclass in distributing postinc The register class required for some MVE loads/stores is more constrained than the register we use when creating postinc. Make sure we constrain the register class to keep the code correct.	2021-07-26 14:26:38 +01:00
Tim Northover	c8cc09ffa5	AArch64: support i128 (& larger) returns in GlobalISel	2021-07-26 14:16:35 +01:00
Nikita Popov	845ad210b0	[SimplifyCFG] Improve store speculation check isSafeToSpeculateStore() looks for a preceding store to the same location to make sure that introducing a new store of the same value is safe. It currently bails on intervening mayHaveSideEffect() instructions. However, I believe just checking mayWriteToMemory() is sufficient there -- we just need to make sure that we know which value was stored, we don't care if we can unwind in the meantime. While looking into this, I started having some doubts about the correctness of the transform with regard to thread safety. While we don't try to hoist non-simple stores, I believe we also need to make sure that the preceding store is simple as well. Otherwise we could introduce a spurious non-atomic write after an atomic write -- under our memory model this would result in a subsequent undef atomic read, even if the second write stores the same value as the first. Example: https://alive2.llvm.org/ce/z/q_3YAL Differential Revision: https://reviews.llvm.org/D106742	2021-07-26 15:01:00 +02:00
Kerry McLaughlin	d43867c3c3	[SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths Fixes more casts to `<FixedVectorType>` for the cases where the instruction is a Insert/ExtractElementInst. For fixed-width, this part of truncateToMinimalBitWidths is tested by AArch64/type-shrinkage-insertelt.ll. I attempted to write a test case for this part of truncateToMinimalBitWidths which uses scalable vectors, but was unable to add one. The tests in type-shrinkage-insertelt.ll rely on scalarization to create extract element instructions for instance, which is not possible for scalable vectors. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106163	2021-07-26 13:44:51 +01:00
Alexey Bataev	7d6020c9b9	Revert "[SLP]Fix costs calculations." This reverts commit a053afed49897aa34e08287f91c5255efa4e5131 to fix buildbots.	2021-07-26 05:42:34 -07:00
Caroline Concatto	7e3be1e953	[AArch65][SVE] Remove vector_splice from AddedComplexity pattern The pattern for vector_splice with Index equal or bigger than zero was misplaced in the AddedComplexity = 1 pattern in the AArch64 tablegen file. This patch fixes it by removing vector_splice pattern from inside AddedComplexity = 1.	2021-07-26 13:35:51 +01:00
Alexey Bataev	d88801e0e6	[SLP]Fix costs calculations. Need to fix several cost-related problems. The final type may be defined incorrectly because of to early definition (we may end up with the wider type), the CommonCost should not be redefined in ExtractElements cost related calculations and the shuffle of the final insertelements vectors should be calculated as a cost of single vector permutations + costs of two vector permutations for other n-1 incoming vectors. Differential Revision: https://reviews.llvm.org/D106578	2021-07-26 04:37:22 -07:00
Paul Walker	fc75021aa7	[NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties. Differential Revision: https://reviews.llvm.org/D106750	2021-07-26 12:25:46 +01:00
Philipp Krones	d7917544a3	[Inliner] Make the CallPenalty configurable Tests with multiple benchmarks, like Embench [1], showed that the CallPenalty magic number has the most influence on inlining decisions when optimizing for size. On the other hand, there was no good default value for this parameter. Some benchmarks profited strongly from a reduced call penalty. On example is the picojpeg benchmark compiled for RISC-V, which got 6% smaller with a CallPenalty of 10 instead of 12. Other benchmarks increased in size, like matmult. This commit makes the compromise of turning the magic number constant of CallPenalty into a configurable value. This introduces the flag `--inline-call-penalty`. With that flag users can fine tune the inliner to their needs. The CallPenalty constant was also used for loops. This commit replaces the CallPenalty constant with a new LoopPenalty constant that is now used instead. This is a slimmed down version of https://reviews.llvm.org/D30899 [1]: https://github.com/embench/embench-iot Differential Revision: https://reviews.llvm.org/D105976	2021-07-26 12:07:49 +01:00
Florian Hahn	a0b07d2e54	[VPlan] Use stored value from recipes for interleave groups. Instead of getting the VPValue for the stored IR values through the current plan, use the stored value of the recipes directly. This way, the correct VPValues are used if the store recipes have been modified in the VPlan and the IR value is not correct any longer. This can happen, e.g. due to D105008.	2021-07-26 12:05:23 +01:00
Dylan Fleming	6f6b3d4f7a	[SVE] Add support for folding for select + masked loads Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0). Patch originally authored by @paulwalker-arm Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106376	2021-07-26 11:58:41 +01:00
Caroline Concatto	9af238dfc2	[SVE][AArch64] Improve code generation for vector_splice for Imm > 0 This patch implements vector_splice in tablegen for all cases when the Immediate is positive and lower than the known minimum value of a scalable vector. Vector_splice can be implemented using SVE instruction EXT. For instance : @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm) @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> EXT Vector_1, Vector_2, Imm // Vector_1 = B, C, D + Vector_2 = E Depends on D105633 Differential Revision: https://reviews.llvm.org/D106273	2021-07-26 11:45:46 +01:00
David Sherwood	fd6a38b569	Fix test failures caused by 0aff1798b5721d5f95d16f465b99d357012bb8d1	2021-07-26 11:40:26 +01:00
Caroline Concatto	d9b910d9d2	[AArch64][SVE] Improve code generation for vector_splice for Imm == -1 This patch implements vector_splice in tablegen for: a) when the immediate is equal to -1 (Imm==1) and uses: INSR + LASTB For instance : @llvm.experimental.vector.splice(Vector_1, Vector_2, -1) @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <D, E, F, G> LAST RegLast, Vector_1 // RegLast = D INSR Res, (Vector_1 >> 1), RegLast // Res = D + E, F, G Differential Revision: https://reviews.llvm.org/D105633	2021-07-26 11:25:01 +01:00
Simon Pilgrim	96fce2fe63	[X86][AVX] Prefer vinsertf128 to vperm2f128 on AVX1 targets Splatting the lower xmm with vinsertf128 is at least as quick as vperm2f128, and a lot faster on some AMD targets. First step towards PR50053	2021-07-26 11:11:56 +01:00
Simon Pilgrim	f76b3abd62	[X86][SSE] Don't scrub address math from interleaved shuffle tests	2021-07-26 11:03:31 +01:00
Cullen Rhodes	f81ad3ab04	[AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc() Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D106635	2021-07-26 09:36:34 +00:00
David Sherwood	2d2e4a1b17	[Analysis] Add simple cost model for strict (in-order) reductions I have added a new FastMathFlags parameter to getArithmeticReductionCost to indicate what type of reduction we are performing: 1. Tree-wise. This is the typical fast-math reduction that involves continually splitting a vector up into halves and adding each half together until we get a scalar result. This is the default behaviour for integers, whereas for floating point we only do this if reassociation is allowed. 2. Ordered. This now allows us to estimate the cost of performing a strict vector reduction by treating it as a series of scalar operations in lane order. This is the case when FP reassociation is not permitted. For scalable vectors this is more difficult because at compile time we do not know how many lanes there are, and so we use the worst case maximum vscale value. I have also fixed getTypeBasedIntrinsicInstrCost to pass in the FastMathFlags, which meant fixing up some X86 tests where we always assumed the vector.reduce.fadd/mul intrinsics were 'fast'. New tests have been added here: Analysis/CostModel/AArch64/reduce-fadd.ll Analysis/CostModel/AArch64/sve-intrinsics.ll Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll Differential Revision: https://reviews.llvm.org/D105432	2021-07-26 10:26:06 +01:00
Fraser Cormack	2df597c58a	[SelectionDAG] Support scalable-vector splats in yet more cases This patch extends support for (scalable-vector) splats in the DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a variety of simple combines of constants. Users of this function may now have to distinguish between `BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing with this in-tree follows the approach added for `ISD::matchUnaryPredicate` implemented in D94501. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106575	2021-07-26 10:15:08 +01:00
Lang Hames	480ddba43e	[ORC][ORC-RT] Add initial Objective-C and Swift support to MachOPlatform. This allows ORC to execute code containing Objective-C and Swift classes and methods (provided that the language runtime is loaded into the executor).	2021-07-26 18:02:01 +10:00
Yuanfang Chen	0b2bb4e657	[Object] make SourceMgr available to MCContext during inline asm symbols collection Fixes PR51210.	2021-07-25 21:23:03 -07:00
Esme-Yi	b278eba445	[Debug-Info][llvm-dwarfdump] Don't use DW_FORM_data4/8 to encode the constants for DW_AT_data_member_location. Summary: In DWARF v3, DW_FORM_data4/8 in DW_AT_data_member_location are interpreted as location list pointers. Interpreting constants as pointers is not expected, so we use DW_FORM_udata to encode the constants. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D105687	2021-07-26 03:47:02 +00:00
Mehdi Amini	f99a775b8b	Revert "Build libSupport with -Werror=global-constructors (NFC)" This reverts commit 579cc9ad2e2db6c3f1670b9f42c2cfe67bc5722c. This breaks on Windows.	2021-07-26 03:08:26 +00:00
Mehdi Amini	77f87f1745	Build libSupport with -Werror=global-constructors (NFC) Ensure that libSupport does not carry any static global initializer. libSupport can be embedded in use cases where we don't want to load all cl::opt unless we want to parse the command line. ManagedStatic can be used to enable lazy-initialization of globals.	2021-07-26 03:04:31 +00:00
Esme-Yi	d0cd0162ac	[yaml2obj] Do not write the string table if there is no string entry. Summary: yaml2obj shouldn't create the string table that isn't needed - doing so wastes time and disk space. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D106420	2021-07-26 02:37:49 +00:00
Mehdi Amini	a60a1f2f04	Revert "Build libSupport with -Werror=global-constructors (NFC)" This reverts commit 5eb2e9aa64b7be7cd8ed7f36de19c2c9bdf1977c. This broke MacOS builds, needs to have a safer check guarding the flag addition.	2021-07-26 00:55:36 +00:00
Mehdi Amini	61041e8ca3	Build libSupport with -Werror=global-constructors (NFC) Ensure that libSupport does not carry any static global initializer. libSupport can be embedded in use cases where we don't want to load all cl::opt unless we want to parse the command line. ManagedStatic can be used to enable lazy-initialization of globals.	2021-07-26 00:21:09 +00:00
Mehdi Amini	ecdc653aaa	Remove the NotUnderValgrind caching flag The motivation for this caching wasn't clear, remove it in an effort to simplify the code and make libSupport free of global dynamic constructor. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D106206	2021-07-26 00:21:09 +00:00
Roman Lebedev	aaa4bd0e18	[SimplifyCFG] Fold branch to common dest: if branch is unpredictable, prefer to speculate This is consistent with the two other usages of prof md in this pass.	2021-07-26 02:57:19 +03:00
Roman Lebedev	519c4b9ca7	[SimplifyCFG] Don't speculatively execute BB[s] if they are predictably not taken Same as D106650, but for `FoldTwoEntryPHINode()` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D106717	2021-07-26 02:55:15 +03:00
Roman Lebedev	cc6967667d	[SimplifyCFG] Don't speculatively execute BB if it's predictably not taken If the branch isn't `unpredictable`, and it is predicted to not branch to the block we are considering speculatively executing, then it seems counter-productive to execute the code that is predicted not to be executed. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D106650	2021-07-26 02:55:14 +03:00
Roman Lebedev	ddc2eed16c	[NFC][SimplifyCFG] Add more negative tests for profmd-induced speculation avoidance	2021-07-26 02:55:08 +03:00
Stefan Gränitz	c0784ed26d	[docs] Update release notes to mention lli JIT engine switch	2021-07-25 23:58:43 +02:00
Nico Weber	ef8d46554d	Revert "[VPlan] Add recipe for first-order rec phis, make splicing explicit." Makes clang crash: https://reviews.llvm.org/D105008#2903350 This reverts commit d2a73fb44ea0b8c981e4b923f811f18793fc4770. Also revert a minor formatting follow-up: This reverts commit 82834a673246f27a541ffcc57e0eb65b008102ef.	2021-07-25 17:39:28 -04:00
Fangrui Song	56c255e563	[LangRef] Reorder two paragraphs for comdat so that IMAGE_COMDAT_SELECT_LARGEST refers to the correct example.	2021-07-25 12:53:14 -07:00
Simon Pilgrim	a499e7c2d0	[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI. Begin replacing individual getMemIntrinsicNode calls and setup (for X86ISD::VBROADCAST_LOAD + X86ISD::SUBV_BROADCAST_LOAD opcodes) with this getBROADCAST_LOAD helper.	2021-07-25 20:37:58 +01:00
Joseph Huber	daea2dd14a	[OpenMP] Introduce RAII to protect certain RTL calls from DCE This patch introduces a new RAII struct that will temporarily make an OpenMP RTL function have external linkage. This is done before the attributor is invoked to prevent it from incorrectly removing some function definitions that we will use later. For example, if we determine all calls to one function are dead, because it has internal linkage it can safely be removed. Later when we try to get an instance to that function to modify the source using `getOrCreateRuntimeFunction` we will then get an empty declaration for that function that won't be defined anywhere. This patch prevents this from occurring. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106707	2021-07-25 14:15:47 -04:00
Roman Lebedev	549bfa55a4	[NFC][Codegen][X86] Improve test coverage for insertions into XMM vector	2021-07-25 21:08:03 +03:00
Kyungwoo Lee	e13913351b	[AArch64] Fix Local Deallocation for Homogeneous Prolog/Epilog The stack adjustment for local deallocation was incorrectly ported. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D106760	2021-07-25 10:51:11 -07:00
Simon Pilgrim	4b99bcdef6	[X86][SSE] LowerRotate - perform modulo on the amount splat source directly. If the rotation amount is a known splat, perform the modulo on the splat source, and then perform the splat. That way the amount-extension performed later by LowerScalarVariableShift can fold the splats away without any multiple-use issues. Fixes one of the concerns raised on D104156	2021-07-25 17:30:32 +01:00
Nikita Popov	ed5a269ff1	[Attributes] Clean up handling of UB implying attributes (NFC) Rather than adding methods for dropping these attributes in various places, add a function that returns an AttrBuilder with these attributes, which can then be used with existing methods for dropping attributes. This is with an eye on D104641, which also needs to drop them from returns, not just parameters. Also be more explicit about the semantics of the method in the documentation. Refer to UB rather than Undef, which is what this is actually about.	2021-07-25 18:21:13 +02:00
Nikita Popov	adc5107b73	[Attributes] Remove nonnull from UB-implying attributes From LangRef: > if the parameter or return pointer is null, poison value is > returned or passed instead. The nonnull attribute should be > combined with the noundef attribute to ensure a pointer is not > null or otherwise the behavior is undefined. Dropping noundef is sufficient to prevent UB. Including nonnull in this method just muddies the semantics.	2021-07-25 18:07:31 +02:00
Simon Pilgrim	4c407a3b35	Revert rG939291041bb35b8088e3b61be2b8b3bc950f64a7 "[AMDGPU] Regenerate wave32.ll test checks" This still breaks buildbots	2021-07-25 15:59:26 +01:00
Nico Weber	e2e1559a48	[JITLink][RISCV] Run new test from 0ad562b48 only if the RISCV backend is enabled	2021-07-25 10:47:26 -04:00

1 2 3 4 5 ...

219170 Commits