llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Martin Storsjö	b44324fc5c	[COFF] Fix ARM and ARM64 REL32 relocations to be relative to the end of the relocation This matches how they are defined on X86. This should fix the relative lookup tables pass for COFF, allowing it to be reenabled. Differential Revision: https://reviews.llvm.org/D102217	2021-05-12 09:53:43 +03:00
Vitaly Buka	79d2416a46	[symbolizer] Fix leak after D96883	2021-05-11 22:51:36 -07:00
Qiu Chaofan	3be56445eb	[VectorComine] Restrict single-element-store index to inbounds constant Vector single element update optimization is landed in 2db4979. But the scope needs restriction. This patch restricts the index to inbounds and vector must be fixed sized. In future, we may use value tracking to relax constant restrictions. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D102146	2021-05-12 13:18:20 +08:00
Congzhe Cao	6e3a5acf29	[LoopInterchange] Handle lcssa PHIs with multiple predecessors This is a bugfix in the transformation phase. If the original outer loop header branches to both the inner loop (header) and the outer loop latch, and if there is an lcssa PHI node outside the loop nest, then after interchange the new outer latch will have an lcssa PHI node inserted which has two predecessors, i.e., the original outer header and the original outer latch. Currently the transformation assumes it has only one predecessor (the original outer latch) and crashes, since the inserted lcssa PHI node does not take both predecessors as incoming BBs. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D100792	2021-05-11 21:30:54 -04:00
Matt Arsenault	836042a211	AMDGPU: Fix SILoadStoreOptimizer for gfx90a This was hardcoding the register class to use for the newly created pointer registers, violating the aligned VGPR requirement.	2021-05-11 21:26:43 -04:00
Matt Arsenault	71baa5ae09	GlobalISel: Don't hardcode varargs=false in resultsCompatible	2021-05-11 20:22:06 -04:00
Matt Arsenault	a97e83f7fc	AMDGPU: Fix assert on constant load from addrspacecasted pointer This was trying to create a bitcast between different address spaces.	2021-05-11 20:12:20 -04:00
Matt Arsenault	b97c2a8a8f	GlobalISel: Make constant fields const	2021-05-11 20:10:55 -04:00
Matt Arsenault	d8bc7ef86e	GlobalISel: Split ValueHandler into assignment and emission classes Currently the ValueHandler handles both selecting the type and location for arguments, as well as inserting instructions needed to handle them. Split this so that the determination of the argument handling is independent of the function state. Currently the checks for tail call compatibility do not follow the full assignment logic, so it misses cases where arguments require nontrivial legalization. This should help avoid targets ending up in a buggy state where the argument evaluation may change in different contexts.	2021-05-11 19:50:12 -04:00
Matt Arsenault	e77e0d934c	GlobalISel: Move AArch64 AssignFnVarArg to base class We can handle the distinction easily enough in the generic code, and this makes it easier to abstract the selection of type/location from the code to insert code.	2021-05-11 19:50:12 -04:00
Jordan Rupprecht	f19959f1c4	Revert "[GVN] Clobber partially aliased loads." This reverts commit 6c570442318e2d3b8b13e95c2f2f588d71491acb. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
Lang Hames	705819c25d	[JITLink] Fix bogus format string.	2021-05-11 16:04:00 -07:00
Congzhe Cao	819056fa35	[LoopInterchange] Fix legality for triangular loops This is a bug fix in legality check. When we encounter triangular loops such as the following form: for (int i = 0; i < m; i++) for (int j = 0; j < i; j++), or for (int i = 0; i < m; i++) for (int j = 0; j*i < n; j++), we should not perform interchange since the number of executions of the loop body will be different before and after interchange, resulting in incorrect results. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D101305	2021-05-11 18:36:53 -04:00
Petr Hosek	2018822774	[Coverage] Support overriding compilation directory When making compilation relocatable, for example in distributed compilation scenarios, we want to set compilation dir to a relative value like `.` but this presents a problem when generating reports because if the file path is relative as well, for example `..`, you may end up writing files outside of the output directory. This change introduces a flag that allows overriding the compilation directory that's stored inside the profile with a different value that is absolute. Differential Revision: https://reviews.llvm.org/D100232	2021-05-11 15:26:45 -07:00
Lang Hames	9ef00d4d71	[JITLink][MachO/x86_64] Expose API for creating eh-frame fixing passes. These can be used to create eh-frame section fixing passes outside the usual linker pipeline, which can be useful for tests and tools that just want to verify or dump graphs.	2021-05-11 15:26:16 -07:00
Lang Hames	ab764e6d8c	[JITLink][x86-64] Add an x86_64 PointerSize constexpr. This can be used in place of magic '8' values in generic x86-64 utilities.	2021-05-11 15:26:15 -07:00
Lang Hames	a86e5116c1	[JITLink] Make LinkGraph debug dumps more readable. This commit reorders some fields and fixes the width of others to try to maintain more consistent columns. It also switches to long-hand scope and linkage names, since LinkGraph dumps aren't read often enough for single-character codes to be memorable.	2021-05-11 15:26:15 -07:00
Congzhe Cao	ec779e7f79	Revert "[LoopInterchange] Fix legality for triangular loops" This reverts commit 29342291d25b83da97e74d75004b177ba41114fc. The test case requires an assert build. Will add REQUIRES and re-commit.	2021-05-11 18:10:58 -04:00
Petr Hosek	71ea50f517	[llvm-cov] Support for v4 format in convert-for-testing v4 moves function records to a dedicated section so we need to write and read it separately. https://reviews.llvm.org/D100535	2021-05-11 14:41:55 -07:00
Evandro Menezes	40043ea4ca	[RISCV] Move instruction information into the RISCVII namespace (NFC) Move instruction attributes into the `RISCVII` namespace and add associated helper functions. Differential Revision: https://reviews.llvm.org/D102268	2021-05-11 16:32:42 -05:00
Nikita Popov	6c5d74ff9e	[InstCombine] Clean up one-hot merge optimization (NFC) Remove the requirement that the instruction is a BinaryOperator, make the predicate check more compact and use slightly more meaningful naming for the and operands.	2021-05-11 23:22:11 +02:00
Alex Orlov	5360a8fe94	Removed unnecessary introduction of semi-colons.	2021-05-12 00:46:00 +04:00
Austin Kerbow	15bcde0b51	[AMDGPU] Fix extra waitcnt being added with BUFFER_INVL2 The waitcnt pass would increment the number of vmem events for some buffer invalidates that were not handled by the pass. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D102252	2021-05-11 13:17:33 -07:00
Craig Topper	a735716218	[RISCV] Regenerate stepvector.ll. NFC It looks like the RV32 and RV64 prefixes were removed from the RUN lines while another patch was in review that added check lines that used them.	2021-05-11 13:04:57 -07:00
Albion Fung	64e5b1ce09	[PowerPC] Improve codegen for int-to-fp conversion of subword vector extract When an integer is converted into floating point in subword vector extract, it can be done in 2 instructions instead of the 3+ instructions it generates right now. This patch removes the uncessary generation. Differential: https://reviews.llvm.org/D100604	2021-05-11 15:00:11 -05:00
Amara Emerson	0127b51db3	[AArch64][GlobaISel] Mark target generic instructions as HasNoSideEffects. One test needed updating because the newly side-effect-free instructions were now being DCE'd.	2021-05-11 12:38:53 -07:00
Roman Lebedev	c0202e5e23	[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): canonicalize to integer type This way we don't have to duplicate i32/f32 and i64/f64 entries, which was already forgotten to be done for a few tuples.	2021-05-11 21:35:58 +03:00
Fangrui Song	0ba4f918b6	[GlobalOpt] Remove heap SROA GlobalOpt implements a heap SROA (SROA for an malloc allocatated struct or array of structs) which is largely undertested (heap-sra-[1234].ll are basically the same test with very little difference) and does not trigger at all when bootstrapping clang (it only supports the case of one single store). The heap SROA implementation causes PR50027 (GEP is not properly handled; crash or miscompile). Just drop the implementation. I have deleted some obviously duplicated tests but kept `heap-sra-[12]{,-no-nullopt}.ll`. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102257	2021-05-11 11:34:37 -07:00
Amara Emerson	ee460561e0	[AArch64][GlobalISel] Support truncstorei8/i16 w/ combine to form truncating G_STOREs. This needs some tablegen changes so that we can actually import the patterns properly. Differential Revision: https://reviews.llvm.org/D102204	2021-05-11 11:33:03 -07:00
Fangrui Song	555e731f7d	[RISCV] Prefer to lower MC_GlobalAddress operands to .Lfoo$local Similar to X86 D73230 and AArch64 D101872 With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode, for default visibility external linkage non-ifunc-non-COMDAT definitions. For such dso_local definitions, variable access/taking the address of a function/calling a function will go through a local alias to avoid GOT/PLT. Reviewed By: jrtc27, luismarques Differential Revision: https://reviews.llvm.org/D101875	2021-05-11 11:29:45 -07:00
Eli Friedman	f0db96b22f	[ArgumentPromotion] Fix byval alignment handling. Make sure the alignment of the generated operations matches the alignment of the byval argument. Previously, we were just ignoring alignment and getting lucky. While I'm here, also delete the unnecessary "tail" handling. Passing a pointer to a byval argument to a "tail" call is UB, so rewriting to an alloca doesn't require any special handling. Differential Revision: https://reviews.llvm.org/D89819	2021-05-11 11:22:18 -07:00
Sam Powell	948b51f8f8	[TextAPI] Reformat llvm_unreachable message Change llvm_unreachable message from "Unknown llvm.MachO.PlatformKind enum" to "Unknown llvm::MachO::PlatformKind enum". Differential revision: https://reviews.llvm.org/D102250	2021-05-11 09:59:26 -07:00
Alan Phipps	0403b73f90	Reland "[Coverage] Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation"" Originally landed in: 6400905a615282c83a2fc6e49e57ff716aa8b4de Reverted in: 668dccc396da4f593ac87c92dc0eb7bc983b5762 Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation groups. This change corrects the implementation for the branch coverage summary to do the same thing for branches that is done for lines and regions. That is, across function instantiations in an instantiation group, the maximum branch coverage found in any of those instantiations is returned, with the total number of branches being the same across instantiations. Differential Revision: https://reviews.llvm.org/D102193	2021-05-11 11:48:23 -05:00
Simon Pilgrim	a973cc5256	[X86][SSE] Add tests for permute(phaddw(phaddw(x,y),phaddw(z,w))) -> phaddw(phaddw(),phaddw()) folds. We currently only fold if NumEltsPerLane == 4	2021-05-11 17:47:10 +01:00
Craig Topper	ad9c0709f2	[RISCV] Use fractional LMULs for fixed length types smaller than riscv-v-vector-bits-min. My thought process is that if v2i64 is an LMUL=1 type then v2i32 should be an LMUL=1/2 type. We limit the fractional LMUL so that SEW=64 clips to LMUL=1, SEW=32 clips to LMUL=1/2, etc. This ensures there's always a fractional LMUL available to truncate a type. This does reduce the number of vsetvlis in some cases. Some tests increase vsetvlis because the best container type for a mask type is dependent on the LMUL+SEW that the mask was produced from, but you can't tell that from the type. I think this is something we need to solve this in the machine IR when optimizing vsetvlis. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D101215	2021-05-11 09:42:48 -07:00
Roman Lebedev	99f6701720	[X86][Codegen] Shift amount mod: sh? i64 x, (32-y) --> sh? i64 x, -(y+32) I've seen this in the RawSpeed's BitPumpMSB*::push() hotpath, after fixing the buffer abstraction to a more sane one, when looking into a +5% runtime regression. I was hoping that this would fix it, but it does not look it does. This seems to be at least not worse than the original pattern. But i'm actually mainly interested in the case where we already compute `(y+32)` (see last test), https://alive2.llvm.org/ce/z/ZCzJio Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D101944	2021-05-11 19:39:41 +03:00
Craig Topper	7e541465b7	[RISCV] Match trunc_vector_vl+sra_vl/srl_vl with splat shift amount to vnsra/vnsrl. Limited to splats because we would need to truncate the shift amount vector otherwise. I tried to do this with new ISD nodes and a DAG combine to avoid such a large pattern, but we don't form the splat until LegalizeDAG and need DAG combine to remove a scalable->fixed->scalable cast before it becomes visible to the shift node. By the time that happens we've already visited the truncate node and won't revisit it. I think I have an idea how to improve i64 on RV32 I'll save for a follow up. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D102019	2021-05-11 09:29:31 -07:00
Alan Phipps	7436a81785	Revert "Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation" This reverts commit 6400905a615282c83a2fc6e49e57ff716aa8b4de.	2021-05-11 11:26:19 -05:00
Alan Phipps	f0f466bdea	Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation groups. This change corrects the implementation for the branch coverage summary to do the same thing for branches that is done for lines and regions. That is, across function instantiations in an instantiation group, the maximum branch coverage found in any of those instantiations is returned, with the total number of branches being the same across instantiations. Differential Revision: https://reviews.llvm.org/D102193	2021-05-11 10:42:40 -05:00
Roman Lebedev	46454a565e	[NFC][X86] Precommit another testcase for D101944	2021-05-11 18:34:43 +03:00
Steven Wu	5c8193b398	[IR][AutoUpgrade] Drop align attribute from void return types Since D87304, `align` become an invalid attribute on none pointer types and verifier will reject bitcode that has invalid `align` attribute. The problem is before the change, DeadArgumentElimination can easily turn a pointer return type into a void return type without removing `align` attribute. Teach Autograde to remove invalid `align` attribute from return types to maintain bitcode compatibility. rdar://77022993 Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D102201	2021-05-11 08:23:55 -07:00
Tony Tye	c65c7c3cfc	[NFC][AMDGPU] Correct product name for gfx908 The product name for gfx908 is "AMD Instinct MI100 Accelerator". Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D102209	2021-05-11 15:17:04 +00:00
Congzhe Cao	8fbcfbcf27	[LoopInterchange] Fix legality for triangular loops This is a bug fix in legality check. When we encounter triangular loops such as the following form: for (int i = 0; i < m; i++) for (int j = 0; j < i; j++), or for (int i = 0; i < m; i++) for (int j = 0; j*i < n; j++), we should not perform interchange since the number of executions of the loop body will be different before and after interchange, resulting in incorrect results. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D101305	2021-05-11 11:00:46 -04:00
Aakanksha Patil	b4cc496168	Fix typo "Execpt" in comments Differential Revision: https://reviews.llvm.org/D101858	2021-05-11 10:47:01 -04:00
Paul C. Anagnostopoulos	afbc30f908	Revert "[TableGen] Make the NUL character invalid in .td files" At least one build uses a 'sed' that does not understand \x00. This reverts commit cf9647011c4f05e1eb4423c6637d84e2f26b2042.	2021-05-11 10:43:13 -04:00
Florian Hahn	bf0bfd4187	[VPlan] Register recipe for instr if the simplified value is recipe. If the simplified VPValue is a recipe, we need to register it for Instr, in case it needs to be recorded. The way this is handled in general may change soon, following some post-commit comments. This fixes PR50298.	2021-05-11 14:32:34 +01:00
Roman Lebedev	a7f61f4671	[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost() Now that getMemoryOpCost() correctly handles all the vector variants, we should no longer hand-roll our own version of it, but use it directly. The AVX512 variant probably needs a similar change, but there it is less obvious.	2021-05-11 16:28:00 +03:00
Paul C. Anagnostopoulos	e1a045d607	[TableGen] Make the NUL character invalid in .td files Differential Revision: https://reviews.llvm.org/D101923	2021-05-11 09:20:42 -04:00
Simon Pilgrim	bfc3f661a4	[X86] Replace repeated isa/cast<ConstantSDNode> calls with single single dyn_cast<>. NFCI. Noticed while looking at D101944	2021-05-11 14:18:45 +01:00
Simon Pilgrim	9fb9f6bd50	[X86][SSE] Replace foldShuffleOfHorizOp with generalized version in canonicalizeShuffleMaskWithHorizOp foldShuffleOfHorizOp only handled basic shufps(hop(x,y),hop(z,w)) folds - by moving this to canonicalizeShuffleMaskWithHorizOp we can work with more general/combined v4x32 shuffles masks, float/integer domains and support shuffle-of-packs as well. The next step will be to support 256/512-bit vector cases.	2021-05-11 14:18:45 +01:00

1 2 3 4 5 ...

215564 Commits