llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Eli Friedman	05a71b0a6d	[ScalarEvolution] Fix overflow in computeBECount. The current implementation of computeBECount doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. To deal with this, use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. Differential Revision: https://reviews.llvm.org/D105216	2021-07-16 16:15:18 -07:00
Nemanja Ivanovic	b425bc6346	[PowerPC] Implement intrinsics for mtfsf[i] This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`). The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands. Reviewed By: #powerpc, qiucf Differential Revision: https://reviews.llvm.org/D105957	2021-07-16 16:26:11 -05:00
Lei Huang	a6ca36648b	[PowerPC] Implement XL compact math builtins Implement a subset of builtins required for compatiblilty with AIX XL compiler. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D105930	2021-07-16 13:21:13 -05:00
Joseph Huber	c2bfd1f7ef	[OpenMP] Add IDs to OpenMP remarks This patch adds unique idenfitiers to the existing OpenMP remarks. This makes it easier to identify the corresponding documentation for each remark that will be hosted in the OpenMP webpage. Depends on D105898 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105939	2021-07-16 14:07:03 -04:00
Guozhi Wei	7d6ba24baf	[X86FixupLEAs] Try again to transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D104684	2021-07-16 10:16:03 -07:00
madhur13490	b3e3f87671	[NFC] Fix typo intrinisic Differential Revision: https://reviews.llvm.org/D106161	2021-07-16 21:45:11 +05:30
Matt Arsenault	5a8526607f	GlobalISel: Remove dead function	2021-07-16 08:59:25 -04:00
Simon Giesecke	d4960d7c98	Reformat files. Differential Revision: https://reviews.llvm.org/D105982	2021-07-16 07:39:21 +00:00
Mehdi Amini	7d809bb14e	Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer We can build it with -Werror=global-constructors now. This helps in situation where libSupport is embedded as a shared library, potential with dlopen/dlclose scenario, and when command-line parsing or other facilities may not be involved. Avoiding the implicit construction of these cl::opt can avoid double-registration issues and other kind of behavior. Reviewed By: lattner, jpienaar Differential Revision: https://reviews.llvm.org/D105959	2021-07-16 07:38:16 +00:00
Mehdi Amini	b708f244c7	Revert "Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer" This reverts commit af9321739b20becf170e6bb5060b8d780e1dc8dd. Still some specific config broken in some way that requires more investigation.	2021-07-16 07:35:13 +00:00
Mehdi Amini	64ec18abb6	Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer We can build it with -Werror=global-constructors now. This helps in situation where libSupport is embedded as a shared library, potential with dlopen/dlclose scenario, and when command-line parsing or other facilities may not be involved. Avoiding the implicit construction of these cl::opt can avoid double-registration issues and other kind of behavior. Reviewed By: lattner, jpienaar Differential Revision: https://reviews.llvm.org/D105959	2021-07-16 06:54:26 +00:00
Shilei Tian	08c004d674	[Attributor] Add support for compound assignment for ChangeStatus A common use of `ChangeStatus` is as follows: ``` ChangeStatus Changed = ChangeStatus::UNCHANGED; Changed \|= foo(); ``` where `foo` returns `ChangeStatus` as well. Currently `ChangeStatus` doesn't support compound assignment, we have to write as ``` Changed = Changed \| foo(); ``` which is not that convenient. This patch add the support for compound assignment for `ChangeStatus`. Compound assignment is usually implemented as a member function, and binary arithmetic operator is therefore implemented using compound assignment. However, unlike regular C++ class, enum class doesn't support member functions. As a result, they can only be implemented in the way shown in the patch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106109	2021-07-15 23:51:46 -04:00
Mehdi Amini	0fd38b8415	Revert "Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer" This reverts commit 42f588f39c5ce6f521e3709b8871d1fdd076292f. Broke some buildbots	2021-07-16 03:46:53 +00:00
Mehdi Amini	a9a8a9a361	Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer We can build it with -Werror=global-constructors now. This helps in situation where libSupport is embedded as a shared library, potential with dlopen/dlclose scenario, and when command-line parsing or other facilities may not be involved. Avoiding the implicit construction of these cl::opt can avoid double-registration issues and other kind of behavior. Reviewed By: lattner, jpienaar Differential Revision: https://reviews.llvm.org/D105959	2021-07-16 03:33:20 +00:00
Matt Arsenault	ef17052770	GlobalISel: Surface offsets parameter from ComputeValueVTs	2021-07-15 19:11:40 -04:00
Matt Arsenault	240dff7427	GlobalISel: Track argument pointeriness with arg flags Since we're still building on top of the MVT based infrastructure, we need to track the pointer type/address space on the side so we can end up with the correct pointer LLTs when interpreting CCValAssigns.	2021-07-15 19:11:40 -04:00
Victor Huang	61ce66a632	[PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for population count, reversed load and store related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D106021	2021-07-15 17:23:56 -05:00
Amara Emerson	e11b55a90a	GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI. This adds some level of type safety, allows helper functions to be added for specific opcodes for free, and also allows us to succinctly check for class membership with the usual dyn_cast/isa/cast functions. To start off with, add variants for the different load/store operations with some places using it. Differential Revision: https://reviews.llvm.org/D105751	2021-07-15 15:21:57 -07:00
Harald van Dijk	f675df37ba	[X86] Fix handling of maskmovdqu in X32 The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit variant, the former using EDI, the latter RDI, but the use of the register is implicit. In 64-bit mode, a 0x67 prefix can be used to get the version using EDI, but there is no way to express this in assembly in a single instruction, the only way is with an explicit addr32. This change adds support for the instruction. When generating assembly text, that explicit addr32 will be added. When not generating assembly text, it will be kept as a single instruction and will be emitted with that 0x67 prefix. When parsing assembly text, it will be re-parsed as ADDR32 followed by MASKMOVDQU64, which still results in the correct bytes when converted to machine code. The same applies to vmaskmovdqu as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103427	2021-07-15 22:56:08 +01:00
Artem Belevich	0226799a56	[NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction. That should allow clang to compile mma.h from CUDA-11.3. Differential Revision: https://reviews.llvm.org/D105384	2021-07-15 12:02:09 -07:00
Philip Reames	a74c4e37ae	[unittest] Exercise SCEV's udiv and udiv ceiling routines The ceiling variant was recently added (due to the work towards D105216), and we're spending a lot of time trying to find optimizations for the expression. This patch brute forces the space of i8 unsigned divides and checks that we get a correct (well consistent with APInt) result for both udiv and udiv ceiling. (This is basically what I've been doing locally in a hand rolled C++ program, and I realized there no good reason not to check it in as a unit test which directly exercises the logic on constants.) Differential Revision: https://reviews.llvm.org/D106083	2021-07-15 11:55:00 -07:00
Quinn Pham	ba35dd5a19	[PowerPC] Fix popcntb XL Compat Builtin for 32bit This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D105360	2021-07-15 13:19:47 -05:00
Nikita Popov	d58a8fbeab	[IR] Add elementtype attribute This implements the elementtype attribute specified in D105407. It just adds the attribute and the specified verifier rules, but doesn't yet make use of it anywhere. Differential Revision: https://reviews.llvm.org/D106008	2021-07-15 18:04:26 +02:00
Nikita Popov	929097793e	[AsmParser] Unify parsing of attributes Continuing on from D105780, this should be the last major bit of attribute cleanup. Currently, LLParser implements attribute parsing for functions, parameters and returns separately, enumerating all supported (and unsupported) attributes each time. This patch extracts the common parsing logic, and performs a check afterwards whether the attribute is valid in the given position. Parameters and returns are handled together, while function attributes need slightly different logic to support attribute groups. Differential Revision: https://reviews.llvm.org/D105938	2021-07-15 17:51:11 +02:00
Simon Pilgrim	87679ebe78	[TTI] Consistently make getMinVectorRegisterBitWidth() methods const. NFCI. The underlying getMinVectorRegisterBitWidth() methods are const, but it was missed in a couple of TargetTransformInfo wrappers. Noticed while working on D103925	2021-07-15 13:27:55 +01:00
Tony Tye	57b2fbab2e	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x44 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D106034	2021-07-15 06:46:27 +00:00
Chuanqi Xu	ca13ea7edf	[Coroutines] Run coroutine passes by default This patch make coroutine passes run by default in LLVM pipeline. Now the clang and opt could handle IR inputs containing coroutine intrinsics without special options. It should be fine. On the one hand, the coroutine passes seems to be stable since there are already many projects using coroutine feature. On the other hand, the coroutine passes should do nothing for IR who doesn't contain coroutine intrinsic. Test Plan: check-llvm Reviewed by: lxfind, aeubanks Differential Revision: https://reviews.llvm.org/D105877	2021-07-15 14:33:40 +08:00
Kuter Dinel	b86b597e6c	[Attributor] AACallEdges, Add a way to ask nonasm unknown callees This patch adds a feature to AACallEdges AbstractAttribute that allows users to ask if there is a unknown callee that isn't a inline assembly. This feature is needed by some of it's users. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105992	2021-07-15 06:10:42 +03:00
Kai Luo	bb52bc77a5	[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D103614	2021-07-15 01:12:09 +00:00
Thomas Lively	3c50e4a7a7	[WebAssembly] Codegen for v128.storeX_lane instructions Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435. Differential Revision: https://reviews.llvm.org/D106019	2021-07-14 16:15:25 -07:00
Stanislav Mekhanoshin	1dae83cabb	[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization Any def of EXEC prevents rematerialization of any VOP instruction because of the physreg use. Create a callback to check if the physreg use can be ingored to allow rematerialization. Differential Revision: https://reviews.llvm.org/D105836	2021-07-14 13:03:58 -07:00
Eli Friedman	0af449d2a7	[SelectionDAG] Add an overload of getStepVector that assumes step 1. This is mostly a minor convenience, but the pattern seems frequent enough to be worthwhile (and we'll probably add more uses in the future). Differential Revision: https://reviews.llvm.org/D105850	2021-07-14 11:37:01 -07:00
Thomas Lively	cf44692539	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Djordje Todorovic	c793732c01	[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This new MIR pass removes redundant DBG_VALUEs. After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE. This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that. This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one: For example: (1) DBG_VALUE $edi, !"var1", ... (2) DBG_VALUE $esi, !"var2", ... (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (1). By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed. Differential Revision: https://reviews.llvm.org/D105279	2021-07-14 04:29:42 -07:00
Arthur Eubanks	78dcef41e1	[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch To help with debugging non-trivial unswitching issues. Don't care about the legacy pass, nobody is using it. If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't default to the empty constructor for the pass params. We should still let the parser take care of it in case the parser has its own defaults. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D105933	2021-07-13 16:09:42 -07:00
Matt Arsenault	74be1319be	RegAlloc: Allow targets to split register allocation AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.	2021-07-13 18:49:29 -04:00
Victor Huang	6d07c374d0	[PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for compare and multiply related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D102875	2021-07-13 16:55:09 -05:00
Philip Reames	d721cdd01e	[ScalarEvolution] Fix overflow when computing max trip counts This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me. The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.	2021-07-13 10:01:10 -07:00
Guillaume Chatelet	fb0a7c4525	Revert "[llvm] Add enum iteration to Sequence" This reverts commit a006af5d6ec6280034ae4249f6d2266d726ccef4.	2021-07-13 16:44:42 +00:00
Guillaume Chatelet	79316cfa46	[llvm] Add enum iteration to Sequence This patch allows iterating typed enum via the ADT/Sequence utility. Differential Revision: https://reviews.llvm.org/D103900	2021-07-13 16:22:19 +00:00
Albion Fung	6272d2fc4a	[PowerPC] Fix L[D\|W]ARX Implementation LDARX and LWARX sometimes gets optimized out by the compiler when it is critical to the correctness of the code. This inline asm generation ensures that it preserved. Differential Revision: https://reviews.llvm.org/D105754	2021-07-13 11:02:07 -05:00
Matt Arsenault	b1584af557	GlobalISel: Remove getIntrinsicID utility function This is redundant with a method directly on MachineInstr	2021-07-13 11:04:10 -04:00
Matt Arsenault	2398c72d1f	Mips/GlobalISel: Use more standard call lowering infrastructure This also fixes some missing implicit uses on call instructions, adds missing G_ASSERT_SEXT/ZEXT annotations, and some missing outgoing sext/zexts. This also fixes not respecting tablegen requested type promotions. This starts treating f64 passed in i32 GPRs as a type of custom assignment, which restores some previously XFAILed tests. This is due to getNumRegistersForCallingConv returns a static value, but in this case it is context dependent on other arguments. Most of the ugliness is reproducing a hack CC_MipsO32 uses in SelectionDAG. CC_MipsO32 depends on a bunch of vectors populated from the original IR argument types in MipsCCState. The way this ends up working in GlobalISel is it only ends up inspecting the most recently added vector element. I'm pretty sure there are cleaner ways to do this, but this seemed easier than fixing up the current DAG handling. This is another case where it would be easier of the CCAssignFns were passed the original type instead of only the pre-legalized ones. There's still a lot of junk here that shouldn't be necessary. This also likely breaks big endian handling, but it wasn't complete/tested anyway since the IRTranslator gives up on big endian targets.	2021-07-13 11:04:10 -04:00
Hafiz Abid Qadeer	5ad78e6343	[AMDGPU] Handle s_branch to another section. Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label. Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned. This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error. Jumps to other section can arise with hold/cold splitting. A patch to handle the relocation in lld will follow shortly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D105760	2021-07-13 12:17:47 +01:00
Jeroen Dobbelaere	9fe5660816	[remangleIntrinsicFunction] Detect and resolve name clash It is possible that the remangled name for an intrinsic already exists with a different (and wrong) prototype within the module. As the bitcode reader keeps both versions of all remangled intrinsics around for a longer time, this can result in a crash, as can be seen in https://bugs.llvm.org/show_bug.cgi?id=50923 This patch makes 'remangleIntrinsicFunction' aware of this situation. When it is detected, it moves the version with the wrong prototype to a different name. That version will be removed anyway once the module is completely loaded. With thanks to @asbirlea for reporting this issue when trying out an lto build with the full restrict patches, and @efriedma for suggesting a sane resolution mechanism. Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D105118	2021-07-13 11:21:12 +02:00
Nikita Popov	495f2550b0	[Attributes] Determine attribute properties from TableGen data Continuing from D105763, this allows placing certain properties about attributes in the TableGen definition. In particular, we store whether an attribute applies to fn/param/ret (or a combination thereof). This information is used by the Verifier, as well as the ForceFunctionAttrs pass. I also plan to use this in LLParser, which also duplicates info on which attributes are valid where. This keeps metadata about attributes in one place, and makes it more likely that it stays in sync, rather than in various functions spread across the codebase. Differential Revision: https://reviews.llvm.org/D105780	2021-07-12 22:13:38 +02:00
Nikita Popov	2812298c44	[Attributes] Replace doesAttrKindHaveArgument() (NFC) This is now the same as isIntAttrKind(), so use that instead, as it does not require manual maintenance. The naming is also more accurate in that both int and type attributes have an argument, but this method was only targeting int attributes. I initially wanted to tighten the AttrBuilder assertion, but we have some in-tree uses that would violate it.	2021-07-12 21:57:26 +02:00
Nikita Popov	4966784718	[Attributes] Assert correct attribute constructor is used (NFCI) Assert that enum/int/type attributes go through the constructor they are supposed to use. To make sure this can't happen via invalid bitcode, explicitly verify that the attribute kind if correct there.	2021-07-12 21:11:59 +02:00
Nikita Popov	28e27a194e	[Attributes] Make type attribute handling more generic (NFCI) Followup to D105658 to make AttrBuilder automatically work with new type attributes. TableGen is tweaked to emit First/LastTypeAttr markers, based on which we can handle type attributes programmatically. Differential Revision: https://reviews.llvm.org/D105763	2021-07-12 20:49:38 +02:00
Thomas Lively	eb7eabd7be	[WebAssembly] Custom combines for f32x4.demote_zero_f64x2 Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern. Differential Revision: https://reviews.llvm.org/D105755	2021-07-12 10:32:18 -07:00

1 2 3 4 5 ...

45469 Commits