llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Stuart Brady	1df98c77f3	[demangler] Fix demangling of 'half' Demangle 'Dh' as 'half' (as per GCC), and not 'decimal16' (which doesn't make sense, as there is no IEEE 754 decimal16 format). The Itanium C++ ABI specification describes 'Dh' as: > IEEE 754r half-precision floating point (16 bits) (https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin) Reviewed By: ldionne, jyknight Differential Revision: https://reviews.llvm.org/D103833	2021-07-19 21:21:34 +01:00
Tony Tye	e4ba84ff98	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x45 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D106249	2021-07-19 20:17:35 +00:00
Petr Hosek	1aff1e2660	[InstrProfiling] Use weak alias for bias variable We need the compiler generated variable to override the weak symbol of the same name inside the profile runtime, but using LinkOnceODRLinkage results in weak symbol being emitted in which case the symbol selected by the linker is going to depend on the order of inputs which can be fragile. This change replaces the use of weak definition inside the runtime with a weak alias. We place the compiler generated symbol inside a COMDAT group so dead definition can be garbage collected by the linker. We also disable the use of runtime counter relocation on Darwin since Mach-O doesn't support weak external references, but Darwin already uses a different continous mode that relies on overmapping so runtime counter relocation isn't needed there. Differential Revision: https://reviews.llvm.org/D105176	2021-07-19 12:23:51 -07:00
Artem Belevich	d635785a03	[MemCpyOpt] Enable memcpy optimizations unconditionally. The patch does not depend on the availability of the library functions for memcpy/memset as it operates on LLVM intrinsics. The optimizations are useful on the targets that have these functions disabled (e.g. NVPTX & AMDGPU). Differential Revision: https://reviews.llvm.org/D104801	2021-07-19 11:58:02 -07:00
Haowei Wu	5403a31116	[ifs][elfabi] Merge llvm-ifs/elfabi tools This change merges llvm-elfabi and llvm-ifs tools. Differential Revision: https://reviews.llvm.org/D100139	2021-07-19 11:23:19 -07:00
Haowei Wu	bcce63dd25	[ifs] Prepare llvm-ifs for elfabi/ifs merging. This diff changes llvm-ifs to use unified IFS file format and perform other renaming changes in preparation for the merging between elfabi/ifs. Differential Revision: https://reviews.llvm.org/D99810	2021-07-19 11:23:00 -07:00
Haowei Wu	db5e2f303b	[elfabi] Prepare elfabi/ifs merging. This change implements unified text stub format and command line interface proposed in the elfabi/ifs merge plan. Differential Revision: https://reviews.llvm.org/D99399	2021-07-19 11:22:43 -07:00
Amy Huang	f2ec69fb59	Revert "[llvm][sve] Lowering for VLS truncating stores" because it causes a seg fault (see https://reviews.llvm.org/D104471). This reverts commit c305557acdaad453e32309d575fe9c6c7090c099.	2021-07-19 11:03:33 -07:00
Amara Emerson	49ad61c372	[GlobalISel] Fix load-or combine moving loads across potential aliasing stores. Although this combine checks that there's no load folding barriers between the loads that it's trying to merge, it was inserting the load at the MIRBuilder's default insertion point, which is the G_OR use inst. This was causing a miscompile in the test suite's SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-bswap-2 Differential Revision: https://reviews.llvm.org/D106251	2021-07-19 10:23:23 -07:00
Wouter van Oortmerssen	8a42745952	[WebAssembly] Support R_WASM_MEMORY_ADDR_TLS_SLEB64 for wasm64 Also fixed TLS tests swapping addr & value in store op Differential Revision: https://reviews.llvm.org/D106096	2021-07-19 10:22:43 -07:00
Victor Campos	33327c4fc5	[NewPM] Fix wrong perfect forwardings Some template functions were missing '&&' in function arguments, therefore these were always taken by value after template instantiation. This patch adds the double ampersand to introduce proper perfect forwarding. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D106148	2021-07-19 17:21:32 +01:00
Simon Pilgrim	bcad7af28b	[ISD] Add disclaimer comments to AssertSext/Zext/Align opcodes about poison values As encountered on D106053, we need to be very explicit that the Assertion nodes don't hold true for a poison value (or for specific poisoned vector elements). Differential Revision: https://reviews.llvm.org/D106257	2021-07-19 17:15:28 +01:00
maekawatoshiki	10a3d40457	[LICM] Create LoopNest Invariant Code Motion (LNICM) pass This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works. Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D104180	2021-07-20 00:31:18 +09:00
Matt Arsenault	9fc3416fcc	GlobalISel: Preserve LLT when bitcasting loads and stores This also avoids improperly legalizing some truncating vector stores.	2021-07-19 11:30:14 -04:00
Jeremy Morse	6df259fb43	[InstrRef][X86] Drop debug instruction numbers from x87 instructions Avoid a crash when using instruction referencing if x87 floating point instructions are used. These instructions are significantly mutated when they're rewritten from referring to registers, to referring to floating-point-stack positions. As a result, their operands are re-ordered, and (InstrRef) LiveDebugValues asserts when it sees a DBG_INSTR_REF referring to a non-reg non-def register operand. To fix this, drop the instruction numbers, and thus variable locations. This patch adds a helper utility do do that. Dropping the variable locations is sub-optimal, but applying DBG_VALUEs to the $fp0 and similar registers is dropped on emission too. It seems we've never done well at describing variables that live in x87 registers, at all. Differential Revision: https://reviews.llvm.org/D105657	2021-07-19 15:08:27 +01:00
Kazu Hirata	297f95d1cb	[CodeGen] Remove isNON_TRUNCStore and isTRUNCStore (NFC) The last use of isNON_TRUNCStore was removed on Oct 10, 2018 in commit 07acc992dc39edfccc5a4b773c3dcf8a5bf6d893. isTRUNCStore seems to be unused for at least 10 years.	2021-07-19 06:56:04 -07:00
Florian Mayer	c24e3effdd	Revert "[hwasan] Use stack safety analysis." This reverts commit 12268fe14a1a65d4b62f0b6e5beab46ba8501ae7.	2021-07-19 12:08:32 +01:00
Florian Mayer	b61613dac7	[hwasan] Use stack safety analysis. This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703	2021-07-19 11:54:44 +01:00
Florian Mayer	070bb38c2e	[NFC] [MTE] helper for stack tagging lifetimes. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D106135	2021-07-19 11:09:16 +01:00
Lang Hames	a020f7f14c	[ORC][ORC-RT] Introduce ORC-runtime based MachO-Platform. Adds support for MachO static initializers/deinitializers and eh-frame registration via the ORC runtime. This commit introduces cooperative support code into the ORC runtime and ORC LLVM libraries (especially the MachOPlatform class) to support macho runtime features for JIT'd code. This commit introduces support for static initializers, static destructors (via cxa_atexit interposition), and eh-frame registration. Near-future commits will add support for MachO native thread-local variables, and language runtime registration (e.g. for Objective-C and Swift). The llvm-jitlink tool is updated to use the ORC runtime where available, and regression tests for the new MachOPlatform support are added to compiler-rt. Notable changes on the ORC runtime side: 1. The new macho_platform.h / macho_platform.cpp files contain the bulk of the runtime-side support. This includes eh-frame registration; jit versions of dlopen, dlsym, and dlclose; a cxa_atexit interpose to record static destructors, and an '__orc_rt_macho_run_program' function that defines running a JIT'd MachO program in terms of the jit- dlopen/dlsym/dlclose functions. 2. Replaces JITTargetAddress (and casting operations) with ExecutorAddress (copied from LLVM) to improve type-safety of address management. 3. Adds serialization support for ExecutorAddress and unordered_map types to the runtime-side Simple Packed Serialization code. 4. Adds orc-runtime regression tests to ensure that static initializers and cxa-atexit interposes work as expected. Notable changes on the LLVM side: 1. The MachOPlatform class is updated to: 1.1. Load the ORC runtime into the ExecutionSession. 1.2. Set up standard aliases for macho-specific runtime functions. E.g. ___cxa_atexit -> ___orc_rt_macho_cxa_atexit. 1.3. Install the MachOPlatformPlugin to scrape LinkGraphs for information needed to support MachO features (e.g. eh-frames, mod-inits), and communicate this information to the runtime. 1.4. Provide entry-points that the runtime can call to request initializers, perform symbol lookup, and request deinitialiers (the latter is implemented as an empty placeholder as macho object deinits are rarely used). 1.5. Create a MachO header object for each JITDylib (defining the __mh_header and __dso_handle symbols). 2. The llvm-jitlink tool (and llvm-jitlink-executor) are updated to use the runtime when available. 3. A `lookupInitSymbolsAsync` method is added to the Platform base class. This can be used to issue an async lookup for initializer symbols. The existing `lookupInitSymbols` method is retained (the GenericIRPlatform code is still using it), but is deprecated and will be removed soon. 4. JIT-dispatch support code is added to ExecutorProcessControl. The JIT-dispatch system allows handlers in the JIT process to be associated with 'tag' symbols in the executor, and allows the executor to make remote procedure calls back to the JIT process (via __orc_rt_jit_dispatch) using those tags. The primary use case is ORC runtime code that needs to call bakc to handlers in orc::Platform subclasses. E.g. __orc_rt_macho_jit_dlopen calling back to MachOPlatform::rt_getInitializers using __orc_rt_macho_get_initializers_tag. (The system is generic however, and could be used by non-runtime code). The new ExecutorProcessControl::JITDispatchInfo struct provides the address (in the executor) of the jit-dispatch function and a jit-dispatch context object, and implementations of the dispatch function are added to SelfExecutorProcessControl and OrcRPCExecutorProcessControl. 5. OrcRPCTPCServer is updated to support JIT-dispatch calls over ORC-RPC. 6. Serialization support for StringMap is added to the LLVM-side Simple Packed Serialization code. 7. A JITLink::allocateBuffer operation is introduced to allocate writable memory attached to the graph. This is used by the MachO header synthesis code, and will be generically useful for other clients who want to create new graph content from scratch.	2021-07-19 19:50:16 +10:00
Eli Friedman	5fd061997c	[X86] Remove incorrect use of known bits in shuffle simplification. This reverts commit 2a419a0b9957ebac9e11e4b43bc9fbe42a9207df. The result of a shufflevector must not propagate poison from any element other than the one noted in the shuffle mask. The regressions outside of fptoui-may-overflow.ll can probably be recovered some other way; for example, using isGuaranteedNotToBePoison. See discussion on https://reviews.llvm.org/D106053 for more background. Differential Revision: https://reviews.llvm.org/D106222	2021-07-18 18:13:11 -07:00
Valentin Churavy	a01ce5e73a	Reland [Orc] Add verylazy example for C-bindings This patch relands https://reviews.llvm.org/D104799, but fixes the memory handling causing leak sanitizer failures. This reverts commit a56fe117e04f7d4b953a4226af412dad59425fb5.	2021-07-18 21:17:49 +02:00
Nikita Popov	b733414f37	[Cloning] Remove unused parameter from CloneAndPruneFunctionInto() (NFC)	2021-07-18 18:38:06 +02:00
Kazu Hirata	52cf78c743	[Analysis] Remove getLoopPackage (NFC) The last use was removed on Apr 28, 2014 in commit c5a3139ebd0d60617629da83c6c66261b66c75e5.	2021-07-18 08:16:29 -07:00
Valentin Churavy	814653b1a2	Revert "[Orc] Add verylazy example for C-bindings" Broke ASAN buildbot, will reland with fixes This reverts commit b5a6ad8c893a642bcb08ab81b251952c545405d9.	2021-07-18 16:21:37 +02:00
Simon Pilgrim	291305b767	[Orc] Remove unnecessary <string> include dependency from Orc headers. NFC. At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string. Move the include down to any cpp implementation where std::string is actually used.	2021-07-18 12:31:13 +01:00
Valentin Churavy	3f3bee8a2c	[Orc] Add verylazy example for C-bindings Still WIP, based on the Kaleidoscope/BuildingAJIT/Chapter4. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D104799	2021-07-18 12:07:16 +02:00
Nikita Popov	055fa09f2c	[IRBuilder] Deprecate CreateGEP() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type. Also remove the separate overload for a single index, as this is already covered by the ArrayRef overload.	2021-07-17 22:57:51 +02:00
Nikita Popov	0ad9d5a362	[IRBuilder] Deprecate CreateInBoundsGEP() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 21:27:16 +02:00
Nikita Popov	34f02d0dc5	[IRBuilder] Deprecate CreateStructGEP() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 18:48:22 +02:00
Nikita Popov	d5f69c8caa	[IRBuilder] Deprecate CreateConstGEP1_32() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 18:32:36 +02:00
Simon Pilgrim	cacf8c90b2	[DebugInfo] Remove unnecessary <string> include dependency from DebugInfo headers. NFC. At most these use the StringRef/Twine wrappers and don't have any implicit uses of std::string. Move the include down to any cpp implementation where std::string is actually used.	2021-07-17 16:56:06 +01:00
Nikita Popov	db4c85d791	[IRBuilder] Deprecate CreateConstInBoundsGEP1_64() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 17:07:48 +02:00
Nikita Popov	ffc0273ec4	[IRBuilder] Deprecate CreateConstGEP1_64() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 16:43:42 +02:00
Nikita Popov	923a0b4995	[IRBuilder] Deprecate CreateConstInBoundsGEP2_64() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 16:42:39 +02:00
Nikita Popov	0911b5db54	[IRBuilder] Deprecate CreateConstGEP2_64() without element type This API is incompatible with opaque pointers and deprecated in favor of the version that accepts an explicit element type.	2021-07-17 16:41:51 +02:00
Kazu Hirata	a471532d2a	[Analaysis, CodeGen] Remove getHotSucc (NFC) These functions seem to be unused for at least 5 years.	2021-07-17 07:31:36 -07:00
Nikita Popov	dd3e030cca	[BPF] Use elementtype attribute for preserve.array/struct.index intrinsics Use the elementtype attribute introduced in D105407 for the llvm.preserve.array/struct.index intrinsics. It carries the element type of the GEP these intrinsics effectively encode. This patch: * Adds a verifier check that the attribute is required. * Adds it in the IRBuilder methods for these intrinsics. * Autoupgrades old bitcode without the attribute. * Updates the lowering code to use the attribute rather than the pointer element type. * Updates lots of tests to specify the attribute. * Adds -force-opaque-pointers to the intrinsic-array.ll test to demonstrate they work now. https://reviews.llvm.org/D106184	2021-07-17 11:09:18 +02:00
Lang Hames	7bb42fce5b	[ORC] Fix typo in declaration	2021-07-17 16:10:15 +10:00
Lang Hames	81a5e12cd4	[ORC] Remove LLVM-side MachO Platform runtime support. Support for this functionality is moving to the ORC runtime.	2021-07-17 14:25:31 +10:00
Kazu Hirata	54c34b405b	[Analysis] Remove isJoinDivergent (NFC) The last use was removed on Sep 30, 2020 in commit 05ae04c396519cca9ef50d3b9cafb0cd9c87d1d7.	2021-07-16 18:23:17 -07:00
Eli Friedman	05a71b0a6d	[ScalarEvolution] Fix overflow in computeBECount. The current implementation of computeBECount doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. To deal with this, use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. Differential Revision: https://reviews.llvm.org/D105216	2021-07-16 16:15:18 -07:00
Nemanja Ivanovic	b425bc6346	[PowerPC] Implement intrinsics for mtfsf[i] This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`). The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands. Reviewed By: #powerpc, qiucf Differential Revision: https://reviews.llvm.org/D105957	2021-07-16 16:26:11 -05:00
Lei Huang	a6ca36648b	[PowerPC] Implement XL compact math builtins Implement a subset of builtins required for compatiblilty with AIX XL compiler. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D105930	2021-07-16 13:21:13 -05:00
Joseph Huber	c2bfd1f7ef	[OpenMP] Add IDs to OpenMP remarks This patch adds unique idenfitiers to the existing OpenMP remarks. This makes it easier to identify the corresponding documentation for each remark that will be hosted in the OpenMP webpage. Depends on D105898 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105939	2021-07-16 14:07:03 -04:00
Guozhi Wei	7d6ba24baf	[X86FixupLEAs] Try again to transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D104684	2021-07-16 10:16:03 -07:00
madhur13490	b3e3f87671	[NFC] Fix typo intrinisic Differential Revision: https://reviews.llvm.org/D106161	2021-07-16 21:45:11 +05:30
Matt Arsenault	5a8526607f	GlobalISel: Remove dead function	2021-07-16 08:59:25 -04:00
Simon Giesecke	d4960d7c98	Reformat files. Differential Revision: https://reviews.llvm.org/D105982	2021-07-16 07:39:21 +00:00
Mehdi Amini	7d809bb14e	Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer We can build it with -Werror=global-constructors now. This helps in situation where libSupport is embedded as a shared library, potential with dlopen/dlclose scenario, and when command-line parsing or other facilities may not be involved. Avoiding the implicit construction of these cl::opt can avoid double-registration issues and other kind of behavior. Reviewed By: lattner, jpienaar Differential Revision: https://reviews.llvm.org/D105959	2021-07-16 07:38:16 +00:00
Mehdi Amini	b708f244c7	Revert "Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer" This reverts commit af9321739b20becf170e6bb5060b8d780e1dc8dd. Still some specific config broken in some way that requires more investigation.	2021-07-16 07:35:13 +00:00
Mehdi Amini	64ec18abb6	Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer We can build it with -Werror=global-constructors now. This helps in situation where libSupport is embedded as a shared library, potential with dlopen/dlclose scenario, and when command-line parsing or other facilities may not be involved. Avoiding the implicit construction of these cl::opt can avoid double-registration issues and other kind of behavior. Reviewed By: lattner, jpienaar Differential Revision: https://reviews.llvm.org/D105959	2021-07-16 06:54:26 +00:00
Shilei Tian	08c004d674	[Attributor] Add support for compound assignment for ChangeStatus A common use of `ChangeStatus` is as follows: ``` ChangeStatus Changed = ChangeStatus::UNCHANGED; Changed \|= foo(); ``` where `foo` returns `ChangeStatus` as well. Currently `ChangeStatus` doesn't support compound assignment, we have to write as ``` Changed = Changed \| foo(); ``` which is not that convenient. This patch add the support for compound assignment for `ChangeStatus`. Compound assignment is usually implemented as a member function, and binary arithmetic operator is therefore implemented using compound assignment. However, unlike regular C++ class, enum class doesn't support member functions. As a result, they can only be implemented in the way shown in the patch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106109	2021-07-15 23:51:46 -04:00
Mehdi Amini	0fd38b8415	Revert "Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer" This reverts commit 42f588f39c5ce6f521e3709b8871d1fdd076292f. Broke some buildbots	2021-07-16 03:46:53 +00:00
Mehdi Amini	a9a8a9a361	Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer We can build it with -Werror=global-constructors now. This helps in situation where libSupport is embedded as a shared library, potential with dlopen/dlclose scenario, and when command-line parsing or other facilities may not be involved. Avoiding the implicit construction of these cl::opt can avoid double-registration issues and other kind of behavior. Reviewed By: lattner, jpienaar Differential Revision: https://reviews.llvm.org/D105959	2021-07-16 03:33:20 +00:00
Matt Arsenault	ef17052770	GlobalISel: Surface offsets parameter from ComputeValueVTs	2021-07-15 19:11:40 -04:00
Matt Arsenault	240dff7427	GlobalISel: Track argument pointeriness with arg flags Since we're still building on top of the MVT based infrastructure, we need to track the pointer type/address space on the side so we can end up with the correct pointer LLTs when interpreting CCValAssigns.	2021-07-15 19:11:40 -04:00
Victor Huang	61ce66a632	[PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for population count, reversed load and store related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D106021	2021-07-15 17:23:56 -05:00
Amara Emerson	e11b55a90a	GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI. This adds some level of type safety, allows helper functions to be added for specific opcodes for free, and also allows us to succinctly check for class membership with the usual dyn_cast/isa/cast functions. To start off with, add variants for the different load/store operations with some places using it. Differential Revision: https://reviews.llvm.org/D105751	2021-07-15 15:21:57 -07:00
Harald van Dijk	f675df37ba	[X86] Fix handling of maskmovdqu in X32 The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit variant, the former using EDI, the latter RDI, but the use of the register is implicit. In 64-bit mode, a 0x67 prefix can be used to get the version using EDI, but there is no way to express this in assembly in a single instruction, the only way is with an explicit addr32. This change adds support for the instruction. When generating assembly text, that explicit addr32 will be added. When not generating assembly text, it will be kept as a single instruction and will be emitted with that 0x67 prefix. When parsing assembly text, it will be re-parsed as ADDR32 followed by MASKMOVDQU64, which still results in the correct bytes when converted to machine code. The same applies to vmaskmovdqu as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103427	2021-07-15 22:56:08 +01:00
Artem Belevich	0226799a56	[NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction. That should allow clang to compile mma.h from CUDA-11.3. Differential Revision: https://reviews.llvm.org/D105384	2021-07-15 12:02:09 -07:00
Philip Reames	a74c4e37ae	[unittest] Exercise SCEV's udiv and udiv ceiling routines The ceiling variant was recently added (due to the work towards D105216), and we're spending a lot of time trying to find optimizations for the expression. This patch brute forces the space of i8 unsigned divides and checks that we get a correct (well consistent with APInt) result for both udiv and udiv ceiling. (This is basically what I've been doing locally in a hand rolled C++ program, and I realized there no good reason not to check it in as a unit test which directly exercises the logic on constants.) Differential Revision: https://reviews.llvm.org/D106083	2021-07-15 11:55:00 -07:00
Quinn Pham	ba35dd5a19	[PowerPC] Fix popcntb XL Compat Builtin for 32bit This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D105360	2021-07-15 13:19:47 -05:00
Nikita Popov	d58a8fbeab	[IR] Add elementtype attribute This implements the elementtype attribute specified in D105407. It just adds the attribute and the specified verifier rules, but doesn't yet make use of it anywhere. Differential Revision: https://reviews.llvm.org/D106008	2021-07-15 18:04:26 +02:00
Nikita Popov	929097793e	[AsmParser] Unify parsing of attributes Continuing on from D105780, this should be the last major bit of attribute cleanup. Currently, LLParser implements attribute parsing for functions, parameters and returns separately, enumerating all supported (and unsupported) attributes each time. This patch extracts the common parsing logic, and performs a check afterwards whether the attribute is valid in the given position. Parameters and returns are handled together, while function attributes need slightly different logic to support attribute groups. Differential Revision: https://reviews.llvm.org/D105938	2021-07-15 17:51:11 +02:00
Simon Pilgrim	87679ebe78	[TTI] Consistently make getMinVectorRegisterBitWidth() methods const. NFCI. The underlying getMinVectorRegisterBitWidth() methods are const, but it was missed in a couple of TargetTransformInfo wrappers. Noticed while working on D103925	2021-07-15 13:27:55 +01:00
Tony Tye	57b2fbab2e	[AMDGPU] Reserve AMDGPU ELF e_flags machine 0x44 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D106034	2021-07-15 06:46:27 +00:00
Chuanqi Xu	ca13ea7edf	[Coroutines] Run coroutine passes by default This patch make coroutine passes run by default in LLVM pipeline. Now the clang and opt could handle IR inputs containing coroutine intrinsics without special options. It should be fine. On the one hand, the coroutine passes seems to be stable since there are already many projects using coroutine feature. On the other hand, the coroutine passes should do nothing for IR who doesn't contain coroutine intrinsic. Test Plan: check-llvm Reviewed by: lxfind, aeubanks Differential Revision: https://reviews.llvm.org/D105877	2021-07-15 14:33:40 +08:00
Kuter Dinel	b86b597e6c	[Attributor] AACallEdges, Add a way to ask nonasm unknown callees This patch adds a feature to AACallEdges AbstractAttribute that allows users to ask if there is a unknown callee that isn't a inline assembly. This feature is needed by some of it's users. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105992	2021-07-15 06:10:42 +03:00
Kai Luo	bb52bc77a5	[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D103614	2021-07-15 01:12:09 +00:00
Thomas Lively	3c50e4a7a7	[WebAssembly] Codegen for v128.storeX_lane instructions Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435. Differential Revision: https://reviews.llvm.org/D106019	2021-07-14 16:15:25 -07:00
Stanislav Mekhanoshin	1dae83cabb	[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization Any def of EXEC prevents rematerialization of any VOP instruction because of the physreg use. Create a callback to check if the physreg use can be ingored to allow rematerialization. Differential Revision: https://reviews.llvm.org/D105836	2021-07-14 13:03:58 -07:00
Eli Friedman	0af449d2a7	[SelectionDAG] Add an overload of getStepVector that assumes step 1. This is mostly a minor convenience, but the pattern seems frequent enough to be worthwhile (and we'll probably add more uses in the future). Differential Revision: https://reviews.llvm.org/D105850	2021-07-14 11:37:01 -07:00
Thomas Lively	cf44692539	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Djordje Todorovic	c793732c01	[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This new MIR pass removes redundant DBG_VALUEs. After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE. This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that. This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one: For example: (1) DBG_VALUE $edi, !"var1", ... (2) DBG_VALUE $esi, !"var2", ... (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (1). By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed. Differential Revision: https://reviews.llvm.org/D105279	2021-07-14 04:29:42 -07:00
Arthur Eubanks	78dcef41e1	[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch To help with debugging non-trivial unswitching issues. Don't care about the legacy pass, nobody is using it. If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't default to the empty constructor for the pass params. We should still let the parser take care of it in case the parser has its own defaults. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D105933	2021-07-13 16:09:42 -07:00
Matt Arsenault	74be1319be	RegAlloc: Allow targets to split register allocation AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.	2021-07-13 18:49:29 -04:00
Victor Huang	6d07c374d0	[PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for compare and multiply related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D102875	2021-07-13 16:55:09 -05:00
Philip Reames	d721cdd01e	[ScalarEvolution] Fix overflow when computing max trip counts This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me. The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.	2021-07-13 10:01:10 -07:00
Guillaume Chatelet	fb0a7c4525	Revert "[llvm] Add enum iteration to Sequence" This reverts commit a006af5d6ec6280034ae4249f6d2266d726ccef4.	2021-07-13 16:44:42 +00:00
Guillaume Chatelet	79316cfa46	[llvm] Add enum iteration to Sequence This patch allows iterating typed enum via the ADT/Sequence utility. Differential Revision: https://reviews.llvm.org/D103900	2021-07-13 16:22:19 +00:00
Albion Fung	6272d2fc4a	[PowerPC] Fix L[D\|W]ARX Implementation LDARX and LWARX sometimes gets optimized out by the compiler when it is critical to the correctness of the code. This inline asm generation ensures that it preserved. Differential Revision: https://reviews.llvm.org/D105754	2021-07-13 11:02:07 -05:00
Matt Arsenault	b1584af557	GlobalISel: Remove getIntrinsicID utility function This is redundant with a method directly on MachineInstr	2021-07-13 11:04:10 -04:00
Matt Arsenault	2398c72d1f	Mips/GlobalISel: Use more standard call lowering infrastructure This also fixes some missing implicit uses on call instructions, adds missing G_ASSERT_SEXT/ZEXT annotations, and some missing outgoing sext/zexts. This also fixes not respecting tablegen requested type promotions. This starts treating f64 passed in i32 GPRs as a type of custom assignment, which restores some previously XFAILed tests. This is due to getNumRegistersForCallingConv returns a static value, but in this case it is context dependent on other arguments. Most of the ugliness is reproducing a hack CC_MipsO32 uses in SelectionDAG. CC_MipsO32 depends on a bunch of vectors populated from the original IR argument types in MipsCCState. The way this ends up working in GlobalISel is it only ends up inspecting the most recently added vector element. I'm pretty sure there are cleaner ways to do this, but this seemed easier than fixing up the current DAG handling. This is another case where it would be easier of the CCAssignFns were passed the original type instead of only the pre-legalized ones. There's still a lot of junk here that shouldn't be necessary. This also likely breaks big endian handling, but it wasn't complete/tested anyway since the IRTranslator gives up on big endian targets.	2021-07-13 11:04:10 -04:00
Hafiz Abid Qadeer	5ad78e6343	[AMDGPU] Handle s_branch to another section. Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label. Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned. This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error. Jumps to other section can arise with hold/cold splitting. A patch to handle the relocation in lld will follow shortly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D105760	2021-07-13 12:17:47 +01:00
Jeroen Dobbelaere	9fe5660816	[remangleIntrinsicFunction] Detect and resolve name clash It is possible that the remangled name for an intrinsic already exists with a different (and wrong) prototype within the module. As the bitcode reader keeps both versions of all remangled intrinsics around for a longer time, this can result in a crash, as can be seen in https://bugs.llvm.org/show_bug.cgi?id=50923 This patch makes 'remangleIntrinsicFunction' aware of this situation. When it is detected, it moves the version with the wrong prototype to a different name. That version will be removed anyway once the module is completely loaded. With thanks to @asbirlea for reporting this issue when trying out an lto build with the full restrict patches, and @efriedma for suggesting a sane resolution mechanism. Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D105118	2021-07-13 11:21:12 +02:00
Nikita Popov	495f2550b0	[Attributes] Determine attribute properties from TableGen data Continuing from D105763, this allows placing certain properties about attributes in the TableGen definition. In particular, we store whether an attribute applies to fn/param/ret (or a combination thereof). This information is used by the Verifier, as well as the ForceFunctionAttrs pass. I also plan to use this in LLParser, which also duplicates info on which attributes are valid where. This keeps metadata about attributes in one place, and makes it more likely that it stays in sync, rather than in various functions spread across the codebase. Differential Revision: https://reviews.llvm.org/D105780	2021-07-12 22:13:38 +02:00
Nikita Popov	2812298c44	[Attributes] Replace doesAttrKindHaveArgument() (NFC) This is now the same as isIntAttrKind(), so use that instead, as it does not require manual maintenance. The naming is also more accurate in that both int and type attributes have an argument, but this method was only targeting int attributes. I initially wanted to tighten the AttrBuilder assertion, but we have some in-tree uses that would violate it.	2021-07-12 21:57:26 +02:00
Nikita Popov	4966784718	[Attributes] Assert correct attribute constructor is used (NFCI) Assert that enum/int/type attributes go through the constructor they are supposed to use. To make sure this can't happen via invalid bitcode, explicitly verify that the attribute kind if correct there.	2021-07-12 21:11:59 +02:00
Nikita Popov	28e27a194e	[Attributes] Make type attribute handling more generic (NFCI) Followup to D105658 to make AttrBuilder automatically work with new type attributes. TableGen is tweaked to emit First/LastTypeAttr markers, based on which we can handle type attributes programmatically. Differential Revision: https://reviews.llvm.org/D105763	2021-07-12 20:49:38 +02:00
Thomas Lively	eb7eabd7be	[WebAssembly] Custom combines for f32x4.demote_zero_f64x2 Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern. Differential Revision: https://reviews.llvm.org/D105755	2021-07-12 10:32:18 -07:00
Jinsong Ji	41284718d3	[AIX] Emit version string in .file directive AIX .file directive support including compiler version string. https://www.ibm.com/docs/en/aix/7.2?topic=ops-file-pseudo-op This patch adds the support so that it will be easier to identify build compiler in objects. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D105743	2021-07-12 17:03:52 +00:00
Albion Fung	9cfb434c49	[PowerPC] Implement trap and conversion builtins for XL compatibility This patch implements trap and FP to and from double conversions. The builtins generate code that mirror what is generated from the XL compiler. Intrinsics are named conventionally with builtin_ppc, but are aliased to provide the same builtin names as the XL compiler. Differential Revision: https://reviews.llvm.org/D103668	2021-07-12 11:04:17 -05:00
Simon Tatham	2dfe33c4d3	Remove unused parameter from parseMSInlineAsm. No implementation uses the `LocCookie` parameter at all. Errors are reported from inside that function by `llvm::SourceMgr`, and the instance of that at the clang call site arranges to pass the error messages back to a `ClangAsmParserCallback`, which is where the clang SourceLocation for the error is computed. (This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. But this particular change seems beneficial in its own right.) Reviewed By: miyuki Differential Revision: https://reviews.llvm.org/D105490	2021-07-12 15:07:03 +01:00
Cullen Rhodes	e25a1a8f41	[AArch64] Add target features for Armv9-A Scalable Matrix Extension (SME) First patch in a series adding MC layer support for the Arm Scalable Matrix Extension. This patch adds the following features: sme, sme-i64, sme-f64 The sme-i64 and sme-f64 flags are for the optional I16I64 and F64F64 features. If a target supports I16I64 then the following instructions are implemented: * 64-bit integer ADDHA and ADDVA variants (D105570). * SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA, UMOPS, USMOPA, and USMOPS instructions that accumulate 16-bit integer outer products into 64-bit integer tiles. If a target supports F64F64 then the FMOPA and FMOPS instructions that accumulate double-precision floating-point outer products into double-precision tiles are implemented. Outer products are implemented in D105571. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: CarolineConcatto Differential Revision: https://reviews.llvm.org/D105569	2021-07-12 13:28:10 +00:00
David Truby	b1f94d70cf	[llvm][sve] Lowering for VLS truncating stores This adds custom lowering for truncating stores when operating on fixed length vectors in SVE. It also includes a DAG combine to fold extends followed by truncating stores into non-truncating stores in order to prevent this pattern appearing once truncating stores are supported. Currently truncating stores are not used in certain cases where the size of the vector is larger than the target vector width. Differential Revision: https://reviews.llvm.org/D104471	2021-07-12 11:14:17 +01:00
Johannes Doerfert	f4830fc58d	[Attributor][NFCI] Add UsedAssumedInformation to more interfaces As with other Attributor interfaces we often want to know if assumed information was used to answer a query. This is important if only known information is allowed or if known information can lead to an early fixpoint. The users have been adjusted but none of them utilizes the new information yet.	2021-07-11 19:18:03 -05:00
Kazu Hirata	e60451b029	[Analysis] Remove unused declaration isPotentiallyReachableFromMany (NFC)	2021-07-11 07:10:11 -07:00
David Green	81ae14cd79	[IfCvt] Don't use pristine register for counting liveins for predicated instructions. The test case here hits machine verifier problems. There are volatile long loads that the results of do not get used, loading into two dead registers. IfCvt will predicate them and as it does will add implicit uses of the predicating registers due to thinking they are live in. As nothing has used the register, the machine verifier disagrees that they are really live and we end up with a failure. The registers come from Pristine regs that LivePhysRegs counts as live. This patch adds a addLiveInsNoPristines method to be used instead in IfCvt, so that only really live in regs need to be added as implicit operands. Differential Revision: https://reviews.llvm.org/D90965	2021-07-11 14:45:54 +01:00
David Blaikie	3f1f672468	Reapply "llvm-symbolizer: Fix "start file" to work with Split DWARF" Originally committed as 04c203e310bd3fb58e16c936c0200d680100526e Reverted in 768510632c5ddbf9438693d9c7db1903e39295ad due to the test failing when encountering windows directory separators. Fix the path separator platform issue with a FileCheck pattern {{[/\\]}} Original commit message: A followup to the feature added in 69da27c7496ea373567ce5121e6fe8613846e7a5 that added the optional "start file name" to match "start line" - but this didn't work with Split DWARF because of the need for the decl file number resolution code to refer back to the skeleton unit to find its .debug_line contribution. So this patch adds the necessary infrastructure to track the skeleton unit corresponding to a split full unit for the purpose of this lookup.	2021-07-10 18:50:55 -07:00
Kazu Hirata	a6c7669e73	[Analysis] Remove changeCondBranchToUnconditionalTo (NFC) The last use was removed on Jan 21, 2021 in commit 0895b836d74ed333468ddece2102140494eb33b6.	2021-07-10 17:31:43 -07:00
Johannes Doerfert	3839fcc5cf	[OpenMP] Detect SPMD compatible kernels and execute them as such In the spirit of TRegions [0], this patch analyzes a kernel and tracks if it can be executed in SPMD-mode. If so, we flip the arguments of the __kmpc_target_init and deinit call to enable the mode. We also update the `<kernel>_exec_mode` flag to indicate to the runtime we changed the mode to SPMD. The code analysis is done interprocedurally by extending the AAKernelInfo abstract attribute to track SPMD compatibility as well. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 Differential Revision: https://reviews.llvm.org/D102307	2021-07-10 18:44:25 -05:00
Johannes Doerfert	51153424db	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 17:53:56 -05:00
Johannes Doerfert	510ec2aa11	[Attributor] Reorganize AAHeapToStack In order to simplify future extensions, e.g., the merge of AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the state we keep for each malloc-like call. The result is also less confusing as we only track malloc-like calls, not all calls. Further, we only perform the updates necessary for a malloc-like to argue it can go to the stack, e.g., we won't check all uses if we moved on to the "must-be-freed" argument. This patch also uses Attributor helps to simplify the allocated size, alignment, and the potentially freed objects. Overall, this is mostly a reorganization and only the use of the optimistic helpers should change (=improve) the capabilities a bit. Differential Revision: https://reviews.llvm.org/D104993	2021-07-10 16:32:24 -05:00
Johannes Doerfert	0b9a71956c	[Attributor][FIX] Do not replace a value with a non-dominating instruction We have to be careful when we replace values to not use a non-dominating instruction. It makes sense that simplification offers those as "simplified values" but we can't manifest them in the IR without PHI nodes. In the future we should consider potentially adding those PHI nodes.	2021-07-10 16:09:30 -05:00
Johannes Doerfert	8ee9d51790	[Attributor] Use AAValueSimplify to simplify returned values We should use AAValueSimplify for all value simplification, however there was some leftover logic that predates AAValueSimplify in AAReturnedValues. This remove the AAReturnedValues part and provides a replacement by making AAValueSimplifyReturned strong enough to handle all previously covered cases. Further, this improve AAValueSimplifyCallSiteReturned to handle returned arguments. AAReturnedValues is now much easier and the collected returned values/instructions are now from the associated function only, making it much more sane. We also do not have the brittle logic anymore that looks for unresolved calls. Instead, we use AAValueSimplify to handle recursion. Useful code has been split into helper functions, e.g., an Attributor interface to get a simplified value. Differential Revision: https://reviews.llvm.org/D103860	2021-07-10 15:52:36 -05:00
Nico Weber	b314064dc7	Revert Attributor patch series Broke check-clang, see https://reviews.llvm.org/D102307#2869065 Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`	2021-07-10 16:15:55 -04:00
Nico Weber	3a309005df	Revert "llvm-symbolizer: Fix "start file" to work with Split DWARF" This reverts commit 04c203e310bd3fb58e16c936c0200d680100526e. Test fails on Windows.	2021-07-10 13:35:05 -04:00
Johannes Doerfert	90355478cc	[Attributor][NFCI] Add UsedAssumedInformation to more interfaces As with other Attributor interfaces we often want to know if assumed information was used to answer a query. This is important if only known information is allowed or if known information can lead to an early fixpoint. The users have been adjusted but none of them utilizes the new information yet.	2021-07-10 12:32:51 -05:00
Johannes Doerfert	df82045809	[OpenMP] Detect SPMD compatible kernels and execute them as such In the spirit of TRegions [0], this patch analyzes a kernel and tracks if it can be executed in SPMD-mode. If so, we flip the arguments of the __kmpc_target_init and deinit call to enable the mode. We also update the `<kernel>_exec_mode` flag to indicate to the runtime we changed the mode to SPMD. The code analysis is done interprocedurally by extending the AAKernelInfo abstract attribute to track SPMD compatibility as well. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 Differential Revision: https://reviews.llvm.org/D102307	2021-07-10 12:32:51 -05:00
Johannes Doerfert	0e688881f2	[Attributor][FIX] Do not replace a value with a non-dominating instruction We have to be careful when we replace values to not use a non-dominating instruction. It makes sense that simplification offers those as "simplified values" but we can't manifest them in the IR without PHI nodes. In the future we should consider potentially adding those PHI nodes.	2021-07-10 12:32:50 -05:00
Johannes Doerfert	63e4735bba	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 12:32:50 -05:00
Johannes Doerfert	1e74d94d7c	[Attributor] Reorganize AAHeapToStack In order to simplify future extensions, e.g., the merge of AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the state we keep for each malloc-like call. The result is also less confusing as we only track malloc-like calls, not all calls. Further, we only perform the updates necessary for a malloc-like to argue it can go to the stack, e.g., we won't check all uses if we moved on to the "must-be-freed" argument. This patch also uses Attributor helps to simplify the allocated size, alignment, and the potentially freed objects. Overall, this is mostly a reorganization and only the use of the optimistic helpers should change (=improve) the capabilities a bit. Differential Revision: https://reviews.llvm.org/D104993	2021-07-10 12:32:50 -05:00
Johannes Doerfert	330a2a1821	[Attributor] Use AAValueSimplify to simplify returned values We should use AAValueSimplify for all value simplification, however there was some leftover logic that predates AAValueSimplify in AAReturnedValues. This remove the AAReturnedValues part and provides a replacement by making AAValueSimplifyReturned strong enough to handle all previously covered cases. Further, this improve AAValueSimplifyCallSiteReturned to handle returned arguments. AAReturnedValues is now much easier and the collected returned values/instructions are now from the associated function only, making it much more sane. We also do not have the brittle logic anymore that looks for unresolved calls. Instead, we use AAValueSimplify to handle recursion. Useful code has been split into helper functions, e.g., an Attributor interface to get a simplified value. Differential Revision: https://reviews.llvm.org/D103860	2021-07-10 12:32:50 -05:00
Sander de Smalen	b4ab982f78	[LV] NFCI: Do cost comparison on InstructionCost directly. Instead of performing the isMoreProfitable() operation on InstructionCost::CostTy the operation is performed on InstructionCost directly, so that it can handle the case where one of the costs is Invalid. This patch also changes the CostTy to be int64_t, so that the type is wide enough to deal with multiplications with e.g. `unsigned MaxTripCount`. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D105113	2021-07-10 11:57:16 +01:00
Sander de Smalen	5d8e9991c3	[InstructionCost] Add saturation support. This patch makes the operations on InstructionCost saturate, so that when costs are accumulated they saturate to <max value>. One of the compelling reasons for wanting to have saturation support is because in various places, arbitrary values are used to represent a 'high' cost, but when accumulating the cost of some set of operations or a loop, overflow is not taken into account, which may lead to unexpected results. By defining the operations to saturate, we can express the cost of something 'very expensive' as InstructionCost::getMax(). Reviewed By: kparzysz, dmgreen Differential Revision: https://reviews.llvm.org/D105108	2021-07-10 11:28:42 +01:00
Amara Emerson	5a37bc3d1b	[AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR. Differential Revision: https://reviews.llvm.org/D103301	2021-07-10 00:25:26 -07:00
Amara Emerson	4e83442b4d	[GlobalISel] Add a new artifact combiner for unmerge which looks through general artifact expressions. The original motivation for this was to implement moreElementsVector of shuffles on AArch64, which resulted in complex sequences of artifacts like unmerge(unmerge(concat...)) which the combiner couldn't handle. It seemed here that the better option, instead of writing ever-more-complex combines, was to have a way to find the original "non-artifact" source registers for a given definition, walking through arbitrary expressions of unmerge/concat/insert. As long as the bits aren't extended or truncated, this is a pretty simple algorithm that avoids the need for lots of combines and instead jumps straight to the final result we want. I've only used this new technique in 2 places within tryCombineUnmerge, using it in more general situations resulted in infinite loops in AMDGPU. So for now it's used when we would otherwise fail to combine and that seems to work. In order to support looking through G_INSERTs, I also had to add it as an artifact in isArtifact(), which caused a whole lot of issues in tests. AMDGPU started infinite looping since full legalization of G_INSERT doensn't seem to be there. To work around this, I've temporarily added a CLI option to use the old behaviour so that the MIR tests will still run and terminate. Other minor changes include no longer making >128b G_MERGE/UNMERGE legal. We never had isel support for that anyway and it was a remnant of the legacy legalizer rules. However being legal prevented the combiner from checking if it was dead and deleting them. Differential Revision: https://reviews.llvm.org/D104355	2021-07-09 22:35:00 -07:00
Lang Hames	a6848c8034	[ORC] Flesh out ExecutorAddress, rename CommonOrcRuntimeTypes header. Renames CommonOrcRuntimeTypes.h to ExecutorAddress.h and moves ExecutorAddress into the 'orc' namespace (rather than orc::shared). Also makes ExecutorAddress a class, adds an ExecutorAddrDiff type and some arithmetic operations on the pair (subtracting two addresses yields an addrdiff, adding an addrdiff and an address yields an address).	2021-07-10 13:53:52 +10:00
Thomas Lively	20f8f245ff	[WebAssembly] Custom combines for f64x2.promote_low_f32x4 Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines from standard SelectionDAG nodes. Implement the new combines to share code with the similar combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232. Differential Revision: https://reviews.llvm.org/D105675	2021-07-09 18:59:29 -07:00
David Blaikie	4607449b18	llvm-symbolizer: Fix "start file" to work with Split DWARF A followup to the feature added in 69da27c7496ea373567ce5121e6fe8613846e7a5 that added the optional "start file name" to match "start line" - but this didn't work with Split DWARF because of the need for the decl file number resolution code to refer back to the skeleton unit to find its .debug_line contribution. So this patch adds the necessary infrastructure to track the skeleton unit corresponding to a split full unit for the purpose of this lookup.	2021-07-09 18:31:32 -07:00
Wouter van Oortmerssen	538b137e0b	[WebAssembly] Added initial type checker to MC Assembler This to protect against non-sensical instruction sequences being assembled, which would either cause asserts/crashes further down, or a Wasm module being output that doesn't validate. Unlike a validator, this type checker is able to give type-errors as part of the parsing process, which makes the assembler much friendlier to be used by humans writing manual input. Because the MC system is single pass (instructions aren't even stored in MC format, they are directly output) the type checker has to be single pass as well, which means that from now on .globaltype and .functype decls must come before their use. An extra pass is added to Codegen to collect information for this purpose, since AsmPrinter is normally single pass / streaming as well, and would otherwise generate this information on the fly. A `-no-type-check` flag was added to llvm-mc (and any other tools that take asm input) that surpresses type errors, as a quick escape hatch for tests that were not intended to be type correct. This is a first version of the type checker that ignores control flow, i.e. it checks that types are correct along the linear path, but not the branch path. This will still catch most errors. Branch checking could be added in the future. Differential Revision: https://reviews.llvm.org/D104945	2021-07-09 14:07:25 -07:00
David Blaikie	6212b5a386	PR51018: A few more explicit conversions from SmallString to StringRef Follow-up to 1def2579e10dd84405465f403e8c31acebff0c97 with a few more obscure cases.	2021-07-09 13:54:02 -07:00
Nikita Popov	103545107e	[IR] Add GEPOperator::indices() (NFC) In order to mirror the GetElementPtrInst::indices() API. Wanted to use this in the IRForTarget code, and was surprised to find that it didn't exist yet.	2021-07-09 21:41:20 +02:00
Nikita Popov	c66ba11a1e	Reapply [IR] Don't accept nullptr as GEP element type Reapply after fixing another occurrence in lldb that was relying on this in the preceding commit. ----- GetElementPtrInst::Create() (and IRBuilder methods based on it) currently accept nullptr as the element type, and will fetch the element type from the pointer in that case. Remove this fallback, as it is incompatible with opaque pointers. I've removed a handful of leftover calls using this behavior as a preliminary step. Out-of-tree code affected by this change should either pass a proper type, or can temporarily explicitly call getPointerElementType(), if the newly added assertion is encountered. Differential Revision: https://reviews.llvm.org/D105653	2021-07-09 21:14:41 +02:00
Nikita Popov	575750b257	Reapply [IR] Don't mark mustprogress as type attribute Reapply with fixes for clang tests. ----- This is a simple enum attribute. Test changes are because enum attributes are sorted before type attributes, so mustprogress is now in a different position.	2021-07-09 20:57:44 +02:00
Varun Gandhi	d697536ac9	[Clang] Introduce Swift async calling convention. This change is intended as initial setup. The plan is to add more semantic checks later. I plan to update the documentation as more semantic checks are added (instead of documenting the details up front). Most of the code closely mirrors that for the Swift calling convention. Three places are marked as [FIXME: swiftasynccc]; those will be addressed once the corresponding convention is introduced in LLVM. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D95561	2021-07-09 11:50:10 -07:00
Stella Stamenova	aff94971f7	Revert "[IR] Don't accept nullptr as GEP element type" This reverts commit 5035e7be1a8ab923e1a82def7e313cc11c0b176f. This change broke several lldb bots.	2021-07-09 11:32:39 -07:00
Nikita Popov	ac1ee01737	Revert "[IR] Don't mark mustprogress as type attribute" This reverts commit 84ed3a794b4ffe7bd673f1e5a17d507aa3113d12. A number of clang tests are also affected by this change. Revert until I can update them.	2021-07-09 18:46:00 +02:00
Nikita Popov	50da677d32	[AttrBuilder] Try to fix build Some buildbots fail with undefined references to ByValTypeIndex etc. Replace static consts with an enum to ensure the address is not taken.	2021-07-09 18:27:57 +02:00
Nikita Popov	7845932811	[IR] Don't mark mustprogress as type attribute This is a simple enum attribute. Test changes are because enum attributes are sorted before type attributes.	2021-07-09 18:24:16 +02:00
Nikita Popov	4449e34e33	[AttrBuilder] Make handling of type attributes more generic (NFCI) While working on the elementtype attribute, I felt that the type attribute handling in AttrBuilder is overly repetitive. This patch converts the separate Type* members into an std::array<Type*>, so that all type attribute kinds can be handled generically. There's more room for improvement here (especially when it comes to converting the AttrBuilder to an Attribute), but this seems like a good starting point. Differential Revision: https://reviews.llvm.org/D105658	2021-07-09 17:48:09 +02:00
Nikita Popov	b2f9152456	[IR] Don't accept nullptr as GEP element type GetElementPtrInst::Create() (and IRBuilder methods based on it) currently accept nullptr as the element type, and will fetch the element type from the pointer in that case. Remove this fallback, as it is incompatible with opaque pointers. I've removed a handful of leftover calls using this behavior as a preliminary step. Out-of-tree code affected by this change should either pass a proper type, or can temporarily explicitly call getPointerElementType(), if the newly added assertion is encountered. Differential Revision: https://reviews.llvm.org/D105653	2021-07-09 17:37:43 +02:00
Kevin P. Neal	1696270de0	[FPEnv][InstSimplify] Constrained FP support for NaN Currently InstructionSimplify.cpp knows how to simplify floating point instructions that have a NaN operand. It does not know how to handle the matching constrained FP intrinsic. This patch teaches it how to simplify so long as the exception handling is not "fpexcept.strict". Differential Revision: https://reviews.llvm.org/D103169	2021-07-09 11:26:28 -04:00
zhijian	606d7e2aeb	[AIX][XCOFF] Use bit order of has_vec and longtbtable bits as defined in AIX header debug.h Summary: The bit order of the has_vec and longtbtable bits in the traceback table generated by the XL compiler flipped at some point after v12.1. This is different from the definition is the AIX header debug.h. The change in the XL compiler that caused the deviation from the OS header definition was unintentional. Since both orderings are extant and the XL compiler runtime also expects the ordering defined by the OS, we will correct the output from LLVM to match the defined ordering given by the OS (which is also consistent with the Assembler Language Reference). Mitigation for traceback tables encoded with the wrong ordering is required for either ordering. Reviewers: XingXue, HubertTong Differential Revision: https://reviews.llvm.org/D105487	2021-07-09 11:06:46 -04:00
Jeremy Morse	fe3b28eeca	[Debug-info][InstrRef] Avoid an unnecessary map ordering We keep a record of substitutions between debug value numbers post-isel, however we never actually look them up until the end of compilation. As a result, there's nothing gained by the collection being a std::map. This patch downgrades it to being a vector, that's then sorted at the end of compilation in LiveDebugValues. Differential Revision: https://reviews.llvm.org/D105029	2021-07-09 15:43:13 +01:00
Martin Storsjö	28e1cf3bb9	Revert "[ScalarEvolution] Fix overflow in computeBECount." This reverts commit 5b350183cdabd83573bc760ddf513f3e1d991bcb (and also "[NFC][ScalarEvolution] Cleanup howManyLessThans.", 009436e9c1fee1290d62bc0faafe0c0295542f56, to make it apply). See https://reviews.llvm.org/D105216 for discussion on various miscompilations caused by that commit.	2021-07-09 14:26:48 +03:00
David Green	fd61052e59	[TTI] Remove IsPairwiseForm from getArithmeticReductionCost This patch removes the IsPairwiseForm flag from the Reduction Cost TTI hooks, along with some accompanying code for pattern matching reductions from trees starting at extract elements. IsPairWise is now assumed to be false, which was the predominant way that the value was used from both the Loop and SLP vectorizers. Since the adjustments such as D93860, the SLP vectorizer has not relied upon this distinction between paiwise and non-pairwise reductions. This also removes some code that was detecting reductions trees starting from extract elements inside the costmodel. This case was double-counting costs though, adding the individual costs on the individual instruction _and_ the total cost of the reduction. Removing it changes the costs in llvm/test/Analysis/CostModel/X86/reduction.ll to not double count. The cost of reduction intrinsics is still tested through the various tests in llvm/test/Analysis/CostModel/X86/reduce-xyz.ll. Differential Revision: https://reviews.llvm.org/D105484	2021-07-09 11:51:16 +01:00
Eli Friedman	aa27065cdf	[NFC][ScalarEvolution] Cleanup howManyLessThans. In preparation for D104075. Some NFC cleanup, and some test coverage for planned changes.	2021-07-08 17:56:26 -07:00
David Blaikie	54e05361bc	Revert "PR51018: Disallow explicit construction of StringRef from SmallString due to ambiguity in C++23" This reverts commit e2d30846327c7ec5cc9d2a46aa9bcd9c2c4eff93. MSVC doesn't seem to resolve the intended ambiguity in implicit conversion contexts correctly: https://godbolt.org/z/ee16aqv4v	2021-07-08 13:46:36 -07:00
David Blaikie	0a392cfdcf	PR51018: Disallow explicit construction of StringRef from SmallString due to ambiguity in C++23 See bug for full details, but basically there's an upcoming ambiguity in the conversion in `StringRef(SomeSmallString)` - either the implicit conversion operator (SmallString::operator StringRef) could be used, or the std::string_view range-based ctor (& then `StringRef(std::string_view)` would be used) To address this, make such a conversion invalid up-front - most uses are more tersely written as `SomeSmallString.str()` anyway, or more clearly written as `StringRef x = y;` rather than `StringRef x(y);` - so if you hit this in out-of-tree code, please update in one of those ways. Hopefully I've fixed everything in tree prior to this patch landing.	2021-07-08 13:37:57 -07:00
Michael Liao	bee0b38da8	[Metadata] Decorate methods with 'const'. NFC. - Minor coding style fix.	2021-07-08 14:11:14 -04:00
Matt Arsenault	fc47c36984	GlobalISel: Track original argument index in ArgInfo SelectionDAG's equivalents in ISD::InputArg/OutputArg track the original argument index. Mips relies on this, and its currently reinventing its own parallel CallLowering infrastructure which tracks these indexes on the side. Add this to help move towards deleting the custom mips handling.	2021-07-08 13:39:02 -04:00
Eli Friedman	915fc454ff	[ScalarEvolution] Fix overflow in computeBECount. There are two issues with the current implementation of computeBECount: 1. It doesn't account for the possibility that adding "Stride - 1" to Delta might overflow. For almost all loops, it doesn't, but it's not actually proven anywhere. 2. It doesn't account for the possibility that Stride is zero. If Delta is zero, the backedge is never taken; the value of Stride isn't relevant. To handle this, we have to make sure that the expression returned by computeBECount evaluates to zero. To deal with this, add two new checks: 1. Use a variety of tricks to try to prove that the addition doesn't overflow. If the proof is impossible, use an alternate sequence which never overflows. 2. Use umax(Stride, 1) to handle the possibility that Stride is zero. Differential Revision: https://reviews.llvm.org/D105216	2021-07-08 10:09:55 -07:00
Nikita Popov	4f2df3c6f8	[IR] Restore vector support for deprecated CreateGEP methods As pointed out in post-commit review on rG8e22539067d9, it's necessary to call getScalarType() to support GEPs with a vector base. Dropping that call was an oversight on my side.	2021-07-08 18:15:56 +02:00
Tim Northover	7c89253a7a	Recommit: Support: add llvm::thread class that supports specifying stack size. This adds a new llvm::thread class with the same interface as std::thread except there is an extra constructor that allows us to set the new thread's stack size. On Darwin even the default size is boosted to 8MB to match the main thread. It also switches all users of the older C-style `llvm_execute_on_thread` API family over to `llvm::thread` followed by either a `detach` or `join` call and removes the old API. Moved definition of DefaultStackSize into the .cpp file to hopefully fix the build on some (GCC-6?) machines.	2021-07-08 16:22:26 +01:00
Tim Northover	1b885b1ce7	Revert "Support: add llvm::thread class that supports specifying stack size." It's causing build failures because DefaultStackSize isn't defined everywhere it should be and I need time to investigate.	2021-07-08 14:59:47 +01:00
Tim Northover	43bfac999c	Support: add llvm::thread class that supports specifying stack size. This adds a new llvm::thread class with the same interface as std::thread except there is an extra constructor that allows us to set the new thread's stack size. On Darwin even the default size is boosted to 8MB to match the main thread. It also switches all users of the older C-style `llvm_execute_on_thread` API family over to `llvm::thread` followed by either a `detach` or `join` call and removes the old API.	2021-07-08 14:51:53 +01:00
xndcn	4bda00e90e	[NFC] Mark Expected<T>::assertIsChecked() as const Some const methods of Expected<T> invoke assertIsChecked(), so we should mark it as const too. Differential Revision: https://reviews.llvm.org/D105292	2021-07-08 21:30:23 +08:00
Moritz Sichert	2f6870edd6	[IR] Added operator delete to subclasses of User to avoid UB Several subclasses of User override operator new without also overriding operator delete. This means that delete expressions fall back to using operator delete of the base class, which would be User. However, this is only allowed if the base class has a virtual destructor which is not the case for User, so this is UB. See also [expr.delete] (3) for the exact wording. This is actually detected in some cases by GCC 11's -Wmismatched-new-delete now which is how I found this error. Differential Revision: https://reviews.llvm.org/D103143	2021-07-08 11:59:22 +02:00
Lang Hames	2d682bd2a2	[ORC] Introduce ExecutorAddress type, fix broken LLDB bot. ExecutorAddressRange depended on JITTargetAddress, but JITTargetAddress is defined in ExecutionEngine, which OrcShared should not depend on. This seems like as good a time as any to introduce a new ExecutorAddress type to eventually replace JITTargetAddress. For now it's just another uint64_t alias, but it will soon be changed to a class type to provide greater type safety.	2021-07-08 16:31:59 +10:00
Lang Hames	bee25fbe59	[ORC] Improve computeLocalDeps / computeNamedSymbolDependencies performance. The computeNamedSymbolDependencies and computeLocalDeps methods on ObjectLinkingLayerJITLinkContext are responsible for computing, for each symbol in the current MaterializationResponsibility, the set of non-locally-scoped symbols that are depended on. To calculate this we have to consider the effect of chains of dependence through locally scoped symbols in the LinkGraph. E.g. .text .globl foo foo: callq bar ## foo depneds on external 'bar' movq Ltmp1(%rip), %rcx ## foo depends on locally scoped 'Ltmp1' addl (%rcx), %eax retq .data Ltmp1: .quad x ## Ltmp1 depends on external 'x' In this example symbol 'foo' depends directly on 'bar', and indirectly on 'x' via 'Ltmp1', which is locally scoped. Performance of the existing implementations appears to have been mediocre: Based on flame graphs posted by @drmeister (in #jit on the LLVM discord server) the computeLocalDeps function was taking up a substantial amount of time when starting up Clasp (https://github.com/clasp-developers/clasp). This commit attempts to address the performance problems in three ways: 1. Using jitlink::Blocks instead of jitlink::Symbols as the nodes of the dependencies-introduced-by-locally-scoped-symbols graph. Using either Blocks or Symbols as nodes provides the same information, but since there may be more than one locally scoped symbol per block the block-based version of the dependence graph should always be a subgraph of the Symbol-based version, and so faster to operate on. 2. Improved worklist management. The older version of computeLocalDeps used a fixed worklist containing all nodes, and iterated over this list propagating dependencies until no further changes were required. The worklist was not sorted into a useful order before the loop started. The new version uses a variable work-stack, visiting nodes in DFS order and only adding nodes when there is meaningful work to do on them. Compared to the old version the new version avoids revisiting nodes which haven't changed, and I suspect it converges more quickly (due to the DFS ordering). 3. Laziness and caching. Mappings of... jitlink::Symbol* -> Interned Name (as SymbolStringPtr) jitlink::Block* -> Immediate dependencies (as SymbolNameSet) jitlink::Block* -> Transitive dependencies (as SymbolNameSet) are all built lazily and cached while running computeNamedSymbolDependencies. According to @drmeister these changes reduced Clasp startup time in his test setup (averaged over a handful of starts) from 4.8 to 2.8 seconds (with ORC/JITLink linking ~11,000 object files in that time), which seems like enough to justify switching to the new algorithm in the absence of any other perf numbers.	2021-07-08 16:31:59 +10:00
Lang Hames	760f860c3a	[ORC] Replace MachOJITDylibInitializers::SectionExtent with ExecutorAddressRange MachOJITDylibInitializers::SectionExtent represented the address range of a section as an (address, size) pair. The new ExecutorAddressRange type generalizes this to an address range (for any object, not necessarily a section) represented as a (start-address, end-address) pair. The aim is to express more of ORC (and the ORC runtime) in terms of simple types that can be serialized/deserialized via SPS. This will simplify SPS-based RPC involving arguments/return-values of these types.	2021-07-08 14:15:44 +10:00
Lang Hames	4c6599a274	[ORC] Fix file comments.	2021-07-08 14:15:44 +10:00
Stanislav Mekhanoshin	dc43bb3409	[AMDGPU] Disable garbage collection passes Differential Revision: https://reviews.llvm.org/D105593	2021-07-07 15:47:57 -07:00
Arthur Eubanks	266a9a84be	[OpaquePtr] Use ArgListEntry::IndirectType for lowering ABI attributes Consolidate PreallocatedType and ByValType into IndirectType, and use that for inalloca.	2021-07-07 14:58:38 -07:00
Arthur Eubanks	b3ffc2a93b	[OpaquePtr] Remove checking pointee type for byval/preallocated type These currently always require a type parameter. The bitcode reader already upgrades old bitcode without the type parameter to use the pointee type. In cases where the caller does not have byval but the callee does, we need to follow CallBase::paramHasAttr() and also look at the callee for the byval type so that CallBase::isByValArgument() and CallBase::getParamByValType() are in sync. Do the same for preallocated. While we're here add a corresponding version for inalloca since we'll need it soon. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104663	2021-07-07 14:28:55 -07:00
Nikita Popov	aec8b8bed1	[IR] Make some pointer element type accesses explicit (NFC) Explicitly fetch the pointer element type in various deprecated methods, so we can hopefully remove support from this from the base GEP constructor.	2021-07-07 22:05:30 +02:00
Martin Storsjö	5f3a753cf4	[COFF] [CodeView] Add a few new enum values These are undocumented, but are visible in the SDK headers since some versions ago. Differential Revision: https://reviews.llvm.org/D105513	2021-07-07 22:00:18 +03:00
Sander de Smalen	3bbfdfb241	[CostModel] Express cost(urem) as cost(div+mul+sub) when set to Expand. The Legalizer expands the operations of urem/srem into a div+mul+sub or divrem when those are legal/custom. This patch changes the cost-model to reflect that cost. Since there is no 'divrem' Instruction in LLVM IR, the cost of divrem is assumed to be the same as div+mul+sub since the three operations will need to be executed at runtime regardless. Patch co-authored by David Sherwood (@david-arm) Reviewed By: RKSimon, paulwalker-arm Differential Revision: https://reviews.llvm.org/D103799	2021-07-07 14:40:28 +01:00
Johannes Doerfert	2f34f28211	[Attributor][FIX] Replace uses first, then values Before we replaced value by registering all their uses. However, as we replace a value old uses become stale. We now replace values explicitly and keep track of "new values" when doing so to avoid replacing only uses in stale/old values but not their replacements.	2021-07-06 22:43:51 -05:00
Johannes Doerfert	4f0b565d46	[Attributor] Introduce a helper function to deal with undef + none We often need to deal with the value lattice that contains none and undef as special values. A simple helper makes this much nicer. Differential Revision: https://reviews.llvm.org/D103857	2021-07-06 22:41:21 -05:00
Johannes Doerfert	13dc82700d	[Attributor] Simplify operands inside of simplification AAs first When we do simplification via AAPotentialValues or AAValueConstantRange we need to simplify the operands of an instruction we deconstruct first. This does not only improve the result, see for example range.ll, but is required as we allow outside AAs to provide simplification rules via callbacks. If we do ignore the simplification rules and base other simplifications on the IR instead we can create an inconsistent state.	2021-07-06 22:41:18 -05:00
Eli Friedman	b83eae9454	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Recommitting with fix to MemoryDepChecker::isDependent. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 12:16:05 -07:00
Eli Friedman	61b59d3278	Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers." This reverts commit 74d6ce5d5f169e9cf3fac0eb1042602e286dd2b9. Seeing crashes on buildbots in MemoryDepChecker::isDependent.	2021-07-06 11:17:13 -07:00
Eli Friedman	b011bc0424	[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 10:54:41 -07:00
Jeremy Morse	409363cd51	[DebugInfo][InstrRef][3/4] Produce DBG_INSTR_REFs for all variable locations This patch emits DBG_INSTR_REFs for two remaining flavours of variable locations that weren't supported: copies, and inter-block VRegs. There are still some locations that must be represented by DBG_VALUE such as constants, but they're mostly independent of optimisations. For variable locations that refer to values defined in different blocks, vregs are allocated before isel begins, but the defining instruction might not exist until late in isel. To get around this, emit DBG_INSTR_REFs in a "half done" state, where the first operand refers to a VReg. Then at the end of isel, patch these back up to refer to instructions, using the finalizeDebugInstrRefs method. Copies are something that I complained about the original RFC, and I really don't want to have to put instruction numbers on copies. They don't define a value: they move them. To address this isel, salvageCopySSA interprets: * COPYs, * SUBREG_TO_REG, * Anything that isCopyInstr thinks is a copy. And follows chains of copies back to the defining instruction that they read from. This relies on any physical registers that COPYs read being defined in the same block, or being entry-block arguments. For the former we can put an instruction number on the defining instruction; for the latter we can drop a DBG_PHI that reads the incoming value. Differential Revision: https://reviews.llvm.org/D88896	2021-07-06 18:31:38 +01:00
Kerry McLaughlin	8dd39c43b3	[LV] Prevent vectorization with unsupported element types. This patch adds a TTI function, isElementTypeLegalForScalableVector, to query whether it is possible to vectorize a given element type. This is called by isLegalToVectorizeInstTypesForScalable to reject scalable vectorization if any of the instruction types in the loop are unsupported, e.g: int foo(__int128_t* ptr, int N) #pragma clang loop vectorize_width(4, scalable) for (int i=0; i<N; ++i) ptr[i] = ptr[i] + 42; This example currently crashes if we attempt to vectorize since i128 is not a supported type for scalable vectorization. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D102253	2021-07-06 13:06:21 +01:00
Albion Fung	2776c1ab5d	[PowerPC] Implament Load and Reserve and Store Conditional Builtins This patch implaments the load and reserve and store conditional builtins for the PowerPC target, in order to have feature parody with xlC on AIX. Differential revision: https://reviews.llvm.org/D105236	2021-07-05 21:35:41 -05:00
Caroline Concatto	1631d2fbaa	[AArch64][CostModel] Add cost model for experimental.vector.splice This patch adds a new ShuffleKind SK_Splice and then handle the cost in getShuffleCost, as in experimental.vector.reverse. Differential Revision: https://reviews.llvm.org/D104630	2021-07-05 14:30:24 +01:00
Esme-Yi	11bbb4a8e4	[llvm-readobj][XCOFF] Add support for printing the String Table. Summary: The patch adds the StringTable dumping to llvm-readobj. Currently only XCOFF is supported. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D104613	2021-07-05 04:16:58 +00:00
Nikita Popov	3cc10c45ba	[IR] Deprecate GetElementPtrInst::CreateInBounds without element type This API is not compatible with opaque pointers, the method accepting an explicit pointer element type should be used instead. Thankfully there were few in-tree users. The BPF case still ends up using the pointer element type for now and needs something like D105407 to avoid doing so.	2021-07-04 16:49:30 +02:00
Paul Walker	ba16635997	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
Nikita Popov	ecd2dc975e	[IRBuilder] Add type argument to CreateMaskedLoad/Gather Same as other CreateLoad-style APIs, these need an explicit type argument to support opaque pointers. Differential Revision: https://reviews.llvm.org/D105395	2021-07-04 12:17:59 +02:00
Christopher Di Bella	b463417679	[llvm][iwyu] explicitly includes <functional> and <utility> Compiling LLVM with Clang modules and libc++ identified that `Support/Printable.h` and `ADL/SmallVector.h` were using features that live in these headers. Differential Revision: https://reviews.llvm.org/D105402	2021-07-04 06:02:11 +00:00
Simon Pilgrim	f73cebf7e4	[KnownBits] Merge const/non-const KnownBits::extractBits implementations. NFC. These are identical and can be just const.	2021-07-03 19:00:25 +01:00
Craig Topper	1ddc2a3bd1	[SelectionDAG] Rename memory VT argument for getMaskedGather/getMaskedScatter from VT to MemVT. Use getMemoryVT() in MGATHER/MSCATTER DAG combines instead of using the passthru or store value VT for this argument.	2021-07-02 17:37:40 -07:00
Jonas Devlieghere	3020664b33	Revert "[DebugInfo] Enforce implicit constraints on `distinct` MDNodes" This reverts commit 8cd35ad854ab4458fd509447359066ea3578b494. It breaks `TestMembersAndLocalsWithSameName.py` on GreenDragon and Mikael Holmén points out in D104827 that bitcode files created with the patch cannot be parsed with binaries built before it.	2021-07-02 15:57:07 -07:00
Amara Emerson	128d2d791b	[GlobalISel] Clean up CombinerHelper::apply* functions to return void. For some reason we/I started writing these as returning bool when the return value is actually ignored by the combiner.	2021-07-02 13:17:06 -07:00
Amara Emerson	b1924533d9	[GlobalISel] Add re-association combine for G_PTR_ADD to allow better addressing mode usage. We're trying to match a few pointer computation patterns here for re-association opportunities. 1) Isolating a constant operand to be on the RHS, e.g.: G_PTR_ADD(BASE, G_ADD(X, C)) -> G_PTR_ADD(G_PTR_ADD(BASE, X), C) 2) Folding two constants in each sub-tree as long as such folding doesn't break a legal addressing mode. G_PTR_ADD(G_PTR_ADD(BASE, C1), C2) -> G_PTR_ADD(BASE, C1+C2) AArch64 code size improvements on CTMark with -Os: Program before after diff pairlocalalign 251048 251044 -0.0% consumer-typeset 421820 421812 -0.0% kc 431348 431320 -0.0% SPASS 413404 413300 -0.0% clamscan 384396 384220 -0.0% tramp3d-v4 370640 370412 -0.1% lencod 432096 431772 -0.1% bullet 479400 478796 -0.1% sqlite3 288504 288072 -0.1% 7zip-benchmark 573796 570768 -0.5% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D105069	2021-07-02 12:31:21 -07:00
Krzysztof Parzyszek	9abc810a43	[OpaquePtr] Add type parameter to emitLoadLinked Differential Revision: https://reviews.llvm.org/D105353	2021-07-02 13:07:40 -05:00
Jon Roelofs	fa1c32679f	[Intrinsics] Make MemCpyInlineInst a MemCpyInst This opens up more optimization opportunities in passes that already handle MemCpyInst's. Differential revision: https://reviews.llvm.org/D105247	2021-07-02 10:25:24 -07:00
Jacob Hegna	721423a975	Unpack the CostEstimate feature in ML inlining models. This change yields an additional 2% size reduction on an internal search binary, and an additional 0.5% size reduction on fuchsia. Differential Revision: https://reviews.llvm.org/D104751	2021-07-02 16:57:16 +00:00
Jinsong Ji	1ed15bd392	[AIX] Use AsmParser to do inline asm parsing Add a flag so that target can choose to use AsmParser for parsing inline asm. And set the flag by default for AIX. -no-intergrated-as will override this default if specified explicitly. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D105314	2021-07-02 16:12:21 +00:00
Alex Richardson	a73a5b4199	Place the BlockAddress type in the address space of the containing function While this should not matter for most architectures (where the program address space is 0), it is important for CHERI (and therefore Arm Morello). We use address space 200 for all of our code pointers and without this change we assert in the SelectionDAG handling of BlockAddress nodes. It is also useful for AVR: previously programs targeting AVR that attempt to read their own machine code via a pointer to a label would instead read from RAM using a pointer relative to the the start of program flash. Reviewed By: dylanmckay, theraven Differential Revision: https://reviews.llvm.org/D48803	2021-07-02 12:17:55 +01:00
Roman Lebedev	5bd901b404	Revert "[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR" This reverts commit 4facbf213c51e4add2e8c19b08d5e58ad71c72de. ``` ****************** FAIL: LLVM :: CodeGen/WebAssembly/funcref-call.ll (44466 of 44468) **************** TEST 'LLVM :: CodeGen/WebAssembly/funcref-call.ll' FAILED ****************** Script: -- : 'RUN: at line 1'; /builddirs/llvm-project/build-Clang12/bin/llc < /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll --mtriple=wasm32-unknown-unknown -asm-verbose=false -mattr=+reference-types \| /builddirs/llvm-project/build-Clang12/bin/FileCheck /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll -- Exit Code: 2 Command Output (stderr): -- llc: /repositories/llvm-project/llvm/include/llvm/Support/LowLevelTypeImpl.h:44: static llvm::LLT llvm::LLT::scalar(unsigned int): Assertion `SizeInBits > 0 && "invalid scalar size"' failed. ```	2021-07-02 11:49:51 +03:00
Paulo Matos	e346ccc104	[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR Reland of 31859f896. This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Differential Revision: https://reviews.llvm.org/D104797	2021-07-02 09:46:28 +02:00
Lang Hames	85a8d3c7b3	[ORC] Rename SPSTargetAddress to SPSExecutorAddress. Also removes SPSTagTargetAddress, which was accidentally introduced at some point (and never used).	2021-07-02 12:40:14 +10:00
Valentin Churavy	0b1b7443f1	[Orc] At CBindings for LazyRexports At C bindings and an example for LLJIT with lazy reexports Differential Revision: https://reviews.llvm.org/D104672	2021-07-01 21:52:05 +02:00
Nikita Popov	6203a57d08	[OpaquePtr] Support opaque pointers in intrinsic type check This adds support for opaque pointers in intrinsic type checks of IIT kind Pointer and PtrToElt. This is less straight-forward than it might initially seem, because we should only accept opaque pointers here in --force-opaque-pointers mode. Otherwise, there would be more than one valid type signature for a given intrinsic name. Differential Revision: https://reviews.llvm.org/D105155	2021-07-01 18:26:41 +02:00
Matt Arsenault	20d89b9242	GlobalISel: Use LLT in call lowering callbacks This preserves the memory type so the lowerings can rely on them.	2021-07-01 12:15:54 -04:00
Hussain Kadhem	7eddb43fa0	[VP] Implementation of intrinsic and SDNode definitions for VP load, store, gather, scatter. This patch adds intrinsic definitions and SDNodes for predicated load/store/gather/scatter, based on the work done in D57504. Reviewed By: simoll, craig.topper Differential Revision: https://reviews.llvm.org/D99355	2021-07-01 13:34:44 +02:00
Jeremy Morse	7a8e30eba0	[DebugInfo][InstrRef][1/4] Support transformations that widen values Very late in compilation, backends like X86 will perform optimisations like this: $cx = MOV16rm $rax, ... -> $rcx = MOV64rm $rax, ... Widening the load from 16 bits to 64 bits. SEeing how the lower 16 bits remain the same, this doesn't affect execution. However, any debug instruction reference to the defined operand now refers to a 64 bit value, nto a 16 bit one, which might be unexpected. Elsewhere in codegen, there's often this pattern: CALL64pcrel32 @foo, implicit-def $rax %0:gr64 = COPY $rax %1:gr32 = COPY %0.sub_32bit Where we want to refer to the definition of $eax by the call, but don't want to refer the copies (they don't define values in the way LiveDebugValues sees it). To solve this, add a subregister field to the existing "substitutions" facility, so that we can describe a field within a larger value definition. I would imagine that this would be used most often when a value is widened, and we need to refer to the original, narrower definition. Differential Revision: https://reviews.llvm.org/D88891	2021-07-01 11:19:27 +01:00
Christian Kühnel	70970300a3	added some example code for llvm::Expected<T> Since I had some fun understanding how to properly use llvm::Expected<T> I added some code examples that I would have liked to see when learning to use it. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D105014	2021-07-01 09:57:20 +00:00
Andrzej Warzynski	41a27c03c7	[flang] Revert "PoC for Flang Driver Plugins" This patch has not been reviewed and was commited by accident. This reverts commit 788a5d4afe6407e647454a9832a7b4a27fba06bf.	2021-07-01 08:27:31 +00:00
Lang Hames	6567b76038	[ORC] Add wrapper-function support methods to ExecutorProcessControl. Adds support for both synchronous and asynchronous calls to wrapper functions using SPS (Simple Packed Serialization). Also adds support for wrapping functions on the JIT side in SPS-based wrappers that can be called from the executor. These new methods simplify calls between the JIT and Executor, and will be used in upcoming ORC runtime patches to enable communication between ORC and the runtime.	2021-07-01 18:21:49 +10:00
Stuart Ellis	c930f37268	PoC for Flang Driver Plugins	2021-07-01 08:10:40 +00:00
Roman Lebedev	e9c11e84f4	[NFC][PassBuilder] addVectorPasses(): clarify that 'IsLTO' is actually 'IsFullLTO' I.e. it will be `false` for thin lto.	2021-07-01 10:09:24 +03:00
Qiu Chaofan	a315353f43	[NFC][Scheduler] Refactor tryCandidate to return boolean This patch changes return type of tryCandidate from void to bool: 1. Methods in some targets already follow this convention. 2. This would help if some target wants to re-use generic code. 3. It looks more intuitive if these try-method returns the same type. We may need to change return type of them from bool to some enum further, to make it less confusing. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D103951	2021-07-01 14:31:47 +08:00
Lang Hames	a397416183	[ORC] Rename TargetProcessControl to ExecutorProcessControl. NFC. This is a first step towards consistently using the term 'executor' for the process that executes JIT'd code. I've opted for 'executor' as the preferred term over 'target' as target is already heavily overloaded ("the target machine for the executor" is much clearer than "the target machine for the target").	2021-07-01 13:31:12 +10:00

... 2 3 4 5 6 ...

45660 Commits