llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Artur Pilipenko	31af2fa7ed	GC-parseable element atomic memcpy/memmove This change introduces a GC parseable lowering for element atomic memcpy/memmove intrinsics. This way runtime can provide an implementation which can take a safepoint during copy operation. See "GC-parseable element atomic memcpy/memmove" thread on llvm-dev for the background and details: https://groups.google.com/g/llvm-dev/c/NnENHzmX-b8/m/3PyN8Y2pCAAJ Differential Revision: https://reviews.llvm.org/D88861	2020-10-23 14:06:09 -07:00
Michael Liao	59287b0fc6	Fix shared build. NFC.	2020-10-23 15:53:05 -04:00
Florian Hahn	9f772c5b7d	[AArch64] Add vector compare/select cost-model tests.	2020-10-23 20:43:04 +01:00
Geoffrey Martin-Noble	91a4ce9865	Unconditionally #include <future> This unbreaks building with `LLVM_ENABLE_THREADS=0`. Since https://github.com/llvm/llvm-project/commit/069919c9ba33 usage of `std::promise` is not guarded by `LLVM_ENABLE_THREADS`, so this header must be unconditionally included. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D89758	2020-10-23 19:17:37 +00:00
Arthur Eubanks	d924016b11	[gn build] Add missing comma	2020-10-23 12:01:23 -07:00
Nick Desaulniers	e95a065d26	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Stanislav Mekhanoshin	270ab79b5a	[AMDGPU] Fixed isLegalRegOperand() with physregs This does not change anything at the moment, but needed for D89170. In that change I am probing a physical SGPR to see if it is legal. RC is SReg_32, but DRC for scratch instructions is SReg_32_XEXEC_HI and test fails. That is sufficient just to check if DRC contains a register here in case of physreg. Physregs also do not use subregs so the subreg handling below is irrelevant for these. Differential Revision: https://reviews.llvm.org/D90064	2020-10-23 11:33:34 -07:00
Hubert Tong	528517aa74	[AIX][cmake] Adjust management of `-G` for linking The change in 0ba98433971f changed the behaviour of the build when using an XL build compiler because `-G` is not a pure linker option: it also implies `-shared`. This was accounted for in the base CMake configuration, so an analysis of the change from 0ba98433971f in relation to a build using Clang (where `-shared` is introduced by CMake) would not identify the issue. This patch resolves this particular issue by adding `-shared` alongside `-Wl,-G`. At the same time, the investigation reveals that several aspects of the various build configurations are not operating in the manner originally intended. The other issue related to the `-G` linker option in the build is that the removal of it (to avoid unnecessary use of run-time linking) is not effective for the build using the Clang compiler. This patch addresses this by adjusting the regular expressions used to remove the broadly- applied `-G`. Finally, the issue of specifying the export list with `-Wl,` instead of a compiler option is flagged with a FIXME comment. Reviewed By: daltenty, amyk Differential Revision: https://reviews.llvm.org/D90041	2020-10-23 14:32:36 -04:00
Nikita Popov	88e55394ae	[BasicAA] Add additional phi cycle test (NFC) This is a variation of the BatchAA problem that also applies without BatchAA. We may have a cached result from earlier in the same query.	2020-10-23 20:31:20 +02:00
Mircea Trofin	fbc18d8c7b	[NFC] Use [MC]Register in RegAllocGreedy This was initiated from the uses of MCRegUnitIterator, so while likely not exhaustive, it's a step forward. Differential Revision: https://reviews.llvm.org/D89975	2020-10-23 11:30:53 -07:00
Baptiste Saleil	e248116f2c	[PowerPC] Add intrinsics for MMA This patch adds support for MMA intrinsics. Authored by: Baptiste Saleil Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D89345	2020-10-23 13:16:02 -05:00
Nikita Popov	bac894562a	[PhiValues] Use SetVector to avoid non-determinism I'm not sure whether this can cause actual non-determinism in the compiler output, but at least it causes non-determinism in the statistics collected by BasicAA. Use SetVector to have a predictable iteration order.	2020-10-23 20:14:02 +02:00
Mircea Trofin	55c9f68e17	[MLInliner] Disable always inliner in bounds tests That changes the threshold calculation.	2020-10-23 10:24:51 -07:00
Amara Emerson	a5ff88d2db	[AArch64][GlobalISel] Introduce a new post-isel optimization pass. There are two optimizations here: 1. Consider the following code: FCMPSrr %0, %1, implicit-def $nzcv %sel1:gpr32 = CSELWr %_, %_, 12, implicit $nzcv %sub:gpr32 = SUBSWrr %_, %_, implicit-def $nzcv FCMPSrr %0, %1, implicit-def $nzcv %sel2:gpr32 = CSELWr %_, %_, 12, implicit $nzcv This kind of code where we have 2 FCMPs each feeding a CSEL can happen when we have a single IR fcmp being used by two selects. During selection, to ensure that there can be no clobbering of nzcv between the fcmp and the csel, we have to generate an fcmp immediately before each csel is selected. However, often we can essentially CSE these together later in MachineCSE. This doesn't work though if there are unrelated flag-setting instructions in between the two FCMPs. In this case, the SUBS defines NZCV but it doesn't have any users, being overwritten by the second FCMP. Our solution here is to try to convert flag setting operations between a interval of identical FCMPs, so that CSE will be able to eliminate one. 2. SelectionDAG imported patterns for arithmetic ops currently select the flag-setting ops for CSE reasons, and add the implicit-def $nzcv operand to those instructions. However if those impdef operands are not marked as dead, the peephole optimizations are not able to optimize them into non-flag setting variants. The optimization here is to find these dead imp-defs and mark them as such. This pass is only enabled when optimizations are enabled. Differential Revision: https://reviews.llvm.org/D89415	2020-10-23 10:18:36 -07:00
LLVM GN Syncbot	df8c67668f	[gn build] Port dbbc4f4e226	2020-10-23 17:06:41 +00:00
Arthur Eubanks	44db41329f	Revert "[CGSCC] Detect devirtualization in more cases" This reverts commit 3024fe5b55ed72633915f613bd5e2826583c396f. Causes major compile time regressions: https://llvm-compile-time-tracker.com/compare.php?from=3b8d8954bf2c192502d757019b9fe434864068e9&to=3024fe5b55ed72633915f613bd5e2826583c396f&stat=instructions	2020-10-23 09:53:52 -07:00
Alex Orlov	41781efce1	Added utility to launch tests on a target remotely. Runs an executable on a remote host. This is meant to be used as an executor when running the LLVM and the Libraries tests on a target. Reviewed By: vvereschaka Differential Revision: https://reviews.llvm.org/D89349	2020-10-23 20:52:30 +04:00
Lang Hames	a156cf61cb	Re-apply "[JITLink][ELF] Add support for ELF::R_X86_64_REX_GOTPCRELX relocation" This re-applies e2fceec2fd1 with fixes. Apparently we already do support relaxation for ELF, so we need to make sure the test case allocates a slab at a fixed address, and that the R_X86_64_REX_GOTPCRELX test references an external that is guaranteed to be out of range.	2020-10-23 09:48:05 -07:00
Huihui Zhang	f5744161af	[AArch64][SVE] Fix umin/umax lowering to handle out of range imm. Immediate must be in an integer range [0,255] for umin/umax instruction. Extend pattern matching helper SelectSVEArithImm() to take in value type bitwidth when checking immediate value is in range or not. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D89831	2020-10-23 09:42:56 -07:00
Victor Huang	e4448695d1	[PowerPC] Fix the Predicates for enabling pcrelative-memops and PLXVP/PSTXVP definitions In this patch, Predicates fix added for the following: * disable prefix-instrs will disable pcrelative-memops * set two predicates PairedVectorMemops and PrefixInstrs for PLXVP/PSTXVP definitions Differential Revision: https://reviews.llvm.org/D89727 Reviewed by: amyk, steven.zhang	2020-10-23 11:33:20 -05:00
LLVM GN Syncbot	5b19aaf8b1	[gn build] Port 00255f41929	2020-10-23 16:19:55 +00:00
vpykhtin	9d887381ef	[AMDGPU] Fix access beyond the end of the basic block in execMayBeModifiedBeforeAnyUse. I was wrong in thinking that MRI.use_instructions return unique instructions and mislead Jay in his previous patch D64393. First loop counted more instructions than it was in reality and the second loop went beyond the basic block with that counter. I used Jay's previous code that relied on MRI.use_operands to constrain the number of instructions to check among. modifiesRegister is inlined to reduce the number of passes over instruction operands and added assert on BB end boundary. Differential Revision: https://reviews.llvm.org/D89386	2020-10-23 19:17:48 +03:00
Paulo Matos	f7fc888024	[WebAssembly] Implementation of (most) table instructions Implementation of instructions table.get, table.set, table.grow, table.size, table.fill, table.copy. Missing instructions are table.init and elem.drop as they deal with element sections which are not yet implemented. Added more tests to tables.s Differential Revision: https://reviews.llvm.org/D89797	2020-10-23 08:42:54 -07:00
Jeremy Morse	0c18b2fb0d	[DebugInstrRef] Handle DBG_INSTR_REFs use-before-defs in LiveDebugValues Deciding where to place debugging instructions when normal instructions sink between blocks is difficult -- see PR44117. Dealing with this with instruction-referencing variable locations is simple: we just tolerate DBG_INSTR_REFs referring to values that haven't been computed yet. This patch adds support into InstrRefBasedLDV to record when a variable value appears in the middle of a block, and should have a DBG_VALUE added when it appears (a debug use before def). While described simply, this relies heavily on the value-propagation algorithm in InstrRefBasedLDV. The implementation doesn't attempt to verify the location of a value unless something non-trivial occurs to merge variable values in vlocJoin. This means that a variable with a value that has no location can retain it across all control flow (including loops). It's only when another debug instruction specifies a different variable value that we have to check, and find there's no location. This property means that if a machine value is defined in a block dominated by a DBG_INSTR_REF that refers to it, all the successor blocks can automatically find a location for that value (if it's not clobbered). Thus in a sense, InstrRefBasedLDV is already supporting and implementing use-before-defs. This patch allows us to specify a variable location in the block where it's defined. When loading live-in variable locations, TransferTracker currently discards those where it can't find a location for the variable value. However, we can tell from the machine value number whether the value is defined in this block. If it is, add it to a set of use-before-def records. Then, once the relevant instruction has been processed, emit a DBG_VALUE immediately after it. Differential Revision: https://reviews.llvm.org/D85775	2020-10-23 16:33:23 +01:00
Jay Foad	9321aed101	[AMDGPU] Add simplification/combines for llvm.amdgcn.fma.legacy This follows on from D89558 which added the new intrinsic and D88955 which added similar combines for llvm.amdgcn.fmul.legacy. Differential Revision: https://reviews.llvm.org/D90028	2020-10-23 16:16:13 +01:00
Denis Antrushin	91be48b03e	Revert "[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty." Downstream testing revealed some problems with this patch. Reverting while investigating. This reverts commit 2b96dcebfae65485859d956954f10f409abaae79.	2020-10-23 21:55:06 +07:00
Nicolai Hähnle	12fb986d8e	CfgInterface: rename interface() to getInterface() Apparently there are some Microsoft headers which `#define interface struct`. This method is only used in pending changes so far. Change-Id: Ic68fe8e1958ec9b015f817ee218431f4146b888a	2020-10-23 16:52:10 +02:00
Simon Pilgrim	2c1de86bff	[InstCombine] Add i8 bitreverse by multiplication test patterns Pulled from bit twiddling hacks webpage	2020-10-23 15:39:57 +01:00
Simon Pilgrim	0f2f9e2cf6	[InstCombine] Add 8/16/32/64 bitreverse test coverage Use typical codegen for the traditional pairwise lgN bitreverse algorithm	2020-10-23 15:39:56 +01:00
Simon Pilgrim	124244322c	[InstCombine] Add initial bitreverse test coverage	2020-10-23 15:39:56 +01:00
Paul C. Anagnostopoulos	a16b7dbf16	[TableGen] Change !getop and !setop to !getdagop and !setdagop. Differential Revision: https://reviews.llvm.org/D89814	2020-10-23 10:36:05 -04:00
Matt Arsenault	46f491ab64	AMDGPU: Don't query for TII in TII	2020-10-23 10:34:24 -04:00
Matt Arsenault	ae7605c97d	AMDGPU: Increase branch size estimate with offset bug This will be relaxed to insert a nop if the offset hits the bad value, so over estimate branch instruction sizes.	2020-10-23 10:34:24 -04:00
Evgeny Leviant	84c45b2024	[llvm-mca] Extend cortex-a57 memory instructions test Patch adds few/load store instructions which have custom sched classes in cortex-a57 model.	2020-10-23 17:02:20 +03:00
Jeremy Morse	7b0445fed2	[DebugInstrRef] Convert DBG_INSTR_REFs into variable locations Handle DBG_INSTR_REF instructions in LiveDebugValues, to determine and propagate variable locations. The logic is fairly straight forwards: Collect a map of debug-instruction-number to the machine value numbers generated in the first walk through the function. When building the variable value transfer function and we see a DBG_INSTR_REF, look up the instruction it refers to, and pick the machine value number it generates, That's it; the rest of LiveDebugValues continues as normal. Awkwardly, there are two kinds of instruction numbering happening here: the offset into the block (which is how machine value numbers are determined), and the numbers that we label instructions with when generating DBG_INSTR_REFs. I've also restructured the TransferTracker redefVar code a little, to separate some DBG_VALUE specific operations into its own method. The changes around redefVar should be largely NFC, while allowing DBG_INSTR_REFs to specify a value number rather than just a location. Differential Revision: https://reviews.llvm.org/D85771	2020-10-23 14:50:02 +01:00
Nico Weber	45a7f324f8	[gn build] port 48e4b0f (__config_site revert) This reverts commit b3ca53e14274642274be8fe7db8b43dc3c146366. This reverts commit 8b7dac81d378c339d3e55f6f51cd0c42803903ad. This reverts commit 37c030f81a9fdd7a7e1b6fa5407b277c1ab1afa1.	2020-10-23 09:45:34 -04:00
Louis Dionne	5216b4c86c	[runtimes] Revert the libc++ __config_site change This is a massive revert of the following commits (from most revent to oldest): 2b9b7b5775a1d8fcd7aa5abaa8fc0bc303434f1a. 529ac33197f6408952ae995075ac5e2dc5287e81 28270234f1478047e35879f4ba8838b47edfcc14 69c2087283cf7b17ca75f69daebf4ffc158b754a b5aa67446e01bd277727b05710a42e69ac41e74b 5d796645d6c8cadeb003715c33e231a8ba05b6de After checking-in the __config_site change, a lot of things started breaking due to widespread reliance on various aspects of libc++'s build, notably the fact that we can include the headers from the source tree, but also reliance on various "internal" CMake variables used by the runtimes build and compiler-rt. These were unintended consequences of the change, and after two days, we still haven't restored all the bots to being green. Instead, now that I understand what specific areas this will blow up in, I should be able to chop up the patch into smaller ones that are easier to digest. See https://reviews.llvm.org/D89041 for more details on this adventure.	2020-10-23 09:41:48 -04:00
Chen Zheng	44da140a08	[LSR] ignore profitable chain when reg num is not major cost. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D89665	2020-10-23 09:35:48 -04:00
Sanjay Patel	954849c6b1	[ValueTracking] add range limits for cttz As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete. InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of cttz to process any "icmp pred cttz(X), C" pattern (the min value is initialized to zero automatically). https://alive2.llvm.org/ce/z/Z_SLWZ Follow-up to D89976.	2020-10-23 08:43:45 -04:00
Sanjay Patel	57190e997a	[ValueTracking] add range limits for ctlz As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete. InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of ctlz to process any "icmp pred ctlz(X), C" pattern (the min value is initialized to zero automatically). Follow-up to D89976.	2020-10-23 08:43:45 -04:00
Sanjay Patel	e96f82baa9	[InstSimplify] add tests for cttz constant range; NFC This is a search-and-replace of f6cb7f3	2020-10-23 08:43:45 -04:00
Sanjay Patel	7ca952e6d1	[InstSimplify] add tests for ctlz constant range; NFC This is a search-and-replace of f6cb7f3.	2020-10-23 08:43:45 -04:00
Sam McCall	dd753f1c7d	[CMake] Fix hardcoding of protobuf output basename. NFC Differential Revision: https://reviews.llvm.org/D90030	2020-10-23 14:29:57 +02:00
Sam McCall	1777cfccdd	[CMake] generate_grpc_protos -> generate_protos(... GRPC). NFC Differential Revision: https://reviews.llvm.org/D90027	2020-10-23 14:28:07 +02:00
Sanjay Patel	68dbafe4a4	[ValueTracking] add range limits for ctpop As discussed in D89952, instcombine can sometimes find a way to reduce similar patterns, but it is incomplete. InstSimplify uses the computeConstantRange() ValueTracking analysis via simplifyICmpWithConstant(), so we just need to fill in the max value of ctpop to process any "icmp pred ctpop(X), C" pattern (the min value is initialized to zero automatically). Differential Revision: https://reviews.llvm.org/D89976	2020-10-23 08:17:54 -04:00
Simon Pilgrim	a52d320c19	[InstCombine] matchBSwapOrBitReverse - expose bswap/bitreverse matching flags. matchBSwapOrBitReverse was hardcoded to just match bswaps - we're going to need to expose the ability to match bitreverse as well, so make this part of the function call.	2020-10-23 12:35:28 +01:00
Simon Pilgrim	66105f50cc	[InstCombine] Rename InstCombinerImpl::matchBSwap to matchBSwapOrBitReverse. NFCI. This matches bswap and bitreverse intrinsics, so we should make that clear in the function name.	2020-10-23 12:35:27 +01:00
Simon Pilgrim	839cb17113	[X86] lowerShuffleWithPERMV - use MVT::changeTypeToInteger helper. NFCI.	2020-10-23 12:35:27 +01:00
Evgeny Leviant	4038f285b5	[ARM][SchedModels] Convert IsR1P0AndLaterPred to MCSchedPredicate. NFC Differential revision: https://reviews.llvm.org/D90017	2020-10-23 14:27:49 +03:00
Florian Hahn	785022033e	[AArch64] Implement getIntrinsicInstrCost, handle min/max intrinsics. This patch adds a specialized implementation of getIntrinsicInstrCost and add initial cost-modeling for min/max vector intrinsics. AArch64 NEON support umin/smin/umax/smax for vectors <8 x i8>, <16 x i8>, <4 x i16>, <8 x i16>, <2 x i32> and <4 x i32>. Notably, it does not support vectors with i64 elements. This change by itself should have very little impact on codegen, but in follow-up patches I plan to teach the vectorizers to consider using those intrinsics on platforms where it is profitable, e.g. because there is no general 'select'-like instruction. The current cost returned should be better for throughput, latency and size. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D89953	2020-10-23 11:32:42 +01:00

1 2 3 4 5 ...

205623 Commits