llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Sjoerd Meijer	215e8fb34a	[MachineLICM][MachineSink] Move SinkIntoLoop to MachineSink. This moves SinkIntoLoop from MachineLICM to MachineSink. The motivation for this work is that hoisting is a canonicalisation transformation, but we do not really have a good story to sink instructions back if that is better, e.g. to reduce live-ranges, register pressure and spilling. This has been discussed a few times on the list, the latest thread is: https://lists.llvm.org/pipermail/llvm-dev/2020-December/147184.html There it was pointed out that we have the LoopSink IR pass, but that works on IR, lacks register pressure informatiom, and is focused on profile guided optimisations, and then we have MachineLICM and MachineSink that both perform sinking. MachineLICM is more about hoisting and CSE'ing of hoisted instructions. It also contained a very incomplete and disabled-by-default SinkIntoLoop feature, which we now move to MachineSink. Getting loop-sinking to do something useful is going to be at least a 3-step approach: 1) This is just moving the code and is almost a NFC, but contains a bug fix. This uses helper function `isLoopInvariant` that was factored out in D94082 and added to MachineLoop. 2) A first functional change to make loop-sink a little bit less restrictive, which it really is at the moment, is the change in D94308. This lets it do more (alias) analysis using functions in MachineSink, making it a bit more powerful. Nothing changes much: still off by default. But it shows that MachineSink is a better home for this, and it starts using its functionality like `hasStoreBetween`, and in the next step we can use `isProfitableToSinkTo`. 3) This is the going to be he interesting step: decision making when and how many instructions to sink. This will be driven by the register pressure, and deciding if reducing live-ranges and loop sinking will help in better performance. 4) Once we are happy with 3), this should be enabled by default, that should be the end goal of this exercise. Differential Revision: https://reviews.llvm.org/D93694	2021-01-27 10:49:56 +00:00
David Green	5c8cfcc0df	[AArch64] Add vector saturating add intrinsic costs This adds sadd.sat, uadd.sat, ssub.sat and usub.sat costs for AArch64, similar to how they were recently added for ARM. Differential Revision: https://reviews.llvm.org/D95292	2021-01-27 10:38:32 +00:00
Fraser Cormack	0f5c801164	[RISCV] Fix a codegen crash in getSetCCResultType This patch fixes some crashes coming from `RISCVISelLowering::getSetCCResultType`, which would occasionally return an EVT constructed from an invalid MVT, which has a null Type pointer. The attached test shows this happening currently for some fixed-length vectors, which hit this issue when the V extension was enabled, even though they're not legal types under the V extension. The fix was also pre-emptively extended to scalable vectors which can't be represented as an MVT, even though a test case couldn't be found for them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D95434	2021-01-27 10:22:54 +00:00
Jay Foad	fbb73224ea	[AMDGPU] Write "GFX6-GFX9" instead of "GFX6-9" in docs ... and similarly for some other cases. This is for consistency and to make it easier to search for mentions of a particular architecture. Differential Revision: https://reviews.llvm.org/D95453	2021-01-27 10:07:07 +00:00
David Green	cfa651ce67	[ARM] Add neon FP16 scalar_to_vector patterns. This adds some simple fp16 scalar_to_vector patterns, preventing a selection failure if this came up. Differential Revision: https://reviews.llvm.org/D95427	2021-01-27 09:59:15 +00:00
Cassie Jones	211141f4c9	[Test][AArch64] Use named vregs in overflow legalization tests. NFC	2021-01-27 04:40:49 -05:00
Cassie Jones	cf15864409	[AArch64][GlobalISel] Make G_SADDE and G_SSUBE legal This makes G_SADDE and G_SSUBE legal in preparation for further work legalizing overflowing operations. It's fine that they don't have an instruction selector implementation yet, because G_UADDE and G_USUBE are already legal on AArch64 without an instruction selector implementation. This completes the set of G_[SU]{ADD,SUB}[EO] operations on AArch64. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D95325	2021-01-27 04:36:17 -05:00
Alexey Bader	efed872d6e	Fix an error about implicit fallthrough during self build - new tag for ittapi. A fix has been implemented in the ittap repo to fix an error about implicit fallthrough in a switch that was occurring during self build. A new tag has been created for that fix. This is to update the tag. Reviewed By: bader Differential Revision: https://reviews.llvm.org/D95462 Patch by Zahira Ammarguellat.	2021-01-27 08:55:52 +03:00
Kazu Hirata	923c60906b	[llvm-objdump] Use append_range (NFC)	2021-01-26 20:00:19 -08:00
Kazu Hirata	8f3f48c295	[MemorySSA] Use ListSeparator (NFC)	2021-01-26 20:00:18 -08:00
Kazu Hirata	d493c03566	[AMDGPU] Forward-declare TargetRegisterClass (NFC) AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a forward declaration of TargetRegisterClass in InstructionSelector.h. This patch adds a forward declaration right in AMDGPUInstructionSelector.h. While we are at it, this patch removes the one in InstructionSelector.h, where it is unnecessary.	2021-01-26 20:00:16 -08:00
Craig Topper	69039e7730	[TableGen] Add isContradictoryImpl implementation to CheckCondCodeMatcher and CheckChild2CondCodeMatcher. This enables better pattern factoring in the RISCV ISel table.	2021-01-26 19:44:57 -08:00
Tom Stellard	1a410d196a	Bump the trunk major version to 13 and clear the release notes.	2021-01-26 19:37:55 -08:00
LLVM GN Syncbot	8cadd79a8e	[gn build] Port bb9eb1982980	2021-01-27 01:23:23 +00:00
Craig Topper	87d87d0dfc	[RISCV] Add rv64 run lines to rv32 MC layer tests for B extension Remove common instructions from rv64 tests since they are now covered by the rv64 run lines in the rv32 tests. Add rv32-only* tests for a few cases that aren't common between r32 and rv64. Addresses review feedback from D95150. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D95272	2021-01-26 17:20:05 -08:00
Petr Hosek	3cfd18b793	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 17:13:34 -08:00
Nico Weber	696b1678eb	[gn build] fix get.py change	2021-01-26 19:20:23 -05:00
Nico Weber	5eefc4a62f	[gn build] restore build command removed in 9595a7ff55b6 for platforms without prebuilts	2021-01-26 19:19:31 -05:00
Nico Weber	1e7cca27a4	llvm-lib: Pull error printing code out of two functions Slightly changes the output in error code, but no behavior change in normal use. This is for preparation for using these two functions elsewhere.	2021-01-26 19:13:30 -05:00
Fangrui Song	9459868630	[llc] Add reportError helper and canonicalize error messages	2021-01-26 15:33:37 -08:00
Duncan P. N. Exon Smith	795af9c7de	Frontend: Simplify handling of non-seeking streams in CompilerInstance, NFC Add a new `raw_pwrite_ostream` variant, `buffer_unique_ostream`, which is like `buffer_ostream` but with unique ownership of the stream it's wrapping. Use this in CompilerInstance to simplify the ownership of non-seeking output streams, avoiding logic sprawled around to deal with them specially. This also simplifies future work to encapsulate output files in a different class. Differential Revision: https://reviews.llvm.org/D93260	2021-01-26 15:20:43 -08:00
Jessica Paquette	ed1a930649	[GlobalISel] Implement computeKnownBits for G_SEXT_INREG Just use the existing `Known.sextInReg` implementation. - Update KnownBitsTest.cpp. - Update combine-redundant-and.mir for a more concrete example. Differential Revision: https://reviews.llvm.org/D95484	2021-01-26 15:01:38 -08:00
Adrian Prantl	73fb46a3c5	Salvage debug info for function arguments in coro-split funclets. This patch improves the availability for variables stored in the coroutine frame by emitting an alloca to hold the pointer to the frame object and rewriting dbg.declare intrinsics to point inside the frame object using salvaged DIExpressions. Finally, a new alloca is created in the funclet to hold the FramePtr pointer to ensure that it is available throughout the entire function at -O0. This path also effectively reverts D90772. The testcase updates highlight nicely how every removed CHECK for a dbg.value is preceded by a new CHECK for a dbg.declare. Thanks to JunMa, Yifeng, and Bruno for their thoughtful reviews! Differential Revision: https://reviews.llvm.org/D93497 rdar://71866936	2021-01-26 15:01:26 -08:00
Zhuojia Shen	822d9e26e1	[ARM] Fix STRT/STRHT/STRBT input/output operands. STRT, STRHT, and STRBT are store instructions and their source register $Rt should be treated as an input operand instead of an output operand. This should fix things (e.g., liveness tracking in LivePhysRegs) if these instructions were used in CodeGen. Differential Revision: https://reviews.llvm.org/D95074	2021-01-26 14:00:58 -08:00
Bjorn Pettersson	3aa7f0e352	[NewPM] Add ExtraVectorizerPasses support As it looks like NewPM generally is using SimpleLoopUnswitch instead of LoopUnswitch, this patch also use SimpleLoopUnswitch in the ExtraVectorizerPasses sequence (compared with LegacyPM which use the LoopUnswitch pass). Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D95457	2021-01-26 22:59:10 +01:00
Valery N Dmitriev	6413cc64a9	[InstCombine] Preserve FMF for powi simplifications. Differential Revision: https://reviews.llvm.org/D95455	2021-01-26 13:26:06 -08:00
Valery N Dmitriev	6cc92601ad	[NFC] Show instcombine powi simplifications drop FMF Differential Revision: https://reviews.llvm.org/D95454	2021-01-26 13:26:06 -08:00
Craig Topper	60dd07a304	[X86] In shrinkAndImmediate, place the new constant into the topological sort. Revert the change to use APInt::isSignedIntN from 5ff5cf8e057782e3e648ecf5ccf1d9990b53ee90. Its clear that the games we were playing to avoid the topological sort aren't working. So just fix it once and for all. Fixes PR48888.	2021-01-26 13:18:04 -08:00
Julian Lettner	f11e7efb47	[NFC][lit] Cleanup code using string interpolation LLVM now requires Python 3.6, so we can use string interpolation to make code more readable.	2021-01-26 13:04:31 -08:00
Amara Emerson	1eb88d40b9	[GlobalISel][IRTranslator] Ignore the llvm.experimental.noalias.scope.decl intrinsic. These don't generate any code.	2021-01-26 13:04:11 -08:00
LLVM GN Syncbot	80d5e143a6	[gn build] Port 1e634f3952aa	2021-01-26 20:48:31 +00:00
Fangrui Song	3aa07124e5	[llvm-elfabi] Fix test after D95140	2021-01-26 12:45:45 -08:00
Haowei Wu	4cba78c0ca	[llvm-elfabi] Support ELF file that lacks .gnu.hash section Before this change, when reading ELF file, elfabi determines number of entries in .dynsym by reading the .gnu.hash section. This change makes elfabi read section headers directly first. This change allows elfabi works on ELF files which do not have .gnu.hash sections. Differential Revision: https://reviews.llvm.org/D93362	2021-01-26 12:31:52 -08:00
Fangrui Song	71834ae8ab	Add -fbinutils-version= to gate ELF features on the specified binutils version There are two use cases. Assembler We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some features are supported by latest GNU as, but we have to use MCAsmInfo::useIntegratedAs() because the newer versions have not been widely adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26). Linker We want to use features supported only by LLD or very new GNU ld, or don't want to work around older GNU ld. We currently can't represent that "we don't care about old GNU ld". You can find such workarounds in a few other places, e.g. Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276), R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969) Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001; GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available). This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table). This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc. It changes one codegen place in SHF_MERGE to demonstrate its usage. `-fbinutils-version=2.35` means the produced object file does not care about GNU ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced assembly can be consumed by GNU as>=2.35, but older versions may not work. `-fbinutils-version=none` means that we can use all ELF features, regardless of GNU as/ld support. Both clang and llc need `parseBinutilsVersion`. Such command line parsing is usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen), however, ClangCodeGen does not depend on LLVMCodeGen. So I add `parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget). Differential Revision: https://reviews.llvm.org/D85474	2021-01-26 12:28:23 -08:00
Petr Hosek	ef4906bd9c	Revert "Support for instrumenting only selected files or functions" This reverts commit 4edf35f11a9e20bd5df3cb47283715f0ff38b751 because the test fails on Windows bots.	2021-01-26 12:25:28 -08:00
Austin Kerbow	d1f23a1772	[AMDGPU] Update subtarget features for new target ID support Support for XNACK and SRAMECC is not static on some GPUs. We must be able to differentiate between different scenarios for these dynamic subtarget features. The possible settings are: - Unsupported: The GPU has no support for XNACK/SRAMECC. - Any: Preference is unspecified. Use conservative settings that can run anywhere. - Off: Request support for XNACK/SRAMECC Off - On: Request support for XNACK/SRAMECC On GCNSubtarget will track the four options based on the following criteria. If the subtarget does not support XNACK/SRAMECC we say the setting is "Unsupported". If no subtarget features for XNACK/SRAMECC are requested we must support "Any" mode. If the subtarget features XNACK/SRAMECC exist in the feature string when initializing the subtarget, the settings are "On/Off". The defaults are updated to be conservatively correct, meaning if no setting for XNACK or SRAMECC is explicitly requested, defaults will be used which generate code that can be run anywhere. This corresponds to the "Any" setting. Differential Revision: https://reviews.llvm.org/D85882	2021-01-26 11:25:51 -08:00
LLVM GN Syncbot	364bd7857e	[gn build] Port 4edf35f11a9e	2021-01-26 19:12:09 +00:00
Petr Hosek	a8839c989d	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 11:11:39 -08:00
Adhemerval Zanella	7ab636be94	[ARM] [ELF] Fix ARMMaterializeGV for Indirect calls Recent shouldAssumeDSOLocal changes (introduced by 961f31d8ad14c66) do not take in consideration the relocation model anymore. The ARM fast-isel pass uses the function return to set whether a global symbol is loaded indirectly or not, and without the expected information llvm now generates an extra load for following code: ``` $ cat test.ll @__asan_option_detect_stack_use_after_return = external global i32 define dso_local i32 @main(i32 %argc, i8** %argv) #0 { entry: %0 = load i32, i32* @__asan_option_detect_stack_use_after_return, align 4 %1 = icmp ne i32 %0, 0 br i1 %1, label %2, label %3 2: ret i32 0 3: ret i32 1 } attributes #0 = { noinline optnone } $ lcc test.ll -o - [...] main: .fnstart [...] movw r0, :lower16:__asan_option_detect_stack_use_after_return movt r0, :upper16:__asan_option_detect_stack_use_after_return ldr r0, [r0] ldr r0, [r0] cmp r0, #0 [...] ``` And without 'optnone' it produces: ``` [...] main: .fnstart [...] movw r0, :lower16:__asan_option_detect_stack_use_after_return movt r0, :upper16:__asan_option_detect_stack_use_after_return ldr r0, [r0] clz r0, r0 lsr r0, r0, #5 bx lr [...] ``` This triggered a lot of invalid memory access in sanitizers for arm-linux-gnueabihf. I checked this patch both a stage1 built with gcc and a stage2 bootstrap and it fixes all the Linux sanitizers issues. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D95379	2021-01-26 15:57:55 -03:00
Craig Topper	4599117eed	[RISCV] Have customLegalizeToWOp truncate to the original type instead of i32 now that we use it for i8/i16 as well. 239cfbccb0509da1a08d9e746706013b732e646b add support for legalizing i8/i16 UDIV/UREM/SDIV to use *W instructions. So we need to truncate to i8/i16 if we're legalizing one of those.	2021-01-26 10:50:03 -08:00
Julian Lettner	0aac51e35d	Reland "[lit] Use os.cpu_count() to cleanup TODO" The initial problem with the remaining bot config was resolved. We can now use Python3. Let's use `os.cpu_count()` to cleanup this helper. Differential Revision: https://reviews.llvm.org/D94734	2021-01-26 10:19:26 -08:00
Matt Arsenault	6be06cd224	AMDGPU: Fix redundant FP spilling/assert in some functions If a function has stack objects, and a call, we require an FP. If we did not initially have any stack objects, and only introduced them during PrologEpilogInserter for CSR VGPR spills, SILowerSGPRSpills would end up spilling the FP register as if it were a normal register. This would result in an assert in a debug build, or redundant handling of the FP register in a release build. Try to predict that we will have an FP later, although this is ugly.	2021-01-26 13:01:45 -05:00
Matt Arsenault	df7c58a46d	AMDGPU: Add assertion to determineCalleeSaves Make sure this isn't getting called multiple times. I was surprised we were modifying the function here, which I think is a bit questionable.	2021-01-26 13:01:45 -05:00
Sanjay Patel	da6d2a1054	[LoopVectorize] add test for fmin/fmax FMF propagation; NFC The existing test has less FMF than we might expect if our FMF was fixed (on all FP values), so this additional test is intended to check propagation in a more "normal" example.	2021-01-26 11:22:51 -05:00
Sanjay Patel	549e4519b0	[LoopUtils] do not initialize Cmp predicate unnecessarily; NFC The switch must set the predicate correctly; anything else should lead to unreachable/assert. I'm trying to fix FMF propagation here and the callers, so this is a preliminary cleanup.	2021-01-26 11:22:51 -05:00
Simon Pilgrim	95a58e9335	[AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI. Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions	2021-01-26 16:09:39 +00:00
Simon Pilgrim	345ddf4259	[AMDGPU] Fix null-dereference static analysis warnings. NFCI. Avoid repeated calls to isZeroValue() and check for a null pointer before dereferencing a dyn_cast<>.	2021-01-26 15:43:59 +00:00
Matt Arsenault	0db9cfc2ab	AMDGPU: Clear IsSSA property in SIFormMemoryClauses Fixes verifier error when writing MIR testcases	2021-01-26 10:40:41 -05:00
Florian Hahn	acb2114d07	[LoopUnswitch] Avoid partially unswitching too aggressively. This patch adds additional checks to avoid partial unswitching in cases where it won't be profitable, e.g. because the path directly exits the loop anyways.	2021-01-26 15:18:41 +00:00
Florian Hahn	edc697c1b2	[LoopUnswitch] Add some additional tests. Add a few additional tests where partial unswitching is not really profitable and should be avoided.	2021-01-26 15:12:45 +00:00

1 2 3 4 5 ...

210379 Commits