llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Coplin, Jared	1b05246eda	[Hexagon] Masked and unmasked load to same base -> load and two selects	2021-04-22 08:44:01 -05:00
Simon Pilgrim	aa49d8a879	[X86] Regenerate atomic-eflags-reuse.ll	2021-04-22 14:07:12 +01:00
Simon Pilgrim	6870b3d5f3	[LTO] Caching.h - remove unused <string> include. NFCI.	2021-04-22 14:07:12 +01:00
Dawid Jurczak	a221d1af2d	[InstCombine][NFC] Use --check-globals flag in tests. This patch adds strings content checking to printf-2.ll via --check-globals flag. Split off from D100724. Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D101034	2021-04-22 15:06:42 +02:00
Dawid Jurczak	c59a909768	[SimplifyLibCalls][NFC] Use StringRef::back instead explicit indexing. Split off from D100724. Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D101032	2021-04-22 15:02:47 +02:00
Jun Ma	684cf0c2d5	[DAGCombiner] Allow operand of step_vector to be negative. It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector: (sub X, step_vector(C)) -> (add X, step_vector(-C)) Differential Revision: https://reviews.llvm.org/D100812	2021-04-22 20:58:03 +08:00
Wang, Pengfei	429405cf52	[X86][AMX][NFC] Remove assert for comparison between different BBs. SmallSet may use operator `<` when we insert MIRef elements, so we cannot limit the comparison between different BBs. We allow MIRef() to be less that any initialized MIRef object, otherwise, we always reture false when compare between different BBs. Differential Revision: https://reviews.llvm.org/D101039	2021-04-22 20:41:59 +08:00
Nico Weber	78b7018813	[gn build] (manually) port aee6c86c4d better "EmptyNodeIntrospection.inc.in" needs to be a source of the action, so that ninja knows to rerun this action if that input changes.	2021-04-22 08:41:40 -04:00
Nico Weber	5c2f53e58f	[gn build] (manually) port aee6c86c4d	2021-04-22 08:36:19 -04:00
Jay Foad	f7148011b7	Fix typo "beneficiates" in comments	2021-04-22 12:30:16 +01:00
Stephen Tozer	80d2f76226	[Bitcode] Ensure DIArgList in bitcode has no null or forward metadata refs This patch fixes an issue in which ConstantAsMetadata arguments to a DIArglist, as well as the Constant values referenced by that metadata, would not be always be emitted correctly into bitcode. This patch fixes this issue firstly by searching for ConstantAsMetadata in DIArgLists (previously we would only search for them when directly wrapped in MetadataAsValue), and secondly by enumerating all of a DIArgList's arguments directly prior to enumerating the DIArgList itself. This patch also adds a number of asserts, and no longer treats the arguments to a DIArgList as optional fields when reading/writing to bitcode. Differential Revision: https://reviews.llvm.org/D100572	2021-04-22 12:03:33 +01:00
Simon Pilgrim	a2c998d322	MipsSEFrameLowering.h - remove unused headers. NFCI.	2021-04-22 11:32:29 +01:00
Simon Pilgrim	421eafbbab	[X86][AVX] Add PR49971 test case This is a llvm12 only bug, and is already avoided in trunk, but we should keep track of it.	2021-04-22 11:32:29 +01:00
Nemanja Ivanovic	422f9c37c7	[PowerPC] Improve codegen for vector fp to int widening conversions We currently do not utilize instructions that convert single precision vectors to doubleword integer vectors. These conversions come up in code occasionally and this improvement allows us to open code some functions that need to be added to altivec.h.	2021-04-22 05:04:06 -05:00
Martin Storsjö	1c71ec1903	[AArch64] Fix calling windows varargs with floats in fixed args from non-windows functions When inspecting the calling convention, for calling windows functions from a non-windows function, inspect the calling convention of the called function, not the caller. Also remove an unnecessary parameter to AArch64CallLowering OutgoingArgHandler. Differential Revision: https://reviews.llvm.org/D100890	2021-04-22 12:02:49 +03:00
Jay Foad	72550bf43c	[AMDGPU] SIWholeQuadMode: don't add duplicate implicit $exec operands STRICT_WWM and STRICT_WQM are already defined with Uses = [EXEC], so there is no need to add another implicit use of $exec when lowering them to V_MOV_B32 instructions. Differential Revision: https://reviews.llvm.org/D100969	2021-04-22 09:19:47 +01:00
Serge Pavlov	830cd58476	[RISCV] Custom lowering of SET_ROUNDING Differential Revision: https://reviews.llvm.org/D91242	2021-04-22 15:04:55 +07:00
David Sherwood	e6e9b5dddc	[LoopVectorize] Don't create unnecessary vscale intrinsic calls In quite a few cases in LoopVectorize.cpp we call createStepForVF with a step value of 0, which leads to unnecessary generation of llvm.vscale intrinsic calls. I've optimised IRBuilder::CreateVScale and createStepForVF to return 0 when attempting to multiply vscale by 0. Differential Revision: https://reviews.llvm.org/D100763	2021-04-22 09:01:52 +01:00
Wenlei He	e734b4c21b	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles The change adds support for triming and merging cold context when mergine CSSPGO profiles using llvm-profdata. This is similar to the context profile trimming in llvm-profgen, however the flexibility to trim cold context after profile is generated can be useful. Differential Revision: https://reviews.llvm.org/D100528	2021-04-22 00:42:37 -07:00
Arthur Eubanks	a2acab380b	[NewPM] Mark some more wrapper passes as ignored We shouldn't print IR when seeing these passes.	2021-04-21 23:55:02 -07:00
Craig Topper	c35e5975c3	[RISCV] Use TargetConstant for condition code of RISCVISD::SELECT_CC. The value is always an immediate and can never be in a register. This the kind of thing TargetConstant is for. Saves a step GenDAGISel to convert a Constant to a TargetConstant.	2021-04-21 23:08:52 -07:00
Max Kazantsev	9fbf6639f4	[GVN] Introduce loop load PRE This patch allows PRE of the following type of loads: ``` preheader: br label %loop loop: br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p br label %merge merge: ... br i1 ..., label %loop, label %exit ``` Into ``` preheader: %x0 = load %p br label %loop loop: %x.pre = phi(x0, x2) br i1 ..., label %merge, label %clobber clobber: call foo() // Clobbers %p %x1 = load %p br label %merge merge: x2 = phi(x.pre, x1) ... br i1 ..., label %loop, label %exit ``` So instead of loading from %p on every iteration, we load only when the actual clobber happens. The typical pattern which it is trying to address is: hot loop, with all code inlined and provably having no side effects, and some side-effecting calls on cold path. The worst overhead from it is, if we always take clobber block, we make 1 more load overall (in preheader). It only matters if loop has very few iteration. If clobber block is not taken at least once, the transform is neutral or profitable. There are several improvements prospect open up: - We can sometimes be smarter in loop-exiting blocks via split of critical edges; - If we have block frequency info, we can handle multiple clobbers. The only obstacle now is that we don't know if their sum is colder than the header. Differential Revision: https://reviews.llvm.org/D99926 Reviewed By: reames	2021-04-22 12:50:38 +07:00
Snehasish Kumar	5f0fcf1af8	[CodeGen] Do not split functions with attr "implicit-section-name". The #pragma clang section can be used at a coarse granularity to specify the section used for bss/data/text/rodata for global objects. When split functions is enabled, the function may be split into two parts violating user expectations. Reference: https://clang.llvm.org/docs/LanguageExtensions.html#specifying-section-names-for-global-objects-pragma-clang-section Differential Revision: https://reviews.llvm.org/D101004	2021-04-21 21:51:33 -07:00
Serge Pavlov	4dafa2234f	[RISCV] Custom lowering of FLT_ROUNDS_ Differential Revision: https://reviews.llvm.org/D90854	2021-04-22 11:39:15 +07:00
Craig Topper	acf9ae5608	[RISCV] Teach lowerSPLAT_VECTOR_PARTS to detect cases where Hi is sign extended from Lo. This recognizes the case when Hi is (sra Lo, 31). We can use SPLAT_VECTOR_I64 rather than splatting the high bits and combining them in the vector register.	2021-04-21 20:24:23 -07:00
Chuanqi Xu	da5ec05222	[Coroutine] Collect CoroBegin if all of terminators are dominated by one coro.destroy Summary: The original logic seems to be we could collecting a CoroBegin if one of the terminators could be dominated by one of coro.destroy, which doesn't make sense. This patch rewrites the logics to collect CoroBegin if all of terminators are dominated by one coro.destroy. If there is no such coro.destroy, we would call hasEscapePath to evaluate if we should collect it. Test Plan: check-llvm Reviewed by: lxfind Differential Revision: https://reviews.llvm.org/D100614	2021-04-22 11:21:37 +08:00
Evgeniy Brevnov	d5e146fe48	Wordsmith the semantics of invariant.load Don't phrase the semantics in terms of the optimizer. Instead have a more straightforward execution based semantic. Reviewed By: ebrevnov Differential Revision: https://reviews.llvm.org/D63439	2021-04-22 10:06:13 +07:00
Matt Arsenault	3f7e76669e	AMDGPU: Fix assert when trying to fold reg_sequence of physreg copies	2021-04-21 21:58:18 -04:00
Giorgis Georgakoudis	fcb81a364e	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00
Fangrui Song	dd466a7214	Delete le32/le64 targets They are unused now. Note: NaCl is still used and is currently expected to be needed until 2022-06 (https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html). Differential Revision: https://reviews.llvm.org/D100981	2021-04-21 18:44:12 -07:00
Jessica Paquette	2faffc9dab	[AArch64][GlobalISel] Fix regbankselect for G_FCMP with vector destinations These should always go to a FPR, since they always use the vector registers. Differential Revision: https://reviews.llvm.org/D100885	2021-04-21 18:11:30 -07:00
Jessica Paquette	11b294d082	[AArch64][GlobalISel] Mark some vector G_ABS cases as legal Each of the cases marked as legal here have an imported pattern in AArch64GenGlobalISel.inc. So, if we mark them as legal, we get selection for free. Technically this is only supposed to happen if we have NEON support. But, we fall back if we don't have that in the legalizer right now. I suppose it'd be better to have a FIXME so we can write the testcase when the time comes. (Plus, it'd just fall back in selection if NEON isn't available, so it's not wrong, I guess?) This fixes some fallbacks in the test suite. (Also use `isScalar` from LegalityPredicates.cpp while we're here just to tidy things a little bit.) Differential Revision: https://reviews.llvm.org/D100916	2021-04-21 18:10:40 -07:00
Hongtao Yu	29e34b908b	[CSSPGO][llvm-profgen] Always report dangling probes for frames with real samples. Report dangling probes for frames that have real samples collected. Dangling probes are the probes associated to an empty block. When reported, sample count on a dangling probe will not be trusted by the compiler and we will rely on the counts inference algorithm to get the probe a reasonable count. This actually fixes a bug where previously only those dangling probes with samples collected were reported. This patch also fixes two existing issues. Pseudo probes are stored in `Address2ProbesMap` and their pointers are used in `PseudoProbeInlineTree`. Previously `std::vector` was used to store probes and the pointers to probes may get obsolete as the vector grows. I'm changing `std::vector` to `std::list` instead. The other issue is that all outlined functions shared the same inline frame previously due to the unchanged `Index` value as the dummy inlineSite identifier. Good results seen for SPEC2017 in general regarding profile quality. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D100235	2021-04-21 18:07:58 -07:00
Fangrui Song	40b4beda19	[IR] Add doc about Function::createWithDefaultAttr. NFC	2021-04-21 16:20:50 -07:00
Fangrui Song	b632d8aff2	[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module. Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.) Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame. This patch * sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute. The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics. Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc. Differential Revision: https://reviews.llvm.org/D100251	2021-04-21 15:58:20 -07:00
Michael Holman	e4e8dcd990	[CodeView] Add CodeView support for PGO debug information This change adds debug information about whether PGO is being used or not. Microsoft performance tooling (e.g. xperf, WPA) uses this information to show whether functions are optimized with PGO or not, as well as whether PGO information is invalid. This information is useful for validating whether training scenarios are providing good coverage of real world scenarios, showing if profile data is out of date, etc. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99994	2021-04-21 15:29:19 -07:00
Craig Topper	c445dfb378	[RISCV] Cleanup up the spec version references around fmaxnum/fminnum. This previously made references to 2.3-draft which was a short lived version number in 2017. It was replaced by date based versions leading up to ratification. This patch uses the latest ratified version number and just says what the behavior is. Nothing here is in flux. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D100878	2021-04-21 14:50:29 -07:00
Craig Topper	2f23110bec	[RISCV] Temporary in vmsge(u).vx pseudo instructions can't be V0. This was checked in some asserts, but not enforced by the instruction matching. There's still a second bug that we don't check that vt and vd are different registers, but that will require custom checking. Differential Revision: https://reviews.llvm.org/D100928	2021-04-21 14:50:29 -07:00
Petr Hosek	92218c28cd	[MC] Use COMDAT for LSDA only if IR comdat type is any This fixed issue introduced in 16af97393346ad636298605930a8b503a55eb40a and 796feb61637c407aefcc0d462f24a1cc41f350d8. Differential Revision: https://reviews.llvm.org/D100909	2021-04-21 14:41:39 -07:00
Olle Fredriksson	16a7ccaf6b	[MemCpyOpt] Allow variable lengths in memcpy optimizer This makes the memcpy-memcpy and memcpy-memset optimizations work for variable sizes as long as they are equal, relaxing the old restriction that they are constant integers. If they're not equal, the old requirement that they are constant integers with certain size restrictions is used. The implementation works by pushing the length tests further down in the code, which reveals some places where it's enough that the lengths are equal (but not necessarily constant). Differential Revision: https://reviews.llvm.org/D100870	2021-04-21 23:23:38 +02:00
Arthur Eubanks	8a2dd1008a	[Evaluator] Bitcast result of pointer stripping Trying to evaluate a GEP would assert with "Ty == cast<PointerType>(C->getType()->getScalarType())->getElementType()" because the type of the pointer we would evaluate the GEP argument to would be a different type than the GEP was expecting. We should treat pointer stripping as a bitcast. The test adds a redundant GEP that would crash due to type mismatch. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D100970	2021-04-21 13:32:29 -07:00
Alexey Bataev	7d8daf5c45	[SLP]Add a test with broadcast shuffle kind in SLP, NFC.	2021-04-21 13:16:31 -07:00
Arthur Eubanks	12af0f159d	[LLParser] Print mismatched types in error message Helps with debugging invalid handcrafted IR. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D100990	2021-04-21 13:10:37 -07:00
Dávid Bolvanský	655dee38a7	[LoopIdiom] Added testcase from PR44378; NFC	2021-04-21 22:00:32 +02:00
Nikita Popov	7d32930751	Revert "[InstCombine] Fold multiuse shr eq zero" This reverts commit 9423f78240a216e3f38b394a41fe3427dee22c26. A performance regression with this patch has been reported at https://reviews.llvm.org/rG9423f78240a2#990953. Reverting for now.	2021-04-21 21:40:52 +02:00
sstefan1	c15c1eb8f9	[FuncAttrs] Don't infer willreturn for nonexact definitions Discovered during attributor testing comparing stats with and without the attributor. Willreturn should not be inferred for nonexact definitions. Differential Revision: https://reviews.llvm.org/D100988	2021-04-21 21:26:09 +02:00
sstefan1	b6d664ee22	[SimplifyLibCalls] Don't change alignment when creating memset Fix for PR49984 This was discovered during Attributor testing. Memset was always created with alignment of 1 and in case when strncpy alignment was changed it triggered an assertion in the AttrBuilder. Memset will now be created with appropriate alignment. Differential Revision: https://reviews.llvm.org/D100875	2021-04-21 20:34:13 +02:00
Sanjay Patel	b3bc645e79	[InstSimplify] generalize ctlz-of-shifted-constant https://alive2.llvm.org/ce/z/zWL_VQ	2021-04-21 14:23:55 -04:00
Sanjay Patel	6d9b6618de	[InstSimplify] add tests for ctlz-of-shift-constant; NFC	2021-04-21 14:23:55 -04:00
Simon Pilgrim	411f75a025	[X86][SSE] getFauxShuffleMask - don't decode OR(SHUFFLE,SHUFFLE) containing UNDEFs. (PR50049) PR50049 demonstrated an infinite loop between OR(SHUFFLE,SHUFFLE) <-> BLEND(SHUFFLE,SHUFFLE) patterns. The UNDEF elements were allowing a combined shuffle mask to be widened which lost the undef element, resulting us needing to use the BLEND pattern (as the undef element would need to be zero for the OR pattern). But then bitcast folds would re-expose the undef element allowing us to use OR again.....	2021-04-21 18:47:00 +01:00

... 2 3 4 5 6 ...

214714 Commits