llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Sanjay Patel	af9b68aef4	[ConstantFolding] fold integer min/max intrinsics If both operands are undef, return undef. If one operand is undef, clamp to limit constant.	2020-07-29 11:01:13 -04:00
Sanjay Patel	b62b20180b	[ConstantFolding] add tests for integer min/max intrinsics; NFC	2020-07-29 11:01:13 -04:00
Simon Pilgrim	e0d56f3aef	[CostModel][X86] Add SSE costs for SMAX/SMIN/UMAX/UMIN intrinsics	2020-07-29 15:55:43 +01:00
Juneyoung Lee	3c2cfe5a9c	[InstSimplify] add tests for expandCommutativeBinOp; NFC	2020-07-29 23:21:39 +09:00
Florian Hahn	63c884fbb4	[SCEVExpander] Add option to preserve LCSSA directly. This patch teaches SCEVExpander to directly preserve LCSSA. As it is currently, SCEV does not look through PHI nodes in loops, as it might break LCSSA form. Once SCEVExpander can preserve LCSSA form, it should be safe for SCEV to look through PHIs. To preserve LCSSA form, this patch uses formLCSSAForInstructions on operands of newly created instructions, if the definition is inside a different loop than the new instruction. The final value we return from expandCodeFor may also need LCSSA phis, depending on the insert point. As no user for it exists there yet, create a temporary instruction at the insert point, which can be passed to formLCSSAForInstructions. This temporary instruction is removed after LCSSA construction. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D71538	2020-07-29 15:07:37 +01:00
Sanjay Patel	8d20a499b4	[ConstantFolding] update test checks FP min/max intrinsics There's a slight difference in functionality with the new CHECK lines: before, we allowed either -0.0 or 0.0 for maxnum/minnum. That matches the definition, but we should always get a deterministic result from constant folding within the compiler, so now we assert that we got the single expected result in all cases.	2020-07-29 09:43:33 -04:00
Simon Pilgrim	99ab22ce47	[CostModel][X86] Add SSE costs for ABS intrinsics	2020-07-29 14:33:59 +01:00
Victor Campos	9c2a0c2f38	[Driver][ARM] Disable unsupported features when nofp arch extension is used A list of target features is disabled when there is no hardware floating-point support. This is the case when one of the following options is passed to clang: - -mfloat-abi=soft - -mfpu=none This option list is missing, however, the extension "+nofp" that can be specified in -march flags, such as "-march=armv8-a+nofp". This patch also disables unsupported target features when nofp is passed to -march. Differential Revision: https://reviews.llvm.org/D82948	2020-07-29 14:13:22 +01:00
Simon Pilgrim	44836ba0cf	[TTI] Move abs/smax/smin/umax/umin cost expansion to ICA getIntrinsicInstrCost variant This will simplify target overrides, and matches what we do for most integer intrinsic costs.	2020-07-29 13:44:38 +01:00
David Green	acc205118e	[ARM] Tune getCastInstrCost for extending masked loads and truncating masked stores This patch uses the feature added in D79162 to fix the cost of a sext/zext of a masked load, or a trunc for a masked store. Previously, those were considered cheap or even free, but it's not the case as we cannot split the load in the same way we would for normal loads. This updates the costs to better reflect reality, and adds a test for it in test/Analysis/CostModel/ARM/cast.ll. It also adds a vectorizer test that showcases the improvement: in some cases, the vectorizer will now choose a smaller VF when tail-predication is enabled, which results in better codegen. (Because if it were to use a higher VF in those cases, the code we see above would be generated, and the vmovs would block tail-predication later in the process, resulting in very poor codegen overall) Original Patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79163	2020-07-29 13:41:34 +01:00
David Green	49873f2449	[Analysis] TTI: Add CastContextHint for getCastInstrCost Currently, getCastInstrCost has limited information about the cast it's rating, often just the opcode and types. Sometimes there is a context instruction as well, but it isn't trustworthy: for instance, when the vectorizer is rating a plan, it calls getCastInstrCost with the old instructions when, in fact, it's trying to evaluate the cost of the instruction post-vectorization. Thus, the current system can get the cost of certain casts incorrect as the correct cost can vary greatly based on the context in which it's used. For example, if the vectorizer queries getCastInstrCost to evaluate the cost of a sext(load) with tail predication enabled, getCastInstrCost will think it's free most of the time, but it's not always free. On ARM MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar situations can come up with how masked loads can be extended when being split. To fix that, this path adds a new parameter to getCastInstrCost to give it a hint about the context of the cast. It adds a CastContextHint enum which contains the type of the load/store being created by the vectorizer - one for each of the types it can produce. Original patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79162	2020-07-29 13:32:53 +01:00
David Sherwood	db7d870d70	[SVE][CodeGen] Add simple integer add tests for SVE tuple types I have added tests to: CodeGen/AArch64/sve-intrinsics-int-arith.ll for doing simple integer add operations on tuple types. Since these tests introduced new warnings due to incorrect use of getVectorNumElements() I have also fixed up these warnings in the same patch. These fixes are: 1. In narrowExtractedVectorBinOp I have changed the code to bail out early for scalable vector types, since we've not yet hit a case that proves the optimisations are profitable for scalable vectors. 2. In DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS I have replaced calls to getVectorNumElements with getVectorMinNumElements in cases that work with scalable vectors. For the other cases I have added asserts that the vector is not scalable because we should not be using shuffle vectors and build vectors in such cases. Differential revision: https://reviews.llvm.org/D84016	2020-07-29 13:32:10 +01:00
Sjoerd Meijer	8b22bd7d24	[ARM] Optimize immediate selection Optimize some specific immediates selection by materializing them with sub/mvn instructions as opposed to loading them from the constant pool. Patch by Ben Shi, powerman1st@163.com. Differential Revision: https://reviews.llvm.org/D83745	2020-07-29 13:29:17 +01:00
Matt Arsenault	e11e91bb11	AMDGPU/GlobalISel: Refactor special argument management	2020-07-29 08:27:31 -04:00
Matt Arsenault	72b7484d1d	AMDGPU: Make saturating add/sub legal for DAG path	2020-07-29 08:27:31 -04:00
Matt Arsenault	2a62cae903	AMDGPU/GlobalISel: Select llvm.amdgcn.global.atomic.csub Remove the custom node boilerplate. Not sure why this tried to handle the LDS atomic stuff.	2020-07-29 08:27:31 -04:00
David Sherwood	6c7452123a	[SVE] Add checks for no warnings in CodeGen/AArch64/sve-sext-zext.ll Previous patches fixed up all the warnings in this test: llvm/test/CodeGen/AArch64/sve-sext-zext.ll and this change simply checks that no new warnings are added in future. Differential revision: https://reviews.llvm.org/D83205	2020-07-29 13:06:39 +01:00
David Sherwood	3a4cd9337d	[CodeGen] Remove calls to getVectorNumElements in DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR In DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR I have replaced calls to getVectorNumElements with getVectorMinNumElements, since this code path works for both fixed and scalable vector types. For scalable vectors the index will be multiplied by VSCALE. Fixes warnings in this test: sve-sext-zext.ll Differential revision: https://reviews.llvm.org/D83198	2020-07-29 13:05:39 +01:00
Florian Hahn	d2a571702a	[NewGVN] Require asserts for crashing tests. Without asserts, it might take a long time for the tests to crash. Only run them with assert builds.	2020-07-29 12:41:05 +01:00
Yevgeny Rouban	a2a9ffd73d	[LoopSimplifyCFG] Delete landing pads in dead exit blocks In addition to removing phi nodes this patch removes any landing pad that the dead exit block might have. Without this fix Verifier complains about a new switch instruction jumps to a block with a landing pad. Differential Revision: https://reviews.llvm.org/D84320	2020-07-29 18:36:51 +07:00
Pushpinder Singh	62153b67c4	[CMAKE] Fix 'clean' target not working cmake was still considering the empty value of ${fake_version_inc} even if it was not defined. Reviewed By: vsapsai Differential Revision: https://reviews.llvm.org/D82847	2020-07-29 07:34:24 -04:00
Simon Pilgrim	58535386bb	[TTI] Add default cost expansion for abs/smax/smin/umax/umin intrinsics	2020-07-29 12:13:06 +01:00
Georgii Rymar	21d06fcc5e	[llvm-readobj] - Move out the common code from printRelocations() methods. This introduces the printRelocationsHelper() which now contains the common code used by both GNU and LLVM output styles. Differential revision: https://reviews.llvm.org/D83935	2020-07-29 13:52:02 +03:00
Simon Pilgrim	21418c85a1	[X86][SSE] getV4X86ShuffleImm8 - canonicalize broadcast masks If the mask input to getV4X86ShuffleImm8 only refers to a single source element (+ undefs) then canonicalize to a full broadcast. getV4X86ShuffleImm8 defaults to inline values for undefs, which can be useful for shuffle widening/narrowing but does leave SimplifyDemanded* calls thinking the shuffle depends on unnecessary elements. I'm still investigating what we should do more generally to avoid these undemanded elements, but broadcast cases was a simpler win.	2020-07-29 11:32:44 +01:00
Xing GUO	2028670e5f	[DWARFYAML][test] Make the check lines stricter. NFC. This patch makes the check lines stricter.	2020-07-29 17:31:38 +08:00
Xing GUO	8f65f802cb	[DWARFYAML] Replace uint_t with yaml::Hex in the 'debug_aranges' entry. Normally, we use yaml::Hex* to describe the length, offsets, address/segment size. NFC.	2020-07-29 16:43:21 +08:00
Juneyoung Lee	73676829de	[InstCombine] Add tests for select(freeze(undef)); NFC	2020-07-29 15:27:09 +09:00
Azharuddin Mohammed	986289ad08	[ThinLTO] [test] cache.ll: Prevent Spotlight indexing of the output dir The test output files whose atime is altered in the test were getting accessed by Spotlight indexing on macOS, causing them to get an updated atime and leading to the test not behaving as expected. Reviewed By: jhenderson, steven_wu Differential Revision: https://reviews.llvm.org/D84700	2020-07-28 21:21:58 -07:00
Ikhlas Ajbar	bba33c567e	[Hexagon] Correct the order of operands when lowering funnel shift-left This patch corrects the order of operands in the pattern that lowers fshl in Hexagon.	2020-07-28 21:22:41 -05:00
Chuanqi Xu	6f59cc588e	[NFC] Edit the comment in User::replaceUsesOfWith	2020-07-29 10:02:04 +08:00
Xing GUO	b2f650ab28	[llvm-readelf][test] Improve wording in the comments. NFC. This patch addresses comments in D84640 (https://reviews.llvm.org/D84640#2178475).	2020-07-29 09:59:28 +08:00
Stefanos Baziotis	9255f8a1f0	[ADT][BitVector][NFC] Merge find_first_in() / find_first_unset_in() We can implement find_first_unset_in() in the same function if every BitWord we use is first flipped. Differential Revision: https://reviews.llvm.org/D84717	2020-07-29 04:51:22 +03:00
Kang Zhang	6613a2ce88	[PowerPC] Add Def CR1 for MTFSFI_rec and MTFSF_rec	2020-07-29 01:47:23 +00:00
Matt Arsenault	f7d5ce97f6	AMDGPU: Optimize copies to exec with other insts after exec def It's possible to have terminator instructions after a write to exec, so skip over them to find it.	2020-07-28 21:34:50 -04:00
Matt Arsenault	e7d2d74837	AMDGPU/GlobalISel: Fix selecting llvm.amdgcn.s.getreg This introduces the same bug llvm.amdgcn.s.setreg has where if the user specified an immediate outside of the valid 16-bit range, it will select into a verifier error.	2020-07-28 21:34:50 -04:00
Thomas Lively	e2b0ae5192	[WebAssembly] Remove intrinsics for SIMD widening ops Instead, pattern match extends of extract_subvectors to generate widening operations. Since extract_subvector is not a legal node, this is implemented via a custom combine that recognizes extract_subvector nodes before they are legalized. The combine produces custom ISD nodes that are later pattern matched directly, just like the intrinsic was. Also removes the clang builtins for these operations since the instructions can now be generated from portable code sequences. Differential Revision: https://reviews.llvm.org/D84556	2020-07-28 18:25:55 -07:00
Craig Topper	c95d9699fd	[X86] Add FeatureCMPXCHG8B and FeatureSlowUAMem16 to 'lakemont' in X86.td We already had CMPXCH8B feature on this CPU for the frontend so this doesn't have much effect. The FeatureSlowUAMem16 only matters if someone compiles with -march=lakemont -msse which doesn't make sense, but is consistent with all our pre-sse4.2 CPUs. Maybe the feature flag should be FeatureFastUAMem16 and set on the newer CPUs instead.	2020-07-28 18:24:46 -07:00
Matt Arsenault	36ab9584ef	AMDGPU: Don't assert in canInsertSelect Currently GlobalISel doesn't force all VGPR phi operands to VGPRs, so this hit a case where it was queried with a VGPR and SGPR. This could arguably be a verifier error, but it's currently not.	2020-07-28 21:01:06 -04:00
Valentin Clement	a18d313f65	[openmp][openacc][NFC] Add wrapper for records in DirectiveEmitter Add wrapper classes to to access record's fields. This makes it easier to pass record information to the diverse functions for code generation. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D84612	2020-07-28 20:47:40 -04:00
Thomas Lively	adbadea361	[WebAssembly] Implement truncating vector stores Rather than expanding truncating stores so that vectors are stored one lane at a time, lower them to a sequence of instructions using narrowing operations instead, when possible. Since the narrowing operations have saturating semantics, but truncating stores require truncation, mask the stored value to manually truncate it before narrowing. Also, since narrowing is a binary operation, pass in the original vector as the unused second argument. Differential Revision: https://reviews.llvm.org/D84377	2020-07-28 17:46:45 -07:00
Matt Arsenault	3c21944921	AMDGPU: Don't assume call targets are registers GlobalISel let through a call to null, which would then fold into the source operand like any other inline immediate. The SelectionDAG lowering deletes calls to null and undef as a workaround from before calls were supported. We should probably drop the special handling case in the DAG lowering now, since the middle end optimizers delete null calls anyway.	2020-07-28 20:46:06 -04:00
Matt Arsenault	f9805ce38d	AMDGPU: Handle a few missing cases in getAddrModeArguments	2020-07-28 20:22:38 -04:00
Matt Arsenault	9df19f7114	AMDGPU: Don't assume there is only one terminator copy This would stop on the first in reverse order, failing the verifier if there were more earlier in the block.	2020-07-28 20:22:38 -04:00
Matt Arsenault	a1d4d5badf	AMDGPU: Fix verifier error on spilling partially defined SGPRs This needs an implicit def of the super-register in case one of the lanes isn't defined, similar to copyPhysReg (or the not-VGPR spill case below). This showed up in GlobalISel testing since it currently doesn't fold out many undef instructions.	2020-07-28 20:01:57 -04:00
Matt Arsenault	aaf6327167	AMDGPU: Serialize MFI spill fields These should probably be inferred from the function on parse, but the target specific infrastructure currently does not give you a way to do this. SILowerSGPRSpills early exits without this reporting spills, which makes it difficult to write a MIR test for.	2020-07-28 20:01:57 -04:00
Joel E. Denny	12c5f043b8	[FileCheck] Report captured variables Report captured variables in input dumps and traces. For example: ``` $ cat check CHECK: hello [[WHAT:[a-z]+]] CHECK: goodbye [[WHAT]] $ FileCheck -dump-input=always -vv check < input \|& tail -8 <<<<<< 1: hello world check:1'0 ^~~~~~~~~~~ check:1'1 ^~~~~ captured var "WHAT" 2: goodbye world check:2'0 ^~~~~~~~~~~~~ check:2'1 with "WHAT" equal to "world" >>>>>> $ FileCheck -dump-input=never -vv check < input check2:1:8: remark: CHECK: expected string found in input CHECK: hello [[WHAT:[a-z]+]] ^ <stdin>:1:1: note: found here hello world ^~~~~~~~~~~ <stdin>:1:7: note: captured var "WHAT" hello world ^~~~~ check2:2:8: remark: CHECK: expected string found in input CHECK: goodbye [[WHAT]] ^ <stdin>:2:1: note: found here goodbye world ^~~~~~~~~~~~~ <stdin>:2:1: note: with "WHAT" equal to "world" goodbye world ^ ``` Reviewed By: thopre Differential Revision: https://reviews.llvm.org/D83651	2020-07-28 19:15:18 -04:00
Joel E. Denny	7663136ac6	[FileCheck] Extend -dump-input with substitutions Substitutions are already reported in the diagnostics appearing before the input dump in the case of failed directives, and they're reported in traces (produced by `-vv -dump-input=never`) in the case of successful directives. However, those reports are not always convenient to view while investigating the input dump, so this patch adds the substitution report to the input dump too. For example: ``` $ cat check CHECK: hello [[WHAT:[a-z]+]] CHECK: [[VERB]] [[WHAT]] $ FileCheck -vv -DVERB=goodbye check < input \|& tail -8 <<<<<< 1: hello world check:1 ^~~~~~~~~~~ 2: goodbye word check:2'0 X~~~~~~~~~~~ error: no match found check:2'1 with "VERB" equal to "goodbye" check:2'2 with "WHAT" equal to "world" >>>>>> ``` Without this patch, the location reported for a substitution for a directive match is the directive's full match range. This location is misleading as it implies the substitution itself matches that range. This patch changes the reported location to just the match range start to suggest the substitution is known at the start of the match. (As in the above example, input dumps don't mark any range for substitutions. The location info in that case simply identifies the right line for the annotation.) Reviewed By: mehdi_amini, thopre Differential Revision: https://reviews.llvm.org/D83650	2020-07-28 19:15:18 -04:00
Zahira Ammarguellat	916f7080e7	On Windows build, making the /bigobj flag global , instead of passing it per file. To avoid having this flag be passed in per/file manner, we are instead passing it globally. This fixes this bug: https://bugs.llvm.org/show_bug.cgi?id=46733 Reviewed-by: aaron.ballman, beanz, meinersbur Differential Revision: https://reviews.llvm.org/D84038	2020-07-28 18:04:36 -05:00
Alina Sbirlea	fe169bb30a	[DominatorTree] Simplify ChildrenGetter. Summary: Simplify ChildrenGetter to a simple wrapper around a GraphDiff call. GraphDiff already handles nullptr in children, so the special casing in clang can also be removed. Reviewers: kuhar, dblaikie Subscribers: llvm-commits, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D84713	2020-07-28 15:44:20 -07:00
Johannes Doerfert	2e0011cdf0	[SROA][Mem2Reg] Use efficient droppable use API (after D83976) Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84804	2020-07-28 17:41:01 -05:00

1 2 3 4 5 ...

201071 Commits