llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00

Author	SHA1	Message	Date
dfukalov	9779f666df	[TTI] NFC: Use InstructionCost to store ScalarizationCost in IntrinsicCostAttributes. This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D101151	2021-04-23 18:02:00 +03:00
Daniil Fukalov	3d4fc1f0f6	[TTI] Fix ScalarizationCost initialization. In cases when ScalarizationCostPassed has no value, UINT_MAX is actually used for cost estimation in `return ScalarCalls * ScalarCost + ScalarizationCost`. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D101099	2021-04-23 17:59:59 +03:00
Joe Ellis	0afb9c183a	[AArch64][SVE] Fix bug in lowering of fixed-length integer vector divides The function AArch64TargetLowering::LowerFixedLengthVectorIntDivideToSVE previously assumed the operands were full vectors, but this is not always true. This function would produce bogus if the division operands are not full vectors, resulting in miscompiles when dividing 8-bit or 16-bit vectors. The fix is to perform an extend + div + truncate for non-full vectors, instead of the usual unpacking and unzipping logic. This is an additive change which reduces the non-full integer vector divisions to a pattern recognised by the existing lowering logic. For future reference, an example of code that would miscompile before this patch is below: 1 int8_t foo(unsigned N, int8_t a, int8_t b, int8_t *c) { 2 int8_t result = 0; 3 for (int i = 0; i < N; ++i) { 4 result += (a[i] / b[i]) / c[i]; 5 } 6 return result; 7 } Differential Revision: https://reviews.llvm.org/D100370	2021-04-23 14:55:10 +00:00
Jay Foad	982c01a843	[AMDGPU] Fix typo in implicit operand lists Several tests had a typo where they mentioned sgpr17 twice instead of sgpr17 and sgpr27. This had a significant effect on the "scavenge_sgpr_pei_no_sgprs" tests because there was actually an sgpr available, namely sgpr27. Differential Revision: https://reviews.llvm.org/D100960	2021-04-23 15:44:17 +01:00
Sebastian Neubauer	8b5cb86ad7	Revert "[AMDGPU] Save WWM registers in functions" This reverts commit 91464c30bfcf731ccb7f9d6ef6d26e8c1657a6e6. Seems to break tests on windows.	2021-04-23 16:38:50 +02:00
Piotr Sobczak	099aac7b88	[AMDGPU][NFC] Update auto-gen test Most likely the "glc" was not added to the test when the volatile loads started generating those bits.	2021-04-23 16:33:16 +02:00
Krzysztof Parzyszek	614a9f9f3d	[Hexagon] Remove redundant HVX intrinsic selection patterns, NFC Deleted HexagonMapAsm2IntrinV65.gen.td that wasn't included anywhere, moved V6_vrmpy_rtt patterns to HexagonIntrinsics.td. Touch CMakeLists.txt to force re-cmake (somehow the unused file was listed as a dependency in the generated makefiles).	2021-04-23 09:28:08 -05:00
Sebastian Neubauer	ec74f9a23a	[AMDGPU] Save WWM registers in functions The values of registers in inactive lanes needs to be saved during function calls. Save all registers used for whole wave mode, similar to how it is done for VGPRs that are used for SGPR spilling. Differential Revision: https://reviews.llvm.org/D99429	2021-04-23 16:09:31 +02:00
Paul C. Anagnostopoulos	14426551a0	[TableGen] Correct some comments in the TableGen parser [NFC] Differential Revision: https://reviews.llvm.org/D101088	2021-04-23 09:53:31 -04:00
Simon Pilgrim	a7d1c869bb	[X86] Add Win32/64 mulo test coverage Part of an investigation to solve the windows regressions caused by rG13ec913bdf50	2021-04-23 14:51:42 +01:00
Paul C. Anagnostopoulos	3d33c078f3	[TableGen] [docs] Improve description of NAME in Programmer's Reference Also use "parent class" consistently and add a note about the term. Differential Revision: https://reviews.llvm.org/D100867	2021-04-23 09:49:17 -04:00
Dávid Bolvanský	0e439477d4	[InstCombine] Added tests for PR50096; NFC	2021-04-23 15:25:44 +02:00
Simon Pilgrim	20e240d4eb	[X86] combineSetCCAtomicArith - pull out repeated ops. NFCI. Reduces diff in D101074	2021-04-23 14:19:24 +01:00
Matt Arsenault	182934f750	AMDGPU: Fix assert on inline asm on gfx90a This was assuming all mayLoad instructions have one def.	2021-04-23 09:00:25 -04:00
Timm Bäder	438fece2aa	[llvm][NFC] Fix assert indentation This triggers GCC's misleading-indentation checker.	2021-04-23 14:44:05 +02:00
Dávid Bolvanský	816c12d30c	[InstCombine] Fixed newly added tests; NFC	2021-04-23 14:42:37 +02:00
Dawid Jurczak	83615768f2	[InstCombine][NFC] add tests for printf("%s", str) --> puts(str)/noop transformation. Split off from D100724. Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D101149	2021-04-23 14:27:16 +02:00
Dávid Bolvanský	30eb4998c0	[InstCombine] Fixed crash when setting align attr for memalign	2021-04-23 14:04:08 +02:00
Fraser Cormack	23b998863a	[RISCV] Custom lower vector F(MIN\|MAX)NUM to vf(min\|max) This patch adds support for both scalable- and fixed-length vector code lowering of the llvm.minnum and llvm.maxnum intrinsics to the equivalent RVV instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101035	2021-04-23 12:22:15 +01:00
Thomas Preud'homme	71007d77ca	[doc] Clarify constrained fcmps behavior Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D101053	2021-04-23 11:55:20 +01:00
Florian Hahn	baa2054364	Recommit "[NewGVN] Track simplification dependencies for phi-of-ops." This recommits 4f5da356ff35a218f23f0b0c4d08aee90da7de6e, including explicit implementations of move a constructor and deleted copy constructors/assignment operators, to fix failures with some compilers. This reverts the revert 74854d00e854196445727a49df58fe5768d9ed5b.	2021-04-23 11:27:43 +01:00
Stephen Tozer	8b8275b1fc	Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" Previous build failures were caused by an error in bitcode reading and writing for DIArgList metadata, which has been fixed in e5d844b587. There were also some unnecessary asserts that were being triggered on certain builds, which have been removed. This reverts commit dad5caa59e6b2bde8d6cf5b64a972c393c526c82.	2021-04-23 10:54:01 +01:00
Wang, Pengfei	cf03157335	[X86][AMX][NFC] Make comparison operators to be complete The previous D101039 didn't fix the SmallSet insertion issue, due to we always return false for the comparison between 2 different nonnull BBs. This patch makes the the comparison to be complete by comparing `MBB` first, so that we can always get the invariant order by a single operator.	2021-04-23 17:38:54 +08:00
LLVM GN Syncbot	a890a88aa7	[gn build] Port c623945d707c	2021-04-23 09:26:02 +00:00
Tim Northover	12f69b73cd	llvm-objdump: refactor SourcePrinter into separate file. NFC. Preparatory patch for MachO feature.	2021-04-23 10:21:52 +01:00
Florian Hahn	3a8b302d64	Revert "[NewGVN] Track simplification dependencies for phi-of-ops." This reverts commit 4f5da356ff35a218f23f0b0c4d08aee90da7de6e. This causes some buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/139/builds/3019	2021-04-23 09:56:17 +01:00
Florian Hahn	a088ff213f	[NewGVN] Track simplification dependencies for phi-of-ops. If we are using a simplified value, we need to add an extra dependency this value , because changes to the class of the simplified value may require us to invalidate any decision based on that value. This is done by adding such values as additional users, however the current code does not excludes temporary instructions. At the moment, this means that we miss those dependencies for phi-of-ops, because they are temporary instructions at this point. We instead need to add the extra dependencies to the root instruction of the phi-of-ops. This patch pushes the responsibility of adding extra users to the callers of createExpression & performSymbolicEvaluation. At those points, it is clearer which real instruction to pick. Alternatively we could either pass the 'real' instruction as additional argument or use another map, but I think the approach in the patch makes things a bit easier to follow. Fixes PR35074. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D99987	2021-04-23 09:48:38 +01:00
Chen Zheng	474057f4c1	[Debug-Info] change return type to void for attribute adding functions. Make following function return void: addLabel() addSectionLabel() addSectionDelta() This aligns with other attributes adding functions. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101022	2021-04-23 04:16:55 -04:00
Jay Foad	899f1c90ad	[GlobalISel] Remove ConstantFoldingMIRBuilder ConstantFoldingMIRBuilder was an experiment which is not used for anything. The constant folding functionality is now part of CSEMIRBuilder. Differential Revision: https://reviews.llvm.org/D101050	2021-04-23 09:13:27 +01:00
Daniel Kiss	30b326d46e	[AArch64] Fix for BTI landing pad insertion with PAC-RET+bkey. EMITBKEY is emitted for PAC-RET+bkey, which is a non machine instructions. PR: 49957 Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D100996	2021-04-23 10:07:25 +02:00
KAWASHIMA Takahiro	47186c3ead	[LoopReroll] Fix rerolling loop with extra instructions Fixes PR47627 This fix suppresses rerolling a loop which has an unrerollable instruction. Sample IR for the explanation below: ``` define void @foo([2 x i32]* nocapture %a) { entry: br label %loop loop: ; base instruction %indvar = phi i64 [ 0, %entry ], [ %indvar.next, %loop ] ; unrerollable instructions %stptrx = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %indvar, i64 0 store i32 999, i32* %stptrx, align 4 ; extra simple arithmetic operations, used by root instructions %plus20 = add nuw nsw i64 %indvar, 20 %plus10 = add nuw nsw i64 %indvar, 10 ; root instruction 0 %ldptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 0 %value0 = load i32, i32* %ldptr0, align 4 %stptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 0 store i32 %value0, i32* %stptr0, align 4 ; root instruction 1 %ldptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 1 %value1 = load i32, i32* %ldptr1, align 4 %stptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 1 store i32 %value1, i32* %stptr1, align 4 ; loop-increment and latch %indvar.next = add nuw nsw i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next, 5 br i1 %exitcond, label %exit, label %loop exit: ret void } ``` In the loop rerolling pass, `%indvar` and `%indvar.next` are appended to the `LoopIncs` vector in the `LoopReroll::DAGRootTracker::findRoots` function. Before this fix, two instructions with `unrerollable instructions` comment above are marked as `IL_All` at the end of the `LoopReroll::DAGRootTracker::collectUsedInstructions` function, as well as instructions with `extra simple arithmetic operations` comment and `loop-increment and latch` comment. It is incorrect because `IL_All` means that the instruction should be executed in all iterations of the rerolled loop but the `store` instruction should not. This fix rejects instructions which may have side effects and don't belong to def-use chains of any root instructions and reductions. See https://bugs.llvm.org/show_bug.cgi?id=47627 for more information.	2021-04-23 15:14:46 +09:00
Wang, Pengfei	83a6f34489	[X86][AMX][NFC] Avoid assert for the same immidiate value The previous condition in the assert was over strict. We ought to allow the same immidiate value being loaded more than once. The intention for the assert is to check the same AMX register uses multiple different immidiate shapes. So this fix supposes to be NFC. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D101124	2021-04-23 12:17:00 +08:00
Wang, Pengfei	d73d62d45f	[X86][AMX] Try to hoist AMX shapes' def We request no intersections between AMX instructions and their shapes' def when we insert ldtilecfg. However, this is not always ture resulting from not only users don't follow AMX API model, but also optimizations. This patch adds a mechanism that tries to hoist AMX shapes' def as well. It only hoists shapes inside a BB, we can improve it for cases across BBs in future. Currently, it only hoists shapes of which all sources' def above the first AMX instruction. We can improve for the case that only source that moves an immediate value to a register below AMX instruction. Differential Revision: https://reviews.llvm.org/D101067	2021-04-23 12:17:00 +08:00
Wang, Pengfei	d7776e3283	[X86] Enable compilation of user interrupt handlers. Add __uintr_frame structure and use UIRET instruction for functions with x86 interrupt calling convention when UINTR is present. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D99708	2021-04-23 11:43:57 +08:00
Serguei Katkov	126d78bad9	[InlineSpiller] Clean-up isSpillCandBB This is mostly NFC except that for end of BB not previous slot is used. Idx is used to find a def of sibling live interval in that slot. The def on end of MBB and on previous slot of end MBB should be the same, so it should be NFC. Reviewers: reames, qcolombet, MatzeB, wmi, rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D100922	2021-04-23 10:16:02 +07:00
Nico Weber	412c04b610	[gn build] (manually) port 0b2bc69ba29	2021-04-22 22:40:53 -04:00
Matt Arsenault	d002a57ca7	AMDGPU: Restore atomic fp feature on FP atomic instruction definitions 9931b1f7a4785b6a17fb87b81a3546d61d0cbca1 switched this to checking for the two specific subtargets, instead of the dedicated feature. This broke supporting functions which force added the feature when emitting targets that do not actually support them. This stil does not work for the targets that use the gfx6/7 or gfx10 encodings.	2021-04-22 21:32:01 -04:00
Fangrui Song	c83fe04e08	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	04656c7e3e	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Keith Smiley	82ce0102d8	llvm-objdump: add --rpaths to macho support This prints the rpaths for the given binary Reviewed By: kastiglione Differential Revision: https://reviews.llvm.org/D100681	2021-04-22 16:01:10 -07:00
Heejin Ahn	5217fbac0b	[WebAssembly] Fix fixEndsAtEndOfFunction for delegate Background: CFGStackify's [[ `398f253400/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (L1481-L1540)` \| fixEndsAtEndOfFunction ]] fixes block/loop/try's return type when the end of function is unreachable and the function return type is not void. So if a function returns i32 and `block`-`end` wraps the whole function, i.e., the `block`'s `end` is the last instruction of the function, the `block`'s return type should be i32 too: ``` block i32 ... end end_function ``` If there are consecutive `end`s, this signature has to be propagate to those blocks too, like: ``` block i32 ... block i32 ... end end end_function ``` This applies to `try`-`end` too: ``` try i32 ... catch ... end end_function ``` In case of `try`, we not only follow consecutive `end`s but also follow `catch`, because for the type of the whole `try` to be i32, both `try` and `catch` parts have to be i32: ``` try i32 ... block i32 ... end catch ... block i32 ... end end end_function ``` --- Previously we only handled consecutive `end`s or `end` before a `catch`. But now we have `delegate`, which serves like `end` for `try`-`delegate`. So we have to follow `delegate` too and mark its corresponding `try` as i32 (the function's return type): ``` try i32 ... catch ... try i32 ;; Here ... delegate N end end_function ``` Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D101036	2021-04-22 15:32:00 -07:00
Heejin Ahn	4405bf5794	[WebAssembly] Serialize params/results in MachineFunctionInfo This adds support for YAML serialization of `Params` and `Results` fields in `WebAssemblyMachineFunctionInfo`. Types are printed as `MVT`'s string representation. This is for writing MIR tests easier. The tests added are testing simple parsing and printing of `params` / `results` fields under `machineFunctionInfo`. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D101029	2021-04-22 15:31:09 -07:00
Heejin Ahn	37702c2638	[WebAssembly] Put utility functions in Utils directory (NFC) This CL 1. Creates Utils/ directory under lib/Target/WebAssembly 2. Moves existing WebAssemblyUtilities.cpp\|h into the Utils/ directory 3. Creates Utils/WebAssemblyTypeUtilities.cpp\|h and put type declarataions and type conversion functions scattered in various places into this single place. It has been suggested several times that it is not easy to share utility functions between subdirectories (AsmParser, DIsassembler, MCTargetDesc, ...). Sometimes we ended up [[ https://reviews.llvm.org/D92840#2478863 \| duplicating ]] the same function because of this. There are already other targets doing this: AArch64, AMDGPU, and ARM have Utils/ subdirectory under their target directory. This extracts the utility functions into a single directory Utils/ and make them sharable among all passes in WebAssembly/ and its subdirectories. Also I believe gathering all type-related conversion functionalities into a single place makes it more usable. (Actually I was working on another CL that uses various type conversion functions scattered in multiple places, which became the motivation for this CL.) Reviewed By: dschuff, aardappel Differential Revision: https://reviews.llvm.org/D100995	2021-04-22 15:29:43 -07:00
Craig Topper	7a82be4f0f	[RISCV] Fix crash with fptosi.sat/fptoui.sat intrinsics on RV64. Add test cases. Add PromoteIntOp_FP_TO_XINT_SAT to type legalize the bit width operand from i32 to i64 for RV64. Add test cases for the saturating intrinsics for half/float/double and i32/i64. CodeGen is definitely not optimal. We can probably make use of the native behavior of fcvt instructions in many cases. Fixes PR50083	2021-04-22 15:18:15 -07:00
Krzysztof Parzyszek	9949cdf248	[Hexagon] Improve lowering of returns of i1 Emit explicit any-extend to avoid weird tstbit sequences.	2021-04-22 16:47:52 -05:00
Elia Geretto	99885567cb	[dfsan] Fix Len argument type in call to __dfsan_mem_transfer_callback This patch is supposed to solve: https://bugs.llvm.org/show_bug.cgi?id=50075 The function `__dfsan_mem_transfer_callback` takes a `Len` argument of type `i64`; however, when processing a `MemTransferInst` such as `llvm.memcpy.p0i8.p0i8.i32`, the `len` argument has type `i32`. In order to make the type of `len` compatible with the one of the callback argument, this change zero-extends it when necessary. Reviewed By: stephan.yichao.zhao, gbalats Differential Revision: https://reviews.llvm.org/D101048	2021-04-22 21:12:20 +00:00
Nikita Popov	5320e959d4	[GVN] Generate LE and BE check lines (NFC) I accidentally dropped some check lines in my previous commit. Apparently update_test_checks no longer warns on label conflicts???	2021-04-22 22:44:08 +02:00
Nikita Popov	b67e56152d	[GVN] Regenerate test checks (NFC)	2021-04-22 22:38:41 +02:00
Krzysztof Parzyszek	b7fbfec6c7	[Hexagon] Use 'vnot' instead of 'not' in patterns with vectors 'not' expands to checking for an xor with a -1 constant. Since this looks for a ConstantSDNode it will never match for a vector. Co-authored-by: Craig Topper <craig.topper@sifive.com> Differential Revision: https://reviews.llvm.org/D100687	2021-04-22 15:36:20 -05:00
Arthur Eubanks	21048e7590	[GlobalOpt] Don't replace alias with aliasee if aliasee is interposable Both the alias and aliasee linkage are important. PR27866 provides some background. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99629	2021-04-22 13:12:34 -07:00

1 2 3 4 5 ...

214645 Commits