llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Matt Arsenault	5a867ce09b	AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load	2020-01-27 13:05:55 -05:00
Matt Arsenault	b11071aa6a	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load.format	2020-01-27 13:02:19 -05:00
Matt Arsenault	940fc835e4	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load Use intermediate instructions, unlike with buffer stores. This is necessary because of the need to have an internal way to distinguish between signed and unsigned extloads. This introduces some duplication and near duplication with the buffer store selection path. The store handling should maybe be moved into legalization to match and eliminate the duplication.	2020-01-27 12:49:23 -05:00
Matt Arsenault	42b291cc3e	AMDGPU/GlobalISel: Handle VOP3NoMods	2020-01-27 09:03:44 -08:00
Matt Arsenault	f22b9884b2	AMDGPU/GlobalISel: Add baseline tests for fma/fmad selection	2020-01-27 09:02:13 -08:00
Matt Arsenault	995d910de2	AMDGPU/GlobalISel: Minor refactor of MUBUF complex patterns This will make it easier to support the small variants in the complex patterns for atomics.	2020-01-27 09:00:00 -08:00
Matt Arsenault	14b1e59c2d	AMDGPU: Fix not using f16 fsin/fcos I noticed this because this accidentally started working for GlobalISel.	2020-01-27 08:59:59 -08:00
Jay Foad	8a902e6ea3	[AMDGPU] Simplify test and extend to gfx9 and gfx10 Summary: This is in preparation for adding more test cases for D69661 and other bug fixes in the same area. Reviewers: tpr, dstuttard, critson, nhaehnle, arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70708	2020-01-27 16:56:40 +00:00
Simon Pilgrim	e72c285895	[X86][AVX] Add a more aggressive SimplifyMultipleUseDemandedBits to simplify masked store masks. Fixes a poor codegen issue noticed in PR11210.	2020-01-27 16:44:25 +00:00
Matt Arsenault	6723ae0205	AMDGPU/GlobalISel: Custom legalize v2s16 G_SHUFFLE_VECTOR Try to keep simple v2s16 cases as-is. This will more naturally map to how the VOP3P op_sel modifiers work compared to the expansion involving bitcasts and bitshifts. This could maybe try harder with wider source vector types, although that could be handled with a pre-legalize combine.	2020-01-27 08:28:05 -08:00
Christian Sigg	4789f36750	Add pretty printers for llvm::PointerIntPair and llvm::PointerUnion. Reviewers: aprantl, dblaikie, jdoerfert, nicolasvasilache Reviewed By: dblaikie Subscribers: jpienaar, dexonsmith, merge_guards_bot, llvm-commits Tags: #llvm, #clang, #lldb, #openmp Differential Revision: https://reviews.llvm.org/D72557	2020-01-27 17:23:59 +01:00
Nico Weber	86bbb1153b	Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()" This reverts commit 7a8b0b1595e7dc878b48cf9bbaa652087a6895db. It seems to break exception handling on 32-bit Windows, see https://crbug.com/1045650	2020-01-27 11:22:33 -05:00
Matt Arsenault	a76ad215d1	Revert "AMDGPU: Temporary drop s_mul_hi_i/u32 patterns" This reverts commit fe23ed2c681413e7baf517c79aee9be130579873. It was never really clear this was responsible for the performance regressions that caused this to be reverted. It's been a long time, and we need to have scalar patterns for this to get GlobalISel working.	2020-01-27 08:07:21 -08:00
Teresa Johnson	32209014dc	Restore "[LTO/WPD] Enable aggressive WPD under LTO option" This restores 59733525d37cf9ad88b5021b33ecdbaf2e18911c (D71913), along with bot fix 19c76989bb505c3117730c47df85fd3800ea2767. The bot failure should be fixed by D73418, committed as af954e441a5170a75687699d91d85e0692929d43. I also added a fix for non-x86 bot failures by requiring x86 in new test lld/test/ELF/lto/devirt_vcall_vis_public.ll.	2020-01-27 07:55:05 -08:00
Matt Arsenault	77c2d662f8	AMDGPU/GlobalISel: Fix not using global atomics on gfx9+ For some reason the flat/global atomics end up in the generated matcher table in a different order from SelectionDAG. Use AddedComplexity to prefer checking for global atomics first.	2020-01-27 07:42:42 -08:00
Whitney Tsang	533c97e3ea	[LoopUnroll] Remove remapInstruction(). Summary: LoopUnroll can reuse the RemapInstruction() in ValueMapper, or remapInstructionsInBlocks() in CloneFunction, depending on the needs. There is no need to have its own version in LoopUnroll. By calling RemapInstruction() without TypeMapper or Materializer and with Flags (RF_NoModuleLevelChanges \| RF_IgnoreMissingLocals), it does the same as remapInstruction(). remapInstructionsInBlocks() calls RemapInstruction() exactly as described. Looking at the history, I cannot find any obvious reason to have its own version. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto, foad, aprantl Reviewed By: jdoerfert Subscribers: hiraditya, zzheng, llvm-commits, prithayan, anhtuyen Tag: LLVM Differential Revision: https://reviews.llvm.org/D73277	2020-01-27 15:42:13 +00:00
James Henderson	564b3705b0	[test][llvm-dwarfdump] Add extra test case for invalid MD5 form A subsequent patch will change how an invalid file name table is handled to allow parsing to continue. This patch adds a test case that will demonstrate a difference in behaviour with that change between invalid file tables where the error is before the end of the stated prologue length and where the error occurs after the stated length. Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D72157	2020-01-27 15:33:34 +00:00
James Henderson	a050bdcc58	[DebugInfo] Make incorrect debug line extended opcode length non-fatal It is possible to try to keep parsing a debug line program even when the length of an extended opcode does not match what is expected for that opcode. This patch changes what was previously a fatal error to be non-fatal. The parser now continues by assuming the the claimed length is correct, even if it means moving the offset backwards. Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D72155	2020-01-27 15:32:41 +00:00
Matt Arsenault	718bb37069	AMDPGPU/GlobalISel: Select more MUBUF global addressing modes The handling of the high bits of the resource descriptor seem weird to me, where the 3rd dword changes based on the instruction.	2020-01-27 07:28:36 -08:00
Matt Arsenault	8cbdc76cb3	AMDGPU/GlobalISel: Initial selection of MUBUF addr64 load/store Fixes the main reason for compile failures on SI, but doesn't really try to use the addressing modes yet.	2020-01-27 07:13:56 -08:00
Simon Pilgrim	d7182de4af	[X86][AVX] Add test case from PR11210 Shows failure to remove sign bit comparison when the result has multiple uses	2020-01-27 15:08:21 +00:00
Dominik Montada	571ce4d769	Use pointer type size for offset constant when lowering load/stores	2020-01-27 06:55:32 -08:00
Matt Arsenault	c260c84c4b	AMDGPU: Allow i16 shader arguments Not allowing this just creates unnecessary complications when writing simple tests.	2020-01-27 06:55:32 -08:00
Jay Foad	c41f3f7738	[AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint Summary: This is in preparation for getMemOperandsWithOffset returning more base operands. Depends on D73455. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73456	2020-01-27 14:45:21 +00:00
Jay Foad	b5b898c823	[AMDGPU] Handle multiple base operands in shouldClusterMemOps Summary: This is in preparation for getMemOperandsWithOffset returning more base operands. Depends on D73454. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73455	2020-01-27 14:45:21 +00:00
Jay Foad	3d3c605a01	[AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr Summary: This is in preparation for getMemOperandsWithOffset returning more base operands. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73454	2020-01-27 14:45:21 +00:00
vpykhtin	5ddc78d032	[AMDGPU] Fix GCN regpressure trackers for INLINEASM instructions. Differential revision: https://reviews.llvm.org/D73338	2020-01-27 17:25:25 +03:00
Matt Arsenault	9b0ab53549	GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results Only use shifts if the requested type exactly matches the source type, and create sub-unmerges otherwise.	2020-01-27 06:18:26 -08:00
David Green	c55d376d9d	[MVE] Fixup order of gather writeback intrinsic outputs The MVE_VLDRWU32_qi_pre gather loads, like the other _pre/_post mve loads returns the writeback as result 0, the value as result 1. The llvm ir intrinsic seems to have this the other way around though, and so when lowering from one to the other we need to switch the first two outputs. I've also fixed up the types of _pre/_post on normal MVE loads. There we were already getting the values the right way around, just not for the types. I don't believe this was causing anything to go wrong, but it was very confusing to read in the debug output. Differential Revision: https://reviews.llvm.org/D73370	2020-01-27 14:08:06 +00:00
Matt Arsenault	c466090669	GlobalISel: Translate vector GEPs	2020-01-27 05:35:05 -08:00
Russell Gallop	6a01dd4b41	Re-land [Support] Extend TimeProfiler to support multiple threads This makes TimeTraceProfilerInstance thread local. Added timeTraceProfilerFinishThread() which moves the thread local instance to a global vector of instances. timeTraceProfilerWrite() then writes recorded data from all instances. Threads are identified based on their thread ids. Totals are reported with artificial thread ids higher than the real ones. This fixes the previous version to work with __thread as well as thread_local. Differential Revision: https://reviews.llvm.org/D71059	2020-01-27 13:01:49 +00:00
Igor Kudrin	70ab7c2444	[DWARF] Do not pass Version to DWARFExpression. NFCI. The Version was used only to determine the size of an operand of DW_OP_call_ref. The size was 4 for all versions apart from 2, but the DW_OP_call_ref operation was introduced only in DWARF3. Thus, the code may be simplified and using of Version may be eliminated. Differential Revision: https://reviews.llvm.org/D73264	2020-01-27 19:08:46 +07:00
Igor Kudrin	b22a094351	[DWARF] Simplify DWARFExpression. NFC. As DataExtractor already has a method to extract an unsigned value of a specified size, there is no need to duplicate that. Differential Revision: https://reviews.llvm.org/D73263	2020-01-27 19:08:46 +07:00
David Stenberg	e4d1228ccd	Improvements to call site register worklist Summary: This fixes PR44118. For cases where we have a chain like this: R8 = R1 (entry value) R0 = R8 call @foo R0 the code that emits call site entries using entry values would not follow that chain, instead emitting a call site entry with R8 as location rather than R0. Such a case was discovered when originally adding dbgcall-site-orr-moves.mir. This patch fixes that issue. This is done by changing the ForwardedRegWorklist set to a map in which the worklist registers always map to the parameter registers that they describe. Another thing this patch fixes is that worklist registers now can describe more than one parameter register at a time. Such a case occurred in dbgcall-site-interpretation.mir, resulting in a call site entry not being emitted for one of the parameters. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73168	2020-01-27 12:41:42 +01:00
Sjoerd Meijer	1ac7c4fcb9	[ARM][MVE] Tail-predication: support constant trip count We had support for runtime trip count values, but not constants, and this adds supports for that. And added a minor optimisation while I was add it: don't invoke Cleanup when there's nothing to clean up. Differential Revision: https://reviews.llvm.org/D73198	2020-01-27 11:05:26 +00:00
Sam Parker	15c79be89a	[ARM][LowOverheadLoops] Dont ignore VCTP When expanding the LoopStart, we try to remove the iteration count calculation. However, if part of the calculation was also used to calculate the number of elements we could end up deleting instructions that were required to feed DLSTP/WLSTP. Differential Revision: https://reviews.llvm.org/D73275	2020-01-27 10:59:12 +00:00
David Stenberg	25a13f02ab	Don't separate imp/expl def handling for call site params Summary: Since D70431 the describeLoadedValue() hook takes a parameter register, meaning that it can now be asked to describe any register. This means that we can drop the difference between explicit and implicit defines that we previously had in collectCallSiteParameters(). I have not found any case for any upstream targets where a parameter register is only implicitly defined, and does not overlap with any explicit defines. I don't know if such a case would even make sense. So as far as I have tested, this patch should be a non-functional change. However, this reduces the complexity of the code a bit, and it will simplify the implementation of an upcoming patch which solves PR44118. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73167	2020-01-27 11:31:09 +01:00
Georgii Rymar	7bdcb225b9	[llvm-readobj] - Refine --needed-libs implementation and add a test. We have no good test for --needed-libs option. The one we have as a part of Object/readobj-shared-object.test is not complete. In this patch I've did a minor NFC changes to the implementation and added a test. This allowed to remove this piece from Object/readobj-shared-object.test Differential revision: https://reviews.llvm.org/D73174	2020-01-27 13:29:28 +03:00
Guillaume Chatelet	8f250c81f4	[Alignment][NFC] Use Align with CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73449	2020-01-27 10:58:36 +01:00
Georgii Rymar	dbe828563d	[llvm-readobj] - Add a test for --dyn-symbols when there are no dynamic symbols. It removes the Object/readobj-absent.test test and creates a one more case in dyn-symbols.test we have. Differential revision: https://reviews.llvm.org/D73169	2020-01-27 12:34:58 +03:00
Georgii Rymar	0e3dc3bd7a	[llvm-readobj] - Add a test for --hash-table option. We had no test for --hash-table in tools/llvm-readobj. The one we had was in test/Object and checked that it is possible to dump the hash table even when an object doesn't have a section header table. In this patch I created a test, moved and merged the existent one. During moving I converted it to be YAML based to stop using the precompiled binary. Differential revision: https://reviews.llvm.org/D73105	2020-01-27 12:28:21 +03:00
Guillaume Chatelet	e18e01b543	[Alignment][NFC] Use Align with CreateMaskedScatter/Gather Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 This patch shows that CreateMaskedScatter/CreateMaskedGather can only take positive non zero alignment values. Reviewers: courbet Subscribers: hiraditya, llvm-commits, delena Tags: #llvm Differential Revision: https://reviews.llvm.org/D73361	2020-01-27 10:17:14 +01:00
Petar Avramovic	75e76863f0	[MIPS GlobalISel] Select population count (popcount) G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates these intrinsics from __builtin_popcount and __builtin_popcountll. Add lower and narrow scalar for G_CTPOP. Lower G_CTPOP for MIPS32. Differential Revision: https://reviews.llvm.org/D73216	2020-01-27 09:59:50 +01:00
Petar Avramovic	4b729fba3d	[MIPS GlobalISel] Select count trailing zeros llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_ctz and __builtin_ctzll. G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true). Clang generates such intrinsics as parts of expansion of builtin_ffs and builtin_ffsll. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ. Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32. Differential Revision: https://reviews.llvm.org/D73215	2020-01-27 09:51:06 +01:00
Petar Avramovic	4fef9ac108	[MIPS GlobalISel] Select count leading zeros llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. MIPS clz instruction returns 32 for zero input. G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_clz and __builtin_clzll. G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as second argument. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF). Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32. Differential Revision: https://reviews.llvm.org/D73214	2020-01-27 09:43:38 +01:00
Fangrui Song	274a5fbacd	[MachineVerifier] Simplify and delete LLVM_VERIFY_MACHINEINSTRS from a comment. NFC The environment variable has been unused since r228079.	2020-01-27 00:31:23 -08:00
Wang, Pengfei	0507f61bd2	[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI. Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes. Differential Revision: https://reviews.llvm.org/D72871	2020-01-27 10:38:05 +08:00
Roman Lebedev	daa0abe865	[X86][BdVer2] Polish LEA instruction scheduling info Based on exhaustive llvm-exegesis measurements. There may still be some imperfections for LEA16r/LEA32r. Much like was observed in D68646, i'm also measuring some outliers with some specific registers.	2020-01-26 22:17:27 +03:00
Roman Lebedev	399f253bb6	[NFC][MCA] Re-autogenerate all check lines in all X86 MCA tests Some whitespace issues have crept in, and some znver2 check lines were missing..	2020-01-26 22:17:26 +03:00
Simon Pilgrim	a5fc7e6584	[InstCombine] Add extra shift(c1,add(c2,y)) tests for PR15141	2020-01-26 19:04:12 +00:00

1 2 3 4 5 ...

190743 Commits