llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 12:43:36 +01:00

Author	SHA1	Message	Date
Matt Arsenault	8e4d20c2f9	R600: Fix failing testcase	2020-01-22 16:01:35 -05:00
Keith Smiley	31fc1398fa	[llvm-cov] Add support for -skip-functions to lcov Summary: This flag was added for the json format to exclude functions from the output. This mirrors that behavior in lcov (where it was previously accepted but ignored). This makes the output file smaller which can be beneficial depending on how you consume it, especially if you don't use this data anyways. Patch by Keith Smiley (@keith). Reviewers: kastiglione, Dor1s, vsk, allevato Reviewed By: Dor1s, allevato Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73160	2020-01-22 12:49:00 -08:00
Sanjay Patel	2591fb635b	[x86] fold vperm2x128 to concat of 128-bit high half vectors vperm (ins ?, X, C), (ins ?, Y, C), 0x31 --> concat X, Y This is another shuffle problem seen with PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 We have this small crack in legalization/lowering/combining/demanded that allows forming a vperm2f128 of high halves with AVX1 when we could do better by peeking through the insert_subvector nodes. AFAICT, it requires IR as shown in the diffs - much larger than legal vectors - to avoid all of the usual folds. Another option would prevent forming the 256-bit vperm in lowering. Differential Revision: https://reviews.llvm.org/D73197	2020-01-22 15:35:50 -05:00
Chris Tetreault	d82a7e0d5a	[SVE] Pass Scalable argument to VectorType::get in Bitcode Reader Pass the Scalability test to VectorType::get in order to be able to deserialize bitcode that contains scalable vector operations Differential Revision: https://reviews.llvm.org/D73144	2020-01-22 12:29:25 -08:00
Alina Sbirlea	c178679b6e	[LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available. If MemorySSA analysis is analysis, LoopDeletion now preserves it.	2020-01-22 11:38:38 -08:00
Jan Vesely	6999f6e94d	AMDGPU/R600: Emit rodata in text segment R600 relies on this behaviour. Fixes: 6e18266aa4dd78953557b8614cb9ff260bad7c65 ('Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0"') Fixes ~100 piglit regressions since 6e18266 Differential Revision: https://reviews.llvm.org/D72991	2020-01-22 14:31:51 -05:00
Aaron Ballman	047716a57a	Add LLVM_VALUE_FUNCTION to Optional::map(); NFC This is for future-proofing when compiling with MSVC once we drop support for 2017.	2020-01-22 14:21:08 -05:00
Nico Weber	068346b51e	[gn build] reformat all build files again Run `git ls-files '.gn' '.gni' \| xargs llvm/utils/gn/gn.py format` after recent fixes to formatting of comments after single-element lists.	2020-01-22 14:04:20 -05:00
Aaron Ballman	08a193ec33	Add a comment about when we can remove this construct; NFC.	2020-01-22 13:17:38 -05:00
Simon Pilgrim	308dbd2e26	[X86][SSE] combineExtractWithShuffle - extract(bitcast(broadcast(x))) --> x Removes some unnecessary gpr<-->fpu traffic	2020-01-22 18:02:58 +00:00
David Green	2ac2dd62a3	[ARM] Mark MVE loads/store as not having side effects The hasSideEffect parameter is usually automatically inferred from instruction patterns. For some of our MVE instructions, we do not have patterns though, such as for the pre/post inc loads and stores. This instead specifies the flag manually on the base MVE_VLDRSTR_base tablegen class, making sure we get this correct. This can help with scheduling multiple loads more optimally. Here I've added a unittest as a more direct form of testing. Differential Revision: https://reviews.llvm.org/D73117	2020-01-22 17:56:55 +00:00
Nico Weber	948bd41ef1	Revert "[DA][TTI][AMDGPU] Add option to select GPUDA with TTI" This reverts commit a90a6502ab35d3c15c7d56772e409c5632ce6cfb. Broke tests on Windows: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/13808	2020-01-22 12:56:19 -05:00
Nico Weber	9a5d53ffb2	Revert "[gn build] [win] produce symbolized stack frames in release builds too" This reverts commit fd98eccf984f203e39452da238a142f83f61d368. Seems to have no effect, need to try it locally for a bit first.	2020-01-22 12:54:19 -05:00
Aaron Ballman	3be16c9bda	Revert "Unconditionally enable lvalue function designators; NFC" This reverts commit 968561bcdc34c7d74482fe3bb69a045abf08d2c1	2020-01-22 12:40:39 -05:00
Nico Weber	aa258c1b77	[gn build] [win] produce symbolized stack frames in release builds too	2020-01-22 12:36:38 -05:00
Florian Hahn	7c3fe23f5d	[AArch64] Don't rename registers with pseudo defs in Ld/St opt. If the root def of for renaming is a noop-pseudo instruction like kill, we would end up without a correct def for the renamed register, causing miscompiles. This patch conservatively bails out on any pseudo instruction. This fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1037912#c70	2020-01-22 09:26:25 -08:00
Matt Arsenault	2ff7e671d5	AMDGPU/GlobalISel: Handle 16-bank LDS llvm.amdgcn.interp.p1.f16 The pattern is also mishandled by the generated matcher, so workaround this as in the DAG path. The existing DAG tests aren't particularly targeted to just this one intrinsic. These also end up differing in scheduling from SGPR->VGPR operand constraint copies.	2020-01-22 12:10:59 -05:00
David Tenty	71acb7b4cf	[NFC][XCOFF] Refactor Csect creation into TargetLoweringObjectFile Summary: We create a number of standard types of control sections in multiple places for things like the function descriptors, external references and the TOC anchor among others, so it is possible for their properties to be defined inconsistently in different places. This refactor moves their creation and properties into functions in the TargetLoweringObjectFile class hierarchy, where functions for retrieving various special types of sections typically seem to reside. Note: There is one case in PPCISelLowering which is specific to function entry points which we don't address since we don't have access to the TLOF there. Reviewers: DiggerLin, jasonliu, hubert.reinterpretcast Reviewed By: jasonliu, hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72347	2020-01-22 12:09:11 -05:00
Stanislav Mekhanoshin	b9c92f5ddd	Precommit NFC part of DAGCombiner change. NFC. This is NFC part of DAGCombiner::visitEXTRACT_SUBVECTOR() change in the D73132.	2020-01-22 09:01:22 -08:00
Stanislav Mekhanoshin	50d8a4f0bd	Regenerate test/CodeGen/ARM/vext.ll. NFC. This is to pre-commit whitespace only changes before D73132.	2020-01-22 08:56:08 -08:00
Matt Arsenault	a0fa14096f	AMDGPU/GlobalISel: Select llvm.amdgcn.mov.dpp This is deprecated, but easy to support.	2020-01-22 11:43:53 -05:00
Matt Arsenault	517f60a00f	AMDGPU/GlobalISel: Select llvm.amdgcn.mov.dpp8	2020-01-22 11:43:40 -05:00
Hiroshi Yamauchi	023bdc26f9	[PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst. Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146	2020-01-22 08:36:54 -08:00
Pablo Barrio	aa9d997415	[AArch64] Add test for DWARF return address signing Summary: Patch by LukeCheeseman and pbarrio Reviewers: samparker, chill Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72835	2020-01-22 16:36:21 +00:00
Matt Arsenault	42c5f78e96	AMDGPU: Fix element size assertion The GlobalISel usage called this with bits, but the DAG usage was incorrectly using bytes.	2020-01-22 11:18:45 -05:00
Matt Arsenault	b889862fe2	AMDGPU/GlobalISel: Keep G_BITCAST out of waterfall loop The waterfall utility function blindly inserts a phi for every def in the loop. We don't need this one to be preserved for every iteration. Saves an extra phi and copy inside the loop body.	2020-01-22 11:16:19 -05:00
Zakk Chen	e95eb656e4	[RISCV] Support ABI checking with per function target-features 1. if users don't specific -mattr, the default target-feature come from IR attribute. 2. fixed bug and re-land this patch Reviewers: lenary, asb Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D70837	2020-01-22 08:12:28 -08:00
Simon Pilgrim	822a5c1afe	[X86][SSE] combineExtractWithShuffle - extract(bictcast(scalar_to_vector(x))) --> x Removes some unnecessary gpr<-->fpu traffic	2020-01-22 16:11:08 +00:00
Matt Arsenault	ef09775aa0	AMDGPU/GlobalISel: Fold add of constant into G_INSERT_VECTOR_ELT Move the subregister base like in the extract case.	2020-01-22 11:09:15 -05:00
Nico Weber	7de08a66c5	[gn build] (manually) port a174f0da62f	2020-01-22 11:08:34 -05:00
Matt Arsenault	49336cf1f0	AMDGPU/GlobalISel: Select G_INSERT_VECTOR_ELT	2020-01-22 11:00:49 -05:00
Matt Arsenault	5341a3bed9	AMDGPU/GlobalISel: Fix RegBankSelect for G_INSERT_VECTOR_ELT The result and source vector are going to be tied, so these need to be the same bank. The inserted value also needs to be broken down based on the result bank, not the inserted value itself.	2020-01-22 10:57:50 -05:00
Matt Arsenault	45831d88f1	AMDGPU/GlobalISel: Fold constant offset vector extract indexes Handle dynamic vector extracts that use an index that's an add of a constant offset into moving the base subregister of the indexing operation. Force the add into the loop in regbankselect, which will be recognized when selected.	2020-01-22 10:50:59 -05:00
Kazushi (Jam) Marukawa	7acc2658ba	[VE] select and selectcc patterns Summary: select and selectcc isel patterns and tests for i32/i64 and fp32/fp64. Includes optimized selectcc patterns for fmin/fmax/maxs/mins. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73195	2020-01-22 16:30:38 +01:00
Matt Arsenault	68415db929	AMDGPU: Fix typo	2020-01-22 10:17:46 -05:00
Matt Arsenault	9c41dd4ce6	AMDGPU: Look through casted selects to constant fold bin ops The promotion of the uniform select to i32 interfered with this fold.	2020-01-22 10:16:39 -05:00
Matt Arsenault	e0f61e459a	AMDGPU: Do binop of select of constant fold in AMDGPUCodeGenPrepare DAGCombiner does this, but divisions expanded here miss this optimization. Since 67aa18f165640374cf0e0a6226dc793bbda6e74f, divisions have been expanded here and missed out on this optimization. Avoids test regressions in a future patch.	2020-01-22 10:16:39 -05:00
Matt Arsenault	9dbb3b7338	AMDGPU/GlobalISel: Add pre-legalize combiner pass Just copy the AArch64 pass as-is for now, except for removing the memcpy handling.	2020-01-22 10:16:39 -05:00
Aaron Ballman	d53a98a327	Unconditionally enable lvalue function designators; NFC We previously had to guard against older MSVC and GCC versions which had rvalue references but not support for marking functions with ref qualifiers. However, having bumped our minimum required version to MSVC 2017 and GCC 5.1 mean we can unconditionally enable this feature. Rather than keeping the macro around, this replaces use of the macro with the actual ref qualifier.	2020-01-22 09:54:34 -05:00
Sanjay Patel	20563eeb1d	[InstCombine] fneg(X + C) --> -C - X This is 1 of the potential folds uncovered by extending D72521. We don't seem to do this in the backend either (unless I'm not seeing some target-specific transform). icc and gcc (appears to be target-specific) do this transform. Differential Revision: https://reviews.llvm.org/D73057	2020-01-22 09:48:43 -05:00
Kazushi (Jam) Marukawa	f9bb5d260a	[VE] setcc isel patterns Summary: SETCC isel patterns and tests for i32/64 and fp32/64 comparison Reviewers: arsenm, rengolin, craig.topper, k-ishizaka Reviewed By: arsenm Subscribers: merge_guards_bot, wdng, hiraditya, llvm-commits Tags: #ve, #llvm Differential Revision: https://reviews.llvm.org/D73171	2020-01-22 15:45:57 +01:00
David Green	d8e98cfe8f	[ARM] Basic gather scatter cost model This is a very basic MVE gather/scatter cost model, based roughly on the code that we will currently produce. It does not handle truncating scatters or extending gathers correctly yet, as it is difficult to tell that they are going to be correctly extended/truncated from the limited information in the cost function. This can be improved as we extend support for these in the future. Based on code originally written by David Sherwood. Differential Revision: https://reviews.llvm.org/D73021	2020-01-22 14:41:38 +00:00
David Green	380c878c38	[ARM] MVE Gather Scatter cost model tests. NFC	2020-01-22 14:41:38 +00:00
Sander de Smalen	3c233e1b36	[AArch64][SVE] Add patterns for unpredicated load/store to frame-indices. This patch also fixes up a number of cases in DAGCombine and SelectionDAGBuilder where the size of a scalable vector is used in a fixed-width context (thus triggering an assertion failure). Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D71215	2020-01-22 14:32:27 +00:00
Jay Foad	1c282f08c9	[MachineScheduler] Allow clustering mem ops with complex addresses The generic BaseMemOpClusterMutation calls into TargetInstrInfo to analyze the address of each load/store instruction, and again to decide whether two instructions should be clustered. Previously this had to represent each address as a single base operand plus a constant byte offset. This patch extends it to support any number of base operands. The old target hook getMemOperandWithOffset is now a convenience function for callers that are only prepared to handle a single base operand. It calls the new more general target hook getMemOperandsWithOffset. The only requirements for the base operands returned by getMemOperandsWithOffset are: - they can be sorted by MemOpInfo::Compare, such that clusterable ops get sorted next to each other, and - shouldClusterMemOps knows what they mean. One simple follow-on is to enable clustering of AMDGPU FLAT instructions with both vaddr and saddr (base register + offset register). I've left a FIXME in the code for this case. Differential Revision: https://reviews.llvm.org/D71655	2020-01-22 14:28:24 +00:00
Matt Arsenault	3da44f1ca3	AMDGPU/GlobalISel: Fix RegbankSelect for llvm.amdgcn.fmul.legacy	2020-01-22 09:26:17 -05:00
Matt Arsenault	30112e1c2b	AMDGPU/GlobalISel: Handle atomic_inc/atomic_dec The intermediate instruction drops the extra volatile argument. We are missing an atomic ordering on these.	2020-01-22 09:26:17 -05:00
Matt Arsenault	3a75a6094d	AMDGPU: Fix interaction of tfe and d16 This using the wrong result register, and dropping the result entirely for v2f16. This would fail to select on the scalar case. I believe it was also mishandling packed/unpacked subtargets.	2020-01-22 09:26:17 -05:00
Matt Arsenault	24875bbc4d	AMDGPU/GlobalISel: RegBankSelect interp intrinsics Note this assumes the future use of immediates for immarg, not the current G_CONSTANT which will be emitted.	2020-01-22 09:01:34 -05:00
Matt Arsenault	2bf86d2e2a	AMDGPU: Fix missing immarg on llvm.amdgcn.interp.mov The first operand maps to an immediate field, so this should be immarg.	2020-01-22 09:01:34 -05:00

1 2 3 4 5 ...

190496 Commits