Summary:
This flag was added for the json format to exclude functions from the
output. This mirrors that behavior in lcov (where it was previously
accepted but ignored). This makes the output file smaller which can be
beneficial depending on how you consume it, especially if you don't use
this data anyways.
Patch by Keith Smiley (@keith).
Reviewers: kastiglione, Dor1s, vsk, allevato
Reviewed By: Dor1s, allevato
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73160
vperm (ins ?, X, C), (ins ?, Y, C), 0x31 --> concat X, Y
This is another shuffle problem seen with PR42024:
https://bugs.llvm.org/show_bug.cgi?id=42024
We have this small crack in legalization/lowering/combining/demanded
that allows forming a vperm2f128 of high halves with AVX1 when we
could do better by peeking through the insert_subvector nodes.
AFAICT, it requires IR as shown in the diffs - much larger than legal
vectors - to avoid all of the usual folds.
Another option would prevent forming the 256-bit vperm in lowering.
Differential Revision: https://reviews.llvm.org/D73197
Pass the Scalability test to VectorType::get in order to be able to
deserialize bitcode that contains scalable vector operations
Differential Revision: https://reviews.llvm.org/D73144
R600 relies on this behaviour.
Fixes: 6e18266aa4dd78953557b8614cb9ff260bad7c65 ('Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0"')
Fixes ~100 piglit regressions since 6e18266
Differential Revision: https://reviews.llvm.org/D72991
The hasSideEffect parameter is usually automatically inferred from
instruction patterns. For some of our MVE instructions, we do not have
patterns though, such as for the pre/post inc loads and stores. This
instead specifies the flag manually on the base MVE_VLDRSTR_base
tablegen class, making sure we get this correct.
This can help with scheduling multiple loads more optimally. Here I've
added a unittest as a more direct form of testing.
Differential Revision: https://reviews.llvm.org/D73117
If the root def of for renaming is a noop-pseudo instruction like kill,
we would end up without a correct def for the renamed register, causing
miscompiles.
This patch conservatively bails out on any pseudo instruction.
This fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1037912#c70
The pattern is also mishandled by the generated matcher, so workaround
this as in the DAG path.
The existing DAG tests aren't particularly targeted to just this one
intrinsic. These also end up differing in scheduling from SGPR->VGPR
operand constraint copies.
Summary:
We create a number of standard types of control sections in multiple places for
things like the function descriptors, external references and the TOC anchor
among others, so it is possible for their properties to be defined
inconsistently in different places. This refactor moves their creation and
properties into functions in the TargetLoweringObjectFile class hierarchy, where
functions for retrieving various special types of sections typically seem
to reside.
Note: There is one case in PPCISelLowering which is specific to function entry
points which we don't address since we don't have access to the TLOF there.
Reviewers: DiggerLin, jasonliu, hubert.reinterpretcast
Reviewed By: jasonliu, hubert.reinterpretcast
Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72347
Summary:
Without the BFI update, some hot blocks are incorrectly treated as cold code.
This fixes a FDO perf regression in the TSVC benchmark from D71288.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73146
The waterfall utility function blindly inserts a phi for every def in
the loop. We don't need this one to be preserved for every
iteration. Saves an extra phi and copy inside the loop body.
1. if users don't specific -mattr, the default target-feature come
from IR attribute.
2. fixed bug and re-land this patch
Reviewers: lenary, asb
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70837
The result and source vector are going to be tied, so these need to be
the same bank.
The inserted value also needs to be broken down based on the result
bank, not the inserted value itself.
Handle dynamic vector extracts that use an index that's an add of a
constant offset into moving the base subregister of the indexing
operation.
Force the add into the loop in regbankselect, which will be recognized
when selected.
Summary: select and selectcc isel patterns and tests for i32/i64 and fp32/fp64.
Includes optimized selectcc patterns for fmin/fmax/maxs/mins.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D73195
DAGCombiner does this, but divisions expanded here miss this
optimization. Since 67aa18f165640374cf0e0a6226dc793bbda6e74f,
divisions have been expanded here and missed out on this
optimization. Avoids test regressions in a future patch.
We previously had to guard against older MSVC and GCC versions which had rvalue
references but not support for marking functions with ref qualifiers. However,
having bumped our minimum required version to MSVC 2017 and GCC 5.1 mean we can
unconditionally enable this feature. Rather than keeping the macro around, this
replaces use of the macro with the actual ref qualifier.
This is 1 of the potential folds uncovered by extending D72521.
We don't seem to do this in the backend either (unless I'm not
seeing some target-specific transform).
icc and gcc (appears to be target-specific) do this transform.
Differential Revision: https://reviews.llvm.org/D73057
This is a very basic MVE gather/scatter cost model, based roughly on the
code that we will currently produce. It does not handle truncating
scatters or extending gathers correctly yet, as it is difficult to tell
that they are going to be correctly extended/truncated from the limited
information in the cost function.
This can be improved as we extend support for these in the future.
Based on code originally written by David Sherwood.
Differential Revision: https://reviews.llvm.org/D73021
This patch also fixes up a number of cases in DAGCombine and
SelectionDAGBuilder where the size of a scalable vector is used in a
fixed-width context (thus triggering an assertion failure).
Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally
Reviewed By: efriedma
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71215
The generic BaseMemOpClusterMutation calls into TargetInstrInfo to
analyze the address of each load/store instruction, and again to decide
whether two instructions should be clustered. Previously this had to
represent each address as a single base operand plus a constant byte
offset. This patch extends it to support any number of base operands.
The old target hook getMemOperandWithOffset is now a convenience
function for callers that are only prepared to handle a single base
operand. It calls the new more general target hook
getMemOperandsWithOffset.
The only requirements for the base operands returned by
getMemOperandsWithOffset are:
- they can be sorted by MemOpInfo::Compare, such that clusterable ops
get sorted next to each other, and
- shouldClusterMemOps knows what they mean.
One simple follow-on is to enable clustering of AMDGPU FLAT instructions
with both vaddr and saddr (base register + offset register). I've left
a FIXME in the code for this case.
Differential Revision: https://reviews.llvm.org/D71655
This using the wrong result register, and dropping the result entirely
for v2f16. This would fail to select on the scalar case. I believe it
was also mishandling packed/unpacked subtargets.