When `variable_ops` is specified in `InOperandList` of instruction,
it behaves as expected, i.e., does not count as operand.
So for `(ins variable_ops)` instruction description will have 0
operands. However when used in OutOperandList it is counted as
operand. So `(outs variable_ops)` results in instruction with
one def.
This patch makes behavior of `variable_ops` in `out` list to match
that of `in` list.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D81095
Summary:
TableGen interprets braces ('{}') in the asm string of instruction aliases as
variants but when defining aliases with literal braces they have to be escaped
to prevent them being removed.
Braces are escaped with '\\', for example:
def FooBraces : InstAlias<"foo \\{$imm\\}", (foo IntOperand:$imm)>;
Although when TableGen is emitting the assembly writer (-gen-asm-writer)
the AsmString that gets emitted is:
AsmString = "foo \{$\x01\}";
In c/c++ braces don't need to be escaped which causes compilation
warnings:
warning: use of non-standard escape character '\{' [-Wpedantic]
This patch fixes the issue by unescaping the flattened alias asm string
in the asm writer, by replacing '\{\}' with '{}'.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D79991
- This allow us to specify the (minimal) alignment on an intrinsic's
arguments and, more importantly, the return value.
Differential Revision: https://reviews.llvm.org/D80422
- Argument attribute needs specifiying through `ArgIndex<n>`
(corresponding to `FirstArgIndex`) to distinguish explicitly from the
index number from the overloaded type list.
- In addition, `RetIndex` (corresponding to `ReturnIndex`) and
`FuncIndex` (corresponding to `FunctionIndex`) are introduced for us
to associate attributes on the return value and potentially function
itself.
Differential Revision: https://reviews.llvm.org/D80422
This is quite expensive and it's already available.
Just ReadLegalValueTypes is taking 4 seconds for me in a debug build
for AMDGPU's -gen-instr-info, and this was introducing a second call.
We need to use it to handle <16 x double> indirect indexes
in the AMDGPU BE.
The only visible change from adding it is in ARM cost model.
To me it looks reasonable. With doubling a vector size it
quadruples the cost up to the size 8 and then it did only
double it. Now it also quadruples, which seems a logical
progression to me.
Actual AMDGPU code is to follow, this is a common part, plus
load/store legalization in the AMDGPU BE not to break what
works now.
Differential Revision: https://reviews.llvm.org/D79952
Summary:
In TableGen's instruction selection table generator, references to
register classes were handled by generating a matcher table entry in the
form of "EmitStringInteger, MVT::i32, 'RegisterClassID'". This ID is in
fact the enum integer value corresponding to the register class.
However, both the table generator and the table consumer
(SelectionDAGISel) assume that this ID is less than or equal to 127,
i.e. at most 7 bits. Values greater than this threshold cause completely
wrong behaviours in the instruction selection process.
This patch adds a check to determine if the enum integer value is
greater than the limit of 127. In finding so, the generator emits an
"EmitInteger" instead, which properly supports values with arbitrary
sizes.
Commit f8d044bbcfdc9e1ddc02247ffb86fe39e1f277f0 fixed the very same bug
for register subindices. The present patch now extends this cover to
register classes.
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79705
Set the right target name in clang/examples/Attribute.
Add a missing dependency in the TableGen GlobalISel sublibrary.
Skip building the Bye pass plugin example on windows; plugins
that should have undefined symbols that are found in the host
process aren't supported on windows - this matches what was done
for a unit test in bc8e44218810c0db6328b9809c959ceb7d43e3f5.
This means AttrBuilder will always create a sorted set of attributes and
we can skip the sorting step. Sorting attributes is surprisingly
expensive, and I recently made it worse by making it use array_pod_sort.
This was hitting the default instruction constraint code which uses
the register classes in the instruction def, which REG_SEQUENCE does
not have.
Fixes not constraining the register class for AMDGPU fneg/fabs
patterns, which would fail when the use was another generic,
unconstrained instruction.
Another oddity I noticed is that the temporary registers are created
with an unnecessary, but incorrect 16-bit LLT but this shouldn't
matter.
I'm also still unclear why root and sub-instructions have to be
handled differently.
Summary:
Some compressed instructions match against negative values; store
immediates as a signed value such that these patterns will now match
the intended instructions.
Reviewers: asb, lenary, PaoloS
Reviewed By: asb
Subscribers: rbar, johnrusso, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76767
Follow-up of D72172 and D72180
This patch passes `uint64_t Address` to print methods of PC-relative
operands so that subsequent target specific patches can change
`*InstPrinter::print{Operand,PCRelImm,...}` to customize the output.
Add MCInstPrinter::PrintBranchImmAsAddress which is set to true by
llvm-objdump.
```
// Current llvm-objdump -d output
aarch64: 20000: bl #0
ppc: 20000: bl .+4
x86: 20000: callq 0
// Ideal output
aarch64: 20000: bl 0x20000
ppc: 20000: bl 0x20004
x86: 20000: callq 0x20005
// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
aarch64: 20000: bl 20000
ppc: 20000: bl 0x20004
x86: 20000: callq 20005
```
In `lib/Target/X86/X86GenAsmWriter1.inc` (generated by `llvm-tblgen -gen-asm-writer`):
```
case 12:
// CALL64pcrel32, CALLpcrel16, CALLpcrel32, EH_SjLj_Setup, JCXZ, JECXZ, J...
- printPCRelImm(MI, 0, O);
+ printPCRelImm(MI, Address, 0, O);
return;
```
Some targets have 2 `printOperand` overloads, one without `Address` and
one with `Address`. They should annotate derived `Operand` properly with
`let OperandType = "OPERAND_PCREL"`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76574
This reverts commit e9f22fd4293a65bcdcf1b18b91c72f63e5e9e45b.
When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70
failing tests with this revision, and 29 without this revision.
This patch generates TableGen descriptions for the specified register
banks which contain a list of register sizes corresponding to the
available HwModes. The appropriate size is used during codegen according
to the current HwMode. As this HwMode was not available on generation,
it is set upon construction of the RegisterBankInfo class. Targets
simply need to provide the HwMode argument to the
<target>GenRegisterBankInfo constructor.
The RISC-V RegisterBankInfo constructor has been updated accordingly
(plus an unused argument removed).
Differential Revision: https://reviews.llvm.org/D76007
This patch rewrites the RegisterBankEmitter class to derive
RegisterClassHierarchy from CodeGenTarget::getRegBank() rather than
constructing our own copy. All are now accessed through a const
reference.
Differential Revision: https://reviews.llvm.org/D76006
For context, the proposed RISC-V bit manipulation extension has a subset
of instructions which require one of two SubtargetFeatures to be
enabled, 'zbb' or 'zbp', and there is no defined feature which both of
these can imply to use as a constraint either (see comments in D65649).
AssemblerPredicates allow multiple SubtargetFeatures to be declared in
the "AssemblerCondString" field, separated by commas, and this means
that the two features must both be enabled. There is no equivalent to
say that _either_ feature X or feature Y must be enabled, short of
creating a dummy SubtargetFeature for this purpose and having features X
and Y imply the new feature.
To solve the case where X or Y is needed without adding a new feature,
and to better match a typical TableGen style, this replaces the existing
"AssemblerCondString" with a dag "AssemblerCondDag" which represents the
same information. Two operators are defined for use with
AssemblerCondDag, "all_of", which matches the current behaviour, and
"any_of", which adds the new proposed ORing features functionality.
This was originally proposed in the RFC at
http://lists.llvm.org/pipermail/llvm-dev/2020-February/139138.html
Changes to all current backends are mechanical to support the replaced
functionality, and are NFCI.
At this stage, it is illegal to combine features with ands and ors in a
single AssemblerCondDag. I suspect this case is sufficiently rare that
adding more complex changes to support it are unnecessary.
Differential Revision: https://reviews.llvm.org/D74338
Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.
Saves ~250 includes of Chrono.h & FileSystem.h:
$ diff -u thedeps-before.txt thedeps-after.txt | grep '^[-+] ' | sort | uniq -c | sort -nr
254 - ../llvm/include/llvm/Support/FileSystem.h
253 - ../llvm/include/llvm/Support/Chrono.h
237 - ../llvm/include/llvm/Support/NativeFormatting.h
237 - ../llvm/include/llvm/Support/FormatProviders.h
192 - ../llvm/include/llvm/ADT/StringSwitch.h
190 - ../llvm/include/llvm/Support/FormatVariadicDetails.h
...
This requires duplicating the file_t typedef, which is unfortunate. I
sunk the choice of mapping mode down into the cpp file using variable
template specializations instead of class members in headers.
Summary:
The type used to represent functional units in MC is
'unsigned', which is 32 bits wide. This is currently
not a problem in any upstream target as no one seems
to have hit the limit on this yet, but in our
downstream one, we need to define more than 32
functional units.
Increasing the size does not seem to cause a huge
size increase in the binary (an llc debug build went
from 1366497672 to 1366523984, a difference of 26k),
so perhaps it would be acceptable to have this patch
applied upstream as well.
Subscribers: hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71210
isPrefix was added to support the patches to align branches.
it relies on a switch over instruction names.
This moves those opcodes to a new format so the information is
tablegen and we can just check for a specific value in some bits
in TSFlags instead.
I've left the other function in place for now so that the
existing patches in phabricator will still work. I'll work with
the owner to get them migrated.
This was checking for default operands in the current DAG instruction,
rather than the correct result operand list. I'm not entirly sure how
this managed to work before, but was failing for me when multiple
default operands were overridden.
Summary:
Previously TableGen would crash trying to print the undefined value as
an integer.
Change-Id: I3900071ceaa07c26acafb33bc49966d7d7a02828
Reviewers: nhaehnle
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74210
Summary:
In the DAG pattern backend, `SimplifyTree` simplifies a pattern by
removing bitconverts between two identical types. But that function is
also run on the fragments list in instances of `PatFrags`, in which
the types haven't been specified yet. So the input and output of the
bitconvert always evaluate to the empty set of types, which makes them
compare equal. So the test always passes, and bitconverts are
unconditionally removed from the PatFrag RHS.
Fixed by spotting the empty type set and using it to inhibit the
optimization.
Reviewers: nhaehnle, hfinkel
Reviewed By: nhaehnle
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74627
Tablegen's DAGISelMatcher emits integers in a VBR format,
so if an integer is below 128 it can fit into a single
byte, otherwise high bit is set, next byte is used etc.
MatcherTable is essentially an unsigned char table. When
SelectionDAGISel parses the table it does a reverse translation.
In a situation when numeric value of an integer to emit is
unknown it can be emitted not as OPC_EmitInteger but as
OPC_EmitStringInteger using a symbolic name of the value.
In this situation the value should not exceed 127.
One of the situations when OPC_EmitStringInteger is used is
if we need to emit a subreg into a matcher table. However,
number of subregs can exceed 127. Currently last defined subreg
for AMDGPU is 192. That results in a silent bug in the ISel
with matcher reading from an invalid offset.
Fixed this bug to emit actual VBR encoded value for a subregs
which value exceeds 127.
Differential Revision: https://reviews.llvm.org/D74368