Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.
APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30392
llvm-svn: 296355
Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.
APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30390
llvm-svn: 296354
Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized.
Differential Revision: https://reviews.llvm.org/D30387
llvm-svn: 296353
Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.
APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30386
llvm-svn: 296352
This is a fix for a loop predication bug which resulted in malformed IR generation.
Loop invariant side of the widened condition is not guaranteed to be available in the preheader as is, so we need to expand it as well. See added unsigned_loop_0_to_n_hoist_length test for example.
Reviewed By: sanjoy, mkazantsev
Differential Revision: https://reviews.llvm.org/D30099
llvm-svn: 296345
This is a cleanup/rewrite of the printSysAlias function. This was not using the
tablegen instruction descriptions, but was "manually" decoding the
instructions. This has been replaced with calls to lookup_XYZ_ByEncoding
tablegen calls.
This revealed several problems. First, instruction IVAU had the wrong encoding.
This was cancelled out by the parser that incorrectly matched the wrong
encoding. Second, instruction CVAP was missing from the SystemOperands tablegen
descriptions, so this has been added. And third, the required target features
were not captured in the tablegen descriptions, so support for this has also
been added.
Differential Revision: https://reviews.llvm.org/D30329
llvm-svn: 296343
Currently we handle this correctly in arm, but in thumb we don't which leads to
an unpredictable instruction being emitted for LSL #0 in an IT block and SP not
being permitted in some cases when it should be.
For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the
.td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to
get the IT handling right. We also need to adjust the handling of
MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We
should also adjust it to allow SP in the same way that it is allowed in
MOV rd, rn, but I haven't done that here because it looks like it would take
quite a lot of work to get right.
Additionally correct the selection of the 16-bit shift instructions in
processInstruction, where it was checking if the two registers were equal when
it should have been checking if they were low. It appears that previously this
code was never executed and the 16-bit encoding was selected by default, but
the other changes I've done here have somehow made it start being used.
Differential Revision: https://reviews.llvm.org/D30294
llvm-svn: 296342
This pattern is essentially a i16 load from p+1 address:
%p1.i16 = bitcast i8* %p to i16*
%p2.i8 = getelementptr i8, i8* %p, i64 2
%v1 = load i16, i16* %p1.i16
%v2.i8 = load i8, i8* %p2.i8
%v2 = zext i8 %v2.i8 to i16
%v1.shl = shl i16 %v1, 8
%res = or i16 %v1.shl, %v2
Current implementation would identify %v1 load as the first byte load and would mistakenly emit a i16 load from %p1.i16 address. This patch adds a check that the first byte is loaded from a non-zero offset of the first load address. This way this address can be used as the base address for the combined value. Otherwise just give up combining.
llvm-svn: 296336
There are no instructions that have "[1]" as part of the assembly string;
FMOVXDhighr is out of date. This removes dead code.
Differential Revision: https://reviews.llvm.org/D30165
llvm-svn: 296327
- Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it:
.amdgpu_runtime_metadata
{ amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ...
- Make IsaInfo optional, and always emit it.
Differential Revision: https://reviews.llvm.org/D30349
llvm-svn: 296324
This creates an llvm-mc-disassemble-fuzzer from the existing llvm-mc-fuzzer
and finishing the assemble support in llvm-mc-assemble-fuzzer.
llvm-svn: 296323
Summary:
BranchInst, SwitchInst (with non-default case) with Undef as input is not
possible at this point. As we always default-fold terminator to one target in
ResolvedUndefsIn and set the input accordingly.
So we should only have constantint/blockaddress here.
If ConstantFoldTerminator fails, that could mean 2 things.
1. ConstantFoldTerminator is doing something unexpected, i.e. not folding on constantint
or blockaddress and not making blocks that should be dead dead.
2. This is not a terminator on constantint or blockaddress. Its on a constant or
overdefined, then this block should not be dead.
In both cases, we should assert.
Reviewers: davide, efriedma, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30381
llvm-svn: 296281
Summary:
Previously we used to return a bogus result, 0, for IR like `ashr %val,
-1`.
I've also added an assert checking that `ComputeNumSignBits` at least
returns 1. That assert found an already checked in test case where we
were returning a bad result for `ashr %val, -1`.
Fixes PR32045.
Reviewers: spatel, majnemer
Reviewed By: spatel, majnemer
Subscribers: efriedma, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D30311
llvm-svn: 296273
The current pattern for extract bits in range is typically:
Mask.lshr(BitOffset).trunc(SubSizeInBits);
Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.
This is another of the compile time issues identified in PR32037 (see also D30265).
This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.
Differential Revision: https://reviews.llvm.org/D30336
llvm-svn: 296272