Surprisingly SIOptimizeExecMaskingPreRA can infinite loop
in some case with DBG_VALUE. Most tests using dbg_value are
run at -O0, so don't run this pass. This seems to only
happen when the value argument is undef.
llvm-svn: 319808
This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks.
Differential Revision: https://reviews.llvm.org/D40039
llvm-svn: 319807
Previously ConstantRange::makeGuaranteedNoWrapRegion only handled addition. This adds support for subtraction.
Differential Revision: https://reviews.llvm.org/D40036
llvm-svn: 319806
There's no such thing as a setcc with vector operands and scalar result. And if we're trying to widen the result we would have to already be looking at a vector result type.
So this patch renames the VSETCC function as the SETCC function and delete the original SETCC function.
llvm-svn: 319799
Without this when lld failed to replace the output file it would leave
the temporary behind. The problem is that the existing logic is
- cancel the delete flag
- rename
We have to cancel first to avoid renaming and then crashing and
deleting the old version. What is missing then is deleting the
temporary file if the rename fails.
This can be an issue on both unix and windows, but I am not sure how
to cause the rename to fail reliably on unix. I think it can be done
on ZFS since it has an ACL system similar to what windows uses, but
adding support for checking that in llvm-lit is probably not worth it.
llvm-svn: 319786
This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR.
Differential Revision: https://reviews.llvm.org/D39720
Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1
llvm-svn: 319778
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.
Differential Revision: https://reviews.llvm.org/D39604
llvm-svn: 319773
Summary:
Found out, at code inspection, that there was a fault in
DAGCombiner::CombineConsecutiveLoads for big-endian targets.
A BUILD_PAIR is always having the least significant bits of
the composite value in element 0. So when we are doing the checks
for consecutive loads, for big endian targets, we should check
if the load to elt 1 is at the lower address and the load
to elt 0 is at the higher address.
Normally this bug only resulted in missed oppurtunities for
doing the load combine. I guess that in some rare situation it
could lead to faulty combines, but I've not seen that happen.
Note that this patch actually will trigger load combine for
some big endian regression tests.
One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get
t76: i64,ch = load<LD8[FixedStack-9]
instead of
t37: i32,ch = load<LD4[FixedStack-10]>
t35: i32,ch = load<LD4[FixedStack-9]>
t41: i64 = build_pair t37, t35
before legalization. Then the legalization will split the LD8
into two loads, so the end result is the same. That should
verify that the transfomation is correct now.
Reviewers: niravd, hfinkel
Reviewed By: niravd
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D40444
llvm-svn: 319771
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.
The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.
Patch by JesperAntonsson.
Reviewers: spatel, eeckstein, hans
Reviewed By: hans
Subscribers: uabelho, llvm-commits
Differential Revision: https://reviews.llvm.org/D40639
llvm-svn: 319768
Pull the checks upon the load out from ReduceLoadWidth into their own
function.
Differential Revision: https://reviews.llvm.org/D40833
llvm-svn: 319766
This marks certain flags in XRay as deprecated (in particular,
`xray_naive_log=` and `xray_fdr_log=`), and recommends the use of the
`xray_mode=` flag.
llvm-svn: 319763
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.
llvm-svn: 319760
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.
llvm-svn: 319758
This has proven a healthy exercise, as many cases of incorrect instruction
flags were corrected in the process. As part of this, IntrWriteMem was added
to several SystemZ instrinsics.
Furthermore, a bug was exposed in TwoAddress with this change (as incorrect
hasSideEffects flags were removed and instructions could now be sunk), and
the test case for that bugfix (r319646) is included here as
test/CodeGen/SystemZ/twoaddr-sink.ll.
One temporary test regression (one extra copy) which will hopefully go away
in upcoming patches for similar cases:
test/CodeGen/SystemZ/vec-trunc-to-i1.ll
Review: Ulrich Weigand.
https://reviews.llvm.org/D40437
llvm-svn: 319756
MachineRegisterInfo used to allow just one regalloc hint per virtual
register. This patch extends this to a vector of regalloc hints, which is
filled in by common code with sorted copy hints. Such hints will make for
more ID copies that can be removed.
NB! This improvement is currently (and hopefully temporarily) *disabled* by
default, except for SystemZ. The only reason for this is the big impact this
has on tests, which has unfortunately proven unmanageable. It was a long
while since all the tests were updated and just waiting for review (which
didn't happen), but now targets have to enable this themselves
instead. Several targets could get a head-start by downloading the tests
updates from the Phabricator review. Thanks to those who helped, and sorry
you now have to do this step yourselves.
This should be an improvement generally for any target!
The target may still create its own hint, in which case this has highest
priority and is stored first in the vector. If it has target-type, it will
not be recomputed, as per the previous behaviour.
The temporary hook enableMultipleCopyHints() will be removed as soon as all
targets return true.
Review: Quentin Colombet, Ulrich Weigand.
https://reviews.llvm.org/D38128
llvm-svn: 319754
This recommits r319533 which was broken llvm-config --system-libs
output. The reason was that I used find_libraries for searching for the
z library. This returns absolute paths, and when these paths made it
into llvm-config, it made it produce nonsensical flags. To fix this, I
hand-roll a search for the library in the same way that we search for
the terminfo library a couple of lines below.
This is a bit less flexible than the find_library option, as it does not
allow the user to specify the path to the library at configure time
(which is important on windows, as zlib is unlikely to be found in any
of the standard places cmake searches), but I was able to guide the
build to find it with appropriate values of LIB and INCLUDE environment
variables.
Reviewers: compnerd, rnk, beanz, rafael
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D40779
llvm-svn: 319751
This is for PR35460.
Currently when LLD adds files to TarWriter it may pass the same file
multiple times. For example it happens for clang reproduce file which specifies
archive (.a) files more than once in command line.
Patch makes TarWriter to ignore files with the same path, so it will
add only the first one to archive.
Differential revision: https://reviews.llvm.org/D40606
llvm-svn: 319750
When trying to determine the correct Mask register class corresponding
to a GPR register class, not all register classes were handled.
This caused an assertion to be raised on some scenarios.
Differential Revision:
https://reviews.llvm.org/D40290
llvm-svn: 319745
The CONCAT_VECTORS operand get its type from getSetCCResultType, but if the mask type and the setcc have different scalar sizes this creates an illegal CONCAT_VECTORS operation. The concat type should be 2x the mask type, and then an extend should be added if needed.
llvm-svn: 319744
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.
If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.
llvm-svn: 319740
This calls handleMove with a DBG_VALUE instruction,
which isn't tracked by LiveIntervals. I'm not sure
this is the correct place to fix this. The generic
scheduler seems to have more deliberate region
selection that skips dbg_value.
The test is also really hard to reduce. I haven't been able
to figure out what exactly causes this particular case to
try moving the dbg_value.
llvm-svn: 319732
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.
If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.
llvm-svn: 319728
The getConstant function can take care of creating the APInt internally.
getZeroVector will take care of using the correct type for the build vector to avoid re-lowering.
The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change.
llvm-svn: 319725
Move the AVX512 code out of LowerAVXExtend. LowerAVXExtend has two callers but one of them pre-checks for AVX-512 so the code is only live from the other caller. So move the AVX-512 checks up to that caller for symmetry.
Move all of the i1 input type code in Lower_AVX512ZeroExend together.
llvm-svn: 319724
The "x${...}" form was a workaround for CMake versions prior to 3.1,
where the if command would interpret arguments as variables even when
quoted [1]. We can drop the workaround now that our minimum CMake
version is 3.4.
[1] https://cmake.org/cmake/help/v3.1/policy/CMP0054.html
Differential Revision: https://reviews.llvm.org/D40744
llvm-svn: 319723
Consistently use the same parameter names as the names of the affected
fields. This avoids some unintuitive abbreviations like `isSS`.
llvm-svn: 319722
While we cannot skip the whole TwoAddressInstructionPass even for -O0
there are some parts of the pass that are currently skipped at -O0 but
not for optnone. Changing this as there is no reason to have those two
hit different code paths here.
llvm-svn: 319721