This patch adds support for SCEVUMinExpr to getRangeRef,
similar to the support for SCEVUMaxExpr.
Reviewers: sanjoy.google, efriedma, reames, nikic
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D67177
llvm-svn: 371768
Implement a TODO from rL371452, and handle loop invariant addresses in predicated blocks. If we can prove that the load is safe to speculate into the header, then we can avoid using a masked.load in favour of a normal load.
This is mostly about vectorization robustness. In the common case, it's generally expected that LICM/LoadStorePromotion would have eliminated such loads entirely.
Differential Revision: https://reviews.llvm.org/D67372
llvm-svn: 371745
In MVE, as of rL371218, we are attempting to sink chains of instructions such as:
%l1 = insertelement <8 x i8> undef, i8 %l0, i32 0
%broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer
In certain situations though, we can end up breaking the dominance relations of
instructions. This happens when we sink the instruction into a loop, but cannot
remove the originals. The Use is updated, which might in fact be a Use from the
second instruction to the first.
This attempts to fix that by reversing the order of instruction that are sunk,
and ensuring that we update the uses on new instructions if they have already
been sunk, not the old ones.
Differential Revision: https://reviews.llvm.org/D67366
llvm-svn: 371743
Folding for fma/fmuladd was added here:
rL202914
...and as seen in existing/unchanged tests, that works to propagate NaN
if it's already an input, but we should fold an fma() that creates NaN too.
From IEEE-754-2008 7.2 "Invalid Operation", there are 2 clauses that apply
to fma, so I added tests for those patterns:
c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c)
unless c is a quiet NaN; if c is a quiet NaN then it is implementation
defined whether the invalid operation exception is signaled
d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of
infinities, such as: addition(+∞, −∞)
Differential Revision: https://reviews.llvm.org/D67446
llvm-svn: 371735
IRTranslator creates G_DYN_STACKALLOC instruction during expansion of
alloca when argument that tells number of elements to allocate on stack
is a virtual register. Use default lowering for MIPS32.
Differential Revision: https://reviews.llvm.org/D67440
llvm-svn: 371728
G_IMPLICIT_DEF is used for both integer and floating point implicit-def.
Handle G_IMPLICIT_DEF as ambiguous opcode in MipsRegisterBankInfo.
Select G_IMPLICIT_DEF for MIPS32.
Differential Revision: https://reviews.llvm.org/D67439
llvm-svn: 371727
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.
llvm-svn: 371722
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.
This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.
The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..
Proofs: https://rise4fun.com/Alive/21b
Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67411
llvm-svn: 371718
Current implementation of estimating divisions loses precision since it
estimates reciprocal first and does multiplication. This patch is to re-order
arithmetic operations in the last iteration in DAGCombiner to improve the
accuracy.
Reviewed By: Sanjay Patel, Jinsong Ji
Differential Revision: https://reviews.llvm.org/D66050
llvm-svn: 371713
Fixing a couple of asan-identified bugs
* use of an invalid "Use" iterator after the element was removed
* use of StringRef to Function name after the Function was erased
This reapplies r371567, which was reverted in r371580.
llvm-svn: 371700
This modifies the tool somewhat to only create files when about to run
the "interestingness" test, and delete them immediately after - this
means some more files will be created sometimes (when "double checking"
work - which should probably be fixed/avoided anyway).
This now creates temporary files, rather than only unique ones, and also
uses ToolOutputFile (without ever calling "keep") to ensure the files
are deleted as soon as the interestingness test is run.
llvm-svn: 371696
AVX512 instructions can cause a frequency drop on these CPUs. This
can negate the performance gains from using wider vectors. Enabling
prefer-vector-width=256 will prevent generation of zmm registers
unless explicit 512 bit operations are used in the original source
code.
I believe gcc and icc both do something similar to this by default.
Differential Revision: https://reviews.llvm.org/D67259
llvm-svn: 371694
First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually
make a difference for us in CallLowering but fix that anyway to silence the assert.
The bigger issue was that after fixing the assert we were generating invalid MIR
because the merging/unmerging of values split across multiple registers wasn't
also implemented for memory locs. This happens when we run out of registers and
have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but
for now just fall back.
llvm-svn: 371693
Before, we only checked the callee for swifterror. However, we should also be
checking the caller to see if it has a swifterror parameter.
Since we don't currently handle outgoing arguments, this didn't show up in the
swifterror.ll testcase.
Also, remove the swifterror checks from call-translator-tail-call.ll, since
they are covered by the existing swifterror testing. Better to have it all in
one place.
Differential Revision: https://reviews.llvm.org/D67465
llvm-svn: 371692
I found three issues:
1. the loop over E[ABCD]X copies run over BB start
2. the direct address of cmpxchg8b could be a frame index
3. the displacement of cmpxchg8b could be a global instead of an
immediate
These were all introduced together in r287875, and should be fixed with
this change.
Issue reported by Zachary Turner.
llvm-svn: 371678
I think this case would crash before I added back the -x86-experimental-vector-widening command line option. Adding this test case to prevent breaking it again when we remove the option.
llvm-svn: 371673
fp128 is considered a legal type for a register, but has almost no legal operations so everything needs to be converted to a libcall. Previously this was implemented by tricking type legalization into softening the operations with various checks for "is legal in hardware register" to change the behavior to still use f128 as the resulting type instead of converting to i128.
This patch abandons this approach and instead moves the libcall conversions to LegalizeDAG. This is the approach taken by AArch64 where they also have a legal fp128 type, but no legal operations. I think this is more in spirit with how SelectionDAG's phases are supposed to work.
I had to make some hacks for STRICT_FP_ROUND because some of the strict FP handling checks if ISD::FP_ROUND is Legal for a given result type, but I had to make ISD::FP_ROUND Custom to allow making a libcall when the input is f128. For all other types the Custom handler just returns the original node. These hacks are incomplete and don't work for a strict truncate from f128, but I don't think it worked before either since LegalizeFloatTypes doesn't know about strict ops yet. I've also raised PR43209 against AArch64 which currently crashes on a strict ftrunc from f64->f32 because of FP_ROUND being marked Custom for the same reason there.
Differential Revision: https://reviews.llvm.org/D67128
llvm-svn: 371672
Summary:
After hoisting and merging m0 initializations schedule them as early as
possible in the MBB. This helps the scheduler avoid hazards in some
cases.
Reviewers: rampitec, arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67450
llvm-svn: 371671
Emit debug entry values using standard DWARF5 opcodes when the debugger
tuning is set to lldb.
Differential Revision: https://reviews.llvm.org/D67410
llvm-svn: 371666
Summary:
During IR Linking, if the types of two globals in destination and source
modules are the same, it can only be because the global in the
destination module is originally from the source module and got added to
the destination module from a shared metadata.
We shouldn't map this type to itself in case the type's components get
remapped to a new type from the destination (for instance, during the
loop over SrcM->getIdentifiedStructTypes() further below in
IRLinker::computeTypeMapping()).
Fixes PR40312.
Reviewers: tejohnson, pcc, srhines
Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66814
llvm-svn: 371643
If there are multiple dead defs of the same virtual register, these
are required to be split into multiple virtual registers with separate
live intervals to avoid a verifier error.
llvm-svn: 371640
This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.
A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect
Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371635
This introduces additional rounding error in some cases. See D67434.
This reverts r371518 (git commit 18a1f0818b659cee13865b4fad2648d85984a4ed)
llvm-svn: 371634
This was actually the original intention in D67332,
but i messed up and forgot about it.
This patch was originally part of D67411, but precommitting this.
llvm-svn: 371630
adding new read attribute to an argument
Summary: Update optimization pass to prevent adding read-attribute to an
argument without removing its conflicting attribute.
A read attribute, based on the result of the attribute deduction
process, might be added to an argument. The attribute might be in
conflict with other read/write attribute currently associated with the
argument. To ensure the compatibility of attributes, conflicting
attribute, if any, must be removed before a new one is added.
The following snippet shows the current behavior of the compiler, where
the compilation process is aborted due to incompatible attributes.
$ cat x.ll
; ModuleID = 'x.bc'
%_type_of_d-ccc = type <{ i8*, i8, i8, i8, i8 }>
@d-ccc = internal global %_type_of_d-ccc <{ i8* null, i8 1, i8 13, i8 0,
i8 -127 }>, align 8
define void @foo(i32* writeonly %.aaa) {
foo_entry:
%_param_.aaa = alloca i32*, align 8
store i32* %.aaa, i32** %_param_.aaa, align 8
store i8 0, i8* getelementptr inbounds (%_type_of_d-ccc,
%_type_of_d-ccc* @d-ccc, i32 0, i32 3)
ret void
}
$ opt -O3 x.ll
Attributes 'readnone and writeonly' are incompatible!
void (i32*)* @foo
in function foo
LLVM ERROR: Broken function found, compilation aborted!
The purpose of this changeset is to fix the above error. This fix is
based on a suggestion from Johannes @jdoerfert (many thanks!!!)
Authored By: anhtuyen
Reviewer: nicholas, rnk, chandlerc, jdoerfert
Reviewed By: rnk
Subscribers: hiraditya, jdoerfert, llvm-commits, anhtuyen, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D58694
llvm-svn: 371622
(srem X, pow2C) sgt/slt 0 can be reduced using bit hacks by masking
off the sign bit and the module (low) bits:
https://rise4fun.com/Alive/jSO
A '2' divisor allows slightly more folding:
https://rise4fun.com/Alive/tDBM
Any chance to remove an 'srem' use is probably worthwhile, but this is limited
to the one-use improvement case because doing more may expose other missing
folds. That means it does nothing for PR21929 yet:
https://bugs.llvm.org/show_bug.cgi?id=21929
Differential Revision: https://reviews.llvm.org/D67334
llvm-svn: 371610
Summary:
This catches malformed mir files which specify alignment as log2 instead of pow2.
See https://reviews.llvm.org/D65945 for reference,
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67433
llvm-svn: 371608
When value of immediate in `mips.nori.b` is 255 (which has all ones in
binary form as 8bit integer) DAGCombiner and Legalizer would fall in an
infinite loop. DAGCombiner would try to simplify `or %value, -1` by
turning `%value` into UNDEF. Legalizer will turn it back into `Constant<0>`
which would then be again turned into UNDEF by DAGCombiner. To avoid this
loop we make UNDEF legal for MSA int types on Mips.
Patch by Mirko Brkusanin.
Differential Revision: https://reviews.llvm.org/D67280
llvm-svn: 371607