Summary:
Teach the FunctionAttrs to do the right thing for IR with operand
bundles.
Reviewers: reames, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14408
llvm-svn: 252387
Summary:
This change fixes an iterator wraparound bug in
`determinePointerReadAttrs`.
Ideally, ++'ing off the `end()` of an iplist should result in a failed
assert, but currently iplist seems to silently wrap to the head of the
list on `end()++`. This is why the bad behavior is difficult to
demonstrate.
Reviewers: chandlerc, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14350
llvm-svn: 252386
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.
Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.
Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.
Reviewers: pgavlin, majnemer, rnk
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14344
llvm-svn: 252383
This reverts commit r252373, reapplying r252372 now that I've updated
clang-tools-extra. Original commit message follows.
ADT: Require explicit ilist iterator/pointer conversions
Disallow implicit conversions between ilist iterators and element
points. Explicit conversions still work of course.
This is the first step toward removing the undefined behaviour in
`ilist` and `iplist`:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
The motivation for removing the implicit iterators is that I came across
real bugs (that were *really* getting lucky). More details and some
brief discussion later in that thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091617.html
Note: if you have out-of-tree code, it should be fairly easy to revert
this patch downstream while you update your out-of-tree call sites.
Note that these conversions are occasionally latent bugs (that may
happen to "work" now, but only because of getting lucky with UB;
follow-ups will change your luck). When they are valid, I suggest using
`->getIterator()` to go from pointer to iterator, and `&*` to go from
iterator to pointer.
llvm-svn: 252380
FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if
there is an EHPad in the BB. Doing so would result in an instruction
inserted after a terminator.
llvm-svn: 252377
Disallow implicit conversions between ilist iterators and element
points. Explicit conversions still work of course.
This is the first step toward removing the undefined behaviour in
`ilist` and `iplist`:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091115.html
The motivation for removing the implicit iterators is that I came across
real bugs (that were *really* getting lucky). More details and some
brief discussion later in that thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/091617.html
Note: if you have out-of-tree code, it should be fairly easy to revert
this patch downstream while you update your out-of-tree call sites.
Note that these conversions are occasionally latent bugs (that may
happen to "work" now, but only because of getting lucky with UB;
follow-ups will change your luck). When they are valid, I suggest using
`->getIterator()` to go from pointer to iterator, and `&*` to go from
iterator to pointer.
llvm-svn: 252372
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress. This removes them.
I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).
llvm-svn: 252371
We tried to insert a cast of a phi in a block whose terminator is an
EHPad. This is invalid. Do not attempt the transform in these
circumstances.
llvm-svn: 252370
This marker prevents optimization passes from adding 'tail' or
'musttail' markers to a call. Is is used to prevent tail call
optimization from being performed on the call.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12923
llvm-svn: 252368
Summary:
In general GetTempDir follows the same logic as the replaced code: checks env variables TMP, TEMP, USERPROFILE in order. However, it also perform other checks like making separators native (\), making the path absolute, etc.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
llvm-svn: 252366
We used to try to constant-fold them to i32 immediates.
Given that fast-isel doesn't otherwise support vNi1, when selecting
the result users, we'd fallback to SDAG anyway.
However, if the users were in another block, we'd insert broken
cross-class copies (GPR32 to FPR64).
Give up, let SDAG agree with itself on a vNi1 legalization strategy.
llvm-svn: 252364
When matching non-LSB-extracting truncating broadcasts, we now insert
the necessary SRL. If the scalar resulted from a load, the SRL will be
folded into it, creating a narrower, offset, load.
However, i16 loads aren't Desirable, so we get i16->i32 zextloads.
We already catch i16 aextloads; catch these as well.
llvm-svn: 252363
Now that we recognize this, we can support it instead of bailing out.
That is, we can fold:
(v8i16 (shufflevector
(v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
<1,1,...,1>))
into:
(v8i16 (vbroadcast (i16 (trunc (srl Y, 16)))))
llvm-svn: 252362
We used to incorrectly assume that the offset we're extracting from
was a multiple of the element size. So, we'd fold:
(v8i16 (shufflevector
(v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
<1,1,...,1>))
into:
(v8i16 (vbroadcast (i16 (trunc Y))))
whereas we should have extracted the higher bits from X.
Instead, bail out if the assumption doesn't hold.
llvm-svn: 252361
Summary:
Pass the VOPProfile object all the through to *_m multiclasses. This will
allow us to do more simplifications in the future.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D13437
llvm-svn: 252339
The SLPVectorizer had a very crude way of trying to benefit
from associativity: it tried to optimize for splat/broadcast
or in order to have the same operator on the same side.
This is benefitial to the cost model and allows more vectorization
to occur.
This patch improve the logic and make the detection optimal (locally,
we don't look at the full tree but only at the immediate children).
Should fix https://llvm.org/bugs/show_bug.cgi?id=25247
Reviewers: mzolotukhin
Differential Revision: http://reviews.llvm.org/D13996
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 252337
All 3 operands of FMA3 instructions are commutable now.
Patch by Slava Klochkov
Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab).
Differential Revision: http://reviews.llvm.org/D13269
llvm-svn: 252335
Modelling of the expression stack is evolving. This patch takes another
step by making pushes explicit.
Differential Revision: http://reviews.llvm.org/D14338
llvm-svn: 252334
Summary:
This change makes the `isImpliedCondition` interface similar to the rest
of the functions in ValueTracking (in that it takes a DataLayout,
AssumptionCache etc.). This is an NFC, intended to make a later diff
less noisy.
Depends on D14369
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14391
llvm-svn: 252333
Summary:
Currently `isImpliedCondition` will optimize "I +_nuw C < L ==> I < L"
only if C is positive. This is an unnecessary restriction -- the
implication holds even if `C` is negative.
Reviewers: reames, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14369
llvm-svn: 252332
Summary:
This change adds a framework for adding more smarts to
`isImpliedCondition` around inequalities. Informally,
`isImpliedCondition` will now try to prove "A < B ==> C < D" by proving
"C <= A && B <= D", since then it follows "C <= A < B <= D".
While this change is in principle NFC, I could not think of a way to not
handle cases like "i +_nsw 1 < L ==> i < L +_nsw 1" (that ValueTracking
did not handle before) while keeping the change understandable. I've
added tests for these cases.
Reviewers: reames, majnemer, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14368
llvm-svn: 252331
Mark kernels that use certain features that require user
SGPRs to support with kernel attributes. We need to know
before instruction selection begins because it impacts
the kernel calling convention lowering.
For now this only detects the workitem intrinsics.
llvm-svn: 252323
For some reason VS_32 ends up factoring into the pressure heuristics
even though we should never see a virtual register with this class.
When SGPRs are reserved for register spilling, this for some reason
triggers reg-crit scheduling.
Setting isAllocatable = 0 may help with this since that seems to remove
it from the default implementation's generated table.
llvm-svn: 252321
Summary:
This reverts commit r251965.
Restore "Move metadata linking after lazy global materialization/linking."
This restores commit r251926, with fixes for the LTO bootstrapping bot
failure.
The bot failure was caused by references from debug metadata to
otherwise unreferenced globals. Previously, this caused the lazy linking
to link in their defs, which is unnecessary. With this patch, because
lazy linking is complete when we encounter the metadata reference, the
materializer created a declaration. For definitions such as aliases and
comdats, it is illegal to have a declaration. Furthermore, metadata
linking should not change code generation. Therefore, when linking of
global value bodies is complete, the materializer will simply return
nullptr as the new reference for the linked metadata.
This change required fixing a different test to ensure there was a
real reference to a linkonce global that was only being reference from
metadata.
Note that the new changes to the only-needed-named-metadata.ll test
illustrate an issue with llvm-link -only-needed handling of comdat
groups, whereby it may result in an incomplete comdat group. I note this
in the test comments, but the issue is orthogonal to this patch (it can
be reproduced without any metadata at head).
Reviewers: dexonsmith, rafael, tra
Subscribers: tobiasvk, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D14447
llvm-svn: 252320
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.
Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer
Subscribers: MatzeB, llvm-commits
Differential Revision: http://reviews.llvm.org/D14407
llvm-svn: 252318
The benefit from converting narrow loads into a wider load (r251438) could be
micro-architecturally dependent, as it assumes that a single load with two bitfield
extracts is cheaper than two narrow loads. Currently, this conversion is
enabled only in cortex-a57 on which performance benefits were verified.
llvm-svn: 252316
We now create the .eh_frame section early, just like every other special
section.
This means that the special flags are visible in code that explicitly
asks for ".eh_frame".
llvm-svn: 252313