Since we use AddVectoredExceptionHandler, we get notified of
every exception that gets raised by a program. Sometimes these
are not necessarily errors though, and this can be especially
true when linking against a library that we have no control
over, and may raise an exception internally which it intends
to catch.
In particular, the Windows API OutputDebugString does exactly
this. It raises an exception inside of a __try / __except,
giving the debugger a chance to handle the exception to print
the message to the debug console.
But this doesn't interoperate nicely with our vectored exception
handler, which just sees another exception and decides that we
need to terminate the program.
Add a special case for this so that we ignore ODS exceptions
and continue normally.
Note that a better fix is to simply not use vectored exception
handlers and use SEH instead, but given that MinGW doesn't support
SEH, this is the only solution for MinGW.
Differential Revision: https://reviews.llvm.org/D33260
llvm-svn: 303219
The operator-> implementation comes from iterator_facade_base, so it should
just work given that the iterator has a tested operator*. But r302257 showed
that required careful handling of for the const qualifier. This patch ensures
the fix in r302257 doesn't regress.
Differential Revision: https://reviews.llvm.org/D33249
llvm-svn: 303215
We would eventually catch these via demanded bits and computing known bits in InstCombine,
but I think it's better to handle the simple cases as soon as possible as a matter of efficiency.
This fold allows further simplifications based on distributed ops transforms. eg:
%a = lshr i8 %x, 7
%b = or i8 %a, 2
%c = and i8 %b, 1
InstSimplify can directly fold this now:
%a = lshr i8 %x, 7
Differential Revision: https://reviews.llvm.org/D33221
llvm-svn: 303213
CTLZ idiom recognition (r303102).
Summary:
The following case:
i = 1;
if(n)
while (n >>= 1)
i++;
use(i);
Was converted to:
i = 1;
if(n)
i += builtin_ctlz(n >> 1, false);
use(i);
Which is not correct. The patch make it:
i = 1;
if(n)
i += builtin_ctlz(n >> 1, true);
use(i);
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 303212
Update threshold based on callee's hotness only when BFI is not available.
Otherwise use only callsite's hotness. This makes it easier to reason about
hotness related threshold updates.
Differential revision: https://reviews.llvm.org/D33157
llvm-svn: 303210
Summary:
This fixes pr32392.
The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in
expandPostRAPseudo.
The reason why expandPostRAPseudo is chosen is because previous passes
are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne-
7, .+4 (some branch pass(s)).
Differential Revision: https://reviews.llvm.org/D32763
llvm-svn: 303205
ProfileSummaryInfo already checks whether the module has sample profile
in determining profile counts. This will also be useful in inliner to
clean up threshold updates.
llvm-svn: 303204
Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.
Test notes:
* Many testcases overwrite store addresses multiple times and needed
minor changes, mainly making stores volatile to prevent the
optimization from optimizing the test away.
* Many X86 test cases optimized out instructions associated with
associated with va_start.
* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
dependencies to check and can probably be removed and potentially
replaced with another test.
Reviewers: rnk, john.brawn
Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33206
llvm-svn: 303198
The referenced tests are derived from:
https://bugs.llvm.org/show_bug.cgi?id=32791
and:
https://reviews.llvm.org/D33172
The motivation for including negative tests may not be clear, so I'm adding an explanatory comment here.
In the post-commit thread for r303133:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170515/453793.html
...it was mentioned that we don't want to add redundant tests. This is a valid point. But in this case,
we have a patch under review (D33172) that demonstrates that no existing regression tests are affected by
a proposed code change, but these are. Therefore, I think these tests have value not visible in any
existing regression tests regardless of whether they show a transform.
Differential Revision: https://reviews.llvm.org/D33242
llvm-svn: 303185
Using LIS can be quite expensive, so caching of calculated region
live-ins and pressure is implemented. It does two things:
1. Caches the info for the second stage when we schedule with
decreased target occupancy.
2. Tracks the basic block from top to bottom thus eliminating the
need to scan whole register file liveness at every region split
in the middle of the block.
The scheduling is now done in 3 stages instead of two, with the first
one being really a no-op and only used to collect scheduling regions
as sent by the scheduler driver.
There is no functional change to the current behavior, only compilation
speed is affected. In general computeBlockPressure() could be simplified
if we switch to backward RP tracker, because scheduler sends regions
within a block starting from the last upward. We could use a natural
order of upward tracker to seamlessly change between regions of the same
block, since live reg set of a previous tracked region would become a
live-out of the next region. That however requires fixing upward tracker
to properly account defs and uses of the same instruction as both are
contributing to the current pressure. When we converge on the produced
pressure we should be able to switch between them back and forth. In
addition, backward tracker is less expensive as it uses LIS in recede
less often than forward uses it in advance.
At the moment the worst known case compilation time has improved from 26
minutes to 8.5.
Differential Revision: https://reviews.llvm.org/D33117
llvm-svn: 303184
According to Intel's Optimization Reference Manual for SNB+:
" For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
dispatch via port 1:
- LEA that has all three source operands: base, index, and offset
- LEA that uses base and index registers where the base is EBP, RBP,or R13
- LEA that uses RIP relative addressing mode
- LEA that uses 16-bit addressing mode "
This patch currently handles the first 2 cases only.
Differential Revision: https://reviews.llvm.org/D32277
llvm-svn: 303183
This factors register pressure estimation mechanism from the
GCNSchedStrategy into the forward tracker to unify interface
with other strategies and expose it to other interested phases.
Differential Revision: https://reviews.llvm.org/D33105
llvm-svn: 303179
Summary:
RewritePHIs algorithm used in building of CoroFrame inserts a placeholder
```
%placeholder = phi [%val]
```
on every edge leading to a block starting with PHI node with multiple incoming edges,
so that if one of the incoming values was spilled and need to be reloaded, we have a
place to insert a reload. We use SplitEdge helper function to split the incoming edge.
SplitEdge function does not deal with unwind edges comping into a block with an EHPad.
This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge.
For landing pads, we clone the landing pad into every edge block and replace the original
landing pad with a PHI collection the values from all incoming landing pads.
For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the
edge blocks.
Reviewers: majnemer, rnk
Reviewed By: majnemer
Subscribers: EricWF, llvm-commits
Differential Revision: https://reviews.llvm.org/D31845
llvm-svn: 303172
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.
Suggested during review of D33184.
llvm-svn: 303163
The information collected when requested by -time-passes is only printed when
llvm_shutdown is called at the moment. This means that when linking against the LTO
library dynamically and using the C interface, it is not possible to see the timing
information, because llvm_shutdown cannot be called. This change modifies the LTO
code generation functions for both regular LTO and thin LTO to explicitly print and
reset the timing information.
I have tested that this works with our proprietary linker. However, as this relies
on a specific method of building and linking against the LTO library, I'm not sure
how or if this can be tested in the LLVM testsuite.
Reviewed by: mehdi_amini
Differential Revision: https://reviews.llvm.org/D32803
llvm-svn: 303152
The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:
for (...) { // loop1
op1 = {A,+,B}
}
for (...) { // loop2
op2 = {A,+,B}
S = add op1, op2
}
In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.
This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.
Reviewers: sanjoy, reames, apilipenko, anna
Reviewed By: sanjoy
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D33121
llvm-svn: 303148