1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00
Commit Graph

19078 Commits

Author SHA1 Message Date
Adrian Prantl
187f5790c8 SROA: Avoid creating a fragment expression that covers the entire variable.
Fixes PR35416.

https://bugs.llvm.org/show_bug.cgi?id=35416

llvm-svn: 319126
2017-11-28 00:57:53 +00:00
Davide Italiano
077affeb1f [Mem2Reg] Clang-format unformatted parts of this file. NFCI.
llvm-svn: 319097
2017-11-27 21:25:52 +00:00
Davide Italiano
21154a4d53 [SROA] Propagate !range metadata when moving loads.
This tries to propagate !range metadata to a pre-existing load
when a load is optimized out. This is done instead of adding an
assume because converting loads to and from assumes creates a
lot of IR.

Patch by Ariel Ben-Yehuda.

Differential Revision:  https://reviews.llvm.org/D37216

llvm-svn: 319096
2017-11-27 21:25:13 +00:00
Sanjay Patel
49d4f16628 [PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend on arg rather than result
This should fix PR31455:
https://bugs.llvm.org/show_bug.cgi?id=31455

Differential Revision: https://reviews.llvm.org/D28314

llvm-svn: 319094
2017-11-27 21:15:43 +00:00
Arnold Schwaighofer
8e8157c899 Inliner: Don't mark notail calls with the 'tail' attribute
enum TailCallKind { TCK_None = 0, TCK_Tail = 1, TCK_MustTail = 2,
                    TCK_NoTail = 3 };

TCK_NoTail is greater than TCK_Tail so taking the min does not do the
correct thing.

rdar://35639547

llvm-svn: 319075
2017-11-27 19:03:40 +00:00
Sanjay Patel
6cd3d6d6fe [InstCombine] use 'auto' with 'dyn_cast'; NFC
llvm-svn: 319067
2017-11-27 18:19:32 +00:00
Benjamin Kramer
0b9cc4f48d Make helpers static. NFC.
llvm-svn: 318953
2017-11-24 14:55:41 +00:00
Alexander Potapenko
500b98de47 MSan: remove an unnecessary cast. NFC for userspace instrumenetation.
llvm-svn: 318923
2017-11-23 15:06:51 +00:00
Alexander Potapenko
87d48e130a [MSan] Move the access address check before the shadow access for that address
MSan used to insert the shadow check of the store pointer operand
_after_ the shadow of the value operand has been written.
This happens to work in the userspace, as the whole shadow range is
always mapped. However in the kernel the shadow page may not exist, so
the bug may cause a crash.

This patch moves the address check in front of the shadow access.

llvm-svn: 318901
2017-11-23 08:34:32 +00:00
Max Kazantsev
21fb04396f [IRCE][NFC] Add no wrap flags to no-wrapping SCEV calculation
In a lambda where we expect to have result within bounds, add respective `nsw/nuw` flags to
help SCEV just in case if it fails to figure them out on its own.

Differential Revision: https://reviews.llvm.org/D40168

llvm-svn: 318898
2017-11-23 06:14:39 +00:00
Davide Italiano
b6d9bc9a26 [SCCP] Pick the right lattice value for constants.
After the dataflow algorithm proves that an argument is constant,
it replaces it value with the integer constant and drops the lattice
value associated to the DEF.

e.g. in the example we have @f() that's called twice:
call @f(undef, ...)
call @f(2, ...)

`undef` MEET 2 = 2 so we replace the argument and all its uses with
the constant 2.

Shortly after, tryToReplaceWithConstantRange() tries to get the lattice
value for the argument we just replaced, causing an assertion.
This function is a little peculiar as it runs when we're doing replacement
and not as part of the solver but still queries the solver.

The fix is that of checking whether we replaced the value already and
get a temporary lattice value for the constant.

Thanks to Zhendong Su for the report!

Fixes PR35357.

llvm-svn: 318817
2017-11-22 03:04:55 +00:00
Hans Wennborg
2ebede8b36 EntryExitInstrumenter: support __cyg_profile_func_enter_bare
It works just like __cyg_profile_func_enter but takes no arguments.

llvm-svn: 318783
2017-11-21 17:22:19 +00:00
Alina Sbirlea
77a10244b2 Add MemorySSA as loop dependency, disabled by default [NFC].
Summary:
First step in adding MemorySSA as dependency for loop pass manager.
Adding the dependency under a flag.

New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null.
Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default.

Reviewers: sanjoy, davide, gberry

Subscribers: mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D40274

llvm-svn: 318772
2017-11-21 15:45:46 +00:00
NAKAMURA Takumi
be536e28e1 SLPVectorizer.cpp: Avoid std::stable_sort(properlyDominates()).
properlyDominates() shouldn't be used as sort key. It causes different output between stdlibc++ and libc++.
Instead, I introduced RPOT. In most cases, it works for CSE.

llvm-svn: 318743
2017-11-21 09:41:01 +00:00
Davide Italiano
7f9b83e34d [SCCP] If we replace with a constant, we can't replace with a range.
This microoptimization is NFC.

llvm-svn: 318711
2017-11-21 00:21:52 +00:00
Vitaly Buka
d4eaab2abf [msan] Don't sanitize "nosanitize" instructions
Reviewers: eugenis

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40205

llvm-svn: 318708
2017-11-20 23:37:56 +00:00
Hiroshi Yamauchi
0596ae1e4a Add heuristics for irreducible loop metadata under PGO
Summary:
Add the following heuristics for irreducible loop metadata:

- When an irreducible loop header is missing the loop header weight metadata,
  give it the minimum weight seen among other headers.
- Annotate indirectbr targets with the loop header weight metadata (as they are
  likely to become irreducible loop headers after indirectbr tail duplication.)

These greatly improve the accuracy of the block frequency info of the Python
interpreter loop (eg. from ~3-16x off down to ~40-55% off) and the Python
performance (eg. unpack_sequence from ~50% slower to ~8% faster than GCC) due to
better register allocation under PGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39980

llvm-svn: 318693
2017-11-20 21:03:38 +00:00
Teresa Johnson
815197bb06 [SROA] Correctly invalidate analyses when dead instructions deleted
Summary:
SROA can fail in rewriting alloca but still rewrite a phi resulting
in dead instruction elimination. The Changed flag was not being set
correctly, resulting in downstream passes using stale analyses.
The included test case will assert during the second BDCE pass as a
result.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39921

llvm-svn: 318677
2017-11-20 18:33:38 +00:00
Evgeniy Stepanov
50c8d56daa [asan] Use dynamic shadow on 32-bit Android, try 2.
Summary:
This change reverts r318575 and changes FindDynamicShadowStart() to
keep the memory range it found mapped PROT_NONE to make sure it is
not reused. We also skip MemoryRangeIsAvailable() check, because it
is (a) unnecessary, and (b) would fail anyway.

Reviewers: pcc, vitalybuka, kcc

Subscribers: srhines, kubamracek, mgorny, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D40203

llvm-svn: 318666
2017-11-20 17:41:57 +00:00
Gil Rapaport
b76bf11f90 [LV] Model masking in VPlan, introducing VPInstructions
This patch adds a new abstraction layer to VPlan and leverages it to model the planned
instructions that manipulate masks (AND, OR, NOT), introduced during predication.

The new VPValue and VPUser classes model how data flows into, through and out
of a VPlan, forming the vertices of a planned Def-Use graph. The new
VPInstruction class is a generic single-instruction Recipe that models a
planned instruction along with its opcode, operands and users. See
VectorizationPlan.rst for more details.

Differential Revision: https://reviews.llvm.org/D38676

llvm-svn: 318645
2017-11-20 12:01:47 +00:00
Max Kazantsev
a11ed4d79f [IRCE] Smart range intersection
In rL316552, we ban intersection of unsigned latch range with signed range check and vice
versa, unless the entire range check iteration space is known positive. It was a correct
functional fix that saved us from dealing with ambiguous values, but it also appeared
to be a very restrictive limitation. In particular, in the following case:

  loop:
    %iv = phi i32 [ 0, %preheader ], [ %iv.next, %latch]
    %iv.offset = add i32 %iv, 10
    %rc = icmp slt i32 %iv.offset, %len
    br i1 %rc, label %latch, label %deopt

  latch:
    %iv.next = add i32 %iv, 11
    %cond = icmp i32 ult %iv.next, 100
    br it %cond, label %loop, label %exit

Here, the unsigned iteration range is `[0, 100)`, and the safe range for range
check is `[-10, %len - 10)`. For unsigned iteration spaces, we use unsigned
min/max functions for range intersection. Given this, we wanted to avoid dealing
with `-10` because it is interpreted as a very big unsigned value. Semantically, range
check's safe range goes through unsigned border, so in fact it is two disjoint
ranges in IV's iteration space. Intersection of such ranges is not trivial, so we prohibited
this case saying that we are not allowed to intersect such ranges.

What semantics of this safe range actually means is that we can start from `-10` and go
up increasing the `%iv` by one until we reach `%len - 10` (for simplicity let's assume that
`%len - 10`  is a reasonably big positive value).

In particular, this safe iteration space includes `0, 1, 2, ..., %len - 11`. So if we were able to return
safe iteration space `[0, %len - 10)`, we could safely intersect it with IV's iteration space. All
values in this range are non-negative, so using signed/unsigned min/max for them is unambiguous.

In this patch, we alter the algorithm of safe range calculation so that it returnes a subset of the
original safe space which is represented by one continuous range that does not go through wrap.
In order to reach this, we use modified SCEV substraction function. It can be imagined as a function
that substracts by `1` (or `-1`) as long as the further substraction does not cause a wrap in IV iteration
space. This allows us to perform IRCE in many situations when we deal with IV space and range check
of different types (in terms of signed/unsigned).

We apply this approach for both matching and not matching types of IV iteration space and the
range check. One implication of this is that now IRCE became smarter in detection of empty safe
ranges. For example, in this case:
  loop:
    %iv = phi i32 [ %begin, %preheader ], [ %iv.next, %latch]
    %iv.offset = sub i32 %iv, 10
    %rc = icmp ult i32 %iv.offset, %len
    br i1 %rc, label %latch, label %deopt

  latch:
    %iv.next = add i32 %iv, 11
    %cond = icmp i32 ult %iv.next, 100
    br it %cond, label %loop, label %exit

If `%len` was less than 10 but SCEV failed to trivially prove that `%begin - 10 >u %len- 10`,
we could end up executing entire loop in safe preloop while the main loop was still generated,
but never executed. Now, cutting the ranges so that if both `begin - 10` and `%len - 10` overflow,
we have a trivially empty range of `[0, 0)`. This in some cases prevents us from meaningless optimization.

Differential Revision: https://reviews.llvm.org/D39954

llvm-svn: 318639
2017-11-20 06:07:57 +00:00
Sanjay Patel
cbc52bc0ce [LibCallSimplifier] allow splat vectors for pow(x, 0.5) -> sqrt() transforms
llvm-svn: 318629
2017-11-19 16:42:27 +00:00
Sanjay Patel
18a3273e37 [LibCallSimplifier] partly fix pow(x, 0.5) -> sqrt() transforms
As the first test shows, we could transform an llvm intrinsic which never sets errno 
into a libcall which could set errno (even though it's marked readnone?), so that's 
not ideal.

It's possible that we can also transform a libcall which could set errno to an
intrinsic given the fast-math-flags constraint, but that's deferred to determine
exactly which set of FMF are needed.

Differential Revision: https://reviews.llvm.org/D40150

llvm-svn: 318628
2017-11-19 16:13:14 +00:00
Florian Hahn
2c20f84218 [CallSiteSplitting] Remove some indirection (NFC).
Summary:
With this patch I tried to reduce the complexity of the code sightly, by
removing some indirection. Please let me know what you think.

Reviewers: junbuml, mcrosier, davidxl

Reviewed By: junbuml

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40037

llvm-svn: 318593
2017-11-18 18:14:13 +00:00
Walter Lee
1ec8be36b7 [asan] Add a full redzone after every stack variable
We were not doing that for large shadow granularity.  Also add more
stack frame layout tests for large shadow granularity.

Differential Revision: https://reviews.llvm.org/D39475

llvm-svn: 318581
2017-11-18 01:13:18 +00:00
Evgeniy Stepanov
a3fd6382fa Revert "[asan] Use dynamic shadow on 32-bit Android" and 3 more.
Revert the following commits:
  r318369 [asan] Fallback to non-ifunc dynamic shadow on android<22.
  r318235 [asan] Prevent rematerialization of &__asan_shadow.
  r317948 [sanitizer] Remove unnecessary attribute hidden.
  r317943 [asan] Use dynamic shadow on 32-bit Android.

MemoryRangeIsAvailable() reads /proc/$PID/maps into an mmap-ed buffer
that may overlap with the address range that we plan to use for the
dynamic shadow mapping. This is causing random startup crashes.

llvm-svn: 318575
2017-11-18 00:22:34 +00:00
Jun Bum Lim
515421b8ee [LICM] Fix PR35342
Summary: This change fix PR35342 by replacing only the current use with undef in unreachable blocks.

Reviewers: efriedma, mcrosier, igor-laevsky

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40184

llvm-svn: 318551
2017-11-17 20:38:25 +00:00
Chandler Carruth
74564c24f8 [PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,
making it no longer even remotely simple.

The pass will now be more of a "full loop unswitching" pass rather than
anything substantively simpler than any other approach. I plan to rename
it accordingly once the dust settles.

The key ideas of the new loop unswitcher are carried over for
non-trivial unswitching:
1) Fully unswitch a branch or switch instruction from inside of a loop to
   outside of it.
2) Update the CFG and IR. This avoids needing to "remember" the
   unswitched branches as well as avoiding excessively cloning and
   reliance on complex parts of simplify-cfg to cleanup the cfg.
3) Update the analyses (where we can) rather than just blowing them away
   or relying on something else updating them.

Sadly, #3 is somewhat compromised here as the dominator tree updates
were too complex for me to want to reason about. I will need to make
another attempt to do this now that we have a nice dynamic update API
for dominators. However, we do adhere to #3 w.r.t. LoopInfo.

This approach also adds an important principls specific to non-trivial
unswitching: not *all* of the loop will be duplicated when unswitching.
This fact allows us to compute the cost in terms of how much *duplicate*
code is inserted rather than just on raw size. Unswitching conditions
which essentialy partition loops will work regardless of the total loop
size.

Some remaining issues that I will be addressing in subsequent commits:
- Handling unstructured control flow.
- Unswitching 'switch' cases instead of just branches.
- Moving to the dynamic update API for dominators.

Some high-level, interesting limitationsV that folks might want to push
on as follow-ups but that I don't have any immediate plans around:
- We could be much more clever about not cloning things that will be
  deleted. In fact, we should be able to delete *nothing* and do
  a minimal number of clones.
- There are many more interesting selection criteria for which branch to
  unswitch that we might want to look at. One that I'm interested in
  particularly are a set of conditions which all exit the loop and which
  can be merged into a single unswitched test of them.

Differential revision: https://reviews.llvm.org/D34200

llvm-svn: 318549
2017-11-17 19:58:36 +00:00
Max Kazantsev
a35b3bc759 [IRCE] Remove folding of two range checks into RANGE_CHECK_BOTH
The logic of replacing of a couple `RANGE_CHECK_LOWER + RANGE_CHECK_UPPER`
into `RANGE_CHECK_BOTH` in fact duplicates the logic of range intersection which
happens when we calculate safe iteration space. Effectively, the result of intersection of
these ranges doesn't differ from the range of merged range check.

We chose to remove duplicating logic in favor of code simplicity.

Differential Revision: https://reviews.llvm.org/D39589

llvm-svn: 318508
2017-11-17 06:49:26 +00:00
David Blaikie
e01dc73ad2 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Mandeep Singh Grang
feb35af5f1 [PredicateInfo] Add comment about why we require stable sort
llvm-svn: 318487
2017-11-17 00:43:24 +00:00
Walter Lee
7805cc6517 [asan] Fix small X86_64 ShadowOffset for non-default shadow scale
The requirement is that shadow memory must be aligned to page
boundaries (4k in this case).  Use a closed form equation that always
satisfies this requirement.

Differential Revision: https://reviews.llvm.org/D39471

llvm-svn: 318421
2017-11-16 17:03:00 +00:00
Sanjay Patel
805088333a [InstCombine] include 'sub' in the list of narrow-able binops
// trunc (binop X, C) --> binop (trunc X, C')
      // trunc (binop (ext X), Y) --> binop X, (trunc Y)

I'm grouping sub with the other binops  because that makes the code simpler
and the transforms are valid:
https://rise4fun.com/Alive/UeF
...so even though we don't expect a sub with constant Op1 or any of the
other opcodes with constant Op0 due to canonicalization rules, we might as
well handle those situations if non-canonical code somehow reaches this
point (it should just make instcombine more efficient in reaching its
end goal).

This should solve the problem that later manifests in the vectorizers in 
PR35295:
https://bugs.llvm.org/show_bug.cgi?id=35295

llvm-svn: 318404
2017-11-16 14:40:51 +00:00
Walter Lee
44ebe18b8a [asan] Fix size/alignment issues with non-default shadow scale
Fix a couple places where the minimum alignment/size should be a
function of the shadow granularity:
- alignment of AllGlobals
- the minimum left redzone size on the stack

Added a test to verify that the metadata_array is properly aligned
for shadow scale of 5, to be enabled when we add build support
for testing shadow scale of 5.

Differential Revision: https://reviews.llvm.org/D39470

llvm-svn: 318395
2017-11-16 12:57:19 +00:00
Max Kazantsev
4a3238b17b [IRCE] Fix SCEVExpander's usage in IRCE
When expanding exit conditions for pre- and postloops, we may end up expanding a
recurrency from the loop to in its loop's preheader. This produces incorrect IR.

This patch ensures that IRCE uses SCEVExpander correctly and only expands code which
is safe to expand in this particular location.

Differentian Revision: https://reviews.llvm.org/D39234

llvm-svn: 318381
2017-11-16 06:06:27 +00:00
Evgeniy Stepanov
5820f2fc39 [asan] Fallback to non-ifunc dynamic shadow on android<22.
Summary: Android < 22 does not support ifunc.

Reviewers: pcc

Subscribers: srhines, kubamracek, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40116

llvm-svn: 318369
2017-11-16 02:52:19 +00:00
Craig Topper
b7aaba0874 [GVNHoist] Fix a signed/unsigned comparison warning that occurs in 32-bit builds with gcc.
std::distance returns ptrdiff_t which is signed. 64-bit builds don't notice because type promotion widens the unsigned first.

llvm-svn: 318354
2017-11-16 00:19:59 +00:00
Sanjay Patel
d0c3452c1c [InstCombine] trunc (binop X, C) --> binop (trunc X, C')
Note that one-use and shouldChangeType() are checked ahead of the switch.

Without the narrowing folds, we can produce inferior vector code as shown in PR35299:
https://bugs.llvm.org/show_bug.cgi?id=35299

llvm-svn: 318323
2017-11-15 19:12:01 +00:00
Reid Kleckner
b527562cd4 [InstCombine] Salvage debug info during initial DCE
InstCombine salvages debug info for every instruction it erases from its
worklist, but it wasn't doing it during its initial DCE when populating
its worklist. This fixes that.

This should help improve availability of 'this' in optimized debug info
when casts are necessary.

llvm-svn: 318320
2017-11-15 18:51:12 +00:00
Adam Nemet
01d159b6a1 [SLP] Added more missed optimization remarks
Summary:
Added more remarks to SLP pass, in particular "missed" optimization remarks.
Also proposed several tests for new functionality.

Patch by Vladimir Miloserdov!

For reference you may look at: https://reviews.llvm.org/rL302811

Reviewers: anemet, fhahn

Reviewed By: anemet

Subscribers: javed.absar, lattner, petecoup, yakush, llvm-commits

Differential Revision: https://reviews.llvm.org/D38367

llvm-svn: 318307
2017-11-15 17:04:53 +00:00
Sanjay Patel
7b98bb7dd7 [Reassociate] simplify code; NFCI
llvm-svn: 318298
2017-11-15 16:19:17 +00:00
Craig Topper
dcd7058011 [InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition.
Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.

As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39999

llvm-svn: 318267
2017-11-15 05:23:02 +00:00
Hans Wennborg
4937b695da Revert r318193 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
It crashes building sqlite; see reply on the llvm-commits thread.

> [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
>
>         Patch tries to improve vectorization of the following code:
>
>         void add1(int * __restrict dst, const int * __restrict src) {
>           *dst++ = *src++;
>           *dst++ = *src++ + 1;
>           *dst++ = *src++ + 2;
>           *dst++ = *src++ + 3;
>         }
>         Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
>         Fixed issues related to previous commit.
>
>         Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
>         Reviewed By: ABataev, RKSimon
>
>         Subscribers: llvm-commits, RKSimon
>
>         Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318239
2017-11-15 00:38:13 +00:00
Craig Topper
4e0cd496e5 [LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes
Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened.

PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch.

Fixes PR35210.

Differential Revision: https://reviews.llvm.org/D40035

llvm-svn: 318237
2017-11-15 00:22:42 +00:00
Evgeniy Stepanov
0f4fe8b8dd [asan] Prevent rematerialization of &__asan_shadow.
Summary:
In the mode when ASan shadow base is computed as the address of an
external global (__asan_shadow, currently on android/arm32 only),
regalloc prefers to rematerialize this value to save register spills.
Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant
pool entry).

This changes adds an inline asm in the function prologue to suppress
this behavior. It reduces AsanTest binary size by 7%.

Reviewers: pcc, vitalybuka

Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40048

llvm-svn: 318235
2017-11-15 00:11:51 +00:00
Davide Italiano
7a04e78909 [EntryExitInstrumenter] Placate GCC, the semicolon is redundant. NFCI.
llvm-svn: 318217
2017-11-14 23:13:38 +00:00
Sanjay Patel
5cbaa253c7 [Reassociate] use dyn_cast instead of isa+cast; NFCI
llvm-svn: 318212
2017-11-14 23:03:56 +00:00
Reid Kleckner
2e4f96b6e5 Make salvageDebugInfo of casts work for dbg.declare and dbg.addr
Summary:
Instcombine (and probably other passes) sometimes want to change the
type of an alloca. To do this, they generally create a new alloca with
the desired type, create a bitcast to make the new pointer type match
the old pointer type, replace all uses with the cast, and then simplify
the casts. We already knew how to salvage dbg.value instructions when
removing casts, but we can extend it to cover dbg.addr and dbg.declare.

Fixes a debug info quality issue uncovered in Chromium in
http://crbug.com/784609

Reviewers: aprantl, vsk

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40042

llvm-svn: 318203
2017-11-14 21:49:06 +00:00
Hans Wennborg
7bd42058c3 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Dinar Temirbulatov
7ad2acdfcd [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
        void add1(int * __restrict dst, const int * __restrict src) {
          *dst++ = *src++;
          *dst++ = *src++ + 1;
          *dst++ = *src++ + 2;
          *dst++ = *src++ + 3;
        }
        Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
        Fixed issues related to previous commit.
    
        Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
        Reviewed By: ABataev, RKSimon
    
        Subscribers: llvm-commits, RKSimon
    
        Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318193
2017-11-14 20:55:08 +00:00