1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00
Commit Graph

1531 Commits

Author SHA1 Message Date
Anna Thomas
146a9e87c5 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

llvm-svn: 343028
2018-09-25 20:57:20 +00:00
Jonas Paulsson
16443bd463 [SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM
After recent improvements which makes better use of LOC instead of IPM, the
TTI cost functions also needs to be updated to reflect this.

This involves sext, zext and xor of i1.

The tests were updated so that for z13 the new costs are expected, while the
old costs are still checked for on zEC12.

Review: Ulrich Weigand
https://reviews.llvm.org/D51339

llvm-svn: 342207
2018-09-14 06:46:55 +00:00
Peter Collingbourne
b6e7ff238c Prevent Constant Folding From Optimizing inrange GEP
This patch does the following things:

1. update SymbolicallyEvaluateGEP so that it bails out if it cannot preserve inrange arribute;
2. update llvm/test/Analysis/ConstantFolding/gep.ll to remove UB in it;
3. remove inaccurate comment above ConstantFoldInstOperandsImpl in llvm/lib/Analysis/ConstantFolding.cpp;
4. add a new regression test that makes sure that no optimizations change an inrange GEP in an unexpected way.

Patch by Zhaomo Yang!

Differential Revision: https://reviews.llvm.org/D51698

llvm-svn: 341888
2018-09-11 01:53:36 +00:00
Philip Reames
0608255adc [AST] Add test coverage of memsets
Immediately after posting https://reviews.llvm.org/D51895, I noticed a small bug.  These tests would have caught that.

llvm-svn: 341880
2018-09-10 23:14:30 +00:00
Philip Reames
7578ed61eb [AST] Visit memtransfer arguments in order
The only point to this change is the test diffs.  When I remove this code entirely (in favor of the recently added generic handling), I don't want there to be any confusion due to spurious test diffs.

As an aside, the fact out tests are AST construction order dependent is not great.  I thought about fixing that, but the reasonable schemes I might want (e.g. sort by name) need the test diffs anyways.

Philip

llvm-svn: 341841
2018-09-10 16:00:27 +00:00
Tim Northover
0f17fdc4c5 InstCombine: move hasOneUse check to the top of foldICmpAddConstant
There were two combines not covered by the check before now, neither of which
actually differed from normal in the benefit analysis.

The most recent seems to be because it was just added at the top of the
function (naturally). The older is from way back in 2008 (r46687) when we just
didn't put those checks in so routinely, and has been diligently maintained
since.

llvm-svn: 341831
2018-09-10 14:26:44 +00:00
Matt Arsenault
6815ea5d8c AMDGPU: Fix tests using old number for constant address space
llvm-svn: 341770
2018-09-10 02:54:25 +00:00
Philip Reames
6fce828d58 [AST] Generalize argument specific aliasing
AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.

The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.

The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.

Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch.  It was originally part of this one, but separate for ease of review and rebase.

Differential Revision: https://reviews.llvm.org/D50730

llvm-svn: 341713
2018-09-07 21:36:11 +00:00
Nicola Zaghen
79c63f731b [InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)
Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares.

This is the Alive proof: https://rise4fun.com/Alive/nyY

Differential Revision: https://reviews.llvm.org/D50972

llvm-svn: 341353
2018-09-04 10:29:48 +00:00
Nicolai Haehnle
6b4a3db968 Move test/Analysis/DivergenceAnalysis/AMDGPU/loads.ll
Should fix failures of buildbots that don't build the AMDGPU backend.

Change-Id: I01cb84b4b47803b10c5b21ea0353546239860a51
llvm-svn: 341079
2018-08-30 15:24:00 +00:00
Nicolai Haehnle
5a64e70379 [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.

Patch by: Simon Moll

Reviewed By: nhaehnle

Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50434

Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
2018-08-30 14:21:36 +00:00
Sanjay Patel
b35900e555 [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.

llvm-svn: 340926
2018-08-29 13:24:34 +00:00
Kit Barton
5fe828ca5e [PPC] Remove Darwin support from POWER backend.
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.

Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.

Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.

llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.

All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.

Phabricator Review: https://reviews.llvm.org/D50988

llvm-svn: 340795
2018-08-28 01:18:29 +00:00
Craig Topper
bc7a9bd514 [X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.
Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51267

llvm-svn: 340708
2018-08-26 18:47:44 +00:00
John Brawn
ea0c076dc1 [PhiValues] Use callback value handles to invalidate deleted values
The way that PhiValues is integrated with BasicAA it is possible for a pass
which uses BasicAA to pick up an instance of BasicAA that uses PhiValues without
intending to, and then delete values from a function in a way that causes
PhiValues to return dangling pointers to these deleted values. Fix this by
having a set of callback value handles to invalidate values when they're
deleted.

llvm-svn: 340613
2018-08-24 15:48:30 +00:00
Brian Homerding
7548e9b485 [FunctionAttrs] Infer WriteOnly Function Attribute
These changes expand the FunctionAttr logic in order to mark functions as
WriteOnly when appropriate. This is done through an additional bool variable
and extended logic.

Reviewers: hfinkel, jdoerfert

Differential Revision: https://reviews.llvm.org/D48387

llvm-svn: 340537
2018-08-23 15:05:22 +00:00
Philip Reames
05de5c7fdc [AST] Add a test for attribute intersection
Already works, but I initially convinced myself it doesn't, so add a test which shows it does.  :)

llvm-svn: 340453
2018-08-22 21:10:56 +00:00
Scott Linder
f19b5a445f [AMDGPU] Consider loads from flat addrspace to be potentially divergent
In general we can't assume flat loads are uniform, and cases where we can prove
they are should be handled through infer-address-spaces.

Differential Revision: https://reviews.llvm.org/D50991

llvm-svn: 340343
2018-08-21 21:24:31 +00:00
Philip Reames
8217861627 [AST] Remove notion of volatile from alias sets [NFCI]
Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.

It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)

There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..

llvm-svn: 340312
2018-08-21 17:59:11 +00:00
Sanjay Patel
c684399bd3 [ConstantFolding] improve folding of binops with vector undef operand
A non-undef operand may still have undef constant elements, 
so we should always propagate the vector results per-lane.

llvm-svn: 340194
2018-08-20 18:19:02 +00:00
Sanjay Patel
eaa473e7b1 [ConstantFolding] add tests for binops on vectors with undef elements; NFC
llvm-svn: 340190
2018-08-20 17:31:34 +00:00
Philip Reames
60add894d8 [AST] Clarify printing of unknown size locations [NFC]
Printing "unknown" is much more clear than an arbitrary large integer

llvm-svn: 340108
2018-08-17 23:17:31 +00:00
Philip Reames
6bc29dcd94 [AST][Tests] Clarify what each test is doing
llvm-svn: 340100
2018-08-17 21:58:26 +00:00
Philip Reames
151ff803ae [AST[Tests] Shorten tests using noalias params
llvm-svn: 340099
2018-08-17 21:45:57 +00:00
Philip Reames
96303e96c2 [AST] Add tests for argmemonly calls [NFC]
First step towards building a test set to rebase D50730 on top of.  Starting with clone of memtransfer tests, more to come.

llvm-svn: 340095
2018-08-17 21:42:18 +00:00
Sanjay Patel
c53642cf79 [ConstantFolding] add simplifications for funnel shift intrinsics
This is another step towards being able to canonicalize to the funnel shift 
intrinsics in IR (see D49242 for the initial patch). 
We should not have any loss of simplification power in IR between these and 
the equivalent IR constructs.

Differential Revision: https://reviews.llvm.org/D50848

llvm-svn: 340022
2018-08-17 13:23:44 +00:00
Max Kazantsev
1be380d8da [MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514
The description of `isGuaranteedToExecute` does not correspond to its implementation.
According to description, it should return `true` if an instruction is executed under the
assumption that its loop is *entered*. However there is a sophisticated alrogithm inside
that tries to prove that the instruction is executed if the loop is *exited*, which is not the
same thing for infinite loops. There is an attempt to protect from dealing with infinite loops
by prohibiting loops without exit blocks, however an infinite loop can have exit blocks.

As result of that, MustExecute can falsely consider some blocks that are never entered as
mustexec, and LICM can hoist dangerous instructions out of them basing on this fact.
This may introduce UB to programs which did not contain it initially.

This patch removes the problematic algorithm and replaced it with a one which tries to
prove what is required in description.

Differential Revision: https://reviews.llvm.org/D50558
Reviewed By: reames

llvm-svn: 339984
2018-08-17 06:19:17 +00:00
Sanjay Patel
cf78f22909 [ConstantFolding] add tests for funnel shift intrinsics; NFC
No functionality for this yet.

llvm-svn: 339889
2018-08-16 16:10:42 +00:00
Easwaran Raman
89a18d93a9 [BFI] Use rounding while computing profile counts.
Summary:
Profile count of a block is computed by multiplying its block frequency
by entry count and dividing the result by entry block frequency. Do
rounded division in the last step and update test cases appropriately.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50822

llvm-svn: 339835
2018-08-16 00:26:59 +00:00
Max Kazantsev
ce271f3ebb [AliasSetTracker] Do not treat experimental_guard intrinsic as memory writing instruction
The `experimental_guard` intrinsic has memory write semantics to model the thread-exiting
logic, but does not do any actual writes to memory. Currently, `AliasSetTracker` treats it as a
normal memory write. As result, a loop-invariant load cannot be hoisted out of loop because
the guard may possibly alias with it.

This patch makes `AliasSetTracker` so that it doesn't treat guards as memory writes.

Differential Revision: https://reviews.llvm.org/D50497
Reviewed By: reames

llvm-svn: 339753
2018-08-15 06:21:02 +00:00
Max Kazantsev
2a35355ab9 [NFC] Add comprehensive test of AliasSetTracker with guards
llvm-svn: 339643
2018-08-14 06:37:39 +00:00
Reid Kleckner
6e852d9c3d [BasicAA] Don't assume tail calls with byval don't alias allocas
Summary:
Calls marked 'tail' cannot read or write allocas from the current frame
because the current frame might be destroyed by the time they run.
However, a tail call may use an alloca with byval. Calling with byval
copies the contents of the alloca into argument registers or stack
slots, so there is no lifetime issue. Tail calls never modify allocas,
so we can return just ModRefInfo::Ref.

Fixes PR38466, a longstanding bug.

Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50679

llvm-svn: 339636
2018-08-14 01:24:35 +00:00
Max Kazantsev
194dc99083 [NFC] Add tests that demonstrate that MustExecute is fundamentally broken
llvm-svn: 339417
2018-08-10 09:20:46 +00:00
Krzysztof Parzyszek
e5e129cd68 [SCEV] Properly solve quadratic equations
Differential Revision: https://reviews.llvm.org/D48283

llvm-svn: 338758
2018-08-02 19:13:35 +00:00
John Brawn
f36d8dfcf7 [BasicAA] Use PhiValuesAnalysis if available when handling phi alias
By using PhiValuesAnalysis we can get all the values reachable from a phi, so
we can be more precise instead of giving up when a phi has phi operands. We
can't make BaseicAA directly use PhiValuesAnalysis though, as the user of
BasicAA may modify the function in ways that PhiValuesAnalysis can't cope with.

For this optional usage to work correctly BasicAAWrapperPass now needs to be not
marked as CFG-only (i.e. it is now invalidated even when CFG is preserved) due
to how the legacy pass manager handles dependent passes being invalidated,
namely the depending pass still has a pointer to the now-dead dependent pass.

Differential Revision: https://reviews.llvm.org/D44564

llvm-svn: 338242
2018-07-30 11:52:08 +00:00
Keno Fischer
a21d54a8e0 [SCEV] Don't expand Wrap predicate using inttoptr in ni addrspaces
Summary:
In non-integral address spaces, we're not allowed to introduce inttoptr/ptrtoint
intrinsics. Instead, we need to expand any pointer arithmetic as geps on the
base pointer. Luckily this is a common task for SCEV, so all we have to do here
is hook up the corresponding helper function and add test case.

Fixes PR38290

Reviewers: sanjoy
Differential Revision: https://reviews.llvm.org/D49832

llvm-svn: 338073
2018-07-26 21:55:06 +00:00
Stanislav Mekhanoshin
1ac228fe82 Fix llvm::ComputeNumSignBits with some operations and llvm.assume
Currently ComputeNumSignBits does early exit while processing some
of the operations (add, sub, mul, and select). This prevents the
function from using AssumptionCacheTracker if passed.

Differential Revision: https://reviews.llvm.org/D49759

llvm-svn: 337936
2018-07-25 16:39:24 +00:00
Roman Tereshin
9cbf4aa413 [SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw><nsw> transform
if the top level addition in (D + (C-D + x + ...)) could be proven to
not wrap, where the choice of D also maximizes the number of trailing
zeroes of (C-D + x + ...), ensuring homogeneous behaviour of the
transformation and better canonicalization of such expressions.

This enables better canonicalization of expressions like

  1 + zext(5 + 20 * %x + 24 * %y)  and
      zext(6 + 20 * %x + 24 * %y)

which get both transformed to

  2 + zext(4 + 20 * %x + 24 * %y)

This pattern is common in address arithmetics and the transformation
makes it easier for passes like LoadStoreVectorizer to prove that 2 or
more memory accesses are consecutive and optimize (vectorize) them.

Reviewed By: mzolotukhin

Differential Revision: https://reviews.llvm.org/D48853

llvm-svn: 337859
2018-07-24 21:48:56 +00:00
Max Kazantsev
78e21c99d5 [SCEV] Fix buggy behavior in getAddExpr with truncs
SCEV tries to constant-fold arguments of trunc operands in SCEVAddExpr, and when it does
that, it passes wrong flags into the recursion. It is only valid to pass flags that are proved for
narrow type into a computation in wider type if we can prove that trunc instruction doesn't
actually change the value. If it did lose some meaningful bits, we may end up proving wrong
no-wrap flags for sum of arguments of trunc.

In the provided test we end up with `nuw` where it shouldn't be because of this bug.

The solution is to conservatively pass `SCEV::FlagAnyWrap` which is always a valid thing to do.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D49471

llvm-svn: 337435
2018-07-19 01:46:21 +00:00
Max Kazantsev
29fa9d81d4 [NFC] Make a test more neat
llvm-svn: 337379
2018-07-18 11:03:40 +00:00
Tim Shen
fb4d2d1d80 Re-apply "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)."
llvm-svn: 337075
2018-07-13 23:58:46 +00:00
Tim Renouf
1bd8a9908e DivergenceAnalysis: added debug output
Summary:
This commit does two things:

1. modified the existing DivergenceAnalysis::dump() so it dumps the
   whole function with added DIVERGENT: annotations;

2. added code to do that dump if the appropriate -debug-only option is
   on.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47700

Change-Id: Id97b605aab1fc6f5a11a20c58a99bbe8c565bf83
llvm-svn: 336998
2018-07-13 13:13:30 +00:00
Piotr Padlewski
c64e878f84 Simplify recursive launder.invariant.group and strip
Summary:
This patch is crucial for proving equality laundered/stripped
pointers. eg:

  bool foo(A *a) {
    return a == std::launder(a);
  }

Clang with -fstrict-vtable-pointers will emit something like:

    define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) {
    entry:
      %c = bitcast %struct.A* %a to i8*
      %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c)
      %0 = bitcast %struct.A* %a to i8*
      %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0)
      %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call)
      %cmp = icmp eq i8* %1, %2
      ret i1 %cmp
    }

and because %2 can be replaced with @llvm.strip.invariant.group(%0)
and that %2 and %1 will produce the same value (because strip is readnone)
we can replace compare with true.

Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D47423

llvm-svn: 336963
2018-07-12 23:55:20 +00:00
Simon Pilgrim
69f8bfb6c2 [TargetTransformInfo] Add pow2 analysis for scalar constants
Add ConstantInt analysis to getOperandInfo so we get more realistic div/rem expansion costs comparable to the vector costs.

llvm-svn: 336827
2018-07-11 17:51:27 +00:00
Manoj Gupta
647946fa14 llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: efriedma, george.burgess.iv

Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47895

llvm-svn: 336613
2018-07-09 22:27:23 +00:00
Simon Pilgrim
a4f67ce5f1 [CostModel][X86] Add SREM/UREM general and constant costs (PR38056)
We penalize general SDIV/UDIV costs but don't do the same for SREM/UREM.

This patch makes general vector SREM/UREM x20 as costly as scalar, the same approach as we do for SDIV/UDIV. The patch also extends the existing SDIV/UDIV constant costs for SREM/UREM - at the moment this means the additional cost of a MUL+SUB (see D48975).

Differential Revision: https://reviews.llvm.org/D48980

llvm-svn: 336486
2018-07-07 16:53:30 +00:00
Tim Shen
8dd0f7c995 Revert "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)."
This reverts commit r336140. Our tests shows that LSR assert fails with it.

llvm-svn: 336473
2018-07-06 23:20:35 +00:00
Max Kazantsev
04093c3826 Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done"
llvm-svn: 336410
2018-07-06 04:04:13 +00:00
Simon Pilgrim
27ee33f2d0 [CostModel][X86] Add UDIV/UREM by pow2 costs
Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc.

llvm-svn: 336371
2018-07-05 16:56:28 +00:00
Max Kazantsev
e6a4692668 [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri

llvm-svn: 336172
2018-07-03 06:23:57 +00:00