1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

1555 Commits

Author SHA1 Message Date
Max Kazantsev
7697aa1564 [IndVars] Smart hard uses detection
When rewriting loop exit values, IndVars considers this transform not profitable if
the loop instruction has a loop user which it believes cannot be optimized away.
In current implementation only calls that immediately use the instruction are considered
as such.

This patch extends the definition of "hard" users to any side-effecting instructions
(which usually cannot be optimized away from the loop) and also allows handling
of not just immediate users, but use chains.

Differentlai Revision: https://reviews.llvm.org/D51584
Reviewed By: etherzhhb

llvm-svn: 345814
2018-11-01 06:47:01 +00:00
Max Kazantsev
7d7e89e1b2 [SCEV] Avoid redundant computations when doing AddRec merge
When we calculate a product of 2 AddRecs, we end up making quite massive
computations to deduce the operands of resulting AddRec. This process can
be optimized by computing all args of intermediate sum and then calling
`getAddExpr` once rather than calling `getAddExpr` with intermediate
result every time a new argument is computed.

Differential Revision: https://reviews.llvm.org/D53189
Reviewed By: rtereshin

llvm-svn: 345813
2018-11-01 06:18:27 +00:00
Simon Pilgrim
656ffe8110 [TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)
Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.

Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!

I've done my best to fix a number of vectorizer uses:

SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.

LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.

Differential Revision: https://reviews.llvm.org/D53573

llvm-svn: 345617
2018-10-30 18:10:02 +00:00
Jonas Paulsson
73da77885f [SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.
Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a
load. This patch adds a check for this.

Review: Ulrich Weigand
https://reviews.llvm.org/D53791

llvm-svn: 345596
2018-10-30 13:41:03 +00:00
Craig Topper
520221ac38 [X86] Add -LABEL to some FileCheck checks. NFC
llvm-svn: 345407
2018-10-26 17:21:19 +00:00
Jonas Paulsson
7fed640ca5 [SystemZ] Improve getMemoryOpCost() to find foldable loads that are converted.
The SystemZ backend can do arithmetic of memory by loading and then extending
one of the operands. Similarly, a load + truncate can be folded into an
operand.

This patch improves the SystemZ TTI cost function to recognize this.

Review: Ulrich Weigand
https://reviews.llvm.org/D52692

llvm-svn: 345327
2018-10-25 22:28:25 +00:00
Jonas Paulsson
531880d371 [SystemZ] Improve handling and cost estimates of vector integer div/rem
Enable the DAG optimization that converts vector div/rem with constants into
multiply+shifts sequences by expanding them early. This is needed since
ISD::SMUL_LOHI is 'Custom' lowered on SystemZ, and will therefore not be
available to BuildSDIV after legalization.

Better cost values for these instructions based on how they will be
implemented (a constant divisor is cheaper).

Review: Ulrich Weigand
https://reviews.llvm.org/D53196

llvm-svn: 345321
2018-10-25 21:47:22 +00:00
Simon Pilgrim
c6ef0b8381 [CostModel][X86] Add realistic vXi64 uitofp vXf64 costs
Match codegen improvements from D53649/rL345256

llvm-svn: 345263
2018-10-25 13:06:20 +00:00
Simon Pilgrim
0fa453abed [CostModel][X86] Add realistic i64 uitofp f64 scalar costs
llvm-svn: 345261
2018-10-25 12:42:10 +00:00
Simon Pilgrim
a4026c510e [CostModel][X86] Add vXi8 vector division by constants costs.
ISD::MULHS/ISD::MULHU lowering of vXi8 types means we expand these in TargetLowering BuildSDIV/BuildUDIV.

llvm-svn: 345175
2018-10-24 18:44:12 +00:00
Simon Pilgrim
5ebce6ffd3 [CostModel][X86] Enable non-uniform vector division by constants costs.
Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases.

llvm-svn: 345164
2018-10-24 17:30:29 +00:00
Simon Pilgrim
9dfb3985fe [TTI][X86] Treat SK_Transpose shuffles as SK_PermuteTwoSrc - there's no difference in lowering.
llvm-svn: 345048
2018-10-23 16:45:26 +00:00
Simon Pilgrim
7b3fe8fd54 [CostModel][X86] Add transpose shuffle cost tests
llvm-svn: 345045
2018-10-23 16:27:14 +00:00
Simon Pilgrim
6615580a1b Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345025
2018-10-23 13:14:54 +00:00
Simon Pilgrim
bf157daee9 Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345023
2018-10-23 13:00:22 +00:00
Simon Pilgrim
0ab6abaf21 [ARM] Regenerate reverse shuffle costs
Came about while cleaning up general shuffle costs for PR39368

llvm-svn: 344966
2018-10-22 22:26:00 +00:00
Simon Pilgrim
22d066d4c8 [CostModel][X86] Add some initial extract/insert subvector shuffle cost tests
Just f64/i64 tests initially to demonstrate PR39368

llvm-svn: 344857
2018-10-20 17:38:33 +00:00
Simon Pilgrim
5d8ef8c33b [CostModel][X86] Add integer vector reduction cost tests
llvm-svn: 344846
2018-10-20 14:29:59 +00:00
Thomas Lively
4f94a73c0f [ConstantFolding] Constant fold minimum and maximum intrinsics
Summary: Depends on D52764

Reviewers: aheejin, dschuff

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52765

llvm-svn: 344796
2018-10-19 18:15:32 +00:00
Anna Thomas
d2aa341912 [LV] Teach vectorizer about variant value store into uniform address
Summary:
Teach vectorizer about vectorizing variant value stores to uniform
address. Similar to rL343028, we do not allow vectorization if we have
multiple stores to the same uniform address.

Cost model already has the change for considering the extract
instruction cost for a variant value store. See added test cases for how
vectorization is done.
The patch also contains changes to the ORE messages.

Reviewers: Ayal, mkuper, anemet, hsaito

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D52656

llvm-svn: 344613
2018-10-16 15:46:26 +00:00
Max Kazantsev
ef8db38ac0 [SCEV] Limit AddRec "simplifications" to avoid combinatorial explosions
SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into
a single AddRec of size `2n+1` with complex combinatorial coefficients can easily
trigger exponential growth of the SCEV (in case if nothing gets folded and simplified).
We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`,
but its default value seems to be insufficiently small: the test attached to this patch
with default value of this option `16` has a SCEV of >3M symbols (when printed out).

This patch reduces the simplification limit. It is not a cure to combinatorial
explosions, but at least it reduces this corner case to something more or less
reasonable.

Differential Revision: https://reviews.llvm.org/D53282
Reviewed By: sanjoy

llvm-svn: 344584
2018-10-16 05:26:21 +00:00
Jonas Paulsson
44725c7a8b [SystemZ] Temporarily disable high VFs with integer div/rem.
Until mischeduler is clever enough to avoid spilling in a vectorized loop
with many (scalar) DLRs it is better to avoid high vectorization factors (8
and above).

llvm-svn: 344129
2018-10-10 09:30:29 +00:00
Jonas Paulsson
1ed2d5f64f [SystemZ] Take better care when computing needed vector registers in TTI.
A new function getNumVectorRegs() is better to use for the number of needed
vector registers instead of getNumberOfParts(). This is to make sure that the
number of vector registers (and typically operations) required for a vector
type is accurate.

getNumberOfParts() which was previously used works by splitting the vector
type until it is legal gives incorrect results for types with a non
power of two number of elements (rare).

A new static function getScalarSizeInBits() that also checks for a pointer
type and returns 64U for it since otherwise it gets a value of 0). Used in a
few places where Ty may be pointer.

Review: Ulrich Weigand
llvm-svn: 344115
2018-10-10 07:36:27 +00:00
George Burgess IV
84de810c0d [Analysis] Make LocationSize pretty-printing more descriptive
This is the third patch in a series intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context. The second being r344013.

The intent is to make the output of printing a LocationSize more
precise. The main motivation for this is that we plan to add a bit to
distinguish whether a given LocationSize is an upper-bound or is
precise; making that information available in pretty-printing is nice.

llvm-svn: 344108
2018-10-10 01:35:22 +00:00
Anna Thomas
146a9e87c5 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

llvm-svn: 343028
2018-09-25 20:57:20 +00:00
Jonas Paulsson
16443bd463 [SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM
After recent improvements which makes better use of LOC instead of IPM, the
TTI cost functions also needs to be updated to reflect this.

This involves sext, zext and xor of i1.

The tests were updated so that for z13 the new costs are expected, while the
old costs are still checked for on zEC12.

Review: Ulrich Weigand
https://reviews.llvm.org/D51339

llvm-svn: 342207
2018-09-14 06:46:55 +00:00
Peter Collingbourne
b6e7ff238c Prevent Constant Folding From Optimizing inrange GEP
This patch does the following things:

1. update SymbolicallyEvaluateGEP so that it bails out if it cannot preserve inrange arribute;
2. update llvm/test/Analysis/ConstantFolding/gep.ll to remove UB in it;
3. remove inaccurate comment above ConstantFoldInstOperandsImpl in llvm/lib/Analysis/ConstantFolding.cpp;
4. add a new regression test that makes sure that no optimizations change an inrange GEP in an unexpected way.

Patch by Zhaomo Yang!

Differential Revision: https://reviews.llvm.org/D51698

llvm-svn: 341888
2018-09-11 01:53:36 +00:00
Philip Reames
0608255adc [AST] Add test coverage of memsets
Immediately after posting https://reviews.llvm.org/D51895, I noticed a small bug.  These tests would have caught that.

llvm-svn: 341880
2018-09-10 23:14:30 +00:00
Philip Reames
7578ed61eb [AST] Visit memtransfer arguments in order
The only point to this change is the test diffs.  When I remove this code entirely (in favor of the recently added generic handling), I don't want there to be any confusion due to spurious test diffs.

As an aside, the fact out tests are AST construction order dependent is not great.  I thought about fixing that, but the reasonable schemes I might want (e.g. sort by name) need the test diffs anyways.

Philip

llvm-svn: 341841
2018-09-10 16:00:27 +00:00
Tim Northover
0f17fdc4c5 InstCombine: move hasOneUse check to the top of foldICmpAddConstant
There were two combines not covered by the check before now, neither of which
actually differed from normal in the benefit analysis.

The most recent seems to be because it was just added at the top of the
function (naturally). The older is from way back in 2008 (r46687) when we just
didn't put those checks in so routinely, and has been diligently maintained
since.

llvm-svn: 341831
2018-09-10 14:26:44 +00:00
Matt Arsenault
6815ea5d8c AMDGPU: Fix tests using old number for constant address space
llvm-svn: 341770
2018-09-10 02:54:25 +00:00
Philip Reames
6fce828d58 [AST] Generalize argument specific aliasing
AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.

The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.

The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.

Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch.  It was originally part of this one, but separate for ease of review and rebase.

Differential Revision: https://reviews.llvm.org/D50730

llvm-svn: 341713
2018-09-07 21:36:11 +00:00
Nicola Zaghen
79c63f731b [InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)
Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares.

This is the Alive proof: https://rise4fun.com/Alive/nyY

Differential Revision: https://reviews.llvm.org/D50972

llvm-svn: 341353
2018-09-04 10:29:48 +00:00
Nicolai Haehnle
6b4a3db968 Move test/Analysis/DivergenceAnalysis/AMDGPU/loads.ll
Should fix failures of buildbots that don't build the AMDGPU backend.

Change-Id: I01cb84b4b47803b10c5b21ea0353546239860a51
llvm-svn: 341079
2018-08-30 15:24:00 +00:00
Nicolai Haehnle
5a64e70379 [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.

Patch by: Simon Moll

Reviewed By: nhaehnle

Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50434

Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
2018-08-30 14:21:36 +00:00
Sanjay Patel
b35900e555 [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.

llvm-svn: 340926
2018-08-29 13:24:34 +00:00
Kit Barton
5fe828ca5e [PPC] Remove Darwin support from POWER backend.
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.

Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.

Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.

llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.

All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.

Phabricator Review: https://reviews.llvm.org/D50988

llvm-svn: 340795
2018-08-28 01:18:29 +00:00
Craig Topper
bc7a9bd514 [X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.
Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51267

llvm-svn: 340708
2018-08-26 18:47:44 +00:00
John Brawn
ea0c076dc1 [PhiValues] Use callback value handles to invalidate deleted values
The way that PhiValues is integrated with BasicAA it is possible for a pass
which uses BasicAA to pick up an instance of BasicAA that uses PhiValues without
intending to, and then delete values from a function in a way that causes
PhiValues to return dangling pointers to these deleted values. Fix this by
having a set of callback value handles to invalidate values when they're
deleted.

llvm-svn: 340613
2018-08-24 15:48:30 +00:00
Brian Homerding
7548e9b485 [FunctionAttrs] Infer WriteOnly Function Attribute
These changes expand the FunctionAttr logic in order to mark functions as
WriteOnly when appropriate. This is done through an additional bool variable
and extended logic.

Reviewers: hfinkel, jdoerfert

Differential Revision: https://reviews.llvm.org/D48387

llvm-svn: 340537
2018-08-23 15:05:22 +00:00
Philip Reames
05de5c7fdc [AST] Add a test for attribute intersection
Already works, but I initially convinced myself it doesn't, so add a test which shows it does.  :)

llvm-svn: 340453
2018-08-22 21:10:56 +00:00
Scott Linder
f19b5a445f [AMDGPU] Consider loads from flat addrspace to be potentially divergent
In general we can't assume flat loads are uniform, and cases where we can prove
they are should be handled through infer-address-spaces.

Differential Revision: https://reviews.llvm.org/D50991

llvm-svn: 340343
2018-08-21 21:24:31 +00:00
Philip Reames
8217861627 [AST] Remove notion of volatile from alias sets [NFCI]
Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.

It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)

There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..

llvm-svn: 340312
2018-08-21 17:59:11 +00:00
Sanjay Patel
c684399bd3 [ConstantFolding] improve folding of binops with vector undef operand
A non-undef operand may still have undef constant elements, 
so we should always propagate the vector results per-lane.

llvm-svn: 340194
2018-08-20 18:19:02 +00:00
Sanjay Patel
eaa473e7b1 [ConstantFolding] add tests for binops on vectors with undef elements; NFC
llvm-svn: 340190
2018-08-20 17:31:34 +00:00
Philip Reames
60add894d8 [AST] Clarify printing of unknown size locations [NFC]
Printing "unknown" is much more clear than an arbitrary large integer

llvm-svn: 340108
2018-08-17 23:17:31 +00:00
Philip Reames
6bc29dcd94 [AST][Tests] Clarify what each test is doing
llvm-svn: 340100
2018-08-17 21:58:26 +00:00
Philip Reames
151ff803ae [AST[Tests] Shorten tests using noalias params
llvm-svn: 340099
2018-08-17 21:45:57 +00:00
Philip Reames
96303e96c2 [AST] Add tests for argmemonly calls [NFC]
First step towards building a test set to rebase D50730 on top of.  Starting with clone of memtransfer tests, more to come.

llvm-svn: 340095
2018-08-17 21:42:18 +00:00
Sanjay Patel
c53642cf79 [ConstantFolding] add simplifications for funnel shift intrinsics
This is another step towards being able to canonicalize to the funnel shift 
intrinsics in IR (see D49242 for the initial patch). 
We should not have any loss of simplification power in IR between these and 
the equivalent IR constructs.

Differential Revision: https://reviews.llvm.org/D50848

llvm-svn: 340022
2018-08-17 13:23:44 +00:00