1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

847 Commits

Author SHA1 Message Date
Sanjay Patel
9da3c40bfb [ValueTracking] fix bug computing isKnownToBeAPowerOfTwo() with arithmetic shift right (PR25900)
This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=25900

If we think that an arithmetic right shift of a power of two is always a power of two, 
an sdiv gets wrongly converted to udiv.

Differential Revision: http://reviews.llvm.org/D15827

llvm-svn: 256655
2015-12-30 22:40:52 +00:00
Elena Demikhovsky
3ed0b3c7f1 Implemented cost model for masked gather and scatter operations
The cost is calculated for all X86 targets. When gather/scatter instruction
is not supported we calculate the cost of scalar sequence.

Differential revision: http://reviews.llvm.org/D15677

llvm-svn: 256519
2015-12-28 20:10:59 +00:00
Chen Li
c60ad3e1fe [gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. 

Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob

Subscribers: reames, mjacob, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15662

llvm-svn: 256443
2015-12-26 07:54:32 +00:00
David Majnemer
42fc0975e8 [OperandBundles] Have GlobalsModRef play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

llvm-svn: 256329
2015-12-23 09:58:46 +00:00
Cong Hou
50c405416c [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Jun Bum Lim
8b973fac5e Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
This is recommit of r256028 with minor fixes in unittests:
  CodeGen/Mips/eh.ll
  CodeGen/Mips/insn-zero-size-bb.ll

Original commit message:

When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

llvm-svn: 256202
2015-12-21 22:00:51 +00:00
Cong Hou
ac19620238 [X86][SSE] Transform truncations between vectors of integers into X86ISD::PACKUS/PACKSS operations during DAG combine.
This patch transforms truncation between vectors of integers into
X86ISD::PACKUS/PACKSS operations during DAG combine. We don't do it in
lowering phase because after type legalization, the original truncation
will be turned into a BUILD_VECTOR with each element that is extracted
from a vector and then truncated, and from them it is difficult to do
this optimization. This greatly improves the performance of truncations
on some specific types.

Cost table is updated accordingly.


Differential revision: http://reviews.llvm.org/D14588

llvm-svn: 256194
2015-12-21 20:42:43 +00:00
Michael Zolotukhin
fa6d622c80 [ValueTracking] Properly handle non-sized types in isAligned function.
Reviewers: apilipenko, reames, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15597

llvm-svn: 256192
2015-12-21 20:38:18 +00:00
Tom Stellard
1e9164c058 AMDGPU/SI: Fix implemenation of isSourceOfDivergence() for graphics shaders
Summary:
The analysis of shader inputs was completely wrong.  We were passing the
wrong index to AttributeSet::hasAttribute() and the logic for which
inputs where in SGPRs was wrong too.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15608

llvm-svn: 256082
2015-12-19 02:54:15 +00:00
Cong Hou
d24cdaa414 Use getEdgeProbability() instead of getEdgeWeight() in BFI and remove getEdgeWeight() interfaces from MBPI.
This patch removes all getEdgeWeight() interfaces from CodeGen directory. As
getEdgeProbability() is a little more expensive than getEdgeWeight(), I will
compose a patch soon in which BPI only stores probabilities instead of edge
weights so that getEdgeProbability() will have O(1) time.


Differential revision: http://reviews.llvm.org/D15489

llvm-svn: 256039
2015-12-18 21:53:24 +00:00
Jingyue Wu
76a12162c8 [DivergenceAnalysis] fix a bug in computing influence regions
Fixes PR25864

llvm-svn: 256036
2015-12-18 21:44:26 +00:00
Rafael Espindola
0cdfda8077 Revert "Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst"
This reverts commit r256028.

It broke:

    LLVM :: CodeGen/Mips/eh.ll
    LLVM :: CodeGen/Mips/insn-zero-size-bb.ll

llvm-svn: 256032
2015-12-18 21:23:32 +00:00
Jun Bum Lim
a88667c58b Enhance BranchProbabilityInfo::calcUnreachableHeuristics for InvokeInst
When identifying blocks post-dominated by an unreachable-terminated block
in BranchProbabilityInfo, consider only the edge to the normal destination
block if the terminator is InvokeInst and let calcInvokeHeuristics() decide
edge weights for the InvokeInst.

llvm-svn: 256028
2015-12-18 20:53:47 +00:00
Vaivaswatha Nagaraj
888cff000a GlobalsAA: Take advantage of ArgMemOnly, InaccessibleMemOnly and InaccessibleMemOrArgMemOnly attributes
Summary:
1. Modify AnalyzeCallGraph() to retain function info for external functions
if the function has [InaccessibleMemOr]ArgMemOnly flags.
2. When analyzing the use of a global is function parameter at a call site,
mark the callee also as modifying the global appropriately.
3. Add additional test cases.

Depends on D15499

Reviewers: hfinkel, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15605

llvm-svn: 255994
2015-12-18 11:02:52 +00:00
Matt Arsenault
c7d6d736ee AMDGPU: Override getCFInstrCost
The default cost was 0 with the assumption that it is predictable.

llvm-svn: 255796
2015-12-16 18:37:19 +00:00
Cong Hou
649a5e80cc [X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1.
Previously in the conversion cost table there are no entries for integer-integer
conversions on SSE2. This will result in imprecise costs for certain vectorized
operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers
are counted from the result of running llc on the new test case in this patch.


Differential revision: http://reviews.llvm.org/D15132

llvm-svn: 255315
2015-12-11 00:31:39 +00:00
Cong Hou
cc25d3b7d5 Don't punish vectorized arithmetic instruction whose type will be split to multiple registers
Currently in LLVM's cost model, a vectorized arithmetic instruction will have
high cost if its type is split into multiple registers. However, this
punishment is too heavy and unnecessary. The overhead of the split should not
be on arithmetic instructions but instructions that implement the split. Note
that during vectorization we have calculated the register pressure, and we
only choose proper interleaving factor (and also vectorization factor) so
that we don't use more registers than the maximum number.

Here is a very simple example: if a vadd has the cost 1, and if we double VF
so that we need two registers to perform it, then its cost will become 4 with
the current implementation, which will prevent us to use larger VF.


Differential revision: http://reviews.llvm.org/D15159

llvm-svn: 254671
2015-12-04 00:36:58 +00:00
Elena Demikhovsky
32e8d60483 AVX-512: Updated cost of FP/SINT/UINT conversion operations
I checked and updated the cost of AVX-512 conversion operations. Added cost of conversion operations in DQ mode.
Conversion of illegal types that requires vector split is not calculated right now (like for other X86 targets).

Differential Revision: http://reviews.llvm.org/D15074

llvm-svn: 254494
2015-12-02 08:59:47 +00:00
Matt Arsenault
8a9536c837 AMDGPU: Report extractelement as free in cost model
The cost for scalarized operations is computed as N * (scalar operation
cost + 1 extractelement + 1 insertelement). This partially fixes
inflating the cost of scalarized operations since every operation is
scalarized and free. I don't think we want any cost asociated with
scalarization, but for now insertelement is still counted. I'm not sure
if we should pretend that insertelement is also free, or add a way
to compute a custom scalarization cost.

llvm-svn: 254438
2015-12-01 19:08:39 +00:00
Elena Demikhovsky
ce52715df6 Fixed a failure in cost calculation for vector GEP
Cost calculation for vector GEP failed with due to invalid cast to GEP index operand.
The bug is fixed, added a test.

http://reviews.llvm.org/D14976

llvm-svn: 254408
2015-12-01 12:08:36 +00:00
Pete Cooper
b753649d63 Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Pete Cooper
aca4c5cdc6 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Igor Laevsky
521a68ebf0 Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)"
Failing clang test is now fixed by the r253458.

llvm-svn: 253459
2015-11-18 14:50:18 +00:00
Charlie Turner
e780735f3d [ARM] Don't pessimize i32 vselect.
The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.

I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.

From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%

I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change

Differential Revision: http://reviews.llvm.org/D14743

llvm-svn: 253349
2015-11-17 17:25:15 +00:00
James Molloy
64e756bd07 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

llvm-svn: 252871
2015-11-12 10:55:20 +00:00
James Molloy
9cd723ec63 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

llvm-svn: 252863
2015-11-12 09:05:43 +00:00
James Molloy
4da975f03a [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

llvm-svn: 252862
2015-11-12 08:53:04 +00:00
Akira Hatanaka
31357e0018 Sort the enums in Attributes.h in case insensitive alphabetical order.
Sort the enums in preparation for moving the attributes to a table-gen
file.

rdar://problem/19836465

llvm-svn: 252692
2015-11-11 02:11:46 +00:00
Renato Golin
2143e4c7a3 Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Igor Laevsky
747370b198 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Mehdi Amini
6086bc7062 Fix LoopAccessAnalysis when potentially nullptr check are involved
Summary:
GetUnderlyingObjects() can return "null" among its list of objects,
we don't want to deduce that two pointers can point to the same
memory in this case, so filter it out.

Reviewers: anemet

Subscribers: dexonsmith, llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 252149
2015-11-05 05:49:43 +00:00
Adam Nemet
8ce9fb467e [LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13256

llvm-svn: 251985
2015-11-03 21:39:52 +00:00
Adam Nemet
b9c59b29d9 [LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence
Summary:
When the dependence distance in zero then we have a loop-independent
dependence from the earlier to the later access.

No current client of LAA uses forward dependences so other than
potentially hitting the MaxDependences threshold earlier, this change
shouldn't affect anything right now.

This and the previous patch were tested together for compile-time
regression.  None found in LNT/SPEC.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13255

llvm-svn: 251973
2015-11-03 20:13:43 +00:00
Adam Nemet
c168f14a53 [LAA] LLE 1/6: Expose Forward dependences
Summary:
Before this change, we didn't use to collect forward dependences since
none of the current clients (LV, LDist) required them.

The motivation to also collect forward dependences is a new pass
LoopLoadElimination (LLE) which discovers store-to-load forwarding
opportunities across the loop's backedge.  The pass uses both lexically
forward or backward loop-carried dependences to detect these
opportunities.

The new pass also analyzes loop-independent (forward) dependences since
they can conflict with the loop-carried dependences in terms of how the
data flows through memory.

The newly added test only covers loop-carried forward dependences
because loop-independent ones are currently categorized as NoDep.  The
next patch will fix this.

The two patches were tested together for compile-time regression.  None
found in LNT/SPEC.

Note that with this change LAA provides all dependences rather than just
"interesting" ones.  A subsequent NFC patch will remove the now trivial
isInterestingDependence and rename the APIs.

Reviewers: hfinkel

Subscribers: jmolloy, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13254

llvm-svn: 251972
2015-11-03 20:13:23 +00:00
Sanjoy Das
f81bfefe0b [SCEV] Fix PR25369
Have `getConstantEvolutionLoopExitValue` work correctly with multiple
entry loops.

As far as I can tell, `getConstantEvolutionLoopExitValue` never did the
right thing for multiple entry loops; and before r249712 it would
silently return an incorrect answer.  r249712 changed SCEV to fail an
assert on a multiple entry loop, and this change fixes the underlying
issue.

llvm-svn: 251770
2015-11-02 02:06:01 +00:00
Sanjoy Das
367ed95f05 [SCEV] Don't create SCEV expressions that break LCSSA
Prevent `createNodeFromSelectLikePHI` from creating SCEV expressions
that break LCSSA.

A better fix for the same issue is to teach SCEVExpander to not break
LCSSA by inserting PHI nodes at appropriate places.  That's planned for
the future.

Fixes PR25360.

llvm-svn: 251756
2015-10-31 23:21:40 +00:00
Silviu Baranga
a85fe6b03a [SCEV] Generalize the SCEV algorithm for creating expressions for PHI nodes
Summary:
When forming expressions for phi nodes having an incoming value from
outside the loop A and a value coming from the previous iteration B
we were forming an AddRec if:
  - B was an AddRec
  - the value A was equal to the value for B at iteration -1 (or equal
    to the value of B shifted by one iteration, at iteration 0)

In this case, we were computing the expression to be the expression of
B, shifted by one iteration.

This changes generalizes the logic above by removing the restriction that
B needs to be an AddRec. For this we introduce two expression rewriters
that allow us to
  - shift an expression by one iteration
  - get the value of an expression at iteration 0

This allows us to get SCEV expressions for PHI nodes when these expressions
are not AddRecExprs.

Reviewers: sanjoy

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D14175

llvm-svn: 251700
2015-10-30 15:02:28 +00:00
Sanjoy Das
5aa51e5247 [SCEV] Compute max backedge count for loops with "shift ivs"
This teaches SCEV to compute //max// backedge taken counts for loops
like

    for (int i = k; i != 0; i >>>= 1)
      whatever();

SCEV yet cannot represent the exact backedge count for these loops, and
this patch does not change that.  This is really geared towards teaching
SCEV that loops like the above are *not* infinite.

llvm-svn: 251558
2015-10-28 21:27:14 +00:00
Igor Laevsky
0cafc32245 [AliasAnalysis] Take into account readnone attribute for the function arguments
Differential Revision: http://reviews.llvm.org/D13992

llvm-svn: 251535
2015-10-28 17:54:48 +00:00
Igor Laevsky
7085d5bdf2 [AliasAnalysis] Take into account readonly attribute for the function arguments
In getArgModRefInfo we consider all arguments as having MRI_ModRef.
However for arguments marked with readonly attribute we can return 
more precise answer - MRI_Ref.

Differential Revision: http://reviews.llvm.org/D13992

llvm-svn: 251525
2015-10-28 16:42:00 +00:00
Chad Rosier
b8f49a8510 Add newline to test. NFC.
llvm-svn: 251511
2015-10-28 12:30:08 +00:00
James Molloy
a250ff5130 [GlobalsAA] An indirect global that is initialized is not fair game
When checking if an indirect global (a global with pointer type) is only assigned by allocation functions, first check if the global is itself initialized. If it is, it's not only assigned by allocation functions.

This fixes PR25309. Thanks to David Majnemer for reducing the test case!

llvm-svn: 251508
2015-10-28 10:41:29 +00:00
Sanjoy Das
2819145e6a [ValueTracking] Use !range metadata more aggressively in KnownBits
Summary:
Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more
aggressively.

Reviewers: majnemer, nlewycky, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14100

llvm-svn: 251487
2015-10-28 03:20:15 +00:00
James Molloy
8d9daddc1b [ValueTracking] Extend r251146 to catch a fairly common case
Even though we may not know the value of the shifter operand, it's possible we know the shifter operand is non-zero. This can allow us to infer more known bits - for example:

  %1 = load %p !range {1, 5}
  %2 = shl %q, %1

We don't know %1, but we do know that it is nonzero so %2[0] is known zero, and importantly %2 is known non-zero.

Calling isKnownNonZero is nontrivially expensive so use an Optional to run it lazily and cache its result.

llvm-svn: 251294
2015-10-26 14:10:46 +00:00
James Molloy
6472ad3426 [BasicAA] Bugfix for r251016
If the loaded type sizes don't match the element type of the sequential type, all bets are off and the addresses may, indeed, overlap.

Surprisingly, this just got caught in one test, on one builder, out of the 30+ builders testing this change. Congratulations go to http://lab.llvm.org:8011/builders/clang-aarch64-lnt/builds/5205.

llvm-svn: 251112
2015-10-23 14:17:03 +00:00
Sanjoy Das
20577e65c0 [SCEV] Commute zero extends through <nuw> additions
llvm-svn: 251052
2015-10-22 19:57:38 +00:00
Sanjoy Das
2c0387ed8b [SCEV] Commute sign extends through nsw additions
Summary: Depends on D13613.

Reviewers: atrick, hfinkel, reames, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13685

llvm-svn: 251049
2015-10-22 19:57:25 +00:00
Sanjoy Das
fd03ccdaff [SCEV] Mark AddExprs as nsw or nuw if legal
Summary:
This uses `ScalarEvolution::getRange` and not potentially control
dependent `nsw` and `nuw` bits on the arithmetic instruction.

Reviewers: atrick, hfinkel, nlewycky

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13613

llvm-svn: 251048
2015-10-22 19:57:19 +00:00
James Molloy
ee690e2eb4 [GlobalsAA] Loosen an overly conservative bailout
Instead of bailing out when we see loads, analyze them. If we can prove that the loaded-from address must escape, then we can conclude that a load from that address must escape too and therefore cannot alias a non-addr-taken global.

When checking if a Value can alias a non-addr-taken global, if the Value is a LoadInst of a non-global, recurse instead of bailing.

If we can follow a trail of loads up to some base that is captured, we know by inference that all the loads we followed are also captured.

llvm-svn: 251017
2015-10-22 13:44:26 +00:00
James Molloy
a87ba0206e [BasicAA] Non-equal indices in a GEP of a SequentialType don't overlap
If the final indices of two GEPs can be proven to not be equal, and
the GEP is of a SequentialType (not a StructType), then the two GEPs
do not alias.

llvm-svn: 251016
2015-10-22 13:28:18 +00:00