1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00
Commit Graph

6701 Commits

Author SHA1 Message Date
Sanjay Patel
51c21d223b [InstSimplify] allow or-of-icmps folds with vector splat constants
llvm-svn: 282592
2016-09-28 14:27:21 +00:00
Sanjay Patel
aac597925f [InstSimplify] allow and-of-icmps folds with vector splat constants
llvm-svn: 282590
2016-09-28 13:53:13 +00:00
Adam Nemet
395f991d15 [LAA] Rename emitAnalysis to recordAnalys. NFC
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly.  Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).

llvm-svn: 282561
2016-09-28 00:58:36 +00:00
Adam Nemet
ec2292c80c [Inliner] Port all opt remarks to new streaming API
llvm-svn: 282559
2016-09-27 23:47:03 +00:00
Adam Nemet
c03a73efe2 Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

llvm-svn: 282544
2016-09-27 22:19:23 +00:00
Adam Nemet
f602aa8cdd Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282539
2016-09-27 20:55:07 +00:00
Sanjoy Das
07e9608e3a [SCEV] Replace a struct with a function; NFC
We can do this now thanks to C++11 lambdas.

llvm-svn: 282515
2016-09-27 18:01:48 +00:00
Sanjoy Das
cbfb142a0c [SCEV] Use find instead of find_as; NFC
We don't need the extra generality here.

llvm-svn: 282514
2016-09-27 18:01:46 +00:00
Sanjoy Das
539ce05e97 [SCEV] Reduce the scope of a struct; NFC
llvm-svn: 282513
2016-09-27 18:01:44 +00:00
Sanjoy Das
dafc679040 [SCEV] Remove custom RAII wrapper; NFC
Instead use the pre-existing `scope_exit` class.

llvm-svn: 282512
2016-09-27 18:01:42 +00:00
Sanjoy Das
2d300f40f9 [SCEV] Make PendingLoopPredicates more frugal; NFCI
I don't expect `PendingLoopPredicates` to have very many
elements (e.g. when -O3'ing the sqlite3 amalgamation,
`PendingLoopPredicates` has at most 3 elements).  So now we use a
`SmallPtrSet` for it instead of the more heavyweight `DenseSet`.

llvm-svn: 282511
2016-09-27 18:01:38 +00:00
Adam Nemet
5058aadaf2 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

llvm-svn: 282503
2016-09-27 16:39:24 +00:00
Adam Nemet
b1d6f940c4 Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282499
2016-09-27 16:15:16 +00:00
Piotr Padlewski
3152e057e1 [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437
2016-09-26 20:37:32 +00:00
Chandler Carruth
59981672f6 [SCEV] Fix the order of members in the initializer list.
Noticed due to the warning on this line. Sanjoy is on
a less-than-awesome internet connection, so committing on his behalf.

llvm-svn: 282380
2016-09-26 04:49:58 +00:00
Sanjoy Das
ef4a695251 [SCEV] Assign LoopPropertiesCache in the move constructor
In a previous change I collapsed two different caches into one.  When
doing that I noticed that ScalarEvolution's move constructor was not
moving those caches.

To keep the previous change simple, I've moved that bugfix into this
separate change.

llvm-svn: 282376
2016-09-26 02:44:10 +00:00
Sanjoy Das
b9f8e32d12 [SCEV] Combine two predicates into one; NFC
Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking
the loop and maintaining similar sorts of caches.  This commit changes
SCEV to compute both the predicates via a single walk, and maintain a
single cache instead of two.

llvm-svn: 282375
2016-09-26 02:44:07 +00:00
Sanjoy Das
4dc0994173 [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage
Specifically, it moves SCEVUnionPredicates from its input into its own
storage.  Make this obvious at the type level.

llvm-svn: 282374
2016-09-26 01:10:27 +00:00
Sanjoy Das
6926238887 [SCEV] Further isolate incidental data structure; NFC
llvm-svn: 282373
2016-09-26 01:10:25 +00:00
Sanjoy Das
2025c6e5ef [SCEV] Simplify BackedgeTakenInfo::getMax; NFC
llvm-svn: 282372
2016-09-26 01:10:22 +00:00
Sanjoy Das
761e5aabb2 [SCEV] Reserve space in SmallVector; NFC
llvm-svn: 282368
2016-09-25 23:12:08 +00:00
Sanjoy Das
23b6c4c2a8 [SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC
SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to
store the (optional) data out of line.

llvm-svn: 282366
2016-09-25 23:12:04 +00:00
Sanjoy Das
6bbb5f521c [SCEV] Simplify tracking ExitNotTakenInfo instances; NFC
This change simplifies a data structure optimization in the
`BackedgeTakenInfo` class for loops with exactly one computable exit.

I've sanity checked that this does not regress compile time performance,
using sqlite3's amalgamated build.

llvm-svn: 282365
2016-09-25 23:12:00 +00:00
Sanjoy Das
3b4c0c8086 [SCEV] Rename a couple of fields; NFC
llvm-svn: 282364
2016-09-25 23:11:57 +00:00
Sanjoy Das
e39d2ae7eb [SCEV] Remove incidental data structure; NFC
llvm-svn: 282363
2016-09-25 23:11:55 +00:00
Duncan P. N. Exon Smith
2f789947a5 Analysis: Return early for UndefValue in computeKnownBits
There is no benefit in looking through assumptions on UndefValue to
guess known bits.  Return early to avoid walking their use-lists, and
assert that all instances of ConstantData are handled here for similar
reasons (UndefValue was the only integer/pointer holdout).

llvm-svn: 282337
2016-09-24 20:42:02 +00:00
Duncan P. N. Exon Smith
b94ad7ec1b Analysis: Return early in isKnownNonNullAt for ConstantData
Check and return early for ConstantPointerNull and UndefValue
specifically in isKnownNonNullAt, and assert that ConstantData never
make it to isKnownNonNullFromDominatingCondition.

This confirms that isKnownNonNullFromDominatingCondition never walks
through the use-list of an instance of ConstantData.  Given that such
use-lists cross module boundaries, it never really made sense to do so,
and was potentially very expensive.

llvm-svn: 282333
2016-09-24 19:39:47 +00:00
Sanjay Patel
6d495bce9c [TLI] isdigit / isascii / toascii param type should match return type (PR30484)
We crash in LibCallSimplifier if we don't check the validity of the function signature properly.

llvm-svn: 282278
2016-09-23 18:44:09 +00:00
Jun Bum Lim
bd9da0f624 Enhance calcColdCallHeuristics for InvokeInst
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.

Reviewers: mcrosier, hfinkel, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D24868

llvm-svn: 282262
2016-09-23 17:26:14 +00:00
Adam Nemet
f1965e4c24 [LV] When reporting about a specific instruction without debug location use loop's
This can occur for example if some optimization drops the debug location.

llvm-svn: 282048
2016-09-21 03:14:20 +00:00
Michael Kuperstein
54c54e0c3e [InferAttributes] Don't access parameters that don't exist.
Check for the correct number of parameters before querying their type.
This fixes PR30455.

llvm-svn: 282038
2016-09-20 23:10:31 +00:00
Sanjay Patel
e9cfe9ee79 move variables closer to their uses; add FIXMEs; NFC
llvm-svn: 281972
2016-09-20 14:36:14 +00:00
Elena Demikhovsky
fdf2d14b30 [Loop Vectorizer] Consecutive memory access - fixed and simplified
Amended consecutive memory access detection in Loop Vectorizer.
Load/Store were not handled properly without preceding GEP instruction.

Differential Revision: https://reviews.llvm.org/D20789

llvm-svn: 281853
2016-09-18 13:56:08 +00:00
Sanjay Patel
11e8147804 [InstCombine] allow vector types for constant folding / computeKnownBits (PR24942)
computeKnownBits() already works for integer vectors, so allow vector types when calling that from InstCombine.

I don't think the change to use m_APInt in computeKnownBits is strictly necessary because we do check for 
ConstantVector later, but it's more efficient to handle the splat case without needing to loop on vector elements.

This should work with InstSimplify, but doesn't yet, so I made that a FIXME comment on the test for PR24942:
https://llvm.org/bugs/show_bug.cgi?id=24942

Differential Revision: https://reviews.llvm.org/D24677

llvm-svn: 281777
2016-09-16 21:20:36 +00:00
David L Kreitzer
d923ca0cb8 Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Chandler Carruth
f1fcfdd6f0 [LCG] Redesign the lazy post-order iteration mechanism for the
LazyCallGraph to support repeated, stable iterations, even in the face
of graph updates.

This is particularly important to allow the CGSCC pass manager to walk
the RefSCCs (and thus everything else) in a module more than once. Lots
of unittests and other tests were hard or impossible to write because
repeated CGSCC pass managers which didn't invalidate the LazyCallGraph
would conclude the module was empty after the first one. =[ Really,
really bad.

The interesting thing is that in many ways this simplifies the code. We
can now re-use the same code for handling reference edge insertion
updates of the RefSCC graph as we use for handling call edge insertion
updates of the SCC graph. Outside of adapting to the shared logic for
this (which isn't trivial, but is *much* simpler than the DFS it
replaces!), the new code involves putting newly created RefSCCs when
deleting a reference edge into the cached list in the correct way, and
to re-formulate the iterator to be stable and effective even in the face
of these kinds of updates.

I've updated the unittests for the LazyCallGraph to re-iterate the
postorder sequence and verify that this all works. We even check for
using alternating iterators to trigger the lazy formation of RefSCCs
after mutation has occured.

It's worth noting that there are a reasonable number of likely
simplifications we can make past this. It isn't clear that we need to
keep the "LeafRefSCCs" around any more. But I've not removed that mostly
because I want this to be a more isolated change.

Differential Revision: https://reviews.llvm.org/D24219

llvm-svn: 281716
2016-09-16 10:20:17 +00:00
Sriraman Tallam
0bea49c555 [PM] Port CFGViewer and CFGPrinter to the new Pass Manager
Differential Revision: https://reviews.llvm.org/D24592

llvm-svn: 281640
2016-09-15 18:35:27 +00:00
Wei Mi
0092b54e43 Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation queries LVI
to check non-null for pointer params of each callsite. If we know the def of
param is an alloca instruction, we know it is non-null and can return early from
LVI. Similarly, CVP queries LVI to check whether pointer for each mem access is
constant. If the def of the pointer is an alloca instruction, we know it is not
a constant pointer. These shortcuts can reduce the cost of CVP significantly.

Differential Revision: https://reviews.llvm.org/D18066

llvm-svn: 281586
2016-09-15 06:28:34 +00:00
Wei Mi
c1cf1864f4 Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Andrea Di Biagio
5f72d10163 [ConstantFold] Improve the bitcast folding logic for constant vectors.
The constant folder didn't know how to always fold bitcasts of constant integer
vectors. In particular, it was unable to handle the case where a constant vector
had some undef elements, and the resulting (i.e. bitcasted) vector type had more
elements than the original vector type.

Example:
  %cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32>

On a little endian target, %cast could have been folded to:
  <4 x i32><i32 undef, i32 undef, i32 2, i32 0>

This patch improves the folding logic by teaching how to correctly propagate
undef elements in the folded vector.

Differential Revision: https://reviews.llvm.org/D24301

llvm-svn: 281343
2016-09-13 14:50:47 +00:00
Philip Reames
d1d1d7910e [LVI] Complete the abstract of the cache layer [NFCI]
Convert the previous introduced is-a relationship between the LVICache and LVIImple clases into a has-a relationship and hide all the implementation details of the cache from the lazy query layer.

The only slightly concerning change here is removing the addition of a queried block into the SeenBlock set in LVIImpl::getBlockValue.  As far as I can tell, this was effectively dead code.  I think it *used* to be the case that getCachedValueInfo wasn't const and might end up inserting elements in the cache during lookup.  That's no longer true and hasn't been for a while.  I did fixup the const usage to make that more obvious.

llvm-svn: 281272
2016-09-12 22:38:44 +00:00
Philip Reames
14f5e6cf51 [LVI] Sink a couple more cache manipulation routines into the cache itself [NFCI]
The only interesting bit here is the refactor of the handle callback and even that's pretty straight-forward.

llvm-svn: 281267
2016-09-12 22:03:36 +00:00
Philip Reames
c53c4f4788 [LVI] Abstract out the actual cache logic [NFCI]
Seperate the caching logic from the implementation of the lazy analysis.  For the moment, the lazy analysis impl has a is-a relationship with the cache; this will change to a has-a relationship shortly.  This was done as two steps merely to keep the changes simple and the diff understandable.

llvm-svn: 281266
2016-09-12 21:46:58 +00:00
Justin Lebar
2c71fc8bb7 Add handling of !invariant.load to PropagateMetadata.
Summary:
This will let e.g. the load/store vectorizer propagate this metadata
appropriately.

Reviewers: arsenm

Subscribers: tra, jholewinski, hfinkel, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23479

llvm-svn: 281153
2016-09-11 01:39:08 +00:00
Dehao Chen
eb60ada52c Do not widen load for different variable in GVN.
Summary:
Widening load in GVN is too early because it will block other optimizations like PRE, LICM.

https://llvm.org/bugs/show_bug.cgi?id=29110

The SPECCPU2006 benchmark impact of this patch:

Reference: o2_nopatch
(1): o2_patched

           Benchmark             Base:Reference   (1)  
-------------------------------------------------------
spec/2006/fp/C++/444.namd                  25.2  -0.08%
spec/2006/fp/C++/447.dealII               45.92  +1.05%
spec/2006/fp/C++/450.soplex                41.7  -0.26%
spec/2006/fp/C++/453.povray               35.65  +1.68%
spec/2006/fp/C/433.milc                   23.79  +0.42%
spec/2006/fp/C/470.lbm                    41.88  -1.12%
spec/2006/fp/C/482.sphinx3                47.94  +1.67%
spec/2006/int/C++/471.omnetpp             22.46  -0.36%
spec/2006/int/C++/473.astar               21.19  +0.24%
spec/2006/int/C++/483.xalancbmk           36.09  -0.11%
spec/2006/int/C/400.perlbench             33.28  +1.35%
spec/2006/int/C/401.bzip2                 22.76  -0.04%
spec/2006/int/C/403.gcc                   32.36  +0.12%
spec/2006/int/C/429.mcf                   41.04  -0.41%
spec/2006/int/C/445.gobmk                 26.94  +0.04%
spec/2006/int/C/456.hmmer                  24.5  -0.20%
spec/2006/int/C/458.sjeng                    28  -0.46%
spec/2006/int/C/462.libquantum            55.25  +0.27%
spec/2006/int/C/464.h264ref               45.87  +0.72%

geometric mean                                   +0.23%

For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400.

Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames

Subscribers: gberry, junbuml

Differential Revision: https://reviews.llvm.org/D24096

llvm-svn: 281074
2016-09-09 18:42:35 +00:00
Chandler Carruth
6a72044ca6 [LCG] Clean up and make NDEBUG verify calls more rigorous with
make_scope_exit now that we have that utility.

This makes the code much more clear and readable by isolating the check.
It also makes it easy to go through and make sure all the interesting
update routines have a start and end verify so we don't slowly let the
graph drift into an invalid state.

llvm-svn: 280619
2016-09-04 08:34:31 +00:00
Chandler Carruth
2f848e0e0f [LCG] A NFC refactoring to extract the logic for doing
a postorder-sequence based update after edge insertion into a generic
helper function.

This separates the SCC-specific logic into two fairly simple lambdas and
extracts the rest into a generic helper template function. I think this
is a net win on its own merits because it disentangles different pieces
of the algorithm. Now there is one place that does the two-step
partition to identify a set of newly connected components and at the
same time update the postorder sequence.

However, I'm also hoping to re-use this an upcoming patch to update
a cached post-order sequence of RefSCCs when doing the analogous update
to the RefSCC graph, and I don't want to have two copies.

The diff is quite messy but this really is just moving things around and
making types generic rather than specific.

llvm-svn: 280618
2016-09-04 08:34:24 +00:00
Andrea Di Biagio
c30204881e Simplify code a bit. No functional change intended.
We don't need to call `GetCompareTy(LHS)' every single time true or false is
returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142.

llvm-svn: 280491
2016-09-02 15:55:25 +00:00
Andrea Di Biagio
12f309496c [instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN.
This patch fixes a crash caused by an incorrect folding of an ordered comparison
between a packed floating point vector and a splat vector of NaN.

An ordered comparison between a vector and a constant vector of NaN, should
always be folded into a constant vector where each element is i1 false.

Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar
'false'. Later on, this would cause an assertion failure, since the value type
of the folded value doesn't match the expected value type of the uses of the
original instruction: "Assertion failed: New->getType() == getType() &&
"replaceAllUses of value with new value of different type!".

This patch fixes the issue and adds a test case to the already existing test
InstSimplify/floating-point-compares.ll.

Differential Revision: https://reviews.llvm.org/D24143

llvm-svn: 280488
2016-09-02 14:47:43 +00:00
Michael Zolotukhin
d2ab1fcb94 [LoopInfo] Add verification by recomputation.
Summary:
Current implementation of LI verifier isn't ideal and fails to detect
some cases when LI is incorrect. For instance, it checks that all
recorded loops are in a correct form, but it has no way to check if
there are no more other (unrecorded in LI) loops in the function. This
patch adds a way to detect such bugs.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits, silvas, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23437

llvm-svn: 280280
2016-08-31 19:26:19 +00:00