1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00
Commit Graph

141261 Commits

Author SHA1 Message Date
Justin Lebar
2de151b36e [StructurizeCFG] Refactor NearestCommonDominator.
Summary:
As far as I can tell, doing our own computations in
NearestCommonDominator is a false optimization -- DomTree will build up
what appears to be exactly this data when it decides it's worthwhile.
Moreover, by building the cache ourselves, we cannot take advantage of
the cache that the domtree might have available.

In addition, I am not convinced of the correctness of the original code.
In particular, setting ResultIndex = 1 on the first addBlock instead of
setting it to 0 is quite fishy.  Similarly, it's not clear to me that
setting IndexMap[Node] = 0 for every node as we walk up the tree finding
a common parent is correct.  But rather than ponder over these
questions, I'd rather just make the code do the obviously-correct thing.

This patch also changes the NearestCommonDominator API a bit, improving
the names and getting rid of the boolean parameter in addBlock -- see
http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html

Reviewers: arsenm

Subscribers: aemerson, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26998

llvm-svn: 288050
2016-11-28 18:49:59 +00:00
Simon Pilgrim
e91c88b619 [X86][SSE] Add initial support for combining (V)PMOVZX with shuffles.
llvm-svn: 288049
2016-11-28 17:58:19 +00:00
Adam Nemet
caf926700f [GVN, OptDiag] Include the value that is forwarded in load elimination
This requires some changes to the opt-diag API.  Hal and I have
discussed this at the Dev Meeting and came up with a streaming delimiter
(setExtraArgs) to solve this.

Arguments after this delimiter are only included in the optimization
records and not in the remarks printed in the compiler output.  (Note,
how in the test the content of the YAML file changes but the remarks on
the compiler output don't.)

This implements the green GVN message with a bug fix at line
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446

The fix is that now we properly include the constant value in the
message: "load of type i32 eliminated in favor of 7"

Differential Revision: https://reviews.llvm.org/D26489

llvm-svn: 288047
2016-11-28 17:45:34 +00:00
Adam Nemet
a4f167bba1 [GVN] Basic optimization remark support
Follow-on patches will add more interesting cases.

The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk.  This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430

Differential Revision: https://reviews.llvm.org/D26488

llvm-svn: 288046
2016-11-28 17:45:28 +00:00
Sanjay Patel
07d0147189 [x86] fix formatting; NFC
llvm-svn: 288045
2016-11-28 17:39:21 +00:00
Daniil Fukalov
4f50e2021b [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio
At the moment optimized tablegen is generated by LLVM_USE_HOST_TOOLS variable that is not set for Visual Sudio since LLVM_ENABLE_ASSERTIONS depends on CMAKE_BUILD_TYPE value that is not equal to "DEBUG" in case of Visual Studio soltion generation.

Modified to do not depend on LLVM_ENABLE_ASSERTIONS value in VS and Xcode cases

Reviewers: beanz

Subscribers: RKSimon, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D27135

llvm-svn: 288042
2016-11-28 17:12:09 +00:00
Adam Nemet
a1f45f8ec4 [LTO] Move finishOptimizationRemarks after codegen
This addresses the comment D26832.

llvm-svn: 288041
2016-11-28 16:51:49 +00:00
Simon Pilgrim
33c460873f [X86][SSE] Added support for combining bit-shifts with shuffles.
Bit-shifts by a whole number of bytes can be represented as a shuffle mask suitable for combining.

Added a 'getFauxShuffleMask' function to allow us to create shuffle masks from other suitable operations.

llvm-svn: 288040
2016-11-28 16:25:01 +00:00
Simon Pilgrim
af9f96f6b6 [X86][SSE] Added tests showing missed combines of shifts with shuffles.
llvm-svn: 288037
2016-11-28 15:50:39 +00:00
Daniel Cederman
517e67b3a9 Test commit
llvm-svn: 288036
2016-11-28 15:33:03 +00:00
Nirav Dave
c8a4e33292 Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor"
This reverts commit r287773 which caused issues with ppc64le builds.

llvm-svn: 288035
2016-11-28 14:30:29 +00:00
Ulrich Weigand
636aaaae0c [SystemZ] Fix build bot fallout from r288030
Remove unused variable that came in due to a copy-and-paste bug
and caused build bot failures.

llvm-svn: 288033
2016-11-28 14:24:14 +00:00
Ulrich Weigand
0120bda5a7 [SystemZ] Support execution hint instructions
This adds assembler support for the instructions provided by the
execution-hint facility (NIAI and BP(R)P).  This required adding
support for the new relocation types for 12-bit and 24-bit PC-
relative offsets used by the BP(R)P instructions.

llvm-svn: 288031
2016-11-28 14:01:51 +00:00
Ulrich Weigand
64c39ae7f5 [SystemZ] Support load-and-trap instructions
This adds support for the instructions provided with the
load-and-trap facility.

llvm-svn: 288030
2016-11-28 13:59:22 +00:00
Ulrich Weigand
0487d18102 [SystemZ] Add remaining branch instructions
This patch adds assembler support for the remaining branch instructions:
the non-relative branch on count variants, and all variants of branch
on index.

The only one of those that can be readily exploited for code generation
is BRCTH (branch on count using a high 32-bit register as count).  Do
use it, however, it is necessary to also introduce a hew CHIMux pseudo
to allow comparisons of a 32-bit value agains a short immediate to go
into a high register as well (implemented via CHI/CIH).

This causes a bit of codegen changes overall, but those have proven to
be neutral (or even beneficial) in performance measurements.

llvm-svn: 288029
2016-11-28 13:40:08 +00:00
Ulrich Weigand
b483140318 [SystemZ] Improve use of conditional instructions
This patch moves formation of LOC-type instructions from (late)
IfConversion to the early if-conversion pass, and in some cases
additionally creates them directly from select instructions
during DAG instruction selection.

To make early if-conversion work, the patch implements the
canInsertSelect / insertSelect callbacks.  It also implements
the commuteInstructionImpl and FoldImmediate callbacks to
enable generation of the full range of LOC instructions.

Finally, the patch adds support for all instructions of the
load-store-on-condition-2 facility, which allows using LOC
instructions also for high registers.

Due to the use of the GRX32 register class to enable high registers,
we now also have to handle the cases where there are still no single
hardware instructions (conditional move from a low register to a high
register or vice versa).  These are converted back to a branch sequence
after register allocation.  Since the expandRAPseudos callback is not
allowed to create new basic blocks, this requires a simple new pass,
modelled after the ARM/AArch64 ExpandPseudos pass.

Overall, this patch causes significantly more LOC-type instructions
to be used, and results in a measurable performance improvement.

llvm-svn: 288028
2016-11-28 13:34:08 +00:00
James Molloy
d523d136d2 [InlineCost] Reduce inline thresholds to compensate for cost changes
In r286814, the algorithm for calculating inline costs changed. This
caused more inlining to take place which is especially apparent
in optsize and minsize modes.

As the cost calculation removed a skewed behaviour (we were inconsistent
about the cost of calls) it isn't possible to update the thresholds to
get exactly the same behaviour as before. However, this threshold change
accounts for the very common case where an inline candidate has no
calls within it. In this case, r286814 would inline around 5-6 more (IR)
instructions.

The changes to -Oz have been heavily benchmarked. The "obvious" value
for the inline threshold at -Oz is zero, but due to inaccuracies in the
inline heuristics this can actually cause code size increases due to
not inlining key thunk functions (that then disappear). Experimentally,
5 was the sweet spot for code size over the test-suite.

For -Os, this change removes the outlier results shown up by green dragon
(http://104.154.54.203/db_default/v4/nts/13248).

Fixes D26848.

llvm-svn: 288024
2016-11-28 11:07:37 +00:00
Chandler Carruth
9d50667842 [PM] Remove weird marking of invalidated analyses as "preserved".
This never made a lot of sense. They've been invalidated for one IR unit
but they aren't really preserved in any normal sense. It seemed like it
would be an elegant way of communicating to outer IR units that pass
managers and adaptors had already handled invalidation, but we've since
ended up adding sets that model this more clearly: we're now using
the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of
"preserving" invalidated analyses didn't work.

This patch moves to rely on that technique exclusively and removes the
cumbersome API aspect of updating the preserved set when doing
invalidation. This in turn will simplify a *number* of upcoming patches.

This has a side benefit of exposing a number of places where we were
failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This
patch fixes those, and with those fixes shouldn't change any observable
behavior.

llvm-svn: 288023
2016-11-28 10:42:21 +00:00
Davide Italiano
24691c7655 [ThreadPool] Rollback recent changes until I figure out the breakage.
llvm-svn: 288018
2016-11-28 09:17:12 +00:00
Davide Italiano
e12e914855 [ThreadPool] Remove outdated comment after r288016.
llvm-svn: 288017
2016-11-28 08:57:05 +00:00
Davide Italiano
ada55c7225 [ThreadPool] Simplify the interface. NFCI.
The callers don't use the return value. Found by Michael
Spencer.

llvm-svn: 288016
2016-11-28 08:53:41 +00:00
Mehdi Amini
00d03454ba Revert "Improve error handling in YAML parsing"
This reverts commit r288014, the unittest isn't passing

llvm-svn: 288015
2016-11-28 04:57:04 +00:00
Mehdi Amini
9c02412ec8 Improve error handling in YAML parsing
Some scanner errors were not checked and reported by the parser.

Fix PR30934

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D26419

llvm-svn: 288014
2016-11-28 04:44:13 +00:00
Chandler Carruth
49ee0f2c30 [PM] Add an ASCII-art diagram for the call graph in the CGSCC unit test.
No functionality changed.

llvm-svn: 288013
2016-11-28 03:40:33 +00:00
Craig Topper
624fc12e44 [X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output.
llvm-svn: 288011
2016-11-27 21:37:04 +00:00
Craig Topper
bf372031c2 [X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns.
llvm-svn: 288010
2016-11-27 21:37:02 +00:00
Craig Topper
9440849602 [X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions.
llvm-svn: 288009
2016-11-27 21:37:00 +00:00
Craig Topper
485d41b7f7 [X86][FMA4] Add test cases to demonstrate missed folding opportunities for FMA4 scalar intrinsics.
llvm-svn: 288008
2016-11-27 21:36:58 +00:00
Craig Topper
ba29d9ef0b [X86] Add SHL by 1 to the load folding tables.
I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships.

llvm-svn: 288007
2016-11-27 21:36:54 +00:00
Simon Pilgrim
38bff0f1fd [X86][SSE] Add support for combining target shuffles to 128/256-bit PSLL/PSRL bit shifts
llvm-svn: 288006
2016-11-27 21:08:19 +00:00
Sanjay Patel
b2bc8129f9 [InstSimplify] allow integer vector types to use computeKnownBits
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005
2016-11-27 21:07:28 +00:00
Craig Topper
1afaa3a66e [AVX-512] Add integer and fp unpck instructions to load folding tables.
llvm-svn: 288004
2016-11-27 19:51:41 +00:00
Simon Pilgrim
ba6a7de0c4 [X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI.
Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit).

llvm-svn: 288003
2016-11-27 19:28:39 +00:00
Craig Topper
9da20958ad [X86] Add TB_NO_REVERSE to entries in the load folding table where the instruction's load size is smaller than the register size.
If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist.

I probably missed some instructions, but this should be a large portion of them.

llvm-svn: 288001
2016-11-27 18:51:13 +00:00
Simon Pilgrim
1225960a5c [X86][SSE] Added tests showing missed combines for shuffle to shifts.
llvm-svn: 288000
2016-11-27 18:25:02 +00:00
Sanjay Patel
7b435e81e5 add tests to show missing analysis; NFC
llvm-svn: 287998
2016-11-27 15:54:45 +00:00
Sanjay Patel
1e0354882f fix formatting; NFC
llvm-svn: 287997
2016-11-27 15:53:48 +00:00
Craig Topper
b4246740bf [AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables.
llvm-svn: 287995
2016-11-27 08:55:31 +00:00
Mohammad Shahid
3137ffb894 [SLP] Add new and update existing lit testfor providing more context to incoming patch for vectorization of jumbled load
Change-Id: Ifb9091bb0f84c1937c2c8bd2fc345734f250d2f9
llvm-svn: 287992
2016-11-27 03:35:31 +00:00
Craig Topper
b2d170fa15 [X86] Remove alignment restrictions from load folding table for some instructions that don't have a restriction.
Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load.

llvm-svn: 287991
2016-11-27 01:52:51 +00:00
Craig Topper
6e69c55c53 [X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold.
llvm-svn: 287987
2016-11-26 18:43:26 +00:00
Craig Topper
f0aaae1a9c [X86] Fix the zero extending load detection in X86DAGToDAGISel::selectScalarSSELoad to pass the load node to IsProfitableToFold and IsLegalToFold.
Previously we were passing the SCALAR_TO_VECTOR node.

llvm-svn: 287986
2016-11-26 18:43:24 +00:00
Craig Topper
1f1f3db883 [X86] Simplify control flow. NFCI
llvm-svn: 287985
2016-11-26 18:43:21 +00:00
Craig Topper
e254e42669 [X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times.
Summary: When selectScalarSSELoad is looking for a scalar_to_vector of a scalar load, it makes sure the load is only used by the scalar_to_vector. But it doesn't make sure the scalar_to_vector is only used once. This can cause the same load to be folded multiple times. This can be bad for performance. This also causes the chain output to be duplicated, but not connected to anything so chain dependencies will not be satisfied.

Reviewers: RKSimon, zvi, delena, spatel

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D26790

llvm-svn: 287983
2016-11-26 17:29:25 +00:00
Sanjay Patel
3bbd46c140 [InstCombine] add test to show missing vector optimization; NFC
llvm-svn: 287982
2016-11-26 16:13:23 +00:00
Sanjay Patel
2f5991c226 [InstCombine] don't drop metadata in FoldOpIntoSelect()
llvm-svn: 287980
2016-11-26 15:23:20 +00:00
Sanjay Patel
29cd3b551c add optional param to copy metadata when creating selects; NFC
There are other spots where we can use this; we're currently dropping 
metadata in some places, and there are proposed changes where we will
want to propagate metadata.

IRBuilder's CreateSelect() already has a parameter like this, so this
change makes the regular 'Create' API line up with that.

llvm-svn: 287976
2016-11-26 15:01:59 +00:00
Craig Topper
80de3aea41 [AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables.
llvm-svn: 287975
2016-11-26 08:21:52 +00:00
Craig Topper
34f7b485a5 [AVX-512] Add masked 128/256-bit integer add/sub instructions to load folding tables.
llvm-svn: 287974
2016-11-26 08:21:48 +00:00
Craig Topper
3c44468f76 [AVX-512] Add masked 512-bit integer add/sub instructions to load folding tables.
llvm-svn: 287972
2016-11-26 07:21:00 +00:00