1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 13:33:37 +02:00
Commit Graph

115772 Commits

Author SHA1 Message Date
Jingyue Wu
a3bb69f6c6 Divergence analysis for GPU programs
Summary:
Some optimizations such as jump threading and loop unswitching can negatively
affect performance when applied to divergent branches. The divergence analysis
added in this patch conservatively estimates which branches in a GPU program
can diverge. This information can then help LLVM to run certain optimizations
selectively.

Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll

Reviewers: resistor, hfinkel, eliben, meheff, jholewinski

Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D8576

llvm-svn: 234567
2015-04-10 05:03:50 +00:00
David Majnemer
7ed6f9b08e [WinEHPrepare] Don't rely on the order of IR
The IPToState table must be emitted after we have generated labels for
all functions in the table.  Don't rely on the order of the list of
globals.  Instead, utilize WinEHFuncInfo to tell us how many catch
handlers we expect to outline.  Once we know we've visited all the catch
handlers, emit the cppxdata.

llvm-svn: 234566
2015-04-10 04:56:17 +00:00
Hal Finkel
79f36597b9 [PowerPC] Don't crash on PPC32 i64 fp_to_uint on modern cores
When we have an instruction for this (and, thus, don't generate a runtime
call), we need to custom type legalize this (in a trivial way, just as we do
for fp_to_sint).

Fixes PR23173.

llvm-svn: 234561
2015-04-10 03:39:00 +00:00
Ahmed Bougacha
9e6b267c41 [AArch64] Promote f16 operations to f32.
For the most common ones (such as fadd), we already did the promotion.
Do the same thing for all the others.

Currently, we'll just crash/assert on all these operations, as
there's no hardware or libcall support whatsoever.

f16 (half) is specified as an interchange - not arithmetic - format,
and is expected to be promoted to single-precision for arithmetic
operations.

While there, teach the legalizer about promoting some of the (mostly
floating-point) operations that we never needed before.

Differential Revision: http://reviews.llvm.org/D8648
See related discussion on the thread for: http://reviews.llvm.org/D8755

llvm-svn: 234550
2015-04-10 00:08:48 +00:00
Nemanja Ivanovic
5c0e16778c Add LLVM support for remaining integer divide and permute instructions from ISA 2.06
This is the patch corresponding to review:
http://reviews.llvm.org/D8406

It adds some missing instructions from ISA 2.06 to the PPC back end.

llvm-svn: 234546
2015-04-09 23:54:37 +00:00
Rafael Espindola
a7ececf04c Simplify use of formatted_raw_ostream.
formatted_raw_ostream is a wrapper over another stream to add column and line
number tracking.

It is used only for asm printing.

This patch moves the its creation down to where we know we are printing
assembly. This has the following advantages:

* Simpler lifetime management: std::unique_ptr
* We don't compute column and line number of object files :-)

llvm-svn: 234535
2015-04-09 21:06:08 +00:00
Ahmed Bougacha
d2a7cedeca [CodeGen] Combine concat_vector of trunc'd scalar to scalar_to_vector.
We already do:
  concat_vectors(scalar, undef) -> scalar_to_vector(scalar)
When the scalar is legal.
When it's not, but is a truncated legal scalar, we can also do:
  concat_vectors(trunc(scalar), undef) -> scalar_to_vector(scalar)
Which is equivalent, since the upper lanes are undef anyway.
While there, teach the combine to look at more than 2 operands.

Differential Revision: http://reviews.llvm.org/D8883

llvm-svn: 234530
2015-04-09 20:04:47 +00:00
Juergen Ributzka
6f558fd68f [AArch64][FastISel] Fix integer extend optimization.
The integer extend optimization tries to fold the extend into the load
instruction. This requires us to identify if the extend has already been
emitted or not and act accordingly on it.

The check that was originally performed for this was not sufficient. Besides
checking the ValueMap for a mapped register we also need to check if the
virtual register has already an associated machine instruction that defines it.

This fixes rdar://problem/20470788.

llvm-svn: 234529
2015-04-09 20:00:46 +00:00
Eric Christopher
d618af519a Remove duplicated code and consolidate initializers.
llvm-svn: 234525
2015-04-09 19:20:37 +00:00
Rafael Espindola
adc15d13f8 clang-format bits of code to make a followup patch easy to read.
llvm-svn: 234519
2015-04-09 18:32:58 +00:00
Rafael Espindola
2d125b4495 Revert "Refactoring and enhancement to FMA combine."
This reverts commit r234513. It was failing on the bots.

llvm-svn: 234518
2015-04-09 18:29:32 +00:00
Rafael Espindola
58833403e8 Define a function with "... llvm::func...".
Using this instead of
namespace llvm {
  func...
}

Has the advantage that the build fails with a compiler error if it gets out
of sync with the .h file.

llvm-svn: 234515
2015-04-09 18:08:15 +00:00
Olivier Sallenave
e19fcb1120 Refactoring and enhancement to FMA combine.
llvm-svn: 234513
2015-04-09 17:55:26 +00:00
Duncan P. N. Exon Smith
c16b798987 IR: Preserve use-list order by default in bitcode
Pull the `-preserve-*-use-list-order` flags out of "experimental" mode,
and preserve use-list order by default when serializing to bitcode.

llvm-svn: 234510
2015-04-09 17:41:20 +00:00
Rafael Espindola
a957410bfc Use a raw_svector_ostream instead of a raw_string_ostream.
It saves a bit of copying.

llvm-svn: 234507
2015-04-09 17:16:25 +00:00
Rafael Espindola
77bf4fdaa6 Don't repeat name in comment. NFC.
llvm-svn: 234506
2015-04-09 17:10:57 +00:00
Jingyue Wu
3534f1845e [NFC] add more comments for SLSR
llvm-svn: 234505
2015-04-09 17:04:28 +00:00
Rafael Espindola
00b7d1173b Misc cleanup. NFC.
These were lost when I reverted the raw_ostream changes.

llvm-svn: 234504
2015-04-09 16:59:07 +00:00
Rafael Espindola
eff0a4f38c clang-format. NFC.
llvm-svn: 234502
2015-04-09 16:43:22 +00:00
Rafael Espindola
81f709962b clang-format this constructor.
llvm-svn: 234501
2015-04-09 16:37:11 +00:00
Rafael Espindola
3818b474b0 Don't repeat names in comments.
llvm-svn: 234498
2015-04-09 16:06:26 +00:00
Rafael Espindola
95aca09bad Use implicit calls to parent constructor. NFC.
llvm-svn: 234497
2015-04-09 16:00:24 +00:00
Rafael Espindola
edd11eb538 This reverts commit r234460 and r234461.
Revert "Add classof implementations to the raw_ostream classes."
Revert "Use the cast machinery to remove dummy uses of formatted_raw_ostream."

The underlying issue can be fixed without classof.

llvm-svn: 234495
2015-04-09 15:54:59 +00:00
Javed Absar
b2e5d643a8 [ARM] support for Cortex-R4/R4F
Currently, llvm (backend) doesn't know cortex-r4, even though it is the
default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes
'cortex-r4' is not a recognized processor for this target' by llvm.
This patch adds support for cortex-r4 and, very closely related, r4f.

llvm-svn: 234486
2015-04-09 14:07:28 +00:00
Rafael Espindola
4fb23cdd75 Nothing inherits from the asm streamer.
Make that explicit and remove protected:

llvm-svn: 234484
2015-04-09 13:04:20 +00:00
Toma Tabacu
b254a682d9 [mips] Refactor saved-registers bitmask creation in MipsAsmPrinter::printSavedRegsBitmask. NFC.
Summary:
Make the code more readable by fusing the for-loops together and explicitly checking for each register class.

Also, this version is more straightforward because it doesn't assume that FPU registers always come before CPU registers in the CalleeSavedInfo vector.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8033

llvm-svn: 234475
2015-04-09 10:54:16 +00:00
Kristof Beyls
c065d4f7d5 [AArch64] Add support for dynamic stack alignment
Differential Revision: http://reviews.llvm.org/D8876

llvm-svn: 234471
2015-04-09 08:49:47 +00:00
Lang Hames
0020ebeb89 [AArch64] Remove redundant -march option. Also fix a think-o from r234462.
llvm-svn: 234467
2015-04-09 05:34:57 +00:00
Nick Lewycky
e6736038da Not all triples put _ before function names. Specify a triple to make this test pass on Linux.
llvm-svn: 234466
2015-04-09 05:31:32 +00:00
Craig Topper
0cf1950f24 Use SmallVector instead of std::vector for uniquing X86 disassembler operand sets. The number of operands is a small fixed size.
llvm-svn: 234465
2015-04-09 04:08:48 +00:00
Craig Topper
a5a636a212 Simplify some printing code by combining new lines onto previous strings. Don't work so hard not to print a comma on the last entry of an array.
llvm-svn: 234464
2015-04-09 04:08:46 +00:00
Craig Topper
ea22f9e938 Don't convert enum to strings just to put them in the uniquing map. Use the enum directly. Only convert to a string for printing.
llvm-svn: 234463
2015-04-09 04:08:42 +00:00
Lang Hames
360efe3451 [AArch64] Teach AArch64TargetLowering::getOptimalMemOpType to consider alignment
restrictions when choosing a type for small-memcpy inlining in
SelectionDAGBuilder.

This ensures that the loads and stores output for the memcpy won't be further
expanded during legalization, which would cause the total number of instructions
for the memcpy to exceed (often significantly) the inlining thresholds.

<rdar://problem/17829180>

llvm-svn: 234462
2015-04-09 03:40:33 +00:00
Rafael Espindola
4b3ef31279 Use the cast machinery to remove dummy uses of formatted_raw_ostream.
If we know we are producing an object, we don't need to wrap the stream
in a formatted_raw_ostream anymore.

llvm-svn: 234461
2015-04-09 02:28:12 +00:00
Rafael Espindola
0c8f021b8e Add classof implementations to the raw_ostream classes.
More uses to follow in a another patch.

llvm-svn: 234460
2015-04-09 02:10:28 +00:00
Rafael Espindola
f745768186 Delete unused constructor.
llvm-svn: 234459
2015-04-09 01:11:26 +00:00
Eric Christopher
30f86366b7 Update comment to refer to software floating point rather than
a local variable.

llvm-svn: 234457
2015-04-09 00:14:49 +00:00
Akira Hatanaka
8e8f03d803 Use option -march instead of -mtriple to avoid overconditionalizing the test.
This fixes r234439, which was committed to fix the test failures caused by
r234430.

llvm-svn: 234451
2015-04-08 23:02:45 +00:00
Manman Ren
7ff426c139 [LTO] do not run internalize pass from compileOptimized.
The input to compileOptimized is already optimized and internalized, so remove
internalize pass from compileOptimized.

rdar://20227235

llvm-svn: 234446
2015-04-08 22:02:11 +00:00
Akira Hatanaka
a1ad440430 Pass -mtriple to llc to appease buildbot.
This fixes the test case I committed in r234430.

llvm-svn: 234439
2015-04-08 21:30:48 +00:00
Andrew Kaylor
28081cec7b Formmatting correction
llvm-svn: 234438
2015-04-08 21:22:46 +00:00
Andrew Kaylor
c8c7b73c5c [WinEH] Minor bug fixes.
Fixed insert point for allocas created for demoted values.
Clear the nested landing pad list after it has been processed.

llvm-svn: 234433
2015-04-08 20:57:22 +00:00
Akira Hatanaka
5d29050f58 [DAGCombine] Fix a bug in MergeConsecutiveStores.
The bug manifests when there are two loads and two stores chained as follows in
a DAG,

(ld v3f32) -> (st f32) -> (ld v3f32) -> (st f32)

and the stores' values are extracted from the preceding vector loads.

MergeConsecutiveStores would replace the first store in the chain with the
merged vector store, which would create a cycle between the merged store node
and the last load node that appears in the chain.

This commits fixes the bug by replacing the last store in the chain instead.

rdar://problem/20275084

Differential Revision: http://reviews.llvm.org/D8849

llvm-svn: 234430
2015-04-08 20:34:53 +00:00
Peter Collingbourne
84eea7b73b Go bindings: make various DIBuilder arguments optional.
r234262 changed some code in DIBuilderBindings.cpp to use the unwrap function
to unwrap debug metadata. The problem with this is that unwrap asserts that
its argument is non-null, which is not what we want in a number of places
in DIBuilder where the argument is optional. This change makes certain
arguments optional by adding null checks in places where it is required,
fixing the llgo build.

llvm-svn: 234428
2015-04-08 20:18:57 +00:00
Rafael Espindola
a9b13dc5bd Don't repeat names in comments.
llvm-svn: 234427
2015-04-08 20:16:23 +00:00
Rafael Espindola
43cb6b4775 Remove unused variable.
llvm-svn: 234426
2015-04-08 20:04:20 +00:00
Cameron Zwarich
f48f486c1d Eliminate O(n^2) worst-case behavior in SSA construction
The code uses a priority queue and a worklist, which share the same
visited set, but the visited set is only updated when inserting into
the priority queue. Instead, switch to using separate visited sets
for the priority queue and worklist.

llvm-svn: 234425
2015-04-08 18:26:20 +00:00
Adam Nemet
23b1ef3354 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
(Re-apply r234361 with a fix and a testcase for PR23157)

Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234424
2015-04-08 17:48:40 +00:00
Scott Douglass
5399459726 [ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math
Because -menable-no-nans causes fcmp conditions to be rewritten
without 'o' or 'u' the recognition code in needs to cope. Also
extended it to handle 'le' and 'ge.

Differential Revision: http://reviews.llvm.org/D8725

llvm-svn: 234421
2015-04-08 17:18:28 +00:00
Sanjay Patel
ff4da24886 fixed to test features, not CPU models
llvm-svn: 234413
2015-04-08 16:51:42 +00:00