1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

129156 Commits

Author SHA1 Message Date
Derek Schuff
9f833b6097 Introduce MachineFunctionProperties and the AllVRegsAllocated property
MachineFunctionProperties represents a set of properties that a MachineFunction
can have at particular points in time. Existing examples of this idea are
MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which
will eventually be switched to use this mechanism.
This change introduces the AllVRegsAllocated property; i.e. the property that
all virtual registers have been allocated and there are no VReg operands
left.

With this mechanism, passes can declare that they require a particular property
to be set, or that they set or clear properties by implementing e.g.
MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class
verifies that the requirements are met, and handles the setting and clearing
based on the delcarations. Passes can also directly query and update the current
properties of the MF if they want to have conditional behavior.

This change annotates the target-independent post-regalloc passes; future
changes will also annotate target-specific ones.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D18421

llvm-svn: 264593
2016-03-28 17:05:30 +00:00
Hemant Kulkarni
89e38a8027 [llvm-size] Implement --common option
Differential Revision: http://reviews.llvm.org/D16820

llvm-svn: 264591
2016-03-28 16:48:10 +00:00
Vedant Kumar
efb5313162 Revert "[PGO] Fix name encoding for ObjC-like functions"
This reverts commit r264587. Reverting to investigate 6 unexpected
failures on the ppc bot:

http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/2822

llvm-svn: 264590
2016-03-28 16:14:07 +00:00
Tom Stellard
fe96362048 AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructions
Summary:
This helps prevent load clustering from drastically increasing register
pressure by trying to cluster 4 SMRDx8 loads together.  The limit of 16
bytes was chosen, because it seems like that was the original intent
of setting the limit to 4 instructions, but more analysis could show
that a different limit is better.

This fixes yields small decreases in register usage with shader-db, but
also helps avoid a large increase in register usage when lane mask
tracking is enabled in the machine scheduler, because lane mask tracking
enables more opportunities for load clustering.

shader-db stats:

2379 shaders in 477 tests
Totals:
SGPRS: 49744 -> 48600 (-2.30 %)
VGPRS: 34120 -> 34076 (-0.13 %)
Code Size: 1282888 -> 1283184 (0.02 %) bytes
LDS: 28 -> 28 (0.00 %) blocks
Scratch: 495616 -> 492544 (-0.62 %) bytes per wave
Max Waves: 6843 -> 6853 (0.15 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: nhaehnle, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18451

llvm-svn: 264589
2016-03-28 16:10:13 +00:00
Davide Italiano
09b1f874a9 [SimplifyLibCalls] Transform printf("%s", "a") -> putchar('a').
llvm-svn: 264588
2016-03-28 15:54:01 +00:00
Vedant Kumar
af89620a83 [PGO] Fix name encoding for ObjC-like functions
Function names in ObjC can have spaces in them. This interacts poorly
with name compression, which uses spaces to separate PGO names. Fix the
issue by using a different separator and update a test.

I chose "\01" as the separator because 1) it's non-printable, 2) we
strip it from PGO names, and 3) it's the next natural choice once "\00"
is discarded (that one's overloaded).

Differential Revision: http://reviews.llvm.org/D18516

llvm-svn: 264587
2016-03-28 15:52:08 +00:00
Vedant Kumar
8c50bbf8c8 [Coverage] Strip <unknown> from PGO names if no filenames are available
Patch suggested by David Li!

llvm-svn: 264586
2016-03-28 15:49:08 +00:00
Krzysztof Parzyszek
5e7b5fce65 [Hexagon] Improve handling of unaligned vector loads and stores
llvm-svn: 264584
2016-03-28 15:43:03 +00:00
James Y Knight
387d85aaa9 NFC: skip FenceInst up-front in AtomicExpandPass.
llvm-svn: 264583
2016-03-28 15:05:30 +00:00
Krzysztof Parzyszek
bd3e938f88 [Hexagon] Only use restore functions for single register at -Oz
llvm-svn: 264581
2016-03-28 14:52:21 +00:00
Krzysztof Parzyszek
fc357e83ee [Hexagon] Speed up frame lowering when no optimizations are enabled
- Do not optimize stack slots in optnone functions.
- Get aligned-base register from HexagonMachineFunctionInfo instead of
  looking for ALIGNA instruction in the function's body.

llvm-svn: 264580
2016-03-28 14:42:03 +00:00
Douglas Katzman
990a4765e9 Sparc: silently ignore .proc assembler directive
Differential Revision: http://reviews.llvm.org/D18463

llvm-svn: 264579
2016-03-28 14:00:11 +00:00
Jacques Pienaar
9af311f3de [lanai] Add Lanai backend.
Add the Lanai backend to lib/Target.

General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html).

Differential Revision: http://reviews.llvm.org/D17011

llvm-svn: 264578
2016-03-28 13:09:54 +00:00
Hal Finkel
d84de52bea [SROA] Fix typo in comment
llvm-svn: 264573
2016-03-28 11:23:21 +00:00
Hal Finkel
1e50d94ddc C++11 is required, remove some preprocessor checks for it
We require C++11 to build, so remove a few remaining preprocessor checks for
'__cplusplus >= 201103L'. This should always be true.

llvm-svn: 264572
2016-03-28 11:13:03 +00:00
Chuang-Yu Cheng
946c8b7db0 [Power9] Implement new altivec instructions: bcd* series
This patch implements the following altivec instructions:

- Decimal Convert From/to National/Zoned/Signed-QWord:
    bcdcfn. bcdcfz. bcdctn. bcdctz. bcdcfsq. bcdctsq.

- Decimal Copy-Sign/Set-Sign:
    bcdcpsgn. bcdsetsgn.

- Decimal Shift/Unsigned-Shift/Shift-and-Round:
    bcds. bcdus. bcdsr.

- Decimal (Unsigned) Truncate:
    bcdtrunc. bcdutrunc.

Total 13 instructions

Thanks Amehsan's advice! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D17838

llvm-svn: 264568
2016-03-28 09:04:23 +00:00
Chuang-Yu Cheng
86334cecad [Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat
This change implements the following vsx instructions:

- Scalar Insert/Extract
    xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp

- Vector Insert/Extract
    xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
    xxextractuw xxinsertw

- Scalar/Vector Test Data Class
    xststdcdp xststdcsp xststdcqp
    xvtstdcdp xvtstdcsp

- Maximum/Minimum
    xsmaxcdp xsmaxjdp
    xsmincdp xsminjdp

- Vector Byte-Reverse/Permute/Splat
    xxbrd xxbrh xxbrq xxbrw
    xxperm xxpermr
    xxspltib

30 instructions

Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16842

llvm-svn: 264567
2016-03-28 08:34:28 +00:00
Elena Demikhovsky
9018ddedab AVX-512: Fixed ICMP instruction selection for i1 operands
ICMP instruction selection fails on SKX and KNL for i1 operand.
I use XOR to resolve:
(A == B) is equivalent to (A xor B) == 0

Differential Revision: http://reviews.llvm.org/D18511

llvm-svn: 264566
2016-03-28 07:47:58 +00:00
Chuang-Yu Cheng
f63e6b8f62 [Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic
This change implements the following vsx instructions:

- quad-precision move
    xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp

- quad-precision fp-arithmetic
    xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
    xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)

22 instructions

Thanks Nemanja and Kit for careful review and invaluable discussion!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16110

llvm-svn: 264565
2016-03-28 07:38:01 +00:00
NAKAMURA Takumi
fc74c1202b llvm/test/Transforms/FunctionImport/funcimport.ll: -stats REQUIRES +Asserts.
llvm-svn: 264561
2016-03-28 02:14:49 +00:00
Vedant Kumar
2cee1f1ff7 [Coverage] Fix the way we load "<unknown>:func" records
When emitting coverage mappings for functions with local linkage and an
unknown filename, we use "<unknown>:func" for the PGO function name. The
problem is that we don't strip "<unknown>" from the name when loading
coverage data, like we do for other file names. Fix that and add a test.

llvm-svn: 264559
2016-03-28 01:16:12 +00:00
Duncan P. N. Exon Smith
1350530482 BitcodeWriter: Replace dead code with an assertion, NFC
The caller of ValueEnumerator::EnumerateOperandType never sends in
metadata.  Assert that, and remove the unnecessary logic.

llvm-svn: 264558
2016-03-28 00:03:12 +00:00
Duncan P. N. Exon Smith
253c249a0b BitcodeWriter: Reuse writeMetadataRecords, NFC
Change writeFunctionMetadata to call writeMetadataRecords.  For now
there's no functionality change, but makes it easy to serialize other
types of metadata in the function block in the future.

llvm-svn: 264557
2016-03-27 23:59:32 +00:00
Duncan P. N. Exon Smith
b67ed646a7 BitcodeWriter: Rename some functions for consistency, NFC
To match writeMetadataRecords, writeNamedMetadata and
writeMetadataStrings, change:

    WriteModuleMetadata        => writeModuleMetadata
    WriteFunctionLocalMetadata => writeFunctionMetadata
    Write##CLASS               => write##CLASS

The only major change is "FunctionLocal" => "Function".  The point is to
be less specific, in preparation for emitting normal metadata records
inside function metadata blocks (currently we only emit
`LocalAsMetadata` there).

llvm-svn: 264556
2016-03-27 23:56:04 +00:00
Duncan P. N. Exon Smith
b56d257cb6 BitcodeWriter: Split out writeMetadataRecords, NFC
Besides being a nice cleanup, this is preparation for reusing the code
in function metadata blocks.

llvm-svn: 264555
2016-03-27 23:53:30 +00:00
Duncan P. N. Exon Smith
768237a65a BitcodeWriter: Restructure WriteFunctionLocalMetadata, NFC
Use an early return to simplify logic.

llvm-svn: 264554
2016-03-27 23:38:36 +00:00
Duncan P. N. Exon Smith
2aefcc81e5 Bitcode: Fix MSVC bot failure from r264549
make_unique => llvm::make_unique

llvm-svn: 264553
2016-03-27 23:36:55 +00:00
Duncan P. N. Exon Smith
66701f0f24 BitcodeWriter: Simplify tracking of function-local metadata, NFC
We don't really need a separate vector here; instead, point at a range
inside the main MDs array.  This matches how r264551 references the
ranges of strings and non-strings.

llvm-svn: 264552
2016-03-27 23:22:31 +00:00
Duncan P. N. Exon Smith
1fbdccf334 Reapply ~"Bitcode: Collect all MDString records into a single blob"
Spiritually reapply commit r264409 (reverted in r264410), albeit with a
bit of a redesign.

Firstly, avoid splitting the big blob into multiple chunks of strings.

r264409 imposed an arbitrary limit to avoid a massive allocation on the
shared 'Record' SmallVector.  The bug with that commit only reproduced
when there were more than "chunk-size" strings.  A test for this would
have been useless long-term, since we're liable to adjust the chunk-size
in the future.

Thus, eliminate the motivation for chunk-ing by storing the string sizes
in the blob.  Here's the layout:

    vbr6: # of strings
    vbr6: offset-to-blob
    blob:
       [vbr6]: string lengths
       [char]: concatenated strings

Secondly, make the output of llvm-bcanalyzer readable.

I noticed when debugging r264409 that llvm-bcanalyzer was outputting a
massive blob all in one line.  Past a small number, the strings were
impossible to split in my head, and the lines were way too long.  This
version adds support in llvm-bcanalyzer for pretty-printing.

    <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 {
      'abc'
      'def'
      'ghi'
    }

From the original commit:

Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this
should (a) slightly reduce bitcode size, since there is less record
overhead, and (b) greatly improve reading speed, since blobs are super
cheap to deserialize.

llvm-svn: 264551
2016-03-27 23:17:54 +00:00
Duncan P. N. Exon Smith
4032b70bb8 BitcodeWriter: Simplify and test writing blobs, NFC
Split helper out of EmitRecordWithAbbrevImpl called emitBlob to reduce
code duplication, and add a few tests for it.

No functionality change intended.

llvm-svn: 264550
2016-03-27 23:04:04 +00:00
Duncan P. N. Exon Smith
a6ffc80357 Support: Implement StreamingMemoryObject::getPointer
The implementation is fairly obvious.  This is preparation for using
some blobs in bitcode.

For clarity (and perhaps future-proofing?), I moved the call to
JumpToBit in BitstreamCursor::readRecord ahead of calling
MemoryObject::getPointer, since JumpToBit can theoretically (a) read
bytes, which (b) invalidates the blob pointer.

This isn't strictly necessary the two memory objects we have:

  - The return of RawMemoryObject::getPointer is valid until the memory
    object is destroyed.

  - StreamingMemoryObject::getPointer is valid until the next chunk is
    read from the stream.  Since the JumpToBit call is only going ahead
    to a word boundary, we'll never load another chunk.

However, reordering makes it clear by inspection that the blob returned
by BitstreamCursor::readRecord will be valid.

I added some tests for StreamingMemoryObject::getPointer and
BitstreamCursor::readRecord.

llvm-svn: 264549
2016-03-27 23:00:59 +00:00
Duncan P. N. Exon Smith
7eb2cc2c0b Support: Move StreamingMemoryObject{,Test}.cpp, NFC
Change the filename to indicate this is a test, rename the tests, move
them into an anonymous namespace, and rename some variables.  All to
match our usual style before making further changes.

llvm-svn: 264548
2016-03-27 22:55:19 +00:00
Duncan P. N. Exon Smith
9f6583a3aa Bitcode: Add SimpleBitstreamCursor::setArtificialByteLimit
Allow users of SimpleBitstreamCursor to limit the number of bytes
available to the cursor.  This is preparation for instantiating a cursor
that isn't allowed to load more bytes from a StreamingMemoryObject (just
move around the ones already-loaded).

llvm-svn: 264547
2016-03-27 22:49:32 +00:00
Duncan P. N. Exon Smith
0f4740991b Bitcode: Add SimpleBitstreamCursor::getPointerToByte, etc.
Add API to SimpleBitstreamCursor to allow users to translate between
byte addresses and pointers.

  - jumpToPointer: move the bit position to a particular pointer.
  - getPointerToByte: get the pointer for a particular byte.
  - getPointerToBit: get the pointer for the byte of the current bit.
  - getCurrentByteNo: convenience function for assertions and tests.

Mainly adds unit tests (getPointerToBit/Byte already has a use), but
also preparation for eventually using jumpToPointer.

llvm-svn: 264546
2016-03-27 22:45:25 +00:00
Duncan P. N. Exon Smith
63b6fead6b Bitcode: Split out SimpleBitstreamCursor
Split out SimpleBitstreamCursor from BitstreamCursor, which is a
lower-level cursor with no knowledge of bitcode blocks, abbreviations,
or records.  It just knows how to read bits and navigate the stream.

This is mainly organizational, to separate the API for manipulating raw
bits from that for bitcode concepts like Record and Block.

llvm-svn: 264545
2016-03-27 22:40:55 +00:00
JF Bastien
c3352415f0 Revert "isPodLike: more precise"
This reverts commit c45f2afac5d6855a4804456a0f718563dc47ada0.

Looks like it may be causing a failure, I'll revert for now.

                 from
lib/CodeGen/AsmPrinter/DwarfDebug.cpp:14:
/usr/include/c++/4.9.2/bits/stl_pair.h: In instantiation of
                 'std::pair<_T1, _T2>& std::pair<_T1,
                 _T2>::operator=(const std::pair<_T1, _T2>&) [with _T1 =
                 std::unique_ptr<llvm::DwarfTypeUnit>; _T2 = const
                 llvm::DICompositeType*]':

/usr/include/c++/4.9.2/bits/stl_pair.h:160:8: error: use of deleted
function 'std::unique_ptr<_Tp, _Dp>& std::unique_ptr<_Tp,
_Dp>::operator=(const std::unique_ptr<_Tp, _Dp>&) [with _Tp =
llvm::DwarfTypeUnit; _Dp = std::default_delete<llvm::DwarfTypeUnit>]'
  first = __p.first;
        ^

llvm-svn: 264544
2016-03-27 20:50:05 +00:00
Sanjay Patel
4565d3e5ee workaround for an IR variable named %.
(which SimplifyCFG can produce...)

llvm-svn: 264543
2016-03-27 20:44:35 +00:00
Sanjay Patel
ae386b1b8a add scrubber for excessive leading whitespace
llvm-svn: 264542
2016-03-27 20:43:02 +00:00
JF Bastien
a477c84400 isPodLike: more precise
I tried to use isPodLike in:
  http://reviews.llvm.org/D18483

That failed because !is_class is too strict on platforms which don't yet
have is_trivially_copyable. This update tries to make isPodLike smarter
for platforms which don't have is_trivially_copyable, and AFAICT it
Should Just Work on all of them. I'll revert if the bots disagree with
me.

I'll also rename isPodLike to isTriviallyCopyable if this all works out,
since that's what the standard calls it now and one day we'll be rid of
isPodLike.

llvm-svn: 264541
2016-03-27 20:32:21 +00:00
Teresa Johnson
cdaff4788b Use DAG check to try to appease bot
Try to appease
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/34772. This was
the only check that didn't use DAG and it wasn't found.

llvm-svn: 264538
2016-03-27 15:36:43 +00:00
Teresa Johnson
e0ea5e3cb1 [ThinLTO] Add optional import message and statistics
Summary:
Add a statistic to count the number of imported functions. Also, add a
new -print-imports option to emit a trace of imported functions, that
works even for an NDEBUG build.

Note that emitOptimizationRemark does not work for the above printing as
it expects a Function object and DebugLoc, neither of which we have
with summary-based importing.

This is part 2 of D18487, the first part was committed separately as
r264536.

Reviewers: joker.eph

Subscribers: llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18487

llvm-svn: 264537
2016-03-27 15:27:30 +00:00
Teresa Johnson
0fe411fafb [ThinLTO] Don't try to import alias unless aliasee can be imported
With r264503, aliases are now being added to the GlobalsToImport set
even when their aliasees can't be imported due to their linkage type.
While the importing worked correctly (the aliases imported as
declarations) due to the logic in doImportAsDefinition, there is no
point to adding them to the GlobalsToImport set.

Additionally, with D18487 it was resulting in incorrectly printing a
message indicating that the alias was imported.

To avoid this, delay adding aliases to the GlobalsToImport set until
after the linkage type of the aliasee is checked.

This patch is part of D18487.

llvm-svn: 264536
2016-03-27 15:01:11 +00:00
Hal Finkel
10aaab9a43 [PowerPC] Map max/minnum intrinsics and fmax/fmin to ISD nodes for CTR-based loop legality
Intrinsic::maxnum and Intrinsic::minnum, along with the associated libc
function calls (fmax[f], etc.) generally map to function calls after lowering.
For some vector types with QPX at least, however, we can legally lower these,
and we don't need to prohibit CTR-based loops on their account.

It turned out, however, that the logic that checked the opcodes associated with
intrinsics was broken (it would set the Opcode variable, but that variable was
later checked only if set for some otherwise-external function call.

This fixes the latter problem and adds the FMAX/MINNUM mappings.

llvm-svn: 264532
2016-03-27 05:40:56 +00:00
Michael Kruse
6587d6dd55 [Verifier] Reject PHIs using defs from own block.
Reject the following IR as malformed (assuming that %entry, %next are
not in a loop):

    next:
      %y = phi i32 [ 0, %entry ]
      %x = phi i32 [ %y, %entry ]

Such PHI nodes came up in PR26718. While there was no consensus on
whether or not this is valid IR, most opinions on that bug and in a
discussion on the llvm-dev mailing list tended towards a
"strict interpretation" (term by Joseph Tremoulet) of PHI node uses.
Also, the language reference explicitly states that "the use of each
incoming value is deemed to occur on the edge from the corresponding
predecessor block to the current block" and
`DominatorTree::dominates(Instruction*, Use&)` uses this definition as
well.

For the code mentioned in PR15384, clang does not compile to such PHIs
(anymore?). The test case still hangs when replacing `%tmp6` with `%tmp`
in revisions before r176366 (where PR15384 has been fixed). The
occurrence of %tmp6 therefore was probably unintentional. Its value is
not used except in other PHIs.

Reviewers: majnemer, reames, JosephTremoulet, bkramer, grosser, jdoerfert, kparzysz, sanjoy

Differential Revision: http://reviews.llvm.org/D18443

llvm-svn: 264528
2016-03-26 23:32:57 +00:00
Sanjay Patel
2736238d12 [SimplifyCFG] propagate branch metadata when creating select (PR26636)
llvm-svn: 264527
2016-03-26 23:30:50 +00:00
Sanjay Patel
912baf9273 minimize test cases
These are tests for store transforms. 
The loads, adds, and geps were irrelevant.

llvm-svn: 264526
2016-03-26 23:09:25 +00:00
David Blaikie
8f12838248 llvm-dwp: Include the dwo name (if available) when diagnosing duplicate CU IDs from dwp input files
If you're building dwps from other dwps, it can be hard to track down a
duplicate CU ID if it comes from two compilations of the same file in
different modes, etc. By including the .dwo path (which is hopefully
more unique than the file path) it can help track down where the
duplicates came from.

llvm-svn: 264520
2016-03-26 20:32:14 +00:00
Simon Pilgrim
af0ed3ac43 [X86][AVX] Enabled SMUL_LOHI/UMUL_LOHI v8i32 vectors on AVX1 targets
Correct splitting of v8i32 vectors into v4i32 vectors to prevent scalarization

llvm-svn: 264517
2016-03-26 18:32:13 +00:00
JF Bastien
cb9fcddfb4 Revert "NFC: static_assert instead of comment"
This reverts commit fa36fcff16c7d4f78204d6296bf96c3558a4a672.

Causes the following Windows failure:

  C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\lib\CodeGen\MachineInstr.cpp(762):
  error C2338: must be trivially copyable to memmove

llvm-svn: 264516
2016-03-26 18:20:02 +00:00
JF Bastien
1855abc17a NFC: static_assert instead of comment
Summary: isPodLike is as close as we have for is_trivially_copyable.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18483

llvm-svn: 264515
2016-03-26 18:14:27 +00:00