1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

4390 Commits

Author SHA1 Message Date
Reid Kleckner
26454793b1 Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

llvm-svn: 196876
2013-12-10 05:12:23 +00:00
Jakub Staszak
11e1c882f7 Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces
overall time of LLVM compilation by ~1%.

llvm-svn: 196667
2013-12-07 21:20:17 +00:00
Andrew Trick
2ac3f8f326 Factor out the SchedRemainder/SchedBoundary from GenericScheduler strategy.
These helper classes take care of the book-keeping the drives the
GenericScheduler heuristics. It is likely that developers writing
target-specific schedulers that work similarly to GenericScheduler
will want to use these helpers too. The immediate goal is to develop a
GenericPostScheduler that can run in place of the old PostRAScheduler,
but will use the new machine model.

No functionality change intended.

llvm-svn: 196643
2013-12-07 05:59:44 +00:00
Andrew Trick
7eb7f7648b MI-Sched: Model "reserved" processor resources.
This allows a target to use MI-Sched as an in-order scheduler that
will model strict resource conflicts without defining a processor
itinerary. Instead, the target can now use the new per-operand machine
model and define in-order resources with BufferSize=0. For example,
this would allow restricting the type of operations that can be formed
into a dispatch group. (Normally NumMicroOps is sufficient to enforce
dispatch groups).

If the intent is to model latency in in-order pipeline, as opposed to
resource conflicts, then a resource with BufferSize=1 should be
defined instead.

This feature is only casually tested as there are no in-tree targets
using it yet. However, Hal will be experimenting with POWER7.

llvm-svn: 196517
2013-12-05 17:56:02 +00:00
Andrew Trick
192311ab9a MI-Sched: handle latency of in-order operations with the new machine model.
The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.

MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.

I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false

For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)

Speedups:
| Benchmarks/BenchmarkGame/recursive         |  52.39% |
| Benchmarks/VersaBench/beamformer           |  20.80% |
| Benchmarks/Misc/pi                         |  19.97% |
| Benchmarks/Misc/mandel-2                   |  19.95% |
| SPEC/CFP2000/188.ammp                      |  18.72% |
| Benchmarks/McCat/08-main/main              |  18.58% |
| Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
| Benchmarks/Olden/power                     |  17.11% |
| Benchmarks/Misc-C++/mandel-text            |  16.47% |
| Benchmarks/Misc/oourafft                   |  15.94% |
| Benchmarks/Misc/flops-7                    |  14.99% |
| Benchmarks/FreeBench/distray               |  14.26% |
| SPEC/CFP2006/470.lbm                       |  14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
| Benchmarks/SmallPT/smallpt                 |  10.36% |
| Benchmarks/Misc-C++/Large/ray              |   8.97% |
| Benchmarks/Misc/fp-convert                 |   8.75% |
| Benchmarks/Olden/perimeter                 |   7.10% |
| Benchmarks/Bullet/bullet                   |   7.03% |
| Benchmarks/Misc/mandel                     |   6.75% |
| Benchmarks/Olden/voronoi                   |   6.26% |
| Benchmarks/Misc/flops-8                    |   5.77% |
| Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
| Benchmarks/MiBench/security-rijndael       |   5.15% |
| Benchmarks/Misc/flops-6                    |   5.10% |
| Benchmarks/Olden/tsp                       |   4.46% |
| Benchmarks/MiBench/consumer-lame           |   4.28% |
| Benchmarks/Misc/flops-5                    |   4.27% |
| Benchmarks/mafft/pairlocalalign            |   4.19% |
| Benchmarks/Misc/himenobmtxpa               |   4.07% |
| Benchmarks/Misc/lowercase                  |   4.06% |
| SPEC/CFP2006/433.milc                      |   3.99% |
| Benchmarks/tramp3d-v4                      |   3.79% |
| Benchmarks/FreeBench/pifft                 |   3.66% |
| Benchmarks/Ptrdist/ks                      |   3.21% |
| Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
| SPEC/CINT2000/175.vpr                      |   3.12% |
| Benchmarks/nbench                          |   2.98% |
| SPEC/CFP2000/183.equake                    |   2.91% |
| Benchmarks/Misc/perlin                     |   2.85% |
| Benchmarks/Misc/flops-1                    |   2.82% |
| Benchmarks/Misc-C++-EH/spirit              |   2.80% |
| Benchmarks/Misc/flops-2                    |   2.77% |
| Benchmarks/NPB-serial/is                   |   2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
| Benchmarks/BenchmarkGame/n-body            |   2.28% |
| Benchmarks/SciMark2-C/scimark2             |   2.27% |
| Benchmarks/Olden/bh                        |   2.03% |
| skidmarks10/skidmarks                      |   1.81% |
| Benchmarks/Misc/flops                      |   1.72% |

Slowdowns:
| Benchmarks/llubenchmark/llu                | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
| Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
| Benchmarks/Shootout/hash                   |  -2.35% |
| Benchmarks/Prolangs-C++/ocean              |  -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
| Polybench/linear-algebra/kernels/3mm       |  -1.95% |
| Benchmarks/McCat/09-vor/vor                |  -1.68% |

llvm-svn: 196516
2013-12-05 17:55:58 +00:00
Alp Toker
e845f8af67 Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

llvm-svn: 196471
2013-12-05 05:44:44 +00:00
Timur Iskhodzhanov
2340a0ee1c Reland 196270 "Generalize debug info / EH emission in AsmPrinter"
Addressing the existense AMDGPUAsmPrinter and other subclasses of AsmPrinter

llvm-svn: 196288
2013-12-03 15:10:23 +00:00
NAKAMURA Takumi
c0b01ff922 Revert r196270, "Generalize debug info / EH emission in AsmPrinter"
It broke CodeGen/R600 tests with +Asserts.

llvm-svn: 196272
2013-12-03 13:15:54 +00:00
Timur Iskhodzhanov
8ce8c7a5d7 Generalize debug info / EH emission in AsmPrinter
llvm-svn: 196270
2013-12-03 12:05:18 +00:00
Michael Gottesman
576426afb7 Added MachineBlockFrequencyInfo::view for displaying the block frequency propagation graph via graphviz.
This is useful for debugging issues in the BlockFrequency implementation
since one can easily visualize where probability mass and other errors
occur in the propagation.

This is the MI version of r194654.

llvm-svn: 196183
2013-12-03 00:49:33 +00:00
Rafael Espindola
c36d63a948 Move getSymbolWithGlobalValueBase to TargetLoweringObjectFile.
This allows it to be used in TargetLoweringObjectFileImpl.cpp.

llvm-svn: 196117
2013-12-02 16:25:47 +00:00
Lang Hames
067c025250 Refactor a lot of patchpoint/stackmap related code to simplify and make it
target independent.

Most of the x86 specific stackmap/patchpoint handling was necessitated by the
use of the native address-mode format for frame index operands. PEI has now
been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing
us to use a simple, platform independent register/offset pair for frame
indexes on stackmap/patchpoints.

Notes:
  - Folding is now platform independent and automatically supported.
  - Emiting patchpoints with direct memory references now just involves calling
    the TargetLoweringBase::emitPatchPoint utility method from the target's
    XXXTargetLowering::EmitInstrWithCustomInserter method. (See
    X86TargetLowering for an example).
  - No more ugly platform-specific operand parsers.

This patch shouldn't change the generated output for X86. 

llvm-svn: 195944
2013-11-29 03:07:54 +00:00
Rafael Espindola
7e7db10302 Remove an always true parameter.
llvm-svn: 195931
2013-11-28 19:35:07 +00:00
Lang Hames
433095d3fe Fix a typo where we were creating <def,kill> operands instead of
<def,dead> ones.

Add an assertion to make sure we catch this in the future.

Fixes <rdar://problem/15464559>.

llvm-svn: 195401
2013-11-22 00:46:32 +00:00
Tom Stellard
439debedd3 Split SETCC if VSELECT requires splitting too.
This patch is a rewrite of the original patch commited in r194542. Instead of
relying on the type legalizer to do the splitting for us, we now peform the
splitting ourselves in the DAG combiner. This is necessary for the case where
the vector mask is a legal type after promotion and still wouldn't require
splitting.

Patch by: Juergen Ributzka

NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195397
2013-11-22 00:39:23 +00:00
Lang Hames
682d6cc95b Dereference the node iterator when dumping the PBQP graph structure in DOT
format.

Thanks to Arnaud A. de Grandmaison for the patch!

llvm-svn: 195316
2013-11-21 06:30:14 +00:00
Eric Christopher
3d1796838e Remove capability for polymorphic destruction from LexicalScope
and LexicalScopes, we're not using it.

llvm-svn: 195182
2013-11-20 00:54:28 +00:00
Eric Christopher
2624bc8ba4 Formatting, 80-col, trailing whitespace.
llvm-svn: 195180
2013-11-20 00:54:19 +00:00
Juergen Ributzka
8e480fdae5 [DAG] Refactor vector splitting code in SelectionDAG. No functional change intended.
Reviewed by Tom

llvm-svn: 195156
2013-11-19 21:20:17 +00:00
Andrew Trick
5b8040c957 Fix patchpoint comments.
llvm-svn: 195103
2013-11-19 05:05:43 +00:00
Andrew Trick
15aac659a7 Add an abstraction to handle patchpoint operands.
Hard-coded operand indices were scattered throughout lowering stages
and layers. It was super bug prone.

llvm-svn: 195093
2013-11-19 03:29:56 +00:00
Juergen Ributzka
5357a6d64b [weak vtables] Remove a bunch of weak vtables
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file. The memory leaks in this version have been fixed. Thanks
Alexey for pointing them out.

Differential Revision: http://llvm-reviews.chandlerc.com/D2068

Reviewed by Andy

llvm-svn: 195064
2013-11-19 00:57:56 +00:00
Alexey Samsonov
3bfef6bdb6 Revert r194865 and r194874.
This change is incorrect. If you delete virtual destructor of both a base class
and a subclass, then the following code:
  Base *foo = new Child();
  delete foo;
will not cause the destructor for members of Child class. As a result, I observe
plently of memory leaks. Notable examples I investigated are:
ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl.

llvm-svn: 194997
2013-11-18 09:31:53 +00:00
Andrew Trick
bd486c29f4 Added a size field to the stack map record to handle subregister spills.
Implementing this on bigendian platforms could get strange. I added a
target hook, getStackSlotRange, per Jakob's recommendation to make
this as explicit as possible.

llvm-svn: 194942
2013-11-17 01:36:23 +00:00
Duncan P. N. Exon Smith
c331c75e8e Fix filename in header comment
llvm-svn: 194924
2013-11-16 15:40:54 +00:00
Juergen Ributzka
ee3af15269 [weak vtables] Remove a bunch of weak vtables
This patch removes most of the trivial cases of weak vtables by pinning them to
a single object file.

Differential Revision: http://llvm-reviews.chandlerc.com/D2068

Reviewed by Andy

llvm-svn: 194865
2013-11-15 22:34:48 +00:00
Bob Wilson
d433cf7463 Avoid illegal integer promotion in fastisel
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow.  Results are different when:

    sext(a) + sext(b) != sext(a + b)

Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.

<rdar://problem/15292280>

Patch by Duncan Exon Smith!

llvm-svn: 194840
2013-11-15 19:09:27 +00:00
Daniel Sanders
0ebbe1d56c Fix illegal DAG produced by SelectionDAG::getConstant() for v2i64 type
Summary:
When getConstant() is called for an expanded vector type, it is split into
multiple scalar constants which are then combined using appropriate build_vector
and bitcast operations.

In addition to the usual big/little endian differences, the case where the
element-order of the vector does not have the same endianness as the elements
themselves is also accounted for.  For example, for v4i32 on big-endian MIPS,
the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is
<0123,4567,89AB,CDEF>.
Handling this case turns out to be a nop since getConstant() returns a splatted
vector (so reversing the element order doesn't change the value)

This fixes a number of cases in MIPS MSA where calling getConstant() during
operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF
into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger
differences between illegal and legal types such as legalizing v2i64 into v8i16.

lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling
getConstant() so this function has been updated in the same patch.

For the sake of transparency, the steps I've taken since the review are:
* Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed
  that the MIPS tests were falsely passing because a polymorphic function was
  not actually polymorphic in the reviewed patch.
* Fixed the tests that were now failing. This involved deleting the code to
  handle the MIPS MSA element-order (which was previously doing an byte-order
  swap instead of an element-order swap). This left
  isVectorEltOrderLittleEndian() unused and it was deleted.
* Fixed build failures caused by rebasing beyond r194467-r194472. These build
  failures involved the bset, bneg, and bclr instructions added in these commits
  using lowerMSASplatImm() in a way that was no longer valid after this patch.
  Some of these were fixed by calling SelectionDAG::getConstant() instead,
  others were fixed by a new function getBuildVectorSplat() that provided the
  removed functionality of lowerMSASplatImm() in a more sensible way.

Reviewers: bkramer

Reviewed By: bkramer

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1973

llvm-svn: 194811
2013-11-15 12:56:49 +00:00
Matt Arsenault
9921608896 Add addrspacecast instruction.
Patch by Michele Scandale!

llvm-svn: 194760
2013-11-15 01:34:59 +00:00
Aaron Ballman
7c6e917033 Replacing HUGE_VALF with llvm::huge_valf in order to work around a warning triggered in MSVC 12.
Patch reviewed by Reid Kleckner and Jim Grosbach.

llvm-svn: 194533
2013-11-13 00:15:44 +00:00
Arnaud A. de Grandmaison
26d846f560 CalcSpillWeights: allow overidding the spill weight normalizing function
This will enable the PBQP register allocator to provide its own normalizing function.

No functionnal change.

llvm-svn: 194417
2013-11-11 19:56:14 +00:00
Arnaud A. de Grandmaison
8c40e45072 CalcSpillWeights: give a better describing name to calculateSpillWeights
Besides, this relates it more obviously to the VirtRegAuxInfo::calculateSpillWeightAndHint.

No functionnal change.

llvm-svn: 194404
2013-11-11 19:04:45 +00:00
Arnaud A. de Grandmaison
6b862708a7 CalculateSpillWeights does not need to be a pass
Based on discussions with Lang Hames and Jakob Stoklund Olesen at the hacker's lab, and in the light of upcoming work on the PBQP register allocator, it was though that CalcSpillWeights does not need to be a pass. This change will enable to customize / tune the spill weight computation depending on the allocator.

Update the documentation style while there.

No functionnal change.

llvm-svn: 194356
2013-11-10 17:46:31 +00:00
Chandler Carruth
f2e7a23acb Move the old pass manager infrastructure into a legacy namespace and
give the files a legacy prefix in the right directory. Use forwarding
headers in the old locations to paper over the name change for most
clients during the transitional period.

No functionality changed here! This is just clearing some space to
reduce renaming churn later on with a new system.

Even when the new stuff starts to go in, it is going to be hidden behind
a flag and off-by-default as it is still WIP and under development.

This patch is specifically designed so that very little out-of-tree code
has to change. I'm going to work as hard as I can to keep that the case.
Only direct forward declarations of the PassManager class are impacted
by this change.

llvm-svn: 194324
2013-11-09 12:26:54 +00:00
NAKAMURA Takumi
ba34a4d189 include/llvm/CodeGen/PBQP: Update @param(s) in comments. [-Wdocumentation]
llvm-svn: 194314
2013-11-09 03:54:05 +00:00
NAKAMURA Takumi
dc501161f9 Fix whitespace.
llvm-svn: 194313
2013-11-09 03:53:55 +00:00
Lang Hames
be91a1d947 Re-apply r194300 with fixes for warnings.
llvm-svn: 194311
2013-11-09 03:08:56 +00:00
Nick Lewycky
ebeaea0192 Revert r194300 which broke the build.
llvm-svn: 194308
2013-11-09 02:01:25 +00:00
Lang Hames
e3c935f4ab Rewrite the PBQP graph data structure.
The new graph structure replaces the node and edge linked lists with vectors.
Free lists (well, free vectors) are used for fast insertion/deletion.

The ultimate aim is to make PBQP graphs cheap to clone. The motivation is that
the PBQP solver destructively consumes input graphs while computing a solution,
forcing the graph to be fully reconstructed for each round of PBQP. This
imposes a high cost on large functions, which often require several rounds of
solving/spilling to find a final register allocation. If we can cheaply clone
the PBQP graph and incrementally update it between rounds then hopefully we can
reduce this cost. Further, once we begin pooling matrix/vector values (future
work), we can cache some PBQP solver metadata and share it between cloned
graphs, allowing the PBQP solver to re-use some of the computation done in
earlier rounds.

For now this is just a data structure update. The allocator and solver still
use the graph the same way as before, fully reconstructing it between each
round. I expect no material change from this update, although it may change
the iteration order of the nodes, causing ties in the solver to break in
different directions, and this could perturb the generated allocations
(hopefully in a completely benign way).

Thanks very much to Arnaud Allard de Grandmaison for encouraging me to get back
to work on this, and for a lot of discussion and many useful PBQP test cases.

llvm-svn: 194300
2013-11-09 00:14:07 +00:00
Juergen Ributzka
f27436b708 [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.
The idea of the AnyReg Calling Convention is to provide the call arguments in
registers, but not to force them to be placed in a paticular order into a
specified set of registers. Instead it is up tp the register allocator to assign
any register as it sees fit. The same applies to the return value (if
applicable).

Differential Revision: http://llvm-reviews.chandlerc.com/D2009

Reviewed by Andy

llvm-svn: 194293
2013-11-08 23:28:16 +00:00
Arnaud A. de Grandmaison
12a0cc8ff5 Revert "CalculateSpillWeights does not need to be a pass"
Temporarily revert my previous commit until I understand why it breaks 3 target tests.

llvm-svn: 194272
2013-11-08 18:19:19 +00:00
Arnaud A. de Grandmaison
926576cff3 CalculateSpillWeights does not need to be a pass
Based on discussions with Lang Hames and Jakob Stoklund Olesen at the hacker's lab, and in the light of upcoming work on the PBQP register allocator, it was though that CalcSpillWeights does not need to be a pass. This change will enable to customize / tune the spill weight computation depending on the allocator.

Update the documentation style while there.

No functionnal change.

llvm-svn: 194269
2013-11-08 17:56:29 +00:00
Andrew Trick
75681a41c0 Add support for stack map generation in the X86 backend.
Originally implemented by Lang Hames.

llvm-svn: 193811
2013-10-31 22:11:56 +00:00
Rafael Espindola
bdb3c4f195 Produce .weak_def_can_be_hidden for some linkonce_odr values
With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr
if they are also unnamed_addr or don't have their address taken.

There is not a lot of documentation about .weak_def_can_be_hidden, but
from the old discussion about linkonce_odr_auto_hide and the name of
the directive this looks correct: these symbols can be hidden.

Testing this with the ld64 in Xcode 5 linking clang reduces the number of
exported symbols from 21053 to 19049.

llvm-svn: 193718
2013-10-30 22:08:11 +00:00
Josh Magee
4b74743099 Reformat code with clang-format.
Differential Revision: http://llvm-reviews.chandlerc.com/D2057

llvm-svn: 193672
2013-10-30 02:25:14 +00:00
NAKAMURA Takumi
579bb406bf StackProtector.h: Fix trailing comments for doxygen. [-Wdocumentation]
s!//<!///<!

llvm-svn: 193669
2013-10-30 00:49:39 +00:00
NAKAMURA Takumi
7388ca6af3 Trailing whitespace in a comment line.
llvm-svn: 193668
2013-10-30 00:49:33 +00:00
Josh Magee
5a6fad91a3 [stackprotector] Update the StackProtector pass to perform datalayout analysis.
This modifies the pass to classify every SSP-triggering AllocaInst according to
an SSPLayoutKind (LargeArray, SmallArray, AddrOf).  This analysis is collected
by the pass and made available for use, but no other pass uses it yet.

The next patch will make use of this analysis in PEI and StackSlot
passes.  The end goal is to support ssp-strong stack layout rules.

WIP.

Differential Revision: http://llvm-reviews.chandlerc.com/D1789

llvm-svn: 193653
2013-10-29 21:16:16 +00:00
Rafael Espindola
68ddc56344 Add a helper getSymbol to AsmPrinter.
llvm-svn: 193627
2013-10-29 17:07:16 +00:00
Richard Sandiford
0ea0d286ba Keep TBAA info when rewriting SelectionDAG loads and stores
Most SelectionDAG code drops the TBAA info when creating a new form of a
load and store (e.g. during legalization, or when converting a plain
load to an extending one).  This patch tries to catch all cases where
the TBAA information can legitimately be carried over.

The patch adds alternative forms of getLoad() and getExtLoad() that take
a MachineMemOperand instead of individual fields.  (The corresponding
getTruncStore() already exists.)  The idea is to use the MachineMemOperand
forms when all fields are carried over (size, pointer info, isVolatile,
isNonTemporal, alignment and TBAA info).  If some adjustment is being
made, e.g. to narrow the load, then we still pass the individual fields
but also pass the TBAA info.

llvm-svn: 193517
2013-10-28 11:17:59 +00:00