1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
Commit Graph

31684 Commits

Author SHA1 Message Date
Hal Finkel
287f0c7ac8 [PowerPC] Put PPCVSXCopy into its own source file
PPCInstrInfo.cpp has ended up containing several small MI-level passes, and
this is making the file harder to read than necessary. Split out
PPCVSXCopy into its own source file. NFC.

llvm-svn: 227771
2015-02-01 22:01:29 +00:00
Hal Finkel
a67f242e9a [PowerPC] Put PPCVSXFMAMutate into its own source file
PPCInstrInfo.cpp has ended up containing several small MI-level passes, and
this is making the file harder to read than necessary. Split out
PPCVSXFMAMutate into its own source file. NFC.

llvm-svn: 227770
2015-02-01 21:51:22 +00:00
Hal Finkel
7435b67236 [PowerPC] Remove the PPCVSXCopyCleanup pass
This MI-level pass was necessary when VSX support was first being developed,
specifically, before the ABI code had been updated to use VSX registers for
arguments (the register assignments did not change, in a physical sense, but
the VSX super-registers are now used). Unfortunately, I never went back and
removed this pass after that was done. I believe this code is now effectively
dead.

llvm-svn: 227767
2015-02-01 21:20:58 +00:00
Hal Finkel
63bdc9fc20 [PowerPC] Add implicit ops to conditional returns in PPCEarlyReturn
When PPCEarlyReturn, it should really copy implicit ops from the old return
instruction to the new one. This currently does not matter much, because we run
PPCEarlyReturn very late in the pipeline (there is nothing to do DCE on
definitions of those registers). However, for completeness, we should do it
anyway.

Noticed by inspection (and there should be no functional change); thus, no
test case.

llvm-svn: 227763
2015-02-01 20:16:10 +00:00
Hal Finkel
276d13a6e1 [PowerPC] VSX stores don't also read
The VSX store instructions were also picking up an implicit "may read" from the
default pattern, which was an intrinsic (and we don't currently have a way of
specifying write-only intrinsics).

This was causing MI verification to fail for VSX spill restores.

llvm-svn: 227759
2015-02-01 19:07:41 +00:00
Hal Finkel
4a7dffd074 [PowerPC] Better scheduling for isel on P7/P8
isel is actually a cracked instruction on the P7/P8, and must start a dispatch
group. The scheduling model should reflect this so that we don't bunch too many
of them together when possible.

Thanks to Bill Schmidt and Pat Haugen for helping to sort this out.

llvm-svn: 227758
2015-02-01 17:52:16 +00:00
Michael Kuperstein
41ae9af2e3 [X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a 
reserved call frame), and perform rudimentary call folding. It still doesn't 
have a heuristic, so it is enabled only for optsize/minsize, with stack 
alignment <= 8, where it ought to be a fairly clear win.

(Re-commit of r227728)

Differential Revision: http://reviews.llvm.org/D6789

llvm-svn: 227752
2015-02-01 16:56:04 +00:00
Michael Kuperstein
f73ce6a4c9 Revert r227728 due to bad line endings.
llvm-svn: 227746
2015-02-01 16:15:07 +00:00
Hal Finkel
d9936757cf [PowerPC] Make r2 allocatable on PPC64/ELF for some leaf functions
The TOC base pointer is passed in r2, and we normally reserve this register so
that we can depend on it being there. However, for leaf functions, and
specifically those leaf functions that don't do any TOC access of their own
(which is generally due to accessing the constant pool, using TLS, etc.),
we can treat r2 as an ordinary callee-saved register (it must be callee-saved
because, for local direct calls, the linker will not insert any save/restore
code).

The allocation order has been changed slightly for PPC64/ELF systems to put r2
at the end of the list (while leaving it near the beginning for Darwin systems
to prevent unnecessary output changes). While r2 is allocatable, using it still
requires spill/restore traffic, and thus comes at the end of the list.

llvm-svn: 227745
2015-02-01 15:03:28 +00:00
Chandler Carruth
a2cd22e25f [multiversion] Remove the function parameter from the unrolling
preferences interface on TTI now that all of TTI is per-function.

llvm-svn: 227741
2015-02-01 14:31:23 +00:00
Chandler Carruth
59453ca4a8 [multiversion] Switch the TTI queries from TargetMachine to Subtarget
now that we have a correct and cached subtarget specific to the
function.

Also, finish providing a cached per-function subtarget in the core
LLVMTargetMachine -- that layer hadn't switched over yet.

The only use of the TargetMachine was to re-lookup a subtarget for
a particular function to work around the fact that TTI was immutable.
Now that it is per-function and we haved a cached subtarget, use it.

This still leaves a few interfaces with real warts on them where we were
passing Function objects through the TTI interface. I'll remove these
and clean their usage up in subsequent commits now that this isn't
necessary.

llvm-svn: 227738
2015-02-01 14:22:17 +00:00
Chandler Carruth
6ea38a46d2 [multiversion] Remove the cached TargetMachine pointer from the
intermediate TTI implementation template and instead query up to the
derived class for both the TargetMachine and the TargetLowering.

Most of the derived types had a TLI cached already and there is no need
to store a less precisely typed target machine pointer.

This will in turn make it much cleaner to look up the TLI via
a per-function subtarget instead of the generic subtarget, and it will
pave the way toward pulling the subtarget used for unroll preferences
into the same form once we are *always* using the function to look up
the correct subtarget.

llvm-svn: 227737
2015-02-01 14:01:15 +00:00
Chandler Carruth
3ed152b528 [multiversion] Switch all of the targets over to use the
TargetIRAnalysis access path directly rather than implementing getTTI.

This even removes getTTI from the interface. It's more efficient for
each target to just register a precise callback that creates their
specific TTI.

As part of this, all of the targets which are building their subtargets
individually per-function now build their TTI instance with the function
and thus look up the correct subtarget and cache it. NVPTX, R600, and
XCore currently don't leverage this functionality, but its trivial for
them to add it now.

llvm-svn: 227735
2015-02-01 13:20:00 +00:00
Chandler Carruth
c67d7f29c0 [multiversion] Remove a false freedom to leave the TargetMachine pointer
null.

For some reason some of the original TTI code supported a null target
machine. This seems to have been legacy, and I made matters worse when
refactoring this code by spreading that pattern further through the
various targets.

The TargetMachine can't actually be null, and it doesn't make sense to
support that use case. I've now consistently removed it and removed all
of the code trying to cope with that situation. This is probably good,
as several targets *didn't* cope with it being null despite the null
default argument in their constructors. =]

llvm-svn: 227734
2015-02-01 12:38:24 +00:00
Chandler Carruth
46a63acccc [multiversion] Implement the old pass manager's TTI wrapper pass in
terms of the new pass manager's TargetIRAnalysis.

Yep, this is one of the nicer bits of the new pass manager's design.
Passes can in many cases operate in a vacuum and so we can just nest
things when convenient. This is particularly convenient here as I can
now consolidate all of the TargetMachine logic on this analysis.

The most important change here is that this pushes the function we need
TTI for all the way into the TargetMachine, and re-creates the TTI
object for each function rather than re-using it for each function.
We're now prepared to teach the targets to produce function-specific TTI
objects with specific subtargets cached, etc.

One piece of feedback I'd love here is whether its worth renaming any of
this stuff. None of the names really seem that awesome to me at this
point, but TargetTransformInfoWrapperPass is particularly ... odd.
TargetIRAnalysisWrapper might make more sense. I would want to do that
rename separately anyways, but let me know what you think.

llvm-svn: 227731
2015-02-01 12:26:09 +00:00
Michael Kuperstein
2f448f269c [X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a 
reserved call frame), and perform rudimentary call folding. It still doesn't 
have a heuristic, so it is enabled only for optsize/minsize, with stack 
alignment <= 8, where it ought to be a fairly clear win.

Differential Revision: http://reviews.llvm.org/D6789

llvm-svn: 227728
2015-02-01 11:44:44 +00:00
Chandler Carruth
4efb41707c [PM] Port TTI to the new pass manager, introducing a TargetIRAnalysis to
produce it.

This adds a function to the TargetMachine that produces this analysis
via a callback for each function. This in turn faves the way to produce
a *different* TTI per-function with the correct subtarget cached.

I've also done the necessary wiring in the opt tool to thread the target
machine down and make it available to the pass registry so that we can
construct this analysis from a target machine when available.

llvm-svn: 227721
2015-02-01 10:11:22 +00:00
Craig Topper
e5f6f4dd36 [X86] Add a few target specific nodes to 'getTargetNodeName'
llvm-svn: 227720
2015-02-01 10:00:37 +00:00
Jingyue Wu
da72eac553 [NVPTX] Emit .pragma "nounroll" for loops marked with nounroll
Summary:
CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA
driver from unrolling a loop marked with llvm.loop.unroll.disable is not
unrolled by CUDA driver, we need to emit .pragma "nounroll" at the
header of that loop.

This patch also extracts getting unroll metadata from loop ID metadata
into a shared helper function.

Test Plan: test/CodeGen/NVPTX/nounroll.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7041

llvm-svn: 227703
2015-02-01 02:27:45 +00:00
NAKAMURA Takumi
e05fc9512b [CMake] LLVMTarget requires Intrinsics.gen since r227669 introduced llvm/Analysis/TargetTransformInfo.h.
llvm-svn: 227699
2015-02-01 00:55:32 +00:00
Chandler Carruth
0cdc876795 [PM] Remove a bunch of stale TTI creation method declarations. I nuked
their definitions, but forgot to clean up all the declarations which are
in different files.

llvm-svn: 227698
2015-02-01 00:22:15 +00:00
Matt Arsenault
7a8cacc8cb Fix typo
llvm-svn: 227697
2015-01-31 23:37:27 +00:00
Matt Arsenault
c7e64d4e6b R600/SI: Only select cvt_flr/cvt_rpi with no NaNs.
These have different behavior from cvt_i32_f32 on NaN.

llvm-svn: 227693
2015-01-31 21:28:13 +00:00
Saleem Abdulrasool
0788326ac3 X86: silence a GCC warning
GCC 4.9 gives the following warning:
  warning: enumeral and non-enumeral type in conditional expression
Cast the enumeral value to an integer within the ternary operation.  NFC.

llvm-svn: 227692
2015-01-31 17:56:11 +00:00
Diego Novillo
84a3d0d0a0 Remove unused variable.
Summary:
This variable is only used inside an assert. This breaks builds with
asserts disabled.

OK for trunk?

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7314

llvm-svn: 227691
2015-01-31 17:17:33 +00:00
Aaron Ballman
2e193d60cf Removed a spurious semicolon; NFC
llvm-svn: 227690
2015-01-31 15:18:47 +00:00
Simon Pilgrim
04b05f844f Removed SSE lane blend findCommutedOpIndices overrides. NFCI.
The default op indices frmo TargetInstrInfo::findCommutedOpIndices are being commuted so we don't need to do this.

llvm-svn: 227689
2015-01-31 15:16:30 +00:00
Simon Pilgrim
45b04beea7 [X86][SSE] Shuffle mask decode support for zero extend, scalar float/double moves and integer load instructions
This patch adds shuffle mask decodes for integer zero extends (pmovzx** and movq xmm,xmm) and scalar float/double loads/moves (movss/movsd).

Also adds shuffle mask decodes for integer loads (movd/movq).

Differential Revision: http://reviews.llvm.org/D7228

llvm-svn: 227688
2015-01-31 14:09:36 +00:00
Chandler Carruth
ad2d6dd7d3 [PM] Switch the TargetMachine interface from accepting a pass manager
base which it adds a single analysis pass to, to instead return the type
erased TargetTransformInfo object constructed for that TargetMachine.

This removes all of the pass variants for TTI. There is now a single TTI
*pass* in the Analysis layer. All of the Analysis <-> Target
communication is through the TTI's type erased interface itself. While
the diff is large here, it is nothing more that code motion to make
types available in a header file for use in a different source file
within each target.

I've tried to keep all the doxygen comments and file boilerplate in line
with this move, but let me know if I missed anything.

With this in place, the next step to making TTI work with the new pass
manager is to introduce a really simple new-style analysis that produces
a TTI object via a callback into this routine on the target machine.
Once we have that, we'll have the building blocks necessary to accept
a function argument as well.

llvm-svn: 227685
2015-01-31 11:17:59 +00:00
Saleem Abdulrasool
edb0c1ece7 ARM: make a table more readable (NFC)
This adds some comments and splits the flag calculation on type boundaries to
make the table more readable.  Addresses some post-commit review comments to SVN
r227603.  NFC.

llvm-svn: 227670
2015-01-31 04:12:06 +00:00
Chandler Carruth
b2d6052871 [PM] Change the core design of the TTI analysis to use a polymorphic
type erased interface and a single analysis pass rather than an
extremely complex analysis group.

The end result is that the TTI analysis can contain a type erased
implementation that supports the polymorphic TTI interface. We can build
one from a target-specific implementation or from a dummy one in the IR.

I've also factored all of the code into "mix-in"-able base classes,
including CRTP base classes to facilitate calling back up to the most
specialized form when delegating horizontally across the surface. These
aren't as clean as I would like and I'm planning to work on cleaning
some of this up, but I wanted to start by putting into the right form.

There are a number of reasons for this change, and this particular
design. The first and foremost reason is that an analysis group is
complete overkill, and the chaining delegation strategy was so opaque,
confusing, and high overhead that TTI was suffering greatly for it.
Several of the TTI functions had failed to be implemented in all places
because of the chaining-based delegation making there be no checking of
this. A few other functions were implemented with incorrect delegation.
The message to me was very clear working on this -- the delegation and
analysis group structure was too confusing to be useful here.

The other reason of course is that this is *much* more natural fit for
the new pass manager. This will lay the ground work for a type-erased
per-function info object that can look up the correct subtarget and even
cache it.

Yet another benefit is that this will significantly simplify the
interaction of the pass managers and the TargetMachine. See the future
work below.

The downside of this change is that it is very, very verbose. I'm going
to work to improve that, but it is somewhat an implementation necessity
in C++ to do type erasure. =/ I discussed this design really extensively
with Eric and Hal prior to going down this path, and afterward showed
them the result. No one was really thrilled with it, but there doesn't
seem to be a substantially better alternative. Using a base class and
virtual method dispatch would make the code much shorter, but as
discussed in the update to the programmer's manual and elsewhere,
a polymorphic interface feels like the more principled approach even if
this is perhaps the least compelling example of it. ;]

Ultimately, there is still a lot more to be done here, but this was the
huge chunk that I couldn't really split things out of because this was
the interface change to TTI. I've tried to minimize all the other parts
of this. The follow up work should include at least:

1) Improving the TargetMachine interface by having it directly return
   a TTI object. Because we have a non-pass object with value semantics
   and an internal type erasure mechanism, we can narrow the interface
   of the TargetMachine to *just* do what we need: build and return
   a TTI object that we can then insert into the pass pipeline.
2) Make the TTI object be fully specialized for a particular function.
   This will include splitting off a minimal form of it which is
   sufficient for the inliner and the old pass manager.
3) Add a new pass manager analysis which produces TTI objects from the
   target machine for each function. This may actually be done as part
   of #2 in order to use the new analysis to implement #2.
4) Work on narrowing the API between TTI and the targets so that it is
   easier to understand and less verbose to type erase.
5) Work on narrowing the API between TTI and its clients so that it is
   easier to understand and less verbose to forward.
6) Try to improve the CRTP-based delegation. I feel like this code is
   just a bit messy and exacerbating the complexity of implementing
   the TTI in each target.

Many thanks to Eric and Hal for their help here. I ended up blocked on
this somewhat more abruptly than I expected, and so I appreciate getting
it sorted out very quickly.

Differential Revision: http://reviews.llvm.org/D7293

llvm-svn: 227669
2015-01-31 03:43:40 +00:00
Saleem Abdulrasool
65ce7eb286 ARM: support stack probe size on Windows on ARM
Now that -mstack-probe-size is piped through to the backend via the function
attribute as on Windows x86, honour the value to permit handling of non-default
values for stack probes.  This is needed /Gs with the clang-cl driver or
-mstack-probe-size with the clang driver when targeting Windows on ARM.

llvm-svn: 227667
2015-01-31 02:26:37 +00:00
Eric Christopher
cc4cd0396b Remove the last vestiges of resetOperationActions.
llvm-svn: 227648
2015-01-31 00:21:17 +00:00
Eric Christopher
c020244686 Reuse a bunch of cached subtargets and remove getSubtarget calls
without a Function argument.

llvm-svn: 227647
2015-01-31 00:06:45 +00:00
Eric Christopher
d7f5849392 Reuse a bunch of cached subtargets and remove getSubtarget calls
without a Function argument.

llvm-svn: 227644
2015-01-30 23:46:43 +00:00
Eric Christopher
a36bf06411 Avoid using the cast and use the templated accessor function.
llvm-svn: 227643
2015-01-30 23:46:40 +00:00
Eric Christopher
ce1de59ee6 Reuse a bunch of cached subtargets and remove getSubtarget calls
without a Function argument.

llvm-svn: 227638
2015-01-30 23:24:40 +00:00
David Blaikie
4f7b68fc43 Add ARM test for r227489, but XFAIL because this is actually more work than it appeared to be.
Also revert r227489 since it didn't actually fix the thing I thought I
was fixing (since the test case was targeting the wrong architecture
initially). The change might be correct & demonstrated by other test
cases, but it's not a priority for me to find those test cases right
now.

Filed PR22417 for the failure.

llvm-svn: 227632
2015-01-30 23:04:39 +00:00
Eric Christopher
4af6c91959 Remove unused function.
llvm-svn: 227624
2015-01-30 22:02:36 +00:00
Eric Christopher
b94d514063 Remove extraneous forward declaration.
llvm-svn: 227623
2015-01-30 22:02:34 +00:00
Eric Christopher
8b69db6dc2 Use the cached subtargets and remove calls to getSubtarget/getSubtargetImpl
without a Function argument.

llvm-svn: 227622
2015-01-30 22:02:31 +00:00
Colin LeMahieu
e39bce1b3f [Hexagon] Adding vector shift instructions and tests.
llvm-svn: 227619
2015-01-30 21:58:46 +00:00
Tom Stellard
0eac7abd85 R600/SI: Handle SI_SPILL_V96_RESTORE in SIRegisterInfo::eliminateFrameIndex()
This fixes a crash in Unigine Heaven.

llvm-svn: 227618
2015-01-30 21:51:51 +00:00
Colin LeMahieu
4316877029 [Hexagon] Adding vector predicate instructions.
llvm-svn: 227613
2015-01-30 21:24:06 +00:00
Colin LeMahieu
ee3ca03932 [Hexagon] Adding vector permutation instructions and tests.
llvm-svn: 227612
2015-01-30 21:14:00 +00:00
Reid Kleckner
42ded7f3da Win64: Put a REX_W prefix on all TAILJMP* instructions
MSDN's x64 software conventions page says that this is one of the fixed
list of legal epilogues:
https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx

Presumably this is how the unwinder distinguishes epilogue jumps from
in-function control flow.

Also normalize the way we place "## TAILCALL" comments on such jumps.

llvm-svn: 227611
2015-01-30 21:03:31 +00:00
Colin LeMahieu
f8c65e69e9 [Hexagon] Adding vector multiplies. Cleaning up tests.
llvm-svn: 227609
2015-01-30 20:56:54 +00:00
Colin LeMahieu
113a9cf539 [Hexagon] Adding XTYPE/COMPLEX instructions and cleaning up tests.
llvm-svn: 227607
2015-01-30 20:08:37 +00:00
Chad Rosier
a802c82711 [AArch64] Make AArch64A57FPLoadBalancing output stable.
Add tie breaker to colorChainSet() sort so that processing order doesn't
depend on std::set order, which depends on pointer order, which is
unstable from run to run.

No test case as this is nearly impossible to reproduce.

Phabricator Review: http://reviews.llvm.org/D7265
Patch by Geoff Berry <gberry@codeaurora.org>!

llvm-svn: 227606
2015-01-30 19:55:40 +00:00
Saleem Abdulrasool
686f2daedf ARM: further correct .fpu directive handling
If the original FPU specification involved a restricted VFP unit (d16), ensure
that we reset the functionality when we encounter a new FPU type.  In
particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has
32 double precision registers), we would fail to reset the D16 feature, and
treat it as being equivalent to vfpv3-d16.

llvm-svn: 227603
2015-01-30 19:35:18 +00:00