1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 21:13:02 +02:00
Commit Graph

606 Commits

Author SHA1 Message Date
Artem Belevich
daaa866808 [NVPTX] Improve lowering of byval args of device functions.
Avoid unnecessary spills of such vars to local space on SASS level and
pointer space conversion.

Instead, make a local copy with appropriate addrspacecasts and let
LLVM optimize them away when possible.

This allows loading value of the argument using [symbol+offset]
instead of converting argument to general space pointer and using it
for indexing (which also implicitly converts param space pointer to
local space one on SASS level and triggers copying of argument into
local space in the process).

This reduces call overhead, uses less registers and reduces overall
SASS size by 2-4%.

Differential Review: http://reviews.llvm.org/D21421

llvm-svn: 273313
2016-06-21 20:30:26 +00:00
Benjamin Kramer
e80783f62f Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Justin Lebar
05e13f8eda [NVPTX] Add intrinsics for shfl instructions.
Summary:
Currently clang emits these instructions via inline (volatile) asm in
the CUDA headers.  Switching to intrinsics will let the optimizer reason
across calls to these intrinsics.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D21160

llvm-svn: 272298
2016-06-09 20:04:08 +00:00
Benjamin Kramer
d415569b3b Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
2016-06-08 19:09:22 +00:00
Benjamin Kramer
5d5a0e4f68 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Benjamin Kramer
a855b3205f Apply clang-tidy's misc-move-constructor-init throughout LLVM.
No functionality change intended, maybe a tiny performance improvement.

llvm-svn: 270997
2016-05-27 14:27:24 +00:00
Benjamin Kramer
63c3e24c6a Avoid some copies by using const references.
clang-tidy's performance-unnecessary-copy-initialization with some manual
fixes. No functional changes intended.

llvm-svn: 270988
2016-05-27 12:30:51 +00:00
Artem Belevich
b0b1ecdbc5 Init member structs in constructor.
Fixes build error on windows where MSVC does not
support list initialization inside member initializer list.

llvm-svn: 270877
2016-05-26 17:29:20 +00:00
Artem Belevich
f14e0f0a96 [NVPTX] Added NVVMIntrRange pass
NVVMIntrRange adds !range metadata to calls of NVVM intrinsics
that return values within known limited range.

This allows LLVM to generate optimal code for indexing arrays
based on tid/ctaid which is a frequently used pattern in CUDA code.

Differential Revision: http://reviews.llvm.org/D20644

llvm-svn: 270872
2016-05-26 17:02:56 +00:00
Justin Lebar
db58249ac7 [NVPTX] Don't (incorrectly) say that the NVVMReflect pass preserves all analyses.
Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D20585

llvm-svn: 270790
2016-05-25 23:12:38 +00:00
Rafael Espindola
22e87bbb08 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

llvm-svn: 269988
2016-05-18 22:04:49 +00:00
Justin Bogner
fe65f9093f SDAG: Implement Select instead of SelectImpl in NVPTXDAGToDAGISel
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
  the method to try* and return a bool for success.

Part of llvm.org/pr26808.

llvm-svn: 269483
2016-05-13 21:12:53 +00:00
Rafael Espindola
e34ff25d67 Return a StringRef from getSection.
This is similar to how getName is handled.

llvm-svn: 269218
2016-05-11 18:21:59 +00:00
Matthias Braun
556abb392a CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

llvm-svn: 269011
2016-05-10 03:21:59 +00:00
Justin Lebar
e9c3e87ae5 [NVPTX] Change begin/end inline asm comments to "begin/end inline asm".
Previously it was just "// inline asm", which made it tricky to read
code with lots of inline assembly.

llvm-svn: 268994
2016-05-10 00:31:22 +00:00
Justin Bogner
9448d374ee SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

llvm-svn: 268693
2016-05-05 23:19:08 +00:00
Justin Holewinski
9cc32d1b1b [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection
Summary:
We don't have sign-/zero-extending ldg/ldu instructions defined,
so we need to emulate them with explicit CVTs. We were originally
handling the i8 case, but not any other cases.

Fixes PR26185

Reviewers: jingyue, jlebar

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D19615

llvm-svn: 268272
2016-05-02 18:12:02 +00:00
Craig Topper
945a4cd524 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Justin Lebar
c5f2ca9cfc [NVPTX] Run NVVMReflect at the beginning of IR passes.
Summary:
Currently the NVVMReflect pass is run at the beginning of our backend
passes.  But really, it should be run as early as possible, as it's
simply resolving an "if" statement in code.  So copy it into
TargetMachine::addEarlyAsPossiblePasses.

We still run it at the beginning of the backend passes, since it's
needed for correctness when lowering to nvptx.

(Specifically, NVVMReflect changes each call to the __nvvm_reflect
function or llvm.nvvm.reflect intrinsic into an integer constant, based
on the pass's configuration.  Clearly we miss many optimization
opportunities if we perform this transformation at the beginning of
codegen.)

Reviewers: rnk

Subscribers: tra, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D18616

llvm-svn: 267765
2016-04-27 19:13:37 +00:00
Andrew Kaylor
9680f122b8 Add optimization bisect opt-in calls for NVPTX passes
Differential Revision: http://reviews.llvm.org/D19518

llvm-svn: 267635
2016-04-26 23:44:31 +00:00
Jingyue Wu
db7e23c040 [NVPTX] Fix some usages of CodeGenOpt::None.
NVPTXLowerKernelArgs is required for correctness, so it should not be guarded
by CodeGenOpt::None.

NVPTXPeephole is optimization only, so it should be skipped when
CodeGenOpt::None.

llvm-svn: 267619
2016-04-26 22:59:25 +00:00
Ahmed Bougacha
db9e64109d [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.
Differential Revision: http://reviews.llvm.org/D17176

llvm-svn: 267606
2016-04-26 21:15:30 +00:00
Craig Topper
0c0d9161b6 [NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ctlz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267265
2016-04-23 02:49:29 +00:00
Sanjoy Das
7e1b5ea5d3 Disable the PatchableFunction pass for NVPTX & Wasm
PatchableFunction requires AllVRegsAllocated that these targets don't
provide.

llvm-svn: 266720
2016-04-19 06:24:58 +00:00
Mehdi Amini
9ff867f98c [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Justin Lebar
4b4ad39f81 [NVPTX] Set NVPTXTTI::getInliningThresholdMultiplier to 5.
Summary:
Calls on NVPTX are unusually expensive (for one thing, lots of state
needs to be saved to memory, which is slow), so make the inlininer much
more aggressive.

Reviewers: chandlerc

Subscribers: jholewinski, llvm-commits, tra

Differential Revision: http://reviews.llvm.org/D18561

llvm-svn: 266406
2016-04-15 01:38:50 +00:00
Duncan P. N. Exon Smith
da537cb806 IR: Introduce ConstantAggregate, NFC
Add a common parent class for ConstantArray, ConstantVector, and
ConstantStruct called ConstantAggregate.  These are the aggregate
subclasses of Constant that take operands.

This is mainly a cleanup, adding common `isa` target and removing
duplicated code.  However, it also simplifies caching which constants
point transitively at `GlobalValue` (a possible future direction).

llvm-svn: 265466
2016-04-05 21:10:45 +00:00
Justin Holewinski
234f7b3e41 [NVPTX] Handle ldg created from sign-/zero-extended load
Reviewers: jingyue

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D18053

llvm-svn: 265389
2016-04-05 12:38:01 +00:00
Justin Lebar
06adb3808e [NVPTX] Add a truncate DAG node to some calls.
Summary:
Previously, we were running afoul of the assertion

  EVT(CLI.Ins[i].VT) == InVals[i].getValueType() && "LowerCall emitted a value with the wrong type!"

in SelectionDAGBuilder.cpp when running the NVPTX/i8-param.ll test.
This is because our backend (for some reason) treats small return values
as i32, but it wasn't ever truncating the i32 back down to the expected
width in the DAG.

Unclear to me whether this fixes any actual bugs -- in this test, at
least, the generated code is unchanged.

Reviewers: jingyue

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: http://reviews.llvm.org/D17872

llvm-svn: 265091
2016-04-01 01:09:10 +00:00
Justin Lebar
dc761e7edf [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect.
Summary:
Previously the NVVMReflect pass would read its configuration from
command-line flags or a static configuration given to the pass at
instantiation time.

This doesn't quite work for clang's use-case.  It needs to pass a value
for __CUDA_FTZ down on a per-module basis.  We use a module flag for
this, so the NVVMReflect pass needs to be updated to read said flag.

Reviewers: tra, rnk

Subscribers: cfe-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D18672

llvm-svn: 265090
2016-04-01 01:09:07 +00:00
Justin Lebar
debef79a9f [NVPTX] Annotate some instructions as hasSideEffects = 0.
Summary:
Tablegen tries to infer this from the selection DAG patterns defined for
the instructions, but it can't always.

An instructive example is CLZr64.  CLZr32 is correctly inferred to have
no side-effects, but the selection DAG pattern for CLZr64 is slightly
more complicated, and in particular the ctlz DAG node is not at the root
of the pattern.  Thus tablegen can't infer that CLZr64 has no
side-effects.

Reviewers: jholewinski

Subscribers: jholewinski, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17472

llvm-svn: 265089
2016-04-01 01:09:05 +00:00
Hans Wennborg
07d10a6e30 Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

llvm-svn: 265036
2016-03-31 18:33:38 +00:00
Justin Lebar
d23b7bb4ec [NVPTX] Make NVVMReflect a function pass.
Summary:
Currently it's a module pass.  Make it a function pass so that we can
move it to PassManagerBuilder's EP_EarlyAsPossible extension point,
which only accepts function passes.

Reviewers: rnk

Subscribers: tra, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D18615

llvm-svn: 264919
2016-03-30 20:40:11 +00:00
Benjamin Kramer
ab82adaded [NVPTX] Avoid temporary std::string and make single-use function local to the cpp file.
No functionality change intended.

llvm-svn: 264861
2016-03-30 12:31:51 +00:00
Derek Schuff
9f833b6097 Introduce MachineFunctionProperties and the AllVRegsAllocated property
MachineFunctionProperties represents a set of properties that a MachineFunction
can have at particular points in time. Existing examples of this idea are
MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which
will eventually be switched to use this mechanism.
This change introduces the AllVRegsAllocated property; i.e. the property that
all virtual registers have been allocated and there are no VReg operands
left.

With this mechanism, passes can declare that they require a particular property
to be set, or that they set or clear properties by implementing e.g.
MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class
verifies that the requirements are met, and handles the setting and clearing
based on the delcarations. Passes can also directly query and update the current
properties of the MF if they want to have conditional behavior.

This change annotates the target-independent post-regalloc passes; future
changes will also annotate target-specific ones.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D18421

llvm-svn: 264593
2016-03-28 17:05:30 +00:00
Duncan P. N. Exon Smith
f60239c0f1 IR: Reserve an MDKind for !llvm.loop; NFC
This reserves an MDKind for !llvm.loop, which allows callers to avoid a
string-based lookup.  I'm not sure why it was missing.

There should be no functionality change here, just a small compile-time
speedup.

llvm-svn: 264371
2016-03-25 00:35:38 +00:00
Jingyue Wu
2526430a5c [NVPTX] Adds a new address space inference pass.
Summary:
The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable
to convert the address space of a pointer induction variable. This patch adds a
new pass called NVPTXInferAddressSpaces that overcomes that limitation using a
fixed-point data-flow analysis (see the file header comments for details).

The new pass is experimental and not enabled by default. Users can turn
it on by setting the -nvptx-use-infer-addrspace flag of llc.

Reviewers: jholewinski, tra, jlebar

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D17965

llvm-svn: 263916
2016-03-20 20:59:20 +00:00
Chandler Carruth
d0ae36e8de [PM] Port GVN to the new pass manager, wire it up, and teach a couple of
tests to run GVN in both modes.

This is mostly the boring refactoring just like SROA and other complex
transformation passes. There is some trickiness in that GVN's
ValueNumber class requires hand holding to get to compile cleanly. I'm
open to suggestions about a better pattern there, but I tried several
before settling on this. I was trying to balance my desire to sink as
much implementation detail into the source file as possible without
introducing overly many layers of abstraction.

Much like with SROA, the design of this system is made somewhat more
cumbersome by the need to support both pass managers without duplicating
the significant state and logic of the pass. The same compromise is
struck here.

I've also left a FIXME in a doxygen comment as the GVN pass seems to
have pretty woeful documentation within it. I'd like to submit this with
the FIXME and let those more deeply familiar backfill the information
here now that we have a nice place in an interface to put that kind of
documentaiton.

Differential Revision: http://reviews.llvm.org/D18019

llvm-svn: 263208
2016-03-11 08:50:55 +00:00
Justin Lebar
f7a86812e3 [NVPTX] Annotate param loads/stores as mayLoad/mayStore.
Summary:
Tablegen was unable to determine that param loads/stores were actually
reading or writing from memory.  I think this isn't a problem in
practice for param stores, because those occur in a block right before
we make our call.  But param loads don't have to at the very beginning
of a function, so should be annotated as mayLoad so we don't incorrectly
optimize them.

Reviewers: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D17471

llvm-svn: 262381
2016-03-01 19:44:22 +00:00
Justin Lebar
18e0e2943c [NVPTX] Remove workaround for tablegen crash in NVPTXInstrInfo.td.
Summary: Looks like this was caused by a typo.

Reviewers: jholewinski

Subscribers: jholewinski, llvm-commits, tra

Differential Revision: http://reviews.llvm.org/D17357

llvm-svn: 262380
2016-03-01 19:44:20 +00:00
Justin Lebar
881bba8e1c [NVPTX] Use different, convergent MIs for convergent calls.
Summary:
Calls sometimes need to be convergent.  This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.

Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs.  But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.

Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.

Reviewers: jholewinski

Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17423

llvm-svn: 262373
2016-03-01 19:24:03 +00:00
Justin Lebar
49fc12b62a [NVPTX] Nix hack used to emit '{' and '}' for NVPTX calls.
Summary: Tablegen understands backslash as an escape char; that's sufficient.

Reviewers: jholewinski

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: http://reviews.llvm.org/D17432

llvm-svn: 262372
2016-03-01 19:24:00 +00:00
Justin Lebar
53267ab7b3 [NVPTX] Reformat NVPTXInstrInfo.td, and add additional comments.
Summary:
Also simplify some of the embedded C++ logic.

No functional changes.

Reviewers: jholewinski

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: http://reviews.llvm.org/D17354

llvm-svn: 262371
2016-03-01 19:23:30 +00:00
Duncan P. N. Exon Smith
53cb4596f6 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
David Majnemer
b56105ad7f Unbreak non-X86 targets from fallout caused by r261462
llvm-svn: 261463
2016-02-21 01:40:04 +00:00
Justin Lebar
fb0b358a7f [NVPTX] Annotate convergent intrinsics as convergent.
Summary:
Previously the machine instructions for bar.sync &co. were not marked as
convergent.  This resulted in some MI passes (such as TailDuplication,
fixed in an upcoming patch) doing unsafe things to these instructions.

Reviewers: jingyue

Subscribers: llvm-commits, tra, jholewinski, hfinkel

Differential Revision: http://reviews.llvm.org/D17318

llvm-svn: 261115
2016-02-17 17:46:54 +00:00
Justin Lebar
f89d876f55 [NVPTX] Annotate call machine instructions as calls.
Summary:
Otherwise we'll try to do unsafe optimizations on these MIs, such as
sinking loads below calls.

(I suspect that this is not the only bug in the NVPTX instruction
tablegen files; I need to comb through them.)

Reviewers: jholewinski, tra

Subscribers: jingyue, jhen, llvm-commits

Differential Revision: http://reviews.llvm.org/D17315

llvm-svn: 261113
2016-02-17 17:46:50 +00:00
Artem Belevich
672363ee16 [NVPTX] emit .file directives for files referenced by subprograms.
.. so .loc directives referring to those files work correctly.

Differential Revision: http://reviews.llvm.org/D17086

llvm-svn: 260557
2016-02-11 18:21:47 +00:00
Ahmed Bougacha
f23bc5a4de [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Jingyue Wu
bb54579422 [NVPTX] Disable performance optimizations when OptLevel==None
Reviewers: jholewinski, tra, eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D16874

llvm-svn: 259749
2016-02-04 04:15:36 +00:00