1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

51 Commits

Author SHA1 Message Date
Jingyue Wu
b0d6c8585a [NVPTX] make load on global readonly memory to use ldg
Summary:
[NVPTX] make load on global readonly memory to use ldg

Summary:
As describe in [1], ld.global.nc may be used to load memory by nvcc when
__restrict__ is used and compiler can detect whether read-only data cache
is safe to use.

This patch will try to check whether ldg is safe to use and use them to
replace ld.global when possible. This change can improve the performance
by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in 
S3D benchmark of shoc [2]. 

Patched by Xuetian Weng. 

[1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache
[2] https://github.com/vetter/shoc

Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

Reviewers: jholewinski, jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11314

llvm-svn: 242713
2015-07-20 21:28:54 +00:00
Jingyue Wu
efaa872a9c [NVPTX] roll forward r239082
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param

Added an llc test in bug21465.

llvm-svn: 239100
2015-06-04 21:28:26 +00:00
Sergey Dmitrouk
7bfbc12128 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235989
2015-04-28 14:05:47 +00:00
Daniel Jasper
39180626db Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

llvm-svn: 235987
2015-04-28 13:38:35 +00:00
Sergey Dmitrouk
01a4dcd3bb [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235977
2015-04-28 11:56:37 +00:00
Eli Bendersky
feff27462f Simplify boolean expressions with true and false using clang-tidy
Patch by Richard (legalize@xmission.com)

Differential Revision: http://reviews.llvm.org/D8521

llvm-svn: 232961
2015-03-23 16:26:23 +00:00
Daniel Sanders
b2b69459a8 Recommit r232027 with PR22883 fixed: Add infrastructure for support of multiple memory constraints.
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.

This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break
anything.

The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate
Constraint_* values.

PR22883 was caused the matching operands copying the whole of the operand flags
for the matched operand. This included the constraint id which needed to be
replaced with the operand number. This has been fixed with a conversion
function. Following on from this, matching operands also used the operand
number as the constraint id. This has been fixed by looking up the matched
operand and taking it from there. 

llvm-svn: 232165
2015-03-13 12:45:09 +00:00
Hal Finkel
dc4180d54f Revert "r232027 - Add infrastructure for support of multiple memory constraints"
This (r232027) has caused PR22883; so it seems those bits might be used by
something else after all. Reverting until we can figure out what else to do.

Original commit message:

The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.

This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.

The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.

llvm-svn: 232093
2015-03-12 20:09:39 +00:00
Daniel Sanders
4eee6f840d Add infrastructure for support of multiple memory constraints.
Summary:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.

This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.

The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D8171

llvm-svn: 232027
2015-03-12 11:00:48 +00:00
Eric Christopher
0d42545048 Remove all use of is64bit off of NVPTXSubtarget and clean up code
accordingly. This changes the constructors of a number of classes
that don't need to know the subtarget's 64-bitness.

llvm-svn: 229787
2015-02-19 00:08:27 +00:00
Duncan P. N. Exon Smith
317d1b73cd NVPTX: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

llvm-svn: 229260
2015-02-14 15:35:43 +00:00
Benjamin Kramer
4b76aa3d46 MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line with countTrailingZeros
Update all callers.

llvm-svn: 228930
2015-02-12 15:35:40 +00:00
Eric Christopher
ce7c64b05c Remove unused argument.
llvm-svn: 227539
2015-01-30 01:41:01 +00:00
Eric Christopher
cf53def49f Migrate NVPTXISelDAGToDAG's getSubtarget to a runOnMachineFunction
version. Update NVPTXInstrInfo accordingly.

llvm-svn: 227538
2015-01-30 01:40:59 +00:00
Hal Finkel
2d0b95ee36 [NVPTX] Remove MemIntrinsicSDNode/MemSDNode duplicate checking
As of r214452, isa<MemSDNode> will return true for nodes for which
isa<MemIntrinsicSDNode> will return true (classof now respects the actual class
hierarchy). So we no longer need to check for both MemIntrinsicSDNode and
MemSDNode separately.

No functionality change intended.

llvm-svn: 215523
2014-08-13 04:59:51 +00:00
Sylvestre Ledru
d22d211a98 Fix typos:
* libaries => libraries
* avaiable => available

llvm-svn: 215366
2014-08-11 18:04:46 +00:00
Justin Holewinski
d906a7b7fa [NVPTX] Silence a GCC warning found by the buildbots
The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.

llvm-svn: 213793
2014-07-23 20:23:47 +00:00
Justin Holewinski
3021ef095b [NVPTX] Improve handling of FP fusion
We now consider the FPOpFusion flag when determining whether
to fuse ops.  We also explicitly emit add.rn when fusion is
disabled to prevent ptxas from fusing the operations on its
own.

llvm-svn: 213287
2014-07-17 18:10:09 +00:00
Justin Holewinski
9c3e284e16 [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch
This also uses TSFlags to mark machine instructions that are surface/texture
accesses, as well as the vector width for surface operations.  This is used
to simplify some of the switch statements that need to detect surface/texture
instructions

llvm-svn: 213256
2014-07-17 11:59:04 +00:00
Justin Holewinski
0ed0e4b150 [NVPTX] Fix handling of ldg/ldu intrinsics.
The address space of the pointer must be global (1) for these intrinsics.  There must also be alignment metadata attached to the intrinsic calls, e.g.

%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0

!0 = metadata !{i32 4}

llvm-svn: 211939
2014-06-27 18:35:51 +00:00
Justin Holewinski
7663d1fd5a [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer

llvm-svn: 211935
2014-06-27 18:35:37 +00:00
Justin Holewinski
2ffa2d24b0 [NVPTX] Add isel patterns for bit-field extract (bfe)
llvm-svn: 211932
2014-06-27 18:35:27 +00:00
Craig Topper
79b097d66a Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
llvm-svn: 207616
2014-04-30 07:17:30 +00:00
Craig Topper
6d411cb95a [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Chandler Carruth
ae889a5f85 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Target/...
edition.

llvm-svn: 206842
2014-04-22 02:41:26 +00:00
Chandler Carruth
74d9386b11 [Modules] Consolidate the DEBUG_TYPE defines in NVPTX to the top of the
cpp file rather than in the header and then again in the cpp file.

llvm-svn: 206778
2014-04-21 19:53:55 +00:00
Craig Topper
69e0e91431 Convert SelectionDAG::getVTList to use ArrayRef
llvm-svn: 206357
2014-04-16 06:10:51 +00:00
Nick Lewycky
82ad9fc7c8 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Justin Holewinski
b035f9f3e4 [NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces
This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later).

llvm-svn: 205907
2014-04-09 15:39:15 +00:00
Justin Holewinski
dd04498e61 [NVPTX] Add isel patterns for addrspacecast
llvm-svn: 204600
2014-03-24 11:17:53 +00:00
Nuno Lopes
79d18a66ec remove a bunch of unused private methods
found with a smarter version of -Wunused-member-function that I'm playwing with.
Appologies in advance if I removed someone's WIP code.

 include/llvm/CodeGen/MachineSSAUpdater.h            |    1 
 include/llvm/IR/DebugInfo.h                         |    3 
 lib/CodeGen/MachineSSAUpdater.cpp                   |   10 --
 lib/CodeGen/PostRASchedulerList.cpp                 |    1 
 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp    |   10 --
 lib/IR/DebugInfo.cpp                                |   12 --
 lib/MC/MCAsmStreamer.cpp                            |    2 
 lib/Support/YAMLParser.cpp                          |   39 ---------
 lib/TableGen/TGParser.cpp                           |   16 ---
 lib/TableGen/TGParser.h                             |    1 
 lib/Target/AArch64/AArch64TargetTransformInfo.cpp   |    9 --
 lib/Target/ARM/ARMCodeEmitter.cpp                   |   12 --
 lib/Target/ARM/ARMFastISel.cpp                      |   84 --------------------
 lib/Target/Mips/MipsCodeEmitter.cpp                 |   11 --
 lib/Target/Mips/MipsConstantIslandPass.cpp          |   12 --
 lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp              |   21 -----
 lib/Target/NVPTX/NVPTXISelDAGToDAG.h                |    2 
 lib/Target/PowerPC/PPCFastISel.cpp                  |    1 
 lib/Transforms/Instrumentation/AddressSanitizer.cpp |    2 
 lib/Transforms/Instrumentation/BoundsChecking.cpp   |    2 
 lib/Transforms/Instrumentation/MemorySanitizer.cpp  |    1 
 lib/Transforms/Scalar/LoopIdiomRecognize.cpp        |    8 -
 lib/Transforms/Scalar/SCCP.cpp                      |    1 
 utils/TableGen/CodeEmitterGen.cpp                   |    2 
 24 files changed, 2 insertions(+), 261 deletions(-)

llvm-svn: 204560
2014-03-23 17:09:26 +00:00
Justin Holewinski
925169cb4e [NVPTX] Fix off-by-one error when creating the VT list for an SDNode
llvm-svn: 196503
2013-12-05 12:58:00 +00:00
Nadav Rotem
fd357159bc Mark some command line flags as hidden
llvm-svn: 193013
2013-10-18 23:38:13 +00:00
Tim Northover
c9a7e47164 ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.

Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.

This should fix PR15840.

llvm-svn: 191165
2013-09-22 08:21:56 +00:00
Craig Topper
8da1ee0773 Replace getValueType().getSimpleVT() with getSimpleValueType().
llvm-svn: 188442
2013-08-15 02:44:19 +00:00
Justin Holewinski
0a7b783785 [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append
.ftz to instructions if the nvptx-f32ftz attribute is set to "true"

llvm-svn: 186820
2013-07-22 12:18:04 +00:00
Justin Holewinski
6284a5cea6 [NVPTX] Fix vector loads from parameters that span multiple loads, and fix some typos
llvm-svn: 185332
2013-07-01 12:59:01 +00:00
Justin Holewinski
46d3cad4d8 [NVPTX] Add isel patterns for [reg+offset] form of ldg/ldu.
llvm-svn: 185329
2013-07-01 12:58:52 +00:00
Justin Holewinski
d365a376eb [NVPTX] Clean up comparison/select/convert patterns and factor out PTX instructions from their patterns
Test case is no breakage

llvm-svn: 185175
2013-06-28 17:58:04 +00:00
Justin Holewinski
0f70140107 [NVPTX] Remove i8 register class. PTX support for i8 (.b8, .u8, .s8) is rather poor and we're better off just ignoring it and letting LLVM expand all i8 ops out to i16.
llvm-svn: 185174
2013-06-28 17:57:59 +00:00
Justin Holewinski
fa2e2511f8 [NVPTX] Remove old CONST_NOT_GEN address space that is not being used anymore and causes constants to be emitted in the global address space
llvm-svn: 183652
2013-06-10 13:29:47 +00:00
Justin Holewinski
099d52887f [NVPTX] Fix case where a sext load of an i1 type may produce an
ld.u1 instead of an ld.u8.

llvm-svn: 182924
2013-05-30 12:22:39 +00:00
Andrew Trick
2790ee3a8e Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Justin Holewinski
2a53cbfbe1 [NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
llvm-svn: 182394
2013-05-21 16:51:30 +00:00
Michael Liao
3b258b6b24 ArrayRefize getMachineNode(). No functionality change.
llvm-svn: 179901
2013-04-19 22:22:57 +00:00
Justin Holewinski
707786b174 [NVPTX] Run clang-format on all NVPTX sources.
Hopefully this resolves any outstanding style issues and gives us
an automated way of ensuring we conform to the style guidelines.

llvm-svn: 178415
2013-03-30 14:29:21 +00:00
Justin Holewinski
9a248309f0 [NVPTX] Disable vector registers
Vectors were being manually scalarized by the backend.  Instead,
let the target-independent code do all of the work.  The manual
scalarization was from a time before good target-independent support
for scalarization in LLVM. However, this forces us to specially-handle
vector loads and stores, which we can turn into PTX instructions that
produce/consume multiple operands.

llvm-svn: 174968
2013-02-12 14:18:49 +00:00
Chandler Carruth
4c1f3c24db Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth
a490793037 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Benjamin Kramer
39afd32d88 NVPTX: Initialize the UseF32FTZ flag.
llvm-svn: 156232
2012-05-05 11:22:02 +00:00