1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

35500 Commits

Author SHA1 Message Date
Craig Topper
474cc56790 [X86] Use assert instead of if and llvm_unreachable. NFC
llvm-svn: 256420
2015-12-25 17:07:27 +00:00
Craig Topper
e1f71ad36d [X86] Minor identation fixes. NFC
llvm-svn: 256419
2015-12-25 17:07:24 +00:00
Dan Gohman
9c3961aabe [WebAssembly] Fix handling of COPY instructions in WebAssemblyRegStackify.
Move RegStackify after coalescing and teach it to use LiveIntervals instead
of depending on SSA form. This avoids a problem where a register in a COPY
instruction is stackified and then subsequently coalesced with a register
that is not stackified.

This also puts it after the scheduler, which allows us to simplify the
EXPR_STACK constraint, as we no longer have instructions being reordered
after stackification and before coloring.

llvm-svn: 256402
2015-12-25 00:31:02 +00:00
Marina Yatsina
38411a839c [X86][ms-inline asm] Add support for memory operands that include structs
Add ability to reference struct symbols in memory operands.
Test case will be added on the clang side (review http://reviews.llvm.org/D15749)

Differential Revision: http://reviews.llvm.org/D15748

llvm-svn: 256381
2015-12-24 12:09:51 +00:00
Asaf Badouh
593a27ca5c [X86][PKU] Add {RD,WR}PKRU encoding
Differential Revision: http://reviews.llvm.org/D15711

llvm-svn: 256366
2015-12-24 08:25:00 +00:00
Elena Demikhovsky
423b83228e AVX-512: Kreg set 0/1 optimization
The patterns that set a mask register to 0/1
KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn
are replaced with
KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization.

KNL does not recognize dependency-breaking idioms for mask registers,
so kxnor %k1, %k1, %k2 has a RAW dependence on %k1.
Using %k0 as the undef input register is a performance heuristic based
on the assumption that %k0 is used less frequently than the other mask
registers, since it is not usable as a write mask.

Differential Revision: http://reviews.llvm.org/D15739

llvm-svn: 256365
2015-12-24 08:12:22 +00:00
Igor Breger
855ac148cd AVX512: VPMOVM2B/W/D/Q intrinsic implementation.
Differential Revision: http://reviews.llvm.org//D15747

llvm-svn: 256364
2015-12-24 07:11:53 +00:00
Matt Arsenault
280a86967d AMDGPU: Fix getRegisterBitWidth for vectors
llvm-svn: 256362
2015-12-24 05:14:55 +00:00
Tom Stellard
9454729563 AMDGPU/SI: Fix encoding of flat instructions on VI
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15735

llvm-svn: 256360
2015-12-24 03:18:18 +00:00
Tom Stellard
d69dab0e1a AMDGPU/SI: Remove non-existent flat instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15734

llvm-svn: 256357
2015-12-24 02:41:55 +00:00
Simon Pilgrim
1750355ddf [X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined
First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151.

As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops.

Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well.

Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions).

Differential Revision: http://reviews.llvm.org/D15477

llvm-svn: 256332
2015-12-23 13:10:07 +00:00
Igor Breger
305115c35b AVX512BW: Enable packed word shift for 512bit vector. Enable lowering scalar immidiate shift v64i8 .Fix predicate for AVX1/2 shifts.
Differential Revision: http://reviews.llvm.org/D15713

llvm-svn: 256324
2015-12-23 08:06:50 +00:00
Dan Gohman
b29eb82387 [WebAssembly] Add a TODO comment for a possible future optimization.
llvm-svn: 256306
2015-12-23 00:22:04 +00:00
Dan Gohman
b04a60c3d7 [WebAssembly] Trim unneeded #includes. NFC.
llvm-svn: 256301
2015-12-22 23:45:21 +00:00
Dan Gohman
cb14b68972 [WebAssembly] Minor code simplification. NFC.
llvm-svn: 256300
2015-12-22 23:39:16 +00:00
Changpeng Fang
8727650a31 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256282
2015-12-22 20:55:23 +00:00
Rafael Espindola
6880ff7f5d Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

llvm-svn: 256275
2015-12-22 19:46:44 +00:00
Changpeng Fang
36fd3caea4 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

llvm-svn: 256273
2015-12-22 19:32:28 +00:00
Cong Hou
50c405416c [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Jun Bum Lim
3cdf780204 [AArch64] Promote loads from stored
This is a recommit of r256004 which was reverted in r256160. The issue was the
incorrect promotion for half and byte loads transformed into mov instructions.
This fix will replace half and byte type loads only with bit field extracts.

Original commit message:

This change promotes load instructions which directly read from stored by
replacing them with mov instructions. If the store is wider than the load,
the load will be replaced with a bitfield extract.
For example :
  STRWui %W1, %X0, 1
  %W0 = LDRHHui %X0, 3
becomes
  STRWui %W1, %X0, 1
  %W0 = UBFMWri %W1, 16, 31

llvm-svn: 256249
2015-12-22 16:36:16 +00:00
Asaf Badouh
d891bbfe44 [X86][AVX512] Add rcp14 and rsqrt14 intrinsics
Differential Revision: http://reviews.llvm.org/D15414

llvm-svn: 256237
2015-12-22 11:40:04 +00:00
Dylan McKay
9c39fe2bb2 [AVR] Added configuration file and machine function information class
This commit adds the 'AVRMachineFunctionInfo' class, which simply stores
basic properties about generated machine functions.

llvm-svn: 256213
2015-12-21 23:13:15 +00:00
Eric Christopher
7774315fd8 Fix line endings after r256155. NFC.
llvm-svn: 256211
2015-12-21 23:04:27 +00:00
David Majnemer
47d3d1e5ef [MC, COFF] Support link /incremental conditionally
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker.  This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).

Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.

llvm-svn: 256203
2015-12-21 22:09:27 +00:00
Cong Hou
ac19620238 [X86][SSE] Transform truncations between vectors of integers into X86ISD::PACKUS/PACKSS operations during DAG combine.
This patch transforms truncation between vectors of integers into
X86ISD::PACKUS/PACKSS operations during DAG combine. We don't do it in
lowering phase because after type legalization, the original truncation
will be turned into a BUILD_VECTOR with each element that is extracted
from a vector and then truncated, and from them it is difficult to do
this optimization. This greatly improves the performance of truncations
on some specific types.

Cost table is updated accordingly.


Differential revision: http://reviews.llvm.org/D14588

llvm-svn: 256194
2015-12-21 20:42:43 +00:00
Adrian Prantl
1f94dd1efc Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when merging
instructions.

As noted in PR24563.
rdar://problem/23963293

llvm-svn: 256183
2015-12-21 19:25:03 +00:00
Tom Stellard
e81c016153 AMDGPU/SI: Fix encoding for FLAT_SCRATCH registers on VI
Summary:
These register has different encodings on CI and VI, so we add pseudo
FLAT_SCRACTH registers to be used before MC, and subtarget specific
registers to be used by the MC layer.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15661

llvm-svn: 256178
2015-12-21 18:44:27 +00:00
Tom Stellard
e5c23ab1eb AMDGPU/SI: Change assembly name for flat scratch registers to flat_scratch
This matches what the assembler accepts.

llvm-svn: 256177
2015-12-21 18:44:21 +00:00
Matthew Simpson
f8053302ae [AArch64] Add additional extract-extend patterns for smov
This patch adds to the target description two additional patterns for matching
extract-extend operations to SMOV. The patterns catch the v16i8-to-i64 and
v8i16-to-i64 cases. The existing patterns miss these cases because the
extracted elements must first be legalized to i32, resulting in any_extend
nodes.

This was originally implemented as a DAG combine (r255895), but was reverted
due to failing out-of-tree tests.

llvm-svn: 256176
2015-12-21 18:31:25 +00:00
Chad Rosier
609c3edd2d Remove extra whitespace. NFC.
llvm-svn: 256173
2015-12-21 18:08:05 +00:00
Dan Gohman
d2ebf04d1b [WebAssembly] Convert a regular for loop to a range-based for loop.
llvm-svn: 256169
2015-12-21 17:22:02 +00:00
Dan Gohman
56738226a7 [WebAssembly] Clean up comments and fix a missing #include dependency.
llvm-svn: 256168
2015-12-21 17:19:31 +00:00
Dan Gohman
0b85bc62b4 [WebAssembly] Remove an unneeded empty destructor.
llvm-svn: 256167
2015-12-21 17:12:40 +00:00
Dan Gohman
89be1cee02 [WebAssembly] Enclose the operand variables for load and store instructions in braces.
This allows the AsmMatcherEmitter to properly tokenize the AsmStrings for
load and store instructions. This is a step towards asm parsing.

llvm-svn: 256166
2015-12-21 16:58:49 +00:00
Dan Gohman
d317713f49 [WebAssembly] Mark the ARGUMENT pseudo-instructions as CodeGenOnly.
llvm-svn: 256165
2015-12-21 16:53:29 +00:00
Dan Gohman
42eaa890a3 [WebAssembly] Add some comments and make some minor source cleanups.
llvm-svn: 256164
2015-12-21 16:50:41 +00:00
Jun Bum Lim
9d2c1dfad1 Revert "[AArch64] Promote loads from stores"
This reverts commit r256004 due to a failure in cortex-a53.

llvm-svn: 256160
2015-12-21 15:36:49 +00:00
Chad Rosier
06bdbb1aa9 [AArch64] Enable PostRAScheduler for AArch64 generic build.
Disable post-ra scheduler for perturbed tests to appease the bots and to
preserve the history of the tests.

http://reviews.llvm.org/D15652

llvm-svn: 256158
2015-12-21 14:43:45 +00:00
Igor Breger
933cee316b AVX512BW: Enable AND/OR/XOR vector byte/word paked operation by promoting to qword that natively suppored.
llvm-svn: 256157
2015-12-21 14:40:36 +00:00
Amjad Aboud
8197f10787 Implemented Support of IA interrupt and exception handlers:
http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html

Differential Revision: http://reviews.llvm.org/D15567

llvm-svn: 256155
2015-12-21 14:07:14 +00:00
Zlatko Buljan
87d29677c0 [mips][microMIPS] Implement DERET and DI instructions and check size operand for EXT and DEXT* instructions
Differential Revision: http://reviews.llvm.org/D15570

llvm-svn: 256152
2015-12-21 13:08:58 +00:00
NAKAMURA Takumi
c8ebe717f8 [Cygwin] Enable TLS as emutls.
It resolves clang selfhosting with std::once() for Cygwin.

FIXME: It may be EmulatedTLS-generic also for X86-Android.
FIXME: Pass EmulatedTLS to LLVM CodeGen from Clang with -femulated-tls.
llvm-svn: 256134
2015-12-21 02:37:23 +00:00
Dylan McKay
5f42f3e6f6 [AVR] Added AVRCallingConv.td
llvm-svn: 256130
2015-12-20 23:17:44 +00:00
Craig Topper
b8df4c67a1 [X86] Use range-based for loop. NFC
llvm-svn: 256127
2015-12-20 18:41:57 +00:00
Craig Topper
660ebfbd90 [X86] Prevent constant hoisting for a couple compare immediates that the selection DAG knows how to optimize into a shift.
This allows "icmp ugt %a, 4294967295" and "icmp uge %a, 4294967296" to be optimized into right shifts by 32 which can fold the immediate into the shift instruction. These patterns show up with some regularity in real code.

Unfortunately, since getImmCost can't see the icmp predicate we can't be tell if we're only catching these specific cases.

llvm-svn: 256126
2015-12-20 18:41:54 +00:00
Dylan McKay
f28a05c29f Add AVR.td and AVRRegisterInfo.td
Summary:
This adds the core AVR TableGen file, along with the register descriptions.

Lines in AVR.td which require other TableGen files which haven't been committed
yet are commented out.

This is a fairly trivial patch, and should only require a quick review.

I kept the line width smaller than 80 columns, but there are a few exceptions
because I'm not sure how to split a string over several lines.

Reviewers: stoklund

Subscribers: dylanmckay, agnat

Differential Revision: http://reviews.llvm.org/D14684

llvm-svn: 256120
2015-12-20 12:16:20 +00:00
Weiming Zhao
caf90336a9 Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions for Thumb2
Summary:
r250697 fixed the mapping for ARM mode. We have to do the same for Thumb2 otherwise the same llvm.arm.ssat() will generate different saturating amount for ARM and Thumb.

r250697: http://reviews.llvm.org/rL250697

Reviewers: rmaprath

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D15653

llvm-svn: 256115
2015-12-20 06:41:44 +00:00
Tom Stellard
1e9164c058 AMDGPU/SI: Fix implemenation of isSourceOfDivergence() for graphics shaders
Summary:
The analysis of shader inputs was completely wrong.  We were passing the
wrong index to AttributeSet::hasAttribute() and the logic for which
inputs where in SGPRs was wrong too.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15608

llvm-svn: 256082
2015-12-19 02:54:15 +00:00
Matt Arsenault
db9fbb2b79 AMDGPU: Switch barrier intrinsics to using convergent
noduplicate prevents unrolling of small loops that happen to have
barriers in them. If a loop has a barrier in it, it is OK to duplicate
it for the unroll.

llvm-svn: 256075
2015-12-19 01:46:41 +00:00
Nicolai Haehnle
50527c616a AMDGPU/SI: use S_MOV_B64 for larger copies in copyPhysReg
Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15629

llvm-svn: 256073
2015-12-19 01:36:26 +00:00