1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00
Commit Graph

33754 Commits

Author SHA1 Message Date
Pete Cooper
89fb65d96c Revert "Add const to some Type* parameters which didn't need to be mutable. NFC."
This reverts commit r243146.

Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state.

llvm-svn: 243282
2015-07-27 17:15:24 +00:00
Bruno Cardoso Lopes
826fe2e37d [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply r242295 with fixes in the implementation.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 243271
2015-07-27 14:39:46 +00:00
Silviu Baranga
18a73d3e6e [ARM/AArch64] Fix cost model for interleaved accesses
Summary:
Fix the cost of interleaved accesses for ARM/AArch64.
We were calling getTypeAllocSize and using it to check
the number of bits, when we should have called
getTypeAllocSizeInBits instead.

This would pottentially cause the vectorizer to
generate loads/stores and shuffles which cannot
be matched with an interleaved access instruction.

No performance changes are expected for now since
matching/generating interleaved accesses is still
disabled by default.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11524

llvm-svn: 243270
2015-07-27 14:39:34 +00:00
Simon Pilgrim
c1ccf86144 [X86] Reordered lowerVectorShuffleAsBitMask before lowerVectorShuffleAsBlend. NFCI.
Allows us to show diffs for D11518 more clearly

llvm-svn: 243264
2015-07-27 12:37:19 +00:00
Marek Olsak
2ce56817b0 AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround
This is a candidate for 3.7.

llvm-svn: 243263
2015-07-27 11:37:42 +00:00
Sean Silva
7622b11f8c Avoid using uncommon acronym "MSROM".
llvm-svn: 243256
2015-07-27 00:46:59 +00:00
Igor Breger
255a11f9b8 Implemented encoding and intrinsics of the following instructions
vunpckhps/pd, vunpcklps/pd, 
  vpunpcklbw, vpunpckhbw, vpunpcklwd, vpunpckhwd, vpunpckldq, vpunpckhdq, vpunpcklqdq, vpunpckhqdq
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11509

llvm-svn: 243246
2015-07-26 14:41:44 +00:00
Juergen Ributzka
b1788053ee [AArch64][FastISel] Always use an AND instruction when truncating to non-legal types.
When truncating to non-legal types (such as i16, i8 and i1) always use an AND
instruction to mask out the upper bits. This was only done when the source type
was an i64, but not when the source type was an i32.

This commit fixes this and adds the missing i32 truncate tests.

This fixes rdar://problem/21990703.

llvm-svn: 243198
2015-07-25 02:16:53 +00:00
Eric Christopher
e77ef518e8 Fix PPCMaterializeInt to check the size of the integer based on the
extension property we're requesting - zero or sign extended.

This fixes cases where we want to return a zero extended 32-bit -1
and not be sign extended for the entire register. Also updated the
already out of date comment with the current behavior.

llvm-svn: 243192
2015-07-25 00:48:08 +00:00
Eric Christopher
2214db93b4 PPCMaterializeInt should only take a ConstantInt so represent this in the prototype
and fix up all uses.

llvm-svn: 243191
2015-07-25 00:48:06 +00:00
Akira Hatanaka
70eb2824e6 [AArch64] Define subtarget feature "reserve-x18", which is used to decide
whether register x18 should be reserved.

This change is needed because we cannot use a backend option to set
cl::opt "aarch64-reserve-x18" when doing LTO.

Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18"
to reserve x18 should make changes to add subtarget feature "reserve-x18"
to the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11463

llvm-svn: 243186
2015-07-25 00:18:31 +00:00
Pete Cooper
a8e3702859 Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern

for (auto I = x.rbegin(), E = x.end(); I != E; ++I)

we can use make_range to construct the reverse range and iterate using
that instead.

llvm-svn: 243163
2015-07-24 21:13:43 +00:00
Pete Cooper
cf8940b509 Add const to some Type* parameters which didn't need to be mutable. NFC.
We were only getting the size of the type which doesn't need to modify
the type.

llvm-svn: 243146
2015-07-24 19:19:26 +00:00
Pete Cooper
31257c8c3c Use foreach loops for StructType::elements(). NFC.
We had a few places where we did

for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {

but those could instead do

for (auto *EltTy : STy->elements()) {

llvm-svn: 243136
2015-07-24 18:55:49 +00:00
Igor Breger
7ea6c4cce1 AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 243122
2015-07-24 17:24:15 +00:00
Mehdi Amini
9f0c09e5bf Remove access to the DataLayout in the TargetMachine
Summary:
Replace getDataLayout() with a createDataLayout() method to make
explicit that it is intended to create a DataLayout only and not
accessing it for other purpose.

This change is the last of a series of commits dedicated to have a
single DataLayout during compilation by using always the one owned
by the module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11103

(cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea)

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243114
2015-07-24 16:04:22 +00:00
Sanjay Patel
727976dcd1 fix wrong comment; NFC
llvm-svn: 243113
2015-07-24 16:02:14 +00:00
Luke Cheeseman
399cf77ee3 [ARM] - Fix lowering of shufflevectors in AArch32
Some shufflevectors are currently being incorrectly lowered in the AArch32
backend as the existing checks for detecting the NEON operations from the
shufflevector instruction expects the shuffle mask and the vector operands to be
of the same length.

This is not always the case as the mask may be twice as long as the operand;
here only the lower half of the shufflemask gets checked, so provided the lower
half of the shufflemask looks like a vector transpose (or even is just all -1
for undef) then the intrinsics may get incorrectly lowered into a vector
transpose (VTRN) instruction.

This patch fixes this by accommodating for both cases and adds regression tests.

Differential Revision: http://reviews.llvm.org/D11407

llvm-svn: 243103
2015-07-24 09:57:05 +00:00
Luke Cheeseman
2d94f99ac0 When lowering vector shifts a check is performed to see if the value to shift by
is an immediate, in this check the value is negated and stored in and int64_t.
The value can be -2^63 yet the result cannot be stored in an int64_t and this
gives some undefined behaviour causing failures. The negation is only necessary
when the values is within a certain range and so it should not need to negate
-2^63, this patch introduces this and also a regression test.

Differential Revision: http://reviews.llvm.org/D11408

llvm-svn: 243100
2015-07-24 09:31:48 +00:00
Mehdi Amini
3a4451076e Revert "Remove access to the DataLayout in the TargetMachine"
This reverts commit 0f720d984f419c747709462f7476dff962c0bc41.

It breaks clang too badly, I need to prepare a proper patch for clang
first.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243089
2015-07-24 03:36:55 +00:00
Alexei Starovoitov
cf5bedb865 [bpf] initial support for debug_info
llvm-svn: 243087
2015-07-24 03:17:08 +00:00
Mehdi Amini
30b6ee8541 Remove access to the DataLayout in the TargetMachine
Summary:
Replace getDataLayout() with a createDataLayout() method to make
explicit that it is intended to create a DataLayout only and not
accessing it for other purpose.

This change is the last of a series of commits dedicated to have a
single DataLayout during compilation by using always the one owned
by the module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11103

(cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea)

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243083
2015-07-24 01:44:39 +00:00
Lawrence Hu
a35c8e634c test commit, only added one space
llvm-svn: 243070
2015-07-23 23:55:28 +00:00
David Gross
918580d2ee [ARM] Register (existing) ARMLoadStoreOpt pass with LLVM pass manager.
Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass.

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11373

llvm-svn: 243052
2015-07-23 22:12:46 +00:00
David Gross
a73bcd305c Test commit.
llvm-svn: 243046
2015-07-23 21:46:09 +00:00
Duncan P. N. Exon Smith
2f20200c0a X86: Use dyn_cast instead of isa+cast, NFC
llvm-svn: 243034
2015-07-23 19:27:07 +00:00
Weiming Zhao
a648a9cbea This patch eanble register coalescing to coalesce the following:
%vreg2<def> = MOVi32imm 1; GPR32:%vreg2
  %W1<def> = COPY %vreg2; GPR32:%vreg2
into:
  %W1<def> = MOVi32imm 1
Patched by Lawrence Hu (lawrence@codeaurora.org)

llvm-svn: 243033
2015-07-23 19:24:53 +00:00
Michael Kuperstein
a5f842a1d9 [X86] Allow load folding into PUSH instructions
Adds pushes to the folding tables.
This also required a fix to the TD definition, since the memory forms of 
the push instructions did not have the right mayLoad/mayStore flags.

Differential Revision: http://reviews.llvm.org/D11340

llvm-svn: 243010
2015-07-23 12:23:45 +00:00
Michael Kuperstein
197cefcc3e [X86] Fix order of operands for ins and outs instructions when parsing intel syntax
Patch by: marina.yatsina@intel.com
Differential Revision: http://reviews.llvm.org/D11337

llvm-svn: 243001
2015-07-23 10:23:48 +00:00
Elena Demikhovsky
60c86dad61 X86: Fixed assertion failure in 32-bit mode
The DAG Node "SCALAR_TO_VECTOR" may be created if the type of the scalar element is legal.
Added a check for the scalar type before creating this node.
Added a test that fails with assertion on the current version.

Differential Revision: http://reviews.llvm.org/D11413

llvm-svn: 242994
2015-07-23 08:25:23 +00:00
Chandler Carruth
3b9bc26920 Revert r242990: "AVX-512: Implemented encoding , DAG lowering and ..."
This commit broke the build. Numerous build bots broken, and it was
blocking my progress so reverting.

It should be trivial to reproduce -- enable the BPF backend and it
should fail when running llvm-tblgen.

llvm-svn: 242992
2015-07-23 08:03:44 +00:00
Igor Breger
8ded9931fe AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 242990
2015-07-23 07:39:21 +00:00
Igor Breger
7f635f1aff AVX : Fix ISA disabling in case AVX512VL , some instructions should be disabled only if AVX512BW and AVX512VL present.
Tests added.

Differential Revision: http://reviews.llvm.org/D11414

llvm-svn: 242987
2015-07-23 07:11:14 +00:00
Jingyue Wu
a5bed5dde5 [NVPTX] run LSR before straight-line optimizations
Summary:
Straight-line optimizations can simplify the loop body and make LSR's
cost analysis more precise. This significantly improves several Eigen3
CUDA benchmarks.

With this change, EigenContractionKernel runs up to 40% faster
(753ceee5f2/unsupported/Eigen/CXX11/src/Tensor/TensorContractionCuda.h (cl-502)).
EigenConvolutionKernel2D runs up to 10% faster
(753ceee5f2/unsupported/Eigen/CXX11/src/Tensor/TensorConvolution.h (cl-605)).

I have some difficulties writing small tests that benefit from this
reordering due to a seemingly issue with LSR (being discussed at
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/088244.html).

See the review thread for the compilation time impact of GVN. 

Reviewers: eliben, jholewinski

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11304

llvm-svn: 242982
2015-07-23 04:59:07 +00:00
Sanjay Patel
0d52c4ec2f fix typo; NFC
llvm-svn: 242947
2015-07-22 21:56:41 +00:00
Sanjay Patel
c70c2e75c0 fix indent; NFC
llvm-svn: 242946
2015-07-22 21:47:13 +00:00
JF Bastien
f617ddf2b8 WebAssembly: basic bitcode → assembly CodeGen test
Summary:
Add a basic CodeGen bitcode test which (for now) only prints out the function name and nothing else. The current code merely implements the basic needed for the test run to not crash / assert. Getting to that point required:

 - Basic InstPrinter.
 - Basic AsmPrinter.
 - DiagnosticInfoUnsupported (not strictly required, but nice to have, duplicated from AMDGPU/BPF's ISelLowering).
 - Some SP and register setup in WebAssemblyTargetLowering.
 - Basic LowerFormalArguments.
 - GenInstrInfo.
 - Placeholder LowerFormalArguments.
 - Placeholder CanLowerReturn and LowerReturn.
 - Basic DAGToDAGISel::Select, which requiresGenDAGISel.inc as well as GET_INSTRINFO_ENUM with GenInstrInfo.inc.
 - Remove WebAssemblyFrameLowering::determineCalleeSaves and rely on default.
 - Implement WebAssemblyFrameLowering::hasFP, same as AArch64's implementation.

Follow-up patches will implement a real AsmPrinter, which will require adding MI opcodes specific to WebAssembly.

Reviewers: sunfish

Subscribers: aemerson, jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D11369

llvm-svn: 242939
2015-07-22 21:28:15 +00:00
Hans Wennborg
34fee45808 Fix -Wextra-semi warnings.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D11400

llvm-svn: 242930
2015-07-22 20:46:11 +00:00
Chad Rosier
9d7b3c4c1e Simplify switch as all cases other than default return true. NFC.
llvm-svn: 242922
2015-07-22 18:41:57 +00:00
Quentin Colombet
30a4f74165 [ARM] Make the frame lowering code ready for shrink-wrapping.
Shrink-wrapping can now be tested on ARM with -enable-shrink-wrap.

Related to <rdar://problem/20821730>

llvm-svn: 242908
2015-07-22 16:34:37 +00:00
Asaf Badouh
7feb9eaba0 [X86][AVX512] add reduce/range/scalef/rndScale
include encoding and intrinsics

Differential Revision: http://reviews.llvm.org/D11222

llvm-svn: 242896
2015-07-22 12:00:43 +00:00
Michael Kuperstein
80699ec16e [X86] Add .intel_syntax noprefix directive to intel-syntax x86 asm output
Patch by: michael.zuckerman@intel.com
Differential Revision: http://reviews.llvm.org/D11223

llvm-svn: 242886
2015-07-22 10:49:44 +00:00
Chandler Carruth
ebae815d81 [PM/AA] Remove all of the dead AliasAnalysis pointers being threaded
through APIs that are no longer necessary now that the update API has
been removed.

This will make changes to the AA interfaces significantly less
disruptive (I hope). Either way, it seems like a really nice cleanup.

llvm-svn: 242882
2015-07-22 09:52:54 +00:00
Elena Demikhovsky
be2ecab469 AVX-512: Added intrinsics for VCVT* instructions.
All SKX forms. All VCVT instructions for float/double/int/long types.

Differential Revision: http://reviews.llvm.org/D11343

llvm-svn: 242877
2015-07-22 08:56:00 +00:00
Jingyue Wu
45a0757122 [BranchFolding] do not iterate the aliases of virtual registers
Summary:
MCRegAliasIterator only works for physical registers. So, do not run it
on virtual registers.

With this issue fixed, we can resurrect the BranchFolding pass in NVPTX
backend.

Reviewers: jholewinski, bkramer

Subscribers: henryhu, meheff, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11174

llvm-svn: 242871
2015-07-22 04:16:52 +00:00
Bill Schmidt
76e220a5ef [PPC64LE] More vector swap optimization TLC
This makes one substantive change and a few stylistic changes to the
VSX swap optimization pass.

The substantive change is to permit LXSDX and LXSSPX instructions to
participate in swap optimization computations.  The previous change to
insert a swap following a SUBREG_TO_REG widening operation makes this
almost trivial.

I experimented with also permitting STXSDX and STXSSPX instructions.
This can be done using similar techniques:  we could insert a swap
prior to a narrowing COPY operation, and then permit these stores to
participate.  I prototyped this, but discovered that the pattern of a
narrowing COPY followed by an STXSDX does not occur in any of our
test-suite code.  So instead, I added commentary indicating that this
could be done.

Other TLC:
 - I changed SH_COPYSCALAR to SH_COPYWIDEN to more clearly indicate
 the direction of the copy.
 - I factored the insertion of swap instructions into a separate
 function.

Finally, I added a new test case to check that the scalar-to-vector
loads are working properly with swap optimization.

llvm-svn: 242838
2015-07-21 21:40:17 +00:00
Chad Rosier
ae9b48df5a Follow up to r242810. NFC.
llvm-svn: 242812
2015-07-21 17:47:56 +00:00
Chad Rosier
69ef021bf9 [AArch64] Simplify the passing of arguments. NFC.
This is setup for future work planned for the AArch64 Load/Store Opt pass.

llvm-svn: 242810
2015-07-21 17:42:04 +00:00
Igor Breger
5441f451cc AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction ,
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11351

llvm-svn: 242761
2015-07-21 07:11:28 +00:00
Akira Hatanaka
cc98d1ef43 [ARM] Define subtarget feature "reserve-r9", which is used to decide
whether register r9 should be reserved.

This recommits r242737, which broke bots because the number of subtarget
features went over the limit of 64.

This change is needed because we cannot use a backend option to set
cl::opt "arm-reserve-r9" when doing LTO.

Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to
reserve r9 should make changes to add subtarget feature "reserve-r9" to
the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11320

llvm-svn: 242756
2015-07-21 01:42:02 +00:00