1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00
Commit Graph

33788 Commits

Author SHA1 Message Date
Jingyue Wu
96bbc40a55 Temporarily revert r242871
PR24299

llvm-svn: 243522
2015-07-29 15:26:11 +00:00
Bill Schmidt
e342da2fca [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask
Given certain shuffle-vector masks, LLVM emits splat instructions
which splat the wrong bytes from the source register.  The issue is
that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp
does not ensure that the splat pattern found is requesting bytes that
are aligned on an EltSize boundary.  This patch detects this situation
as not a valid splat mask, resulting in a permute being generated
instead of a splat.

Patch and test case by Tyler Kenney, cleaned up a bit by me.

This is a simple bug fix that would be good to incorporate into 3.7.

llvm-svn: 243519
2015-07-29 14:31:57 +00:00
Akira Hatanaka
c1ecb014d8 [AArch64] Define subtarget feature strict-align.
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -aarch64-strict-align to decide whether strict alignment should be
forced.

rdar://problem/21529937

llvm-svn: 243516
2015-07-29 14:17:26 +00:00
Alex Lorenz
cfa492a57d Fix broken ArrayRef conversion from r243497.
llvm-svn: 243501
2015-07-28 23:34:27 +00:00
Sanjay Patel
f59ca7d0e7 fix TLI's combineRepeatedFPDivisors interface to return the minimum user threshold
This fix was suggested as part of D11345 and is part of fixing PR24141.

With this change, we can avoid walking the uses of a divisor node if the target
doesn't want the combineRepeatedFPDivisors transform in the first place.

There is no NFC-intended other than that.

Differential Revision: http://reviews.llvm.org/D11531

llvm-svn: 243498
2015-07-28 23:05:48 +00:00
Alex Lorenz
13d7649f78 MIR Serialization: Serialize the target index machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243497
2015-07-28 23:02:45 +00:00
Akira Hatanaka
da45bae202 [ARM] Define subtarget feature strict-align.
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -arm-strict-align to decide whether strict alignment should be
forced. Also, remove the logic that was checking the OS and architecture
as clang is now responsible for setting strict-align based on the command
line options specified and the target architecute and OS.

rdar://problem/21529937

http://reviews.llvm.org/D11470

llvm-svn: 243493
2015-07-28 22:44:28 +00:00
Tim Northover
2a4f107a53 AArch64: be careful of large immediates when optimising cmps.
llvm-svn: 243492
2015-07-28 22:42:32 +00:00
Bruno Cardoso Lopes
e5225dc00d [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply 243271 with more fixes; although we are not handling multiple
sources with coalescable copies, we were not properly skipping this
case.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 243486
2015-07-28 21:45:50 +00:00
Vasileios Kalintiris
cc192bfbf1 [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls.
Summary:
Currently, we support only the MIPS O32 ABI calling convention for call
lowering. With this change we avoid using the O32 calling convetion for
lowering calls marked as using the fast calling convention.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11515

llvm-svn: 243485
2015-07-28 21:43:31 +00:00
Vasileios Kalintiris
21dafebde9 [mips][FastISel] Fix generated code for IR's select instruction.
Summary:
Generate correct code for the select instruction by zero-extending
it's boolean/condition operand to GPR-width. This is necessary because
the conditional-move instructions operate on the whole register.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11506

llvm-svn: 243469
2015-07-28 19:57:25 +00:00
Matt Arsenault
51dce12776 AMDGPU: Don't try to use LDS/vector for private if pointer value stored
If the pointer is the store's value operand, this would produce
a broken module. Make sure the use is actually for the pointer operand.

llvm-svn: 243462
2015-07-28 18:47:00 +00:00
Matt Arsenault
16b3f40599 AMDGPU: Fix crash if called function is a bitcast
getCalledFunction() is null, so this would crash. Replace
crash with an error on unsupported call.

llvm-svn: 243461
2015-07-28 18:29:14 +00:00
Matt Arsenault
27b16f9fd2 AMDGPU: Fix return type of getImplicitParameterOffset.
Patch by Zoltan Gilian <zoltan.gilian@gmail.com>

llvm-svn: 243459
2015-07-28 18:09:55 +00:00
JF Bastien
47b5b15818 WebAssembly: MCAsmInfo only has one syntax variant for now.
Summary: MCAsmInfo is set up with the default AssemblerDialect, which is zero.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11567

llvm-svn: 243452
2015-07-28 17:23:07 +00:00
Chih-Hung Hsieh
17e97fbf13 Implement target independent TLS compatible with glibc's emutls.c.
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.

clang and driver changes in http://reviews.llvm.org/D10524

  Added -femulated-tls flag to select the emulated TLS model,
  which will be used for old targets like Android that do not
  support ELF TLS models.

Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.

Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.

TODO: Add proper DIE for emulated TLS variables.
      Added new unit tests with emulated TLS.

Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243438
2015-07-28 16:24:05 +00:00
Geoff Berry
72eac10c87 [AArch64] Match float round and convert to int instructions.
Summary:
Add patterns for doing floating point round with various rounding modes
followed by conversion to int as a single FCVT* instruction.

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11424

llvm-svn: 243422
2015-07-28 15:24:10 +00:00
Adhemerval Zanella
b3b937fabe Implement __builtin_thread_pointer
This path add the aarch64 lowering of __builtin_thread_pointer.  It uses
the already implemented AArch64ISD::THREAD_POINTER used in TLS generation.

llvm-svn: 243412
2015-07-28 13:03:31 +00:00
Michael Kuperstein
07f74e1a99 [X86] Remove mergeSPUpdatesUp()
X86FrameLowering has both a mergeSPUpdates() that accepts a direction, and an
mergeSPUpdatesUp(), which seem to do the same thing, except for a slightly 
different interface. Removed the less general function.
NFC.

Differential Revision: http://reviews.llvm.org/D11510

llvm-svn: 243396
2015-07-28 08:56:13 +00:00
Simon Pilgrim
8ef138051c [X86][SSE] Use bitmasks instead of shuffles where possible.
VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions.

Split off from D11518.

Differential Revision: http://reviews.llvm.org/D11541

llvm-svn: 243395
2015-07-28 08:54:41 +00:00
Igor Breger
28223d1ba3 AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11528

llvm-svn: 243390
2015-07-28 06:53:28 +00:00
Sanjay Patel
fa4a9158e9 fix invalid load folding with SSE/AVX FP logical instructions (PR22371)
This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ).
I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy
when targeting 32-bit x86 and causes a miscompile/crash in Wine:
https://bugs.winehq.org/show_bug.cgi?id=38826
https://llvm.org/bugs/show_bug.cgi?id=22371#c25

The quick fix is to simply remove the scalar FP logical instructions from the load folding table
in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs,
fneg, fcopysign. So the majority of this patch is altering those lowerings to use *vector* FP
logical instructions (because that's all x86 gives us anyway). That lets us do the load folding 
legally.

Differential Revision: http://reviews.llvm.org/D11477

llvm-svn: 243361
2015-07-28 00:48:32 +00:00
JF Bastien
0149d0fb75 WebAssembly: add a generic CPU
Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated.

Subscribers: jfb, llvm-commits, sunfish

Differential Revision: http://reviews.llvm.org/D11546

llvm-svn: 243345
2015-07-27 23:25:54 +00:00
JF Bastien
ee1632c917 WebAssembly: more MCAsmInfo nits.
Summary: As suggested by sunfish.

Subscribers: jfb, llvm-commits, sunfish

Differential Revision: http://reviews.llvm.org/D11544

llvm-svn: 243339
2015-07-27 22:40:31 +00:00
Alexandros Lamprineas
873b41a931 - Added support for parsing HWDiv features using Target Parser.
- Architecture extensions are represented as a bitmap.

Phabricator: http://reviews.llvm.org/D11457
llvm-svn: 243335
2015-07-27 22:26:59 +00:00
Colin LeMahieu
d4a6dbfde8 [llvm-mc] Pushing plumbing through for --fatal-warnings flag.
llvm-svn: 243334
2015-07-27 21:56:53 +00:00
Sanjay Patel
4377acadb6 remove unnecessary forward declaration; NFC
llvm-svn: 243328
2015-07-27 21:11:55 +00:00
Sanjay Patel
826bbb3a0c don't repeat function names in comments; NFC
llvm-svn: 243327
2015-07-27 21:03:03 +00:00
JF Bastien
b32f45ff30 WebAssembly: minor MCAsmInfo fixes
Summary:
Fix pointer / callee-save stack sto size.
Update comment character to be LISP-ish.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11537

llvm-svn: 243326
2015-07-27 20:46:51 +00:00
Bruno Cardoso Lopes
59ac98220a Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources"
Still breaks some ARM buildbots. This reverts r243271.

llvm-svn: 243318
2015-07-27 20:26:04 +00:00
Akira Hatanaka
ae88dc4be0 [AArch64] Remove check for Darwin that was needed to decide if x18 should
be reserved.

The decision to reserve x18 is going to be made solely by the front-end,
so it isn't necessary to check if the OS is Darwin in the backend.

llvm-svn: 243308
2015-07-27 19:18:47 +00:00
Diego Novillo
1849ae0b08 Fix ODR violation. NFC.
There is an ODR conflict between lib/ExecutionEngine/ExecutionEngineBindings.cpp
and lib/Target/TargetMachineC.cpp. The inline definitions should simply
be marked static (thanks dblaikie for the hint).

llvm-svn: 243298
2015-07-27 18:27:23 +00:00
Marek Olsak
c7618b203e AMDGPU: don't match vgpr loads for constant loads
Author: Dave Airlie <airlied@redhat.com>

In order to implement indirect sampler loads, we don't
want to match on a VGPR load but an SGPR one for constants,
as we cannot feed VGPRs to the sampler only SGPRs.

this should be applicable for llvm 3.7 as well.

llvm-svn: 243294
2015-07-27 18:16:08 +00:00
Sanjay Patel
dfb3c96c10 fix typo and spacing; NFC
llvm-svn: 243287
2015-07-27 17:39:20 +00:00
Pete Cooper
89fb65d96c Revert "Add const to some Type* parameters which didn't need to be mutable. NFC."
This reverts commit r243146.

Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state.

llvm-svn: 243282
2015-07-27 17:15:24 +00:00
Bruno Cardoso Lopes
826fe2e37d [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply r242295 with fixes in the implementation.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 243271
2015-07-27 14:39:46 +00:00
Silviu Baranga
18a73d3e6e [ARM/AArch64] Fix cost model for interleaved accesses
Summary:
Fix the cost of interleaved accesses for ARM/AArch64.
We were calling getTypeAllocSize and using it to check
the number of bits, when we should have called
getTypeAllocSizeInBits instead.

This would pottentially cause the vectorizer to
generate loads/stores and shuffles which cannot
be matched with an interleaved access instruction.

No performance changes are expected for now since
matching/generating interleaved accesses is still
disabled by default.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11524

llvm-svn: 243270
2015-07-27 14:39:34 +00:00
Simon Pilgrim
c1ccf86144 [X86] Reordered lowerVectorShuffleAsBitMask before lowerVectorShuffleAsBlend. NFCI.
Allows us to show diffs for D11518 more clearly

llvm-svn: 243264
2015-07-27 12:37:19 +00:00
Marek Olsak
2ce56817b0 AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround
This is a candidate for 3.7.

llvm-svn: 243263
2015-07-27 11:37:42 +00:00
Sean Silva
7622b11f8c Avoid using uncommon acronym "MSROM".
llvm-svn: 243256
2015-07-27 00:46:59 +00:00
Igor Breger
255a11f9b8 Implemented encoding and intrinsics of the following instructions
vunpckhps/pd, vunpcklps/pd, 
  vpunpcklbw, vpunpckhbw, vpunpcklwd, vpunpckhwd, vpunpckldq, vpunpckhdq, vpunpcklqdq, vpunpckhqdq
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11509

llvm-svn: 243246
2015-07-26 14:41:44 +00:00
Juergen Ributzka
b1788053ee [AArch64][FastISel] Always use an AND instruction when truncating to non-legal types.
When truncating to non-legal types (such as i16, i8 and i1) always use an AND
instruction to mask out the upper bits. This was only done when the source type
was an i64, but not when the source type was an i32.

This commit fixes this and adds the missing i32 truncate tests.

This fixes rdar://problem/21990703.

llvm-svn: 243198
2015-07-25 02:16:53 +00:00
Eric Christopher
e77ef518e8 Fix PPCMaterializeInt to check the size of the integer based on the
extension property we're requesting - zero or sign extended.

This fixes cases where we want to return a zero extended 32-bit -1
and not be sign extended for the entire register. Also updated the
already out of date comment with the current behavior.

llvm-svn: 243192
2015-07-25 00:48:08 +00:00
Eric Christopher
2214db93b4 PPCMaterializeInt should only take a ConstantInt so represent this in the prototype
and fix up all uses.

llvm-svn: 243191
2015-07-25 00:48:06 +00:00
Akira Hatanaka
70eb2824e6 [AArch64] Define subtarget feature "reserve-x18", which is used to decide
whether register x18 should be reserved.

This change is needed because we cannot use a backend option to set
cl::opt "aarch64-reserve-x18" when doing LTO.

Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18"
to reserve x18 should make changes to add subtarget feature "reserve-x18"
to the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11463

llvm-svn: 243186
2015-07-25 00:18:31 +00:00
Pete Cooper
a8e3702859 Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern

for (auto I = x.rbegin(), E = x.end(); I != E; ++I)

we can use make_range to construct the reverse range and iterate using
that instead.

llvm-svn: 243163
2015-07-24 21:13:43 +00:00
Pete Cooper
cf8940b509 Add const to some Type* parameters which didn't need to be mutable. NFC.
We were only getting the size of the type which doesn't need to modify
the type.

llvm-svn: 243146
2015-07-24 19:19:26 +00:00
Pete Cooper
31257c8c3c Use foreach loops for StructType::elements(). NFC.
We had a few places where we did

for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {

but those could instead do

for (auto *EltTy : STy->elements()) {

llvm-svn: 243136
2015-07-24 18:55:49 +00:00
Igor Breger
7ea6c4cce1 AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 243122
2015-07-24 17:24:15 +00:00
Mehdi Amini
9f0c09e5bf Remove access to the DataLayout in the TargetMachine
Summary:
Replace getDataLayout() with a createDataLayout() method to make
explicit that it is intended to create a DataLayout only and not
accessing it for other purpose.

This change is the last of a series of commits dedicated to have a
single DataLayout during compilation by using always the one owned
by the module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11103

(cherry picked from commit 5609fc56bca971e5a7efeaa6ca4676638eaec5ea)

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243114
2015-07-24 16:04:22 +00:00