1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

33644 Commits

Author SHA1 Message Date
Diego Novillo
47b6978e39 Add missing dependency to Hexagon target.
A recent patch added calls to isInstructionTriviallyDead without the
corresponding dependency on TransformUtils.

llvm-svn: 241731
2015-07-08 21:13:37 +00:00
Reid Kleckner
f93836486a [Win64] Only treat some functions as having the Win64 convention
All the usual X86 target-specific conventions are collapsed to the
normal Win64 convention, but the custom conventions like GHC and webkit
should not be.

Previously we would assume that the caller allocated 32 bytes of shadow
space for us, which is not how webkit_jscc or other custom conventions
are supposed to work.

Based on a patch by peavo@outlook.com.

Fixes PR24051.

llvm-svn: 241725
2015-07-08 21:03:47 +00:00
Krzysztof Parzyszek
ac54e4bbae [Hexagon] Implement commoning of GetElementPtr instructions
llvm-svn: 241714
2015-07-08 19:22:28 +00:00
Reid Kleckner
cde3a2cf79 [SEH] Ensure that empty __except blocks have their own BB
The 32-bit lowering assumed that WinEHPrepare had this invariant.
WinEHPrepare did it for C++, but not SEH. The result was that we would
insert calls to llvm.x86.seh.restoreframe in normal basic blocks, which
corrupted the frame pointer.

llvm-svn: 241699
2015-07-08 18:08:52 +00:00
Duncan P. N. Exon Smith
d56771ab94 MC: Constify MCSubtargetInfo in getDeprecationInfo(), NFC
There's no reason to be able to mutate `MCSubtargetInfo` in
`getDeprecationInfo()`.  Constify the reference.

llvm-svn: 241693
2015-07-08 17:30:55 +00:00
Eli Bendersky
34f1645451 Cosmetic cleanups - NFC
Remove commented lines, trailing whitespace, etc.

llvm-svn: 241687
2015-07-08 16:33:21 +00:00
James Y Knight
4f71b891ec [SPARC] Cleanup handling of the Y/ASR registers.
- Implement copying ASR to/from GPR regs.
- Mark ASRs as non-allocatable, so it won't try to arbitrarily use
  them inappropriately.
- Instead of inserting explicit WRASR/RDASR nodes in the MUL/DIV
  routines, just do normal register copies.
- Also...mark div as using Y, not just writing it.

Added a test case with some code which previously died with an
assertion failure (with -O0), or produced wrong code (otherwise).

(Third time's the charm?)

Differential Revision: http://reviews.llvm.org/D10401

llvm-svn: 241686
2015-07-08 16:25:12 +00:00
Krzysztof Parzyszek
1ff6cefced [Hexagon] Generate "insert" instructions more aggressively
llvm-svn: 241683
2015-07-08 14:47:34 +00:00
Krzysztof Parzyszek
0942a0a955 Revert 241681: causes Windows builds to fail
llvm-svn: 241682
2015-07-08 14:34:13 +00:00
Krzysztof Parzyszek
bba8c32fe9 [Hexagon] Generate "insert" instructions more aggressively
llvm-svn: 241681
2015-07-08 14:22:27 +00:00
Simon Pilgrim
5c44e5f75e [X86][SSE] Added (V)ROUNDSD + (V)ROUNDSS stack folding support
llvm-svn: 241671
2015-07-08 08:07:57 +00:00
Mehdi Amini
ce81a1a1e3 Remove IsLittleEndian from TargetLowering and redirect to DataLayout
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11017

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241655
2015-07-08 01:00:38 +00:00
Reid Kleckner
0138358834 [WinEH] Make llvm.x86.seh.restoreframe work for stack realignment prologues
The incoming EBP value points to the end of a local stack allocation, so
we can use that to restore ESI, the base pointer. Once we do that, we
can use local stack allocations. If we know we need stack realignment,
spill the original frame pointer in the prologue and reload it after
restoring ESI.

llvm-svn: 241648
2015-07-07 23:45:58 +00:00
Reid Kleckner
6207d850e4 [WinEH] Add localaddress intrinsic instead of using frameaddress
Clang uses this for SEH finally. The new intrinsic will produce the
right value when stack realignment is required.

llvm-svn: 241643
2015-07-07 23:23:03 +00:00
Arnold Schwaighofer
6d6b413fa3 Add more nvcasts
Tim Northover has told me that they can occur when the compiler cleverly
constructs constants - as demonstrated in the test case.

rdar://21703486

llvm-svn: 241641
2015-07-07 23:13:18 +00:00
Dan Gohman
2f1f3dca07 [WebAssembly] Set the scheduling preference.
llvm-svn: 241637
2015-07-07 22:38:06 +00:00
Reid Kleckner
45072b933e Rename llvm.frameescape and llvm.framerecover to localescape and localrecover
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.

These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.

Naming suggestions at this point are welcome, I'm happy to re-run sed.

Reviewers: majnemer, nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11011

llvm-svn: 241633
2015-07-07 22:25:32 +00:00
Sanjay Patel
99204c830c fix typo; NFC
llvm-svn: 241629
2015-07-07 21:31:54 +00:00
Arnold Schwaighofer
a112326960 Add a pattern for a nvcast from v2f64 -> v4f32
Since the NvCast is generated by the selection process the concerns about
endianess and bit reversal don't apply.

rdar://21703486

llvm-svn: 241611
2015-07-07 18:31:55 +00:00
Reid Kleckner
ecbf17944e Use default member initializers to deduplicate code in X86MachineFunctionInfo, NFC
llvm-svn: 241609
2015-07-07 18:12:06 +00:00
Krzysztof Parzyszek
ffdcea2db8 [Hexagon] Fix unused variable warnings in NDEBUG build caused by r241595
llvm-svn: 241600
2015-07-07 16:02:11 +00:00
Reid Kleckner
d19a4a185e [WinEH] Add a report_fatal_error for 32-bit stack realignment
This type of prologue isn't supported yet. Implementing it should be a
matter of copying the adjusted incoming EBP into ESI (the base pointer)
instead of EBP.  The original EBP can be saved and restored from other
memory afterwards.

llvm-svn: 241597
2015-07-07 15:47:29 +00:00
Krzysztof Parzyszek
b3976a4e86 [Hexagon] Implement bit-tracking facility with specifics for Hexagon
This includes code that is intended to be target-independent as well
as the Hexagon-specific details. This is just the framework without
any users.

llvm-svn: 241595
2015-07-07 15:16:42 +00:00
Sanjay Patel
41327794b7 use range-based for loops; NFCI
llvm-svn: 241592
2015-07-07 15:03:53 +00:00
Denis Protivensky
fc5fc4c36b Fix gcc warnings of different enum and non-enum types in ternaries
llvm-svn: 241567
2015-07-07 07:48:48 +00:00
Akira Hatanaka
548fcd7ec7 [ARM] Define a subtarget feature and use it to decide whether long calls should
be emitted.

This is needed to enable ARM long calls for LTO and enable and disable it on a
per-function basis.

Out-of-tree projects currently using EnableARMLongCalls to emit long calls
should start passing "+long-calls" to the feature string (see the changes made
to clang in r241565).

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D9364

llvm-svn: 241566
2015-07-07 06:54:42 +00:00
Simon Pilgrim
3c101973b3 [X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with zero'd lanes
The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments.

As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well.

Differential Revision: http://reviews.llvm.org/D10593

llvm-svn: 241516
2015-07-06 22:46:46 +00:00
Sanjay Patel
6781c4029c [x86] extend machine combiner reassociation optimization to SSE scalar adds
Extend the reassociation optimization of http://reviews.llvm.org/rL240361 (D10460)
to SSE scalar FP SP adds in addition to AVX scalar FP SP adds.

With the 'switch' in place, we can trivially add other opcodes and test cases in
future patches.

Differential Revision: http://reviews.llvm.org/D10975

llvm-svn: 241515
2015-07-06 22:35:29 +00:00
Simon Pilgrim
a825efbf95 [X86][SSE] Vectorized i64 uniform constant SRA shifts
This patch adds vectorization support for uniform constant i64 arithmetic shift right operators.

Differential Revision: http://reviews.llvm.org/D9645

llvm-svn: 241514
2015-07-06 22:35:19 +00:00
JF Bastien
55d9badb10 WebAssembly: add some TODO
Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D10971

llvm-svn: 241513
2015-07-06 21:41:59 +00:00
Simon Pilgrim
385bee8c59 [X86][SSE4A] Shuffle lowering using SSE4A EXTRQ/INSERTQ instructions
This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it.

As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions.

From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level.

Differential Revision: http://reviews.llvm.org/D10146

llvm-svn: 241508
2015-07-06 20:46:41 +00:00
Simon Pilgrim
af8901c21c [X86][SSE] Use the general SMAX/SMIN/UMAX/UMIN opcodes and remove the X86 implementation
With the completion of D9746 there is now a common implementation of integer signed/unsigned min/max nodes, removing the need for the equivalent X86 specific implementations.

This patch removes the old X86ISD nodes, legalizes the relevant SSE2/SSE41/AVX2/AVX512 instructions for the ISD versions and converts the small amount of existing X86 code.

Differential Revision: http://reviews.llvm.org/D10947

llvm-svn: 241506
2015-07-06 20:30:47 +00:00
Alex Lorenz
6945ffc637 llc: Add a 'run-pass' option.
This commit adds a 'run-pass' option to llc, which instructs the compiler to run
one specific code generation pass only.

Llc already has the 'start-after' and the 'stop-after' options, and this new
option complements the other two by making it easier to write tests that want
to invoke a single pass only.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10776

llvm-svn: 241476
2015-07-06 17:44:26 +00:00
Matt Arsenault
591ddbb960 AMDGPU: Run SIInsertWaits as pre-emit pass
Running this after the scheduler enables scheduling
waits later so other ALU instructions can run while
this would be waiting.

When combined with enabling the post-RA scheduler, this
gives about a ~20% improvement on sgemm.

llvm-svn: 241473
2015-07-06 17:02:20 +00:00
Daniel Sanders
46f5420293 Change the last few internal StringRef triples into Triple objects.
Summary:
This concludes the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

At this point, the StringRef-form of GNU Triples should only be used in the
public API (including IR serialization) and a couple objects that directly
interact with the API (most notably the Module class). The next step is to
replace these Triple objects with the TargetTuple object that will represent
our authoratative/unambiguous internal equivalent to GNU Triples.

Reviewers: rengolin

Subscribers: llvm-commits, jholewinski, ted, rengolin

Differential Revision: http://reviews.llvm.org/D10962

llvm-svn: 241472
2015-07-06 16:56:07 +00:00
Daniel Sanders
cd6495240a Where Triple has a suitable predicate, use it rather than the enum values. NFC.
Reviewers: mcrosier

Subscribers: llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10960

llvm-svn: 241469
2015-07-06 16:33:18 +00:00
Matt Arsenault
36294af135 AMDGPU/SI: Add debugging subtarget feature for DS offsets
We don't have a good way to detect most situations where
DS offsets are usable on SI, so add an option to force using
them even if unsafe for debugging performance problems.

llvm-svn: 241462
2015-07-06 16:01:58 +00:00
James Y Knight
528aaa5079 [Sparc] Add more instruction aliases.
These are mostly from the chart in the SparcV8 spec, section "A.3
Synthetic Instructions".

Differential Revision: http://reviews.llvm.org/D9834

llvm-svn: 241461
2015-07-06 16:01:07 +00:00
James Y Knight
91a40832bd [Sparc] Add support for flush instruction.
Differential Revision: http://reviews.llvm.org/D9833

llvm-svn: 241460
2015-07-06 16:01:04 +00:00
Chad Rosier
46c71906bc Fix a bug in the A57FPLoadBalancing register tracking/scavenger.
The code in AArch64A57FPLoadBalancing::scavengeRegister() to handle dead defs
was not correctly handling aliased registers.  E.g. if the dead def was of D2,
then S2 was not being marked as unavailable, so it could potentially be used
across a live-range in which it would be clobbered.

Patch by Geoff Berry <gberry@codeaurora.org>!
Phabricator: http://reviews.llvm.org/D10900

llvm-svn: 241449
2015-07-06 14:46:34 +00:00
Asaf Badouh
a51b8d0d5b [X86][AVX512] Multiply Packed Unsigned Integers with Round and Scale
pmulhrsw

review:
http://reviews.llvm.org/D10948

llvm-svn: 241443
2015-07-06 14:03:40 +00:00
Peter Collingbourne
f49ef7d3ac IR: Do not consider available_externally linkage to be linker-weak.
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.

Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.

Differential Revision: http://reviews.llvm.org/D10941

llvm-svn: 241413
2015-07-05 20:52:35 +00:00
Benjamin Kramer
89b0e53c15 [TargetLowering] StringRefize asm constraint getters.
There is some functional change here because it changes target code from
atoi(3) to StringRef::getAsInteger which has error checking. For valid
constraints there should be no difference.

llvm-svn: 241411
2015-07-05 19:29:18 +00:00
Asaf Badouh
7e53a288e3 [x86][AVX512] add Multiply High Op
include encoding and intrinsics tests.

review
http://reviews.llvm.org/D10896

llvm-svn: 241406
2015-07-05 12:23:20 +00:00
Michael Kuperstein
41c7f42f7f [X86] Fix incorrect/inefficient pushw encodings for x86-64 targets
Correctly support assembling "pushw $imm8" on x86-64 targets. 
Also some cleanup of the PUSH instructions (PUSH64i16 and PUSHi16 actually
represent the same instruction)

This fixes PR23996

Patch by: david.l.kreitzer@intel.com
Differential Revision: http://reviews.llvm.org/D10878

llvm-svn: 241404
2015-07-05 10:25:41 +00:00
Nemanja Ivanovic
4dede06034 Add missing builtins to the PPC back end for ABI compliance (vol. 2)
This patch corresponds to review:
http://reviews.llvm.org/D10874

Back end portion of the second round of additions to altivec.h.

llvm-svn: 241398
2015-07-05 06:03:51 +00:00
Simon Pilgrim
ae34c3f1f4 [X86][SSE] Improved i8/i16 to f64 uint2fp vector conversions
Followup to D10433 and D10589 that fixes i8/i16 uint2fp vector conversions by zero extending to i32 and using the sint2fp path (unless the target does actually support uint2fp).

llvm-svn: 241394
2015-07-04 15:33:34 +00:00
Craig Topper
3faa86f8f9 [X86] Add proper 64-bit mode checks to jrcxz and jcxz.
llvm-svn: 241381
2015-07-04 00:01:07 +00:00
Matt Arsenault
30008082bc AMDGPU: Fix indentation of switch
llvm-svn: 241380
2015-07-03 23:33:38 +00:00
Rafael Espindola
06691d6e5a Return ErrorOr from getSymbolAddress.
It can fail trying to get the section on ELF and COFF. This makes sure the
error is handled.

llvm-svn: 241366
2015-07-03 18:19:00 +00:00
Rafael Espindola
7af299bc9a Replace a few more MachO only uses of getSymbolAddress.
llvm-svn: 241365
2015-07-03 18:02:36 +00:00
Simon Pilgrim
555783cecd [X86][SSE] Sign extension for target vector sizes less than 128 bits (pt2)
Add support for v2i8/v2i16 to v2f64 by using a sign extension to v2i32 before conversion to v2f64.

Differential Revision: http://reviews.llvm.org/D10589

llvm-svn: 241325
2015-07-03 08:01:36 +00:00
Simon Pilgrim
438ea7ca28 [X86][SSE] Sign extension for target vector sizes less than 128 bits (pt1)
This patch adds support for sign extension for sub 128-bit vectors, such as to v2i32. It concatenates with UNDEF subvectors up to 128-bits, performs the sign extension (i.e. as v4i32) and then extracts the target subvector.

Patch 1/2 of D10589 - the second patch covers the conversion of v2i8/v2i16 to v2f64.

llvm-svn: 241323
2015-07-03 07:51:01 +00:00
Dan Gohman
512ba1b916 [WebAssembly] Set the HasFloatingPointExceptions flag for WebAssembly.
llvm-svn: 241302
2015-07-02 21:36:25 +00:00
Rafael Espindola
165a342cde Return ErrorOr from SymbolRef::getName.
This function can really fail since the string table offset can be out of
bounds.

Using ErrorOr makes sure the error is checked.

Hopefully a lot of the boilerplate code in tools/* can go away once we have
a diagnostic manager in Object.

llvm-svn: 241297
2015-07-02 20:55:21 +00:00
Bill Schmidt
8da856ce76 [PPC64LE] Remove implicit-subreg restriction from VSX swap removal
In r241285, I removed the SUBREG_TO_REG restriction from VSX swap
removal, determining that this was overly conservative.  We have
another form of the same restriction in that we check for the presence
of implicit subregs in vector operations.  As with SUBREG_TO_REG for
partial register conversions, an implicit subreg is safe in and of
itself, provided no other operation makes a lane-sensitive assumption
about the result.  This patch removes that restriction, by removing
the HasImplicitSubreg flag and all code that relies on it.

I've added a test case that fails to optimize before this patch is
applied, and optimizes properly with the patch.  Test based on a
report from Anton Blanchard.

llvm-svn: 241290
2015-07-02 19:01:22 +00:00
Bill Schmidt
0c245b6f0f [PPC64LE] Teach swap optimization about the doubleword splat idiom
With a previous patch, the VSX swap optimization is able to recognize
the doubleword load-splat idiom that can be implemented using lxvdsx.
However, that does not cover a doubleword splat where the source is a
register.  We can implement this using xxspltd (a special form of
xxpermdi).  This patch teaches the swap optimization pass about this
idiom.

As a prerequisite, it also permits swap optimization to succeed for
all forms of SUBREG_TO_REG.  Previously we were conservative and only
allowed SUBREG_TO_REG when it copied a full register.  However, on
reflection any form of SUBREG_TO_REG is safe in and of itself, so long
as an unsafe operation is not performed on its result.  In particular,
a widening SUBREG_TO_REG often occurs as an input to a doubleword
splat idiom, particularly in auto-vectorized code.

The doubleword splat idiom is an XXPERMDI operation where both source
registers are identical, and the selection mask is either 0 (splat the
first element) or 3 (splat the second element).  To determine whether
the registers are identical, we use the existing mechanism for looking
through "copy-like" operations.  That mechanism has a side effect of
marking the XXPERMDI operation as using a physical register, which
would invalidate its presence in a swap-optimized region.  This is
correct for the form of XXPERMDI that performs a swap and hence would
be removed, but is not what we want for a doubleword-splat variety of
XXPERMDI.  Therefore we reset the physical-register flag on the
XXPERMDI when it represents a splat.

A simple test case is added to verify that we generate the splat and
that we also remove the xxswapd instructions that would otherwise be
associated with the load and store of another operand.

llvm-svn: 241285
2015-07-02 17:03:06 +00:00
Eric Christopher
5168f3454a Implement TargetTransformInfo::hasCompatibleFunctionAttributes for X86.
This checks subtarget feature compatibility for inlining by verifying
that the callee is a strict subset of the caller's features. This includes
the cpu as part of the subtarget we can get via the incoming functions as
the backend takes CPUs as feature sets.

This allows us to inline things like:

int foo() { return baz(); }

int __attribute__((target("sse4.2"))) bar() {
  return foo();
}

so that generic code can be inlined into specialized functions.

llvm-svn: 241221
2015-07-02 01:11:50 +00:00
JF Bastien
8ef54c36dd WebAssembly: start instructions
Summary:
* Add 64-bit address space feature.
* Rename SIMD feature to SIMD128.
* Handle single-thread model with an IR pass (same way ARM does).
* Rename generic processor to MVP, to follow design's lead.
* Add bleeding-edge processors, with all features included.
* Fix a few DEBUG_TYPE to match other backends.

Test Plan: ninja check

Reviewers: sunfish

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D10880

llvm-svn: 241211
2015-07-01 23:41:25 +00:00
Dan Gohman
3b5aac894a [WebAssembly] Define separate Target instances for 32-bit and 64-bit.
llvm-svn: 241193
2015-07-01 21:42:34 +00:00
Jingyue Wu
5f36b4cd05 [NVPTX] expand extload/truncstore for vectors of floats
Summary:
According to PTX ISA:

For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type.

So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of.

As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like
ld.v2.f32 {%fd3, %fd4}, [%rd2]

This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated.

Patched by Gang Hu. 

Test Plan: Test case attached.

Reviewers: jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D10876

llvm-svn: 241191
2015-07-01 21:32:42 +00:00
Jingyue Wu
bf15de754a [NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass
Summary:
Offset of frame index is calculated by NVPTXPrologEpilogPass. Before
that the correct offset of stack objects cannot be obtained, which
leads to wrong offset if there are more than 2 frame objects. This patch
move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index
is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check
VRFrame register instead, and try to remove the VRFrame if there
is no usage after NVPTXPeephole pass.

Patched by Xuetian Weng. 

Test Plan:
Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the
offset calculation based on SP and SPL.

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10853

llvm-svn: 241185
2015-07-01 20:08:06 +00:00
Bill Schmidt
2cbe4a84c5 [PPC64LE] Enable missing lxvdsx optimization, and related swap optimization
When adding little-endian vector support for PowerPC last year, I
inadvertently disabled an optimization that recognizes a load-splat
idiom and generates the lxvdsx instruction.  This patch moves the
offending logic so lxvdsx is once again generated.

This pattern is frequently generated by the vectorizer for scalar
loads of an effective constant.  Previously the lxvdsx instruction was
wrongly listed as lane-sensitive for the VSX swap optimization (since
both doublewords are identical, swaps are safe).  This patch fixes
this as well, so that vectorized code using lxvdsx can now have swaps
removed from the computation.

There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll
that checks for the missing optimization.  However, vsx.ll was only
being tested for POWER7 with big-endian code generation.  I've added
a little-endian RUN statement and expected LE code generation for all
the tests in vsx.ll to give us a bit better VSX coverage, including
what's needed for this patch.

llvm-svn: 241183
2015-07-01 19:40:07 +00:00
Sanjay Patel
b294bc94ef fix formatting; NFC
llvm-svn: 241175
2015-07-01 17:58:53 +00:00
Sanjay Patel
30c5c88ab0 fix typos in comment; NFC
llvm-svn: 241174
2015-07-01 17:55:07 +00:00
Reid Kleckner
ff8213aeba [SEH] Don't assert if the parent function lacks a personality
The EH code might have been deleted as unreachable and the personality
pruned while the filter is still present.  Currently I'm hitting this at
-O0 due to the clang bug PR24009.

llvm-svn: 241170
2015-07-01 16:45:47 +00:00
Arnaud A. de Grandmaison
9f66584696 [AArch64] Implement add/adds/sub/subs/cmp/cmn with negative immediate aliases
This patch teaches the AsmParser to accept add/adds/sub/subs/cmp/cmn
with a negative immediate operand and convert them as shown:

  add  Rd, Rn, -imm -> sub  Rd, Rn, imm
  sub  Rd, Rn, -imm -> add  Rd, Rn, imm
  adds Rd, Rn, -imm -> subs Rd, Rn, imm
  subs Rd, Rn, -imm -> adds Rd, Rn, imm
  cmp  Rn, -imm     -> cmn  Rn, imm
  cmn  Rn, -imm     -> cmp  Rn, imm

Those instructions are an alternate syntax available to assembly coders,
and are needed in order to support code already compiling with some other
assemblers (gas). They are documented in the "ARMv8 Instruction Set
Overview", in the "Arithmetic (immediate)" section. This makes llvm-mc
a programmer-friendly assembler !

This also fixes PR20978: "Assembly handling of adding negative numbers
not as smart as gas".

llvm-svn: 241166
2015-07-01 15:05:58 +00:00
James Y Knight
4771453f17 [Sparc] Rearrange SparcInstrInfo, no change.
Move some instructions into order of sections in the spec, as the rest
already were.

Differential Revision: http://reviews.llvm.org/D9102

llvm-svn: 241163
2015-07-01 14:38:07 +00:00
Igor Breger
cdff3524c0 AVX-512: Implemented missing encoding for FMA scalar instructions
Added tests for encoding

Differential Revision: http://reviews.llvm.org/D10865

llvm-svn: 241159
2015-07-01 13:24:28 +00:00
Michael Kuperstein
5ab1a3193f [X86] Avoid over-relaxation of 8-bit immediates in integer arithmetic instructions.
Only consider an instruction a candidate for relaxation if the last operand of the 
instruction is an expression. We previously checked whether any operand is an expression,
which is useless, since for all instructions concerned, the only operand that may be
affected by relaxation is the last one.
In addition, this removes the check for having RIP as an argument, since it was 
plain wrong - even when one of the arguments is RIP, relaxation may still be needed.

This fixes PR9807.

Patch by: david.l.kreitzer@intel.com
Differential Revision: http://reviews.llvm.org/D10766

llvm-svn: 241152
2015-07-01 10:54:42 +00:00
Zoran Jovanovic
0f96f10936 [mips][microMIPS] Implement SLL and NOP instructions
http://reviews.llvm.org/D10474

llvm-svn: 241150
2015-07-01 09:54:51 +00:00
Reid Kleckner
8011644f25 [SEH] Add new intrinsics for recovering and restoring parent frames
The incoming EBP value established by the runtime is actually a pointer
to the end of the EH registration object, and not the true parent
function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo
anymore because we know that the exception info pointer is at a fixed
offset from this incoming EBP.

The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the
EH runtime and returns a pointer that is usable with llvm.framerecover.

The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit
specific preparation pass in blocks targetted by the EH runtime. It
re-establishes any physical registers used by the parent function to
address the stack, such as the frame, base, and stack pointers.

Neither of these intrinsics correctly handle stack realignment prologues
yet, but it's possible to add that later.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D10848

llvm-svn: 241125
2015-06-30 22:46:59 +00:00
Jingyue Wu
4759e1d16f [NVPTX] cleanups and refacotring in NVPTXFrameLowering.cpp
Summary: NFC

Test Plan: no regression

Reviewers: wengxt

Reviewed By: wengxt

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10849

llvm-svn: 241118
2015-06-30 21:28:31 +00:00
Nemanja Ivanovic
6da9caa16d Modified a comment about the reason for the patch (removed commented code).
llvm-svn: 241110
2015-06-30 20:01:16 +00:00
Nemanja Ivanovic
7d2f845a9d Fixes a bug with __builtin_vsx_lxvdw4x on Little Endian systems
llvm-svn: 241108
2015-06-30 19:45:45 +00:00
Jingyue Wu
20fd92cbbc [NVPTX] Fix issue introduced in D10321
Summary:
Really check if %SP is not used in other places, instead of checking only exact
one non-dbg use.

Patched by Xuetian Weng. 

Test Plan:
@foo4 in test/CodeGen/NVPTX/local-stack-frame.ll, create a case that
SP will appear twice.

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits, sfantao, jholewinski

Differential Revision: http://reviews.llvm.org/D10844

llvm-svn: 241099
2015-06-30 18:59:19 +00:00
Samuel Antao
067b1d340e Force relocation mode to be default, regardless of what is passed to the backend.
llvm-svn: 241081
2015-06-30 17:18:00 +00:00
Michael Kuperstein
16ecf4031e [X86] Fix a bug in WIN_FTOL_32/64 handling.
Duplicating an FP register "as itself" is a bad idea, since it violates the
invariant that every FP register is mapped to at most one FPU stack slot.
Use the scratch FP register instead.

This fixes PR23957.

llvm-svn: 241069
2015-06-30 14:38:57 +00:00
Toma Tabacu
a6d617755d [mips] [IAS] Add support for the .module softfloat/hardfloat directives.
These directives are used to set the default value of the SoftFloat feature.
They have the same effect as setting -m{soft, hard}-float from the command line.

Differential Revision: http://reviews.llvm.org/D9073

llvm-svn: 241066
2015-06-30 13:46:03 +00:00
Toma Tabacu
8fcfd10905 [mips] [IAS] Make .module directives change AssemblerOptions->front().
Differential Revision: http://reviews.llvm.org/D10643

llvm-svn: 241062
2015-06-30 12:41:33 +00:00
Ranjeet Singh
2f642c039e Reverting r241058 because it's causing buildbot failures.
llvm-svn: 241061
2015-06-30 12:32:53 +00:00
Ranjeet Singh
9a787f3fa4 There are a few places where subtarget features are still
represented by uint64_t, this patch replaces these
usages with the FeatureBitset (std::bitset) type.

Differential Revision: http://reviews.llvm.org/D10542

llvm-svn: 241058
2015-06-30 11:30:42 +00:00
Toma Tabacu
8f128b1fb2 [mips] [IAS] Add support for the .set oddspreg/nooddspreg directives.
Differential Revision: http://reviews.llvm.org/D10657

llvm-svn: 241052
2015-06-30 09:36:50 +00:00
Michael Kuperstein
7b4b71c924 [X86] Add FXSR intrinsics
Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64)

llvm-svn: 241049
2015-06-30 08:49:35 +00:00
Rafael Espindola
6f9850f8a9 Don't return error_code from a function that doesn't fail.
llvm-svn: 241033
2015-06-30 01:53:01 +00:00
Rafael Espindola
75e27b270d Cleanup getRelocationAddend.
Realistically, this will be returning ErrorOr for some time as refactoring the
user code to check once per section will take some time.

Given that, use it for checking if a relocation has addend or not.

While at it, add ELFRelocationRef to simplify the users.

llvm-svn: 241028
2015-06-30 00:33:59 +00:00
Dan Gohman
e04339a4ce [WebAssembly] Initial WebAssembly backend
This WebAssembly backend is just a skeleton at this time and is not yet
functional.

llvm-svn: 241022
2015-06-29 23:51:55 +00:00
Peter Collingbourne
d3c303721f Teach LTOModule to emit linker flags for dllexported symbols, plus interface cleanup.
This change unifies how LTOModule and the backend obtain linker flags
for globals: via a new TargetLoweringObjectFile member function named
emitLinkerFlagsForGlobal. A new function LTOModule::getLinkerOpts() returns
the list of linker flags as a single concatenated string.

This change affects the C libLTO API: the function lto_module_get_*deplibs now
exposes an empty list, and lto_module_get_*linkeropts exposes a single element
which combines the contents of all observed flags. libLTO should never have
tried to parse the linker flags; it is the linker's job to do so. Because
linkers will need to be able to parse flags in regular object files, it
makes little sense for libLTO to have a redundant mechanism for doing so.

The new API is compatible with the old one. It is valid for a user to specify
multiple linker flags in a single pragma directive like this:

 #pragma comment(linker, "/defaultlib:foo /defaultlib:bar")

The previous implementation would not have exposed
either flag via lto_module_get_*deplibs (as the test in
TargetLoweringObjectFileCOFF::getDepLibFromLinkerOpt was case sensitive)
and would have exposed "/defaultlib:foo /defaultlib:bar" as a single flag via
lto_module_get_*linkeropts. This may have been a bug in the implementation,
but it does give us a chance to fix the interface.

Differential Revision: http://reviews.llvm.org/D10548

llvm-svn: 241010
2015-06-29 22:04:09 +00:00
Tim Northover
f460862bbe ARM: add correct kill flags when combining stm instructions
When the store sequence being combined actually stores the base register, we
should not mark it as killed until the end.

rdar://21504262

llvm-svn: 241003
2015-06-29 21:42:16 +00:00
Matthias Braun
c4e357521f X86: Rework inline asm integer register specification.
This is a new version of http://reviews.llvm.org/D10260.

It turned out that when you specify an integer register in inline asm on
x86 you get the register of the required type size back. That means that
X86TargetLowering::getRegForInlineAsmConstraint() has to accept any of
the integer registers and adapt its size to the given target size which
may be any 8/16/32/64 bit sized type. Surprisingly that means given a
constraint of "{ax}" and a type of MVT::F32 we need to return X86::EAX.

This change makes this face explicit, the previous code seemed like
working by accident because there it never returned an error once a
register was found. On the other hand this rewrite allows to actually
return errors for invalid situations like requesting an integer register
for an i128 type.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10813

llvm-svn: 241002
2015-06-29 21:35:51 +00:00
Elena Demikhovsky
12bde41e5a AVX-512: all forms of SCATTER instruction on SKX,
encoding, intrinsics and tests.

llvm-svn: 240936
2015-06-29 12:14:24 +00:00
Javed Absar
013d6e555c [ARM]: Extend -mfpu options for half-precision and vfpv3xd
Some of the the permissible ARM -mfpu options, which are supported in GCC,
are currently not present in llvm/clang.This patch adds the options:
'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16.
These are related to half-precision floating-point and single precision.

Reviewers: rengolin, ranjeet.singh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10645

llvm-svn: 240930
2015-06-29 09:32:29 +00:00
Igor Breger
7ca2ee2eb1 AVX-512: Implemented missing encoding and intrinsics for FMA instructions
Added tests for DAG lowering ,encoding and intrinsics

Differential Revision: http://reviews.llvm.org/D10796

llvm-svn: 240926
2015-06-29 09:10:00 +00:00
NAKAMURA Takumi
ebf43b6005 Whitespace.
llvm-svn: 240924
2015-06-29 04:50:09 +00:00
Matt Arsenault
0bf3df2409 AMDGPU/SI: Fix extra space when printing v_div_fmas_*
llvm-svn: 240911
2015-06-28 18:16:14 +00:00
Asaf Badouh
732e3b5425 [x86][AVX512]
Add vscalef support
include encoding and intrinsics


review:
http://reviews.llvm.org/D10730

llvm-svn: 240906
2015-06-28 14:30:39 +00:00
Elena Demikhovsky
02169f53d0 AVX-512: Added all SKX forms of GATHER instructions.
Added intrinsics.
Added encoding and tests.

llvm-svn: 240905
2015-06-28 10:53:29 +00:00
Daniel Sanders
7fcedd84f2 [mips] Add COP0 register class and use it in M[FT]C0/DM[FT]C0.
Summary:
Previously it (incorrectly) used GPR's.

Patch by Simon Dardis. A couple small corrections by myself.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10567

llvm-svn: 240883
2015-06-27 15:39:19 +00:00
Jingyue Wu
e89c324de3 [NVPTX] noop when kernel pointers are already global
Summary:
Some front ends make kernel pointers global already. In that case,
handlePointerParams does nothing.

Test Plan: more tests in lower-kernel-ptr-arg.ll

Reviewers: grosser

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10779

llvm-svn: 240849
2015-06-26 22:35:43 +00:00
Tom Stellard
a50ac5923b AMDPGU/SI: Use correct resource descriptors for VI on HSA
Summary: We need to set MTYPE = 2 for VI shaders when targeting the HSA runtime.

Reviewers: arsenm

Differential Revision: http://reviews.llvm.org/D10777

llvm-svn: 240841
2015-06-26 21:58:42 +00:00
Tom Stellard
ff6108f813 AMDGPU/SI: Update amd_kernel_code_t definition and add assembler support
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10772

llvm-svn: 240839
2015-06-26 21:58:31 +00:00
Tom Stellard
cab8e5d764 AMDGPU/SI: Remove unused variable
This should fix some bots that were broken by r240831.

llvm-svn: 240838
2015-06-26 21:58:26 +00:00
Tom Stellard
ac2f277b1d AMDGPU/SI: Set ELF OS/ABI to ELFOSABI_AMDGPU_HSA
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10708

llvm-svn: 240832
2015-06-26 21:15:11 +00:00
Tom Stellard
daced4c4cc AMDGPU/SI: Add hsa code object directives
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10757

llvm-svn: 240831
2015-06-26 21:15:07 +00:00
Tom Stellard
a87fcfe4a1 AMDGPU/SI: There are no implicit kernel args in the amdhsa ABI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10706

llvm-svn: 240830
2015-06-26 21:15:03 +00:00
Tom Stellard
9f47cd8f77 AMDGPU/SI: Emit amd_kernel_code_t in EmitFunctionBodyStart()
Summary:
This way the function symbol points to the start of amd_kernel_code_t
rather than the start of the function.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10705

llvm-svn: 240829
2015-06-26 21:14:58 +00:00
Marek Olsak
6aff2cf35d AMDGPU: really don't commute REV opcodes if the target variant doesn't exist
If pseudoToMCOpcode failed, we would return the original opcode, so operands
would be swapped, but the instruction would remain the same.
It resulted in LSHLREV a, b ---> LSHLREV b, a.

This fixes Glamor text rendering and
piglit/arb_sample_shading-builtin-gl-sample-mask on VI.

This is a candidate for stable branches.

v2: the test was simplified by Tom Stellard
llvm-svn: 240824
2015-06-26 20:29:10 +00:00
Nemanja Ivanovic
634851fffb Add missing builtins to the PPC back end for ABI compliance (vol. 1)
This patch corresponds to review:
http://reviews.llvm.org/D10638

This is the back end portion of patch
http://reviews.llvm.org/D10637
It just adds the code gen and intrinsic functions necessary to support that patch to the back end.

llvm-svn: 240820
2015-06-26 19:26:53 +00:00
David Majnemer
4b1a02ac46 Revert "Revert r240762 "[X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant""
This reverts commit r240793 while fixing how we handle array constant
pool entries.

This fixes PR23966.

llvm-svn: 240811
2015-06-26 18:55:48 +00:00
Pete Cooper
f9eedfbd62 Add op_values() to iterate over the SDValue operands of an SDNode.
SDNode already had ops() which would iterate over the operands and return
SDUse*.  This version instead gets the SDValue's out of the SDUse's so that
we can use foreach in more places.

Reviewed by David Blaikie.

llvm-svn: 240805
2015-06-26 18:17:36 +00:00
Javed Absar
7e032f3626 [ARM] Cortex-R5 is not VFPOnlySP
This patch fixes the error in ARM.td which stated that Cortex-R5
floating point unit can do only single precision, when it can do double as well.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10769

llvm-svn: 240799
2015-06-26 17:42:37 +00:00
Douglas Katzman
149da381f0 [X86]: Correctly sign-extend 16-bit immediate in CALL instruction.
Patch by Matthew Barney. Thanks!

Differential Revision: http://reviews.llvm.org/D9514

llvm-svn: 240795
2015-06-26 16:58:59 +00:00
Hans Wennborg
583756c4cb Revert r240762 "[X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant"
It seems to have caused PR23966: "UNREACHABLE executed at ..\lib\Target\X86\X86TargetObjectFile.cpp:148"

llvm-svn: 240793
2015-06-26 16:48:02 +00:00
Rafael Espindola
db0cb3137a Rename getObjectFile to getObject for consistency.
llvm-svn: 240785
2015-06-26 14:51:16 +00:00
Toma Tabacu
c9c0571983 [mips] [IAS] Add partial support for the ULW pseudo-instruction.
Summary:
This only adds support for ULW of an immediate address with/without a source register.
It does not include support for ULW of the address of a symbol.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9663

llvm-svn: 240782
2015-06-26 13:20:17 +00:00
Javed Absar
d4144d8279 [ARM] Cortex-R4F is not VFPOnlySP
Cortex-R4F TRM states that fpu supports both single and double precision.
This patch corrects the information in ARM.td file and corresponding test.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10763

llvm-svn: 240776
2015-06-26 12:14:56 +00:00
Rafael Espindola
9b4c1d87c6 Optimize the creation of mapping symbols.
No need to create two symbols just to assign one to the other.

llvm-svn: 240773
2015-06-26 11:31:13 +00:00
David Majnemer
f499207c44 [X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant
No functionality changed, just keeping things clean.

llvm-svn: 240762
2015-06-26 07:03:12 +00:00
Hao Liu
fc6114fe0f [ARM] Lower interleaved memory accesses to vldN/vstN intrinsics.
This patch also adds a function to calculate the cost of interleaved memory accesses.

E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4)

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240755
2015-06-26 02:45:36 +00:00
Hao Liu
1dc438bc4e [AArch64] Lower interleaved memory accesses to ldN/stN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses.
E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr)

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240754
2015-06-26 02:32:07 +00:00
Matthias Braun
917f1e4a7f Revert "X86: Reject register operands with obvious type mismatches."
Revert until http://llvm.org/PR23955 is investigated.

This reverts commit r239309.

llvm-svn: 240746
2015-06-26 00:26:49 +00:00
NAKAMURA Takumi
a5b2580063 PPCISelLowering.cpp: Appease PR23956. [-Wdocumentation]
llvm-svn: 240727
2015-06-25 23:38:44 +00:00
Rafael Espindola
9195b6922c Add an ELFSymbolRef type.
This allows user code to say Sym.getSize() instead of having to manually fetch
the object.

llvm-svn: 240708
2015-06-25 22:10:04 +00:00
Pete Cooper
c2ffa0891f Use foreach loop over constant operands. NFC.
A number of places had explicit loops over Constant::operands().
Just use foreach loops where possible.

llvm-svn: 240694
2015-06-25 20:51:38 +00:00
Kit Barton
230223268c [PPC] Implement vmrgew and vmrgow instructions
This patch adds support for the vector merge even word and vector merge odd word
instructions introduced in POWER8.

Phabricator review: http://reviews.llvm.org/D10704

llvm-svn: 240650
2015-06-25 15:17:40 +00:00
Benjamin Kramer
6586644fa4 [PPC] Replace debug value skipping with getLastNonDebugInstr.
No functionality change intended.

llvm-svn: 240641
2015-06-25 13:39:03 +00:00
Benjamin Kramer
4ed07455af Replace copy-pasted debug value skipping with MBB::getLastNonDebugInstr
No functional change intended.

llvm-svn: 240639
2015-06-25 13:28:24 +00:00
Toma Tabacu
f1219360be [mips] [IAS] Refactor the emitDirectiveModuleFP() functions. NFC.
Summary:
Simplify emitDirectiveModuleFP() by having it just print the current information
from MipsABIFlagsSection and doing an updateABIInfo() before such calls.

This prevents us from forgetting to update the STI.FeatureBits,
because updateABIInfo() uses those to update the MipsABIFlagsSection object,
and also makes sure we use the update mechanism from MipsABIFlagsSection.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, mpf

Differential Revision: http://reviews.llvm.org/D10642

llvm-svn: 240637
2015-06-25 12:44:38 +00:00
Ulrich Weigand
580894b7f9 [SystemZ] Only attempt RxSBG optimization for integer types
As pointed out by Justin Bogner (see r240520), SystemZDAGToDAGISel::Select
currently attempts to convert boolean operations into RxSBG even on some
non-integer types (in particular, vector types).  This would not work in
any case, and it happened to trigger undefined behaviour in allOnes.

This patch verifies that we have a (<= 64-bit) integer type before
attempting to perform this optimization.

llvm-svn: 240634
2015-06-25 11:52:36 +00:00
Toma Tabacu
b5146d10c2 [mips] [IAS] Refactor the emitDirectiveModuleOddSPReg() functions. NFC.
Summary:
We can simplify emitDirectiveModuleOddSPReg() by having it print the current OddSPReg information
from MipsABIFlagsSection and doing an updateABIInfo() before such calls.

This prevents us from forgetting to update the STI.FeatureBits, because updateABIInfo() uses those to update the MipsABIFlagsSection object,
and also makes sure we use the update mechanism from MipsABIFlagsSection.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits, mpf

Differential Revision: http://reviews.llvm.org/D10641

llvm-svn: 240630
2015-06-25 10:56:57 +00:00
Toma Tabacu
5af17cfb8c [mips] [IAS] Fix parsing of memory offset expressions with parenthesis depth >1.
Summary:
In an expression such as "(((a+b)+c)+d)", parseParenExpression() would only parse the "a+b)+c", which would result in an error later on in the parser.
This means that we can only parse one level of inner parentheses.

In order to fix this, I added a new function called parseParenExprOfDepth(), which parses a specified number of trailing parenthesis expressions
(except for the outermost parenthesis), and changed MipsAsmParser to use it in parseMemOffset instead of parseParenExpression().

Reviewers: dsanders, rafael

Reviewed By: dsanders, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9742

llvm-svn: 240625
2015-06-25 09:52:02 +00:00
Ahmed Bougacha
9f949b3f29 [X86] Accept hasAVX512() as well as hasFMA() when generating FMA.
We don't always have FMA, for example when using 'clang -mavx512f'
without an explicit CPU.

Also check for an explicit +avx512f instead of CPUs in a couple
related tests.

llvm-svn: 240616
2015-06-25 00:44:46 +00:00
Swaroop Sridhar
c6d536f7a9 Enable StackMap Serialization for COFF
Summary

This change turns on the emission of 
__LLVM_Stackmaps section when generating COFF binaries.

Test Plan

Added a scenario to the test case: 
test\CodeGen\X86\statepoint-stackmap-format.ll.

Code Review:

http://reviews.llvm.org/D10680

llvm-svn: 240613
2015-06-25 00:28:42 +00:00
Douglas Katzman
73a319e633 [X86] Simplify some stuff in X86DisassemblerDecoder. NFC
- Deciding that insn->sibIndex is SIB_INDEX_NONE does not require another
check beyond the fully decoded bits being equal to 0x4.
The expression insn->sibIndex == SIB_INDEX_sib could not have been true unless
index were 0x4, because SIB_INDEX_sib is merely the range base (SIB_INDEX_EAX)
plus 4. Respectively SIB_INDEX_sib64.

- Don't use a switch statement to perform left-shift.

Differential Revision: http://reviews.llvm.org/D9762

llvm-svn: 240598
2015-06-24 22:04:55 +00:00
Jingyue Wu
7ff7eb51d2 Add NVPTXPeephole pass to reduce unnecessary address cast
Summary:
This patch first change the register that holds local address for stack
frame to %SPL. Then the new NVPTXPeephole pass will try to scan the
following pattern

   %vreg0<def> = LEA_ADDRi64 <fi#0>, 4
   %vreg1<def> = cvta_to_local %vreg0

and transform it into

   %vreg1<def> = LEA_ADDRi64 %VRFrameLocal, 4

Patched by Xuetian Weng

Test Plan: test/CodeGen/NVPTX/local-stack-frame.ll

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: eliben, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10549

llvm-svn: 240587
2015-06-24 20:20:16 +00:00
Matthias Braun
a43ec6b1f8 ARMLoadStoreOptimizer: Fix errata 602117 handling and make testcase actually test for it
This fixes PR23912

Differential Revision: http://reviews.llvm.org/D10620

llvm-svn: 240582
2015-06-24 20:03:27 +00:00
Zoran Jovanovic
729cee7955 [mips][microMIPS] Implement BREAK, EHB and EI instructions
http://reviews.llvm.org/D10090

llvm-svn: 240531
2015-06-24 10:32:16 +00:00
Rafael Espindola
a70d8a336d Change how symbol sizes are handled in lib/Object.
COFF and MachO only define symbol sizes for common symbols. Reflect that
in the class hierarchy by having a method for common symbols only in the base
and a general one in ELF.

This avoids the need of using a magic value for the size, which had a few
problems
* Most callers didn't check for it.
* The ones that did could not tell the magic value from a file actually having
  that value.

llvm-svn: 240529
2015-06-24 10:20:30 +00:00
Justin Bogner
dcde6ede19 Hexagon: Paper over the undefined behaviour introduced by r238692
This stops shifting a 32-bit value by such absurd amounts as 96 and
120. We do this by dropping a call to the function that was doing this
entirely, which rather surprisingly doesn't break *any* tests.

I've also added an assert in the misbehaving function to prove that
it's no longer being called with completely invalid arguments.

This change looks pretty bogus and we should probably be reverting
r238692 instead, but this is hard to do with the number of follow ups
that have happened since. It can't be any worse than the undefined
behaviour that was happening before though.

llvm-svn: 240526
2015-06-24 07:03:07 +00:00
Justin Bogner
a1d0e24d46 Hexagon: Avoid left shifting negative values (it's UB)
Found by ubsan.

llvm-svn: 240521
2015-06-24 06:00:53 +00:00
Justin Bogner
87afb11c9d SystemZ: Rephrase this allOnes calculation to avoid UB
This allOnes function hits undefined behaviour if Count is greater
than 64, but we can avoid that and simplify the calculation by just
saturating if such a value is passed in.

This comes up under ubsan becauseRxSBGOperands is sometimes created
with values that are 128 bits wide. Somebody more familiar with this
code should probably look into whether that's expected, as a 64 bit
mask may or may not be appropriate for such types.

llvm-svn: 240520
2015-06-24 05:59:19 +00:00
Ahmed Bougacha
2070941892 [X86] Don't generate vbroadcasti128 for v4i64 splats from memory.
We used to erroneously match:
    (v4i64 shuffle (v2i64 load), <0,0,0,0>)

Whereas vbroadcasti128 is more like:
    (v4i64 shuffle (v2i64 load), <0,1,0,1>)

This problem doesn't exist for vbroadcastf128, which kept matching
the intrinsic after r231182.  We should perhaps re-introduce the
intrinsic here as well, but that's a separate issue still being
discussed.

While there, add some proper vbroadcastf128 tests.  We don't currently
match those, like for loading vbroadcastsd/ss on AVX (the reg-reg
broadcasts where added in AVX2).

Fixes PR23886.

llvm-svn: 240488
2015-06-24 00:07:16 +00:00
John Brawn
b5072dd408 [ARM] ARMLoadStoreOpt::UpdateBaseRegUses should stop on def
When UpdateBaseRegUses sees an instruction that defines the base
register it must stop, as the base register value it is updating is no
longer live. Ideally we would already have seen the register be killed
(which is already checked for), but the kill flags may be inaccurate
and we have to account for this.

Differential Revision: http://reviews.llvm.org/D10566

llvm-svn: 240424
2015-06-23 16:02:11 +00:00
Justin Bogner
df8438db3e SystemZ: Avoid left shifting negative values (it's UB)
Found by ubsan.

llvm-svn: 240420
2015-06-23 15:38:24 +00:00
Benjamin Kramer
489ac71c59 Make helper functions static. NFC.
llvm-svn: 240416
2015-06-23 14:51:40 +00:00
Toma Tabacu
b7e4340827 [mips] [IAS] Add partial support for the ULHU pseudo-instruction.
Summary:
This only adds support for ULHU of an immediate address with/without a source register.
It does not include support for ULHU of the address of a symbol.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9671

llvm-svn: 240410
2015-06-23 14:39:42 +00:00
Toma Tabacu
9185107577 [mips] [IAS] Add support for generating DADDu to createAddu(). NFC.
Summary: This isn't used right now, but it will be in some upcoming changes.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10568

llvm-svn: 240407
2015-06-23 14:00:54 +00:00
Rafael Espindola
72dc307fa0 Simplify the Mangler interface now that DataLayout is mandatory.
We only need to pass in a DataLayout when mangling a raw string, not when
constructing the mangler.

llvm-svn: 240405
2015-06-23 13:59:29 +00:00
Petar Jovanovic
479995f6f0 [mips64] Emit correct addend for some PC-relative relocations
So far, LLVM has not emitted correct addend for N64 and N32 ABI. This patch
fixes that. It also removes fixup from MCJIT for R_MIPS_PC16 relocation.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D10565

llvm-svn: 240404
2015-06-23 13:54:42 +00:00
Krzysztof Parzyszek
66d1f57704 [Hexagon] Use MF reference from parent class in HexagonPacketizerList
llvm-svn: 240403
2015-06-23 13:50:23 +00:00