1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

105364 Commits

Author SHA1 Message Date
Daniel Sanders
aa92e911ca [mips] Add support for -modd-spreg/-mno-odd-spreg
Summary:
When -mno-odd-spreg is in effect, 32-bit floating point values are not
permitted in odd FPU registers. The option also prohibits 32-bit and 64-bit
floating point comparison results from being written to odd registers.

This option has three purposes:
* It allows support for certain MIPS implementations such as loongson-3a that
  do not allow the use of odd registers for single precision arithmetic.
* When using -mfpxx, -mno-odd-spreg is the default and this allows us to
  statically check that code is compliant with the O32 FPXX ABI since mtc1/mfc1
  instructions to/from odd registers are guaranteed not to appear for any
  reason. Once this has been established, the user can then re-enable
  -modd-spreg to regain the use of all 32 single-precision registers.
* When using -mfp64 and -mno-odd-spreg together, an O32 extension named
  O32 FP64A is used as the ABI. This is intended to provide almost all
  functionality of an FR=1 processor but can also be executed on a FR=0 core
  with the assistance of a hardware compatibility mode which emulates FR=0
  behaviour on an FR=1 processor.

* Added '.module oddspreg' and '.module nooddspreg' each of which update
  the .MIPS.abiflags section appropriately
* Moved setFpABI() call inside emitDirectiveModuleFP() so that the caller
  doesn't have to remember to do it.
* MipsABIFlags now calculates the flags1 and flags2 member on demand rather
  than trying to maintain them in the same format they will be emitted in.

There is one portion of the -mfp64 and -mno-odd-spreg combination that is not
implemented yet. Moves to/from odd-numbered double-precision registers must not
use mtc1. I will fix this in a follow-up.

Differential Revision: http://reviews.llvm.org/D4383

llvm-svn: 212717
2014-07-10 13:38:23 +00:00
Zinovy Nis
34ef30df00 [x32] Add AsmBackend for X32 which uses ELF32 with x86_64 (the author is Pavel Chupin).
This is minimal change for backend required to have "hello world" compiled and working on x32 target (x86_64-linux-gnux32). More patches for x32 will follow.

Differential Revision: http://reviews.llvm.org/D4181

llvm-svn: 212716
2014-07-10 13:03:26 +00:00
Chandler Carruth
b55565f54f [x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous
to the zero-extend-vector-inreg node introduced previously for the same
purpose: manage the type legalization of widened extend operations,
especially to support the experimental widening mode for x86.

I'm adding both because sign-extend is expanded in terms of any-extend
with shifts to propagate the sign bit. This removes the last
fundamental scalarization from vec_cast2.ll (a test case that hit many
really bad edge cases for widening legalization), although the trunc
tests in that file still appear scalarized because the the shuffle
legalization is scalarizing. Funny thing, I've been working on that.

Some initial experiments with this and SSE2 scenarios is showing
moderately good behavior already for sign extension. Still some work to
do on the shuffle combining on X86 before we're generating optimal
sequences, but avoiding scalarization is a huge step forward.

llvm-svn: 212714
2014-07-10 12:32:32 +00:00
Richard Sandiford
2bd4df26d9 [SystemZ] Use SystemZCallingConv.td to define callee-saved registers
Just a clean-up.  No behavioral change intended.

llvm-svn: 212711
2014-07-10 11:44:37 +00:00
NAKAMURA Takumi
7b68d9e309 SpecialCaseList.h: Fix -Wdocumentation with \code.
llvm-svn: 212710
2014-07-10 11:39:59 +00:00
NAKAMURA Takumi
07e99292d7 llvm/test/CodeGen/X86/shift-parts.ll: FileCheck-ize. (from r212640)
llvm-svn: 212709
2014-07-10 11:37:39 +00:00
NAKAMURA Takumi
ea850a0edc Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."
This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations.

llvm-svn: 212708
2014-07-10 11:37:28 +00:00
Richard Sandiford
9bea71208f [SystemZ] Tweak instruction format classifications
There's no real need to have Shift as a separate format type from Binary.
The comments for other format types were too specific and in some cases
no longer accurate.

Just a clean-up, no behavioral change intended.

llvm-svn: 212707
2014-07-10 11:29:23 +00:00
Chandler Carruth
b0b7a4016c [x86] Add another combine that is particularly useful for the new vector
shuffle lowering: match shuffle patterns equivalent to an unpcklwd or
unpckhwd instruction.

This allows us to use generic lowering code for v8i16 shuffles and match
the unpack pattern late.

llvm-svn: 212705
2014-07-10 11:09:29 +00:00
Richard Sandiford
e97885896b [SystemZ] Add MC support for LEDBRA, LEXBRA and LDXBRA
These instructions aren't used for codegen since the original L*DB instructions
are suitable for fround.

llvm-svn: 212703
2014-07-10 11:00:55 +00:00
Richard Sandiford
6bdbd7dbea [SystemZ] Avoid using i8 constants for immediate fields
Immediate fields that have no natural MVT type tended to use i8 if the
field was small enough.  This was a bit confusing since i8 isn't a legal
type for the target.  Fields for short immediates in a 32-bit or 64-bit
operation use i32 or i64 instead, so it would be better to do the same
for all fields.

No behavioral change intended.

llvm-svn: 212702
2014-07-10 10:52:51 +00:00
Richard Sandiford
c340b63db7 [SystemZ] Fix FPR dwarf numbering
The dwarf FPR numbers are supposed to have the order F0, F2, F4, F6,
F1, F3, F5, F7, F8, etc., which matches the pairing of registers for
long doubles.  E.g. a long double stored in F0 is paired with F2.

llvm-svn: 212701
2014-07-10 10:45:11 +00:00
Daniel Sanders
a59d10a761 Make it possible for ints/floats to return different values from getBooleanContents()
Summary:
On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer
comparisons return 0 or 1.

Updated the various uses of getBooleanContents. Two simplifications had to be
disabled when float and int boolean contents differ:
- ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially
  discoverable (i.e. when the condition of the VSELECT is a SETCC node).
- visitVSELECT (select C, 0, 1) -> (xor C, 1).
  Come to think of it, this one could test for the common case of 'C'
  being a SETCC too.

Preserved existing behaviour for all other targets and updated the affected
MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low'
variable was counting in the wrong direction because it thought it could simply
add the result of the comparison.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D4389

llvm-svn: 212697
2014-07-10 10:18:12 +00:00
Chandler Carruth
ee588e17c9 [x86] Expand the target DAG combining for PSHUFD nodes to be able to
combine into half-shuffles through unpack instructions that expand the
half to a whole vector without messing with the dword lanes.

This fixes some redundant instructions in splat-like lowerings for
v16i8, which are now getting to be *really* nice.

llvm-svn: 212695
2014-07-10 09:57:36 +00:00
Chandler Carruth
7c6ee1b539 [x86] Tweak the v16i8 single input special case lowering for shuffles
that splat i8s into i16s.

Previously, we would try much too hard to arrange a sequence of i8s in
one half of the input such that we could unpack them into i16s and
shuffle those into place. This isn't always going to be a cheaper i8
shuffle than our other strategies. The case where it is always going to
be cheaper is when we can arrange all the necessary inputs into one half
using just i16 shuffles. It happens that viewing the problem this way
also makes it much easier to produce an efficient set of shuffles to
move the inputs into one half and then unpack them.

With this, our splat code gets one step closer to being not terrible
with the new experimental lowering strategy. It also exposes two
combines missing which I will add next.

llvm-svn: 212692
2014-07-10 09:16:40 +00:00
Hal Finkel
fd0d2f71d3 A test case for not asserting in isDereferenceablePointer upon unsized types
This is the test case for r212687.

llvm-svn: 212688
2014-07-10 07:04:37 +00:00
Hal Finkel
1e7ffc2c10 Fix isDereferenceablePointer not to try to take the size of an unsized type.
I'll add a test-case shortly.

llvm-svn: 212687
2014-07-10 06:06:11 +00:00
Hal Finkel
54ccd892b9 Allow isDereferenceablePointer to look through some bitcasts
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.

In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).

llvm-svn: 212686
2014-07-10 05:27:53 +00:00
Saleem Abdulrasool
e53e7bb4fc MC: modernise for loop
Convert a for loop to range bsaed form.  NFC.

llvm-svn: 212684
2014-07-10 04:50:09 +00:00
Saleem Abdulrasool
506e2b8456 MC: add and use an accessor for WinCFI
This adds a utility method to access the WinCFI information in bulk and uses
that to iterate rather than requesting the count and individually iterating
them.  This is in preparation for restructuring WinCFI handling to enable more
clear sharing across architectures to enable unwind information emission for
Windows on ARM.

llvm-svn: 212683
2014-07-10 04:50:06 +00:00
Peter Collingbourne
2bc51dcbef Remove move assignment operator to appease older GCCs.
llvm-svn: 212682
2014-07-10 04:39:40 +00:00
Chandler Carruth
7132367354 [x86] Initial improvements to the new shuffle lowering for v16i8
shuffles specifically for cases where a small subset of the elements in
the input vector are actually used.

This is specifically targetted at improving the shuffles generated for
trunc operations, but also helps out splat-like operations.

There is still some really low-hanging fruit here that I want to address
but this is a huge step in the right direction.

llvm-svn: 212680
2014-07-10 04:34:06 +00:00
Peter Collingbourne
0ccbe0740d Explicitly define move constructor and move assignment operator to appease MSVC.
llvm-svn: 212679
2014-07-10 04:29:06 +00:00
Peter Collingbourne
e5f0c51843 SpecialCaseList: use std::unique_ptr.
llvm-svn: 212678
2014-07-10 03:55:02 +00:00
Hao Liu
cb23c5a35e [AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector.
llvm-svn: 212677
2014-07-10 03:41:50 +00:00
Matt Arsenault
834bcebaa6 R600/SI: Add support for llvm.convert.{to|from}.fp16
llvm-svn: 212676
2014-07-10 03:22:20 +00:00
Matt Arsenault
f97738a858 Fix types in documentation.
The examples were using f32, but the IR type is called float

llvm-svn: 212675
2014-07-10 03:22:16 +00:00
Chandler Carruth
17aad54271 [x86] Refactor some of the new code for lowering v16i8 shuffles to
remove duplication and make it easier to select different strategies.

No functionality changed.

llvm-svn: 212674
2014-07-10 02:24:26 +00:00
Peter Collingbourne
8a30b69a11 [dfsan] Handle bitcast aliases.
llvm-svn: 212668
2014-07-10 01:30:39 +00:00
Chandler Carruth
86e8e81be5 [SDAG] Make the new zext-vector-inreg node default to expand so targets
don't need to set it manually.

This is based on feedback from Tom who pointed out that if every target
needs to handle this we need to reach out to those maintainers. In fact,
it doesn't make sense to duplicate everything when anything other than
expand seems unlikely at this stage.

llvm-svn: 212661
2014-07-09 22:53:04 +00:00
David Blaikie
4464e33224 Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information.
Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson
reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped
provide a reduced reproduction, though the failure wasn't too hard to
guess, and even easier with the example to confirm.

The assertion that the subprogram metadata associated with an
llvm::Function matches the scope data referenced by the DbgLocs on the
instructions in that function is not valid under LTO. In LTO, a C++
inline function might exist in multiple CUs and the subprogram metadata
nodes will refer to the same llvm::Function. In this case, depending on
the order of the CUs, the first intance of the subprogram metadata may
not be the one referenced by the instructions in that function and the
assertion will fail.

A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the
assertion removed and a comment added to explain this situation.

Original commit message:

If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.

While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.

Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.

Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).

llvm-svn: 212649
2014-07-09 21:02:41 +00:00
Alexey Samsonov
bea2f2e5d8 Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
  * DFSanABIList in DFSan instrumentation pass.
  * SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).

Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.

No functionality change.

llvm-svn: 212643
2014-07-09 19:40:08 +00:00
Tim Northover
ef600d3f40 Use simpler constructor for range adapter.
It is a good idea, it's slightly clearer and simpler. Unfortunately
the headline news is: we save one line!

llvm-svn: 212641
2014-07-09 19:14:34 +00:00
Matt Arsenault
479a1f90e1 Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.
Do this if the truncate is free and the select is legal.

llvm-svn: 212640
2014-07-09 19:12:07 +00:00
Jim Grosbach
10098b8e10 AArch64: Better codegen for storing to __fp16.
Storing will generally be immediately preceded by rounding from an f32
or f64, so make sure to match those patterns directly to convert into the
FPR16 register class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-from-f64 path
which was first converting to f32 and then to f16 from there.

rdar://17594379

llvm-svn: 212638
2014-07-09 18:55:52 +00:00
Jim Grosbach
54e0e56d3f Change an assert() to a diagnostic.
llvm-svn: 212637
2014-07-09 18:55:49 +00:00
Benjamin Kramer
e971397307 TargetRegisterInfo: Remove function that fell out of use years ago.
llvm-svn: 212636
2014-07-09 18:53:57 +00:00
Cameron McInally
b2897c66f8 Update ReleaseNotes to mention Atomic NAND semantic changes.
llvm-svn: 212635
2014-07-09 18:29:55 +00:00
Adam Nemet
46d0e99fe3 [X86] AVX512: Enable it in the Loop Vectorizer
This lets us experiment with 512-bit vectorization without passing
force-vector-width manually.

The code generated for a simple integer memset loop is properly vectorized.
Disassembly is still broken for it though :(.

llvm-svn: 212634
2014-07-09 18:22:33 +00:00
Louis Gerbarg
53a9c08b97 Make AArch64FastISel::EmitIntExt explicitly check its source and destination types
This is a follow up to r212492. There should be no functional difference, but
this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be
an i8/i16/i32/i64.

rdar://17516686

llvm-svn: 212633
2014-07-09 17:54:32 +00:00
Sanjay Patel
e488c2a027 removed duplicate testcase
llvm-svn: 212632
2014-07-09 17:49:58 +00:00
Sanjay Patel
38aa8c3b99 Fix for PR20059 (instcombine reorders shufflevector after instruction that may trap)
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).

This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.

Differential Revision: http://reviews.llvm.org/D4424

llvm-svn: 212629
2014-07-09 16:34:54 +00:00
Daniel Sanders
37f35e2a7f Add Imagination Technologies to the vendors in llvm::Triple
Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang.

Differential Revision: http://reviews.llvm.org/D4435

llvm-svn: 212626
2014-07-09 16:03:10 +00:00
Tim Northover
89066c9335 Generic: add range-adapter for option parsing.
I want to use it in lld, but while I'm here I'll update LLVM uses.

llvm-svn: 212615
2014-07-09 13:03:37 +00:00
Chandler Carruth
646382f7ef [x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were
not widening the input type to the node sufficiently to let the ext take
place in a register.

This would in turn result in a mysterious bitcast assertion failure
downstream. First change here is to add back the helpful assert I had in
an earlier version of the code to catch this immediately.

Next change is to add support to the type legalization to detect when we
have widened the operand either too little or too much (for whatever
reason) and find a size-matched legal vector type to convert it to
first. This can also fail so we get a new fallback path, but that seems
OK.

With this, we no longer crash on vec_cast2.ll when using widening. I've
also added the CHECK lines for the zero-extend cases here. We still need
to support sign-extend and trunc (or something) to get plausible code
for the other two thirds of this test which is one of the regression
tests that showed the most scalarization when widening was
force-enabled. Slowly closing in on widening being a viable legalization
strategy without it resorting to scalarization at every turn. =]

llvm-svn: 212614
2014-07-09 12:36:54 +00:00
Chandler Carruth
c03f949cd5 Sink two variables only used in an assert into the assert itself. Should
fix the release builds with Werror.

llvm-svn: 212612
2014-07-09 11:13:16 +00:00
Benjamin Kramer
ba4b6af99b X86: When lowering v8i32 himuls use the correct shuffle masks for AVX2.
Turns out my trick of using the same masks for SSE4.1 and AVX2 didn't work out
as we have to blend two vectors. While there remove unecessary cross-lane moves
from the shuffles so the backend can lower it to palignr instead of vperm.

Fixes PR20118, a miscompilation of vector sdiv by constant on AVX2.

llvm-svn: 212611
2014-07-09 11:12:39 +00:00
Chandler Carruth
eb00ef26a3 [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening
vector types to be legal and a ZERO_EXTEND node is encountered.

When we use widening to legalize vector types, extend nodes are a real
challenge. Either the input or output is likely to be legal, but in many
cases not both. As a consequence, we don't really have any way to
represent this situation and the prior code in the widening legalization
framework would just scalarize the extend operation completely.

This patch introduces a new DAG node to represent doing a zero extend of
a vector "in register". The core of the idea is to allow legal but
different vector types in the input and output. The output vector must
have fewer lanes but wider elements. The operation is defined to zero
extend the low elements of the input to the size of the output elements,
and drop all of the high elements which don't have a corresponding lane
in the output vector.

It also includes generic expansion of this node in terms of blending
a zero vector into the high elements of the vector and bitcasting
across. This in turn yields extremely nice code for x86 SSE2 when we use
the new widening legalization logic in conjunction with the new shuffle
lowering logic.

There is still more to do here. We need to support sign extension, any
extension, and potentially int-to-float conversions. My current plan is
to continue using similar synthetic nodes to model each of these
transitions with generic lowering code for each one.

However, with this patch LLVM already reaches performance parity with
GCC for the core C loops of the x264 code (assuming you disable the
hand-written assembly versions) when compiling for SSE2 and SSE3
architectures and enabling the new widening and lowering logic for
vectors.

Differential Revision: http://reviews.llvm.org/D4405

llvm-svn: 212610
2014-07-09 10:58:18 +00:00
Daniel Sanders
80d45ce1e9 [mips][mips64r6] Correct select patterns that have the condition or true/false values backwards
Summary: This bug caused SingleSource/Regression/C/uint64_to_float and SingleSource/UnitTests/2002-05-02-CastTest3 to fail (among others).

Differential Revision: http://reviews.llvm.org/D4388

llvm-svn: 212608
2014-07-09 10:47:26 +00:00
Daniel Sanders
f8ec2a1561 [mips][mips64r6] Correct cond names in the cmp.cond.[ds] instructions
Summary:
It seems we accidentally read the wrong column of the table MIPS64r6 spec
and used the names for c.cond.fmt instead of cmp.cond.fmt.

Differential Revision: http://reviews.llvm.org/D4387

llvm-svn: 212607
2014-07-09 10:40:20 +00:00