1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00
Commit Graph

28663 Commits

Author SHA1 Message Date
Matt Arsenault
6387e9a3dc R600/SI: Implement i64 ctpop
llvm-svn: 210568
2014-06-10 19:18:24 +00:00
Matt Arsenault
8407076508 R600/SI: Use bcnt instruction for ctpop
llvm-svn: 210567
2014-06-10 19:18:21 +00:00
Matt Arsenault
d30b483e1a R600: Handle fcopysign
llvm-svn: 210564
2014-06-10 19:00:20 +00:00
Matt Arsenault
5bfef73e00 R600/SI: Handle sign_extend and zero_extend to i64 with patterns.
llvm-svn: 210563
2014-06-10 18:54:59 +00:00
Eric Christopher
5d40e6494c Add a FIXME.
llvm-svn: 210559
2014-06-10 18:31:18 +00:00
Eric Christopher
c85f7b41b5 Move AArch64SelectionDAGInfo down to the subtarget.
llvm-svn: 210557
2014-06-10 18:21:53 +00:00
Eric Christopher
b49d64f413 Remove the cached little endian variable. We can get it easily off
of the DataLayout.

llvm-svn: 210555
2014-06-10 18:11:20 +00:00
Eric Christopher
653ef1ea20 Have AArch64SelectionDAGInfo take a DataLayout parameter rather
than a TargetMachine.

llvm-svn: 210554
2014-06-10 18:06:28 +00:00
Eric Christopher
f8abeb0328 Remove caching of the subtarget for AArch64SelectionDAGInfo.
llvm-svn: 210553
2014-06-10 18:06:25 +00:00
Eric Christopher
3447f35f1b Move DataLayout onto the AArch64 subtarget.
llvm-svn: 210552
2014-06-10 18:06:23 +00:00
Eric Christopher
dcaea5b602 Move AArch64FrameLowering into the subtarget.
llvm-svn: 210549
2014-06-10 17:44:12 +00:00
Eric Christopher
9130d84166 Remove the uses of AArch64TargetMachine and AArch64Subtarget from
AArch64FrameLowering.

llvm-svn: 210548
2014-06-10 17:33:39 +00:00
Reed Kotler
582410c0f5 Do Materialize Floating Point in Mips Fast-Isel
Summary:
Implement materialize of floating point literals in Mips Fast-Isel

Reopened version of D3659

Test Plan: simplestorefp1.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4071

llvm-svn: 210546
2014-06-10 16:45:44 +00:00
Andrea Di Biagio
183859cf37 [X86] Improved target combine rules for selecting horizontal add/sub.
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for 
the VEX.256 encoded versions of horizontal add/sub.

For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.

Added a new test case into haddsub-2.ll.

llvm-svn: 210545
2014-06-10 16:42:57 +00:00
Tom Stellard
e9ee33fafb Hexagon: Expand i1 SELECT_CC
il is legal for Hexagon, so I should have marked this as Expand for
SELECT_CC when I removed setOperationAction(ISD::SELECT_CC, MVT::Other,
Expand); in r210541.

llvm-svn: 210544
2014-06-10 16:42:41 +00:00
Adam Nemet
db983b8c6b [X86] AVX512: Add vmovntdqa
Along with the corresponding intrinsic and tests.

llvm-svn: 210543
2014-06-10 16:39:53 +00:00
Renato Golin
89197528be Fix a bug in the Thumb1 ARM Load/Store optimizer
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.

This patch fixes PR19972. Patch by Moritz Roth.

llvm-svn: 210542
2014-06-10 16:39:21 +00:00
Tom Stellard
ad2d29f10e SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC
The SelectionDAG bad a special case for ISD::SELECT_CC, where it would
allow targets to specify:

setOperationAction(ISD::SELECT_CC, MVT::Other, Expand);

to indicate that they wanted to expand ISD::SELECT_CC for all types.
This wasn't applied correctly everywhere, and it makes writing new
DAG patterns with ISD::SELECT_CC difficult.

llvm-svn: 210541
2014-06-10 16:01:29 +00:00
Tom Stellard
aab1db4cd9 SelectionDAG: Expand SELECT_CC to SELECT + SETCC
This consolidates code from the Hexagon, R600, and XCore targets.

No functionality change intended.

llvm-svn: 210539
2014-06-10 16:01:22 +00:00
Bill Schmidt
6e11183ad7 [PPC64LE] Recognize shufflevector patterns for little endian
Various masks on shufflevector instructions are recognizable as
specific PowerPC instructions (vector pack, vector merge, etc.).
There is existing code in PPCISelLowering.cpp to recognize the correct
patterns for big endian code.  The masks for these instructions are
different for little endian code due to the big-endian numbering
employed by these instructions.  This patch adds the recognition code
for little endian.

I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for
this.  The existing recognizer test (vec_shuffle.ll) is unnecessarily
verbose and difficult to read, so I felt it was better to add a new
test rather than modify the old one.

llvm-svn: 210536
2014-06-10 14:35:01 +00:00
Chad Rosier
0f6d185fcf [AArch64] Emit .ident compiler version attribute.
Patch by Ana Pazos<apazos@codeaurora.org>!

llvm-svn: 210535
2014-06-10 14:32:08 +00:00
Artyom Skrobov
e445b07705 Condition codes AL and NV are invalid in the aliases that use
inverted condition codes (CINC, CINV, CNEG, CSET, and CSETM).

Matching aliases based on "immediate classes", when disassembling,
wasn't previously supported, hence adding MCOperandPredicate
into class Operand, and implementing the support for it
in AsmWriterEmitter.

The parsing for those aliases was already custom, so just adding
the missing condition into AArch64AsmParser::parseCondCode.

llvm-svn: 210528
2014-06-10 13:11:35 +00:00
Tim Northover
8d5e97704b AArch64: disallow x30 & x29 as the destination for indirect tail calls
As Ana Pazos pointed out, these have to be restored to their incoming values
before a function returns; i.e. before the tail call. So they can't be used
correctly as the destination register.

llvm-svn: 210525
2014-06-10 10:50:24 +00:00
Tim Northover
7a0bf66207 Revert "X86: elide comparisons after cmpxchg instructions."
This reverts commit r210523. It was committed prematurely without waiting for
review.

llvm-svn: 210524
2014-06-10 10:50:11 +00:00
Tim Northover
d8b770a0be X86: elide comparisons after cmpxchg instructions.
The C++ and C semantics of the compare_and_swap operations actually
require us to return a boolean "success" value. In LLVM terms this
means a second comparison of the output of "cmpxchg" against the input
desired value.

However, x86's "cmpxchg" instruction sets all flags for the comparison
formed, so we can skip any secondary comparison. (N.b. this isn't true
for cmpxchg8b/16b, which only set ZF).

rdar://problem/13201607

llvm-svn: 210523
2014-06-10 10:49:07 +00:00
Tim Northover
bfac8dd607 AArch64: teach FastISel how to handle offset FrameIndices
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).

rdar://problem/17187463

llvm-svn: 210520
2014-06-10 09:52:44 +00:00
Tim Northover
666d07f003 AArch64: make FastISel memcpy emission more robust.
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.

rdar://problem/17187463

llvm-svn: 210519
2014-06-10 09:52:40 +00:00
Eric Christopher
27ed136ced Delete X86JITInfo in the subtarget destructor.
llvm-svn: 210516
2014-06-10 08:03:42 +00:00
Juergen Ributzka
250efba0f3 [ConstantHoisting][X86] Improve the cost model for small constants with large types (i64 and above).
This improves the X86 cost model for small constants with large types. Before
this commit we would even hoist trivial constants such as i96 2.

This is related to <rdar://problem/17070936>

llvm-svn: 210504
2014-06-10 00:32:29 +00:00
Bill Schmidt
41cd7375c8 [PPC64LE] Generate correct code for unaligned little-endian vector loads
The code in PPCTargetLowering::PerformDAGCombine() that handles
unaligned Altivec vector loads generates a lvsl followed by a vperm.
As we've seen in numerous other places, the vperm instruction has a
big-endian bias, and this is fixed for little endian by complementing
the permute control vector and swapping the input operands.  In this
case the lvsl is providing the permute control vector.  Rather than
generating an lvsl and a complement operation, it is sufficient to
generate an lvsr instruction instead.  Thus for LE code generation we
will generate an lvsr rather than an lvsl, and swap the other input
arguments on the vperm.

The existing test/CodeGen/PowerPC/vec_misalign.ll is updated to test
the code generation for PPC64 and PPC64LE, in addition to the existing
PPC32/G5 testing.

llvm-svn: 210493
2014-06-09 22:00:52 +00:00
Saleem Abdulrasool
cf709958ac ARM: add VLA extension for WoA Itanium ABI
The armv7-windows-itanium environment is nearly identical to the MSVC ABI. It
has a few divergences, mostly revolving around the use of the Itanium ABI for
C++. VLA support is one of the extensions that are amongst the set of the
extensions.

This adds support for proper VLA emission for this environment. This is
somewhat similar to the handling for __chkstk emission on X86 and the large
stack frame emission for ARM. The invocation style for chkstk is still
controlled via the -mcmodel flag to clang.

Make an explicit note that this is an extension.

llvm-svn: 210489
2014-06-09 20:18:42 +00:00
Eric Christopher
de9b19fdc2 Move all of the x86 subtarget initialized variables down into the x86 subtarget
from the x86 target machine. Should be no functional change.

llvm-svn: 210479
2014-06-09 17:08:19 +00:00
Matt Arsenault
f38f9f9399 R600/SI: Rename VOP3 helper class to be more general
It has other uses besides shift instructions.

llvm-svn: 210478
2014-06-09 17:00:46 +00:00
Andrea Di Biagio
23548cb631 [X86] Add target combine rules for horizontal add/sub.
This patch adds new target specific combine rules to identify horizontal
add/sub idioms from BUILD_VECTOR dag nodes.

This patch also teaches the DAGCombiner how to canonicalize sequences of
insert_vector_elt dag nodes according to the following rule:

  (insert_vector_elt (insert_vector_elt A, I0), I1) ->
    (insert_vecto_elt (insert_vector_elt A, I1), I0)

This new canonicalization rule only triggers if the inner insert_vector
dag node has exactly one use; also, both indices must be known constants,
and I1 < I0.
This last rule made it possible to write a simpler algorithm to identify
horizontal add/sub patterns because now we don't have to worry about the
ordering of insert_vector_elt dag nodes.

llvm-svn: 210477
2014-06-09 16:54:41 +00:00
Matt Arsenault
c9f3bd4d6c R600/SI: Keep 64-bit not on SALU
llvm-svn: 210476
2014-06-09 16:36:31 +00:00
Matt Arsenault
a34a3c834c R600: Fix selection failure for vector bswap
llvm-svn: 210475
2014-06-09 16:20:25 +00:00
Bill Schmidt
3ff0a8eb8b [PPC64LE] Generate correct little-endian code for v16i8 multiply
The existing code in PPCTargetLowering::LowerMUL() for multiplying two
v16i8 values assumes that vector elements are numbered in big-endian
order.  For little-endian targets, the vector element numbering is
reversed, but the vmuleub, vmuloub, and vperm instructions still
assume big-endian numbering.  To account for this, we must adjust the
permute control vector and reverse the order of the input registers on
the vperm instruction.

The existing test/CodeGen/PowerPC/vec_mul.ll is updated to be executed
on powerpc64 and powerpc64le targets as well as the original powerpc
(32-bit) target.

llvm-svn: 210474
2014-06-09 16:06:29 +00:00
Sasa Stankovic
c2477ae99d [mips] Fix a bug for NaCl target - Don't report the error when non-dangerous
load/store is in branch delay slot.

Differential Revision: http://llvm-reviews.chandlerc.com/D4048

llvm-svn: 210470
2014-06-09 14:09:28 +00:00
Andrea Di Biagio
26cba06d11 [X86] Avoid emitting unnecessary test instructions.
This patch teaches the backend how to check for the 'NoSignedWrap' flag on
binary operations to improve the emission of 'test' instructions.

If the result of a binary operation is known not to overflow we know that
resetting the Overflow flag is unnecessary and so we can avoid emitting
the test instruction.

Patch by Marcello Maggioni.

llvm-svn: 210468
2014-06-09 12:34:50 +00:00
Alexey Volkov
a3a5a1d7f1 [X86] Use ADD/SUB instead of INC/DEC for Silvermont
According to Intel Software Optimization Manual 
on Silvermont INC or DEC instructions require 
an additional uop to merge the flags.
As a result, a branch instruction depending 
on an INC or a DEC instruction incurs a 1 cycle penalty.

Differential Revision: http://reviews.llvm.org/D3990

llvm-svn: 210466
2014-06-09 11:40:41 +00:00
Artyom Skrobov
915d6e58c2 [AArch64] Missing aliases for CMP/CMN [W]SP with no shift
llvm-svn: 210464
2014-06-09 11:10:14 +00:00
Zoran Jovanovic
6af2af8ced [mips][mips64r6] Add LDPC instruction
Differential Revision: http://reviews.llvm.org/D3822

llvm-svn: 210460
2014-06-09 09:49:51 +00:00
Chad Rosier
22a15b47d4 [AArch64] Fix the ordering of the accumulate operand in SchedRW list.
Patch by Dave Estes <cestes@codeaurora.org>
http://reviews.llvm.org/D4037

llvm-svn: 210446
2014-06-09 01:54:00 +00:00
Chad Rosier
010594577d [AArch64] When combining constant mul of power of 2 plus/minus 1, prefer shift
plus add.  The shift can be folded into the add.  This only effects codegen
when the constant is 3.

llvm-svn: 210445
2014-06-09 01:25:51 +00:00
Craig Topper
b00824c629 [C++11] Use 'nullptr'.
llvm-svn: 210442
2014-06-08 22:29:17 +00:00
Saleem Abdulrasool
8c260cf505 X86: simplify data layout calculation
X86Subtarget::isTargetCygMing || X86Subtarget::isTargetKnownWindowsMSVC is
equivalent to all Windows environments.  Simplify the check to isOSWindows.
NFC.

llvm-svn: 210431
2014-06-08 19:08:36 +00:00
David Blaikie
f670b953e7 AsmMatchers: Use unique_ptr to manage ownership of MCParsedAsmOperand
I saw at least a memory leak or two from inspection (on probably
untested error paths) and r206991, which was the original inspiration
for this change.

I ran this idea by Jim Grosbach a few weeks ago & he was OK with it.
Since it's a basically mechanical patch that seemed sufficient - usual
post-commit review, revert, etc, as needed.

llvm-svn: 210427
2014-06-08 16:18:35 +00:00
Alp Toker
ad1219e3fd Revert "Do materialize for floating point"
1) The commit was made despite profound lack of understanding:

   "I did not understand the comment about using dyn_cast instead of isa. I will
   commit as is and make the update after. You can explain what you meant to me."

   Commit first, understand later isn't OK.

2) Review comments were simply ignored:

   "Can you edit the summary to describe what the patch is for? It appears to be
   a list of commits at the moment."

3) The patch got LGTM'd off-list without any indication of readiness.

4) The public mailing list was excluded from patch review so all of this was
   hidden from the community.

This reverts commit r210414.

llvm-svn: 210424
2014-06-08 09:13:42 +00:00
Alp Toker
c46322b804 Remove outdated CMake MSVC workaround
llvm-svn: 210421
2014-06-08 07:37:17 +00:00
Reed Kotler
75bd26e764 Do materialize for floating point
Summary:
start to do simple constants

finish simplestore

add test case

format

Merge branch 'master' into 1756_8

Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel

Merge branch 'master' into 1756_8

Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel

in progress

finish integer materialize

test cases

test cases

in progress

Finish up fast-isel materialize for ints.

Finish materialize for ints

test cases

simplestorei.ll

Merge branch 'master' into 1756_8

fix fp constants for fast-isel

Merge branch '1758_1' of dmz-portal.mips.com:llvm into 1758_1

in progress

lastest for fp materialization

clean up

Merge branch 'master' into 1758_1

formatting

add test case

finish test case

Merge branch 'master' into 1758_2

Test Plan:
simplestore.ll

simplestore.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3659

llvm-svn: 210414
2014-06-08 03:30:32 +00:00
Reed Kotler
2649e8f1d6 start to clean up buildMI calls in mips fast-isel
Summary: Merge branch 'master' into 1758_6

Test Plan:
No functionality change. Run "make check" and run test-suite.
Because our servers are not yet running again I have not yet run test-suite.
I will further review myself before submission.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3819

llvm-svn: 210413
2014-06-08 03:04:42 +00:00
Reed Kotler
1d0d382144 include MipsGenFastISel.inc
Summary:
Included this file which is needed to enable tablegen generated functionality
for fast mips-isel

Test Plan:
This has no visible functionality by itself but just adding the include
file creates some issues so I have it as a separate patch.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3812

llvm-svn: 210410
2014-06-08 02:08:43 +00:00
Alp Toker
62946e907c Fix typos
llvm-svn: 210401
2014-06-07 21:23:09 +00:00
Saleem Abdulrasool
6fbe07452a ARM: correct assertion for long-calls on WoA
COFF/PE, so the relocation model is never static.  Loosen the assertion
accordingly.  The relocation can still be emitted properly, as it will be
converted to an IMAGE_REL_ARM_ADDR32 which will be resolved by the loader
taking the base relocation into account.  This is necessary to permit the
emission of long calls which can be controlled via the -mlong-calls option in
the driver.

llvm-svn: 210399
2014-06-07 20:29:27 +00:00
Eric Christopher
0f6f12761f Replace the use of TargetMachine with a tiny bool variable.
llvm-svn: 210386
2014-06-06 23:26:48 +00:00
Eric Christopher
99e2e51dd4 Remove all local variables from X86SelectionDAGInfo, the DAG has
all of the ones we were stashing away on startup.

llvm-svn: 210385
2014-06-06 23:26:43 +00:00
Benjamin Kramer
ab2896f4aa X86: Don't turn shifts into ands if there's another use that may not check for equality.
Fixes PR19964.

llvm-svn: 210371
2014-06-06 21:08:55 +00:00
Eric Christopher
db8e2ecde5 Have TargetSelectionDAGInfo take a DataLayout initializer rather than
a TargetMachine since the only thing it wants is DataLayout.

llvm-svn: 210366
2014-06-06 19:04:48 +00:00
Filipe Cabecinhas
bcbf7c1220 Fixed a bug in lowering shuffle_vectors to insertps
Summary:
We were being too strict and not accounting for undefs.
Added a test case and fixed another one where we improved codegen.

Reviewers: grosbach, nadav, delena

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4039

llvm-svn: 210361
2014-06-06 18:07:06 +00:00
Bill Schmidt
647be1ef2c [PPC64LE] Fix lowering of BUILD_VECTOR and SHUFFLE_VECTOR for little endian
This patch fixes a couple of lowering issues for little endian
PowerPC.  The code for lowering BUILD_VECTOR contains a number of
optimizations that are only valid for big endian.  For now, we disable
those optimizations for correctness.  In the future, we will add
analogous optimizations that are correct for little endian.

When lowering a SHUFFLE_VECTOR to a VPERM operation, we again need to
make the now-familiar transformation of swapping the input operands
and complementing the permute control vector.  Correctness of this
transformation is tested by the accompanying test case.

llvm-svn: 210336
2014-06-06 14:06:26 +00:00
Eric Christopher
5320200bbc Remove X86Subtarget from the X86FrameLowering constructor since
we can just pass in the values we already know and we're not
caching the subtarget anymore.

llvm-svn: 210292
2014-06-05 22:10:58 +00:00
Eric Christopher
48b570a54b Remove caching of the subtarget for X86FrameLowering.
llvm-svn: 210290
2014-06-05 22:00:31 +00:00
Eric Christopher
f2fcd2c296 Remove duplicate copy of InstrItineraryData from the TargetMachine,
it's already on the subtarget.

llvm-svn: 210289
2014-06-05 21:42:54 +00:00
Tom Roeder
740d86dc79 Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.

This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.

llvm-svn: 210280
2014-06-05 19:29:43 +00:00
Bill Schmidt
9380b3b7bf [PPC64LE] Temporarily disable VSX support in little-endian mode
This is a preliminary patch for the PowerPC64LE support.  In stage 1
of the vector support, we will support the VMX (Altivec) instruction
set, but will not yet support the VSX instructions.  This is merely a
staging issue to provide functional vector support as soon as
possible.

llvm-svn: 210271
2014-06-05 16:21:13 +00:00
Ulrich Weigand
6d5691fc73 [SystemZ] Do not install IfConverter pass at -O0
When not optimizing, do not run the IfConverter pass, this makes
debugging more difficult (and causes a testsuite failure in
DebugInfo/unconditional-branch.ll).

llvm-svn: 210263
2014-06-05 14:20:10 +00:00
Sasa Stankovic
9817e8490e [mips] Modify long branch for NaCl:
* Move the instruction that changes sp outside of the branch delay slot.
  * Bundle-align the target of indirect branch.

Differential Revision: http://llvm-reviews.chandlerc.com/D3928

llvm-svn: 210262
2014-06-05 13:52:08 +00:00
Eric Christopher
9eab959903 We've got a getSlotSize call already that we use everywhere else,
use it here too.

llvm-svn: 210227
2014-06-05 00:22:13 +00:00
Matt Arsenault
9e400b2e26 R600/SI: Match rsq instructions
llvm-svn: 210226
2014-06-05 00:15:55 +00:00
Eric Christopher
dee30cf13f 80-columns.
llvm-svn: 210224
2014-06-05 00:09:08 +00:00
Eric Christopher
8daea5a337 Remove uses of the TargetMachine from X86FrameLowering.
llvm-svn: 210223
2014-06-05 00:09:05 +00:00
Matt Arsenault
d9ef70c461 Use nullptr
llvm-svn: 210222
2014-06-05 00:01:12 +00:00
Yaron Keren
9921f8bf3f Two small enhancements for the JIT.
When JITting a large project such as Boost it's quite hard to figure out the problematic inline asm without debug location. This patch provides debug location printout before the JIT aborts due to inline asm. printDebugLoc() was exposed from MachineInstr.cpp and reused here.

If the JIT run with debug info, don't bomb on DBG_VALUE but ignore them.

http://reviews.llvm.org/D3416

llvm-svn: 210201
2014-06-04 17:35:28 +00:00
Tilmann Scheller
acc3c4f243 [AArch64] clang-format the load/store optimizer.
No change in functionality.

llvm-svn: 210182
2014-06-04 12:40:35 +00:00
Tilmann Scheller
4ed82f8466 [AArch64] Fix some LLVM Coding Standards violations in the load/store optimizer.
Variable names should start with an upper case letter.

No change in functionality.

llvm-svn: 210181
2014-06-04 12:36:28 +00:00
Nick Lewycky
336449ee77 Fix a use of uninitialized value. OldCC is set when IsCmpZero || IsSwapped and read when ShouldUpdateCC || IsSwapped, and ShouldUpdateCC is independent. Fixes PR19932, but no test since I wasn't able to get any symptoms to appear, not even with valgrind and the testcase from the PR. It's clear what happened from inspection of the code.
llvm-svn: 210168
2014-06-04 07:45:54 +00:00
Andrew Trick
2cfbbba814 Add a subtarget hook: enablePostMachineScheduler.
As requested by AArch64 subtargets.

Note that this will have no effect until the
AArch64 target actually enables the pass like this:
substitutePass(&PostRASchedulerID, &PostMachineSchedulerID);

As soon as armv7 switches over, PostMachineScheduler will become the
default postRA scheduler, so this won't be necessary any more.
Targets using the old postRA schedule would then do:
substitutePass(&PostMachineSchedulerID, &PostRASchedulerID);

llvm-svn: 210167
2014-06-04 07:06:27 +00:00
Matt Arsenault
ad098591b8 Fix typos
llvm-svn: 210135
2014-06-03 23:06:13 +00:00
Eric Christopher
f3e627ce2e Revert r209381 as it isn't a local variable. Add a testcase so that
we know next time this happens.

llvm-svn: 210127
2014-06-03 21:01:39 +00:00
Eric Christopher
2010fabe89 Fixup formatting in the pass.
llvm-svn: 210126
2014-06-03 21:01:35 +00:00
Tilmann Scheller
a373112959 [AArch64] Fix typo in load/store optimizer.
llvm-svn: 210114
2014-06-03 16:33:13 +00:00
Tim Northover
d56609ce6c AArch64: mark small types (i1, i8, i16) as promoted
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.

llvm-svn: 210102
2014-06-03 13:54:53 +00:00
Jiangning Liu
531302fb19 [AArch64] Correctly deal with VPR stack parameter passing.
llvm-svn: 210067
2014-06-03 03:25:09 +00:00
Rafael Espindola
87cd774844 Allow alias to point to an arbitrary ConstantExpr.
This  patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.

This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like

@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
                                 i32 ptrtoint (i32* @bar to i32)) to i32*)

An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).

Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.

llvm-svn: 210062
2014-06-03 02:41:57 +00:00
Eric Christopher
44a7e98ce5 Omit else branch after return.
llvm-svn: 210034
2014-06-02 17:29:07 +00:00
Andrea Di Biagio
3455d1a524 [X86] Fix checked arithmetic for i8 on X86.
When lowering a ISD::BRCOND into a test+branch, make sure that we
always use the correct condition code to emit the test operation.

This fixes PR19858: "i8 checked mul is wrong on x86".

Patch by Keno Fisher!

llvm-svn: 210032
2014-06-02 16:00:27 +00:00
Christian Pirker
6d4cce97f1 ARMEB: Fix function return type f64
Reviewed at http://reviews.llvm.org/D3968

llvm-svn: 209990
2014-06-01 09:30:52 +00:00
Matt Arsenault
a36a2916ac R600: Set all float vector expands in the same place
llvm-svn: 209988
2014-06-01 07:38:21 +00:00
Alp Toker
e8634eb077 Fix typos
llvm-svn: 209982
2014-05-31 21:26:28 +00:00
Alp Toker
f3f3560e44 Update a couple of header inclusion guards
llvm-svn: 209980
2014-05-31 21:26:09 +00:00
Matt Arsenault
1c23cf5566 R600/SI: Remove redundant patterns
These patterns are already handled in the instruction definition.

llvm-svn: 209979
2014-05-31 19:25:17 +00:00
Matt Arsenault
ff3cea9ab5 R600/SI: Fix [s|u]int_to_fp for i1
llvm-svn: 209971
2014-05-31 06:47:42 +00:00
Eric Christopher
1aad72164e Have the TLOF creation take a Triple rather than needing a subtarget.
llvm-svn: 209937
2014-05-31 00:07:32 +00:00
Andrea Di Biagio
3a03708285 [X86] Add two combine rules to simplify dag nodes introduced during type legalization when promoting nodes with illegal vector type.
This patch teaches the backend how to simplify/canonicalize dag node
sequences normally introduced by the backend when promoting certain dag nodes
with illegal vector type.

This patch adds two new combine rules:
1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) ->
        (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>)

2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) ->
        (shuffle (BINOP A, B), Undef, <Mask>).

Both rules are only triggered on the type-legalized DAG.
In particular, rule 1. is a target specific combine rule that attempts
to sink a bitconvert into the operands of a binary operation.
Rule 2. is a target independet rule that attempts to move a shuffle
immediately after a binary operation.

llvm-svn: 209930
2014-05-30 23:17:53 +00:00
Eric Christopher
0cc6977494 isSVR4ABI() returned !isDarwin() so just move that to the else
block and remove the unreachable code.

llvm-svn: 209927
2014-05-30 22:47:53 +00:00
Eric Christopher
f0478ea2df Rename CreateTLOF->createTLOF to match the rest of the file and the
rest of the targets with a similar function name.

llvm-svn: 209926
2014-05-30 22:47:48 +00:00
Filipe Cabecinhas
89440ec19e Separate the check for blend shuffle_vector masks
Summary:
Separate the check for blend shuffle_vector masks into isBlendMask.
This function will also be used to check if a vector shuffle is legal. No
change in functionality was intended, but we ended up improving codegen on
two tests, which were being (more) optimized only if the resulting shuffle
was legal.

Reviewers: nadav, delena, andreadb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3964

llvm-svn: 209923
2014-05-30 21:31:21 +00:00
Tim Northover
6ee9050b92 ARM: use AAPCS-style prologues for embedded MachO.
Darwin prologues save their GPRs in two stages: a narrow push of r0-r7 & lr,
followed by a wide push of the remaining registers if there are any. AAPCS uses
a single push.w instruction.

It turns out that, on average, enough registers get pushed that code is smaller
in the AAPCS prologue, which is a nice property for M-class programmers. They
also have other options available for back-traces, so can hopefully deal with
the fact that FP & LR aren't adjacent in memory.

rdar://problem/15909583

llvm-svn: 209895
2014-05-30 13:23:06 +00:00
Tim Northover
3bb84c9bcc ARM & AArch64: make use of common cmpxchg idioms after expansion
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.

When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.

This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.

I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.

rdar://problem/16227836

llvm-svn: 209883
2014-05-30 10:09:59 +00:00
Adam Nemet
94b6f19596 [X86] Remove AVX1 vbroadcast intrinsics
The corresponding CFE patch replaces these intrinsics with vector initializers
in avxintrin.h.  This patch removes the LLVM intrinsics from the backend.

We now stop lowering at X86ISD::VBROADCAST custom node rather than lowering
that further to the intrinsics.

The patch only changes VBROADCASTS* and leaves VBROADCAST[FI]128 to continue
to use intrinsics.  As explained in the CFE patch, the reason is that we
currently don't generate as good code for them without the intrinsics.

CodeGen/X86/avx-vbroadcast.ll already provides coverage for this change.  It
checks that for a series of insertelements we generate the appropriate
vbroadcast instruction.

Also verified that there was no assembly change in the test-suite before and
after this patch.

llvm-svn: 209864
2014-05-29 23:35:36 +00:00