1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

24718 Commits

Author SHA1 Message Date
Arnold Schwaighofer
69cfc9ae09 Add a triple so that right syntax is choosen on mac osx systems
llvm-svn: 211188
2014-06-18 17:20:49 +00:00
Matt Arsenault
a46ba4c9d1 R600/SI: Add intrinsics for brev instructions
llvm-svn: 211187
2014-06-18 17:13:57 +00:00
Matt Arsenault
48848ba546 R600/SI: Prettier operand printing for 64-bit ops.
Copy what is done for 32-bit already so the order is about the same.

llvm-svn: 211186
2014-06-18 17:13:51 +00:00
Matheus Almeida
d742950090 [mips] SYNC $stype instruction was added in Mips32
but SYNC with an implied operand ($stype = 0) is valid since Mips2.

llvm-svn: 211185
2014-06-18 17:10:30 +00:00
Matt Arsenault
068d030935 R600: Implement f64 ftrunc, ffloor and fceil.
CI has instructions for these, so this fixes them for older hardware.

llvm-svn: 211183
2014-06-18 17:05:30 +00:00
Matt Arsenault
77f7e6fc35 R600: Custom lower f64 frint for pre-CI
llvm-svn: 211182
2014-06-18 17:05:26 +00:00
Adam Nemet
d36a2c2dba [X86] AVX512: Add non-temporal stores
Note that I followed the AVX2 convention here and didn't add LLVM intrinsics
for stores.  These can be generated with the nontemporal hint on LLVM IR
stores (see new test). The GCC builtins are lowered directly into nontemporal
stores.

<rdar://problem/17082571>

llvm-svn: 211176
2014-06-18 16:51:10 +00:00
Adam Nemet
2673c0a107 [X86] AVX512: Specify compressed displacement for vmovntdqa
Use the max 64-bit element size with EVEX_CD8.  This should work since element
size is ignored for a full-vector access (FVM).

llvm-svn: 211175
2014-06-18 16:51:07 +00:00
Ulrich Weigand
1458f67d3e [PowerPC] Do not use BLA with the 64-bit SVR4 ABI
The PowerPC back-end uses BLA to implement calls to functions at
known-constant addresses, which is apparently used for certain
system routines on Darwin.

However, with the 64-bit SVR4 ABI, this is actually incorrect.
An immediate function pointer value on this platform is not
directly usable as a target address for BLA:
- in the ELFv1 ABI, the function pointer value refers to the
  *function descriptor*, not the code address
- in the ELFv2 ABI, the function pointer value refers to the
  global entry point, but BL(A) would only be correct when
  calling the *local* entry point

This bug didn't show up since using immediate function pointer
values is not usually done in the 64-bit SVR4 ABI in the first
place.  However, I ran into this issue with a certain use case
of LLVM as JIT, where immediate function pointer values were
uses to implement callbacks from JITted code to helpers in
statically compiled code.

Fixed by simply not using BLA with the 64-bit SVR4 ABI.

llvm-svn: 211174
2014-06-18 16:14:04 +00:00
Ulrich Weigand
91f4bb3dc9 Do not XFAIL test/tools/llvm-cov tests on powerpc64le
All tests in test/tools/llvm-cov fail on big-endian targets and are
supposed to be XFAILed there.  However, including "powerpc64" in the
XFAIL line is now incorrect, since that matches both powerpc64- and
powerpc64le- targets, and the tests pass on the latter.

Update the XFAIL lines to use powerpc64- instead (like mips64-).

llvm-svn: 211172
2014-06-18 15:52:18 +00:00
Matheus Almeida
20a0332ec0 [mips] Fix expansion of memory operation if destination register is not a GPR.
Summary:
The assembler tries to reuse the destination register for memory operations whenever
it can but it's not possible to do so if the destination register is not a GPR.

Example:
  ldc1 $f0, sym
should expand to:
  lui $at, %hi(sym)
  ldc1 $f0, %lo(sym)($at)

It's entirely wrong to expand to:
  lui $f0, %hi(sym)
  ldc1 $f0, %lo(sym)($f0)

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4173

llvm-svn: 211169
2014-06-18 14:49:56 +00:00
Matheus Almeida
c796a413df [mips] Report correct location when "erroring" about the use of $at when it's not available.
Summary: This removes the FIXMEs from test/MC/Mips/mips-noat.s.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4172

llvm-svn: 211168
2014-06-18 14:46:05 +00:00
Zoran Jovanovic
02da9f46d5 [mips][mips64r6] Add BLTC and BLTUC instructions
Differential Revision: http://reviews.llvm.org/D3923

llvm-svn: 211167
2014-06-18 14:36:00 +00:00
Matheus Almeida
97cad15f37 [mips] Access $at only if necessary.
Summary:
This patch doesn't really change the logic behind expandMemInst but it allows
us to assemble .S files that use .set noat with some macros. For example:

.set noat
lw $k0, offset($k1)

Can expand to:
lui	$k0, %hi(offset)
addu	$k0, $k0, $k1
lw	$k0, %lo(offset)($k0)

with no need to access $at.

Reviewers: dsanders, vmedic

Reviewed By: dsanders, vmedic

Differential Revision: http://reviews.llvm.org/D4159

llvm-svn: 211165
2014-06-18 14:15:42 +00:00
Cameron McInally
8c86d371c4 Add pattern for unsigned v4i32->v4f64 convert on AVX512.
llvm-svn: 211164
2014-06-18 14:04:37 +00:00
Matheus Almeida
2185c77bcc [mips] Update MipsAsmParser so that it's possible to handle immediates that start with the binary operator NOT (~).
Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4158

llvm-svn: 211163
2014-06-18 13:55:18 +00:00
Matheus Almeida
9d6564dc81 [mips] Implement alias for 'and' and 'or' instructions for all ISAs.
Summary:
Examples: 
and $2, 4 <=> andi $2, $2, 4
or $2, 4 <=> ori $2, $2, 4

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4155

llvm-svn: 211161
2014-06-18 13:30:57 +00:00
Matheus Almeida
6d8543a62d [mips] Remove the last usage of parseRegister from MipsAsmParser.
Summary:
Added negative test case so that we can be sure we handle erroneous situations
while parsing the .cpsetup directive.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3681

llvm-svn: 211160
2014-06-18 13:08:59 +00:00
Jan Vesely
79097fa008 R600: Implement 64bit SRA
v2: Use capitalized variable name

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 211159
2014-06-18 12:27:17 +00:00
Jan Vesely
e8b0cf21f9 R600: Implement 64bit SRL
v2: use C++ style comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 211158
2014-06-18 12:27:15 +00:00
Jan Vesely
945809b390 R600: Implement 64bit SHL
v2: Use c++ style comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 211157
2014-06-18 12:27:13 +00:00
Evgeniy Stepanov
5b86f69879 [msan] Handle X86 *.psad.* and *.pmadd.* intrinsics.
llvm-svn: 211156
2014-06-18 12:02:29 +00:00
Tim Northover
ee6cb87672 DAG: move sret demotion into most basic LowerCallTo implementation.
It looks like there are two versions of LowerCallTo here: the
SelectionDAGBuilder one is designed to operate on LLVM IR, and the
TargetLowering one in the case where everything is at DAG level.

Previously, only the SelectionDAGBuilder variant could handle demoting
an impossible return to sret semantics (before delegating to the
TargetLowering version), but this functionality is also useful for
certain libcalls (e.g. 128-bit operations on 32-bit x86).  So this
commit moves the sret handling down a level.

rdar://problem/17242889

llvm-svn: 211155
2014-06-18 11:52:44 +00:00
Simon Atanasyan
93ed003276 [llvm-readobj][ELF] New -mips-plt-got command line option to output
MIPS GOT section.

Patch reviewed by Rafael Espindola.

llvm-svn: 211150
2014-06-18 08:47:09 +00:00
Kevin Qin
8c13822db4 [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.
ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.

llvm-svn: 211144
2014-06-18 05:54:42 +00:00
Louis Gerbarg
c6a8e61725 Allow X86FastIsel to cope with 64 bit absolute relocations
This patch is a follow up to r211040 & r211052. Rather than bailing out of fast
isel this patch will generate an alternate instruction (movabsq) instead of the
leaq. While this will always have enough room to handle the 64 bit displacment
it is generally over kill for internal symbols (most displacements will be
within 32 bits) but since we have no way of communicating the code model to the
the assmebler in order to avoid flagging an absolute leal/leaq as illegal when
using a symbolic displacement.

llvm-svn: 211130
2014-06-17 23:22:41 +00:00
Juergen Ributzka
4e43e935e5 [FastISel][X86] Optimize predicates and fold CMP instructions.
This optimizes predicates for certain compares, such as fcmp oeq %x, %x to
fcmp ord %x, %x. The latter one is more efficient to generate.

The same optimization is applied to conditional branches.

llvm-svn: 211126
2014-06-17 21:55:43 +00:00
Kevin Enderby
ee2166f84e Add "-format darwin" to llvm-size to be like darwin's size(1) -m output, and
and the -l option for the long format.  Also when the object is a Mach-O
file and the format is berkeley produce output like darwin’s default size(1)
summary berkeley derived output.

Like System V format, there are also some small changes in how and where
the file names and archive member names are printed for darwin and
Mach-O.

Like the changes to llvm-nm these are the first steps in seeing if it is
possible to make llvm-size produce the same output as darwin's size(1).

llvm-svn: 211117
2014-06-17 17:54:13 +00:00
Matt Arsenault
508925b13a R600/SI: Match cttz_zero_undef
llvm-svn: 211116
2014-06-17 17:36:27 +00:00
Matt Arsenault
71fb43e88e R600/SI: Match ctlz_zero_undef
llvm-svn: 211115
2014-06-17 17:36:24 +00:00
Will Schmidt
5832e5b8ca mark the old jit tests as unsupported for powerpc64 (for cmake)
mark the old JIT tests as unsupported for powerpc64 - CMake style.
This follows the style used for hexagon/arm64/aarch64.
The equivalent tests still run under the supported MCJIT/*

llvm-svn: 211111
2014-06-17 17:04:42 +00:00
Tom Stellard
a529beed9c R600: Use LDS and vectors for private memory
llvm-svn: 211110
2014-06-17 16:53:14 +00:00
Tom Stellard
6987d06184 SelectionDAG: Expand i64 = FP_TO_SINT i32
llvm-svn: 211108
2014-06-17 16:53:07 +00:00
Juergen Ributzka
39c5c57c4a [FastISel][X86] Fix previous refactoring commit (r211077)
Overlooked that fcmp_une uses an "or" instead of an "and" for combining the
flags.

llvm-svn: 211104
2014-06-17 14:47:45 +00:00
Dinesh Dwivedi
3f5165cd6a Fixed jump threading going to infinite loop.
This patch add code to remove unreachable blocks from function
as they may cause jump threading to stuck in infinite loop.

Differential Revision: http://reviews.llvm.org/D3991

llvm-svn: 211103
2014-06-17 14:34:19 +00:00
Tim Northover
300d65b2ea AArch64: estimate inline asm length during branch relaxation
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".

Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.

rdar://problem/17277590

llvm-svn: 211095
2014-06-17 11:31:42 +00:00
Evgeniy Stepanov
98ddd4a0cc [msan] Fix handling of multiplication by a constant with a number of trailing zeroes.
Multiplication by an integer with a number of trailing zero bits leaves
the same number of lower bits of the result initialized to zero.
This change makes MSan take this into account in the case of multiplication by
a compile-time constant.

We don't handle the general, non-constant, case because
(a) it's not going to be cheap (computation-wise);
(b) multiplication by a partially uninitialized value in user code is
    a bad idea anyway.

Constant case must be handled because it appears from LLVM optimization of a
completely valid user code, as the test case in compiler-rt demonstrates.

llvm-svn: 211092
2014-06-17 09:23:12 +00:00
Jingyue Wu
089dd5d55e [InstCombine] mark ADD with nuw if no unsigned overflow
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".

Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll

Test Plan: make check-all

Reviewers: eliben, rafael, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: http://reviews.llvm.org/D4144

llvm-svn: 211084
2014-06-17 00:42:07 +00:00
Duncan P. N. Exon Smith
dbf049d50f SROA: Only split loads on byte boundaries
r199771 accidently broke the logic that makes sure that SROA only splits
load on byte boundaries.  If such a split happens, some bits get lost
when reassembling loads of wider types, causing data corruption.

Move the width check up to reject such splits early, avoiding the
corruption.  Fixes PR19250.

Patch by: Björn Steinbrink <bsteinbr@gmail.com>

llvm-svn: 211082
2014-06-17 00:19:35 +00:00
Juergen Ributzka
199456e51e [FastISel][X86] Refactor the code to get the X86 condition from a helper function. NFC.
Make use of helper functions to simplify the branch and compare instruction
selection in FastISel. Also add test cases for compare and conditonal branch.

llvm-svn: 211077
2014-06-16 23:58:24 +00:00
Eli Bendersky
2adb3a762b Teach LoopUnrollPass to respect loop unrolling hints in metadata.
[This is resubmitting r210721, which was reverted due to suspected breakage
which turned out to be unrelated].

Some extra review comments were addressed. See D4090 and D4147 for more details.

The Clang change that produces this metadata was committed in r210667

Patch by Mark Heffernan.

llvm-svn: 211076
2014-06-16 23:53:02 +00:00
Reed Kotler
e3da1c94f1 Add load/store functionality
Summary:
This patches allows non conversions like i1=i2; where both are global ints.
In addition, arithmetic and other things start to work since fast-isel will use
existing patterns for non fast-isel from tablegen files where applicable.

In addition i8, i16 will work in this limited context for assignment without the need
for sign extension (zero or signed). It does not matter how i8 or i16 are loaded (zero or sign extended)
since only the 8 or 16 relevant bits are used and clang will ask for sign extension before using them in
arithmetic. This is all made more complete in forthcoming patches.

for example:
  int i, j=1, k=3;
 
  void foo() {
    i = j + k;
  }

Keep in mind that this pass is not enabled right now and is an experimental pass
It can only be enabled with a hidden option to llvm of -mips-fast-isel.

Test Plan: Run test-suite, loadstore2.ll and I will run some executable tests.

Reviewers: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D3856

llvm-svn: 211061
2014-06-16 22:05:47 +00:00
Bill Schmidt
b0bab996e0 [PPC64] Fix PR19893 - improve code generation for local function addresses
Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal
code generation for forming a function address that is local to the compile
unit.  The existing code was treating both local and non-local functions
identically.

This patch fixes the problem by properly identifying local functions and
generating the proper addis/addi code.  I also noticed that Rafael's earlier
changes to correct the surrounding code in PPCISelLowering.cpp were also
needed for fast instruction selection in PPCFastISel.cpp, so this patch
fixes that code as well.

The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new
code generation.  I've added a -O0 run line to test the fast-isel code as
well.

Tested on powerpc64[le]-unknown-linux-gnu with no regressions.

llvm-svn: 211056
2014-06-16 21:36:02 +00:00
Louis Gerbarg
7ea43963d7 Improve comments for r211040
Added comment to clarify why we r211040 choose to bail out of fast isel instead
of generating a more complicated relocation, and fix mislabelled register in the
comments of the asan test case.

llvm-svn: 211052
2014-06-16 20:31:50 +00:00
Tim Northover
7f8ca02cf4 ARM: implement correct atomic operations on v7M
ARM v7M has ldrex/strex but not ldrexd/strexd. This means 32-bit
operations should work as normal, but 64-bit ones are almost certainly
doomed.

Patch by Phoebe Buckheister.

llvm-svn: 211042
2014-06-16 18:49:36 +00:00
Louis Gerbarg
55f89e91ff Fix illegal relocations in X86FastISel
On x86_86  the lea instruction can only use a 32 bit immediate value. When
the code is compiled statically the RIP register is not used, meaning the
immediate is all that can be used for the relocation, which is not sufficient
in the case of targets more than +/- 2GB away. This patch bails out of fast
isel in those cases and reverts to DAG which does the right thing.

Test case included.

llvm-svn: 211040
2014-06-16 17:35:40 +00:00
Jim Grosbach
2272906641 LowerSwitch: track bounding range for the condition tree.
When LowerSwitch transforms a switch instruction into a tree of ifs it
is actually performing a binary search into the various case ranges, to
see if the current value falls into one cases range of values.

So, if we have a program with something like this:

switch (a) {
case 0:
  do0();
  break;
case 1:
  do1();
  break;
case 2:
  do2();
  break;
default:
  break;
}

the code produced is something like this:

  if (a < 1) {
    if (a == 0) {
      do0();
    }
  } else {
    if (a < 2) {
      if (a == 1) {
        do1();
      }
    } else {
      if (a == 2) {
        do2();
      }
    }
  }

This code is inefficient because the check (a == 1) to execute do1() is
not needed.

The reason is that because we already checked that (a >= 1) initially by
checking that also  (a < 2) we basically already inferred that (a == 1)
without the need of an extra basic block spawned to check if actually (a
== 1).

The patch addresses this problem by keeping track of already
checked bounds in the LowerSwitch algorithm, so that when the time
arrives to produce a Leaf Block that checks the equality with the case
value / range the algorithm can decide if that block is really needed
depending on the already checked bounds .

For example, the above with "a = 1" would work like this:

the bounds start as LB: NONE , UB: NONE
as (a < 1) is emitted the bounds for the else path become LB: 1 UB:
NONE. This happens because by failing the test (a < 1) we know that the
value "a" cannot be smaller than 1 if we enter the else branch.
After the emitting the check (a < 2) the bounds in the if branch become
LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then
the upper bound becomes 2 - 1 = 1.

When it is time to emit the leaf block for "case 1:" we notice that 1
can be squeezed exactly in between the LB and UB, which means that if we
arrived to that block there is no need to emit a block that checks if (a
== 1).

Patch by: Marcello Maggioni <hayarms@gmail.com>

llvm-svn: 211038
2014-06-16 16:55:20 +00:00
Rafael Espindola
93c342bca4 Fix pr17056.
This makes llvm-nm ignore members that are not sufficiently aligned for
lib/Object to handle.

These archives are invalid. GNU AR is able to handle this, but in general
just warns about broken archive members.

We should probably start warning too, but for now just make sure llvm-nm
exits with an 0.

llvm-svn: 211036
2014-06-16 16:41:00 +00:00
Cameron McInally
1bfa586059 Hook up vector int_ctlz for AVX512.
llvm-svn: 211024
2014-06-16 14:12:28 +00:00
Daniel Sanders
4895243b92 [mips][mips64r6] ssnop is deprecated on MIPS32r6/MIPS64r6
Summary: Depends on D4120

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: zoran.jovanovic, vmedic

Differential Revision: http://reviews.llvm.org/D4121

llvm-svn: 211021
2014-06-16 13:25:35 +00:00