1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00
Commit Graph

24071 Commits

Author SHA1 Message Date
Christian Pirker
35d96c7f86 ARM big endian function argument passing
llvm-svn: 208316
2014-05-08 14:06:24 +00:00
Daniel Sanders
c6c9c916df [mips] Implement l[wd]c3, and s[wd]c3.
Summary:
These instructions were added in MIPS-I, and MIPS-II but were removed in
MIPS-III. Interestingly, GAS continues to accept them when assembling for
MIPS-III.

For the moment, these instructions will follow GAS and accept them for
MIPS-III and newer but this will be tightened up when the invalid-*.s
tests are added.

Depends on D3647

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3648

llvm-svn: 208311
2014-05-08 13:02:11 +00:00
Dario Domizioli
df090599ce Revert test commit. Removed blank line.
llvm-svn: 208308
2014-05-08 12:54:43 +00:00
James Molloy
294269a69e [ARM64-BE] Teach fast-isel about how to set up sub-word stack arguments for big endian calls.
SelectionDAG already knows about this, but fast-isel was ignorant.

llvm-svn: 208307
2014-05-08 12:53:50 +00:00
Daniel Sanders
8071a219e6 [mips] Marked up instructions added in MIPS-II and tested that IAS for -mcpu=mips1 does not accept them
Summary:
A small number of instructions are rejected with the wrong error message.
These have been placed in a separate test for now. There seems to be some
parsing quirk that triggers when these instructions are disabled.

Depends on D3571

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3647

llvm-svn: 208305
2014-05-08 12:40:48 +00:00
Daniel Sanders
94fae7d980 [mips] Implement tlbp, tlbr, tlbwi, and tlbwr
Reviewers: vmedic, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3571

llvm-svn: 208301
2014-05-08 11:51:18 +00:00
Dario Domizioli
97c0aef837 Test commit. Added blank line.
llvm-svn: 208298
2014-05-08 11:28:14 +00:00
Tim Northover
72838ce201 ARM64: make sure FastISel emits SSA MachineInstrs
We need to use a temporary register for a 2-step operation like REM.

llvm-svn: 208297
2014-05-08 10:30:56 +00:00
Evgeniy Stepanov
196ad52640 [asan] Preserve flags in asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 208296
2014-05-08 09:55:24 +00:00
Hal Finkel
c52e65b830 Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).

This does represent a small functionality change:
 - For generic x86-64 (which uses the SB model and, thus, will get some
   unrolling).
 - For AMD cores (because they still currently use the SB scheduling model)
 - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
   the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.

llvm-svn: 208289
2014-05-08 09:14:44 +00:00
Hao Liu
be513c440d AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ARM64 backend.
llvm-svn: 208284
2014-05-08 07:38:13 +00:00
Saleem Abdulrasool
fffc610ca8 test: fix silly typo
Oh silly Darwin and your case insensitive file system.

llvm-svn: 208274
2014-05-08 01:41:04 +00:00
Saleem Abdulrasool
84a61727f4 ARM: support FK_SecRel_2 relocations on WoA
This adds FK_SecRel_2 relocation support to ARM.  This enables the building of
object files for armv7-windows-msvc which enables CodeView line tables for
debugging as opposed to armv7-windows-itanium which currently uses DWARF.

llvm-svn: 208273
2014-05-08 01:35:57 +00:00
Filipe Cabecinhas
275860c4fd Lower certain build_vectors to insertps instructions
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.

Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.

Added some tests for different cases where this optimization triggers.

Reviewers: nadav, delena, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3521

llvm-svn: 208271
2014-05-08 00:25:16 +00:00
Duncan P. N. Exon Smith
c74b2b0974 IR: Don't allow non-default visibility on local linkage
Visibilities of `hidden` and `protected` are meaningless for symbols
with local linkage.

  - Change the assembler to reject non-default visibility on symbols
    with local linkage.

  - Change the bitcode reader to auto-upgrade `hidden` and `protected`
    to `default` when the linkage is local.

  - Update LangRef.

<rdar://problem/16141113>

llvm-svn: 208263
2014-05-07 22:57:20 +00:00
Quentin Colombet
548eb2e304 [X86] Add a test case for r208252.
Prior to r208252, the FMA 231 family was marked as isCommutable. However the
memory variants of this family are not commutable. Therefore, we did not
implemented the findCommutedOpIndices for those variants and missed that
the default implementation (more or less: commute indices 1 and 2) was
firing behind our back.
As a result, as demonstrated in the test case before the fix, we were
transforming a = b * c + a into a = a * c + b.

I.e., before r208252 we were generating for this test case:
vmovaps %xmm0, %xmm1
vmoss (%rsi), %xmm0
vfmadd231ss (%rdi), %xmm1, %xmm0

Instead of:
vmoss (%rsi), %xmm1
vfmadd231ss (%rdi), %xmm1, %xmm0

<rdar://problem/16800495> 

llvm-svn: 208260
2014-05-07 22:52:58 +00:00
Adam Nemet
9c5e483a57 [Test] Remove c-index-test from the list of substitutions
All the tests are under the clang tests and none should be under llvm moving
forward.

The topic was discussed in this thread:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140428/214905.html

llvm-svn: 208234
2014-05-07 18:16:02 +00:00
Sebastian Pop
d5cb815565 split delinearization pass in 3 steps
To compute the dimensions of the array in a unique way, we split the
delinearization analysis in three steps:

- find parametric terms in all memory access functions
- compute the array dimensions from the set of terms
- compute the delinearized access functions for each dimension

The first step is executed on all the memory access functions such that we
gather all the patterns in which an array is accessed. The second step reduces
all this information in a unique description of the sizes of the array. The
third step is delinearizing each memory access function following the common
description of the shape of the array computed in step 2.

This rewrite of the delinearization pass also solves a problem we had with the
previous implementation: because the previous algorithm was by induction on the
structure of the SCEV, it would not correctly recognize the shape of the array
when the memory access was not following the nesting of the loops: for example,
see polly/test/ScopInfo/multidim_only_ivs_3d_reverse.ll

; void foo(long n, long m, long o, double A[n][m][o]) {
;
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       for (long k = 0; k < o; k++)
;         A[i][k][j] = 1.0;

Starting with this patch we no longer delinearize access functions that do not
contain parameters, for example in test/Analysis/DependenceAnalysis/GCD.ll

;;  for (long int i = 0; i < 100; i++)
;;    for (long int j = 0; j < 100; j++) {
;;      A[2*i - 4*j] = i;
;;      *B++ = A[6*i + 8*j];

these accesses will not be delinearized as the upper bound of the loops are
constants, and their access functions do not contain SCEVUnknown parameters.

llvm-svn: 208232
2014-05-07 18:01:20 +00:00
Simon Atanasyan
a8319970e6 [yaml2obj] Support ELF x86 relocations.
llvm-svn: 208228
2014-05-07 17:06:38 +00:00
Chad Rosier
da933c5aba [ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally,
this patch disables the dead register elimination pass and the load/store pair
optimization pass at -O0.  The ILP optimizations don't require the optimization
level to be checked because the call to addILPOpts is predicated with the
necessary check.  The AdvSIMDScalar pass is disabled by default at all
optimization levels.  This patch leaves that pass disabled by default.

Also, move command-line options into ARM64TargetMachine.cpp and add a few
additional flags to aid in debugging.  This fixes an issue with the
-debug-pass=Structure flag where passes were printed, but not actually run
(i.e., AdvSIMDScalar pass).

llvm-svn: 208223
2014-05-07 16:41:55 +00:00
Daniel Sanders
44cc643eee [mips] Add highly experimental support for MIPS-I, MIPS-II, MIPS-III, and MIPS-V
Summary:
These processors will only be available for the integrated assembler at
first (CodeGen will emit a fatal error saying they are not implemented).

The intention is to work through the existing instructions and correctly
annotate the ISA they were added in so that we have a sufficiently good
base to start MIPS64r6 development. MIPS64r6 removes/re-encodes certain
instructions and I believe it is best to define ISA's using set-union's
as far as possible rather than using set-subtraction.

Reviewers: vmedic

Subscribers: emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D3569

llvm-svn: 208221
2014-05-07 16:25:22 +00:00
Michael Zolotukhin
5fe056ff21 [InstCombine] Add optimization of redundant insertvalue instructions.
rdar://problem/11861387

llvm-svn: 208214
2014-05-07 14:30:18 +00:00
Evgeniy Stepanov
c02f1a9f96 [msan] Fix -fsanitize=memory -fno-integrated-as.
llvm-svn: 208211
2014-05-07 14:10:51 +00:00
Tim Northover
aed9e378c3 AArch64/ARM64: optimise vector selects & enable test
When performing a scalar comparison that feeds into a vector select,
it's actually better to do the comparison on the vector side: the
scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP"
since the vector comparisons are all mask based.

llvm-svn: 208210
2014-05-07 14:10:27 +00:00
James Molloy
9af6fee0f6 [ARM64-BE] Fix fast-isel, and add appropriate RUN lines to appropriate tests.
llvm-svn: 208200
2014-05-07 12:33:55 +00:00
James Molloy
7951361ffc [ARM64-BE] Fix variable-argument saving.
llvm-svn: 208199
2014-05-07 12:33:48 +00:00
James Molloy
f9bd42cfb5 [ARM64-BE] Implement the lane-twiddling logic at AAPCS boundaries for big endian.
The AAPCS states that values passed in registers must have a value as though
they had been loaded with "LDR". LDR is equivalent to "LD1.64 vX.1D" - that is,
loading scalars to vector registers and loading 1-element vectors is equivalent.

The logic implemented here is to ensure that at all call boundaries and during
formal argument lowering all vectors are treated as their bitwidth-based floating
point scalar counterpart, which is always one of f64 or f128 (v2i32 -> f64,
v4i32 -> f128 etc). A BITCAST is inserted so that the appropriate REV will be
generated during code generation.

llvm-svn: 208198
2014-05-07 12:33:41 +00:00
James Molloy
c74863e0d9 [ARM64-BE] Implement the crazy bitcast handling for big endian vectors.
Because we've canonicalised on using LD1/ST1, every time we do a bitcast
between vector types we must do an equivalent lane reversal.

Consider a simple memory load followed by a bitconvert then a store.
  v0 = load v2i32
  v1 = BITCAST v2i32 v0 to v4i16
       store v4i16 v2

In big endian mode every memory access has an implicit byte swap. LDR and
STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that
is, they treat the vector as a sequence of elements to be byte-swapped.
The two pairs of instructions are fundamentally incompatible. We've decided
to use LD1/ST1 only to simplify compiler implementation.

LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes
the original code sequence:  v0 = load v2i32

  v1 = REV v2i32                  (implicit)
  v2 = BITCAST v2i32 v1 to v4i16
  v3 = REV v4i16 v2               (implicit)
       store v4i16 v3

But this is now broken - the value stored is different to the value loaded
due to lane reordering. To fix this, on every BITCAST we must perform two
other REVs:

  v0 = load v2i32
  v1 = REV v2i32                  (implicit)
  v2 = REV v2i32
  v3 = BITCAST v2i32 v2 to v4i16
  v4 = REV v4i16
  v5 = REV v4i16 v4               (implicit)
       store v4i16 v5

This means an extra two instructions, but actually in most cases the two REV
instructions can be combined into one. For example:
  (REV64_2s (REV64_4h X)) === (REV32_4h X)

There is also no 128-bit REV instruction. This must be synthesized with an
EXT instruction.

Most bitconverts require some sort of conversion. The only exceptions are:
  a) Identity conversions -  vNfX <-> vNiX
  b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX

Even though there are hundreds of changed lines, I have a fairly high confidence
that they are somewhat correct. The changes to add two REV instructions per
bitcast were pretty mechanical, and once I'd done that I threw the resulting
.td at a script I wrote which combined the two REVs together (and added
an EXT instruction, for f128) based on an instruction description I gave it.

This was much less prone to error than doing it all manually, plus my brain
would not just have melted but would have vapourised.

llvm-svn: 208194
2014-05-07 11:28:53 +00:00
James Molloy
c6eeb59eb7 [ARM64-BE] Make big endian (scalar) argument passing work correctly.
This completes the port of r204814 (cpirker "AArch64_BE function argument
passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues
found during later development along the way. The biggest of these was
that the alignment fixup logic wasn't replicated into all the places it
should have been.

llvm-svn: 208192
2014-05-07 11:28:36 +00:00
Tim Northover
158a8f8793 AArch64/ARM64: run test on ARM64 too.
llvm-svn: 208188
2014-05-07 10:47:04 +00:00
Tim Northover
2d1729cbfc AArch64/ARM64: put annotation in test
It makes finding already covered tests much easier with "grep -L
arm64".

llvm-svn: 208187
2014-05-07 10:47:00 +00:00
Tim Northover
3f03338f64 AArch64/ARM64: disable test directory if ARM64 not present
llvm-svn: 208186
2014-05-07 10:42:06 +00:00
Daniel Sanders
ed9c2dd966 [tablegen] Add !listconcat operator with the similar semantics as !strconcat
Summary:
It concatenates two or more lists. In addition to the !strconcat semantics
the lists must have the same element type.

My overall aim is to make it easy to append to Instruction.Predicates
rather than override it. This can be done by concatenating lists passed as
arguments, or by concatenating lists passed in additional fields.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D3506

llvm-svn: 208183
2014-05-07 10:13:19 +00:00
Evgeniy Stepanov
4a872de9c5 [asan] Add a flag to control asm instrumentation.
With this change, asm instrumentation is disabled by default.

llvm-svn: 208167
2014-05-07 07:54:11 +00:00
Joerg Sonnenberger
541f85d280 Allow using normal .eh_frame based unwinding on ARM. Use the same
encodings as x86. Use this exception model for NetBSD.

llvm-svn: 208166
2014-05-07 07:49:34 +00:00
Saleem Abdulrasool
fe0d3eaa0d ARM: fix WoA PEI instruction selection
The ARM::BLX instruction is an ARM mode instruction.  The Windows on ARM target
is limited to Thumb instructions.  Correctly use the thumb mode tBLXr
instruction.  This would manifest as an errant write into the object file as the
instruction is 4-bytes in length rather than 2.  The result would be a corrupted
object file that would eventually result in an executable that would crash at
runtime.

llvm-svn: 208152
2014-05-07 03:03:27 +00:00
Justin Bogner
d56b21803f llvm-cov: Handle missing source files as GCOV does
If the source files referenced by a gcno file are missing, gcov
outputs a coverage file where every line is simply /*EOF*/.  This also
occurs for lines in the coverage that are past the end of a file that
is found.

This change mimics gcov.

llvm-svn: 208149
2014-05-07 02:11:23 +00:00
Justin Bogner
b8e01630df llvm-cov: Implement --no-output
In gcov, there's a -n/--no-output option, which disables the writing
of any .gcov files, so that it emits only the summary info on stdout.
This implements the same behaviour in llvm-cov.

llvm-svn: 208148
2014-05-07 02:11:18 +00:00
Joerg Sonnenberger
347cf3253f If a function needs a frame pointer, but r11 (aka fp) has not been used,
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.

llvm-svn: 208129
2014-05-06 20:43:01 +00:00
Diego Novillo
68bb60ec91 Do not make -pass-remarks additive.
Summary:
When I initially introduced -pass-remarks, I thought it would be a
neat idea to make it additive. So, if one used it as:

$ llc -pass-remarks=inliner --pass-remarks=loop.*

the compiler would build the regular expression '(inliner)|(loop.*)'.

The more I think about it, the more I regret it. This is not how
other flags work. The standard semantics are right-to-left overrides.

This is how clang interprets -Rpass. And I think the two should be
compatible in this respect.

Reviewers: qcolombet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3614

llvm-svn: 208122
2014-05-06 19:14:00 +00:00
Benjamin Kramer
593859517f TTI: Estimate @llvm.fmuladd cost as fmul + fadd when FMA's aren't legal on the target.
llvm-svn: 208115
2014-05-06 18:36:23 +00:00
Andrea Di Biagio
540f8696d1 [X86] Improve the lowering of BITCAST dag nodes from type f64 to type v2i32 (and vice versa).
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.

With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).

This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.

llvm-svn: 208107
2014-05-06 17:09:03 +00:00
Renato Golin
8a9a382ab2 Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

llvm-svn: 208104
2014-05-06 16:51:25 +00:00
Rafael Espindola
5988eacc23 Special case aliases in GlobalValue::getAlignment.
An alias has the address of what it points to, so it also has the same
alignment.

This allows a few optimizations to see past aliases for free.

llvm-svn: 208103
2014-05-06 16:48:58 +00:00
Tim Northover
c3dfe08427 AArch64/ARM64: implement diagnosis of unpredictable loads & stores
llvm-svn: 208091
2014-05-06 14:15:14 +00:00
Tim Northover
78243bb739 AArch64/ARM64: add two more MC tests to ARM64 set.
llvm-svn: 208085
2014-05-06 12:50:58 +00:00
Tim Northover
317652e9e8 AArch64/ARM64: enable MC-level diagnostic tests for NEON insts.
Obviously we can't expect the two backends to produce identical diagnostics,
since what's possible depends quite a bit on how the .td files are structured.
I think the ARM64 diagnostics are basically of the same quality in all the
changed cases, so I've split the CHECK lines.

llvm-svn: 208084
2014-05-06 12:50:55 +00:00
Tim Northover
46970c8884 AArch64/ARM64: make NEON vector list parsing a bit more robust
It doesn't change the results, but it seems silly not to diagnose obvious
problems early on.

llvm-svn: 208083
2014-05-06 12:50:51 +00:00
Tim Northover
fab515c3bb AArch64/ARM64: produce more informative diagnostic assembling some immediates
No tests here, they'll be added when the entire neon-diagnostics.s test from
AArch64 is enabled.

llvm-svn: 208079
2014-05-06 11:18:53 +00:00
Christian Pirker
20a4e2bc33 ARM: For thumb fixups store halfwords high first and low second
llvm-svn: 208076
2014-05-06 10:05:11 +00:00