1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

68854 Commits

Author SHA1 Message Date
Tim Northover
f3e0ceb127 ARM64: add extra NEG pattern.
llvm-svn: 206609
2014-04-18 14:54:35 +00:00
Tim Northover
56351e91d9 AArch64/ARM64: add non-scalar lowering for more FCVT operations.
llvm-svn: 206591
2014-04-18 13:16:42 +00:00
Tim Northover
de9624364e AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE.
We couldn't cope if the first mask element was UNDEF before, which
isn't ideal.

llvm-svn: 206588
2014-04-18 12:50:58 +00:00
Evgeniy Stepanov
de38078fd6 [msan] Add -msan-instrumentation-with-call-threshold.
This flag replaces inline instrumentation for checks and origin stores with
calls into MSan runtime library. This is a workaround for PR17409.

Disabled by default.

llvm-svn: 206585
2014-04-18 12:17:20 +00:00
Chandler Carruth
3104562756 [LCG] Remove all of the complexity stemming from supporting copying.
Reality is that we're never going to copy one of these. Supporting this
was becoming a nightmare because nothing even causes it to compile most
of the time. Lots of subtle errors built up that wouldn't have been
caught by any "normal" testing.

Also, make the move assignment actually work rather than the bogus swap
implementation that would just infloop if used. As part of that, factor
out the graph pointer updates into a helper to share between move
construction and move assignment.

llvm-svn: 206583
2014-04-18 11:02:33 +00:00
Chandler Carruth
d0707ed2ba [LCG] Add support for building persistent and connected SCCs to the
LazyCallGraph. This is the start of the whole point of this different
abstraction, but it is just the initial bits. Here is a run-down of
what's going on here. I'm planning to incorporate some (or all) of this
into comments going forward, hopefully with better editing and wording.
=]

The crux of the problem with the traditional way of building SCCs is
that they are ephemeral. The new pass manager however really needs the
ability to associate analysis passes and results of analysis passes with
SCCs in order to expose these analysis passes to the SCC passes. Making
this work is kind-of the whole point of the new pass manager. =]

So, when we're building SCCs for the call graph, we actually want to
build persistent nodes that stick around and can be reasoned about
later. We'd also like the ability to walk the SCC graph in more complex
ways than just the traditional postorder traversal of the current CGSCC
walk. That means that in addition to being persistent, the SCCs need to
be connected into a useful graph structure.

However, we still want the SCCs to be formed lazily where possible.

These constraints are quite hard to satisfy with the SCC iterator. Also,
using that would bypass our ability to actually add data to the nodes of
the call graph to facilite implementing the Tarjan walk. So I've
re-implemented things in a more direct and embedded way. This
immediately makes it easy to get the persistence and connectivity
correct, and it also allows leveraging the existing nodes to simplify
the algorithm. I've worked somewhat to make this implementation more
closely follow the traditional paper's nomenclature and strategy,
although it is still a bit obtuse because it isn't recursive, using
an explicit stack and a tail call instead, and it is interruptable,
resuming each time we need another SCC.

The other tricky bit here, and what actually took almost all the time
and trials and errors I spent building this, is exactly *what* graph
structure to build for the SCCs. The naive thing to build is the call
graph in its newly acyclic form. I wrote about 4 versions of this which
did precisely this. Inevitably, when I experimented with them across
various use cases, they became incredibly awkward. It was all
implementable, but it felt like a complete wrong fit. Square peg, round
hole. There were two overriding aspects that pushed me in a different
direction:

1) We want to discover the SCC graph in a postorder fashion. That means
   the root node will be the *last* node we find. Using the call-SCC DAG
   as the graph structure of the SCCs results in an orphaned graph until
   we discover a root.

2) We will eventually want to walk the SCC graph in parallel, exploring
   distinct sub-graphs independently, and synchronizing at merge points.
   This again is not helped by the call-SCC DAG structure.

The structure which, quite surprisingly, ended up being completely
natural to use is the *inverse* of the call-SCC DAG. We add the leaf
SCCs to the graph as "roots", and have edges to the caller SCCs. Once
I switched to building this structure, everything just fell into place
elegantly.

Aside from general cleanups (there are FIXMEs and too few comments
overall) that are still needed, the other missing piece of this is
support for iterating across levels of the SCC graph. These will become
useful for implementing #2, but they aren't an immediate priority.

Once SCCs are in good shape, I'll be working on adding mutation support
for incremental updates and adding the pass manager that this analysis
enables.

llvm-svn: 206581
2014-04-18 10:50:32 +00:00
Benjamin Kramer
0b863e9a98 X86: Pattern match scalar loads + vcvtph2ps into just vcvtph2ps.
vcvtph2ps only reads the lower 64 bits of the address passed to the
intrinsic.

llvm-svn: 206579
2014-04-18 10:45:33 +00:00
Chandler Carruth
583964878f Revert r206565 (and r206566 which updated tests).
This commit was attributed to a different person from the person who
posted the patch to the list, and the person who posted it the list
claimed when they did that they were not the author, but that the author
was yet a third person. I don't know what is going on here, but
reverting until the attribution is clear and the author has explicitly
contributed the patch.

Also, the review hasn't really involved any of the MC maintainers and
that seems questionable too.

llvm-svn: 206576
2014-04-18 09:35:51 +00:00
Tim Northover
23ca911ed1 AArch64/ARM64: spot a greater variety of concat_vector operations.
Code mostly copied from AArch64, just tidied up a trifle and plumbed
into the ARM64 way of doing things.

This also enables the AArch64 tests which inspired the previous
untested commits.

llvm-svn: 206574
2014-04-18 09:31:27 +00:00
Tim Northover
1584cfd1c5 ARM64: implement cunning optimisation from AArch64
A vector extract followed by a dup can become a single instruction even if the
types don't match. AArch64 handled this in ISelLowering, but a few reasonably
simple patterns can take care of it in TableGen, so that's where I've put it.

llvm-svn: 206573
2014-04-18 09:31:20 +00:00
Tim Northover
bb94a88804 ARM64: spot a vector_shuffle that maps to INS and expand.
Tests will be coming very shortly when all the optimisations needed to
support AArch64's neon-copy.ll file are committed.

llvm-svn: 206572
2014-04-18 09:31:15 +00:00
Tim Northover
e3c3a026a1 ARM64: nick some AArch64 patterns for extract/insert -> INS.
Tests will be committed shortly when all optimisations needed to
support AArch64's neon-copy.ll file are supported.

llvm-svn: 206571
2014-04-18 09:31:11 +00:00
Tim Northover
21403d6f09 AArch64/ARM64: emit all vector FP comparisons as such.
ARM64 was scalarizing some vector comparisons which don't quite map to
AArch64's compare and mask instructions. AArch64's approach of sacrificing a
little efficiency to emulate them with the limited set available was better, so
I ported it across.

More "inspired by" than copy/paste since the backend's internal expectations
were a bit different, but the tests were invaluable.

llvm-svn: 206570
2014-04-18 09:31:07 +00:00
Tim Northover
1828862541 AArch64/ARM64: port BSL logic from AArch64 & enable test.
I enhanced it a little in the process. The decision shouldn't really be beased
on whether a BUILD_VECTOR is a splat: any set of constants will do the job
provided they're related in the correct way.

Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's
best to check for all 4 possibilities rather than assuming it'll be the RHS.

llvm-svn: 206569
2014-04-18 09:31:01 +00:00
Tim Northover
2b48b866aa AArch64/ARM64: copy byval implementation from AArch64.
It's not actually used to handle C or C++ ABI rules on ARM64, but could well be
emitted by other language front-ends, so it's as well to have a sensible
implementation.

llvm-svn: 206568
2014-04-18 09:30:52 +00:00
Yaron Keren
e48a1d6c0b Patch by Ray Donnelly.
Emit WIN64 SEH registers by name instead of just number.

llvm-svn: 206565
2014-04-18 08:03:38 +00:00
Kostya Serebryany
2b02920109 [asan] one more workaround for PR17409: don't do BB-level coverage instrumentation if there are more than N (=1500) basic blocks. This makes ASanCoverage work on libjpeg_turbo/jchuff.c used by Chrome, which has 1824 BBs
llvm-svn: 206564
2014-04-18 08:02:42 +00:00
Jiangning Liu
57e94eee58 This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64.
A new test case is also added for ARM64.

Patched by Z.Zheng

llvm-svn: 206563
2014-04-18 07:57:54 +00:00
Matt Arsenault
de91105f57 R600: Minor cleanups.
Fix indentation, better line wrapping, unused includes.

llvm-svn: 206562
2014-04-18 07:40:20 +00:00
Lang Hames
b12d0547e6 [ExecutionEngine] Allow JIT clients to enable/disable module verification.
Previously module verification was always enabled, with no way to turn it off.
As of this commit, module verification is on by default in Debug builds, and off
by default in release builds. The default behaviour can be overridden by calling
setVerifyModules(bool) on the JIT instance (this works for both the old JIT, and
MCJIT).

<rdar://problem/16150008>

llvm-svn: 206561
2014-04-18 06:48:23 +00:00
Jiangning Liu
6aa9a901c7 This is one of the optimizations ported from ARM64 to AArch64 to address the performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64.
Patched by Z.Zheng

llvm-svn: 206559
2014-04-18 05:58:09 +00:00
Matt Arsenault
6b6f53eaec R600/SI: Try to use scalar BFE.
Use scalar BFE with constant shift and offset when possible.
This is complicated by the fact that the scalar version packs
the two operands of the vector version into one.

llvm-svn: 206558
2014-04-18 05:19:26 +00:00
Jiangning Liu
fcc0f2379a This commit enables unaligned memory accesses of vector types on AArch64 back end. This should boost vectorized code performance.
Patched by Z. Zheng

llvm-svn: 206557
2014-04-18 03:58:38 +00:00
Duncan P. N. Exon Smith
79011f6e40 Revert "blockfreq: Rewrite BlockFrequencyInfoImpl"
This reverts commits r206548, r206549 and r206549.

There are some unit tests failing that aren't failing locally [1], so
reverting until I have time to investigate.

[1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816

llvm-svn: 206556
2014-04-18 02:17:43 +00:00
Duncan P. N. Exon Smith
6179ba903c blockfreq: Really fix r206548 (and r206549)
Turns out this code is dead.

llvm-svn: 206554
2014-04-18 02:10:09 +00:00
Duncan P. N. Exon Smith
efced391b3 blockfreq: Fixing MSVC after r206548?
llvm-svn: 206549
2014-04-18 02:06:24 +00:00
Duncan P. N. Exon Smith
78f8766db3 blockfreq: Rewrite BlockFrequencyInfoImpl
Rewrite the shared implementation of BlockFrequencyInfo and
MachineBlockFrequencyInfo entirely.

The old implementation had a fundamental flaw:  precision losses from
nested loops (or very wide branches) compounded past loop exits (and
convergence points).

The @nested_loops testcase at the end of
test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating.  This
function has three nested loops, with branch weights in the loop headers
of 1:4000 (exit:continue).  The old analysis gives non-sensical results:

    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    ---- Block Freqs ----
     entry = 1.0
     for.cond1.preheader = 1.00103
     for.cond4.preheader = 5.5222
     for.body6 = 18095.19995
     for.inc8 = 4.52264
     for.inc11 = 0.00109
     for.end13 = 0.0

The new analysis gives correct results:

    Printing analysis 'Block Frequency Analysis' for function 'nested_loops':
    block-frequency-info: nested_loops
     - entry: float = 1.0, int = 8
     - for.cond1.preheader: float = 4001.0, int = 32007
     - for.cond4.preheader: float = 16008001.0, int = 128064007
     - for.body6: float = 64048012001.0, int = 512384096007
     - for.inc8: float = 16008001.0, int = 128064007
     - for.inc11: float = 4001.0, int = 32007
     - for.end13: float = 1.0, int = 8

Most importantly, the frequency leaving each loop matches the frequency
entering it.

The new algorithm leverages BlockMass and PositiveFloat to maintain
precision, separates "probability mass distribution" from "loop
scaling", and uses dithering to eliminate probability mass loss.  I have
unit tests for these types out of tree, but it was decided in the review
to make the classes private to BlockFrequencyInfoImpl, and try to shrink
them (or remove them entirely) in follow-up commits.

The new algorithm should generally have a complexity advantage over the
old.  The previous algorithm was quadratic in the worst case.  The new
algorithm is still worst-case quadratic in the presence of irreducible
control flow, but it's linear without it.

The key difference between the old algorithm and the new is that control
flow within a loop is evaluated separately from control flow outside,
limiting propagation of precision problems and allowing loop scale to be
calculated independently of mass distribution.  Loops are visited
bottom-up, their loop scales are calculated, and they are replaced by
pseudo-nodes.  Mass is then distributed through the function, which is
now a DAG.  Finally, loops are revisited top-down to multiply through
the loop scales and the masses distributed to pseudo nodes.

There are some remaining flaws.

  - Irreducible control flow isn't modelled correctly.  LoopInfo and
    MachineLoopInfo ignore irreducible edges, so this algorithm will
    fail to scale accordingly.  There's a note in the class
    documentation about how to get closer.  See also the comments in
    test/Analysis/BlockFrequencyInfo/irreducible.ll.

  - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting
    the 64-bit integer precision used downstream.

  - The "bias" calculation proposed on llvmdev is *not* incorporated
    here.  This will be added in a follow-up commit, once comments from
    this review have been handled.

llvm-svn: 206548
2014-04-18 01:57:45 +00:00
Matt Arsenault
42cf57d738 R600/SI: Match sign_extend_inreg to s_sext_i32_i8 and s_sext_i32_i16
llvm-svn: 206547
2014-04-18 01:53:18 +00:00
Duncan P. N. Exon Smith
7063f2846a PMBuilder: Expose an option to disable tail calls
Adds API to allow frontends to disable tail calls in PassManagerBuilder.

<rdar://problem/16050591>

llvm-svn: 206542
2014-04-18 01:05:15 +00:00
Tom Stellard
59f91bb185 R600/SI: Use SReg_64 instead of VSrc_64 when selecting BUILD_PAIR
llvm-svn: 206541
2014-04-18 00:36:21 +00:00
Jim Grosbach
68de04b1d5 [ARM64,C++11] Range'ify another loop.
llvm-svn: 206539
2014-04-17 23:41:57 +00:00
Diego Novillo
45811c5ea3 Fix bug 19437 - Only add discriminators for DWARF 4 and above.
Summary:
This prevents the discriminator generation pass from triggering if
the DWARF version being used in the module is prior to 4.

Reviewers: echristo, dblaikie

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3413

llvm-svn: 206507
2014-04-17 22:33:50 +00:00
Nuno Lopes
4a36b584a3 remove some dead code
lib/Analysis/IPA/InlineCost.cpp         |   18 ------------------
 lib/Analysis/RegionPass.cpp             |    1 -
 lib/Analysis/TypeBasedAliasAnalysis.cpp |    1 -
 lib/Transforms/Scalar/LoopUnswitch.cpp  |   21 ---------------------
 lib/Transforms/Utils/LCSSA.cpp          |    2 --
 lib/Transforms/Utils/LoopSimplify.cpp   |    6 ------
 utils/TableGen/AsmWriterEmitter.cpp     |   13 -------------
 utils/TableGen/DFAPacketizerEmitter.cpp |    7 -------
 utils/TableGen/IntrinsicEmitter.cpp     |    2 --
 9 files changed, 71 deletions(-)

llvm-svn: 206506
2014-04-17 22:26:44 +00:00
Reed Kotler
7b6663aed6 Start pushing changes for Mips Fast-Isel
llvm-svn: 206505
2014-04-17 22:15:34 +00:00
Tom Stellard
a405a50d5f R600: Add comment clariying use of sext for result of MUL_U24
llvm-svn: 206501
2014-04-17 21:00:13 +00:00
Tom Stellard
50135a875d R600/SI: Stop using i128 as the resource descriptor type
Having i128 as a legal type complicates the legalization phase.  v4i32
is already a legal type, so we will use that instead.

This fixes several piglit tests.

llvm-svn: 206500
2014-04-17 21:00:11 +00:00
Tom Stellard
095d18364b R600/SI: Change default register class for i32 to SReg_32
SIFixSGPRCopies is smart enough to handle this now.

llvm-svn: 206499
2014-04-17 21:00:09 +00:00
Tom Stellard
ca9afaf1ed R600/SI: Teach SIInstrInfo::moveToVALU() how to handle PHI instructions
llvm-svn: 206498
2014-04-17 21:00:07 +00:00
Tom Stellard
1e0e6b9839 R600/SI: Legalize operands after changing dst reg in FixSGPRCopies
Otherwise we may not legalize some illegal REG_SEQUENCE instructions.

llvm-svn: 206497
2014-04-17 21:00:01 +00:00
Louis Gerbarg
b9b4d34ddb Improve ARM64 vector creation
This patch improves the performance of vector creation in caseiswhere where
several of the lanes in the vector are a constant floating point value. It
also includes new patterns to fold together some of the instructions when the
value is 0.0f. Test cases included.

rdar://16349427

llvm-svn: 206496
2014-04-17 20:51:50 +00:00
Jim Grosbach
a428f4ce70 ARM64: [su]xtw use W regs as inputs, not X regs.
Update the SXT[BHW]/UXTW instruction aliases and the shifted reg addressing
mode handling.

PR19455 and rdar://16650642

llvm-svn: 206495
2014-04-17 20:47:31 +00:00
David Blaikie
d2e576d088 ManagedStatic is never built with a null constructor, remove support for it.
llvm-svn: 206492
2014-04-17 20:30:35 +00:00
Tim Northover
77edcc9a3a ARM64: switch to IR-based atomic operations.
Goodbye code!

(Game: spot the bug fixed by the change).

llvm-svn: 206490
2014-04-17 20:00:33 +00:00
Tim Northover
d47e9a6e0d ARM64: add acquire/release versions of the existing atomic intrinsics.
These will be needed to support IR-level lowering of atomic
operations.

llvm-svn: 206489
2014-04-17 20:00:24 +00:00
Gerolf Hoflehner
897ba868ad Reverse 206485.
After some discussions the preferred semantics of
the always_inline attribute is
inline always when the compiler can determine
that it it safe to do so.

llvm-svn: 206487
2014-04-17 19:14:06 +00:00
Josh Magee
5d6a41432d [stack protector] Make the StackProtector pass respect ssp-buffer-size.
Previously, SSPBufferSize was assigned the value of the "stack-protector-buffer-size"
attribute after all uses of SSPBufferSize.  The effect was that the default
SSPBufferSize was always used during analysis.  I moved the check for the
attribute before the analysis; now --param ssp-buffer-size= works correctly again.

Differential Revision: http://reviews.llvm.org/D3349

llvm-svn: 206486
2014-04-17 19:08:36 +00:00
Tim Northover
fa11ed01b6 Atomics: promote ARM's IR-based atomics pass to CodeGen.
Still only 32-bit ARM using it at this stage, but the promotion allows
direct testing via opt and is a reasonably self-contained patch on the
way to switching ARM64.

At this point, other targets should be able to make use of it without
too much difficulty if they want. (See ARM64 commit coming soon for an
example).

llvm-svn: 206485
2014-04-17 18:22:47 +00:00
Matt Arsenault
628cc59d6b R600/SI: f64 frint is legal on CI
llvm-svn: 206475
2014-04-17 17:06:37 +00:00
Chad Rosier
f414b6adf9 [AArch64] Implement the getCSRFirstUseCost API, mirroring that in ARM64.
llvm-svn: 206473
2014-04-17 16:19:54 +00:00
NAKAMURA Takumi
51c35adf06 Inliner::OptimizationRemark: Fix crash in clang/test/Frontend/optimization-remark.c on some hosts, including --vg.
DebugLoc in Callsite would not live after Inliner. It should be copied before Inliner.

llvm-svn: 206459
2014-04-17 12:22:14 +00:00
Chandler Carruth
b1022ecf9e [LCG] Just move the allocator (now that we can) when moving a call
graph. This simplifies the custom move constructor operation to one of
walking the graph and updating the 'up' pointers to point to the new
location of the graph. Switch the nodes from a reference to a pointer
for the 'up' edge to facilitate this.

llvm-svn: 206450
2014-04-17 07:25:59 +00:00
Chandler Carruth
3ea4c82c5f [LCG] Remove the Module reference member which we weren't using for
anything and doesn't make sense if assigning.

llvm-svn: 206449
2014-04-17 07:22:19 +00:00
Craig Topper
72a95aa7ac [X86] Add disassembler support for the 0x0f 0x7f form of movq %mm, %mm.
llvm-svn: 206447
2014-04-17 06:33:45 +00:00
Saleem Abdulrasool
59c1f0c309 MC: rework static_assert to be MSVC compatible
Visual Studio does not permit referencing a structure member as a static field
for sizeof calculations.  Resort to a pointer cast which is compatible across
Visual Studio and other compilers.

llvm-svn: 206445
2014-04-17 06:17:20 +00:00
Matt Arsenault
adccea7f1a R600/SI: Fix zext from i1 to i64
llvm-svn: 206437
2014-04-17 02:03:08 +00:00
Adam Nemet
3430c6a131 [ARM64] Fix "Cannot select" for vector ctpop
The commit of r205855:

Author: Arnold Schwaighofer <aschwaighofer@apple.com>
Date:   Wed Apr 9 14:20:47 2014 +0000

    SLPVectorizer: Only vectorize intrinsics whose operands are widened equally

    The vectorizer only knows how to vectorize intrinics by widening all operands by
    the same factor.

    Patch by Tyler Nowicki!

exposed a backend bug causing a regression (Cannot select ctpop).

The commit msg is a bit confusing because the patch actually changes the
behavior for the loop-vectorizer as well.  As things got refactored into a
helper ctpop got snuck in to the trivially-vectorizable helper which is now
used by both vectorizers.  In other words, we started seeing vector-ctpops in
the backend.

This change makes ctpop LegalizeAction::Expand for the types not supported by
the byte-only CNT instruction.  We may be able to custom-lower these later to
a single CNT but this is to fix the compiler crash first.

Fixes <rdar://problem/16578951>

llvm-svn: 206433
2014-04-17 01:01:37 +00:00
Gerolf Hoflehner
b8702223fc Inline a function when the always_inline attribute
is set even when it contains a indirect branch.
The attribute overrules correctness concerns
like the escape of a local block address.

This is for rdar://16501761

llvm-svn: 206429
2014-04-17 00:21:52 +00:00
Jim Grosbach
63557754ee [c++11] Tidy up AsmPrinter.cpp.
Range'ify loops and tidy up some by-reference handling. No functional
change.

llvm-svn: 206422
2014-04-16 22:38:02 +00:00
Tom Stellard
eaf98dc55c Added new functionality to LLVM C API to use DiagnosticInfo to handle errors
Patch by: Darren Powell

llvm-svn: 206407
2014-04-16 17:45:04 +00:00
Aaron Ballman
897205f2ef Replacing a non-ASCII character in a comment with an ASCII character. Fixes a C4819 warning in MSVC.
llvm-svn: 206403
2014-04-16 17:09:20 +00:00
Diego Novillo
cb4502bb78 Allow diagnostic handlers to check for optimization remarks.
Summary:
When optimization remarks are enabled via the driver flag -Rpass, we
should allow the FE diagnostic handler to check if the given pass name
needs a diagnostic.

We were unconditionally checking the pattern defined in opt's
-pass-remarks flag. This was causing the FE to not emit any diagnostics.

Reviewers: qcolombet

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3362

llvm-svn: 206400
2014-04-16 16:53:41 +00:00
Matheus Almeida
69855a11a3 [mips] Use TwoOperandAliasConstraint for shift instructions.
This enables TableGen to generate an additional two operand
matcher for our shift_rotate_imm and shift_rotate_reg class of instructions.

The tests were also updated so that they include now encoding information
for all affected instructions.

llvm-svn: 206398
2014-04-16 16:28:59 +00:00
Matheus Almeida
5607900620 [mips] Add initial support for NaN2008 in the back-end.
This is so that EF_MIPS_NAN2008 is set if we are using IEEE 754-2008
NaN encoding (-mnan=2008). This patch also adds support for parsing
'.nan legacy' and '.nan 2008' assembly directives. The handling of
these directives should match GAS' behaviour i.e., the last directive
in use sets the ELF header bit (EF_MIPS_NAN2008).

Differential Revision: http://reviews.llvm.org/D3346

llvm-svn: 206396
2014-04-16 15:48:55 +00:00
Tim Northover
549caa9e40 ARM64: silence sign-comparison warning.
llvm-svn: 206393
2014-04-16 15:28:06 +00:00
Tim Northover
3eb27f781b AArch64/ARM64: produce correct relocation for conditional branches.
llvm-svn: 206391
2014-04-16 15:27:52 +00:00
Daniel Sanders
af1975547a [mips] Indentation
llvm-svn: 206389
2014-04-16 14:38:27 +00:00
Daniel Sanders
20113a414a [mips] Fix emission of '.option pic0' for MIPS-IV.
Summary: This was a case of incorrect usage of hasMips64() vs isABI_N64()

Reviewers: matheusalmeida, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3398

llvm-svn: 206388
2014-04-16 13:58:57 +00:00
Daniel Sanders
555545a9e8 [mips] Correct r206370 to account for non-Linux targets using the small data section.
This should fix the ninja-x64-msvc-RA-centos6 builder.

I suspect the check in MipsSubtarget.cpp is incorrect and is really trying to
check for a bare-metal target rather and anything other than linux. I'll
investigate this.

llvm-svn: 206385
2014-04-16 12:29:08 +00:00
Kostya Serebryany
9bec638044 [asan] add two new hidden compile-time flags for asan: asan-instrumentation-with-call-threshold and asan-memory-access-callback-prefix. This is part of the workaround for PR17409 (instrument huge functions with callbacks instead of inlined code). These flags will also help us experiment with kasan (kernel-asan) and clang
llvm-svn: 206383
2014-04-16 12:12:19 +00:00
Tim Northover
dde1d393b4 AArch64/ARM64: port across stub handling for ELF C++ exceptions.
The most important part here is that we should actuall emit the stubs we refer
to in the exception table, but as a side issue this uses more sensible & GCC
compatible representations for some of the bits of information.

llvm-svn: 206380
2014-04-16 11:52:55 +00:00
Tim Northover
9be7d39fd7 ARM64: use 32-bit moves for constants where possible.
If we know that a particular 64-bit constant has all high bits zero, then we
can rely on the fact that 32-bit ARM64 instructions automatically zero out the
high bits of an x-register. This gives the expansion logic less constraints to
satisfy and so sometimes allows it to pick better sequences.

Came up while porting test/CodeGen/AArch64/movw-consts.ll: this will allow a
32-bit MOVN to be used in @test8 soon.

llvm-svn: 206379
2014-04-16 11:52:51 +00:00
Tim Northover
fb109731e4 ARM64: use the integrated assembler on ELF.
llvm-svn: 206378
2014-04-16 11:52:40 +00:00
Matheus Almeida
018ddbad9c [mips] Emit '.set nomicromips' before a function's entry label
if not in micromips mode.

The test (elf_st_other.ll) was renamed as the name and description didn't
make sense as the test wasn't checking any symbol table entry.

Differential Revision: http://reviews.llvm.org/D3346

llvm-svn: 206377
2014-04-16 11:46:59 +00:00
Aaron Ballman
938e477955 Fixing a compile error in debug versions of MSVC. It seems that the range-based for loop is confused by the DEBUG macro expansion unless a compound statement is used.
llvm-svn: 206376
2014-04-16 11:15:57 +00:00
Daniel Sanders
51d1cc3b18 [mips] Correct callee saved list for the N32 ABI and enable test
Summary: Depends on D3339

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3340

llvm-svn: 206371
2014-04-16 10:23:37 +00:00
Tim Northover
0610b92b97 ARM64: mark x7 as used when an i128 gets shunted onto the stack.
The second half of a split i128 was ending up in x7, which is not a good thing.

This is another part of PR19432.

llvm-svn: 206366
2014-04-16 09:03:25 +00:00
Tim Northover
dcc9d1cb89 DAGCombiner: don't optimise non-existant litpool load
This particular DAG combine is designed to kick in when both ConstantFPs will
end up being loaded via a litpool, however those nodes have a semi-legal
status, dictated by isFPImmLegal so in some cases there wouldn't have been a
litpool in the first place. Don't try to be clever in those circumstances.

Picked up while merging some AArch64 tests.

llvm-svn: 206365
2014-04-16 09:03:09 +00:00
Timur Iskhodzhanov
924d8860d8 Simplify a static_assert so VS2013 can build it
llvm-svn: 206363
2014-04-16 08:30:32 +00:00
Saleem Abdulrasool
737ba538cf COFF: fix an off by one error
Adjust the tests to validate the number of auxiliary entries used to store the
filename.

Thanks to majnemer's sharp eye for catching the missing - 1 in the round up
calculation.

llvm-svn: 206359
2014-04-16 06:22:53 +00:00
Craig Topper
69e0e91431 Convert SelectionDAG::getVTList to use ArrayRef
llvm-svn: 206357
2014-04-16 06:10:51 +00:00
Craig Topper
f803e4fd66 [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206356
2014-04-16 04:21:27 +00:00
Saleem Abdulrasool
f1d6af74d5 COFF: add support for .file symbols
Add support for emitting .file records.  This is mostly a quality of
implementation change (more complete support for COFF file emission) that was
noticed while working on COFF file emission for Windows on ARM.

A .file record is emitted as a symbol with storage class FILE (103) and the name
".file".  A series of auxiliary format 4 records follow which contain the file
name.  The filename is stored as an ANSI string and is padded with NULL if the
length is not a multiple of COFF::SymbolSize (18).

llvm-svn: 206355
2014-04-16 04:15:32 +00:00
Saleem Abdulrasool
68fc5dab97 Target: whitespace
llvm-svn: 206353
2014-04-16 04:15:25 +00:00
Matt Arsenault
15d9205991 R600: Expand sign extension of vectors.
Setting vector types to expand will result in scalarization on pre SI hw,
as those gpus don't have vector shifts either.
Expand also i32 vectors, this helps llvm make the correct decision
about scalarizing the vector ops.

v2: move setOperation() calls to R600ISelLowering.cpp.
    cleanup the SI code to make it obvious that this patch does is nop for SI

Patch by: Jan Vesely <jan.vesely@rutgers.edu>

llvm-svn: 206348
2014-04-16 01:41:30 +00:00
Jim Grosbach
0ff92ff2c3 [ARM64,C++11] Tidy up branch relaxation a bit w/ c++11.
No functional change.

llvm-svn: 206344
2014-04-16 00:42:46 +00:00
Jim Grosbach
cb81230c88 ARM64: Nuke some dead code.
Missed in previous commit.

llvm-svn: 206343
2014-04-16 00:42:43 +00:00
Jim Grosbach
6d3668120b [ARM64,C++11] Clean up the ARM64 LOH collection pass.
Range'ify a bunch of loops, mainly. As a result, we have a variety
of objects via reference rather than by pointer, so propogate that
through the various helper functions where it makes sense.

llvm-svn: 206337
2014-04-15 22:57:02 +00:00
Matt Arsenault
15a1a4469f R600/SI: Print code size along with used registers
llvm-svn: 206336
2014-04-15 22:40:47 +00:00
Matt Arsenault
f045301bf1 R600/SI: Print more immediates in hex format
Print in decimal for inline immediates, and hex otherwise. Use hex
always for offsets in addressing offsets.

This approximately matches what the shader compiler does.

llvm-svn: 206335
2014-04-15 22:32:49 +00:00
Matt Arsenault
2afac6d82b R600/SI: Cleanup parsing of register names.
Try to figure out the class and number of subregisters.

llvm-svn: 206334
2014-04-15 22:32:42 +00:00
Matt Arsenault
a43cbe5951 R600/SI: Fix loads of i1
llvm-svn: 206330
2014-04-15 22:28:39 +00:00
Tobias Grosser
fdada48f35 RegionInfo: Do not access a value that was just moved away
This fixes a regression introduced in r206310.

llvm-svn: 206328
2014-04-15 22:09:36 +00:00
Akira Hatanaka
d5cce12997 Make FastISel::SelectInstruction return before target specific fast-isel code
handles Intrinsic::trap if TargetOptions::TrapFuncName is set.

This fixes a bug in which the trap function was not taken into consideration
when a program was compiled without optimization (at -O0).

<rdar://problem/16291933>

llvm-svn: 206323
2014-04-15 21:30:06 +00:00
Andrea Di Biagio
216ae0fd5e [X86] Improve the lowering of packed shifts by constant build_vector.
This patch teaches the backend how to efficiently lower logical and
arithmetic packed shifts on both SSE and AVX/AVX2 machines.

When possible, instead of scalarizing a vector shift, the backend should try
to expand the shift into a sequence of two packed shifts by immedate count
followed by a MOVSS/MOVSD.

Example
  (v4i32 (srl A, (build_vector < X, Y, Y, Y>)))

Can be rewritten as:
  (v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>)))

[with X and Y ConstantInt]

The advantage is that the two new shifts from the example would be lowered into
X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into
four scalar shifts plus four pairs of vector insert/extract.

llvm-svn: 206316
2014-04-15 19:30:48 +00:00
Quentin Colombet
d23a7863fe [ARM64] Set default CPU to generic instead of cyclone.
llvm-svn: 206313
2014-04-15 19:08:46 +00:00
Robert Lougher
eaacc6b6ba Revert r191049/r191059 as it can produce wrong code (see PR17975).
It has already been reverted on the 3.4 branch in r196521.

llvm-svn: 206311
2014-04-15 18:34:24 +00:00
David Blaikie
f5cc2195dc Use unique_ptr to manage ownership of child Regions within llvm::Region
llvm-svn: 206310
2014-04-15 18:32:43 +00:00
Julien Lerouge
dd5842a2e5 Add lifetime markers for allocas created to hold byval arguments, make them
appear in the InlineFunctionInfo.

llvm-svn: 206308
2014-04-15 18:06:46 +00:00
Julien Lerouge
9ecdd0ce5b Split byval argument initialization so the memcpy(s) are injected at the
beginning of the first new block after inlining.

llvm-svn: 206307
2014-04-15 18:01:54 +00:00
Duncan P. N. Exon Smith
571c11d959 LTO: Add more loop simplification passes to LTO
Similar to r202051, add missing loop simplification passes to the LTO
optimization pipeline.

Patch by Rafael Espindola.

llvm-svn: 206306
2014-04-15 17:48:15 +00:00
Duncan P. N. Exon Smith
e5981e5c13 verify-di: Add back braces for MSVC compatability
Fixup after r206300.

<rdar://problem/15500563>

llvm-svn: 206305
2014-04-15 17:28:26 +00:00
Duncan P. N. Exon Smith
58154f2238 verify-di: Implement DebugInfoVerifier
Implement DebugInfoVerifier, which steals verification relying on
DebugInfoFinder from Verifier.

  - Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps
    DebugInfoVerifier.  Uses -verify-di command-line flag.

  - Change verifyModule() to invoke DebugInfoVerifier as well as
    Verifier.

  - Add a call to createDebugInfoVerifierPass() wherever there was a
    call to createVerifierPass().

This implementation as a module pass should sidestep efficiency issues,
allowing us to turn debug info verification back on.

<rdar://problem/15500563>

llvm-svn: 206300
2014-04-15 16:27:38 +00:00
Duncan P. N. Exon Smith
03bcaf78d5 verify-di: split out VerifierSupport
Split out assertion and output helpers from Verifier in preparation for
writing the DebugInfoVerifier.

<rdar://problem/15500563>

llvm-svn: 206299
2014-04-15 16:27:32 +00:00
David Blaikie
e8c0fb659a Use unique_ptr to manage PassInfo instances in the PassRegistry
llvm-svn: 206297
2014-04-15 15:17:14 +00:00
NAKAMURA Takumi
f862fdfd01 MipsAsmParser.cpp: Fix vg_leak in MipsOperand::CreateMem(). Mem.Base is managed by k_Memory itself.
llvm-svn: 206293
2014-04-15 14:13:21 +00:00
NAKAMURA Takumi
4d658e107b MipsAsmParser::ParseRegister(): Be responsible to delete an Operand on a temporary Operands.
llvm-svn: 206292
2014-04-15 14:06:27 +00:00
Tim Northover
1a72d5035d AArch64/ARM64: add missing pattern for extending load.
llvm-svn: 206290
2014-04-15 14:00:19 +00:00
Tim Northover
2fa51b50b7 AArch64/ARM64: only mangle MOVZ/MOVN during encoding when needed
Sometimes we need emit the bits that would actually be a MOVN when producing a
relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got
wrong until now.

llvm-svn: 206289
2014-04-15 14:00:15 +00:00
Tim Northover
b80745a8af AArch64/ARM64: add support for large code-model jump tables.
I've left the MachO CodeGen as it is, there's a reasonable chance it should use
the GOT like ConstPools, but I'm not certain.

llvm-svn: 206288
2014-04-15 14:00:11 +00:00
Tim Northover
25bd67bc13 AArch64/ARM64: add patterns for various commutations of FNMADD.
llvm-svn: 206287
2014-04-15 14:00:06 +00:00
Tim Northover
807e4d1956 AArch64/ARM64: add half as a storage type on ARM64.
This brings it into line with the AArch64 behaviour and should open the way for
certain OpenCL features.

llvm-svn: 206286
2014-04-15 14:00:03 +00:00
Tim Northover
d9b311a7db AArch64/ARM64: copy patterns for fixed-point conversions
Code is mostly copied directly across, with a slight extension of the
ISelDAGToDAG function so that it can cope with the floating-point constants
being behind a litpool.

llvm-svn: 206285
2014-04-15 13:59:57 +00:00
Tim Northover
19b6db94a6 ARM64: add constraints to various FastISel operations
llvm-svn: 206284
2014-04-15 13:59:53 +00:00
Tim Northover
537e0eb4e2 FastISel: constrain the RegClass of operands when emitting instructions.
ARM64 suffered multiple -verify-machineinstr failures (principally over the
xsp/xzr issue) because FastISel was completely ignoring which subset of the
general-purpose registers each instruction required.

More fixes are coming in ARM64 specific FastISel, but this should cover the
generic problems.

llvm-svn: 206283
2014-04-15 13:59:49 +00:00
Tim Northover
5317490f6a AArch64/ARM64: add dp tests from AArch64
llvm-svn: 206281
2014-04-15 13:59:40 +00:00
NAKAMURA Takumi
bd52b3fbf2 ARM64AsmParser.cpp: Fix vg_leak in MC/ARM64/fp-encoding.s.
llvm-svn: 206279
2014-04-15 13:22:11 +00:00
Stepan Dyatkovskiy
733dd42b91 Optional hash symbol feature support for ARM64
http://reviews.llvm.org/D3328

llvm-svn: 206276
2014-04-15 11:43:09 +00:00
Vladimir Medic
f2d4c9e7ad Current definition of subtract with immediate instruction aliases uses CodeGenOnly defined instructions and post matcher expansion methods to emit real instructions add with immediate. However, they can directly alias add with immediate instruction and remove unnecessary definitions and code in MipsAsmParser.cpp. This patch makes no change in functionality, just removes unnecessary definitions and code.
llvm-svn: 206272
2014-04-15 10:14:49 +00:00
Chandler Carruth
e4cc6b2f96 [Allocator] Finally, finish nuking the redundant code that led me here
by removing the MallocSlabAllocator entirely and just using
MallocAllocator directly. This makes all off these allocators expose and
utilize the same core interface.

The only ugly part of this is that it exposes the fact that the JIT
allocator has no real handling of alignment, any more than the malloc
allocator does. =/ It would be nice to fix both of these to support
alignments, and then to leverage that in the BumpPtrAllocator to do less
over allocation in order to manually align pointers. But, that's another
patch for another day. This patch has no functional impact, it just
removes the somewhat meaningless wrapper around MallocAllocator.

llvm-svn: 206267
2014-04-15 09:44:09 +00:00
Alexey Bataev
135cfee77c D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadata
llvm-svn: 206266
2014-04-15 09:37:30 +00:00
NAKAMURA Takumi
acebc1e6ca X86JITInfo: [x86] Rework r206240, X86CompilationCallback_SSE() should be called for SSE-enabled code generator, even if LLVM is not built with -msse.
llvm-svn: 206261
2014-04-15 08:28:23 +00:00
Nick Lewycky
82ad9fc7c8 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Craig Topper
c2260fc0ab [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206252
2014-04-15 06:32:26 +00:00
David Blaikie
ad3e8f4101 Use unique_ptr to manage TypePromotionActions owned by TypePromotionTransaction.
llvm-svn: 206250
2014-04-15 06:17:44 +00:00
David Blaikie
621e20bc78 Use unique_ptr to manage ownership of GCFunctionInfos in GCStrategy
llvm-svn: 206249
2014-04-15 06:07:26 +00:00
David Blaikie
ff6e0d4bb1 Use unique_ptr for the result of Registry entries.
llvm-svn: 206248
2014-04-15 05:53:26 +00:00
David Blaikie
3d383785b6 Use unique_ptr to manage ownership of GCStrategy objects in GCMetadata
llvm-svn: 206246
2014-04-15 05:34:49 +00:00
David Blaikie
128c7c6443 Use unique_ptr for section/segment ownership in WinCOFFObjectWriter
llvm-svn: 206245
2014-04-15 05:25:03 +00:00
David Blaikie
e2f6c1d28f Use unique_ptr to own MCFunctions within MCModule.
MCModule's ctor had to be moved out of line so the definition of
MCFunction was available. (ctor requires the dtor of members (in case
the ctor throws) which required access to the dtor of MCFunction)

llvm-svn: 206244
2014-04-15 05:15:19 +00:00
Craig Topper
8cd194d4c1 [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206243
2014-04-15 04:59:12 +00:00
David Blaikie
a3599b2242 Use std::unique_ptr to manage MCBasicBlocks in MCFunction.
llvm-svn: 206242
2014-04-15 04:56:29 +00:00
Lang Hames
91cdab6916 [MC] Require an MCContext when constructing an MCDisassembler.
This patch re-introduces the MCContext member that was removed from
MCDisassembler in r206063, and requires that an MCContext be passed in at
MCDisassembler construction time. (Previously the MCContext member had been
initialized in an ad-hoc fashion after construction). The MCCContext member
can be used by MCDisassembler sub-classes to construct constant or
target-specific MCExprs.

This patch updates disassemblers for in-tree targets, and provides the
MCRegisterInfo instance that some disassemblers were using through the
MCContext (previously those backends were constructing their own
MCRegisterInfo instances).

llvm-svn: 206241
2014-04-15 04:40:56 +00:00
NAKAMURA Takumi
11d1c480fb X86JITInfo: [x86] Use X86CompilationCallback_SSE() along;
*not* Subtarget->hasSSE1()
  *but* __SSE__, the flag that LLVM libraries are compiled

The callback calls internal LLVM JIT libraries. It may be built with -msse (or above).

FIXME: JIT may use "host" instead of "generic" by default.
llvm-svn: 206240
2014-04-15 04:12:21 +00:00
Jim Grosbach
161b74b41e [ARM64,C++11]: Range'ify the dead-register-definition pass.
Range-based for loops. No functional change intended.

llvm-svn: 206239
2014-04-15 02:14:09 +00:00
Quentin Colombet
e6f12ef160 [MC] Emit an error if cfi_startproc is used before a symbol is defined.
Currently, we bind those directives with the last symbol, so if none
has been defined, this would lead to a crash of the compiler.

<rdar://problem/15939159>

llvm-svn: 206236
2014-04-15 01:17:45 +00:00
Quentin Colombet
d141feffeb [ARM64][MC] Set the default CPU string to generic.
llvm-svn: 206228
2014-04-15 00:28:39 +00:00
David Blaikie
7495172545 Use std::unique_ptr for DIE children
Got bored, removed some manual memory management.

Pushed references (rather than pointers) through a few APIs rather than
replacing *x with x.get().

llvm-svn: 206222
2014-04-14 22:45:02 +00:00
Jim Grosbach
20e09f25d7 X86: Nuke one more CPU autodetect blurb.
Missed one in r206094. This brings MC and TargetMachine back into sync.

llvm-svn: 206220
2014-04-14 22:23:30 +00:00
David Blaikie
5191eb8e9f Change argument order and add explanatory comment to r206130
Changes requested in code review by Eric Christopher of r206130.

llvm-svn: 206219
2014-04-14 22:23:06 +00:00
Eric Christopher
de23a0240e Use FrameSetup on frame instructions for the Mips port.
I can't seem to get a testcase to show a difference here, but it's
part of the unconditional-br.ll line table weirdness.

llvm-svn: 206218
2014-04-14 22:21:22 +00:00
Matt Arsenault
2a6aada789 Revert "Revert r206045, "Fix shift by constants for vector.""
Fix cases where the Value itself is used, and not the constant value.

llvm-svn: 206214
2014-04-14 21:50:37 +00:00
Quentin Colombet
e2f7da8884 [ARM64][MC] Set the default CPU to cyclone when initilizating the MC layer.
This matches that ARM64Subtarget does for now.

This is related to <rdar://problem/16573920>

llvm-svn: 206211
2014-04-14 21:25:53 +00:00
Adrian Prantl
ee1b12d3e5 Re-apply r206096 after investigating the gdb buildbot failure.
Thanks to dblaikie for updating the testcase!

Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.

This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.

rdar://problem/16362674.

llvm-svn: 206210
2014-04-14 21:16:04 +00:00
Louis Gerbarg
3eecd92d3d Fix for codegen bug that could cause illegal cmn instruction generation
In rare cases the dead definition elimination pass code can cause illegal cmn
instructions when it replaces dead registers on instructions that use
unmaterialized frame indexes. This patch disables the dead definition
optimization for instructions which include frame index operands.

rdar://16438284

llvm-svn: 206208
2014-04-14 21:05:05 +00:00
Louis Gerbarg
c06e27404e Add a flag to disable the ARM64DeadRegisterDefinitionsPass
This patch adds a -arm64-dead-def-elimination flag so that it is possible to
disable dead definition elimination. Includes test case.

llvm-svn: 206207
2014-04-14 21:05:02 +00:00
James Molloy
4482987575 [ARM64] Port over missing subtarget features, and CPU definitions from AArch64.
llvm-svn: 206198
2014-04-14 17:38:00 +00:00
James Molloy
db53ec7e4f [ARM64] Add big endian target arm64_be.
llvm-svn: 206197
2014-04-14 17:37:53 +00:00
Kaelyn Takata
5db7807928 Replace two calls to object::symbol_iterator::increment(), which had
been removed in r200442.

llvm-svn: 206196
2014-04-14 17:26:50 +00:00
Kaelyn Takata
1bd7494436 Remove a variable from r206192 that is only used in an assert.
llvm-svn: 206195
2014-04-14 17:21:50 +00:00
Akira Hatanaka
bf62cde86e Fix a bug in which BranchProbabilityInfo wasn't setting branch weights of basic blocks inside loops correctly.
Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights.

This fixes PR18705 and <rdar://problem/15991090>.

Differential Revision: http://reviews.llvm.org/D3363

llvm-svn: 206194
2014-04-14 16:56:19 +00:00
Kaelyn Takata
7f0dccdbd8 Fix up MCFixup::getAccessVariant to handle unary expressions.
This allows correct relocations to be generated for a symbolic
address that is being adjusted by a negative constant. Since r204294,
such expressions have triggered undefined behavior when LLVM was built
without assertions.

Credit goes to Rafael for this patch; I'm submitting it on his behalf
as he is on vacation this week.

llvm-svn: 206192
2014-04-14 16:50:22 +00:00
Daniel Sanders
5698dfd860 [mips] Fix fcopysign for MIPS-IV and add the test.
Summary:
This was another incorrect use of hasMips64() vs isGP64bit().

Depends on D3344

Reviewers: matheusalmeida, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3347

llvm-svn: 206187
2014-04-14 16:24:12 +00:00
Daniel Sanders
b3f0be8f2e [mips] Fix more incorrect uses of HasMips64 and isMips64()
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's

Patch by David Chisnall
His work was sponsored by: DARPA, AFRL

I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
    LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
    LLVM :: CodeGen/Mips/cmov.ll
    LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
    LLVM :: CodeGen/Mips/largeimmprinting.ll
    LLVM :: CodeGen/Mips/longbranch.ll
    LLVM :: CodeGen/Mips/mips64-f128.ll
    LLVM :: CodeGen/Mips/mips64directive.ll
    LLVM :: CodeGen/Mips/mips64ext.ll
    LLVM :: CodeGen/Mips/mips64fpldst.ll
    LLVM :: CodeGen/Mips/mips64intldst.ll
    LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
    LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll

Reviewers: dsanders

Reviewed By: dsanders

CC: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3343

llvm-svn: 206183
2014-04-14 15:44:42 +00:00
James Molloy
b0f3737522 Teach llvm-lto to respect the given RelocModel.
Patch by Nick Tomlinson!

llvm-svn: 206177
2014-04-14 13:54:16 +00:00
Tim Northover
1ddf6e933e ARM64: remove buggy REV16 pattern.
The 32-bit pattern is still valid: 0123 -> 3210 -> 1032.

llvm-svn: 206172
2014-04-14 12:59:52 +00:00
Tim Northover
837be1f97d AArch64/ARM64: enable directcond.ll test on ARM64.
Code change is because optimizeCompareInstr didn't know how to pull the
condition code out of FCSEL instructions.

llvm-svn: 206171
2014-04-14 12:51:06 +00:00
Tim Northover
3a59e7c449 ARM64: add patterns for csXYZ with reversed operands.
AArch64 tests for this, and it's obviously a good idea. Have to invert the
condition code, of course.

llvm-svn: 206170
2014-04-14 12:51:02 +00:00
Tim Northover
0f5179b30d ARM64: add support for AArch64's addsub_ext.ll
There was one definite issue in ARM64 (the off-by-1 check for whether
a shift could be folded in) and one difference that is probably
correct: ARM64 didn't fold nodes with multiple uses into the
arithmetic operations unless optimising for code size.

llvm-svn: 206168
2014-04-14 12:50:50 +00:00
Tim Northover
614708ff8e ARM64: optimise (cmp x, (sub 0, y)) to (cmn x, y).
This transformation is only valid when being used for an EQ or NE
comparison since the flags change otherwise.

llvm-svn: 206167
2014-04-14 12:50:47 +00:00
Richard Osborne
6d5512a94e [XCore] Don't create invalid MKMSK instructions inside loadImmediate().
Summary:
Previously loadImmediate() would produce MKMSK instructions with invalid
immediate values such as mkmsk r0, 9. Fix this by checking the mask size
is valid.

Reviewers: robertlytton

Reviewed By: robertlytton

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3289

llvm-svn: 206163
2014-04-14 12:30:35 +00:00
NAKAMURA Takumi
1a21e608ca Whitespace.
llvm-svn: 206154
2014-04-14 07:03:13 +00:00
NAKAMURA Takumi
c6fb0494ea Revert r206045, "Fix shift by constants for vector."
It broke some builders, at least, i686.

llvm-svn: 206153
2014-04-14 07:02:57 +00:00
Chandler Carruth
feba5396be [Allocator] Hoist the external helper function into a namespace scope
declaration. GCC 4.7 appears to get hopelessly confused by declaring
this function within a member function of a class template. Go figure.

llvm-svn: 206152
2014-04-14 06:42:56 +00:00
Hal Finkel
93c495a063 Don't assert in BasicTTI::getMemoryOpCost for non-simple types
BasicTTI::getMemoryOpCost must explicitly check for non-simple types; setting
AllowUnknown=true with TLI->getSimpleValueType is not sufficient because, for
example, non-power-of-two vector types return non-simple EVTs (not MVT::Other).

llvm-svn: 206150
2014-04-14 05:59:09 +00:00
Chandler Carruth
0ecbcadf5d [Allocator] Make the underlying allocator a template instead of an
abstract interface. The only user of this functionality is the JIT
memory manager and it is quite happy to have a custom type here. This
removes a virtual function call and a lot of unnecessary abstraction
from the common case where this is just a *very* thin vaneer around
a call to malloc.

Hopefully still no functionality changed here. =]

llvm-svn: 206149
2014-04-14 05:11:27 +00:00
Chandler Carruth
27e852b6ee [Allocator] Switch the BumpPtrAllocator to use a vector of pointers to
slabs rather than embedding a singly linked list in the slabs
themselves. This has a few advantages:

- Better utilization of the slab's memory by not wasting 16-bytes at the
  front.
- Simpler allocation strategy by not having a struct packed at the
  front.
- Avoids paging every allocated slab in just to traverse them for
  deallocating or dumping stats.

The latter is the really nice part. Folks have complained from time to
time bitterly that tearing down a BumpPtrAllocator, even if it doesn't
run any destructors, pages in all of the memory allocated. Now it won't.
=]

Also resolves a FIXME with the scaling of the slab sizes. The scaling
now disregards specially sized slabs for allocations larger than the
threshold.

llvm-svn: 206147
2014-04-14 03:55:11 +00:00
Serge Pavlov
c145d314a1 Use APInt arithmetic, fixed typo. Thanks to Benjamin Kramer for noticing that.
llvm-svn: 206144
2014-04-14 02:20:19 +00:00
Craig Topper
30281a67fb [C++11] More 'nullptr' conversion. In some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206142
2014-04-14 00:51:57 +00:00
Hal Finkel
c4a623f8d4 [PowerPC] [Constant Hoisting] Enable constant hoisting on PPC
Implements the various TTI functions to enable constant hoisting on PPC. The
only significant test-suite change is this:

MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup
(which essentially reverses the slowdown from r206120).

llvm-svn: 206141
2014-04-13 23:02:40 +00:00
Saleem Abdulrasool
c27cb76435 MC: check machine magic when applying offset adjustments
The values for the relocation type can (and do) overlap across various
architectures.  When performing an adjustment of the emitted relocation in the
final object file, check that the file magic matches the target for which the
relocation type is valid (e.g. a I386 relocation is only applied to an X86
object file, and an AMD64 relocation is only applied to an X86_64 object file).
This was noticed while adding support for ARM WinCOFF object file emission.

A test case for this is not really possible as the values for REL32 do not
overlap on I386 and AMD64, which is why this was never noticed in practice.  The
ARM WinCOFF emission is not yet ready to merge into the tree.

llvm-svn: 206138
2014-04-13 20:47:55 +00:00
Serge Pavlov
816d014c52 Recognize test for overflow in integer multiplication.
If multiplication involves zero-extended arguments and the result is
compared as in the patterns:

    %mul32 = trunc i64 %mul64 to i32
    %zext = zext i32 %mul32 to i64
    %overflow = icmp ne i64 %mul64, %zext
or
    %overflow = icmp ugt i64 %mul64 , 0xffffffff

then the multiplication may be replaced by call to umul.with.overflow.
This change fixes PR4917 and PR4918.

Differential Revision: http://llvm-reviews.chandlerc.com/D2814

llvm-svn: 206137
2014-04-13 18:23:41 +00:00
Hal Finkel
4da0e32e2a [PowerPC] Fix rlwimi isel when mask is not constant
We had been using the known-zero values of the operand of the or to construct
the mask for an rlwimi; this is not quite correct, but fine when the mask is
constant. When the mask is constant, then the known zeros of the operand must
be a superset of the zeros in the mask. However, when the mask is not a
constant, then there might be bits in the operand that are not known to be zero
that, at runtime, might be zero in the mask. Therefore, we check that any bits
not known to be zero *are* known to be one in the mask. Otherwise, we can't
fold the mask with the or and shift.

This was revealed as a miscompile of
MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with
constant hoisting.

llvm-svn: 206136
2014-04-13 17:10:58 +00:00
David Blaikie
cedce0ba4f Fix instruction debug info location during legalization
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).

I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.

llvm-svn: 206130
2014-04-13 06:39:55 +00:00
Craig Topper
bd0a634bba [C++11] More 'nullptr' conversion or in some cases just using a boolean check instead of comparing to nullptr.
llvm-svn: 206129
2014-04-13 04:57:38 +00:00
Lang Hames
f72ca1f7d5 [X86] unique_ptr'ify one of X86GenericDisassembler's members.
llvm-svn: 206127
2014-04-13 04:09:16 +00:00
Hal Finkel
a1849e7ac8 [PowerPC] Implement some additional TLI callbacks
Add implementations of:
  bool isLegalICmpImmediate(int64_t Imm) const
  bool isLegalAddImmediate(int64_t Imm) const
  bool isTruncateFree(Type *Ty1, Type *Ty2) const
  bool isTruncateFree(EVT VT1, EVT VT2) const
  bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const

Unfortunately, this regresses counter-register-based loop formation because
some of the loops now end up in forms were SE cannot compute loop counts.
However, nevertheless, the test-suite results favor committing:

SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup
MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup

MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown

llvm-svn: 206120
2014-04-12 21:52:38 +00:00
Benjamin Kramer
e5a211292c Spell the specialization namespace correctly.
Not sure why clang didn't diagnose this (GCC does).

llvm-svn: 206117
2014-04-12 18:45:24 +00:00
Benjamin Kramer
224e38407d Make helper static and place random global into the llvm namespace.
llvm-svn: 206116
2014-04-12 18:39:57 +00:00
Benjamin Kramer
f6c0615b06 Retire llvm::array_endof in favor of non-member std::end.
While there make array_lengthof constexpr if we have support for it.

llvm-svn: 206112
2014-04-12 16:15:53 +00:00
Benjamin Kramer
336194addf Move MDBuilder's methods out of line.
Making them inline was a historical accident, they're neither hot nor
templated.

llvm-svn: 206109
2014-04-12 14:26:59 +00:00
David Blaikie
7c76f25336 PR13337: Omit DW_TAG_restrict_type when compiling for DWARF2
DWARF3 introduced DW_TAG_restrict_type, so avoid using it in prior
versions.

llvm-svn: 206105
2014-04-12 05:35:59 +00:00
Adrian Prantl
2d31ce322b Revert "Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,"
This reverts commit 206096 while I investigate why this broke the gdb
buildbot.

llvm-svn: 206103
2014-04-12 04:25:02 +00:00
Juergen Ributzka
23e31a5f36 [ARM64] Never hoist the shift value of a shift instruction.
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.

llvm-svn: 206101
2014-04-12 02:53:51 +00:00
Juergen Ributzka
2cdca434db [ARM64] Fix the cost model for cheap large constants.
Originally the cost model would give up for large constants and just return the
maximum cost. This is not what we want for constant hoisting, because some of
these constants are large in bitwidth, but are still cheap to materialize.

This commit fixes the cost model to either return TCC_Free if the cost cannot be
determined, or accurately calculate the cost even for large constants
(bitwidth > 128).

This fixes <rdar://problem/16591573>.

llvm-svn: 206100
2014-04-12 02:36:28 +00:00
David Blaikie
1031309286 Use dwarf::Tag rather than unsigned for DIE::Tag to make debugging easier.
Nice to be able to just print out the Tag and have the debugger print
dwarf::DW_TAG_subprogram or whatever, rather than an int.

It's a bit finicky (for example DIDescriptor::getTag still returns
unsigned) because some places still handle real dwarf tags + our fake
tags (one day we'll remove the fake tags, hopefully).

llvm-svn: 206098
2014-04-12 02:24:04 +00:00
Adrian Prantl
86edc5e96b Debug info: (bugfix) C++ C/Dtors can be compiled to multiple functions,
therefore, their declaration cannot have one DW_AT_linkage_name.
The specific instances however can and should have that attribute.

This patch reorders the code in DwarfUnit::getOrCreateSubprogramDIE()
to emit linkage names for C/Dtors.

rdar://problem/16362674.

llvm-svn: 206096
2014-04-12 01:44:42 +00:00
Jim Grosbach
51592aeaa3 X86: Remove TargetMachine CPU auto-detection.
This logic is properly in the realm of whatever is creating the
TargetMachine. This makes plain 'llc foo.ll' consistent across
heterogenous machines.

llvm-svn: 206094
2014-04-12 01:34:29 +00:00
Hal Finkel
70fab0cd6d Reenable use of TBAA during CodeGen
We had disabled use of TBAA during CodeGen (even when otherwise using AA)
because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to
miss basic type punning that it should catch (and, thus, we'd fail to override
TBAA when we should).

However, when AA is in use during CodeGen, CGP now uses normal GEPs and
bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a
result, BasicAA should be able to make us do the right thing in the face of
type-punning, and it seems safe to enable use of TBAA again. self-hosting seems
fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle.

Note: We still don't update TBAA when merging stack slots, although because
BasicAA should now catch all such cases, this is no longer a blocking issue.
Nevertheless, I plan to commit code to deal with this properly in the near
future.

llvm-svn: 206093
2014-04-12 01:26:00 +00:00
Hal Finkel
f4336e3866 Add the ability to use GEPs for address sinking in CGP
The current memory-instruction optimization logic in CGP, which sinks parts of
the address computation that can be adsorbed by the addressing mode, does this
by explicitly converting the relevant part of the address computation into
IR-level integer operations (making use of ptrtoint and inttoptr). For most
targets this is currently not a problem, but for targets wishing to make use of
IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a
problem for two reasons:
  1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr
  2. In cases where type-punning was used, and BasicAA was used
     to override TBAA, BasicAA may no longer do so. (this had forced us to disable
     all use of TBAA in CodeGen; something which we can now enable again)

This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by
default (except for those targets that use AA during CodeGen), and so aside
from some PowerPC subtargets and SystemZ, there should be no change in
behavior. We may be able to switch completely away from the ptrtoint/inttoptr
sinking on all targets, but further testing is required.

I've doubled-up on a number of existing tests that are sensitive to the
address sinking behavior (including some store-merging tests that are
sensitive to the order of the resulting ADD operations at the SDAG level).

llvm-svn: 206092
2014-04-12 00:59:48 +00:00
Chad Rosier
f94fdcad79 [AArch64] Implement the isLegalAddressingMode and getScalingFactorCost APIs.
llvm-svn: 206089
2014-04-12 00:14:23 +00:00
Duncan P. N. Exon Smith
532f710ed4 blockfreq: Rename BlockFrequencyImpl to BlockFrequencyInfoImpl
This is a shared implementation class for BlockFrequencyInfo and
MachineBlockFrequencyInfo, not for BlockFrequency, a related (but
distinct) class.

No functionality change.

<rdar://problem/14292693>

llvm-svn: 206083
2014-04-11 23:20:58 +00:00
Duncan P. N. Exon Smith
3af74641e5 blockfreq: Use getSuccessorIndex()
No functionality change.

<rdar://problem/14292693>

llvm-svn: 206082
2014-04-11 23:20:52 +00:00
David Blaikie
7e2966c6c0 Pull out a named variable for the cached section names to aid readability.
Based on a code review suggestion from Eric Christopher in r205990

llvm-svn: 206080
2014-04-11 22:49:14 +00:00
Louis Gerbarg
e84c6abeec Add ARM64 CLS patterns
This patch adds patterns to generate the cls instruction ARM64. Includes tests
for 64 bit and 32 bit operands.

rdar://15611957

llvm-svn: 206079
2014-04-11 22:27:58 +00:00
David Blaikie
67ed112864 Format fixes for r205990
llvm-svn: 206078
2014-04-11 22:11:50 +00:00
Quentin Colombet
e1542121a5 [RegAllocGreedy][Last Chance Recoloring] Change the name of the exhaustive search option.
fexhaustive-register-search => exhaustive-register-search
'f' is a Clang thing!

This is related to PR18747.

llvm-svn: 206075
2014-04-11 21:51:09 +00:00
Quentin Colombet
4f1ce6e4a2 [RegAllocGreedy][Last Chance Recoloring] Addition of
-fexhaustive-register-search option to allow an exhaustive search during last
chance recoloring.

This is related to PR18747

Patch by MAYUR PANDEY <mayur.p@samsung.com>. 

llvm-svn: 206072
2014-04-11 21:39:44 +00:00
Matt Arsenault
63e365489a R600: Check if a sextload should be used for parameter loads.
Through some oddity where truncate (sextload x) isn't folded into
an anyextload for vectors, the sextload remains if the
vector isn't immediately scalarized. This keeps the expected
zextload instructions in the kernel-args test when small type
vectors aren't scalarized.

llvm-svn: 206070
2014-04-11 20:59:54 +00:00
Lang Hames
960e8b3cb0 Remove redundant symbolization support from MCDisassembler interface.
MCDisassembler has an MCSymbolizer member that is meant to take care of
symbolizing during disassembly, but it also has several methods that enable the
disassembler to do symbolization internally (i.e. without an attached symbolizer
object). There is no need for this duplication, but ARM64 had been making use of
it. This patch moves the ARM64 symbolization logic out of ARM64Disassembler and
into an ARM64ExternalSymbolizer class, and removes the duplicated MCSymbolizer
functionality from the MCDisassembler interface. Symbolization will now be
done exclusively through MCSymbolizers.

There should be no impact on disassembly for any platform, but this allows us to
tidy up the MCDisassembler interface and simplify the process of (and invariants
related to) disassembler setup.

llvm-svn: 206063
2014-04-11 20:07:58 +00:00
Quentin Colombet
149298454e [Register Coalescer] Fix wrong live-range information with rematerialization.
When rematerializing an instruction that defines a super register that would be
used by a physical subregisters we use the related physical super register for
the definition.
To keep the live-range information accurate, all the defined subregisters must be
marked as dead def, otherwise the register allocation may miss some
interferences.

Working on a reduced test-case!

<rdar://problem/16582185>

llvm-svn: 206060
2014-04-11 19:45:07 +00:00
Matt Arsenault
e7a5c69ed5 R600/SI: Refactor SOPC classes slightly.
Better match what is done for VOPC to eventually
prefer selecting these.

llvm-svn: 206048
2014-04-11 19:25:18 +00:00
Rafael Espindola
4115bca87c Don't lose the thumb bit by using relocations with sections.
This fixes a regression from r205076.

llvm-svn: 206047
2014-04-11 19:18:01 +00:00
Matt Arsenault
c399a3f659 Fix shift by constants for vector.
ashr <N x iM>, <N x iM> M -> undef

llvm-svn: 206045
2014-04-11 17:57:53 +00:00
Adrian Prantl
208fc516be Debug info: Store the DIVariable in DebugLocEntry also for constants,
so DwarfDebug::emitDebugLocEntry can emit them with the correct signedness.

rdar://problem/15928306

llvm-svn: 206042
2014-04-11 17:49:47 +00:00
Matt Arsenault
65fde80ac6 Move ExtractVectorElements to SelectionDAG.
This seems generally useful, and makes sense to
go along with SplitVector.

llvm-svn: 206041
2014-04-11 17:47:30 +00:00
Tom Stellard
9ea60803c4 SelectionDAG: Use helper function to improve legalization of ISD::MUL
The TargetLowering::expandMUL() helper contains lowering code extracted
from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more
ISD::MUL patterns without having to use a library call.

llvm-svn: 206037
2014-04-11 16:12:01 +00:00
Tom Stellard
be787d15d0 SelectionDAG: Factor ISD::MUL lowering code out of DAGTypeLegalizer
This code has been moved to a new function in the TargetLowering
class called expandMUL().  The purpose of this is to be able
to share lowering code between the SelectionDAGLegalize and
DAGTypeLegalizer classes.

No functionality changed intended.

llvm-svn: 206036
2014-04-11 16:11:58 +00:00
Diego Novillo
594092a9c5 Fix use-after-free bug caught by address sanitizer:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2959

The location string is returned as a std::string, not a StringRef.

llvm-svn: 206032
2014-04-11 13:55:56 +00:00
Simon Atanasyan
129c45d10c [yaml2obj][ELF] ELF Relocations Support.
The patch implements support for both relocation record formats: Elf_Rel
and Elf_Rela. It is possible to define relocation against symbol only.
Relocations against sections will be implemented later. Now yaml2obj
recognizes X86_64, MIPS and Hexagon relocation types.

Example of relocation section specification:
Sections:
- Name: .text
  Type: SHT_PROGBITS
  Content: "0000000000000000"
  AddressAlign: 16
  Flags: [SHF_ALLOC]

- Name: .rel.text
  Type: SHT_REL
  Info: .text
  AddressAlign: 4
  Relocations:
    - Offset: 0x1
      Symbol: glob1
      Type: R_MIPS_32
    - Offset: 0x2
      Symbol: glob2
      Type: R_MIPS_CALL16

The patch reviewed by Michael Spencer, Sean Silva, Shankar Easwaran.

llvm-svn: 206017
2014-04-11 04:13:39 +00:00
David Blaikie
1573e6e09f Implement depth_first and inverse_depth_first range factory functions.
Also updated as many loops as I could find using df_begin/idf_begin -
strangely I found no uses of idf_begin. Is that just used out of tree?

Also a few places couldn't use df_begin because either they used the
member functions of the depth first iterators or had specific ordering
constraints (I added a comment in the latter case).

Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T>
where you needed iterator_range<idf_iterator<T>>)

llvm-svn: 206016
2014-04-11 01:50:01 +00:00
Jim Grosbach
0d0ea8cdb5 [c++11] Range'ify use list loops in InstrEmitter.
llvm-svn: 206015
2014-04-11 01:13:16 +00:00
Jim Grosbach
6f9873ee9f [c++11] Range'ify use list loops in DAGCombiner.
llvm-svn: 206014
2014-04-11 01:13:13 +00:00
Jim Grosbach
1560d0c9a4 [ARM64,C++11] Range'ify use-lists iterators in address type promotion.
llvm-svn: 206013
2014-04-11 01:13:10 +00:00
David Blaikie
19cb705994 Use value types instead of 'new'd objects to store dwarf labels for asm files
llvm-svn: 206009
2014-04-11 00:43:52 +00:00
Jim Grosbach
351004c3a4 [ARM64,C++11]: Range'ify use-list iterators in DAGToDAG.
llvm-svn: 206007
2014-04-11 00:27:22 +00:00
Jim Grosbach
cbc54c6ebd [ARM64,C++11]: More range-based loop simplification.
llvm-svn: 206006
2014-04-11 00:27:19 +00:00
David Blaikie
6ce533487f Remove lazy-initialization of section caches in MCContext
This seems to have been a cargo-culted habit from the very first such
cache which didn't have any specific justification (but might've been a
layering constraint at the time).

llvm-svn: 206003
2014-04-10 23:55:11 +00:00
Reid Kleckner
f99741400f Move the segmented stack switch to a function attribute
This removes the -segmented-stacks command line flag in favor of a
per-function "split-stack" attribute.

Patch by Luqman Aden and Alex Crichton!

llvm-svn: 205997
2014-04-10 22:58:43 +00:00
Jim Grosbach
0cd1c34275 [ARM64,C++11]: Range'ify loops in InstrInfo.
llvm-svn: 205992
2014-04-10 22:00:18 +00:00
David Blaikie
5f6c3e8299 Reimplement debug info compression by compressing the whole section, rather than a fragment.
To support compressing the debug_line section that contains multiple
fragments (due, I believe, to variation in choices of line table
encoding depending on the size of instruction ranges in the actual
program code) we needed to support compressing multiple MCFragments in a
single pass.

This patch implements that behavior by mutating the post-relaxed and
relocated section to be the compressed form of its former self,
including renaming the section.

This is a more flexible (and less invasive, to a degree) implementation
that will allow for other features such as "use compression only if it's
smaller than the uncompressed data".

Compressing debug_frame would be a possible further extension to this
work, but I've left it for now. The hurdle there is alignment sections -
which might require going as far as to refactor
MCAssembler.cpp:writeFragment to handle writing to a byte buffer or an
MCObjectWriter (there's already a virtual call there, so it shouldn't
add substantial compile-time cost) which could in turn involve
refactoring MCAsmBackend::writeNopData to use that same abstraction...
which involves touching all the backends. This would remove the limited
handling of fragment writing seen in
ELFObjectWriter.cpp:getUncompressedData which would be nice - but it's
more invasive.

I did discover that I (perhaps obviously) don't need to handle
relocations when I rewrite the fragments - since the relocations have
already been applied and computed (and stored into
ELFObjectWriter::Relocations) by this stage (necessarily, because we
need to have written any immediate values or assembly-time relocations
into the data already before we compress it, which we have). The test
case doesn't necessarily cover that in detail - I can add more test
coverage if that's preferred.

llvm-svn: 205990
2014-04-10 21:53:53 +00:00
David Blaikie
99744346e4 Revert debug info compression support.
To support compression for debug_line and debug_frame a different
approach is required. To simplify review, revert the old implementation
and XFAIL the test case. New implementation to follow shortly.

Reverts r205059 and r204958.

llvm-svn: 205989
2014-04-10 21:53:47 +00:00
Jim Grosbach
24a37167ab [ARM64,C++11]: Range'ify loops in the conditional-compare pass.
llvm-svn: 205988
2014-04-10 21:49:24 +00:00
Kevin Enderby
7eca8cdb16 For the ARM integrated assembler add checking of the
alignments on vld/vst instructions.  And report errors for
alignments that are not supported.

While this is a large diff and an big test case, the changes
are very straight forward.  But pretty much had to touch
all vld/vst instructions changing the addrmode to one of the
new ones that where added will do the proper checking for
the specific instruction.

FYI, re-committing this with a tweak so MemoryOp's default
constructor is trivial and will work with MSVC 2012. Thanks
to Reid Kleckner and Jim Grosbach for help with the tweak.

rdar://11312406

llvm-svn: 205986
2014-04-10 20:18:58 +00:00
Adrian Prantl
52b43b7eb6 Debug info: Factor the retrieving of the DIVariable from a MachineInstr
into a function.

llvm-svn: 205973
2014-04-10 17:39:48 +00:00
Daniel Sanders
1f079bb11c [mips] NotMips64 predicate is really a test for 32-bit GPR's.
Summary:
Similarly, the HasMips64 on the 64-bit move InstAlias is a test for 64-bit
GPR's.

No functional change.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3263

llvm-svn: 205968
2014-04-10 15:00:28 +00:00
Arnold Schwaighofer
c65ae6074a Reapply "SLPVectorizer: Ignore users that are insertelements we can reschedule them"
This commit reapplies 205018. After 205855 we should correctly vectorize
intrinsics.

llvm-svn: 205965
2014-04-10 13:41:35 +00:00
Daniel Sanders
d6fb64ca58 [mips] Switch the MIPS-III and MIPS-IV assembler tests to use -mcpu=mips4.
Summary:
It is now the smallest superset for these ISA's.

FeatureMips4 now contains FeatureFPIdx since [ls][dw]xc1 were added in MIPS-IV.
Made the FPIdx feature bit lowercase so that it can be used in the -mattr option.

Depends on D3274

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3275

llvm-svn: 205964
2014-04-10 13:16:49 +00:00
NAKAMURA Takumi
29db5b1f13 ARM64/*/LLVMBuild.txt: Prune redundant deps.
llvm-svn: 205963
2014-04-10 12:46:13 +00:00
NAKAMURA Takumi
d13a996dc0 LLVMBuild.txt: Add missing dependencies.
llvm-svn: 205962
2014-04-10 11:16:47 +00:00
NAKAMURA Takumi
b6baf3b8a1 LLVMBuild.txt: Reformat.
llvm-svn: 205961
2014-04-10 11:16:17 +00:00
David Majnemer
60de53a3e4 YAMLIO: Allow scalars to dictate quotation rules
Introduce ScalarTraits::mustQuote which determines whether or not a
StringRef needs quoting before it is acceptable to output.

llvm-svn: 205955
2014-04-10 07:37:33 +00:00
Simon Atanasyan
fdc9dc7bfb Use range-based for loops. No functionality change.
llvm-svn: 205953
2014-04-10 06:02:49 +00:00
NAKAMURA Takumi
5041f185fc Fix abuse of StringRef on ARM64SysReg::MRSMapper::toString(Val, Valid).
FIXME: Could we use SmallString here?
llvm-svn: 205950
2014-04-10 03:05:59 +00:00
Saleem Abdulrasool
ee8a89e416 ARM64: add an explicit cast to silence a silly warning
GCC 4.8 complains with:
  warning: enumeral and non-enumeral type in conditional expression

Although this is silly and harmless in this case, add an explicit cast to
silence the warning.

llvm-svn: 205949
2014-04-10 02:48:10 +00:00
Juergen Ributzka
c19a8638e5 [ARM64] Fix immediate cost calculation for types larger than i64.
The immediate cost calculation code was hitting an assertion in the included
test case, because APInt was still internally 128-bits. Truncating it to 64-bits
fixed the issue.

Fixes <rdar://problem/16572521>.

llvm-svn: 205947
2014-04-10 01:36:59 +00:00
Reid Kleckner
52c8f727f3 Revert "For the ARM integrated assembler add checking of the alignments on vld/vst instructions. And report errors for alignments that are not supported."
It doesn't build with MSVC 2012, because MSVC doesn't allow union
members that have non-trivial default constructors.  This change added
'SMLoc AlignmentLoc' to MemoryOp, which made MemoryOp's default ctor
non-trivial.

This reverts commit r205930.

llvm-svn: 205944
2014-04-10 00:52:14 +00:00
Jim Grosbach
8314a2474c Fix to support properly cleaning up failed address sinking against constants
As it turns out the source of the sunkaddr can be a constant, in which case
there is not an instruction to delete, causing the cleanup code introduced in
r204833 to crash. This patch adds a dynamic check to ensure the deleted value is
in fact an instruction and not a constant.

Patch by Louis Gerbarg <lgg@apple.com>

llvm-svn: 205941
2014-04-10 00:27:45 +00:00
Jim Grosbach
366b84c436 Add support for load folding of avx1 logical instructions
AVX supports logical operations using an operand from memory. Unfortunately
because integer operations were not added until AVX2 the AVX1 logical
operation's types were preventing the isel from folding the loads. In a limited
number of cases the peephole optimizer would fold the loads, but most were
missed. This patch adds explicit patterns with appropriate casts in order for
these loads to be folded.

The included test cases run on reduced examples and disable the peephole
optimizer to ensure the folds are being pattern matched.

Patch by Louis Gerbarg <lgg@apple.com>

rdar://16355124

llvm-svn: 205938
2014-04-09 23:39:25 +00:00
Jim Grosbach
118d4cc5e8 SelectionDAG: Don't constant fold target-specific nodes.
FoldConstantArithmetic() only knows how to deal with a few target independent
ISD opcodes. Bail early if it sees a target-specific ISD node. These node do
funny things with operand types which may break the assumptions of the code
that follows, and there's no actual folding that can be done anyway. For example,
non-constant 256 bit vector shifts on X86 have a shift-amount operand that's a
128-bit v4i32 vector regardless of what the first operand type is and that breaks
the assumption that the operand types must match.

rdar://16530923

llvm-svn: 205937
2014-04-09 23:28:11 +00:00
Kevin Enderby
a2db407730 For the ARM integrated assembler add checking of the
alignments on vld/vst instructions.  And report errors for
alignments that are not supported.

While this is a large diff and an big test case, the changes
are very straight forward.  But pretty much had to touch
all vld/vst instructions changing the addrmode to one of the
new ones that where added will do the proper checking for
the specific instruction.

rdar://11312406

llvm-svn: 205930
2014-04-09 21:32:59 +00:00
Chad Rosier
0a639ba5a6 [AArch64] Implement the isZExtFree APIs.
llvm-svn: 205926
2014-04-09 20:51:21 +00:00
Chad Rosier
b6bf098390 [AArch64] Implement the isTruncateFree API.
In AArch64 i64 to i32 truncate operation is a subregister access.

This allows more opportunities for LSR optmization to eliminate
variables of different types (i32 and i64).

llvm-svn: 205925
2014-04-09 20:43:40 +00:00
Quentin Colombet
95c5120626 [DAGCombiner] DAG combine does not know how to combine indexed loads with
sign/zero/any extensions. However a few places were not checking properly the
property of the load and were turning an indexed load into a regular extended
load. Therefore the indexed value was lost during the process and this was
triggering an assertion.

<rdar://problem/16389332>

llvm-svn: 205923
2014-04-09 20:03:05 +00:00
Bob Wilson
9cddd45364 Simple fix for build failures resulting from r205867.
llvm-svn: 205918
2014-04-09 18:34:45 +00:00
David Majnemer
38564aab17 Revert "Revert "YAMLIO: Encode ambiguous hex strings explicitly""
Don't quote octal compatible strings if they are only two wide, they
aren't ambiguous.

This reverts commit r205857 which reverted r205857.

llvm-svn: 205914
2014-04-09 17:04:27 +00:00
David Majnemer
22c9e46ce7 obj2yaml: Don't crash if the characteristics field is zero
obj2yaml would fail when seeing a Weak External auxiliary record with a
characteristics field holding zero instead of one of
IMAGE_WEAK_EXTERN_SEARCH_NOLIBRARY, IMAGE_WEAK_EXTERN_SEARCH_NOLIBRARY,
or IMAGE_WEAK_EXTERN_SEARCH_NOLIBRARY.

llvm-svn: 205911
2014-04-09 16:38:15 +00:00
Justin Holewinski
b035f9f3e4 [NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces
This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later).

llvm-svn: 205907
2014-04-09 15:39:15 +00:00
Justin Holewinski
80f133a62c [NVPTX] Add support for addrspacecast in global variable initializers, including emitting generic() when casting to address space 0.
llvm-svn: 205906
2014-04-09 15:39:11 +00:00
Justin Holewinski
256f672482 [NVPTX] Add query support for read-write images and managed variables
This also fixes a bug in the annotation cache where the cache will not be cleared between modules if multiple modules are compiled in the same process.

llvm-svn: 205905
2014-04-09 15:38:52 +00:00
Alp Toker
111bd28e59 Fix some doc and comment typos
llvm-svn: 205899
2014-04-09 14:47:27 +00:00