1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

8207 Commits

Author SHA1 Message Date
Daniel Sanders
cd6495240a Where Triple has a suitable predicate, use it rather than the enum values. NFC.
Reviewers: mcrosier

Subscribers: llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10960

llvm-svn: 241469
2015-07-06 16:33:18 +00:00
Peter Collingbourne
f49ef7d3ac IR: Do not consider available_externally linkage to be linker-weak.
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.

Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.

Differential Revision: http://reviews.llvm.org/D10941

llvm-svn: 241413
2015-07-05 20:52:35 +00:00
Benjamin Kramer
89b0e53c15 [TargetLowering] StringRefize asm constraint getters.
There is some functional change here because it changes target code from
atoi(3) to StringRef::getAsInteger which has error checking. For valid
constraints there should be no difference.

llvm-svn: 241411
2015-07-05 19:29:18 +00:00
Ranjeet Singh
2f642c039e Reverting r241058 because it's causing buildbot failures.
llvm-svn: 241061
2015-06-30 12:32:53 +00:00
Ranjeet Singh
9a787f3fa4 There are a few places where subtarget features are still
represented by uint64_t, this patch replaces these
usages with the FeatureBitset (std::bitset) type.

Differential Revision: http://reviews.llvm.org/D10542

llvm-svn: 241058
2015-06-30 11:30:42 +00:00
Tim Northover
f460862bbe ARM: add correct kill flags when combining stm instructions
When the store sequence being combined actually stores the base register, we
should not mark it as killed until the end.

rdar://21504262

llvm-svn: 241003
2015-06-29 21:42:16 +00:00
Javed Absar
013d6e555c [ARM]: Extend -mfpu options for half-precision and vfpv3xd
Some of the the permissible ARM -mfpu options, which are supported in GCC,
are currently not present in llvm/clang.This patch adds the options:
'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16.
These are related to half-precision floating-point and single precision.

Reviewers: rengolin, ranjeet.singh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10645

llvm-svn: 240930
2015-06-29 09:32:29 +00:00
Javed Absar
7e032f3626 [ARM] Cortex-R5 is not VFPOnlySP
This patch fixes the error in ARM.td which stated that Cortex-R5
floating point unit can do only single precision, when it can do double as well.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10769

llvm-svn: 240799
2015-06-26 17:42:37 +00:00
Javed Absar
d4144d8279 [ARM] Cortex-R4F is not VFPOnlySP
Cortex-R4F TRM states that fpu supports both single and double precision.
This patch corrects the information in ARM.td file and corresponding test.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10763

llvm-svn: 240776
2015-06-26 12:14:56 +00:00
Rafael Espindola
9b4c1d87c6 Optimize the creation of mapping symbols.
No need to create two symbols just to assign one to the other.

llvm-svn: 240773
2015-06-26 11:31:13 +00:00
Hao Liu
fc6114fe0f [ARM] Lower interleaved memory accesses to vldN/vstN intrinsics.
This patch also adds a function to calculate the cost of interleaved memory accesses.

E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4)

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240755
2015-06-26 02:45:36 +00:00
Benjamin Kramer
4ed07455af Replace copy-pasted debug value skipping with MBB::getLastNonDebugInstr
No functional change intended.

llvm-svn: 240639
2015-06-25 13:28:24 +00:00
Matthias Braun
a43ec6b1f8 ARMLoadStoreOptimizer: Fix errata 602117 handling and make testcase actually test for it
This fixes PR23912

Differential Revision: http://reviews.llvm.org/D10620

llvm-svn: 240582
2015-06-24 20:03:27 +00:00
John Brawn
b5072dd408 [ARM] ARMLoadStoreOpt::UpdateBaseRegUses should stop on def
When UpdateBaseRegUses sees an instruction that defines the base
register it must stop, as the base register value it is updating is no
longer live. Ideally we would already have seen the register be killed
(which is already checked for), but the kill flags may be inaccurate
and we have to account for this.

Differential Revision: http://reviews.llvm.org/D10566

llvm-svn: 240424
2015-06-23 16:02:11 +00:00
Alexander Kornienko
f993659b8f Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
Pete Cooper
e49cb88496 Change .thumb_set to have the same error checks as .set.
According to the documentation, .thumb_set is 'the equivalent of a .set directive'.

We didn't have equivalent behaviour in terms of all the errors we could throw, for
example, when a symbol is redefined.

This change refactors parseAssignment so that it can be used by .set and .thumb_set
and implements tests for .thumb_set for all the errors thrown by that method.

Reviewed by Rafael Espíndola.

llvm-svn: 240318
2015-06-22 19:35:57 +00:00
Alexander Kornienko
40cb19d802 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
Ahmed Bougacha
57ebe7191c [ARM] Look through concat when lowering in-place shuffles (VZIP, ..)
Currently, we canonicalize shuffles that produce a result larger than
their operands with:
  shuffle(concat(v1, undef), concat(v2, undef))
->
  shuffle(concat(v1, v2), undef)

because we can access quad vectors (see PerformVECTOR_SHUFFLECombine).

This is useful in the general case, but there are special cases where
native shuffles produce larger results: the two-result ops.

We can look through the concat when lowering them:
  shuffle(concat(v1, v2), undef)
->
  concat(VZIP(v1, v2):0, :1)

This lets us generate the native shuffles instead of scalarizing to
dozens of VMOVs.

Differential Revision: http://reviews.llvm.org/D10424

llvm-svn: 240118
2015-06-19 02:32:35 +00:00
Ahmed Bougacha
d492dfd93e [ARM] Factor out two-result shuffle matching. NFCI.
In preparation for a future patch: makes it easier to do the same
matching to generate different nodes, without duplication.

llvm-svn: 240116
2015-06-19 02:25:01 +00:00
Eric Christopher
0b2dfae3ba Fix "the the" in comments.
llvm-svn: 240112
2015-06-19 01:53:21 +00:00
Daniel Sanders
134c99480b Clean up redundant copies of Triple objects. NFC
Summary:

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10382

llvm-svn: 239823
2015-06-16 15:44:21 +00:00
Matthias Braun
729d1d707d Rename TargetSubtargetInfo::enablePostMachineScheduler() to enablePostRAScheduler()
r213101 changed the behaviour of this method to not only affect the
PostMachineScheduler scheduler but also the PostRAScheduler scheduler,
renaming should make this fact clear. Also document that the preferred
way is to specify this in the scheduling model instead of overriding
this method.

Differential Revision: http://reviews.llvm.org/D10427

llvm-svn: 239659
2015-06-13 03:42:16 +00:00
Matthias Braun
e311841a60 MachineLICM: Use TargetSchedModel instead of just itineraries
This will use Itinieraries if available, but will also work if just a
MCSchedModel is available.

Differential Revision: http://reviews.llvm.org/D10428

llvm-svn: 239658
2015-06-13 03:42:11 +00:00
Daniel Sanders
ac381fec59 Replace string GNU Triples with llvm::Triple in TargetMachine. NFC.
Summary:
For the moment, TargetMachine::getTargetTriple() still returns a StringRef.

This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: ted, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10362

llvm-svn: 239554
2015-06-11 19:41:26 +00:00
Ahmed Bougacha
ee490f0abc [CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC.
llvm-svn: 239553
2015-06-11 19:30:37 +00:00
Daniel Sanders
b5daf5341a Replace string GNU Triples with llvm::Triple in computeDataLayout(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: llvm-commits, jfb, rengolin

Differential Revision: http://reviews.llvm.org/D10361

llvm-svn: 239538
2015-06-11 15:34:59 +00:00
Reid Kleckner
8d217e6b48 Revert "Move dllimport name mangling to IR mangler."
This reverts commit r239437.

This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol
directly instead of using it to do an indirect function call.

llvm-svn: 239502
2015-06-11 01:31:48 +00:00
Daniel Sanders
e37ebd59c5 Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10311

llvm-svn: 239467
2015-06-10 12:11:26 +00:00
Daniel Sanders
326a8d5bed Replace string GNU Triples with llvm::Triple in create*MCRelocationInfo(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10307

llvm-svn: 239465
2015-06-10 10:54:40 +00:00
Daniel Sanders
2da676c315 Replace string GNU Triples with llvm::Triple in MCAsmBackend subclasses and create*AsmBackend(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: echristo, rafael

Reviewed By: rafael

Subscribers: rafael, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10243

llvm-svn: 239464
2015-06-10 10:35:34 +00:00
Peter Collingbourne
6f8524df44 Move dllimport name mangling to IR mangler.
This ensures that LTO clients see the correct external symbol name.

Differential Revision: http://reviews.llvm.org/D10318

llvm-svn: 239437
2015-06-09 22:09:53 +00:00
Akira Hatanaka
c42437a4f8 Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427
2015-06-09 19:07:19 +00:00
Aaron Ballman
0856283b6f Removing spurious semi colons; NFC.
llvm-svn: 239399
2015-06-09 12:03:46 +00:00
Matt Arsenault
8c9e05929c MC: Add target hook to control symbol quoting
llvm-svn: 239370
2015-06-09 00:31:39 +00:00
Akira Hatanaka
76bd57e472 [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717

llvm-svn: 239325
2015-06-08 18:50:43 +00:00
Pete Cooper
226ceaabfc Remove includes of MCMachOSymbolFlags.h after it was deleted
llvm-svn: 239318
2015-06-08 17:25:57 +00:00
Peter Collingbourne
3003aae9ae Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."
as it caused miscompilations and assertion failures (PR23768,
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html).

llvm-svn: 239169
2015-06-05 18:01:28 +00:00
Benjamin Kramer
eaeddf4864 [ARM] Make helper function static.
This one had a declaration but it differed from the definition so the
declaration was actually dead.

llvm-svn: 239157
2015-06-05 14:32:54 +00:00
John Brawn
4ef1f45b4f [ARM] Add support for -sp- FPUs and FPU none to TargetParser
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.

Differential Revision: http://reviews.llvm.org/D10238

llvm-svn: 239151
2015-06-05 13:31:19 +00:00
John Brawn
38460915a2 [ARM] Add knowledge of FPU subtarget features to TargetParser
Add getFPUFeatures to TargetParser, which gets the list of subtarget features
that are enabled/disabled for each FPU, and use it when handling the .fpu
directive.

No functional change in this commit, though clang will start behaving
differently once it starts using this.

Differential Revision: http://reviews.llvm.org/D10237

llvm-svn: 239150
2015-06-05 13:29:24 +00:00
Jim Grosbach
e76e79548b MC: Clean up naming in MCObjectWriter. NFC.
s/WriteObject/writeObject/
s/RecordRelocation/recordRelocation/
s/IsSymbolRefDifferenceFullyResolved/isSymbolRefDifferenceFullyResolved/
s/Write8/write8/
s/WriteLE16/writeLE16/
s/WriteLE32/writeLE32/
s/WriteLE64/writeLE64/
s/WriteBE16/writeBE16/
s/WriteBE32/writeBE32/
s/WriteBE64/writeBE64/
s/Write16/write16/
s/Write32/write32/
s/Write64/write64/
s/WriteZeroes/writeZeroes/
s/WriteBytes/writeBytes/

llvm-svn: 239108
2015-06-04 22:24:41 +00:00
Ahmed Bougacha
59fec45c7d [GlobalMerge] Take into account minsize on Global users' parents.
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.

Differential Revision: http://reviews.llvm.org/D10054

llvm-svn: 239087
2015-06-04 20:39:23 +00:00
Jim Grosbach
3a8310cc67 MC: Remove obsolete MachO UseAggressiveSymbolFolding.
Fix the FIXME and remove this old as(1) compat option. It was useful for
bringup of the integrated assembler to diff object files, but now it's
just causing more relocations than strictly necessary to be generated.

rdar://21201804

llvm-svn: 239084
2015-06-04 20:27:42 +00:00
Daniel Sanders
06c811431c Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and create*AsmInfo(). NFC.
Summary:
This is the first of several patches to eliminate StringRef forms of GNU
triples from the internals of LLVM. After this is complete, GNU triples
will be replaced by a more authoratitive representation in the form of
an LLVM TargetTuple.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: ted, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10236

llvm-svn: 239036
2015-06-04 13:12:25 +00:00
Rafael Espindola
f7db6d4a8a Remove MCELFSymbolFlags.h. It is now internal to MCSymbolELF.
llvm-svn: 238996
2015-06-04 00:47:43 +00:00
Rafael Espindola
5d3bece91c Remove getOrCreateSymbolData. There is no MCSymbolData anymore.
llvm-svn: 238952
2015-06-03 19:03:11 +00:00
Matthias Braun
80e08ad23e ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

Recommiting after the revert in r238821, the buildbot still failed with
the patch removed so there seems to be another reason for the breakage.

llvm-svn: 238935
2015-06-03 16:30:24 +00:00
Daniel Sanders
88d2b6b8ee [arm] Fix r238921. We must handle Constraint_i too.
llvm-svn: 238925
2015-06-03 14:17:18 +00:00
Daniel Sanders
d73d802d93 [arm] Distinguish the /U[qytnms]/, 'Uv', 'Q', and 'm' inline assembly memory constraints.
Summary:
But still handle them the same way since I don't know how they differ on
this target.

Of these, /U[qytnms]/ do not have backend tests but are accepted by clang.

No functional change intended.

Reviewers: t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D8203

llvm-svn: 238921
2015-06-03 12:33:56 +00:00
Rafael Espindola
2e71400006 Pass a MCSymbolELF to a few ELF only functions. NFC.
llvm-svn: 238868
2015-06-02 21:30:13 +00:00
Rafael Espindola
1fc8198d33 Merge MCELF.h into MCSymbolELF.h.
Now that we have a dedicated type for ELF symbol, these helper functions can
become member function of MCSymbolELF.

llvm-svn: 238864
2015-06-02 20:38:46 +00:00
Renato Golin
ff58af5431 Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs"
This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot.

Since self-hosting issues with Clang are hard to investigate, I'm taking the
liberty to revert now, so we can investigate it offline.

llvm-svn: 238821
2015-06-02 11:47:30 +00:00
Matthias Braun
7f4928846c ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

llvm-svn: 238795
2015-06-01 23:27:08 +00:00
Matthias Braun
4db11611d9 ARMLoadStoreOptimizer: Fix doxygen comments; NFC
llvm-svn: 238784
2015-06-01 21:26:23 +00:00
Luke Cheeseman
4fff08f837 Re-commit of r238201 with fix for building with shared libraries.
llvm-svn: 238739
2015-06-01 12:02:47 +00:00
Matt Arsenault
b0334192af Add address space argument to isLegalAddressingMode
This is important because of different addressing modes
depending on the address space for GPU targets.

This only adds the argument, and does not update
any of the uses to provide the correct address space.

llvm-svn: 238723
2015-06-01 05:31:59 +00:00
NAKAMURA Takumi
1533760873 ARMConstantIslandPass.cpp: Prune an empty \brief. [-Wdocumentation]
llvm-svn: 238697
2015-05-31 23:05:35 +00:00
Tim Northover
0eb976c493 ARM: recommit r237590: allow jump tables to be placed as constant islands.
The original version didn't properly account for the base register
being modified before the final jump, so caused miscompilations in
Chromium and LLVM. I've fixed this and tested with an LLVM self-host
(I don't have the means to build & test Chromium).

The general idea remains the same: in pathological cases jump tables
can be too far away from the instructions referencing them (like other
constants) so they need to be movable.

Should fix PR23627.

llvm-svn: 238680
2015-05-31 19:22:07 +00:00
Renato Golin
068c4bcd5c Comment change. NFC
That comment misleads the current discussions in mentioned bug. Leave
the discussions to the bug. Also, adding a future change FIXME.

llvm-svn: 238653
2015-05-30 10:44:07 +00:00
Renato Golin
c0a1933df0 [ARMTargetParser] Move IAS arch ext parser. NFC
The plan was to move the whole table into the already existing ArchExtNames
but some fields depend on a table-generated file, and we don't yet have this
feature in the generic lib/Support side.

Once the minimum target-specific table-generated files are available in a
generic fashion to these libraries, we'll have to keep it in the ASM parser.

llvm-svn: 238651
2015-05-30 10:30:02 +00:00
Jim Grosbach
30efd68a58 MC: Clean up MCExpr naming. NFC.
llvm-svn: 238634
2015-05-30 01:25:56 +00:00
Rafael Espindola
27d0b57917 Remove getData.
This completes the mechanical part of merging MCSymbol and MCSymbolData.

llvm-svn: 238617
2015-05-29 21:45:01 +00:00
Rafael Espindola
b365b7fead Remove the MCSymbolData typedef.
The getData member function is next.

llvm-svn: 238611
2015-05-29 20:41:47 +00:00
Rafael Espindola
caf32aa8f6 Rename getOrCreateSymbolData to registerSymbol and return void.
Another step in merging MCSymbol and MCSymbolData.

llvm-svn: 238607
2015-05-29 20:21:02 +00:00
Rafael Espindola
d72e3e8f02 Pass MCSymbols to the helper functions in MCELF.h.
llvm-svn: 238596
2015-05-29 18:47:23 +00:00
Rafael Espindola
72bab9e25e Pass a MCSymbol to needsRelocateWithSymbol.
llvm-svn: 238589
2015-05-29 18:26:09 +00:00
Matthias Braun
4eae512569 CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands
MIOperands/ConstMIOperands are classes iterating over the MachineOperand
of a MachineInstr, however MachineInstr::mop_iterator does the same
thing.

I assume these two iterators exist to have a uniform interface to
iterate over the operands of a machine instruction bundle and a single
machine instruction. However in practice I find it more confusing to have 2
different iterator classes, so this patch transforms (nearly all) the
code to use mop_iterators.

The only exception being MIOperands::anlayzePhysReg() and
MIOperands::analyzeVirtReg() still needing an equivalent, I leave that
as an exercise for the next patch.

Differential Revision: http://reviews.llvm.org/D9932

This version is slightly modified from the proposed revision in that it
introduces MachineInstr::getOperandNo to avoid the extra counting
variable in the few loops that previously used MIOperands::getOperandNo.

llvm-svn: 238539
2015-05-29 02:56:46 +00:00
Rafael Espindola
be6f69a9cc Remove a trivial forwarding function. NFC.
llvm-svn: 238506
2015-05-28 21:36:02 +00:00
Peter Collingbourne
688547e6fc Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM.
We were previously codegen'ing these as regular load/store operations and
hoping that the register allocator would allocate registers in ascending order
so that we could apply an LDM/STM combine after register allocation. According
to the commit that first introduced this code (r37179), we planned to teach
the register allocator to allocate the registers in ascending order. This
never got implemented, and up to now we've been stuck with very poor codegen.

A much simpler approach for achiveing better codegen is to create LDM/STM
instructions with identical sets of virtual registers, let the register
allocator pick arbitrary registers and order register lists when printing an
MCInst. This approach also avoids the need to repeatedly calculate offsets
which ultimately ought to be eliminated pre-RA in order to decrease register
pressure.

This is implemented by lowering the memcpy intrinsic to a series of SD-only
MCOPY pseudo-instructions which performs a memory copy using a given number
of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a
little unusual, but it avoids the need to encode register lists in the SD,
and we can take advantage of SD use lists to decide whether to use the _UPD
variant of the instructions.

Fixes PR9199.

Differential Revision: http://reviews.llvm.org/D9508

llvm-svn: 238473
2015-05-28 20:02:45 +00:00
Renato Golin
b26209eb93 ARMTargetParser: Normalising build attributes
Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu
strings are using ARMTargetParser, it's time to make it a bit more conforming
with what the ABI says.

This commit adds some clarification on what build attributes are accepted and
which are "non-standard". It also makes clear that the "defaultCPU" and
"defaultArch" methods were really just build attribute getters.

It also diverges from GCC's behaviour to say that armv2/armv3 are really an
ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4.

llvm-svn: 238344
2015-05-27 18:15:37 +00:00
Rafael Espindola
2ada7efad8 Use operator<< instead of print in a few more places.
llvm-svn: 238315
2015-05-27 13:05:42 +00:00
Matthias Braun
ab18829bd2 ARMLoadStoreOptimizer: Code cleanup; NFC
llvm-svn: 238289
2015-05-27 05:12:40 +00:00
Diego Novillo
4ef8ed56e4 Revert "Re-commit changes in r237579 with fix for bug breaking windows builds."
This reverts commit r238201 to fix linking problems in x86 Linux
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html

llvm-svn: 238223
2015-05-26 17:45:38 +00:00
Luke Cheeseman
e0014ad127 Re-commit changes in r237579 with fix for bug breaking windows builds.
llvm-svn: 238201
2015-05-26 13:40:31 +00:00
Luke Cheeseman
27ebd5e2f5 Test Commit
llvm-svn: 238199
2015-05-26 13:10:35 +00:00
Michael Kuperstein
9b9d97f26a Use std::bitset for SubtargetFeatures.
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures.
Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. 
This should now be fixed.

llvm-svn: 238192
2015-05-26 10:47:10 +00:00
Rafael Espindola
4f9f9a1d0f Stop using MCSectionData in MCMachObjectWriter.h.
llvm-svn: 238165
2015-05-26 01:15:30 +00:00
Rafael Espindola
0173a2073d Stop using MCSectionData in MCExpr.h.
llvm-svn: 238163
2015-05-26 00:52:18 +00:00
Rafael Espindola
aea8a3a003 Return a MCSection from MCFragment::getParent().
Another step in merging MCSectionData and MCSection.

llvm-svn: 238162
2015-05-26 00:36:57 +00:00
Rafael Espindola
668a2fc955 Stop forwarding getOrdinal and setOrdinal.
llvm-svn: 238139
2015-05-25 14:12:48 +00:00
Benjamin Kramer
a4cf7de209 [AArch64] Clean up the ELF streamer a bit.
llvm-svn: 238102
2015-05-23 16:39:10 +00:00
Akira Hatanaka
626316f0c3 Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions.
This is part of the work to remove TargetMachine::resetTargetOptions.

In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.

There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim". 

llvm-svn: 238080
2015-05-23 01:14:08 +00:00
Chad Rosier
d38d07cca6 Use new MachineInstr mayLoadOrStore() API. NFC.
llvm-svn: 238044
2015-05-22 20:07:34 +00:00
John Brawn
7ffe468a60 [ARM] Fix typo in subtarget feature list for 7em triple
The list of subtarget features for the 7em triple contains 't2xtpk',
which actually disables that subtarget feature. Correct that to
'+t2xtpk' and test that the instructions enabled by that feature do
actually work.

Differential Revision: http://reviews.llvm.org/D9936

llvm-svn: 238022
2015-05-22 14:16:22 +00:00
Peter Collingbourne
0964ae6e51 Revert r237590, "ARM: allow jump tables to be placed as constant islands."
Caused a miscompile of the Android port of Chromium, details
forthcoming.

llvm-svn: 237972
2015-05-21 23:20:55 +00:00
Rafael Espindola
dda3f1317e Move alignment from MCSectionData to MCSection.
This starts merging MCSection and MCSectionData.

There are a few issues with the current split between MCSection and
MCSectionData.

* It optimizes the the not as important case. We want the production
of .o files to be really fast, but the split puts the information used
for .o emission in a separate data structure.

* The ELF/COFF/MachO hierarchy is not represented in MCSectionData,
leading to some ad-hoc ways to represent the various flags.

* It makes it harder to remember where each item is.

The attached patch starts merging the two by moving the alignment from
MCSectionData to MCSection.

Most of the patch is actually just dropping 'const', since
MCSectionData is mutable, but MCSection was not.

llvm-svn: 237936
2015-05-21 19:20:38 +00:00
Davide Italiano
4ae19d2fd4 [Target/ARM] Only enable OptimizeBarrierPass at -O1 and above.
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.

Discussed with Tim Northover on IRC.

llvm-svn: 237837
2015-05-20 21:40:38 +00:00
Matthias Braun
e7695127e5 ARM: Fix comment and make it slightly more readable
llvm-svn: 237820
2015-05-20 18:40:06 +00:00
Duncan P. N. Exon Smith
1f4ff82e51 MC: Use MCSymbol in MachObjectWriter, NFC
Replace uses of `MCSymbolData` with `MCSymbol` where both are needed, so
we can remove the backpointer.

llvm-svn: 237799
2015-05-20 15:16:14 +00:00
Duncan P. N. Exon Smith
1b0c97740a MC: Take MCSymbol in MachObjectWriter::getSymbolAddress(), NFC
Pass through an `MCSymbol` instead of an `MCSymbolData` so we can get
rid of the back pointer.

llvm-svn: 237750
2015-05-20 00:02:39 +00:00
Duncan P. N. Exon Smith
4c0dd00c17 MC: Use MCSymbol in MCAsmLayout::getSymbolOffset(), NFC
Continue to canonicalize on MCSymbol instead of MCSymbolData when both
are needed.

llvm-svn: 237749
2015-05-19 23:53:20 +00:00
Matthias Braun
0e49c41528 MachineInstr: Remove unused parameter.
llvm-svn: 237726
2015-05-19 21:22:20 +00:00
David Blaikie
0be3b52a8f Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only
llvm-svn: 237624
2015-05-18 22:13:54 +00:00
Matthias Braun
4662acc8ae MachineInstr: Change return value of getOpcode() to unsigned.
This was previously returning int. However there are no negative opcode
numbers and more importantly this was needlessly different from
MCInstrDesc::getOpcode() (which even is the value returned here) and
SDValue::getOpcode()/SDNode::getOpcode().

llvm-svn: 237611
2015-05-18 20:27:55 +00:00
Jim Grosbach
95c79d189f MC: Clean up method names in MCContext.
The naming was a mish-mash of old and new style. Update to be consistent
with the new. NFC.

llvm-svn: 237594
2015-05-18 18:43:14 +00:00
Tim Northover
42f554c12a ARM: allow jump tables to be placed as constant islands.
Previously, they were forced to immediately follow the actual branch
instruction. This was usually OK (the LEAs actually accessing them got emitted
nearby, and weren't usually separated much afterwards). Unfortunately, a
sufficiently nasty phi elimination dumps many instructions right before the
basic block terminator, and this can increase the range too much.

This patch frees them up to be placed as usual by the constant islands pass,
and consequently has to slightly modify the form of TBB/TBH tables to refer to
a PC-relative label at the final jump. The other jump table formats were
already position-independent.

rdar://20813304

llvm-svn: 237590
2015-05-18 17:10:40 +00:00
Oliver Stannard
d562deac4b Revert r237579, as it broke windows buildbots
llvm-svn: 237583
2015-05-18 16:39:16 +00:00
Oliver Stannard
8fe6b2aa15 [LLVM - ARM/AArch64] Add ACLE special register intrinsics
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.

This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.

Patch by Luke Cheeseman.

Differential Revision: http://reviews.llvm.org/D9699

llvm-svn: 237579
2015-05-18 16:23:33 +00:00
Duncan P. N. Exon Smith
d7b04389e5 MC: Use MCSymbol in RelAndSymbol, NFC
Switch from `MCSymbolData` to `MCSymbol`.

llvm-svn: 237502
2015-05-16 01:14:19 +00:00
Jim Grosbach
0c6b91deee MC: MCCodeGenInfo naming update. NFC.
s/InitMCCodeGenInfo/initMCCodeGenInfo/

llvm-svn: 237471
2015-05-15 19:13:31 +00:00
Jim Grosbach
eb68de6ea2 MC: Update MCCodeEmitter naming. NFC.
s/EncodeInstruction/encodeInstruction/

llvm-svn: 237469
2015-05-15 19:13:16 +00:00
Jim Grosbach
6eeec2791d MC: Update MCFixup naming. NFC.
s/MCFixup::Create/MCFixup::create/

llvm-svn: 237468
2015-05-15 19:13:05 +00:00
Artyom Skrobov
b1d8d6d276 Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases
No longer breaks SPEC2000/2006

llvm-svn: 237361
2015-05-14 12:59:46 +00:00
Tim Northover
c446a370d7 ARM: remove possible vestiges of the legacy JIT???
There's no need to manually pass modifier strings around to tell an operand how
to print now, that information is encoded in the operand itself since the MC
layer came along.

llvm-svn: 237295
2015-05-13 20:28:41 +00:00
Tim Northover
47431217a2 ARM: remove custom jump table UID
We were creating and propagating two separate indices for each jump table (from
back in the mists of time). However, the generic index used by other backends
is sufficient to emit a unique symbol so this was unneeded.

llvm-svn: 237294
2015-05-13 20:28:38 +00:00
Tim Northover
bd66fdc24b ARM: refactor optimizeThumb2JumpTables.
The previous logic mixed 2 separate questions:
  + Can we form a TBB/TBH instruction?
  + Can we remove the jump-table calculation before it?

It then performed a bunch of random tests on the instructions earlier in the
basic block, which were probably sufficient to answer 2 but only because of the
very limited ways in which a t2BR_JT can actually be created.

For example there's no reason to expect the LeaInst to define the same base
register as the following indexing calulation. In practice this means we might
have missed opportunities to form TBB/TBH, in theory you could end up
misidentifying a sequence and removing the wrong LEA:

     %R1 = t2LEApcrelJT ...
     %R2 = t2LEApcrelJT ...
     <... using and killing %R2 ...>
     %R2 = t2ADDr %R1, $Ridx

Before we would have looked for an LEA defining %R2 and found the wrong one. We
just got lucky that jump table setup was (almost?) always confined to a single
basic block and there was only one jump table per block.

llvm-svn: 237293
2015-05-13 20:28:32 +00:00
Jim Grosbach
b635db1046 MC: Modernize MCOperand API naming. NFC.
MCOperand::Create*() methods renamed to MCOperand::create*().

llvm-svn: 237275
2015-05-13 18:37:00 +00:00
Silviu Baranga
452d76e681 Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006
llvm-svn: 237256
2015-05-13 14:03:18 +00:00
Artyom Skrobov
b27364c475 [AArch64] Codegen VMAX/VMIN for safe math cases
llvm-svn: 237247
2015-05-13 12:01:09 +00:00
Michael Kuperstein
5efc4deda0 Reverting r237234, "Use std::bitset for SubtargetFeatures"
The buildbots are still not satisfied.
MIPS and ARM are failing (even though at least MIPS was expected to pass).

llvm-svn: 237245
2015-05-13 10:28:46 +00:00
Michael Kuperstein
56a8e05a6b Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first two times this was committed (r229831, r233055), it caused several buildbot failures. 
At least some of the ARM and MIPS ones were due to gcc/binutils issues, and should now be fixed.

llvm-svn: 237234
2015-05-13 08:27:08 +00:00
Matthias Braun
552ff1fcaf Revert "ARM: Remove Itineraries for swift CPU"
Reverting until I figure out the new lit failures.

This reverts commit r237179.

llvm-svn: 237189
2015-05-12 21:28:39 +00:00
Matthias Braun
9727b417a6 ARM: Remove Itineraries for swift CPU
They do more harm than good when used in the MachineScheduler as they
tend to take preference to register pressure minimsation which is more
important for swift.

Differential Revision: http://reviews.llvm.org/D9718

llvm-svn: 237179
2015-05-12 21:07:54 +00:00
Douglas Katzman
0a133ecd6e Strip trailing whitespace. NFC
llvm-svn: 237165
2015-05-12 19:42:31 +00:00
John Brawn
c74bcc2521 [ARM] Use AEABI aligned function variants
AEABI defines aligned variants of memcpy etc. that can be faster than
the default version due to not having to do alignment checks. When
emitting target code for these functions make use of these aligned
variants if possible. Also convert memset to memclr if possible.

Differential Revision: http://reviews.llvm.org/D8060

llvm-svn: 237127
2015-05-12 13:13:38 +00:00
Renato Golin
0b5f530e15 Change TargetParser enum names to avoid macro conflicts (llvm)
sys/time.h on Solaris (and possibly other systems) defines "SEC" as "1"
using a cpp macro.  The result is that this fails to compile.

Fixes https://llvm.org/PR23482

llvm-svn: 237112
2015-05-12 10:33:58 +00:00
Eric Christopher
2ba04d1116 Migrate existing backends that care about software floating point
to use the information in the module rather than TargetOptions.

We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.

For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.

Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.

llvm-svn: 237079
2015-05-12 01:26:05 +00:00
Davide Italiano
2db346656e [Target/ARM] Remove unused 'private' from class.
Differential Revision:	http://reviews.llvm.org/D9611
Reviewed by:	rengolin

llvm-svn: 236918
2015-05-08 23:58:28 +00:00
Arnold Schwaighofer
d6f4926afa ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.

Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.

Fix for PR23459.

rdar://20740035

llvm-svn: 236916
2015-05-08 23:52:00 +00:00
Renato Golin
d125d9551c TargetParser: FPU/ARCH/EXT parsing refactory - NFC
This new class in a global context contain arch-specific knowledge in order
to provide LLVM libraries, tools and projects with the ability to understand
the architectures. For now, only FPU, ARCH and ARCH extensions on ARM are
supported.

Current behaviour it to parse from free-text to enum values and back, so that
all users can share the same parser and codes. This simplifies a lot both the
ASM/Obj streamers in the back-end (where this came from), and the front-end
parsers for command line arguments (where this is going to be used next).

The previous implementation, using .def/.h includes is deprecated due to its
inflexibility to be built without the backend support and for being too
cumbersome. As more architectures join this scheme, and as more features of
such architectures are added (such as hardware features, type sizes, etc) into
a full blown TargetDescription class, having a set of classes is the most
sane implementation.

The ultimate goal of this refactor both LLVM's and Clang's target description
classes into one unique interface, so that we can de-duplicate and standardise
the descriptions, as well as make it available for other front-ends, tools,
etc.

The FPU parsing for command line options in Clang has been converted to use
this new library and a number of aliases were added for compatibility:
 * A bogus neon-vfpv3 alias (neon defaults to vfp3)
 * armv5/v6
 * {fp4/fp5}-{sp/dp}-d16

Next steps:
 * Port Clang's ARCH/EXT parsing to use this library.
 * Create a TableGen back-end to generate this information.
 * Run this TableGen process regardless of which back-ends are built.
 * Expose more information and rename it to TargetDescription.
 * Continue re-factoring Clang to use as much of it as possible.

llvm-svn: 236900
2015-05-08 21:04:27 +00:00
Matthias Braun
7d098beb63 Fix typo.
llvm-svn: 236785
2015-05-07 22:16:10 +00:00
Matthias Braun
3b3ecc12b2 Change getTargetNodeName() to produce compiler warnings for missing cases, fix them
llvm-svn: 236775
2015-05-07 21:33:59 +00:00
Wei Mi
338b822ab9 [X86] Disable loop unrolling in loop vectorization pass when VF is 1.
The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture,
by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce
the cost of overflow check, memory boundary check and extra prologue/epilogue code when
regular unroller will unroll the loop another time. Disable it when VF==1 remove the
unnecessary cost on x86. The same can be done for other platforms after verifying
interleaving/memory bound checking to be not perf critical on those platforms.

Differential Revision: http://reviews.llvm.org/D9515

llvm-svn: 236613
2015-05-06 17:12:25 +00:00
Pete Cooper
a188cd1607 [ARM] Fast-Isel was incorrectly selecting <2 x double> adds.
With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add.

However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it."

This commit disables SelectBinaryFPOp for any vector types.  There's already a FIXME to try handle neon.  Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 || VT == MVT::i64'

llvm-svn: 236609
2015-05-06 16:39:17 +00:00
Artyom Skrobov
fbdc6e53f2 [ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too
llvm-svn: 236590
2015-05-06 11:44:10 +00:00
Ahmed Bougacha
49254b5c2e [ARM][FastISel] Use TST #1 instead of CMP #0 for select.
Since r234249, i1 are sext instead of zext; because of that, doing
"CMP rN, #0; IT EQ/NE" isn't correct anymore.

"TST #1" is the conservatively correct alternative - the tradeoff being
that it doesn't have a 16-bit encoding -, so use that instead.

llvm-svn: 236569
2015-05-06 04:14:02 +00:00
Peter Collingbourne
549025e337 Thumb2SizeReduction: Check the correct set of registers for LDMIA.
The register set for LDMIA begins at offset 3, not 4. We were previously
missing the short encoding of this instruction in the case where the base
register was the first register in the register set.

Also clean up some dead code:

- The isARMLowRegister check is redundant with what VerifyLowRegs does;
  replace with an assert.
- Remove handling of LDMDB instruction, which has no short encoding (and
  does not appear in ReduceTable).

Differential Revision: http://reviews.llvm.org/D9485

llvm-svn: 236535
2015-05-05 20:07:10 +00:00
Quentin Colombet
c82cc9dc57 [ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>

llvm-svn: 236507
2015-05-05 17:38:16 +00:00
Pete Cooper
7f7b570248 [ARM] IT block insertion needs to update kill flags
When forming an IT block from the first MOV here:

	%R2<def> = t2MOVr %R0, pred:1, pred:%CPSR, opt:%noreg
	%R3<def> = tMOVr %R0<kill>, pred:14, pred:%noreg

the move in to R3 is moved out of the IT block so that later instructions on the same predicate can be inside this block, and we can share the IT instruction.

However, when moving the R3 copy out of the IT block, we need to clear its kill flags for anything in use at this point in time, ie, R0 here.

This appeases the machine verifier which thought that R0 wasn't defined when used.

I have a test case, but its extremely register allocator specific.  It would be too fragile to commit a test which depends on the register allocator here.

llvm-svn: 236468
2015-05-04 22:44:47 +00:00
Pete Cooper
2c1b4fde0b [ARM] Transfer the internal flag in thumb2 size reduction.
Converting from t2LDRs to tLDRr caused the shift argument to drop the internal flag.  This would then throw machine verifier errors.

Unfortunately i'm having trouble reducing a test case.  I'm going to keep trying, but so far its a scary combination of machine sinking, an 'and i1', loads feeding loads, and a bunch of code which shouldn't change IT block formation, but does.  Its not useful to commit a test in that state as we have no way of knowing if it even hits this code reliably in future.

rdar://problem/20752113

llvm-svn: 236333
2015-05-01 18:57:32 +00:00
Peter Collingbourne
daa528e28b ARM: Align functions containing Thumb-2 jump tables to 4 bytes.
Functions with jump tables need an alignment of 4 because they use the ADR
instruction, which aligns the PC to 4 bytes before adding an offset.

Differential Revision: http://reviews.llvm.org/D9424

llvm-svn: 236327
2015-05-01 18:05:59 +00:00
Pete Cooper
9851cd43b2 [ARM] optimizeSelect should clear kill flags.
If we move an instruction from one block down to a MOVC and predicate it,
then the original instruction could be moved in to a loop.  In this case,
its invalid for any kill flags to remain on there.

Fails with -verfy-machineinstrs.

rdar://problem/20752113

llvm-svn: 236290
2015-04-30 23:57:47 +00:00
Pete Cooper
90ff562042 Don't always apply kill flag in thumb2 ABS pseudo expansion.
The expansion for t2ABS was always setting the kill flag on the rsb instruction.
It should instead only be set on rsb if it was set on the original ABS instruction.

rdar://problem/20752113

llvm-svn: 236272
2015-04-30 22:15:59 +00:00
Quentin Colombet
bc9c5a5af1 [ARM] Do not generate invalid encoding for stack adjust, even if this is just
temporary.

Because of that:
1. The machine verifier was complaining on such code.
2. The generate code worked just because the thumb reduction size pass fixed the
opcode.

rdar://problem/20749824

llvm-svn: 236247
2015-04-30 18:52:49 +00:00
Tim Northover
73f7ef2b21 ARM: mark branch-like instructions with correct flags.
There's probably no way to test BXJ, but if the compiler ever did emit it
during CodeGen it would have to be a block terminator so "isBranch" is
appropriate.

BLX is more tricky. Clearly a call, but it affects surprisingly little.

rdar://18719544

llvm-svn: 236140
2015-04-29 19:16:38 +00:00
Tim Northover
5bf87deee6 ARM: fix peephole optimisation of TST
We were trying to look through COPY instructions, but only to the next
instruction in a BB and incorrectly anyway. The cases where that would actually
be a good idea are rare enough (and not even tested!) that it's not worth
trying to get right.

rdar://20721342

llvm-svn: 236050
2015-04-28 22:03:55 +00:00
Sergey Dmitrouk
7bfbc12128 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235989
2015-04-28 14:05:47 +00:00
Daniel Jasper
39180626db Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

llvm-svn: 235987
2015-04-28 13:38:35 +00:00
Sergey Dmitrouk
01a4dcd3bb [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

llvm-svn: 235977
2015-04-28 11:56:37 +00:00
Matthias Braun
381ec865bc Cleanup, remove unused return value
llvm-svn: 235952
2015-04-28 00:37:05 +00:00
Benjamin Kramer
fe42e95d9c [ARM] Simplify code. NFC.
llvm-svn: 235803
2015-04-25 17:25:13 +00:00
Lang Hames
5d79e39b45 [AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr.
AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a
reference for this is crufty.

llvm-svn: 235752
2015-04-24 19:11:51 +00:00
Peter Collingbourne
249a230d23 Thumb2: When applying branch optimizations, visit branches in reverse order.
The order in which branches appear in ImmBranches is approximately their
order within the function body. By visiting later branches first, we reduce
the distance between earlier forward branches and their targets, making it
more likely that the cbn?z optimization, which can only apply to forward
branches, will succeed for those earlier branches.

Differential Revision: http://reviews.llvm.org/D9185

llvm-svn: 235640
2015-04-23 20:31:35 +00:00
Peter Collingbourne
685edd3002 ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.
In particular, this preserves the kill flag, which allows the Thumb2 cbn?z
optimization to be applied in cases where a branch has been re-created after
the live variables analysis pass, e.g. by the machine block placement pass.

This appears to be low risk; a number of other targets seem to already be
doing something similar, e.g. AArch64, PowerPC.

Differential Revision: http://reviews.llvm.org/D9184

llvm-svn: 235639
2015-04-23 20:31:32 +00:00
Peter Collingbourne
e778bcdbc1 Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero.
This allows the constant island pass to lower these branches to cbn?z
instructions, resulting in a shorter instruction sequence.

Differential Revision: http://reviews.llvm.org/D9183

llvm-svn: 235638
2015-04-23 20:31:30 +00:00
Peter Collingbourne
2930fae4b8 ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets.
This makes it more likely that we can use the 16-bit push and pop instructions
on Thumb-2, saving around 4 bytes per function.

Differential Revision: http://reviews.llvm.org/D9165

llvm-svn: 235637
2015-04-23 20:31:26 +00:00
Peter Collingbourne
337509326a ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.
This appears to have been introduced back in r76698 as part of an unrelated
change. I can find no official ARM documentation stating that Thumb-2 functions
require 4-byte alignment; in fact, ARM documentation appears to contradict
this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement,
section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions.").

Also remove code that sets alignment for ARM functions, which is redundant
with code in the MachineFunction constructor, and remove the hidden
-arm-align-constant-islands flag, which has been enabled by default since
r146739 (Dec 2011) and has probably received sufficient testing by now.

Differential Revision: http://reviews.llvm.org/D9138

llvm-svn: 235636
2015-04-23 20:31:22 +00:00
Vladimir Sukharev
ca216126ff [ARM] Add v8.1a "Privileged Access Never" extension
Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8504

llvm-svn: 235087
2015-04-16 11:34:25 +00:00
Charlie Turner
9657b6aa81 Fix BXJ is undefined in AArch32.
BXJ was incorrectly said to be unsupported in ARMv8-A. It is not
supported in the A64 instruction set, but it is supported in the T32
and A32 instruction sets, because it's listed as an instruction in the
ARM ARM section F7.1.28.

Using SP as an operand to BXJ changed from UNPREDICTABLE to
PREDICTABLE in v8-A. This patch reflects that update as well.

This was found by MCHammer.

llvm-svn: 235024
2015-04-15 17:28:23 +00:00
Rafael Espindola
aeb03deb16 Use raw_pwrite_stream in the object writer/streamer.
The ELF object writer will take advantage of that in the next commit.

llvm-svn: 234950
2015-04-14 22:14:34 +00:00