1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

21999 Commits

Author SHA1 Message Date
Tim Northover
83bd24524f AArch64: fall back to DAG ISel for inline assembly.
We can't currently handle "calls" to inlineasm strings so it's better to let
the DAG handle it than generate rubbish.

llvm-svn: 292540
2017-01-19 23:59:35 +00:00
Simon Pilgrim
fb321451a1 [SelectionDAG] Improve knownbits handling of UMIN/UMAX (PR31293)
This patch improves the knownbits logic for unsigned integer min/max opcodes.

For UMIN we know that the result will have the maximum of the inputs' known leading zero bits in the result, similarly for UMAX the maximum of the inputs' leading one bits.

This is particularly useful for simplifying clamping patterns,. e.g. as SSE doesn't have a uitofp instruction we want to use sitofp instead where possible and for that we need to confirm that the top bit is not set.

Differential Revision: https://reviews.llvm.org/D28853

llvm-svn: 292528
2017-01-19 22:41:22 +00:00
Mikael Holmen
ba88ebaa68 [DAG] Don't increase SDNodeOrder for dbg.value/declare.
Summary:
The SDNodeOrder is saved in the IROrder field in the SDNode, and this
field may affects scheduling. Thus, letting dbg.value/declare increase
the order numbers may in turn affect scheduling.

Because of this change we also need to update the code deciding when
dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues.

Dbg values now have the same order as the SDNode they are connected to,
not the following orders.

Test cases provided by Florian Hahn.

Reviewers: bogner, aprantl, sunfish, atrick

Reviewed By: atrick

Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D25318

llvm-svn: 292485
2017-01-19 13:55:55 +00:00
Daniel Sanders
39b5a7c2bb Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since first commit attempt:
* Added missing guards
* Added more missing guards
* Found and fixed a use-after-free bug involving Twine locals

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292478
2017-01-19 11:15:55 +00:00
Justin Bogner
dcc513e780 GlobalISel: Implement widening for shifts
llvm-svn: 292476
2017-01-19 07:51:17 +00:00
Justin Bogner
fc1966de57 GlobalISel: Implement narrowing for G_LOAD
llvm-svn: 292461
2017-01-19 01:05:48 +00:00
Justin Bogner
d1e248f3b6 GlobalISel: Fix text wrapping in a comment. NFC
llvm-svn: 292460
2017-01-19 01:04:46 +00:00
Dehao Chen
967f2c7d51 Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info:

* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)

This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):

               -gmlt(orig) -gmlt(patched) -g
433.milc       4.68%       5.40%          19.73%
444.namd       8.45%       8.93%          45.99%
447.dealII     97.43%      115.21%        374.89%
450.soplex     27.75%      31.88%         126.04%
453.povray     21.81%      26.16%         92.03%
470.lbm        0.60%       0.67%          1.96%
482.sphinx3    5.77%       6.47%          26.17%
400.perlbench  17.81%      19.43%         73.08%
401.bzip2      3.73%       3.92%          12.18%
403.gcc        31.75%      34.48%         122.75%
429.mcf        0.78%       0.88%          3.89%
445.gobmk      6.08%       7.92%          42.27%
456.hmmer      10.36%      11.25%         35.23%
458.sjeng      5.08%       5.42%          14.36%
462.libquantum 1.71%       1.96%          6.36%
464.h264ref    15.61%      16.56%         43.92%
471.omnetpp    11.93%      15.84%         60.09%
473.astar      3.11%       3.69%          14.18%
483.xalancbmk  56.29%      81.63%         353.22%
geomean        15.60%      18.30%         57.81%

Debug info size change for -gmlt binary with this patch:

433.milc       13.46%
444.namd       5.35%
447.dealII     18.21%
450.soplex     14.68%
453.povray     19.65%
470.lbm        6.03%
482.sphinx3    11.21%
400.perlbench  8.91%
401.bzip2      4.41%
403.gcc        8.56%
429.mcf        8.24%
445.gobmk      29.47%
456.hmmer      8.19%
458.sjeng      6.05%
462.libquantum 11.23%
464.h264ref    5.93%
471.omnetpp    31.89%
473.astar      16.20%
483.xalancbmk  44.62%
geomean        16.83%

Reviewers: davidxl, echristo, dblaikie

Reviewed By: echristo, dblaikie

Subscribers: aprantl, probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25434

llvm-svn: 292457
2017-01-19 00:44:11 +00:00
Matthias Braun
a2f54981f9 LiveIntervalAnalysis: Cleanup; NFC
- Fix doxygen comments: Do not repeat name, remove duplicated doxygen
  comment (on declaration + implementation), etc.
- Use more range based for

llvm-svn: 292455
2017-01-19 00:32:13 +00:00
Krzysztof Parzyszek
c5bd88df38 Treat segment [B, E) as not overlapping block with boundaries [A, B)
llvm-svn: 292446
2017-01-18 23:12:19 +00:00
Haicheng Wu
a5b100a7d1 [CodeGenPrepare] Fix a typo in the comment. NFC.
encode => endcode.

Differential Revision: https://reviews.llvm.org/D28866

llvm-svn: 292438
2017-01-18 21:12:10 +00:00
Justin Bogner
19a6ba7a53 GlobalISel: Implement narrowing for G_STORE
Legalize stores of types that are too wide by breaking them up into
sequences of smaller stores.

llvm-svn: 292412
2017-01-18 17:29:54 +00:00
Teresa Johnson
b4143d5d72 Don't create a comdat group for a dropped def with initializer
Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to
available_externally when possible. If they had an initializer in the
global_ctors list, a comdat group was being created. This code
already had logic to skip available_externally defs, but now the
EliminateAvailableExternally pass will drop these symbols to
declarations earlier. Change the check to skip all declarations for
linker (which includes available_externally along with declarations).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28737

llvm-svn: 292408
2017-01-18 16:58:43 +00:00
Florian Hahn
0d055ba470 [thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.
Summary:
In this function, virtual registers can be introduced (for example
through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs
will replace those virtual registers with concrete registers later on
in PrologEpilogInserter, which sets NoVRegs again.

This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which
failed with expensive checks.
https://llvm.org/bugs/show_bug.cgi?id=27484


Reviewers: rnk, bkramer, olista01

Reviewed By: olista01

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D28829

llvm-svn: 292372
2017-01-18 15:01:22 +00:00
Daniel Sanders
22bc51fc1f Re-revert: [globalisel] Tablegen-erate current Register Bank Information
More missing guards. My build didn't notice it due to a stale file left over
from a Global ISel build.

llvm-svn: 292369
2017-01-18 14:26:12 +00:00
Daniel Sanders
bb2615a6eb Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since last commit:
The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and
this should fix the buildbots however it may not be the whole fix. The previous
buildbot failures suggest there may be a memory bug lurking that I'm unable to
reproduce (including when using asan) or spot in the source. If they re-occur
on this commit then I'll need assistance from the bot owners to track it down.

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292367
2017-01-18 14:17:50 +00:00
Matt Arsenault
15def6a25a DAG: Consider nnan in isKnownNeverNaN
llvm-svn: 292328
2017-01-18 02:10:08 +00:00
Wei Mi
3f55206292 Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer build bot.
llvm-svn: 292327
2017-01-18 01:53:53 +00:00
Matthias Braun
bb065eba97 MIRParser: Allow regclass specification on operand
You can now define the register class of a virtual register on the
operand itself avoiding the need to use a "registers:" block.

Example: "%0:gr64 = COPY %rax"

Differential Revision: https://reviews.llvm.org/D22398

llvm-svn: 292321
2017-01-18 00:59:19 +00:00
Wei Mi
4e944b6da4 [RegisterCoalescing] Remove partial redundent copy.
The patch is to solve the performance problem described in PR27827.
Register coalescing sometimes cannot remove a copy because of interference.
But if we can find a reverse copy in one of the predecessor block of the copy,
the copy is partially redundent and we may remove the copy partially by moving
it to the predecessor block without the reverse copy.

Differential Revision: https://reviews.llvm.org/D28585

llvm-svn: 292292
2017-01-17 23:39:07 +00:00
Tim Northover
478a7d4c52 GlobalISel: correctly handle varargs
Some platforms (notably iOS) use a different calling convention for unnamed vs
named parameters in varargs functions, so we need to keep track of this
information when translating calls.

Since not many platforms are involved, the guts of the special handling is in
the ValueHandler class (with a generic implementation that should work for most
targets).

llvm-svn: 292283
2017-01-17 22:30:10 +00:00
Tim Northover
337b711f4b [GlobalISel] track predecessor mapping during switch lowering.
Correctly populating Machine PHIs relies on knowing exactly how the IR level
CFG was lowered to MachineIR. This needs to be tracked by any translation
phases that meddle (currently only SwitchInst handling).

This reapplies r291973 which was reverted because of testing failures. Fixes:

 + Don't return an ArrayRef to a local temporary.
 + Incorporate Kristof's suggested comment improvements.

llvm-svn: 292278
2017-01-17 22:13:50 +00:00
Ahmed Bougacha
98077584bd Revert "[TLI] Robustize SDAG proto checking by merging it into TLI."
This reverts commit r292189, as it causes issues on SystemZ bots.

llvm-svn: 292191
2017-01-17 03:31:00 +00:00
Ahmed Bougacha
2004f9b06a [TLI] Robustize SDAG proto checking by merging it into TLI.
SelectionDAGBuilder recognizes libfuncs using some homegrown
parameter type-checking.

Use TLI instead, removing another heap of redundant code.

This isn't strictly NFC, as the SDAG code was too lax.
Concretely, this means changes are required to two tests:
- calling a non-variadic function via a variadic prototype isn't OK;
  it just happens to work on x86_64 (but not on, e.g., aarch64).
- mempcpy has a size_t parameter;  the SDAG code accepts any integer
  type, which meant using i32 on x86_64 worked.

I don't think it's worth supporting either of these (IMO) broken
testcases.  Instead, fix them to be more correct.

llvm-svn: 292189
2017-01-17 03:10:06 +00:00
Daniel Sanders
9d4cd68ffa Revert r292132: [globalisel] Tablegen-erate current Register Bank Information'...
Several buildbots encountered a crash in tablegen when building this commit.
Reverting while I investigate the cause.

llvm-svn: 292136
2017-01-16 15:34:43 +00:00
Daniel Sanders
9102aa35bd [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292132
2017-01-16 15:20:43 +00:00
Simon Pilgrim
7522ab562f [SelectionDAG] Add knownbits support for BITREVERSE
llvm-svn: 292130
2017-01-16 14:49:26 +00:00
Simon Pilgrim
a8edcf036b [SelectionDAG] Add support for BITREVERSE constant folding
We were relying on constant folding of the legalized instructions to do what constant folding we had previously

llvm-svn: 292114
2017-01-16 13:39:00 +00:00
Serge Pavlov
121b7d2a81 Reverted: Track validity of pass results
Commits r291882 and related r291887.

llvm-svn: 292062
2017-01-15 10:23:18 +00:00
Daniel Jasper
efabb3051c Revert "[GlobalISel] track predecessor mapping during switch lowering."
This reverts commit r291973.

The test fails in a Release build with LLVM_BUILD_GLOBAL_ISEL enabled.
AFAICT, llc segfaults. I'll add a few more details to the original
commit.

llvm-svn: 292061
2017-01-15 09:41:49 +00:00
Justin Bogner
1c7fe7f803 GlobalISel: Abort in ResetMachineFunctionPass if fallback isn't enabled
When GlobalISel is configured to abort rather than fallback the only
thing that resetting the machine function does is make things harder
to debug. If we ever get to this point in the abort configuration it
indicates that we've already hit a bug, so this changes the behaviour
to abort instead.

llvm-svn: 291977
2017-01-13 23:46:11 +00:00
Tim Northover
ead291ba07 [GlobalISel] track predecessor mapping during switch lowering.
Correctly populating Machine PHIs relies on knowing exactly how the IR level
CFG was lowered to MachineIR. This needs to be tracked by any translation
phases that meddle (currently only SwitchInst handling).

llvm-svn: 291973
2017-01-13 23:11:37 +00:00
David Majnemer
d9836b30c3 [CodeGen] Simplify getRecipEstimateForFunc
It used two attribute lookups when only one was needed.

llvm-svn: 291965
2017-01-13 22:24:25 +00:00
James Y Knight
2b851af232 Check for register clobbers when merging a vreg live range with a
reserved physreg in RegisterCoalescer.

Previously, we only checked for clobbers when merging into a READ of
the physreg, but not when merging from a WRITE to the physreg.

Differential Revision: https://reviews.llvm.org/D28527

llvm-svn: 291942
2017-01-13 19:08:36 +00:00
Malcolm Parsons
26eb6a0783 Remove unused lambda captures. NFC
llvm-svn: 291916
2017-01-13 17:12:16 +00:00
Benjamin Kramer
5fd769f791 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Diana Picus
971b3bbda9 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

llvm-svn: 291891
2017-01-13 09:58:52 +00:00
Serge Pavlov
c81d1f1786 Track validity of pass results
Running tests with expensive checks enabled exhibits some problems with
verification of pass results.

First, the pass verification may require results of analysis that are not
available. For instance, verification of loop info requires results of dominator
tree analysis. A pass may be marked as conserving loop info but does not need to
be dependent on DominatorTreePass. When a pass manager tries to verify that loop
info is valid, it needs dominator tree, but corresponding analysis may be
already destroyed as no user of it remained.

Another case is a pass that is skipped. For instance, entities with linkage
available_externally do not need code generation and such passes are skipped for
them. In this case result verification must also be skipped.

To solve these problems this change introduces a special flag to the Pass
structure to mark passes that have valid results. If this flag is reset,
verifications dependent on the pass result are skipped.

Differential Revision: https://reviews.llvm.org/D27190

llvm-svn: 291882
2017-01-13 06:09:54 +00:00
Daniel Sanders
4f6b2e2aaf [globalisel] Move as much RegisterBank initialization to the constructor as possible
Summary:
The register bank is now entirely initialized in the constructor. However,
we still have the hardcoded number of register classes which will be
dealt with in the TableGen patch (D27338) since we do not have access
to this information to resolve this at this stage. The number of register
classes is known to the TRI and to TableGen but the RegisterBank
constructor is too early for the former and too late for the latter.
This will be fixed when the data is tablegen-erated.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27809

llvm-svn: 291770
2017-01-12 16:11:23 +00:00
Daniel Sanders
edef185b46 [globalisel] Initialize RegisterBanks with static data.
Summary:
Refactor the RegisterBank initialization to use static data. This requires
GlobalISel implementations to rewrite calls to createRegisterBank() and
addRegBankCoverage() into a call to setRegBankData().

Out of tree targets can use diff 4 of D27807
(https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump
the register classes and other data that needs to be provided to
setRegBankData(). This is the method that was used to generate the static data
in this patch.

Tablegen-eration of this static data will follow after some refactoring.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27807
Differential Revision: https://reviews.llvm.org/D27808

llvm-svn: 291768
2017-01-12 15:32:10 +00:00
Zachary Turner
fa65a3c140 [CodeView] Finish decoupling TypeDatabase from TypeDumper.
Previously the type dumper itself was passed around to a lot of different
places and manipulated in ways that were more appropriate on the type
database. For example, the entire TypeDumper was passed into the symbol
dumper, when all the symbol dumper wanted to do was lookup the name of a
TypeIndex so it could print it. That's what the TypeDatabase is for --
mapping type indices to names.

Another example is how if the user runs llvm-pdbdump with the option to
dump symbols but not types, we still have to visit all types so that we
can print minimal information about the type of a symbol, but just without
dumping full symbol records. The way we did this before is by hacking it
up so that we run everything through the type dumper with a null printer,
so that the output goes to /dev/null. But really, we don't need to dump
anything, all we want to do is build the type database. Since
TypeDatabaseVisitor now exists independently of TypeDumper, we can do
this. We just build a custom visitor callback pipeline that includes a
database visitor but not a dumper.

All the hackery around printers etc goes away. After this patch, we could
probably even delete the entire CVTypeDumper class since really all it is
at this point is a thin wrapper that hides the details of how to build a
useful visitation pipeline. It's not a priority though, so CVTypeDumper
remains for now.

After this patch we will be able to easily plug in a different style of
type dumper by only implementing the proper visitation methods to dump
one-line output and then sticking it on the pipeline.

Differential Revision: https://reviews.llvm.org/D28524

llvm-svn: 291724
2017-01-11 23:24:22 +00:00
Kyle Butt
6035a9f12a Revert "CodeGen: Allow small copyable blocks to "break" the CFG."
This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.

This needs a simple probability check because there are some cases where it is
not profitable.

llvm-svn: 291695
2017-01-11 19:55:19 +00:00
Tim Northover
617ebb4a7a GlobalISel: only print debug info with -debug. NFC.
Turns out DEBUG(...) has uses even inside NDEBUG checks.

llvm-svn: 291685
2017-01-11 17:33:37 +00:00
Craig Topper
27ac09e3e4 Revert r291645 "[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent."
Some test appears to be hanging on the build bots.

llvm-svn: 291650
2017-01-11 04:59:25 +00:00
Craig Topper
63cd0c2c97 [DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent.
llvm-svn: 291645
2017-01-11 04:02:23 +00:00
Matt Arsenault
2c00794e15 DAGCombiner: Add hasOneUse checks to fadd/fma combine
Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.

llvm-svn: 291642
2017-01-11 02:02:12 +00:00
Eugene Zelenko
dd0e4dae1c [Target] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291641
2017-01-11 01:45:03 +00:00
Quentin Colombet
c2e1981f6c [RegBankSelect] Improve the output of the debug messages.
Add more information about mapping cost and chosen solution.

llvm-svn: 291629
2017-01-11 00:48:41 +00:00
Zachary Turner
c1f7412cbe [CodeView/PDB] Rename a bunch of files.
We were starting to get some name clashes between llvm-pdbdump
and the common CodeView framework, so I took this opportunity
to rename a bunch of files to more accurately describe their
usage.  This also helps in llvm-pdbdump to distinguish
between different files and whether they are used for pretty
dump mode or raw dump mode.

llvm-svn: 291627
2017-01-11 00:35:43 +00:00
Zachary Turner
60f2748d40 [CodeView] Add TypeDatabase class.
This creates a centralized class in which to store type records.
It stores types as an array of entries, which matches the
notion of a type stream being a topologically sorted DAG.
Logic to build up such a database was already being used in
CVTypeDumper, so CVTypeDumper is now updated to to read from
a TypeDatabase which is filled out by an earlier visitor in
the pipeline.

Differential Revision: https://reviews.llvm.org/D28486

llvm-svn: 291626
2017-01-11 00:35:08 +00:00
Kyle Butt
af32417840 CodeGen: Allow small copyable blocks to "break" the CFG.
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.

Differential revision: https://reviews.llvm.org/D27742

llvm-svn: 291609
2017-01-10 23:04:30 +00:00
Matt Arsenault
bc8f9b01ec Remove unused CONVERT_RNDSAT intrinsics
llvm-svn: 291607
2017-01-10 22:38:02 +00:00
Matt Arsenault
f633d550ca DAG: Avoid OOB when legalizing vector indexing
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.

llvm-svn: 291604
2017-01-10 22:02:30 +00:00
Victor Leschuk
b9ac00f6e5 DebugInfo: support for DW_FORM_implicit_const
Support for DW_FORM_implicit_const DWARFv5 feature.
When this form is used attribute value goes to .debug_abbrev section (as SLEB).
As this form would break any debug tool which doesn't support DWARFv5
it is guarded by dwarf version check. Attempt to use this form with
dwarf version <= 4 is considered a fatal error.

Differential Revision: https://reviews.llvm.org/D28456

llvm-svn: 291599
2017-01-10 21:18:26 +00:00
Simon Dardis
4fc397dfdf [mips] Fix Mips MSA instrinsics
The usage of some MIPS MSA instrinsics that took immediates could crash LLVM
during lowering. This patch addresses that behaviour. Crucially this patch
also makes the use of intrinsics with out of range immediates as producing an
internal error.

The ld,st instrinsics would trigger an assertion failure for MIPS64 as their
lowering would attempt to add an i32 offset to a i64 pointer.

Reviewers: vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D25438

llvm-svn: 291571
2017-01-10 16:40:57 +00:00
Craig Topper
a03446e0cd [DAGCombiner] Merge together duplicate checks for folding fold (select C, 1, X) -> (or C, X) and folding (select C, X, 0) -> (and C, X). Also be consistent about checking that both the condition and the result type are i1. NFC
I guess previously we just assumed if the result type was i1, then the condition type must also be i1?

llvm-svn: 291548
2017-01-10 07:42:57 +00:00
Craig Topper
6f8393bb6b [DAGCombiner] Remove code for optimizing select (xor Cond, 0), X, Y -> select Cond, X, Y. Just let combine on the xor itself take care of it.
llvm-svn: 291534
2017-01-10 04:12:19 +00:00
Evandro Menezes
b116545236 [CodeGen] Implement the SUnit::print() method
This method seems to have had a troubled life.  This patch proposes that it
replaces the recently added helper function dumpSUIdentifier.  This way, the
method can be used in other files using the SUnit class.

Differential revision: https://reviews.llvm.org/D28488

llvm-svn: 291520
2017-01-10 01:08:01 +00:00
Matthias Braun
537f364e9a PeepholeOptimizer: Do not replace SubregToReg(bitcast like)
While we can usually replace bitcast like instructions
(MachineInstr::isBitcast()) with a COPY this is not legal if any of the
users uses SUBREG_TO_REG to assert the upper bits of the result are
zero.

Differential Revision: https://reviews.llvm.org/D28474

llvm-svn: 291483
2017-01-09 21:38:17 +00:00
Matthias Braun
ea1c543874 MachineInstr: Print name for subreg index in SUBREG_TO_REG
SUBREG_TO_REG takes a subregister index as 3rd operand, print the name
instead of a number. We already do the same for INSERT_SUBREG and
REG_SEQUENCE.

llvm-svn: 291481
2017-01-09 21:38:10 +00:00
Sumanth Gundapaneni
c5b243c780 In the below scenario, we must be able to skip the a DBG_VALUE instruction and
remove the dead store.

%vreg0<def> = L2_loadri_io <fi#15>, 0; mem:LD4[%dataF](align=4)
DBG_VALUE %vreg0, %noreg, !"dataF", <!184>; IntRegs:%vreg0 
S2_storeri_io <fi#15>, 0, %vreg0; mem:ST4[%dataF]

In reality, this kind of stores are eliminated before Stack Slot Coloring pass,
possibly in instruction lowering

Differential Revision: https://reviews.llvm.org/D26616

llvm-svn: 291455
2017-01-09 17:45:02 +00:00
Bjorn Pettersson
b5889ab1a8 [SelectionDAG] Fix in legalization of UMAX/SMAX/UMIN/SMIN. Solves PR31486.
Summary:
Originally

 i64 = umax t8, Constant:i64<4>

was expanded into

 i32,i32 = umax Constant:i32<0>, Constant:i32<0>
 i32,i32 = umax t7, Constant:i32<4>

Now instead the two produced umax:es return i32 instead of i32, i32.

Thanks to Jan Vesely for help with the test case.

Patch by mikael.holmen at ericsson.com

Reviewers: bogner, jvesely, tstellarAMD, arsenm

Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D28135

llvm-svn: 291441
2017-01-09 12:03:50 +00:00
Daniel Sanders
a949cf108b [globalisel] Stop requiring -debug/-debug-only=registerbankinfo for assertions.
Summary:
I've noticed that these assertions don't trigger when the condition is false.
The problem is that the DEBUG(x) macro only executes x when the pass is
emitting debug output via the -debug and -debug-only=registerbankinfo command
line arguments.

Debug builds should always execute the assertions so use '#ifndef NDEBUG' instead.

Also removed an assertion that is only true the first time it's tested. <Target>RegisterBankInfo's constructor will re-use register banks causing them to be valid on subsequent tests. That
assertion will fail on the first test too in the near future.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D28358

llvm-svn: 291235
2017-01-06 14:29:34 +00:00
David Majnemer
3bb7f77e90 [SelectionDAG] Rework lowerRangeToAssertZExt
Utilize ConstantRange to make it easier to interpret range metadata.

llvm-svn: 291211
2017-01-06 02:43:28 +00:00
David Majnemer
f0ed7ae0f8 [SelectionDAG] Correctly transform range metadata to AssertZExt
We used the logBase2 of the high instead of the ceilLogBase2 resulting
in the wrong result for certain values.  For example, it resulted in an
i1 AssertZExt when the exclusive portion of the range was 3.

llvm-svn: 291196
2017-01-06 00:11:46 +00:00
Joerg Sonnenberger
d01a38a2e3 PR 31534: When emitting both DWARF unwind tables and debug information,
do not use .cfi_sections. This requires checking if any non-declaration
function in the module needs an unwind table.

llvm-svn: 291172
2017-01-05 20:55:28 +00:00
Matthias Braun
61599008e2 CodeGen: Assert that liveness is up to date when reading block live-ins.
Add an assert that checks whether liveins are up to date before they are
used.

- Do not print liveins into .mir files anymore in situations where they
  are out of date anyway.
- The assert in the RegisterScavenger is superseded by the new one in
  livein_begin().
- Skip parts of the liveness updating logic in IfConversion.cpp when
  liveness isn't tracked anymore (just enough to avoid hitting the new
  assert()).

Differential Revision: https://reviews.llvm.org/D27562

llvm-svn: 291169
2017-01-05 20:01:19 +00:00
Kristof Beyls
a124ec224e [GlobalISel] Add support for address-taken basic blocks
To make this work, pointers from the MachineBasicBlock to the LLVM-IR-level
basic blocks need to be initialized, as the AsmPrinter uses this link to be
able to print out labels for the basic blocks that are address-taken.

Most of the changes in this commit are about adapting existing tests to include
the basic block name that is now printed out in the MIR format, now that the
name becomes available as the link to the LLVM-IR basic block is initialized.
The relevant test change for the functionality added in this patch are the
added "(address-taken)" strings in
test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D28123

llvm-svn: 291105
2017-01-05 13:27:52 +00:00
Kristof Beyls
1ab23a98c0 [GlobalISel] Add support for switch statements
This commit does this using a trivial chain of conditional branches.  In the
future, we probably want to reuse the optimized switch lowering used in
SelectionDAG.

Differential Revision: https://reviews.llvm.org/D28176

llvm-svn: 291099
2017-01-05 11:28:51 +00:00
Saleem Abdulrasool
939d0790a2 MC: support passing search paths to the IAS
This is needed to support inclusion in inline assembly via the
`.include` directive.

llvm-svn: 291085
2017-01-05 05:56:39 +00:00
Tim Shen
2485ce4de5 [Legalizer] Fix fp-to-uint to fp-tosint promotion assertion.
Summary:
When promoting fp-to-uint16 to fp-to-sint32, the result is actually zero
extended. For example, given double 65534.0, without legalization:

  fp-to-uint16: 65534.0 -> 0xfffe

With the legalization:

  fp-to-sint32: 65534.0 -> 0x0000fffe

Without this patch, legalization wrongly emits a signed extend assertion,
which is consumed by later icmp instruction, and cause miscompile.

Note that the floating point value must be in [0, 65535), otherwise the
behavior is undefined.

This patch reverts r279223 behavior and adds more tests and
documentations.

In PR29041's context, James Molloy mentioned that:

  We don't need to mask because conversion from float->uint8_t is
  undefined if the integer part of the float value is not representable in
  uint8_t. Therefore we can assume this doesn't happen!

which is totally true and good, because fptoui is documented clearly to
have undefined behavior when overflow/underflow happens. We should take
the advantage of this behavior so that we can save unnecessary mask
instructions.

Reviewers: jmolloy, nadav, echristo, kbarton

Subscribers: mehdi_amini, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28284

llvm-svn: 291015
2017-01-04 22:11:42 +00:00
Evgeny Stupachenko
25c2e240d4 The patch fixes (base, index, offset) match.
Summary:
Instead of matching:
  (a + i) + 1 -> (a + i, undef, 1)
Now it matches:
  (a + i) + 1 -> (a, i, 1)

Reviewers: rengolin

Differential Revision: http://reviews.llvm.org/D26367

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 291012
2017-01-04 21:43:39 +00:00
Bjorn Pettersson
53828bbf87 Fix for InlineSpiller accessing not updated dom tree base information.
Summary:
The InlineSpiller was accessing the DominatorTreeBase directly
through the public data member DT in the MachineDominatorTree.
This is not a good idea as the "cached" information in
SplitCriticalEdges is not applied before the access.
The DominatorTreeBase must be accessed through the member
function getBase() in MachineDominatorTree.

The fault was introduced in r266162.

I think the public data member DT in the MachineDominatorTree
should have been made private in the original code (r215576)
that introduced the concept of lazily updating the
MachineDominatorTree information from
MachineBasicBlock::SplitCriticalEdge().

Patch by Karl-Johan Karlsson <karl-johan.karlsson@ericsson.com>

Reviewers: wmi, qcolombet

Subscribers: llvm-commits, bjope, uabelho

Differential Revision: https://reviews.llvm.org/D27983

llvm-svn: 290950
2017-01-04 09:41:56 +00:00
Ahmed Bougacha
af2642b0e5 [CodeGen] Further simplify returned call operand logic. NFC.
As Pete points out in r290905, CallSite lets us avoid duplicating this!

llvm-svn: 290909
2017-01-03 21:42:43 +00:00
Ahmed Bougacha
9258c02aa0 [CodeGen] Simplify logic that looks for returned call operands. NFC-ish.
Use getReturnedArgOperand() instead of rolling our own.  Note that it's
equivalent because there can only be one 'returned' operand.

The existing code was also incorrect: there already was awkward logic to
ignore callee/EH blocks, but operands can now also be operand bundles,
in which case we'll look for non-existent parameter attributes.

Unfortunately, this isn't observable in-tree, as it only crashes when
exercising the regular call lowering logic with operand bundles.
Still, this is a nice small cleanup anyway.

llvm-svn: 290905
2017-01-03 20:33:22 +00:00
Dean Michael Berris
14872cc56c [XRay] Merge instrumentation point table emission code into AsmPrinter.
Summary:
No need to have this per-architecture.  While there, unify 32-bit ARM's
behaviour with what changed elsewhere and start function names lowercase
as per the coding standards.  Individual entry emission code goes to the
entry's own class.

Fully tested on amd64, cross-builds on both ARMs and PowerPC.

Reviewers: dberris

Subscribers: aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D28209

llvm-svn: 290858
2017-01-03 04:30:21 +00:00
Joerg Sonnenberger
5f19da4f7e Emit .cfi_sections before the first .cfi_startproc
GNU as rejects input where .cfi_sections is used after .cfi_startproc,
if the new section differs from the old. Adjust our output to always
emit .cfi_sections before the first .cfi_startproc to minimize necessary
code.

Differential Revision: https://reviews.llvm.org/D28011

llvm-svn: 290817
2017-01-02 18:05:27 +00:00
Keno Fischer
97bf60327b Reapply "[CodeGen] Fix invalid DWARF info on Win64"
This reapplies rL289013 (reverted in rL289014) with the fixes identified
in D21731. Should hopefully pass the buildbots this time.

llvm-svn: 290809
2017-01-02 03:00:19 +00:00
Florian Hahn
f3a4ee42b3 [selectiondag] Check PromotedFloats map during expansive checks.
Summary:
`PromotedFloats` needs to be checked in 
`DAGTypeLegalizer::PerformExpensiveChecks`. This patch fixes a few type
legalization failures with expansive checks for ARM fp16 tests.

Reviewers: baldrick, bogner, arsenm

Subscribers: arsenm, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D28187

llvm-svn: 290796
2017-01-01 13:58:27 +00:00
Reid Kleckner
48c3376e2c Simplify FunctionLoweringInfo.cpp with range for loops
I'm preparing to add some pattern matching code here, so simplify the
code before I do. NFC

llvm-svn: 290731
2016-12-30 00:21:38 +00:00
Reid Kleckner
dd916c40a8 Revert "[COFF] Use 32-bit jump table entries in .rdata for Win64"
This reverts commit r290694. It broke sanitizer tests on Win64. I'll
probably bring this back, but the jump tables will just live in .text
like they do for MSVC.

llvm-svn: 290714
2016-12-29 17:07:10 +00:00
Igor Laevsky
cde17c8a3e Introduce element-wise atomic memcpy intrinsic
This change adds a new intrinsic which is intended to provide memcpy functionality
with additional atomicity guarantees. Please refer to the review thread
or language reference for further details.

Differential Revision: https://reviews.llvm.org/D27133

llvm-svn: 290708
2016-12-29 14:31:07 +00:00
Reid Kleckner
551887db33 [COFF] Use 32-bit jump table entries in .rdata for Win64
Summary:
We were already using 32-bit jump table entries, but this was a
consequence of the default PIC model on Win64, and not an intentional
design decision. This patch ensures that we always use 32-bit label
difference jump table entries on Win64 regardless of the PIC model. This
is a good idea because it saves executable size and object file size.

Moving the jump tables to .rdata cleans up the disassembled object code
and reduces the available ROP targets, but it requires adding one more
RIP-relative lea to the code.  COFF doesn't have relocations to express
the difference between two arbitrary symbols, so we can't use the jump
table label in the label difference like we do elsewhere.

Fixes PR31488

Reviewers: majnemer, compnerd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28141

llvm-svn: 290694
2016-12-29 00:12:39 +00:00
Reid Kleckner
911a079ad9 [WinEH] Don't assume endFunction is called while in .text
Jump table emission can switch to .rdata before
WinException::endFunction gets called. Just remember the appropriate
text section we started in and reset back to it when we end the
function. We were already switching sections back from .xdata anyway.

Fixes the first problem in PR31488, so that now COFF switch tables can
live in .rdata if we want them to.

llvm-svn: 290678
2016-12-28 19:05:12 +00:00
Simon Pilgrim
3d00c1e5ad [SelectionDAG] Early out from computeKnownBits when we know we will have no common bits.
Avoid extra (recursive) calls to computeKnownBits if we already know that there are no common known bits.

llvm-svn: 290490
2016-12-24 12:59:35 +00:00
Zijiao Ma
eef813bb47 Make the canonicalisation on shifts benifit to more case.
1.Fix pessimized case in FIXME.
2.Add tests for it.
3.The canonicalisation on shifts results in different sequence for
  tests of machine-licm.Correct some check lines.

Differential Revision: https://reviews.llvm.org/D27916

llvm-svn: 290410
2016-12-23 02:56:07 +00:00
Sanjoy Das
b0176f5ff2 NFC code motion in ImplicitNullChecks
Extract out two large lambdas into top level member functions.

llvm-svn: 290395
2016-12-23 00:41:24 +00:00
Sanjoy Das
9b70f16033 Reimplement depedency tracking in the ImplicitNullChecks pass
Summary:
This change rewrites a core component in the ImplicitNullChecks pass for
greater simplicity since the original design was over-complicated for no
good reason.  Please review this as essentially a new pass.  The change
is almost NFC and I've added a test case for a scenario that this new
code handles that wasn't handled earlier.

The implicit null check pass, at its core, is a code hoisting transform.
It differs from "normal" code transforms in that it speculates
potentially faulting instructions (by design), but a lot of the usual
hazard detection logic (register read-after-write etc.) still applies.
We previously detected hazards by keeping track of registers defined and
used by machine instructions over an instruction range, but that was
unwieldy and did not actually confer any performance benefits.  The
intent was to have linear time complexity over the number of machine
instructions considered, but it ended up being N^2 is practice.

This new version is more obviously O(N^2) (with N capped to 8 by
default) in hazard detection.  It does not attempt to be clever in
tracking register uses or defs (the previous cleverness here was a
source of bugs).

Once this is checked in, I'll extract out the `IsSuitableMemoryOp` and
`CanHoistLoadInst` lambda into member functions (they're too complicated
to be inline lambdas) and do some other related NFC cleanups.

Reviewers: reames, anna, atrick

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27592

llvm-svn: 290394
2016-12-23 00:41:21 +00:00
Quentin Colombet
eec78653c8 [GlobalISel] More fix for the size vs. type typo. NFC.
I missed those in my previous commit (r290378).

llvm-svn: 290387
2016-12-22 22:50:34 +00:00
Quentin Colombet
6877ce89ac [MachineVerifier] Check that even generic vregs comply to regclass constraints.
We used to not check generic vregs, but that is actually a mistake given
nothing in the GlobalISel pipeline is going to fix the constraints on
target specific instructions. Therefore, the target has to have them
right from the start.

llvm-svn: 290380
2016-12-22 21:56:39 +00:00
Quentin Colombet
d113df1785 [MIRParser] Fix a typo in comment and error message.
We have long switched from size to type.

llvm-svn: 290378
2016-12-22 21:56:35 +00:00
Quentin Colombet
e24c513b21 [MIRParser] Non-generic virtual register may have a type.
When generic virtual registers get constrained, because of a use on a
target specific operation for instance, we end up with regular virtual
registers with a type and that's perfectly fine.

llvm-svn: 290376
2016-12-22 21:56:29 +00:00
Quentin Colombet
b3e77855cb [RegisterBankInfo] Allow to set a register class when nothing else is set
This is going to be needed to be able to constraint register class on
target specific instruction while the RegBankSelect pass did not run
yet.

llvm-svn: 290375
2016-12-22 21:56:26 +00:00
Quentin Colombet
766bd68456 [GlobalISel] Refactor the logic to constraint registers.
Move the logic to constraint register from InstructionSelector to a
utility function. It will be required by other passes in the GlobalISel
pipeline.

llvm-svn: 290374
2016-12-22 21:56:19 +00:00
Wei Mi
f82ada61cb Redo store splitting in CodeGenPrepare.
This is a succeeding patch of https://reviews.llvm.org/D22840 to address the
issue when a value to be merged into an int64 pair is in a different BB. Redoing
the store splitting in CodeGenPrepare so we can match the pattern across multiple
BBs and move some instructions into the same BB. We still keep the code in dag
combine so that we can catch cases that show up after DAG combining runs.

Differential Revision: https://reviews.llvm.org/D25914

llvm-svn: 290365
2016-12-22 19:44:45 +00:00
Wei Mi
f71e4cfcb2 Change the interface of TLI.isMultiStoresCheaperThanBitsMerge.
This is for splitMergedValStore in DAG Combine to share the target query interface
with similar logic in CodeGenPrepare.

Differential Revision: https://reviews.llvm.org/D24707

llvm-svn: 290363
2016-12-22 19:38:22 +00:00
Krzysztof Parzyszek
64adc16cf0 Add the DAG mutation interface to the software pipeliner
llvm-svn: 290360
2016-12-22 19:21:20 +00:00
Krzysztof Parzyszek
8d7cd1c8ba Fix two bugs in the pipeliner in renaming phis in the prolog and epilog
When the pipeliner is renaming phi values, it may need to iterate through
the phi operands to check for other phis. However, the pipeliner should
stop once it reaches a phi that is outside the pipelined loop.

Also, when the generateExistingPhis code is unable to reuse an existing
phi, the default code that computes the PhiOp2 is only to be used when
the pipeliner is generating the kernel. Otherwise, the phi may be a value
computed earlier in the same epilog.

Patch by Brendon Cahoon.

llvm-svn: 290355
2016-12-22 18:49:55 +00:00
Adrian Prantl
9e92e0bc4f Fix an assertion in DwarfExpression when emitting fragments in vector registers
When DwarfExpression is emitting a fragment that is located in a
register and that fragment is smaller than the register, and the
register must be composed from sub-registers (are you still with me?)
the last DW_OP_piece operation must not be larger than the size of the
fragment itself, since the last piece of the fragment could be smaller
than the last subregister that is being emitted.

rdar://problem/29779065

llvm-svn: 290324
2016-12-22 06:10:41 +00:00
Adrian Prantl
97f5f281ea Refactor the DIExpression fragment query interface (NFC)
... so it becomes available to DIExpressionCursor.

llvm-svn: 290322
2016-12-22 05:27:12 +00:00
Matt Arsenault
7363b816c7 DAG: Add helper for testing constant values
There are helpers for testing for constant or constant build_vector,
and for splat ConstantFP vectors, but not for a constantfp or
non-splat ConstantFP vector.

llvm-svn: 290317
2016-12-22 04:39:45 +00:00
Oren Ben Simhon
13bb84d557 [X86] Vectorcall Calling Convention - Adding CodeGen Complete Support
The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. 
The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above.

The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it.
This aubmit also includes additional lit tests to cover better HVAs corner cases.

Differential Revision: https://reviews.llvm.org/D27392

llvm-svn: 290240
2016-12-21 08:31:45 +00:00
Sebastian Pop
3ea829a444 machine combiner: fix pretty printer
we used to print UNKNOWN instructions when the instruction to be printer was not
yet inserted in any BB: in that case the pretty printer would not be able to
compute a TII as the instruction does not belong to any BB or function yet.
This patch explicitly passes the TII to the pretty-printer.

Differential Revision: https://reviews.llvm.org/D27645

llvm-svn: 290228
2016-12-21 01:41:12 +00:00
George Burgess IV
8761f680f4 [Analysis] Centralize objectsize lowering logic.
We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.

llvm-svn: 290214
2016-12-20 23:46:36 +00:00
Adrian Prantl
cf7e846d11 [IR] Remove the DIExpression field from DIGlobalVariable.
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.

Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:

(1) The DIGlobalVariable should describe the source level variable,
    not how to get to its location.

(2) It makes it unsafe/hard to update the expressions when we call
    replaceExpression on the DIGLobalVariable.

(3) It makes it impossible to represent a global variable that is in
    more than one location (e.g., a variable with multiple
    DW_OP_LLVM_fragment-s).  We also moved away from attaching the
    DIExpression to DILocalVariable for the same reasons.

This reapplies r289902 with additional testcase upgrades and a change
to the Bitcode record for DIGlobalVariable, that makes upgrading the
old format unambiguous also for variables without DIExpressions.

<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769

llvm-svn: 290153
2016-12-20 02:09:43 +00:00
Bjorn Pettersson
a59b141704 [CodeGen] Make MachineInstr::isIdenticalTo() symmetric.
Summary:
MachineInstr::isIdenticalTo() is for some reason not
symmetric when comparing bundles, which gives us the
property:

  I1->isIdenticalTo(*I2) != I2->isIdenticalTo(*I1)

when comparing bundles where one bundle is longer than
the other.

This patch makes sure that bundles of different length
always are considered as not being identical. Thus, the
result of the comparison will be the same regardless of
which side that happens to be to the left.

Reviewers: dexonsmith, jonpa, andrew.w.kaylor

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27508

llvm-svn: 290095
2016-12-19 11:20:57 +00:00
Dean Michael Berris
c8dc3c621c [XRay] Fix assertion failure on empty machine basic blocks (PR 31424)
The original version of the code in XRayInstrumentation.cpp assumed that
functions may not have empty machine basic blocks (or that the first one
couldn't be). This change addresses that by special-casing that specific
situation.

We provide two .mir test-cases to make sure we're handling this
appropriately.

Fixes llvm.org/PR31424.

Reviewers: chandlerc

Subscribers: varno, llvm-commits

Differential Revision: https://reviews.llvm.org/D27913

llvm-svn: 290091
2016-12-19 09:20:38 +00:00
Tom Stellard
199ef57082 Add custom type for PseudoSourceValue
Summary:
PseudoSourceValue can be used to attach a target specific value for "well behaved" side-effects lowered from target specific intrinsics.
This is useful whenever there is not an LLVM IR Value around when representing such "well behaved" side-effected operations in backends by attaching a MachineMemOperand with a custom PseudoSourceValue as this makes the scheduler not treating them as "GlobalMemoryObjects" which triggers a logic that makes the operation act like a barrier in the Schedule DAG.

This patch adds another Kind to the PseudoSourceValue object which is "TargetCustom". It indicates a type of PseudoSourceValue that has a target specific meaning (aka. LLVM shouldn't assume any specific usage for such a PSV).

It supports the possibility of having many different kinds of "TargetCustom" PseudoSourceValues.

We had a discussion about if this was valuable or not (in particular because there was a believe that PSV were going away sooner or later) but seems like they are not going anywhere and I think they are useful backend side.

It is not clear the interaction of this with MIRParser (do we need a target hook to parse these?) and I would like a comment from Alex about that :)

Reviewers: arphaman, hfinkel, arsenm

Subscribers: Eugene.Zelenko, llvm-commits

Patch By: Marcello Maggioni

Differential Revision: https://reviews.llvm.org/D13575

llvm-svn: 290037
2016-12-17 04:41:53 +00:00
Matthias Braun
c1899a2b46 BranchRelaxation: Recompute live-ins when splitting a block
Factors out and reuses live-in computation code from BranchFolding.

Differential Revision: https://reviews.llvm.org/D27558

llvm-svn: 290013
2016-12-16 23:55:37 +00:00
Paul Robinson
c9e7578fcb Allow "line 0" to be the first explicit debug location in a function.
Feedback on r289468 from Adrian Prantl.

llvm-svn: 290012
2016-12-16 23:54:33 +00:00
Zachary Turner
4078aeb252 Resubmit "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."
The original patch was broken due to some undefined behavior
as well as warnings that were triggering -Werror.

llvm-svn: 290000
2016-12-16 22:48:14 +00:00
Jun Bum Lim
4f195c2484 [CodeGenPrep] Skip merging empty case blocks
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly :

Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl

Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

llvm-svn: 289988
2016-12-16 20:38:39 +00:00
Sanjoy Das
60cab50190 Inline stripInvariantGroupMetadata out of existence
As a one liner function, I don't think it is pulling its weight in terms
of helping readability.

llvm-svn: 289987
2016-12-16 20:29:39 +00:00
Adrian Prantl
0ab6669d6d Revert "[IR] Remove the DIExpression field from DIGlobalVariable."
This reverts commit 289920 (again).
I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable
has not DIExpression. Unfortunately it is not possible to safely upgrade
these variables without adding a flag to the bitcode record indicating which
version they are.
My plan of record is to roll the planned follow-up patch that adds a
unit: field to DIGlobalVariable into this patch before recomitting.
This way we only need one Bitcode upgrade for both changes (with a
version flag in the bitcode record to safely distinguish the record
formats).

Sorry for the churn!

llvm-svn: 289982
2016-12-16 19:39:01 +00:00
Zachary Turner
ab64e55c57 Revert "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."
This reverts commit r289978, which is failing due to some rebase/merge
issues.

llvm-svn: 289981
2016-12-16 19:25:23 +00:00
Zachary Turner
526ce01d27 [CodeView] Hook CodeViewRecordIO for reading/writing symbols.
This is the 3rd of 3 patches to get reading and writing of
CodeView symbol and type records to use a single codepath.

Differential Revision: https://reviews.llvm.org/D26427

llvm-svn: 289978
2016-12-16 19:20:35 +00:00
Krzysztof Parzyszek
3467c555e1 Implement LaneBitmask::any(), use it to replace !none(), NFCI
llvm-svn: 289974
2016-12-16 19:11:56 +00:00
Sanjoy Das
f115142e03 Fix CodeGenPrepare::stripInvariantGroupMetadata
`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs.  The
only reason it worked at all is that `getMetadataID` returns something
unrelated -- it returns the subclass ID of the receiver (which is used
in `dyn_cast` etc.).  That does not numerically match
`LLVMContext::MD_invariant_group` and ends up dropping `invariant_group`
along with every other metadata that does not numerically match
`LLVMContext::MD_invariant_group`.

llvm-svn: 289973
2016-12-16 18:52:33 +00:00
Joel Jones
5b76390d23 Fix name typo in SelectonDAG
llvm-svn: 289969
2016-12-16 18:22:54 +00:00
Jun Bum Lim
6a3faa2a6d Revert "[CodeGenPrep] Skip merging empty case blocks"
This reverts commit r289951.

llvm-svn: 289960
2016-12-16 17:06:14 +00:00
Jun Bum Lim
ab29427ce1 [CodeGenPrep] Skip merging empty case blocks
This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block:

Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl

Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

llvm-svn: 289951
2016-12-16 16:03:31 +00:00
Krzysztof Parzyszek
79e30f3bba [MIRParser] Add parsing hex literals of arbitrary size as unsigned integers
The current code does not parse hex literals larger than 32-bit.

llvm-svn: 289943
2016-12-16 13:58:01 +00:00
Diana Picus
a8212f0f94 [ARM] GlobalISel: Select add i32, i32
Add the minimal support necessary to select a function that returns the sum of
two i32 values.

This includes some support for argument/return lowering of i32 values through
registers, as well as the handling of copy and add instructions throughout the
GlobalISel pipeline.

Differential Revision: https://reviews.llvm.org/D26677

llvm-svn: 289940
2016-12-16 12:54:46 +00:00
Florian Hahn
0a82699397 [codegen] Add generic functions to skip debug values.
Summary:
This commits moves skipDebugInstructionsForward and
skipDebugInstructionsBackward from lib/CodeGen/IfConversion.cpp
to include/llvm/CodeGen/MachineBasicBlock.h and updates
some codgen files to use them.

This refactoring was suggested in https://reviews.llvm.org/D27688
and I thought it's best to do the refactoring in a separate
review, but I could also put both changes in a single review
if that's preferred.

Also, the names for the functions aren't the snappiest and
I would be happy to rename them if anybody has suggestions. 

Reviewers: eli.friedman, iteratee, aprantl, MatzeB

Subscribers: MatzeB, llvm-commits

Differential Revision: https://reviews.llvm.org/D27782

llvm-svn: 289933
2016-12-16 11:10:26 +00:00
Adrian Prantl
2345112c5b [IR] Remove the DIExpression field from DIGlobalVariable.
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.

Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:

(1) The DIGlobalVariable should describe the source level variable,
    not how to get to its location.

(2) It makes it unsafe/hard to update the expressions when we call
    replaceExpression on the DIGLobalVariable.

(3) It makes it impossible to represent a global variable that is in
    more than one location (e.g., a variable with multiple
    DW_OP_LLVM_fragment-s).  We also moved away from attaching the
    DIExpression to DILocalVariable for the same reasons.

This reapplies r289902 with additional testcase upgrades.

<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769

llvm-svn: 289920
2016-12-16 04:25:54 +00:00
Chandler Carruth
a06dc57c65 Add extra headers that got deleted by my revert in r289916 but for which
new usage had already grown in the file.

llvm-svn: 289917
2016-12-16 04:08:31 +00:00
Chandler Carruth
dc4dc4b073 Revert patch series introducing the DAG combine to match a load-by-bytes
idiom.

r289538: Match load by bytes idiom and fold it into a single load
r289540: Fix a buildbot failure introduced by r289538
r289545: Use more detailed assertion messages in the code ...
r289646: Add a couple of assertions to the load combine code ...

This DAG combine has a bad crash in it that is quite hard to trigger
sadly -- it relies on sneaking code with UB through the SDAG build and
into this particular combine. I've responded to the original commit with
a test case that reproduces it.

However, the code also has other problems that will require substantial
changes to address and so I'm going ahead and reverting it for now. This
should unblock us and perhaps others that are hitting the crash in the
wild and will let a fresh patch with updated approach come in cleanly
afterward.

Sorry for any trouble or disruption!

llvm-svn: 289916
2016-12-16 04:05:22 +00:00
Adrian Prantl
daf4fef1f9 Revert "[IR] Remove the DIExpression field from DIGlobalVariable."
This reverts commit 289902 while investigating bot berakage.

llvm-svn: 289906
2016-12-16 01:00:30 +00:00
Adrian Prantl
0eee52640f [IR] Remove the DIExpression field from DIGlobalVariable.
This patch implements PR31013 by introducing a
DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
DIExpression.

Currently, DIGlobalVariables holds a DIExpression. This is not the
best way to model this:

(1) The DIGlobalVariable should describe the source level variable,
    not how to get to its location.

(2) It makes it unsafe/hard to update the expressions when we call
    replaceExpression on the DIGLobalVariable.

(3) It makes it impossible to represent a global variable that is in
    more than one location (e.g., a variable with multiple
    DW_OP_LLVM_fragment-s).  We also moved away from attaching the
    DIExpression to DILocalVariable for the same reasons.

<rdar://problem/29250149>
https://llvm.org/bugs/show_bug.cgi?id=31013
Differential Revision: https://reviews.llvm.org/D26769

llvm-svn: 289902
2016-12-16 00:36:43 +00:00
David Blaikie
ec3b7a29bd DebugInfo: Address non-deterministic output (iterating a SmallPtrSet) in 289697
Post-commit review feedback from Adrian Prantl.

Hopefully this fixes that up :)

llvm-svn: 289892
2016-12-15 23:37:38 +00:00
Quentin Colombet
7f79b5f58f [IRTranslator] Merge the entry and ABI lowering blocks.
The IRTranslator uses an additional block before the LLVM-IR entry block
to perform all the ABI lowering and the constant hoisting. Thus, this
block is the actual entry block and it falls through the LLVM-IR entry
block. However, with such representation, we end up with two basic
blocks that are not maximal.

Therefore, this patch adds a bit of canonicalization by merging both the
LLVM-IR entry block and the ABI lowering/constants hoisting into one
block, making the resulting block more likely to be maximal (indeed the
LLVM-IR entry block might not have been maximal).

llvm-svn: 289891
2016-12-15 23:32:25 +00:00
David Blaikie
956504fb9e DebugInfo: Emit ranges for functions with DISubprograms but lacking locations on any instructions
This seems more consistent, and helps tidy up/simplify some other code
in this change.

llvm-svn: 289889
2016-12-15 23:17:52 +00:00
Eli Friedman
a6de56264a Don't combine splats with other shuffles.
We sometimes end up creating shuffles which are worse than the obvious
translation of the IR.

Fixes https://llvm.org/bugs/show_bug.cgi?id=31301 .

Differential Revision: https://reviews.llvm.org/D27793

llvm-svn: 289882
2016-12-15 22:41:40 +00:00
Eli Friedman
221fbc02dc Don't combine a shuffle of two BUILD_VECTORs with duplicate elements.
Targets can't handle this case well in general; we often transform
a shuffle of two cheap BUILD_VECTORs to element-by-element insertion,
which is very inefficient.

Fixes https://llvm.org/bugs/show_bug.cgi?id=31364 . Partially
fixes https://llvm.org/bugs/show_bug.cgi?id=31301.

Differential Revision: https://reviews.llvm.org/D27787

llvm-svn: 289874
2016-12-15 21:36:59 +00:00
Geoff Berry
ece34f2d82 [LiveRangeEdit] Change eliminateDeadDef assert to if condition.
The assert could potentially fire (though no cases have been
encountered), so just check that the instruction we're handling
specially for rematerialization only has one def to begin with.

Reviewed by Wei Mi over email.

llvm-svn: 289861
2016-12-15 19:55:19 +00:00
Krzysztof Parzyszek
b6cc44c368 Extract LaneBitmask into a separate type
Specifically avoid implicit conversions from/to integral types to
avoid potential errors when changing the underlying type. For example,
a typical initialization of a "full" mask was "LaneMask = ~0u", which
would result in a value of 0x00000000FFFFFFFF if the type was extended
to uint64_t.

Differential Revision: https://reviews.llvm.org/D27454

llvm-svn: 289820
2016-12-15 14:36:06 +00:00
Prakhar Bahuguna
dc9a43dec1 [ARM] Implement execute-only support in CodeGen
This implements execute-only support for ARM code generation, which
prevents the compiler from generating data accesses to code sections.
The following changes are involved:

* Add the CodeGen option "-arm-execute-only" to the ARM code generator.
* Add the clang flag "-mexecute-only" as well as the GCC-compatible
  alias "-mpure-code" to enable this option.
* When enabled, literal pools are replaced with MOVW/MOVT instructions,
  with VMOV used in addition for floating-point literals. As the MOVT
  instruction is required, execute-only support is only available in
  Thumb mode for targets supporting ARMv8-M baseline or Thumb2.
* Jump tables are placed in data sections when in execute-only mode.
* The execute-only text section is assigned section ID 0, and is
  marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'.
  This also overrides selection of ELF sections for globals.

llvm-svn: 289784
2016-12-15 07:59:08 +00:00
Hal Finkel
777d1a3ae9 Trying to fix NDEBUG build after r289764
llvm-svn: 289766
2016-12-15 05:33:19 +00:00
Sanjoy Das
970ff026f0 [MachineBlockPlacement] Don't make blocks "uneditable"
Summary:
This fixes an issue with MachineBlockPlacement due to a badly timed call
to `analyzeBranch` with `AllowModify` set to true.  The timeline is as
follows:

 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls
    `TailDup.shouldTailDuplicate` on its argument, which in turn calls
    `analyzeBranch` with `AllowModify` set to true.

 2. This `analyzeBranch` call edits the terminator sequence of the block
    based on the physical layout of the machine function, turning an
    unanalyzable non-fallthrough block to a unanalyzable fallthrough
    block.  Normally MBP bails out of rearranging such blocks, but this
    block was unanalyzable non-fallthrough (and thus rearrangeable) the
    first time MBP looked at it, and so it goes ahead and decides where
    it should be placed in the function.

 3. When placing this block MBP fails to analyze and thus update the
    block in keeping with the new physical layout.

Concretely, before (1) we have something like:

```
LBL0:
  < unknown terminator op that may branch to LBL1 >
  jmp LBL1

LBL1:
  ... A

LBL2:
  ... B
```

In (2), analyze branch simplifies this to

```
LBL0:
  < unknown terminator op that may branch to LBL2 >
  ;; jmp LBL1 <- redundant jump removed

LBL1:
  ... A

LBL2:
  ... B
```

In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2
after the first block since that is profitable.

```
LBL0:
  < unknown terminator op that may branch to LBL2 >
  ;; jmp LBL1 <- redundant jump

LBL2:
  ... B

LBL1:
  ... A
```

and the program now has incorrect behavior (we no longer fall-through
from `LBL0` to `LBL1`) because MBP can no longer edit LBL0.

There are several possible solutions, but I went with removing the teeth
off of the `analyzeBranch` calls in TailDuplicator.  That makes thinking
about the result of these calls easier, and breaks nothing in the lit
test suite.

I've also added some bookkeeping to the MachineBlockPlacement pass and
used that to write an assert that would have caught this.

Reviewers: chandlerc, gberry, MatzeB, iteratee

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27783

llvm-svn: 289764
2016-12-15 05:08:57 +00:00
Sanjay Patel
45d2c1d6aa [DAG] allow more select folding for targets that have 'and not' (PR31175)
The original motivation for this patch comes from wanting to canonicalize 
more IR to selects and also canonicalizing min/max.

If we're going to do that, we need more backend fixups to undo select codegen 
when simpler ops will do. I chose AArch64 for the tests because that shows the
difference in the simplest way. This should fix:
https://llvm.org/bugs/show_bug.cgi?id=31175

Differential Revision: https://reviews.llvm.org/D27489

llvm-svn: 289738
2016-12-14 22:59:14 +00:00
David Blaikie
594ea7cb43 DebugInfo: Improve type safety and simplify some subprogram finalization code
This probably ended up this way aften the subprogram<>function link
inversion and debug info metadata schema changes.

llvm-svn: 289697
2016-12-14 19:38:39 +00:00
Andrew Kaylor
9f2d83ffdb [WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements
Differential Revision: https://reviews.llvm.org/D27693

llvm-svn: 289694
2016-12-14 19:30:18 +00:00
Nirav Dave
9fd3ae9cf9 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
Reverting due to ARM MCJIT and MIPS LLD error.

This reverts commit r289659.

llvm-svn: 289667
2016-12-14 16:43:44 +00:00
Nirav Dave
afe2eccae3 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after fixing after removing load-store factoring through
token factors in favor of improved token factor operand pruning

Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.

Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).

Additional Minor Changes:

   1. Finishes removing unused AliasLoad code
   2. Unifies the the chain aggregation in the merged stores across
      code paths
   3. Re-add the Store node to the worklist after calling
      SimplifyDemandedBits.
   4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
      arbitrary, but seemed sufficient to not cause regressions in
      tests.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations

Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 289659
2016-12-14 15:44:26 +00:00
Simon Pilgrim
f05854a8a9 [DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2
Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases.

Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value.

Differential Revision: https://reviews.llvm.org/D27714

llvm-svn: 289654
2016-12-14 15:08:13 +00:00
Stephan Bergmann
aba15d97df Replace APFloatBase static fltSemantics data members with getter functions
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.

Differential Revision: https://reviews.llvm.org/D26671

llvm-svn: 289647
2016-12-14 11:57:17 +00:00
Artur Pilipenko
6633f605fc Add a couple of assertions to the load combine code introduced by r289538
llvm-svn: 289646
2016-12-14 11:55:47 +00:00
Paul Robinson
fee2a350dc [DWARF] Preserve column number when emitting 'line 0' record
Follow-up to r289256, address a FIXME to avoid resetting the column
number. This reduced .debug_line by 2.6% in a RelWithDebInfo
self-build of clang.

llvm-svn: 289620
2016-12-14 00:27:35 +00:00
Alina Sbirlea
81ca226117 Generalize strided store pattern in interleave access pass
Summary:
This patch aims to generalize matching of the strided store accesses to more general masks.
The more general rule is to have consecutive accesses based on the stride:
[x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...]
All elements in the masks need not form a contiguous space, there may be gaps.
As before, undefs are allowed and filled in with adjacent element loads.

Reviewers: HaoLiu, mssimpso

Subscribers: mkuper, delena, llvm-commits

Differential Revision: https://reviews.llvm.org/D23646

llvm-svn: 289573
2016-12-13 19:32:36 +00:00
Artur Pilipenko
f4a6479ee9 Use more detailed assertion messages in the code introduced by r289538
llvm-svn: 289545
2016-12-13 16:26:15 +00:00
Artur Pilipenko
08a092ed23 Fix a buildbot failure introduced by r289538
Build failed because of unused variable in product mode.

llvm-svn: 289540
2016-12-13 14:55:31 +00:00
Artur Pilipenko
5f36ddf06d [DAGCombiner] Match load by bytes idiom and fold it into a single load
Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.

Assuming little endian target:
  i8 *a = ...
  i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
  i32 val = *((i32)a)

  i8 *a = ...
  i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
  i32 val = BSWAP(*((i32)a))

This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.

Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
  i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)

Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.

The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.

Reviewed By: hfinkel, RKSimon, filcab

Differential Revision: https://reviews.llvm.org/D26149

llvm-svn: 289538
2016-12-13 14:21:14 +00:00
Artur Pilipenko
c5f2a9227a Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the upcoming user
llvm-svn: 289537
2016-12-13 14:16:02 +00:00
Simon Pilgrim
d5a20f83a2 [SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI.
We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this.

llvm-svn: 289534
2016-12-13 13:36:27 +00:00
Diana Picus
76a7924279 [GlobalISel] Move extendRegister where it belongs. NFCI
Apparently I missed this one when I moved ValueHandler back in r288658. Sorry!

llvm-svn: 289528
2016-12-13 10:46:12 +00:00
Philip Reames
863d1f370d [peephole] Enhance folding logic to work for STATEPOINTs
The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are:

    Support for folding multiple operands at once which reference the same load
    Support for folding multiple loads into a single instruction
    Walk all the operands of the instruction for varidic instructions (this is a bug fix!)

Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here.

Differential Revision: https://reviews.llvm.org/D24103

llvm-svn: 289510
2016-12-13 01:38:41 +00:00
Philip Reames
781131135a [Statepoints] Reuse stack slots more than once within a basic block
The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse.

The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again.

Differential Revision: https://reviews.llvm.org/D25243

llvm-svn: 289509
2016-12-13 01:21:15 +00:00
Andrew Kaylor
bfc4ce4b74 Avoid infinite loops in branch folding
Differential Revision: https://reviews.llvm.org/D27582

llvm-svn: 289486
2016-12-12 23:05:38 +00:00
Paul Robinson
7e7e871a0c Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table.  By default, use this for branch targets
and some other cases that have no specified source location, to
prevent inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).

Updated patch allows enabling or suppressing this behavior for all
unspecified source locations.

Differential Revision: http://reviews.llvm.org/D24180

llvm-svn: 289468
2016-12-12 20:49:11 +00:00
Geoff Berry
ae2f925550 [LiveRangeEdit] Add assert string and descriptive comment.
llvm-svn: 289456
2016-12-12 19:12:41 +00:00
Simon Pilgrim
cb8d466ab0 [SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBits
Pre-commit as discussed on D27657

llvm-svn: 289425
2016-12-12 10:29:43 +00:00
Sebastian Pop
828720bb36 instr-combiner: sum up all latencies of the transformed instructions
We have found that -- when the selected subarchitecture has a scheduling model
and we are not optimizing for size -- the machine-instruction combiner uses a
too-simple algorithm to compute the cost of one of the two alternatives [before
and after running a combining pass on a section of code], and therefor it throws
away the combination results too often.

This fix has the potential to help any ISA with the potential to combine
instructions and for which at least one subarchitecture has a scheduling model.
As of now, this is only known to definitely affect AArch64 subarchitectures with
a scheduling model.

Regression tested on AMD64/GNU-Linux, new test case tested to fail on an
unpatched compiler and pass on a patched compiler.

Patch by Abe Skolnik and Sebastian Pop.

llvm-svn: 289399
2016-12-11 19:39:32 +00:00
Simon Pilgrim
401af4781d [SelectionDAG] Add ability for computeKnownBits to peek through bitcasts from 'large element' scalar/vector to 'small element' vector.
Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types.

llvm-svn: 289329
2016-12-10 17:00:00 +00:00
Adrian Prantl
63b8f792d5 Fix LLVM's use of DW_OP_bit_piece in DWARF expressions.
LLVM's use of DW_OP_bit_piece is incorrect and a based on a
misunderstanding of the wording in the DWARF specification. The offset
argument of DW_OP_bit_piece refers to the offset into the location
that is on the top of the DWARF expression stack, and not an offset
into the source variable. This has since also been clarified in the
DWARF specification.

This patch fixes all uses of DW_OP_bit_piece to emit the correct
offset and simplifies the DwarfExpression class to semi-automaticaly
emit empty DW_OP_pieces to adjust the offset of the source variable,
thus simplifying the code using DwarfExpression.

While this is an incompatible bugfix, in practice I don't expect this
to be much of a problem since LLVM's old interpretation and the
correct interpretation of DW_OP_bit_piece differ only when there are
gaps in the fragmented locations of the described variables or if
individual fragments are smaller than a byte. LLDB at least won't
interpret locations with gaps in them because is has no way to present
undefined bits in a variable, and there is a high probability that an
old-form expression will be malformed when interpreted correctly,
because the DW_OP_bit_piece offset will be outside of the location at
the top of the stack.

As a nice side-effect, this patch enables us to use a more efficient
encoding for subregisters: In order to express a sub-register at a
non-zero offset we now use a DW_OP_bit_piece instead of shifting the
value into place manually.

This patch also adds missing test coverage for code paths that weren't
exercised before.

<rdar://problem/29335809>
Differential Revision: https://reviews.llvm.org/D27550

llvm-svn: 289266
2016-12-09 20:43:40 +00:00
Paul Robinson
5f7230c1c5 [DWARF] Suppress .loc directives from CFI instructions
Like DBG_VALUE, these emit nothing to the .text section, and sometimes
have no source location specified.  Just ignore them.

Differential Revision: http://reviews.llvm.org/D27492

llvm-svn: 289256
2016-12-09 19:15:32 +00:00
Simon Pilgrim
62680ef29e [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED)
Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type.

llvm-svn: 289232
2016-12-09 17:53:11 +00:00
Matt Arsenault
eb15d80d21 AMDGPU: Fix i128 mul
llvm-svn: 289231
2016-12-09 17:49:14 +00:00
Nirav Dave
0b475dec77 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r289221 which appears to be triggering an assertion

llvm-svn: 289226
2016-12-09 17:18:24 +00:00
Nirav Dave
075ae0197d In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after fixing overly aggressive load-store forwarding optimization.

Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.

Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).

Additional Minor Changes:

   1. Finishes removing unused AliasLoad code
   2. Unifies the the chain aggregation in the merged stores across
      code paths
   3. Re-add the Store node to the worklist after calling
      SimplifyDemandedBits.
   4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
      arbitrary, but seemed sufficient to not cause regressions in
      tests.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations

Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 289221
2016-12-09 16:15:12 +00:00
Simon Pilgrim
b1e8d4a717 Use SelectionDAG.getSplatBuildVector helper. NFCI.
llvm-svn: 289220
2016-12-09 16:01:50 +00:00
Simon Pilgrim
76dbac96c6 [SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI.
Makes interception of BUILD_VECTOR creation easier for debugging.

llvm-svn: 289218
2016-12-09 15:23:41 +00:00
Simon Pilgrim
3d9639ff3f [SelectionDAG] Add additional checks to CONCAT_VECTORS creation
Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements.

llvm-svn: 289214
2016-12-09 14:27:52 +00:00
Benjamin Kramer
11e0ca31fc Plug another leak in the DWARF unittests, DIEInlineStrings are never destroyed.
llvm-svn: 289208
2016-12-09 13:33:41 +00:00
Simon Pilgrim
668cfba535 [SelectionDAG] Add partial BITCAST support to computeKnownBits
Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together.

We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them.

Differential Revision: https://reviews.llvm.org/D27129

llvm-svn: 289200
2016-12-09 10:13:45 +00:00
Daniel Jasper
d4a88ded31 Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes"
This reverts commit r288916 as it is currently causing a crasher in
Halide. Reproducer on llvm.org/PR31323. While it might be that halide is
generating invalid IR, llc shouldn't crash.

llvm-svn: 289194
2016-12-09 09:04:51 +00:00
Tim Northover
7be37de3d0 GlobalISel: fall back gracefully for debug intrinsics.
Supporting them properly is a reasonably complex chunk of work, so to allow bot
testing before then we should at least be able to fall back to DAG ISel.

llvm-svn: 289150
2016-12-08 22:44:13 +00:00
Tim Northover
9192e8f992 GlobalISel: factor overflow handling into separate function. NFC.
llvm-svn: 289149
2016-12-08 22:44:00 +00:00
Reid Kleckner
53a410195b Don't emit .seh_handler directives for any cleanup funclets
We were falsely claiming that we had an LSDA for the relevant EH
personality before this change, which could lead to the EH machinery
interpreting random adjacent data as an LSDA.

Fixes PR31317

This change is safe because cleanups can't contain exception handlers
today. We do these things to maintain that invariant:
- C++ destructors are naturally out-of-line
- __finally blocks are outlined in clang
- LLVM's inliner will not inline EH constructs into cleanups

llvm-svn: 289101
2016-12-08 20:38:46 +00:00
NAKAMURA Takumi
f8337108d2 Prune unused libdeps.
llvm-svn: 289060
2016-12-08 15:28:02 +00:00
Nicolai Haehnle
3557fd1bae [SelectionDAG] Add expansion and promotion of [US]MUL_LOHI
Summary:
Most targets set the action for these nodes to Expand even though there
isn't actually any code for them in ExpandNode. Instead, targets simply
relied on the fact that no code generates these nodes as long as the
nodes aren't legal or custom.

However, generating these nodes can be useful e.g. for divide-by-constant
in wider integer types.

Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and
a sequence of half-width multiplications otherwise. Promote uses a wider
multiply.

This patch intends to not change the generated code, but indirect effects
are possible since expansions/promotions that were previously done in
DAGCombine may now be done in LegalizeDAG.

See D24822 for a change that actually uses the new expansion.

Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD

Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D24956

llvm-svn: 289050
2016-12-08 14:08:14 +00:00
Daniel Jasper
e4ffea4b0a Move DwarfGenerator.cpp to unittests
So far it creates a test helper and so it should be moved there. It also
create a layering cycle between CodeGen and CodeGen/AsmPrinter, which
should be avoided.

Review: https://reviews.llvm.org/D27570
llvm-svn: 289044
2016-12-08 12:45:29 +00:00
Simon Pilgrim
6eaec41012 Wdocumentation fix
llvm-svn: 289038
2016-12-08 10:41:41 +00:00
Keno Fischer
1d88090ce6 Revert "[CodeGen] Fix invalid DWARF info on Win64"
Appears to break on build bots. Reverting pending investigation.

llvm-svn: 289014
2016-12-08 01:56:23 +00:00
Keno Fischer
7c7d74df6a [CodeGen] Fix invalid DWARF info on Win64
The relocations for `DIEEntry::EmitValue` were wrong for Win64
(emitting FK_Data_4 instead of FK_SecRel_4). This corrects that
oversight so that the DWARF data is correct in Win64 COFF files.

Fixes PR15393.

Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch
by David Majnemer.

Differential Revision: https://reviews.llvm.org/D21731

llvm-svn: 289013
2016-12-08 01:40:21 +00:00
Greg Clayton
34f25a2606 Make a DWARF generator so we can unit test DWARF APIs with gtest.
The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps.

More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings.

DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests:

dwarfgen::Generator DG;
Triple Triple("x86_64--");
bool success = DG.init(Triple, Version);
if (!success)
  return;
dwarfgen::CompileUnit &CU = DG.addCompileUnit();
dwarfgen::DIE CUDie = CU.getUnitDIE();

CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c");
CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C);

dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram);
SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main");
SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U);
SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U);

dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type);
IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int");
IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed);
IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4);

dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter);
ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc");
// ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie);
ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie);

StringRef FileBytes = DG.generate();
MemoryBufferRef FileBuffer(FileBytes, "dwarf");
auto Obj = object::ObjectFile::createObjectFile(FileBuffer);
EXPECT_TRUE((bool)Obj);
DWARFContextInMemory DwarfContext(*Obj.get());
This code is backed by the AsmPrinter code that emits DWARF for the actual compiler.

While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings.

Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class.

Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset.

DIEInteger now supports more DW_FORM values.

There are also unit tests that cover:

Encoding and decoding all form types and values
Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit.

Differential Revision: https://reviews.llvm.org/D27326

llvm-svn: 289010
2016-12-08 01:03:48 +00:00
Matthias Braun
0550c9ad11 TargetPassConfig: Rename DisablePostRA -> DisablePostRASched; NFC
llvm-svn: 289003
2016-12-08 00:16:08 +00:00
Matthias Braun
ed9d5255b8 LivePhysReg: Use reference instead of pointer in init(); NFC
llvm-svn: 289002
2016-12-08 00:15:51 +00:00
Quentin Colombet
bf33e15f26 [InlineSpiller] Don't call TargetInstrInfo::foldMemoryOperand with an empty list.
Since r287792 if we try to do that we will hit an assert.

llvm-svn: 289001
2016-12-08 00:06:51 +00:00
Tim Northover
cdda15523c GlobalISel: use correct builder for ConstantExprs.
ConstantExpr instances were emitting code into the current block rather than
the entry block. This meant they didn't necessarily dominate all uses, which is
clearly wrong.

llvm-svn: 288985
2016-12-07 21:29:15 +00:00
Tim Northover
a947270b42 GlobalISel: store the current MachineFunction as direct state. NFC.
Having to ask the MIRBuilder for the current function is a little awkward, and
I'm intending to improve how that's threaded through anyway.

llvm-svn: 288983
2016-12-07 21:17:47 +00:00
Tim Northover
9cf8f9c151 GlobalISel: simplify MachineIRBuilder interface.
MachineIRBuilder had weird before/after and beginning/end flags for the insert
point. Unfortunately the non-default means that instructions will be inserted
in reverse order which is almost never what anyone wants.

Really, I think we just want (like IRBuilder has) the ability to insert at any
C++ iterator-style point (i.e. before any instruction or before MBB.end()). So
this fixes MIRBuilders to behave like IRBuilders in this respect.

llvm-svn: 288980
2016-12-07 21:05:38 +00:00
Simon Pilgrim
e88a47ae36 [SelectionDAG] Add knownbits support for vector demandedelts in SMAX/SMIN/UMAX/UMIN opcodes
llvm-svn: 288926
2016-12-07 17:54:00 +00:00
Simon Pilgrim
1411a026a3 [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes
llvm-svn: 288916
2016-12-07 16:28:21 +00:00
Simon Pilgrim
219f636fa6 [SelectionDAG] Removed old knownbits TODO comment. NFCI.
EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range.

llvm-svn: 288913
2016-12-07 15:31:12 +00:00
Eli Friedman
3cc7de085d [CodeGen] Fix result type for SMULO/UMULO legalization
On some platforms (like MSP430) the second element of the result
structure for SMULO/UMULO may have a shorter type than the one
returned by SetCC. We need to truncate it to the right type, or
else some incorrect code may be generated later on.

This fixes issue https://github.com/rust-lang/rust/issues/37829

Patch by Vadzim Dambrouski!

Differential Revision: https://reviews.llvm.org/D27154

llvm-svn: 288857
2016-12-06 22:49:36 +00:00
Tim Northover
a43d6fcffd GlobalISel: correctly handle small args via memory.
We were rounding size in bits down rather than up, leading to 0-sized slots for
i1 (assert!) and bugs for other types not byte-aligned.

llvm-svn: 288848
2016-12-06 21:02:19 +00:00
Simon Pilgrim
09a65b6f28 [DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine
Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted.

Differential Revision: https://reviews.llvm.org/D27461

llvm-svn: 288842
2016-12-06 19:09:37 +00:00
Tim Northover
c414033584 GlobalISel: fall back gracefully when we hit unhandled legalizer default.
llvm-svn: 288840
2016-12-06 19:02:15 +00:00
Simon Pilgrim
58b1271ee7 [SelectionDAG] We can ignore knownbits from an undef shuffle vector index if we don't actually demand that element
llvm-svn: 288839
2016-12-06 18:58:25 +00:00
Tim Northover
27693beb8e GlobalISel: handle G_SEQUENCE fallbacks gracefully.
There were two problems:
  + AArch64 was reusing random data from its binary op tables, which is
    complete nonsense for G_SEQUENCE.
  + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE,
    the generic code asserted.

llvm-svn: 288836
2016-12-06 18:38:38 +00:00