1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
Commit Graph

143904 Commits

Author SHA1 Message Date
Quentin Colombet
b546b773a7 [RegisterBankInfo] Emit proper type for remapped registers.
When the OperandsMapper creates virtual registers, it used to just create
plain scalar register with the right size. This may confuse the
instruction selector because we lose the information of the instruction
using those registers what supposed to do. The MachineVerifier complains
about that already.

With this patch, the OperandsMapper still creates plain scalar register,
but the expectation is for the mapping function to remap the type
properly. The default mapping function has been updated to do that.

rdar://problem/30231850

llvm-svn: 293362
2017-01-28 02:23:48 +00:00
Daniel Berlin
26d6d363fc MemorySSA: Fix block numbering invalidation and replacement bugs discovered by updater
llvm-svn: 293361
2017-01-28 02:22:52 +00:00
Matthias Braun
5809e12d46 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Daniel Berlin
ceeee7379a MemorySSA: Move updater to its own file
llvm-svn: 293357
2017-01-28 01:35:02 +00:00
Daniel Berlin
15f7119c5e Introduce a basic MemorySSA updater, that supports insertDef,
insertUse, moveBefore and moveAfter operations.

Summary:
This creates a basic MemorySSA updater that handles arbitrary
insertion of uses and defs into MemorySSA, as well as arbitrary
movement around the CFG. It replaces the current splice API.

It can be made to handle arbitrary control flow changes.
Currently, it uses the same updater algorithm from D28934.

The main difference is because MemorySSA is single variable, we have
the complete def and use list, and don't need anyone to give it to us
as part of the API.  We also have to rename stores below us in some
cases.

If we go that direction in that patch, i will merge all the updater
implementations (using an updater_traits or something to provide the
get* functions we use, called read*/write* in that patch).

Sadly, the current SSAUpdater algorithm is way too slow to use for
what we are doing here.

I have updated the tests we have to basically build memoryssa
incrementally using the updater api, and make sure it still comes out
the same.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29047

llvm-svn: 293356
2017-01-28 01:23:13 +00:00
Quentin Colombet
4b2ab7e75e [RegisterCoalescing] Recommit the patch "Remove partial redundent copy".
In r292621, the recommit fixes a bug related with live interval update
after the partial redundent copy is moved.

This recommit solves an additional bug related to the lack of update of
subranges.

The original patch is to solve the performance problem described in
PR27827. Register coalescing sometimes cannot remove a copy because of
interference. But if we can find a reverse copy in one of the predecessor
block of the copy, the copy is partially redundent and we may remove the
copy partially by moving it to the predecessor block without the
reverse copy.

Differential Revision: https://reviews.llvm.org/D28585

Re-apply r292621

Revert "Revert rL292621. Caused some internal build bot failures in apple."

This reverts commit r292984.

Original patch: Wei Mi <wmi@google.com>
Subrange fix: Mostly Matthias Braun <matze@braunis.de>

llvm-svn: 293353
2017-01-28 01:05:27 +00:00
Evgeniy Stepanov
126bfcda16 Fix memory leak in globalisel.
#0 0x89cdeb in operator new[](unsigned long) /code/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:84:37
    #1 0x4ec87c4 in llvm::RegisterBankInfo::ValueMapping const* llvm::RegisterBankInfo::getOperandsMapping<llvm::RegisterBankInfo::ValueMapping const* const*>(llvm::RegisterBankInfo::ValueMapping const* const*, llvm::RegisterBankInfo::ValueMapping const* const*) const /code/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:297:9
    #2 0x9327ee in llvm::AArch64RegisterBankInfo::getInstrMapping(llvm::MachineInstr const&) const /code/llvm/lib/Target/AArch64/AArch64RegisterBankInfo.cpp:540:30
    #3 0x4eb8d07 in llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:546:24
    #4 0x4eb9dd2 in llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:624:12
    #5 0x3141875 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /code/llvm/lib/CodeGen/MachineFunctionPass.cpp:62:13
    #6 0x396128d in llvm::FPPassManager::runOnFunction(llvm::Function&) /code/llvm/lib/IR/LegacyPassManager.cpp:1513:27
    #7 0x3961832 in llvm::FPPassManager::runOnModule(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1534:16
    #8 0x3962540 in runOnModule /code/llvm/lib/IR/LegacyPassManager.cpp:1590:27
    #9 0x3962540 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1693
    #10 0x8ae368 in compileModule(char**, llvm::LLVMContext&) /code/llvm/tools/llc/llc.cpp:562:8
    #11 0x8a7a1b in main /code/llvm/tools/llc/llc.cpp:316:22

llvm-svn: 293351
2017-01-28 00:46:30 +00:00
Vadim Chugunov
8095538732 Test commit.
llvm-svn: 293349
2017-01-27 23:59:26 +00:00
Eugene Zelenko
a1464d4fd2 [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293348
2017-01-27 23:58:02 +00:00
Tim Northover
409645fc6c GlobalISel: don't leak super-entry BB when merging with IR-level one.
We have to delete the block manually or it leaks. That triggers failures in
-fsanitize=leak bots (unsurprisingly), which should be fixed by this patch.

llvm-svn: 293347
2017-01-27 23:54:31 +00:00
Sanjay Patel
3ff9c85b95 [InstCombine] move icmp transforms that might be recognized as min/max and inf-loop (PR31751)
This is a minimal patch to avoid the infinite loop in:
https://llvm.org/bugs/show_bug.cgi?id=31751

But the general problem is bigger: we're not canonicalizing all of the min/max forms reported
by value tracking's matchSelectPattern(), and we don't define min/max consistently. Some code
uses matchSelectPattern(), other code uses matchers like m_Umax, and others have their own
inline definitions which may be subtly different from any of the above.

The reason that the test cases in this patch need a cast op to trigger is because we don't
(yet) canonicalize all min/max forms based on matchSelectPattern() in 
canonicalizeMinMaxWithConstant(), but we do make min/max+cast transforms based on 
matchSelectPattern() in visitSelectInst().

The location of the icmp transforms that trigger the inf-loop seems arbitrary at best, so
I'm moving those behind the min/max fence in visitICmpInst() as the quick fix.

llvm-svn: 293345
2017-01-27 23:26:27 +00:00
Peter Collingbourne
3703d8c3b2 Analysis: Add appropriate const qualification to functions in TypeMetadataUtils.cpp. NFC.
llvm-svn: 293341
2017-01-27 22:55:30 +00:00
Kostya Serebryany
41936ee2a4 [libFuzzer] make shmem more robust in the presence of signals
llvm-svn: 293339
2017-01-27 22:41:30 +00:00
Artem Tamazov
ead6d1c1e8 [AMDGPU][mc] Fix memory corruption uncovered by AddressSanitizer during coverage/smoke Gfx7/8 testing.
Coverage/smoke Gfx7/8 tests were committed r292922 but then reverted
by r292974 due to AddressSanitizer failure, which is fixed by this patch.
Tests to be re-committed soon.

llvm-svn: 293338
2017-01-27 22:19:42 +00:00
Tim Northover
cfa8ee39df GlobalISel: set correct regclass for LOAD_STACK_GUARD.
Since it's not actually a generic MI, its register operands need a RegClass,
which is conveniently the target's pointer RegClass.

llvm-svn: 293335
2017-01-27 21:31:24 +00:00
Tim Northover
5a90ccd1f0 GlobalISel: mark incoming landing-pad registers as live.
Should fix machine verifier failures.

llvm-svn: 293334
2017-01-27 21:31:17 +00:00
Krzysztof Parzyszek
f03cd5e678 [Hexagon] Remove unused variable (and silence a warning)
llvm-svn: 293331
2017-01-27 20:40:14 +00:00
Mehdi Amini
4122f2ee13 Fix ASAN failure in cxa_demangle
Found with ASAN + libFuzzer by Kostya Serebryany <kcc@google.com>

llvm-svn: 293330
2017-01-27 20:32:16 +00:00
Mehdi Amini
80bbcf9c96 Global DCE performance improvement
Change the original algorithm so that it scales better when meeting
very large bitcode where every instruction does not implies a global.

The target query is "how to you get all the globals referenced by
another global"?

Before this patch, it was doing this by walking the body (or the
initializer) and collecting the references. What this patch is doing,
it precomputing the answer to this query for the whole module by
walking the use-list of every global instead.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D28549

llvm-svn: 293328
2017-01-27 19:48:57 +00:00
Justin Lebar
0a8fdd1e4a Update NVVMReflect usage doc to new idiom for adding target-specific early passes.
llvm-svn: 293327
2017-01-27 19:44:24 +00:00
Xinliang David Li
94dd5f41b3 [PGO] add debug option to view raw count after prof use annotation
Differential Revision: https://reviews.llvm.org/D29045

llvm-svn: 293325
2017-01-27 19:06:25 +00:00
Matthias Braun
bdf76df01d ScheduleDAGInstrs: Do not try to toggle kill flags on debug uses
Preparation for upcoming changes. No testcase as none of the public
targets bundles early enough and has a post machine scheduler enabled at
the same time. The error is also easily catched by asserts.

llvm-svn: 293324
2017-01-27 18:53:07 +00:00
Matthias Braun
6bb4bd7689 ScheduleDAGInstrs: Cleanup toggleKillFlag(); NFC
llvm-svn: 293323
2017-01-27 18:53:05 +00:00
Matthias Braun
f455049da3 ScheduleDAGInstrs: Cleanup; NFC
Comment, doxygen and a bit of whitespace cleanup.

llvm-svn: 293322
2017-01-27 18:53:00 +00:00
Tom Stellard
4f3e674183 AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D29068

llvm-svn: 293321
2017-01-27 18:41:14 +00:00
Konstantin Zhuravlyov
51a141429e [AMDGPU] Grab MCSubtargetInfo from TargetMachine instead of constructing it
Differential Revision: https://reviews.llvm.org/D29224

llvm-svn: 293318
2017-01-27 18:32:40 +00:00
Chris Ray
fa485acb5d [X86] Adding FFREEP instruction.
Summary: Small change to get the FREEP instruction to decode properly.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29193

llvm-svn: 293314
2017-01-27 18:02:53 +00:00
Anna Thomas
ee73d81422 NFC: Add debug tracing for more cases where loop unrolling fails.
llvm-svn: 293313
2017-01-27 17:57:05 +00:00
Matt Arsenault
9317a1de75 AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

llvm-svn: 293310
2017-01-27 17:42:26 +00:00
Matthew Simpson
401964cda6 [ARM/AArch64] Relocate and update InterleavedAccessPass tests (NFC)
The interleaved access pass is an IR-to-IR transformation that runs before code
generation. It matches interleaved memory operations to target-specific
intrinsics (that are later lowered to load and store multiple instructions on
ARM/AArch64). We place tests for similar passes (e.g., GlobalMergePass) under
test/Transforms. This patch moves the InterleavedAccessPass tests out of
test/CodeGen and into target-specific directories under
test/Transforms/InterleavedAccess.

Although the pass is an IR pass, many of the existing tests were llc tests
rather opt tests. For example, the tests would check for ldN/stN instructions
generated by llc rather than the intrinsic calls the pass actually inserts.
Thus, this patch updates all tests to be opt tests that check for the inserted
intrinsics. We already have separate CodeGen tests that ensure we lower the
interleaved access intrinsics to their corresponding ldN/stN instructions. In
addition to migrating the tests to opt, this patch also performs some minor
clean-up (to ensure consistent naming, etc.).

Differential Revision: https://reviews.llvm.org/D29184

llvm-svn: 293309
2017-01-27 17:33:16 +00:00
Matt Arsenault
94c9a8236c NVPTX: Make NVPTXInferAddressSpaces preserve CFG
llvm-svn: 293308
2017-01-27 17:30:39 +00:00
Jun Bum Lim
ce761ff368 [CodeGenPrep]No negative cost in the ExtLd promotion
Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument.

Reviewers: mcrosier, jmolloy, qcolombet, javed.absar

Reviewed By: mcrosier, qcolombet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28871

llvm-svn: 293307
2017-01-27 17:16:37 +00:00
Stanislav Mekhanoshin
32e634e989 [AMDGPU] Turn AMDGPUUnifyMetadata back into module pass
With the adjustPassManager interface that is now possible to use
custom early module passes.

Differential Revision: https://reviews.llvm.org/D29189

llvm-svn: 293300
2017-01-27 16:38:10 +00:00
Mehdi Amini
6aa2c55163 Fix BasicAA incorrect assumption on GEP
This is fixing pr31761: BasicAA is deducing NoAlias
on the result of the GEP if the base pointer is itself NoAlias.

This is possible only if the NoAlias on the base pointer is
deduced with a non-sized query: this should guarantee that
the pointers are belonging to different memory allocation
and that the GEP can't legally jump from one to another.

Differential Revision: https://reviews.llvm.org/D29216

llvm-svn: 293293
2017-01-27 16:12:22 +00:00
Ivan Krasin
3f19a46ff1 Avoid using unspecified ordering in MetadataLoader::MetadataLoaderImpl::parseOneMetadata.
Summary:
MetadataLoader::MetadataLoaderImpl::parseOneMetadata uses
the following construct in a number of places:

```
MetadataList.assignValue(<...>, NextMetadataNo++);
```

There, NextMetadataNo gets incremented, and since the order
of arguments evaluation is not specified, that can happen
before or after other arguments are evaluated.

In a few cases the other arguments indirectly use NextMetadataNo.
For instance, it's

```
MetadataList.assignValue(
    GET_OR_DISTINCT(DIModule,
                    (Context, getMDOrNull(Record[1]),
                     getMDString(Record[2]), getMDString(Record[3]),
                     getMDString(Record[4]), getMDString(Record[5]))),
    NextMetadataNo++);
```

getMDOrNull calls getMD that uses NextMetadataNo:

```
MetadataList.getMetadataFwdRef(NextMetadataNo);
```

Therefore, the order of evaluation becomes important. That caused
a very subtle LLD crash that only happens if compiled with GCC or
if LLD is built with LTO. In the case if LLD is compiled with Clang
and regular linking mode, everything worked as intended.

This change extracts incrementing of NextMetadataNo outside of
the arguments list to guarantee the correct order of evaluation.

For the record, this has taken 3 days to track to the origin. It all
started with a ThinLTO bot in Chrome not being able to link a target
if debug info is enabled.

Reviewers: pcc, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D29204

llvm-svn: 293291
2017-01-27 15:54:49 +00:00
Simon Dardis
224192bde8 [mips] Recommit: "N64 static relocation model support"
This patch makes one change to GOT handling and two changes to N64's
relocation model handling. Furthermore, the jumptable encodings have
been corrected for static N64.

Big GOT handling is now done via a new SDNode MipsGotHi - this node is
unconditionally lowered to an lui instruction.

The first change to N64's relocation handling is the lifting of the
restriction that N64 always uses PIC. Now it is possible to target static
environments.

The second change adds support for 64 bit symbols and enables them by
default. Previously N64 had patterns for sym32 mode only. In this mode all
symbols are assumed to have 32 bit addresses. sym32 mode support
is selectable with attribute 'sym32'. A follow on patch for clang will
add the necessary frontend parameter.

This partially resolves PR/23485.

Thanks to Brooks Davis for reporting the issue!

This version corrects a "Conditional jump or move depends on uninitialised
value(s)" error detected by valgrind present in the original commit.

Reviewers: dsanders, seanbruno, zoran.jovanovic, vkalintiris

Differential Revision: https://reviews.llvm.org/D23652

llvm-svn: 293279
2017-01-27 11:36:52 +00:00
Alexey Bataev
76d6416ffb [SLP] Refactoring of horizontal reduction analysis, NFC.
Some checks in SLP horizontal reduction analysis function are performed
several times, though it is enough to perform these checks only once
during an initial attempt at adding candidate for the reduction
instruction/reduced value.

Differential Revision: https://reviews.llvm.org/D29175

llvm-svn: 293274
2017-01-27 10:54:04 +00:00
Chandler Carruth
7ddff7f375 [LICM] When we are recomputing the alias sets for a subloop, we cannot
skip sub-subloops.

The logic to skip subloops dated from when this code was shared with the
cached case. Once it was factored out to only run in the case of
recomputed subloops it became a dangerous bug. If a subsubloop contained
an interfering instruction it would be silently skipped from the alias
sets for LICM.

With the old pass manager this was extremely hard to trigger as it would
require failing to visit these subloops with the LICM pass but then
visiting the outer loop somehow. I've not yet contrived any test case
that actually manages to trigger this.

But with the new pass manager we don't do the cross-loop caching hack
that the old PM does and so we recompute alias set information from
first principles. While this seems much cleaner and simpler it exposed
this bug and would subtly miscompile code due to failing to correctly
model the aliasing constraints of deeply nested loops.

llvm-svn: 293273
2017-01-27 10:27:32 +00:00
Jonas Paulsson
e87466a278 [DAGTypeLegalizer] Handle SIGN/ZERO_EXTEND in WidenVecRes_Convert().
In case of a SIGN/ZERO_EXTEND of an incomplete vector type (using only a
partial number of available vector elements), WidenVecRes_Convert() used to
resort to scalarization.

This patch adds a handling of the (common) case where an input vector can be
found of same width as the widened result vector, by converting the node to
SIGN/ZERO_EXTEND_VECTOR_INREG.

Review: Eli Friedman
llvm-svn: 293268
2017-01-27 07:46:26 +00:00
Adam Nemet
048d1de98a [opt-viewer] Introduce global context
This is necessary since globals (max_hotness, caller_loc) need to be
explicitly passed to the subprocesses.

llvm-svn: 293266
2017-01-27 06:39:09 +00:00
Adam Nemet
b6cdaf306c [opt-viewer] Remove message from the key
This is causing problems because the rendering of the text will depend on
varying global state to show relative hotness or a link in the inlining
context.

llvm-svn: 293265
2017-01-27 06:39:08 +00:00
Adam Nemet
7d7d8fbb84 [opt-viewer] Unique across the different jobs as well
llvm-svn: 293264
2017-01-27 06:39:06 +00:00
Adam Nemet
ffe6b78a23 [opt-viewer] Make sorting for the index page deterministic
Break the tie between entries with identical hotness deterministically.

llvm-svn: 293263
2017-01-27 06:39:02 +00:00
Adam Nemet
b7b4a7b0c5 [opt-viewer] Include the function in the remark key
Avoid uniquing remarks with different the inlining context (Function).

llvm-svn: 293262
2017-01-27 06:39:01 +00:00
Adam Nemet
c38acefa67 [opt-viewer] Put critical items in parallel
Summary:
Put opt-viewer critical items in parallel

Patch by Brian Cain!

Requires features from Python 2.7

**Performance**
Below are performance results across various configurations. These were taken on an i5-5200U (dual core + HT). They were taken with a small subset of the YAML output of building Python 3.6.0b3 with LTO+PGO. 60 YAML files.

"multiprocessing" is the current submission contents. "baseline" is as of 544f14c6b2a07a94168df31833dba9dc35fd8289 (I think this is aka r287505).

"ImportError" vs "class<...CLoader>" below are just confirming the expected configuration (with/without CLoader).

The below was measured on AMD A8-5500B (4 cores) with 224 input YAML files, showing a ~1.75x speed increase over the baseline with libYAML.  I suspect it would scale well on high-end servers.

```
**************************************** MULTIPROCESSING ****************************************
PyYAML:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
        ImportError: cannot import name CLoader
        Python 2.7.10
489.42user 5.53system 2:38.03elapsed 313%CPU (0avgtext+0avgdata 400308maxresident)k
0inputs+31392outputs (0major+473540minor)pagefaults 0swaps

PyYAML+libYAML:
        <class 'yaml.cyaml.CLoader'>
        Python 2.7.10
78.69user 5.45system 0:32.63elapsed 257%CPU (0avgtext+0avgdata 398560maxresident)k
0inputs+31392outputs (0major+542022minor)pagefaults 0swaps

PyPy/PyYAML:
        Traceback (most recent call last):
          File "<builtin>/app_main.py", line 75, in run_toplevel
          File "<builtin>/app_main.py", line 601, in run_it
          File "<string>", line 1, in <module>
        ImportError: cannot import name 'CLoader'
        Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
        [PyPy 2.6.0 with GCC 4.9.3]
154.27user 8.12system 0:53.83elapsed 301%CPU (0avgtext+0avgdata 627960maxresident)k
808inputs+30376outputs (0major+727994minor)pagefaults 0swaps
**************************************** BASELINE        ****************************************
PyYAML:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
        ImportError: cannot import name CLoader
        Python 2.7.10
        358.08user 4.05system 6:08.37elapsed 98%CPU (0avgtext+0avgdata 315004maxresident)k
0inputs+31392outputs (0major+85252minor)pagefaults 0swaps

PyYAML+libYAML:
        <class 'yaml.cyaml.CLoader'>
        Python 2.7.10
50.32user 3.30system 0:56.59elapsed 94%CPU (0avgtext+0avgdata 307296maxresident)k
0inputs+31392outputs (0major+79335minor)pagefaults 0swaps

PyPy/PyYAML:
        Traceback (most recent call last):
          File "<builtin>/app_main.py", line 75, in run_toplevel
          File "<builtin>/app_main.py", line 601, in run_it
          File "<string>", line 1, in <module>
        ImportError: cannot import name 'CLoader'
        Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
        [PyPy 2.6.0 with GCC 4.9.3]
72.94user 5.18system 1:23.41elapsed 93%CPU (0avgtext+0avgdata 455312maxresident)k
0inputs+30392outputs (0major+110280minor)pagefaults 0swaps

```

Reviewers: fhahn, anemet

Reviewed By: anemet

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D26967

llvm-svn: 293261
2017-01-27 06:38:31 +00:00
Richard Trieu
ccbf67507b Fix unused variable warning.
llvm-svn: 293260
2017-01-27 06:06:05 +00:00
Saleem Abdulrasool
da5a339f73 ARM: fix vectorized division on WoA
The Windows on ARM target uses custom division for normal division as
the backend needs to insert division-by-zero checks.  However, it is
designed to only handle non-vectorized division.  ARM has custom
lowering for vectorized division as that can avoid loading registers
with the values and invoke a division routine for each one, preferring
to lower using NEON instructions.  Fall back to the custom lowering for
the NEON instructions if we encounter a vectorized division.

Resolves PR31778!

llvm-svn: 293259
2017-01-27 03:41:53 +00:00
Daniel Berlin
39c1af708a NewGVN: Add basic dead and redundant store elimination
Summary:
This adds basic dead and redundant store elimination to
NewGVN.  Unlike our current DSE, it will happily do cross-block DSE if
it meets our requirements.

We get a bunch of DSE's simple.ll cases, and some stuff it doesn't.
Unlike DSE, however, we only try to eliminate stores of the same value
to the same memory location, not just general stores to the same
memory location.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29149

llvm-svn: 293258
2017-01-27 02:37:11 +00:00
NAKAMURA Takumi
8dfc1642c6 NVPTXCodeGen: Add IPO to libdeps, since r293189.
llvm-svn: 293256
2017-01-27 02:11:10 +00:00
Tim Shen
75bc287023 [APFloat] Reduce some dispatch boilerplates. NFC.
Summary: This is an attempt to reduce the verbose manual dispatching code in APFloat. This doesn't handle multiple dispatch on single discriminator (e.g. APFloat::add(const APFloat&)), nor handles multiple dispatch on multiple discriminators (e.g. APFloat::convert()).

Reviewers: hfinkel, echristo, jlebar

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D29161

llvm-svn: 293255
2017-01-27 02:11:07 +00:00