1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

128635 Commits

Author SHA1 Message Date
Chad Rosier
41407e5c3d [AArch64] Refactor AArch64FrameLowering::emitPrologue. NFC.
http://reviews.llvm.org/D18125
Patch by Aditya Kumar.

llvm-svn: 263461
2016-03-14 18:24:34 +00:00
Quentin Colombet
cca4e6c42d [SpillPlacement] Fix a quadratic behavior in spill placement.
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.

Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
  added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.

llvm-svn: 263460
2016-03-14 18:21:25 +00:00
Chad Rosier
1c035c4cf5 [AArch64] Break the dependency between FP and SP when possible.
When the SP in not changed because of realignment/VLAs etc., we restore the SP
by using the previous value of SP and not the FP. Breaking the dependency will
help in cases when the epilog of a callee is close to the epilog of the caller;
for then "sub sp, fp, #" depends on the load restoring the FP in the epilog of
the callee.

http://reviews.llvm.org/D18060
Patch by Aditya Kumar and Evandro Menezes.

llvm-svn: 263458
2016-03-14 18:17:41 +00:00
Quentin Colombet
573b41c165 [ADT] Add a pop_back_val method to the SparseSet container.
The next commit will use it.

llvm-svn: 263455
2016-03-14 18:10:41 +00:00
Chad Rosier
3b7d2db4ef [Mips] Fix -Wunused-private-field warning after r263444.
llvm-svn: 263454
2016-03-14 18:10:20 +00:00
Sanjay Patel
f7ad46820f [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel
f22bc14a47 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Tom Stellard
1cea59b42c AMDGPU/SI: Handle wait states required for DPP instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17543

llvm-svn: 263447
2016-03-14 17:05:56 +00:00
Sanjay Patel
f3adc07abf [x86, AVX] replace masked load with full vector load when possible
Converting masked vector loads to regular vector loads for x86 AVX should always be a win.
I raised the legality issue of reading the extra memory bytes on llvm-dev. I did not see any
objections.

1. x86 already does this kind of optimization for multiple scalar loads -> vector load.
2. If other targets have the same flexibility, we could move this transform up to CGP or DAGCombiner.

Differential Revision: http://reviews.llvm.org/D18094

llvm-svn: 263446
2016-03-14 16:54:43 +00:00
Daniel Sanders
0983c108c8 [mips] MIPS32R6 compact branch support
Summary:
MIPSR6 introduces a class of branches called compact branches. Unlike the
traditional MIPS branches which have a delay slot, compact branches do not
have a delay slot. The instruction following the compact branch is only
executed if the branch is not taken and must not be a branch.

It works by generating compact branches for MIPS32R6 when the delay slot
filler cannot fill a delay slot. Then, inspecting the generated code for
forbidden slot hazards (a compact branch with an adjacent branch or other
CTI) and inserting nops to clear this hazard.

Patch by Simon Dardis.

Reviewers: vkalintiris, dsanders

Subscribers: MatzeB, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16353

llvm-svn: 263444
2016-03-14 16:24:05 +00:00
Marek Olsak
64405cd52f AMDGPU/SI: Incomplete shader binaries need to finish execution at the end
Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D18058

llvm-svn: 263441
2016-03-14 15:57:14 +00:00
Nicolai Haehnle
176749e5b2 AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence
Summary:
When multiple threads perform an atomic op with the same arguments, they
will usually see different return values.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18101

llvm-svn: 263440
2016-03-14 15:37:18 +00:00
Vasileios Kalintiris
b93f6a836b [mips] Use range-based for loops. NFC.
llvm-svn: 263438
2016-03-14 15:05:30 +00:00
Benjamin Kramer
8ea7e842e3 Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379."
This reverts commit r263424. Breaks self-host.

llvm-svn: 263437
2016-03-14 14:58:36 +00:00
Ulrich Weigand
5020f81c76 [SystemZ] Avoid LER on z13 due to partial register dependencies
On the z13, it turns out to be more efficient to access a full
floating-point register than just the upper half (as done e.g.
by the LE and LER instructions).

Current code already takes this into account when loading from
memory by using the LDE instruction in place of LE.  However,
we still generate LER, which shows the same performance issues
as LE in certain circumstances.

This patch changes the back-end to emit LDR instead of LER to
implement FP32 register-to-register copies on z13.

llvm-svn: 263431
2016-03-14 13:50:03 +00:00
Chad Rosier
e4cdbb48f0 [CVP] Replace nonnegative with positive, per Philip's request. NFC.
llvm-svn: 263430
2016-03-14 13:48:00 +00:00
Zlatko Buljan
cc5a6a9d1a [mips] Fix an issue with long double when function roundl is defined
Differential Revision: http://reviews.llvm.org/D17760

llvm-svn: 263428
2016-03-14 12:50:23 +00:00
Daniel Sanders
85d0b438a1 [mips] Range check uimm16_64
Summary:

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D17725

llvm-svn: 263427
2016-03-14 12:44:44 +00:00
Amjad Aboud
97fd9dd46d Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26715 at r263379.

llvm-svn: 263424
2016-03-14 12:03:20 +00:00
Daniel Sanders
eda4dc3658 [mips] Simplify ordering of range checked immediate classes.
Summary:
With the addition of checks to ensure that operands have a strict ordering
it has become tricky to manage the order in the way I originally intended.

This patch linearizes the ordering which simplifies the implementation but
requires an order that is arbitrary in places. Here are some examples:
* uimm4 < uimm5 < uimm6
* simm4 < uimm4 < simm5 < uimm5
* uimm5 < uimm5_plus1 (1..32) < uimm5_plus32 (32..63) < uimm6
  The term 'superset' starts to break down here since the *_plus* classes
  are not true supersets of uimm5 (but they are still subsets of uimm6).
* uimm5 < uimm5_64, and uimm5 < vsplat_uimm5
  This is entirely arbitrary. We need an ordering and what we pick is
  unimportant since only one is possible for a given mnemonic.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D17723

llvm-svn: 263423
2016-03-14 11:46:30 +00:00
Nikolay Haustov
bade203fd0 [AMDGPU] Assembler: SOP* instruction fixes
s_bitset0_b64, s_bitset1_b64 has 32-bit src0, not 64-bit.
s_rfe_b64 has just one destination operand and no source.
Uncomment S_BITCMP* and S_SETVSKIP, adjust SOPC_* classes for that.
Add s_memrealtime test and change comments in smem.s to follow common style.
Change test for s_memtime to use non-zero register to make it really test encoding.
Add tests for s_buffer_load*.
Add tests for SOPC instructions (same for SI and VI)

Differential Revision: http://reviews.llvm.org/D18040

llvm-svn: 263420
2016-03-14 11:17:19 +00:00
Daniel Sanders
dec4e4aa9c [mips] Range check uimm6_lsl2.
Summary:

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17291

llvm-svn: 263419
2016-03-14 11:16:56 +00:00
Hans Wennborg
3b13eb7175 Try to fix build of WebAssemblyRegStackify.cpp on Windows
It's failing to build on VS2015 with:

C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\lib\Target\WebAssembly\WebAssemblyRegStackify.cpp(520):
error C2668: 'llvm::make_reverse_iterator': ambiguous call to overloaded function
C:\b\build\slave\ClangToTWin\build\src\third_party\llvm\include\llvm/ADT/STLExtras.h(217):
note: could be 'std::reverse_iterator<llvm::MachineBasicBlock::iterator>
llvm::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(IteratorTy)'
        with
        [
            IteratorTy=llvm::MachineInstrBundleIterator<llvm::MachineInstr>
        ]
C:\b\depot_tools\win_toolchain\vs_files\391bbf1220d3edcd3cc3fccdb56224181e3b13a7\win_sdk\bin\..\..\VC\include\xutility(1217):
note: or 'std::reverse_iterator<llvm::MachineBasicBlock::iterator>
std::make_reverse_iterator<llvm::MachineInstrBundleIterator<llvm::MachineInstr>>(_RanIt)' [found using argument-dependent lookup]
        with
        [
            _RanIt=llvm::MachineInstrBundleIterator<llvm::MachineInstr>
        ]

I don't have VS2015 locally at the moment, but hopefully this will help.

llvm-svn: 263418
2016-03-14 11:04:15 +00:00
Igor Breger
e61fb42d0b AVX512: icmp operation should be always lowered to CMPM (AVX-512) instruction on SKX.
implemented by delena

Differential Revision: http://reviews.llvm.org/D18054

llvm-svn: 263417
2016-03-14 10:26:39 +00:00
Valery Pykhtin
73eb61d4ef [AMDGPU] AsmParser: Factor out parseRegister. NFC.
llvm-svn: 263411
2016-03-14 07:43:42 +00:00
Valery Pykhtin
fa61c51d9e [AMDGPU] AsmParser: refactor post push_back vector access. NFC.
llvm-svn: 263409
2016-03-14 05:25:44 +00:00
David Majnemer
00a7fd7b32 [CodeView] Consistently handle overly large symbol names
Overly large symbol names weren't correctly handled for leaf function
records.

llvm-svn: 263408
2016-03-14 05:15:09 +00:00
Valery Pykhtin
95a8f2b685 [AMDGPU] AsmParser: remove redundant isReg checks. NFC.
llvm-svn: 263407
2016-03-14 05:01:45 +00:00
Haicheng Wu
9e87f080cd [CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative
The motivating example is this

for (j = n; j > 1; j = i) {
   i = j / 2;
}

The signed division is safely to be changed to an unsigned division (j is known
to be larger than 1 from the loop guard) and later turned into a single shift
without considering the sign bit.

llvm-svn: 263406
2016-03-14 03:24:28 +00:00
Amaury Sechet
608702a15b Add facility to add/remove/check attribute on function and arguments.
Summary: This comes from work to make attribute manipulable via the C API.

Reviewers: gottesmm, hfinkel, baldrick, echristo, tejohnson

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18128

llvm-svn: 263404
2016-03-14 01:37:29 +00:00
Junmo Park
1679e4dc19 [MCSchedule] Remove comments about MinLatency. NFC
Summary:
There is no definition about MinLatency any more.

Reviewers: mcrosier, spatel, hfinkel

Differential Revision: http://reviews.llvm.org/D18079

llvm-svn: 263403
2016-03-14 00:36:19 +00:00
Simon Pilgrim
4aa8c2f74f [X86][XOP] Added target shuffle combine tests for XOP's VPPERM 2-op shuffle
Actual combing support will be added in a future patch

llvm-svn: 263402
2016-03-14 00:18:26 +00:00
David Blaikie
71951ed5d5 Remove some unused variables
llvm-svn: 263396
2016-03-13 22:00:18 +00:00
Mehdi Amini
12d624a26a Remove PreserveNames template parameter from IRBuilder
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
2016-03-13 21:05:13 +00:00
Simon Pilgrim
e914699fcf [X86][SSE] Added truncated vector arithmetic tests.
For cases where we are truncating an integer vector arithmetic result, it may be better to pre-truncate the input operands - no code to support this yet (scalar is done with SimplifyDemandedBits but adding vector support could be a lot of work) but these tests represent the current codegen status.

Example bugs: PR14666, PR22703

llvm-svn: 263384
2016-03-13 19:08:01 +00:00
Simon Pilgrim
176a4e0ec3 [X86][SSE41] Avoid variable blend for constant v8i16 shifts
The SSE41 v8i16 shift lowering using (v)pblendvb is great for non-constant shift amounts, but if it is constant then we can efficiently reduce the VSELECT to shuffles with the pre-SSE41 lowering.

llvm-svn: 263383
2016-03-13 18:35:59 +00:00
Amjad Aboud
9bfdf16f5b Fixed DIBuilder to verify that same imported entity will not be added twice to the "imports" list of the DICompileUnit.
Differential Revision: http://reviews.llvm.org/D17884

llvm-svn: 263379
2016-03-13 11:11:39 +00:00
David Majnemer
4253ad553a [CodeView] Truncate display names
Fundamentally, the length of a variable or function name is bound by the
maximum size of a record: 0xffff.  However, the name doesn't live in a
vacuum; other data is associated with the name, lowering the bound
further.

We would naively attempt to emit the name, causing us to assert because
the record would no-longer fit in 16-bits.  Instead, truncate the name
but preserve as much as we can.

While I have tested this locally, I've decided to not commit it due to
the test's size.

N.B.  While this behavior is undesirable, it is better than MSVC's
behavior.  They seem to truncate to ~4000 characters.

llvm-svn: 263378
2016-03-13 10:53:30 +00:00
David Majnemer
01978bba2d [Bitcode] Make writeComdats less strange
It had a weird artificial limitation on the write side: the comdat name
couldn't be bigger than 2**16.  However, the reader had no such
limitation.  Make the reader and the writer agree.

llvm-svn: 263377
2016-03-13 08:01:03 +00:00
Fiona Glaser
4634b04513 ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression
Check to see if all operands are constant before calling simplify on them
so that we don't perform wasted simplifications.

llvm-svn: 263374
2016-03-13 05:36:15 +00:00
Matt Arsenault
974dc48135 Fix build
llvm-svn: 263372
2016-03-13 05:22:08 +00:00
Matt Arsenault
26aa8f6035 APFloat: Fix ilogb for denormals
llvm-svn: 263370
2016-03-13 05:12:32 +00:00
Matt Arsenault
b9effef6fc APFloat: Fix scalbn handling of denormals
This was incorrect for denormals, and also failed
on longer exponent ranges.

llvm-svn: 263369
2016-03-13 05:11:51 +00:00
Rui Ueyama
b8549ecbdb Define IsRela static const member to Elf_Rel type.
So that we can write RelTy::IsRela to query its type.

llvm-svn: 263367
2016-03-13 04:55:44 +00:00
Craig Topper
f34a1d74e9 [X86] Remove many operands that represent memory stores from outs to ins. These operands are the registers and immediates that specify the memory address not the memory itself thus they are inputs.
llvm-svn: 263354
2016-03-13 02:56:31 +00:00
Amaury Sechet
bd13303f63 Add echo test for constant data arrays in the LLVM C API
llvm-svn: 263350
2016-03-13 00:58:25 +00:00
Amaury Sechet
483902dcce Use templated version of unwrap instead of cats in the Core.cpp. NFC
llvm-svn: 263349
2016-03-13 00:54:40 +00:00
Amaury Sechet
577518b8ab Move LLVMConstStructInContext so that declarationa nd definition order match. NFC
llvm-svn: 263348
2016-03-13 00:40:12 +00:00
Sanjay Patel
d21ed66293 update test to use FileCheck
llvm-svn: 263347
2016-03-12 21:09:26 +00:00
Sanjay Patel
b0eaf59441 fix documentation comments; NFC
llvm-svn: 263346
2016-03-12 20:44:58 +00:00