1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

15678 Commits

Author SHA1 Message Date
Quentin Colombet
96d6f82ab0 [X86] Fix the lowering of TLS calls.
The callseq_end node must be glued with the TLS calls, otherwise,
the generic code will miss the uses of the returned value and will
mark it dead.
Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI,
the pseudo uses the symbol address at this point not RDI and the
lowering will do the right thing.

llvm-svn: 267797
2016-04-27 21:37:37 +00:00
Matt Arsenault
982e737c85 AMDGPU: Account for globals in AMDGPUPromoteAlloca pass
Patch by Bas Nieuwenhuizen

llvm-svn: 267791
2016-04-27 21:05:08 +00:00
Ahmed Bougacha
e8bff14c32 [AArch64] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.

Follow-up to r266339.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267779
2016-04-27 20:33:02 +00:00
Ahmed Bougacha
492c1a346a [ARM] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.

The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.

Follow-up to r266679.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267778
2016-04-27 20:32:54 +00:00
Kevin B. Smith
1783031f2d [X86]: Quit promoting 16 bit loads to 32 bit.
Differential Revision: http://reviews.llvm.org/D19592

llvm-svn: 267773
2016-04-27 19:58:03 +00:00
Marcin Koscielnicki
1e17bfd3e5 [Mips] Add support for llvm.thread.pointer intrinsic.
This will be used to implement __builtin_thread_pointer in clang.

Differential Revision: http://reviews.llvm.org/D19569

llvm-svn: 267743
2016-04-27 17:21:49 +00:00
Nicolai Haehnle
494b4aee1e AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic
Summary:
So it appears that to guarantee some of the ordering requirements of a GLSL
memoryBarrier() executed in the shader, we need to emit an s_waitcnt.

(We can't use an s_barrier, because memoryBarrier() may appear anywhere in
the shader, in particular it may appear in non-uniform control flow.)

Reviewers: arsenm, mareko, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19203

llvm-svn: 267729
2016-04-27 15:46:01 +00:00
Artem Tamazov
0b6855273a [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware registers.
Possibility to specify code of hardware register kept.
Disassemble to symbolic name, if name is known.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19335

llvm-svn: 267724
2016-04-27 15:17:03 +00:00
Nico Weber
b519b357d0 Revert r267649, it caused PR27539.
llvm-svn: 267723
2016-04-27 15:16:54 +00:00
Zlatko Buljan
92f1550331 [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU instructions
Differential Revision: http://reviews.llvm.org/D16676

llvm-svn: 267694
2016-04-27 11:31:44 +00:00
Zlatko Buljan
a2323fb2af [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, SRAV, SRL and SRLV instructions
Differential Revision: http://reviews.llvm.org/D17989

llvm-svn: 267693
2016-04-27 11:02:23 +00:00
Chuang-Yu Cheng
d389efdf8e [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state instead of implicit
This fixes PR27414

Reviewers: kbarton mgrang tjablin

http://reviews.llvm.org/D19255

llvm-svn: 267660
2016-04-27 02:59:28 +00:00
Ahmed Bougacha
991d42e979 [X86] Don't assume that MMX extractelts are from index 0.
It's probably the case for all 3 MMX users out there, but with
hand-crafted IR, you can trigger selection failures. Fix that.

llvm-svn: 267652
2016-04-27 01:35:29 +00:00
Ahmed Bougacha
208a5db302 [X86] Re-enable MMX i32 extractelt combine.
This effectively adds back the extractelt combine removed by r262358:
the direct case can still occur (because x86_mmx is special, see
r262446), but it's the indirect case that's now superseded by the
generic combine.

llvm-svn: 267651
2016-04-27 01:35:25 +00:00
Cong Hou
3dea148bfe Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched.
Differential revision: http://reviews.llvm.org/D14840

llvm-svn: 267649
2016-04-27 01:29:18 +00:00
Quentin Colombet
c01de3fc6a [X86] Make sure it is safe to clobber EFLAGS, if need be, when choosing
the prologue.

Do not use basic blocks that have EFLAGS live-in as prologue if we need
to realign the stack. Realigning the stack uses AND instruction and this
clobbers EFLAGS.

An other alternative would have been to save and restore EFLAGS around
the stack realignment code, but this is likely inefficient.

Fixes PR27531.

llvm-svn: 267634
2016-04-26 23:44:14 +00:00
Mitch Bodart
d1778e20f3 [X86] Replace -mcpu with -mattr in several tests
Differential Revision: http://reviews.llvm.org/D19568

llvm-svn: 267629
2016-04-26 23:36:38 +00:00
Quentin Colombet
67573257d1 [MachineBasicBlock] Take advantage of the partially dead information.
Thanks to that information we wouldn't lie on a register being live whereas it
is not.

llvm-svn: 267622
2016-04-26 23:14:29 +00:00
Quentin Colombet
c2937566b8 [MachineInstrBundle] Improvement the recognition of dead definitions.
Now, it is possible to know that partial definitions are dead definitions and
recognize that clobbered registers are also dead.

llvm-svn: 267621
2016-04-26 23:14:24 +00:00
Krzysztof Parzyszek
8d29c2a6a5 [Tail duplication] Handle source registers with subregisters
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.

Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.

Differential Revision: http://reviews.llvm.org/D19337

llvm-svn: 267583
2016-04-26 18:36:34 +00:00
Manman Ren
e3c0ba8445 Swift Calling Convention: use %RAX for sret.
We don't need to copy the sret argument into %rax upon return.
rdar://25671494

llvm-svn: 267579
2016-04-26 18:08:06 +00:00
Saleem Abdulrasool
63a10bee58 tests: tweak MIR for ARM tests to correct MI issues
The Machine Instruction Verifier flagged some issues in the serialized MIR.
Adjust the input to correct them.

Fixes the remaining portion of PR27480.

llvm-svn: 267578
2016-04-26 17:54:21 +00:00
Saleem Abdulrasool
4348c4ad02 test: remove some bleeding whitespace
Kill bleeding whitespace.  NFC

llvm-svn: 267577
2016-04-26 17:54:16 +00:00
Sanjay Patel
7177ce0576 [CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344

CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.

For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.

As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for 
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 
50/50 taken/not-taken might still be 100% predictable.

Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.

Differential Revision: http://reviews.llvm.org/D19488

llvm-svn: 267572
2016-04-26 17:11:17 +00:00
Konstantin Zhuravlyov
a8b24aaab2 [AMDGPU] Reserve VGPRs for trap handler usage if instructed
Differential Revision: http://reviews.llvm.org/D19235

llvm-svn: 267563
2016-04-26 15:43:14 +00:00
Andrey Turetskiy
53086bdd4e [X86] PR27502: Fix the LEA optimization pass.
Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass.

Differential Revision: http://reviews.llvm.org/D19409

llvm-svn: 267551
2016-04-26 12:18:12 +00:00
Marcin Koscielnicki
704c818d77 [PowerPC] Add support for llvm.thread.pointer
Differential Revision: http://reviews.llvm.org/D19304

llvm-svn: 267546
2016-04-26 10:37:22 +00:00
Marcin Koscielnicki
599068857b [SPARC] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on sparc.

Differential Revision: http://reviews.llvm.org/D19386

llvm-svn: 267545
2016-04-26 10:37:14 +00:00
Marcin Koscielnicki
470e89bab4 [SPARC] Add support for llvm.thread.pointer.
Differential Revision: http://reviews.llvm.org/D19387

llvm-svn: 267544
2016-04-26 10:37:01 +00:00
Craig Topper
b7db006017 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Craig Topper
4513730c09 [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.
The default is Legal, which results in 'Cannot select' errors.

llvm-svn: 267521
2016-04-26 05:04:37 +00:00
Craig Topper
783834f3cf [ARM] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267520
2016-04-26 05:04:33 +00:00
Richard Trieu
0f158489d6 Pass the test file in through stdin instead of by filename.
When passed in via filename, this test will fail if the path to the test
has the strings "f1" and "f2" in somewhere.  Pass the file through stdin
to prevent test failures due to coincidences in path names.

llvm-svn: 267517
2016-04-26 03:43:49 +00:00
Dan Gohman
2ad6a4c0e6 [WebAssembly] Account for implicit operands when computing operand indices.
llvm-svn: 267511
2016-04-26 01:40:56 +00:00
Ahmed Bougacha
b6c12fe106 [X86] Use LivePhysRegs in X86FixupBWInsts.
Kill-flags, which computeRegisterLiveness uses, are not reliable.
LivePhysRegs is.

Differential Revision: http://reviews.llvm.org/D19472

llvm-svn: 267495
2016-04-26 00:00:48 +00:00
James Y Knight
439f0092c7 [Sparc] Fix double-float fabs and fneg on little endian CPUs.
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.

However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.

Thus, this expansion must check the endianness to use the correct
subregister.

llvm-svn: 267489
2016-04-25 22:54:09 +00:00
Tim Northover
66f8d5ae59 ARM: put extern __thread stubs in a special section.
The linker needs to know that the symbols are thread-local to do its job
properly.

llvm-svn: 267473
2016-04-25 21:12:04 +00:00
Quentin Colombet
7f8c56085e Re-apply r267206 with a fix for the encoding problem: when the immediate of
log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit
variant cannot encode it. Therefore, set the subreg part accordingly.

[AArch64] Fix optimizeCondBranch logic.

The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267465
2016-04-25 20:54:08 +00:00
Matt Arsenault
b60850cb10 AMDGPU: Implement addrspacecast
llvm-svn: 267452
2016-04-25 19:27:24 +00:00
Matt Arsenault
524b24258c AMDGPU: Add queue ptr intrinsic
llvm-svn: 267451
2016-04-25 19:27:18 +00:00
Krzysztof Parzyszek
7bc31501a1 [Hexagon] Register save/restore functions do not follow regular conventions
Do not mark them as modifying any of the volatile registers by default.

llvm-svn: 267433
2016-04-25 17:49:44 +00:00
Sanjay Patel
2bdc7d43ef add tests for potential CGP transform (PR27344)
llvm-svn: 267426
2016-04-25 16:56:52 +00:00
Marcin Koscielnicki
de3ced2d10 [PR27390] [CodeGen] Reject indexed loads in CombinerDAG.
visitAND, when folding and (load) forgets to check which output of
an indexed load is involved, happily folding the updated address
output on the following testcase:

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%typ = type { i32, i32 }

define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) {
  %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1
  %1 = load i32, i32* %b, align 4
  %2 = ptrtoint i32* %b to i64
  %3 = and i64 %2, -35184372088833
  %4 = inttoptr i64 %3 to i32*
  %_msld = load i32, i32* %4, align 4
  %zzz = add i32 %1,  %_msld
  ret i32 %zzz
}

Fix this by checking ResNo.

I've found a few more places that currently neglect to check for
indexed load, and tightened them up as well, but I don't have test
cases for them.  In fact, they might not be triggerable at all,
at least with current targets.  Still, better safe than sorry.

Differential Revision: http://reviews.llvm.org/D19202

llvm-svn: 267420
2016-04-25 15:43:44 +00:00
Hrvoje Varga
e25aaa57ef [mips][microMIPS] Revert commit r267137
Commit r267137 was the reason for failing tests in LLVM test suite.

llvm-svn: 267419
2016-04-25 15:40:08 +00:00
Sanjay Patel
74d1b00e49 [x86] auto-generate checks for cmov tests
llvm-svn: 267417
2016-04-25 15:26:57 +00:00
David Majnemer
c78d15d0c3 [WinEH] Update SplitAnalysis::computeLastSplitPoint to cope with multiple EH successors
We didn't have logic to correctly handle CFGs where there was more than
one EH-pad successor (these are novel with WinEH).
There were situations where a register was live in one exceptional
successor but not another but the code as written would only consider
the first exceptional successor it found.

This resulted in split points which were insufficiently early if an
invoke was present.

This fixes PR27501.

N.B.  This removes getLandingPadSuccessor.

llvm-svn: 267412
2016-04-25 14:31:32 +00:00
Silviu Baranga
6c665bec7d [ARM] Add support for the X asm constraint
Summary:
This patch adds support for the X asm constraint.

To do this, we lower the constraint to either a "w" or "r" constraint
depending on the operand type (both constraints are supported on ARM).

Fixes PR26493

Reviewers: t.p.northover, echristo, rengolin

Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19061

llvm-svn: 267411
2016-04-25 14:29:18 +00:00
Artem Tamazov
7fa01faba1 [AMDGPU][llvm-mc] s_getreg/setreg* - Add hwreg(...) syntax.
Added hwreg(reg[,offset,width]) syntax.
Default offset = 0, default width = 32.
Possibility to specify 16-bit immediate kept.
Added out-of-range checks.
Disassembling is always to hwreg(...) format.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19329

llvm-svn: 267410
2016-04-25 14:13:51 +00:00
Marcin Koscielnicki
5a59940bb3 [PowerPC] [PR27387] Disallow r0 for ADD8TLS.
ADD8TLS, a variant of add instruction used for initial-exec TLS,
currently accepts r0 as a source register.  While add itself supports
r0 just fine, linker can relax it to a local-exec sequence, converting
it to addi - which doesn't support r0.

Differential Revision: http://reviews.llvm.org/D19193

llvm-svn: 267388
2016-04-25 09:24:34 +00:00
Michael Zuckerman
4cc752e542 Fixing wrong mask size error. From __mmask8 to __mmask16.
Was reviewed over the shoulder by AsafBadouh.
Connected to review http://reviews.llvm.org/D19195.

llvm-svn: 267379
2016-04-25 05:27:51 +00:00