1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

37426 Commits

Author SHA1 Message Date
Daniel Sanders
cd4dc5cad8 [mips] Correct the ordering of HI/LO pairs in the relocation table.
Summary:
There seems to have been a misunderstanding as to the meaning of 'offset' in
the rules laid down by our ABI. The previous code believed that 'offset' meant
the offset within the section that the relocation is applied to. However, it
should have meant the offset from the symbol used in the relocation expression.

This patch adds two fields to ELFRelocationEntry and uses them to correct the
order of relocations for MIPS. These fields contain:
* The original symbol before shouldRelocateWithSymbol() is considered. This
  ensures that R_MIPS_GOT16 is able to correctly distinguish between local and
  external symbols, allowing us to tell whether %got() requires a matching
  %lo() or not (local symbols require one, external symbols don't). It also
  prevents confusing cases where the fuzzy matching rules cause things like
  %hi(foo)/%lo(foo+3) and %hi(bar)/%lo(bar+1) to swap their %lo()'s.
* The original offset before shouldRelocateWithSymbol() is considered. The
  existing Addend field is always zero when the object uses in place addends
  (because it's already moved it to the encoding) but MIPS needs to use the
  original offset to ensure that the linker correctly calculates the carry-in
  bit for %hi() and %got().

IAS ensures that unmatchable %hi()/%got() relocations are placed at the end of
the table to ensure that the linker rejects the table (we're unable to report
such errors directly). The alternatives to this risk accidental matching
against inappropriate relocations which may silently compute incorrect values
due to an incorrect carry bit between the %lo() and %hi()/%got().

Reviewers: sdardis

Subscribers: dsanders, sdardis, rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D19718

llvm-svn: 268733
2016-05-06 13:49:25 +00:00
Daniel Sanders
ae8df9e819 [mips][mips16] Use isUnconditionalBranch() in AnalyzeBranch() and constant island pass.
Summary:
This stops it misidentifying unconditional branches as conditional branches
which fixes a -verify-machineinstrs error about exiting a function via fall through.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19864

llvm-svn: 268731
2016-05-06 13:23:51 +00:00
Daniel Sanders
198717c89c [mips][fastisel] Conditional moves do not have implicit operands.
Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19862

llvm-svn: 268730
2016-05-06 12:57:26 +00:00
Sam Kolton
6ae1dde4c7 [TableGen] AsmMatcher: support for default values for optional operands
Summary:
This change allows to specify "DefaultMethod" for optional operand (IsOptional = 1) in AsmOperandClass that return default value for operand. This is used in convertToMCInst to set default values in MCInst.
Previously if you wanted to set default value for operand you had to create custom converter method. With this change it is possible to use standard converters even when optional operands presented.

Reviewers: tstellarAMD, ab, craig.topper

Subscribers: jyknight, dsanders, arsenm, nhaustov, llvm-commits

Differential Revision: http://reviews.llvm.org/D18242

llvm-svn: 268726
2016-05-06 11:31:17 +00:00
Dylan McKay
3be6c5786f [AVR] Add a majority of the backend code
Summary: This adds the majority of the AVR backend.

Reviewers: hfinkel, dsanders, vkalintiris, arsenm

Subscribers: dylanmckay

Differential Revision: http://reviews.llvm.org/D17906

llvm-svn: 268722
2016-05-06 10:12:31 +00:00
Nikolay Haustov
eeb6f36732 AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2.
Summary:
    Check calling convention in AMDGPUMachineFunction::isKernel

    This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.

    Also, in the future unused non-kernels may be optimized.

    Reviewers: tstellarAMD, arsenm

    Subscribers: arsenm, joker.eph, llvm-commits

    Differential Revision: http://reviews.llvm.org/D19917

llvm-svn: 268719
2016-05-06 09:23:13 +00:00
Zlatko Buljan
a375151a77 [mips][microMIPS] Add CodeGen support for MUL* and DMUL* instructions
Differential Revision: http://reviews.llvm.org/D15744

llvm-svn: 268714
2016-05-06 08:24:14 +00:00
Justin Bogner
9448d374ee SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

llvm-svn: 268693
2016-05-05 23:19:08 +00:00
Krzysztof Parzyszek
c935f59419 [scan-build] fix warnings emitted on LLVM Hexagon code base
Patch by Apelete Seketeli.

Differential Revision: http://reviews.llvm.org/D19968

llvm-svn: 268691
2016-05-05 22:00:44 +00:00
Krzysztof Parzyszek
03ea811655 [Hexagon] Fix the offset ranges for vector memory instructions
llvm-svn: 268690
2016-05-05 21:58:02 +00:00
Chad Rosier
2778b687b1 [AArch64] Remove unused MBP headers/dependency. NFC.
llvm-svn: 268682
2016-05-05 20:58:38 +00:00
Dan Gohman
04b0e0d2b7 [WebAssembly] Don't emit epilogue code in the middle of stackified code.
llvm-svn: 268679
2016-05-05 20:41:15 +00:00
Matt Arsenault
cebbd339ae AMDGPU: Simplify control flow / conditions
llvm-svn: 268676
2016-05-05 20:27:02 +00:00
NAKAMURA Takumi
25dc6f95a1 Touch Hexagon/CMakeLists.txt to regenerate build files, since r268641 complains of missing HexagonAlias.td on ninja.
FIXME: TableGen.cmake globs *.td(s) with wildcards for deps. It is not good.
llvm-svn: 268666
2016-05-05 19:28:01 +00:00
Tim Northover
5f7a3665f1 ARM: don't attempt to merge litpools referencing different PC-anchors.
Given something like:

    ldr r0, .LCPI0_0 (== pc-rel var)
    add r0, pc

    ldr r1, .LCPI0_1 (== pc-rel var)
    add r1, pc

we cannot combine the 2 ldr instructions and litpools because they get added to
a different pc to form the correct address. I think the original logic came
from a time when we fused the LDRpci/PICADD instructions into one
pseudo-instruction so the PC was always immediately at-hand. That's no longer
the case.

Should fix general-dynamic TLS access on Linux, and quite possibly other -fPIC
code that relies on litpools (e.g. v6m and -Oz compilations) though trivial
tweaks of the .ll test didn't provoke anything.

llvm-svn: 268662
2016-05-05 18:38:53 +00:00
Krzysztof Parzyszek
4622dac468 [Hexagon] Add aliases for vector loads/stores with no explicit offset
The mem(r0) instructions are treated as mem(r0+#0).

llvm-svn: 268661
2016-05-05 18:38:35 +00:00
Nicolai Haehnle
854cd758f6 AMDGPU: Uniform branch conditions can originate with intrinsics
Summary:
Discovered by Dave Airlie, fixes an assertion in Khronos OpenGL CTS
GL43-CTS.shader_storage_buffer_object.advanced-matrix.

In this particular case, the buffer load intrinsic fed into a uniform
conditional branch, and led the brcond lowering down the wrong path.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19931

llvm-svn: 268650
2016-05-05 17:36:36 +00:00
Tom Stellard
0d89e9527b AMDGPU/SI: Add support for AMD code object version 2.
Summary:
Version 2 is now the default.  If you want to emit version 1, use
the amdgcn--amdhsa-amdcov1 triple.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19283

llvm-svn: 268647
2016-05-05 17:03:33 +00:00
Hans Wennborg
4b2eddacf6 X86CallFrameOptimization: make adjustCallSequence's return type void
It always returned the same value (true). No functionality change.

llvm-svn: 268645
2016-05-05 16:39:31 +00:00
Krzysztof Parzyszek
7943b9ccdb [Hexagon] Merge HexagonAlias.td into HexagonInstrAlias.td, NFC
llvm-svn: 268641
2016-05-05 16:19:36 +00:00
Krzysztof Parzyszek
275ca43ee3 [Hexagon] Handle operand type differences for A2_tfrpi
The instruction A2_tfrpi has a 64-bit operand, while the corresponding
intrinsic takes a 32-bit value. The actual value has only 8 significant
bits, so the difference is only in the type used to represent it.
In order to map the intrinsic to the instruction, the operand needs to
be extended to the correct type.

llvm-svn: 268635
2016-05-05 15:29:47 +00:00
James Y Knight
49bc30f7bb Remove bit-rotten CppBackend.
This backend was supposed to generate C++ code which will re-construct
the LLVM IR passed as input. This seems to me to have very marginal
usefulness in the first place.

However, the code has never been updated to use IRBuilder, which makes
its current value negative -- people who look at the output may be
steered to use the *wrong* C++ APIs to construct IR.

Furthermore, it's generated code that doesn't compile since at least
2013.

Differential Revision: http://reviews.llvm.org/D19942

llvm-svn: 268631
2016-05-05 14:35:40 +00:00
Nirav Dave
ed0981bcb7 Fix Mips Parser error reporting
[mips] On error, ParseDirective should always return false to signify that the
directive was understood.

Reviewers: dsanders, vkalintiris, sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19929

llvm-svn: 268630
2016-05-05 14:15:46 +00:00
Marcin Koscielnicki
fe2c3bbeba [X86] Extend some Linux special cases to cover kFreeBSD.
Both Linux and kFreeBSD use glibc, so follow similiar code paths.
Add isTargetGlibc to check for this, and use it instead of isTargetLinux
in a few places.

Fixes PR22248 for kFreeBSD.

Differential Revision: http://reviews.llvm.org/D19104

llvm-svn: 268624
2016-05-05 11:35:51 +00:00
David Majnemer
d7465d2b1f [X86] Use the right type when folding xor (truncate (shift)) -> setcc
The result type of setcc is dependent on whether or not AVX512 is
present.
We had an X86-specific DAG-combine which assumed that the result type
should be i8 when it could be i1.
This meant that we would generate illegal setccs which LowerSETCC did
not like.

Instead, use an appropriate type and zero extend to i8.

Also, there were some scenarios where the fold should have fired but
didn't because we were overly cautious about the types.  This meant that
we generated:

        shrl    $31, %edi
        andl    $1, %edi
        kmovw   %edi, %k0
        kxnorw  %k0, %k0, %k1
        kshiftrw        $15, %k1, %k1
        kxorw   %k1, %k0, %k0
        kmovw   %k0, %eax

instead of:

        testl   %edi, %edi
        setns   %al

This fixes PR27638.

llvm-svn: 268609
2016-05-05 06:00:56 +00:00
Justin Bogner
89b0f9301d ARM: Use a Handle to track SDNodes in case they're CSE'd. NFC
The code here is recursively Select-ing a new Node to avoid issues
where N is CSE'd during replaceDAGValue and stops being valid. We can
accomplish the same goal in a more principled way by using a
HandleSDNode.

This is essentially a less dodgy fix for PR25733 than the original
attempt back in r255120.

llvm-svn: 268590
2016-05-05 01:43:49 +00:00
Marcin Koscielnicki
8b6548a4bd [SystemZ] Implement backchain attribute (recommit with fix).
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI.  This will
be used to implement -mbackchain option in clang.

Differential Revision: http://reviews.llvm.org/D19889

Fixed in this version: added RegState::Define and RegState::Kill on R1D
in prologue.

llvm-svn: 268581
2016-05-05 00:37:30 +00:00
Marcin Koscielnicki
824f8f1251 Revert "[SystemZ] Implement backchain attribute."
This reverts commit rL268571.

It caused failures in register scavenger.

llvm-svn: 268576
2016-05-04 23:54:53 +00:00
Marcin Koscielnicki
b26ad64ef1 [SystemZ] Implement llvm.get.dynamic.area.offset
To be used for AddressSanitizer.

Differential Revision: http://reviews.llvm.org/D19817

llvm-svn: 268572
2016-05-04 23:31:26 +00:00
Marcin Koscielnicki
237ad4edbd [SystemZ] Implement backchain attribute.
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI.  This will
be used to implement -mbackchain option in clang.

Differential Revision: http://reviews.llvm.org/D19889

llvm-svn: 268571
2016-05-04 23:31:20 +00:00
Quentin Colombet
4ddfd26ab6 [X86] Add a few register classes for x32 address accesses.
The new register classes allow to tell the machine verifier that it is
fine to use RIP for address accesses in x32 mode. Prior to that patch,
we would complain that we are using a GR64 in place of GR32, whereas it
is actually fine to use GR64 for x32 as long as the 32 high bits are 0s.
RIP has this property and is used for RIP-relative addressing.

This partially fixes http://llvm.org/PR27481.

llvm-svn: 268567
2016-05-04 22:45:31 +00:00
Evandro Menezes
31ae8a94dc [AArch64] Add cheap as move instructions for Exynos M1
llvm-svn: 268549
2016-05-04 20:47:25 +00:00
Evandro Menezes
bcc5f26f19 [AArch64] Use the reciprocal estimation machinery
This patch adds support for estimating the square root, its reciprocal and
division or reciprocal using the combiner generic reciprocal machinery.

llvm-svn: 268539
2016-05-04 20:18:27 +00:00
Vitaly Buka
2cb6d37066 Revert r268529 because it caused use-of-uninitialized-value
Summary: This reverts commit d88cc0862bf7da64850b89e9bb5ea9f95e7f1184.

#0 0xfed467 in llvm::ARMFrameLowering::determineCalleeSaves(llvm::MachineFunction&, llvm::BitVector&, llvm::RegScavenger*) const /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Target/ARM/ARMFrameLowering.cpp:1625:52
#1 0x330d4cc in (anonymous namespace)::PEI::runOnMachineFunction(llvm::MachineFunction&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/PrologEpilogInserter.cpp:186:3
#2 0x3193e12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:60:13
#3 0x396237d in llvm::FPPassManager::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1526:23
#4 0x3962a23 in llvm::FPPassManager::runOnModule(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1547:16
#5 0x3963d52 in runOnModule /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1603:23
#6 0x3963d52 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1706
#7 0x6bb910 in compileModule(char**, llvm::LLVMContext&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:412:5
#8 0x6b3c25 in main /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:218:22
#9 0x7fd4a7d37ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
#10 0x625c93 in _start (/mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm_build_msan/bin/llc+0x625c93)

Reviewers:

Subscribers:

llvm-svn: 268536
2016-05-04 19:44:11 +00:00
Weiming Zhao
1c9cb11794 [ARM] Fix Scavenger assert due to underestimated stack size
Summary:
Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure.

Reviewers: rengolin

Subscribers: aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D19896

llvm-svn: 268529
2016-05-04 18:19:33 +00:00
Nemanja Ivanovic
d6a1215edc [PowerPC] Generate VSX version of splat word
This patch corresponds to review:
http://reviews.llvm.org/D18592

It allows the PPC back end to generate the xxspltw instruction where we
previously only emitted vspltw.

llvm-svn: 268516
2016-05-04 16:04:02 +00:00
Jan Vesely
f70ba98667 AMDGPU/R600: Minor cleanup in InstrInfo
Use std::make_pair instead of constructor
Use C++11 loop
Reuse helper var

Reviewers: tstellardAMD

Subsribers: arsenm

Differential Revision: http://reviews.llvm.org/D19787

llvm-svn: 268503
2016-05-04 14:55:45 +00:00
Daniel Sanders
fd238b0ebd [mips][ias] Only round section sizes when explicitly requested.
As requested by Rafael Espindola in his post-commit comments on r268036. This
makes the previous behaviour the default while still allowing verification of
IAS.

llvm-svn: 268496
2016-05-04 13:21:06 +00:00
Chris Dewhurst
a2cf1867ba [Sparc] Allow taking of function address into a register.
Modification of previously existing code (variable rename only), with unit test added.

Differential Revision: http://reviews.llvm.org/D19368

llvm-svn: 268493
2016-05-04 12:11:05 +00:00
Zlatko Buljan
3153f38fd5 [mips][microMIPS] Add CodeGen support for microMIPSr6 ROTR and ROTRV and add tests for LL, SC, SYSCALL, ROTR, ROTRV, LWM32, SWM32 and MOVEP instructions
Differential Revision: http://reviews.llvm.org/D19857

llvm-svn: 268491
2016-05-04 12:02:12 +00:00
Chris Dewhurst
7a66203c71 [Sparc] Implement __builtin_setjmp, __builtin_longjmp back-end.
This code implements builtin_setjmp and builtin_longjmp exception handling intrinsics for 32-bit Sparc back-ends.

The code started as a mash-up of the PowerPC and X86 versions, although there are sufficient differences to both that had to be made for Sparc handling.

Note: I have manual tests running. I'll work on a unit test and add that to the rest of this diff in the next day.

Also, this implementation is only for 32-bit Sparc. I haven't focussed on a 64-bit version, although I have left the code in a prepared state for implementing this, including detecting pointer size and comments indicating where I suspect there may be differences.

Differential Revision: http://reviews.llvm.org/D19798

llvm-svn: 268483
2016-05-04 09:33:30 +00:00
David Majnemer
a2740cd234 [X86] Lower zext i1 arguments
i1 is now a legal type for X86 with AVX512.
There were some paths in X86FastISel which were not quite ready to see
an i1 value: they were not quite sure how to deal with sign/zero extends
for call arguments.
DTRT by extending to i8 for zeroext and bailing out of FastISel for
signext.

This fixes PR27591.

llvm-svn: 268470
2016-05-04 00:22:23 +00:00
Simon Pilgrim
2ce898f3ae [X86] Tidied up SDValue's SDNode referencing. NFCI.
llvm-svn: 268445
2016-05-03 21:44:45 +00:00
Tim Northover
06b0388bac X86-Darwin: start emitting data-region directives for jump-tables.
The surrounding tools can cope these days, and they were invented for a reason.

llvm-svn: 268437
2016-05-03 21:03:41 +00:00
David L Kreitzer
ed75c93233 Add an address space for the X86 SS segment.
Patch by Michael LeMay (michael.lemay@intel.com)

Differential Revision: http://reviews.llvm.org/D17093

llvm-svn: 268431
2016-05-03 20:16:08 +00:00
Tom Stellard
79e676a87a AMDGPU/SI: Use range loops to simplify some code in the SI Scheduler
Reviewers: arsenm, axeldavy

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19822

llvm-svn: 268396
2016-05-03 16:30:56 +00:00
Aaron Ballman
53075f9a50 Silence unused variable warning; NFC.
llvm-svn: 268392
2016-05-03 15:17:25 +00:00
Simon Pilgrim
386b3c8531 [X86][SSE] Added target shuffle combine to MOVQ
llvm-svn: 268391
2016-05-03 15:05:13 +00:00
James Y Knight
c0965c3577 [Sparc] Constification of TargetMachine arguments
This patch changes the TargetMachine arguments to be const. This is
required for {D19265}, and was requested to be done in a separate patch.

Patch by Jacob Hansen!

Differential Revision: http://reviews.llvm.org/D19797

llvm-svn: 268389
2016-05-03 14:57:18 +00:00
Daniel Sanders
ec5e1f9b79 [mips][fastisel] ADJCALLSTACKUP has a second immediate operand.
Summary:
It's always zero for SelectionDAG and is never read by the MIPS backend so
do the same for FastISel.

Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19863

llvm-svn: 268386
2016-05-03 14:19:26 +00:00
Daniel Sanders
848ffeb5f0 [mips] Fix unused variable warning for release builds introduced by r268379.
llvm-svn: 268383
2016-05-03 14:00:37 +00:00
Daniel Sanders
be0ced1166 [mips] Use MipsMCExpr instead of MCSymbolRefExpr for all relocations.
Summary:
This is much closer to the way MIPS relocation expressions work
(%hi(foo + 2) rather than %hi(foo) + 2) and removes the need for the
various bodges in MipsAsmParser::evaluateRelocExpr().

Removing those bodges ensures that the constant stored in MCValue is the
full 32 or 64-bit (depending on ABI) offset from the symbol. This will be used
to correct the %hi/%lo matching needed to sort the relocation table correctly.

As part of this:
* Gave MCExpr::print() the ability to omit parenthesis when emitting a
  symbol reference inside a MipsMCExpr operator like %hi(X). Without this
  we print things like %lo(($L1)).
* %hi(%neg(%gprel(X))) is now three MipsMCExpr's instead of one. Most of
  the related special cases have been removed or moved to MipsMCExpr. We
  can remove the rest as we gain support for the less common relocations
  when they are not part of this specific combination.
* Renamed MipsMCExpr::VariantKind and the enum prefix ('VK_') to avoid confusion
  with MCSymbolRefExpr::VariantKind and its prefix (also 'VK_').
* fixup_Mips_GOT_Local and fixup_Mips_GOT_Global were found to be identical
  and merged into fixup_Mips_GOT.
* MO_GOT16 and MO_GOT turned out to be identical and have been merged into
  MO_GOT.
* VK_Mips_GOT and VK_Mips_GOT16 turned out to be the same thing so they
  have been merged into MEK_GOT

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19716

llvm-svn: 268379
2016-05-03 13:35:44 +00:00
Igor Breger
2e54ab6509 [AVX512] Add support for commutative MAX/MIN . In general VMAX{PS,PD} and VMIN{PS,PD} instruction are not commutative . In combine pass only if UnsafeFPMath are used VMAX/VMAX are converted to commutative nodes VMAXC/VMAXC.
Differential Revision: http://reviews.llvm.org/D19860

llvm-svn: 268375
2016-05-03 11:51:45 +00:00
Igor Breger
ab90b9e166 [AVX512] Fix lowerV4X128VectorShuffle to select correctly input operands .
Differential Revision: http://reviews.llvm.org/D19803

llvm-svn: 268368
2016-05-03 08:08:44 +00:00
Matthias Braun
4869e8d140 AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZ
This fixes -verify-machineinstrs complaints when compiling
test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp

llvm-svn: 268360
2016-05-03 04:54:16 +00:00
Matthias Braun
bac4271200 livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

llvm-svn: 268340
2016-05-03 00:24:32 +00:00
Matthias Braun
d86d0ddbc5 LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

llvm-svn: 268336
2016-05-03 00:08:46 +00:00
Quentin Colombet
6b53c89899 [X86] Model FAULTING_LOAD_OP as a terminator and branch.
This operation may branch to the handler block and we do not want it
to happen anywhere within the basic block.
Moreover, by marking it "terminator and branch" the machine verifier
does not wrongly assume (because of AnalyzeBranch not knowing better)
the branch is analyzable. Indeed, the target was seeing only the
unconditional branch and not the faulting load op and thought it was
a simple unconditional block.
The machine verifier was complaining because of that and moreover,
other optimizations could have done wrong transformation!

In the process, simplify the representation of the handler block in
the faulting load op. Now, we directly reference the handler block
instead of using a label. This has the benefits of:
1. MC knows how to issue a label for a BB, so leave that to it.
2. Accessing the target BB from its label is painful, whereas it is
   direct from a MBB operand.

Note: The 2 bytes offset in implicit-null-check.ll comes from the
fact the unconditional jumps are not removed anymore, as the whole
terminator sequence is not analyzable anymore.

Will fix it in a subsequence commit.

llvm-svn: 268327
2016-05-02 22:58:54 +00:00
Simon Pilgrim
116d17f711 [X86][SSE] Added placeholder for 128/256-bit wide shuffle combines
Begun adding placeholder for future support for vperm2f128/vshuff64x2 style 128/256-bit wide shuffles

llvm-svn: 268306
2016-05-02 21:12:48 +00:00
Matt Arsenault
dfb613a88d AMDGPU: Custom lower v2i32 loads and stores
This will allow us to split up 64-bit private accesses when
necessary.

llvm-svn: 268296
2016-05-02 20:13:51 +00:00
Tom Stellard
d541008932 AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratch
We were using v_readlane_b32 with the lane set to zero, but this won't
work if thread 0 is not active.

Differential Revision: http://reviews.llvm.org/D19745

llvm-svn: 268295
2016-05-02 20:11:44 +00:00
Matt Arsenault
7932e530a0 AMDGPU: Make i64 loads/stores promote to v2i32
Now that unaligned access expansion should not attempt
to produce i64 accesses, we can remove the hack in
PreprocessISelDAG where this is done.

This allows splitting i64 private accesses while
allowing the new add nodes indexing the vector components
can be folded with the base pointer arithmetic.

llvm-svn: 268293
2016-05-02 20:07:26 +00:00
Reid Kleckner
34524af63a Fix instance of -Winconsistent-missing-override in AMDGPU code
llvm-svn: 268289
2016-05-02 19:45:10 +00:00
Tom Stellard
72fc788f2b AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratch
Summary:
When we restore an SGPR value from scratch, we first load it into a
temporary VGPR and then use v_readlane_b32 to copy the value from the
VGPR back into an SGPR.

We weren't setting the kill flag on the VGPR in the v_readlane_b32
instruction, so the register scavenger wasn't able to re-use this
temp value later.

I wasn't able to create a lit test for this.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19744

llvm-svn: 268287
2016-05-02 19:37:56 +00:00
Tim Northover
d33c3d2654 ARM: fix handling of SUB immediates in peephole opt.
We were negating an immediate that was going to be used in a SUBri form
unnecessarily. Since ADD/SUB are very similar we *can* do that, but we have to
change the SUB to an ADD at the same time. This also applies to ADD, and allows
us to handle a slightly larger range of immediates for those two operations.

rdar://25992245

llvm-svn: 268276
2016-05-02 18:30:08 +00:00
Justin Holewinski
9cc32d1b1b [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection
Summary:
We don't have sign-/zero-extending ldg/ldu instructions defined,
so we need to emulate them with explicit CVTs. We were originally
handling the i8 case, but not any other cases.

Fixes PR26185

Reviewers: jingyue, jlebar

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D19615

llvm-svn: 268272
2016-05-02 18:12:02 +00:00
Tom Stellard
667d4234b1 AMDGPU: Move R600 specific code out of AMDGPUISelLowering.cpp
Reviewers: arsenm

Subscribers: jvesely, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19736

llvm-svn: 268267
2016-05-02 18:05:17 +00:00
Tom Stellard
097fcbcf91 AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260
We can't use MI->getDebugLoc() when MI is an iterator that could be
MBB.end().

llvm-svn: 268265
2016-05-02 18:02:24 +00:00
Tom Stellard
179b86b996 AMDGPU/SI: Use the hazard recognizer to break SMEM soft clauses
Summary:
Add support for detecting hazards in SMEM soft clauses, so that we only
break the clauses when necessary, either by adding s_nop or re-ordering
other alu instructions.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18870

llvm-svn: 268260
2016-05-02 17:39:06 +00:00
Nicolai Haehnle
634941ba37 AMDGPU: llvm.SI.fs.constant is a source of divergence
Summary:
This intrinsic is used to get flat-shaded fragment shader inputs. Those are
uniform across a primitive, but a fragment shader wave may process pixels from
multiple primitives (as indicated by the prim_mask), and so that's where
divergence can arise.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19747

llvm-svn: 268259
2016-05-02 17:37:01 +00:00
Derek Schuff
f0cc027b2e [WebAssembly] Rename memory_size intrinsic to current_memory
This follows the recent renaming in the wasm spec.

llvm-svn: 268255
2016-05-02 17:25:22 +00:00
Tom Stellard
7f58d124e5 AMDGPU/SI: Use hazard recognizer to detect DPP hazards
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18603

llvm-svn: 268247
2016-05-02 16:23:09 +00:00
Simon Pilgrim
d1da382cb8 [X86][SSE] Dropped X86ISD::FGETSIGNx86 and use MOVMSK instead for FGETSIGN lowering
movmsk.ll tests are unchanged.

llvm-svn: 268237
2016-05-02 14:58:22 +00:00
Chad Rosier
4e4b009b82 Cleanup comments. NFC.
llvm-svn: 268236
2016-05-02 14:56:21 +00:00
Chad Rosier
5cf7beb283 Cleanup comments. NFC.
llvm-svn: 268235
2016-05-02 14:50:30 +00:00
Aaron Ballman
3b44c6c3a1 Silence unused variable warnings; NFC.
llvm-svn: 268234
2016-05-02 14:48:03 +00:00
David L Kreitzer
5e9178eeb3 Enable the X86 call frame optimization for the 64-bit targets that allow it.
Fixes PR27241.

Differential Revision: http://reviews.llvm.org/D19688

llvm-svn: 268227
2016-05-02 13:45:25 +00:00
Jonas Paulsson
af5993c7a1 [SystemZ] Fix in restoreCalleeSavedRegisters()
Only add operands for GRs to the LMG.

Reviewed by Ulrich Weigand.

llvm-svn: 268216
2016-05-02 09:37:44 +00:00
Jonas Paulsson
c2fd1dcc97 [SystemZ] Mark CC defs as dead whenever possible.
Marking implicit CC defs as dead everywhere except when CC is actually
defined and used explicitly, is important since the post-ra scheduler
will otherwise insert edges between instructions unnecessarily.

Also temporarily disable LA(Y)-> AGSI optimization in
foldMemoryOperandImpl(), since this inroduces a def of the CC reg,
which is illegal unless it is known to be dead.

Reviewed by Ulrich Weigand.

llvm-svn: 268215
2016-05-02 09:37:40 +00:00
Craig Topper
bd972a5333 [X86] Fix a bug in LOCK arithmetic operation pattern matching where the wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates.
This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first.

llvm-svn: 268212
2016-05-02 05:44:21 +00:00
Craig Topper
721f6428df [AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While there fix the execution domain for VPACKSSDW/VPACKUSDW.
llvm-svn: 268200
2016-05-01 17:38:32 +00:00
Igor Breger
a0208b4462 Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth.
Differential Revision: http://reviews.llvm.org/D19579

llvm-svn: 268190
2016-05-01 08:40:00 +00:00
Craig Topper
5387c11293 [AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when VLX and BWI are supported.
llvm-svn: 268189
2016-05-01 06:52:19 +00:00
Craig Topper
d3f6441aba [AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB and VPMADDUBSW/VPMADDWD.
llvm-svn: 268188
2016-05-01 06:24:57 +00:00
Craig Topper
0d12ba42f8 [AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also marked as requiring VLX.
llvm-svn: 268186
2016-05-01 05:57:06 +00:00
Craig Topper
ac42bfd1c7 [X86] Add an AddedComplexity to another pattern to put it near similar in the output file.
llvm-svn: 268184
2016-05-01 05:22:15 +00:00
Craig Topper
30013f145d [X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere with an AddedComplexity that made this unreachable.
llvm-svn: 268183
2016-05-01 05:22:13 +00:00
Craig Topper
6aa28ed840 [X86] Add AddedComplexity to keep some similar patterns near each other in the output file.
llvm-svn: 268181
2016-05-01 04:59:49 +00:00
Craig Topper
fcae9c2218 [X86] Remove some redundant selection patterns.
llvm-svn: 268180
2016-05-01 04:59:46 +00:00
Craig Topper
6b37764392 [AVX512] Replace vector_extract with extractelt in some patterns. They mean the same thing but vector_extract is deprecated. NFC
llvm-svn: 268179
2016-05-01 04:59:44 +00:00
Craig Topper
f3a74291c9 [AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.
llvm-svn: 268174
2016-05-01 01:03:56 +00:00
Craig Topper
01735085d2 [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.
llvm-svn: 268164
2016-04-30 17:59:49 +00:00
Rafael Espindola
634193bb09 Add missing override.
llvm-svn: 268163
2016-04-30 15:18:21 +00:00
Tom Stellard
6245c9db08 AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits
This was supposed to be part of r268143.

llvm-svn: 268154
2016-04-30 04:04:48 +00:00
Hal Finkel
724aac3fce [PowerPC/QPX] Fix the load/splat peephole with overlapping reads
If, in between the splat and the load (which does an implicit splat), there is
a read of the splat register, then that register must have another earlier
definition. In that case, we can't replace the load's destination register with
the splat's destination register.

Unfortunately, I don't have a small or non-fragile test case.

llvm-svn: 268152
2016-04-30 01:59:28 +00:00
Tom Stellard
51b37329c1 AMDGPU/SI: Enable the post-ra scheduler
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.

Reviewers: arsenm

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18602

llvm-svn: 268143
2016-04-30 00:23:06 +00:00
Matt Arsenault
42ea6294ae AMDGPU: Fix crash with unreachable terminators.
If a block has no successors because it ends in unreachable,
this was accessing an invalid iterator.

Also stop counting instructions that don't emit any
real instructions.

llvm-svn: 268119
2016-04-29 21:52:13 +00:00
Sriraman Tallam
899df7646a Differential Revision: http://reviews.llvm.org/D19733
llvm-svn: 268106
2016-04-29 21:19:16 +00:00
Matt Arsenault
87a15c33eb AMDGPU: Add kernarg.segment.ptr intrinsic
llvm-svn: 268105
2016-04-29 21:16:52 +00:00
Matt Arsenault
4a6478abc5 AMDGPU/SI: Move post regalloc run of SIShrinkInstructions
Move to addPreEmitPass. This is so it runs after post-RA
scheduling so we can merge s_nops emitted by the scheduler
and hazard recognizer.

llvm-svn: 268095
2016-04-29 20:23:42 +00:00
Artem Tamazov
7cdb9e0ea9 Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD."
Previously reverted by r267752.

r267733 review:
Differential Revision: http://reviews.llvm.org/D19342

llvm-svn: 268066
2016-04-29 17:04:50 +00:00
Guozhi Wei
194cccd63b [PPC] Enable shuffling of VSX vectors
This patch fixes PR27078 by enabling shuffling of vectors if VSX is available.

llvm-svn: 268064
2016-04-29 17:00:54 +00:00
Daniel Sanders
035784dd96 [mips][ias] Move createCpRestoreMemOp to MipsTargetStreamer. NFC.
Summary:
This removes the temporary call to isIntegratedAssemblerRequired() which was
added recently. It's effect is now acheived directly in the MipsTargetStreamer
hierarchy.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19715

llvm-svn: 268058
2016-04-29 16:16:49 +00:00
Krzysztof Parzyszek
dc88db3465 Fix NDEBUG build: variables used only in debug code causing compile error
llvm-svn: 268057
2016-04-29 16:14:00 +00:00
Simon Dardis
ccaff6aa52 [mips][FastISel] A store is not a load.
Correct trivial error. One of the failing tests from PR/27458.

Reviewers: dsanders, vkalintiris, mcrosier

Differential Review: http://reviews.llvm.org/D19726

llvm-svn: 268053
2016-04-29 16:07:47 +00:00
Simon Dardis
cbfb26b5e3 [PATCH] [mips] Fix forbidden slot hazard handling
MipsHazardSchedule has to determine what the next physical machine instruction
is to decide whether to insert a nop. In case where a branch with a forbidden
slot appears at the end of a basic block, first *real* instruction of the next
physical basic block was determined using getFirstNonDebugInstr().

Unfortunately this only considers DBG_VALUEs and not other transient opcodes
such as EHLABEL. As EHLABEL passes the SafeInForbiddenSlot predicate and the
instruction after the EHLABEL can be a CTI, we observed test failures in the
LNT testsuite.

Reviewers: dsanders

Differential Review: http://reviews.llvm.org/D19051

llvm-svn: 268052
2016-04-29 16:04:18 +00:00
Krzysztof Parzyszek
d4659dc8ea [Hexagon] Optimize addressing modes for load/store
Patch by Jyotsna Verma.

llvm-svn: 268051
2016-04-29 15:49:13 +00:00
Filipe Cabecinhas
c8ae081a57 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
Tom Stellard
33134ca52e AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions
Summary:
These instructions can add an immediate offset to the address, like other
ds instructions.

Reviewers: arsenm

Subscribers: arsenm, scchan

Differential Revision: http://reviews.llvm.org/D19233

llvm-svn: 268043
2016-04-29 14:34:26 +00:00
Daniel Sanders
931329918a [mips][ias] Split expandMemInst between MipsAsmParser and MipsTargetStreamer. Almost NFC.
Summary:
The portion in MipsAsmParser is responsible for figuring out which expansion to
use, while the portion in MipsTargetStreamer is responsible for emitting it.

This allows us to remove the call to isIntegratedAssemblerRequired() which is
currently ensuring the effect of .cprestore only occurs when writing objects.

The small functional change is that the memory offsets are now correctly
printed as signed values.

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19714

llvm-svn: 268042
2016-04-29 13:43:45 +00:00
Daniel Sanders
6c5253a9db [mips][ias] Moved most instruction emission helpers to MipsTargetStreamer. NFC.
Summary:
* Moved all the emit*() helpers to MipsTargetStreamer.
* Moved createNop() to MipsTargetStreamer as emitNop() and emitEmptyDelaySlot().
  This instruction has been split to distinguish between the 'nop' instruction
  and the nop used in delay slots which is sometimes a different nop to the
  'nop' instruction (e.g. for short delay slots on microMIPS).
* Moved createAddu() to MipsTargetStreamer as emitAddu().
* Moved createAppropriateDSLL() to MipsTargetStreamer as emitDSLL().

Reviewers: sdardis

Subscribers: dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D19712

llvm-svn: 268041
2016-04-29 13:33:12 +00:00
Daniel Sanders
e40f7b3df6 [mips][ias] Make section sizes a multiple of the alignment.
Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D19008

llvm-svn: 268036
2016-04-29 12:44:07 +00:00
Nikolay Haustov
048a920e0e AMDGPU/SI: Assembler: Unify parsing/printing of operands.
Summary:
The goal is for each operand type to have its own parse function and
at the same time share common code for tracking state as different
instruction types share operand types (e.g. glc/glc_flat, etc).

Introduce parseAMDGPUOperand which can parse any optional operand.
DPP and Clamp/OMod have custom handling for now. Sam also suggested
to have class hierarchy for operand types instead of table. This
can be done in separate change.

Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps,
parseMubufOptionalOps, parseDPPOptionalOps.
Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class.
Rename AsmMatcher/InstPrinter methods accordingly.
Print immediate type when printing parsed immediate operand.
Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3).
Update tests.

Reviewers: tstellarAMD, SamWot, artem.tamazov

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19584

llvm-svn: 268015
2016-04-29 09:02:30 +00:00
Zlatko Buljan
8adb396ea9 [mips][microMIPS] Fix offsets for LLE, LWE, SBE, SCE and SHE instructions
Differential Revision: http://reviews.llvm.org/D18645

llvm-svn: 268012
2016-04-29 08:36:54 +00:00
Matt Arsenault
1b74f4bf74 AMDGPU: Stop reporting an addressing mode for unknown addrspace
This was being treated the same as private, which has an immediate
offset. For unknown, it probably means it's for a computation not
actually being used for accessing memory, so it should not have a
nontrivial addressing mode.

llvm-svn: 268002
2016-04-29 06:25:10 +00:00
Craig Topper
0e99bdde34 [X86] Remove unnecessary header file containing a small class. It was only included in one place. Just define the class directly in the cpp file. NFC
llvm-svn: 267985
2016-04-29 04:22:28 +00:00
Craig Topper
d56dbc1e89 [X86] Include X86MCTargetDesc.h directly in X86Disassembler.cpp instead of duplicating parts of it. NFC
llvm-svn: 267984
2016-04-29 04:22:26 +00:00
Craig Topper
8974fbaa0d [X86] Use nested switches to vary the operand to helper functions that were previously called in multiple cases. This seems to help the inliner reduce code. NFC
llvm-svn: 267964
2016-04-29 00:51:30 +00:00
Matthias Braun
f2b211c8f0 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

This re-applies r260806 with LiveVariables manually added to PowerPC to
hopefully not break the stage 2 bots this time.

llvm-svn: 267954
2016-04-28 23:42:51 +00:00
Marcin Koscielnicki
832d560a7e [PowerPC] Fix the EH_SjLj_Setup pseudo.
This instruction is just a control flow marker - it should not
actually exist in the object file.  Unfortunately, nothing catches
it before it gets to AsmPrinter.  If integrated assembler is used,
it's considered to be a normal 4-byte instruction, and emitted as
an all-0 word, crashing the program.  With external assembler,
a comment is emitted.

Fixed by setting Size to 0 and handling it in MCCodeEmitter - this
means the comment will still be emitted if integrated assembler
is not used.

This broke an ASan test, which has been disabled for a long time
as a result (see the discussion on D19657).  We can reenable it
once this lands.

llvm-svn: 267943
2016-04-28 21:24:37 +00:00
Krzysztof Parzyszek
1f77b82ca5 [RDF] Recognize tail calls in graph creation
llvm-svn: 267939
2016-04-28 20:40:08 +00:00
Krzysztof Parzyszek
89f6a784c5 [RDF] Improve handling of inline-asm
- Keep implicit defs from inline-asm instructions.
- Treat register references from inline-asm as fixed.

llvm-svn: 267936
2016-04-28 20:33:33 +00:00
Krzysztof Parzyszek
3050ff6665 [RDF] Add option to keep dead phi nodes in DFG
Dead phi nodes are needed for code motion (such as copy propagation),
where a new use would be placed in a location that would be dominated
by a dead phi. Such a transformation is not legal for copy propagation,
and the existence of the phi would prevent it, but if the phi is not
there, it may appear to be valid.

llvm-svn: 267932
2016-04-28 20:17:06 +00:00
Kit Barton
d03785d120 This reverts commit r265505.
Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance".
This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe.

llvm-svn: 267927
2016-04-28 20:00:42 +00:00
Krzysztof Parzyszek
4d8da47e67 [Hexagon] Add instruction aliases for vector unsigned compare-equal
Unsigned compare-equal instructions are mapped to signed compare-equal.

llvm-svn: 267925
2016-04-28 19:49:18 +00:00
Matt Arsenault
28f0a3fe58 AMDGPU: Emit error if too much LDS is used
llvm-svn: 267922
2016-04-28 19:37:35 +00:00
Matt Arsenault
f94836045a AMDGPU: Fix mishandling array allocations when promoting alloca
The canonical form for allocas is a single allocation of the array type.
In case we see a non-canonical array alloca, make sure we aren't
replacing this with an array N times smaller.

llvm-svn: 267916
2016-04-28 18:38:48 +00:00
Krzysztof Parzyszek
ff5fb695cc [Hexagon] Define certain aliases for vector instructions
Specifically:
  Vd = #0   -> Vd = vxor(Vd, Vd)
  Vdd = #0  -> Vdd.w = vsub(Vdd.w, Vdd.w)
  Vdd = Vss -> Vdd = vcombine(Vss.H, Vss.L)

llvm-svn: 267901
2016-04-28 16:43:16 +00:00
Simon Dardis
156870a1a4 [mips][atomics] Fix partword atomic binary operation implementation
Currently Mips::emitAtomicBinaryPartword() does not properly respect the
width of pointers. For MIPS64 this causes the memory address that the ll/sc
sequence uses to be truncated. At runtime this causes a segmentation fault.

This can be fixed by applying similar changes as r266204, so that a full 64bit
pointer is loaded.

Reviewers: dsanders

Differential Review: http://reviews.llvm.org/D19651

llvm-svn: 267900
2016-04-28 16:26:43 +00:00
Krzysztof Parzyszek
14103a2bbf [Hexagon] Handle double-vector registers as new-value producers
Patch by Colin LeMahieu.

llvm-svn: 267897
2016-04-28 15:54:48 +00:00
Krzysztof Parzyszek
49d1f997e6 [RDF] Handle undefined registers in RDF copy propagation
When updating the graph, make sure that new uses without reaching defs
are handled correctly.

llvm-svn: 267891
2016-04-28 15:09:19 +00:00
Craig Topper
82aee41426 [X86] Remove unused operand from a function and all its callers. NFC
llvm-svn: 267854
2016-04-28 05:58:46 +00:00
Craig Topper
945a4cd524 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Craig Topper
42195e68d9 [AArch64] Expand CTTZ for all vector types.
llvm-svn: 267837
2016-04-28 01:58:21 +00:00
Bryan Chan
2567ab558c [SystemZ] Support Swift Calling Convention
Summary:
Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see:

RFC: Implementing the Swift calling convention in LLVM and Clang
https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0

Reviewers: kbarton, manmanren, rjmccall, uweigand

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D19414

llvm-svn: 267823
2016-04-28 00:17:23 +00:00
Mitch Bodart
fde6615c97 [X86] Enable the post-RA-scheduler for clang's default 32-bit cpu.
For compilations with no explicit cpu specified, this exhibits
nice gains on Silvermont, with neutral performance on big cores.

Differential Revision: http://reviews.llvm.org/D19138

llvm-svn: 267809
2016-04-27 22:52:35 +00:00
Quentin Colombet
d6bb035737 [X86][FastISel] Make sure we use the right register class when we select stores.
llvm-svn: 267806
2016-04-27 22:33:42 +00:00
Colin LeMahieu
7045424b87 [Hexagon] Merging nops in to previous packet rather than always creating a new one.
llvm-svn: 267798
2016-04-27 21:37:44 +00:00
Quentin Colombet
96d6f82ab0 [X86] Fix the lowering of TLS calls.
The callseq_end node must be glued with the TLS calls, otherwise,
the generic code will miss the uses of the returned value and will
mark it dead.
Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI,
the pseudo uses the symbol address at this point not RDI and the
lowering will do the right thing.

llvm-svn: 267797
2016-04-27 21:37:37 +00:00
Matt Arsenault
982e737c85 AMDGPU: Account for globals in AMDGPUPromoteAlloca pass
Patch by Bas Nieuwenhuizen

llvm-svn: 267791
2016-04-27 21:05:08 +00:00
Ahmed Bougacha
5a7cdb03e6 [ARM] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.
We run after PEI.
Found via inspection; no obvious testcase.

Follow-up to r266679.

llvm-svn: 267781
2016-04-27 20:33:07 +00:00
Ahmed Bougacha
b9e69e57c8 [AArch64] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.
We run after PEI.
Found via inspection; no obvious testcase.

Follow-up to r266339.

llvm-svn: 267780
2016-04-27 20:33:05 +00:00
Ahmed Bougacha
e8bff14c32 [AArch64] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB,
whereas it should be a successor of the original MBB.

Follow-up to r266339.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267779
2016-04-27 20:33:02 +00:00
Ahmed Bougacha
492c1a346a [ARM] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.

The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.

Follow-up to r266679.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267778
2016-04-27 20:32:54 +00:00
Kevin B. Smith
1783031f2d [X86]: Quit promoting 16 bit loads to 32 bit.
Differential Revision: http://reviews.llvm.org/D19592

llvm-svn: 267773
2016-04-27 19:58:03 +00:00
Andrew Kaylor
98d037199a Add optimization bisect opt-in calls for PowerPC passes
Differential Revision: http://reviews.llvm.org/D19554

llvm-svn: 267769
2016-04-27 19:39:32 +00:00
Justin Lebar
c5f2ca9cfc [NVPTX] Run NVVMReflect at the beginning of IR passes.
Summary:
Currently the NVVMReflect pass is run at the beginning of our backend
passes.  But really, it should be run as early as possible, as it's
simply resolving an "if" statement in code.  So copy it into
TargetMachine::addEarlyAsPossiblePasses.

We still run it at the beginning of the backend passes, since it's
needed for correctness when lowering to nvptx.

(Specifically, NVVMReflect changes each call to the __nvvm_reflect
function or llvm.nvvm.reflect intrinsic into an integer constant, based
on the pass's configuration.  Clearly we miss many optimization
opportunities if we perform this transformation at the beginning of
codegen.)

Reviewers: rnk

Subscribers: tra, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D18616

llvm-svn: 267765
2016-04-27 19:13:37 +00:00
Chad Rosier
c6f107fe0e Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD."
This reverts commit r267733 due to a -Werror,-Wunused-function error.

llvm-svn: 267752
2016-04-27 18:29:11 +00:00
Gerolf Hoflehner
19cd041163 [DAGCombiner] Follow coding convention for function name (NFC)
llvm-svn: 267745
2016-04-27 17:27:16 +00:00
Marcin Koscielnicki
1e17bfd3e5 [Mips] Add support for llvm.thread.pointer intrinsic.
This will be used to implement __builtin_thread_pointer in clang.

Differential Revision: http://reviews.llvm.org/D19569

llvm-svn: 267743
2016-04-27 17:21:49 +00:00
Reid Kleckner
6a058f1e29 Silence a -Wdangling-else
llvm-svn: 267737
2016-04-27 16:46:33 +00:00
Matthew Simpson
8b365897a4 Add parentheses to silence buildbot warning
llvm-svn: 267734
2016-04-27 16:25:04 +00:00
Artem Tamazov
ed6f89bcdc [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.
Added support of TTMP quads.
Reworked M0 exclusion machinery for SMRD and similar instructions
to enable usage of TTMP registers in those instructions as destinations.
Tests added.

Differential Revision: http://reviews.llvm.org/D19342

llvm-svn: 267733
2016-04-27 16:20:23 +00:00
Nicolai Haehnle
494b4aee1e AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic
Summary:
So it appears that to guarantee some of the ordering requirements of a GLSL
memoryBarrier() executed in the shader, we need to emit an s_waitcnt.

(We can't use an s_barrier, because memoryBarrier() may appear anywhere in
the shader, in particular it may appear in non-uniform control flow.)

Reviewers: arsenm, mareko, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19203

llvm-svn: 267729
2016-04-27 15:46:01 +00:00
Matthew Simpson
995eceaf0c [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

llvm-svn: 267725
2016-04-27 15:20:21 +00:00
Artem Tamazov
0b6855273a [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware registers.
Possibility to specify code of hardware register kept.
Disassemble to symbolic name, if name is known.
Tests updated/added.

Differential Revision: http://reviews.llvm.org/D19335

llvm-svn: 267724
2016-04-27 15:17:03 +00:00
Nico Weber
b519b357d0 Revert r267649, it caused PR27539.
llvm-svn: 267723
2016-04-27 15:16:54 +00:00
Zlatko Buljan
92f1550331 [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU instructions
Differential Revision: http://reviews.llvm.org/D16676

llvm-svn: 267694
2016-04-27 11:31:44 +00:00
Zlatko Buljan
a2323fb2af [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, SRAV, SRL and SRLV instructions
Differential Revision: http://reviews.llvm.org/D17989

llvm-svn: 267693
2016-04-27 11:02:23 +00:00
Chuang-Yu Cheng
d389efdf8e [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state instead of implicit
This fixes PR27414

Reviewers: kbarton mgrang tjablin

http://reviews.llvm.org/D19255

llvm-svn: 267660
2016-04-27 02:59:28 +00:00
Ahmed Bougacha
1b19a8307b [X86] Set AddPristinesAndCSRs to FixupBW LivePhysRegs. NFC.
We run after PEI, so we need to AddPristinesAndCSRs.
In practice, that makes no difference here, because we only ask about
liveness of super-registers of defined GR8/GR16 registers, so they
can't be pristine. Still, it's the correct thing to do.

Thanks to Quentin for noticing!

Follow-up to r267495.

llvm-svn: 267658
2016-04-27 01:51:38 +00:00
Ahmed Bougacha
991d42e979 [X86] Don't assume that MMX extractelts are from index 0.
It's probably the case for all 3 MMX users out there, but with
hand-crafted IR, you can trigger selection failures. Fix that.

llvm-svn: 267652
2016-04-27 01:35:29 +00:00
Ahmed Bougacha
208a5db302 [X86] Re-enable MMX i32 extractelt combine.
This effectively adds back the extractelt combine removed by r262358:
the direct case can still occur (because x86_mmx is special, see
r262446), but it's the indirect case that's now superseded by the
generic combine.

llvm-svn: 267651
2016-04-27 01:35:25 +00:00
Cong Hou
3dea148bfe Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched.
Differential revision: http://reviews.llvm.org/D14840

llvm-svn: 267649
2016-04-27 01:29:18 +00:00
Andrew Kaylor
322b6b2a32 Add optimization bisect opt-in calls for SystemZ passes
Differential Revision: http://reviews.llvm.org/D19562

llvm-svn: 267636
2016-04-26 23:49:41 +00:00
Andrew Kaylor
9680f122b8 Add optimization bisect opt-in calls for NVPTX passes
Differential Revision: http://reviews.llvm.org/D19518

llvm-svn: 267635
2016-04-26 23:44:31 +00:00
Quentin Colombet
c01de3fc6a [X86] Make sure it is safe to clobber EFLAGS, if need be, when choosing
the prologue.

Do not use basic blocks that have EFLAGS live-in as prologue if we need
to realign the stack. Realigning the stack uses AND instruction and this
clobbers EFLAGS.

An other alternative would have been to save and restore EFLAGS around
the stack realignment code, but this is likely inefficient.

Fixes PR27531.

llvm-svn: 267634
2016-04-26 23:44:14 +00:00
Quentin Colombet
ce78de67e5 [X86] Teach the expansion of copy instructions how to do proper liveness.
When the simple analysis provided by MachineBasicBlock::computeRegisterLiveness
fails, fall back on the LivePhysReg utility.

llvm-svn: 267623
2016-04-26 23:14:32 +00:00
Jingyue Wu
db7e23c040 [NVPTX] Fix some usages of CodeGenOpt::None.
NVPTXLowerKernelArgs is required for correctness, so it should not be guarded
by CodeGenOpt::None.

NVPTXPeephole is optimization only, so it should be skipped when
CodeGenOpt::None.

llvm-svn: 267619
2016-04-26 22:59:25 +00:00
Andrew Kaylor
b34d1cfddb Optimization bisect support in X86-specific passes
Differential Revision: http://reviews.llvm.org/D19439

llvm-svn: 267608
2016-04-26 21:44:24 +00:00
Ahmed Bougacha
db9e64109d [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.
Differential Revision: http://reviews.llvm.org/D17176

llvm-svn: 267606
2016-04-26 21:15:30 +00:00
Andrew Kaylor
927565d9fa Add optimization bisect opt-in calls for Hexagon passes
Differential Revision: http://reviews.llvm.org/D19509

llvm-svn: 267593
2016-04-26 19:46:28 +00:00
Manman Ren
e3c0ba8445 Swift Calling Convention: use %RAX for sret.
We don't need to copy the sret argument into %rax upon return.
rdar://25671494

llvm-svn: 267579
2016-04-26 18:08:06 +00:00
Konstantin Zhuravlyov
c01e46c011 [AMDGPU] Move reserved vgpr count for trap handler usage to SIMachineFunctionInfo + minor commenting changes
Differential Revision: http://reviews.llvm.org/D19537

llvm-svn: 267573
2016-04-26 17:24:40 +00:00
Konstantin Zhuravlyov
a8b24aaab2 [AMDGPU] Reserve VGPRs for trap handler usage if instructed
Differential Revision: http://reviews.llvm.org/D19235

llvm-svn: 267563
2016-04-26 15:43:14 +00:00
Sam Kolton
f3d76df79f [AMDGPU] Assembler: basic support for SDWA instructions
Support for SDWA instructions for VOP1 and VOP2 encoding.
Not done yet:
  - converters for support optional operands and modifiers
  - VOPC
  - sext() modifier
  - intrinsics
  - VOP2b (see vop_dpp.s)
  - V_MAC_F32 (see vop_dpp.s)

Differential Revision: http://reviews.llvm.org/D19360

llvm-svn: 267553
2016-04-26 13:33:56 +00:00
Andrey Turetskiy
53086bdd4e [X86] PR27502: Fix the LEA optimization pass.
Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass.

Differential Revision: http://reviews.llvm.org/D19409

llvm-svn: 267551
2016-04-26 12:18:12 +00:00
Marcin Koscielnicki
bbb6abb048 [Sparc] Fix build error introduced by rL267545.
llvm-svn: 267549
2016-04-26 10:43:47 +00:00
Marcin Koscielnicki
704c818d77 [PowerPC] Add support for llvm.thread.pointer
Differential Revision: http://reviews.llvm.org/D19304

llvm-svn: 267546
2016-04-26 10:37:22 +00:00
Marcin Koscielnicki
599068857b [SPARC] [SSP] Add support for LOAD_STACK_GUARD.
This fixes PR22248 on sparc.

Differential Revision: http://reviews.llvm.org/D19386

llvm-svn: 267545
2016-04-26 10:37:14 +00:00
Marcin Koscielnicki
470e89bab4 [SPARC] Add support for llvm.thread.pointer.
Differential Revision: http://reviews.llvm.org/D19387

llvm-svn: 267544
2016-04-26 10:37:01 +00:00
Chuang-Yu Cheng
da7d0bf651 [ppc64] Reenable sibling call optimization on ppc64 since fixed tsan library tail-call issue
print-stack-trace.cc test failure of compiler-rt has been fixed by
r266869 (http://reviews.llvm.org/D19148), so reenable sibling call
optimization on ppc64

Reviewers: nemanjai kbarton
llvm-svn: 267527
2016-04-26 07:38:24 +00:00
Craig Topper
b7db006017 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Craig Topper
4513730c09 [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.
The default is Legal, which results in 'Cannot select' errors.

llvm-svn: 267521
2016-04-26 05:04:37 +00:00
Craig Topper
783834f3cf [ARM] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267520
2016-04-26 05:04:33 +00:00
Dan Gohman
2ad6a4c0e6 [WebAssembly] Account for implicit operands when computing operand indices.
llvm-svn: 267511
2016-04-26 01:40:56 +00:00
Andrew Kaylor
cbfcb4c888 Reverting Thumb2SizeReduction opt bisect change to fix failing buildbots.
llvm-svn: 267506
2016-04-26 00:56:36 +00:00
Junmo Park
69d7e17058 Remove MinLatency in SchedMachineModel. NFC.
Summary:
We don't use MinLatency any more since r184032.

Reviewers: atrick, hfinkel, mcrosier

Differential Revision: http://reviews.llvm.org/D19474

llvm-svn: 267502
2016-04-26 00:37:46 +00:00
Ahmed Bougacha
b6c12fe106 [X86] Use LivePhysRegs in X86FixupBWInsts.
Kill-flags, which computeRegisterLiveness uses, are not reliable.
LivePhysRegs is.

Differential Revision: http://reviews.llvm.org/D19472

llvm-svn: 267495
2016-04-26 00:00:48 +00:00
James Y Knight
439f0092c7 [Sparc] Fix double-float fabs and fneg on little endian CPUs.
The SparcV8 fneg and fabs instructions interestingly come only in a
single-float variant. Since the sign bit is always the topmost bit no
matter what size float it is, you simply operate on the high
subregister, as if it were a single float.

However, the layout of double-floats in the float registers is reversed
on little-endian CPUs, so that the high bits are in the second
subregister, rather than the first.

Thus, this expansion must check the endianness to use the correct
subregister.

llvm-svn: 267489
2016-04-25 22:54:09 +00:00
Andrew Kaylor
b9ffd9d2d0 Fix build warning
llvm-svn: 267487
2016-04-25 22:27:30 +00:00
Andrew Kaylor
ce278dfa7a Add optimization bisect opt-in calls for AMDGPU passes
Differential Revision: http://reviews.llvm.org/D19450

llvm-svn: 267485
2016-04-25 22:23:44 +00:00
Andrew Kaylor
e8dd108465 Add optimization bisect opt-in calls for ARM passes
Differential Revision: http://reviews.llvm.org/D19449

llvm-svn: 267480
2016-04-25 22:01:04 +00:00
Andrew Kaylor
3825400f62 Add optimization bisect opt-in calls for AArch64 passes
Differential Revision: http://reviews.llvm.org/D19394

llvm-svn: 267479
2016-04-25 21:58:52 +00:00
Tim Northover
66f8d5ae59 ARM: put extern __thread stubs in a special section.
The linker needs to know that the symbols are thread-local to do its job
properly.

llvm-svn: 267473
2016-04-25 21:12:04 +00:00
Krzysztof Parzyszek
08a24a9751 [Hexagon] Few fixes for exception handling
llvm-svn: 267469
2016-04-25 21:05:19 +00:00
Quentin Colombet
7f8c56085e Re-apply r267206 with a fix for the encoding problem: when the immediate of
log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit
variant cannot encode it. Therefore, set the subreg part accordingly.

[AArch64] Fix optimizeCondBranch logic.

The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!

This fixes the last make check verifier issues for AArch64: PR27479.

llvm-svn: 267465
2016-04-25 20:54:08 +00:00
Matt Arsenault
edc94ff860 AMDGPU/SI: Optimize adjacent s_nop instructions
Use the operand for how long to wait. This is somewhat
distasteful, since it would be better to just emit s_nop
with the right argument in the first place. This would require
changing TII::insertNoop to emit N operands, which would be easy.
Slightly more problematic is the post-RA scheduler and hazard recognizer
represent nops as a single null node, and would require inventing
another way of representing N nops.

llvm-svn: 267456
2016-04-25 19:53:22 +00:00
Matt Arsenault
b60850cb10 AMDGPU: Implement addrspacecast
llvm-svn: 267452
2016-04-25 19:27:24 +00:00
Matt Arsenault
524b24258c AMDGPU: Add queue ptr intrinsic
llvm-svn: 267451
2016-04-25 19:27:18 +00:00