1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

28508 Commits

Author SHA1 Message Date
Zoran Jovanovic
5a4694735d [mips][mips64r6] Add CLASS.fmt instructions
Differential Revision: http://reviews.llvm.org/D3712

llvm-svn: 208894
2014-05-15 15:16:36 +00:00
Zoran Jovanovic
f64b55bdcb [mips][mips64r6] Add RINT.fmt instructions
Differential Revision: http://reviews.llvm.org/D3711

llvm-svn: 208892
2014-05-15 15:04:37 +00:00
Zoran Jovanovic
6de41d285d [mips][mips64r6] Add SELEQZ/SELNEZ.fmt instructions
Differential Revision: http://reviews.llvm.org/D3710

llvm-svn: 208891
2014-05-15 14:58:42 +00:00
Zoran Jovanovic
bc63e943e1 [mips][mips64r6] Add MAX/MIN/MAXA/MINA.fmt instructions
Differential Revision: http://reviews.llvm.org/D3709

llvm-svn: 208890
2014-05-15 14:54:06 +00:00
Tom Stellard
dbf9b9b7af R600/SI: Stop using VSrc_* as the default register class for types.
We now use SReg_* for integer types and VReg_* for floating-point types.
This should help simplify the SIFixSGPRCopies pass and no longer causes
ISel to insert a COPY after termiator instuctions that output a value.

This change is covered by exisitng tests.

llvm-svn: 208888
2014-05-15 14:41:57 +00:00
Tom Stellard
d01bb8adfa R600/SI: Fix a bug with handling of INSERT_SUBREG in SIFixSGPRCopies
This prevents a future commit from regressing the load-i1.ll test.

llvm-svn: 208887
2014-05-15 14:41:55 +00:00
Tom Stellard
77051e93a5 R600/SI: Only use SALU instructions for 64-bit add in a block of CF depth 0
llvm-svn: 208886
2014-05-15 14:41:54 +00:00
Tom Stellard
efb8470c62 R600/SI: Use VALU instructions for i1 ops
llvm-svn: 208885
2014-05-15 14:41:50 +00:00
Tim Northover
ac5dac4c75 TableGen: use correct MIOperand when printing aliases
Previously, TableGen assumed that every aliased operand consumed precisely 1
MachineInstr slot (this was reasonable because until a couple of days ago,
nothing more complicated was eligible for printing).

This allows a couple more ARM64 aliases to print so we can remove the special
code.

On the X86 side, I've gone for explicit AT&T size specifiers as the default, so
turned off a few of the aliases that would have just started printing.

llvm-svn: 208880
2014-05-15 13:36:01 +00:00
Daniel Sanders
f89f1dcf37 [mips][mips64r6] Add bitswap, and dbitswap
Summary: Depends on D3728

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3729

llvm-svn: 208877
2014-05-15 12:18:23 +00:00
Jay Foad
2827803889 Instead of littering asserts throughout the code after every call to
computeKnownBits, consolidate them into one assert at the end of
computeKnownBits itself.

llvm-svn: 208876
2014-05-15 12:12:55 +00:00
Tim Northover
ed117bb644 ARM64: print correct aliases for NEON mov & mvn instructions
In all cases, if a "mov" alias exists, it is the canonical form of the
instruction. Now that TableGen can support aliases containing syntax variants,
we can enable them and improve the quality of the asm output.

llvm-svn: 208874
2014-05-15 12:11:02 +00:00
Daniel Sanders
883f9833f0 [mips][mips64r6] Add align and dalign
Summary: Depends on D3689

Reviewers: vmedic, zoran.jovanovic, jkolek

Reviewed By: jkolek

Differential Revision: http://reviews.llvm.org/D3728

llvm-svn: 208872
2014-05-15 12:06:36 +00:00
Tim Northover
4ba95d4483 TableGen/ARM64: print aliases even if they have syntax variants.
To get at least one use of the change (and some actual tests) in with its
commit, I've enabled the AArch64 & ARM64 NEON mov aliases.

llvm-svn: 208867
2014-05-15 11:16:32 +00:00
Tim Northover
83bd592b77 ARM64: add correct vector registers during asm parsing
Previously, we ignored the difference between V64 and V128 when parsing
assembly: they both got mapped to registers in the FPR128 class. This is
basically harmless at the moment because they both print and encode the same
way. However, it will affect the printing of aliases.

llvm-svn: 208866
2014-05-15 11:16:19 +00:00
Bradley Smith
8301057544 [ARM64] Improve load/store diagnostics and forbid 32-bit register addresses
llvm-svn: 208864
2014-05-15 11:08:30 +00:00
Bradley Smith
c77dfa4453 [ARM64] Parse fixed vector lanes properly so that diagnostics can be emitted
llvm-svn: 208863
2014-05-15 11:07:57 +00:00
Bradley Smith
ffae33a2db [ARM64] Add/Fixup diagnostics for floating point immediates
llvm-svn: 208862
2014-05-15 11:07:28 +00:00
Bradley Smith
5033c221c9 [ARM64] Add condition code operand type such that proper diagnostics can be emitted
llvm-svn: 208861
2014-05-15 11:06:51 +00:00
Bradley Smith
b8ba322e07 [ARM64] Add more simple diagnostics for immediate/shift ranges
llvm-svn: 208860
2014-05-15 11:06:16 +00:00
Daniel Sanders
17f37b6f3f [mips][mips64r6] Add addiupc, aluipc, and auipc
Summary:
No support for symbols in place of the immediate yet since it requires new
relocations.

Depends on D3671

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3689

llvm-svn: 208858
2014-05-15 10:45:58 +00:00
Daniel Sanders
f29be03643 [mips][mips64r6] Add aui, daui, dahi, and dati
Summary: Depends on D3671

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3759

llvm-svn: 208857
2014-05-15 10:27:19 +00:00
Daniel Sanders
ca9cbc1b8b [mips][mips64r6] Test that branch likelies are not accepted on MIPS64r6.
Summary:
They aren't implemented for any ISA at the moment.

Depends on D3670

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3671

llvm-svn: 208855
2014-05-15 09:47:43 +00:00
Jonathan Roelofs
9a09decd36 Fix some dyslexia in an assert message
llvm-svn: 208842
2014-05-15 02:24:50 +00:00
Alp Toker
18115693f7 Fix typos
llvm-svn: 208839
2014-05-15 01:52:21 +00:00
Jiangning Liu
ecd097d587 [ARM64] Support aggressive fastcc/tailcallopt breaking ABI by popping out argument stack from callee.
llvm-svn: 208837
2014-05-15 01:33:17 +00:00
Eric Christopher
f26f61b12b Move the TargetMachine MC options to MCTargetOptions. No functional
change.

llvm-svn: 208832
2014-05-15 01:08:00 +00:00
Jay Foad
e0eac700cb Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
Christian Pirker
7dd3a40e09 ARM-BE: test files for vector argument passing
Reviewed at http://reviews.llvm.org/D3766

llvm-svn: 208793
2014-05-14 16:59:44 +00:00
Christian Pirker
f835f2f7be [ARM64-BE] Fix byte order of CIE and FDE frames for exception handling
Reviewed at http://reviews.llvm.org/D3741

llvm-svn: 208792
2014-05-14 16:51:58 +00:00
Benjamin Kramer
56a86f3d17 X86: If we have an instruction that sets a flag and a zero test on the input of that instruction try to eliminate the test.
For example
	tzcntl	%edi, %ebx
	testl %edi, %edi
	je	.label

can be rewritten into
	tzcntl	%edi, %ebx
	jb 	.label

A minor complication is that tzcnt sets CF instead of ZF when the input
is zero, we have to rewrite users of the flags from ZF to CF. Currently
we recognize patterns using lzcnt, tzcnt and popcnt.

Differential Revision: http://reviews.llvm.org/D3454

llvm-svn: 208788
2014-05-14 16:14:45 +00:00
Daniel Sanders
2b18784a55 [mips][mips64r6] Add sel.s and sel.d
Summary:
Also use named constants for common opcode fields.

Depends on D3669

Reviewers: vmedic, zoran.jovanovic, jkolek

Reviewed By: jkolek

Differential Revision: http://reviews.llvm.org/D3670

llvm-svn: 208784
2014-05-14 15:29:44 +00:00
Tim Northover
0cd4ebc382 ARM64: remove unneeded InstPrinter hacks
Now that TableGen handles aliases, these are unneeded. Hopefully more will be
able to go soon.

llvm-svn: 208781
2014-05-14 14:44:18 +00:00
Saleem Abdulrasool
a39ea3408e ARM: implement support for the UDF mnemonic
The UDF instruction is a reserved undefined instruction space.  The assembler
mnemonic was introduced with ARM ARM rev C.a.  The instruction is not predicated
and the immediate constant is ignored by the CPU.  Add support for the three
encodings for this instruction.

The changes to the invalid instruction test is due to the fact that the invalid
instructions actually overlap with the undefined instruction.  Introduction of
the new instruction results in a partial decode as an undefined sequence.  Drop
the tests as they are invalid instruction patterns anyways.

llvm-svn: 208751
2014-05-14 03:47:39 +00:00
Eric Christopher
935299458d Fix typo in function name.
llvm-svn: 208743
2014-05-14 00:31:15 +00:00
Matt Arsenault
102b7be363 R600/SI: Try to fix BFE operands when moving to VALU
This was broken by r208479

llvm-svn: 208740
2014-05-13 23:45:50 +00:00
Eric Christopher
1091ab4275 Save the optimization level the subtarget was created with in a
member variable and sink the initialization of crbits into the
subtarget feature reset code.

No functional change, but this refactor will be used in a future
commit.

llvm-svn: 208726
2014-05-13 20:49:08 +00:00
Christian Pirker
f4b3e60979 ARMEB: Fix byte order of EH frame unwinding instructions, with modified test file
This commit was already commited as revision rL208689 and discussd in
phabricator revision D3704.
But the test file was crashing on OS X and windows.

I fixed the test file in the same way as in rL208340.

llvm-svn: 208711
2014-05-13 16:44:30 +00:00
Rafael Espindola
a2d476ba2a Revert "ARMEB: Fix byte order of EH frame unwinding instructions"
This reverts commit r208689.

The test was crashing on OS X and windows.

llvm-svn: 208704
2014-05-13 15:19:56 +00:00
Daniel Sanders
3d39ebcc85 [mips] Marked up instructions added in MIPS32r2 and tested that IAS for -mcpu=mips(2|32) does not accept them
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-3 that was available in MIPS32R2.

To limit the number of tests required, only one 32-bit and one 64-bit ISA
prior to MIPS32/MIPS64 are tested.

rdhwr has been deliberately left without an ISA annotation for now. This is
because the assembler and CodeGen disagree on when the instruction is
available. Strictly speaking, it is only available in MIPS32r2 and
MIPS64r2. However, it is emulated by a kernel trap on earlier ISA's and is
necessary for TLS so CodeGen should emit it on older ISA's too.

Depends on D3696

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3697

llvm-svn: 208690
2014-05-13 11:45:36 +00:00
Christian Pirker
56cf70310a ARMEB: Fix byte order of EH frame unwinding instructions
llvm-svn: 208689
2014-05-13 11:41:49 +00:00
Daniel Sanders
829e475bed [mips] Free up two values in SubtargetFeatureFlag by folding the redundant IsGP32/IsGP64 into IsGP32bit/IsGP64bit
Summary:
We are currently very close to the 32-bit limit of the current assembler
implementation. This is because there is no way to represent an instruction
that is available in, for example, Mips3 or Mips32. We have to define a
feature bit that represents this.

This patch cleans up a pair of redundant feature bits and slightly postpones the
point we will reach the limit.

Reviewers: zoran.jovanovic, jkolek, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3703

llvm-svn: 208685
2014-05-13 11:17:46 +00:00
Artyom Skrobov
5fd9c8419e [un]wrap extracted from lib/Target/Target[MachineC].cpp, lib/ExecutionEngine/ExecutionEngineBindings.cpp into include/llvm/IR/DataLayout.h
llvm-svn: 208680
2014-05-13 09:45:26 +00:00
Kevin Qin
0a385c6e45 [ARM64] Fix the misleading diagnostic on bad extend amount of reg+reg addressing mode.
A vague diagnostic replaced the misleading one.
This can fix bug 19502.

llvm-svn: 208669
2014-05-13 07:35:12 +00:00
Weiming Zhao
142b806751 Folding into CSEL when there is ZEXT between SETCC and ADD
Normally, patterns like (add x, (setcc cc ...)) will be folded into
(csel x, x+1, not cc). However, if there is a ZEXT after SETCC, they
won't be folded. This patch recognizes the ZEXT and allows the
generation of CSINC.

This patch fixes bug 19680.

llvm-svn: 208660
2014-05-13 00:40:58 +00:00
Reid Kleckner
d7efe8386c Try to fix an SDAG dependence issue with sret
r208453 added support for having sret on the second parameter.  In that
change, the code for copying sret into a virtual register was hoisted
into the loop that lowers formal parameters.  This caused a "Wrong
topological sorting" assertion failure during scheduling when a
parameter is passed in memory.  This change undoes that by creating a
second loop that deals with sret.

I'm worried that this fix is incomplete.  I don't fully understand the
dependence issues.  However, with this change we produce the same DAGs
we used to produce, so if they are broken, they are just as broken as
they have always been.

llvm-svn: 208637
2014-05-12 22:01:27 +00:00
Matt Arsenault
7739d11cce Use cast<> for unchecked use
llvm-svn: 208627
2014-05-12 20:42:57 +00:00
Louis Gerbarg
acd97f5881 Add support bswap16 to/from memory compiling to rev16 on ARM/Thumb
The current patterns for REV16 misses mostn __builtin_bswap16() due to
legalization promoting the operands to from load/stores toi32s and then
truncing/extending them. This patch adds new patterns that catch the resultant
DAGs and codegens them to rev16 instructions. Tests included.

rdar://15353652

llvm-svn: 208620
2014-05-12 19:53:52 +00:00
Matt Arsenault
aa6b8c3524 Use cast<> for unchecked use
llvm-svn: 208618
2014-05-12 19:26:38 +00:00
Matt Arsenault
133f79f1a6 Use range for
llvm-svn: 208617
2014-05-12 19:23:21 +00:00
Tim Northover
3c2cc7a397 TableGen: use PrintMethods to print more aliases
llvm-svn: 208607
2014-05-12 18:04:06 +00:00
Tim Northover
b9ab6037ee AArch64/ARM64: use InstAliases for NEON logical (imm) instructions.
llvm-svn: 208606
2014-05-12 18:03:42 +00:00
Tim Northover
5d583d1db4 AArch64/ARM64: implement "mov $Rd, $Imm" aliases in TableGen.
This is a slightly different approach to AArch64 (the base instruction
definitions aren't quite right for that to work), but achieves the
same thing and reduces C++ hackery in AsmParser.

llvm-svn: 208605
2014-05-12 18:03:36 +00:00
Matt Arsenault
c2251d492b R600: Add mul24 intrinsics
llvm-svn: 208604
2014-05-12 17:49:57 +00:00
Daniel Sanders
21e8add4de Revert: r208582 - [mips][mips64r6] Add sel.s and sel.d
Accidentally committed an unreviewed patch. Reverted it.

llvm-svn: 208583
2014-05-12 15:43:41 +00:00
Daniel Sanders
21e49ad22a [mips][mips64r6] Add sel.s and sel.d
Summary:
Also use named constants for common opcode fields.

Depends on D3669

Reviewers: jkolek, vmedic, zoran.jovanovic

Differential Revision: http://reviews.llvm.org/D3670

llvm-svn: 208582
2014-05-12 15:39:10 +00:00
Daniel Sanders
6e0f23768c [mips][mips64r6] Add d?div, d?mod, d?divu, d?modu
Summary: Depends on D3668

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3669

llvm-svn: 208579
2014-05-12 15:24:16 +00:00
Daniel Sanders
62837f7412 [mips][mips64r6] Added mul/mulu/muh/muhu
Summary: The 'mul' line of the test is temporarily commented out because it currently matches the MIPS32 mul instead of the MIPS32r6 mul. This line will be uncommented when we disable the MIPS32 mul on MIPS32r6.

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3668

llvm-svn: 208576
2014-05-12 15:12:45 +00:00
Aaron Ballman
005c04633a Silencing an MSVC warning about not all control paths returning a value (even though the switch is fully covered). No functional change.
llvm-svn: 208565
2014-05-12 14:22:58 +00:00
Tim Northover
8f159af538 ARM64: remove dead validation code from the AsmParser.
If this code triggers, any immediate has already been validated so it can't
possibly trigger a diagnostic.

llvm-svn: 208564
2014-05-12 14:13:21 +00:00
Tim Northover
5613cd99e8 ARM64: merge "extend" and "shift" addressing-mode enums.
In terms of assembly, these have too much overlap to be neatly modelled as
disjoint classes: in many cases "lsl" is an acceptable alternative to either
"uxtw" or "uxtx".

llvm-svn: 208563
2014-05-12 14:13:17 +00:00
Rafael Espindola
13ccdbe63e Remove an always true argument.
llvm-svn: 208557
2014-05-12 13:30:10 +00:00
Benjamin Kramer
32f96f80e4 X86: Make sure that we have SSE4.1 before we generate insertps nodes.
PR19721.

llvm-svn: 208552
2014-05-12 13:12:08 +00:00
Daniel Sanders
ed7f67b1d1 [mips] Marked up instructions added in MIPS32 and tested that IAS for -mcpu=mips2 does not accept them
Summary:
To limit the number of tests required, only one 32-bit and one 64-bit ISA
prior to MIPS32/MIPS64 are explicitly tested.

Depends on D3695

Reviewers: vmedic

Differential Revision: http://reviews.llvm.org/D3696

llvm-svn: 208549
2014-05-12 13:04:32 +00:00
Rafael Espindola
20867572f4 Remove MCUseCFI from TargetMachine.
It was always true.

llvm-svn: 208547
2014-05-12 13:01:42 +00:00
Daniel Sanders
25c4476b13 [mips] Marked up instructions added in MIPS-V and tested that IAS for -mcpu=mips[1234] does not accept them
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-V that was available in MIPS32R2

Most of these instructions are correctly rejected but with the wrong error
message. These have been placed in a separate test for now. It happens
because many of the MIPS V instructions have not been implemented.

Depends on D3694

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3695

llvm-svn: 208546
2014-05-12 12:52:44 +00:00
Daniel Sanders
78fef0e36d [mips] Fold FeatureBitCount into FeatureMips32 and FeatureMips64
Summary:
DCL[ZO] are now correctly marked as being MIPS64 instructions. This has no
effect on the CodeGen tests since expansion of i64 prevented their use
anyway.

The check for MIPS16 to prevent the use of CLZ no longer prevents DCLZ as
well. This is not a functional change since DCLZ is still prohibited by
being a MIPS64 instruction (MIPS16 is only compatible with MIPS32).

No functional change

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3694

llvm-svn: 208544
2014-05-12 12:41:59 +00:00
Daniel Sanders
919a8bc274 [mips] Fold FeatureSEInReg into FeatureMips32r2
Summary: No functional change

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3693

llvm-svn: 208543
2014-05-12 12:28:15 +00:00
Daniel Sanders
a5dd1a7062 [mips] Fold FeatureSwap into FeatureMips32r2 and FeatureMips64r2
Summary:
dsbh and dshd are not available on Mips32r2. No codegen test changes
required since expansion of i64 prevented the use of these instructions
anyway.

Depends on D3690

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3692

llvm-svn: 208542
2014-05-12 12:15:41 +00:00
Daniel Sanders
da7e05d501 [mips] Replace FeatureFPIdx with FeatureMips4_32r2
Summary:
No functional change.

The minor change to the MIPS16 code is in preparation for a patch that will handle 32-bit FPIdx instructions separately to 64-bit (because they were added in different revisions)

Depends on D3677

Reviewers: rkotler, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3690

llvm-svn: 208541
2014-05-12 11:56:16 +00:00
Bradley Smith
51a757e4cd [ARM64] Add proper bounds checking/diagnostics to logical shifts
llvm-svn: 208540
2014-05-12 11:49:16 +00:00
Christian Pirker
03fd39ca48 ARM: Implement big endian bit-conversion for NEON type
llvm-svn: 208538
2014-05-12 11:19:20 +00:00
NAKAMURA Takumi
64ea062a28 X86ISelLowering.cpp:LowerINTRINSIC_W_CHAIN(): Prune impossible "default:" [-Wcovered-switch-default]
llvm-svn: 208533
2014-05-12 10:16:46 +00:00
Bradley Smith
853591a396 [ARM64] Add diagnostics for bitfield extract/insert instructions
Unfortunately, since ARM64 models all these instructions as aliases,
the checks need to be done at the time the alias is seen rather than
during instruction validation as AArch64 does it.

llvm-svn: 208529
2014-05-12 09:44:57 +00:00
Bradley Smith
afab1cb33a [ARM64] Correct more bounds checks/diagnostics for arithmetic shift operands
llvm-svn: 208528
2014-05-12 09:41:43 +00:00
Bradley Smith
469accbbee [ARM64] Move register/register MOV handling into tablegen and improve diagnostics
llvm-svn: 208527
2014-05-12 09:38:16 +00:00
Elena Demikhovsky
e132f428fd Fixed compilation issue
llvm-svn: 208524
2014-05-12 07:45:41 +00:00
Elena Demikhovsky
784490ba2d AVX-512: changes in intrinsics
1) Changed gather and scatter intrinsics. Now they are aligned with GCC built-ins. There is no more non-masked form. Masked intrinsic receives -1 if all lanes are executed.
2) I changed the function that works with intrinsics inside X86ISelLowering.cpp. I put all intrinsics in one table. I did it for INTRINSICS_W_CHAIN and plan to put all intrinsics from WO_CHAIN set to the same table in order to avoid the long-long "switch". (I wanted to use static map initialization that allowed by C++11 but I wasn't able to compile it on VS2012).
3) I added gather/scatter prefetch intrinsics.
4) I fixed MRMm encoding for masked instructions.

llvm-svn: 208522
2014-05-12 07:18:51 +00:00
Matt Arsenault
8358bf5227 Fix return before else
llvm-svn: 208510
2014-05-11 21:24:41 +00:00
Hal Finkel
f6f53bcd51 [PowerPC] Add global named register support
Support for the intrinsics that read from and write to global named registers
is added for r1, r2 and r13 (depending on the subtarget).

llvm-svn: 208509
2014-05-11 19:29:11 +00:00
Hal Finkel
5b038e4cbc Pass the value type to TLI::getRegisterByName
We must validate the value type in TLI::getRegisterByName, because if we
don't and the wrong type was used with the IR intrinsic, then we'll assert
(because we won't be able to find a valid register class with which to
construct the requested copy operation). For PPC64, additionally, the type
information is necessary to decide between the 64-bit register and the 32-bit
subregister.

No functionality change.

llvm-svn: 208508
2014-05-11 19:29:07 +00:00
Hal Finkel
34d719e885 Add 'override' to getRegisterByName in *ISelLowering.h
No functionality change intended.

llvm-svn: 208507
2014-05-11 19:28:55 +00:00
Hal Finkel
ef707e1c3e [PowerPC] On PPC32, 128-bit shifts might be runtime calls
The counter-loops formation pass needs to know what operations might be
function calls (because they can't appear in counter-based loops). On PPC32,
128-bit shifts might be runtime calls (even though you can't use __int128 on
PPC32, it seems that SROA might form them).

Fixes PR19709.

llvm-svn: 208501
2014-05-11 16:23:29 +00:00
Filipe Cabecinhas
5c7f162cea Fixed a bug when lowering build_vector (PR19694)
When lowering build_vector to an insertps, we would still lower it, even
if the source vectors weren't v4x32. This would break on avx if the source
was a v8x32. We now check the type of the source vectors.

llvm-svn: 208487
2014-05-11 08:12:56 +00:00
Vincent Lejeune
03352d8b38 R600/SI: Fold fabs/fneg into src input modifier
llvm-svn: 208480
2014-05-10 19:18:39 +00:00
Vincent Lejeune
840594f1e6 R600/SI: Prettier display of input modifiers
llvm-svn: 208479
2014-05-10 19:18:33 +00:00
Vincent Lejeune
8467918b43 R600/SI: Use pseudo instruction for fabs/clamp/fneg
llvm-svn: 208478
2014-05-10 19:18:25 +00:00
Tim Northover
a6ded6a3c2 ARM64: fix SELECT_CC lowering in absence of NaNs.
We were swapping the true & false results while testing for FMAX/FMIN,
but not putting them back to the original state if the later checks
failed.

Should fix PR19700.

llvm-svn: 208469
2014-05-10 07:37:50 +00:00
Reid Kleckner
86ad66783f Revert "[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret'"
This reverts commit r200561.

This calling convention was an attempt to match the MSVC C++ ABI for
methods that return structures by value.  This solution didn't scale,
because it would have required splitting every CC available on Windows
into two: one for methods and one for free functions.

Now that we can put sret on the second arg (r208453), and Clang does
that (r208458), revert this hack.

llvm-svn: 208459
2014-05-09 22:56:42 +00:00
Reid Kleckner
0c2e3574f4 Allow sret on the second parameter as well as the first
MSVC always places the implicit sret parameter after the implicit this
parameter of instance methods.  We used to handle this for
x86_thiscallcc by allocating the sret parameter on the stack and leaving
the this pointer in ecx, but that doesn't handle alternative calling
conventions like cdecl, stdcall, fastcall, or the win64 convention.

Instead, change the verifier to allow sret on the second parameter.

This also requires changing the Mips and X86 backends to return the
argument with the sret parameter, instead of assuming that the sret
parameter comes first.

The Sparc backend also returns sret parameters in a register, but I
wasn't able to update it to handle secondary sret parameters.  It
currently calls report_fatal_error if you feed it an sret in the second
parameter.

Reviewers: rafael.espindola, majnemer

Differential Revision: http://reviews.llvm.org/D3617

llvm-svn: 208453
2014-05-09 22:32:13 +00:00
Jonathan Roelofs
96b4e50b21 Fix broken build
ARM64 backend was missing a required_library entry.

llvm-svn: 208437
2014-05-09 18:06:22 +00:00
Louis Gerbarg
3c6803e843 Add custom lowering for add/sub with overflow intrinsics to ARM
This patch adds support to ARM for custom lowering of the
llvm.{u|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful
for handling idiomatic saturating math functions as generated by
InstCombineCompare.

Test cases included.

rdar://14853450

llvm-svn: 208435
2014-05-09 17:02:49 +00:00
Tom Stellard
1a986ede50 R600/SI: Teach SIInstrInfo::moveToVALU() how to move S_LOAD_*_IMM instructions
llvm-svn: 208432
2014-05-09 16:42:22 +00:00
Tom Stellard
550d79da42 R600/SI: Fix SMRD pattern for offsets > 32 bits
We were dropping the high bits of 64-bit immediate offsets.

llvm-svn: 208431
2014-05-09 16:42:21 +00:00
Tom Stellard
9562e8a6ba R600: Expand i64 SELECT_CC
llvm-svn: 208430
2014-05-09 16:42:19 +00:00
Tom Stellard
83d3208148 R600: Move MIN/MAX matching from LowerOperation() to PerformDAGCombine()
llvm-svn: 208429
2014-05-09 16:42:16 +00:00
Daniel Sanders
5cba0c9900 [mips] Marked up instructions added in MIPS-IV and tested that IAS for -mcpu=mips[123] does not accept them
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-IV that was available in MIPS32

A small number of instructions are correctly rejected but with the wrong error
message. These have been placed in a separate test for now.

Depends on D3676

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3677

llvm-svn: 208414
2014-05-09 14:06:17 +00:00
Oliver Stannard
2b1166b162 ARM: HFAs must be passed in consecutive registers
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.

llvm-svn: 208413
2014-05-09 14:01:47 +00:00
Daniel Sanders
cb50ea8f81 [mips] Remove unused CondMov feature bit
Summary:
No functional change

Depends on D3675

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3676

llvm-svn: 208410
2014-05-09 13:15:07 +00:00
Daniel Sanders
ab71566b13 [mips] Marked up instructions added in MIPS-III and tested that IAS for -mcpu=mips[12] does not accept them
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-III that was available in MIPS32

A small number of instructions are correctly rejected but with the wrong error
message. These have been placed in a separate test for now.

There's some obvious InstAlias's that ought to be marked MIPS-III but arent.
This is because they are not currently tested. I intend to catch these with
a final pass through the tablegen records to find tablegen records without
ISA annotations.

Depends on D3674

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3675

llvm-svn: 208408
2014-05-09 13:02:27 +00:00
Andrea Di Biagio
db3980299c Fix 80 col violation.
No functional change intended.

llvm-svn: 208405
2014-05-09 11:08:23 +00:00
Benjamin Kramer
330ea89877 [asan] Stop leaking X86Operands.
llvm-svn: 208400
2014-05-09 09:48:03 +00:00
Daniel Sanders
42bf9dcf31 [mips][mips64r6] Add experimental support for MIPS32r6 and MIPS64r6
Summary:
Adds MIPS32r6/MIPS64r6 and checks the compatibility requirements for these
processors.

I've also included comments to describe removed and re-encoded instructions,
along with placeholder def's for the new instructions but there are no
functional changes to codegen at this point.

Reviewers: jkolek, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3622

llvm-svn: 208399
2014-05-09 09:46:21 +00:00
Daniel Sanders
46b1fb6d95 [mips] Added missing dsra -> dsrav and sra -> srav aliases.
Summary: dsll, dsrl, sll, and srl already exist.

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3673

llvm-svn: 208397
2014-05-09 09:24:49 +00:00
Saleem Abdulrasool
17218ea0a8 ARM: support PIC on Windows on ARM
Handle lowering of global addresses for PIC mode compilation on Windows.  Always
use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and
is a pure Thumb environment.

llvm-svn: 208385
2014-05-09 00:58:32 +00:00
Filipe Cabecinhas
f3415cd85c Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper, delena

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

llvm-svn: 208372
2014-05-08 23:16:08 +00:00
Jyotsna Verma
0b0b5038bf [Hexagon] Add new InstrItinClass to support timing classes.
This patch doesn't introduce any functionality change. Test cases will be
added later when v5 support is added.

llvm-svn: 208349
2014-05-08 18:47:08 +00:00
Rafael Espindola
ccc1932aa6 Use for range loops.
llvm-svn: 208348
2014-05-08 18:40:06 +00:00
Matt Arsenault
119209fcfe R600: Promote f64 vector load/stores to i64 for consistency
llvm-svn: 208344
2014-05-08 18:01:56 +00:00
Andrea Di Biagio
7e0e036a46 [X86] Add target specific combine rules to fold SSE2/AVX2 packed arithmetic shift intrinsics.
This patch teaches the backend how to combine packed SSE2/AVX2 arithmetic shift
intrinsics.

The rules are:
 - Always fold a packed arithmetic shift by zero to its first operand;
 - Convert a packed arithmetic shift intrinsic dag node into a ISD::SRA only if
   the shift count is known to be smaller than the vector element size.

This patch also teaches to function 'getTargetVShiftByConstNode' how fold
target specific vector shifts by zero.

Added two new tests to verify that the DAGCombiner is able to fold
sequences of SSE2/AVX2 packed arithmetic shift calls.

llvm-svn: 208342
2014-05-08 17:44:04 +00:00
Daniel Sanders
ac1f965519 [mips] Add PredicateControl to InstAlias's
Summary:
No functional change

Depends on D3649

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3672

llvm-svn: 208334
2014-05-08 16:12:31 +00:00
Bradley Smith
e7c79c7b8a [ARM64] Add diagnostics for expected arithmetic shifts
llvm-svn: 208330
2014-05-08 15:40:39 +00:00
Bradley Smith
08ab47ca2d [ARM64] Re-work parsing of ADD/SUB shifted immediate operands
The parsing of ADD/SUB shifted immediates needs to be done explicitly so
that better diagnostics can be emitted, as a side effect this also
removes some of the hacks in the current method of handling this operand
type.

Additionally remove manual CMP aliasing to ADD/SUB and use InstAlias
instead.

llvm-svn: 208329
2014-05-08 15:39:58 +00:00
Bradley Smith
40ea4329b1 [ARM64] Ensure immediates in extend operands are in a valid range
Also emit a more useful diagnostic when they are not.

llvm-svn: 208318
2014-05-08 14:12:12 +00:00
Bradley Smith
0bcfb4a0bc [ARM64] Check for proper immediate in shift/extend operands
llvm-svn: 208317
2014-05-08 14:11:16 +00:00
Christian Pirker
35d96c7f86 ARM big endian function argument passing
llvm-svn: 208316
2014-05-08 14:06:24 +00:00
Daniel Sanders
c6c9c916df [mips] Implement l[wd]c3, and s[wd]c3.
Summary:
These instructions were added in MIPS-I, and MIPS-II but were removed in
MIPS-III. Interestingly, GAS continues to accept them when assembling for
MIPS-III.

For the moment, these instructions will follow GAS and accept them for
MIPS-III and newer but this will be tightened up when the invalid-*.s
tests are added.

Depends on D3647

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3648

llvm-svn: 208311
2014-05-08 13:02:11 +00:00
James Molloy
294269a69e [ARM64-BE] Teach fast-isel about how to set up sub-word stack arguments for big endian calls.
SelectionDAG already knows about this, but fast-isel was ignorant.

llvm-svn: 208307
2014-05-08 12:53:50 +00:00
Daniel Sanders
8071a219e6 [mips] Marked up instructions added in MIPS-II and tested that IAS for -mcpu=mips1 does not accept them
Summary:
A small number of instructions are rejected with the wrong error message.
These have been placed in a separate test for now. There seems to be some
parsing quirk that triggers when these instructions are disabled.

Depends on D3571

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3647

llvm-svn: 208305
2014-05-08 12:40:48 +00:00
Daniel Sanders
94fae7d980 [mips] Implement tlbp, tlbr, tlbwi, and tlbwr
Reviewers: vmedic, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3571

llvm-svn: 208301
2014-05-08 11:51:18 +00:00
Tim Northover
72838ce201 ARM64: make sure FastISel emits SSA MachineInstrs
We need to use a temporary register for a 2-step operation like REM.

llvm-svn: 208297
2014-05-08 10:30:56 +00:00
Evgeniy Stepanov
196ad52640 [asan] Preserve flags in asm instrumentation.
Patch by Yuri Gorshenin.

llvm-svn: 208296
2014-05-08 09:55:24 +00:00
Hal Finkel
c52e65b830 Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).

This does represent a small functionality change:
 - For generic x86-64 (which uses the SB model and, thus, will get some
   unrolling).
 - For AMD cores (because they still currently use the SB scheduling model)
 - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
   the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.

llvm-svn: 208289
2014-05-08 09:14:44 +00:00
Hao Liu
be513c440d AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ARM64 backend.
llvm-svn: 208284
2014-05-08 07:38:13 +00:00
Saleem Abdulrasool
84a61727f4 ARM: support FK_SecRel_2 relocations on WoA
This adds FK_SecRel_2 relocation support to ARM.  This enables the building of
object files for armv7-windows-msvc which enables CodeView line tables for
debugging as opposed to armv7-windows-itanium which currently uses DWARF.

llvm-svn: 208273
2014-05-08 01:35:57 +00:00
Filipe Cabecinhas
275860c4fd Lower certain build_vectors to insertps instructions
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.

Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.

Added some tests for different cases where this optimization triggers.

Reviewers: nadav, delena, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3521

llvm-svn: 208271
2014-05-08 00:25:16 +00:00
Hal Finkel
a22ec95e68 [X86TTI] Remove the unrolling branch limits
The loop stream detector (LSD) on modern Intel cores, which optimizes the
execution of small loops, has limits on the number of taken branches in
addition to uop-count limits (modern AMD cores have similar limits).
Unfortunately, at the IR level, estimating the number of branches that will be
taken is difficult. For one thing, it strongly depends on later passes (block
placement, etc.). The original implementation took a conservative approach and
limited the maximal BB DFS depth of the loop.  However, fairly-extensive
benchmarking by several of us has revealed that this is the wrong approach. In
fact, there are zero known cases where the branch limit prevents a detrimental
unrolling (but plenty of cases where it does prevent beneficial unrolling).

While we could improve the current branch counting logic by incorporating
branch probabilities, this further complication seems unjustified without a
motivating regression. Instead, unless and until a regression appears, the
branch counting will be removed.

llvm-svn: 208255
2014-05-07 22:25:18 +00:00
Quentin Colombet
9b13d839be [X86] Selectively mark the FMA variants inside a family as isCommutable.
Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or
memory) are commutable.
E.g., for the 213 family (with the syntax src1, src2, src3):
fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3

Now consider the 231 family:
fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B
But
fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B
Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231.

Working on a reduced test case!

<rdar://problem/16800495>

llvm-svn: 208252
2014-05-07 21:43:35 +00:00
Eric Christopher
91701a28d0 Reformat a couple of functions for clarity.
llvm-svn: 208248
2014-05-07 21:05:47 +00:00
Jyotsna Verma
c9cece4644 [Hexagon] Add New TSFlags to be used in the upcoming patches.
llvm-svn: 208239
2014-05-07 19:07:34 +00:00
Chandler Carruth
17e2aa3c0a [x86] Make the 'x86-64' cpu, what I see as and many use as the generic
default architecture for reasonable modern x86 processors, actually be
modern. This processor model should essentially be "tuned" for modern
x86 chips as much as possible without undue penalties on any specific
architecture. Previously we weren't even using the nice scheduling
models. There are a few other tweaks needed here, but this change at
least I have benchmarked across a decent swatch of chips (intel's
clovertown, westmere, and sandybridge; amd's istanbul) and seen no
significant regressions.

If anyone has suggested ways to test this, just let me know. Somewhat
alarmingly, no existing tests failed.

llvm-svn: 208230
2014-05-07 17:37:03 +00:00
Chad Rosier
da933c5aba [ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally,
this patch disables the dead register elimination pass and the load/store pair
optimization pass at -O0.  The ILP optimizations don't require the optimization
level to be checked because the call to addILPOpts is predicated with the
necessary check.  The AdvSIMDScalar pass is disabled by default at all
optimization levels.  This patch leaves that pass disabled by default.

Also, move command-line options into ARM64TargetMachine.cpp and add a few
additional flags to aid in debugging.  This fixes an issue with the
-debug-pass=Structure flag where passes were printed, but not actually run
(i.e., AdvSIMDScalar pass).

llvm-svn: 208223
2014-05-07 16:41:55 +00:00
Daniel Sanders
44cc643eee [mips] Add highly experimental support for MIPS-I, MIPS-II, MIPS-III, and MIPS-V
Summary:
These processors will only be available for the integrated assembler at
first (CodeGen will emit a fatal error saying they are not implemented).

The intention is to work through the existing instructions and correctly
annotate the ISA they were added in so that we have a sufficiently good
base to start MIPS64r6 development. MIPS64r6 removes/re-encodes certain
instructions and I believe it is best to define ISA's using set-union's
as far as possible rather than using set-subtraction.

Reviewers: vmedic

Subscribers: emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D3569

llvm-svn: 208221
2014-05-07 16:25:22 +00:00
Rafael Espindola
3c06343011 Use range loop.
llvm-svn: 208218
2014-05-07 14:53:32 +00:00
Daniel Sanders
1cc95dd8b6 [mips] Add FGR_32/FGR_64/GPR_64 adjectives and use then instead of FGRPredicates/GPRPredicates
Summary:
No functional change (confirmed by diffing tablegen-erated files).

Depends on D3642

Reviewers: vmedic, dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3645

llvm-svn: 208213
2014-05-07 14:25:43 +00:00
Daniel Sanders
e8ed4c1087 [mips] Add INSN_<name> adverbs and start using them instead of AdditionalPredicates overrides
Summary:
No functional change

Depends on D3641

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3642

llvm-svn: 208212
2014-05-07 14:11:46 +00:00
Tim Northover
aed9e378c3 AArch64/ARM64: optimise vector selects & enable test
When performing a scalar comparison that feeds into a vector select,
it's actually better to do the comparison on the vector side: the
scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP"
since the vector comparisons are all mask based.

llvm-svn: 208210
2014-05-07 14:10:27 +00:00
Daniel Sanders
ff0b1220dc [mips] Add ISA_<name> adverbs and start using them instead of AdditionalPredicates overrides
Summary:
One small functional change. The recently added PAUSE instruction now has
the HasStdEnc predicate which was accidentally removed by a Requires<>.

Depends on D3640

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3641

llvm-svn: 208209
2014-05-07 13:57:22 +00:00
Rafael Espindola
765e5e78cf Remove the UseCFI option from createAsmStreamer.
We were already always passing true, this just removes the option.

llvm-svn: 208205
2014-05-07 13:00:43 +00:00
Daniel Sanders
9bda911eb9 [mips] Continue splitting Instruction.Predicates into smaller lists and re-join them with !listconcat
Summary:
Move IsGP64bit into GPRPredicates, and IsFP64bit/NotFP64bit into FGRPredicates

No functional change (confirmed by diffing tablegen-erated files).

Depends on D3639

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3640

llvm-svn: 208201
2014-05-07 12:48:37 +00:00
James Molloy
9af6fee0f6 [ARM64-BE] Fix fast-isel, and add appropriate RUN lines to appropriate tests.
llvm-svn: 208200
2014-05-07 12:33:55 +00:00
James Molloy
7951361ffc [ARM64-BE] Fix variable-argument saving.
llvm-svn: 208199
2014-05-07 12:33:48 +00:00
James Molloy
f9bd42cfb5 [ARM64-BE] Implement the lane-twiddling logic at AAPCS boundaries for big endian.
The AAPCS states that values passed in registers must have a value as though
they had been loaded with "LDR". LDR is equivalent to "LD1.64 vX.1D" - that is,
loading scalars to vector registers and loading 1-element vectors is equivalent.

The logic implemented here is to ensure that at all call boundaries and during
formal argument lowering all vectors are treated as their bitwidth-based floating
point scalar counterpart, which is always one of f64 or f128 (v2i32 -> f64,
v4i32 -> f128 etc). A BITCAST is inserted so that the appropriate REV will be
generated during code generation.

llvm-svn: 208198
2014-05-07 12:33:41 +00:00
Daniel Sanders
00bd4febb4 [mips] Move IsFP64bit/NotFP64bit to the front of the AdditionalPredicates list
Summary:
This makes it easier to prove a more complicated change in the next commit
is non-functional.

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3639

llvm-svn: 208197
2014-05-07 12:27:46 +00:00
James Molloy
c74863e0d9 [ARM64-BE] Implement the crazy bitcast handling for big endian vectors.
Because we've canonicalised on using LD1/ST1, every time we do a bitcast
between vector types we must do an equivalent lane reversal.

Consider a simple memory load followed by a bitconvert then a store.
  v0 = load v2i32
  v1 = BITCAST v2i32 v0 to v4i16
       store v4i16 v2

In big endian mode every memory access has an implicit byte swap. LDR and
STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that
is, they treat the vector as a sequence of elements to be byte-swapped.
The two pairs of instructions are fundamentally incompatible. We've decided
to use LD1/ST1 only to simplify compiler implementation.

LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes
the original code sequence:  v0 = load v2i32

  v1 = REV v2i32                  (implicit)
  v2 = BITCAST v2i32 v1 to v4i16
  v3 = REV v4i16 v2               (implicit)
       store v4i16 v3

But this is now broken - the value stored is different to the value loaded
due to lane reordering. To fix this, on every BITCAST we must perform two
other REVs:

  v0 = load v2i32
  v1 = REV v2i32                  (implicit)
  v2 = REV v2i32
  v3 = BITCAST v2i32 v2 to v4i16
  v4 = REV v4i16
  v5 = REV v4i16 v4               (implicit)
       store v4i16 v5

This means an extra two instructions, but actually in most cases the two REV
instructions can be combined into one. For example:
  (REV64_2s (REV64_4h X)) === (REV32_4h X)

There is also no 128-bit REV instruction. This must be synthesized with an
EXT instruction.

Most bitconverts require some sort of conversion. The only exceptions are:
  a) Identity conversions -  vNfX <-> vNiX
  b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX

Even though there are hundreds of changed lines, I have a fairly high confidence
that they are somewhat correct. The changes to add two REV instructions per
bitcast were pretty mechanical, and once I'd done that I threw the resulting
.td at a script I wrote which combined the two REVs together (and added
an EXT instruction, for f128) based on an instruction description I gave it.

This was much less prone to error than doing it all manually, plus my brain
would not just have melted but would have vapourised.

llvm-svn: 208194
2014-05-07 11:28:53 +00:00
James Molloy
a179abfc18 [ARM64-BE] Predicate VLDR/VSTR for vectors as little-endian only. We must use LD1/ST1 on big-endian.
llvm-svn: 208193
2014-05-07 11:28:45 +00:00
James Molloy
c6eeb59eb7 [ARM64-BE] Make big endian (scalar) argument passing work correctly.
This completes the port of r204814 (cpirker "AArch64_BE function argument
passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues
found during later development along the way. The biggest of these was
that the alignment fixup logic wasn't replicated into all the places it
should have been.

llvm-svn: 208192
2014-05-07 11:28:36 +00:00
Daniel Sanders
078f32bcd8 [mips] Split Instruction.Predicates into smaller lists and re-join them with !listconcat
Summary:
The overall idea is to chop the Predicates list into subsets that are
usually overridden independently. This allows subclasses to partially
override the predicates of their superclasses without having to re-add all
the existing predicates.

This patch starts the process by moving HasStdEnc into a new
EncodingPredicates list and almost everything else into
AdditionalPredicates.

It has revealed a couple likely bugs where 'let Predicates' has removed
the HasStdEnc predicate.

No functional change (confirmed by diffing tablegen-erated files).

Depends on D3549, D3506

Reviewers: vmedic

Differential Revision: http://reviews.llvm.org/D3550

llvm-svn: 208184
2014-05-07 10:27:09 +00:00
Daniel Sanders
dc36787564 [mips] Move HasStdEnc to the front of the predicates lists.
Summary:
This will make it easier to prove that a more complicated change in the
following commit is non-functional.

No functional change.

Depends on D3506

Reviewers: vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3549

llvm-svn: 208179
2014-05-07 09:58:05 +00:00
Evgeniy Stepanov
4a872de9c5 [asan] Add a flag to control asm instrumentation.
With this change, asm instrumentation is disabled by default.

llvm-svn: 208167
2014-05-07 07:54:11 +00:00
Joerg Sonnenberger
541f85d280 Allow using normal .eh_frame based unwinding on ARM. Use the same
encodings as x86. Use this exception model for NetBSD.

llvm-svn: 208166
2014-05-07 07:49:34 +00:00
Saleem Abdulrasool
4a6126d8b7 ARM: mark additional instructions as MachineFrameSetup
Mark up additional instructions which are part of the function prologue as
MachineFrameSetup.  These instructions are part of the function prologue,
emitted by the PEI pass to setup the stack for use in the activating frame.

llvm-svn: 208153
2014-05-07 03:03:31 +00:00
Saleem Abdulrasool
fe0d3eaa0d ARM: fix WoA PEI instruction selection
The ARM::BLX instruction is an ARM mode instruction.  The Windows on ARM target
is limited to Thumb instructions.  Correctly use the thumb mode tBLXr
instruction.  This would manifest as an errant write into the object file as the
instruction is 4-bytes in length rather than 2.  The result would be a corrupted
object file that would eventually result in an executable that would crash at
runtime.

llvm-svn: 208152
2014-05-07 03:03:27 +00:00
Andrew Trick
bf2fd15d49 Update an embarassing out-of-date comment.
llvm-svn: 208137
2014-05-06 22:18:43 +00:00
Joerg Sonnenberger
347cf3253f If a function needs a frame pointer, but r11 (aka fp) has not been used,
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.

llvm-svn: 208129
2014-05-06 20:43:01 +00:00
Andrea Di Biagio
540f8696d1 [X86] Improve the lowering of BITCAST dag nodes from type f64 to type v2i32 (and vice versa).
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.

With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).

This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.

llvm-svn: 208107
2014-05-06 17:09:03 +00:00
Renato Golin
8a9a382ab2 Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

llvm-svn: 208104
2014-05-06 16:51:25 +00:00
Tim Northover
c3dfe08427 AArch64/ARM64: implement diagnosis of unpredictable loads & stores
llvm-svn: 208091
2014-05-06 14:15:14 +00:00
Tim Northover
46970c8884 AArch64/ARM64: make NEON vector list parsing a bit more robust
It doesn't change the results, but it seems silly not to diagnose obvious
problems early on.

llvm-svn: 208083
2014-05-06 12:50:51 +00:00
Tim Northover
5494085a1f AArch64/ARM64: add more specific diagnostic for floating imm 0.0.
llvm-svn: 208082
2014-05-06 12:50:47 +00:00
Tim Northover
3d47129ce2 AArch64/ARM64: add more specific diagnostic for invalid vector lanes
llvm-svn: 208081
2014-05-06 12:50:44 +00:00
Tim Northover
fab515c3bb AArch64/ARM64: produce more informative diagnostic assembling some immediates
No tests here, they'll be added when the entire neon-diagnostics.s test from
AArch64 is enabled.

llvm-svn: 208079
2014-05-06 11:18:53 +00:00
Christian Pirker
20a4e2bc33 ARM: For thumb fixups store halfwords high first and low second
llvm-svn: 208076
2014-05-06 10:05:11 +00:00
Kevin Qin
7464b14156 [ARM64] Enable alignment control option in front-end for ARM64.
This is the modification in llvm part.

llvm-svn: 208074
2014-05-06 09:48:52 +00:00
Craig Topper
659167eb76 Use X86 memory operand enums instead of hardcoding.
llvm-svn: 208064
2014-05-06 07:04:32 +00:00
Reid Kleckner
3d55680273 Fix i128 div/mod on mingw64
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.

Patch by Jameson Nash!

llvm-svn: 208029
2014-05-06 01:20:42 +00:00
Eric Christopher
e5222b0f5a Fix typo.
llvm-svn: 208006
2014-05-05 21:50:57 +00:00
Tom Stellard
1291d8f8c2 R600: Expand i64 ISD:SUB
llvm-svn: 208005
2014-05-05 21:47:15 +00:00
Filipe Cabecinhas
cccfa360ce Revert "Optimize shufflevector that copies an i64/f64 and zeros the rest."
This reverts commit 207992. I misread the phab number on the LGTM.

llvm-svn: 207993
2014-05-05 19:40:36 +00:00
Filipe Cabecinhas
912c9dd1fe Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

llvm-svn: 207992
2014-05-05 19:36:28 +00:00
Marek Olsak
6fa0778eff R600/SI: allow 5 more input SGPRs to a shader
Our OpenGL driver needs 22 SGPRs (16 user SGPRs + 6 streamout non-user SGPRs).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
llvm-svn: 207990
2014-05-05 19:30:54 +00:00
Saleem Abdulrasool
45ef2e0451 CodeGen: correct memset emittance for WoA
Windows on ARM does not conform to AEABI.  However, memset would be emitted
using the AEABI signature, resulting in inverted parameters.  Handle this
special case appropriately.

llvm-svn: 207943
2014-05-04 23:13:21 +00:00
Saleem Abdulrasool
bca668b81f MC: support FK_SecRel_4 for Windows on ARM
Add handling for FK_SecRel_4 (4-byte section relative relocations).  These are
used by the generation of DWARF debug information (the abbrevations use section
relative relocations).  This will also be used in generation of CodeView line
tables.

llvm-svn: 207941
2014-05-04 23:13:15 +00:00
Elena Demikhovsky
52b1f22e9c AVX-512: minor change in rndscale intrinsic
llvm-svn: 207937
2014-05-04 13:35:37 +00:00
Saleem Abdulrasool
d9cd5486e9 X86: further range-loopify AsmPrinter
Use more range loops in the X86AsmPrinter.  NFC.

llvm-svn: 207928
2014-05-04 01:54:17 +00:00
Saleem Abdulrasool
53f4c01738 X86: remove X86COFFMachineModuleInfo
Remove dead code.  This is vestigial after r98384.

llvm-svn: 207927
2014-05-04 01:54:12 +00:00
Saleem Abdulrasool
edfd1818df X86: repair export compatibility with MinGW/cygwin
Both MinGW and cygwin (i686) construct export directives without the global
leader prefix.  This is mostly due to the fact that they use GNU ld which does
not correctly handle the export directive.  This apparently has been been broken
for a while.  However, this was recently reported as being broken by
mingwandroid and diorcety of the msys2 project.

Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain
the global leader prefix.  Add an explicit test for cygwin's behaviour of export
directives.

llvm-svn: 207926
2014-05-04 00:03:48 +00:00
Saleem Abdulrasool
f061e56242 X86: refactor export directive generation
Create a helper function to generate the export directive.  This was previously
duplicated inline to handle export directives for variables and functions.  This
also enables the use of range-based iterators for the generation of the
directive rather than the traditional loops.  NFC.

llvm-svn: 207925
2014-05-04 00:03:41 +00:00
Rafael Espindola
bb77d317cd Fix pr19645.
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.

This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.

Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.

llvm-svn: 207920
2014-05-03 19:57:04 +00:00
Joey Gouly
48d50f02a1 [ARM64] Correctly select ANDWri in FastISel.
http://reviews.llvm.org/D3598

llvm-svn: 207917
2014-05-03 17:27:06 +00:00
Benjamin Kramer
b0e4e7faba Add a description for AMD's bdver4 (aka Excavator).
This is just bdver3 + AVX2 + BMI2.

llvm-svn: 207847
2014-05-02 15:47:07 +00:00
Tom Stellard
7831477f20 R600/SI: Add processor type for Mullins.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
llvm-svn: 207846
2014-05-02 15:41:49 +00:00
Tom Stellard
18ca382db4 R600: Expand vector sin and cos.
v2: move code to AMDGPUISelLowering.cpp
    squash with tests (both EG and SI)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 207845
2014-05-02 15:41:47 +00:00
Tom Stellard
05e86018ff R600: Expand TruncStore i64 -> {i16,i8}
llvm-svn: 207844
2014-05-02 15:41:46 +00:00
Tom Stellard
94494bca4d R600/SI: Only create one instruction when spilling/restoring register v3
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.

v2:
  - Fix calculation of lane index
  - Extend VGPR liveness to end of program.

v3:
  - Use SIMM16 field of S_NOP to specify multiple NOPs.

https://bugs.freedesktop.org/show_bug.cgi?id=75005

llvm-svn: 207843
2014-05-02 15:41:42 +00:00
Tim Northover
ab17e77d87 AArch64/ARM64: add patterns for post-indexed ST1 ops.
llvm-svn: 207840
2014-05-02 14:54:27 +00:00
Tim Northover
1fc15a656e ARM64: refactor NEON post-indexed loads & stores (MC).
Previously, LLVM had no knowledge that these instructions actually
modified their address register: fine if they never end up in CodeGen,
but when I'd rather like to write some patterns for them it becomes a
disaster.

The change is mostly straightforward, I think the most significant
design decision was to *always* put the address write-back first. This
allows loads and stores to be accessed more uniformly, for example
permitting the continued sharing of the InstAlias definitions.

I also discovered that the custom Decode logic is no longer needed, so
I removed it.

No tests, because there should be no functionality change.

llvm-svn: 207839
2014-05-02 14:54:21 +00:00
Tim Northover
fa66a37eb2 AArch64/ARM64: support indexed loads/stores on vector types.
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.

llvm-svn: 207838
2014-05-02 14:54:15 +00:00
Pranav Bhandarkar
8e6ddf2de1 Remove HexagonTargetMachine::addPassesForOptimizations; it is not needed any more.
llvm-svn: 207800
2014-05-01 22:10:59 +00:00
Reed Kotler
d858d50a3a Add basic functionality for assignment of ints.
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel

Test Plan: simplestore.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3527

llvm-svn: 207790
2014-05-01 20:39:21 +00:00
Eli Bendersky
0602e236ae Add an optimization that does CSE in a group of similar GEPs.
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.

The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.

Review: http://reviews.llvm.org/D3462

Patch by Jingyue Wu.

llvm-svn: 207783
2014-05-01 18:38:36 +00:00
Matt Arsenault
293918660a R600/SI: Fix verifier error with pseudo store instructions.
Use i32 instead of specifying SReg_32. When this is
the pseudo INDIRECT_BASE_ADDR, this would give a bogus
verifier error.

llvm-svn: 207770
2014-05-01 16:37:52 +00:00
Bradley Smith
6bf9a37dfb [ARM64] Prefer generation of bzero on Darwin only
llvm-svn: 207760
2014-05-01 13:11:59 +00:00
Rafael Espindola
2551be1514 Don't force symbols to be globals in .thumb_set.
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given

.thumb_set foo, bar

we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.

Producing an undefined reference to bar is a general difference from MC and gas.
For example, given

a = b

gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.

llvm-svn: 207757
2014-05-01 12:45:43 +00:00
Tim Northover
56141b7826 AArch64/ARM64: print BFM instructions as BFI or BFXIL
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.

llvm-svn: 207752
2014-05-01 12:29:38 +00:00
Richard Barton
9c0649fc59 Correction to assert statemtent to allow 32-bit unsigned numbers with the top bit set.
This fixes an ARM assembler crash - regression test added.

llvm-svn: 207747
2014-05-01 11:37:44 +00:00
Bradley Smith
7a5aebf3ad [ARM64] Conditionalize CPU specific system registers on subtarget features
llvm-svn: 207742
2014-05-01 10:25:36 +00:00
Matheus Almeida
c66aaa03f3 [mips] Move expansion of .cpsetup to target streamer.
Summary:
There are two functional changes:
1) The directive is not expanded for the ASM->ASM code path.
2) If PIC is not set, there's no expansion for the ASM->OBJ code path (same behaviour as GAS).

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D3482

llvm-svn: 207741
2014-05-01 10:24:46 +00:00
Daniel Sanders
bdb011a812 [mips] Removed two-operand alias for sllv, sr[al]v, rotrv, dsllv, dsr[al]v, and drotrv
GAS doesn't actually accept these particular cases.

The mnemonic without the trailing 'v' still supports two-operand aliases.

llvm-svn: 207740
2014-05-01 10:08:36 +00:00
Saleem Abdulrasool
78ec2560f7 ARM: fix memory leak, simplify WoA stack probing
This fixes the memory leak introduced with the initial addition of support for
WoA stack probing.  Now that the pseudo-instruction expansion can handle an
external symbol, use that to generate the load which simplifies the logic as
well as avoids the memory leak.

llvm-svn: 207737
2014-05-01 04:19:59 +00:00