1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

24142 Commits

Author SHA1 Message Date
Tim Northover
ec272f9cde ARM: Permit "sp" in ARM variant of STREXD instructions
Patch from Mihail Popa

llvm-svn: 179854
2013-04-19 15:44:32 +00:00
Tim Northover
5ae8bb1aa6 ARM: permit "sp" in ARM variants of MOVW/MOVT instructions
llvm-svn: 179847
2013-04-19 09:58:09 +00:00
Michael Liao
10bba2164e Use 'array_lengthof' as possible to avoid magic numbers
llvm-svn: 179833
2013-04-19 04:03:37 +00:00
Tom Stellard
017c53ebbd R600: Add pattern for the BFI_INT instruction
llvm-svn: 179830
2013-04-19 02:11:06 +00:00
Tom Stellard
db47653487 R600/SI: Use InstFlag for VOP3 modifier operands
InstFlag has a default value of 0 and will simplify the VOP3 patterns.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179829
2013-04-19 02:11:00 +00:00
Bill Wendling
accb8d458d Use an enum instead of magic constants to improve readability.
llvm-svn: 179820
2013-04-19 00:05:59 +00:00
Chad Rosier
9eb6febf54 [ms-inline asm] Apply the condition code mnemonic aliases to both the Intel and
AT&T dialect.  Test case for r179804 as well.
rdar://13674398 and PR13340.

llvm-svn: 179813
2013-04-18 23:16:12 +00:00
Bill Wendling
4e6332d4c3 Set the compact unwind encoding to 'requires EH DWARF' if we cannot generate a CU encoding.
llvm-svn: 179808
2013-04-18 22:55:29 +00:00
Hal Finkel
b96e61374f Disable PPC comparison optimization by default
This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do
I'm disabling this for now.

llvm-svn: 179807
2013-04-18 22:54:25 +00:00
Chad Rosier
645b701422 [asm parser] Add support for predicating MnemonicAlias based on the assembler
variant/dialect.  Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.

llvm-svn: 179804
2013-04-18 22:35:36 +00:00
Hal Finkel
44190578df Implement optimizeCompareInstr for PPC
Many PPC instructions have a so-called 'record form' which stores to a specific
condition register the result of comparing the result of the instruction with
zero (always as a signed comparison). For integer operations on PPC64, this is
always a 64-bit comparison.

This implementation is derived from the implementation in the ARM backend;
there are some differences because PPC condition registers are allocatable
virtual registers (although the record forms always use a specific one), and we
look for a matching subtraction instruction after the compare (but before the
first use) in addition to before it.

llvm-svn: 179802
2013-04-18 22:15:08 +00:00
Benjamin Kramer
aeff9e581b X86: Add an SSE2 lowering for 64 bit compares when pcmpgtq (SSE4.2) isn't available.
This pattern started popping up in vectorized min/max reductions.

llvm-svn: 179797
2013-04-18 21:37:45 +00:00
Derek Schuff
c55a3d43a9 Allow misaligned stores in x86 fast-isel.
In X86FastISel::X86SelectStore(), improperly aligned stores are rejected and
handled by the DAG-based ISel.  However, X86FastISel::X86SelectLoad() makes
no such requirement.  There doesn't appear to be an x86 architectural
correctness issue with allowing potentially unaligned store instructions.
This patch removes this restriction.

Patch by Jim Stichnot.

llvm-svn: 179774
2013-04-18 17:41:08 +00:00
Chad Rosier
da17fa9b38 [ms-inline asm] Simplify some logic and add a FIXME for unhandled unary minus.
llvm-svn: 179765
2013-04-18 16:28:19 +00:00
Chad Rosier
5dffee4c99 Make this private method.
llvm-svn: 179764
2013-04-18 16:13:18 +00:00
Hao Liu
ca09ec237c Fix for PR14824, An ARM Load/Store Optimization bug
llvm-svn: 179751
2013-04-18 09:11:08 +00:00
Akira Hatanaka
111892c653 [mips] Rename function.
llvm-svn: 179741
2013-04-18 01:00:46 +00:00
Akira Hatanaka
ae4353c654 [mips] DSP-ASE move from HI/LO register instructions.
llvm-svn: 179739
2013-04-18 00:52:44 +00:00
Jack Carter
b9f4cdf48c Mips assembler: formatting and comment changes.
This patch should not have any functional changes. 

llvm-svn: 179737
2013-04-18 00:41:53 +00:00
Peter Collingbourne
a0d11d0e11 Add support for subsections to the ELF assembler. Fixes PR8717.
Differential Revision: http://llvm-reviews.chandlerc.com/D598

llvm-svn: 179725
2013-04-17 21:18:16 +00:00
Chad Rosier
1cb3175415 [ms-inline asm] These should be int64_t, not uint64_t.
llvm-svn: 179724
2013-04-17 21:14:38 +00:00
Chad Rosier
1efbeb717f [ms-inline asm] Add support for the minus unary operator. Previously, we were
unable to handle cases such as __asm mov eax, 8*-8.

This patch also attempts to simplify the state machine.  Further, the error
reporting has been improved.  Test cases included, but more will be added to
the clang side shortly.
rdar://13668445

llvm-svn: 179719
2013-04-17 21:01:45 +00:00
Eli Bendersky
802610971f This patch teaches x86 fast-isel to generate the native div/idiv instructions
for the sdiv/srem/udiv/urem bitcode instructions.  This is done for the i8,
i16, and i32 types, as well as i64 for the x86_64 target.

Patch by Jim Stichnoth

llvm-svn: 179715
2013-04-17 20:10:13 +00:00
Arnold Schwaighofer
e1dc8ae8c8 X86 cost model: Exit before calling getSimpleVT on non-simple VTs
getSimpleVT can only handle simple value types.

radar://13676022

llvm-svn: 179714
2013-04-17 20:04:53 +00:00
Quentin Colombet
c65f67e600 Fix treatment of ARM unallocated hint instructions.
The reference manual defines only 5 permitted values for the immediate field of the "hint" instruction:
1. nop (imm == 0)
2. yield (imm == 1)
3. wfe (imm == 2)
4. wfi (imm == 3)
5. sev (imm == 4)

Therefore, restrict the permitted values for the "hint" instruction to 0 through 4.

Patch by Mihail Popa <Mihail.Popa@arm.com>

llvm-svn: 179707
2013-04-17 18:46:12 +00:00
Ulrich Weigand
046b0abdfb PowerPC: Mark some more patterns as isCodeGenOnly.
A couple of recently introduced conditional branch patterns
also need to be marked as isCodeGenOnly since they cannot
be handled by the asm parser.

No change in generated code.

llvm-svn: 179690
2013-04-17 17:19:05 +00:00
Vincent Lejeune
cd0483fb18 R600: Make Export Instruction not duplicable
llvm-svn: 179686
2013-04-17 15:17:39 +00:00
Vincent Lejeune
a1a9b1752d R600: Export is emitted as a CF_NATIVE inst
llvm-svn: 179685
2013-04-17 15:17:32 +00:00
Vincent Lejeune
966453087f R600: Emit used GPRs count
llvm-svn: 179684
2013-04-17 15:17:25 +00:00
Evgeniy Stepanov
eaa78f8bb9 Fix -Werror build.
Broken in r179657.

llvm-svn: 179669
2013-04-17 06:45:11 +00:00
Jack Carter
e773ca9ec6 Mips assembler: Enable handling of nested expressions
This patch allows the Mips assembler to parse and emit nested 
expressions as instruction operands. It also extends the 
expansion of memory instructions when an offset is given as 
an expression. 

Contributer: Vladimir Medic
llvm-svn: 179657
2013-04-17 00:18:04 +00:00
Chad Rosier
441bf36faa [ms-inline asm] Add support for parsing complex immediate expressions. Test
cases to be submitted on clang side shortly.
rdar://13663768 and PR15760

llvm-svn: 179655
2013-04-17 00:11:46 +00:00
Tom Stellard
cbb7544fa4 C API: Add LLVMTargetMachineEmitToMemoryBuffer()
llvm-svn: 179648
2013-04-16 23:12:56 +00:00
Chad Rosier
9a757bb4ea Remove unused variable from previous refactor.
llvm-svn: 179611
2013-04-16 18:20:10 +00:00
Chad Rosier
128e5ae5af [ms-inline asm] Refactor. No functional change intended.
llvm-svn: 179610
2013-04-16 18:15:40 +00:00
Chad Rosier
0aff0eaab6 [ms-inline asm] Remove some dead code.
llvm-svn: 179607
2013-04-16 17:27:40 +00:00
Logan Chien
dd22a5184e Fix build failure introduced in 179591 when assertions are disabled.
llvm-svn: 179593
2013-04-16 14:02:30 +00:00
Logan Chien
6f13ff357d Implement ARM unwind opcode assembler.
llvm-svn: 179591
2013-04-16 12:02:21 +00:00
Jakob Stoklund Olesen
b4edc00933 Add 64-bit multiply and divide instructions for SPARC v9.
llvm-svn: 179582
2013-04-16 02:57:02 +00:00
Jim Grosbach
10785fcd52 ARM: Add VACLT and VACLE assembly aliases.
These are aliases for VACGT and VACGE, respectively, with the source
operands reversed.

rdar://13638090

llvm-svn: 179575
2013-04-15 22:42:50 +00:00
Jack Carter
6a3d1c59be Mips assembler: Explicit floating point condition register recognition.
This patch allows the assembler to recognize $fcc0 
as a valid register for conditional move instructions. 

Corresponding test cases have been added.

Contributer: Vladimir Medic
llvm-svn: 179567
2013-04-15 22:21:55 +00:00
Tom Stellard
bd67f8cd81 R600/SI: Emit config values in register value pairs.
Instead of emitting config values in a predefined order, the code
emitter will now emit a 32-bit register index followed by the 32-bit
config value.

llvm-svn: 179546
2013-04-15 17:51:35 +00:00
Tom Stellard
a44e2e18a1 R600/SI: Emit configuration value in the .AMDGPU.config ELF section
llvm-svn: 179545
2013-04-15 17:51:30 +00:00
Tom Stellard
cb4468b00a R600: Emit ELF formatted code rather than raw ISA.
llvm-svn: 179544
2013-04-15 17:51:21 +00:00
Hal Finkel
75ee7f1dca Mark all PPC comparison instructions as not having side effects
Now that the CR spilling issues have been resolved, we can remove the
unmodeled-side-effect attributes from the comparison instructions (and also
mark them as isCompare). By allowing these, by default, to have unmodeled side
effects, we were hiding problems with CR spilling; but everything seems much
happier now.

llvm-svn: 179502
2013-04-15 02:37:46 +00:00
Hal Finkel
371be65604 Fix PPC64 CR spill location for callee-saved registers
This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition
registers, the spill location is specified relative to the stack pointer (SP +
8). However, this is not relative to the SP after the new stack frame is
established, but instead relative to the caller's stack pointer (it is stored
into the linkage area of the parent's stack frame).

So, like with the link register, we don't directly spill the CRs with other
callee-saved registers, but just mark them to be spilled during prologue
generation.

In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32).

llvm-svn: 179500
2013-04-15 02:07:05 +00:00
Jakob Stoklund Olesen
3b790b7f2e Use i32 for all SPARC shift amounts, even in 64-bit mode.
Test case by llvm-stress.

llvm-svn: 179477
2013-04-14 05:48:50 +00:00
Jakob Stoklund Olesen
8fafe8cd31 Add support for the abs64 SPARC v9 code model.
For when 16 TB just isn't enough.

llvm-svn: 179474
2013-04-14 05:10:36 +00:00
Jakob Stoklund Olesen
d29f125f5b Add support for the SPARC v9 abs44 code model.
This is the default model for non-PIC 64-bit code. It supports
text+data+bss linked anywhere in the low 16 TB of the address space.

llvm-svn: 179473
2013-04-14 04:57:51 +00:00
Jakob Stoklund Olesen
c97ab7209a Use target flags for printing SPARC asm operands.
64-bit code models need multiple relocations that can't be inferred from
the opcode like they can in 32-bit code.

llvm-svn: 179472
2013-04-14 04:35:19 +00:00
Jakob Stoklund Olesen
b5173ad8fb Also put target flags on SPARC constant pool references.
Constant pool entries are accessed exactly the same way as global
variables.

llvm-svn: 179471
2013-04-14 04:35:16 +00:00
Jakob Stoklund Olesen
c23fada5f9 Fix patterns for 64-bit pointers.
This fixes the pic32 code model for SPARC v9.

llvm-svn: 179469
2013-04-14 01:53:23 +00:00
Jakob Stoklund Olesen
edb2dd00c5 Add target flags to SPARC address operands.
SDNodes and MachineOperands get target flags representing the %hi() and
%lo() assembly annotations that eventually become relocations.

Also define flags to be used by the 64-bit code models.

llvm-svn: 179468
2013-04-14 01:33:32 +00:00
Hal Finkel
12ac18c635 Mark all PPC CR registers to be spilled as live-in and tag MFCR appropriately
Leaving MFCR has having unmodeled side effects is not enough to prevent
unwanted instruction reordering post-RA. We could probably apply a stronger
barrier attribute, but there is a better way: Add all (not just the first) CR
to be spilled as live-in to the entry block, and add all CRs to the MFCR
instruction as implicitly killed.

Unfortunately, I don't have a small test case.

llvm-svn: 179465
2013-04-13 23:06:15 +00:00
Jakob Stoklund Olesen
6182cc630a Define SPARC code models.
Currently, only abs32 and pic32 are implemented. Add a test case for
abs32 with 64-bit code. 64-bit PIC code is currently broken.

llvm-svn: 179463
2013-04-13 19:02:23 +00:00
Jakob Stoklund Olesen
d7b7964c47 Use the correct types when matching ADDRri patterns from frame indexes.
It doesn't seem like anybody is checking types this late in isel, so no
test case.

llvm-svn: 179462
2013-04-13 19:02:16 +00:00
Hal Finkel
978a847acb Spill and restore PPC CR registers using the FP when we have one
For functions that need to spill CRs, and have dynamic stack allocations, the
value of the SP during the restore is not what it was during the save, and so
we need to use the FP in these cases (as for all of the other spills and
restores, but the CR restore has a special code path because its reserved slot,
like the link register, is specified directly relative to the adjusted SP).

llvm-svn: 179457
2013-04-13 08:09:20 +00:00
Andrew Trick
835ac00f78 X86 machine model: reduce SandyBridge and Haswell ILPWindow.
The initial values were arbitrary. I want them to be more
conservative. This represents the number of latency cycles hidden by
OOO execution. In practice, I think it should be within a small factor
of the complex floating point operation latency so the scheduler can
make some attempt to hide latency even for smallish blocks.

These are by no means the best values, just a starting point for
tuning heuristics. Some benchmarks such as TSVC run faster with this
lower value for SandyBridge. I haven't run anything on Haswell, but
it's shouldn't be 2x SB.

llvm-svn: 179450
2013-04-13 06:07:43 +00:00
Andrew Trick
d9efdff16f Catch another case where SD fails to propagate node order.
I need to handle this for the test case in my following scheduler
commit.

Work is already under way to redesign the mechanism for node order
propagation because this case by case approach is unmaintainable.

llvm-svn: 179448
2013-04-13 06:07:36 +00:00
Akira Hatanaka
3d45da9cc4 [mips] Move MipsTargetLowering::lowerINTRINSIC_W_CHAIN and
lowerINTRINSIC_WO_CHAIN into MipsSETargetLowering.

No functionality changes.

llvm-svn: 179444
2013-04-13 02:13:30 +00:00
Akira Hatanaka
e0468ce3e1 [mips] Reapply r179420 and r179421.
llvm-svn: 179434
2013-04-13 00:55:41 +00:00
Akira Hatanaka
85437e61dc [mips] Override TargetLoweringBase::isShuffleMaskLegal.
llvm-svn: 179433
2013-04-13 00:45:02 +00:00
Chad Rosier
3d83c7a3e0 [ms-inline asm] Simplify the logic by using parsePrimaryExpr. No functional
change intended.  Test case previously added in r178568.
Part of rdar://13611297

llvm-svn: 179425
2013-04-12 23:03:20 +00:00
Akira Hatanaka
b0b85e00d8 Revert r179420 and r179421.
llvm-svn: 179422
2013-04-12 22:40:07 +00:00
Akira Hatanaka
737648f84c [mips] Instruction selection patterns for carry-setting and using add
instructions.

llvm-svn: 179421
2013-04-12 22:24:52 +00:00
Akira Hatanaka
d809bc8eeb [mips] v4i8 and v2i16 add, sub and mul instruction selection patterns.
llvm-svn: 179420
2013-04-12 22:14:24 +00:00
Chad Rosier
81c2f41261 [ms-inline asm] Move this logic into a static function as it's only applicable
when parsing MS-style inline assembly.  No functional change intended.

llvm-svn: 179407
2013-04-12 20:20:54 +00:00
Chad Rosier
4e3d67d4c5 [ms-inline asm] Address the FIXME for ImmDisp before brackets. This
is a follow on to r179393 and r179399.  Test case to be added on
the clang side.
Part of rdar://13453209

llvm-svn: 179403
2013-04-12 19:51:49 +00:00
Chad Rosier
443d79152d [ms-inline asm] Have the [ Symbol ] case fall into the more general logic. This
is a follow on to r179393.  Test case to be added on the clang side.
Part of rdar://13453209

llvm-svn: 179399
2013-04-12 18:54:20 +00:00
Quentin Colombet
226206b401 ARM: Correct printing of pre-indexed operands.
According to the ARM reference manual, constant offsets are mandatory for pre-indexed addressing modes.
The MC disassembler was not obeying this when the offset is 0.
It was producing instructions like: str r0, [r1]!.
Correct syntax is: str r0, [r1, #0]!.

This change modifies the dumping of operands so that the offset is always printed, regardless of its value, when pre-indexed addressing mode is used.

Patch by Mihail Popa <Mihail.Popa@arm.com>

llvm-svn: 179398
2013-04-12 18:47:25 +00:00
Chad Rosier
b3960a88c7 [ms-inline asm] Add support for operands that include both a symbol and an
immediate displacement.  Specifically, add support for generating the proper IR.
We've been able to parse this for some time now.  Test case to be added on the
clang side.
Part of rdar://13453209

llvm-svn: 179393
2013-04-12 18:21:18 +00:00
Hal Finkel
f21215af29 PPC: Remove (broken) nested implicit definition lists
TableGen will not combine nested list 'let' bindings into a single list, and
instead uses only the inner scope. As a result, several instruction definitions
were missing implicit register defs that were in outer scopes. This de-nests
these scopes and makes all instructions have only one let binding which sets
implicit register definitions.

llvm-svn: 179392
2013-04-12 18:17:57 +00:00
Hal Finkel
c6d837e387 Add a comment about the PPC Interpretation64Bit bit
llvm-svn: 179391
2013-04-12 18:17:38 +00:00
Jyotsna Verma
f365070e16 Hexagon: Set isPredicatedNew flag on predicate new instructions.
llvm-svn: 179388
2013-04-12 18:01:06 +00:00
Jyotsna Verma
a7ba594fa9 Hexagon: Set isPredicatedFlase flag for all the instructions with negated predication.
llvm-svn: 179387
2013-04-12 17:46:52 +00:00
Hal Finkel
a4429c79f5 Add PPC instruction record forms and associated query functions
This is prep. work for the implementation of optimizeCompare. Many PPC
instructions have 'record' forms (in almost all cases, this means that the RC
bit is set) that cause the result of the instruction to be compared with zero,
and the result of that comparison saved in a predefined condition register. In
order to add the record forms of the instructions without too much
copy-and-paste, the relevant functions have been refactored into multiclasses
which define both the record and normal forms.

Also, two TableGen-generated mapping functions have been added which allow
querying the instruction code for the record form given the normal form (and
vice versa).

No functionality change intended.

llvm-svn: 179356
2013-04-12 02:18:09 +00:00
Chad Rosier
765ec71e8d [ms-inline asm] Add support for using the LENGTH, TYPE, and SIZE operators with
variables that use namespace alias qualifiers.  Test case coming on clang side
shortly.
Part of rdar://13499009

llvm-svn: 179343
2013-04-11 23:57:04 +00:00
Chad Rosier
2aa88ce036 [ms-inline asm] Add support for using offsetof operator with variables that use
namespace alias qualifiers.  Test case coming on clang side shortly.
Part of rdar://13499009

llvm-svn: 179339
2013-04-11 23:37:34 +00:00
Chad Rosier
9323db9524 [ms-inline asm] Pass a StringRef reference to ParseIntelVarWithQualifier so we
can build up the identifier string.  No test case as support for looking up
these type of identifiers hasn't been implemented on the clang side.
Part of rdar://13499009

llvm-svn: 179336
2013-04-11 23:24:15 +00:00
Chad Rosier
b3012065df [ms-inline asm] Remove brackets from around a symbol reference in the target
specific logic.  This makes the code much less fragile.  Test case coming on the
clang side in a moment.
rdar://13634327

llvm-svn: 179323
2013-04-11 21:49:30 +00:00
David Majnemer
2a9a6a16c4 Fix undefined behavior in AArch64
A64Imms::isLogicalImmBits and A64Imms::isLogicalImm will attempt to
execute shifts that perform undefined behavior. Instead of attempting
to perform the 64-bit rotation, treat it as a no-op.

llvm-svn: 179317
2013-04-11 20:13:52 +00:00
Akira Hatanaka
2b0ffc5124 [mips] Custom-lower i64 MULHS and MULHU nodes. Remove the code which selects
multiply instructions in MipsSEDAGToDAGISel.

This patch was supposed to be part of r178403.

llvm-svn: 179314
2013-04-11 19:29:26 +00:00
Akira Hatanaka
10518e983d [mips] Clean up MipsISelDAGToDAG.cpp and MipsISelLowering.cpp.
- Rename function.
- Pass iterator by value.
- Remove header include.

No functionality changes.

llvm-svn: 179312
2013-04-11 19:07:14 +00:00
Michael Liao
877d1576e6 Optimize vector select from all 0s or all 1s
As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane,
vector select could be simplified to AND/OR or removed if one or both values
being selected is all 0s or all 1s.

llvm-svn: 179267
2013-04-11 05:15:54 +00:00
Michael Liao
75c886a312 Add CLAC/STAC instruction encoding/decoding support
As these two instructions in AVX extension are privileged instructions for
special purpose, it's only expected to be used in inlined assembly.

llvm-svn: 179266
2013-04-11 04:52:28 +00:00
Michael Liao
87125582e9 Enhance bool simplifcation in X86 to handle more cases
This patch is revised based on patch from Victor Umansky
<victor.umansky@intel.com>. More cases are handled in X86's bool
simplification, i.e.
- SETCC_CARRY
- value is truncated to i1 with AND

As a by-product, PR5443 is also fixed.

llvm-svn: 179265
2013-04-11 04:43:09 +00:00
NAKAMURA Takumi
c9309ae42b R600ControlFlowFinalizer.cpp: Fix a warning. [-Wunused-variable]
llvm-svn: 179263
2013-04-11 04:16:27 +00:00
NAKAMURA Takumi
1837d9ec3e Whitespace.
llvm-svn: 179262
2013-04-11 04:16:22 +00:00
Hal Finkel
f28d7e2863 Make PPCInstrInfo::isPredicated always return false
Because of how predication in implemented on PPC (only for branches), I think
that this is the right thing to do.  No functionality change intended.

llvm-svn: 179252
2013-04-11 01:23:34 +00:00
Nico Rieck
8e22855ea6 MC: Support COFF image-relative MCSymbolRefs
Add support for the COFF relocation types IMAGE_REL_I386_DIR32NB and
IMAGE_REL_AMD64_ADDR32NB for 32- and 64-bit respectively. These are
similar to normal 4-byte relocations except that they do not include
the base address of the image.

Image-relative relocations are used for debug information (32-bit) and
SEH unwind tables (64-bit).

A new MCSymbolRef variant called 'VK_COFF_IMGREL32' is introduced to
specify such relocations. For AT&T assembly, this variant can be accessed
using the symbol suffix '@imgrel'.

llvm-svn: 179240
2013-04-10 23:28:17 +00:00
Kay Tiong Khoo
5f12d15d44 fixed xsave, xsaveopt, xrstor mnemonics with intel syntax; added test cases
llvm-svn: 179223
2013-04-10 21:52:25 +00:00
Kay Tiong Khoo
ba75929324 fixed to disassemble with tab after mnemonic rather than space
llvm-svn: 179215
2013-04-10 21:17:58 +00:00
Preston Gurd
de5cf7a23b In the X86 back end, getMemoryOperandNo() returns the offset
into the operand array of the start of the memory reference descriptor.

Additional code in EncodeInstruction provides an additional adjustment.

This patch places that additional code in a separate function,
called getOperandBias, so that any caller of getMemoryOperandNo
can also call getOperandBias.

llvm-svn: 179211
2013-04-10 20:11:59 +00:00
Chad Rosier
b0156236cb Tidy up, fix and simplify a few of the SMLocs. Prior to r179109 the Start SMLoc
wasn't always the start of the operand.  If there was a symbol reference, then
Start pointed to that token.  It's very likely there are other places that need
to be updated.

llvm-svn: 179210
2013-04-10 20:07:47 +00:00
Chad Rosier
cc61ca2355 Remove unused variable.
llvm-svn: 179205
2013-04-10 18:46:58 +00:00
Hal Finkel
63d2aee393 PPC: Don't predicate a diamond with two counter decrements
I've not seen this happen in practice, and probably can't until we start
allowing decrement-counter-based conditional branches to be double predicated,
but just in case, don't allow predication of a diamond in which both sides have
ctr-defining branches. Even though the branching behavior of these can be
predicated, the counter-decrementing behavior cannot be.

llvm-svn: 179199
2013-04-10 18:30:16 +00:00
Chad Rosier
411fdc3a74 Reapply r179115, but use parsePrimaryExpression a little more judiciously.
Test cases that regressed due to r179115, plus a few more, were added in
r179182.  Original commit message below:

[ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to
parse an identifier.  Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:

 __asm mov eax, [Symbol + ImmDisp]

Part of rdar://13611297

llvm-svn: 179187
2013-04-10 17:35:30 +00:00
Michel Danzer
c1562afdde R600/SI: Add pattern for AMDGPUurecip
21 more little piglits with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 179186
2013-04-10 17:17:56 +00:00
Reed Kotler
68e5128508 This is for an experimental option -mips-os16. The idea is to compile all
Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this
would happen as long as floating point instructions are not needed.
Probably it would also make sense to compile as mips32 if atomic operations
are needed too. There may be other cases too.

A module pass prescans the IR and adds the mips16 or nomips16 attribute
to functions depending on the functions needs.

Mips 16 mode can result in a 40% code compression by utililizing 16 bit
encoding of many instructions.

The hope is for this to replace the traditional gcc way of dealing with
Mips16 code using floating point which involves essentially using soft float
but with a library implemented using mips32 floating point. This gcc 
method also requires creating stubs so that Mips32 code can interact with
these Mips 16 functions that have floating point needs. My conjecture is
that in reality this traditional gcc method would never win over this
new method.

I will be implementing the traditional gcc method also. Some of it is already
done but I needed to do the stubs to finish the work and those required
this mips16/32 mixed mode capability.

I have more ideas for to make this new method much better and I think the old
method will just live in llvm for anyone that needs the backward compatibility
but I don't for what reason that would be needed.

llvm-svn: 179185
2013-04-10 16:58:04 +00:00
Vincent Lejeune
daa1e69206 R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr
llvm-svn: 179174
2013-04-10 13:29:20 +00:00
Tim Northover
b82f729eb5 ARM: Make "SMC" instructions conditional on new TrustZone architecture feature.
These instructions aren't universally available, but depend on a specific
extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new
feature is appropriate.

This also enables the feature by default on A-class cores which usually have
these extensions, to avoid breaking existing code and act as a sensible
default.

llvm-svn: 179171
2013-04-10 12:08:35 +00:00
Christian Konig
f40f671bab R600/SI: dynamical figure out the reg class of MIMG
Depending on the number of bits set in the writemask.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179166
2013-04-10 08:39:16 +00:00
Christian Konig
76cd1a76c2 R600/SI: adjust writemask to only the used components
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179165
2013-04-10 08:39:08 +00:00
Christian Konig
ffddac18a4 R600/SI: remove image sample writemask
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179164
2013-04-10 08:39:01 +00:00
Hal Finkel
e0c835f71c Cleanup PPCInstrInfo::DefinesPredicate
Implement suggestions made by Bill Schmidt in post-commit review. Thanks!

llvm-svn: 179162
2013-04-10 07:17:47 +00:00
Hal Finkel
f6c2064a4a PPC: Prep for if conversion of bctr[l]
This adds in-principle support for if-converting the bctr[l] instructions.
These instructions are used for indirect branching. It seems, however, that the
current if converter will never actually predicate these. To do so, it would
need the ability to hoist a few setup insts. out of the conditionally-executed
block. For example, code like this:
  void foo(int a, int (*bar)()) { if (a != 0) bar(); }
becomes:
        ...
        beq 0, .LBB0_2
        std 2, 40(1)
        mr 12, 4
        ld 3, 0(4)
        ld 11, 16(4)
        ld 2, 8(4)
        mtctr 3
        bctrl
        ld 2, 40(1)
.LBB0_2:
        ...
and it would be safe to do all of this unconditionally with a predicated
beqctrl instruction.

llvm-svn: 179156
2013-04-10 06:42:34 +00:00
Evan Cheng
9f82233851 __sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in
xmm0 / xmm1.

rdar://13599493

llvm-svn: 179141
2013-04-10 01:26:07 +00:00
Jack Carter
03f8f98410 Mips specific inline asm operand modifier 'D'
Modifier 'D' is to use the second word of a double integer.

We had previously implemented the pure register varient of 
the modifier and this patch implements the memory reference.



#include "stdio.h"

int b[8] = {0,1,2,3,4,5,6,7};
void main()
{
    int i;
    
    // The first word. Notice, no 'D'
    {asm (
    "lw    %0,%1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);

    // The second word
    {asm (
    "lw    %0,%D1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);
}

llvm-svn: 179135
2013-04-09 23:19:50 +00:00
Hal Finkel
8b05494b58 Allow PPC B and BLR to be if-converted into some predicated forms
This enables us to form predicated branches (which are the same conditional
branches we had before) and also a larger set of predicated returns (including
instructions like bdnzlr which is a conditional return and loop-counter
decrement all in one).

At the moment, if conversion does not capture all possible opportunities. A
simple example is provided in early-ret2.ll, where if conversion forms one
predicated return, and then the PPCEarlyReturn pass picks up the other one. So,
at least for now, we'll keep both mechanisms.

llvm-svn: 179134
2013-04-09 22:58:37 +00:00
Chad Rosier
aa67701688 Cleanup. No functional change intended.
llvm-svn: 179129
2013-04-09 20:58:48 +00:00
Chad Rosier
85e2894bd6 Cleanup. No functional change intended.
llvm-svn: 179125
2013-04-09 20:44:09 +00:00
Chad Rosier
e040ffba05 Revert r179115 as it looks to have killed the ASan tests.
llvm-svn: 179120
2013-04-09 19:59:12 +00:00
Reed Kotler
9b753510a5 This patch enables llvm to switch between compiling for mips32/mips64
and mips16 on a per function basis.

Because this patch is somewhat involved I have provide an overview of the
key pieces of it.

The patch is written so as to not change the behavior of the non mixed
mode. We have tested this a lot but it is something new to switch subtargets
so we don't want any chance of regression in the mainline compiler until
we have more confidence in this.

Mips32/64 are very different from Mip16 as is the case of ARM vs Thumb1.
For that reason there are derived versions of the register info, frame info, 
instruction info and instruction selection classes.

Now we register three separate passes for instruction selection.
One which is used to switch subtargets (MipsModuleISelDAGToDAG.cpp) and then
one for each of the current subtargets (Mips16ISelDAGToDAG.cpp and
MipsSEISelDAGToDAG.cpp).

When the ModuleISel pass runs, it determines if there is a need to switch
subtargets and if so, the owning pointers in MipsTargetMachine are
appropriately changed.

When 16Isel or SEIsel is run, they will return immediately without doing
any work if the current subtarget mode does not apply to them.

In addition, MipsAsmPrinter needs to be reset on a function basis.

The pass BasicTargetTransformInfo is substituted with a null pass since the
pass is immutable and really needs to be a function pass for it to be
used with changing subtargets. This will be fixed in a follow on patch.

llvm-svn: 179118
2013-04-09 19:46:01 +00:00
Chad Rosier
4ef6c35911 [ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to
parse an identifier.  Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:

 __asm mov eax, [Symbol + ImmDisp]

The existing test cases exercise this patch.
rdar://13611297

llvm-svn: 179115
2013-04-09 19:34:59 +00:00
Hal Finkel
2be8935801 Cleanup PPCEarlyReturn
Some general cleanup and only scan the end of a BB for branches (once we're
done with the terminators and debug values, then there should not be any other
branches). These address post-commit review suggestions by Bill Schmidt.

No functionality change intended.

llvm-svn: 179112
2013-04-09 18:25:18 +00:00
Chad Rosier
5ec822982c [ms-inline asm] Maintain a StringRef to reference a symbol in a parsed operand,
rather than deriving the StringRef from the Start and End SMLocs.

Using the Start and End SMLocs works fine for operands such as [Symbol], but
not for operands such as [Symbol + ImmDisp].  All existing test cases that
reference a variable exercise this patch.
rdar://13602265

llvm-svn: 179109
2013-04-09 17:53:49 +00:00
Hal Finkel
c72f2476a9 Use virtual base registers on PPC
On PowerPC, non-vector loads and stores have r+i forms; however, in functions
with large stack frames these were not being used to access slots far from the
stack pointer because such slots were out of range for the signed 16-bit
immediate offset field. This increases register pressure because we need a
separate register for each offset (when the r+r form is used). By enabling
virtual base registers, we can deal with large stack frames without unduly
increasing register pressure.

llvm-svn: 179105
2013-04-09 17:27:09 +00:00
Jakob Stoklund Olesen
b5640bf98c Extract a function.
llvm-svn: 179086
2013-04-09 05:11:52 +00:00
Jakob Stoklund Olesen
6a275455ac Compute correct frame sizes for SPARC v9 64-bit frames.
The save area is twice as big and there is no struct return slot. The
stack pointer is always 16-byte aligned (after adding the bias).

Also eliminate the stack adjustment instructions around calls when the
function has a reserved stack frame.

llvm-svn: 179083
2013-04-09 04:37:47 +00:00
Arnold Schwaighofer
3218da2403 X86 cost model: Model cost for uitofp and sitofp on SSE2
The costs are overfitted so that I can still use the legalization factor.

For example the following kernel has about half the throughput vectorized than
unvectorized when compiled with SSE2. Before this patch we would vectorize it.

unsigned short A[1024];
double B[1024];
void f() {
  int i;
  for (i = 0; i < 1024; ++i) {
    B[i] = (double) A[i];
  }
}

radar://13599001

llvm-svn: 179033
2013-04-08 18:05:48 +00:00
Chad Rosier
7583b0a3c3 [ms-inline asm] Add support for ImmDisp [ Symbol ] memory operands.
rdar://13521249

llvm-svn: 179030
2013-04-08 17:43:47 +00:00
Hal Finkel
0daaa8e2de Generate PPC early conditional returns
PowerPC has a conditional branch to the link register (return) instruction: BCLR.
This should be used any time when we'd otherwise have a conditional branch to a
return. This adds a small pass, PPCEarlyReturn, which runs just prior to the
branch selection pass (and, importantly, after block placement) to generate
these conditional returns when possible. It will also eliminate unconditional
branches to returns (these happen rarely; most of the time these have already
been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice
optimization for small functions that do not maintain a stack frame.

llvm-svn: 179026
2013-04-08 16:24:03 +00:00
Vincent Lejeune
cbdacdc057 R600: Control Flow support for pre EG gen
llvm-svn: 179020
2013-04-08 13:05:49 +00:00
Tim Northover
8eb5637d73 AArch64: remove barriers from AArch64 atomic operations.
I've managed to convince myself that AArch64's acquire/release
instructions are sufficient to guarantee C++11's required semantics,
even in the sequentially-consistent case.

llvm-svn: 179005
2013-04-08 08:40:41 +00:00
Benjamin Kramer
e546a3587f ARM: Remove unused variable.
llvm-svn: 179001
2013-04-08 08:07:35 +00:00
Hal Finkel
9af1fa0d50 Cleanup and improve PPC fsel generation
First, we should not cheat: fsel-based lowering of select_cc is a
finite-math-only optimization (the ISA manual, section F.3 of v2.06, makes
this clear, as does a note in our own README).

This also adds fsel-based lowering of EQ and NE condition codes. As it turned
out, fsel generation was covered by a grand total of zero regression test
cases. I've added some test cases to cover the existing behavior (which is now
finite-math only), as well as the new EQ cases.

llvm-svn: 179000
2013-04-07 22:11:09 +00:00
Jakob Stoklund Olesen
658f7754a3 Implement LowerCall_64 for the SPARC v9 64-bit ABI.
There is still no support for byval arguments (which I don't think are
needed) and varargs.

llvm-svn: 178993
2013-04-07 19:10:57 +00:00
Hal Finkel
cb2d4ef2d5 PPC rotate instructions don't have unmodeled side effcts
llvm-svn: 178982
2013-04-07 15:06:53 +00:00
Hal Finkel
7fa1879cf1 Most PPC M[TF]CR instructions do not have side effects
llvm-svn: 178978
2013-04-07 14:33:13 +00:00
Hal Finkel
5cb9f96c21 PPC pre-increment load instructions do not have side effects
A few were missed in r178972.

llvm-svn: 178973
2013-04-07 06:30:47 +00:00
Hal Finkel
d0234b53ce PPC pre-increment load instructions do not have side effects
llvm-svn: 178972
2013-04-07 05:46:58 +00:00
Hal Finkel
90572b9c18 PPC MCRF instruction does not have side effects
llvm-svn: 178971
2013-04-07 05:16:57 +00:00
Hal Finkel
e21295d038 PPC FMR instruction does not have side effects
llvm-svn: 178970
2013-04-07 04:56:16 +00:00
Jakob Stoklund Olesen
594e073ba0 Implement LowerReturn_64 for SPARC v9.
Integer return values are sign or zero extended by the callee, and
structs up to 32 bytes in size can be returned in registers.

The CC_Sparc64 CallingConv definition is shared between
LowerFormalArguments_64 and LowerReturn_64. Function arguments and
return values are passed in the same registers.

The inreg flag is also used for return values. This is required to handle
C functions returning structs containing floats and ints:

  struct ifp {
    int i;
    float f;
  };

  struct ifp f(void);

LLVM IR:

  define inreg { i32, float } @f() {
     ...
     ret { i32, float } %retval
  }

The ABI requires that %retval.i is returned in the high bits of %i0
while %retval.f goes in %f1.

Without the inreg return value attribute, %retval.i would go in %i0 and
%retval.f would go in %f3 which is a more efficient way of returning
%multiple values, but it is not ABI compliant for returning C structs.

llvm-svn: 178966
2013-04-06 23:57:33 +00:00
Jakob Stoklund Olesen
2e3d792bf5 SPARC v9 stack pointer bias.
64-bit SPARC v9 processes use biased stack and frame pointers, so the
current function's stack frame is located at %sp+BIAS .. %fp+BIAS where
BIAS = 2047.

This makes more local variables directly accessible via [%fp+simm13]
addressing.

llvm-svn: 178965
2013-04-06 21:38:57 +00:00
Hal Finkel
6c65d3a736 Implement PPCInstrInfo::FoldImmediate
There are certain PPC instructions into which we can fold a zero immediate
operand. We can detect such cases by looking at the register class required
by the using operand (so long as it is not otherwise constrained).

llvm-svn: 178961
2013-04-06 19:30:30 +00:00
Hal Finkel
65566059cd PPC ISEL is a select and never has side effects
llvm-svn: 178960
2013-04-06 19:30:28 +00:00
Jakob Stoklund Olesen
2076ffbc15 Complete formal arguments for the SPARC v9 64-bit ABI.
All arguments are formally assigned to stack positions and then promoted
to floating point and integer registers. Since there are more floating
point registers than integer registers, this can cause situations where
floating point arguments are assigned to registers after integer
arguments that where assigned to the stack.

Use the inreg flag to indicate 32-bit fragments of structs containing
both float and int members.

The three-way shadowing between stack, integer, and floating point
registers requires custom argument lowering. The good news is that
return values are passed in the exact same way, and we can share the
code.

Still missing:

 - Update LowerReturn to handle structs returned in registers.
 - LowerCall.
 - Variadic functions.

llvm-svn: 178958
2013-04-06 18:32:12 +00:00
Tom Stellard
8ad4f7c25b R600/SI: Add support for buffer stores v2
v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178931
2013-04-05 23:31:51 +00:00
Tom Stellard
379d612a66 R600/SI: Use same names for corresponding MUBUF operands and encoding fields
The code emitter knows how to encode operands whose name matches one of
the encoding fields.  If there is no match, the code emitter relies on
the order of the operand and field definitions to determine how operands
should be encoding.  Matching by order makes it easy to accidentally break
the instruction encodings, so we prefer to match by name.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178930
2013-04-05 23:31:44 +00:00
Tom Stellard
7dd3fda85d R600: Add RV670 processor
This is an R600 GPU with double support.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178929
2013-04-05 23:31:40 +00:00
Tom Stellard
917d7412f1 R600/SI: Add processor types for each SI variant
Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178928
2013-04-05 23:31:35 +00:00
Tom Stellard
17fba38a4b R600/SI: Avoid generating S_MOVs with 64-bit immediates v2
SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 178927
2013-04-05 23:31:20 +00:00
Hal Finkel
b556d850c9 Enable early if conversion on PPC
On cores for which we know the misprediction penalty, and we have
the isel instruction, we can profitably perform early if conversion.
This enables us to replace some small branch sequences with selects
and avoid the potential stalls from mispredicting the branches.

Enabling this feature required implementing canInsertSelect and
insertSelect in PPCInstrInfo; isel code in PPCISelLowering was
refactored to use these functions as well.

llvm-svn: 178926
2013-04-05 23:29:01 +00:00
Hal Finkel
a1dd7b3f45 Correct the PPC A2 misprediction penalty
The manual states that there is a minimum of 13 cycles from when the
mispredicted branch is issued to when the correct branch target is
issued.

llvm-svn: 178925
2013-04-05 23:28:58 +00:00
Bill Wendling
f2bb7aa5f8 Use the target options specified on a function to reset the back-end.
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917
2013-04-05 21:52:40 +00:00
Renato Golin
9d05117f2b Reverting 178851 as it broke buildbots
llvm-svn: 178883
2013-04-05 16:39:53 +00:00
Chad Rosier
59dbb08c9b [ms-inline asm] Add support for numeric displacement expressions in bracketed
memory operands.

Essentially, this layers an infix calculator on top of the parsing state
machine.  The scale on the index register is still expected to be an immediate

 __asm mov eax, [eax + ebx*4]

and will not work with more complex expressions.  For example,

 __asm mov eax, [eax + ebx*(2*2)]

The plus and minus binary operators assume the numeric value of a register is
zero so as to not change the displacement.  Register operands should never
be an operand for a multiply or divide operation; the scale*indexreg
expression is always replaced with a zero on the operand stack to prevent
such a case.
rdar://13521380

llvm-svn: 178881
2013-04-05 16:28:55 +00:00
Stepan Dyatkovskiy
257a194cf1 Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage.
llvm-svn: 178854
2013-04-05 07:34:08 +00:00
Stepan Dyatkovskiy
98f7dac944 Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion: 
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html

llvm-svn: 178851
2013-04-05 05:52:14 +00:00
Hal Finkel
9e772fa482 Add a SchedMachineModel for the PPC G5
llvm-svn: 178850
2013-04-05 05:49:18 +00:00
Hal Finkel
ac71a202bd Add a SchedMachineModel for the PPC A2
llvm-svn: 178848
2013-04-05 05:34:08 +00:00
Arnold Schwaighofer
15f0999c37 ARM scheduler model: Add scheduler info to more instructions and resource
descriptions for compares

llvm-svn: 178844
2013-04-05 05:01:06 +00:00
Arnold Schwaighofer
e0459c3175 ARM scheduler model: Swift has varying latencies, uops for simple ALU ops
llvm-svn: 178842
2013-04-05 04:42:00 +00:00
Arnold Schwaighofer
52871434dd X86 cost model: Differentiate cost for vector shifts of constants
SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808
2013-04-04 23:26:24 +00:00
Arnold Schwaighofer
861251004b CostModel: Add parameter to instruction cost to further classify operand values
On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807
2013-04-04 23:26:21 +00:00
Hal Finkel
02fd9b0859 Rename the current PPC BCL definition to BCLalways
BCL is normally a conditional branch-and-link instruction, but has
an unconditional form (which is used in the SjLj code, for example).
To make clear that this BCL instruction definition is specifically
the special unconditional form (which does not meaningfully take
a condition-register input), rename it to BCLalways.

No functionality change intended.

llvm-svn: 178803
2013-04-04 22:55:54 +00:00
Hal Finkel
1dc78e666b PPC: Improve code generation for mixed-precision reciprocal sqrt
The DAGCombine logic that recognized a/sqrt(b) and transformed it into
a multiplication by the reciprocal sqrt did not handle cases where the
sqrt and the division were separated by an fpext or fptrunc.

llvm-svn: 178801
2013-04-04 22:44:12 +00:00
Jyotsna Verma
c3ebace56c Hexagon: Expand br_cc.
It fixes following tests for Hexagon:

CodeGen/Generic/2003-07-29-BadConstSbyte.ll
CodeGen/Generic/2005-10-21-longlonggtu.ll
CodeGen/Generic/2009-04-28-i128-cmp-crash.ll
CodeGen/Generic/MachineBranchProb.ll
CodeGen/Generic/builtin-expect.ll
CodeGen/Generic/pr12507.ll

llvm-svn: 178794
2013-04-04 21:18:26 +00:00
Richard Osborne
25a2bd3084 [XCore] Add bru instruction.
llvm-svn: 178783
2013-04-04 20:05:35 +00:00
Richard Osborne
2eabe25672 [XCore] The RRegs register class is a superset of GRRegs.
At the time when the XCore backend was added there were some issues with
with overlapping register classes but these all seem to be fixed now.
Describing the register classes correctly allow us to get rid of a
codegen only instruction (LDAWSP_lru6_RRegs) and it means we can
disassemble ru6 instructions that use registers above r11.

llvm-svn: 178782
2013-04-04 19:57:46 +00:00
Jakob Stoklund Olesen
a53fa8d450 Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

llvm-svn: 178773
2013-04-04 18:25:36 +00:00
Vincent Lejeune
3a22d07044 R600: Use a mask for offsets when encoding instructions
llvm-svn: 178763
2013-04-04 14:00:09 +00:00
Vincent Lejeune
d5f0b3821e R600: Fix wrong address when substituting ENDIF
llvm-svn: 178762
2013-04-04 14:00:03 +00:00
Vincent Lejeune
a680946842 R600: Take export into account when computing cf address
llvm-svn: 178761
2013-04-04 13:59:59 +00:00
Jakob Stoklund Olesen
1969a96fcd Add SPARC v9 support for select on 64-bit compares.
This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

llvm-svn: 178737
2013-04-04 03:08:00 +00:00
Arnold Schwaighofer
329430aeac X86 cost model: Vector shifts are expensive in most cases
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
2013-04-03 21:46:05 +00:00
Vincent Lejeune
6a4ef74f44 R600: Fix last ALU of a clause being emitted in a separate clause
llvm-svn: 178675
2013-04-03 18:24:47 +00:00
Hal Finkel
3e38cb94ec Cleanup PPC reciprocal-estimate functionality
Incorporating review feedback from Bill Schmidt on r178617. No functionality
change intended.

llvm-svn: 178672
2013-04-03 17:44:56 +00:00
Vincent Lejeune
9bc67cfa08 R600: Factorize maximum alu per clause in a single location
llvm-svn: 178667
2013-04-03 16:49:34 +00:00
Vincent Lejeune
bab4692335 R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer
llvm-svn: 178665
2013-04-03 16:24:09 +00:00
Vincent Lejeune
6b257b347d R600: Consider KILLGT as an ALU instruction
Mesa does not override llvm behavior wrt KILLGT anymore so llvm
has to handle KILLGT on its own.

llvm-svn: 178664
2013-04-03 16:24:04 +00:00
Hal Finkel
994d3213dc PPC: Enable FRES and FRSQRTE on the default PPC64 description
I discussed this with Bill Schmidt on IRC, and it was decided that this is a
safe and reasonable default.

llvm-svn: 178659
2013-04-03 14:40:18 +00:00
Hal Finkel
7b7e07e3ed PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern
llvm-svn: 178658
2013-04-03 14:40:16 +00:00
Hal Finkel
bf904721de Remove some obsolete PowerPC/README entries
llvm-svn: 178657
2013-04-03 14:25:55 +00:00
Ulrich Weigand
00a652878d More direct types in PowerPC AltiVec intrinsics.
This patch follows up on work done by Bill Schmidt in r178277,
and replaces most of the remaining uses of VRRC in ISEL DAG patterns.

The resulting .inc files are identical except for comments, so
no change in code generation is expected.

llvm-svn: 178656
2013-04-03 14:08:13 +00:00
Bill Schmidt
990515e4c4 Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.

llvm-svn: 178639
2013-04-03 13:05:44 +00:00
Tim Northover
2550df2b22 AArch64: implement ETMv4 trace system registers.
llvm-svn: 178637
2013-04-03 12:31:29 +00:00
Timur Iskhodzhanov
ecd533f0ec Fix SRet for thiscall in i686-pc-win32
llvm-svn: 178634
2013-04-03 11:27:54 +00:00
Tim Northover
acffe8e7ca AArch64: switch patterns to be type-based rather than RegClass-based
It's a bit of churn in the blame log, but I think there are real benefits to
the newer system so I'm making the change in one go.

llvm-svn: 178633
2013-04-03 11:19:16 +00:00
Jakob Stoklund Olesen
3b7eaf9bb6 Add 64-bit compare + branch for SPARC v9.
The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

llvm-svn: 178621
2013-04-03 04:41:44 +00:00
Hal Finkel
f9aac2db2e Remove some unsupported-feature comments from PPC.td
These refer to the reciprocal estimate support recently committed.

llvm-svn: 178618
2013-04-03 04:03:58 +00:00
Hal Finkel
0208f7c744 Use PPC reciprocal estimates with Newton iteration in fast-math mode
When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

llvm-svn: 178617
2013-04-03 04:01:11 +00:00
Eric Christopher
81bacb7670 Formatting.
llvm-svn: 178589
2013-04-02 23:06:40 +00:00
Akira Hatanaka
f08d3a5a83 [mips] Small update to the implementation of eh.return for Mips.
This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.

llvm-svn: 178588
2013-04-02 23:02:07 +00:00
Akira Hatanaka
7d5dae9eab [mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp.
This patch fixes the following two tests which have been failing on
llvm-mips-linux builder since r178403:

LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll
LLVM :: Analysis/Profiling/load-branch-weights-loops.ll

llvm-svn: 178584
2013-04-02 22:53:58 +00:00
Chad Rosier
1fe97eda48 [ms-inline asm] Add support for parsing variables with namespace alias
qualifiers.

This patch only adds support for parsing these identifiers in the
X86AsmParser.  The front-end interface isn't capable of looking up
these identifiers at this point in time.  The end result is the
compiler now errors during object file emission, rather than at
parse time.  Test case coming shortly.
Part of rdar://13499009 and PR13340

llvm-svn: 178566
2013-04-02 20:02:33 +00:00
Bill Schmidt
c98ed219d3 Fix PR15630: Replace faulty stdcx. with stwcx.
When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.

llvm-svn: 178559
2013-04-02 18:37:08 +00:00
Chad Rosier
908153170e [fast-isel] Use the correct API to disable FastLowerArguments for Win64.
llvm-svn: 178549
2013-04-02 16:31:41 +00:00
Justin Holewinski
1a754ca503 [NVPTX] Fix a few style issues in NVVMReflect
llvm-svn: 178536
2013-04-02 12:37:11 +00:00
Jakob Stoklund Olesen
8a184a7fe4 Add 64-bit load and store instructions.
There is only a few new instructions, the rest is handled with patterns.

llvm-svn: 178528
2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen
22fe26207f Basic 64-bit ALU operations.
SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

llvm-svn: 178527
2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen
d57f9ab92f Materialize 64-bit immediates.
The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

llvm-svn: 178526
2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen
5ef2195726 Add 64-bit shift instructions.
SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

llvm-svn: 178525
2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen
7fcde17aa6 Add predicates for distinguishing 32-bit and 64-bit modes.
The 'sparc' architecture produces 32-bit code while 'sparcv9' produces
64-bit code.

It is also possible to run 32-bit code using SPARC v9 instructions with:

  llc -march=sparc -mattr=+v9

llvm-svn: 178524
2013-04-02 04:09:06 +00:00
Jakob Stoklund Olesen
9fbfce2d11 Add support for 64-bit calling convention.
This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

llvm-svn: 178523
2013-04-02 04:09:02 +00:00
Jakob Stoklund Olesen
068f573f65 Add an I64Regs register class for 64-bit registers.
We are going to use the same registers for 32-bit and 64-bit values, but
in two different register classes. The I64Regs register class has a
larger spill size and alignment.

The addition of an i64 register class confuses TableGen's type
inference, so it is necessary to clarify the type of some immediates and
the G0 register.

In 64-bit mode, pointers are i64 and should use the I64Regs register
class. Implement getPointerRegClass() to dynamically provide the pointer
register class depending on the subtarget. Use ptr_rc and iPTR for
memory operands.

Finally, add the i64 type to the IntRegs register class. This register
class is not used to hold i64 values, I64Regs is for that. The type is
required to appease TableGen's type checking in output patterns like this:

  def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>;

SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and
TableGen doesn't know to check the type of register sub-classes.

llvm-svn: 178522
2013-04-02 04:08:54 +00:00
Hal Finkel
1adfd9c87a Fix typo in PPCISelLowering
Thanks to Bill Schmidt for finding this in review of r178480.

llvm-svn: 178521
2013-04-02 03:29:51 +00:00
Andrew Trick
b6ac50177f The divide unit is not pipeline, but it is still buffered.
Buffered means a later divide may be executed out-of-order while a
prior divide is sitting (buffered) in a reservation station.

You can tell it's not pipelined, because operations that use it
reserve it for more than one cycle:

def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> {
  let Latency = 25;
  let ResourceCycles = [1, 10];
}

We don't currently distinguish between an unpipeline operation and one
that is split into multiple micro-ops requiring the same unit. Except
that the later may have NumMicroOps > 1 if they also consume
issue/dispatch resources.

llvm-svn: 178519
2013-04-02 01:58:47 +00:00
NAKAMURA Takumi
9ce5fbdaab Target/R600: Fix CMake build to add missing files.
llvm-svn: 178508
2013-04-01 22:05:58 +00:00