1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

26680 Commits

Author SHA1 Message Date
Zoran Jovanovic
b55909330e Support for microMIPS FPU instructions 1.
llvm-svn: 197815
2013-12-20 15:44:08 +00:00
Richard Sandiford
8daaabe4c3 [SystemZ] Optimize comparisons with truncated extended loads
If the extension of a loaded value is compared against zero and used in
other arithmetic, InstCombine will change the comparison to use the
unextended load.  It's also possible that the comparison could be against
the unextended load from the outset.

In DAG form this becomes a truncation of an extending load.  We want to
strip the truncation if possible so that we can use load-and-test instructions.

llvm-svn: 197804
2013-12-20 11:56:02 +00:00
Richard Sandiford
48a0b2f8e3 [SystemZ] Extend RISBG optimization
The handling of ANY_EXTEND and ZERO_EXTEND was too strict.  In this context
we can treat ZERO_EXTEND in much the same way as an AND and then also handle
outermost ZERO_EXTENDs.

I couldn't find a test that benefited from the ANY_EXTEND change, but it's
more obvious to write it this way once SIGN_EXTEND and ZERO_EXTEND are
handled differently.

llvm-svn: 197802
2013-12-20 11:49:48 +00:00
Saleem Abdulrasool
e75f7af412 ARM IAS: add support for the .pool directive
The .pool directive is an alias for the .ltorg directive used to create a
literal pool.  Simply treat .pool as if .ltorg was passed.

llvm-svn: 197787
2013-12-20 07:21:16 +00:00
Tom Stellard
b39ac07c09 R600: Allow ftrunc
v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc
v3: move ftrunc pattern next to TRUNC definition, it's available since R600

Patch By: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197783
2013-12-20 05:11:55 +00:00
Eric Christopher
24d8bb6edd [x86] Rename In32BitMode predicate to Not64BitMode
That's what it actually means, and with 16-bit support it's going to be
a little more relevant since in a few corner cases we may actually want
to distinguish between 16-bit and 32-bit mode (for example the bare 'push'
aliases to pushw/pushl etc.)

Patch by David Woodhouse

llvm-svn: 197768
2013-12-20 02:04:49 +00:00
Alp Toker
20f2bae8eb Fix documentation typos
llvm-svn: 197757
2013-12-20 00:33:39 +00:00
Kevin Enderby
b0799fc34d Un-revert: the buildbot failure in LLVM on lld-x86_64-win7 had me with
this commit as the only one on the Blamelist so I quickly reverted this.
However it was actually Nick's change who has since fixed that issue.

Original commit message:

Changed the X86 assembler for intel syntax to work with directional labels.

The X86 assembler as a separate code to parser the intel assembly syntax
in X86AsmParser::ParseIntelOperand().  This did not parse directional labels.
And if something like 1f was used as a branch target it would get an
"Unexpected token" error.

The fix starts in X86AsmParser::ParseIntelExpression() in the case for
AsmToken::Integer, it needs to grab the IntVal from the current token
then look for a 'b' or 'f' following an Integer.  Then it basically needs to
do what is done in AsmParser::parsePrimaryExpr() for directional
labels.  It saves the MCExpr it creates in the IntelExprStateMachine
in the Sym field.

When it returns to X86AsmParser::ParseIntelOperand() it looks
for a non-zero Sym field in the IntelExprStateMachine and if
set it creates a memory operand not an immediate operand
it would normally do for the Integer.

rdar://14961158

llvm-svn: 197744
2013-12-19 23:16:14 +00:00
David Peixotto
ae78d16e68 Ensure deterministic when printing ARM assembler constant pools
We dump any non-empty assembler constant pools after a successful
parse of an assembly file that uses the ldr pseudo opcode. These
per-section constant pools should be output in a deterministic order
to ensure that we always generate the same output when printing the
output with an AsmStreamer.

This patch changes the map data struture used to associate a section
with its constant pool to a MapVector to ensure deterministic
output. Because this map type does not support deletion, we now
check that the constant pool is not empty before dumping its entries
and clear the entries after emitting them with the streamer.

llvm-svn: 197735
2013-12-19 22:41:56 +00:00
Kevin Enderby
63d4a91601 Revert my change to the X86 assembler for intel syntax to work with
directional labels.  Because it doesn't work for windows :)

llvm-svn: 197731
2013-12-19 22:24:09 +00:00
Kevin Enderby
aaa32c63ce Changed the X86 assembler for intel syntax to work with directional labels.
The X86 assembler has a separate code to parser the intel assembly syntax
in X86AsmParser::ParseIntelOperand().  This did not parse directional labels.
And if something like 1f was used as a branch target it would get an
"Unexpected token" error.

The fix starts in X86AsmParser::ParseIntelExpression() in the case for
AsmToken::Integer, it needs to grab the IntVal from the current token
then look for a 'b' or 'f' following the Integer.  Then it basically needs to
do what is done in AsmParser::parsePrimaryExpr() for directional
labels.  It saves the MCExpr it creates in the IntelExprStateMachine
in the Sym field.

When it returns to X86AsmParser::ParseIntelOperand() it looks
for a non-zero Sym field in the IntelExprStateMachine and if
set it creates a memory operand not an immediate operand
it would normally do for the Integer.

rdar://14961158

llvm-svn: 197728
2013-12-19 22:02:03 +00:00
Quentin Colombet
884367d931 [X86][fast-isel] Fix select lowering.
The condition in selects is supposed to be i1.
Make sure we are just reading the less significant bit
of the 8 bits width value to match this constraint.

<rdar://problem/15651765>

llvm-svn: 197712
2013-12-19 18:32:04 +00:00
David Peixotto
16536db0ae Implement the .ltorg directive for ARM assembly
This directive will write out the assembler-maintained constant
pool for the current section. These constant pools are created to
support the ldr-pseudo instruction (e.g. ldr r0, =val).

The directive can be used by the programmer to place the constant
pool in a location that can be reached by a pc-relative offset in
the ldr instruction.

llvm-svn: 197711
2013-12-19 18:26:07 +00:00
David Peixotto
a66e68bb52 Implement the ldr-pseudo opcode for ARM assembly
The ldr-pseudo opcode is a convenience for loading 32-bit constants.
It is converted into a pc-relative load from a constant pool. For
example,

  ldr r0, =0x10001
  ldr r1, =bar

will generate this output in the final assembly

  ldr r0, .Ltmp0
  ldr r1, .Ltmp1
  ...
  .Ltmp0: .long 0x10001
  .Ltmp1: .long bar

Sketch of the LDR pseudo implementation:
  Keep a map from Section => ConstantPool

  When parsing ldr r0, =val
    parse val as an MCExpr
    get ConstantPool for current Section
    Label = CreateTempSymbol()
    remember val in ConstantPool at next free slot
    add operand to ldr that is MCSymbolRef of Label

  On finishParse() callback
    Write out all non-empty constant pools
    for each Entry in ConstantPool
      Emit Entry.Label
      Emit Entry.Value

Possible improvements to be added in a later patch:
  1. Does not convert load of small constants to mov
     (e.g. ldr r0, =0x1 => mov r0, 0x1)
  2. Does reuse constant pool entries for same constant

The implementation was tested for ARM, Thumb1, and Thumb2 targets on
linux and darwin.

llvm-svn: 197708
2013-12-19 18:12:36 +00:00
Rafael Espindola
db0b3614ff Small simplification, p0 is the same as p.
llvm-svn: 197699
2013-12-19 16:51:03 +00:00
Zoran Jovanovic
e6baed485f Support for microMIPS control instructions.
llvm-svn: 197696
2013-12-19 16:25:00 +00:00
Rafael Espindola
6cf8a98d5f Long doubles are required to be aligned to 128 bits and svr4 32 bits.
Clang was already getting this right.

llvm-svn: 197694
2013-12-19 16:23:59 +00:00
Hal Finkel
ce61543897 Add a disassembler to the PowerPC backend
The tests for the disassembler were adapted from the encoder tests, and for the
most part, the output from the disassembler matches that encoder-test inputs.
There are some places where more-informative mnemonics could be produced
(notably for the branch instructions), and those cases are noted in the tests
with FIXMEs.

Future work includes:

 - Generating more-informative mnemonics when possible (this may also be done
   in the printer).

 - Remove the dependence on positional "numbered" operand-to-variable mapping
   (for both encoding and decoding).

 - Internally using 64-bit instruction variants in 64-bit mode (if this turns
   out to matter).

llvm-svn: 197693
2013-12-19 16:13:01 +00:00
Zoran Jovanovic
a4d6da998d Support for microMIPS LL and SC instructions.
llvm-svn: 197692
2013-12-19 16:12:56 +00:00
Zoran Jovanovic
6b16ca6dfa Support for microMIPS TLS relocations.
llvm-svn: 197685
2013-12-19 16:02:32 +00:00
Matt Arsenault
e64331a159 R600/SI: Make private pointers be 32-bit.
Different sized address spaces should theoretically work
most of the time now, and since 64-bit add is currently
disabled, using more 32-bit pointers fixes some cases.

llvm-svn: 197659
2013-12-19 05:32:55 +00:00
Saleem Abdulrasool
a48f1a5b54 ARM IAS: support .inst directive
This adds support for the .inst directive.  This is an ARM specific directive to
indicate an instruction encoded as a constant expression.  The major difference
between .word, .short, or .byte and .inst is that the latter will be
disassembled as an instruction since it does not get flagged as data.

llvm-svn: 197657
2013-12-19 05:17:58 +00:00
Josh Magee
86d29cffa7 [stackprotector] Use analysis from the StackProtector pass for stack layout in PEI a nd LocalStackSlot passes.
This changes the MachineFrameInfo API to use the new SSPLayoutKind information
produced by the StackProtector pass (instead of a boolean flag) and updates a
few pass dependencies (to preserve the SSP analysis).

The stack layout follows the same approach used prior to this change - i.e.,
only LargeArray stack objects will be placed near the canary and everything
else will be laid out normally.  After this change, structures containing large
arrays will also be placed near the canary - a case previously missed by the
old implementation.

Out of tree targets will need to update their usage of
MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. 

The next patch will implement the rules for sspstrong and sspreq.  The end goal
is to support ssp-strong stack layout rules.

WIP.

Differential Revision: http://llvm-reviews.chandlerc.com/D2158

llvm-svn: 197653
2013-12-19 03:17:11 +00:00
Rafael Espindola
5589bebfa5 Add stack alignment information for Sparc.
This matches the data in clang which was added by Jakob Stoklund Olesen in
r179596.

Thanks for erikjv on irc for pointing me to the relevant documents:
http://sparc.com/standards/64.psabi.1.35.ps.Z
page 25: Every stack frame must be 16-byte aligned.

http://sparc.com/standards/psABI3rd.pdf
page 3-10: Although the architecture requires only word alignment, software convention and the operating system require every stack frame to be doubleword aligned.

I tried to add a test, but it looks like sparc doesn't implement dynamic stack
realignment. This will be tested in clang shortly.

llvm-svn: 197646
2013-12-19 02:21:16 +00:00
Reid Kleckner
f795c3e4a9 Begin adding docs and IR-level support for the inalloca attribute
The inalloca attribute is designed to support passing C++ objects by
value in the Microsoft C++ ABI.  It behaves the same as byval, except
that it always implies that the argument is in memory and that the bytes
are never copied.  This attribute allows the caller to take the address
of an outgoing argument's memory and execute arbitrary code to store
into it.

This patch adds basic IR support, docs, and verification.  It does not
attempt to implement any lowering or fix any possibly broken transforms.

When this patch lands, a complete description of this feature should
appear at http://llvm.org/docs/InAlloca.html .

Differential Revision: http://llvm-reviews.chandlerc.com/D2173

llvm-svn: 197645
2013-12-19 02:14:12 +00:00
Rafael Espindola
859cb122ba Synchronize the NaCl DataLayout strings with the ones in clang.
Patch by Derek Schuff.

llvm-svn: 197640
2013-12-19 00:44:37 +00:00
Reed Kotler
660bf702a1 Make cosmetic changes as part of Mips internal post commit review of
patch r196331.

llvm-svn: 197638
2013-12-19 00:43:08 +00:00
Reed Kotler
012c0a0f79 Fix a problem with mips16 stubs when calls are transformed during
tail call optimization. Some more work may be needed for indirect
calls but this patch fixes the current regression in Prolangc++/trees.
S2 optimization as part of the general cleanup and optimization
of prolog and epilog was not saving S2 in this case and needed to.

llvm-svn: 197630
2013-12-18 23:57:48 +00:00
Weiming Zhao
628bf03d65 [aarch32] fix bug 18268: Incorrect condition of vsel
Given vsel_cc, op1, op2, since vsel has no LE/LT, to generate vsel for
such selection, it needs to inverse cc and swap op1 and op2. To inverse
cc, both L/G and E bits should be flipped.

llvm-svn: 197615
2013-12-18 22:25:17 +00:00
Rafael Espindola
6dc5fe883a Correctly handle the degenerated triple "thumb".
Fixes a crash in llc where some parts think the target is thumb and others think
it is ARM.

llvm-svn: 197607
2013-12-18 21:29:44 +00:00
Logan Chien
89e5c10f8c [arm] Rename Tag_VFP_arch to Tag_FP_arch.
According to "Addenda to ABI for ARM architecture", Tag_FP_arch is the
new name for the equivalent Tag_VFP_arch.  This commit renames
Tag_VFP_arch to Tag_FP_arch.

llvm-svn: 197587
2013-12-18 17:23:15 +00:00
Rafael Espindola
9b3064d2fb Fix f64 and f128 for ppc-darwin.
This patch adds -f64:32:64 to 32 bit ppc darwin since a f64 inside a
structure are only 32 bit aligned.

The patch also drop -f128:64:128 from all ppc darwin, since f128 is
128 bit aligned.

llvm-svn: 197574
2013-12-18 15:06:25 +00:00
Rafael Espindola
e1792e72e1 One ppc32-darwin, a i64 inside a structure can have 32 bit alignment.
Thanks for Iain Sandoe for testing this with the original gcc.

Clang was already getting this right.

llvm-svn: 197572
2013-12-18 14:35:37 +00:00
Tim Northover
c9f20609f9 ARM: update comment to match reality
llvm-svn: 197570
2013-12-18 14:18:36 +00:00
Tim Northover
fe4e45a5b0 ARM: set default float ABI based on triple.
Clang sets the float-abi target option manually, but no longer
annotates each function with its ABI. This can lead to confusing
mistmatch between "clang -emit-llvm | llc" and normal clang
invocations.

Besides which, gnueabihf actually *is* hard-float. Defaulting to soft
was just perverse.

llvm-svn: 197554
2013-12-18 09:27:33 +00:00
Kevin Qin
99ae282f19 [AArch64 NEON]Implment loading vector constant form constant pool.
llvm-svn: 197551
2013-12-18 06:26:04 +00:00
Rafael Espindola
6b9c798c67 Fix N32 registers and stack alignment.
This patch fixes the "n" and "S" components of the data layout for mips. Clang
already gets this right.

This will be tested in clang.

llvm-svn: 197536
2013-12-17 23:15:58 +00:00
Hal Finkel
0cc169d05b Eliminate PPC instruction decoding ambiguities
The instruction definitions in the PPC backend have a number of variants
defined for the same instruction to represent differences between 64-bit and
32-bit semantics. In order to generate a disassembler for the PPC backend, we
need to mark all but one of these as CodeGen only.

No functionality change intended; this is prep work for PPC disassembly
support.

llvm-svn: 197535
2013-12-17 23:05:18 +00:00
Rafael Espindola
2b8c8dce53 On APCS, only try to align aggregates to 32 bits instead of 64.
This matches clang's behavior and since it is only a preference, it is not
an ABI issue.

llvm-svn: 197526
2013-12-17 21:36:54 +00:00
Rafael Espindola
0e6563fe28 Handle i64 first for clarity. No functionality change.
llvm-svn: 197524
2013-12-17 21:28:36 +00:00
Duncan P. N. Exon Smith
e1df9cf614 Assert that the last operand is actually EFLAGS
This is another follow-up to r197503, after a post-commit review by
Andy.

<rdar://problem/15627766>

llvm-svn: 197520
2013-12-17 20:28:21 +00:00
Matheus Almeida
b7efcc666c [mips] Fix off by one issue when applying a fixup.
The branch offset for a R_MIPS_PC16 relocation is indeed a 16-bit signed
immediate.

llvm-svn: 197506
2013-12-17 17:10:00 +00:00
Duncan P. N. Exon Smith
85e7983ab3 Revert "Revert "Mark vastart_save_xmm_regs as changing EFLAGS""
This reverts commit r197481, recommiting r197469 with an extra fix.

The vastart_save_xmm_regs pseudo-instruction expands to a test and a
branch, so it modifies EFLAGS.  Mark it so, or else the scheduler might
place it in the middle of another test+branch.

This fixes a bug exposed by r192750, which changed the initial scheduler
to source-order as part of enabling the MI Scheduler for X86.

This re-commit changes the VASTART_SAVE_XMM_REGS custom inserter not to
try to save %flags, and adds a test that catches the bad behavior of
r197469.

<rdar://problem/15627766>

llvm-svn: 197503
2013-12-17 15:54:45 +00:00
Rafael Espindola
4a6f55789a Fix the pointer size for the PS3 datalayout.
This will be tested from clang.

llvm-svn: 197501
2013-12-17 15:29:48 +00:00
Stepan Dyatkovskiy
ea2c7b6742 Fix for PR18045:
http://llvm.org/bugs/show_bug.cgi?id=18045

Short issue description:
For X86 machines with sse < sse4.1 we got failures for some
particular load/store vector sequences:

$ clang-trunk -m32 -O2 test-case.c
fatal error: error in backend: Cannot select: 0x4200920: v4i32,ch = load 0x41d6ab0, 0x4205850,
      0x41dcb10<LD16[getelementptr inbounds ([4 x i32]* @e, i32 0, i32 0)](align=4)> [ORD=82]
      [ID=58]
  0x4205850: i32 = X86ISD::Wrapper 0x41d5490 [ORD=26] [ID=43]
    0x41d5490: i32 = TargetGlobalAddress<[4 x i32]* @e> 0 [ORD=26] [ID=23]
  0x41dcb10: i32 = undef [ID=2]

The reason is that EltsFromConsecutiveLoads could emit such load instruction
both before and after legalize stage. Though this instruction is not legal for
machines with SSSE3 and lower.

The fix: In EltsFromConsecutiveLoads, if we have passed legalize stage, we
check whether nodes it emits are legal. 

P.S.: If you get failure in time from 12:00 and till 22:00 (UTC-8),
perhaps I'll slow with response, so you better reject this commit. Thanks!

llvm-svn: 197492
2013-12-17 12:07:33 +00:00
Elena Demikhovsky
241694a7bc AVX-512: Added implementation of CONCAT_VECTORS for v8i1 vectors (by Alexey Bader).
Added implementation of "truncate" from integer type (i64/i32/i16/i8) to i1.

llvm-svn: 197482
2013-12-17 08:33:15 +00:00
Duncan P. N. Exon Smith
3f31d678ca Revert "Mark vastart_save_xmm_regs as changing EFLAGS"
This reverts commit r197469.

The sanitizer and dragonegg buildbots are failing, I think because of
this change.  Reverting until I figure out why.

llvm-svn: 197481
2013-12-17 07:13:58 +00:00
Duncan P. N. Exon Smith
2cc99f0e39 Mark vastart_save_xmm_regs as changing EFLAGS
The vastart_save_xmm_regs pseudo-instruction expands to a test and a
branch, so it modifies EFLAGS.  Mark it so, or else the scheduler might
place it in the middle of another test+branch.

This fixes a bug exposed by r192750, which turned on the MI Scheduler
for X86.

<rdar://problem/15627766>

llvm-svn: 197469
2013-12-17 06:12:05 +00:00
Andrew Trick
a3aa2ba174 Allow MachineCSE to coalesce trivial subregister copies the same way that it coalesces normal copies.
Without this, MachineCSE is powerless to handle redundant operations with truncated source operands.

This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled:

     %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1
     %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2
     %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def>

Test case: cse-add-with-overflow.ll.

This exposed an existing bug in
PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case:
PowerPC/crash.ll.

llvm-svn: 197465
2013-12-17 04:50:45 +00:00
Andrew Trick
935a379f21 whitespace
llvm-svn: 197464
2013-12-17 04:50:40 +00:00
Yi Jiang
67f2e8e3f8 Enable double to float shrinking optimizations for binary functions like 'fmin/fmax'. Fix radar:15283121
llvm-svn: 197434
2013-12-16 22:42:40 +00:00
Juergen Ributzka
d14ed59350 [Stackmap] Allow WebKit_JS calling convention to store 4 byte sized and aligned arguments.
This allows the WebKit_JS calling convention to perform partial writes on a 4
byte granularity to stack slots.

llvm-svn: 197431
2013-12-16 22:05:32 +00:00
Matt Arsenault
da53b62afc Fix typo in instruction name.
SI_KIL -> SI_KILL

llvm-svn: 197425
2013-12-16 20:58:33 +00:00
Juergen Ributzka
3569d25c1a [Stackmap] The first integer argument is passed in register for the WebKit_JS calling convention.
Pass the first integer argument (callee) in register to optimize inline caches.

llvm-svn: 197416
2013-12-16 19:53:31 +00:00
Rafael Espindola
525165cec2 One last cleanup of LLVM's DataLayout strings.
Produce them in the same order on every target. The order is that of
getStringRepresentation: e|E-i*-f*-v*-a*-s*-n*-S*.

llvm-svn: 197411
2013-12-16 19:31:14 +00:00
Rafael Espindola
9920ad5b81 Structure R600's computeDataLayout more like every other target.
While there, simplify "p3:32:32:32" to "p3:32:32".

llvm-svn: 197407
2013-12-16 19:18:57 +00:00
Joerg Sonnenberger
dea3c5aab3 Recognize EABIHF as environment and use it for RTAPI + VFP.
llvm-svn: 197405
2013-12-16 18:51:28 +00:00
Chad Rosier
6bd296d8f5 [AArch64] Fix v1fx patterns for Floating-point Multiply Extend and Floating-point Compare to Zero.
llvm-svn: 197402
2013-12-16 18:29:35 +00:00
Rafael Espindola
559bceac20 The preferred alignment defaults to the abi alignment. Omit if it is the same.
llvm-svn: 197400
2013-12-16 18:01:51 +00:00
Rafael Espindola
8be80792dc Don't duplicate the DataLayout defaults for integer, floats and vectors.
llvm-svn: 197398
2013-12-16 17:41:15 +00:00
Rafael Espindola
65c80ee4a2 On DataLayout, omit the default of p:64:64:64.
llvm-svn: 197397
2013-12-16 17:15:29 +00:00
Hal Finkel
951040af31 Set has_asmparser in PowerPC/LLVMBuild.txt
PowerPC now has an asm parser (and has for many months now); indicate this in
PowerPC/LLVMBuild.txt.

llvm-svn: 197393
2013-12-16 15:48:09 +00:00
Elena Demikhovsky
b43ccbc3f7 AVX-512: Added legal type MVT::i1 and VK1 register for it.
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.

llvm-svn: 197384
2013-12-16 13:52:35 +00:00
Evgeniy Stepanov
5a1332c25e Fix Android regression in r197332.
llvm-svn: 197366
2013-12-16 07:02:51 +00:00
Hao Liu
81b69b5ce1 [AArch64]Fix the pattern match failure for v1i8/v1i16/v1i32 types.
Currently we have such types as legal vector types. The DAG combiner may generate some DAG nodes having such types but we don't have patterns to match them.
E.g. a load i32 and a bitcast i32 to v1i32 will be combined into a load v1i32:
     bitcast (load i32) to v1i32 -> load v1i32.
So this patch fixes such problems for load/dup instructions.
If v1i8/v1i16/v1i32 are not legal any more, the code in this patch can be deleted. So I also add some FIXME.

llvm-svn: 197361
2013-12-16 02:51:28 +00:00
Reed Kotler
b1b6964b8e remove an uneeded statement (condition is covered by the statement
that follows).

llvm-svn: 197358
2013-12-15 23:33:59 +00:00
Reed Kotler
e3e6631285 Fix some indentation.
llvm-svn: 197357
2013-12-15 23:03:35 +00:00
Reed Kotler
58dbd59162 Get rid of an superfluous tab in the .s file. This was originally
part of a multi-line pseudo which worked around a linker bug for mips16.

llvm-svn: 197356
2013-12-15 22:02:31 +00:00
Reed Kotler
5bb816aed3 Last change for mips16 prolog/epilog cleanup and optimization.
Some tiny cosmetic code changes to follow. Because of the wide
ranging nature of the patch a full 24 test cycle was needed to
check against regression. This was the smallest patch I could
make to progress from the earlier ones in the series. 

llvm-svn: 197350
2013-12-15 20:49:30 +00:00
Joerg Sonnenberger
72ca1c7c94 There is no exp10 on NetBSD.
llvm-svn: 197348
2013-12-15 20:36:17 +00:00
Joerg Sonnenberger
64c33d9c40 Replace string matching with a switch on Triple::getEnvironment.
llvm-svn: 197332
2013-12-15 00:12:52 +00:00
Matt Arsenault
6520333e09 Don't manually calculate size in bytes
llvm-svn: 197327
2013-12-14 18:21:59 +00:00
Iain Sandoe
d78fe2e004 [Powerpc darwin] AsmParser Base implementation.
This is a base implementation of the powerpc-apple-darwin asm parser dialect.

* Enables infrastructure (essentially isDarwin()) and fixes up the parsing of asm directives to separate out ELF and MachO/Darwin additions.
* Enables parsing of {r,f,v}XX as register identifiers.
* Enables parsing of lo16() hi16() and ha16() as modifiers.

The changes to the test case are from David Fang (fangism).

llvm-svn: 197324
2013-12-14 13:34:02 +00:00
Juergen Ributzka
ae94270422 [Stackmap] Only the AnyReg calling convention should preserve all registers.
llvm-svn: 197316
2013-12-14 06:52:59 +00:00
Rafael Espindola
d0ba591b49 Refactor NVPTX's computeDataLayout.
No functionality change.

llvm-svn: 197312
2013-12-14 06:42:48 +00:00
Rafael Espindola
e8b32c0d4d Turn NVPTXSubtarget::getDataLayout into a static function.
No functionality change.

llvm-svn: 197311
2013-12-14 06:36:30 +00:00
Rafael Espindola
9ed27196df Turn AMDGPUSubtarget::getDataLayout into a static function.
No functionality change.

llvm-svn: 197310
2013-12-14 06:13:44 +00:00
Kevin Enderby
e6ab982310 Fixed a bug in getARMFixupKindMachOInfo() where three ARM fixup kinds
were falling into the cases for 24-bit branch kinds which are not 24-bit
branches.  The routine is to return false for fixups are expected to always
be resolvable at assembly time. Which these three fixups are as they have
limited displacement and are for local references within a function.

rdar://15586725

llvm-svn: 197282
2013-12-13 22:46:54 +00:00
Chad Rosier
75d1acea0b [AArch64] Simplify the Neon Scalar3Same patterns for floating-point reciprocal
step, floating-point reciprocal square root step, floating-point absolute
difference, and integer/floating-point compare instructions.  Also, move the
scalar general arithmetic operation patterns closer to similar code.  No
functional change intended.

llvm-svn: 197250
2013-12-13 17:56:44 +00:00
Rafael Espindola
2bd13393e0 Assume defaults to produce smaller datalayout strings.
llvm-svn: 197249
2013-12-13 17:56:11 +00:00
Rafael Espindola
77962b5146 Fix pr18235.
The cpp backend is not a reasonable fallback for a missing target. It is a
very special backend, so it is reasonable to use it only if explicitly
requested.

While at it, simplify the interface a bit.

llvm-svn: 197241
2013-12-13 16:05:32 +00:00
Richard Sandiford
5a37a68afd [SystemZ] Optimize X [!=]= Y in cases where X - Y or Y - X is also computed
In those cases it's better to compare the result of the subtraction
against zero.

llvm-svn: 197239
2013-12-13 15:50:30 +00:00
Richard Sandiford
7ae30a86de [SystemZ] Make more use of TMHH
This originally came about after noticing that InstCombine turns
some of the TMHH (icmp (and...), ...) tests into plain comparisons.
Since there is no instruction to compare with a 64-bit immediate,
TMHH is generally better than an ordered comparison for the cases
that it can handle.

llvm-svn: 197238
2013-12-13 15:46:55 +00:00
Iain Sandoe
fc27a861a1 test commit.
Amend a comment.

llvm-svn: 197237
2013-12-13 15:46:48 +00:00
Richard Sandiford
50a6f85c7a [SystemZ] Extend integer absolute selection
This patch makes more use of LPGFR and LNGFR.  It builds on top of
the LTGFR selection from r197234.  Most of the tests are motivated
by what InstCombine would produce.

llvm-svn: 197236
2013-12-13 15:35:00 +00:00
Richard Sandiford
21b1981cca [SystemZ] Add a structure to represent a selected comparison
...in an attempt to rein back the increasingly complex selection code.
A knock-on effect is that ICmpType is exposed from the outset, which
slightly simplifies adjustSubwordCmp.

The code is no piece of art even after this change, but at least it should
be slightly better.  No behavioral change intended.

llvm-svn: 197235
2013-12-13 15:28:45 +00:00
Richard Sandiford
3cb2e57bb5 [SystemZ] Make more use of LTGFR
InstCombine turns (sext (trunc)) into (ashr (shl)), then converts any
comparison of the ashr against zero into a comparison of the shl against zero.
This makes sense in itself, but we want to undo it for z, since the sign-
extension instruction has a CC-setting form.

I've included tests for both the original and InstCombined variants,
but the former already worked.  The patch fixes the latter.

llvm-svn: 197234
2013-12-13 15:07:39 +00:00
Benjamin Kramer
123ebfd785 X86: When lowering shl_parts, don't emit shift amounts larger than the bit width.
While it's safe for the X86-specific shift nodes, dag combining will
kill generic nodes. Insert an AND to make it safe, isel will nuke it
as x86's shift instructions have an implicit AND.

Fixes PR16108, which contains a contraption to hit this case in between
constant folders.

llvm-svn: 197228
2013-12-13 13:40:24 +00:00
Joerg Sonnenberger
2826f5b278 Enabling thumb2 mode used to force support for armv6t2. Replace this
with a temporary assertion and adjust the various test cases.

llvm-svn: 197224
2013-12-13 11:16:00 +00:00
Matheus Almeida
b273b4d798 [mips] Add checks for alignment and maximum displacements for most of the
branch instructions for mips and micromips instruction sets thus avoiding
the situation of generating branches to undesired locations if offsets
cannot be encoded.

This patch also checks if a fixup cannot be applied and returns a fatal error
if that's the case.

llvm-svn: 197223
2013-12-13 11:11:02 +00:00
Kai Nacke
384eb1e0f0 Change stack probing code for MingW.
Since gcc 4.6 the compiler uses ___chkstk_ms which has the same semantics as the
MS CRT function __chkstk. This simplifies the prologue generation a bit.

Reviewed by Rafael Espíndola. 

llvm-svn: 197205
2013-12-13 05:37:05 +00:00
Rafael Espindola
b5a53246ce Simplify the datalayout string of ARM and AArch64.
No functionality change.

Reviewed by Tim Northover.

llvm-svn: 197172
2013-12-12 17:43:37 +00:00
Rafael Espindola
70b2c8383c Simplify the SystemZ datalayout string.
Reviewed by Richard Sandiford.

llvm-svn: 197170
2013-12-12 17:30:07 +00:00
Rafael Espindola
404941ee51 Use "a" instead of "a0" in DataLayout.
It means exactly the same and is just a bit shorter.

llvm-svn: 197169
2013-12-12 17:21:51 +00:00
Rafael Espindola
52eb6b7c6e Switch to the new MingW ABI.
GCC 4.7 changed the MingW ABI. On the LLVM side it means that sret functions
don't pop the stack.

llvm-svn: 197163
2013-12-12 16:06:58 +00:00
Chad Rosier
7ff770d07b [AArch64] Removed unnecessary copy patterns with v1fx types.
- Copy patterns with float/double types are enough.
- Fix typos in test case names that were using v1fx.
- There is no ACLE intrinsic that uses v1f32 type.  And there is no conflict of
  neon and non-neon ovelapped operations with this type, so there is no need to
  support operations with this type.
- Remove v1f32 from FPR32 register and disallow v1f32 as a legal type for
  operations.

Patch by Ana Pazos!

llvm-svn: 197159
2013-12-12 15:46:29 +00:00
Andrea Di Biagio
b414478458 Added new X86 patterns to select SSE scalar fp arithmetic instructions from
a vector packed single/double fp operation followed by a vector insert.

The effect is that the backend coverts the packed fp instruction
followed by a vectro insert into a SSE or AVX scalar fp instruction.

For example, given the following code:
   __m128 foo(__m128 A, __m128 B) {
     __m128 C = A + B;
     return (__m128) {c[0], a[1], a[2], a[3]};
   }

 previously we generated:
   addps %xmm0, %xmm1
   movss %xmm1, %xmm0
 
 we now generate:
   addss %xmm1, %xmm0

llvm-svn: 197145
2013-12-12 11:50:47 +00:00
Gabor Greif
c84043be50 typo in comment
llvm-svn: 197136
2013-12-12 08:00:34 +00:00
Hao Liu
f67a904939 [AArch64]Fix the problem that AArch64 backend fails to select scalar_to_vector of vector types having more than one element.
llvm-svn: 197135
2013-12-12 07:36:26 +00:00
Reed Kotler
12023f45b4 Check for null pointer before dereferencing. A careless typo on my part.
I don't know why this did not show up earlier. This code has been
around for ages. 

llvm-svn: 197119
2013-12-12 02:41:11 +00:00
Yi Jiang
f648b7ec82 Resubmit r196544: Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x)
llvm-svn: 197109
2013-12-12 01:55:04 +00:00
Hal Finkel
bb547872c5 Remove unused multiclass from PPCInstrInfo.td
llvm-svn: 197100
2013-12-12 00:23:29 +00:00
Hal Finkel
e8c7fef754 Improve instruction scheduling for the PPC POWER7
Aside from a few minor latency corrections, the major change here is a new
hazard recognizer which focuses on better dispatch-group formation on the
POWER7. As with the PPC970's hazard recognizer, the most important thing it
does is avoid load-after-store hazards within the same dispatch group. It uses
the POWER7's special dispatch-group-terminating nop instruction (instead of
inserting multiple regular nop instructions). This new hazard recognizer makes
use of the scheduling dependency graph itself, built using AA information, to
robustly detect the possibility of load-after-store hazards.

significant test-suite performance changes (the error bars are 99.5% confidence
intervals based on 5 test-suite runs both with and without the change --
speedups are negative):

speedups:

MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.55171% +/- 0.333168%

MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl
	-17.5576% +/- 14.598%

MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl
	-29.5708% +/- 7.09058%

MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt
	-34.9471% +/- 11.4391%

SingleSource/Benchmarks/BenchmarkGame/puzzle
	-25.1347% +/- 11.0104%

SingleSource/Benchmarks/Misc/flops-8
	-17.7297% +/- 9.79061%

SingleSource/Benchmarks/Shootout-C++/ary3
	-35.5018% +/- 23.9458%

SingleSource/Regression/C/uint64_to_float
	-56.3165% +/- 25.4234%

SingleSource/UnitTests/Vectorizer/gcc-loops
	-18.5309% +/- 6.8496%

regressions:

MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
	18.351% +/- 12.156%

SingleSource/Benchmarks/Shootout-C++/methcall
	27.3086% +/- 14.4733%

llvm-svn: 197099
2013-12-12 00:19:11 +00:00
Chad Rosier
551789d294 [AArch64] Refactor NEON floating-point Max/Min/Maxnm/Minnm across vector AArch64
intrinsics to use f32 types, rather than their vector equivalents.

llvm-svn: 197090
2013-12-11 23:21:25 +00:00
Hal Finkel
4a76fae817 Fix the PPC subsumes-predicate check
For one predicate to subsume another, they must both check the same condition
register. Failure to check this prerequisite was causing miscompiles.

Fixes PR18003.

llvm-svn: 197089
2013-12-11 23:12:25 +00:00
Chad Rosier
c251a82254 [AArch64] Add NEON scalar floating-point compare LLVM AArch64 intrinsics that
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197068
2013-12-11 21:03:46 +00:00
Chad Rosier
0b1fef12e8 [AArch64] Refactor the NEON scalar floating-point reciprocal step and
floating-point reciprocal square root step LLVM AArch64 intrinsics to
use f32/f64 types, rather than their vector equivalents.

llvm-svn: 197067
2013-12-11 21:03:43 +00:00
Chad Rosier
43daaa765b [AArch64] Refactor the NEON scalar floating-point reciprocal estimate, floating-
point reciprocal exponent, and floating-point reciprocal square root estimate
LLVM AArch64 intrinsics to use f32/f64 types, rather than their vector
equivalents.

llvm-svn: 197066
2013-12-11 21:03:40 +00:00
Rafael Espindola
676fc699c0 Don't set unused variable.
llvm-svn: 197064
2013-12-11 20:40:57 +00:00
Tom Stellard
846883f395 R600: Re-format Processors.td
This makes it a little easier to read.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197058
2013-12-11 17:51:51 +00:00
Tom Stellard
11ea886d72 R600: Register AMDGPUCFGStructurizer pass
This enables -print-before-all to dump MachineInstrs after it is run.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197057
2013-12-11 17:51:47 +00:00
Tom Stellard
22354cfe07 R600: Register R600EmitClauseMarkers pass
This enables -print-before-all to dump MachineInstrs after it is run.

Reviewed-by: Vincent Lejeune <vljn at ovi.com>
llvm-svn: 197056
2013-12-11 17:51:41 +00:00
Logan Chien
e6b3d37d36 [arm] Implement ARM .arch directive.
llvm-svn: 197052
2013-12-11 17:16:25 +00:00
Tim Northover
d66403a4e1 ARM: constrain register-class in fast-isel
The tests were no longer using fast-isel at all (MachO needs an "ios" rather
than "darwin" triple at the moment and Linux needs ARM mode). Once that was
corrected, the verifier complained about a t2ADDri created for the alloca.

llvm-svn: 197046
2013-12-11 16:04:57 +00:00
Elena Demikhovsky
154413adc2 AVX-512: Removed "z" suffix from AVX-512 instructions, since it is incompatible with GCC.
I moved a test from avx512-vbroadcast-crash.ll to avx512-vbroadcast.ll
I defined HasAVX512 predicate as AssemblerPredicate. It means that you should invoke llvm-mc with "-mcpu=knl" to get encoding for AVX-512 instructions. I need this to let AsmMatcher to set different encoding for AVX and AVX-512 instructions that have the same mnemonic and operands (all scalar instructions).

llvm-svn: 197041
2013-12-11 14:31:04 +00:00
Richard Sandiford
533c977c68 [SystemZ] Optimize fcmp X, 0 in cases where X is also negated
In such cases it's often better to test the result of the negation instead,
since the negation also sets CC.

llvm-svn: 197032
2013-12-11 11:45:08 +00:00
Reed Kotler
ebfdeaa94e Distinguish and choose 16 or 32 bit forms of save/restore for Mips16.
llvm-svn: 196999
2013-12-11 03:32:44 +00:00
Kevin Qin
1453b4e3d7 [AArch64 NEON] Get instruction BSL matched to VSELECT.
llvm-svn: 196998
2013-12-11 02:33:50 +00:00
Rafael Espindola
cb6cafb631 Move mips' datalayout computation out of line and add comments.
llvm-svn: 196996
2013-12-11 01:41:10 +00:00
Rafael Espindola
082ec7aafe Move Sparc's getDataLayout out of line and add comments.
llvm-svn: 196990
2013-12-11 01:07:43 +00:00
NAKAMURA Takumi
54fa39136d Prune redundant dependencies in LLVMBuild.txt.
llvm-svn: 196988
2013-12-11 00:30:57 +00:00
Rafael Espindola
db206ff5b6 Move PPC's getDataLayoutString out of line and document it better.
llvm-svn: 196987
2013-12-11 00:09:06 +00:00
Reid Kleckner
9e1ba62c63 Revert the backend fatal error from r196939
The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.

ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.

XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.

llvm-svn: 196986
2013-12-10 23:23:52 +00:00
Rafael Espindola
6b7d172e52 Refactor the computation of the x86 datalayout.
llvm-svn: 196976
2013-12-10 22:05:32 +00:00
Matt Arsenault
9db19365b4 Use llvm_unreachable instead of assert(0)
llvm-svn: 196971
2013-12-10 21:37:42 +00:00
David Fang
7d43532c99 on darwin<10, fallback to .weak_definition (PPC,X86)
.weak_def_can_be_hidden was not yet supported by the system assembler

llvm-svn: 196970
2013-12-10 21:37:41 +00:00
Chad Rosier
29ed5c4552 [AArch64] Refactor the NEON floating-point absolute difference LLVM AArch64
intrinsic to use f32/f64 types, rather than their vector equivalents.

llvm-svn: 196965
2013-12-10 21:33:59 +00:00
Chad Rosier
5394f9c916 [AArch64] Refactor the NEON signed/unsigned floating-point convert to fixed-point
LLVM AArch64 intrinsics to use f32/f64, rather than their vector equivalents.

llvm-svn: 196964
2013-12-10 21:33:56 +00:00
Chad Rosier
b2112dc6c3 [AArch64] Overload NEON signed/unsigned floating-point convert to fixed-point
and fixed-point convert to floating-point LLVM AArch64 intrinsics.

llvm-svn: 196963
2013-12-10 21:33:53 +00:00
Chad Rosier
3d7979609e [AArch64] Overload NEON signed/unsigned integer convert to floating-point
LLVM AArch64 intrinsics.

llvm-svn: 196962
2013-12-10 21:33:50 +00:00
Reid Kleckner
b6a72325f3 Reland "Fix miscompile of MS inline assembly with stack realignment"
This re-lands commit r196876, which was reverted in r196879.

The tests have been fixed to pass on platforms with a stack alignment
larger than 4.

Update to clang side tests will land shortly.

llvm-svn: 196939
2013-12-10 18:27:32 +00:00
Tim Northover
54d769f4eb Make Triple's isOSBinFormatXXX functions partition triple-space.
Most users would be surprised if "isCOFF" and "isMachO" were simultaneously
true, unless they'd put the compiler in a box with a gun attached to a photon
detector.

This makes sure precisely one of the three formats is true for any triple and
simplifies some target logic based on that.

llvm-svn: 196934
2013-12-10 16:57:43 +00:00
Chad Rosier
7e9f19f92d [AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so
that they use float/double rather than the vector equivalents when appropriate.

llvm-svn: 196930
2013-12-10 16:11:39 +00:00
Chad Rosier
0b6c7be6f7 [AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
Specifically, reuse the ARM intrinsics when possible.

llvm-svn: 196926
2013-12-10 15:35:33 +00:00
Andrea Di Biagio
3107b3cbaa Ensure that the backend no longer emits unnecessary vector insert instructions
immediately after SSE scalar fp instructions like addss or mulss.

Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.

For example, given the following code:
  __m128 foo(__m128 A, __m128 B) {
    A[0] += B[0];
    return A;
  }

previously we generated:
  addss %xmm0, %xmm1
  movss %xmm1, %xmm0

now we generate:
  addss %xmm1, %xmm0

llvm-svn: 196925
2013-12-10 15:22:48 +00:00
Vincent Lejeune
73c6e97c15 R600: Fix an infinite loop when trying to reorganize export/tex vector input
llvm-svn: 196923
2013-12-10 14:43:31 +00:00
Vincent Lejeune
cd5d9a9849 R600: Fix input modifiers lost for Cayman
llvm-svn: 196922
2013-12-10 14:43:27 +00:00
Reed Kotler
1f7ad447b7 Next step in Mips16 prologue/epilogue cleanup.
Save S2(reg 18) only when we are calling floating point stubs that
have a return value of float or complex. Some more work to make this
better but this is the first step.

llvm-svn: 196921
2013-12-10 14:29:38 +00:00
Elena Demikhovsky
57057960b0 AVX-512: changed intrinsics for mask operations
llvm-svn: 196918
2013-12-10 13:53:10 +00:00
Elena Demikhovsky
b3a0e7bbed AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin form
llvm-svn: 196914
2013-12-10 11:58:35 +00:00
Daniel Sanders
89ddadb4c5 [mips][msa] Correct sld and sldi builtins.
Summary: The result register of these instructions is also the first operand.

Reviewers: jacksprat, dsanders

Reviewed By: dsanders

Differential Revision: http://llvm-reviews.chandlerc.com/D2362
Differential Revision: http://llvm-reviews.chandlerc.com/D2363

llvm-svn: 196910
2013-12-10 11:37:00 +00:00
Richard Sandiford
1b5b817c73 Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.

llvm-svn: 196906
2013-12-10 10:49:34 +00:00
Richard Sandiford
bc88711db8 Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.

llvm-svn: 196905
2013-12-10 10:36:34 +00:00
Kevin Qin
e00af17fe4 [AArch64 NEON] Replace fpimm with fpz32 for floating compare with zero.
This is a small change to be strict. Just want get pattern safer.

llvm-svn: 196889
2013-12-10 06:51:07 +00:00
Kevin Qin
746aa8a55e [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
llvm-svn: 196887
2013-12-10 06:48:35 +00:00
NAKAMURA Takumi
3fda77b6c7 Add proper dependencies to LLVMBuild.txt in llvm/lib.
I'll prune redundant deps in LLVMBuild.txt, later.

llvm-svn: 196881
2013-12-10 05:39:34 +00:00
NAKAMURA Takumi
b2c60b7ca7 Whitespaces.
llvm-svn: 196880
2013-12-10 05:39:12 +00:00
Reid Kleckner
cb3c239850 Revert "Fix miscompile of MS inline assembly with stack realignment"
This reverts commit r196876.  Its tests failed on the bots, so I'll
figure it out tomorrow.

llvm-svn: 196879
2013-12-10 05:31:27 +00:00
Reid Kleckner
26454793b1 Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

llvm-svn: 196876
2013-12-10 05:12:23 +00:00
Rafael Espindola
3a54357741 Add comments documenting the ARM datalayout string.
llvm-svn: 196850
2013-12-10 00:37:37 +00:00