1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

9282 Commits

Author SHA1 Message Date
Chad Rosier
81c2f41261 [ms-inline asm] Move this logic into a static function as it's only applicable
when parsing MS-style inline assembly.  No functional change intended.

llvm-svn: 179407
2013-04-12 20:20:54 +00:00
Chad Rosier
4e3d67d4c5 [ms-inline asm] Address the FIXME for ImmDisp before brackets. This
is a follow on to r179393 and r179399.  Test case to be added on
the clang side.
Part of rdar://13453209

llvm-svn: 179403
2013-04-12 19:51:49 +00:00
Chad Rosier
443d79152d [ms-inline asm] Have the [ Symbol ] case fall into the more general logic. This
is a follow on to r179393.  Test case to be added on the clang side.
Part of rdar://13453209

llvm-svn: 179399
2013-04-12 18:54:20 +00:00
Chad Rosier
b3960a88c7 [ms-inline asm] Add support for operands that include both a symbol and an
immediate displacement.  Specifically, add support for generating the proper IR.
We've been able to parse this for some time now.  Test case to be added on the
clang side.
Part of rdar://13453209

llvm-svn: 179393
2013-04-12 18:21:18 +00:00
Chad Rosier
765ec71e8d [ms-inline asm] Add support for using the LENGTH, TYPE, and SIZE operators with
variables that use namespace alias qualifiers.  Test case coming on clang side
shortly.
Part of rdar://13499009

llvm-svn: 179343
2013-04-11 23:57:04 +00:00
Chad Rosier
2aa88ce036 [ms-inline asm] Add support for using offsetof operator with variables that use
namespace alias qualifiers.  Test case coming on clang side shortly.
Part of rdar://13499009

llvm-svn: 179339
2013-04-11 23:37:34 +00:00
Chad Rosier
9323db9524 [ms-inline asm] Pass a StringRef reference to ParseIntelVarWithQualifier so we
can build up the identifier string.  No test case as support for looking up
these type of identifiers hasn't been implemented on the clang side.
Part of rdar://13499009

llvm-svn: 179336
2013-04-11 23:24:15 +00:00
Chad Rosier
b3012065df [ms-inline asm] Remove brackets from around a symbol reference in the target
specific logic.  This makes the code much less fragile.  Test case coming on the
clang side in a moment.
rdar://13634327

llvm-svn: 179323
2013-04-11 21:49:30 +00:00
Michael Liao
877d1576e6 Optimize vector select from all 0s or all 1s
As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane,
vector select could be simplified to AND/OR or removed if one or both values
being selected is all 0s or all 1s.

llvm-svn: 179267
2013-04-11 05:15:54 +00:00
Michael Liao
75c886a312 Add CLAC/STAC instruction encoding/decoding support
As these two instructions in AVX extension are privileged instructions for
special purpose, it's only expected to be used in inlined assembly.

llvm-svn: 179266
2013-04-11 04:52:28 +00:00
Michael Liao
87125582e9 Enhance bool simplifcation in X86 to handle more cases
This patch is revised based on patch from Victor Umansky
<victor.umansky@intel.com>. More cases are handled in X86's bool
simplification, i.e.
- SETCC_CARRY
- value is truncated to i1 with AND

As a by-product, PR5443 is also fixed.

llvm-svn: 179265
2013-04-11 04:43:09 +00:00
Nico Rieck
8e22855ea6 MC: Support COFF image-relative MCSymbolRefs
Add support for the COFF relocation types IMAGE_REL_I386_DIR32NB and
IMAGE_REL_AMD64_ADDR32NB for 32- and 64-bit respectively. These are
similar to normal 4-byte relocations except that they do not include
the base address of the image.

Image-relative relocations are used for debug information (32-bit) and
SEH unwind tables (64-bit).

A new MCSymbolRef variant called 'VK_COFF_IMGREL32' is introduced to
specify such relocations. For AT&T assembly, this variant can be accessed
using the symbol suffix '@imgrel'.

llvm-svn: 179240
2013-04-10 23:28:17 +00:00
Kay Tiong Khoo
5f12d15d44 fixed xsave, xsaveopt, xrstor mnemonics with intel syntax; added test cases
llvm-svn: 179223
2013-04-10 21:52:25 +00:00
Kay Tiong Khoo
ba75929324 fixed to disassemble with tab after mnemonic rather than space
llvm-svn: 179215
2013-04-10 21:17:58 +00:00
Preston Gurd
de5cf7a23b In the X86 back end, getMemoryOperandNo() returns the offset
into the operand array of the start of the memory reference descriptor.

Additional code in EncodeInstruction provides an additional adjustment.

This patch places that additional code in a separate function,
called getOperandBias, so that any caller of getMemoryOperandNo
can also call getOperandBias.

llvm-svn: 179211
2013-04-10 20:11:59 +00:00
Chad Rosier
b0156236cb Tidy up, fix and simplify a few of the SMLocs. Prior to r179109 the Start SMLoc
wasn't always the start of the operand.  If there was a symbol reference, then
Start pointed to that token.  It's very likely there are other places that need
to be updated.

llvm-svn: 179210
2013-04-10 20:07:47 +00:00
Chad Rosier
cc61ca2355 Remove unused variable.
llvm-svn: 179205
2013-04-10 18:46:58 +00:00
Chad Rosier
411fdc3a74 Reapply r179115, but use parsePrimaryExpression a little more judiciously.
Test cases that regressed due to r179115, plus a few more, were added in
r179182.  Original commit message below:

[ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to
parse an identifier.  Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:

 __asm mov eax, [Symbol + ImmDisp]

Part of rdar://13611297

llvm-svn: 179187
2013-04-10 17:35:30 +00:00
Evan Cheng
9f82233851 __sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in
xmm0 / xmm1.

rdar://13599493

llvm-svn: 179141
2013-04-10 01:26:07 +00:00
Chad Rosier
aa67701688 Cleanup. No functional change intended.
llvm-svn: 179129
2013-04-09 20:58:48 +00:00
Chad Rosier
85e2894bd6 Cleanup. No functional change intended.
llvm-svn: 179125
2013-04-09 20:44:09 +00:00
Chad Rosier
e040ffba05 Revert r179115 as it looks to have killed the ASan tests.
llvm-svn: 179120
2013-04-09 19:59:12 +00:00
Chad Rosier
4ef6c35911 [ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to
parse an identifier.  Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:

 __asm mov eax, [Symbol + ImmDisp]

The existing test cases exercise this patch.
rdar://13611297

llvm-svn: 179115
2013-04-09 19:34:59 +00:00
Chad Rosier
5ec822982c [ms-inline asm] Maintain a StringRef to reference a symbol in a parsed operand,
rather than deriving the StringRef from the Start and End SMLocs.

Using the Start and End SMLocs works fine for operands such as [Symbol], but
not for operands such as [Symbol + ImmDisp].  All existing test cases that
reference a variable exercise this patch.
rdar://13602265

llvm-svn: 179109
2013-04-09 17:53:49 +00:00
Arnold Schwaighofer
3218da2403 X86 cost model: Model cost for uitofp and sitofp on SSE2
The costs are overfitted so that I can still use the legalization factor.

For example the following kernel has about half the throughput vectorized than
unvectorized when compiled with SSE2. Before this patch we would vectorize it.

unsigned short A[1024];
double B[1024];
void f() {
  int i;
  for (i = 0; i < 1024; ++i) {
    B[i] = (double) A[i];
  }
}

radar://13599001

llvm-svn: 179033
2013-04-08 18:05:48 +00:00
Chad Rosier
7583b0a3c3 [ms-inline asm] Add support for ImmDisp [ Symbol ] memory operands.
rdar://13521249

llvm-svn: 179030
2013-04-08 17:43:47 +00:00
Bill Wendling
f2bb7aa5f8 Use the target options specified on a function to reset the back-end.
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917
2013-04-05 21:52:40 +00:00
Chad Rosier
59dbb08c9b [ms-inline asm] Add support for numeric displacement expressions in bracketed
memory operands.

Essentially, this layers an infix calculator on top of the parsing state
machine.  The scale on the index register is still expected to be an immediate

 __asm mov eax, [eax + ebx*4]

and will not work with more complex expressions.  For example,

 __asm mov eax, [eax + ebx*(2*2)]

The plus and minus binary operators assume the numeric value of a register is
zero so as to not change the displacement.  Register operands should never
be an operand for a multiply or divide operation; the scale*indexreg
expression is always replaced with a zero on the operand stack to prevent
such a case.
rdar://13521380

llvm-svn: 178881
2013-04-05 16:28:55 +00:00
Arnold Schwaighofer
52871434dd X86 cost model: Differentiate cost for vector shifts of constants
SSE2 has efficient support for shifts by a scalar. My previous change of making
shifts expensive did not take this into account marking all shifts as expensive.
This would prevent vectorization from happening where it is actually beneficial.

With this change we differentiate between shifts of constants and other shifts.

radar://13576547

llvm-svn: 178808
2013-04-04 23:26:24 +00:00
Arnold Schwaighofer
861251004b CostModel: Add parameter to instruction cost to further classify operand values
On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.

We can efficiently support

for (i = 0 ; i < ; i += 4)
  w[0:3] = v[0:3] << <2, 2, 2, 2>

but not

for (i = 0; i < ; i += 4)
  w[0:3] = v[0:3] << x[0:3]

This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.

Targets can then choose to return a different cost for instructions with such
operand values.

A follow-up commit will test this feature on x86.

radar://13576547

llvm-svn: 178807
2013-04-04 23:26:21 +00:00
Arnold Schwaighofer
329430aeac X86 cost model: Vector shifts are expensive in most cases
The default logic does not correctly identify costs of casts because they are
marked as custom on x86.

For some cases, where the shift amount is a scalar we would be able to generate
better code. Unfortunately, when this is the case the value (the splat) will get
hoisted out of the loop, thereby making it invisible to ISel.

radar://13130673
radar://13537826

llvm-svn: 178703
2013-04-03 21:46:05 +00:00
Timur Iskhodzhanov
ecd533f0ec Fix SRet for thiscall in i686-pc-win32
llvm-svn: 178634
2013-04-03 11:27:54 +00:00
Eric Christopher
81bacb7670 Formatting.
llvm-svn: 178589
2013-04-02 23:06:40 +00:00
Chad Rosier
1fe97eda48 [ms-inline asm] Add support for parsing variables with namespace alias
qualifiers.

This patch only adds support for parsing these identifiers in the
X86AsmParser.  The front-end interface isn't capable of looking up
these identifiers at this point in time.  The end result is the
compiler now errors during object file emission, rather than at
parse time.  Test case coming shortly.
Part of rdar://13499009 and PR13340

llvm-svn: 178566
2013-04-02 20:02:33 +00:00
Chad Rosier
908153170e [fast-isel] Use the correct API to disable FastLowerArguments for Win64.
llvm-svn: 178549
2013-04-02 16:31:41 +00:00
Andrew Trick
b6ac50177f The divide unit is not pipeline, but it is still buffered.
Buffered means a later divide may be executed out-of-order while a
prior divide is sitting (buffered) in a reservation station.

You can tell it's not pipelined, because operations that use it
reserve it for more than one cycle:

def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> {
  let Latency = 25;
  let ResourceCycles = [1, 10];
}

We don't currently distinguish between an unpipeline operation and one
that is split into multiple micro-ops requiring the same unit. Except
that the later may have NumMicroOps > 1 if they also consume
issue/dispatch resources.

llvm-svn: 178519
2013-04-02 01:58:47 +00:00
Benjamin Kramer
7634eefc37 X86TTI: Add accurate costs for itofp operations, based on the actual instruction counts.
llvm-svn: 178459
2013-04-01 10:23:49 +00:00
Benjamin Kramer
790bd5fb50 X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available.
A vector sext + sitofp is a lot cheaper than 8 scalar conversions.

llvm-svn: 178448
2013-03-31 12:49:15 +00:00
Benjamin Kramer
50725426cb Change '@SECREL' suffix to GAS-compatible '@SECREL32'.
'@SECREL' is what is used by the Microsoft assembler, but GNU as expects '@SECREL32'.
With the patch, the MC-generated code works fine in combination with a recent GNU as (2.23.51.20120920 here).

Patch by David Nadlinger!
Differential Revision: http://llvm-reviews.chandlerc.com/D429

llvm-svn: 178427
2013-03-30 16:21:50 +00:00
Benjamin Kramer
279e5cfa9a Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349
2013-03-29 17:14:24 +00:00
Michael Liao
427149cbcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Michael Liao
aec693ab31 Enhance boolean simplification to handle 16-/64-bit RDRAND
- RDRAND always clears the destination value when a random value is not
  available (i.e. CF == 0). This value is truncated or zero-extended as
  the false boolean value to be returned. Boolean simplification needs
  to skip this 'zext' or 'trunc' node.

llvm-svn: 178312
2013-03-28 23:38:52 +00:00
Michael Liao
d961d7a7b3 Skip moving call address loading into callseq when targets prefer register indirect call.
To enable a load of a call address to be folded with that call, this
load is moved from outside of callseq into callseq. Such a moving
adds a non-glued node (that load) into a glued sequence. This non-glue
load is only removed when DAG selection folds them into a memory form
call instruction. When such instruction selection is disabled, it breaks
DAG schedule.

To prevent that, such moving is disabled when target favors register
indirect call.

Previous workaround disabling CALL32m/CALL64m insn selection is removed.

llvm-svn: 178308
2013-03-28 23:13:21 +00:00
Nadav Rotem
705ec0e7e3 Add the X86 FMAs to the scheduling model.
llvm-svn: 178303
2013-03-28 22:54:45 +00:00
Nadav Rotem
401bba05fe Add the Haswell machine model.
llvm-svn: 178301
2013-03-28 22:34:46 +00:00
Nadav Rotem
e5f49b65f6 Remove the unused port from the SandyBridge machine model
llvm-svn: 178300
2013-03-28 22:32:41 +00:00
Michael Liao
30577169a4 Add ADX CPUID detection
llvm-svn: 178299
2013-03-28 22:29:53 +00:00
Timur Iskhodzhanov
d7de83d51c Make Win32 put the SRet address into EAX, fixes PR15556
llvm-svn: 178291
2013-03-28 21:30:04 +00:00
Preston Gurd
787c145b5f This patch follows is a follow up to r178171, which uses the register
form of call in preference to memory indirect on Atom.

In this case, the patch applies the optimization to the code for reloading
spilled registers.

The patch also includes changes to sibcall.ll and movgs.ll, which were
failing on the Atom buildbot after the first patch was applied.

This patch by Sriram Murali.

llvm-svn: 178193
2013-03-27 23:16:18 +00:00
Chad Rosier
09bc7a9c8d [ms-inline asm] Add support of imm displacement before bracketed memory
expression.  Specifically, this syntax:

 ImmDisp [ BaseReg + Scale*IndexReg + Disp ] 

We don't currently support:

 ImmDisp [ Symbol ]

rdar://13518671

llvm-svn: 178186
2013-03-27 21:49:56 +00:00
Preston Gurd
b6ed645cb6 For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171
2013-03-27 19:14:02 +00:00
Hal Finkel
ba08d9519e Fix typo (common to both X86 and PPC)
Thanks to Bill Schmidt for pointing this out during code review!

llvm-svn: 178170
2013-03-27 19:10:42 +00:00
Michael Liao
bd3f6b0eea Add XTEST codegen support
llvm-svn: 178083
2013-03-26 22:47:01 +00:00
Michael Liao
3515920fbd Add HLE target feature
llvm-svn: 178082
2013-03-26 22:46:02 +00:00
Jakob Stoklund Olesen
43b68b7eb9 Enable SandyBridgeModel for all modern Intel P6 descendants.
All Intel CPUs since Yonah look a lot alike, at least at the granularity
of the scheduling models. We can add more accurate models for
processors that aren't Sandy Bridge if required. Haswell will probably
need its own.

The Atom processor and anything based on NetBurst is completely
different. So are the non-Intel chips.

llvm-svn: 178080
2013-03-26 22:19:12 +00:00
Jakob Stoklund Olesen
b728975493 Annotate the remaining x86 instructions with SchedRW lists.
Now all x86 instructions that have itinerary classes also have SchedRW
lists. This is required before the new scheduling models can be used.

There are still unannotated instructions remaining, but they don't have
itinerary classes either.

llvm-svn: 178051
2013-03-26 18:24:22 +00:00
Jakob Stoklund Olesen
b7896f4f21 Annotate x87 and mmx instructions with SchedRW lists.
This only covers the instructions that were given itinerary classes for
the Atom model.

llvm-svn: 178050
2013-03-26 18:24:20 +00:00
Jakob Stoklund Olesen
68af421d67 Annotate control instructions with SchedRW lists.
This could definitely be more granular. I am not sure if it makes a
difference.

llvm-svn: 178049
2013-03-26 18:24:17 +00:00
Jakob Stoklund Olesen
9e6a9659f1 Annotate the rest of X86InstrInfo.td with SchedRW lists.
llvm-svn: 178048
2013-03-26 18:24:15 +00:00
Michael Liao
969ef73c31 Add PREFETCHW codegen support
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
2013-03-26 17:47:11 +00:00
Michael Liao
a0a4d0c6f7 Revise alignment checking/calculation on 256-bit unaligned memory access
- It's still considered aligned when the specified alignment is larger
  than the natural alignment;
- The new alignment for the high 128-bit vector should be min(16,
  alignment) as the pointer is advanced by 16, a power-of-2 offset.

llvm-svn: 177947
2013-03-25 23:50:10 +00:00
Jakob Stoklund Olesen
d323c87f9a Add a scheduling model for Intel Sandy Bridge microarchitecture.
The model isn't hooked up by this patch because the instruction set
isn't fully annotated yet.

llvm-svn: 177942
2013-03-25 23:37:17 +00:00
Jakob Stoklund Olesen
0ac9ff5688 Remove IIC_DEFAULT from X86Schedule.td
All the instructions tagged with IIC_DEFAULT had nothing in common, and
we already have a NoItineraries class to represent untagged
instructions.

llvm-svn: 177937
2013-03-25 23:12:41 +00:00
Jakob Stoklund Olesen
19c4788c8a Annotate X86InstrCompiler.td with SchedRW lists.
llvm-svn: 177936
2013-03-25 23:07:35 +00:00
Jakob Stoklund Olesen
b81c63e1a2 Annotate shifts and rotates with SchedRW lists.
llvm-svn: 177935
2013-03-25 23:07:32 +00:00
NAKAMURA Takumi
eeb13ad532 X86DisassemblerDecoder.c: Make this C89-compliant.
llvm-svn: 177910
2013-03-25 20:55:49 +00:00
NAKAMURA Takumi
ce709c8119 Whitespace.
llvm-svn: 177909
2013-03-25 20:55:43 +00:00
Dave Zarzycki
2949721b7d x86 -- add the XTEST instruction
llvm-svn: 177888
2013-03-25 18:59:43 +00:00
Dave Zarzycki
b443618b88 x86 -- disassemble the REP/REPNE prefix when needed
This fixes Apple bug: 13493622

llvm-svn: 177887
2013-03-25 18:59:38 +00:00
Jakob Stoklund Olesen
6ffb4136aa Add a WriteMicrocoded for ancient microcoded instructions.
llvm-svn: 177611
2013-03-21 00:07:17 +00:00
Jakob Stoklund Olesen
b3af273625 Model prefetches and barriers as loads.
It's not yet clear if these instructions need a more careful model.

llvm-svn: 177599
2013-03-20 23:09:53 +00:00
Jakob Stoklund Olesen
042f102514 Add a catch-all WriteSystem SchedWrite type.
This is used for all the expensive system instructions.

llvm-svn: 177598
2013-03-20 23:09:50 +00:00
Jakob Stoklund Olesen
305e22bdab Annotate the remaining SSE MOV instructions.
llvm-svn: 177592
2013-03-20 22:37:16 +00:00
Jakob Stoklund Olesen
96a403dd67 Annotate SSE horizontal and integer instructions.
llvm-svn: 177591
2013-03-20 22:37:13 +00:00
Michael Liao
fe785c9579 Correct cost model for vector shift on AVX2
- After moving logic recognizing vector shift with scalar amount from
  DAG combining into DAG lowering, we declare to customize all vector
  shifts even vector shift on AVX is legal. As a result, the cost model
  needs special tuning to identify these legal cases.

llvm-svn: 177586
2013-03-20 22:01:10 +00:00
Jakob Stoklund Olesen
d202f35d06 Add some missing SSE annotations.
llvm-svn: 177540
2013-03-20 16:56:39 +00:00
Jakob Stoklund Olesen
327db7c83a Annotate remaining IIC_BIN_* instructions.
llvm-svn: 177539
2013-03-20 16:56:36 +00:00
Michael Liao
d0e167edfb Fix PR15296
- Move SRA/SRL/SHL lowering support from DAG combination to DAG lowering
  to support extended 256-bit integer in AVX but not AVX2.

llvm-svn: 177478
2013-03-20 02:33:21 +00:00
Michael Liao
8be4fbefe3 Mark all variable shifts needing customizing
- Prepare moving logic from DAG combining into DAG lowering. There's no
  functionality change.

llvm-svn: 177477
2013-03-20 02:28:20 +00:00
Michael Liao
3b72fc2823 Move scalar immediate shift lowering into a dedicated func
- no functionality change

llvm-svn: 177476
2013-03-20 02:20:36 +00:00
Jakob Stoklund Olesen
3b039fa614 Annotate various null idioms with SchedRW lists.
llvm-svn: 177461
2013-03-19 23:23:31 +00:00
Jakob Stoklund Olesen
a8c3f3d12c Annotate SSE float conversions with SchedRW lists.
llvm-svn: 177460
2013-03-19 23:23:29 +00:00
Jakob Stoklund Olesen
c4c0f667dc Annotate X86InstrCMovSetCC.td with SchedRW lists.
llvm-svn: 177459
2013-03-19 23:23:26 +00:00
Chad Rosier
3019e30cac [ms-inline asm] Move the immediate asm rewrite into the target specific
logic as a QOI cleanup.  No functional change.  Tests already in place.
rdar://13456414

llvm-svn: 177446
2013-03-19 21:58:18 +00:00
Jakob Stoklund Olesen
71393fdd98 Annotate X86InstrCompiler.td with SchedRW lists.
Add a new WriteZero SchedWrite type for the common dependency-breaking
instructions that clear a register.

llvm-svn: 177442
2013-03-19 21:16:56 +00:00
Chad Rosier
d2b8daa7f4 [ms-inline asm] Create a helper function, CreateMemForInlineAsm, that creates
an X86Operand, but also performs a Sema lookup and adds the sizing directive
when appropriate.  Use this when parsing a bracketed statement.  This is
necessary to get the instruction matching correct as well.  Test case coming
on clang side.
rdar://13455408

llvm-svn: 177439
2013-03-19 21:11:56 +00:00
Ulrich Weigand
4570892d73 Remove an invalid and unnecessary Pat pattern from the X86 backend:
def : Pat<(load (i64 (X86Wrapper tglobaltlsaddr :$dst))),
            (MOV64rm tglobaltlsaddr :$dst)>;

This pattern is invalid because the MOV64rm instruction expects a
source operand of type "i64mem", which is a subclass of X86MemOperand
and thus actually consists of five MI operands, but the Pat provides
only a single MI operand ("tglobaltlsaddr" matches an SDnode of
type ISD::TargetGlobalTLSAddress and provides a single output).

Thus, if the pattern were ever matched, subsequent uses of the MOV64rm
instruction pattern would access uninitialized memory.  In addition,
with the TableGen patch I'm about to check in, this would actually be
reported as a build-time error.

Fortunately, the pattern does in fact never match, for at least two
independent reasons.

First, the code generator actually never generates a pattern of the
form (load (X86Wrapper (tglobaltlsaddr))).  For most combinations of
TLS and code models, (tglobaltlsaddr) represents just an offset that
needs to be added to some base register, so it is never directly
dereferenced.  The only exception is the initial-exec model, where
(tglobaltlsaddr) refers to the (pc-relative) address of a GOT slot,
which *is* in fact directly dereferenced: but in that case, the
X86WrapperRIP node is used, not X86Wrapper, so the Pat doesn't match.

Second, even if some patterns along those lines *were* ever generated,
we should not need an extra Pat pattern to match it.  Instead, the
original MOV64rm instruction pattern ought to match directly, since
it uses an "addr" operand, which is implemented via the SelectAddr
C++ routine; this routine is supposed to accept the full range of
input DAGs that may be implemented by a single mov instruction,
including those cases involving ISD::TargetGlobalTLSAddress (and
actually does so e.g. in the initial-exec case as above).

To avoid build breaks (due to the above-mentioned error) after the
TableGen patch is checked in, I'm removing this Pat here.

llvm-svn: 177426
2013-03-19 19:49:52 +00:00
Nadav Rotem
317ff20b46 Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com>

llvm-svn: 177421
2013-03-19 18:38:27 +00:00
Jakob Stoklund Olesen
0a45a14a8b Annotate X86InstrExtension.td with SchedRW lists.
llvm-svn: 177418
2013-03-19 18:03:58 +00:00
Jakob Stoklund Olesen
a56394519c Annotate a lot of X86InstrInfo.td with SchedRW lists.
llvm-svn: 177417
2013-03-19 18:03:55 +00:00
Chad Rosier
758abf9ba0 [ms-inline asm] Move the size directive asm rewrite into the target specific
logic as a QOI cleanup.
rdar://13445327

llvm-svn: 177413
2013-03-19 17:32:17 +00:00
Chad Rosier
a43a646d71 [ms-inline asm] Avoid emitting a redundant sizing directive, if we've already
parsed one.  Test case coming shortly.
rdar://13446980

llvm-svn: 177347
2013-03-18 23:31:24 +00:00
Jakob Stoklund Olesen
2d375df1d8 Add SchedRW annotations to most of X86InstrSSE.td.
We hitch a ride with the existing OpndItins class that was used to add
instruction itinerary classes in the many multiclasses in this file.

Use the link provided by the X86FoldableSchedWrite.Folded to find the
right SchedWrite for folded loads.

llvm-svn: 177326
2013-03-18 22:01:35 +00:00
Jakob Stoklund Olesen
59db94a2dc Annotate X86 arithmetic instructions with SchedRW lists.
This new-style scheduling information is going to replace the
instruction iteneraries.

This also serves as a test case for Andy's fix in r177317.

llvm-svn: 177323
2013-03-18 21:32:39 +00:00
Anton Korobeynikov
97423324fa TLS support for MinGW targets.
MinGW is almost completely compatible to MSVC, with the exception of the _tls_array global not being available.

Patch by David Nadlinger!

llvm-svn: 177257
2013-03-18 08:12:28 +00:00
Craig Topper
bbd7402f42 Post process ADC/SBB and use a shorter encoding if they use a sign extended immediate.
llvm-svn: 177243
2013-03-18 03:34:55 +00:00
Craig Topper
5b425a9de6 Refactor some duplicated code into helper functions.
llvm-svn: 177242
2013-03-18 02:53:34 +00:00
Craig Topper
bcf2bc336a Add X86 code emitter support AVX encoded MRMDestReg instructions.
Previously we weren't skipping the VVVV encoded register. Based on patch by Michael Liao.

llvm-svn: 177221
2013-03-16 03:44:31 +00:00
Jakob Stoklund Olesen
da755c012d Define more SchedWrites for annotating X86 instructions.
Since almost all X86 instructions can fold loads, use a multiclass to
define register/memory pairs of SchedWrites.

An X86FoldableSchedWrite represents the register version of an
instruction. It holds a reference to the SchedWrite to use when the
instruction folds a load.

This will be used inside multiclasses that define rr and rm instruction
versions together.

llvm-svn: 177210
2013-03-16 00:02:17 +00:00
Eric Christopher
867ab49c5e Silence anonymous type in anonymous union warnings.
llvm-svn: 177135
2013-03-15 00:42:55 +00:00
Nadav Rotem
03b60b8657 Unaligned loads should use the VMOVUPS opcode.
llvm-svn: 177130
2013-03-14 23:49:44 +00:00
Jakob Stoklund Olesen
4d174fdcd0 Prepare for adding InstrSchedModel annotations to X86 instructions.
The new InstrSchedModel is easier to use than the instruction
itineraries. It will be used to model instruction latency and throughput
in modern Intel microarchitectures like Sandy Bridge.

InstrSchedModel should be able to coexist with instruction itinerary
classes, but for cleanliness we should switch the Atom processor model
to the new InstrSchedModel as well.

llvm-svn: 177122
2013-03-14 22:42:17 +00:00
Chad Rosier
aca0d2a5a0 [fast-isel] The X86FastISel::FastLowerArguments function doesn't properly handle
the win64 calling convention.
rdar://13423768

llvm-svn: 177113
2013-03-14 21:25:04 +00:00
Craig Topper
ca6b32ebe1 Fix the name of a variable to match its declaration. Fixes build failure from r177014.
llvm-svn: 177015
2013-03-14 07:47:43 +00:00
Craig Topper
29d0a365f1 Fix a bug in the calculation of the VEX.B bit for FMA4 rr with the VEX.W bit set. The VEX.B was being calculated from the wrong operand. Fixes at least some portion of PR14185.
llvm-svn: 177014
2013-03-14 07:40:52 +00:00
Craig Topper
a7c401d655 Teach X86 MC instruction lowering that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not. This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior.
llvm-svn: 177011
2013-03-14 07:09:57 +00:00
Michael Liao
89d165e673 Fix PR15309
- Fix the typo on type checking

llvm-svn: 177010
2013-03-14 06:57:42 +00:00
Kevin Enderby
319d315d55 Fixes disassembler crashes on 2013 Haswell RTM instructions.
rdar://13318048

llvm-svn: 176828
2013-03-11 21:17:13 +00:00
Tom Stellard
fa72758e1d DAGCombiner: Use correct value type for checking legality of BR_CC v3
LegalizeDAG.cpp uses the value of the comparison operands when checking
the legality of BR_CC, so DAGCombiner should do the same.

v2:
  - Expand more BR_CC value types for NVPTX

v3:
  - Expand correct BR_CC value types for Hexagon, Mips, and XCore.

llvm-svn: 176694
2013-03-08 15:36:57 +00:00
Benjamin Kramer
d2f85ae895 X86: Fold EXTRACT_SUBVECTORs of a BUILD_VECTOR into a smaller BUILD_VECTOR.
That can usually be lowered efficiently and is common in sandybridge code.
It would be nice to do this in DAGCombiner but we can't insert arbitrary
BUILD_VECTORs this late.

Fixes PR15462.

llvm-svn: 176634
2013-03-07 18:48:40 +00:00
Michael Liao
32f3aca77c Fix two remaining issue after fixing PR15355 when CMOV is not available
- Phi nodes should be replaced/updated after lowering CMOV into branch
  because 'mainMBB' updating operand in Phi node is changed.
- Add EFLAGS in livein before lowering the 2nd CMOV. It's necessary as
  we will reuse the EFLAGS generated before the 1st lowered CMOV, which
  won't clobber EFLAGS. However, we need explicitly specify that.
- '-attr=-cmov' test case are added.

llvm-svn: 176598
2013-03-07 01:01:29 +00:00
Michael Liao
5859ab0234 Fix PR15355
- Clear 'mayStore' flag when loading from the atomic variable before the
  spin loop
- Clear kill flag from one use to multiple use in registers forming the
  address to that atomic variable
- don't use a physical register as live-in register in BB (neither entry
  nor landing pad.) by copying it into virtual register

(patch by Cameron Zwarich)

llvm-svn: 176538
2013-03-06 00:17:04 +00:00
David Sehr
e37f7ab590 The current X86 NOP padding uses one long NOP followed by the remainder in
one-byte NOPs.  If the processor actually executes those NOPs, as it sometimes
does with aligned bundling, this can have a performance impact.  From my
micro-benchmarks run on my one machine, a 15-byte NOP followed by twelve
one-byte NOPs is about 20% worse than a 15 followed by a 12.  This patch
changes NOP emission to emit as many 15-byte (the maximum) as possible followed
by at most one shorter NOP.

llvm-svn: 176464
2013-03-05 00:02:23 +00:00
Preston Gurd
66b9c4fcf9 Bypass Slow Divides
* Only apply divide bypass optimization when not optimizing for size. 
* Fixed bug caused by constant for 0 value of type Int32,
  used dividend type to generate the constant instead.
* For atom x86-64 apply the divide bypass to use 16-bit divides instead of
  64-bit divides when operand values are small enough.
* Added lit tests for 64-bit divide bypass.

Patch by Tyler Nowicki!

llvm-svn: 176442
2013-03-04 18:13:57 +00:00
Arnold Schwaighofer
e60e6fc70f X86 cost model: Adjust cost for custom lowered vector multiplies
This matters for example in following matrix multiply:

int **mmult(int rows, int cols, int **m1, int **m2, int **m3) {
  int i, j, k, val;
  for (i=0; i<rows; i++) {
    for (j=0; j<cols; j++) {
      val = 0;
      for (k=0; k<cols; k++) {
        val += m1[i][k] * m2[k][j];
      }
      m3[i][j] = val;
    }
  }
  return(m3);
}

Taken from the test-suite benchmark Shootout.

We estimate the cost of the multiply to be 2 while we generate 9 instructions
for it and end up being quite a bit slower than the scalar version (48% on my
machine).

Also, properly differentiate between avx1 and avx2. On avx-1 we still split the
vector into 2 128bits and handle the subvector muls like above with 9
instructions.
Only on avx-2 will we have a cost of 9 for v4i64.

I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an
add instead of a mul because with a mul we now no longer vectorize. I did
verify that the mul would be indeed more expensive when vectorized with 3
kernels:

for (i ...)
   r += a[i] * 3;
for (i ...)
  m1[i] = m1[i] * 3; // This matches the test case in avx1.ll
and a matrix multiply.

In each case the vectorized version was considerably slower.

radar://13304919

llvm-svn: 176403
2013-03-02 04:02:52 +00:00
Michael Liao
1e621fbd2f Fix PR10475
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
  but TLI.getShiftAmountTy() so far only return scalar type. As a
  result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
  TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
  return target-specificed scalar type or the same vector type as the
  1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
  type.

llvm-svn: 176364
2013-03-01 18:40:30 +00:00
Duncan Sands
268f52a52f GCC thinks that this variable might be used uninitialized (it isn't).
llvm-svn: 176341
2013-03-01 09:46:03 +00:00
Yiannis Tsiouris
b2a123a008 Re-format comments (and check commit access)
llvm-svn: 176270
2013-02-28 16:59:10 +00:00
Nadav Rotem
6489b3c2b7 Revert r176166 because it broke one of the lit tests.
llvm-svn: 176171
2013-02-27 05:56:20 +00:00
Nadav Rotem
c64a5a0435 std::string to StringRef.
llvm-svn: 176166
2013-02-27 05:23:56 +00:00
Chad Rosier
5d41dc7229 [fast-isel] Make sure the FastLowerArguments function checks to make sure the
arguments type is a simple type.
rdar://13290455

llvm-svn: 176066
2013-02-26 01:05:31 +00:00
Michael Liao
ad0b9ecc47 Refine fix to PR10499, no functionality change
- Put expensive checking after simple one

llvm-svn: 176060
2013-02-25 23:16:36 +00:00
Michael Liao
ff7d7ec88b Fix PR10499
- Check whether SSE is available before lowering all 1s vector building with
  PCMPEQD, which is only available from SSE2

llvm-svn: 176058
2013-02-25 23:01:03 +00:00
Chad Rosier
37142b6930 [fast-isel] Add X86FastIsel::FastLowerArguments to handle functions with 6 or
fewer scalar integer (i32 or i64) arguments. It completely eliminates the need
for SDISel for trivial functions.

Also, add the new llc -fast-isel-abort-args option, which is similar to
-fast-isel-abort option, but for formal argument lowering.

llvm-svn: 176052
2013-02-25 21:59:35 +00:00
Chad Rosier
77e46d6eb6 [ms-inline asm] Add support for the pushad/popad mnemonics.
rdar://13254235

llvm-svn: 176036
2013-02-25 19:06:27 +00:00
Nadav Rotem
0740239f87 Revert r169638 because it broke Mesa llvmpipe tests.
Fix PR15239.

llvm-svn: 175985
2013-02-24 07:09:35 +00:00
Benjamin Kramer
bdb1d9aad3 X86: Disable cmov-memory patterns on subtargets without cmov.
Fixes PR15115.

llvm-svn: 175962
2013-02-23 10:40:58 +00:00
Peter Collingbourne
7dc1ee08f5 x86_64: designate most general purpose and SSE registers as callee save under coldcc
llvm-svn: 175911
2013-02-22 19:19:44 +00:00
Eli Bendersky
37f247b8d8 Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo
to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.

There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.

The refactoring was OK'd by Anton Korobeynikov on llvmdev.

Note: this touches the target interfaces, so out-of-tree targets may
be affected.

llvm-svn: 175788
2013-02-21 20:05:00 +00:00
Eli Bendersky
79457c8f90 getX86SubSuperRegister has a special mode with High=true for i64 which
exists solely to enable it to call itself for i8 with some registers.
The proposed patch simplifies the function somewhat to make the High
bit only meaningful for the i8 mode, which makes sense. No functional
difference (getX86SubSuperRegister is not getting called from anywhere
outside with i64 and High=true).

llvm-svn: 175762
2013-02-21 16:40:18 +00:00
Jim Grosbach
89c0252c2a MCParser: Update method names per coding guidelines.
s/AddDirectiveHandler/addDirectiveHandler/
s/ParseMSInlineAsm/parseMSInlineAsm/
s/ParseIdentifier/parseIdentifier/
s/ParseStringToEndOfStatement/parseStringToEndOfStatement/
s/ParseEscapedString/parseEscapedString/
s/EatToEndOfStatement/eatToEndOfStatement/
s/ParseExpression/parseExpression/
s/ParseParenExpression/parseParenExpression/
s/ParseAbsoluteExpression/parseAbsoluteExpression/
s/CheckForValidSection/checkForValidSection/

http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

No functional change intended.

llvm-svn: 175675
2013-02-20 22:21:35 +00:00
Jim Grosbach
233487d8a2 Update TargetLowering ivars for name policy.
http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

ivars should be camel-case and start with an upper-case letter. A few in
TargetLowering were starting with a lower-case letter.

No functional change intended.

llvm-svn: 175667
2013-02-20 21:13:59 +00:00
Chad Rosier
f46d46cb82 [ms-inline asm] Make the comment a bit more verbose.
llvm-svn: 175641
2013-02-20 18:03:44 +00:00
Elena Demikhovsky
0886fb4d55 I optimized the following patterns:
sext <4 x i1> to <4 x i64>
 sext <4 x i8> to <4 x i64>
 sext <4 x i16> to <4 x i64>
 
I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns:
 (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT)))
 
 The sext_in_reg (v4i32 x) may be lowered to shl+sar operations.
 The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution.

I also added a cost of this operations to the AVX costs table.

llvm-svn: 175619
2013-02-20 12:42:54 +00:00
Chad Rosier
2be41be7b9 [ms-inline asm] Force the use of a base pointer if the MachineFunction includes
MS-style inline assembly.

This is a follow-on to r175334.  Forcing a FP to be emitted doesn't ensure it
will be used.  Therefore, force the base pointer as well.  We now treat MS
inline assembly in the same way we treat functions with dynamic stack
realignment and VLAs.  This guarantees the BP will be used to reference 
parameters and locals.
rdar://13218191

llvm-svn: 175576
2013-02-19 23:50:45 +00:00
Jakub Staszak
8aa8fcbbf4 Add obvious constantness.
llvm-svn: 175560
2013-02-19 21:54:59 +00:00
Benjamin Kramer
e94a9e6d14 Clean up HiPE prologue emission a bit and avoid signed arithmetic tricks.
No intended functionality change.

llvm-svn: 175536
2013-02-19 17:32:57 +00:00
Rafael Espindola
2e18dbb394 Move LLVM_LIBRARY_VISIBILITY for consistency with what was done to
PPCJITInfo.cpp in r175394.

llvm-svn: 175531
2013-02-19 17:14:33 +00:00
Eli Bendersky
cf980445ac Make pass name more precise and fix comment.
llvm-svn: 175525
2013-02-19 16:38:32 +00:00
Craig Topper
6b083f50d1 Fix capitalization in comment to match function name.
llvm-svn: 175497
2013-02-19 07:43:59 +00:00
Jakub Staszak
201dd59b34 Use array_pod_sort instead of std::sort.
llvm-svn: 175472
2013-02-18 23:18:22 +00:00
NAKAMURA Takumi
fd77684daa X86FrameLowering.cpp: Fixup. Sorry for the breakage.
llvm-svn: 175467
2013-02-18 23:15:21 +00:00
NAKAMURA Takumi
ea78a0dbe1 X86FrameLowering.cpp: Fix a warning in -Asserts. [-Wunused-variable]
llvm-svn: 175464
2013-02-18 23:08:49 +00:00
Chad Rosier
0914ae8db3 Remove a useless assert.
llvm-svn: 175463
2013-02-18 22:20:16 +00:00
Benjamin Kramer
f776243f47 Fix a 32/64 bit incompatibility in the HiPE prologue generation.
llvm-svn: 175458
2013-02-18 21:45:01 +00:00
Benjamin Kramer
462b555ebe Support for HiPE-compatible code emission, patch by Yiannis Tsiouris.
llvm-svn: 175457
2013-02-18 20:55:12 +00:00
Benjamin Kramer
4ee689feed X86: Add a note.
llvm-svn: 175408
2013-02-17 23:34:14 +00:00
Jakub Staszak
933b3fda5f Return false instead of 0.
llvm-svn: 175402
2013-02-17 18:35:25 +00:00
NAKAMURA Takumi
355a81a32e [msvc x64] Update X86CompilationCallback_Win64.asm corresponding to r175267.
llvm-svn: 175363
2013-02-16 16:04:29 +00:00
Jakub Staszak
0b70c7ab00 Minor cleanups. No functionality change.
llvm-svn: 175359
2013-02-16 13:34:26 +00:00