1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

6795 Commits

Author SHA1 Message Date
Rafael Espindola
011e168728 Revert my previous patch to make the valgrind bots happy.
llvm-svn: 121461
2010-12-10 04:01:09 +00:00
Nate Begeman
8c00ecd290 Add some missing predicates.
llvm-svn: 121445
2010-12-10 00:54:26 +00:00
Nate Begeman
cb6d1c8193 Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Rafael Espindola
03ad1e8f1f Initial support for the cfi directives. This is just enough to get
f:
        .cfi_startproc
        nop
        .cfi_endproc

assembled (on ELF).

llvm-svn: 121434
2010-12-09 23:48:29 +00:00
Nate Begeman
4a62a3e229 Add support for AVX to materialize +0.0 when doing scalar FP.
llvm-svn: 121415
2010-12-09 21:43:51 +00:00
Eric Christopher
0e40452eb0 Rewrite the darwin tlv support to use a chain and return to copying
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.

llvm-svn: 121358
2010-12-09 06:25:53 +00:00
Eric Christopher
64e662fce9 Stop confusing people, it's not really a chain, or a tumor.
llvm-svn: 121340
2010-12-09 00:57:19 +00:00
Eric Christopher
0100a8fda4 Remove extraneous copy from DAG conversion for darwin tls. This was
popping up at O0 when it wasn't folded and the fast allocator would
complain.

llvm-svn: 121330
2010-12-09 00:27:58 +00:00
Eric Christopher
cc8a622ca4 Add rsp to the uses for the same reason as 32-bit.
llvm-svn: 121328
2010-12-09 00:26:41 +00:00
Kevin Enderby
988dab6b5c Allow a slash, '/', as a prefix separator for X86. rdar://8741045
llvm-svn: 121320
2010-12-08 23:57:59 +00:00
NAKAMURA Takumi
7947c9770a lib/Target/X86/X86MCAsmInfo.cpp: [PR8741] On Win64, specify explicit PrivateGlobalPrefix as ".L".
Or, global symbols @Lxxxx might be treated as temporal symbol by MCSymbol.

llvm-svn: 121103
2010-12-07 02:43:45 +00:00
Rafael Espindola
65c25aef87 Remove the instruction fragment to data fragment lowering since it was causing
freed data to be read. I will open a bug to track it being reenabled.

llvm-svn: 121028
2010-12-06 19:08:48 +00:00
Rafael Espindola
3e954d16f4 Second try at making direct object emission produce the same results
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.

llvm-svn: 121006
2010-12-06 17:27:56 +00:00
Chris Lattner
e30adfb732 Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags
result.  This allows us to compile:

void *test12(long count) {
      return new int[count];
}

into:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	movq	$-1, %rdi
	cmovnoq	%rax, %rdi
	jmp	__Znam                  ## TAILCALL

instead of:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	seto	%cl
	testb	%cl, %cl
	movq	$-1, %rdi
	cmoveq	%rax, %rdi
	jmp	__Znam

Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.

llvm-svn: 120936
2010-12-05 07:49:54 +00:00
Chris Lattner
76601e7a99 it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935
2010-12-05 07:30:36 +00:00
Chris Lattner
16bafb2414 generalize the previous check to handle -1 on either side of the
select, inserting a not to compensate.  Add a missing isZero check
that I lost somehow.

This improves codegen of:

void *func(long count) {
      return new int[count];
}

from:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

to:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

llvm-svn: 120932
2010-12-05 02:00:51 +00:00
Chris Lattner
474ed0aa9b Improve an integer select optimization in two ways:
1. generalize 
    (select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
    (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y

2. Handle the identical pattern that happens with !=:
   (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y

cmov is often high latency and can't fold immediates or
memory operands.  For example for (x == 0) ? -1 : 1, before 
we got:

< 	testb	%sil, %sil
< 	movl	$-1, %ecx
< 	movl	$1, %eax
< 	cmovel	%ecx, %eax

now we get:

> 	cmpb	$1, %sil
> 	sbbl	%eax, %eax
> 	orl	$1, %eax

llvm-svn: 120929
2010-12-05 01:23:24 +00:00
Bill Wendling
2b53c0830d Initialize HasPOPCNT.
llvm-svn: 120923
2010-12-04 23:57:24 +00:00
Benjamin Kramer
851691ddb2 Add patterns for the x86 popcnt instruction.
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917
2010-12-04 20:32:23 +00:00
Benjamin Kramer
77faee6ba1 Simplify code. No functionality change.
llvm-svn: 120907
2010-12-04 14:22:24 +00:00
Rafael Espindola
9215947c83 There are two reasons why we might want to use
foo = a - b
.long foo
instead of just
.long a - b

First, on darwin9 64 bits the assembler produces the wrong result. Second,
if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not
consider a - b to be a constant but will if the dummy foo is created.

Split how we handle these cases. The first one is something MC should take care
of. The second one has to be handled by the caller.

llvm-svn: 120889
2010-12-04 03:21:47 +00:00
Nate Begeman
d4310b6d7c Revert this change since it breaks a couple of the AVX tests.
I'm unclear if the tests are actually correct or not, but reverting for now.

llvm-svn: 120847
2010-12-03 22:29:15 +00:00
Nate Begeman
deb26223bd Scalar f32/f64 are also subregs of ymm regs
llvm-svn: 120844
2010-12-03 21:54:39 +00:00
Nate Begeman
3911dcfd71 Remove SSE1-4 disable when AVX is enabled. While this may be useful for development,
it completely breaks scalar fp in xmm regs when AVX is enabled.

llvm-svn: 120843
2010-12-03 21:54:14 +00:00
Devang Patel
409a5ff824 Revert r120580.
llvm-svn: 120630
2010-12-02 00:22:29 +00:00
Evan Cheng
4118b24aca Fix and re-enable tail call optimization of expanded libcalls.
llvm-svn: 120622
2010-12-01 22:59:46 +00:00
Devang Patel
e68cb5a5cf Disable debug info for x86-darwin9 and earlier until PR 8715 and radar 8709290 are fixed.
llvm-svn: 120580
2010-12-01 16:59:34 +00:00
Duncan Sands
cd4f56b8e2 I don't think it makes any sense to assert that the target supports SSE3 here.
The user (i.e. whoever generated a call to the intrinsic in the first place) is
essentially asking for a particular instruction to be placed in the assembler.
If that instruction won't execute on the target machine, that's their problem
not ours.  Two buildbots with processors that don't support SSE3 were barfing
on the apm.ll test in CodeGen/X86 because of this assertion.

llvm-svn: 120574
2010-12-01 12:58:13 +00:00
Evan Cheng
84162760b7 Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot.
llvm-svn: 120549
2010-12-01 03:27:20 +00:00
Evan Cheng
f7e586d749 Enable sibling call optimization of libcalls which are expanded during
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777

llvm-svn: 120501
2010-11-30 23:55:39 +00:00
Eric Christopher
3a1c712e47 Move X86InstrFPStack.td over to PseudoI as well.
llvm-svn: 120470
2010-11-30 21:57:32 +00:00
Eric Christopher
b15c993a73 Migrate X86InstrControl.td to use PseudoI and fix a couple of 80-col violations
while I'm in there.

llvm-svn: 120466
2010-11-30 21:37:36 +00:00
Eric Christopher
1a99e7ebdb Fix some grammar in comments I noticed.
llvm-svn: 120416
2010-11-30 09:11:54 +00:00
Eric Christopher
d8e045d29e This defaults to GenericDomain.
llvm-svn: 120415
2010-11-30 09:11:07 +00:00
Eric Christopher
6a21ceab5c Implement a PseudoI class and transfer the sse instructions over to use
it.

llvm-svn: 120412
2010-11-30 08:57:23 +00:00
Eric Christopher
73365ae8b6 Fix insertion point in pcmp expander.
While I'm there, clean up too many \n even for me.

llvm-svn: 120411
2010-11-30 08:20:21 +00:00
Eric Christopher
2170738538 Fix some cleanups from my last patch.
llvm-svn: 120410
2010-11-30 08:10:28 +00:00
Eric Christopher
f27f0b5234 Rewrite mwait and monitor support and custom lower arguments.
Fixes PR8573.

llvm-svn: 120404
2010-11-30 07:20:12 +00:00
Michael J. Spencer
d5ec932c3a Merge System into Support.
llvm-svn: 120298
2010-11-29 18:16:10 +00:00
Rafael Espindola
9287c4b38f Move lowering of TLS_addr32 and TLS_addr64 to X86MCInstLower.
llvm-svn: 120263
2010-11-28 21:16:39 +00:00
Chris Lattner
a11c5c5c32 fix PR8686, accepting a 'b' suffix at the end of all the setcc
instructions.  I choose to handle this with an asmparser hack,
though it could be handled by changing all the instruction definitions
to allow be "setneb" instead of "setne".  The asm parser hack is
better in this case, because we want the disassembler to produce
setne, not setneb.

llvm-svn: 120260
2010-11-28 20:23:50 +00:00
Rafael Espindola
f77005db2b Define generic 1, 2 and 4 byte pc relative relocations. They are common
and at least the 4 byte one will be needed to implement the .cfi_* directives.

llvm-svn: 120240
2010-11-28 14:17:56 +00:00
Anton Korobeynikov
598465c605 Move more PEI-related hooks to TFI
llvm-svn: 120229
2010-11-27 23:05:25 +00:00
Anton Korobeynikov
c87f68e32e Move callee-saved regs spills / reloads to TFI
llvm-svn: 120228
2010-11-27 23:05:03 +00:00
Rafael Espindola
45cd9713f2 Lower TLS_addr32 and TLS_addr64.
llvm-svn: 120225
2010-11-27 20:43:02 +00:00
Rafael Espindola
6f5680f4a0 Implement the data16 prefix.
llvm-svn: 120224
2010-11-27 20:29:45 +00:00
Daniel Dunbar
bcc15fb05d MC/Mach-O: Switch to using MachOFormat.h.
- I'm leaving MachO.h, because I believe it has external consumers, but I would really like to eliminate it (we have stylistic disagreements with one another).

llvm-svn: 120187
2010-11-27 04:38:36 +00:00
Rafael Espindola
0641196380 Remove the unused TheTarget member.
llvm-svn: 120168
2010-11-26 04:24:21 +00:00
Rafael Espindola
e3dc1c951c Use multiple 0x66 prefixes so that all nops up to 15 bytes are a single instruction.
llvm-svn: 120147
2010-11-25 17:14:16 +00:00
Rafael Espindola
72c8de703d Implement the rex64 prefix.
llvm-svn: 120017
2010-11-23 11:23:24 +00:00