1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 11:42:57 +01:00
Commit Graph

12057 Commits

Author SHA1 Message Date
Rafael Espindola
c97d642bf7 Relax address updates in the eh_frame section.
llvm-svn: 122591
2010-12-28 05:39:27 +00:00
Rafael Espindola
0552cb0638 Start adding basic support for emitting the call frame instructions.
llvm-svn: 122590
2010-12-28 04:15:37 +00:00
Rafael Espindola
12c30aed07 Add support for .cfi_lsda.
llvm-svn: 122584
2010-12-27 15:56:22 +00:00
Daniel Dunbar
2d0cf8e149 MC/Mach-O/Thumb: Select appropriate relocation types for Thumb.
llvm-svn: 122583
2010-12-27 14:49:49 +00:00
Rafael Espindola
7f947794d7 Handle reloc_riprel_4byte_movq_load. Should make the bots happy.
llvm-svn: 122579
2010-12-27 02:03:24 +00:00
Rafael Espindola
e7e67fce10 Add support for the same encodings of the personality function that gnu as
supports.

llvm-svn: 122577
2010-12-27 00:36:05 +00:00
Chris Lattner
d4daf9f002 implement enough of the memset inference algorithm to recognize and insert
memsets.  This is still missing one important validity check, but this is enough
to compile stuff like this:

void test0(std::vector<char> &X) {
  for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I)
    *I = 0;
}

void test1(std::vector<int> &X) {
  for (long i = 0, e = X.size(); i != e; ++i)
    X[i] = 0x01010101;
}

With:
 $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc 

to:

__Z5test0RSt6vectorIcSaIcEE:            ## @_Z5test0RSt6vectorIcSaIcEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rsi
	cmpq	%rsi, %rax
	je	LBB0_2
## BB#1:                                ## %bb.nph
	subq	%rax, %rsi
	movq	%rax, %rdi
	callq	___bzero
LBB0_2:                                 ## %for.end
	addq	$8, %rsp
	ret
...
__Z5test1RSt6vectorIiSaIiEE:            ## @_Z5test1RSt6vectorIiSaIiEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	subq	%rax, %rdx
	cmpq	$4, %rdx
	jb	LBB1_2
## BB#1:                                ## %for.body.preheader
	andq	$-4, %rdx
	movl	$1, %esi
	movq	%rax, %rdi
	callq	_memset
LBB1_2:                                 ## %for.end
	addq	$8, %rsp
	ret

llvm-svn: 122573
2010-12-26 23:42:51 +00:00
Chris Lattner
9007b56712 start using irbuilder to make mem intrinsics in a few passes.
llvm-svn: 122572
2010-12-26 22:57:41 +00:00
Rafael Espindola
2ebe553431 Add support for @note. Patch by Jörg Sonnenberger.
llvm-svn: 122568
2010-12-26 21:30:59 +00:00
Rafael Espindola
99f1527316 Add basic support for .cfi_personality.
llvm-svn: 122566
2010-12-26 20:20:31 +00:00
Chris Lattner
2129ce0891 Generalize a previous change, fixing PR8855 - an valid large immediate
rejected by the mc assembler.

llvm-svn: 122557
2010-12-25 21:36:35 +00:00
Benjamin Kramer
49e40d4c4b MemCpyOpt: Turn memcpys from a constant into a memset if possible.
This allows us to compile "int cst[] = {-1, -1, -1};" into
  movl  $-1, 16(%rsp)
  movq  $-1, 8(%rsp)
instead of
  movl  _cst+8(%rip), %eax
  movl  %eax, 16(%rsp)
  movq  _cst(%rip), %rax
  movq  %rax, 8(%rsp)

llvm-svn: 122548
2010-12-24 21:17:12 +00:00
Daniel Dunbar
592854a10a MC/Mach-O/ARM: Start handling some Thumb branches.
llvm-svn: 122547
2010-12-24 16:41:46 +00:00
Kevin Enderby
ff7e68c5e7 In llvm-mc parse a Hash token as a full line comment. Allows handling of
preprocessed .s files and matches darwin gas.  rdar://8798690
Also fix a comment on the next line of AsmParser.cpp after this new code.

llvm-svn: 122531
2010-12-24 00:12:02 +00:00
Owen Anderson
6afd90810e When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero
are not the low bits of x, but the bits that WILL be the low bits after the operation completes.

llvm-svn: 122529
2010-12-23 23:56:24 +00:00
Bob Wilson
85dbc89f44 Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions.
If the basic block containing the BCCi64 (or BCCZi64) instruction ends with
an unconditional branch, that branch needs to be deleted before appending
the expansion of the BCCi64 to the end of the block.

llvm-svn: 122521
2010-12-23 22:45:49 +00:00
Torok Edwin
cfd30fdf42 XFAIL vg_leak the new test as the rest.
llvm-svn: 122517
2010-12-23 21:22:09 +00:00
Torok Edwin
2acdd67db2 Fix OCaml bindings crash, PR8847.
See http://caml.inria.fr/mantis/view.php?id=4166
If we call only external functions from a module, then its 'let _' bindings
don't get executed, which means that the exceptions don't get registered for use
in the C code.
This in turn causes llvm_raise to call raise_with_arg() with a NULL pointer and
cause a segmentation fault.

The workaround is to declare all 'external' functions as 'val' in these .mli
files.

Also added a separate testcase (the testcase must call only external functions
for the bug to occur).

llvm-svn: 122497
2010-12-23 15:49:26 +00:00
Andrew Trick
cc701bcfdc Fixes PR8823: add-with-overflow-128.ll
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472
2010-12-23 03:15:51 +00:00
Benjamin Kramer
49942a90b7 DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code.
example code:
unsigned foo(unsigned x, unsigned y) {
  if (x != 0) y--;
  return y;
}

before:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
    notl  %eax                    ## encoding: [0xf7,0xd0]
    addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
    ret                           ## encoding: [0xc3]

after:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
    adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
    ret                           ## encoding: [0xc3]

llvm-svn: 122455
2010-12-22 23:17:45 +00:00
Benjamin Kramer
27d13684f5 InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1.
llvm-svn: 122453
2010-12-22 23:12:15 +00:00
Benjamin Kramer
d8387aa9bd X86: Lower a select directly to a setcc_carry if possible.
int test(unsigned long a, unsigned long b) { return -(a < b); }
compiles to
  _test:                              ## @test
    cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
    sbbl  %eax, %eax                  ## encoding: [0x19,0xc0]
    ret                               ## encoding: [0xc3]
instead of
  _test:                              ## @test
    xorl  %ecx, %ecx                  ## encoding: [0x31,0xc9]
    cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
    movl  $-1, %eax                   ## encoding: [0xb8,0xff,0xff,0xff,0xff]
    cmovael %ecx, %eax                ## encoding: [0x0f,0x43,0xc1]
    ret                               ## encoding: [0xc3]

llvm-svn: 122451
2010-12-22 23:09:28 +00:00
Daniel Dunbar
e6ec0e7149 MC/Mach-O/ARM: Don't try to use scattered relocs for BR24 fixups.
llvm-svn: 122441
2010-12-22 21:26:43 +00:00
Rafael Espindola
5004de4d8b Add reduced test from 8845.
llvm-svn: 122438
2010-12-22 21:15:13 +00:00
Duncan Sands
68d969c2f5 When determining whether the new instruction was already present in
the original instruction, half the cases were missed (making it not
wrong but suboptimal).  Also correct a typo (A <-> B) in the second
chunk. 

llvm-svn: 122414
2010-12-22 17:15:25 +00:00
Duncan Sands
e1522867e6 Make this test not depend on how the variable is named.
llvm-svn: 122413
2010-12-22 17:08:04 +00:00
Daniel Dunbar
cb8ac619a2 MC/Mach-O/ARM: We always use the SECTDIFF reloc type on ARM, which is
esp. important given that the LOCAL_SECTDIFF enumeration got redefined.

llvm-svn: 122412
2010-12-22 16:52:19 +00:00
Daniel Dunbar
e44a2c1166 MC/Mach-O/ARM: Add enough relocation logic to get BR24 relocations.
llvm-svn: 122407
2010-12-22 16:19:24 +00:00
Rafael Espindola
7c995a90fc Simplify the handling of .size expressions.
llvm-svn: 122404
2010-12-22 16:03:00 +00:00
Duncan Sands
922251757b Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C)
if both A op B and A op C simplify.  This fires fairly often but doesn't
make that much difference.  On gcc-as-one-file it removes two "and"s and
turns one branch into a select.

llvm-svn: 122399
2010-12-22 13:36:08 +00:00
Che-Liang Chiou
e73ad4387e ptx: add ld instruction and test
llvm-svn: 122398
2010-12-22 10:38:51 +00:00
Chris Lattner
04ef853e23 Fix a bug in ReduceLoadWidth that wasn't handling extending
loads properly.  We miscompiled the testcase into:

_test:                                  ## @test
	movl	$128, (%rdi)
	movzbl	1(%rdi), %eax
	ret

Now we get a proper:

_test:                                  ## @test
	movl	$128, (%rdi)
	movsbl	(%rdi), %eax
	movzbl	%ah, %eax
	ret

This fixes PR8757.

llvm-svn: 122392
2010-12-22 08:02:57 +00:00
Owen Anderson
b4f1511864 Give GVN back the ability to perform simple conditional propagation on conditional branch values.
I still think that LVI should be handling this, but that capability is some ways off in the future,
and this matters for some significant benchmarks.

llvm-svn: 122378
2010-12-21 23:54:34 +00:00
Dale Johannesen
e0fb87c3d7 Reapply 122353-122355 with fixes. 122354 was wrong;
the shift type was needed one place, the shift count
type another.  The transform in 123555 had the same
problem.

llvm-svn: 122366
2010-12-21 21:55:50 +00:00
Benjamin Kramer
369872edfc Add some x86 specific dagcombines for conditional increments.
(add Y, (sete  X, 0)) -> cmp X, 1; adc  0, Y
(add Y, (setne X, 0)) -> cmp X, 1; sbb -1, Y
(sub (sete  X, 0), Y) -> cmp X, 1; sbb  0, Y
(sub (setne X, 0), Y) -> cmp X, 1; adc -1, Y

for
  unsigned foo(unsigned a, unsigned b) {
    if (a == 0) b++;
    return b;
  }
we now get:
  foo:
    cmpl  $1, %edi
    movl  %esi, %eax
    adcl  $0, %eax
    ret
instead of:
  foo:
    testl %edi, %edi
    sete  %al
    movzbl  %al, %eax
    addl  %esi, %eax
    ret

llvm-svn: 122364
2010-12-21 21:41:44 +00:00
Dale Johannesen
972aba543a Revert 122353-122355 for the moment, they broke stuff.
llvm-svn: 122360
2010-12-21 21:22:27 +00:00
Dale Johannesen
39186cfb0b Add a new transform to DAGCombiner.
llvm-svn: 122355
2010-12-21 20:10:51 +00:00
Dale Johannesen
5f3e7b08f6 Get the type of a shift from the shift, not from its shift
count operand.  These should be the same but apparently are
not always, and this is cleaner anyway.  This improves the
code in an existing test.

llvm-svn: 122354
2010-12-21 20:06:19 +00:00
David Greene
33a91c0c9a Revert 122341. It breaks some darwin tests.
llvm-svn: 122346
2010-12-21 17:25:43 +00:00
David Greene
28140b5288 Fix PR 8199. This patch prepends the build tool dir to LLVM programs
being tested.  This ensures that we test the tools just built and not
some random tools that might happen to be in the user's PATH.  This
makes LLVM testing much more stable and predictable.

llvm-svn: 122341
2010-12-21 16:55:53 +00:00
Duncan Sands
658dd68e10 Add an additional InstructionSimplify factorization test.
llvm-svn: 122333
2010-12-21 15:12:22 +00:00
Duncan Sands
b4497c7e0f While I don't think any later transforms can fire, it seems cleaner to
not assume this (for example in case more transforms get added below
it).  Suggested by Frits van Bommel.

llvm-svn: 122332
2010-12-21 15:03:43 +00:00
Duncan Sands
3ceeaf218e Fix typo in comment, spotted by Deewiant.
llvm-svn: 122329
2010-12-21 13:39:20 +00:00
Duncan Sands
0bd25425b6 Teach InstructionSimplify about distributive laws. These transforms fire
quite often, but don't make much difference in practice presumably because
instcombine also knows them and more.

llvm-svn: 122328
2010-12-21 13:32:22 +00:00
Duncan Sands
5880f299da Add generic simplification of associative operations, generalizing
a couple of existing transforms.  This fires surprisingly often, for
example when compiling gcc "(X+(-1))+1->X" fires quite a lot as well
as various "and" simplifications (usually with a phi node operand).
Most of the time this doesn't make a real difference since the same
thing would have been done elsewhere anyway, eg: by instcombine, but
there are a few places where this results in simplifications that we
were not doing before.

llvm-svn: 122326
2010-12-21 08:49:00 +00:00
Bob Wilson
01593c55a2 Add ARM-specific DAG combining to cast i64 vector element load/stores to f64.
Type legalization splits up i64 values into pairs of i32 values, which leads
to poor quality code when inserting or extracting i64 vector elements.
If the vector element is loaded or stored, it can be treated as an f64 value
and loaded or stored directly from a VPR register.  Use the pre-legalization
DAG combiner to cast those vector elements to f64 types so that the type
legalizer won't mess them up.  Radar 8755338.

llvm-svn: 122319
2010-12-21 06:43:19 +00:00
Wesley Peck
e8ec7a4d1f Teach the MBlaze disassembler to disassemble special purpose registers.
llvm-svn: 122269
2010-12-20 21:18:04 +00:00
Roman Divacky
42b3eee794 Set the value of absolute symbols.
llvm-svn: 122268
2010-12-20 21:14:39 +00:00
Roman Divacky
13b5260f62 Print all 64bits for st_value and st_size. Adjust tests accordingly.
llvm-svn: 122263
2010-12-20 20:49:43 +00:00
Wesley Peck
af2890a051 Teach the MBlaze asm parser how to parse special purpose register names.
llvm-svn: 122261
2010-12-20 20:43:24 +00:00
Dale Johannesen
036c3da142 Cosmetic changes.
llvm-svn: 122259
2010-12-20 20:10:50 +00:00
Benjamin Kramer
bec7a6be15 Teach InstCombine to merge (icmp ult (X + CA), C1) | (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1.
InstCombine creates these so now we compile x == 23 || x == 24 || x == 25 to
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 3
instead of
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 2
  %cmp3 = icmp eq i32 %x, 25
  %ret2 = or i1 %1, %cmp3

llvm-svn: 122248
2010-12-20 16:18:51 +00:00
Duncan Sands
f72cfa961d Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods
(they had just been forgotten before).  Adding Xor causes "main" in the
existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified.

llvm-svn: 122245
2010-12-20 14:47:04 +00:00
Chris Lattner
b27b5d0a3a fix PR8807 by making transformConstExprCastCall aware of byval arguments.
llvm-svn: 122238
2010-12-20 08:36:38 +00:00
Chris Lattner
ba962825a4 when eliding a byval copy due to inlining a readonly function, we have
to make sure that the reused alloca has sufficient alignment.

llvm-svn: 122236
2010-12-20 08:10:40 +00:00
Chris Lattner
c0a48df9f9 pull byval processing out to its own helper function.
llvm-svn: 122235
2010-12-20 07:57:41 +00:00
Chris Lattner
029952c844 fix PR8769, a miscompilation by inliner when inlining a function with a byval
argument.  The generated alloca has to have at least the alignment of the
byval, if not, the client may be making assumptions that the new alloca won't
satisfy.

llvm-svn: 122234
2010-12-20 07:45:28 +00:00
Chris Lattner
52149d6e21 merge two tests.
llvm-svn: 122233
2010-12-20 07:39:57 +00:00
Chris Lattner
2fa128c4c5 filecheckize
llvm-svn: 122232
2010-12-20 07:38:24 +00:00
Chris Lattner
4c3662e299 temporarily disable this: PR8823.
llvm-svn: 122222
2010-12-20 02:11:23 +00:00
Chris Lattner
bee7320c3c now that addc/adde are gone, "ADDC" in the X86 backend uses EFLAGS results,
the same as setcc.  Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS).  This is
a step towards finishing off PR5443.  In the testcase in that bug we now  get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	sbbq	%rcx, %rcx
	testb	$1, %cl
	setne	%dl
	ret

instead of:

	movq	%rdi, %rax
	addq	%rsi, %rax
	movl	$0, %ecx
	adcq	$0, %rcx
	testq	%rcx, %rcx
	setne	%dl
	ret

llvm-svn: 122219
2010-12-20 01:37:09 +00:00
Chris Lattner
2d4e17d195 We lower setb to sbb with the hope that the and will go away, when it
doesn't, match it back to setb.

On a 64-bit version of the testcase before we'd get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	sbbb	%dl, %dl
	andb	$1, %dl
	ret

now we get:

	movq	%rdi, %rax
	addq	%rsi, %rax
	setb	%dl
	ret

llvm-svn: 122217
2010-12-20 01:16:03 +00:00
Mon P Wang
236fa96503 Test case for r122215 when InstCombine optimizes memset
llvm-svn: 122216
2010-12-20 01:06:23 +00:00
Mon P Wang
666259546c Add comment for testcase for 122206
llvm-svn: 122210
2010-12-20 00:54:26 +00:00
Mon P Wang
d3adab7a64 Prevents PerformShuffleCombine from creating a node with an illegal type after legalize types
has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type.

llvm-svn: 122206
2010-12-19 23:55:53 +00:00
Chris Lattner
297259f6f1 improve the setcc -> setcc_carry optimization to happen more
consistently by moving it out of lowering into dag combine.

Add some missing patterns for matching away extended versions of setcc_c.

llvm-svn: 122201
2010-12-19 22:08:31 +00:00
Chris Lattner
ad85635a93 now that generic vector types aren't selected onto MMX registers, these
tests don't need -disable-mmx.

llvm-svn: 122188
2010-12-19 20:12:58 +00:00
Chris Lattner
29475c23d0 X86 supports i8/i16 overflow ops (except i8 multiplies), we should
generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2

llvm-svn: 122186
2010-12-19 20:03:11 +00:00
Chris Lattner
2d6a408c4c add a general coverage test for overflow intrinsics.
llvm-svn: 122185
2010-12-19 20:01:13 +00:00
Chris Lattner
3bc741a0d2 recognize an unsigned add with overflow idiom into uadd.
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.

Previously we got:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	cmpq	%rdi, %rsi
	movl	$42, %eax
	cmovaq	%rsi, %rax
	ret

Now we get:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	movl	$42, %eax
	cmovbq	%rsi, %rax
	ret

llvm-svn: 122182
2010-12-19 19:37:52 +00:00
Chris Lattner
faef9b6bfb optimize uadd(x, cst) into a comparison when the normal
result is dead.  This is required for my next patch to not
regress the testsuite.

llvm-svn: 122181
2010-12-19 19:35:32 +00:00
Chris Lattner
d1f114d8f2 generalize the sadd creation code to not require that the
sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:

unsigned char X(char a, char b) {
  int res = a+b;
  if ((unsigned )(res+128) > 255U)
    abort();
  return res;
}

llvm-svn: 122178
2010-12-19 18:35:09 +00:00
Chris Lattner
bb0d067691 fix another miscompile in the llvm.sadd formation logic: it wasn't
checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.

This is likely to be the fix for 8816.

llvm-svn: 122177
2010-12-19 18:22:06 +00:00
Chris Lattner
c7876edb16 fix a bug (possibly 8816) in the sadd forming xform: it isn't
profitable (or safe) to promote code when the add-with-constant
has other uses.

llvm-svn: 122175
2010-12-19 17:59:02 +00:00
Chris Lattner
bb93cd80d6 Enhance LICM to promote alias sets whose pointers themselves are stored,
which doesn't affect the memory address being promoted.

llvm-svn: 122172
2010-12-19 05:57:25 +00:00
Chris Lattner
71fcecf597 fix PR8602, a bug in an assertion: a volatile store *of* a pointer
does not make the alias set for that pointer volatile, just stores
*to* the pointer.

llvm-svn: 122171
2010-12-19 05:51:54 +00:00
Chris Lattner
ac82ea26da fix PR8642: if a critical edge has a PHI value that can trap,
isel is *required* to split the edge.  PHI values get evaluated
on the edge, not in their predecessor block.

llvm-svn: 122170
2010-12-19 04:58:57 +00:00
Chris Lattner
0965f3f76d revert r122164, I'm going to go with a different approach.
llvm-svn: 122168
2010-12-19 04:23:03 +00:00
Chris Lattner
14a3e26146 first step to fixing PR8642: don't fold away empty basic blocks
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.

llvm-svn: 122164
2010-12-19 03:02:34 +00:00
Chris Lattner
1cc35d2472 move this test into the ARM test so that it is only run when the arm backend
is enabled.

llvm-svn: 122163
2010-12-19 02:58:14 +00:00
Anton Korobeynikov
f49c9c02d6 Restore the behavior of frame lowering before my refactoring.
It turns out that ppc backend has really weird interdependencies
over different hooks and all stuff is fragile wrt small changes.
This should fix PR8749

llvm-svn: 122155
2010-12-18 19:53:14 +00:00
Benjamin Kramer
84d3e6cfd0 Just rename the functions, relying on matching a instruction that has the same name as a symbol is way too fragile.
llvm-svn: 122154
2010-12-18 14:23:57 +00:00
Benjamin Kramer
4d591385d1 Test more than just label names and make test work on non-x86 hosts.
llvm-svn: 122153
2010-12-18 14:07:28 +00:00
Roman Divacky
ed5bb14415 Add support for lexing single quotes like 'c'.
This fixed 8615.

llvm-svn: 122150
2010-12-18 08:56:37 +00:00
Rafael Espindola
4c1b83e53f Add a test that shows that we produce no fixups when computing the difference
of two symbols in the same fragment.

llvm-svn: 122145
2010-12-18 05:07:45 +00:00
Rafael Espindola
dff6c18a5e Test for push being relaxed.
llvm-svn: 122124
2010-12-18 01:16:59 +00:00
Bob Wilson
776d3f73eb Fix result type of Neon floating-point comparisons against zero.
The result vector elements are always integers.  Radar 8782191.

llvm-svn: 122112
2010-12-18 00:04:33 +00:00
Nate Begeman
063d88d6fb Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.
llvm-svn: 122105
2010-12-17 23:12:19 +00:00
Bill Wendling
c16f9b1ccc During local stack slot allocation, the materializeFrameBaseRegister function
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.

Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)

<rdar://problem/8782198>

llvm-svn: 122104
2010-12-17 23:09:14 +00:00
Bob Wilson
c57c2d755b Fix a DAGCombiner crash when folding binary vector operations with constant
BUILD_VECTOR operands where the element type is not legal.  I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.

llvm-svn: 122102
2010-12-17 23:06:49 +00:00
Bob Wilson
347efe0b29 Combine several vector-related DAGCombiner tests.
llvm-svn: 122101
2010-12-17 23:06:46 +00:00
Nate Begeman
ef5f3c0fa7 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts

llvm-svn: 122098
2010-12-17 22:55:37 +00:00
Dale Johannesen
c2c6ebd82a Add a transform to DAG Combiner. This improves the
code for the case where 32-bit divide by constant is
turned into 64-bit multiply by constant.  8771012.

llvm-svn: 122090
2010-12-17 21:45:49 +00:00
Owen Anderson
6acf8c9125 Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself
on the DragonEgg self-host bot.  Unfortunately, the testcase is pretty messy and doesn't reduce well due to
interactions with other parts of InstCombine.

llvm-svn: 122072
2010-12-17 18:08:00 +00:00
Benjamin Kramer
39b30b18fa SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build.
llvm-svn: 122054
2010-12-17 10:48:14 +00:00
Kalle Raiskila
68f221707a Don't feed 19 bit immediates to ILA.
Patch (slightly modified) by Visa Putkinen.

llvm-svn: 122052
2010-12-17 09:36:09 +00:00
Chris Lattner
e92f8121d4 improve switch formation to handle small range
comparisons formed by comparisons.  For example,
this:

void foo(unsigned x) {
  if (x == 0 || x == 1 || x == 3 || x == 4 || x == 6) 
    bar();
}

compiles into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$6, %edi
	ja	LBB0_2
## BB#1:                                ## %entry
	movl	%edi, %eax
	movl	$91, %ecx
	btq	%rax, %rcx
	jb	LBB0_3

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$2, %edi
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	cmpl	$6, %edi
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movl	%edi, %eax
	movl	$88, %ecx
	btq	%rax, %rcx
	jb	LBB0_4

This catches a bunch of cases in GCC, which look like this:

 %804 = load i32* @which_alternative, align 4, !tbaa !0
 %805 = icmp ult i32 %804, 2
 %806 = icmp eq i32 %804, 3
 %or.cond121 = or i1 %805, %806
 %807 = icmp eq i32 %804, 4
 %or.cond124 = or i1 %or.cond121, %807
 br i1 %or.cond124, label %.thread, label %808

turning this into a range comparison.

llvm-svn: 122045
2010-12-17 06:20:15 +00:00
Daniel Dunbar
1f9fd0b79b MC/Expr: Implemnt more aggressive folding during symbol evaluation using
IsSymbolRefDifferenceFullyResolved(). For example, we will now fold away
something like:
--
_a:
...
L0:
...
L1:
...
.long (L1 - L0) / 2
--

llvm-svn: 122043
2010-12-17 05:50:33 +00:00
Bob Wilson
e06f6eabe7 Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.
Radar 8776599

llvm-svn: 122018
2010-12-17 01:21:12 +00:00
Dan Gohman
f8949c3d1a Revert r64460. strtol and friends cannot be marked readonly, even with
a null endptr argument, because they may write to errno.

This fixes a seflhost miscompile observed on Linux targets when TBAA
was enabled.

llvm-svn: 122014
2010-12-17 01:09:43 +00:00
Rafael Espindola
bd13ceed72 "Fix" FDE alignment to match what gas does.
llvm-svn: 122006
2010-12-17 00:28:02 +00:00
Rafael Espindola
3ee4530406 Make pushq produce signed relocations.
llvm-svn: 122005
2010-12-16 22:50:01 +00:00
Duncan Sands
22de496ae3 Speculatively revert commit 121905 since it looks like it might have broken the
dragonegg self-host buildbot.  Original commit message:

Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

llvm-svn: 121965
2010-12-16 09:40:54 +00:00
Jason W Kim
2f4a8d7553 1. ARM/MC/ELF: A few more ELF relocs for .o
2. Fixed EmitLocalCommonSymbol for ELF (Yes, they exist. :)
   Test added.

llvm-svn: 121951
2010-12-16 03:12:17 +00:00
Dan Gohman
29a260015a -enable-tbaa is on by default now.
llvm-svn: 121945
2010-12-16 02:53:48 +00:00
Dan Gohman
e106936414 Make memcpyopt TBAA-aware.
llvm-svn: 121944
2010-12-16 02:51:19 +00:00
Jason W Kim
529d762fff Fix elf-dump --dump-section-data for .bss section
llvm-svn: 121927
2010-12-16 00:15:10 +00:00
Dan Gohman
a2fd4f2e22 Preserve TBAA tags when doing load PRE.
llvm-svn: 121921
2010-12-15 23:53:55 +00:00
Jim Grosbach
84c2b29b58 Thumb1 had two patterns for the same load-from-constant-pool instruction.
Canonicalize on tLDRpci and remove tLDRcp.

llvm-svn: 121920
2010-12-15 23:52:36 +00:00
Eric Christopher
339499f8f3 Don't handle -arm-long-calls in fast isel for now.
llvm-svn: 121919
2010-12-15 23:47:29 +00:00
Owen Anderson
aefeb448a9 Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

Fixes <rdar://problem/8558713>.

llvm-svn: 121905
2010-12-15 22:32:38 +00:00
Evan Cheng
68e1ed8752 Teach machine cse to commute instructions.
llvm-svn: 121903
2010-12-15 22:16:21 +00:00
Bob Wilson
438a9a1367 Add Neon VCVT instructions for f32 <-> f16 conversions.
Clang is now providing intrinsics for these and so we need to support them
in the backend.  Radar 8068427.

llvm-svn: 121902
2010-12-15 22:14:12 +00:00
Bob Wilson
1082705e72 Fix misspelled target triples in MC/ARM test commands.
llvm-svn: 121901
2010-12-15 22:14:01 +00:00
Wesley Peck
2a376c535e Lower the MBlaze target specific calling conventions for "interrupt_handler"
and "save_volatiles" correctly. This completes the custom calling convention
functionality changes for the MBlaze backend that were started in 121888.

llvm-svn: 121891
2010-12-15 20:27:28 +00:00
Duncan Sands
2699fb1072 Move Sub simplifications and additional Add simplifications out of
instcombine and into InstructionSimplify.

llvm-svn: 121861
2010-12-15 14:07:39 +00:00
Frits van Bommel
83b7c3773f Teach jump threading to "look through" a select when the branch direction of a terminator depends on it.
When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately.

llvm-svn: 121859
2010-12-15 09:51:20 +00:00
Rafael Espindola
94d026d157 Relax alignment fragments.
With this we don't need the EffectiveSize field anymore. Without that field
LayoutFragment only updates offsets and we don't need to invalidate the
current fragment when it is relaxed (only the ones following it).

This is also a very small improvement in the accuracy of the layout info as
we now use the after relaxation size immediately.

llvm-svn: 121857
2010-12-15 08:45:53 +00:00
Rafael Espindola
f55f520a7a Patch by David Meyer to avoid a O(N^2) behaviour when relaxing fragments.
Since we now don't update addresses so early, we might relax a bit more than
we need to. This is simillar to the issue in PR8467.

llvm-svn: 121856
2010-12-15 07:39:29 +00:00
Chris Lattner
81815cd4db take care of some todos, transforming [us]mul_lohi into
a wider mul if the wider mul is legal.

llvm-svn: 121848
2010-12-15 06:04:19 +00:00
Chris Lattner
3bec2e7d0d merge two tests
llvm-svn: 121847
2010-12-15 05:58:59 +00:00
Kevin Enderby
6b3ae489f8 Add some more MC tests for ARM arithmetic instructions that update or don't
update the condition codes.  These come from my test generator and are just
the ones that MC currently assembles correctly.

llvm-svn: 121830
2010-12-15 01:24:36 +00:00
Owen Anderson
de42e1136e Fix PR8790, another instance where unreachable code can cause instruction simplification to fail,
this case involve a select that simplifies to itself.

llvm-svn: 121817
2010-12-15 00:55:35 +00:00
Evan Cheng
7e96e67d98 Fix a minor bug in two-address pass. It was missing a commute opportunity.
regB = move RCX
regA = op regB, regC
RAX  = move regA
where both regB and regC are killed. If regB is constrainted to non-compatible
physical registers but regC is not constrainted at all, then it's better to
commute the instruction.
       movl    %edi, %eax
       shlq    $32, %rcx
       leaq    (%rcx,%rax), %rax
=>
       movl    %edi, %eax
       shlq    $32, %rcx
       orq     %rcx, %rax
rdar://8762995

llvm-svn: 121793
2010-12-14 21:34:53 +00:00
Daniel Dunbar
3f9b9dc852 MC/ARM: Fix-up fixup offset for fixup_arm_branch target specific fixup.
llvm-svn: 121772
2010-12-14 17:37:16 +00:00
Chris Lattner
c1aaf52608 - Insert new instructions before DomBlock's terminator,
which is simpler than finding a place to insert in BB.
 - Don't perform the 'if condition hoisting' xform on certain
   i1 PHIs, as it interferes with switch formation.

This re-fixes "example 7", without breaking the world hopefully.

llvm-svn: 121764
2010-12-14 08:46:09 +00:00
Chris Lattner
22d4dc5a4d fix two significant issues with FoldTwoEntryPHINode:
first, it can kick in on blocks whose conditions have been
folded to a constant, even though one of the edges will be
trivially folded.

second, it doesn't clean up the "if diamond" that it just 
eliminated away.  This is a problem because other simplifycfg
xforms kick in depending on the order of block visitation,
causing pointless work.

llvm-svn: 121762
2010-12-14 08:01:53 +00:00
Chris Lattner
5d4aea9791 fix yet anohter broken line
llvm-svn: 121750
2010-12-14 06:09:07 +00:00
Chris Lattner
093b5b256d reapply my recent change that disables a piece of the switch formation
work, but fixes 400.perlbmk.

llvm-svn: 121749
2010-12-14 05:57:30 +00:00
Evan Cheng
6a2bed92f5 bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663
llvm-svn: 121746
2010-12-14 03:22:07 +00:00
Jason W Kim
0cf0f7a078 fix fixme case typo :-)
llvm-svn: 121743
2010-12-14 01:42:38 +00:00
Owen Anderson
5536134dc4 Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state
where I'm confident there were no crashes or miscompilations.  XFAIL the test added since then for now.

llvm-svn: 121733
2010-12-13 23:49:28 +00:00
Jason W Kim
b5cc5dad79 First cut of ARM/MC/ELF PIC relocations.
Test has fixme, to move to .s -> .o test when AsmParser works better.

llvm-svn: 121732
2010-12-13 23:16:07 +00:00
Bob Wilson
33e5e902b0 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

llvm-svn: 121730
2010-12-13 23:02:37 +00:00
Chris Lattner
dcba81d96f temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk.
llvm-svn: 121728
2010-12-13 23:02:19 +00:00
Dan Gohman
b187cce266 Reapply r121520, PartialAlias implementation for BasicAA, now that
memdep is updated to handle it.

llvm-svn: 121725
2010-12-13 22:50:24 +00:00
Benjamin Kramer
7f1cdac1e4 Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780.
llvm-svn: 121705
2010-12-13 18:20:38 +00:00
Chris Lattner
d17cbf803b rename test
llvm-svn: 121697
2010-12-13 08:39:40 +00:00
Chris Lattner
14810c808b Add a couple dag combines to transform mulhi/mullo into a wider multiply
when the wider type is legal.  This allows us to compile:

define zeroext i16 @test1(i16 zeroext %x) nounwind {
entry:
	%div = udiv i16 %x, 33
	ret i16 %div
}

into:

test1:                                  # @test1
	movzwl	4(%esp), %eax
	imull	$63551, %eax, %eax      # imm = 0xF83F
	shrl	$21, %eax
	ret

instead of:

test1:                                  # @test1
        movw    $-1985, %ax             # imm = 0xFFFFFFFFFFFFF83F
        mulw    4(%esp)
        andl    $65504, %edx            # imm = 0xFFE0
        movl    %edx, %eax
        shrl    $5, %eax
        ret

Implementing rdar://8760399 and example #4 from:
http://blog.regehr.org/archives/320

We should implement the same thing for [su]mul_hilo, but I don't
have immediate plans to do this.

llvm-svn: 121696
2010-12-13 08:39:01 +00:00
Chris Lattner
0368bf7457 reinstate my patch: the miscompile was caused by an inverted branch in the
'and' case.

llvm-svn: 121695
2010-12-13 08:12:19 +00:00
Chris Lattner
caad324345 Completely disable the optimization I added in r121680 until
I can track down a miscompile.  This should bring the buildbots
back to life

llvm-svn: 121693
2010-12-13 07:41:29 +00:00
Chris Lattner
5ce3e42d80 Make simplifycfg reprocess newly formed "br (cond1 | cond2)" conditions
when simplifying, allowing them to be eagerly turned into switches.  This
is the last step required to get "Example 7" from this blog post:
http://blog.regehr.org/archives/320

On X86, we now generate this machine code, which (to my eye) seems better
than the ICC generated code:

_crud:                                  ## @crud
## BB#0:                                ## %entry
	cmpb	$33, %dil
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	addb	$-34, %dil
	cmpb	$58, %dil
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movzbl	%dil, %eax
	movabsq	$288230376537592865, %rcx ## imm = 0x400000017001421
	btq	%rax, %rcx
	jb	LBB0_4
LBB0_3:                                 ## %lor.rhs
	xorl	%eax, %eax
	ret
LBB0_4:                                 ## %lor.end
	movl	$1, %eax
	ret

llvm-svn: 121690
2010-12-13 07:00:06 +00:00
Chris Lattner
ea15ce73be fix a bug in r121680 that upset the various buildbots.
llvm-svn: 121687
2010-12-13 05:34:18 +00:00
Chris Lattner
c331eb8e1e make these tests a bit less fragile
llvm-svn: 121682
2010-12-13 05:10:30 +00:00
Chris Lattner
5cbbcc56ad enhance the "change or icmp's into switch" xform to handle one value in an
'or sequence' that it doesn't understand.  This allows us to optimize
something insane like this:

int crud (unsigned char c, unsigned x)
 {
   if(((((((((( (int) c <= 32 ||
                    (int) c == 46) || (int) c == 44)
                  || (int) c == 58) || (int) c == 59) || (int) c == 60)
               || (int) c == 62) || (int) c == 34) || (int) c == 92)
            || (int) c == 39) != 0)
     foo();
 }

into:

define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone {
entry:
  %cmp = icmp ult i8 %c, 33
  br i1 %cmp, label %if.then, label %switch.early.test

switch.early.test:                                ; preds = %entry
  switch i8 %c, label %if.end [
    i8 39, label %if.then
    i8 44, label %if.then
    i8 58, label %if.then
    i8 59, label %if.then
    i8 60, label %if.then
    i8 62, label %if.then
    i8 46, label %if.then
    i8 92, label %if.then
    i8 34, label %if.then
  ]

by pulling the < comparison out ahead of the newly formed switch.

llvm-svn: 121680
2010-12-13 04:50:38 +00:00
Chris Lattner
e35f4b31f4 merge two tests
llvm-svn: 121679
2010-12-13 04:45:56 +00:00
Chris Lattner
25b642edfd Fix my previous patch to handle a degenerate case that the llvm-gcc
bootstrap buildbot tripped over.

llvm-svn: 121674
2010-12-13 03:43:57 +00:00
Chris Lattner
a21c02e807 fix a fairly serious oversight with switch formation from
or'd conditions.  Previously we'd compile something like this:

int crud (unsigned char c) {
   return c == 62 || c == 34 || c == 92;
}

into:

  switch i8 %c, label %lor.rhs [
    i8 62, label %lor.end
    i8 34, label %lor.end
  ]

lor.rhs:                                          ; preds = %entry
  %cmp8 = icmp eq i8 %c, 92
  br label %lor.end

lor.end:                                          ; preds = %entry, %entry, %lor.rhs
  %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ]
  %lor.ext = zext i1 %0 to i32
  ret i32 %lor.ext

which failed to merge the compare-with-92 into the switch.  With this patch
we simplify this all the way to:

  switch i8 %c, label %lor.rhs [
    i8 62, label %lor.end
    i8 34, label %lor.end
    i8 92, label %lor.end
  ]

lor.rhs:                                          ; preds = %entry
  br label %lor.end

lor.end:                                          ; preds = %entry, %entry, %entry, %lor.rhs
  %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ]
  %lor.ext = zext i1 %0 to i32
  ret i32 %lor.ext

which is much better for codegen's switch lowering stuff.  This kicks in 33 times
on 176.gcc (for example) cutting 103 instructions off the generated code.

llvm-svn: 121671
2010-12-13 03:18:54 +00:00
Bill Wendling
16e7d7bf2f Add support for using the `!if' operator when initializing variables:
class A<bit a, bits<3> x, bits<3> y> {
    bits<3> z;
    let z = !if(a, x, y);
  }

The variable z will get the value of x when 'a' is 1 and 'y' when a is '0'.

llvm-svn: 121666
2010-12-13 01:46:19 +00:00
Wesley Peck
f842b79b4b Missed some ADDI <-> ADDIK conversions in 121649.
llvm-svn: 121652
2010-12-12 22:53:14 +00:00
Benjamin Kramer
a638216447 Generalize the and-icmp-select instcombine further by allowing selects of the form
(x & 2^n) ? 2^m+C : C

we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.

llvm-svn: 121609
2010-12-11 10:49:22 +00:00
Benjamin Kramer
5a1721f4ac Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it
to catch cases where n != m with a shift.

llvm-svn: 121608
2010-12-11 09:42:59 +00:00
Evan Cheng
b6773d7e1f (or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056
llvm-svn: 121606
2010-12-11 04:11:38 +00:00
Bob Wilson
d30768fe3e Add float patterns for Neon vld1-lane/dup and vst1-lane operations.
llvm-svn: 121583
2010-12-10 22:13:32 +00:00
Dan Gohman
18e2a55c07 Revert r121520, which may have introduced miscompilations.
llvm-svn: 121573
2010-12-10 21:48:28 +00:00
Dan Gohman
d1bf1d8013 Implement PartialAlias checking in BasicAA.
llvm-svn: 121520
2010-12-10 20:47:03 +00:00
Bob Wilson
5ff13f9d5c Fix some invalid alignments for Neon vld-dup and vld/st-lane instructions.
Alignments smaller than the total size of the memory being loaded or stored,
unless the alignment is 8 bytes, are not allowed.  Add tests for this, too.

llvm-svn: 121506
2010-12-10 19:37:42 +00:00
NAKAMURA Takumi
e6b65793ea macho-dump: Fix CMake build, following up to r121466.
llvm-svn: 121476
2010-12-10 09:18:26 +00:00
Rafael Espindola
0e665e502d Fixed version of 121434 with no new memory leaks.
llvm-svn: 121471
2010-12-10 07:39:47 +00:00
Daniel Dunbar
599da0cadf macho-dump: Switch to C++ macho-dump tool.
llvm-svn: 121466
2010-12-10 06:19:45 +00:00
Rafael Espindola
011e168728 Revert my previous patch to make the valgrind bots happy.
llvm-svn: 121461
2010-12-10 04:01:09 +00:00
NAKAMURA Takumi
5ad673e4d9 Add dependency to "make check".
cmake/modules/AddLLVM.cmake: Add empty "phony" target in add_llvm_loadable_module() even if loadable module were not supported.

llvm-svn: 121455
2010-12-10 02:15:36 +00:00
Nate Begeman
cb6d1c8193 Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Rafael Espindola
03ad1e8f1f Initial support for the cfi directives. This is just enough to get
f:
        .cfi_startproc
        nop
        .cfi_endproc

assembled (on ELF).

llvm-svn: 121434
2010-12-09 23:48:29 +00:00
Kevin Enderby
55cb19813e Add support for parsing ARM arithmetic instructions that update or don't update
the condition codes.  Where the ones that do have an 's' suffix and the ones
that don't don't have the suffix.  The trick is if MatchInstructionImpl() fails
we try again after adding a CCOut operand with the correct value and removing
the 's' if present.  Four simple test cases added for now, lots more to come.

llvm-svn: 121401
2010-12-09 19:19:43 +00:00
Jim Grosbach
8bc33cc6e5 ARM stm/ldm instructions require more than one register in the register list.
Otherwise, a plain str/ldr should be used instead. Make sure we account for
that in prologue/epilogue code generation.
rdar://8745460

llvm-svn: 121391
2010-12-09 18:31:13 +00:00
Bruno Cardoso Lopes
93e5c2fb64 Add ROTR and ROTRV mips32 instructions. Patch by Akira Hatanaka
llvm-svn: 121377
2010-12-09 17:32:30 +00:00
Chris Lattner
996691e79c enhance memcpyopt to zap memcpy's that have the same src/dst.
llvm-svn: 121362
2010-12-09 07:45:45 +00:00
Chris Lattner
4fef82afa0 fix PR8753, eliminating a case where we'd infinitely make a
substitution because it doesn't actually change the IR.  Patch by
Jakub Staszak!

llvm-svn: 121361
2010-12-09 07:39:50 +00:00
Eric Christopher
0e40452eb0 Rewrite the darwin tlv support to use a chain and return to copying
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.

llvm-svn: 121358
2010-12-09 06:25:53 +00:00
Dan Gohman
3d9fc7db03 Really check that the bits that will become zero are actually already zero
before eliminating the operation that zeros them. This fixes rdar://8739316.

llvm-svn: 121353
2010-12-09 02:52:17 +00:00
Eric Christopher
0100a8fda4 Remove extraneous copy from DAG conversion for darwin tls. This was
popping up at O0 when it wasn't folded and the fast allocator would
complain.

llvm-svn: 121330
2010-12-09 00:27:58 +00:00
Kevin Enderby
988dab6b5c Allow a slash, '/', as a prefix separator for X86. rdar://8741045
llvm-svn: 121320
2010-12-08 23:57:59 +00:00
Eric Christopher
d601b8288f Move this test to tlv* to make it easier to notice versus linux tls
support.

llvm-svn: 121316
2010-12-08 23:33:23 +00:00
Jason W Kim
2e6e50c1b0 ARM/MC/ELF TPsoft is now a proper pseudo inst.
Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM
as well as ELF/OBJ (including fixup)

Also added support for ELF::R_ARM_TLS_IE32

llvm-svn: 121312
2010-12-08 23:14:44 +00:00
Evan Cheng
3bd9b95b4d Fix a bad prologue / epilogue codegen bug where the compiler would emit illegal
vpush instructions to save / restore VFP / NEON registers like this:
vpush {d8,d10,d11}
vpop {d8,d10,d11}

vpush and vpop do not allow gaps in the register list.
rdar://8728956

llvm-svn: 121197
2010-12-07 23:08:38 +00:00
Bruno Cardoso Lopes
0e14644599 Match a pattern generated by a dag combiner opt where:
(select (load (load tga0)) (load tga1)) => (load (select (load tga0) tga1))

Thanks to Akira for pointing that.

llvm-svn: 121163
2010-12-07 19:00:20 +00:00
Rafael Espindola
866531d633 Fix absolute recording of differences of symbols in two sections. Reduced from ctor_dtor_count-2.cpp.
llvm-svn: 121152
2010-12-07 17:12:32 +00:00
Rafael Espindola
da64b6aa50 Fix relocations with weak definitions.
llvm-svn: 121114
2010-12-07 05:57:28 +00:00
NAKAMURA Takumi
9d1f909489 Revert test/Archive/check_binary_output.ll". It fails on a buildbot.
llvm-svn: 121113
2010-12-07 05:57:02 +00:00
Chris Lattner
12c2c17ac7 reapply r121100 with a tweak to constant fold ConstExprs with TargetData
(if available) as we go so that we get simple constantexprs not insane ones.
This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp
that the previous iteration of this patch had.

llvm-svn: 121111
2010-12-07 04:33:29 +00:00
Rafael Espindola
9ede5ef045 Fix pcrel relocations that cross sections.
llvm-svn: 121107
2010-12-07 03:50:14 +00:00
NAKAMURA Takumi
159722407b test/Archive/check_binary_output.ll: Add a new test to check output of 'llvm-ar -p' is sane. Thanks to Danil Malyshev!
llvm-svn: 121106
2010-12-07 03:35:20 +00:00
NAKAMURA Takumi
f260cf36f0 test/Other/close-stderr.ll: Require the feature 'shell'. It is not executable on Win32 but it is executable on MSYS-bash.
llvm-svn: 121105
2010-12-07 02:43:58 +00:00
NAKAMURA Takumi
63bff1a5d3 test: Add the feature 'shell' on LLVM_ON_UNIX.
llvm-svn: 121104
2010-12-07 02:43:51 +00:00
Eric Christopher
cab6997dc8 Temporarily revert r121100 as it's causing clang to fail
CodeGenCXX/virtual-base-ctor.cpp.

llvm-svn: 121102
2010-12-07 02:41:11 +00:00
Chris Lattner
5996a47663 fix PR8710 - teach global opt that some constantexprs are too complex to
put in a global variable's initializer.

llvm-svn: 121100
2010-12-07 01:59:32 +00:00
Michael J. Spencer
4c0dfbd472 Test: Fix Support.Path and _all_ of the unittest death tests. GetTempPath defaults to \Windows\.
If I typed anything else it would just decline into cursing.

llvm-svn: 121095
2010-12-07 01:23:49 +00:00
Rafael Espindola
c98cc0b286 Fix a crash reduced from gcc produced assembly.
llvm-svn: 121085
2010-12-07 01:09:54 +00:00
Owen Anderson
81f8b084e6 Second attempt at converting Thumb2's LDRpci, including updating the gazillion places that need to know about it.
llvm-svn: 121082
2010-12-07 00:45:21 +00:00
Frits van Bommel
1494a2f6fe Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInt*s or BlockAddress*s.
llvm-svn: 121066
2010-12-06 23:36:56 +00:00
Devang Patel
6fe7fe8dd4 If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0.
llvm-svn: 121059
2010-12-06 22:39:26 +00:00
Wesley Peck
996c76c27e Fixed reversed operands for IDIV and CMP instructions in MBlaze backend.
Use BRAD instead of BRD for indirect branches in MBlaze backend.

patch contributed by Jack Whitham!

llvm-svn: 121044
2010-12-06 22:06:49 +00:00
Wesley Peck
b168ddedaa Fix a 16-bit immediate value detection bug in the MBlaze delay slot filler.
Address more hazards in the MBlaze delay slot filler.

patch contributed by Jack Whitham!

llvm-svn: 121037
2010-12-06 21:11:01 +00:00
Rafael Espindola
3e954d16f4 Second try at making direct object emission produce the same results
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.

llvm-svn: 121006
2010-12-06 17:27:56 +00:00
Rafael Espindola
4ec917db9b Revert previous two patches while I try to find out how to make both
linux and darwin assemblers happy :-(

llvm-svn: 121004
2010-12-06 15:35:15 +00:00
Rafael Espindola
ad6219b193 Update test for the extra =.
llvm-svn: 121001
2010-12-06 15:05:36 +00:00
Che-Liang Chiou
cd2878d421 ptx: add shift instructions
llvm-svn: 120982
2010-12-06 04:00:03 +00:00
Rafael Espindola
f56c11276e Don't use PadSectionToAlignment on windows.
llvm-svn: 120978
2010-12-06 03:03:44 +00:00
Chris Lattner
db6c348f31 Fix PR8728, a miscompilation I recently introduced. When optimizing
memcpy's like:
  memcpy(A, B)
  memcpy(A, C)

we cannot delete the first memcpy as dead if A and C might be aliases.
If so, we actually get:

  memcpy(A, B)
  memcpy(A, A)

which is not correct to transform into:

  memcpy(A, A)

This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks
Jakub!

llvm-svn: 120974
2010-12-06 01:48:06 +00:00
Evan Cheng
fc78767730 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.

llvm-svn: 120960
2010-12-05 22:04:16 +00:00
Frits van Bommel
e390b379ae Fix PR 4170 by having ExtractValueInst::getIndexedType() reject out-of-bounds indexing.
Also add asserts that the indices are valid in InsertValueInst::init(). ExtractValueInst already asserts when constructed with invalid indices.

llvm-svn: 120956
2010-12-05 20:50:26 +00:00
Frits van Bommel
31cf7b99f9 Teach SimplifyCFG to turn
(indirectbr (select cond, blockaddress(@fn, BlockA),
                            blockaddress(@fn, BlockB)))
into
  (br cond, BlockA, BlockB).

llvm-svn: 120943
2010-12-05 18:29:03 +00:00
Chris Lattner
e30adfb732 Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags
result.  This allows us to compile:

void *test12(long count) {
      return new int[count];
}

into:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	movq	$-1, %rdi
	cmovnoq	%rax, %rdi
	jmp	__Znam                  ## TAILCALL

instead of:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	seto	%cl
	testb	%cl, %cl
	movq	$-1, %rdi
	cmoveq	%rax, %rdi
	jmp	__Znam

Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.

llvm-svn: 120936
2010-12-05 07:49:54 +00:00
Chris Lattner
76601e7a99 it turns out that when ".with.overflow" intrinsics were added to the X86
backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935
2010-12-05 07:30:36 +00:00
Chris Lattner
9b4b9e751a fix the rest of the linux miscompares :)
llvm-svn: 120933
2010-12-05 02:08:07 +00:00
Chris Lattner
16bafb2414 generalize the previous check to handle -1 on either side of the
select, inserting a not to compensate.  Add a missing isZero check
that I lost somehow.

This improves codegen of:

void *func(long count) {
      return new int[count];
}

from:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

to:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

llvm-svn: 120932
2010-12-05 02:00:51 +00:00
Chris Lattner
e1c32a116b relax this to handle linux defaulting to -static.
llvm-svn: 120930
2010-12-05 01:31:13 +00:00
Chris Lattner
474ed0aa9b Improve an integer select optimization in two ways:
1. generalize 
    (select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
    (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y

2. Handle the identical pattern that happens with !=:
   (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y

cmov is often high latency and can't fold immediates or
memory operands.  For example for (x == 0) ? -1 : 1, before 
we got:

< 	testb	%sil, %sil
< 	movl	$-1, %ecx
< 	movl	$1, %eax
< 	cmovel	%ecx, %eax

now we get:

> 	cmpb	$1, %sil
> 	sbbl	%eax, %eax
> 	orl	$1, %eax

llvm-svn: 120929
2010-12-05 01:23:24 +00:00
Chris Lattner
c383be807e merge some tests into select.ll and make them more specific.
llvm-svn: 120928
2010-12-05 01:13:58 +00:00
Chris Lattner
7e0a594633 rename test
llvm-svn: 120927
2010-12-05 01:02:23 +00:00
Chris Lattner
1a760f6247 remove two tests that aren't really testing anything.
llvm-svn: 120926
2010-12-05 01:02:13 +00:00
Benjamin Kramer
851691ddb2 Add patterns for the x86 popcnt instruction.
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917
2010-12-04 20:32:23 +00:00
Bob Wilson
20c65a9d33 The Thumb tADDrSPi instruction is not valid when the destination is SP.
Check for that and try narrowing it to tADDspi instead.  Radar 8724703.

llvm-svn: 120892
2010-12-04 04:40:19 +00:00
Rafael Espindola
9215947c83 There are two reasons why we might want to use
foo = a - b
.long foo
instead of just
.long a - b

First, on darwin9 64 bits the assembler produces the wrong result. Second,
if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not
consider a - b to be a constant but will if the dummy foo is created.

Split how we handle these cases. The first one is something MC should take care
of. The second one has to be handled by the caller.

llvm-svn: 120889
2010-12-04 03:21:47 +00:00
Rafael Espindola
50b6170457 Next step: Only pad debug_line when the target is darwin. Add a FIXME to avoid
doing that if the target is darwin10 or newer.

This fixes
*) Direct object emission was producing objects without the workaround on
   darwin9.
*) Assembly printing was producing objects with the workaround on linux.

llvm-svn: 120866
2010-12-04 00:31:13 +00:00
Jim Grosbach
8cef570ed9 Encode the 32-bit wide Thumb (and Thumb2) instructions with the high order
halfword being emitted to the stream first. rdar://8728174

llvm-svn: 120848
2010-12-03 22:31:40 +00:00
Jim Grosbach
c69ad2176a When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the
32-bit wide version by adding the .w suffix.

llvm-svn: 120838
2010-12-03 20:33:01 +00:00
Devang Patel
dad3193123 Hide tests, that check .loc, .file in output assembly, from darwin9 buildbot.
llvm-svn: 120750
2010-12-02 23:29:58 +00:00
Devang Patel
822facd787 Use set directive for StartMinusEndExpr.
This is a fix for llvm-gcc-i386-darwin9 buildbot failure.

llvm-svn: 120742
2010-12-02 21:32:30 +00:00
Stuart Hastings
55928bc576 Test case for r120740. Radar 8712503.
llvm-svn: 120741
2010-12-02 21:25:55 +00:00
Duncan Sands
b38ddd79ed Adjust this test for the fact that the stores are no longer
being combined (which is being tracked as PR8699).

llvm-svn: 120734
2010-12-02 20:56:51 +00:00
Jim Grosbach
5a94ed2d67 XFAIL for now. If someone with access to an ARM/Linux host wants to have a look
that would be great. They're ARM JIT failures, so without that, it's tough.

llvm-svn: 120731
2010-12-02 20:20:32 +00:00
Evan Cheng
402157b66e Fix test.
llvm-svn: 120730
2010-12-02 20:17:34 +00:00
Duncan Sands
84bd43844e This test dates from the time when llvm-gcc had problems if two types were
named the same, so it had to qualify type names according to the enclosing
scope to ensure uniqueness.  This is no longer needed for correctness (though
it may be helpful when reading the IR), so this test has lost its importance.
Zap it because dragonegg will never be able to produce the qualified type name
since modern gcc zaps language specific info (such as whether a type is nested
inside another - needed to get X::Y here) before dragonegg is reached.

llvm-svn: 120721
2010-12-02 18:19:23 +00:00
NAKAMURA Takumi
f8cc69a131 test/Archive/extract.ll: Use cmp instead of diff. Thanks to Danil Malyshev!
llvm-svn: 120698
2010-12-02 09:16:14 +00:00
Evan Cheng
4118b24aca Fix and re-enable tail call optimization of expanded libcalls.
llvm-svn: 120622
2010-12-01 22:59:46 +00:00
Rafael Espindola
16b64c646a Rename temporary symbols if they conflict with artificial symbols created
by the assembler. This was blocking parsing any large .s produced by clang for
example.

Fixes PR8596.

llvm-svn: 120603
2010-12-01 20:46:11 +00:00
Owen Anderson
8802c68592 Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax.
llvm-svn: 120589
2010-12-01 19:18:46 +00:00
Evan Cheng
84162760b7 Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot.
llvm-svn: 120549
2010-12-01 03:27:20 +00:00
Jason W Kim
4d960e071c ARM/MC/ELF relocation "hello world" for movw/movt.
Lifted adjustFixupValue() from Darwin for sharing w ELF.
Test added
TODO:
  refactor ELFObjectWriter::RecordRelocation more.
  Possibly share more code with Darwin?
  Lots more relocations...

llvm-svn: 120534
2010-12-01 02:40:06 +00:00
Chris Lattner
c3112f1e94 fix a bozo bug I introduced in r119930, causing a miscompile of
20040709-1.c from the gcc testsuite.  I was using the size of a
pointer instead of the pointee.  This fixes rdar://8713376

llvm-svn: 120519
2010-12-01 01:24:55 +00:00
NAKAMURA Takumi
ffb10289fb test/Archive: FileCheck-ize, and remove *.toc. These may be CRLF-tolerant.
llvm-svn: 120506
2010-12-01 00:09:25 +00:00
Evan Cheng
f7e586d749 Enable sibling call optimization of libcalls which are expanded during
legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777

llvm-svn: 120501
2010-11-30 23:55:39 +00:00
Chris Lattner
c888f3ec58 Enhance DSE to handle the variable index case in PR8657.
llvm-svn: 120498
2010-11-30 23:43:23 +00:00
Chris Lattner
191aa08db1 remove fixme comment too.
llvm-svn: 120493
2010-11-30 23:25:01 +00:00
Chris Lattner
eee2bb2ff0 check in *all* files. This is now handled by my previous DSE commit.
llvm-svn: 120492
2010-11-30 23:23:59 +00:00
Chris Lattner
b9c5a6fa04 teach DSE to use GetPointerBaseWithConstantOffset to analyze
may-aliasing stores that partially overlap with different base
pointers.  This implements PR6043 and the non-variable part of
PR8657

llvm-svn: 120485
2010-11-30 23:05:20 +00:00
Chris Lattner
41b6b286a3 enhance isRemovable to refuse to delete volatile mem transfers
now that DSE hacks on them.  This fixes a regression I introduced,
by generalizing DSE to hack on transfers.

llvm-svn: 120445
2010-11-30 19:12:10 +00:00
Owen Anderson
e2a8781847 Add tests for more forms of Thumb2 loads and stores.
llvm-svn: 120436
2010-11-30 18:15:21 +00:00
Che-Liang Chiou
f594fe5fc5 ptx: add command-line options for gpu target and ptx version
llvm-svn: 120423
2010-11-30 10:14:14 +00:00
Eric Christopher
990bcd83b8 Not all platforms use _<func>. Duh.
llvm-svn: 120418
2010-11-30 09:23:54 +00:00
Bill Wendling
ae920bcc50 Add parsing for the Thumb t_addrmode_s4 addressing mode. This can almost
certainly be made more generic. But it does allow us to parse something like:

          ldr     r3, [r2, r4]

correctly in Thumb mode.

llvm-svn: 120408
2010-11-30 07:44:32 +00:00
Chris Lattner
7d444d0682 Rewrite the main DSE loop to be written in terms of reasoning
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate.  This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.

This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet.  Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.

llvm-svn: 120406
2010-11-30 07:23:21 +00:00
Eric Christopher
f27f0b5234 Rewrite mwait and monitor support and custom lower arguments.
Fixes PR8573.

llvm-svn: 120404
2010-11-30 07:20:12 +00:00
Anders Carlsson
67e9e6234c Add a puts optimization that converts puts() to putchar('\n').
llvm-svn: 120398
2010-11-30 06:19:18 +00:00
Anders Carlsson
2a46a03898 Fix a typo.
llvm-svn: 120394
2010-11-30 06:03:55 +00:00
Anders Carlsson
a2ad88fb73 Rename this test to FPuts.ll since it actually tests fputs.
llvm-svn: 120393
2010-11-30 05:59:26 +00:00
Chris Lattner
bea813875e remove a use of llvm-dis
llvm-svn: 120383
2010-11-30 02:04:15 +00:00
Chris Lattner
56b0cc6974 merge one more away
llvm-svn: 120375
2010-11-30 01:06:43 +00:00