1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

8890 Commits

Author SHA1 Message Date
Devang Patel
3b08c33f33 Remove dead debug info intrinsics.
Intrinsic::dbg_stoppoint
 Intrinsic::dbg_region_start 
 Intrinsic::dbg_region_end 
 Intrinsic::dbg_func_start
AutoUpgrade simply ignores these intrinsics now.

llvm-svn: 92557
2010-01-05 01:10:40 +00:00
Devang Patel
6915c52d56 Fix debug_inlined section entries for routines whose names are changed through __asm() extension.
llvm-svn: 92533
2010-01-04 23:04:36 +00:00
Dan Gohman
73b0882c6e Make this test more portable.
llvm-svn: 92514
2010-01-04 21:23:34 +00:00
Devang Patel
e04ac70892 Remove oversimplified test case.
llvm-svn: 92510
2010-01-04 20:54:06 +00:00
Dan Gohman
b71bc40eed Add some tests and update an existing test to reflect recent
x86 isel peeps.

llvm-svn: 92509
2010-01-04 20:53:54 +00:00
Devang Patel
46d7029a91 The test, derived from optimzed IR, does not mention "bar" in debug info anywhere so the dwarf writer is not expected to emit any debug info for function "bar".
llvm-svn: 92499
2010-01-04 19:41:13 +00:00
Chris Lattner
3b060b2d41 Truncate GEP indexes larger than the pointer size down to pointer size
when doing this transform if the GEP is not inbounds.  No testcase because
it is very difficult to trigger this: instcombine already canonicalizes
GEP indices to pointer size, so it relies specific permutations of the
instcombine worklist.

Thanks to Duncan for pointing this possible problem out.

llvm-svn: 92495
2010-01-04 18:57:15 +00:00
Anton Korobeynikov
3915cf5ef4 Fix invalid chain folding for memory variant of sdiv / udiv
llvm-svn: 92472
2010-01-04 10:31:54 +00:00
Chris Lattner
ce3f5f3448 implement an instcombine xform needed by clang's codegen
on the example in PR4216.  This doesn't trigger in the testsuite,
so I'd really appreciate someone scrutinizing the logic for
correctness.

llvm-svn: 92458
2010-01-04 06:03:59 +00:00
Chris Lattner
8e83066d12 fix PR5930, allowing the asmprinter to emit difference between
two labels as a truncate.

llvm-svn: 92455
2010-01-03 18:33:18 +00:00
Chris Lattner
49cda26f7e add PR#
llvm-svn: 92451
2010-01-03 18:10:58 +00:00
Chris Lattner
7246a69d2b differences between two blockaddress's don't cause a
global variable initializer to require relocations.

llvm-svn: 92450
2010-01-03 18:09:40 +00:00
Chris Lattner
647c629ee4 generalize the previous transformation to handle indexing into
arrays of structs and other arrays, so long as all the subsequent
indexes are constants.  This triggers frequently for stuff like:

@divisions = internal constant [29 x [2 x i32]] [[2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 1], [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 2], [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] zeroinitializer, [2 x i32] [i32 0, i32 2], [2 x i32] [i32 0, i32 1], [2 x i32] zeroinitializer, [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 0], [2 x i32] [i32 1, i32 1], [2 x i32] [i32 1, i32 2], [2 x i32] [i32 1, i32 2]], align 32 ; <[29 x [2 x i32]]*> [#uses=50]

	  %623 = getelementptr inbounds [29 x [2 x i32]]* @divisions, i64 0, i64 %619, i64 0 ; <i32*> [#uses=1]
	   %684 = icmp eq i32 %683, 999 

also for the "my_defs" table in 'gs', etc.

llvm-svn: 92444
2010-01-03 03:03:27 +00:00
Chris Lattner
acb0c133ec teach instcombine to optimize idioms like A[i]&42 == 0. This
occurs in 403.gcc in mode_mask_array, in safe-ctype.c (which
is copied in multiple apps) in _sch_istable, etc.

llvm-svn: 92427
2010-01-02 22:08:28 +00:00
Chris Lattner
4af67af013 Teach the table lookup optimization to generate range compares
when a consequtive sequence of elements all satisfies the 
predicate.  Like the double compare case, this generates better
code than the magic constant case and generalizes to more than
32/64 element array lookups.

Here are some examples where it triggers.  From 403.gcc, most
accesses to the rtx_class array are handled, e.g.:

@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=547]
   %142 = icmp eq i8 %141, 105
@rtx_class = constant [153 x i8] c"xxxxxmmmmmmmmxxxxxxxxxxxxmxxxxxxiiixxxxxxxxxxxxxxxxxxxooxooooooxxoooooox3x2c21c2222ccc122222ccccaaaaaa<<<<<<<<<<<<<<<<<<111111111111bbooxxxxxxxxxxcc2211x", align 32 ; <[153 x i8]*> [#uses=543]
	   %165 = icmp eq i8 %164, 60      

Also, most of the 59-element arrays (mode_class/rid_to_yy, etc) 
optimized before are actually range compares.  This lets 32-bit
machines optimize them.

400.perlbmk has stuff like this:

400.perlbmk: PL_regkind, even for 32-bit:
@PL_regkind = constant [62 x i8] c"\00\00\02\02\02\06\06\06\06\09\09\0B\0B\0D\0E\0E\0E\11\12\12\14\14\16\16\18\18\1A\1A\1C\1C\1E\1F !!!$$&'((((,-.///88886789:;8$", align 32 ; <[62 x i8]*> [#uses=4]
	   %811 = icmp ne i8 %810, 33 

@PL_utf8skip = constant [256 x i8] c"\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\01\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\02\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\03\04\04\04\04\04\04\04\04\05\05\05\05\06\06\07\0D", align 32 ; <[256 x i8]*> [#uses=94]
	   %12 = icmp ult i8 %10, 2
           
etc.

llvm-svn: 92426
2010-01-02 21:50:18 +00:00
Nick Lewycky
cda0109ec5 Fix logic error in previous commit. The != case needs to become an or, not an
and.

llvm-svn: 92419
2010-01-02 16:14:56 +00:00
Nick Lewycky
3cc8fe073a Optimize pointer comparison into the typesafe form, now that the backends will
handle them efficiently. This is the opposite direction of the transformation
we used to have here.

llvm-svn: 92418
2010-01-02 15:25:44 +00:00
Chris Lattner
e1a2489017 Generalize the previous xform to handle cases where exactly
two elements match or don't match with two comparisons.  For
example, the testcase compiles into:

define i1 @test5(i32 %X) {
  %1 = icmp eq i32 %X, 2                          ; <i1> [#uses=1]
  %2 = icmp eq i32 %X, 7                          ; <i1> [#uses=1]
  %R = or i1 %1, %2                               ; <i1> [#uses=1]
  ret i1 %R
}

This generalizes the previous xforms when the array is larger than
64 elements (and this case matches) and generates better code for
cases where it overlaps with the magic bitshift case.

This generalizes more cases than you might expect.  For example,
400.perlbmk has:

@PL_utf8skip = constant [256 x i8] c"\01\01\01\...
%15 = icmp ult i8 %7, 7

403.gcc has:
@rid_to_yy = internal constant [114 x i16] [i16 259, i16 260, ...
%18 = icmp eq i16 %16, 295 

and xalancbmk has a bunch of examples, such as 
_ZN11xercesc_2_5L15gCombiningCharsE and _ZN11xercesc_2_5L10gBaseCharsE.

llvm-svn: 92417
2010-01-02 09:35:17 +00:00
Chris Lattner
1cdc77b8da enhance the compare/load/index optimization to work on *any* load
from a global with 32/64 elements or less (depending on whether
i64 is native on the target), generating a bitshift idiom to 
determine the result.  For example, on test4 we produce:

define i1 @test4(i32 %X) {
  %1 = lshr i32 933, %X                           ; <i32> [#uses=1]
  %2 = and i32 %1, 1                              ; <i32> [#uses=1]
  %R = icmp ne i32 %2, 0                          ; <i1> [#uses=1]
  ret i1 %R
}

This triggers in a number of interesting cases, for example, here's an
fp case:
@A.3255 = internal constant [4 x double] [double 4.100000e+00, double -3.900000e+00, double -1.000000e+00, double 1.000000e+00], align 32 ; <[4 x double]*> [#uses=7]
...
	   %7 = fcmp olt double %3, 0.000000e+00

In this case we make the slen2_tab global dead, which is nice:
@slen2_tab = internal constant [16 x i32] [i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 1, i32 2, i32 3, i32 2, i32 3], align 32 ; <[16 x i32]*> [#uses=1]
...
	   %204 = icmp eq i32 %46, 0     

Perl has a bunch of these, also on the 'Perl_regkind' array:
@Perl_yygindex = internal constant [51 x i16] [i16 0, i16 0, i16 0, i16 0, i16 374, i16 351, i16 0, i16 -12, i16 0, i16 946, i16 413, i16 -83, i16 0, i16 0, i16 0, i16 -311, i16 -13, i16 4007, i16 2893, i16 0, i16 0, i16 0, i16 0, i16 0, i16 372, i16 -8, i16 0, i16 0, i16 246, i16 -131, i16 43, i16 86, i16 208, i16 -45, i16 -169, i16 987, i16 0, i16 0, i16 0, i16 0, i16 308, i16 0, i16 -271, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0], align 32 ; <[51 x i16]*> [#uses=1]
...
  %1364 = icmp eq i16 %1361, 0

186.crafty really likes this on 64-bit machines, because it triggers on a bunch of globals like this:
@white_outpost = internal constant [64 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\02\02\00\00\00\00\00\04\05\05\04\00\00\00\00\03\06\06\03\00\00\00\00\00\01\01\00\00\00\00\00\00\00\00\00\00\00", align 32 ; <[64 x i8]*> [#uses=2]

However the big winner is 403.gcc, which triggers hundreds of times, eliminating all the accesses to the 57-element arrays 'mode_class', mode_unit_size, mode_bitsize, regclass_map, etc.

go 64-bit machines :)

llvm-svn: 92415
2010-01-02 08:56:52 +00:00
Chris Lattner
59136ba5ad enhance the previous optimization to work with fcmp in addition
to icmp.

llvm-svn: 92412
2010-01-02 08:20:51 +00:00
Chris Lattner
f3f6c10218 Teach instcombine to fold compares of loads from constant
arrays with variable indices into a comparison of the index
with a constant.  The most common occurrence of this that
I see by far is stuff like:

if ("foobar"[i] == '\0') ...

which we compile into: if (i == 6), saving a load and 
materialization of the global address.  This also exposes 
loop trip count information to later passes in many cases.

This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps.  Here are a few 
interesting ones from various apps:

@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
  %scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
  %17 = load ...
  %18 = icmp eq i8* %17, null                     ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7 


@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
  %57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
   %mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
   %64 = icmp eq i8 %58, 4                         ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35             ; <i1> [#uses=0]


@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768                 ; <i1> [#uses=0]
-> false

llvm-svn: 92411
2010-01-02 08:12:04 +00:00
Chris Lattner
cf784992da remove the instcombine transformations that are inserting nasty
pointer to int casts that confuse later optimizations.  See PR3351
for details.

This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well.  Clang++ will do
better I guess.

llvm-svn: 92408
2010-01-02 00:31:05 +00:00
Chris Lattner
9e64bad0da allow this to work on linux hosts.
llvm-svn: 92407
2010-01-02 00:22:15 +00:00
Chris Lattner
fe8af82cd4 Teach codegen to handle:
(X != null) | (Y != null) --> (X|Y) != 0
 (X == null) & (Y == null) --> (X|Y) == 0

so that instcombine can stop doing this for pointers.  This is part of PR3351,
which is a case where instcombine doing this for pointers (inserting ptrtoint)
is pessimizing code.

llvm-svn: 92406
2010-01-02 00:00:03 +00:00
Chris Lattner
4e49a69ec5 rename file.
llvm-svn: 92405
2010-01-01 23:55:04 +00:00
Chris Lattner
ef4fba933d add a simple instcombine xform, simplify another one to use hasAllZeroIndices()
instead of hand rolling a loop.

llvm-svn: 92403
2010-01-01 23:09:08 +00:00
Chris Lattner
feb7b1af69 generalize the pointer difference optimization to handle
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.

llvm-svn: 92402
2010-01-01 22:42:29 +00:00
Chris Lattner
89b1b63bdf teach instcombine to optimize pointer difference idioms involving constant
expressions.  This is a step towards comment #4 in PR3351.

llvm-svn: 92401
2010-01-01 22:29:12 +00:00
Chris Lattner
ce7717e168 implement the transform requested in PR5284
llvm-svn: 92398
2010-01-01 18:34:40 +00:00
Chris Lattner
44298d184a Teach codegen to lower llvm.powi to an efficient (but not optimal)
multiply sequence when the power is a constant integer.  Before, our
codegen for std::pow(.., int) always turned into a libcall, which was
really inefficient.

This should also make many gfortran programs happier I'd imagine.

llvm-svn: 92388
2010-01-01 03:32:16 +00:00
Chris Lattner
3d38dbff2a Make this more likely to generate a libcall.
llvm-svn: 92387
2010-01-01 03:26:51 +00:00
Chris Lattner
e5f5e4b151 add a few trivial instcombines for llvm.powi.
llvm-svn: 92383
2010-01-01 01:52:15 +00:00
Chris Lattner
662a872e15 When factoring multiply expressions across adds, factor both
positive and negative forms of constants together.  This 
allows us to compile:

int foo(int x, int y) {
    return (x-y) + (x-y) + (x-y);
}

into:

_foo:                                                       ## @foo
	subl	%esi, %edi
	leal	(%rdi,%rdi,2), %eax
	ret

instead of (where the 3 and -3 were not factored):

_foo:
        imull   $-3, 8(%esp), %ecx
        imull   $3, 4(%esp), %eax
        addl    %ecx, %eax
        ret

this started out as:
    movl    12(%ebp), %ecx
    imull   $3, 8(%ebp), %eax
    subl    %ecx, %eax
    subl    %ecx, %eax
    subl    %ecx, %eax
    ret

This comes from PR5359.

llvm-svn: 92381
2010-01-01 01:13:15 +00:00
Chris Lattner
dd837a069b test case we alredy get right.
llvm-svn: 92380
2010-01-01 00:50:00 +00:00
Chris Lattner
ebe5932016 reuse negates where possible instead of always creating them from scratch.
This allows us to optimize test12 into:

define i32 @test12(i32 %X) {
  %factor = mul i32 %X, -3                        ; <i32> [#uses=1]
  %Z = add i32 %factor, 6                         ; <i32> [#uses=1]
  ret i32 %Z
}

instead of:

define i32 @test12(i32 %X) {
  %Y = sub i32 6, %X                              ; <i32> [#uses=1]
  %C = sub i32 %Y, %X                             ; <i32> [#uses=1]
  %Z = sub i32 %C, %X                             ; <i32> [#uses=1]
  ret i32 %Z
}

llvm-svn: 92373
2009-12-31 20:34:32 +00:00
Chris Lattner
59379f8bb0 teach reassociate to factor x+x+x -> x*3. While I'm at it,
fix RemoveDeadBinaryOp to actually do something.

llvm-svn: 92368
2009-12-31 19:24:52 +00:00
Chris Lattner
4870f6a384 simple fix for an incorrect factoring which causes a
miscompilation, PR5458.

llvm-svn: 92354
2009-12-31 08:33:49 +00:00
Chris Lattner
d9ed31f6d0 merge some more tests in.
llvm-svn: 92353
2009-12-31 08:32:22 +00:00
Chris Lattner
67d96ee04c filecheckize
llvm-svn: 92352
2009-12-31 08:29:56 +00:00
Chris Lattner
70788c5e57 add some basic named MD tests.
llvm-svn: 92336
2009-12-31 03:00:49 +00:00
Chris Lattner
83d5230121 fix two bogus tests that the asmparser now rejects.
llvm-svn: 92303
2009-12-30 05:54:51 +00:00
Chris Lattner
03a91d987f reimplement insertvalue/extractvalue metadata handling to not blindly
accept invalid input.  Actually add a testcase.

llvm-svn: 92297
2009-12-30 05:14:00 +00:00
Chris Lattner
c42b8ff24e fix parsing of mdstring values.
llvm-svn: 92290
2009-12-30 04:13:37 +00:00
Chris Lattner
c6b925c592 Each instruction is allowed to have *multiple* different
metadata objects on them.  Though the entire compiler supports this,
the asmparser didn't.

llvm-svn: 92270
2009-12-29 21:25:40 +00:00
Chris Lattner
86c74f4783 Do not crash when .ll printing metadata that smells like debug info, but isn't.
llvm-svn: 92268
2009-12-29 21:17:33 +00:00
Sanjiv Gupta
543a6716fb Extern declaration for unordered.f32 libcall was not being emitted. Fixed that.
llvm-svn: 92242
2009-12-29 03:24:34 +00:00
Sanjiv Gupta
efad5b2a93 Fixed llc crash for zext (i1 -> i8) loads.
llvm-svn: 92201
2009-12-28 04:53:24 +00:00
Dale Johannesen
a62752a06e Testcase for llvm-gcc checkin 92108.
llvm-svn: 92110
2009-12-24 01:10:43 +00:00
Chris Lattner
4e96d36f72 handle equality memcmp of 8 bytes on x86-64 with two unaligned loads and a
compare.  On other targets we end up with a call to memcmp because we don't
want 16 individual byte loads.  We should be able to use movups as well, but
we're failing to select the generated icmp.

llvm-svn: 92107
2009-12-24 01:07:17 +00:00
Chris Lattner
5d3919d5f9 move an optimization for memcmp out of simplifylibcalls and into
SDISel.  This optimization was causing simplifylibcalls to 
introduce type-unsafe nastiness.  This is the first step, I'll be 
expanding the memcmp optimizations shortly, covering things that
we really really wouldn't want simplifylibcalls to do.

llvm-svn: 92098
2009-12-24 00:37:38 +00:00