1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00
Commit Graph

69210 Commits

Author SHA1 Message Date
Chris Lattner
29f339f87c if an alloca is only ever accessed as a unit, and is accessed with load/store instructions,
then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571
2011-01-16 06:18:28 +00:00
Chris Lattner
b43fce09a9 Use an irbuilder to get some trivial constant folding when doing a store
of a constant.

llvm-svn: 123570
2011-01-16 05:58:24 +00:00
Chris Lattner
1a125a870f remove a dead check, this was needed before we had an explicit veto on uses of phis.
llvm-svn: 123569
2011-01-16 05:37:55 +00:00
Chris Lattner
2067fb2a93 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568
2011-01-16 05:28:59 +00:00
Evan Cheng
144b435a15 Spill R4 if it's going to be used to restore SP from FP.
llvm-svn: 123567
2011-01-16 05:14:33 +00:00
Chris Lattner
84d8f40fbb remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the
first line of the function because it isn't a good idea, even for compares.

llvm-svn: 123566
2011-01-16 05:14:26 +00:00
Chris Lattner
c639cb2c82 more cleanups: use the IR builder.
llvm-svn: 123565
2011-01-16 05:08:00 +00:00
Chris Lattner
9af2484c39 tidy up code.
llvm-svn: 123564
2011-01-16 04:37:29 +00:00
Owen Anderson
6e0fa67f91 Improve the safety of my globalopt enhancement by ensuring that the bitcast
of the stored value to the new store type is always.  Also, add a testcase.

llvm-svn: 123563
2011-01-16 04:33:33 +00:00
Chris Lattner
aba06ce448 fix PR8983, a broken assertion.
llvm-svn: 123562
2011-01-16 03:43:53 +00:00
Venkatraman Govindaraju
fe346f6cba Implement AnalyzeBranch in Sparc Backend.
llvm-svn: 123561
2011-01-16 03:15:11 +00:00
Chris Lattner
24ea7f696e fix PR8981, a crash trying to form a conditional inc with a floating point compare.
llvm-svn: 123560
2011-01-16 02:56:53 +00:00
Chris Lattner
c4d1d86d3e reapply my fix for PR8961 with a tweak to properly handle
multi-instruction sequences like calls.  Many thanks to Jakob for
finding a testcase.

llvm-svn: 123559
2011-01-16 02:27:38 +00:00
Chris Lattner
e3d0c7819e simplify this code, it is still broken but will follow up on llvm-commits.
llvm-svn: 123558
2011-01-16 02:05:10 +00:00
Michael J. Spencer
76f1706025 Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1."
llvm-svn: 123557
2011-01-16 01:43:22 +00:00
Chandler Carruth
a3261fcca5 Simplify a README.txt entry significantly to expose the core issue.
llvm-svn: 123556
2011-01-16 01:40:23 +00:00
Chris Lattner
44bcf63348 one of michael's recent patches broke this, temporarily disable
it so the bots go green

llvm-svn: 123555
2011-01-16 01:04:49 +00:00
Chris Lattner
75599bb566 remove the partial specialization pass. It is unmaintained and has bugs.
llvm-svn: 123554
2011-01-16 00:27:10 +00:00
Michael J. Spencer
927075c958 Archive: Fix spelling.
llvm-svn: 123552
2011-01-15 21:43:45 +00:00
Michael J. Spencer
303c304f0d Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.
llvm-svn: 123551
2011-01-15 21:43:37 +00:00
Michael J. Spencer
971bf61475 Support/GraphWriter: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.
llvm-svn: 123550
2011-01-15 21:43:25 +00:00
Benjamin Kramer
2e7ead5bb5 Add an assert so we don't silently miscompile ctpop for bit widths > 128.
llvm-svn: 123549
2011-01-15 21:19:37 +00:00
Michael J. Spencer
e1defa51ae Support/PathV2: Add identify_magic.
llvm-svn: 123548
2011-01-15 20:39:36 +00:00
Benjamin Kramer
b48a048de6 Reimplement CTPOP legalization with the "best" algorithm from
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.

I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.

llvm-svn: 123547
2011-01-15 20:30:30 +00:00
Michael J. Spencer
ef831d650f Unittests/Support/Path: Tweak test.
llvm-svn: 123546
2011-01-15 18:52:49 +00:00
Michael J. Spencer
86a5515979 Support/PathV2: Implement has_magic in terms of get_magic.
llvm-svn: 123545
2011-01-15 18:52:41 +00:00
Michael J. Spencer
78fc0cacd0 Support/PathV2: Implement get_magic.
llvm-svn: 123544
2011-01-15 18:52:33 +00:00
Nick Lewycky
7e71443cf2 Add missing whitespace.
llvm-svn: 123543
2011-01-15 18:42:52 +00:00
Nick Lewycky
1d57e867a4 Make constmerge a two-pass algorithm so that it won't miss merging
opporuntities. Fixes PR8978.

llvm-svn: 123541
2011-01-15 18:14:21 +00:00
Oscar Fuentes
c9265ce51e Make config.h.cmake similar to config.h.in
Patch by arrowdodger!

llvm-svn: 123539
2011-01-15 13:35:37 +00:00
Benjamin Kramer
91f0608676 Try to unbreak selfhost.
llvm-svn: 123537
2011-01-15 11:25:34 +00:00
Nick Lewycky
9293c403d8 Add a cache that protects mergefunc's internals from more surprises in DenseSet.
Also, replace tabs with spaces. Yes, it's 2011.

llvm-svn: 123535
2011-01-15 10:16:23 +00:00
Nick Lewycky
708df45c84 Teach LazyValueInfo that allocas aren't NULL. Over all of llvm-test, this saves
half a million non-local queries, each of which would otherwise have triggered a
linear scan over a basic block.

Also fix a fixme for memory intrinsics which dereference pointers. With this,
we prove that a pointer is non-null because it was dereferenced by an intrinsic
112 times in llvm-test.

llvm-svn: 123533
2011-01-15 09:16:12 +00:00
Rafael Espindola
fde74c53b6 Add a clarification about merging constants with and without unnamed_addr.
llvm-svn: 123530
2011-01-15 08:20:57 +00:00
Rafael Espindola
3b43f22391 Allow unnamed_addr on declarations.
llvm-svn: 123529
2011-01-15 08:15:00 +00:00
Chris Lattner
55c2150f36 temporarily revert r123526. While working on a follow-on patch I
realize that ConstantFoldTerminator doesn't preserve dominfo.

llvm-svn: 123527
2011-01-15 07:51:19 +00:00
Chris Lattner
68a47147ba fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code
The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea.  Just fold branches on constants
into unconditional branches.

llvm-svn: 123526
2011-01-15 07:36:13 +00:00
Chris Lattner
2a7c042c37 simplify code, no functionality change.
llvm-svn: 123525
2011-01-15 07:29:01 +00:00
Chris Lattner
d4eaf6eba8 Now that instruction optzns can update the iterator as they go, we can
have objectsize folding recursively simplify away their result when it
folds.  It is important to catch this here, because otherwise we won't
eliminate the cross-block values at isel and other times.

llvm-svn: 123524
2011-01-15 07:25:29 +00:00
Chris Lattner
939e77a0df make the current instruction iterator an ivar, allowing xforms that
potentially invalidate it (like inline asm lowering) to be sunk into
their proper place, cleaning up a ton of code.

llvm-svn: 123523
2011-01-15 07:14:54 +00:00
Chris Lattner
74ed5d30ca implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Chris Lattner
089d215cb3 fix typo
llvm-svn: 123519
2011-01-15 06:27:35 +00:00
Chris Lattner
934c574ef9 Fix m_Not and m_Neg to not match random ConstantInt's. Before
these would try hard to match constants by inverting the bits
and recursively matching.  There are two problems with this:
1) some patterns would match when we didn't want them to (theoretical)
2) this is insanely expensive to do, and most often pointless.

This was apparently useful in just 2 instcombine cases, which I
added code to handle explicitly.  This change speeds up 'opt'
time on 176.gcc by 1% and produces bitwise identical code.

llvm-svn: 123518
2011-01-15 05:52:27 +00:00
Chris Lattner
0868c29c36 one more instcombine variant that is needed to work with future changes,
no functionality change currently.

llvm-svn: 123517
2011-01-15 05:50:18 +00:00
Chris Lattner
360fedf20a fix typo
llvm-svn: 123516
2011-01-15 05:42:47 +00:00
Chris Lattner
ca796e7838 Catch ~x < cst just like ~x < ~y, we currently handle this through
means that are about to disappear.

llvm-svn: 123515
2011-01-15 05:41:33 +00:00
Chris Lattner
06849c1228 reduce indentation
llvm-svn: 123514
2011-01-15 05:40:29 +00:00
Eric Christopher
d675e0b362 80-col.
llvm-svn: 123505
2011-01-15 00:25:09 +00:00
Chris Lattner
e6d5b3c4ce Generalize LoadAndStorePromoter a bit and switch LICM
to use it.

llvm-svn: 123501
2011-01-15 00:12:35 +00:00
Bob Wilson
e6b8ba1ae4 Fix a comment.
llvm-svn: 123497
2011-01-15 00:09:18 +00:00