1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

270 Commits

Author SHA1 Message Date
Owen Anderson
47f0efad86 When folding away a (shl (shr)) pair, we need to check that the bits that will BECOME the low
bits are zero, not that the current low bits are zero.  Fixes <rdar://problem/8606771>.

llvm-svn: 117953
2010-11-01 21:08:20 +00:00
Bob Wilson
d84c4629c1 Clean up indentation and other whitespace.
llvm-svn: 117728
2010-10-29 22:20:45 +00:00
Bob Wilson
9297637759 Remove trailing whitespace.
llvm-svn: 117727
2010-10-29 22:20:43 +00:00
Bob Wilson
59dedc2629 Fix 80-column violation.
llvm-svn: 117722
2010-10-29 22:03:07 +00:00
Bob Wilson
d7f24e831f Change instcombine's getShuffleMask to represent undef with negative values.
This code had previously used 2*N, where N is the mask length, to represent
undef.  That is not safe because the shufflevector operands may have more
than N elements -- they don't have to match the result type.

llvm-svn: 117721
2010-10-29 22:03:05 +00:00
Bob Wilson
996353fb5d Make instcombine a little more aggressive in combining vector shuffles.
Allow splats even if they don't match either of the original shuffles,
possibly due to undef entries in the shuffles masks.  Radar 8597790.
Also fix some 80-column violations.

llvm-svn: 117719
2010-10-29 22:02:50 +00:00
Dale Johannesen
454b9243bd Teach InstCombine not to use Add and Neg on FP. PR 8490.
llvm-svn: 117510
2010-10-27 23:45:18 +00:00
Dan Gohman
96e34e87ca Fix a case where instcombine was stripping metadata (and alignment)
from stores when folding in bitcasts.

llvm-svn: 117265
2010-10-25 16:16:27 +00:00
Benjamin Kramer
86b0370b66 SmallVectorize.
llvm-svn: 117213
2010-10-23 17:10:24 +00:00
Bob Wilson
0290dbe7d4 Teach instcombine to set the alignment arguments for NEON load/store intrinsics.
llvm-svn: 117154
2010-10-22 21:41:48 +00:00
Owen Anderson
46990c17f7 Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which
must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
the pass's dependencies.

Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.

I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
registration/creation, please send the testcase to me directly.

llvm-svn: 116820
2010-10-19 17:21:58 +00:00
Owen Anderson
69cbf2e8b7 Now with fewer extraneous semicolons!
llvm-svn: 115996
2010-10-07 22:25:06 +00:00
Owen Anderson
8adf6f2759 Add initialization routines to InstCombine.
llvm-svn: 115965
2010-10-07 20:04:55 +00:00
Chris Lattner
bf0f375aba fix PR8267 - Instcombine shouldn't optimizer away volatile memcpy's.
llvm-svn: 115296
2010-10-01 05:51:02 +00:00
Oscar Fuentes
eb27a44982 Removed a bunch of unnecessary target_link_libraries.
llvm-svn: 114999
2010-09-28 22:39:14 +00:00
Michael J. Spencer
90f807fda5 Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally."
This reverts commit r113632

Conflicts:

	cmake/modules/AddLLVM.cmake

llvm-svn: 113819
2010-09-13 23:59:48 +00:00
Owen Anderson
9c34a7831d Re-apply r113679, which was reverted in r113720, which added a paid of new instcombine transforms
to expose greater opportunities for store narrowing in codegen.  This patch fixes a potential
infinite loop in instcombine caused by one of the introduced transforms being overly aggressive.

llvm-svn: 113763
2010-09-13 17:59:27 +00:00
Eric Christopher
d4aaabfa74 Revert 113679, it was causing an infinite loop in a testcase that I've sent
on to Owen.

llvm-svn: 113720
2010-09-12 06:09:23 +00:00
Owen Anderson
d4ebde12ce Invert and-of-or into or-of-and when doing so would allow us to clear bits of the and's mask.
This can result in increased opportunities for store narrowing in code generation.  Update a number of
tests for this change.  This fixes <rdar://problem/8285027>.

Additionally, because this inverts the order of ors and ands, some patterns for optimizing or-of-and-of-or
no longer fire in instances where they did originally.  Add a simple transform which recaptures most of these
opportunities: if we have an or-of-constant-or and have failed to fold away the inner or, commute the order 
of the two ors, to give the non-constant or a chance for simplification instead.

llvm-svn: 113679
2010-09-11 05:48:06 +00:00
Michael J. Spencer
98ad3f2ea7 CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally.
llvm-svn: 113632
2010-09-10 21:14:25 +00:00
Benjamin Kramer
cd84f14639 This transform is also performed by InstructionSimplify, remove the duplicate.
llvm-svn: 113608
2010-09-10 19:52:35 +00:00
Owen Anderson
c51d7d1a8d Generalize instcombine's support for combining multiple bit checks into a single test. Patch by Dirk Steinke!
llvm-svn: 113423
2010-09-08 22:16:17 +00:00
Chris Lattner
a58a97dafc Fix a serious performance regression introduced by r108687 on linux:
turning (fptrunc (sqrt (fpext x))) -> (sqrtf x)  is great, but we have
to delete the original sqrt as well.  Not doing so causes us to do 
two sqrt's when building with -fmath-errno (the default on linux).

llvm-svn: 113260
2010-09-07 20:01:38 +00:00
Owen Anderson
bd9edea8a3 Remove r111665, which implemented store-narrowing in InstCombine. Chris discovered a miscompilation in it, and it's not easily
fixable at the optimizer level. I'll investigate reimplementing it in DAGCombine.

llvm-svn: 112575
2010-08-31 04:41:06 +00:00
Chris Lattner
fc1da78d16 for completeness, allow undef also.
llvm-svn: 112351
2010-08-28 03:36:51 +00:00
Chris Lattner
b61cf1e296 handle the constant case of vector insertion. For something
like this:

struct S { float A, B, C, D; };

struct S g;
struct S bar() { 
  struct S A = g;
  ++A.B;
  A.A = 42;
  return A;
}

we now generate:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	pshufd	$16, %xmm2, %xmm2
	movss	LCPI1_1(%rip), %xmm0
	pshufd	$16, %xmm0, %xmm0
	unpcklps	%xmm2, %xmm0
	ret

instead of:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	12(%rax), %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	unpcklps	%xmm0, %xmm1
	addss	LCPI1_0(%rip), %xmm2
	movd	%xmm2, %eax
	shlq	$32, %rax
	addq	$1109917696, %rax       ## imm = 0x42280000
	movd	%rax, %xmm0
	ret

llvm-svn: 112345
2010-08-28 01:50:57 +00:00
Chris Lattner
c70b0c0ee7 optimize bitcasts from large integers to vector into vector
element insertion from the pieces that feed into the vector.
This handles a pattern that occurs frequently due to code
generated for the x86-64 abi.  We now compile something like
this:

struct S { float A, B, C, D; };
struct S g;
struct S bar() { 
  struct S A = g;
  ++A.A;
  ++A.C;
  return A;
}

into all nice vector operations:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm0
	movss	4(%rax), %xmm2
	movss	12(%rax), %xmm3
	pshufd	$16, %xmm2, %xmm2
	unpcklps	%xmm2, %xmm0
	addss	8(%rax), %xmm1
	pshufd	$16, %xmm1, %xmm1
	pshufd	$16, %xmm3, %xmm2
	unpcklps	%xmm2, %xmm1
	ret

instead of icky integer operations:

_bar:                                   ## @bar
	movq	_g@GOTPCREL(%rip), %rax
	movss	LCPI1_0(%rip), %xmm1
	movss	(%rax), %xmm0
	addss	%xmm1, %xmm0
	movd	%xmm0, %ecx
	movl	4(%rax), %edx
	movl	12(%rax), %esi
	shlq	$32, %rdx
	addq	%rcx, %rdx
	movd	%rdx, %xmm0
	addss	8(%rax), %xmm1
	movd	%xmm1, %eax
	shlq	$32, %rsi
	addq	%rax, %rsi
	movd	%rsi, %xmm1
	ret

This resolves rdar://8360454

llvm-svn: 112343
2010-08-28 01:20:38 +00:00
Chris Lattner
3f880c2097 Enhance the shift propagator to handle the case when you have:
A = shl x, 42
...
B = lshr ..., 38

which can be transformed into:
A = shl x, 4
...

iff we can prove that the would-be-shifted-in bits
are already zero.  This eliminates two shifts in the testcase
and allows eliminate of the whole i128 chain in the real example.

llvm-svn: 112314
2010-08-27 22:53:44 +00:00
Chris Lattner
80632e5fd9 Implement a pretty general logical shift propagation
framework, which is good at ripping through bitfield
operations.  This generalize a bunch of the existing
xforms that instcombine does, such as 
  (x << c) >> c -> and
to handle intermediate logical nodes.  This is useful for
ripping up the "promote to large integer" code produced by
SRoA.

llvm-svn: 112304
2010-08-27 22:24:38 +00:00
Chris Lattner
5ed3d56ced remove some special shift cases that have been subsumed into the
more general simplify demanded bits logic.

llvm-svn: 112291
2010-08-27 21:04:34 +00:00
Chris Lattner
866b888095 teach the truncation optimization that an entire chain of
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.

llvm-svn: 112285
2010-08-27 20:32:06 +00:00
Chris Lattner
69a9143584 Add an instcombine to clean up a common pattern produced
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:

   %94 = zext i16 %93 to i32                       ; <i32> [#uses=2]
   %96 = lshr i32 %94, 8                           ; <i32> [#uses=1]
   %101 = trunc i32 %96 to i8                      ; <i8> [#uses=1]

This also unblocks other xforms from happening, now clang is able to compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	pshufd	$1, %xmm0, %xmm2
	addss	%xmm0, %xmm2
	movdqa	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	pshufd	$1, %xmm1, %xmm0
	addss	%xmm3, %xmm0
	ret

on x86-64, instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

This seems pretty close to optimal to me, at least without
using horizontal adds.  This also triggers in lots of other
code, including SPEC.

llvm-svn: 112278
2010-08-27 18:31:05 +00:00
Chris Lattner
d5d68438c1 optimize "integer extraction out of the middle of a vector" as produced
by SRoA.  This is part of rdar://7892780, but needs another xform to
expose this.

llvm-svn: 112232
2010-08-26 22:14:59 +00:00
Chris Lattner
19a5dc488b optimize bitcast(trunc(bitcast(x))) where the result is a float and 'x'
is a vector to be a vector element extraction.  This allows clang to
compile:

struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }

into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movapd	%xmm1, %xmm3
	addss	%xmm2, %xmm3
	movd	%xmm1, %rax
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm3, %xmm0
	ret

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movd	%xmm0, %rax
	movd	%eax, %xmm0
	shrq	$32, %rax
	movd	%eax, %xmm2
	addss	%xmm0, %xmm2
	movd	%xmm1, %rax
	movd	%eax, %xmm1
	addss	%xmm2, %xmm1
	shrq	$32, %rax
	movd	%eax, %xmm0
	addss	%xmm1, %xmm0
	ret

... eliminating half of the horribleness.

llvm-svn: 112227
2010-08-26 21:55:42 +00:00
Owen Anderson
678fd04aa5 Re-apply r111568 with a fix for the clang self-host.
llvm-svn: 111665
2010-08-20 18:24:43 +00:00
Owen Anderson
0e57acb623 Revert r111568 to unbreak clang self-host.
llvm-svn: 111571
2010-08-19 23:25:16 +00:00
Owen Anderson
7f2852ba2d When a set of bitmask operations, typically from a bitfield initialization, only modifies the low bytes of a value,
we can narrow the store to only over-write the affected bytes.

llvm-svn: 111568
2010-08-19 22:15:40 +00:00
Eric Christopher
08e9f0250a Temporarily revert r110987 as it's causing some miscompares in
vector heavy code.  I'll re-enable when we've tracked down the problem.

llvm-svn: 111318
2010-08-17 22:55:27 +00:00
Nate Begeman
e57074fc48 Reapply this transformation now that it is passing the external test which it previously failed.
llvm-svn: 110987
2010-08-13 00:17:53 +00:00
Eric Christopher
34acdf57df Temporarily revert 110737 and 110734, they were causing failures
in an external testsuite.

llvm-svn: 110905
2010-08-12 07:01:22 +00:00
Nate Begeman
713062e756 Add the minimal amount of smarts necessary to instcombine of shufflevectors to recognize
patterns generated by clang for transpose of a matrix in generic vectors.  This is made
of two parts:

1) Propagating vector extracts of hi/lo half into their users
2) Recognizing an insertion of even elements followed by the odd elements as an unpack.

Testcase to come, but this shrinks the # of shuffle instructions generated on x86 from ~40 to the minimal 8.

llvm-svn: 110734
2010-08-10 21:38:12 +00:00
Eli Friedman
7197d66ff1 PR7853: fix a silly mistake introduced in r101899, and add a test to make sure
it doesn't regress again.

llvm-svn: 110597
2010-08-09 20:49:43 +00:00
Owen Anderson
f2fea95f2f Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Owen Anderson
aadd8a89ca Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Owen Anderson
b9762c07cb Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Dan Gohman
a80f89dbc7 Make instcombine set explicit alignments on load or store
instructions with alignment 0, so that subsequent passes don't
need to bother checking the TargetData ABI size manually.

llvm-svn: 110128
2010-08-03 18:20:32 +00:00
Dan Gohman
8391bdbe95 Use unary + instead of a separate local variable for working
around std::min vs static const friction.

llvm-svn: 110112
2010-08-03 16:15:50 +00:00
Owen Anderson
e957c57ebb Re-apply the infamous r108614, with a fix pointed out by Dirk Steinke.
llvm-svn: 110036
2010-08-02 09:32:13 +00:00
Daniel Dunbar
f2be238c99 Speculatively revert r108614, "Another attempt at getting the clang self-host to
like my instcombine patch.", in an attempt to fix Clang i386 bootstrap.
 - Also PR7719.

llvm-svn: 109953
2010-07-31 19:51:11 +00:00
Dan Gohman
230d92b375 Move MaximumAlignment to be a member of the Value class.
llvm-svn: 109891
2010-07-30 21:07:05 +00:00