1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

10422 Commits

Author SHA1 Message Date
Dale Johannesen
eb251be031 PPC doesn't supported VLA with large alignment. This was
formerly rejected by the FE, so asserted in the BE; now the FE only
warns, so we treat it as a legitimate fatal error in PPC BE.
This means the test for the feature won't pass, so it's xfail'd.

llvm-svn: 109892
2010-07-30 21:09:48 +00:00
Bruno Cardoso Lopes
f6ed26ef55 A *bunch* of tests for AVX intrinsics
llvm-svn: 109881
2010-07-30 19:57:56 +00:00
Bob Wilson
6394c4bdd1 Attempt to fix the llvm-gcc-powerpc-darwin9 buildbot.
llvm-svn: 109876
2010-07-30 18:52:47 +00:00
Eli Friedman
bea7c851cf Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly
check the range of the constant when optimizing a comparison between a
constant and a sign_extend_inreg node.

llvm-svn: 109854
2010-07-30 06:44:31 +00:00
Jim Grosbach
1718345a30 Many Thumb2 instructions can reference the full ARM register set (i.e.,
have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
  mls r0,r9,r0,sp
instead of:
  mov r2, sp
  mls r0, r9, r0, r2

This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.

PR7499

llvm-svn: 109842
2010-07-30 02:41:01 +00:00
Eric Christopher
82701d8dfe Fix this up per llvm-gcc r109819.
llvm-svn: 109820
2010-07-29 23:20:29 +00:00
Benjamin Kramer
23151050ce Remove XFAIL, test doesn't leak anymore.
llvm-svn: 109801
2010-07-29 20:36:36 +00:00
Dale Johannesen
717fbb2b32 Implement vector constants which are splat of
integers with mov + vdup.  8003375.  This is
currently disabled by default because LICM will
not hoist a VDUP, so it pessimizes the code if
the construct occurs inside a loop (8248029).

llvm-svn: 109799
2010-07-29 20:10:08 +00:00
Dan Gohman
343e4fb4ea Make GlobalValue alignment consistent with load, store, and alloca
alignment, fixing silent truncation of alignment values.

llvm-svn: 109653
2010-07-28 20:56:48 +00:00
Dan Gohman
939744be5f Define a maximum supported alignment value for load, store, and
alloca instructions (constrained by their internal encoding),
and add error checking for it. Fix an instcombine bug which
generated huge alignment values (null is infinitely aligned).
This fixes undefined behavior noticed by John Regehr.

llvm-svn: 109643
2010-07-28 20:12:04 +00:00
Nate Begeman
133820e806 Implement a vectorized algorithm for <16 x i8> << <16 x i8>
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
2010-07-28 00:21:48 +00:00
Stuart Hastings
d2b44f0739 Testcase for r109556. Radar 8198362.
llvm-svn: 109557
2010-07-27 23:15:25 +00:00
Nate Begeman
068e932975 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549
2010-07-27 22:37:06 +00:00
Devang Patel
186cbcd229 Update tests to not rely on input file's absolute path.
llvm-svn: 109521
2010-07-27 18:13:53 +00:00
Nate Begeman
15fe179ecb Fix a crash in the dag combiner caused by ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself
recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR.

llvm-svn: 109519
2010-07-27 18:02:18 +00:00
Tobias Grosser
cac9d6f302 Make coff-dump.py executable and add python as executable for this script.
This fixes the MC/COFF/basic-coff.ll test case.

llvm-svn: 109497
2010-07-27 09:01:26 +00:00
Michael J. Spencer
33ac353ce4 Make MC use Windows COFF on Windows and add tests.
llvm-svn: 109494
2010-07-27 06:46:15 +00:00
Anton Korobeynikov
5e3d50ec58 Currently EH lowering code expects typeinfo to be global only.
This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716

llvm-svn: 109423
2010-07-26 18:45:39 +00:00
Owen Anderson
647ac93b7d Fix a test with malformed IR. Not sure why this didn't fail before.
llvm-svn: 109422
2010-07-26 18:44:56 +00:00
Dan Gohman
9e0ae022d2 Fix SCEVExpander::visitAddRecExpr so that it remembers the induction variable
it inserted rather than using LoopInfo::getCanonicalInductionVariable to
rediscover it, since that doesn't work on non-canonical loops. This fixes
infinite recurrsion on such loops; PR7562.

llvm-svn: 109419
2010-07-26 18:28:14 +00:00
Dan Gohman
48bddf693c Avoid depending on LCSSA implicitly pulling in LoopSimplify.
llvm-svn: 109410
2010-07-26 18:00:43 +00:00
Bruno Cardoso Lopes
632295a03c Support x86 "eiz" and "riz" pseudo index registers in the assembler.
llvm-svn: 109295
2010-07-24 00:06:39 +00:00
Matt Fleming
6c3c8fe3ae Consolidate the ELF section directive tests into a single file as
suggested by Chris Lattner.

llvm-svn: 109290
2010-07-23 23:40:41 +00:00
Evan Cheng
f215e55d5f - Allow target to specify when is register pressure "too high". In most cases,
it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279
2010-07-23 22:39:59 +00:00
Bruno Cardoso Lopes
8907fcd2b5 Move AVX encoding tests to different files
llvm-svn: 109269
2010-07-23 21:25:26 +00:00
Dan Gohman
1694c4352a Use the proper type for shift counts. This fixes a bootstrap error.
llvm-svn: 109265
2010-07-23 21:08:12 +00:00
Stuart Hastings
3987a0d4ea Test case to insure template function declaration refers to correct filename. Radar 8063111.
llvm-svn: 109258
2010-07-23 20:15:49 +00:00
Bruno Cardoso Lopes
a80b57e5fc Add AVX version of CLMUL instructions
llvm-svn: 109248
2010-07-23 18:41:12 +00:00
Dan Gohman
8859ab786b DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits
are not demanded. This often allows the anyext to be folded away.

llvm-svn: 109242
2010-07-23 18:03:30 +00:00
Bruno Cardoso Lopes
b9182e3051 Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual
llvm-svn: 109204
2010-07-23 00:54:35 +00:00
Bruno Cardoso Lopes
7722724eee Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step
llvm-svn: 109168
2010-07-22 21:18:49 +00:00
Tobias Grosser
604a50cd71 Add new RegionInfo pass.
The RegionInfo pass detects single entry single exit regions in a function,
where a region is defined as any subgraph that is connected to the remaining
graph at only two spots.
Furthermore an hierarchical region tree is built.
Use it by calling "opt -regions analyze" or "opt -view-regions".

llvm-svn: 109089
2010-07-22 07:46:31 +00:00
Eric Christopher
4924d5fb93 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.

llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Evan Cheng
5aa6a25102 More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Bruno Cardoso Lopes
5920e38cd2 Add more 256-bit forms for a bunch of regular AVX instructions
Add 64-bit (GR64) versions of some instructions (which are not
described in their SSE forms, but are described in AVX)

llvm-svn: 109063
2010-07-21 23:53:50 +00:00
Eric Christopher
3d118d5e8a Baby steps towards ARM fast-isel.
llvm-svn: 109047
2010-07-21 22:26:11 +00:00
Bruno Cardoso Lopes
eea3b7ed83 Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it
llvm-svn: 109039
2010-07-21 21:37:59 +00:00
Dan Gohman
fc3ee085a0 Disallow null as a named metadata operand.
Make MDNode::destroy private.
Fix the one thing that used MDNode::destroy, outside of MDNode itself.

One should never delete or destroy an MDNode explicitly. MDNodes
implicitly go away when there are no references to them (implementation
details aside).

llvm-svn: 109028
2010-07-21 18:54:18 +00:00
Rafael Espindola
9aab8413b8 Fix calling convention on ARM if vfp2+ is enabled.
llvm-svn: 109009
2010-07-21 11:38:30 +00:00
Bruno Cardoso Lopes
d13d8c2562 Add AVX only vzeroall and vzeroupper instructions
llvm-svn: 109002
2010-07-21 08:56:24 +00:00
Eric Christopher
b9bda35e06 Turn this test on again after the llvm-gcc change in r108986.
llvm-svn: 108987
2010-07-21 04:54:06 +00:00
Eric Christopher
0388466352 Update this to use a "valid" alignment.
llvm-svn: 108985
2010-07-21 04:51:24 +00:00
Bruno Cardoso Lopes
c4d93a5a34 Add new AVX vpermilps, vpermilpd and vperm2f128 instructions
llvm-svn: 108984
2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes
a7efb29695 Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it
llvm-svn: 108983
2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes
e0dce1c741 Add new AVX vextractf128 instructions
llvm-svn: 108964
2010-07-20 23:19:02 +00:00
Matt Fleming
1d96df2bf5 Include some tests for the recently committed ELF section directive
handlers.

llvm-svn: 108938
2010-07-20 21:37:30 +00:00
Eric Christopher
59463ea0c3 Testcase for llvm-gcc commit r108910.
llvm-svn: 108918
2010-07-20 20:32:47 +00:00
Bruno Cardoso Lopes
b677cbc9b2 Add new AVX instruction vinsertf128
llvm-svn: 108892
2010-07-20 19:44:51 +00:00
Dan Gohman
45ba7b5c5c Fix SCEV denormalization of expressions where the exit value from
one loop is involved in the increment of an addrec for another
loop. This fixes rdar://8168938.

llvm-svn: 108863
2010-07-20 17:06:20 +00:00
Jim Grosbach
28e5f92387 update tests for smarter BIC usage
llvm-svn: 108846
2010-07-20 16:16:48 +00:00