1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

1548 Commits

Author SHA1 Message Date
Torok Edwin
110a24e43e add 2 more testcases for -mattr=-sse (r63495).
--This line, and those below, will be ignaored--

A    test/CodeGen/X86/nosse-error1.ll
A    test/CodeGen/X86/nosse-error2.ll

llvm-svn: 63496
2009-02-01 18:24:20 +00:00
Torok Edwin
b4c9a6097f Implement -mno-sse: if SSE is disabled on x86-64, don't store XMM on stack for
var-args, and don't allow FP return values

llvm-svn: 63495
2009-02-01 18:15:56 +00:00
Duncan Sands
cac6cf74f9 Fix PR3453 and probably a bunch of other potential
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.

llvm-svn: 63494
2009-02-01 18:06:53 +00:00
Duncan Sands
74179a9dde Fix PR3401: when using large integers, the type
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization.  Thanks to Dan for writing the
original patch (which I shamelessly pillaged).

llvm-svn: 63482
2009-01-31 15:50:11 +00:00
Mon P Wang
1cc0fd54b7 Used "-enable-unsafe-fp-math" to allow this transformation - (a * b -c) = c - a *b.
llvm-svn: 63475
2009-01-31 06:50:54 +00:00
Mon P Wang
2a0cce6b1c If unsafe FP optimization is not set, don't allow -(A-B) => B-A because
when A==B, -0.0 != +0.0.

llvm-svn: 63474
2009-01-31 06:07:45 +00:00
Owen Anderson
fe46e52119 XFAIL this test. It only worked before because of a bug in the spill point selection code. Not deleting because
it should be possible to enhance the selection code to handle this in the future.

llvm-svn: 63340
2009-01-29 22:27:56 +00:00
Evan Cheng
1ffb8d20e8 Local register allocator shouldn't assume only the entry and landing pad basic blocks have live-ins.
llvm-svn: 63323
2009-01-29 18:37:30 +00:00
Dan Gohman
14959cba72 In the case of an extractelement on an insertelement value,
the element indices may be equal if either one is not a
constant.

llvm-svn: 63311
2009-01-29 16:10:46 +00:00
Evan Cheng
1346d22223 Exit with nice warnings when register allocator run out of registers.
llvm-svn: 63267
2009-01-29 02:20:59 +00:00
Dan Gohman
9d120d6d8f Make x86's BT instruction matching more thorough, and add some
dagcombines that help it match in several more cases. Add
several more cases to test/CodeGen/X86/bt.ll. This doesn't
yet include matching for BT with an immediate operand, it
just covers more register+register cases.

llvm-svn: 63266
2009-01-29 01:59:02 +00:00
Mon P Wang
8abb07a527 Fixed lowering of v816 shuffles.
llvm-svn: 63252
2009-01-28 23:11:14 +00:00
Bill Wendling
ab2b1d7629 Make test platform agnostic.
llvm-svn: 63247
2009-01-28 22:20:56 +00:00
Evan Cheng
2a965124b7 The memory alignment requirement on some of the mov{h|l}p{d|s} patterns are 16-byte. That is overly strict. These instructions read / write f64 memory locations without alignment requirement.
llvm-svn: 63195
2009-01-28 08:35:02 +00:00
Mon P Wang
f089df40b5 Added sse test patterns for r62979 and r63193.
llvm-svn: 63194
2009-01-28 08:13:56 +00:00
Bill Wendling
6c7c632a21 Add testcase for r63142.
llvm-svn: 63149
2009-01-27 23:00:53 +00:00
Evan Cheng
a05436f739 Implement multiple with overflow by 2 with an add instruction.
llvm-svn: 63090
2009-01-27 03:30:42 +00:00
Dan Gohman
2c06ee586b Add a regression test for x86-64 red zone usage.
llvm-svn: 63075
2009-01-27 00:40:27 +00:00
Duncan Sands
276b736496 Fix PR3393, which amounts to a bug in the expensive
checking logic.  Rather than make the checking more
complicated, I've tweaked some logic to make things
conform to how the checking thought things ought to
be, since this results in a simpler "mental model".

llvm-svn: 63048
2009-01-26 21:54:18 +00:00
Dan Gohman
4613b5e807 At Nick Lewycky's request, rename this test with a more informative name.
llvm-svn: 63042
2009-01-26 21:36:31 +00:00
Evan Cheng
ec03e0cd3b Enhance logic in X86DAGToDAGISel::PreprocessForRMW which move load inside callseq_start to allow it to be folded into a call. It was not considering the cases where a token factor is between the load and the callseq_start.
llvm-svn: 63022
2009-01-26 18:43:34 +00:00
Scott Michel
da9360e77e CellSPU:
- Rename fcmp.ll test to fcmp32.ll, start adding new double tests to fcmp64.ll
- Fix select_bits.ll test
- Capitulate to the DAGCombiner and move i64 constant loads to instruction
  selection (SPUISelDAGtoDAG.cpp).

  <rant>DAGCombiner will insert all kinds of 64-bit optimizations after
  operation legalization occurs and now we have to do most of the work that
  instruction selection should be doing twice (once to determine if v2i64
  build_vector can be handled by SelectCode(), which then runs all of the
  predicates a second time to select the necessary instructions.) But,
  CellSPU is a good citizen.</rant>

llvm-svn: 62990
2009-01-26 03:31:40 +00:00
Nate Begeman
92efc4f0ce Map address space 256 to gs; similar mappings could be supported for the
other x86 segments.  address space 0 is stack/default, 1-255 are reserved for
client use.

llvm-svn: 62980
2009-01-26 01:24:32 +00:00
Torok Edwin
3f54410405 revert this patch for now, because Codegen does still want to generate SSE code,
for example in the case of va-args. XFAIL associated tests.

llvm-svn: 62972
2009-01-25 20:21:24 +00:00
Torok Edwin
49b1d3e3cc If user explicitly asks not to use SSE, don't force it. This fixes LLVM part of PR3402.
llvm-svn: 62967
2009-01-25 17:58:56 +00:00
Evan Cheng
71ca3e2bdb Private linkage support for PPC / Darwin.
llvm-svn: 62955
2009-01-25 06:32:01 +00:00
Evan Cheng
4ebe9b79fa Teach 2addr pass to be do more commuting. If both uses of a two-address instruction are killed, but the first operand has a use before and after the def, commute if the second operand does not suffer from the same issue.
%reg1028<def> = EXTRACT_SUBREG %reg1027<kill>, 1                                                                                                                                     
%reg1029<def> = MOV8rr %reg1028                                                                                                                                                      
%reg1029<def> = SHR8ri %reg1029, 7, %EFLAGS<imp-def,dead>                                                                                                                            
insert => %reg1030<def> = MOV8rr %reg1028                                                                                                                                            
%reg1030<def> = ADD8rr %reg1028<kill>, %reg1029<kill>, %EFLAGS<imp-def,dead>                                                                                                         

In this case, it might not be possible to coalesce the second MOV8rr                                                                                                                 
instruction if the first one is coalesced. So it would be profitable to                                                                                                              
commute it:                                                                                                                                                                          
%reg1028<def> = EXTRACT_SUBREG %reg1027<kill>, 1                                                                                                                                     
%reg1029<def> = MOV8rr %reg1028                                                                                                                                                      
%reg1029<def> = SHR8ri %reg1029, 7, %EFLAGS<imp-def,dead>                                                                                                                            
insert => %reg1030<def> = MOV8rr %reg1029                                                                                                                                            
%reg1030<def> = ADD8rr %reg1029<kill>, %reg1028<kill>, %EFLAGS<imp-def,dead>

llvm-svn: 62954
2009-01-25 03:53:59 +00:00
Dan Gohman
14a8caaef9 Add a PR comment to this test.
llvm-svn: 62921
2009-01-24 17:32:54 +00:00
Evan Cheng
4a5296ec5f Update test to reflect command line option name change.
llvm-svn: 62836
2009-01-23 05:45:31 +00:00
Dan Gohman
a6e5948fce Don't create ISD::FNEG nodes after legalize if they aren't legal.
Simplify x+0 to x in unsafe-fp-math mode. This avoids a bunch of
redundant work in many cases, because in unsafe-fp-math mode,
ISD::FADD with a constant is considered free to negate, so the
DAGCombiner often negates x+0 to -0-x thinking it's free, when
in reality the end result is -x, which is more expensive than x.

Also, combine x*0 to 0.

This fixes PR3374.

llvm-svn: 62789
2009-01-22 21:58:43 +00:00
Devang Patel
386023e3f0 Do not use buggy llvm-gcc to generate testcases.
llvm-svn: 62770
2009-01-22 18:28:11 +00:00
Bill Wendling
58a51ca0f9 Now with RUN line.
llvm-svn: 62716
2009-01-21 21:28:03 +00:00
Bill Wendling
1eb2ec148b Run this through -simplifycfg and -mem2reg to test only what we need to test.
llvm-svn: 62714
2009-01-21 21:02:27 +00:00
Dan Gohman
d021a20409 Simplify ReduceLoadWidth's logic: it doesn't need several different
special cases after producing the new reduced-width load, because the
new load already has the needed adjustments built into it. This fixes
several bugs due to the special cases, including PR3317.

llvm-svn: 62692
2009-01-21 15:17:51 +00:00
Dan Gohman
704f0d5879 Fix a recent regression. ClrOpcode is not set for i8; for i8, if
we want to clear %ah to zero before a division, just use a
zero-extending mov to %al. This fixes PR3366.

llvm-svn: 62691
2009-01-21 14:50:16 +00:00
Duncan Sands
07b1beeba8 Let's try to have our cake and eat it to: move
this test into FrontendC to ensure that llvm-gcc
is available; assemble using "llvm-gcc -xassembler"
rather than "as".

llvm-svn: 62683
2009-01-21 11:37:31 +00:00
Duncan Sands
e60dd41a39 Don't rely on grep -w working.
llvm-svn: 62682
2009-01-21 09:41:42 +00:00
Scott Michel
c80e71ac35 CellSPU:
- Ensure that (operation) legalization emits proper FDIV libcall when needed.
- Fix various bugs encountered during llvm-spu-gcc build, along with various
  cleanups.
- Start supporting double precision comparisons for remaining libgcc2 build.
  Discovered interesting DAGCombiner feature, which is currently solved via
  custom lowering (64-bit constants are not legal on CellSPU, but DAGCombiner
  insists on inserting one anyway.)
- Update README.

llvm-svn: 62664
2009-01-21 04:58:48 +00:00
Evan Cheng
0ed6a9d7e0 Favors generating "not" over "xor -1". For example.
unsigned test(unsigned a) {
  return ~a;
}
llvm used to generate:
movl    $4294967295, %eax
xorl    4(%esp), %eax

Now it generates:
movl      4(%esp), %eax
notl      %eax

It's 3 bytes shorter.

llvm-svn: 62661
2009-01-21 02:09:05 +00:00
Owen Anderson
10ad717dc8 Be more aggressive about renumbering vregs after splitting them.
llvm-svn: 62639
2009-01-21 00:13:28 +00:00
Chris Lattner
4f4bbecafb Don't bother running the assembler, we don't know that it will be configured
for whatever llc defaults to.  This fixes PR3363

llvm-svn: 62619
2009-01-20 21:41:53 +00:00
Evan Cheng
5bea79c062 Fix PR3243: a LiveVariables bug. When HandlePhysRegKill is checking whether the last reference is also the last def (i.e. dead def), it should also check if last reference is the current machine instruction being processed. This can happen when it is processing a physical register use and setting the current machine instruction as sub-register's last ref.
llvm-svn: 62617
2009-01-20 21:25:12 +00:00
Evan Cheng
0151af3a61 Add test case for PR3154.
llvm-svn: 62604
2009-01-20 19:29:54 +00:00
Bill Wendling
68171bde8e Testcase for limited precision stuff.
llvm-svn: 62572
2009-01-20 06:23:59 +00:00
Dan Gohman
ff4c4ab39f Fix a dagcombine to not generate loads of non-round integer types,
as its comment says, even in the case where it will be generating
extending loads. This fixes PR3216.

llvm-svn: 62557
2009-01-20 01:06:45 +00:00
Evan Cheng
5ee5ba12be Make linear scan's trivial coalescer slightly more aggressive.
llvm-svn: 62547
2009-01-20 00:16:18 +00:00
Dale Johannesen
5508ead868 Move & restructure test per review.
llvm-svn: 62538
2009-01-19 22:33:12 +00:00
Dan Gohman
af4e583c93 Fix SelectionDAG::ReplaceAllUsesWith to behave correctly when
uses are added to the From node while it is processing From's
use list, because of automatic local CSE. The fix is to avoid
visiting any new uses.

Fix a few places in the DAGCombiner that assumed that after
a RAUW call, the From node has no users and may be deleted.

This fixes PR3018.

llvm-svn: 62533
2009-01-19 21:44:21 +00:00
Dale Johannesen
31f3cac06b compile-time fmod was done incorrectly. PR 3316.
llvm-svn: 62528
2009-01-19 21:17:05 +00:00
Devang Patel
50ac518b6c Verify Intrinsic::dbg_declare.
llvm-svn: 62526
2009-01-19 21:00:48 +00:00
Evan Cheng
06cfade044 DIVREM isel deficiency: If sign bit is known zero, zero out DX/EDX/RDX instead of sign extending the low part (in AX/EAX/RAX) into it.
llvm-svn: 62519
2009-01-19 19:06:11 +00:00
Evan Cheng
53e83a2eb9 Now not UINT_TO_FP is legal (it's marked custom), dag combiner won't
optimize it to a SINT_TO_FP when the sign bit is known zero. X86 isel should perform the optimization itself.

llvm-svn: 62504
2009-01-19 08:08:22 +00:00
Chris Lattner
6f03811071 Fix rdar://6505632, an llc crash on 483.xalancbmk
llvm-svn: 62470
2009-01-18 20:35:00 +00:00
Bill Wendling
bca1d8ae5a Testcase for last commit.
llvm-svn: 62418
2009-01-17 07:42:44 +00:00
Evan Cheng
182d9c4c9f Fix MatchAddress bug that's preventing negative displacement from being folded in 64-bit mode.
llvm-svn: 62413
2009-01-17 07:09:27 +00:00
Mon P Wang
563134282c Simplify extract element of a scalar to vector.
llvm-svn: 62383
2009-01-17 00:07:25 +00:00
Evan Cheng
d7cc550900 Fix PPC ISD::Declare isel and eliminate the need for PPCTargetLowering::LowerGlobalAddress to check if isVerifiedDebugInfoDesc() is true. Given the recent changes, it would falsely return true for a lot of GlobalAddressSDNode's.
llvm-svn: 62373
2009-01-16 22:57:32 +00:00
Dan Gohman
cd46de9bdc Disable the post-RA scheduler on this test, since it uses a
simple %prcontext which doesn't find what it's looking for
if the scheduler has rearranged the instructions.

llvm-svn: 62363
2009-01-16 21:40:12 +00:00
Evan Cheng
c4d19d6e8c CreateVirtualRegisters does trivial copy coalescing. If a node def is used by a single CopyToReg, it reuses the virtual register assigned to the CopyToReg. This won't work for SDNode that is a clone or is itself cloned. Disable this optimization for those nodes or it can end up with non-SSA machine instructions.
llvm-svn: 62356
2009-01-16 20:57:18 +00:00
Bill Wendling
c9e856fbfd Add support for non-zero __builtin_return_address values on X86.
llvm-svn: 62338
2009-01-16 19:25:27 +00:00
Mon P Wang
e235ba591f Added missing support to widen an operand from a bit convert.
llvm-svn: 62285
2009-01-15 22:43:38 +00:00
Rafael Espindola
46b374f55b Fix Alpha test and support for private linkage.
llvm-svn: 62282
2009-01-15 21:51:46 +00:00
Mon P Wang
4cfe965df2 Expand insert/extract of a <4 x i32> with a variable index.
llvm-svn: 62281
2009-01-15 21:10:20 +00:00
Rafael Espindola
0aba6c9435 Add the private linkage.
llvm-svn: 62279
2009-01-15 20:18:42 +00:00
Richard Osborne
ce265d8cf9 Don't fold address calculations which use negative offsets into
the ADDRspii addressing mode.

llvm-svn: 62258
2009-01-15 11:32:30 +00:00
Scott Michel
b4699590f0 - Convert remaining i64 custom lowering into custom instruction emission
sequences in SPUDAGToDAGISel.cpp and SPU64InstrInfo.td, killing custom
  DAG node types as needed.
- i64 mul is now a legal instruction, but emits an instruction sequence
  that stretches tblgen and the imagination, as well as violating laws of
  several small countries and most southern US states (just kidding, but
  looking at a function with 80+ parameters is really weird and just plain
  wrong.)
- Update tests as needed.

llvm-svn: 62254
2009-01-15 04:41:47 +00:00
Richard Osborne
12b88f2fae Add pseudo instructions to the XCore for (load|store|load address) of a
frame index. eliminateFrameIndex will replace these instructions with
(LDWSP|STWSP|LDAWSP) or (LDW|STW|LDAWF) if a frame pointer is in use.

This fixes PR 3324. Previously we used LDWSP, STWSP, LDAWSP before frame
pointer elimination. However since they were marked as implicitly using
SP they could not be rematerialised.

llvm-svn: 62238
2009-01-14 18:26:46 +00:00
Dan Gohman
8c835f6285 Disable the register+memory forms of the bt instructions for now. Thanks
to Eli for pointing out that these forms don't ignore the high bits of
their index operands, and as such are not immediately suitable for use
by isel.

llvm-svn: 62194
2009-01-13 23:23:30 +00:00
Dan Gohman
9e9858781c The list-td and list-tdrr schedulers don't yet support physreg
scheduling dependencies. Add assertion checks to help catch
this.

It appears the Mips target defaults to list-td, and it has a
regression test that uses a physreg dependence. Such code was
liable to be miscompiled, and now evokes an assertion failure.

llvm-svn: 62177
2009-01-13 20:24:13 +00:00
Duncan Sands
975f2428ba When replacing uses and the same node is reached
via two paths, process it once not twice, d'oh!
Analysis, testcase and original patch thanks to
Mon Ping Wang.

llvm-svn: 62169
2009-01-13 15:17:14 +00:00
Evan Cheng
a706a020bc FIX llvm-gcc bootstrap on x86_64 linux. If a virtual register is copied to a physical register, it's not necessarily defined by a copy. We have to watch out it doesn't clobber any sub-register that might be live during its live interval. If the live interval crosses a basic block, then it's not safe to check with the less conservative check (by scanning uses and defs) because it's possible a sub-register might be live out of the block.
llvm-svn: 62144
2009-01-13 03:57:45 +00:00
Devang Patel
6d7fd4b913 Use DebugInfo interface to lower dbg_* intrinsics.
llvm-svn: 62126
2009-01-13 00:32:17 +00:00
Evan Cheng
5e17ea36e1 Fix PR3241: Currently EmitCopyFromReg emits a copy from the physical register to a virtual register unless it requires an expensive cross class copy. That means we are only treating "expensive to copy" register dependency as physical register dependency.
Also future proof the scheduler to handle "normal" physical register dependencies. The code is not exercised yet.

llvm-svn: 62074
2009-01-12 03:19:55 +00:00
Evan Cheng
47dfb8c719 This is a dup of pr2659.ll.
llvm-svn: 62029
2009-01-10 19:06:32 +00:00
Evan Cheng
411c48b7d2 Duplicated node may produce a non-physical register def.
llvm-svn: 62015
2009-01-09 22:44:02 +00:00
Evan Cheng
84945aba0b Add test case from PR2659.
llvm-svn: 62006
2009-01-09 21:01:31 +00:00
Dan Gohman
a707c0dafe PR2659 was fixed by r61847. Add the testcase as a regression test.
llvm-svn: 61986
2009-01-09 08:16:12 +00:00
Chris Lattner
4166afffa7 this test should not run opt -std-compile-opts, it should run
just llc.

llvm-svn: 61979
2009-01-09 05:32:00 +00:00
Misha Brukman
6338af14f6 Fix off-by-one error in traversing an array; this fixes a test.
The error was reported by gcc-4.3.0 during compilation.

llvm-svn: 61896
2009-01-07 23:07:29 +00:00
Evan Cheng
a70ecc2f51 The coalescer does not coalesce a virtual register to a physical register if any of the physical register's sub-register live intervals overlaps with the virtual register. This is overly conservative. It prevents a extract_subreg from being coalesced away:
v1024 = EDI  // not killed
      =
      = EDI

One possible solution is for the coalescer to examine the sub-register live intervals in the same manner as the physical register. Another possibility is to examine defs and uses (when needed) of sub-registers. Both solutions are too expensive. For now, look for "short virtual intervals" and scan instructions to look for conflict instead.

This is a small win on x86-64. e.g. It shaves 403.gcc by ~80 instructions.

llvm-svn: 61847
2009-01-07 02:08:57 +00:00
Chris Lattner
f6de7aa2c9 add a testcase.
llvm-svn: 61845
2009-01-07 01:48:08 +00:00
Dan Gohman
ca4475dd7b Add patterns to match conditional moves with loads folded
into their left operand, rather than their right. Do this
by commuting the operands and inverting the condition.

llvm-svn: 61842
2009-01-07 01:00:24 +00:00
Dan Gohman
2682e8745c X86_COND_C and X86_COND_NC are alternate mnemonics for
X86_COND_B and X86_COND_AE, respectively.

llvm-svn: 61835
2009-01-07 00:15:08 +00:00
Dan Gohman
4edc9d725b Now that fold-pcmpeqd-0.ll is effectively testing that scheduling helps
avoid the need for spilling, add a new testcase that tests that the
pcmpeqd used for V_SETALLONES is changed to a constant-pool load as
needed.

llvm-svn: 61831
2009-01-06 23:48:10 +00:00
Dan Gohman
e033f7c41e Revert r42653 and forward-port the code that lets INC64_32r be
converted to LEA64_32r in x86's convertToThreeAddress. This
replaces code like this:
   movl  %esi, %edi
   inc   %edi
with this:
   lea   1(%rsi), %edi
which appears to be beneficial.

llvm-svn: 61830
2009-01-06 23:34:46 +00:00
Dan Gohman
b19f5073f9 Fix a bug in ComputeLinearIndex computation handling multi-level
aggregate types. Don't increment the current index after reaching
the end of a struct, as it will already be pointing at
one-past-the end. This fixes PR3288.

llvm-svn: 61828
2009-01-06 22:53:52 +00:00
Scott Michel
c30557841b CellSPU:
- Fix bugs 3194, 3195: i128 load/stores produce correct code (although, we
  need to ensure that i128 is 16-byte aligned in real life), and 128 zero-
  extends are supported.
- New td file: SPU128InstrInfo.td: this is where all new i128 support should
  be put in the future.
- Continue to hammer on i64 operations and test cases; ensure that the only
  remaining problem will be i64 mul.

llvm-svn: 61784
2009-01-06 03:36:14 +00:00
Dan Gohman
cf1ac86514 Delete this test; it's a duplicate of 2006-07-03-schedulers.ll.
llvm-svn: 61781
2009-01-06 01:36:23 +00:00
Dan Gohman
1cdb677fc8 Use a latency value of 0 for the artificial edges inserted by
AddPseudoTwoAddrDeps. This lets the scheduling infrastructure
avoid recalculating node heights. In very large testcases this
was a major bottleneck. Thanks to Roman Levenstein for finding
this!

As a side effect, fold-pcmpeqd-0.ll is now scheduled better
and it no longer requires spilling on x86-32.

llvm-svn: 61778
2009-01-06 01:19:04 +00:00
Evan Cheng
36e238a4d3 Find loop back edges only after empty blocks are eliminated.
llvm-svn: 61752
2009-01-05 21:17:27 +00:00
Scott Michel
733d5f71a0 CellSPU:
- Teach SPU64InstrInfo.td about the remaining signed comparisons, update tests
  accordingly.

llvm-svn: 61672
2009-01-05 04:05:53 +00:00
Scott Michel
06c324c6c7 CellSPU:
- Add an 8-bit operation test, which doesn't do much at this point.

llvm-svn: 61665
2009-01-05 01:35:22 +00:00
Scott Michel
0d9d939406 CellSPU:
- Fix (brcond (setq ...)) bug, where BRNZ should have been used vice BRZ.
- Kill unused/unnecessary nodes in SPUNodes.td
- Beef out the i64operations.c test harness to use a lot of unaligned
  loads, test loops and LLVM loop/basic block optimizations; run the
  test harness successfully on real Cell hardware.

llvm-svn: 61664
2009-01-05 01:34:35 +00:00
Dan Gohman
2a079de3f5 Fix a DAGCombiner abort on an invalid shift count constant. This fixes PR3250.
llvm-svn: 61613
2009-01-03 19:22:06 +00:00
Scott Michel
0309418000 CellSPU:
- Remove custom lowering for BRCOND
- Add remaining functionality for branches in SPUInstrInfo, such as branch
  condition reversal and load/store folding. Updated BrCond test to reflect
  branch reversal.

llvm-svn: 61597
2009-01-03 00:27:53 +00:00
Evan Cheng
c52f942d67 Do not isel load folding bt instructions for pentium m, core, core2, and AMD processors. These are significantly slower than a load followed by a bt of a register.
llvm-svn: 61557
2009-01-02 05:35:45 +00:00
Evan Cheng
57115c1887 Use movaps / movd to extract vector element 0 even with sse4.1. It's still cheaper than pextrw especially if the value is in memory.
llvm-svn: 61555
2009-01-02 05:29:08 +00:00
Chris Lattner
2d3e57c337 rename a file to follow naming conventions.
llvm-svn: 61550
2009-01-02 01:52:35 +00:00
Duncan Sands
190d6bc636 Fix PR3274: when promoting the condition of a BRCOND node,
promote from i1 all the way up to the canonical SetCC type.
In order to discover an appropriate type to use, pass
MVT::Other to getSetCCResultType.  In order to be able to
do this, change getSetCCResultType to take a type as an
argument, not a value (this is also more logical).

llvm-svn: 61542
2009-01-01 15:52:00 +00:00
Bill Wendling
e288a29970 This is not failing on Darwin for some reason. XFAIL for other platforms.
llvm-svn: 61533
2008-12-31 19:26:09 +00:00
Scott Michel
c163bf5042 XFAIL this for now until I can figure out what's going on.
llvm-svn: 61512
2008-12-31 00:08:25 +00:00
Scott Michel
12a5f7cfb9 Fix test erratum (which is wierd: works locally for me?)
llvm-svn: 61511
2008-12-30 23:52:05 +00:00
Scott Michel
cdcae67887 - Start moving target-dependent nodes that could be represented by an
instruction sequence and cannot ordinarily be simplified by DAGcombine
  into the various target description files or SPUDAGToDAGISel.cpp.

  This makes some 64-bit operations legal.

- Eliminate target-dependent ISD enums.

- Update tests.

llvm-svn: 61508
2008-12-30 23:28:25 +00:00
Scott Michel
bf224860c8 - Remove Tilmann's custom truncate lowering: it completely hosed over
DAGcombine's ability to find reasons to remove truncates when they were not
  needed. Consequently, the CellSPU backend would produce correct, but _really
  slow and horrible_, code.

  Replaced with instruction sequences that do the equivalent truncation in
  SPUInstrInfo.td.

- Re-examine how unaligned loads and stores work. Generated unaligned
  load code has been tested on the CellSPU hardware; see the i32operations.c
  and i64operations.c in CodeGen/CellSPU/useful-harnesses.  (While they may be
  toy test code, it does prove that some real world code does compile
  correctly.)

- Fix truncating stores in bug 3193 (note: unpack_df.ll will still make llc
  fault because i64 ult is not yet implemented.)

- Added i64 eq and neq for setcc and select/setcc; started new instruction
  information file for them in SPU64InstrInfo.td. Additional i64 operations
  should be added to this file and not to SPUInstrInfo.td.

llvm-svn: 61447
2008-12-27 04:51:36 +00:00
Chris Lattner
cd245cc5c3 add PR #
llvm-svn: 61427
2008-12-25 05:40:38 +00:00
Chris Lattner
fde038935b Add a simple pattern for matching 'bt'.
llvm-svn: 61426
2008-12-25 05:34:37 +00:00
Bill Wendling
f348b28d39 Revert the changes in this testcase until Anton can fix them.
llvm-svn: 61414
2008-12-24 05:23:34 +00:00
Dan Gohman
7ff343fe6c Fix a compiler-abort on a testcase where the stack-pointer is added to
a symbolic constant. This is unlikely to be intentional, but it
shouldn't crash the compiler.

llvm-svn: 61408
2008-12-24 00:27:51 +00:00
Dale Johannesen
e1a3d2da49 Add another permutation where we should get rid of a-a.
llvm-svn: 61401
2008-12-23 23:01:27 +00:00
Anton Korobeynikov
4aaf7c6b6a Update test
llvm-svn: 61399
2008-12-23 22:26:37 +00:00
Mon P Wang
993de36832 Added shuffle and splat test cases for r61365.
llvm-svn: 61366
2008-12-23 04:05:08 +00:00
Dale Johannesen
425b44516f One more permutation of subtracting off a base value.
llvm-svn: 61361
2008-12-23 01:59:54 +00:00
Dan Gohman
c9f244842c Fix fast-isel to not emit invalid assembly when presented with a
constant shift count that doesn't fit in the shift instruction's
immediate field. This fixes PR3242.

llvm-svn: 61281
2008-12-20 17:19:40 +00:00
Dan Gohman
8c5bea15ca Use the correct Preds and Succs lists in setHeightDirty()
and setDepthDirty(), respectively. This fixes PR3241.

llvm-svn: 61276
2008-12-20 16:34:57 +00:00
Evan Cheng
da55c4ffb7 Fix PR3149. If an early clobber def is a physical register and it is tied to an input operand, it effectively extends the live range of the physical register. Currently we do not have a good way to represent this.
172     %ECX<def> = MOV32rr %reg1039<kill>
180     INLINEASM <es:subl $5,$1
        sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>,
36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
188     %EAX<def> = MOV32rr %EAX<kill>
196     %ECX<def> = MOV32rr %ECX<kill>
204     %ECX<def> = MOV32rr %ECX<kill>
212     %EAX<def> = MOV32rr %EAX<kill>
220     %EAX<def> = MOV32rr %EAX
228     %reg1039<def> = MOV32rr %ECX<kill>

The early clobber operand ties ECX input to the ECX def.

The live interval of ECX is represented as this:
%reg20,inf = [46,47:1)[174,230:0)  0@174-(230) 1@46-(47)

The right way to represent this is something like
%reg20,inf = [46,47:2)[174,182:1)[181:230:0)  0@174-(182) 1@181-230 @2@46-(47)

Of course that won't work since that means overlapping live ranges defined by two val#.

The workaround for now is to add a bit to val# which says the val# is redefined by a early clobber def somewhere. This prevents the move at 228 from being optimized away by SimpleRegisterCoalescing::AdjustCopiesBackFrom.

llvm-svn: 61259
2008-12-19 20:58:01 +00:00
Evan Cheng
17b53ef5b0 - CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges.
- Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions.

llvm-svn: 61248
2008-12-19 18:03:11 +00:00
Rafael Espindola
7593f0004f Fix bug 3202.
The EH_frame and .eh symbols are now private, except for darwin9 and earlier.
The patch also fixes the definition of PrivateGlobalPrefix on pcc linux.

llvm-svn: 61242
2008-12-19 10:55:56 +00:00
Mon P Wang
4e9022582d Fix test to account for generating some vector code for mul v2i64 instead
of incorrectly generating pmuldq

llvm-svn: 61228
2008-12-18 23:42:37 +00:00
Mon P Wang
5b74ab1f1e Added some basic test cases for r61209
llvm-svn: 61210
2008-12-18 20:05:58 +00:00
Eli Friedman
4aae828bf8 Fix for PR3225: disable a broken optimization in
DAGTypeLegalizer::ExpandShiftWithKnownAmountBit.

In terms of restoring the optimization, the best fix here isn't 
obvious... any ideas?

llvm-svn: 61119
2008-12-17 03:35:17 +00:00
Dale Johannesen
e348900657 A new dag combine; several permutations of this
are there under ADD, this one was missing.

llvm-svn: 61107
2008-12-16 22:13:49 +00:00
Evan Cheng
96d87db03b We have decided not to support inline asm where an output operand with a matching input operand with incompatible type (i.e. either one is a floating point and the other is an integer or the sizes of the types differ). SelectionDAGBuild will catch these and exit with an error.
llvm-svn: 61092
2008-12-16 18:21:39 +00:00
Dan Gohman
10eb3ccaeb Enable anti-dependence breaking by default when post-RA scheduling is enabled.
llvm-svn: 61078
2008-12-16 06:21:45 +00:00
Dan Gohman
40a40dd7c1 Fix some register-alias-related bugs in the post-RA scheduler liveness
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.

Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.

Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.

llvm-svn: 61073
2008-12-16 03:25:46 +00:00
Mon P Wang
bb3c2994f0 Added support for splitting and scalarizing vector shifts.
llvm-svn: 61050
2008-12-15 21:44:00 +00:00
Mon P Wang
2f96113348 Added support to LegalizeType for expanding the operands of scalar to vector
and insert vector element.  Modified extract vector element to extend the
result to match the expected promoted type.

llvm-svn: 61029
2008-12-15 06:57:02 +00:00
Bill Wendling
13e4a3d0b0 - Use patterns instead of creating completely new instruction matching patterns,
which are identical to the original patterns.

- Change the multiply with overflow so that we distinguish between signed and
  unsigned multiplication. Currently, unsigned multiplication with overflow
  isn't working!

llvm-svn: 60963
2008-12-12 21:15:41 +00:00
Bill Wendling
292263313b If ADD, SUB, or MUL have an overflow bit that's used, don't do transformation on
them. The DAG combiner expects that nodes that are transformed have one value
result.

llvm-svn: 60857
2008-12-10 22:36:00 +00:00
Duncan Sands
81499a8e1c For amusement, implement SADDO, SSUBO, UADDO, USUBO
for promoted integer types, eg: i16 on ppc-32, or
i24 on any platform.  Complete support for arbitrary
precision integers would require handling expanded
integer types, eg: i128, but I couldn't be bothered.

llvm-svn: 60834
2008-12-10 12:30:42 +00:00
Mon P Wang
308879dcfc Fixed a bug when trying to optimize a extract vector element of a
bit convert that changes the number of elements of a shuffle.

llvm-svn: 60829
2008-12-10 03:59:02 +00:00
Bill Wendling
1c1dacdd42 Implement fast-isel conversion of a branch instruction that's branching on an
overflow/carry from the "arithmetic with overflow" intrinsics. It searches the
machine basic block from bottom to top to find the SETO/SETC instruction that is
its conditional. If an instruction modifies EFLAGS before it reaches the
SETO/SETC instruction, then it defaults to the normal instruction emission.

llvm-svn: 60807
2008-12-09 23:19:12 +00:00
Bill Wendling
4c8fb3a0cc Add sub/mul overflow intrinsics. This currently doesn't have a
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!

llvm-svn: 60800
2008-12-09 22:08:41 +00:00
Duncan Sands
88a2901801 Fix PR3117: not all nodes being legalized. The
essential problem was that the DAG can contain
random unused nodes which were never analyzed.
When remapping a value of a node being processed,
such a node may become used and need to be analyzed;
however due to operands being transformed during
analysis the node may morph into a different one.
Users of the morphing node need to be updated, and
this wasn't happening.  While there I added a bunch
of documentation and sanity checks, so I (or some
other poor soul) won't have to scratch their head
over this stuff so long trying to remember how it
was all supposed to work next time some obscure
problem pops up!  The extra sanity checking exposed
a few places where invariants weren't being preserved,
so those are fixed too.  Since some of the sanity
checking is expensive, I added a flag to turn it
on.  It is also turned on when building with
ENABLE_EXPENSIVE_CHECKS=1.

llvm-svn: 60797
2008-12-09 21:33:20 +00:00
Scott Michel
5c944059b4 CellSPU:
- Fix call.ll and call_indirect.ll expected results, now that it's using a
  different pre-register allocation scheduler.

llvm-svn: 60741
2008-12-09 06:12:03 +00:00
Mon P Wang
0c011f8ba9 Fix getNode to allow a vector for the shift amount for shifts of vectors.
Fix the shift amount when unrolling a vector shift into scalar shifts.
Fix problem in getShuffleScalarElt where it assumes that the input of
a bit convert must be a vector.

llvm-svn: 60740
2008-12-09 05:46:39 +00:00
Dan Gohman
14d4094968 Factor out the code for sign-extending/truncating gep indices
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.

llvm-svn: 60696
2008-12-08 07:57:47 +00:00
Evan Cheng
5c92d425a9 Clean up some ARM GV asm printing out; minor fixes to match what gcc does.
llvm-svn: 60621
2008-12-06 02:00:55 +00:00
Dale Johannesen
f4758579eb Fix test to pass on Linux.
llvm-svn: 60614
2008-12-05 22:38:21 +00:00
Dale Johannesen
f5a072c388 Make LoopStrengthReduce smarter about hoisting things out of
loops when they can be subsumed into addressing modes.

Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)

llvm-svn: 60608
2008-12-05 21:47:27 +00:00
Evan Cheng
460beb0063 This test also requires -mattr=+sse41.
llvm-svn: 60601
2008-12-05 19:26:37 +00:00
Evan Cheng
144447bfa0 Effectively undo 60461 in PIC mode which simply transform V_SET0 / V_SETALLONES into a load from constpool in order to fold into restores. This is not safe to do when PIC base is being used for a number of reasons:
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).

llvm-svn: 60595
2008-12-05 17:23:48 +00:00
Evan Cheng
1b795803dd Re-did 60519. It turns out Darwin's handling of hidden visibility symbols are a bit more complicate than I expected. Both declarations and weak definitions still need a stub indirection. However, the stubs are in data section and they contain the addresses of the actual symbols.
llvm-svn: 60571
2008-12-05 01:06:39 +00:00
Scott Michel
550ec4540c CellSPU: Add new directory under tests/CodeGen/CellSPU to retain tests that
aren't part of the test suite but are generally useful nonetheless, and can
be expanded later to test the backend against the actual Cell SPU system.

There's basically no other good place to put this code, so put it here for
the time being.

- vecoperations.c: Vector shuffles for all supported vector types, tests
  for v16i8 add and multiply.

llvm-svn: 60566
2008-12-05 00:01:00 +00:00
Bill Wendling
a0466523bd Temporarily revert r60519. It was causing a bootstrap failure:
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c ../../../llvm-gcc.src/libgomp/barrier.c  -fno-common -DPIC -o .libs/barrier.o
checking for sys/file.h... /var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:non-relocatable subtraction expression, "_gomp_tls_key" minus "L1$pb"
/var/folders/zG/zGE-ZJOGFiGjv0B5cs5oYE+++TM/-Tmp-//cc34Jg5P.s:13:symbol: "_gomp_tls_key" can't be undefined in a subtraction expression
make[4]: *** [barrier.lo] Error 1
make[4]: *** Waiting for unfinished jobs....
/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/xgcc -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.obj/./gcc/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/bin/ -B/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/lib/ -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/include -isystem /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm-gcc.install/i386-apple-darwin9.5.0/sys-include -DHAVE_CONFIG_H -I. -I../../../llvm-gcc.src/libgomp -I. -I../../../llvm-gcc.src/libgomp/config/posix -I../../../llvm-gcc.src/libgomp -Wall -pthread -Werror -O2 -g -O2 -MT alloc.lo -MD -MP -MF .deps/alloc.Tpo -c ../../../llvm-gcc.src/libgomp/alloc.c -o alloc.o >/dev/null 2>&1
yes
checking for sys/param.h... make[3]: *** [all-recursive] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-target-libgomp] Error 2
make[1]: *** Waiting for unfinished jobs....

llvm-svn: 60527
2008-12-04 04:07:00 +00:00
Evan Cheng
d4b7459179 Visibility hidden GVs do not require extra load of symbol address from the GOT or non-lazy-ptr.
llvm-svn: 60519
2008-12-04 01:56:50 +00:00
Evan Cheng
05ded29738 Use mmx (punpckldq VR64, (mmx_v_set0)) to clear high 32-bits of a VR64 register.
llvm-svn: 60499
2008-12-03 19:38:05 +00:00
Rafael Espindola
0b01e188e5 Fix some tests. The grep for "il" was matching "file".
llvm-svn: 60485
2008-12-03 17:14:56 +00:00
Richard Osborne
e74ae9dbb7 Add support for ISD::TRAP to the XCore backend
llvm-svn: 60479
2008-12-03 10:59:16 +00:00
Evan Cheng
803ac3b438 Fix test.
llvm-svn: 60476
2008-12-03 08:20:45 +00:00
Dan Gohman
ac6561793c Mark x86's V_SET0 and V_SETALLONES with isSimpleLoad, and teach X86's
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.

Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).

llvm-svn: 60461
2008-12-03 05:21:24 +00:00