1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

16750 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
3eb650542a Split loop exiting edges more aggressively.
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

llvm-svn: 160571
2012-07-20 20:49:53 +00:00
Richard Osborne
f82086baa5 Fix assertion in jump threading (PR13405).
GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't
true if the block ends in an indirect branch with no successors. Fix this by
bailing out earlier in this case.

llvm-svn: 160546
2012-07-20 10:36:17 +00:00
Kostya Serebryany
a57ebfbe10 [asan] make sure that the crash callbacks do not get merged (Chandler's idea: insert an empty InlineAsm). Change the order in which the new BBs are inserted: the slow path BB is insert between old BBs, the crash BB is inserted at the end. Don't create an empty BB (introduced by recent commits). Update the test. The experimental code that does manual crash callback merge will most likely be deleted later.
llvm-svn: 160544
2012-07-20 09:54:50 +00:00
Nick Lewycky
62064c6cc3 Revert r160529 due to crashes.
llvm-svn: 160532
2012-07-19 23:59:21 +00:00
Nick Lewycky
8a31eaccbd Don't wipe out global variables that are probably storing pointers to heap
memory. This makes clang play nice with leak checkers.

llvm-svn: 160529
2012-07-19 22:35:28 +00:00
Preston Gurd
68df79bf06 Fix remaining lit tests which were failing when run on an Atom
processor.

Patches by Tyler Nowicki, Andy Zhang, and Preston Gurd!

llvm-svn: 160520
2012-07-19 18:53:21 +00:00
NAKAMURA Takumi
f7dd74d4cc test/DebugInfo/dwarfdump-test.test: Tweak expressions for Win32 to match backslashes. They are still odd, though.
For example, Paths are printed on Win32 as below;

/tmp/dbginfo\def2.cc:4:0
/tmp/dbginfo\include\decl2.h:1:0
/tmp/include\decl.h:5:0

llvm-svn: 160505
2012-07-19 13:40:09 +00:00
Jush Lu
54c7329b88 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 160500
2012-07-19 09:49:00 +00:00
Alexey Samsonov
8832a9374f DebugInfo library: add support for fetching absolute paths to source files
(instead of basenames) from DWARF. Use this behavior in llvm-dwarfdump tool.

Reviewed by Benjamin Kramer.

llvm-svn: 160496
2012-07-19 07:03:58 +00:00
Manman Ren
dd8d9c10a3 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454
2012-07-18 21:40:01 +00:00
Preston Gurd
d2b344c685 This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451
2012-07-18 20:49:17 +00:00
Chandler Carruth
5d1c4f0605 Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

llvm-svn: 160443
2012-07-18 18:58:22 +00:00
Andrew Trick
db674bed44 Added unit test for PR13361: LSR + SCEV "hangs" on reasonably sized test.
llvm-svn: 160439
2012-07-18 18:07:52 +00:00
Victor Oliveira
f8290a3f24 test commit
llvm-svn: 160438
2012-07-18 17:53:05 +00:00
Jack Carter
7f725ae6fe Mips specific inline asm operand modifier 'M':
Print the high order register of a double word register operand.

In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using 
doubles in assembler and continue to control register to variable
relationships.

This patch also fixes a related bug in a previous patch:

    case 'D': // Second part of a double word register operand
    case 'L': // Low order register of a double word register operand
    case 'M': // High order register of a double word register operand

I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.

llvm-svn: 160429
2012-07-18 06:41:36 +00:00
Andrew Trick
612785f908 indvars: Linear function test replace should avoid reusing undef.
Fixes PR13371: indvars pass incorrectly substitutes 'undef' values.

I do not like this fix. It's needed until/unless the meaning of undef
changes. It attempts to be complete according to the IR spec, but I
don't have much confidence in the implementation given the difficulty
testing undefined behavior. Worse, this invalidates some of my
hard-fought work on indvars and LSR to optimize pointer induction
variables. It results benchmark regressions, which I'll track
internally. On x86_64 no LTO I see:

-3% huffbench
-3% 400.perlbench
-8% fhourstones

My only suggestion for recovering is to change the meaning of
undef. If we could trust an arbitrary instruction to produce a some
real value that can be manipulated (e.g. incremented) according to
non-undef rules, then this case could be easily handled with SCEV.

llvm-svn: 160421
2012-07-18 04:35:10 +00:00
Craig Topper
b144f3b6db Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Joel Jones
4ce75efda5 More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160410
2012-07-18 00:02:16 +00:00
Evan Cheng
a93c834b1a Add test case for r160387
llvm-svn: 160389
2012-07-17 19:40:05 +00:00
Evan Cheng
5e82ad04d5 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
NAKAMURA Takumi
dff0927cea llvm/test/Transforms/LoopRotate/PhiRename-1.ll: FileCheck-ize. It fixes PR13301.
It began choking since Chandler's r159547, possibly due to improper expression on grep from TclParser to ShParser.

llvm-svn: 160367
2012-07-17 15:43:17 +00:00
Alexey Samsonov
5204605486 Improve behavior of DebugInfoEntryMinimal::getSubprogramName() introduced in r159512.
To fetch a subprogram name we should not only inspect the DIE for this subprogram, but optionally inspect
its specification, or its abstract origin (even if there is no inlining), or even specification of an abstract origin.

Reviewed by Benjamin Kramer.

llvm-svn: 160365
2012-07-17 15:28:35 +00:00
Nadav Rotem
9df24d20a6 Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.

llvm-svn: 160357
2012-07-17 09:07:37 +00:00
Evan Cheng
f84dd0cf40 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng
0b6bcb6e06 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Akira Hatanaka
3006cd86b5 Fix function select_cc_f32 in test/CodeGen/Mips/selectcc.ll.
llvm-svn: 160329
2012-07-16 23:56:51 +00:00
Nuno Lopes
97c381ea93 fix PR13339 (remove the predecessor from the unwind BB when removing an invoke)
llvm-svn: 160325
2012-07-16 22:49:40 +00:00
Evan Cheng
b409a61574 For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926

llvm-svn: 160312
2012-07-16 19:35:43 +00:00
Nadav Rotem
ae88f0486b Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160305
2012-07-16 18:34:53 +00:00
Tom Stellard
7c25278423 Revert "test/CodeGen/R600: Add some basic tests v6"
This reverts commit 11d3457afcda7848448dd7f11b2ede6552ffb9ea.

llvm-svn: 160300
2012-07-16 18:19:43 +00:00
Kostya Serebryany
c80a9f4bea [asan] refactor instrumentation to allow merging the crash callbacks (not fully implemented yet, no functionality change except the BB order)
llvm-svn: 160284
2012-07-16 16:15:40 +00:00
Jack Carter
f2bf098c4f Doubleword Shift Left Logical Plus 32
Mips shift instructions DSLL, DSRL and DSRA are transformed into
DSLL32, DSRL32 and DSRA32 respectively if the shift amount is between
32 and 63

Here is a description of DSLL:

Purpose: Doubleword Shift Left Logical Plus 32
To execute a left-shift of a doubleword by a fixed amount--32 to 63 bits

Description: GPR[rd] <- GPR[rt] << (sa+32)

The 64-bit doubleword contents of GPR rt are shifted left, inserting
 zeros into the emptied bits; the result is placed in
GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.

This patch implements the direct object output of these instructions.

llvm-svn: 160277
2012-07-16 15:14:51 +00:00
Alexey Samsonov
24c8b51694 Fix tests that failed on i686-win32 after r160248:
1. FileCheck-ize epilogue.ll and allow another asm instruction to restore %rsp.
2. Remove check in widen_arith-3.ll that was hitting instruction in epilogue instead of
vector add.

llvm-svn: 160274
2012-07-16 14:33:36 +00:00
Tom Stellard
ba3a0edb7d test/CodeGen/R600: Add some basic tests v6
llvm-svn: 160273
2012-07-16 14:17:19 +00:00
Nadav Rotem
67ff66bd0c Fix a bug in the 3-address conversion of LEA when one of the operands is an
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160260
2012-07-16 10:52:25 +00:00
Chandler Carruth
800f86f31b Revert r160254 temporarily.
It turns out that ASan relied on the at-the-end block insertion order to
(purely by happenstance) disable some LLVM optimizations, which in turn
start firing when the ordering is made more "normal". These
optimizations in turn merge many of the instrumentation reporting calls
which breaks the return address based error reporting in ASan.

We're looking at several different options for fixing this.

llvm-svn: 160256
2012-07-16 10:01:02 +00:00
Chandler Carruth
2f34858fa0 Teach AddressSanitizer to create basic blocks in a more natural order.
This is particularly useful to the backend code generators which try to
process things in the incoming function order.

Also, cleanup some uses of IRBuilder to be a bit simpler and more clear.

llvm-svn: 160254
2012-07-16 08:58:53 +00:00
Chandler Carruth
ff8e933c02 Add a basic test for AddressSanitizer. This is just a bare-bones
functionality test.

In general, unless the functionality is substantially separated, we
should lump more basic testing into this file. The test running
infrastructure likes having a few test files with more comprehensive
testing within them.

llvm-svn: 160253
2012-07-16 08:56:46 +00:00
Alexey Samsonov
c68bb48704 This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment.
It is intended to fix PR11468.

Old prologue and epilogue looked like this:
push %rbp
mov %rsp, %rbp
and $alignment, %rsp
push %r14
push %r15
...
pop %r15
pop %r14
mov %rbp, %rsp
pop %rbp

The problem was to reference the locations of callee-saved registers in exception handling:
locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
take some effort to implement this in LLVM, as currently MachineLocation can only have the form
"Register + Offset". Funciton prologue and epilogue are now changed to:

push %rbp
mov %rsp, %rbp
push %14
push %15
and $alignment, %rsp
...
lea -$size_of_saved_registers(%rbp), %rsp
pop %r15
pop %r14
pop %rbp

Reviewed by Chad Rosier.

llvm-svn: 160248
2012-07-16 06:54:09 +00:00
Nadav Rotem
338506d265 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160235
2012-07-15 20:39:08 +00:00
Nadav Rotem
a09775b875 Teach getTargetVShiftNode about TargetConstant nodes.
llvm-svn: 160234
2012-07-15 20:27:43 +00:00
NAKAMURA Takumi
a759fed621 llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Rewrite expressions to fit various targets.
- Make sure existence of "barrier".
  - Confirm reload corresponding to spill.

llvm-svn: 160232
2012-07-15 14:38:35 +00:00
Nadav Rotem
0377e0d234 Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Nadav Rotem
c40d85dda5 AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector.
This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions.

llvm-svn: 160222
2012-07-14 22:26:05 +00:00
Nadav Rotem
a6dfd89cd0 Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
2012-07-14 21:30:27 +00:00
Andrew Trick
8030f89de4 LSR Fix: check SCEV expression safety before expansion.
All SCEV expressions used by LSR formulae must be safe to
expand. i.e. they may not contain UDiv unless we can prove nonzero
denominator.

Fixes PR11356: LSR hoists UDiv.

llvm-svn: 160205
2012-07-13 23:33:10 +00:00
Joel Jones
12ea066486 This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160200
2012-07-13 23:25:25 +00:00
Jack Carter
b8e3cf5fbc The Mips specific relocation R_MIPS_GOT_DISP
is used in cases where global symbols are 
directly represented in the GOT and we use an 
offset into the global offset table.

This patch adds direct object support for R_MIPS_GOT_DISP.

llvm-svn: 160183
2012-07-13 19:15:47 +00:00
Jack Carter
d4d064d1da test case for revision 160084: Alignment filling between Mips function units
llvm-svn: 160177
2012-07-13 18:14:01 +00:00
Duncan Sands
fce8aff6ce Restrict this to x86, hopefully fixing ARM buildbots.
llvm-svn: 160163
2012-07-13 07:02:00 +00:00
Eric Christopher
2db5457cb8 The end of the prologue should be marked with is_stmt.
Fixes PR13303.

Patch by Paul Robinson!

llvm-svn: 160148
2012-07-12 23:30:25 +00:00
Akira Hatanaka
d7e6cec8c7 Fix check strings in test/MC/Disassembler/Mips/* and run FileCheck.
Patch by Vladimir Medic.

llvm-svn: 160143
2012-07-12 21:19:32 +00:00
Benjamin Kramer
558c56f216 Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it.
I already had the necessary things in place for IR-level passes but missed the machine passes.

llvm-svn: 160137
2012-07-12 18:14:57 +00:00
Nadav Rotem
5c2cfd2db1 The LIT tests below do not specify the exact cpu model and fail on AVX2 machines, because we select different instructions such as vbroadcast, new shuffles, etc.
Patch by Michael Liao.

llvm-svn: 160129
2012-07-12 13:45:15 +00:00
NAKAMURA Takumi
aff92ba6af llvm/test/CodeGen/X86/rdrand.ll: Relax expression corresponding to Win64 CC.
llvm-svn: 160124
2012-07-12 10:22:57 +00:00
NAKAMURA Takumi
ca8207d81e llvm/test/CMakeLists.txt: Add llvm-diff to deps.
llvm-svn: 160123
2012-07-12 10:15:48 +00:00
Benjamin Kramer
65d12b1345 Use %s instead of the explicit name, the latter doesn't work in out-of-tree builds.
llvm-svn: 160120
2012-07-12 09:36:29 +00:00
Benjamin Kramer
f8e67a04f4 Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

llvm-svn: 160117
2012-07-12 09:31:43 +00:00
Duncan Sands
30441cb37c The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.

llvm-svn: 160116
2012-07-12 09:01:35 +00:00
Craig Topper
6eef81b65b Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Evan Cheng
e6c5349fcd Instcombine was transforming:
%shr = lshr i64 %key, 3
  %0 = load i64* %val, align 8
  %sub = add i64 %0, -1
  %and = and i64 %sub, %shr
  ret i64 %and

to:
  %shr = lshr i64 %key, 3
  %0 = load i64* %val, align 8
  %sub = add i64 %0, 2305843009213693951
  %and = and i64 %sub, %shr
  ret i64 %and

The demanded bit optimization is actually a pessimization because add -1 would
be codegen'ed as a sub 1. Teach the demanded constant shrinking optimization
to check for negated constant to make sure it is actually reducing the width
of the constant.

rdar://11793464

llvm-svn: 160101
2012-07-12 01:45:35 +00:00
Manman Ren
0479bff92d ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.

llvm-svn: 160090
2012-07-11 22:51:44 +00:00
Stepan Dyatkovskiy
abb724e28d Fixed diff comparison.
llvm-svn: 160076
2012-07-11 21:02:57 +00:00
Akira Hatanaka
34599c86b6 Test case for r160036.
llvm-svn: 160067
2012-07-11 19:50:46 +00:00
Manman Ren
93ef864f3c X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair.
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables
removal of Cmp.

llvm-svn: 160066
2012-07-11 19:35:12 +00:00
Akira Hatanaka
2e26e543b9 Implement MipsTargetLowering::LowerSELECT_CC to custom lower SELECT_CC.
llvm-svn: 160064
2012-07-11 19:32:27 +00:00
Benjamin Kramer
e4bd0cdfa8 PR13326: Fix a subtle edge case in the udiv -> magic multiply generator.
This caused 6 of 65k possible 8 bit udivs to be wrong.

llvm-svn: 160058
2012-07-11 18:31:59 +00:00
Nadav Rotem
22652c85bc When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
llvm-svn: 160044
2012-07-11 13:27:05 +00:00
Akira Hatanaka
aad21ac7f2 Lower RETURNADDR node in Mips backend.
Patch by Sasa Stankovic.

llvm-svn: 160031
2012-07-11 00:53:32 +00:00
Jack Carter
639a740a15 Mips specific inline asm operand modifier 'L'.
Low order register of a double word register operand. Operands 
   are defined by the name of the variable they are marked with in
   the inline assembler code. This is a way to specify that the 
   operand just refers to the low order register for that variable.
   
   It is the opposite of modifier 'D' which specifies the high order
   register.
   
   Example:
   
 main()
{

    long long ll_input = 0x1111222233334444LL;
    long long ll_val = 3;
    int i_result = 0;

    __asm__ __volatile__( 
		   "or	%0, %L1, %2"
	     : "=r" (i_result) 
	     : "r" (ll_input), "r" (ll_val)); 
}

   Which results in:
   
   	lui	$2, %hi(_gp_disp)
	addiu	$2, $2, %lo(_gp_disp)
	addiu	$sp, $sp, -8
	addu	$2, $2, $25
	sw	$2, 0($sp)
	lui	$2, 13107
	ori	$3, $2, 17476     <-- Low 32 bits of ll_input
	lui	$2, 4369
	ori	$4, $2, 8738      <-- High 32 bits of ll_input
	addiu	$5, $zero, 3  <-- Low 32 bits of ll_val
	addiu	$2, $zero, 0  <-- High 32 bits of ll_val
	#APP
	or	$3, $4, $5        <-- or i_result, high 32 ll_input, low 32 of ll_val
	#NO_APP
	addiu	$sp, $sp, 8
	jr	$ra

If not direction is done for the long long for 32 bit variables results
in using the low 32 bits as ll_val shows.

There is an existing bug if 'L' or 'D' is used for the destination register
for 32 bit long longs in that the target value will be updated incorrectly
for the non-specified part unless explicitly set within the inline asm code.

llvm-svn: 160028
2012-07-10 22:41:20 +00:00
Chad Rosier
a721fc4419 Add newline.
llvm-svn: 160006
2012-07-10 17:57:00 +00:00
Chad Rosier
b8b2320c3c Add test case accidentally omitted from r160002.
llvm-svn: 160004
2012-07-10 17:49:39 +00:00
Chad Rosier
5395ec6ee4 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.  Basically, this is a reapplication of r158087 with a few fixes.

Specifically, (1) the stack pointer is restored from the base pointer before
popping callee-saved registers and (2) in obscure cases (see comments in patch)
we must cache the value of the original stack adjustment in the prologue and
apply it in the epilogue.

rdar://11496434

llvm-svn: 160002
2012-07-10 17:45:53 +00:00
Nadav Rotem
5f6e9d5ffe Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Richard Barton
2bacde8589 Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters) (and do it properly this time!
llvm-svn: 159989
2012-07-10 12:51:09 +00:00
Craig Topper
b346ce8240 Reverse assembler/disassembler operand order for gather instructions.
llvm-svn: 159983
2012-07-10 06:38:33 +00:00
Akira Hatanaka
96b3eb563a Make register Mips::RA allocatable if not in mips16 mode.
llvm-svn: 159971
2012-07-10 00:19:06 +00:00
Chad Rosier
b986265e3b Revert r159938 (and r159945) to appease the buildbots.
llvm-svn: 159960
2012-07-09 20:43:34 +00:00
Owen Anderson
e0c4f94d45 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.

llvm-svn: 159957
2012-07-09 20:31:12 +00:00
Manman Ren
dc41586be4 X86: implement functions to analyze & synthesize CMOV|SET|Jcc
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond

No functional change intended.
If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
opcode to get the condition code, then update the condition code, finally
synthesize the new opcode form the new condition code.

llvm-svn: 159955
2012-07-09 18:57:12 +00:00
Akira Hatanaka
3d2bcefaf1 Reapply r158846.
Access mips register classes via MCRegisterInfo's functions instead of via the
TargetRegisterClasses defined in MipsGenRegisterInfo.inc.

llvm-svn: 159953
2012-07-09 18:46:47 +00:00
Nuno Lopes
c676931bb9 instcombine: merge the functions that remove dead allocas and dead mallocs/callocs/...
This patch removes ~70 lines in InstCombineLoadStoreAlloca.cpp and makes both functions a bit more aggressive than before :)
In theory, we can be more aggressive when removing an alloca than a malloc, because an alloca pointer should never escape, but we are not taking advantage of this anyway

llvm-svn: 159952
2012-07-09 18:38:20 +00:00
Richard Barton
cb28956a79 Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters)
llvm-svn: 159938
2012-07-09 16:41:33 +00:00
Richard Barton
58c6ccbb1c Prevent ARM assembler from losing a right shift by #32 applied to a register
llvm-svn: 159937
2012-07-09 16:31:14 +00:00
Richard Barton
2ca50f6513 Teach the assembler to use the narrow thumb encodings of various three-register dp instructions where permissable.
llvm-svn: 159935
2012-07-09 16:12:24 +00:00
Manman Ren
eca5886e50 X86: Fix optimizeCompare to correctly check safe condition.
It is safe if EFLAGS is killed or re-defined.
When we are done with the basic block, check whether EFLAGS is live-out.
Do not optimize away cmp if EFLAGS is live-out.

llvm-svn: 159888
2012-07-07 03:34:46 +00:00
Nuno Lopes
f3ba9a4d21 teach instcombine to remove allocated buffers even if there are stores, memcpy/memmove/memset, and objectsize users.
This means we can do cheap DSE for heap memory.
Nothing is done if the pointer excapes or has a load.

The churn in the tests is mostly due to objectsize, since we want to make sure we
don't delete the malloc call before evaluating the objectsize (otherwise it becomes -1/0)

llvm-svn: 159876
2012-07-06 23:09:25 +00:00
Akira Hatanaka
37565e70b6 revert r159851.
llvm-svn: 159854
2012-07-06 20:16:48 +00:00
Akira Hatanaka
4320724cc5 Reapply r158846.
Include file MipsGenRegisterInfo.inc.

llvm-svn: 159851
2012-07-06 19:29:11 +00:00
Manman Ren
8cbff1360f X86: peephole optimization to remove cmp instruction
For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.

llvm-svn: 159838
2012-07-06 17:36:20 +00:00
Chad Rosier
2bffb657ca [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
llvm-svn: 159837
2012-07-06 17:33:39 +00:00
Duncan Sands
ffd9b270f2 Attempt to fix windows buildbots. Patch by James Benton.
llvm-svn: 159826
2012-07-06 14:43:16 +00:00
NAKAMURA Takumi
aac3b7a46a test/CodeGen/X86/sext-setcc-self.ll: Mark it as XFAIL: cygwin,mingw32,win32. Investigating.
llvm-svn: 159820
2012-07-06 12:12:39 +00:00
NAKAMURA Takumi
bd5a31d598 Revert r159804, "[arm-fast-isel] Add support for vararg function calls."
It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts.

llvm-svn: 159817
2012-07-06 11:12:44 +00:00
Alexey Samsonov
195a9e9c55 Fix PR13202 and a regtest.
DwarfDebug class could generate the same (inlined) DIVariable twice:
1) when trying to find abstract debug variable for a concrete inlined instance.
2) when explicitly collecting info for variables that were optimized out.

This change makes sure that this duplication won't happen and makes
Clang pass "gdb.opt/inline-locals" test from gdb testsuite.

Reviewed by Eric Christopher.

llvm-svn: 159811
2012-07-06 08:45:08 +00:00
Jush Lu
4fdc23801c [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 159804
2012-07-06 03:02:37 +00:00
Jack Carter
d22e7550f9 Mips specific inline asm operand modifier D.
Print the second half of a double word operand.
   
   The include list was cleaned up a bit as well.
   
   Also the test case was modified to test for both
   big and little patterns.
   

llvm-svn: 159787
2012-07-05 23:58:21 +00:00
Akira Hatanaka
5b41861c38 test case for r159770.
llvm-svn: 159771
2012-07-05 19:29:31 +00:00
Duncan Sands
f9eec0d373 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.

llvm-svn: 159739
2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen
795083115c Ensure CopyToReg nodes are always glued to the call instruction.
The CopyToReg nodes that set up the argument registers before a call
must be glued to the call instruction. Otherwise, the scheduler may emit
the physreg copies long before the call, causing long live ranges for
the fixed registers.

Besides disabling good register allocation, that can also expose
problems when EmitInstrWithCustomInserter() splits a basic block during
the live range of a physreg.

llvm-svn: 159721
2012-07-04 19:28:31 +00:00
Rafael Espindola
d06801429f Add a testcase for pr13209. It is not a great test, but it still fails if
159509 and 159479 are reverted. It would be really nice to be able to run
just the coalescer :-(

llvm-svn: 159715
2012-07-04 16:06:00 +00:00
Jakob Stoklund Olesen
79846e5c9b Add early if-conversion support to X86.
Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.

Early if-conversion is still not enabled by default.

llvm-svn: 159695
2012-07-04 00:09:58 +00:00
Nuno Lopes
eac3a6d03c BoundsChecking: optimize out the check for offset < 0 if size is known to be >= 0 (signed).
(LLVM optimizers cannot do this optimization by themselves)

llvm-svn: 159668
2012-07-03 17:30:18 +00:00
Craig Topper
6fcb4454a0 Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252.
llvm-svn: 159644
2012-07-03 05:49:45 +00:00
NAKAMURA Takumi
33e15f03ee test/CodeGen/SPARC/private.ll: Fixup. Forgot to prune old RUN lines.
llvm-svn: 159643
2012-07-03 04:29:20 +00:00
NAKAMURA Takumi
8e3879b6ce test/CodeGen/SPARC/private.ll: FileCheck-ize.
llvm-svn: 159642
2012-07-03 04:21:57 +00:00
NAKAMURA Takumi
99db282ba1 llvm/test/lit.cfg: Retweak for Win32 to fix testing.
- execute_external should be;
    - Not on Win32.
    - Using bash.
    In reverse, "execute_internal" shoud be (Win32 && !bash).

  - lit.getBashPath() behaves differently before and after tweaking $PATH.

I will add a few explanations there later.

llvm-svn: 159641
2012-07-03 03:59:34 +00:00
NAKAMURA Takumi
ebfaec6c36 test/CodeGen/X86/sincos.ll: FileCheck-ize.
llvm-svn: 159639
2012-07-03 03:59:22 +00:00
NAKAMURA Takumi
2c39bfe0fe test/CodeGen/X86/fabs.ll: FileCheck-ize.
llvm-svn: 159638
2012-07-03 03:59:15 +00:00
NAKAMURA Takumi
d1831ee438 test/CodeGen/X86/2007-09-05-InvalidAsm.ll: FileCheck-ize.
llvm-svn: 159637
2012-07-03 03:59:08 +00:00
NAKAMURA Takumi
6c6ec244da test/CodeGen/X86/2004-03-30-Select-Max.ll: FileCheck-ize.
llvm-svn: 159636
2012-07-03 03:58:59 +00:00
Jack Carter
0e58c3f697 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    

llvm-svn: 159625
2012-07-02 23:35:23 +00:00
Eric Christopher
07e1aa6bfe Revert " mips32 long long register inline asm constraint support." as
it appears to be breaking the bots.

This reverts commit 1b055ce320fa13f6f1ac81670d11b45e01f79876.

llvm-svn: 159619
2012-07-02 23:22:25 +00:00
Eric Christopher
185293d560 Revert "IntRange:" as it appears to be breaking self hosting.
This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c.

llvm-svn: 159618
2012-07-02 23:22:21 +00:00
Jack Carter
f66ba1e2a3 deleted test/CodeGen/Mips/inlineasm-cnstrnt-bad-r-1.ll
llvm-svn: 159617
2012-07-02 23:21:22 +00:00
Jack Carter
64aeffc069 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    

llvm-svn: 159610
2012-07-02 22:39:45 +00:00
Chandler Carruth
4081d7f1e7 Extend the workaround from r159593 to cover a few explicit alias
targets.

llvm-svn: 159597
2012-07-02 21:45:22 +00:00
Chandler Carruth
5be5cafa65 Revert r159588, and apply a more principled fix. Place the fix for this
in the abstraction for lit test suites so that the various other layers
of abstraction pick up the same behavioral fix, and so that we still get
a complete list of dependencies for the 'check-all' target.

This should fix the follow-on issues of the same nature with various
other build targets, including Clang targets. Sorry for the churn, and
again thanks to Matt for testing and breaking this more thoroughly.

llvm-svn: 159593
2012-07-02 21:31:03 +00:00
Chandler Carruth
61e71944a9 Work around a really frustrating apparant CMake bug.
No functionality changed here, except that the CMake installed by
default on Ubuntu Lucid should actually work with the makefile
generators now.

Thanks to Matt for the report and head-desking required to figure out
why it was failing.

llvm-svn: 159588
2012-07-02 21:14:06 +00:00
Jack Carter
4355dfdc86 Pass the correct ELFOSABI enumeration to the MipsELFObjectWriter constructor
Contributer: Sasa Stankovic 
llvm-svn: 159574
2012-07-02 20:04:43 +00:00
Bob Wilson
a848f156de Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

llvm-svn: 159570
2012-07-02 19:48:45 +00:00
Chandler Carruth
5d3a0ce4e5 Fix the remaining TCL-style quotes found in the testsuite. This is
another mechanical change accomplished though the power of terrible Perl
scripts.

I have manually switched some "s to 's to make escaping simpler.

While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.

Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/

llvm-svn: 159547
2012-07-02 19:09:46 +00:00
Duncan Sands
16235b1885 GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection.
llvm-svn: 159546
2012-07-02 18:55:39 +00:00
Chandler Carruth
d200829a4f Convert the uses of '|&' to use '2>&1 |' instead, which works on old
versions of Bash. In addition, I can back out the change to the lit
built-in shell test runner to support this.

This should fix the majority of fallout on Darwin, but I suspect there
will be a few straggling issues.

llvm-svn: 159544
2012-07-02 18:37:59 +00:00
Bob Wilson
8564204a8c Do not attempt to use ROR for Thumb1.
Patch by Matt Fischer!

llvm-svn: 159538
2012-07-02 17:22:47 +00:00
Nuno Lopes
e967ebe7bb fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB
llvm-svn: 159534
2012-07-02 16:14:47 +00:00
Chandler Carruth
eafc1e5909 The built-in shell test runner for some reason doesn't like the quoting
and multi-line nature of this test. I don't really feel like bugging
this kind of edge-case, so just put it on one line and use single
quotes. With this, every test *really* passes with the built-in shell
test runner.

llvm-svn: 159530
2012-07-02 13:35:01 +00:00
Chandler Carruth
36dd0d679c Fix the TCL-style quoting in one random test that somehow slipped
through my perl nets.

With this, the test suite passes even if I force it to run with the
built-in shell test logic, except for a test which REQUIREs shell.

llvm-svn: 159529
2012-07-02 13:29:47 +00:00
Stepan Dyatkovskiy
fff4579249 IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Chandler Carruth
8a358b3669 Convert all tests using TCL-style quoting to use shell-style quoting.
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.

If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.

Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.

Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s

llvm-svn: 159525
2012-07-02 12:47:22 +00:00
Chandler Carruth
79f47d4fc5 Make tests which first provide a negative assertion via 'not', then
a pipeline, and then a positive assertion via grep, use two RUN lines
instead.

Supporting these complex ideas of 'success' and 'failure' across
multiple stages of a pipeline is brittle in the shell world, and would
block switching to ShTest format; it only worked due to contrivances
introduced by the TclTest format.

Writing this as two separate RUN lines seems clearer in any event.

This is another step toward completely removing TclTests from lit.

llvm-svn: 159524
2012-07-02 12:23:19 +00:00
Chandler Carruth
b27545af79 Rewrite three tests that had truly egregious abuses of 'grep' in them to
use FileCheck.

Aside from removing a dependence on TCL-style quoting, this also makes
the tests ... significantly more robust. =] It would be really, *really*
great of the maintainer(s) of the CellSPU backend went through and
systematically rewrite these tests to use FileCheck. There are a lot
more that have nearly this bad of abuses.

Another step along the path to a TclTest-free testsuite.

llvm-svn: 159523
2012-07-02 12:20:14 +00:00
Chandler Carruth
11714e13d3 Switch a bunch of Linker tests from using elaborate echo productions to
just provide and reference separate input files from an Inputs
subdirectory. This pattern works very well in the Clang tree and is
easier to understand in my opinion. It also has fewer limitations and
will remove one particularly annoying use of TCL-style {} quoting from
the testsuite.

Also teach the LLVM lit configuration to avoid recursing into 'Inputs'
subdirectories. This wasn't required for the previous 'Inputs'
subdirectories used due to fortuitous suffix patterns.

This is the first step to completely removing support for TCL-style tests.

llvm-svn: 159520
2012-07-02 10:18:06 +00:00
Alexey Samsonov
2e5d156c11 This patch extends the libLLVMDebugInfo which contains a minimalistic DWARF parser:
1) DIContext is now able to return function name for a given instruction address (besides file/line info).
2) llvm-dwarfdump accepts flag --functions that prints the function name (if address is specified by --address flag).
3) test case that checks the basic functionality of llvm-dwarfdump added

llvm-svn: 159512
2012-07-02 05:54:45 +00:00
Rafael Espindola
dd05a97f8e Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

llvm-svn: 159509
2012-07-01 17:08:01 +00:00
Elena Demikhovsky
0617b5a56c Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
llvm-svn: 159504
2012-07-01 06:12:26 +00:00
Chandler Carruth
7f56993d8e Hoist LLVM's lit testsuite infrastructure into module so that it can be
re-used. Also, build in direct support for accumulating a set of lit
parameters, arguments, and testsuites to run as part of a 'check-all'
rule. This sinks 'check-all' from a Clang-specific construct to
a generic construct of the project.

llvm-svn: 159482
2012-06-30 10:14:14 +00:00
Jakob Stoklund Olesen
b57dc22af0 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Duncan Sands
64b10a65e1 Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to
the optimizers producing a multiply expression with more multiplications than
the original (!).

llvm-svn: 159426
2012-06-29 13:25:06 +00:00
Rafael Espindola
53e0eee9de In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

llvm-svn: 159409
2012-06-29 04:22:35 +00:00
Manman Ren
63bf58865a X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Chandler Carruth
4889e7fe29 Remove a completely unnecessary mkdir from the CMake build.
Clang has been getting along fine without this for quite some time.

llvm-svn: 159400
2012-06-29 00:45:57 +00:00
Nick Lewycky
6b7498ebd5 If the step value is a constant zero, the loop isn't going to terminate. Fixes
the assert reported in PR13228!

llvm-svn: 159393
2012-06-28 23:44:57 +00:00
Nuno Lopes
49279c51b6 make the verifier accept @llvm.donothing as the only intrinsic that can be invoked
While at it, merge 2 tests and FileCheckize them

llvm-svn: 159388
2012-06-28 22:57:00 +00:00
Nuno Lopes
66896bbd47 make simplifyCFG erase invokes to readonly/readnone functions
llvm-svn: 159385
2012-06-28 22:32:27 +00:00
Nuno Lopes
b0d4abe297 make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Nuno Lopes
031ca196d0 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Nuno Lopes
52920835c9 make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases)
llvm-svn: 159353
2012-06-28 16:13:37 +00:00
Chandler Carruth
ce08859a5a Move the setup for variables that are expanded in the lit.site.cfg into
a dedicated helper function. This will enable re-using the same logic
for Clang's lit setup, etc.

llvm-svn: 159333
2012-06-28 06:36:24 +00:00
Hal Finkel
89ff4e2b47 Allow BBVectorize to form non-2^n-length vectors.
The original algorithm only used recursive pair fusion of equal-length
types. This is now extended to allow pairing of any types that share
the same underlying scalar type. Because we would still generally
prefer the 2^n-length types, those are formed first. Then a second
set of iterations form the non-2^n-length types.

Also, a call to SimplifyInstructionsInBlock has been added after each
pairing iteration. This takes care of DCE (and a few other things)
that make the following iterations execute somewhat faster. For the
same reason, some of the simple shuffle-combination cases are now
handled internally.

There is some additional refactoring work to be done, but I've had
many requests for this feature, so additional refactoring will come
soon in future commits (as will additional test cases).

llvm-svn: 159330
2012-06-28 05:42:42 +00:00