1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

16561 Commits

Author SHA1 Message Date
Rafael Espindola
53e0eee9de In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

llvm-svn: 159409
2012-06-29 04:22:35 +00:00
Manman Ren
63bf58865a X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Chandler Carruth
4889e7fe29 Remove a completely unnecessary mkdir from the CMake build.
Clang has been getting along fine without this for quite some time.

llvm-svn: 159400
2012-06-29 00:45:57 +00:00
Nick Lewycky
6b7498ebd5 If the step value is a constant zero, the loop isn't going to terminate. Fixes
the assert reported in PR13228!

llvm-svn: 159393
2012-06-28 23:44:57 +00:00
Nuno Lopes
49279c51b6 make the verifier accept @llvm.donothing as the only intrinsic that can be invoked
While at it, merge 2 tests and FileCheckize them

llvm-svn: 159388
2012-06-28 22:57:00 +00:00
Nuno Lopes
66896bbd47 make simplifyCFG erase invokes to readonly/readnone functions
llvm-svn: 159385
2012-06-28 22:32:27 +00:00
Nuno Lopes
b0d4abe297 make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Nuno Lopes
031ca196d0 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Nuno Lopes
52920835c9 make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases)
llvm-svn: 159353
2012-06-28 16:13:37 +00:00
Chandler Carruth
ce08859a5a Move the setup for variables that are expanded in the lit.site.cfg into
a dedicated helper function. This will enable re-using the same logic
for Clang's lit setup, etc.

llvm-svn: 159333
2012-06-28 06:36:24 +00:00
Hal Finkel
89ff4e2b47 Allow BBVectorize to form non-2^n-length vectors.
The original algorithm only used recursive pair fusion of equal-length
types. This is now extended to allow pairing of any types that share
the same underlying scalar type. Because we would still generally
prefer the 2^n-length types, those are formed first. Then a second
set of iterations form the non-2^n-length types.

Also, a call to SimplifyInstructionsInBlock has been added after each
pairing iteration. This takes care of DCE (and a few other things)
that make the following iterations execute somewhat faster. For the
same reason, some of the simple shuffle-combination cases are now
handled internally.

There is some additional refactoring work to be done, but I've had
many requests for this feature, so additional refactoring will come
soon in future commits (as will additional test cases).

llvm-svn: 159330
2012-06-28 05:42:42 +00:00
Jack Carter
50778bd9cc The Mips specific inline asm operand modifier 'z' has the
following description in the gnu sources:

    Print $0 if operand is zero otherwise print the op normally.

llvm-svn: 159324
2012-06-28 01:33:40 +00:00
Nuno Lopes
873f05c3ff make LVI::getEdgeValue() always intersect the constraints of the edge with the range of the block. Previously it was only performing the intersection for a few cases, thus losing precision
llvm-svn: 159320
2012-06-28 01:16:18 +00:00
Chandler Carruth
f40c0fc048 Remove 'site.exp' building from both CMake and configure+make.
This is another vestige of the DejaGNU roots. There were FIXMEs in the
lit setup to add a 'lit.site.cfg', which has been around for quite some
time now, so I've properly switched the handling of the 4 things
actually used in site.exp to go through lit.site.cfg now. No more
parsing of the .exp file, one fewer configure-style generated file,
etc., etc.

llvm-svn: 159313
2012-06-28 00:16:51 +00:00
Chandler Carruth
35573d17de Remove the last vestiges of the '-lit' and '-dg' test runner split by
removing '-lit' qualifiers from make rules. I've left a legacy
'check-local-lit' rule in case build scripts have this encoded
somewhere.

llvm-svn: 159311
2012-06-28 00:03:15 +00:00
Chandler Carruth
da1d0dfcf9 Rip out legacy DejaGNU support from our Makefiles. This hasn't been the
default in forever, and hasn't even worked since most of the .exp files
were removed.

llvm-svn: 159307
2012-06-27 23:48:39 +00:00
Chandler Carruth
8825833ea4 LLVM-GCC is dead. Really. I promise. ;]
More importantly, these files don't even have the variable that these
lines purport to substite.

llvm-svn: 159304
2012-06-27 23:34:25 +00:00
Jack Carter
156781dada This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159302
2012-06-27 23:13:42 +00:00
Jack Carter
dc890e3c25 This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159301
2012-06-27 22:48:25 +00:00
Matt Beaumont-Gay
93c66a3db1 Revert r159136 due to PR13124.
Original commit message:

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159272
2012-06-27 17:10:33 +00:00
Duncan Sands
1c87a20df1 Some reassociate optimizations create new instructions, which they insert just
before the expression root.  Any existing operators that are changed to use one
of them needs to be moved between it and the expression root, and recursively
for the operators using that one.  When I rewrote RewriteExprTree I accidentally
inverted the logic, resulting in the compacting going down from operators to
operands rather than up from operands to the operators using them, oops.  Fix
this, resolving PR12963.

llvm-svn: 159265
2012-06-27 14:19:00 +00:00
Richard Barton
7d5dedd329 Teach assembler to handle capitalised operation values for DSB instructions
llvm-svn: 159259
2012-06-27 09:48:23 +00:00
Chandler Carruth
360685166d Clean up the 'check' CMake build rule a bit, notable renaming it to
'check-llvm'.

Don't worry! 'check' still works! =] To rationalize the names of targets
used to run tests, the vague plan is the following:

make check-llvm  # run LLVM reg/unit tests  (currently 'check')
make check-clang # run Clang reg/unit tests (currently 'clang-test')
make check-rt    # run CompilerRT reg/unit tests
make check-asan  # run ASan reg/unit tests (subset of -rt)
make check-tsan  # run TSan reg/unit tests (subset of -rt)
make check-all   # run as much of the above as is available

The last one respects what projects are checked out and built for
a given tree. Personally, I would like to eventually make 'check' be an
alias for 'check-all'. For now however, it is an alias for 'check-llvm',
and thus no behavior has changed.

While this patch and my plan only really apply to CMake, I think it
might be good to similarly rationalize the naming scheme for the Make
builds.

llvm-svn: 159258
2012-06-27 09:44:16 +00:00
Akira Hatanaka
ea099eba3e Test case for r159240.
llvm-svn: 159242
2012-06-27 00:40:34 +00:00
Evan Cheng
9132bcf0e3 Remove a instcombine transform that (no longer?) makes sense:
// C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230
2012-06-26 22:03:13 +00:00
Rafael Espindola
51d0d15c23 Fix llc's -print-before=pass and -print-after=pass.
llvm-svn: 159227
2012-06-26 21:33:36 +00:00
Manman Ren
6be46b7b4c X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.

llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Jack Carter
0d53f88926 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter
llvm-svn: 159203
2012-06-26 13:49:27 +00:00
Duncan Sands
1770ae1ae4 Replacing zero-sized alloca's with a null pointer is too aggressive, instead
merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite.  What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size".  It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly.  Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.

llvm-svn: 159202
2012-06-26 13:39:21 +00:00
Elena Demikhovsky
832f074a32 Shuffle optimization for AVX/AVX2.
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]

llvm-svn: 159188
2012-06-26 08:04:10 +00:00
Craig Topper
5c8bdeb3f3 Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.
llvm-svn: 159184
2012-06-26 04:12:49 +00:00
Andrew Trick
c5e08120a4 Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183
2012-06-26 04:11:38 +00:00
Eli Friedman
a3ccee4b33 Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196.
llvm-svn: 159176
2012-06-25 23:42:33 +00:00
Nuno Lopes
bf0bd73d19 revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias
llvm-svn: 159175
2012-06-25 23:26:10 +00:00
Nuno Lopes
d9d8ad5188 do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway..
llvm-svn: 159173
2012-06-25 22:55:50 +00:00
Manman Ren
bd339c27e1 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166
2012-06-25 21:49:38 +00:00
Dan Gohman
2287ddbef7 Fix the objc_autoreleasedReturnValue optimization code to locate
the call correctly even in the case where it is an invoke. This
fixes rdar://11714057.

llvm-svn: 159157
2012-06-25 19:47:37 +00:00
Jakob Stoklund Olesen
cc79c28e91 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

llvm-svn: 159150
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen
9333a7fb3b Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

llvm-svn: 159149
2012-06-25 18:12:18 +00:00
Nuno Lopes
165c99b53d improve optimization of invoke instructions:
- simplifycfg:  invoke undef/null -> unreachable
 - instcombine:  invoke new  -> invoke expect(0, 0)  (an arbitrary NOOP intrinsic;  only done if the allocated memory is unused, of course)
 - verifier:  allow invoke of intrinsics  (to make the previous step work)

llvm-svn: 159146
2012-06-25 17:11:47 +00:00
Meador Inge
b8cf9886db PR13013: ELF Type identification fails for MSB type ELF files.
Fix 'sys::IdentifyFileType' to work with big and little endian byte orderings
when reading the ELF object file type.

Initial patch by Stefan Hepp.

llvm-svn: 159138
2012-06-25 14:48:43 +00:00
Rafael Espindola
45a2b18594 If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159136
2012-06-25 14:30:31 +00:00
Jakob Stoklund Olesen
76fcb51532 %RCX is not a function live-out in eh.return functions.
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

llvm-svn: 159116
2012-06-24 15:53:01 +00:00
Hal Finkel
409cab2a0a Allow controlling vectorization of boolean values separately from other integer types.
These are used as the result of comparisons, and often handled differently from larger integer types.

llvm-svn: 159111
2012-06-24 13:28:01 +00:00
Nick Lewycky
f408016f37 Remove dyn_cast + dereference pattern by replacing it with a cast and changing
the safety check to look for the same type we're going to actually cast to.
Fixes PR13180!

llvm-svn: 159110
2012-06-24 10:15:42 +00:00
Nick Lewycky
e4f20af5c4 Remove a dangling reference to a deleted instruction. Fixes PR13185!
llvm-svn: 159096
2012-06-24 01:44:08 +00:00
Pete Cooper
9f89f00988 DAG legalisation can now handle illegal fma vector types by scalarisation
llvm-svn: 159092
2012-06-24 00:05:44 +00:00
Hal Finkel
d0a65988d8 Allow BBVectorize to fuse compare instructions.
llvm-svn: 159088
2012-06-23 21:52:50 +00:00
Marshall Clow
b34fa3efd3 Add relocation types for Hexagon processor; patch by Sidney Manning <sidneym@codeaurora.org>
llvm-svn: 159081
2012-06-23 14:46:18 +00:00
Hans Wennborg
8c011bd43a Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

llvm-svn: 159077
2012-06-23 11:37:03 +00:00
Rafael Espindola
048a927ab5 Handle aliases to tls variables in all architectures, not just x86.
llvm-svn: 159058
2012-06-23 00:30:03 +00:00
Evan Cheng
2d498dc096 (sub X, imm) gets canonicalized to (add X, -imm)
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136

llvm-svn: 159057
2012-06-23 00:29:06 +00:00
Jim Grosbach
92de1a3f58 ARM: Add a better diagnostic for some out of range immediates.
As an example of how the custom DiagnosticType can be used to provide
better operand-mismatch diagnostics, add a custom diagnostic for
the imm0_15 operand class used for several system instructions.
Update the tests to expect the improved diagnostic.

rdar://8987109

llvm-svn: 159051
2012-06-22 23:56:48 +00:00
Hal Finkel
ebe9ea8bd7 Add support for the PPC isel instruction.
The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

llvm-svn: 159045
2012-06-22 23:10:08 +00:00
Chad Rosier
2701ece4bf FileCheckize tests.
llvm-svn: 159044
2012-06-22 23:04:02 +00:00
Lang Hames
7d298105e5 Rename fp-op fusion option (yet again) for compatibility with GCC option.
llvm-svn: 159042
2012-06-22 22:31:00 +00:00
Evan Cheng
d957460992 EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134
llvm-svn: 159023
2012-06-22 20:14:46 +00:00
Jakob Stoklund Olesen
c970d61f6d Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x"
This fixes PR5997.

These transforms were disabled because codegen couldn't deal with other
uses of trunc(x). This is now handled by the peephole pass.

This causes no regressions on x86-64.

llvm-svn: 159003
2012-06-22 16:36:43 +00:00
NAKAMURA Takumi
85707de537 test/CodeGen/Generic/asm-large-immediate.ll: Mark it as XFAIL: powerpc, possibly due to r158939.
llvm-svn: 158994
2012-06-22 13:41:00 +00:00
Jakob Stoklund Olesen
3efab18404 Functions calling __builtin_eh_return must have a frame pointer.
The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame
pointer exists, but the frame pointer was forced by the presence of
llvm.eh.unwind.init which isn't guaranteed.

If llvm.eh.unwind.init is actually required in functions calling
eh.return (is it?), we should diagnose that instead of emitting bad
machine code.

This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot.

llvm-svn: 158961
2012-06-22 03:04:27 +00:00
Andrew Trick
764fe3bfef ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

llvm-svn: 158959
2012-06-22 02:50:31 +00:00
Nick Lewycky
da52706728 Emit relocations for DW_AT_location entries on systems which need it. This is
a recommit of r127757. Fixes PR9493. Patch by Paul Robinson!

llvm-svn: 158957
2012-06-22 01:25:12 +00:00
Lang Hames
68cf87e3ef Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Nuno Lopes
009e7f08aa instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here).
Fixes the crashes with the attached test case

llvm-svn: 158951
2012-06-21 23:52:14 +00:00
Evan Cheng
404624ee4d Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329
llvm-svn: 158946
2012-06-21 22:52:49 +00:00
Jack Carter
ecfcd0f81b The inline asm operand modifier 'n' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Negate the immediate constant

Several Architectures such as x86 have local implementations
of operand modifier 'n' which go beyond the above description
slightly. This won't affect them.

Affected files:

    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'n' to the switch cases.

    test/CodeGen/Generic/asm-large-immediate.ll
        Generic compiled test (x86 for me)

    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter
llvm-svn: 158939
2012-06-21 21:37:54 +00:00
Nuno Lopes
8baf9fdf84 Add support for invoke to the MemoryBuiltin analysid.
Update comments accordingly.

Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached).

llvm-svn: 158937
2012-06-21 21:25:05 +00:00
Akira Hatanaka
021bc1c8a2 1. fix null program output after some other changes
2. re-enable null.ll test
3. fix some minor style violations

Patch by Reed Kotler.

llvm-svn: 158935
2012-06-21 20:39:10 +00:00
Akira Hatanaka
0ce3a3b090 Add Mips to the list of target architectures for the MCJIT tests.
Patch by Reed Kotler.

llvm-svn: 158933
2012-06-21 20:23:32 +00:00
Hal Finkel
bc9be7c0e5 Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC.
Thanks to Tobias von Koch for pointing out this problem.

llvm-svn: 158932
2012-06-21 20:10:48 +00:00
Jack Carter
533bef32ae The inline asm operand modifier 'c' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Substitute immediate value without immediate syntax

Several Architectures such as x86 have local implementations
of operand modifier 'c' which go beyond the above description
slightly. To make use of the generic modifiers without overriding
local implementation one can make a call to the base class method
for AsmPrinter::PrintAsmOperand() in the locally derived method's 
"default" case in the switch statement. That way if it is already
defined locally the generic version will never get called.

This change is needed when test/CodeGen/generic/asm-large-immediate.ll
failed on a native Mips board. The test was assuming a generic
implementation was in place.

Affected files:

    lib/Target/Mips/MipsAsmPrinter.cpp:
        Changed the default case to call the base method.
    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'c' to the switch cases.
    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter
llvm-svn: 158925
2012-06-21 17:14:46 +00:00
Nuno Lopes
46de159c09 hopefully fix the buildbots: some tests have wrong definitions of malloc and were crashing this code on 64 bits machines
llvm-svn: 158923
2012-06-21 16:47:58 +00:00
Nuno Lopes
0861020fd8 port the BoundsChecking patch to the new MemoryBuiltin API (i.e., remove most of the code from here).
Remove the alloc_size.ll test until we settle on a metadata format that makes everyone happy..

llvm-svn: 158920
2012-06-21 15:59:53 +00:00
Nuno Lopes
c9edab11db refactor the MemoryBuiltin analysis:
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
 - provide an API to compute the size and offset of an object pointed by

Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.

Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.

llvm-svn: 158919
2012-06-21 15:45:28 +00:00
NAKAMURA Takumi
ff5fc0d895 Revert r158209, "test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911."
It passes according to ppc changes.

llvm-svn: 158917
2012-06-21 13:43:06 +00:00
Lang Hames
662801dbc8 Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
llvm-svn: 158902
2012-06-21 06:10:00 +00:00
Evan Cheng
5e3a175c65 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607

llvm-svn: 158900
2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen
914857b29a Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

llvm-svn: 158873
2012-06-20 23:31:34 +00:00
Akira Hatanaka
9a6df0f613 Revert r158846.
llvm-svn: 158855
2012-06-20 21:19:39 +00:00
Akira Hatanaka
f8ce377e38 In MipsDisassembler.cpp, instead of defining register class tables, use the ones
that are generated by TableGen and are already available in
MipsGenRegisterInfo.inc. Suggested by Jakob Stoklund Olesen.

Also, fix bug in function DecodeAFGR64RegisterClass.

Patch by Vladimir Medic. 

llvm-svn: 158846
2012-06-20 20:39:23 +00:00
Jakob Stoklund Olesen
6d2db5c3d9 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

llvm-svn: 158831
2012-06-20 18:00:57 +00:00
Hal Finkel
a94da28a6d Add support for generating reg+reg (indexed) pre-inc loads on PPC.
llvm-svn: 158823
2012-06-20 15:43:03 +00:00
Craig Topper
d63e429d68 Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases.
llvm-svn: 158792
2012-06-20 05:39:26 +00:00
Lang Hames
f0b9601a6d Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen
0a9edb38d3 Add a triple.
The test was failing on Linux because of asm syntax differences.

llvm-svn: 158748
2012-06-19 21:46:25 +00:00
Jakob Stoklund Olesen
66e7517610 Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

llvm-svn: 158743
2012-06-19 21:14:34 +00:00
Jan Wen Voung
fa15c02364 Have ARM ELF use correct reloc for "b" instr.
The condition code didn't actually matter for arm "b" instructions,
unlike "bl".  It should just use the R_ARM_JUMP24 reloc.

llvm-svn: 158722
2012-06-19 16:03:02 +00:00
Hal Finkel
42b797225a Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

llvm-svn: 158698
2012-06-19 02:34:32 +00:00
Rafael Espindola
0f267bbe04 really add a triple :-(
llvm-svn: 158696
2012-06-19 02:17:35 +00:00
Rafael Espindola
1cc3be37a0 Add a triple to the test.
llvm-svn: 158695
2012-06-19 01:42:34 +00:00
Rafael Espindola
38c45a939d Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Nuno Lopes
de7b3a54f2 revert r158660, since Chris has some issues with this patch (namely using code to reprent information only used by the compiler)
Original commit msg:
add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158688
2012-06-18 23:34:26 +00:00
Manman Ren
6d2895c506 ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 158684
2012-06-18 22:23:48 +00:00
Jim Grosbach
6ea9efb4e5 ARM: Define generic HINT instruction.
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.

rdar://11600518

llvm-svn: 158674
2012-06-18 19:45:50 +00:00
Nuno Lopes
aa5ffcb407 add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158660
2012-06-18 16:04:04 +00:00
Joel Jones
3d5ae56be4 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.  The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.

<rdar://problem/11481151>

llvm-svn: 158659
2012-06-18 14:51:32 +00:00
Chandler Carruth
1c3df655ea Add a regression test for the bug exposed by r158087, which has been
temporarily reverted.

This test is annoyingly overspecified, but I don't know of another way
to thoroughly test the saving and restoring of the registers. While this
will have to be adjusted even with the issue fixed in order to re-apply
r158087, those adjustments should very clearly indicate that it is still
correct (%esp getting restored prior to pops), whereas without it, this
case can easily slip under the radar.

Still, any suggestions for improvements are very welcome.

All credit to Matt Beaumont-Gay for reducing this out of an insane
Address Sanitizer crash to a reasonably small seg-faulting C program
when built with -mstackrealign. I just reduced it to IR, which was much
simpler. =]

llvm-svn: 158656
2012-06-18 09:15:04 +00:00
Chandler Carruth
d2716ae111 Temporarily revert r158087.
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.

Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.

llvm-svn: 158654
2012-06-18 07:03:12 +00:00
Pete Cooper
5e72f7e4f9 Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts
llvm-svn: 158623
2012-06-17 03:58:26 +00:00
Hal Finkel
40483bafbf Cleanup trip-count finding for PPC CTR loops (and some bug fixes).
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.

In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).

llvm-svn: 158607
2012-06-16 20:34:07 +00:00