1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

18088 Commits

Author SHA1 Message Date
Richard Osborne
02e7d3f377 Add instruction encodings / disassembly support for u6 / lu6 instructions.
llvm-svn: 173086
2013-01-21 20:44:17 +00:00
Richard Osborne
0b3c4f3112 Add instruction encoding / disassembly support for ru6 / lru6 instructions.
llvm-svn: 173085
2013-01-21 20:42:16 +00:00
Richard Osborne
c048ec5160 Add instruction encodings / disassembly support for l2rus instructions.
llvm-svn: 172987
2013-01-20 18:51:15 +00:00
Richard Osborne
5688672b0b Add instruction encodings / disassembly support for l3r instructions.
llvm-svn: 172986
2013-01-20 18:37:49 +00:00
Richard Osborne
400fb5329e Add instruction encodings / disassembler support for 2rus instructions.
llvm-svn: 172985
2013-01-20 17:22:43 +00:00
Richard Osborne
3a5e1fcceb Add instruction encodings / disassembly support 3r instructions.
It is not possible to distinguish 3r instructions from 2r / rus instructions
using only the fixed bits. Therefore if an instruction doesn't match the
2r / rus format try to decode it as a 3r instruction before returning Fail.

llvm-svn: 172984
2013-01-20 17:18:47 +00:00
NAKAMURA Takumi
61727f1583 llvm/test/CodeGen/X86/win_ftol2.ll: Add -cpu=generic to appease valgrind.
On valgrind the processor is reported;
  Host CPU: athlon-fx

llvm-svn: 172983
2013-01-20 15:40:02 +00:00
Nadav Rotem
94213533f7 Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.

llvm-svn: 172968
2013-01-20 08:35:56 +00:00
Nadav Rotem
9ec02f071a LoopVectorizer: Implement a new heuristics for selecting the unroll factor.
We ignore the cpu frontend and focus on pipeline utilization. We do this because we
don't have a good way to estimate the loop body size at the IR level.

llvm-svn: 172964
2013-01-20 05:24:29 +00:00
Nadav Rotem
adb4fb5903 Change the cpu type in the test.
llvm-svn: 172963
2013-01-20 05:20:56 +00:00
NAKAMURA Takumi
7da1a315d8 llvm/test/Other/close-stderr.ll: Mark this as XFAIL:valgrind. We got 127 instead of 1 here.
llvm-svn: 172956
2013-01-20 03:35:39 +00:00
David Blaikie
7af654510b The last of PR14471 - emission of constant floats
llvm-svn: 172941
2013-01-20 01:18:01 +00:00
David Blaikie
3498a9c833 Fix a latent bug exposed by recent static member debug info changes.
We weren't encoding boolean constants correctly due to modeling boolean as a
signed type & then sign extending an i1 up to a byte & getting 255.

llvm-svn: 172926
2013-01-19 23:00:25 +00:00
Benjamin Kramer
28c812f680 LoopVectorizer: Emit memory checks into their own basic block.
This separates the check for "too few elements to run the vector loop" from the
"memory overlap" check, giving a lot nicer code and allowing to skip the memory
checks when we're not going to execute the vector code anyways. We still leave
the decision of whether to emit the memory checks as branches or setccs, but it
seems to be doing a good job. If ugly code pops up we may want to emit them as
separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso.

Most of this is legwork to allow multiple bypass blocks while updating PHIs,
dominators and loop info.

llvm-svn: 172902
2013-01-19 13:57:58 +00:00
Nadav Rotem
6d7bb8551d On Sandybridge split unaligned 256bit stores into two xmm-sized stores.
llvm-svn: 172894
2013-01-19 08:38:41 +00:00
Jakob Stoklund Olesen
b38d4fe021 Remove some register allocation order dependencies.
llvm-svn: 172874
2013-01-19 00:03:32 +00:00
Nadav Rotem
de06852537 On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction.
llvm-svn: 172868
2013-01-18 23:10:30 +00:00
Eric Christopher
0a652f09e1 Split out DW_OP_addr for the split debug info DWARF5 proposal.
llvm-svn: 172857
2013-01-18 22:11:33 +00:00
Jack Carter
3f50a8d5c0 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
Support for Mips register information sections.

Mips ELF object files have a section that is dedicated
to register use info. Some of this information such as
the assumed Global Pointer value is used by the linker
in relocation resolution.

The register info file is .reginfo in o32 and .MIPS.options
in 64 and n32 abi files.

This patch contains the changes needed to create the sections,
but leaves the actual register accounting for a future patch.


Contributer: Jack Carter
 
llvm-svn: 172847
2013-01-18 21:20:38 +00:00
Jack Carter
2737a54489 This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.

Removal of redundant code and formatting fixes.

Contributers: Jack Carter/Vladimir Medic
 
llvm-svn: 172842
2013-01-18 20:15:06 +00:00
Daniel Dunbar
bda778db3b [MC/Mach-O] Implement integrated assembler support for linker options.
- Also, fixup syntax errors in LangRef and missing newline in the MCAsmStreamer.

llvm-svn: 172837
2013-01-18 19:37:00 +00:00
NAKAMURA Takumi
6fb32a52cb llvm/test/CodeGen/X86/Atomics-64.ll: Tweak for 2nd RUN not to overwrite %t. It sometimes causes spurious failure on lit win32.
Feel free to prune or suppress each output.

llvm-svn: 172823
2013-01-18 14:52:02 +00:00
Daniel Dunbar
1cb39fe210 [MC/Mach-O] Add support for linker options in Mach-O files.
llvm-svn: 172779
2013-01-18 01:26:07 +00:00
Daniel Dunbar
4f5f37798a [MC/Mach-O] Add AsmParser support for .linker_option directive.
llvm-svn: 172778
2013-01-18 01:25:48 +00:00
Bill Wendling
9449705327 Reverting r171325 & r172363. This was causing a mis-compile on the self-hosted LTO build bots.
Okay, here's how to reproduce the problem:

1) Build a Release (or Release+Asserts) version of clang in the normal way.

2) Using the clang & clang++ binaries from (1), build a Release (or
   Release+Asserts) version of the same sources, but this time enable LTO ---
   specify the `-flto' flag on the command line.

3) Run the ARC migrator tests:

    $ arcmt-test --args -triple x86_64-apple-darwin10 -fsyntax-only -x objective-c++ ./src/tools/clang/test/ARCMT/cxx-rewrite.mm

You'll see that the output isn't correct (the whitespace is off).

The mis-compile is in the function `RewriteBuffer::RemoveText' in the
clang/lib/Rewrite/Core/Rewriter.cpp file. When that function and RewriteRope.cpp
are compiled with LTO and the `arcmt-test' executable is regenerated, you'll see
the error. When those files are not LTO'ed, then the output of the `arcmt-test'
is fine.

It is *really* hard to get a testcase out of this. I'll file a PR with what I
have currently.

--- Reverse-merging r172363 into '.':
U    include/llvm/Analysis/MemoryBuiltins.h
U    lib/Analysis/MemoryBuiltins.cpp

--- Reverse-merging r171325 into '.':
U    test/Transforms/InstCombine/objsize.ll
G    include/llvm/Analysis/MemoryBuiltins.h
G    lib/Analysis/MemoryBuiltins.cpp

llvm-svn: 172756
2013-01-17 21:28:46 +00:00
Bill Schmidt
5aab290f08 Restore reverted test case, this time with REQUIRES: asserts
llvm-svn: 172747
2013-01-17 19:46:51 +00:00
Bill Schmidt
2bbdbadcdc Remove bad test case
llvm-svn: 172746
2013-01-17 19:39:36 +00:00
Bill Schmidt
27aac7a2f3 This patch fixes PR13626 by providing i128 support in the return
calling convention.  128-bit integers are now properly returned
in GPR3 and GPR4 on PowerPC.

llvm-svn: 172745
2013-01-17 19:34:57 +00:00
Jyotsna Verma
01c6fabba6 Add indexed load/store instructions for offset validation check.
This patch fixes bug 14902 - http://llvm.org/bugs/show_bug.cgi?id=14902

llvm-svn: 172737
2013-01-17 18:42:37 +00:00
Bill Schmidt
a2682ebf0e This patch fixes the PPC calling convention to handle returns of
_Complex float and _Complex long double, by simply increasing the
number of floating point registers available for return values.

The test case verifies that the correct registers are loaded.

llvm-svn: 172733
2013-01-17 17:45:19 +00:00
Elena Demikhovsky
461c2bd18c Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

llvm-svn: 172708
2013-01-17 09:59:53 +00:00
Eric Christopher
bd3c66b31a Fix the assembly and dissassembly of DW_FORM_sec_offset. Found this by
changing both the string of the dwo_name to be correct and the type of
the statement list.

Testcases all around.

llvm-svn: 172699
2013-01-17 03:00:04 +00:00
Eric Christopher
7c06c5fb2b Add the DW_AT_GNU_addr_base for the skeleton cu. Add support for
emitting the dwarf32 version of DW_FORM_sec_offset and correct
disassembler support.

llvm-svn: 172698
2013-01-17 02:59:59 +00:00
Jack Carter
fc7b77744d This is a resubmittal. For some reason it broke the bots yesterday
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.

The Mips RDHWR (Read Hardware Register) instruction was not 
tested for assembler or dissassembler consumption. This patch
adds that functionality.

Contributer: Vladimir Medic
 
llvm-svn: 172685
2013-01-17 00:28:20 +00:00
Daniel Dunbar
677520900d [IR] Add 'Append' and 'AppendUnique' module flag behaviors.
llvm-svn: 172659
2013-01-16 21:38:56 +00:00
Michael Gottesman
fd06af4ad6 Added test for r172599 which fixes bugzilla://14584,rdar://11744105.
llvm-svn: 172656
2013-01-16 21:07:18 +00:00
Eric Christopher
961c3d7f24 Make this test X86 only.
llvm-svn: 172652
2013-01-16 20:31:35 +00:00
Eric Christopher
07e3b98cb9 Move this to X86.
llvm-svn: 172651
2013-01-16 20:31:32 +00:00
Eric Christopher
1a2014e917 Add testcase missed yesterday from Paul Robinson.
llvm-svn: 172646
2013-01-16 19:53:47 +00:00
Daniel Dunbar
ac8f9382bc [Linker] Change module flag linking to be more extensible.
- Instead of computing a bunch of buckets of different flag types, just do an
   incremental link resolving conflicts as they arise.

 - This also has the advantage of making the link result deterministic and not
   dependent on map iteration order.

llvm-svn: 172634
2013-01-16 18:39:23 +00:00
Kevin Enderby
2e875dfb6d We want the dwarf AT_producer for assembly source files to match clang's
AT_producer.  Which includes clang's version information so we can tell
which version of the compiler was used.

This is the first of two steps to allow us to do that.  This is the llvm-mc
change to provide a method to set the AT_producer string.  The second step,
coming soon to a clang near you, will have the clang driver pass the value
of getClangFullVersion() via an flag when invoking the integrated assembler
on assembly source files.

rdar://12955296

llvm-svn: 172630
2013-01-16 17:46:23 +00:00
Peter Collingbourne
5f190b5e4e Introduce llvm::sys::getProcessTriple() function.
In r143502, we renamed getHostTriple() to getDefaultTargetTriple()
as part of work to allow the user to supply a different default
target triple at configure time.  This change also affected the JIT.
However, it is inappropriate to use the default target triple in the
JIT in most circumstances because this will not necessarily match
the current architecture used by the process, leading to illegal
instruction and other such errors at run time.

Introduce the getProcessTriple() function for use in the JIT and
its clients, and cause the JIT to use it.  On architectures with a
single bitness, the host and process triples are identical.  On other
architectures, the host triple represents the architecture of the
host CPU, while the process triple represents the architecture used
by the host CPU to interpret machine code within the current process.
For example, when executing 32-bit code on a 64-bit Linux machine,
the host triple may be 'x86_64-unknown-linux-gnu', while the process
triple may be 'i386-unknown-linux-gnu'.

This fixes JIT for the 32-on-64-bit (and vice versa) build on non-Apple
platforms.

Differential Revision: http://llvm-reviews.chandlerc.com/D254

llvm-svn: 172627
2013-01-16 17:27:22 +00:00
Benjamin Kramer
efb9c9f20b Move test that depends on the x86 target into a target-specific directory.
Should fix the arm buildbot (which only builds the arm target).

llvm-svn: 172611
2013-01-16 13:25:56 +00:00
Alexey Samsonov
555fc1db3c ASan: wrap mapping scale and offset in a struct and make it a member of ASan passes. Add test for non-default mapping scale and offset. No functionality change
llvm-svn: 172610
2013-01-16 13:23:28 +00:00
Benjamin Kramer
5c109df93a Remove triple from this test, it makes it fail when X86 TTI is missing.
Without a triple opt falls back to NoTTI which comes closer to LSR's pre-TTI behavior.

llvm-svn: 172609
2013-01-16 13:19:59 +00:00
Jack Carter
6e4e11b877 reverting 172579
llvm-svn: 172594
2013-01-16 01:29:10 +00:00
Jack Carter
d951346d6e Akira,
Hope you are feeling better.

The Mips RDHWR (Read Hardware Register) instruction was not 
tested for assembler or dissassembler consumption. This patch
adds that functionality.

Contributer: Vladimir Medic
 
llvm-svn: 172579
2013-01-16 00:07:45 +00:00
Eric Christopher
a159b6c731 Split address information for DWARF5 split dwarf proposal. This involves
using the DW_FORM_GNU_addr_index and a separate .debug_addr section which
stays in the executable and is fully linked.

Sneak in two other small changes:

a) Print out the debug_str_offsets.dwo section.
b) Change form we're expecting the entries in the debug_str_offsets.dwo
   section to take from ULEB128 to U32.

Add tests for all of this in the fission-cu.ll test.

llvm-svn: 172578
2013-01-15 23:56:56 +00:00
Nadav Rotem
24c96fa80c Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero.
llvm-svn: 172576
2013-01-15 23:43:14 +00:00
Shuxin Yang
36c3cb1b3b 1. Hoist minus sign as high as possible in an attempt to reveal
some optimization opportunities (in the enclosing supper-expressions).

   rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "-0.0 - X" has only one reference.

   rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "0.0 - X" has only one reference, and
        the instruction is marked "noSignedZero".

2. Eliminate negation (The compiler was already able to handle these
    opt if the 0.0s are replaced with -0.0.)

   rule 3: (0.0 - X) * (0.0 - Y) => X * Y
   rule 4: (0.0 - X) * C => X * -C
   if the expr is flagged "noSignedZero".

3. 
  Rule 5: (X*Y) * X => (X*X) * Y
   if X!=Y and the expression is flagged with "UnsafeAlgebra".

   The purpose of this transformation is two-fold:
    a) to form a power expression (of X).
    b) potentially shorten the critical path: After transformation, the
       latency of the instruction Y is amortized by the expression of X*X,
       and therefore Y is in a "less critical" position compared to what it
      was before the transformation. 

4. Remove the InstCombine code about simplifiying "X * select".
   
   The reasons are following:
    a) The "select" is somewhat architecture-dependent, therefore the
       higher level optimizers are not able to precisely predict if
       the simplification really yields any performance improvement
       or not.

    b) The "select" operator is bit complicate, and tends to obscure
       optimization opportunities. It is btter to keep it as low as
       possible in expr tree, and let CodeGen to tackle the optimization.

llvm-svn: 172551
2013-01-15 21:09:32 +00:00