1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 00:12:50 +01:00
Commit Graph

90387 Commits

Author SHA1 Message Date
Michel Danzer
2f63b04c7c R600: Use legacy (0 * anything = 0) MUL instructions for pow intrinsics
Fixes wrong lighting in some corner cases with r600g and radeonsi, e.g.
manifested by failure of two piglit/glean tests and intermittent black
patches in many apps.

Tested on SI and RS880.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62012 [radeonsi]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58150 [r600g]

NOTE: This is a candidate for the Mesa stable branch.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 177730
2013-03-22 14:09:10 +00:00
Kostya Serebryany
3d0691a059 [asan] Change the way we report the alloca frame on stack-buff-overflow.
Before: the function name was stored by the compiler as a constant string
and the run-time was printing it.
Now: the PC is stored instead and the run-time prints the full symbolized frame.
This adds a couple of instructions into every function with non-empty stack frame,
but also reduces the binary size because we store less strings (I saw 2% size reduction).
This change bumps the asan ABI version to v3.

llvm part.

Example of report (now):
==31711==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffa77cf1c5 at pc 0x41feb0 bp 0x7fffa77cefb0 sp 0x7fffa77cefa8
READ of size 1 at 0x7fffa77cf1c5 thread T0
    #0 0x41feaf in Frame0(int, char*, char*, char*) stack-oob-frames.cc:20
    #1 0x41f7ff in Frame1(int, char*, char*) stack-oob-frames.cc:24
    #2 0x41f477 in Frame2(int, char*) stack-oob-frames.cc:28
    #3 0x41f194 in Frame3(int) stack-oob-frames.cc:32
    #4 0x41eee0 in main stack-oob-frames.cc:38
    #5 0x7f0c5566f76c (/lib/x86_64-linux-gnu/libc.so.6+0x2176c)
    #6 0x41eb1c (/usr/local/google/kcc/llvm_cmake/a.out+0x41eb1c)
Address 0x7fffa77cf1c5 is located in stack of thread T0 at offset 293 in frame
    #0 0x41f87f in Frame0(int, char*, char*, char*) stack-oob-frames.cc:12  <<<<<<<<<<<<<< this is new
  This frame has 6 object(s):
    [32, 36) 'frame.addr'
    [96, 104) 'a.addr'
    [160, 168) 'b.addr'
    [224, 232) 'c.addr'
    [288, 292) 's'
    [352, 360) 'd'

llvm-svn: 177724
2013-03-22 10:37:20 +00:00
Dmitry Vyukov
b12c1b7911 tsan: fix the test
Add missed file from r177717 commit that adds __tsan_vptr_read.

llvm-svn: 177719
2013-03-22 09:04:01 +00:00
Dmitry Vyukov
eae8006130 tsan: handle vptr loads specially
This is required to determine ctor/dtor vs virtual call races.
http://llvm-reviews.chandlerc.com/D566

llvm-svn: 177717
2013-03-22 08:51:22 +00:00
Evgeniy Stepanov
f128dbc036 Fix llvm::removeUnreachableBlocks to handle unreachable loops.
llvm-svn: 177713
2013-03-22 08:43:04 +00:00
Arnaud A. de Grandmaison
7a4226244b InstCombine: Improve the result bitvect type when folding (cmp pred (load (gep GV, i)) C) to a bit test.
The original code used i32, and i64 if legal. This introduced unneeded
casts when they aren't legal, or when the index variable i has another
type. In order of preference: try to use i's type; use the smallest
fitting legal type (using an added DataLayout method); default to i32.
A testcase checks that this works when the index gep operand is i16.

Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com>
Reviewed by : Duncan

llvm-svn: 177712
2013-03-22 08:25:01 +00:00
Hal Finkel
fb115a33df Remove ScavengedRC from RegisterScavenging
ScavengedRC was a dead private variable (set, but not otherwise used). No
functionality change intended.

llvm-svn: 177708
2013-03-22 07:27:44 +00:00
David Blaikie
620d0ae359 Reorder the DIFile field in DILexicalBlock to become a prefix common with other DIScopes
llvm-svn: 177703
2013-03-22 05:47:44 +00:00
Chandler Carruth
cdea7a21e5 Remove the ARM-specific variant of this test. It's already covered by
the ARM build bots, and it adds a weird case to the test suite where
a test uses as inputs files in the parent directory.

Talked about this with Dave on IRC and he's fine with this approach even
though it isn't optimal.

llvm-svn: 177700
2013-03-22 05:16:46 +00:00
Argyrios Kyrtzidis
1158d6e6fc Introduce LLVM_STATIC_ASSERT macro, which expands to C/C++'s static_assert on compilers which support it.
llvm-svn: 177699
2013-03-22 03:10:51 +00:00
Chandler Carruth
e849175ffc Revert r177543: Add timing of the IR parsing code with a new
-time-ir-parsing flag

This breaks the layering of the Support library. We can't add an
implementation side to IRReader because it refers directly to entities
only accessible as part of the IR, AsmParser, and BitcodeReader
libraries. It can only be used in a context where all of those libraries
will be available.

We'll need to find some other way to get this functionality, and
hopefully solve the long-standing layering problem of IRReader.h...

llvm-svn: 177695
2013-03-22 02:20:34 +00:00
Jack Carter
f42f53d767 Fix the invalid opcode for Mips branch instructions in the assembler
For mips a branch an 18-bit signed offset (the 16-bit 
offset field shifted left 2 bits) is added to the 
address of the instruction following the branch 
(not the branch itself), in the branch delay slot, 
to form a PC-relative effective target address. 

Previously, the code generator did not perform the 
shift of the immediate branch offset which resulted 
in wrong instruction opcode. This patch fixes the issue.

Contributor: Vladimir Medic
llvm-svn: 177687
2013-03-22 00:29:10 +00:00
Jack Carter
748712c200 This patch that enables the Mips assembler to use symbols for offset for instructions
This patch uses the generated instruction info tables to 
identify memory/load store instructions.
After successful matching and based on the operand type 
and size, it generates additional instructions to the output.

Contributor: Vladimir Medic
llvm-svn: 177685
2013-03-22 00:05:30 +00:00
Hal Finkel
598c91ae81 Remove the G8RC_NOX0_and_GPRC_NOR0 PPC register class
As Jakob pointed out in his review of r177423, having a shared ZERO
register between the 32- and 64-bit register classes causes this
odd G8RC_NOX0_and_GPRC_NOR0 class to be created. As recommended,
this adds a ZERO8 register which differentiates the 32- and 64-bit
zeros.

No functionality change intended.

llvm-svn: 177683
2013-03-21 23:45:03 +00:00
Sean Silva
b2ca5c4ca4 Add TableGen ctags(1) emitter and helper script.
To use this in conjunction with exuberant ctags to generate a single
combined tags file, run tblgen first and then
  $ ctags --append [...]

Since some identifiers have corresponding definitions in C++ code,
it can be useful (if using vim) to also use cscope, and
  :set cscopetagorder=1
so that
  :tag X
will preferentially select the tablegen symbol, while
  :cscope find g X
will always find the C++ symbol.

Patch by Kevin Schoedel!

(a couple small formatting changes courtesy of clang-format)

llvm-svn: 177682
2013-03-21 23:40:38 +00:00
Bill Wendling
cf28d49703 Always forward 'resume' instructions to the outter landing pad.
How did this ever work?

Basically, if you have a function that's inlined into the caller, it may not
have any 'call' instructions, but any 'resume' instructions it may have should
still be forwarded to the outer (caller's) landing pad. This requires that all
of the 'landingpad' instructions in the callee have their clauses merged with
the caller's outer 'landingpad' instruction (hence the bit of ugly code in the
`forwardResume' method).

Testcase in a follow commit to the test-suite repository.

<rdar://problem/13360379> & PR15555

llvm-svn: 177680
2013-03-21 23:30:12 +00:00
Hal Finkel
164c449fcc Fix a register-class comparison bug in PPCCTRLoops
Thanks to Jakob for isolating the underlying problem from the
test case in r177423. The original commit had introduced
asymmetric copy operations, but these turned out to be a work-around
to the real problem (the use of == instead of hasSubClassEq in PPCCTRLoops).

llvm-svn: 177679
2013-03-21 23:23:34 +00:00
David Blaikie
c9d598113b Refactor the filename/directory information in DISubprogram to refer directly to the pair rather than the DIFile.
llvm-svn: 177677
2013-03-21 23:08:34 +00:00
Bill Wendling
05a454cd6d Add a query to tell if a landing pad has a catch-all.
llvm-svn: 177675
2013-03-21 23:01:03 +00:00
David Blaikie
648a81f32c Move the DIFile in DISubprogram to the beginning to be a common prefix along with other DIScopes
llvm-svn: 177674
2013-03-21 22:29:36 +00:00
Douglas Gregor
874757a6ee <rdar://problem/13477190> On Darwin, use DARWIN_USER_TEMP_DIR or DARWIN_USER_CACHE_DIR for the system temporary directory.
The DARWIN_USER_TEMP_DIR and DARWIN_USER_CACHE_DIR configuration
settings are more idiomatic for Darwin than the TMPDIR environment
variable.

llvm-svn: 177669
2013-03-21 21:46:10 +00:00
Jack Carter
9e089b8c4f This patch enables the Mips .set directive to define aliases
The .set directive in the Mips the assembler can be 
used to set the value of a symbol to an expression. 
This changes the symbol's value and type to conform 
to the expression's.

Syntax: .set symbol, expression

This patch implements the parsing of the above syntax 
and enables the parser to use defined symbols when 
parsing operands.

Contributor: Vladimir Medic
llvm-svn: 177667
2013-03-21 21:44:16 +00:00
Hal Finkel
7e324aee83 Implement builtin_{setjmp/longjmp} on PPC
This implements SJLJ lowering on PPC, making the Clang functions
__builtin_{setjmp/longjmp} functional on PPC platforms. The implementation
strategy is similar to that on X86, with the exception that a branch-and-link
variant is used to get the right jump address. Credit goes to Bill Schmidt for
suggesting the use of the unconditional bcl form (instead of the regular bl
instruction) to limit return-address-cache pollution.

Benchmarking the speed at -O3 of:

static jmp_buf env_sigill;

void foo() {
                __builtin_longjmp(env_sigill,1);
}

main() {
	...

        for (int i = 0; i < c; ++i) {
                if (__builtin_setjmp(env_sigill)) {
                        goto done;
                } else {
                        foo();
                }

done:;
        }

	...
}

vs. the same code using the libc setjmp/longjmp functions on a P7 shows that
this builtin implementation is ~4x faster with Altivec enabled and ~7.25x
faster with Altivec disabled. This comparison is somewhat unfair because the
libc version must also save/restore the VSX registers which we don't yet
support.

llvm-svn: 177666
2013-03-21 21:37:52 +00:00
Renato Golin
1fca3efc0b Fix Darwin NEON FP and increase coverage
llvm-svn: 177664
2013-03-21 21:30:49 +00:00
David Blaikie
67c9dc82dc Remove unused field in DISubprogram
llvm-svn: 177661
2013-03-21 20:28:52 +00:00
Hal Finkel
7e6dc78317 Add support for spilling VRSAVE on PPC
Although there is only one Altivec VRSAVE register, it is a member of
a register class, and we need the ability to spill it. Because this
register is normally callee-preserved and handled by special code this
has never before been necessary. However, this capability will be required by
a forthcoming commit adding SjLj support.

llvm-svn: 177654
2013-03-21 19:03:21 +00:00
Hal Finkel
2043b2adae Correct PPC FRAMEADDR lowering using a pseudo-register
The old code used to lower FRAMEADDR tried to replicate the logic in the real
frame-lowering code that determines whether or not the frame pointer (r31) will
be used. When it seemed as through the frame pointer would not be used, the
stack pointer (r1) was used instead. Unfortunately, because the stack size is
not yet known, this does not work. Instead, this change introduces new
always-reserved pseudo-registers (FP and FP8) that are replaced during prologue
insertion with the real frame-pointer register (either r1 or r31).

It is important that this intrinsic always return a valid frame address because
it is used by Clang to store the frame address as part of code generation for
__builtin_setjmp.

llvm-svn: 177653
2013-03-21 19:03:19 +00:00
Renato Golin
0854fd9bef Avoid NEON SP-FP unless unsafe-math or Darwin
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision
floating point operations with NEON unless unsafe-math is turned on. The
equivalent VFP instructions are IEEE 754 compliant, but in some cores they're
much slower, so some archs/OSs might still request it to be on by default,
such as Swift and Darwin.

llvm-svn: 177651
2013-03-21 18:47:47 +00:00
Bill Wendling
e20714f292 Update some EH tests that were violating the new EH model.
The landingpad instruction needs to be the first non-PHI instruction in the
unwind destination block.

llvm-svn: 177650
2013-03-21 18:30:10 +00:00
Chandler Carruth
8613f86d1c Hoist the definition of getTypeSizeInBits to be inlinable and in the
header.

This method is called in the hot path for *many* passes, SROA is what
caught my interest. A common pattern is that which branch of the switch
should be taken is known in the callsite and so it is a very good
candidate for inlining and simplification. Moving it into the header
allows the optimizer to fold a lot of boring, repeatitive code in
callers of this routine.

I'm seeing pretty significant speedups in parts of SROA and I suspect
other passes will see similar speedups if they end up working with type
sizes frequently. I've not seen any significant growth of the binaries
as a consequence, but let me know if you see anything suspicious here.

llvm-svn: 177632
2013-03-21 09:52:22 +00:00
Chandler Carruth
5dfc3ade1f [SROA] Prefix names using a custom IRBuilder inserter.
The key part of this is ensuring that name prefixes remain in a Twine
form until we get to a point where we can nuke them under NDEBUG. This
is tricky using the old APIs as they played fast and loose with Twine,
which is prone to serious error. The inserter is much cleaner as it is
actually in the call stack leading to the setName call, and so has
a good opportunity to prepend the prefix.

This matters more than you might imagine because most runs over an
alloca find a single partition, and rewrite 3 or 4 instructions
referring to it. As a consequence doing this lazily and exclusively with
Twine allows the optimizer to delete more of it and shaves another 2% to
3% off of the release build's SROA run time for PR15412. I also think
the APIs are cleaner, and the use of Twine is more reliable, so
I consider it a win-win despite the churn required to reach this state.

llvm-svn: 177631
2013-03-21 09:52:18 +00:00
Evgeniy Stepanov
91fdbb2384 [msan] Add an option to disable poisoning of shadow for undef values.
llvm-svn: 177630
2013-03-21 09:38:26 +00:00
Meador Inge
8c4638bcc3 simplify-libcalls: Removed unused variable
The 'Modified' variable should have been removed from SimplifyLibCalls
in r177619, but was missed.  This commit removes it.

llvm-svn: 177622
2013-03-21 02:44:07 +00:00
Matt Arsenault
a80a8422d5 Fix missing std::. Not sure how this compiles for anyone else.
llvm-svn: 177620
2013-03-21 00:57:21 +00:00
Meador Inge
30024047b3 Move library call prototype attribute inference to functionattrs
The simplify-libcalls pass implemented a doInitialization hook to infer
function prototype attributes for well-known functions.  Given that the
simplify-libcalls pass is going away *and* that the functionattrs pass
is already in place to deduce function attributes, I am moving this logic
to the functionattrs pass.  This approach was discussed during patch
review:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121126/157465.html.

llvm-svn: 177619
2013-03-21 00:55:59 +00:00
David Blaikie
162c31d46b Removing unused DISubprogram::getFile
llvm-svn: 177614
2013-03-21 00:10:31 +00:00
Jakob Stoklund Olesen
6ffb4136aa Add a WriteMicrocoded for ancient microcoded instructions.
llvm-svn: 177611
2013-03-21 00:07:17 +00:00
David Blaikie
b2d35852ea Debug info: refactor the first field of DICompileUnit to be a raw file/directory pair
This removes the DICompileUnit special case from DIScope.

llvm-svn: 177610
2013-03-20 23:58:12 +00:00
Jakub Staszak
9d539599f4 Use pre-inc, pre-dec when possible.
They are generally faster (at least not slower) than post-inc, post-dec.

llvm-svn: 177608
2013-03-20 23:56:19 +00:00
Jakub Staszak
de54132000 Remove 'else' after 'return'.
llvm-svn: 177607
2013-03-20 23:53:45 +00:00
Reid Kleckner
804b51ca54 [lit] Avoid CRLFs in bash scripts on Windows
Native Windows Python will do line ending translation by default, which
we don't want in bash scripts.  If we're not native Windows Python, then
'b' is ignored.

llvm-svn: 177602
2013-03-20 23:32:14 +00:00
Justin Holewinski
19c2f0143b Make variable name more explicit and eliminate redundant lookup in SDNodeOrdering
llvm-svn: 177600
2013-03-20 23:10:59 +00:00
Jakob Stoklund Olesen
b3af273625 Model prefetches and barriers as loads.
It's not yet clear if these instructions need a more careful model.

llvm-svn: 177599
2013-03-20 23:09:53 +00:00
Jakob Stoklund Olesen
042f102514 Add a catch-all WriteSystem SchedWrite type.
This is used for all the expensive system instructions.

llvm-svn: 177598
2013-03-20 23:09:50 +00:00
Nadav Rotem
3a9f2d7de8 When computing the demanded bits of Load SDNodes, make sure that we are looking at the loaded-value operand and not the ptr result (in case of pre-inc loads).
rdar://13348420

llvm-svn: 177596
2013-03-20 22:53:44 +00:00
David Blaikie
78d3bdea74 Debug Info: Swap the 2nd and 3rd parameters to DICompileUnit to match the common DIScope prefix
llvm-svn: 177595
2013-03-20 22:52:54 +00:00
Jakob Stoklund Olesen
305e22bdab Annotate the remaining SSE MOV instructions.
llvm-svn: 177592
2013-03-20 22:37:16 +00:00
Jakob Stoklund Olesen
96a403dd67 Annotate SSE horizontal and integer instructions.
llvm-svn: 177591
2013-03-20 22:37:13 +00:00
David Blaikie
30abbc718f Remove unused field in DICompileUnit
llvm-svn: 177590
2013-03-20 22:34:33 +00:00
Michael Liao
fe785c9579 Correct cost model for vector shift on AVX2
- After moving logic recognizing vector shift with scalar amount from
  DAG combining into DAG lowering, we declare to customize all vector
  shifts even vector shift on AVX is legal. As a result, the cost model
  needs special tuning to identify these legal cases.

llvm-svn: 177586
2013-03-20 22:01:10 +00:00