llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Arnold Schwaighofer	8a0e82c2bc	LoopVectorizer: Enable unrolling of conditional stores and the load/store unrolling heuristic per default Benchmarking on x86_64 (thanks Chandler!) and ARM has shown those options speed up some benchmarks while not causing any interesting regressions. llvm-svn: 200621	2014-02-02 03:12:34 +00:00
Matt Arsenault	4198d962c0	R600/SI: Fix insertelement with dynamic indices. This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619	2014-02-02 00:05:35 +00:00
Venkatraman Govindaraju	9c2c79ab67	[Sparc] Set %o7 as the return address register instead of %i7 in MCRegisterInfo. Also, add CFI instructions to initialize the frame correctly. llvm-svn: 200617	2014-02-01 18:54:16 +00:00
Arnold Schwaighofer	984f27d265	ARMTTI: We don't have 16 allocatable scalar registers This caused an regression on libquantum after enabling the new loop vectorizer unroll heuristics. llvm-svn: 200616	2014-02-01 18:00:25 +00:00
David Woodhouse	29e7c4248c	MC: Fix .octa output for APInts with BitWidth > 128 llvm-svn: 200615	2014-02-01 16:52:33 +00:00
David Woodhouse	03ba7e20e4	MC: Add support for .octa This is a minimal implementation which accepts only constants rather than full expressions, but that should be perfectly sufficient for all known users for now. Patch from PaX Team <pageexec@freemail.hu> llvm-svn: 200614	2014-02-01 16:20:59 +00:00
David Woodhouse	56de88752a	MC: Add AsmLexer::BigNum token for integers greater than 64 bits This will be needed for .octa support, but we don't want to just use the existing AsmLexer::Integer for it and then have to litter all its users with explicit checks for the size, and make them use the new get APIntVal() method. So let the lexer produce an AsmLexer::Integer as before for numbers which are small enough — which appears to cover what was previously a nasty special case handling of numbers which don't fit in int64_t but do fit in uint64_t. Where the number is too large even for that, produce an AsmLexer::BigNum instead. We do nothing with these except complain about them for now, but that will be changed shortly... Based on a patch from PaX Team <pageexec@freemail.hu> llvm-svn: 200613	2014-02-01 16:20:54 +00:00
Chandler Carruth	a93c365f31	[LPM] Apply a really big hammer to fix PR18688 by recursively reforming LCSSA when we promote to SSA registers inside of LICM. Currently, this is actually necessary. The promotion logic in LICM uses SSAUpdater which doesn't understand how to place LCSSA PHI nodes. Teaching it to do so would be a very significant undertaking. It may be worthwhile and I've left a FIXME about this in the code as well as starting a thread on llvmdev to try to figure out the right long-term solution. For now, the PR needs to be fixed. Short of using the promition SSAUpdater to place both the LCSSA PHI nodes and the promoted PHI nodes, I don't see a cleaner or cheaper way of achieving this. Fortunately, LCSSA is relatively lazy and sparse -- it should only update instructions which need it. We can also skip the recursive variant when we don't promote to SSA values. llvm-svn: 200612	2014-02-01 13:35:14 +00:00
Eli Bendersky	62efb50a57	Remove some unused #includes llvm-svn: 200611	2014-02-01 13:12:54 +00:00
Benjamin Kramer	dd0d6831e5	Silence GCC warnings. llvm-svn: 200610	2014-02-01 11:26:18 +00:00
Chandler Carruth	e29471d99d	[inliner] Skip debug intrinsics even earlier in computing the inline cost so that they don't impact the vector bonus. Fundamentally, counting unsimplified instructions is just wrong; it will continue to introduce instability as things which do not generate code bizarrely impact inlining. For example, sufficiently nested inlined functions could turn off the vector bonus with lifetime markers just like the debug intrinsics do. =/ This is a short-term tactical fix. Long term, I think we need to remove the vector bonus entirely. That's a separate patch and discussion though. The patch to fix this provided by Dario Domizioli. I've added some comments about the planned direction and used a heavily pruned form of debug info intrinsics for the test case. While this debug info doesn't work or "do" anything useful, it lets us easily test all manner of interference easily, and I suspect this will not be the last time we want to craft a pattern where debug info interferes with the inliner in a problematic way. llvm-svn: 200609	2014-02-01 10:38:17 +00:00
Craig Topper	f97f309449	Simplify some x86 format classes and remove some ambiguities in their application. llvm-svn: 200608	2014-02-01 08:17:56 +00:00
David Majnemer	9b4ef71ec8	MC: Improve the .fill directive's compatibility with GAS Per the GAS documentation, .fill should permit pattern widths that aren't a power of two. While I was in the neighborhood, I added some sanity checking. This change was motivated by a use of this construct in the Linux Kernel. llvm-svn: 200606	2014-02-01 07:19:38 +00:00
Peter Collingbourne	0c61826b4b	Hopefully fix mingw32 bots. For some reason this symbolic constant isn't defined in some versions of mingw32. llvm-svn: 200605	2014-02-01 02:42:20 +00:00
Reid Kleckner	0421c6aef8	Revert "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..." This reverts commit r200576. It broke 32-bit self-host builds by vectorizing two calls to @llvm.bswap.i64, which we then fail to expand. llvm-svn: 200602	2014-02-01 01:37:30 +00:00
Josh Magee	d0e03ee88f	[stackprotector] Implement the sspstrong rules for stack layout. This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to follow the extended stack layout rules for sspstrong and sspreq. The sspstrong layout rules are: 1. Large arrays and structures containing large arrays (>= ssp-buffer-size) are closest to the stack protector. 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are 2nd closest to the protector. 3. Variables that have had their address taken are 3rd closest to the protector. Differential Revision: http://llvm-reviews.chandlerc.com/D2546 llvm-svn: 200601	2014-02-01 01:36:16 +00:00
Reid Kleckner	239e9806ff	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 llvm-svn: 200596	2014-01-31 23:50:57 +00:00
Peter Collingbourne	80068b8c2c	Introduce line editor library. This library will be used by clang-query. I can imagine LLDB becoming another client of this library, so I think LLVM is a sensible place for it to live. It wraps libedit, and adds tab completion support. The code is loosely based on the line editor bits in LLDB, with a few improvements: - Polymorphism for retrieving the list of tab completions, based on the concept pattern from the new pass manager. - Tab completion doesn't corrupt terminal output if the input covers multiple lines. Unfortunately this can only be done in a truly horrible way, as far as I can tell. But since the alternative is to implement our own line editor (which I don't think LLVM should be in the business of doing, at least for now) I think it may be acceptable. - Includes a fallback for the case where the user doesn't have libedit installed. Note that this uses C stdio, mainly because libedit also uses C stdio. Differential Revision: http://llvm-reviews.chandlerc.com/D2200 llvm-svn: 200595	2014-01-31 23:46:14 +00:00
Peter Collingbourne	6cd66bd2db	Introduce llvm::sys::path::home_directory. This will be used by the line editor library to derive a default path to the history file. Differential Revision: http://llvm-reviews.chandlerc.com/D2199 llvm-svn: 200594	2014-01-31 23:46:06 +00:00
Reid Kleckner	80a8045bb4	Don't put non-static allocas in the static alloca map Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. llvm-svn: 200593	2014-01-31 23:45:12 +00:00
Rafael Espindola	f34497adab	Remove a redundant call to hasRawTextSupport. The code path it was guarding was already using emitRawComment. llvm-svn: 200591	2014-01-31 23:14:01 +00:00
Rafael Espindola	7ed26bece7	Remove another hasRawTextSupport. To remove this one simply move the end of file logic from the asm printer to the target mc streamer. This removes the last call to hasRawTextSupport from lib/Target. llvm-svn: 200590	2014-01-31 23:10:26 +00:00
Chandler Carruth	8bdf469e88	[inliner] Print out extra stats about the cost, threshold, and vector bonus in the inline cost analysis. Split out of a patch by Dario Domizioli to commit separately. llvm-svn: 200586	2014-01-31 22:32:32 +00:00
Rafael Espindola	9e0d89fd92	Remove the last hasRawTextSupport call from R600. There is nothing wrong with printing the disassembly section when printing text. An hypothetical assembler would then produce a .o just like our direct object emission produces. llvm-svn: 200583	2014-01-31 22:14:06 +00:00
Rafael Espindola	181d98005b	Replace another use with hasRawTextSupport+EmitRawText with emitRawComment. llvm-svn: 200582	2014-01-31 22:08:19 +00:00
Rafael Espindola	7a4c0f827a	Use emitRawComment to avoid a call to hasRawTextSupport. llvm-svn: 200581	2014-01-31 21:54:49 +00:00
Lang Hames	884a7dc676	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577	2014-01-31 21:29:19 +00:00
Chandler Carruth	74c658030d	[SLPV] Recognize vectorizable intrinsics during SLP vectorization and transform accordingly. Based on similar code from Loop vectorization. Subsequent commits will include vectorization of function calls to vector intrinsics and form function calls to vector library calls. Patch by Raul Silvera! (Much delayed due to my not running dcommit) llvm-svn: 200576	2014-01-31 21:14:40 +00:00
Rafael Espindola	4007ec608c	Simplify getSymbolFlags. None of the object formats require extra parsing to compute these flags, so the method cannot fail. llvm-svn: 200574	2014-01-31 20:57:12 +00:00
Paul Robinson	7b5cad010e	If we're not producing DWARF accel tables, don't waste memory keeping track of those entries. llvm-svn: 200572	2014-01-31 20:39:19 +00:00
Eric Christopher	861178d373	Add support for DW_FORM_flag and DW_FORM_flag_present to the DIE hashing algorithm. Sink the 'A' + Attribute hash into each form so we don't have to check valid forms before deciding whether or not we're going to hash which will let the default be to return without doing anything. llvm-svn: 200571	2014-01-31 20:02:58 +00:00
David Blaikie	d3fdfda01f	DebugInfo: Flag type unit references as declarations This ensures DWARF consumers don't confuse these references for definitions. I'd argue it might be nice to improve debuggers so we don't need this, but it's just one field in an abbreviation anyway - so it doesn't seem worth the fight. llvm-svn: 200569	2014-01-31 19:52:26 +00:00
Reid Kleckner	edec4d571c	x86: Rename NumBytesForCalleeToPush to ...Pop for accuracy If we have a callee cleanup convention, the callee is going to pop the arguments off the stack, not push them on. llvm-svn: 200566	2014-01-31 19:07:18 +00:00
Reid Kleckner	8ff8b30e4d	[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret' MSVC always places the 'this' parameter for a method first. The implicit 'sret' pointer for methods always comes second. We already implement this for __thiscall by putting sret parameters on the stack, but __cdecl methods require putting both parameters on the stack in opposite order. Using a special calling convention allows frontends to keep the sret parameter first, which avoids breaking lots of assumptions in LLVM and Clang. Fixes PR15768 with the corresponding change in Clang. Reviewers: ributzka, majnemer Differential Revision: http://llvm-reviews.chandlerc.com/D2663 llvm-svn: 200561	2014-01-31 17:41:22 +00:00
Matheus Almeida	489791e923	[mips][msa] Add insert.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200543	2014-01-31 13:31:20 +00:00
Chandler Carruth	fbc2b60e8a	[vectorizer] Tweak the way we do small loop runtime unrolling in the loop vectorizer to not do so when runtime pointer checks are needed and share code with the new (not yet enabled) load/store saturation runtime unrolling. Also ensure that we only consider the runtime checks when the loop hasn't already been vectorized. If it has, the runtime check cost has already been paid. I've fleshed out a test case to cover the scalar unrolling as well as the vector unrolling and comment clearly why we are or aren't following the pattern. llvm-svn: 200530	2014-01-31 10:51:08 +00:00
Craig Topper	e33ac72bdf	Separate x86 opcode maps and 0x66/0xf2/0xf3 prefixes from each other in the TSFlags. This greatly simplifies the switch statements in the disassembler tables and the code emitters. llvm-svn: 200522	2014-01-31 08:47:06 +00:00
Craig Topper	0754fb95c1	Move REP out of the Prefix field of the X86 format. Give it its own bit. It had special handling anyway and this enables a future patch. llvm-svn: 200520	2014-01-31 07:00:55 +00:00
Craig Topper	fbc60780e1	Move address override handling in X86CodeEmitter to a place where it works for VEX encoded instructions too. This allows 32-bit addressing to work in 64-bit mode. llvm-svn: 200517	2014-01-31 05:42:35 +00:00
Craig Topper	c56f5e167f	Move address override handling in X86MCCodeEmitter to a place where it works for VEX encoded instructions too. This allows 32-bit addressing to work in 64-bit mode. llvm-svn: 200516	2014-01-31 05:33:45 +00:00
Bob Wilson	1478ea0cc7	Fix a bug in gcov instrumentation introduced by r195513. <rdar://15930350> The entry block of a function starts with all the static allocas. The change in r195513 splits the block before those allocas, which has the effect of turning them into dynamic allocas. That breaks all sorts of things. Change to split after the initial allocas, and also add a comment explaining why the block is split. llvm-svn: 200515	2014-01-31 05:24:01 +00:00
Venkatraman Govindaraju	b0c5799fbd	[Sparc] Save and restore float registers that may be used for parameter passing. llvm-svn: 200509	2014-01-31 01:53:08 +00:00
Manman Ren	0552af6547	This patch teaches the DAGCombiner how to fold insert_subvector nodes when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506	2014-01-31 01:10:35 +00:00
Owen Anderson	2809d5d134	DAGCombine should not produce ISD::OR nodes after operation legalization if they're not legal. llvm-svn: 200503	2014-01-31 00:51:43 +00:00
Manman Ren	7760d41e27	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502	2014-01-31 00:42:44 +00:00
Matt Arsenault	5055466f83	Allow speculating llvm.sqrt, fma and fmuladd This doesn't set errno, so this should be OK. Also update the documentation to explicitly state that errno are not set. llvm-svn: 200501	2014-01-31 00:09:00 +00:00
David Woodhouse	10eb2a8985	[x86] Fix signed relocations for i64i32imm operands These should end up (in ELF) as R_X86_64_32S relocs, not R_X86_64_32. Kill the horrid and incomplete special case and FIXME in EncodeInstruction() and set things up so it can infer the signedness from the ImmType just like it can the size and whether it's PC-relative. llvm-svn: 200495	2014-01-30 22:20:41 +00:00
Chad Rosier	156f3a2a96	[AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, v8i16, v16i8 types. llvm-svn: 200491	2014-01-30 21:46:54 +00:00
Timur Iskhodzhanov	04f94cf108	Fix PR18381 - print a minimal diagnostic rather than assert on unresolved .secidx target llvm-svn: 200490	2014-01-30 21:13:05 +00:00
Rafael Espindola	fae4ff3453	Only ELF has a dynamic symbol table. Remove it from ObjectFile. COFF has only one symbol table. MachO has a LC_DYSYMTAB, but that is not a symbol table, just extra info about the one symbol table (LC_SYMTAB). IR (coming soon) also has only one table. llvm-svn: 200488	2014-01-30 20:45:33 +00:00

1 2 3 4 5 ...

66958 Commits