llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00

Author	SHA1	Message	Date
Hal Finkel	457fbe481c	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Chandler Carruth	fc5ab5d388	Rewrite how machine block placement handles loop rotation. This is a complex change that resulted from a great deal of experimentation with several different benchmarks. The one which proved the most useful is included as a test case, but I don't know that it captures all of the relevant changes, as I didn't have specific regression tests for each, they were more the result of reasoning about what the old algorithm would possibly do wrong. I'm also failing at the moment to craft more targeted regression tests for these changes, if anyone has ideas, it would be welcome. The first big thing broken with the old algorithm is the idea that we can take a basic block which has a loop-exiting successor and a looping successor and use the looping successor as the layout top in order to get that particular block to be the bottom of the loop after layout. This happens to work in many cases, but not in all. The second big thing broken was that we didn't try to select the exit which fell into the nearest enclosing loop (to which we exit at all). As a consequence, even if the rotation worked perfectly, it would result in one of two bad layouts. Either the bottom of the loop would get fallthrough, skipping across a nearer enclosing loop and thereby making it discontiguous, or it would be forced to take an explicit jump over the nearest enclosing loop to earch its successor. The point of the rotation is to get fallthrough, so we need it to fallthrough to the nearest loop it can. The fix to the first issue is to actually layout the loop from the loop header, and then rotate the loop such that the correct exiting edge can be a fallthrough edge. This is actually much easier than I anticipated because we can handle all the hard parts of finding a viable rotation before we do the layout. We just store that, and then rotate after layout is finished. No inner loops get split across the post-rotation backedge because we check for them when selecting the rotation. That fix exposed a latent problem with our exitting block selection -- we should allow the backedge to point into the middle of some inner-loop chain as there is no real penalty to it, the whole point is that it won't be a fallthrough edge. This may have blocked the rotation at all in some cases, I have no idea and no test case as I've never seen it in practice, it was just noticed by inspection. Finally, all of these fixes, and studying the loops they produce, highlighted another problem: in rotating loops like this, we sometimes fail to align the destination of these backwards jumping edges. Fix this by actually walking the backwards edges rather than relying on loopinfo. This fixes regressions on heapsort if block placement is enabled as well as lots of other cases where the previous logic would introduce an abundance of unnecessary branches into the execution. llvm-svn: 154783	2012-04-16 01:12:56 +00:00
Craig Topper	1b15347812	Merge vpermps/vpermd and vpermpd/vpermq SD nodes. llvm-svn: 154782	2012-04-16 00:41:45 +00:00
Craig Topper	c217784dc3	Fix SDTypeProfile for vpermps. The mask operand should be v8i32. llvm-svn: 154781	2012-04-16 00:12:20 +00:00
Craig Topper	e274a2cc61	Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in vpermps/vpermd. Hardware only looks at lower 3-bits. llvm-svn: 154780	2012-04-15 23:48:57 +00:00
Craig Topper	788250eec1	Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors. llvm-svn: 154778	2012-04-15 22:43:31 +00:00
Craig Topper	a6f7e1a202	Make member variables of AsmToken private. Remove unnecessary forward declarations. Remove an unnecessary include. llvm-svn: 154775	2012-04-15 22:00:22 +00:00
Jakub Staszak	64c3ee0cea	Fix class name. llvm-svn: 154773	2012-04-15 20:22:36 +00:00
Nadav Rotem	7a2a2ae678	Do not convert between fp128 <-> ppc_fp128 since there is no legal cast conversion between the two. Patch by nobled <nobled@dreamwidth.org> llvm-svn: 154772	2012-04-15 20:17:14 +00:00
Jakub Staszak	bde8ec16d4	Fix filename and register numbers. llvm-svn: 154771	2012-04-15 20:13:47 +00:00
Nadav Rotem	2a4e2ef10c	Fix PR12529. The Vxx family of instructions are only supported by AVX. Use non-vex instructions for SSE4. llvm-svn: 154770	2012-04-15 19:36:44 +00:00
Duncan Sands	f6cbb0b2cb	Add the MDBuilder helper class for conveniently creating metadata. llvm-svn: 154766	2012-04-15 18:03:49 +00:00
Benjamin Kramer	d4a8bf07d5	Wire up support for diagnostic ranges in the ARMAsmParser. As an example, attach range info to the "invalid instruction" message: $ clang -arch arm -c asm.c asm.c:2:11: error: invalid instruction __asm__("foo r0"); ^ <inline asm>:1:2: note: instantiated into assembly here foo r0 ^~~ llvm-svn: 154765	2012-04-15 17:04:27 +00:00
Nadav Rotem	b8710ee43f	When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type. llvm-svn: 154764	2012-04-15 15:08:09 +00:00
Elena Demikhovsky	92fb3e613e	Added VPERM optimization for AVX2 shuffles llvm-svn: 154761	2012-04-15 11:18:59 +00:00
NAKAMURA Takumi	0133680b3d	HexagonCopyToCombine.cpp: Silence two warnings, -Wunused-variable, with -Asserts. llvm-svn: 154759	2012-04-15 05:33:43 +00:00
NAKAMURA Takumi	ddf2dc407e	Target/Hexagon: Tweak to fix msvc build. llvm-svn: 154758	2012-04-15 05:09:09 +00:00
Anshuman Dasgupta	68108fede2	Remove trailing whitespace. llvm-svn: 154755	2012-04-14 20:59:13 +00:00
Anshuman Dasgupta	9c02269a1c	Add VLIW packetizer to ReleaseNotes.html and CREDITS.TXT. Committing patch by Sundeep Kushwaha. llvm-svn: 154754	2012-04-14 20:57:13 +00:00
Brendon Cahoon	2284d72f57	Add the loop unrolling info to ReleaseNotes.html and CREDITS.TXT. llvm-svn: 154752	2012-04-14 16:54:12 +00:00
Duncan Sands	7e4fa0a115	There is no need for setIsExact to be public. Make it private. llvm-svn: 154750	2012-04-14 15:43:22 +00:00
Duncan Sands	40d080e3b7	Rename "fpaccuracy" metadata to the more generic "fpmath". That's because I'm thinking of generalizing it to be able to specify other freedoms beyond accuracy (such as that NaN's don't have to be respected). I'd like the 3.1 release (the first one with this metadata) to have the more generic name already rather than having to auto-upgrade it in 3.2. llvm-svn: 154744	2012-04-14 12:36:06 +00:00
Benjamin Kramer	b9eb9d651b	Make StringMap's copy ctor non-explicit. Without this gcc doesn't allow us to put a StringMap into a std::map. Works with clang though. llvm-svn: 154737	2012-04-14 09:04:57 +00:00
Hal Finkel	028d6e153e	Fix an error in BBVectorize important for vectorizing pointer types. When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735	2012-04-14 07:32:50 +00:00
Hal Finkel	c55edb7b35	Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs. llvm-svn: 154734	2012-04-14 07:32:43 +00:00
Andrew Trick	550cf63beb	misched: Added CanHandleTerminators. This is a special flag for targets that really want their block terminators in the DAG. The default scheduler cannot handle this correctly, so it becomes the specialized scheduler's responsibility to schedule terminators. llvm-svn: 154712	2012-04-13 23:29:54 +00:00
Bob Wilson	da87e36649	Remove old code to strip out unwanted PPC slices for Apple llvmCore. llvm-svn: 154706	2012-04-13 22:58:53 +00:00
Richard Smith	d5004a79d9	Fix X86 codegen for 'atomicrmw nand' to generate x = ~(x & y), not x = ~x & y. llvm-svn: 154705	2012-04-13 22:47:00 +00:00
Sirish Pande	fc7e619733	Remove iostream from New Value Jump. llvm-svn: 154703	2012-04-13 21:01:35 +00:00
Hal Finkel	12b4c41203	Add support to BBVectorize for vectorizing selects. llvm-svn: 154700	2012-04-13 20:45:45 +00:00
Sirish Pande	6c3fc0ca53	Add support for Hexagon Architectural feature, New Value Jump. llvm-svn: 154696	2012-04-13 20:22:31 +00:00
Sirish Pande	01b53a9593	Pass to replace tranfer/copy instructions into combine instruction where possible. llvm-svn: 154695	2012-04-13 20:22:19 +00:00
Benjamin Kramer	191fe619aa	Reduce malloc traffic in DwarfAccelTable - Don't copy offsets into HashData, the underlying vector won't change once the table is finalized. - Allocate HashData and HashDataContents in a BumpPtrAllocator. - Allocate string map entries in the same allocator. - Random cleanups. llvm-svn: 154694	2012-04-13 20:06:17 +00:00
Tony Linthicum	053413c3e8	Support for Hexagon backend. llvm-svn: 154692	2012-04-13 19:09:44 +00:00
Tony Linthicum	36d03a30fa	Support for Hexagon backend. llvm-svn: 154691	2012-04-13 19:09:18 +00:00
Evan Cheng	3499593c7e	On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly. llvm-svn: 154689	2012-04-13 18:59:28 +00:00
Dan Gohman	0387e6b701	Add some comments, and fix a few places that missed setting Changed. llvm-svn: 154687	2012-04-13 18:57:48 +00:00
Kevin Enderby	84e97c7df2	For ARM disassembly only print 32 unsigned bits for the address of branch targets so if the branch target has the high bit set it does not get printed as: beq 0xffffffff8008c404 llvm-svn: 154685	2012-04-13 18:46:37 +00:00
Dan Gohman	d5743c7fd0	Consider ObjC runtime calls objc_storeWeak and others which make a copy of their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682	2012-04-13 18:28:58 +00:00
Hal Finkel	f8611de2a6	By default, use Early-CSE instead of GVN for vectorization cleanup. As has been suggested by Duncan and others, Early-CSE and GVN should do similar redundancy elimination, but Early-CSE is much less expensive. Most of my autovectorization benchmarks show a performance regresion, but all of these are < 0.1%, and so I think that it is still worth using the less expensive pass. llvm-svn: 154673	2012-04-13 17:15:33 +00:00
Sylvestre Ledru	53db93eead	Catch the Python exception when subprocess.Popen is failing. For example, if llc cannot be found, the full python stacktrace is displayed and no interesting information are provided. + fail the process when an exception occurs llvm-svn: 154665	2012-04-13 11:22:18 +00:00
Benjamin Kramer	9087b1f54a	Remove unused variable. llvm-svn: 154661	2012-04-13 08:09:12 +00:00
Craig Topper	7c0af9b204	Silence various build warnings from Hexagon backend that show up in release builds. Mostly converting 'assert(0)' to 'llvm_unreachable' to silence warnings about missing returns. Also fold some variable declarations into asserts to prevent the variables from being unused in release builds. llvm-svn: 154660	2012-04-13 06:38:11 +00:00
Craig Topper	da52eeedcb	Fix target specific intrinsic handling to adjust intrinsic number before doing attribute table lookup. Also fix attribute table lookup to handle 'invalid' intrinsic correctly. Fixes PR12542 llvm-svn: 154658	2012-04-13 06:14:57 +00:00
Craig Topper	79c030996a	Remove getElfArchType from ELF.h. It's only used in ELFObjectFile.cpp and there's already a copy there. ELF.h was hiding the one there and causing an unused function warning. llvm-svn: 154657	2012-04-13 05:58:19 +00:00
Dan Gohman	81ac0c921f	Use the new Use-aware dominates method to apply the objc runtime library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647	2012-04-13 01:08:28 +00:00
Bill Wendling	8659a23f4a	Code-gen may inject code into the IR before it emits the ASM. The linker obviously cannot know that this code is present, let alone used. So prevent the internalize pass from internalizing those global values which code-gen may insert. llvm-svn: 154645	2012-04-13 01:06:27 +00:00
Dan Gohman	6a5b02f8ee	Don't move objc_autorelease calls past autorelease pool boundaries when optimizing autorelease calls on phi nodes with null operands. This fixes rdar://11207070. llvm-svn: 154642	2012-04-13 00:59:57 +00:00
Dan Gohman	cde3a46455	Def here is an Instruction, so !isa<Instruction>(Def) is always false, as Eli noticed. llvm-svn: 154641	2012-04-13 00:50:57 +00:00
Dan Gohman	c0a906405e	Add forms of dominates and isReachableFromEntry that accept a Use directly instead of a user Instruction. This allows them to test whether a def dominates a particular operand if the user instruction is a PHI. llvm-svn: 154631	2012-04-12 23:31:46 +00:00

1 2 3 4 5 ...

81670 Commits