llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-31 07:52:55 +01:00

Author	SHA1	Message	Date
Chandler Carruth	33b200ad13	Tweak the loop rotation logic to check whether the loop is naturally laid out in a form with a fallthrough into the header and a fallthrough out of the bottom. In that case, leave the loop alone because any rotation will introduce unnecessary branches. If either side looks like it will require an explicit branch, then the rotation won't add any, do it to ensure the branch occurs outside of the loop (if possible) and maximize the benefit of the fallthrough in the bottom. llvm-svn: 154806	2012-04-16 09:31:23 +00:00
Benjamin Kramer	a72a6005f8	Reapply 'Add reverseColor to raw_ostream'. To be used in printing unprintable source in clang diagnostics. Patch by Seth Cantrell, with a minor fix for mingw by me. llvm-svn: 154805	2012-04-16 08:56:50 +00:00
Eli Bendersky	b9e9796e6f	Documentation fixes to LLVMBuild.html [PR 11563] llvm-svn: 154804	2012-04-16 08:42:55 +00:00
Argyrios Kyrtzidis	4950ffbb4f	Revert r154800 which breaks windows builders. llvm-svn: 154802	2012-04-16 07:59:39 +00:00
Craig Topper	db4fcf7088	Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes. llvm-svn: 154801	2012-04-16 07:13:00 +00:00
Argyrios Kyrtzidis	3d576f296a	Add reverseColor to raw_ostream. To be used in printing unprintable source in clang diagnostics. Patch by Seth Cantrell! llvm-svn: 154800	2012-04-16 07:07:38 +00:00
Craig Topper	a986fc78e2	Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps. llvm-svn: 154798	2012-04-16 06:43:40 +00:00
Craig Topper	129dccdc84	Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second. llvm-svn: 154797	2012-04-16 06:26:15 +00:00
Bill Wendling	3514c015bd	Add credit and release notes for r150307. By Kai Nacke. llvm-svn: 154796	2012-04-16 05:24:52 +00:00
Bill Wendling	66282c6d7f	Add a Fixme. llvm-svn: 154793	2012-04-16 04:23:52 +00:00
Sebastian Pop	d60bf3baf0	add configure flag --with-default-sysroot llvm-svn: 154791	2012-04-16 04:11:45 +00:00
Hal Finkel	4ac041693e	Say something about -vectorize in the release notes. llvm-svn: 154788	2012-04-16 03:49:43 +00:00
Hal Finkel	4f7adc1f50	Simplify checking for pointer types in BBVectorize (this change was suggested by Duncan). llvm-svn: 154787	2012-04-16 03:49:42 +00:00
Hal Finkel	457fbe481c	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Chandler Carruth	fc5ab5d388	Rewrite how machine block placement handles loop rotation. This is a complex change that resulted from a great deal of experimentation with several different benchmarks. The one which proved the most useful is included as a test case, but I don't know that it captures all of the relevant changes, as I didn't have specific regression tests for each, they were more the result of reasoning about what the old algorithm would possibly do wrong. I'm also failing at the moment to craft more targeted regression tests for these changes, if anyone has ideas, it would be welcome. The first big thing broken with the old algorithm is the idea that we can take a basic block which has a loop-exiting successor and a looping successor and use the looping successor as the layout top in order to get that particular block to be the bottom of the loop after layout. This happens to work in many cases, but not in all. The second big thing broken was that we didn't try to select the exit which fell into the nearest enclosing loop (to which we exit at all). As a consequence, even if the rotation worked perfectly, it would result in one of two bad layouts. Either the bottom of the loop would get fallthrough, skipping across a nearer enclosing loop and thereby making it discontiguous, or it would be forced to take an explicit jump over the nearest enclosing loop to earch its successor. The point of the rotation is to get fallthrough, so we need it to fallthrough to the nearest loop it can. The fix to the first issue is to actually layout the loop from the loop header, and then rotate the loop such that the correct exiting edge can be a fallthrough edge. This is actually much easier than I anticipated because we can handle all the hard parts of finding a viable rotation before we do the layout. We just store that, and then rotate after layout is finished. No inner loops get split across the post-rotation backedge because we check for them when selecting the rotation. That fix exposed a latent problem with our exitting block selection -- we should allow the backedge to point into the middle of some inner-loop chain as there is no real penalty to it, the whole point is that it won't be a fallthrough edge. This may have blocked the rotation at all in some cases, I have no idea and no test case as I've never seen it in practice, it was just noticed by inspection. Finally, all of these fixes, and studying the loops they produce, highlighted another problem: in rotating loops like this, we sometimes fail to align the destination of these backwards jumping edges. Fix this by actually walking the backwards edges rather than relying on loopinfo. This fixes regressions on heapsort if block placement is enabled as well as lots of other cases where the previous logic would introduce an abundance of unnecessary branches into the execution. llvm-svn: 154783	2012-04-16 01:12:56 +00:00
Craig Topper	1b15347812	Merge vpermps/vpermd and vpermpd/vpermq SD nodes. llvm-svn: 154782	2012-04-16 00:41:45 +00:00
Craig Topper	c217784dc3	Fix SDTypeProfile for vpermps. The mask operand should be v8i32. llvm-svn: 154781	2012-04-16 00:12:20 +00:00
Craig Topper	e274a2cc61	Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in vpermps/vpermd. Hardware only looks at lower 3-bits. llvm-svn: 154780	2012-04-15 23:48:57 +00:00
Craig Topper	788250eec1	Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors. llvm-svn: 154778	2012-04-15 22:43:31 +00:00
Craig Topper	a6f7e1a202	Make member variables of AsmToken private. Remove unnecessary forward declarations. Remove an unnecessary include. llvm-svn: 154775	2012-04-15 22:00:22 +00:00
Jakub Staszak	64c3ee0cea	Fix class name. llvm-svn: 154773	2012-04-15 20:22:36 +00:00
Nadav Rotem	7a2a2ae678	Do not convert between fp128 <-> ppc_fp128 since there is no legal cast conversion between the two. Patch by nobled <nobled@dreamwidth.org> llvm-svn: 154772	2012-04-15 20:17:14 +00:00
Jakub Staszak	bde8ec16d4	Fix filename and register numbers. llvm-svn: 154771	2012-04-15 20:13:47 +00:00
Nadav Rotem	2a4e2ef10c	Fix PR12529. The Vxx family of instructions are only supported by AVX. Use non-vex instructions for SSE4. llvm-svn: 154770	2012-04-15 19:36:44 +00:00
Duncan Sands	f6cbb0b2cb	Add the MDBuilder helper class for conveniently creating metadata. llvm-svn: 154766	2012-04-15 18:03:49 +00:00
Benjamin Kramer	d4a8bf07d5	Wire up support for diagnostic ranges in the ARMAsmParser. As an example, attach range info to the "invalid instruction" message: $ clang -arch arm -c asm.c asm.c:2:11: error: invalid instruction __asm__("foo r0"); ^ <inline asm>:1:2: note: instantiated into assembly here foo r0 ^~~ llvm-svn: 154765	2012-04-15 17:04:27 +00:00
Nadav Rotem	b8710ee43f	When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type. llvm-svn: 154764	2012-04-15 15:08:09 +00:00
Elena Demikhovsky	92fb3e613e	Added VPERM optimization for AVX2 shuffles llvm-svn: 154761	2012-04-15 11:18:59 +00:00
NAKAMURA Takumi	0133680b3d	HexagonCopyToCombine.cpp: Silence two warnings, -Wunused-variable, with -Asserts. llvm-svn: 154759	2012-04-15 05:33:43 +00:00
NAKAMURA Takumi	ddf2dc407e	Target/Hexagon: Tweak to fix msvc build. llvm-svn: 154758	2012-04-15 05:09:09 +00:00
Anshuman Dasgupta	68108fede2	Remove trailing whitespace. llvm-svn: 154755	2012-04-14 20:59:13 +00:00
Anshuman Dasgupta	9c02269a1c	Add VLIW packetizer to ReleaseNotes.html and CREDITS.TXT. Committing patch by Sundeep Kushwaha. llvm-svn: 154754	2012-04-14 20:57:13 +00:00
Brendon Cahoon	2284d72f57	Add the loop unrolling info to ReleaseNotes.html and CREDITS.TXT. llvm-svn: 154752	2012-04-14 16:54:12 +00:00
Duncan Sands	7e4fa0a115	There is no need for setIsExact to be public. Make it private. llvm-svn: 154750	2012-04-14 15:43:22 +00:00
Duncan Sands	40d080e3b7	Rename "fpaccuracy" metadata to the more generic "fpmath". That's because I'm thinking of generalizing it to be able to specify other freedoms beyond accuracy (such as that NaN's don't have to be respected). I'd like the 3.1 release (the first one with this metadata) to have the more generic name already rather than having to auto-upgrade it in 3.2. llvm-svn: 154744	2012-04-14 12:36:06 +00:00
Benjamin Kramer	b9eb9d651b	Make StringMap's copy ctor non-explicit. Without this gcc doesn't allow us to put a StringMap into a std::map. Works with clang though. llvm-svn: 154737	2012-04-14 09:04:57 +00:00
Hal Finkel	028d6e153e	Fix an error in BBVectorize important for vectorizing pointer types. When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735	2012-04-14 07:32:50 +00:00
Hal Finkel	c55edb7b35	Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs. llvm-svn: 154734	2012-04-14 07:32:43 +00:00
Andrew Trick	550cf63beb	misched: Added CanHandleTerminators. This is a special flag for targets that really want their block terminators in the DAG. The default scheduler cannot handle this correctly, so it becomes the specialized scheduler's responsibility to schedule terminators. llvm-svn: 154712	2012-04-13 23:29:54 +00:00
Bob Wilson	da87e36649	Remove old code to strip out unwanted PPC slices for Apple llvmCore. llvm-svn: 154706	2012-04-13 22:58:53 +00:00
Richard Smith	d5004a79d9	Fix X86 codegen for 'atomicrmw nand' to generate x = ~(x & y), not x = ~x & y. llvm-svn: 154705	2012-04-13 22:47:00 +00:00
Sirish Pande	fc7e619733	Remove iostream from New Value Jump. llvm-svn: 154703	2012-04-13 21:01:35 +00:00
Hal Finkel	12b4c41203	Add support to BBVectorize for vectorizing selects. llvm-svn: 154700	2012-04-13 20:45:45 +00:00
Sirish Pande	6c3fc0ca53	Add support for Hexagon Architectural feature, New Value Jump. llvm-svn: 154696	2012-04-13 20:22:31 +00:00
Sirish Pande	01b53a9593	Pass to replace tranfer/copy instructions into combine instruction where possible. llvm-svn: 154695	2012-04-13 20:22:19 +00:00
Benjamin Kramer	191fe619aa	Reduce malloc traffic in DwarfAccelTable - Don't copy offsets into HashData, the underlying vector won't change once the table is finalized. - Allocate HashData and HashDataContents in a BumpPtrAllocator. - Allocate string map entries in the same allocator. - Random cleanups. llvm-svn: 154694	2012-04-13 20:06:17 +00:00
Tony Linthicum	053413c3e8	Support for Hexagon backend. llvm-svn: 154692	2012-04-13 19:09:44 +00:00
Tony Linthicum	36d03a30fa	Support for Hexagon backend. llvm-svn: 154691	2012-04-13 19:09:18 +00:00
Evan Cheng	3499593c7e	On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly. llvm-svn: 154689	2012-04-13 18:59:28 +00:00
Dan Gohman	0387e6b701	Add some comments, and fix a few places that missed setting Changed. llvm-svn: 154687	2012-04-13 18:57:48 +00:00

1 2 3 4 5 ...

81683 Commits