llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00

Author	SHA1	Message	Date
Matt Arsenault	6df23adcfc	Fix another constant folding address space place I missed. This fixes an assertion failure with a different sized address space. llvm-svn: 194014	2013-11-04 20:46:52 +00:00
Hal Finkel	38113823b9	Consider (x == -1) unlikely in BranchProbabilityInfo This adds another heuristic to BPI, similar to the existing heuristic that considers (x == 0) unlikely to be true. As suggested in the PACT'98 paper by Deitrich, Cheng, and Hwu, -1 is often used to indicate an invalid index, and equality comparisons with -1 are also unlikely to succeed. Local experimentation supports this hypothesis: This yields a 1-2% speedup in the test-suite sqlite benchmark on the PPC A2 core, with no significant regressions. llvm-svn: 193855	2013-11-01 10:58:22 +00:00
Rafael Espindola	afc61d382c	Merge CallGraph and BasicCallGraph. llvm-svn: 193734	2013-10-31 03:03:55 +00:00
Benjamin Kramer	5938ec4cbe	SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive. We can't do this for the general case as saying a GEP with a negative index doesn't have unsigned wrap isn't valid for negative indices. %gep = getelementptr inbounds i32* %p, i64 -1 But an inbounds GEP cannot run past the end of address space. So we check for the very common case of a positive index and make GEPs derived from that NUW. Together with Andy's recent non-unit stride work this lets us analyze loops like void foo3(int a, int b) { for (; a < b; a++) {} } PR12375, PR12376. Differential Revision: http://llvm-reviews.chandlerc.com/D2033 llvm-svn: 193514	2013-10-28 07:30:06 +00:00
Shuxin Yang	eb29e658a0	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. llvm-svn: 193489	2013-10-27 03:08:44 +00:00
Wan Xiaofei	f3100f24fa	Quick look-up for block in loop. This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460	2013-10-26 03:08:02 +00:00
Andrew Trick	741b7a0cfc	Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438	2013-10-25 21:35:56 +00:00
Andrew Trick	16b7e5b553	Fix LSR: don't normalize quadratic recurrences. Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) ScalarEvolutionNormalization was attempting to normalize by adding and subtracting strides. Chained recurrences don't work that way. llvm-svn: 193437	2013-10-25 21:35:52 +00:00
Rafael Espindola	cd9791ce72	Call destroy from ~BasicCallGraph. This fix a memory leak found by valgrind. Calling it from the base class destructor would not destroy the BasicCallGraph bits. FIXME: BasicCallGraph is the only thing that inherits from CallGraph. Can we merge the two? llvm-svn: 193412	2013-10-25 15:01:34 +00:00
Nuno Lopes	09c3fc8dac	fix PR17635: false positive with packed structures LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317	2013-10-24 09:17:24 +00:00
Shuxin Yang	45a453cafe	Use address-taken to disambiguate global variable and indirect memops. Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251	2013-10-23 17:28:19 +00:00
Andrew Trick	a454a863ad	Clarify SCEV comments. We handle for(i=n; i>0; i -= s) by canonicalizing within SCEV to for(i=-n; i<0; i += s). llvm-svn: 193147	2013-10-22 05:09:40 +00:00
Manman Ren	87bfd7a670	TBAA: fix PR17620. We can have a struct type with a single field and the field does not start with 0. In that case, we should correctly update the offset. llvm-svn: 193137	2013-10-22 01:40:25 +00:00
Matt Arsenault	fcca6dd732	Use more type helper functions llvm-svn: 193109	2013-10-21 19:43:56 +00:00
Matt Arsenault	3089ca20bd	Fix creating bitcasts between address spaces in SCEV. The test before wasn't successfully testing this since it was missing the datalayout piece to change the size of the second address space. llvm-svn: 193102	2013-10-21 18:41:10 +00:00
Matt Arsenault	5b1a8d26fc	Remove unused SCEV functions llvm-svn: 193097	2013-10-21 18:08:09 +00:00
Andrew Trick	027f71d443	SCEV should use NSW to get trip count for positive nonunit stride loops. SCEV currently fails to compute loop counts for nonunit stride loops. This comes up frequently. It prevents loop optimization and forces vectorization to insert extra loop checks. For example: void foo(int n, int *x) { for (int i = 0; i < n; i += 3) { x[i] = i; x[i+1] = i+1; x[i+2] = i+2; } } We need to properly handle the case in which limit > INT_MAX-stride. In the above case: n > INT_MAX-3. In this case the loop counter will step beyond the limit and overflow at the same time. However, knowing that signed integer overlow in undefined, we can assume the loop test behavior is arbitrary after overflow. This obeys both C undefined behavior rules, and the more strict LLVM poison value rules. I'm finally fixing this in response to Hal Finkel's persistence. The most probable reason that we never optimized this before is that we were being careful to handle case where the developer expected a side-effect free infinite loop relying on overflow: for (int i = 0; i < n; i += s) { ++j; } return j; If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might expect an infinite loop. However there are plenty of ways to achieve this effect without relying on undefined behavior of signed overflow. llvm-svn: 193015	2013-10-18 23:43:53 +00:00
Craig Topper	037594e792	Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672	2013-10-15 05:20:47 +00:00
Matt Arsenault	c43ade894d	Rename DataLayout variables TD -> DL llvm-svn: 191927	2013-10-03 19:50:01 +00:00
Benjamin Kramer	969358c375	CaptureTracking: Plug a loophole in the "too many uses" heuristic. The heuristic was added to avoid spending too much compile time A specially crafted test case (PR17461, PR16474) with many uses on a select or bitcast instruction can still trigger the slow case. Add a check for that case. This only affects compile time, don't have a good way to test it. llvm-svn: 191896	2013-10-03 13:24:02 +00:00
Chandler Carruth	ee12d58370	Remove the very substantial, largely unmaintained legacy PGO infrastructure. This was essentially work toward PGO based on a design that had several flaws, partially dating from a time when LLVM had a different architecture, and with an effort to modernize it abandoned without being completed. Since then, it has bitrotted for several years further. The result is nearly unusable, and isn't helping any of the modern PGO efforts. Instead, it is getting in the way, adding confusion about PGO in LLVM and distracting everyone with maintenance on essentially dead code. Removing it paves the way for modern efforts around PGO. Among other effects, this removes the last of the runtime libraries from LLVM. Those are being developed in the separate 'compiler-rt' project now, with somewhat different licensing specifically more approriate for runtimes. llvm-svn: 191835	2013-10-02 15:42:23 +00:00
Rafael Espindola	a279462828	Remove several unused variables. Patch by Alp Toker. llvm-svn: 191757	2013-10-01 13:32:03 +00:00
Benjamin Kramer	e04c4c3faf	SCEVExpander: Fix a regression I introduced by to eagerly adding RAII objects. PR17425. llvm-svn: 191741	2013-10-01 12:17:11 +00:00
Benjamin Kramer	7b5eaaacfd	Convert manual insert point restores to the new RAII object. llvm-svn: 191675	2013-09-30 15:40:17 +00:00
Benjamin Kramer	1dab382232	ObjectSizeOffsetEvaluator: Don't run into infinite recursion if we have a cyclic GEP. Those can occur in dead code. PR17402. llvm-svn: 191644	2013-09-29 19:39:13 +00:00
Manman Ren	a61e576332	TBAA: try to fix the dragonegg bots. llvm-svn: 191585	2013-09-27 22:59:21 +00:00
Matt Arsenault	7e864bac3e	Minor code simplification llvm-svn: 191579	2013-09-27 22:38:23 +00:00
Matt Arsenault	c1629ee7a8	Use type helper functions llvm-svn: 191574	2013-09-27 22:18:51 +00:00
Manman Ren	2ef9ca7627	TBAA: handle scalar TBAA format and struct-path aware TBAA format. Remove the command line argument "struct-path-tbaa" since we should not depend on command line argument to decide which format the IR file is using. Instead, we check the first operand of the tbaa tag node, if it is a MDNode, we treat it as struct-path aware TBAA format, otherwise, we treat it as scalar TBAA format. When clang starts to use struct-path aware TBAA format no matter whether struct-path-tbaa is no, and we can auto-upgrade existing bc files, the support for scalar TBAA format can be dropped. Existing testing cases are updated to use the struct-path aware TBAA format. llvm-svn: 191538	2013-09-27 18:34:27 +00:00
Benjamin Kramer	5903a6ce39	MemoryBuiltins: Remove posix_memalign from the list and replace it with a TODO. This code isn't ready to deal with allocation functions where the return is not the allocated pointer. The checks below will reject posix_memalign anyways. llvm-svn: 191319	2013-09-24 17:49:08 +00:00
Benjamin Kramer	3ad5ca9c1c	MemoryBuiltins: Reinstate optimizing (uninitialized) loads from operator new. llvm-svn: 191315	2013-09-24 17:34:29 +00:00
Benjamin Kramer	e77aa22768	MemoryBuiltins: Fix operator new bits. We really don't want to optimize malloc return value checks away. llvm-svn: 191313	2013-09-24 17:15:14 +00:00
Benjamin Kramer	bc13e7ad78	Teach MemoryBuiltins and InstructionSimplify that operator new never returns NULL. This is safe per C++11 18.6.1.1p3: [operator new returns] a non-null pointer to suitably aligned storage (3.7.4), or else throw a bad_alloc exception. This requirement is binding on a replacement version of this function. Brings us a tiny bit closer to eliminating more vector push_backs. llvm-svn: 191310	2013-09-24 16:37:51 +00:00
Benjamin Kramer	5318f92353	InstSimplify: Fold equality comparisons between non-inbounds GEPs. Overflow doesn't affect the correctness of equalities. Computing this is cheap, we just reuse the computation for the inbounds case and try to peel of more non-inbounds GEPs. This pattern is unlikely to ever appear in code generated by Clang, but SCEV occasionally produces it. llvm-svn: 191200	2013-09-23 14:16:38 +00:00
Matt Arsenault	e36237bda3	Fix a constant folding address space place I missed. If address space 0 was smaller than the address space in a constant inttoptr/ptrtoint pair, the wrong mask size would be used. llvm-svn: 190899	2013-09-17 23:23:16 +00:00
Eric Christopher	cf815a772b	Move variable into assert to avoid unused variable warning. llvm-svn: 190886	2013-09-17 21:13:57 +00:00
Arnold Schwaighofer	eabde1ffce	Costmodel: Add support for horizontal vector reductions Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence: (v0, v1, v2, v3) \ \ / / \ \ / + + (v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef) %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0 This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface. We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp). (v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef) %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0 llvm-svn: 190876	2013-09-17 18:06:50 +00:00
Krzysztof Parzyszek	d3b8515190	In AliasSetTracker, do not change the alias set to "mod/ref" when adding a volatile load, or a volatile store. llvm-svn: 190631	2013-09-12 20:15:50 +00:00
Matt Arsenault	cee048defc	Move variable under condition where it is used llvm-svn: 190567	2013-09-12 01:07:58 +00:00
Hal Finkel	fe9daed60a	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 190542	2013-09-11 19:25:43 +00:00
Matt Arsenault	a15663a1b4	Teach ScalarEvolution about pointer address spaces llvm-svn: 190425	2013-09-10 19:55:24 +00:00
Manman Ren	de03bcdbec	TBAA: add isTBAAVtableAccess to MDNode so clients can call the function instead of having its own implementation. The implementation of isTBAAVtableAccess is in TypeBasedAliasAnalysis.cpp since it is related to the format of TBAA metadata. The path for struct-path tbaa will be exercised by test/Instrumentation/ThreadSanitizer/read_from_global.ll, vptr_read.ll, and vptr_update.ll when struct-path tbaa is on by default. llvm-svn: 190216	2013-09-06 22:47:05 +00:00
Hal Finkel	198ffea54f	Revert: r189565 - Add getUnrollingPreferences to TTI Revert unintentional commit (of an unreviewed change). Original commit message: Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189566	2013-08-29 03:33:15 +00:00
Hal Finkel	04a990355c	Add getUnrollingPreferences to TTI Allow targets to customize the default behavior of the generic loop unrolling transformation. This will be used by the PowerPC backend when targeting the A2 core (which is in-order with a deep pipeline), and using more aggressive defaults is important. llvm-svn: 189565	2013-08-29 03:29:57 +00:00
Matt Arsenault	c6d3bffb73	Handle address spaces in TargetTransformInfo llvm-svn: 189527	2013-08-28 22:41:57 +00:00
Matt Arsenault	0c955e9568	Fix lint assert on integer vector division llvm-svn: 189290	2013-08-26 23:29:33 +00:00
Jakub Staszak	9f3e19fc54	Remove trailing spaces. llvm-svn: 189173	2013-08-24 14:16:00 +00:00
Richard Sandiford	b195d89bde	Turn MipsOptimizeMathLibCalls into a target-independent scalar transform ...so that it can be used for z too. Most of the code is the same. The only real change is to use TargetTransformInfo to test when a sqrt instruction is available. The pass is opt-in because at the moment it only handles sqrt. llvm-svn: 189097	2013-08-23 10:27:02 +00:00
Bill Wendling	0b02009429	Reorder headers according to lint. llvm-svn: 188932	2013-08-21 21:14:19 +00:00
Jakub Staszak	b5370ddc88	Add some constantness. llvm-svn: 188844	2013-08-20 23:04:15 +00:00

1 2 3 4 5 ...

4647 Commits