llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Alp Toker	b20c031b7a	Add missed cleanup from r198456 All other uses of this macro in LLVM/clang have been moved to the function definition so follow suite (and the usage advice) here too for consistency. llvm-svn: 198516	2014-01-04 22:47:48 +00:00
Alp Toker	2d17611e90	Revert "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things." This commit was the source of crasher PR18384: While deleting: label %for.cond127 An asserting value handle still pointed to this value! UNREACHABLE executed at llvm/lib/IR/Value.cpp:671! Reverting to get the builders green, feel free to re-land after fixing up. (Renato has a handy isolated repro if you need it.) This reverts commit r198478. llvm-svn: 198503	2014-01-04 17:00:45 +00:00
Andrew Trick	45ef495b91	Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things. getSCEV for an ashr instruction creates an intermediate zext expression when it truncates its operand. The operand is initially inside the loop, so the narrow zext expression has a non-loop-invariant loop disposition. LoopSimplify then runs on an outer loop, hoists the ashr operand, and properly invalidate the SCEVs that are mapped to value. The SCEV expression for the ashr is now an AddRec with the hoisted value as the now loop-invariant start value. The LoopDisposition of this wide value was properly invalidated during LoopSimplify. However, if we later get the ashr SCEV again, we again try to create the intermediate zext expression. We get the same SCEV that we did earlier, and it is still cached because it was never mapped to a Value. When we try to create a new AddRec we abort because we're using the old non-loop-invariant LoopDisposition. I don't have a solution for this other than to clear LoopDisposition when LoopSimplify hoists things. I think the long-term strategy should be to perform LoopSimplify on all loops before computing SCEV and before running any loop opts on individual loops. It's possible we may want to rerun LoopSimplify on individual loops, but it should rarely do anything, so rarely require invalidating SCEV. llvm-svn: 198478	2014-01-04 05:52:49 +00:00
Nico Weber	7e53ec0698	Add a LLVM_DUMP_METHOD macro. The motivation is to mark dump methods as used in debug builds so that they can be called from lldb, but to not do so in release builds so that they can be dead-stripped. There's lots of potential follow-up work suggested in the thread "Should dump methods be LLVM_ATTRIBUTE_USED only in debug builds?" on cfe-dev, but everyone seems to agreen on this subset. Macro name chosen by fair coin toss. llvm-svn: 198456	2014-01-03 22:53:37 +00:00
David Peixotto	2028917754	Fix loop rerolling pass failure with non-consant loop lower bound The loop rerolling pass was failing with an assertion failure from a failed cast on loops like this: void foo(int A, int B, int m, int n) { for (int i = m; i < n; i+=4) { A[i+0] = B[i+0] * 4; A[i+1] = B[i+1] * 4; A[i+2] = B[i+2] * 4; A[i+3] = B[i+3] * 4; } } The code was casting the SCEV-expanded code for the new induction variable to a phi-node. When the loop had a non-constant lower bound, the SCEV expander would end the code expansion with an add insted of a phi node and the cast would fail. It looks like the cast to a phi node was only needed to get the induction variable value coming from the backedge to compute the end of loop condition. This patch changes the loop reroller to compare the induction variable to the number of times the backedge is taken instead of the iteration count of the loop. In other words, we stop the loop when the current value of the induction variable == IterationCount-1. Previously, the comparison was comparing the induction variable value from the next iteration == IterationCount. This problem only seems to occur on 32-bit targets. For some reason, the loop is not rerolled on 64-bit targets. PR18290 llvm-svn: 198425	2014-01-03 17:20:01 +00:00
Hal Finkel	df8016f76f	Disable compare sinking in CodeGenPrepare when multiple condition registers are available As noted in the comment above CodeGenPrepare::OptimizeInst, which aggressively sinks compares to reduce pressure on the condition register(s), for targets such as PowerPC with multiple condition registers, this may not be the right thing to do. This adds an HasMultipleConditionRegisters boolean to TLI, and CodeGenPrepare::OptimizeInst is skipped when HasMultipleConditionRegisters is true. This functionality will be used by the PowerPC backend in an upcoming commit. Especially when the PowerPC backend starts tracking individual condition register bits as separate allocatable entities (which will happen in this upcoming commit), this sinking from CodeGenPrepare::OptimizeInst is significantly suboptimial. llvm-svn: 198354	2014-01-02 21:13:43 +00:00
Andrew Trick	9bdab3f1b3	indvars: cleanup the IV visitor. It does more than gather sext/zext info. llvm-svn: 198353	2014-01-02 21:12:11 +00:00
Matt Arsenault	e28f607079	Delete unread globals through addrspacecast llvm-svn: 198346	2014-01-02 20:01:43 +00:00
Matt Arsenault	090fe5a92a	Fix addrspacecast with metadata globals llvm-svn: 198345	2014-01-02 19:53:49 +00:00
Andrew Trick	5f76ab650f	indvars: insert truncate at loop boundary to avoid redundant IVs. When widening an IV to remove s/zext, we generally try to eliminate the original narrow IV. However, LCSSA phi nodes outside the loop were still using the original IV. Clean this up more aggressively to avoid redundancy in generated code. llvm-svn: 198338	2014-01-02 19:29:38 +00:00
Nico Weber	10bf32e628	Set LLVM_EXPORTED_SYMBOL_FILE in CMakeLists whose corresponding Makefiles do so. (unittests/ExecutionEngine/JIT/CMakeLists.txt is still missing for now, since it handles export files in a strange way: It generates a .exports file from a .def file instead of the other way round.) llvm-svn: 198183	2013-12-29 23:06:49 +00:00
Alexander Potapenko	7da398bcae	[ASan] Fix the test for __asan_gen_ globals and actually fix http://llvm.org/bugs/show_bug.cgi?id=17976 by setting the correct linkage (as stated in the bug). llvm-svn: 198018	2013-12-25 16:46:27 +00:00
Alexander Potapenko	53694d2efb	[ASan] Make sure none of the __asan_gen_ global strings end up in the symbol table, add a test. This should fix http://llvm.org/bugs/show_bug.cgi?id=17976 Another test checking for the global variables' locations and prefixes on Darwin will be committed separately. llvm-svn: 198017	2013-12-25 14:22:15 +00:00
Andrew Trick	e7f9f5556d	Add support to indvars for optimizing sadd.with.overflow. Split sadd.with.overflow into add + sadd.with.overflow to allow analysis and optimization. This should ideally be done after InstCombine, which can perform code motion (eventually indvars should run after all canonical instcombines). We want ISEL to recombine the add and the check, at least on x86. This is currently under an option for reducing live induction variables: -liv-reduce. The next step is reducing liveness of IVs that are live out of the overflow check paths. Once the related optimizations are fully developed, reviewed and tested, I do expect this to become default. llvm-svn: 197926	2013-12-23 23:31:49 +00:00
Richard Sandiford	f367c783a7	Fix Scalarizer insertion point when replacing PHIs with insertelements If the Scalarizer scalarized a vector PHI but could not scalarize all uses of it, it would insert a series of insertelements to reconstruct the vector PHI value from the scalar ones. The problem was that it would emit these insertelements immediately after the PHI, even if there were other PHIs after it. llvm-svn: 197909	2013-12-23 14:51:56 +00:00
Richard Sandiford	27fc4a21a8	Fix Scalarizer handling of vector GEPs with multiple index operands The old code only worked for one index operand. Also handle "inbounds". llvm-svn: 197908	2013-12-23 14:45:00 +00:00
Kostya Serebryany	a148c8c9ed	[asan] don't unpoison redzones on function exit in use-after-return mode. Summary: Before this change the instrumented code before Ret instructions looked like: <Unpoison Frame Redzones> if (Frame != OriginalFrame) // I.e. Frame is fake <Poison Complete Frame> Now the instrumented code looks like: if (Frame != OriginalFrame) // I.e. Frame is fake <Poison Complete Frame> else <Unpoison Frame Redzones> Reviewers: eugenis Reviewed By: eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2458 llvm-svn: 197907	2013-12-23 14:15:08 +00:00
Kostya Serebryany	911683de1d	[asan] produce fewer stores when poisoning stack shadow llvm-svn: 197904	2013-12-23 09:24:36 +00:00
Justin Bogner	3b4e34606e	Transforms: Don't create bad weights when eliminating dead cases If we happen to eliminate every case in a switch that has branch weights, we currently try to create metadata for the one remaining branch, triggering an assert. Instead, we need to check that the metadata we're trying to create is sensible. llvm-svn: 197791	2013-12-20 08:21:30 +00:00
Kay Tiong Khoo	86f36f1147	Stay classy (and legal) LLVM. Remove links to 3rd party SMT solver whose links may not be permanent. llvm-svn: 197713	2013-12-19 18:35:54 +00:00
Kay Tiong Khoo	304e305b5c	Improved fix for PR17827 (instcombine of shift/and/compare). This change fixes the case of arithmetic shift right - do not attempt to fold that case. This change also relaxes the conditions when attempting to fold the logical shift right and shift left cases. No additional IR-level test cases included at this time. See http://llvm.org/bugs/show_bug.cgi?id=17827 for proofs that these are correct transformations. llvm-svn: 197705	2013-12-19 18:07:17 +00:00
Evgeniy Stepanov	301154310a	[dfsan] Simplify code after r197677. llvm-svn: 197679	2013-12-19 14:37:03 +00:00
Evgeniy Stepanov	0cd4eea1b6	Add an explicit insert point argument to SplitBlockAndInsertIfThen. Currently SplitBlockAndInsertIfThen requires that branch condition is an Instruction itself, which is very inconvenient, because it is sometimes an Operator, or even a Constant. llvm-svn: 197677	2013-12-19 13:29:56 +00:00
Arnold Schwaighofer	e4d65aae7d	LoopVectorizer: Don't if-convert constant expressions that can trap A phi node operand or an instruction operand could be a constant expression that can trap (division). Check that we don't vectorize such cases. PR16729 radar://15653590 llvm-svn: 197449	2013-12-17 01:11:01 +00:00
Yi Jiang	67f2e8e3f8	Enable double to float shrinking optimizations for binary functions like 'fmin/fmax'. Fix radar:15283121 llvm-svn: 197434	2013-12-16 22:42:40 +00:00
Hal Finkel	89ba3023da	Fix a use-after-free error in GlobalOpt CleanupConstantGlobalUsers GlobalOpt's CleanupConstantGlobalUsers function uses a worklist array to manage constant users to be visited. The pointers in this array need to be weak handles because when we delete a constant array, we may also be holding a pointer to one of its elements (or an element of one of its elements if we're dealing with an array of arrays) in the worklist. Fixes PR17347. llvm-svn: 197178	2013-12-12 20:45:24 +00:00
Hal Finkel	4492bf3e5d	Initialize the barrier pass llvm::initializeIPO The barrier pass is a temporary hack, and should go away soon. Nevertheless, if we don't initialize it, then opt will not understand -barrier, and this will break bugpoint (because when it dumps the passes from the default pass manager -barrier will be there). llvm-svn: 197177	2013-12-12 20:45:08 +00:00
Yi Jiang	f648b7ec82	Resubmit r196544: Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x) llvm-svn: 197109	2013-12-12 01:55:04 +00:00
NAKAMURA Takumi	54fa39136d	Prune redundant dependencies in LLVMBuild.txt. llvm-svn: 196988	2013-12-11 00:30:57 +00:00
Reid Kleckner	29793e54b1	[asan] Fix the coverage.cc test broken by r196939 It was failing because ASan was adding all of the following to one function: - dynamic alloca - stack realignment - inline asm This patch avoids making the static alloca dynamic when coverage is used. ASan should probably not be inserting empty inline asm blobs to inhibit duplicate tail elimination. llvm-svn: 196973	2013-12-10 21:49:28 +00:00
NAKAMURA Takumi	3fda77b6c7	Add proper dependencies to LLVMBuild.txt in llvm/lib. I'll prune redundant deps in LLVMBuild.txt, later. llvm-svn: 196881	2013-12-10 05:39:34 +00:00
NAKAMURA Takumi	b2c60b7ca7	Whitespaces. llvm-svn: 196880	2013-12-10 05:39:12 +00:00
Justin Bogner	42f9d0cf93	Transforms: Don't create bad branch weights when folding a switch This avoids creating branch weight metadata of length one when we fold cases into the default of a switch instruction, which was triggering an assert. llvm-svn: 196845	2013-12-10 00:13:41 +00:00
Manman Ren	4fcc808139	Revert 196544 due to internal bot failures. llvm-svn: 196732	2013-12-08 20:28:33 +00:00
Mark Seaborn	72517bc221	Fix inlining to not lose the "cleanup" clause from landingpads This fixes PR17872. This bug can lead to C++ destructors not being called when they should be, when an exception is thrown. llvm-svn: 196711	2013-12-08 00:51:21 +00:00
Mark Seaborn	2d856cb007	Fix inlining to not produce duplicate landingpad clauses Before this change, inlining one "invoke" into an outer "invoke" call site can lead to the outer landingpad's catch/filter clauses being copied multiple times into the resulting landingpad. This happens: * when the inlined function contains multiple "resume" instructions, because forwardResume() copies the clauses but is called multiple times; * when the inlined function contains a "resume" and a "call", because HandleCallsInBlockInlinedThroughInvoke() copies the clauses but is redundant with forwardResume(). Fix this by deduplicating the code. This problem doesn't lead to any incorrect execution; it's only untidy. This change will make fixing PR17872 a little easier. llvm-svn: 196710	2013-12-08 00:50:58 +00:00
Jakub Staszak	11e1c882f7	Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces overall time of LLVM compilation by ~1%. llvm-svn: 196667	2013-12-07 21:20:17 +00:00
Matt Arsenault	db406f2a95	Fix assert with copy from global through addrspacecast llvm-svn: 196638	2013-12-07 02:58:45 +00:00
Duncan P. N. Exon Smith	de77610c42	Don't use isNullValue to evaluate ConstantExpr ConstantExpr can evaluate to false even when isNullValue gives false. Fixes PR18143. llvm-svn: 196611	2013-12-06 21:48:36 +00:00
Kostya Serebryany	f0897919b4	[asan] fix ndebug build with strict warnings (-Wunused-variable) llvm-svn: 196574	2013-12-06 09:26:09 +00:00
Kostya Serebryany	fc32a3e5d2	[asan] rewrite asan's stack frame layout Summary: Rewrite asan's stack frame layout. First, most of the stack layout logic is moved into a separte file to make it more testable and (potentially) useful for other projects. Second, make the frames more compact by using adaptive redzones (smaller for small objects, larger for large objects). Third, try to minimized gaps due to large alignments (this is hypothetical since today we don't see many stack vars aligned by more than 32). The frames indeed become more compact, but I'll still need to run more benchmarks before committing, but I am sking for review now to get early feedback. This change will be accompanied by a trivial change in compiler-rt tests to match the new frame sizes. Reviewers: samsonov, dvyukov Reviewed By: samsonov CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2324 llvm-svn: 196568	2013-12-06 09:00:17 +00:00
Yi Jiang	0bd0569be5	Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x) llvm-svn: 196544	2013-12-05 22:42:50 +00:00
Renato Golin	a4d4a4c44f	Add #pragma vectorize enable/disable to LLVM The intended behaviour is to force vectorization on the presence of the flag (either turn on or off), and to continue the behaviour as expected in its absence. Tests were added to make sure the all cases are covered in opt. No tests were added in other tools with the assumption that they should use the PassManagerBuilder in the same way. This patch also removes the outdated -late-vectorize flag, which was on by default and not helping much. The pragma metadata is being attached to the same place as other loop metadata, but nothing forbids one from attaching it to a function (to enable #pragma optimize) or basic blocks (to hint the basic-block vectorizers), etc. The logic should be the same all around. Patches to Clang to produce the metadata will be produced after the initial implementation is agreed upon and committed. Patches to other vectorizers (such as SLP and BB) will be added once we're happy with the pass manager changes. llvm-svn: 196537	2013-12-05 21:20:02 +00:00
Michael Gottesman	f2e7969964	Change std::deque => std::vector. No functionality change. There is no reason to use std::deque here over std::vector. Thus given the performance differences inbetween the two it makes sense to change deque to vector. llvm-svn: 196524	2013-12-05 18:42:12 +00:00
Rafael Espindola	2ad993fb14	Fix non-deterministic behavior. We use CSEBlocks to initialize a worklist: SmallVector<BasicBlock *, 8> CSEWorkList(CSEBlocks.begin(), CSEBlocks.end()); so it must have a deterministic order. llvm-svn: 196520	2013-12-05 18:28:01 +00:00
Arnold Schwaighofer	120880c780	SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use We were creating external uses for scalar values in MustGather entries that also had a ScalarToTreeEntry (they also are present in a vectorized tuple). This meant we would keep a value 'alive' as a scalar and vectorized causing havoc. This is not necessary because when we create a MustGather vector we explicitly create external uses entries for the insertelement instructions of the MustGather vector elements. Fixes PR18129. radar://15582184 llvm-svn: 196508	2013-12-05 15:14:40 +00:00
Kostya Serebryany	eb57b3e248	[tsan] fix PR18146: sometimes a variable written into vptr could have an integer type (after other optimizations) llvm-svn: 196507	2013-12-05 15:03:02 +00:00
Alp Toker	e845f8af67	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Yuchen Wu	9559c6af5d	llvm-cov: Replace size() with empty() in bool check. llvm-svn: 196400	2013-12-04 19:18:23 +00:00
Daniel Jasper	ca41e63412	Un-revert r196358: "llvm-cov: Added support for function checksums." And add the proper fix. llvm-svn: 196367	2013-12-04 08:57:17 +00:00
Daniel Jasper	a7dd8af910	Revert r196358: "llvm-cov: Added support for function checksums." This currently breaks clang/test/CodeGen/code-coverage.c. The root cause is that the newly introduced access to Funcs[j] is out of bounds. llvm-svn: 196365	2013-12-04 08:23:33 +00:00
Yuchen Wu	b1a23c9951	llvm-cov: Added support for function checksums. The function checksums are hashed from the concatenation of the function name and line number. llvm-svn: 196358	2013-12-04 06:00:17 +00:00
Yunzhong Gao	05c1966c8c	Teach the internalize pass to skip dllexported symbols because they could be referenced in a way that even the linker does not see. Differential Revision: http://llvm-reviews.chandlerc.com/D2280 llvm-svn: 196300	2013-12-03 18:05:14 +00:00
Kay Tiong Khoo	dd9c210766	Use local variable for repeated use rather than 'get' method. No functional change intended. llvm-svn: 196164	2013-12-02 22:23:32 +00:00
Kay Tiong Khoo	40a987d7ca	Move variables to where they are used and give them better names. No functional change intended. llvm-svn: 196163	2013-12-02 22:20:40 +00:00
Kay Tiong Khoo	14489a85ab	Rename variables to be consistent (CST -> Cst). No functional change intended. llvm-svn: 196161	2013-12-02 22:11:56 +00:00
Mark Seaborn	b08ae3c322	InlineFunction.cpp: Remove a return value that is always false Remove some associated dead code. This cleanup is associated with PR17872. llvm-svn: 196147	2013-12-02 20:50:59 +00:00
Kay Tiong Khoo	5257afa264	Conservative fix for PR17827 - don't optimize a shift + and + compare sequence where the shift is logical unless the comparison is unsigned llvm-svn: 196129	2013-12-02 18:43:59 +00:00
Kostya Serebryany	e71e08d007	[tsan] fix instrumentation of vector vptr updates (https://code.google.com/p/thread-sanitizer/issues/detail?id=43 ) llvm-svn: 196079	2013-12-02 08:07:15 +00:00
Bill Wendling	0f97d98496	Use accessor methods instead. llvm-svn: 196006	2013-12-01 03:40:42 +00:00
Bill Wendling	178e2b5358	Use 'unsigned char' to get this past gcc error message: error: invalid conversion from 'unsigned char' to '{anonymous}::Sequence' llvm-svn: 196004	2013-12-01 03:36:07 +00:00
Stephen Canon	d8aaca93a6	Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)). llvm-svn: 195934	2013-11-28 21:38:05 +00:00
Nadav Rotem	dc01e91cf5	PR1860 - We can't save a list of ExtractElement instructions to CSE because some of these instructions may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE. llvm-svn: 195791	2013-11-26 22:24:25 +00:00
Arnold Schwaighofer	d0c05d2c84	LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary In signed arithmetic we could end up with an i64 trip count for an i32 phi. Because it is signed arithmetic we know that this is only defined if the i32 does not wrap. It is therefore safe to truncate the i64 trip count to a i32 value. Fixes PR18049. llvm-svn: 195787	2013-11-26 22:11:23 +00:00
Diego Novillo	0929297da9	Refactor some code in SampleProfile.cpp I'm adding new functionality in the sample profiler. This will require more data to be kept around for each function, so I moved the structure SampleProfile that we keep for each function into a separate class. There are no functional changes in this patch. It simply provides a new home where to place all the new data that I need to propagate weights through edges. There are some other name and minor edits throughout. llvm-svn: 195780	2013-11-26 20:37:33 +00:00
Nadav Rotem	643eb4c26e	PR18060 - When we RAUW values with ExtractElement instructions in some cases we generate PHI nodes with multiple entries from the same basic block but with different values. Enabling CSE on ExtractElement instructions make sure that all of the RAUWed instructions are the same. llvm-svn: 195773	2013-11-26 17:29:19 +00:00
Stepan Dyatkovskiy	83455f2b60	PR17925 bugfix. Short description. This issue is about case of treating pointers as integers. We treat pointers as different if they references different address space. At the same time, we treat pointers equal to integers (with machine address width). It was a point of false-positive. Consider next case on 32bit machine: void foo0(i32 addrespace(1)* %p) void foo1(i32 addrespace(2)* %p) void foo2(i32 %p) foo0 != foo1, while foo1 == foo2 and foo0 == foo2. As you can see it breaks transitivity. That means that result depends on order of how functions are presented in module. Next order causes merging of foo0 and foo1: foo2, foo0, foo1 First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be merged with foo2. Depending on order, things could be merged we don't expect to. The fix: Forbid to treat any pointer as integer, except for those, who belong to address space 0. llvm-svn: 195769	2013-11-26 16:11:03 +00:00
Chandler Carruth	5be5f8d16c	[PM] Split the CallGraph out from the ModulePass which creates the CallGraph. This makes the CallGraph a totally generic analysis object that is the container for the graph data structure and the primary interface for querying and manipulating it. The pass logic is separated into its own class. For compatibility reasons, the pass provides wrapper methods for most of the methods on CallGraph -- they all just forward. This will allow the new pass manager infrastructure to provide its own analysis pass that constructs the same CallGraph object and makes it available. The idea is that in the new pass manager, the analysis pass's 'run' method returns a concrete analysis 'result'. Here, that result is a 'CallGraph'. The 'run' method will typically do only minimal work, deferring much of the work into the implementation of the result object in order to be lazy about computing things, but when (like DomTree) there is some up-front computation, the analysis does it prior to handing the result back to the querying pass. I know some of this is fairly ugly. I'm happy to change it around if folks can suggest a cleaner interim state, but there is going to be some amount of unavoidable ugliness during the transition period. The good thing is that this is very limited and will naturally go away when the old pass infrastructure goes away. It won't hang around to bother us later. Next up is the initial new-PM-style call graph analysis. =] llvm-svn: 195722	2013-11-26 04:19:30 +00:00
Chandler Carruth	a1094eb135	Migrate metadata information from scalar to vector instructions during SLP vectorization. Based on the code in BBVectorizer. Fixes PR17741. Patch by Raul Silvera, reviewed by Hal and Nadav. Reformatted by my driving of clang-format. =] llvm-svn: 195528	2013-11-23 00:48:34 +00:00
Yuchen Wu	43df88959d	llvm-cov: Split entry blocks in GCNOProfiling.cpp. gcov expects every function to contain an entry block that unconditionally branches into the next block. clang does not implement basic blocks in this manner, so gcov did not output correct branch info if the entry block branched to multiple blocks. This change splits every function's entry block into an empty block and a block with the rest of the instructions. The instrumentation code will take care of the rest. llvm-svn: 195513	2013-11-22 23:07:45 +00:00
Manman Ren	7590fee969	Debug Info: move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp. We can share the implementation between StripSymbols and dropping debug info for metadata versions that do not match. Also update the comments to match the implementation. A follow-on patch will drop the "Debug Info Version" module flag in StripDebugInfo. llvm-svn: 195505	2013-11-22 22:06:31 +00:00
Matt Arsenault	35acaad8c7	StructurizeCFG: Fix verification failure with some loops. If the beginning of the loop was also the entry block of the function, branches were inserted to the entry block which isn't allowed. If this occurs, create a new dummy function entry block that branches to the start of the loop. llvm-svn: 195493	2013-11-22 19:24:39 +00:00
Matt Arsenault	9afcbf3562	StructurizeCFG: Fix inverting a branch on an argument llvm-svn: 195492	2013-11-22 19:24:37 +00:00
Rafael Espindola	524e7f35de	Add a fixed version of r195470 back. The fix is simply to use CurI instead of I when handling aliases to avoid accessing a invalid iterator. original message: Convert linkonce* to weak* instead of strong. Also refactor the logic into a helper function. This is an important improve on mingw where the linker complains about mixed weak and strong symbols. Converting to weak ensures that the symbol is not dropped, but keeps in a comdat, making the linker happy. llvm-svn: 195477	2013-11-22 17:58:12 +00:00
Rafael Espindola	749aa1e00d	Revert "Convert linkonce* to weak* instead of strong." This reverts commit r195470. Debugging failure in some bots. llvm-svn: 195472	2013-11-22 17:09:34 +00:00
Richard Sandiford	82ac8f6b68	Add a Scalarizer pass. llvm-svn: 195471	2013-11-22 16:58:05 +00:00
Rafael Espindola	2ac1404bee	Convert linkonce* to weak* instead of strong. Also refactor the logic into a helper function. This is an important improvement on mingw where the linker complains about mixed weak and strong symbols. Converting to weak ensures that the symbol is not dropped, but keeps in a comdat, making the linker happy. llvm-svn: 195470	2013-11-22 16:14:30 +00:00
Arnold Schwaighofer	3fa9376236	SLPVectorizer: Fix whitespace errors. llvm-svn: 195468	2013-11-22 15:47:17 +00:00
Yi Jiang	74286d427a	SLP Vectorizer: Extract cost will only be added once even if the scalar has multiple external uses. llvm-svn: 195406	2013-11-22 01:57:02 +00:00
Peter Collingbourne	89b5505b6b	Introduce two command-line flags for the instrumentation pass to control whether the labels of pointers should be ignored in load and store instructions The new command line flags are -dfsan-ignore-pointer-label-on-store and -dfsan-ignore-pointer-label-on-load. Their default value matches the current labelling scheme. Additionally, the function __dfsan_union_load is marked as readonly. Patch by Lorenzo Martignoni! Differential Revision: http://llvm-reviews.chandlerc.com/D2187 llvm-svn: 195382	2013-11-21 23:20:54 +00:00
Evgeniy Stepanov	405470ce24	[msan] Propagate condition origin in select instruction. llvm-svn: 195349	2013-11-21 12:00:24 +00:00
Yuchen Wu	5480edb76f	llvm-cov: Don't assume FileChecksum was generated. For cases where emitProfileArcs() was called but emitProfileNotes() was not, set the CfgChecksum to 0. llvm-svn: 195311	2013-11-21 04:53:39 +00:00
Yuchen Wu	d218a85f8c	llvm-cov: Fixed some bugs related to file checksum. Added call to update CfgChecksum. Made FileChecksum a vector, separate for each source file. llvm-svn: 195309	2013-11-21 04:01:05 +00:00
Yuchen Wu	734fa40b2a	llvm-cov: Added file checksum to gcno and gcda files. Instead of permanently outputting "MVLL" as the file checksum, clang will create gcno and gcda checksums by hashing the destination block numbers of every arc. This allows for llvm-cov to check if the two gcov files are synchronized. Regenerated the test files so they contain the checksum. Also added negative test to ensure error when the checksums don't match. llvm-svn: 195191	2013-11-20 04:15:05 +00:00
Arnold Schwaighofer	242935ec8c	SLPVectorizer: Fix stale for Value pointer array We are slicing an array of Value pointers and process those slices in a loop. The problem is that we might invalidate a later slice by vectorizing a former slice. Use a WeakVH to track the pointer. If the pointer is deleted or RAUW'ed we can tell. The test case will only fail when running with libgmalloc. radar://15498655 llvm-svn: 195162	2013-11-19 22:20:20 +00:00
Arnold Schwaighofer	3149313505	SLPVectorizer: Fix whitespace errors llvm-svn: 195161	2013-11-19 22:20:18 +00:00
Chandler Carruth	4d8e469cd3	Fix an issue where SROA computed different results based on the relative order of slices of the alloca which have exactly the same size and other properties. This was found by a perniciously unstable sort implementation used to flush out buggy uses of the algorithm. The fundamental idea is that findCommonType should return the best common type it can find across all of the slices in the range. There were two bugs here previously: 1) We would accept an integer type smaller than a byte-width multiple, and if there were different bit-width integer types, we would accept the first one. This caused an actual failure in the testcase updated here when the sort order changed. 2) If we found a bad combination of types or a non-load, non-store use before an integer typed load or store we would bail, but if we found the integere typed load or store, we would use it. The correct behavior is to always use an integer typed operation which covers the partition if one exists. While a clever debugging sort algorithm found problem #1 in our existing test cases, I have no useful test case ideas for #2. I spotted in by inspection when looking at this code. llvm-svn: 195118	2013-11-19 09:03:18 +00:00
Michael Ilseman	0f69f39907	Add support for software expansion of 64-bit integer division instructions. Patch by Dmitri Shtilman! llvm-svn: 195116	2013-11-19 06:54:19 +00:00
Adrian Prantl	afeac86924	Debug info: Let LowerDbgDeclare perfom the dbg.declare -> dbg.value lowering only for load/stores to scalar allocas. The resulting values confuse the backend and don't add anything because we can describe array-allocas with a dbg.declare intrinsic just fine. rdar://problem/15464571 llvm-svn: 195052	2013-11-18 23:04:38 +00:00
Alexey Samsonov	cbf7462c74	[ASan] Fix PR17867 - make sure ASan doesn't crash if use-after-scope and use-after-return are combined. llvm-svn: 195014	2013-11-18 14:53:55 +00:00
Arnold Schwaighofer	e4280ec4dd	LoopVectorizer: Extend the induction variable to a larger type In some case the loop exit count computation can overflow. Extend the type to prevent most of those cases. The problem is loops like: int main () { int a = 1; char b = 0; lbl: a &= 4; b--; if (b) goto lbl; return a; } The backedge count is 255. The induction variable type is i8. If we add one to 255 to get the exit count we overflow to zero. To work around this issue we extend the type of the induction variable to i32 in the case of i8 and i16. PR17532 llvm-svn: 195008	2013-11-18 13:14:32 +00:00
NAKAMURA Takumi	413002f21c	Utils/LoopUnroll.cpp: Tweak (StringRef)OldName to be valid until it is used, since r194601. eraseFromParent() invalidates OldName. llvm-svn: 194970	2013-11-17 18:05:34 +00:00
Hal Finkel	2557302085	Add a loop rerolling flag to the PassManagerBuilder This adds a boolean member variable to the PassManagerBuilder to control loop rerolling (just like we have for unrolling and the various vectorization options). This is necessary for control by the frontend. Loop rerolling remains disabled by default at all optimization levels. llvm-svn: 194966	2013-11-17 16:02:50 +00:00
Hal Finkel	f09058ab9e	Add the cold attribute to error-reporting call sites Generally speaking, control flow paths with error reporting calls are cold. So far, error reporting calls are calls to perror and calls to fprintf, fwrite, etc. with stderr as the stream. This can be extended in the future. The primary motivation is to improve block placement (the cold attribute affects the static branch prediction heuristics). llvm-svn: 194943	2013-11-17 02:06:35 +00:00
Hal Finkel	5d880eed19	Fix ndebug-build unused variable in loop rerolling llvm-svn: 194941	2013-11-17 01:21:54 +00:00
Hal Finkel	cc70e01f05	Add a loop rerolling pass This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3i] = foo(0); x[3i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. llvm-svn: 194939	2013-11-16 23:59:05 +00:00
Hal Finkel	79b1387151	Apply the InstCombine fptrunc sqrt optimization to llvm.sqrt InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls: (fptrunc (sqrt (fpext x))) -> (sqrtf x) but does not apply the same optimization to llvm.sqrt. This is a problem because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in fast-math mode, and because this optimization is being applied to sqrt and not applied to llvm.sqrt, sometimes the fast-math code is slower. This change makes InstCombine apply this optimization to llvm.sqrt as well. This fixes the specific problem in PR17758, although the same underlying issue (optimizations applied to libcalls are not applied to intrinsics) exists for other optimizations in SimplifyLibCalls. llvm-svn: 194935	2013-11-16 21:29:08 +00:00
Benjamin Kramer	0519e29d1b	InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs. This is common in bitfield code. llvm-svn: 194925	2013-11-16 16:00:48 +00:00
Arnold Schwaighofer	01b6f1cc9a	LoopVectorizer: Use abi alignment for accesses with no alignment When we vectorize a scalar access with no alignment specified, we have to set the target's abi alignment of the scalar access on the vectorized access. Using the same alignment of zero would be wrong because most targets will have a bigger abi alignment for vector types. This probably fixes PR17878. llvm-svn: 194876	2013-11-15 23:09:33 +00:00
Manman Ren	2189fb16b9	ArgumentPromotion: correctly transfer TBAA tags and alignments. We used to use std::map<IndicesVector, LoadInst> for OriginalLoads, and when we try to promote two arguments, they will both write to OriginalLoads causing created loads for the two arguments to have the same original load. And the same tbaa tag and alignment will be put to the created loads for the two arguments. The fix is to use std::map<std::pair<Argument, IndicesVector>, LoadInst*> for OriginalLoads, so each Argument will write to different parts of the map. PR17906 llvm-svn: 194846	2013-11-15 20:41:15 +00:00
Kostya Serebryany	3b2f8ea060	[asan] use GlobalValue::PrivateLinkage for coverage guard to save quite a bit of code size llvm-svn: 194800	2013-11-15 09:52:05 +00:00
Bob Wilson	5f54393e59	Reapply "[asan] Poor man's coverage that works with ASan" I was able to successfully run a bootstrapped LTO build of clang with r194701, so this change does not seem to be the cause of our failing buildbots. llvm-svn: 194789	2013-11-15 07:16:09 +00:00
Matt Arsenault	2c9ec5c652	Add instcombine visitor for addrspacecast llvm-svn: 194786	2013-11-15 05:45:08 +00:00
Bob Wilson	1eb0da26bc	Revert "[asan] Poor man's coverage that works with ASan" This reverts commit 194701. Apple's bootstrapped LTO builds have been failing, and this change (along with compiler-rt 194702-194704) is the only thing on the blamelist. I will either reappy these changes or help debug the problem, depending on whether this fixes the buildbots. llvm-svn: 194780	2013-11-15 03:28:22 +00:00
Kostya Serebryany	8bd4b917a6	[asan] Poor man's coverage that works with ASan llvm-svn: 194701	2013-11-14 13:27:41 +00:00
Evgeniy Stepanov	01ee97e26c	[msan] Fast path optimization for wrap-indirect-calls feature of MemorySanitizer. Indirect call wrapping helps MSanDR (dynamic instrumentation companion tool for MSan) to catch all cases where execution leaves a compiler-instrumented module by allowing the tool to rewrite targets of indirect calls. This change is an optimization that skips wrapping for calls when target is inside the current module. This relies on the linker providing symbols at the begin and end of the module code (or code + data, does not really matter). Gold linker provides such symbols by default. GNU (BFD) linker needs a link flag: -Wl,--defsym=__executable_start=0. More info: https://code.google.com/p/memory-sanitizer/wiki/MSanDR#Native_exec llvm-svn: 194697	2013-11-14 12:29:04 +00:00
Jakub Staszak	5e8ab0ddcf	Use StringRef instead of std::string llvm-svn: 194601	2013-11-13 20:09:11 +00:00
Alexey Samsonov	89a2c4d5df	Fix -Wdelete-non-virtual-dtor warnings by making SampleProfile methods non-virtual llvm-svn: 194568	2013-11-13 13:09:39 +00:00
Diego Novillo	7b4e2dda6b	SampleProfileLoader pass. Initial setup. This adds a new scalar pass that reads a file with samples generated by 'perf' during runtime. The samples read from the profile are incorporated and emmited as IR metadata reflecting that profile. The profile file is assumed to have been generated by an external profile source. The profile information is converted into IR metadata, which is later used by the analysis routines to estimate block frequencies, edge weights and other related data. External profile information files have no fixed format, each profiler is free to define its own. This includes both the on-disk representation of the profile and the kind of profile information stored in the file. A common kind of profile is based on sampling (e.g., perf), which essentially counts how many times each line of the program has been executed during the run. The SampleProfileLoader pass is organized as a scalar transformation. On startup, it reads the file given in -sample-profile-file to determine what kind of profile it contains. This file is assumed to contain profile information for the whole application. The profile data in the file is read and incorporated into the internal state of the corresponding profiler. To facilitate testing, I've organized the profilers to support two file formats: text and native. The native format is whatever on-disk representation the profiler wants to support, I think this will mostly be bitcode files, but it could be anything the profiler wants to support. To do this, every profiler must implement the SampleProfile::loadNative() function. The text format is mostly meant for debugging. Records are separated by newlines, but each profiler is free to interpret records as it sees fit. Profilers must implement the SampleProfile::loadText() function. Finally, the pass will call SampleProfile::emitAnnotations() for each function in the current translation unit. This function needs to translate the loaded profile into IR metadata, which the analyzer will later be able to use. This patch implements the first steps towards the above design. I've implemented a sample-based flat profiler. The format of the profile is fairly simplistic. Each sampled function contains a list of relative line locations (from the start of the function) together with a count representing how many samples were collected at that line during execution. I generate this profile using perf and a separate converter tool. Currently, I have only implemented a text format for these profiles. I am interested in initial feedback to the whole approach before I send the other parts of the implementation for review. This patch implements: - The SampleProfileLoader pass. - The base ExternalProfile class with the core interface. - A SampleProfile sub-class using the above interface. The profiler generates branch weight metadata on every branch instructions that matches the profiles. - A text loader class to assist the implementation of SampleProfile::loadText(). - Basic unit tests for the pass. Additionally, the patch uses profile information to compute branch weights based on instruction samples. This patch converts instruction samples into branch weights. It does a fairly simplistic conversion: Given a multi-way branch instruction, it calculates the weight of each branch based on the maximum sample count gathered from each target basic block. Note that this assignment of branch weights is somewhat lossy and can be misleading. If a basic block has more than one incoming branch, all the incoming branches will get the same weight. In reality, it may be that only one of them is the most heavily taken branch. I will adjust this assignment in subsequent patches. llvm-svn: 194566	2013-11-13 12:22:21 +00:00
Nadav Rotem	a2f08b8300	Update the docs to match the function name. llvm-svn: 194537	2013-11-13 01:12:01 +00:00
Nadav Rotem	e15585e8ff	Fold (iszero(A&K1) \| iszero(A&K2)) -> (A&(K1\|K2)) != (K1\|K2) if we know that K1 and K2 are 'one-hot' (only one bit is on). llvm-svn: 194525	2013-11-12 22:38:59 +00:00
Nadav Rotem	8fbd606127	FoldBranchToCommonDest merges branches into a single branch with or/and of the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable. llvm-svn: 194524	2013-11-12 22:37:16 +00:00
Rafael Espindola	318b459092	Corruptly merge constants with explicit and implicit alignments. Constant merge can merge a constant with implicit alignment with one that has explicit alignment. Before this change it was assuming that the explicit alignment was higher than the implicit one, causing the result to be under aligned in some cases. Fixes pr17815. Patch by Chris Smowton! llvm-svn: 194506	2013-11-12 20:21:43 +00:00
Benjamin Kramer	d93d22eb86	SimplifyCFG: Use existing constant folding logic when forming switch tables. Both simpler and more powerful than the hand-rolled folding logic. llvm-svn: 194475	2013-11-12 12:24:36 +00:00
Shuxin Yang	ccc49e9a1b	Correct a glitch in r194424 which may invalidate iterator. llvm-svn: 194457	2013-11-12 08:33:03 +00:00
Yuchen Wu	12b9b5086e	llvm-cov: Added call to update run/program counts. Also updated test files that were generated from this change. llvm-svn: 194453	2013-11-12 04:59:08 +00:00
Shuxin Yang	fc370d42f7	Fix PR17952. The symptom is that an assertion is triggered. The assertion was added by me to detect the situation when value is propagated from dead blocks. (We can certainly get rid of assertion; it is safe to do so, because propagating value from dead block to alive join node is certainly ok.) The root cause of this bug is : edge-splitting is conducted on the fly, the edge being split could be a dead edge, therefore the block that split the critial edge needs to be flagged "dead" as well. There are 3 ways to fix this bug: 1) Get rid of the assertion as I mentioned eariler 2) When an dead edge is split, flag the inserted block "dead". 3) proactively split the critical edges connecting dead and live blocks when new dead blocks are revealed. This fix go for 3) with additional 2 LOC. Testing case was added by Rafael the other day. llvm-svn: 194424	2013-11-11 22:00:23 +00:00
Renato Golin	ed3b88828a	Move debug message in vectorizer No functional change, just better reporting. llvm-svn: 194388	2013-11-11 16:27:35 +00:00
Evgeniy Stepanov	32b834b198	[msan] Propagate origin for insertvalue, extractvalue. llvm-svn: 194374	2013-11-11 13:37:10 +00:00
Bill Wendling	74b6463d35	Revert "Resurrect r191017 " GVN proceeds in the presence of dead code" plus a fix to PR17307 & 17308." This causes PR17852. This reverts commit d93e8a06b2ca09ab18f390cd514b7443e2e571f7. Conflicts: test/Transforms/GVN/cond_br2.ll llvm-svn: 194348	2013-11-10 07:34:34 +00:00
Matt Arsenault	e3debcae62	Use type form of getIntPtrType. This should be inconsequential and is work towards removing the default address space arguments. llvm-svn: 194347	2013-11-10 04:46:57 +00:00
Nadav Rotem	537b5180e9	SimplifyCFG has a heuristics for out-of-order processors that decides when it is worthwhile to merge branches. It tries to estimate if the operands of the instruction that we want to hoist are ready. This commit marks function arguments as 'ready' because they require no calculation. This boosts libquantum and a few other workloads from the testsuite. llvm-svn: 194346	2013-11-10 04:13:31 +00:00
Matt Arsenault	4fcb58f7fb	Teach MergeFunctions about address spaces llvm-svn: 194342	2013-11-10 01:44:37 +00:00
Hal Finkel	b485e0c406	Remove dead code from LoopUnswitch LoopUnswitch's code simplification routine has logic to convert conditional branches into unconditional branches, after unswitching makes the condition constant, and then remove any blocks that renders dead. Unfortunately, this code is dead, currently broken, and furthermore, has never been alive (at least as far back at 2006). No functionality change intended. llvm-svn: 194277	2013-11-08 19:58:21 +00:00
Michael Gottesman	04ff2468f0	[objc-arc] Convert the one directional retain/release relation assert to a conditional check + fail. Due to the previously added overflow checks, we can have a retain/release relation that is one directional. This occurs specifically when we run into an additive overflow causing us to drop state in only one direction. If that occurs, we should bail and not optimize that retain/release instead of asserting. Apologies for the size of the testcase. It is necessary to cause the additive cfg overflow to trigger. rdar://15377890 llvm-svn: 194083	2013-11-05 16:02:40 +00:00
Hal Finkel	26ff1b7453	Add a runtime unrolling parameter to the LoopUnroll pass constructor As with the other loop unrolling parameters (the unrolling threshold, partial unrolling, etc.) runtime unrolling can now also be controlled via the constructor. This will be necessary for moving non-trivial unrolling late in the pass manager (after loop vectorization). No functionality change intended. llvm-svn: 194027	2013-11-05 00:08:03 +00:00
Shuxin Yang	08d2dc8405	Remove dead code llvm-svn: 194017	2013-11-04 21:44:01 +00:00
Benjamin Kramer	9eaaead296	SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a strict weak ordering. STL debug mode checks this. llvm-svn: 194015	2013-11-04 21:34:55 +00:00
Matt Arsenault	1f521e921d	Scalarize select vector arguments when extracted. When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. llvm-svn: 194013	2013-11-04 20:36:06 +00:00
Benjamin Kramer	15ebc47438	SLPVectorizer: Add a missing pair of parens. No functionality change. llvm-svn: 193958	2013-11-03 12:54:32 +00:00
Benjamin Kramer	f45bcf5480	SLPVectorizer: When CSEing generated gathers only scan blocks containing them. Instead of doing a RPO traversal of the whole function remember the blocks containing gathers (typically <= 2) and scan them in dominator-first order. The actual CSE is still quadratic, but I'm not confident that adding a scoped hash table here is worth it as we're only looking at the generated instructions and not arbitrary code. llvm-svn: 193956	2013-11-03 12:27:52 +00:00
David Majnemer	2bbbbaaeee	Revert "Inliner: Handle readonly attribute per argument when adding memcpy" This reverts commit r193356, it caused PR17781. A reduced test case covering this regression has been added to the test suite. llvm-svn: 193955	2013-11-03 12:22:13 +00:00
David Majnemer	fbb2338856	Spell "Actual" correctly llvm-svn: 193954	2013-11-03 11:09:39 +00:00
Bob Wilson	eb7943100b	Convert calls to __sinpi and __cospi into __sincospi_stret This adds an SimplifyLibCalls case which converts the special __sinpi and __cospi (float & double variants) into a __sincospi_stret where appropriate to remove duplicated work. Patch by Tim Northover llvm-svn: 193943	2013-11-03 06:48:38 +00:00
Benjamin Kramer	ae919396b6	SLPVectorizer: Remove duplicated function. llvm-svn: 193927	2013-11-02 14:46:27 +00:00
Benjamin Kramer	abc7baa1dc	LoopVectorize: Remove quadratic behavior the local CSE. Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. llvm-svn: 193926	2013-11-02 13:39:00 +00:00
Arnold Schwaighofer	fed0c4f8e8	LoopVectorizer: Move cse code into its own function llvm-svn: 193895	2013-11-01 23:28:54 +00:00
Arnold Schwaighofer	fba1c74b67	LoopVectorizer: Perform redundancy elimination on induction variables When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 llvm-svn: 193891	2013-11-01 22:18:19 +00:00
Benjamin Kramer	3045156cee	LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). llvm-svn: 193860	2013-11-01 14:09:50 +00:00
Arnold Schwaighofer	5d7be45165	LoopVectorizer: If dependency checks fail try runtime checks When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 llvm-svn: 193853	2013-11-01 03:05:07 +00:00
Arnold Schwaighofer	fe8e481ef6	LoopVectorizer: Clear all member data structures in RuntimeCheck.reset() Clear all data structures when resetting the RuntimeCheck data structure. No test case. This was exposed by an upcomming change. llvm-svn: 193852	2013-11-01 03:05:04 +00:00
Manman Ren	bba5655c39	Do not convert "call asm" to "invoke asm" in Inliner. Given that backend does not handle "invoke asm" correctly ("invoke asm" will be handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right setup for LPadToCallSiteMap) and we already made the assumption that inline asm does not throw in InstCombiner::visitCallSite, we are going to make the same assumption in Inliner to make sure we don't convert "call asm" to "invoke asm". If it becomes necessary to add support for "invoke asm" later on, we will need to modify the backend as well as remove the assumptions that inline asm does not throw. Fix rdar://15317907 llvm-svn: 193808	2013-10-31 21:56:03 +00:00
Rafael Espindola	24353f2de2	Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when not doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800	2013-10-31 20:51:58 +00:00
Rafael Espindola	afc61d382c	Merge CallGraph and BasicCallGraph. llvm-svn: 193734	2013-10-31 03:03:55 +00:00
Matt Arsenault	7af7ffc005	Teach scalarrepl about address spaces llvm-svn: 193720	2013-10-30 22:54:58 +00:00
Matt Arsenault	f608b07c6f	Fix GVN creating bitcast between address spaces llvm-svn: 193710	2013-10-30 19:05:41 +00:00
Arnold Schwaighofer	fe80e563da	ARM cost model: Account for zero cost scalar SROA instructions By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573	2013-10-29 01:33:53 +00:00
Arnold Schwaighofer	6f22639253	SLPVectorizer: Use vector type for vectorized memory operations No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 llvm-svn: 193572	2013-10-29 01:33:50 +00:00
Shuxin Yang	eb29e658a0	Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. llvm-svn: 193489	2013-10-27 03:08:44 +00:00
Wan Xiaofei	f3100f24fa	Quick look-up for block in loop. This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460	2013-10-26 03:08:02 +00:00

1 2 3 4 5 ...

11109 Commits