llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Max Kazantsev	f0c38d90c7	[SCEV][NFC] Introduces expression sizes estimation This patch introduces the field `ExpressionSize` in SCEV. This field is calculated only once on SCEV creation, and it represents the complexity of this SCEV from arithmetical point of view (not from the point of the number of actual different SCEV nodes that are used in the expression). Roughly saying, it is the number of operands and operations symbols when we print this SCEV. A formal definition is following: if SCEV `X` has operands `Op1`, `Op2`, ..., `OpN`, then Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN). Size of SCEVConstant and SCEVUnknown is one. Expression size may be used as a universal way to limit SCEV transformations for huge SCEVs. Currently, we have a bunch of options that represents various limits (such as recursion depth limit) that may not make any sense from the point of view of a LLVM users who is not familiar with SCEV internals, and all these different options pursue one goal. A more general rule that may potentially allow us to get rid of this redundancy in options is "do not make transformations with SCEVs of huge size". It can apply to all SCEV traversals and transformations that may need to visit a SCEV node more than once, hence they are prone to combinatorial explosions. This patch only introduces SCEV sizes calculation as NFC, its utilization will be introduced in follow-up patches. Differential Revision: https://reviews.llvm.org/D35989 Reviewed By: reames llvm-svn: 351725	2019-01-21 06:19:50 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Teresa Johnson	e8cb930dbb	Revert "[ThinLTO] Add summary entries for index-based WPD" Mistaken commit of something still under review! This reverts commit r351453. llvm-svn: 351455	2019-01-17 16:05:04 +00:00
Teresa Johnson	5b4ceacd72	[ThinLTO] Add summary entries for index-based WPD Summary: If LTOUnit splitting is disabled, the module summary analysis computes the summary information necessary to perform single implementation devirtualization during the thin link with the index and no IR. The information collected from the regular LTO IR in the current hybrid WPD algorithm is summarized, including: 1) For vtable definitions, record the function pointers and their offset within the vtable initializer (subsumes the information collected from IR by tryFindVirtualCallTargets). 2) A record for each type metadata summarizing the vtable definitions decorated with that metadata (subsumes the TypeIdentiferMap collected from IR). Also added are the necessary bitcode records, and the corresponding assembly support. The index-based WPD will be sent as a follow-on. Depends on D53890. Reviewers: pcc Subscribers: mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54815 llvm-svn: 351453	2019-01-17 15:49:03 +00:00
Tom Stellard	62cb21da04	Only promote args when function attributes are compatible Summary: Check to make sure that the caller and the callee have compatible function arguments before promoting arguments. This uses the same TargetTransformInfo queries that are used to determine if attributes are compatible for inlining. The goal here is to avoid breaking ABI when a called function's ABI depends on a target feature that is not enabled in the caller. This is a very conservative fix for PR37358. Ideally we would have a more sophisticated check for ABI compatiblity rather than checking if the attributes are compatible for inlining. Reviewers: echristo, chandlerc, eli.friedman, craig.topper Reviewed By: echristo, chandlerc Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits Differential Revision: https://reviews.llvm.org/D53554 llvm-svn: 351296	2019-01-16 05:15:31 +00:00
Nikita Popov	3247e0488e	Reapply "[DemandedBits] Use SetVector for Worklist" DemandedBits currently uses a simple vector for the worklist, which means that instructions may be inserted multiple times into it. Especially in combination with the deep lattice, this may cause instructions too be recomputed very often. To avoid this, switch to a SetVector. Reapplying with a smaller number of inline elements in the SmallSetVector, to avoid running into the SmallDenseMap issue described in D56455. Differential Revision: https://reviews.llvm.org/D56362 llvm-svn: 350997	2019-01-12 09:09:15 +00:00
Nikita Popov	8654837372	[ConstantFolding] Fold undef for integer intrinsics This fixes https://bugs.llvm.org/show_bug.cgi?id=40110. This implements handling of undef operands for integer intrinsics in ConstantFolding, in particular for the bitcounting intrinsics (ctpop, cttz, ctlz), the with.overflow intrinsics, the saturating math intrinsics and the funnel shift intrinsics. The undef behavior follows what InstSimplify does for the general cas e of non-constant operands. For the bitcount intrinsics (where InstSimplify doesn't do undef handling -- there cannot be a combination of an undef + non-constant operand) I'm using a 0 result if the intrinsic is defined for zero and undef otherwise. Differential Revision: https://reviews.llvm.org/D55950 llvm-svn: 350971	2019-01-11 21:18:00 +00:00
Teresa Johnson	52c19ef944	[LTO] Record whether LTOUnit splitting is enabled in index Summary: Records in the module summary index whether the bitcode was compiled with the option necessary to enable splitting the LTO unit (e.g. -fsanitize=cfi, -fwhole-program-vtables, or -fsplit-lto-unit). The information is passed down to the ModuleSummaryIndex builder via a new module flag "EnableSplitLTOUnit", which is propagated onto a flag on the summary index. This is then used during the LTO link to check whether all linked summaries were built with the same value of this flag. If not, an error is issued when we detect a situation requiring whole program visibility of the class hierarchy. This is the case when both of the following conditions are met: 1) We are performing LowerTypeTests or Whole Program Devirtualization. 2) There are type tests or type checked loads in the code. Note I have also changed the ThinLTOBitcodeWriter to also gate the module splitting on the value of this flag. Reviewers: pcc Subscribers: ormris, mehdi_amini, Prazek, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D53890 llvm-svn: 350948	2019-01-11 18:31:57 +00:00
Alina Sbirlea	fe33c8f598	[MemorySSA] Disable checkClobberSanity for SkipSelfWalker. Sanity will fail for this, since we're exploring getting a clobber further than the sanity check expects. Ideally we need to teach the sanity check to differentiate between the two walkers based on the SkipSelf bool in the query. llvm-svn: 350895	2019-01-10 21:47:15 +00:00
Easwaran Raman	a5daaaae35	Refactor synthetic profile count computation. NFC. Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755	2019-01-09 20:10:27 +00:00
Easwaran Raman	372b240afa	[Inliner] Assert that the computed inline threshold is non-negative. Reviewers: chandlerc Subscribers: haicheng, llvm-commits Differential Revision: https://reviews.llvm.org/D56409 llvm-svn: 350751	2019-01-09 19:26:17 +00:00
David Callahan	cd641e1e98	refactor BlockFrequencyInfo::view to take a title parameter Summary: All a non-default title for the debugging this debugging aide Reviewers: twoh, Kader, modocache Reviewed By: twoh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56499 llvm-svn: 350749	2019-01-09 19:12:38 +00:00
Max Kazantsev	523ec9ed09	[IPT] Drop cache less eagerly in GVN and LoopSafetyInfo Current strategy of dropping `InstructionPrecedenceTracking` cache is to invalidate the entire basic block whenever we change its contents. In fact, `InstructionPrecedenceTracking` has 2 internal strictures: `OrderedInstructions` that is needed to be invalidated whenever the contents changes, and the map with first special instructions in block. This second map does not need an update if we add/remove a non-special instuction because it cannot affect the contents of this map. This patch changes API of `InstructionPrecedenceTracking` so that it now accounts for reasons under which we invalidate blocks. This should lead to much less recalculations of the map and should save us some compile time because in practice we don't typically add/remove special instructions. Differential Revision: https://reviews.llvm.org/D54462 Reviewed By: efriedma llvm-svn: 350694	2019-01-09 07:28:13 +00:00
Philip Pfaffe	c5802c17b7	[DA][NewPM] Add a printerpass and port the testsuite The new-pm version of DA is untested. Testing requires a printer, so add that and use it in the existing DA tests. Differential Revision: https://reviews.llvm.org/D56386 llvm-svn: 350624	2019-01-08 14:06:58 +00:00
Alina Sbirlea	608c9998e4	[MemorySSA] Add SkipSelfWalker. Summary: Add implementation of SkipSelfWalker. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D56285 llvm-svn: 350561	2019-01-07 19:38:47 +00:00
Alina Sbirlea	6f9fb7e80b	[MemorySSA] Refactor CachingWalker. Summary: Refactor caching walker to make creating a walker that skips the starting access strightforward. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D55957 llvm-svn: 350558	2019-01-07 19:22:37 +00:00
Alina Sbirlea	6c3810e67a	[MemorySSA] Extend the clobber walker with the option to skip the starting access. Summary: The option enables loop transformations to hoist accesses that do not have clobbers in the loop. If the clobber queries skips the starting access, the result may be outside the loop instead of the header Phi. Adding the walker that uses this option in a separate patch. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D55944 llvm-svn: 350551	2019-01-07 18:40:27 +00:00
Nikita Popov	a26e9aae55	Revert "[DemandedBits] Use SetVector for Worklist" This reverts commit r350547. Seeing assertion failures on clang tests. llvm-svn: 350549	2019-01-07 18:15:11 +00:00
Nikita Popov	cef1cdf524	[DemandedBits] Use SetVector for Worklist DemandedBits currently uses a simple vector for the worklist, which means that instructions may be inserted multiple times into it. Especially in combination with the deep lattice, this may cause instructions too be recomputed very often. To avoid this, switch to a SetVector. Differential Revision: https://reviews.llvm.org/D56362 llvm-svn: 350547	2019-01-07 18:03:36 +00:00
Chandler Carruth	5954d00ef2	[CallSite removal] Port `IndirectCallSiteVisitor` to use `CallBase` and update client code. Also rename it to use the more generic term `call` instead of something that could be confused with a praticular type. Differential Revision: https://reviews.llvm.org/D56183 llvm-svn: 350508	2019-01-07 07:15:51 +00:00
Chandler Carruth	cb1f5addb7	[CallSite removal] Migrate all Alias Analysis APIs to use the newly minted `CallBase` class instead of the `CallSite` wrapper. This moves the largest interwoven collection of APIs that traffic in `CallSite`s. While a handful of these could have been migrated with a minorly more shallow migration by converting from a `CallSite` to a `CallBase`, it hardly seemed worth it. Most of the APIs needed to migrate together because of the complex interplay of AA APIs and the fact that converting from a `CallBase` to a `CallSite` isn't free in its current implementation. Out of tree users of these APIs can fairly reliably migrate with some combination of `.getInstruction()` on the `CallSite` instance and casting the resulting pointer. The most generic form will look like `CS` -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there is a more elegant migration. Hopefully, this migrates enough APIs for users to fully move from `CallSite` to the base class. All of the in-tree users were easily migrated in that fashion. Thanks for the review from Saleem! Differential Revision: https://reviews.llvm.org/D55641 llvm-svn: 350503	2019-01-07 05:42:51 +00:00
Nikita Popov	61d02a0896	[BDCE] Remove dead uses of arguments In addition to finding dead uses of instructions, also find dead uses of function arguments, and replace them with zero as well. I'm changing the way the known bits are computed here to remove the coupling between the transfer function and the algorithm. It previously relied on the first op being visited first and computing known bits -- unless the first op is not an instruction, in which case they're computed on the second op. I could have adjusted this to check for "instruction or argument", but I think it's better to avoid the repeated calculation with an explicit flag. Differential Revision: https://reviews.llvm.org/D56247 llvm-svn: 350435	2019-01-04 21:21:43 +00:00
Florian Hahn	1b9728f62a	[ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset GetPointerBaseWithConstantOffset include this code, where ByteOffset and GEPOffset are both of type llvm::APInt : ByteOffset += GEPOffset.getSExtValue(); The problem with this line is that getSExtValue() returns an int64_t, but the += matches an overload for uint64_t. The problem is that the resulting APInt is no longer considered to be signed. That in turn causes assertion failures later on if the relevant pointer type is > 64 bits in width and the GEPOffset was negative. Changing it to ByteOffset += GEPOffset.sextOrTrunc(ByteOffset.getBitWidth()); resolves the issue and explicitly performs the sign-extending or truncation. Additionally, instead of asserting later if the result is > 64 bits, it breaks out of the loop in that case. See also https://reviews.llvm.org/D24729 https://reviews.llvm.org/D24772 This commit must be merged after D38662 in order for the test to pass. Patch by Michael Ferguson <mpfergu@gmail.com>. Reviewers: reames, sanjoy, hfinkel Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D38501 llvm-svn: 350395	2019-01-04 14:53:22 +00:00
Simon Pilgrim	6db96180a2	[SLPVectorizer] Flag ADD/SUB SSAT/USAT intrinsics trivially vectorizable (PR40123) Enables SLP vectorization for the SSE2 PADDS/PADDUS/PSUBS/PSUBUS style intrinsics llvm-svn: 350300	2019-01-03 12:18:23 +00:00
Hal Finkel	88d49a9216	[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug) Motivated by the discussion in D38499, this patch updates BasicAA to support arbitrary pointer sizes by switching most remaining non-APInt calculations to use APInt. The size of these APInts is set to the maximum pointer size (maximum over all address spaces described by the data layout string). Most of this translation is straightforward, but this patch contains a fix for a bug that revealed itself during this translation process. In order for test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit pointers, the intermediate calculations must be performed using 64-bit integers. This is because, as noted in the patch, when GetLinearExpression decomposes an expression into C1V+C2, and we then multiply this by Scale, and distribute, to get (C1Scale)V + C2Scale, it can be the case that, even through C1V+C2 does not overflow for relevant values of V, (C2Scale) can overflow. If this happens, later logic will draw invalid conclusions from the (base) offset value. Thus, when initially applying the APInt conversion, because the maximum pointer size in this test is 32 bits, it started failing. Suspicious, I created a 64-bit version of this test (included here), and that failed (miscompiled) on trunk for a similar reason (the multiplication can overflow). After fixing this overflow bug, the first test case (at least) in Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was relying on having 64-bit intermediate values to have BasicAA return an accurate result. In order to fix this problem, and because I believe that it is not uncommon to use i64 indexing expressions in 32-bit code (especially portable code using int64_t), it seems reasonable to always use at least 64-bit integers. In this way, we won't regress our analysis capabilities (and there's a command-line option added, so experimenting with this should be easy). As pointed out by Eli during the review, there are other potential overflow conditions that this patch does not address. Fixing those is left to follow-up work. Patch by me with contributions from Michael Ferguson (mferguson@cray.com). Differential Revision: https://reviews.llvm.org/D38662 llvm-svn: 350220	2019-01-02 16:28:09 +00:00
Nikita Popov	a3f725f6cd	Reapply "[BDCE][DemandedBits] Detect dead uses of undead instructions" This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The previous attempt to land this lead to miscompiles, because cases where uses were initially dead but were later found to be live during further analysis were not always correctly removed from the DeadUses set. This is fixed now and the added test case demanstrates such an instance. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 350188	2019-01-01 10:05:26 +00:00
George Burgess IV	11021e590b	[MemoryLocation] Use LocationSize instead of ints; NFC Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. This one sadly isn't super small, but all of the changes here are either to: - libfuncs that are passed a constant size (memcpy, memset, ...) - instructions that store/load a constant size So they have to be precise llvm-svn: 350017	2018-12-23 03:36:44 +00:00
George Burgess IV	5c64c0f1d1	[Loads] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. This tries to find literal loads/stores of the given type, so this has to be precise. llvm-svn: 350016	2018-12-23 03:10:56 +00:00
George Burgess IV	304a7e4519	[Lint] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350015	2018-12-23 02:50:08 +00:00
George Burgess IV	584e0434bb	[AAEval] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350014	2018-12-23 02:39:58 +00:00
George Burgess IV	283d33e193	[Analysis] More LocationSize cleanup; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350008	2018-12-22 18:23:21 +00:00
Vedant Kumar	0705274c67	[IR] Add Instruction::isLifetimeStartOrEnd, NFC Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964	2018-12-21 21:49:40 +00:00
Reid Kleckner	d851eb745f	[BasicAA] Fix AA bug on dynamic allocas and stackrestore Summary: BasicAA has special logic for unescaped allocas, which normally applies equally well to dynamic and static allocas. However, llvm.stackrestore has the power to end the lifetime of dynamic allocas, without referring to them directly. stackrestore is already marked with the most conservative memory modification attributes, but because the alloca is not escaped, the normal logic produces incorrect results. I think BasicAA needs a special case here to teach it about the relationship between dynamic allocas and stackrestore. Fixes PR40118 Reviewers: gbiv, efriedma, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55969 llvm-svn: 349945	2018-12-21 19:59:03 +00:00
Florian Hahn	800a38a203	[LAA] Avoid generating RT checks for known deps preventing vectorization. If we found unsafe dependences other than 'unknown', we already know at compile time that they are unsafe and the runtime checks should always fail. So we can avoid generating them in those cases. This should have no negative impact on performance as the runtime checks that would be created previously should always fail. As a sanity check, I measured the test-suite, spec2k and spec2k6 and there were no regressions. Reviewers: Ayal, anemet, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D55798 llvm-svn: 349794	2018-12-20 18:49:09 +00:00
Clement Courbet	af5c2af0bc	[NFC] Fix trailing comma after function. lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732	2018-12-20 09:20:07 +00:00
Michael Kruse	a9ec994652	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata. The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725	2018-12-20 04:58:07 +00:00
Nikita Popov	045fafaf88	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions" This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684	2018-12-19 22:09:02 +00:00
Nikita Popov	1da83b4649	[BDCE][DemandedBits] Detect dead uses of undead instructions This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674	2018-12-19 19:56:21 +00:00
Sanjay Patel	eb5c885318	[ValueTracking] remove unused parameters from helper functions; NFC llvm-svn: 349641	2018-12-19 16:49:18 +00:00
Florian Hahn	64ec6d7f40	[LAA] Introduce enum for vectorization safety status (NFC). This patch adds a VectorizationSafetyStatus enum, which will be extended in a follow up patch to distinguish between 'safe with runtime checks' and 'known unsafe' dependences. Reviewers: anemet, anna, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D54892 llvm-svn: 349556	2018-12-18 22:25:11 +00:00
Pete Cooper	f6c9646a94	Change the objc ARC optimizer to use the new objc.* intrinsics We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534	2018-12-18 20:32:49 +00:00
Artur Pilipenko	5b58b6cebf	[CaptureTracking] Pass MaxUsesToExplore from wrappers to the actual implementation This is a follow up for rL347910. In the original patch I somehow forgot to pass the limit from wrappers to the function which actually does the job. llvm-svn: 349438	2018-12-18 03:32:33 +00:00
Nikita Popov	6d62833353	[InstSimplify] Simplify saturating add/sub + icmp If a saturating add/sub has one constant operand, then we can determine the possible range of outputs it can produce, and simplify an icmp comparison based on that. The implementation is based on a similar existing mechanism for simplifying binary operator + icmps. Differential Revision: https://reviews.llvm.org/D55735 llvm-svn: 349369	2018-12-17 17:45:18 +00:00
Wei Mi	a323ae1555	[SampleFDO] handle ProfileSampleAccurate when initializing function entry count ProfileSampleAccurate is used to indicate the profile has exact match to the code to be optimized. Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle ProfileSampleAccurate in multiple places. Differential Revision: https://reviews.llvm.org/D55660 llvm-svn: 349088	2018-12-13 21:51:42 +00:00
Easwaran Raman	8482e48d19	[ThinLTO] Compute synthetic function entry count Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076	2018-12-13 19:54:27 +00:00
David L. Jones	f8dd1a7265	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail" This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002	2018-12-13 03:15:11 +00:00
Michael Kruse	f48207fec4	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944	2018-12-12 17:32:52 +00:00
Wei Mi	25453a0385	[SampleFDO] Extend profile-sample-accurate option to cover isFunctionColdInCallGraph For SampleFDO, when a callsite doesn't appear in the profile, it will not be marked as cold callsite unless the option -profile-sample-accurate is specified. But profile-sample-accurate doesn't cover function isFunctionColdInCallGraph which is used to decide whether a function should be put into text.unlikely section, so even if the user knows the profile is accurate and specifies profile-sample-accurate, those functions not appearing in the sample profile are still not be put into text.unlikely section right now. The patch fixes that. Differential Revision: https://reviews.llvm.org/D55567 llvm-svn: 348940	2018-12-12 17:09:27 +00:00
Nikita Popov	acc9737e1d	[ConstantFolding] Handle leading zero-size elements in load folding Struct types may have leading zero-size elements like [0 x i32], in which case the "real" element at offset 0 will not necessarily coincide with the 0th element of the aggregate. ConstantFoldLoadThroughBitcast() wants to drill down the element at offset 0, but currently always picks the 0th aggregate element to do so. This patch changes the code to find the first non-zero-size element instead, for the struct case. The motivation behind this change is https://github.com/rust-lang/rust/issues/48627. Rust is fond of emitting [0 x iN] separators between struct elements to enforce alignment, which prevents constant folding in this particular case. The additional tests with [4294967295 x [0 x i32]] check that we don't end up unnecessarily looping over a large number of zero-size elements of a zero-size array. Differential Revision: https://reviews.llvm.org/D55169 llvm-svn: 348895	2018-12-11 20:29:16 +00:00
Fedor Sergeev	5a4198704c	[NewPM] fixing asserts on deleted loop in -print-after-all IR-printing AfterPass instrumentation might be called on a loop that has just been invalidated. We should skip printing it to avoid spurious asserts. Reviewed By: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D54740 llvm-svn: 348887	2018-12-11 19:05:35 +00:00

1 2 3 4 5 ...

8326 Commits