llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Craig Topper	eeedbd9928	[X86][CostModel] Update costs for vector truncate with avx512f/avx512bw. All avx512 truncate instructions except vXi64->vXi32 are 2 uops on port 5. So raise their costs to 2. Except when we have an earlier faster sequence like pshufb for 128 bit input vectors. Add a lower cost of 3 v16i16->v16i8 with avx512f where we can extend to v16i32 then truncate. And a cost of 2 for avx512bw with and without avx512vl. There we can use vpmovwb with either a ymm or zmm input. Both of these beat masking, splitting, and using packuswb which is our avx/avx2 codegen.	2020-04-27 12:00:24 -07:00
Davide Italiano	10a0f42b8f	[GlobalISel] Remove debug locations when emitting constants. The tl;dr story is that this causes jumps in the emitted line tables, even at `-O0`. We could at some point consider more fancy solutions to preserve locations, but it doesn't seem to be worth the effort for now. <rdar://problem/62460788> Differential Revision: https://reviews.llvm.org/D78947	2020-04-27 11:27:08 -07:00
Stefan Pintilie	8344cff456	[PowerPC][Future] Implement PC Relative Tail Calls Tail Calls were initially disabled for PC Relative code because it was not safe to make certain assumptions about the tail calls (namely that all compiled functions no longer used the TOC pointer in R2). However, once all of the TOC pointer references have been removed it is safe to tail call everything that was tail called prior to the PC relative additions as well as a number of new cases. For example, it is now possible to tail call indirect functions as there is no need to save and restore the TOC pointer for indirect functions if the caller is marked as may clobber R2 (st_other=1). For the same reason it is now also possible to tail call functions that are external. Differential Revision: https://reviews.llvm.org/D77788	2020-04-27 12:55:08 -05:00
Lang Hames	59e4d993a0	[JITLink] Fix endianness bug fedd32e2fa36. The ByteSwap_NN functions return their result rather than modifying their argument in-place, so we need to write the result back to CPUType here.	2020-04-27 10:40:11 -07:00
Fangrui Song	4078ba50cb	Reland D78945 TarWriter: Only use 137 of the 155 prefix bytes. With a fix to unittests/Support/TarWriterTest.cpp This makes lld's --reproduce output more compatible with tar 1.13 and before. This is a very old version of tar, but it's the version in both gnuwin and unxutils, and the cost for supporting them are very low, so we might as well just do that. https://bugs.chromium.org/p/chromium/issues/detail?id=1073524#c21 and onward has more details. Differential Revision: https://reviews.llvm.org/D78945	2020-04-27 10:37:23 -07:00
Craig Topper	bbac985873	[X86][CostModel] Improve costs for fp_to_uint/fp_to_sint for vXi8/vXi16/v2i32 results. Differential Revision: https://reviews.llvm.org/D78893	2020-04-27 10:35:15 -07:00
Nico Weber	d5ca816ff8	Revert "TarWriter: Only use 137 of the 155 prefix bytes." This reverts commit 90d6ed144c1352e393556a799e79da6ec3a5fab9. Breaks check-llvm. Revert while I investigate.	2020-04-27 13:34:04 -04:00
Nico Weber	4cd45015fc	TarWriter: Only use 137 of the 155 prefix bytes. This makes lld's --reproduce output more compatible with tar 1.13 and before. This is a very old version of tar, but it's the version in both gnuwin and unxutils, and the cost for supporting them are very low, so we might as well just do that. https://bugs.chromium.org/p/chromium/issues/detail?id=1073524#c21 and onward has more details. Differential Revision: https://reviews.llvm.org/D78945	2020-04-27 13:15:22 -04:00
Fangrui Song	d163afe84b	[llvm-objdump] Print target address with evaluateMemoryOperandAddress() D63847 added `MCInstrAnalysis::evaluateMemoryOperandAddress()`. This patch leverages the feature to print the target addresses for evaluable instructions. ``` -400a: movl 4080(%rip), %eax +400a: movl 4080(%rip), %eax # 5000 <data1> ``` This patch also deletes `MIA->isCall(Inst) \|\| MIA->isUnconditionalBranch(Inst) \|\| MIA->isConditionalBranch(Inst)` which is used to guard `MCInstrAnalysis::evaluateBranch()` Reviewed By: jhenderson, skan Differential Revision: https://reviews.llvm.org/D78776	2020-04-27 09:43:51 -07:00
Mircea Trofin	7eeb2e373d	[llvm][NFC] Add an explicit 'ComputeFullInlineCost' API Summary: Added getInliningCostEstimate, which is essentially what getInlineCost computes if passed default inlining params, and non-null ORE or InlineParams::ComputeFullInlineCost. Reviewers: davidxl, eraman, jdoerfert Subscribers: hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78730	2020-04-27 09:11:45 -07:00
Jay Foad	e7d2959d40	[AMDGPU] Remove odd blank line in debug output.	2020-04-27 17:10:36 +01:00
Wei Mi	aa6bebc786	[ProfileSummary] Add partial profile annotation on IR. Profile and profile summary are usually read only once and then annotated on IR. The profile summary metadata on IR should include the value of the newly added partial profile flag, so that compilation phase like thinlto postlink can get the full set of profile information. Differential Revision: https://reviews.llvm.org/D78310	2020-04-27 08:34:15 -07:00
David Sherwood	f81af32255	[CodeGen] Use SPLAT_VECTOR for zeroinitialiser with scalable types Summary: When generating code for the LLVM IR zeroinitialiser operation, if the vector type is scalable we should be using SPLAT_VECTOR instead of BUILD_VECTOR. Differential Revision: https://reviews.llvm.org/D78636	2020-04-27 15:57:59 +01:00
David Green	219eaf384f	[ARM] Allow fma in tail predicated loops There are some intrinsics like this that currently block tail predication, but should be fine. This allows fma through, as the one that I ran into. There may be others that need the same treatment but I've only done this one here. Differential Revision: https://reviews.llvm.org/D78385	2020-04-27 15:32:47 +01:00
Simon Pilgrim	e2ba225180	[X86][SSE] getFauxShuffle - account for PEXTW/PEXTB implicit zero-extension The insert(truncate/extend(extract(vec0,c0)),vec1,c1) case in rGacbc5ede99 wasn't combining the 'mineltsize' with the src vector elt size which may be smaller due to implicit extension during extraction. Reduced from test case provided by @mstorsjo	2020-04-27 12:46:50 +01:00
Sameer Sahasrabuddhe	7d23e63fa4	[NFC] UnifyLoopExits: correctly skip expensive checks	2020-04-27 15:10:35 +05:30
David Green	9780cd1185	[ARM] Replace hasNoSchedulingInfo with UnsupportedFeatures in the A57 schedule hasNoSchedulingInfo should be used for Pseudo's and other instructions that are never expected to be scheduled. This removes the flag from new ARM instructions, instead fixing the A57 schedule by marking the related architecture features as unsupported.	2020-04-27 10:13:29 +01:00
David Green	4e339508d3	[ARM] Only produce qadd8b under hasV6Ops When compiling for a arm5te cpu from clang, the +dsp attribute is set. This meant we could try and generate qadd8 instructions where we would end up having no pattern. I've changed the condition here to be hasV6Ops && hasDSP, which is what other parts of ARMISelLowering seem to use for similar instructions. Fixed PR45677. Differential Revision: https://reviews.llvm.org/D78877	2020-04-27 10:13:29 +01:00
QingShan Zhang	31c511276c	[NFC][DAGCombine] Adding three helper functions and change the getNegatedExpression to negateExpression This is a NFC patch for D77319. The idea is to hide the getNegatibleCost inside the getNegatedExpression() to have it return null if the cost is expensive, and add some helper function for easy to use. And rename the old getNegatedExpression to negateExpression to avoid the semantic conflict. Reviewed By: RKSimon Differential revision: https://reviews.llvm.org/D78291	2020-04-27 04:11:42 +00:00
Craig Topper	dda78f5b46	[X86] Add cost table entry for v2i32->v2f64 fp_to_uint with avx512. We're currently getting this from the default implementation. But I don't like how the cost model came to this answer and I might be making some changes there.	2020-04-26 19:59:01 -07:00
Fangrui Song	1cb8a0e346	[TableGen] Simplify with TGParser::consume()	2020-04-26 15:26:49 -07:00
Hongtao Yu	cfaa908f1f	[ViewCFG] Allow printing edge weights in debuggers Summary: Extending the Function::viewCFG prototypes to allow for printing block probability info in form of .dot files during debugging. Also avoiding an AV when no BFI/BPI available. Reviewers: wenlei, davidxl, knaumov Reviewed By: wenlei, davidxl Subscribers: MaskRay, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77978	2020-04-26 13:18:29 -07:00
Benjamin Kramer	53fc784e78	[IR] Since AttributeSets are sorted, binary search them. Not likely to make a big difference, but there's a fair bit of pointer chasing in large sets.	2020-04-26 20:15:41 +02:00
Ayal Zaks	020a9fcf72	[LV] Fix recording of BranchTakenCount for FoldTail When folding tail, branch taken count is computed during initial VPlan execution and recorded to be used by the compare computing the loop's mask. This recording should directly set the State, instead of reusing Value2VPValue mapping which serves original Values present prior to vectorization. The branch taken count may be a constant Value, which may be used elsewhere in the loop; trying to employ Value2VPValue for both leads to the issue reported in https://reviews.llvm.org/D76992#inline-721028 Differential Revision: https://reviews.llvm.org/D78847	2020-04-26 20:13:10 +03:00
Florian Hahn	0af1ac6753	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Benjamin Kramer	6c1cc25022	Sort EnumAttr so it matches Attribute::operator< This means AttrBuilder will always create a sorted set of attributes and we can skip the sorting step. Sorting attributes is surprisingly expensive, and I recently made it worse by making it use array_pod_sort.	2020-04-26 17:00:25 +02:00
Alexandre Ganea	0437c703af	Re-land [MC] Fix quadratic behavior in addPendingLabel This was discovered when compiling large unity/blob/jumbo files. Differential Revision: https://reviews.llvm.org/D78775	2020-04-26 10:39:42 -04:00
Simon Pilgrim	eb1b2706f9	[X86][SSE] getFauxShuffle - support insert(truncate/extend(extract(vec0,c0)),vec1,c1) shuffle patterns at the byte level Followup to the PR45604 fix at rGe71dd7c011a3 where we disabled most of these cases. By creating the shuffle at the byte level we can handle any extension/truncation as long as we track how small the scalar got and assume that the upper bytes will need to be zero.	2020-04-26 15:31:01 +01:00
Simon Pilgrim	b3eca34010	X86ISelDAGToDAG.cpp - remove unnecessary includes. NFC. The X86 specific headers have to include these so we don't need to duplicate.	2020-04-26 14:50:53 +01:00
Simon Pilgrim	0dcc1addd8	X86MCTargetDesc.h - remove unused DataType.h include. NFC.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	74f61af1ac	X86MCTargetDesc.cpp - remove MSVC intrin.h include. NFC. This was needed when the file called cpuid but that was removed at rL233170.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	502933b543	X86MacroFusion.h - reduce MachineScheduler.h include. NFC. We only need a ScheduleDAGMutation forward declaration.	2020-04-26 14:50:52 +01:00
Florian Hahn	2f937465a7	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Benjamin Kramer	c34f658055	raw_ostream: Simplify code a bit. NFCI.	2020-04-26 14:07:05 +02:00
Simon Pilgrim	e9e77991e4	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Simon Pilgrim	fbd71f44a9	X86Operand.h - remove unnecessary includes. NFC.	2020-04-26 12:12:22 +01:00
Simon Pilgrim	206a77113d	AMDGPU/Utils - cleanup include and forward declarations. NFC. Remove unused includes + forward declarations. Reduce unnecessary StringRef.h includes to StringRef forward declaration.	2020-04-26 12:12:21 +01:00
Benjamin Kramer	8d0e7ecb99	[IR] Simplify code to print string attributes a bit. NFC.	2020-04-26 13:06:50 +02:00
Fangrui Song	b8f75260a2	[TableGen] Delete unused Record::resolveReferencesTo() after D44478. NFC	2020-04-26 01:21:41 -07:00
Nikita Popov	1e2d4bd6f8	[CaptureTracking] Make MaxUsesToExplore cheaper (NFC) The change in D78624 had a noticeable negative compile-time impact. It seems that going through a function call for the MaxUsesToExplore default is fairly expensive, at least if LLVM is not built with LTO. This patch makes MaxUsesToExpore default to 0 and assigns the actual default in the implementation instead. This recovers most of the regression. Differential Revision: https://reviews.llvm.org/D78734	2020-04-26 09:54:15 +02:00
Nikita Popov	ed46b6656d	[GVN] Reduce expression size (NFC) Reduce size of GVN::Expression by reordering fields to reduce padding.	2020-04-26 09:43:35 +02:00
Nikita Popov	4c34305b53	[IR] Use map for string attributes (NFC) Attributes are currently stored as a simple list. Enum attributes additionally use a bitset to allow quickly determining whether an attribute is set. String attributes on the other hand require a full scan of the list. As functions tend to have a lot of string attributes (at least when clang is used), this is a noticeable performance issue. This patch adds an additional name => attribute map to the AttributeSetNode, which allows querying string attributes quickly. This results in a 3% reduction in instructions retired on CTMark. Changes to memory usage seem to be in the noise (attribute sets are uniqued, and we don't tend to have more than a few dozen or hundred unique attribute sets, so adding an extra map does not have a noticeable cost.) Differential Revision: https://reviews.llvm.org/D78859	2020-04-26 09:38:05 +02:00
Craig Topper	0c2165f695	[X86] Fix the cost of v16i1->v16i16 sext/zext on avx targets. Previously we were hitting the scalarization case in the default implementation.	2020-04-25 23:16:20 -07:00
Craig Topper	6e8ebb5707	[X86][CostModel] Improve costs for vXi1 sign_extend/zero_extend with avx512. With avx512 vXi1 is legal and uses k-registers with many custom cases for extending.	2020-04-25 23:16:20 -07:00
Fangrui Song	3abf162d09	[TableGen] Add TGParser::consume()	2020-04-25 21:58:54 -07:00
Chris Lattner	8c75627d4f	[SourceMgr] Tidy up the SourceMgr header file to include less stuff. Summary: Specifically make some simple refactorings to get PointerUnion.h and Twine.h out of the public includes. While here, trim out a lot of transitive includes as well. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78870	2020-04-25 21:18:59 -07:00
Fangrui Song	33d6800058	[TableGen] Drop deprecated leading # when parsing a SimpleValue	2020-04-25 16:27:40 -07:00
Fangrui Song	d192bcd830	[TableGen] Drop deprecated leading # operation (NOP) and replace ## with #	2020-04-25 16:26:45 -07:00
Craig Topper	64665f8080	[X86] Improve lowering of v16i8->v16i1 truncate under prefer-vector-width=256.	2020-04-25 15:20:33 -07:00
Chris Lattner	535019df4d	[SourceMgr/MLIR diagnostics] Introduce a new method to speed things up Summary: This introduces a new SourceMgr::FindLocForLineAndColumn method that uses the OffsetCache in SourceMgr::SrcBuffer to do do a constant time lookup for the line number (once the cache is populated). Use this method in MLIR's SourceMgrDiagnosticHandler::convertLocToSMLoc, replacing the O(n) scanning logic. This resolves a long standing TODO in MLIR, and makes one of my usecases go dramatically faster (which is currently producing many diagnostics in a 40MB SourceBuffer). NFC, this is just a performance speedup and cleanup. Reviewers: rriddle!, ftynse! Subscribers: hiraditya, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78868	2020-04-25 14:06:44 -07:00

1 2 3 4 5 ...

133843 Commits