llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Arthur Eubanks	0d7675c47c	Add IR constructs for preallocated (inalloca replacement) Add llvm.call.preallocated.{setup,arg} instrinsics. Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup. Add "preallocated" parameter attribute, which is like byval but without the copy. Verifier changes for these IR constructs. See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74651	2020-04-27 16:15:50 -07:00
LemonBoy	2362398441	[AsmPrinter] Fix emission of non-standard integer constants for BE targets The code assumed that zero-extending the integer constant to the designated alloc size would be fine even for BE targets, but that's not the case as that pulls in zeros from the MSB side while we actually expect the padding zeros to go after the LSB. I've changed the codepath handling the constant integers to use the store size for both small(er than u64) and big constants and then add zero padding right after that. Differential Revision: https://reviews.llvm.org/D78011	2020-04-27 14:57:29 -07:00
Craig Topper	1ca348fa21	[X86][CostModel] Update truncate costs for some narrow vector cases to match their wider version. This updates v4i16->v4i8 with sse2 to match v8i16->v8i8. Update v2i16->v2i8 and v4i16->v4i8 with sse 4.1 to match v8i16->v8i8.	2020-04-27 13:47:48 -07:00
Nick Desaulniers	10a0911549	fix D78849 for g++ < 7.1 Summary: Looks like g++ < 7.1 has a bug resolving calls to member functions without `this->` in lamdas with `auto` types. It looks like multiple build bots are using g++-5. https://stackoverflow.com/questions/32097759/calling-this-member-function-from-generic-lambda-clang-vs-gcc https://godbolt.org/z/MiaRt- Reviewers: MaskRay, efriedma, jyknight, craig.topper, rsmith Reviewed By: rsmith Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78962	2020-04-27 13:47:00 -07:00
Andrew Browne	360cd64638	ADT: SmallVector size/capacity use word-size integers when elements are small SmallVector currently uses 32bit integers for size and capacity to reduce sizeof(SmallVector). This limits the number of elements to UINT32_MAX. For a SmallVector<char>, this limits the SmallVector size to only 4GB. Buffering bitcode output uses SmallVector<char>, but needs >4GB output. This changes SmallVector size and capacity to conditionally use word-size integers if the element type is small (<4 bytes). For larger elements types, the vector size can reach ~16GB with 32bit size. Making this conditional on the element type provides both the smaller sizeof(SmallVector) for larger types which are unlikely to grow so large, and supports larger capacities for smaller element types. This recommit fixes the same template being instantiated twice on platforms where uintptr_t is the same as uint32_t.	2020-04-27 13:41:01 -07:00
Sanjay Patel	907731f939	[SLP] refactor load-combine logic; NFC We may want to identify sequences that are not reductions, but still qualify as load-combines in the back-end, so make most of the body a helper function.	2020-04-27 16:02:37 -04:00
Victor Huang	130b1dc745	[PowerPC][Future] Remove "unskipableSimplifyCode()" in PPCMIPeephole.cpp "unskipableSimplifyCode()" was added to handle unsafe BL8_NOTOC instruction when TOC was not completely removed. The function is not needed after confirming TOC pointer is not used in a function that uses PC-Relative addressing. Differential Revision: https://reviews.llvm.org/D78517	2020-04-27 14:57:02 -05:00
Wei Mi	da8d6f5e6f	Recommit "Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore" Insert .cfi_offset/.cfi_register when IncomingCSRSaved of current block is larger than OutgoingCSRSaved of its previous block. Original commit message: https://reviews.llvm.org/D42848 only handled CFA related cfi directives but didn't handle CSR related cfi. The patch adds the CSR part. Basically it reuses the framework created in D42848. For each basicblock, the patch tracks which CSR set have been saved at its CFG predecessors's exits, and compare the CSR set with the set at its previous basicblock's exit (The previous block is the block laid before the current block). If the saved CSR set at its previous basicblock's exit is larger, .cfi_restore will be inserted. The patch also generates proper .cfi_restore in epilogue to make sure the saved CSR set is consistent for the incoming edges of each block. Differential Revision: https://reviews.llvm.org/D74303	2020-04-27 12:46:58 -07:00
Nick Desaulniers	32b2707dc3	[MachineVerifier] retrofit iterators with range for. NFC Summary: Reviewing failures identified in D78586, I was finding the identifiers for these iterators hard to read. Reviewers: efriedma, MaskRay, jyknight Reviewed By: MaskRay Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78849	2020-04-27 12:15:55 -07:00
Craig Topper	eeedbd9928	[X86][CostModel] Update costs for vector truncate with avx512f/avx512bw. All avx512 truncate instructions except vXi64->vXi32 are 2 uops on port 5. So raise their costs to 2. Except when we have an earlier faster sequence like pshufb for 128 bit input vectors. Add a lower cost of 3 v16i16->v16i8 with avx512f where we can extend to v16i32 then truncate. And a cost of 2 for avx512bw with and without avx512vl. There we can use vpmovwb with either a ymm or zmm input. Both of these beat masking, splitting, and using packuswb which is our avx/avx2 codegen.	2020-04-27 12:00:24 -07:00
Davide Italiano	10a0f42b8f	[GlobalISel] Remove debug locations when emitting constants. The tl;dr story is that this causes jumps in the emitted line tables, even at `-O0`. We could at some point consider more fancy solutions to preserve locations, but it doesn't seem to be worth the effort for now. <rdar://problem/62460788> Differential Revision: https://reviews.llvm.org/D78947	2020-04-27 11:27:08 -07:00
Stefan Pintilie	8344cff456	[PowerPC][Future] Implement PC Relative Tail Calls Tail Calls were initially disabled for PC Relative code because it was not safe to make certain assumptions about the tail calls (namely that all compiled functions no longer used the TOC pointer in R2). However, once all of the TOC pointer references have been removed it is safe to tail call everything that was tail called prior to the PC relative additions as well as a number of new cases. For example, it is now possible to tail call indirect functions as there is no need to save and restore the TOC pointer for indirect functions if the caller is marked as may clobber R2 (st_other=1). For the same reason it is now also possible to tail call functions that are external. Differential Revision: https://reviews.llvm.org/D77788	2020-04-27 12:55:08 -05:00
Lang Hames	59e4d993a0	[JITLink] Fix endianness bug fedd32e2fa36. The ByteSwap_NN functions return their result rather than modifying their argument in-place, so we need to write the result back to CPUType here.	2020-04-27 10:40:11 -07:00
Fangrui Song	4078ba50cb	Reland D78945 TarWriter: Only use 137 of the 155 prefix bytes. With a fix to unittests/Support/TarWriterTest.cpp This makes lld's --reproduce output more compatible with tar 1.13 and before. This is a very old version of tar, but it's the version in both gnuwin and unxutils, and the cost for supporting them are very low, so we might as well just do that. https://bugs.chromium.org/p/chromium/issues/detail?id=1073524#c21 and onward has more details. Differential Revision: https://reviews.llvm.org/D78945	2020-04-27 10:37:23 -07:00
Craig Topper	bbac985873	[X86][CostModel] Improve costs for fp_to_uint/fp_to_sint for vXi8/vXi16/v2i32 results. Differential Revision: https://reviews.llvm.org/D78893	2020-04-27 10:35:15 -07:00
Nico Weber	d5ca816ff8	Revert "TarWriter: Only use 137 of the 155 prefix bytes." This reverts commit 90d6ed144c1352e393556a799e79da6ec3a5fab9. Breaks check-llvm. Revert while I investigate.	2020-04-27 13:34:04 -04:00
Nico Weber	4cd45015fc	TarWriter: Only use 137 of the 155 prefix bytes. This makes lld's --reproduce output more compatible with tar 1.13 and before. This is a very old version of tar, but it's the version in both gnuwin and unxutils, and the cost for supporting them are very low, so we might as well just do that. https://bugs.chromium.org/p/chromium/issues/detail?id=1073524#c21 and onward has more details. Differential Revision: https://reviews.llvm.org/D78945	2020-04-27 13:15:22 -04:00
Fangrui Song	d163afe84b	[llvm-objdump] Print target address with evaluateMemoryOperandAddress() D63847 added `MCInstrAnalysis::evaluateMemoryOperandAddress()`. This patch leverages the feature to print the target addresses for evaluable instructions. ``` -400a: movl 4080(%rip), %eax +400a: movl 4080(%rip), %eax # 5000 <data1> ``` This patch also deletes `MIA->isCall(Inst) \|\| MIA->isUnconditionalBranch(Inst) \|\| MIA->isConditionalBranch(Inst)` which is used to guard `MCInstrAnalysis::evaluateBranch()` Reviewed By: jhenderson, skan Differential Revision: https://reviews.llvm.org/D78776	2020-04-27 09:43:51 -07:00
Mircea Trofin	7eeb2e373d	[llvm][NFC] Add an explicit 'ComputeFullInlineCost' API Summary: Added getInliningCostEstimate, which is essentially what getInlineCost computes if passed default inlining params, and non-null ORE or InlineParams::ComputeFullInlineCost. Reviewers: davidxl, eraman, jdoerfert Subscribers: hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78730	2020-04-27 09:11:45 -07:00
Jay Foad	e7d2959d40	[AMDGPU] Remove odd blank line in debug output.	2020-04-27 17:10:36 +01:00
Wei Mi	aa6bebc786	[ProfileSummary] Add partial profile annotation on IR. Profile and profile summary are usually read only once and then annotated on IR. The profile summary metadata on IR should include the value of the newly added partial profile flag, so that compilation phase like thinlto postlink can get the full set of profile information. Differential Revision: https://reviews.llvm.org/D78310	2020-04-27 08:34:15 -07:00
David Sherwood	f81af32255	[CodeGen] Use SPLAT_VECTOR for zeroinitialiser with scalable types Summary: When generating code for the LLVM IR zeroinitialiser operation, if the vector type is scalable we should be using SPLAT_VECTOR instead of BUILD_VECTOR. Differential Revision: https://reviews.llvm.org/D78636	2020-04-27 15:57:59 +01:00
David Green	219eaf384f	[ARM] Allow fma in tail predicated loops There are some intrinsics like this that currently block tail predication, but should be fine. This allows fma through, as the one that I ran into. There may be others that need the same treatment but I've only done this one here. Differential Revision: https://reviews.llvm.org/D78385	2020-04-27 15:32:47 +01:00
Simon Pilgrim	e2ba225180	[X86][SSE] getFauxShuffle - account for PEXTW/PEXTB implicit zero-extension The insert(truncate/extend(extract(vec0,c0)),vec1,c1) case in rGacbc5ede99 wasn't combining the 'mineltsize' with the src vector elt size which may be smaller due to implicit extension during extraction. Reduced from test case provided by @mstorsjo	2020-04-27 12:46:50 +01:00
Sameer Sahasrabuddhe	7d23e63fa4	[NFC] UnifyLoopExits: correctly skip expensive checks	2020-04-27 15:10:35 +05:30
David Green	9780cd1185	[ARM] Replace hasNoSchedulingInfo with UnsupportedFeatures in the A57 schedule hasNoSchedulingInfo should be used for Pseudo's and other instructions that are never expected to be scheduled. This removes the flag from new ARM instructions, instead fixing the A57 schedule by marking the related architecture features as unsupported.	2020-04-27 10:13:29 +01:00
David Green	4e339508d3	[ARM] Only produce qadd8b under hasV6Ops When compiling for a arm5te cpu from clang, the +dsp attribute is set. This meant we could try and generate qadd8 instructions where we would end up having no pattern. I've changed the condition here to be hasV6Ops && hasDSP, which is what other parts of ARMISelLowering seem to use for similar instructions. Fixed PR45677. Differential Revision: https://reviews.llvm.org/D78877	2020-04-27 10:13:29 +01:00
QingShan Zhang	31c511276c	[NFC][DAGCombine] Adding three helper functions and change the getNegatedExpression to negateExpression This is a NFC patch for D77319. The idea is to hide the getNegatibleCost inside the getNegatedExpression() to have it return null if the cost is expensive, and add some helper function for easy to use. And rename the old getNegatedExpression to negateExpression to avoid the semantic conflict. Reviewed By: RKSimon Differential revision: https://reviews.llvm.org/D78291	2020-04-27 04:11:42 +00:00
Craig Topper	dda78f5b46	[X86] Add cost table entry for v2i32->v2f64 fp_to_uint with avx512. We're currently getting this from the default implementation. But I don't like how the cost model came to this answer and I might be making some changes there.	2020-04-26 19:59:01 -07:00
Fangrui Song	1cb8a0e346	[TableGen] Simplify with TGParser::consume()	2020-04-26 15:26:49 -07:00
Hongtao Yu	cfaa908f1f	[ViewCFG] Allow printing edge weights in debuggers Summary: Extending the Function::viewCFG prototypes to allow for printing block probability info in form of .dot files during debugging. Also avoiding an AV when no BFI/BPI available. Reviewers: wenlei, davidxl, knaumov Reviewed By: wenlei, davidxl Subscribers: MaskRay, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77978	2020-04-26 13:18:29 -07:00
Benjamin Kramer	53fc784e78	[IR] Since AttributeSets are sorted, binary search them. Not likely to make a big difference, but there's a fair bit of pointer chasing in large sets.	2020-04-26 20:15:41 +02:00
Ayal Zaks	020a9fcf72	[LV] Fix recording of BranchTakenCount for FoldTail When folding tail, branch taken count is computed during initial VPlan execution and recorded to be used by the compare computing the loop's mask. This recording should directly set the State, instead of reusing Value2VPValue mapping which serves original Values present prior to vectorization. The branch taken count may be a constant Value, which may be used elsewhere in the loop; trying to employ Value2VPValue for both leads to the issue reported in https://reviews.llvm.org/D76992#inline-721028 Differential Revision: https://reviews.llvm.org/D78847	2020-04-26 20:13:10 +03:00
Florian Hahn	0af1ac6753	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Benjamin Kramer	6c1cc25022	Sort EnumAttr so it matches Attribute::operator< This means AttrBuilder will always create a sorted set of attributes and we can skip the sorting step. Sorting attributes is surprisingly expensive, and I recently made it worse by making it use array_pod_sort.	2020-04-26 17:00:25 +02:00
Alexandre Ganea	0437c703af	Re-land [MC] Fix quadratic behavior in addPendingLabel This was discovered when compiling large unity/blob/jumbo files. Differential Revision: https://reviews.llvm.org/D78775	2020-04-26 10:39:42 -04:00
Simon Pilgrim	eb1b2706f9	[X86][SSE] getFauxShuffle - support insert(truncate/extend(extract(vec0,c0)),vec1,c1) shuffle patterns at the byte level Followup to the PR45604 fix at rGe71dd7c011a3 where we disabled most of these cases. By creating the shuffle at the byte level we can handle any extension/truncation as long as we track how small the scalar got and assume that the upper bytes will need to be zero.	2020-04-26 15:31:01 +01:00
Simon Pilgrim	b3eca34010	X86ISelDAGToDAG.cpp - remove unnecessary includes. NFC. The X86 specific headers have to include these so we don't need to duplicate.	2020-04-26 14:50:53 +01:00
Simon Pilgrim	0dcc1addd8	X86MCTargetDesc.h - remove unused DataType.h include. NFC.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	74f61af1ac	X86MCTargetDesc.cpp - remove MSVC intrin.h include. NFC. This was needed when the file called cpuid but that was removed at rL233170.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	502933b543	X86MacroFusion.h - reduce MachineScheduler.h include. NFC. We only need a ScheduleDAGMutation forward declaration.	2020-04-26 14:50:52 +01:00
Florian Hahn	2f937465a7	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Benjamin Kramer	c34f658055	raw_ostream: Simplify code a bit. NFCI.	2020-04-26 14:07:05 +02:00
Simon Pilgrim	e9e77991e4	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Simon Pilgrim	fbd71f44a9	X86Operand.h - remove unnecessary includes. NFC.	2020-04-26 12:12:22 +01:00
Simon Pilgrim	206a77113d	AMDGPU/Utils - cleanup include and forward declarations. NFC. Remove unused includes + forward declarations. Reduce unnecessary StringRef.h includes to StringRef forward declaration.	2020-04-26 12:12:21 +01:00
Benjamin Kramer	8d0e7ecb99	[IR] Simplify code to print string attributes a bit. NFC.	2020-04-26 13:06:50 +02:00
Fangrui Song	b8f75260a2	[TableGen] Delete unused Record::resolveReferencesTo() after D44478. NFC	2020-04-26 01:21:41 -07:00
Nikita Popov	1e2d4bd6f8	[CaptureTracking] Make MaxUsesToExplore cheaper (NFC) The change in D78624 had a noticeable negative compile-time impact. It seems that going through a function call for the MaxUsesToExplore default is fairly expensive, at least if LLVM is not built with LTO. This patch makes MaxUsesToExpore default to 0 and assigns the actual default in the implementation instead. This recovers most of the regression. Differential Revision: https://reviews.llvm.org/D78734	2020-04-26 09:54:15 +02:00
Nikita Popov	ed46b6656d	[GVN] Reduce expression size (NFC) Reduce size of GVN::Expression by reordering fields to reduce padding.	2020-04-26 09:43:35 +02:00

1 2 3 4 5 ...

133852 Commits