llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 12:43:36 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	674d498a96	[X86][SSE] combineX86ShuffleChain - remove unused shuffle(vzext_load(),undef) combine. This should always be caught by the various VZEXT_MOVL handling in combineTargetShuffle and SimplifyDemandedVectorEltsForTargetNode.	2020-05-06 15:20:29 +01:00
Matt Arsenault	207294810c	AMDGPU: Insert kernarg code after allocas This produces more normal looking IR by keeping all the allocas clustered at the start of the block.	2020-05-06 10:19:56 -04:00
David Green	c9f36a7bf2	[ARM] VMOVrh of VMOVhr A VMOVhr of a VMOVrh can be simply folded to the original HPR value. Differential Revision: https://reviews.llvm.org/D78710	2020-05-06 15:10:01 +01:00
Sanjay Patel	3ff9a6bb20	[VectorCombine] add tests for possible scalarization; NFC	2020-05-06 09:58:27 -04:00
David Green	cca08857e3	[ARM] Extract from a VDUP If we get into the situation where we are extracting from a VDUP, the extracted value is just the origin, so long as the types match or we can bitcast between the two. Differential Revision: https://reviews.llvm.org/D78708	2020-05-06 14:51:25 +01:00
David Green	8aafbb48f2	[ARM] Convert a bitcast VDUP to a VDUP The idea, under MVE, is to introduce more bitcasts around VDUP's in an attempt to get the type correct across basic block boundaries. In order to do that without other regressions we need a few fixups, of which this is the first. If the code is a bitcast of a VDUP, we can convert that straight into a VDUP of the new type, so long as they have the same size. Differential Revision: https://reviews.llvm.org/D78706	2020-05-06 14:14:21 +01:00
Alexandre Ganea	13e60df049	[Debug][CodeView] Emit fully qualified names for globals Emit S_[L\|G][THREAD32\|DATA32] records with a fully qualified name (namespace + class scope). Differential Revision: https://reviews.llvm.org/D79447	2020-05-06 09:12:00 -04:00
Alexandre Ganea	fe787499fa	[Support] Silence warning: comparison of integers of different signs: 'const int' and 'const unsigned long'	2020-05-06 09:12:00 -04:00
Alexandre Ganea	b53c3d0951	[InstrProf] Silence warnings when targeting x86 with VS2019 16.5.4 Differential Revision: https://reviews.llvm.org/D79337	2020-05-06 09:12:00 -04:00
Simon Pilgrim	cf3ae8d32b	[X86][SSE] Move VZEXT_MOVL removal into SimplifyDemandedVectorEltsForTargetNode This patch replaces the VZEXT_MOVL removal from combineShuffle with a more general version based in SimplifyDemandedVectorEltsForTargetNode. By using computeKnownBits we can always remove the VZEXT_MOVL if the upper elements of the source operand are known to be zero. This requires us to add the conversion ops to computeKnownBitsForTargetNode as well. Reviewed By: @craig.topper Differential Revision: https://reviews.llvm.org/D79335	2020-05-06 14:05:07 +01:00
Simon Pilgrim	70644cf927	[X86][SSE] getShuffleScalarElt - minor NFC cleanup. Use SelectionDAG::MaxRecursionDepth instead of (equal) hard coded constant. clang-format	2020-05-06 14:05:07 +01:00
David Spickett	3ec1de748c	Reland "[CodeGen] Make logic of CCState::resultsCompatible clearer" This relands commit d782d1f898eaafee49548d5332e84c3ae11ebac4. With a typo fixed, which was causing the x86 test failure.	2020-05-06 13:40:49 +01:00
Dmitry Preobrazhensky	5962ff9344	[AMDGPU][MC][GFX9+] Enabled 21-bit signed offsets for SMEM instructions Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D79288	2020-05-06 14:13:10 +03:00
Stefan Pintilie	3a142ee8ca	[PowerPC] Fix missing GOT indirect variant kind The function MCSymbolRefExpr::getVariantKindForName was missing the entry for VK_PPC_GOT_PCREL. This patch adds the missing entry. Differential Revision: https://reviews.llvm.org/D79015	2020-05-06 05:50:56 -05:00
Jay Foad	592448f7b1	Fix misleading comments.	2020-05-06 11:08:25 +01:00
Benjamin Kramer	79dea5077e	Quiet some -Wdocumentation warnings.	2020-05-06 11:23:13 +02:00
David Spickett	d2e8f08414	Revert "[CodeGen] Make logic of CCState::resultsCompatible clearer" This reverts commit d782d1f898eaafee49548d5332e84c3ae11ebac4 which caused test CodeGen/X86/sibcall.ll to fail.	2020-05-06 10:14:17 +01:00
Xing GUO	7653f25b98	[llvm-nm/objdump/size] Add tests for dumping symbol tables with invalid sh_size. This change adds tests for llvm-nm, llvm-objdump and llvm-size when dumping symbol tables with invalid sh_size (sh_size % sizeof(Elf_Sym) != 0). Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D77864	2020-05-06 17:01:20 +08:00
David Spickett	fe3b80339d	[CodeGen] Make logic of CCState::resultsCompatible clearer	2020-05-06 09:48:58 +01:00
Konstantin Schwarz	0f92ec39f7	[GlobalISel][InlineAsm] Add support for basic output operand constraints Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette Reviewed By: arsenm Subscribers: gargaroff, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78318	2020-05-06 10:06:13 +02:00
Vitaly Buka	b282583726	[local-bounds] Ignore volatile operations Summary: -fsanitize=local-bounds is very similar to ``object-size`` and should also ignore volatile pointers. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#volatile Reviewers: chandlerc, rsmith Reviewed By: rsmith Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D78607	2020-05-05 23:08:08 -07:00
Craig Topper	40e4f4a681	[X86] Add v32i16/v64i8 into the handling for 512-bit inline assembly constraints.	2020-05-05 21:41:31 -07:00
Johannes Doerfert	2e2f48ec13	[Attributor][NFC] Cleanup some AAMemoryLocation code This is the first step to resolve a TODO in AAMemoryLocation and to fix a bug we have when handling `byval` arguments of `readnone` call sites. No functional change intended.	2020-05-05 23:15:33 -05:00
Johannes Doerfert	95e37abb7b	[Attributor][NFC] Minor code cleanups to minimize follow up diffs	2020-05-05 23:14:23 -05:00
Johannes Doerfert	be67505b0c	[Attributor][NFC] Avoid dependences on known information	2020-05-05 23:14:23 -05:00
Craig Topper	e48ee635ba	[X86] Allow Yz inline assembly constraint to choose ymm0 or zmm0 when avx/avx512 are enabled and type is 256 or 512 bits gcc supports selecting ymm0/zmm0 for the Yz constraint when used with 256 or 512 bit vector types. Fixes PR45806 Differential Revision: https://reviews.llvm.org/D79448	2020-05-05 21:12:30 -07:00
Puyan Lotfi	8c3e5ac746	[NFC] Outliner label name clean up. Just simplifying how the label name is generated while using std::to_string instead of Twine. Differential Revision: https://reviews.llvm.org/D79464	2020-05-05 23:27:46 -04:00
Jessica Paquette	095011b168	[AArch64][GlobalISel] Fold shifts into G_ICMP Since G_ICMP can be selected to a SUBS, we can fold shifts into such compares. E.g. ``` cmp w1, w0, lsl #3 cmp w1, w0, lsr #3 cmp w1, w0, asr #3 ``` This is done the same way as for adds and subtracts, using `selectShiftedRegister`. This gives some minor code size savings on CTMark. https://reviews.llvm.org/D79365	2020-05-05 18:35:39 -07:00
Wenlei He	e88b44e49c	[llvm-profdata] Support -detailed-summary for Sample Profile Summary: Add -detailed-summary support for sample profile dump to match that of instrumentation profile. Reviewers: wmi, davidxl, hoyFB Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79291	2020-05-05 18:28:22 -07:00
Justice Adams	5db772cd7b	Fix SelectionDAG Graph Printing on Windows Currently, when compiling to IR (presumably at the clang level) LLVM mangles symbols and sometimes they have illegal file characters including `?`'s in them. This causes a problem when building a graph via llc on Windows because the code currently passes the machine function name all the way down to the Windows API which frequently returns error 123 ERROR_INVALID_NAME https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499- Thus, we need to remove those illegal characters from the machine function name before generating a graph, which is the purpose of this patch. https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file I've created a static helper function replace_illegal_filename_chars which within GraphWriter.cpp to help with replacing illegal file character names before generating a dot graph filename. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D76863	2020-05-05 17:01:05 -07:00
Craig Topper	42835f2bf6	[X86] Fix usage of Align constructing MachineMemOperands. Similar to D77687, but for the X86 specific code. Differential Revision: https://reviews.llvm.org/D79381	2020-05-05 15:25:02 -07:00
Reid Kleckner	6064535152	[Support] Move LLD's parallel algorithm wrappers to support Essentially takes the lld/Common/Threads.h wrappers and moves them to the llvm/Support/Paralle.h algorithm header. The changes are: - Remove policy parameter, since all clients use `par`. - Rename the methods to `parallelSort` etc to match LLVM style, since they are no longer C++17 pstl compatible. - Move algorithms from llvm::parallel:: to llvm::, since they have "parallel" in the name and are no longer overloads of the regular algorithms. - Add range overloads - Use the sequential algorithm directly when 1 thread is requested (skips task grouping) - Fix the index type of parallelForEachN to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t. Remove Threads.h and update LLD for that. This is a prerequisite for parallel public symbol processing in the PDB library, which is in LLVM. Reviewed By: MaskRay, aganea Differential Revision: https://reviews.llvm.org/D79390	2020-05-05 15:21:05 -07:00
Christopher Tetreault	9628857335	[SVE] Fix invalid usage of getNumElements() in InstCombineMulDivRem Summary: getLogBase2 tries to iterate over the number of vector elements. Since the number of elements of a scalable vector is unknown at compile time, we must return null if the input type is scalable. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, fpetrogalli, kmclaughlin, spatel Reviewed By: efriedma, fpetrogalli Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79197	2020-05-05 15:19:01 -07:00
Jan Korous	05620404c0	[VFS][NFC] Fix typo in comment	2020-05-05 13:55:27 -07:00
Stanislav Mekhanoshin	965e88e480	[AMDGPU] Added 'a' constraint documentation. NFC. AGPR inline asm constraint was missing from the LangRef.rst.	2020-05-05 13:52:04 -07:00
Alina Sbirlea	02d6ea4855	[MemorySSA] Make MemoryLocation unknown when phi translation cannot be performed. Summary: When phi translation cannot be performed, be conservative and make the MemoryLocation unknown. Reviewers: george.burgess.iv Subscribers: Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79386	2020-05-05 13:32:32 -07:00
Sanjay Patel	639da4fda8	[ValueTracking] fix CannotBeNegativeZero() to disregard 'nsz' FMF The 'nsz' flag is different than 'nnan' or 'ninf' in that it does not create poison. Make that explicit in the LangRef and fix ValueTracking analysis that misinterpreted the definition. This manifests as bugs in InstSimplify shown in the test diffs and as discussed in PR45778: https://bugs.llvm.org/show_bug.cgi?id=45778 Differential Revision: https://reviews.llvm.org/D79422	2020-05-05 16:04:59 -04:00
Christudasan Devadasan	496b696b52	[AMDGPU] Fixed the test by adding the triple.	2020-05-06 00:14:01 +05:30
Momchil Velikov	d01e846995	Revert "[ARM] CMSE code generation" This reverts commit 7cbbf89d230d46c3de9a7affc29b23f08c4377a1. The regression tests fail with the expensive checks.	2020-05-05 19:05:40 +01:00
David Blaikie	169fdcbf38	Collapse variable into assert to remove non-assert unused variable	2020-05-05 11:04:43 -07:00
Kazu Hirata	cefc8486f0	[Inlining] Teach shouldBeDeferred to take the total cost into account Summary: This patch teaches shouldBeDeferred to take into account the total cost of inlining. Suppose we have a call hierarchy {A1,A2,A3,...}->B->C. (Each of A1, A2, A3, ... calls B, which in turn calls C.) Without this patch, shouldBeDeferred essentially returns true if TotalSecondaryCost < IC.getCost() where TotalSecondaryCost is the total cost of inlining B into As. This means that if B is a small wraper function, for example, it would get inlined into all of As. In turn, C gets inlined into all of As. In other words, shouldBeDeferred ignores the cost of inlining C into each of As. This patch adds an option, inline-deferral-scale, to replace the expression above with: TotalCost < Allowance where - TotalCost is TotalSecondaryCost + IC.getCost() * # of As, and - Allowance is IC.getCost() * Scale For now, the new option defaults to -1, disabling the new scheme. Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79138	2020-05-05 11:02:06 -07:00
Nico Weber	78af3e9035	Let normalize() for posix style convert backslash to slash unconditionally. Currently, normalize() for posix replaces backslashes to slashes, except that two backslashes in sequence are kept as-is. clang calls normalize() to convert \ to / is microsoft compat mode. This generally works well, but a path like "c:\\foo\\bar.h" with two backslashes doesn't work due to the exception in normalize(). These paths happen naturally on Windows hosts with e.g. `#include __FILE__`, and them not working on other hosts makes it more difficult to write tests for this case. The special case has been around without justification since this code was added in r203611 (since then moved around in r215241 r215243). No integration tests fail if I remove it. Try removing the special case. Differential Revision: https://reviews.llvm.org/D79265	2020-05-05 13:54:55 -04:00
Nico Weber	bc8001bcb3	Add a test to Support.NormalizePath.	2020-05-05 13:41:24 -04:00
Christudasan Devadasan	8553d88165	[AMDGPU] Introduce more scratch registers in the ABI. The AMDGPU target has a convention that defined all VGPRs (execept the initial 32 argument registers) as callee-saved. This convention is not efficient always, esp. when the callee requiring more registers, ended up emitting a large number of spills, even though its caller requires only a few. This patch revises the ABI by introducing more scratch registers that a callee can freely use. The 256 vgpr registers now become: 32 argument registers 112 scratch registers and 112 callee saved registers. The scratch registers and the CSRs are intermixed at regular intervals (a split boundary of 8) to obtain a better occupancy. Reviewers: arsenm, t-tye, rampitec, b-sumner, mjbedy, tpr Reviewed By: arsenm, t-tye Differential Revision: https://reviews.llvm.org/D76356	2020-05-05 23:02:58 +05:30
Momchil Velikov	c814314f5f	[ARM] CMSE code generation This patch implements the final bits of CMSE code generation: * emit special linker symbols * restrict parameter passing to not use memory * emit BXNS and BLXNS instructions for returns from non-secure entry functions, and non-secure function calls, respectively * emit code to save/restore secure floating-point state around calls to non-secure functions * emit code to save/restore non-secure floating-pointy state upon entry to non-secure entry function, and return to non-secure state * emit code to clobber registers not used for arguments and returns when switching to no-secure state Patch by Momchil Velikov, Bradley Smith, Javed Absar, David Green, possibly others. Differential Revision: https://reviews.llvm.org/D76518	2020-05-05 18:23:28 +01:00
Stanislav Mekhanoshin	7ed90a6ea4	[AMDGPU] Fix FoldImmediate for 16 bit operand Differential Revision: https://reviews.llvm.org/D79362	2020-05-05 10:19:14 -07:00
Sanjay Patel	95f9bd9556	[SLP] add another bailout for load-combine patterns This builds on the or-reduction bailout that was added with D67841. We still do not have IR-level load combining, although that could be a target-specific enhancement for -vector-combiner. The heuristic is narrowly defined to catch the motivating case from PR39538: https://bugs.llvm.org/show_bug.cgi?id=39538 ...while preserving existing functionality. That is, there's an unmodified test of pure load/zext/store that is not seen in this patch at llvm/test/Transforms/SLPVectorizer/X86/cast.ll. That's the reason for the logic difference to require the 'or' instructions. The chances that vectorization would actually help a memory-bound sequence like that seem small, but it looks nicer with: vpmovzxwd (%rsi), %xmm0 vmovdqu %xmm0, (%rdi) rather than: movzwl (%rsi), %eax movl %eax, (%rdi) ... In the motivating test, we avoid creating a vector mess that is unrecoverable in the backend, and SDAG forms the expected bswap instructions after load combining: movzbl (%rdi), %eax vmovd %eax, %xmm0 movzbl 1(%rdi), %eax vmovd %eax, %xmm1 movzbl 2(%rdi), %eax vpinsrb $4, 4(%rdi), %xmm0, %xmm0 vpinsrb $8, 8(%rdi), %xmm0, %xmm0 vpinsrb $12, 12(%rdi), %xmm0, %xmm0 vmovd %eax, %xmm2 movzbl 3(%rdi), %eax vpinsrb $1, 5(%rdi), %xmm1, %xmm1 vpinsrb $2, 9(%rdi), %xmm1, %xmm1 vpinsrb $3, 13(%rdi), %xmm1, %xmm1 vpslld $24, %xmm0, %xmm0 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpslld $16, %xmm1, %xmm1 vpor %xmm0, %xmm1, %xmm0 vpinsrb $1, 6(%rdi), %xmm2, %xmm1 vmovd %eax, %xmm2 vpinsrb $2, 10(%rdi), %xmm1, %xmm1 vpinsrb $3, 14(%rdi), %xmm1, %xmm1 vpinsrb $1, 7(%rdi), %xmm2, %xmm2 vpinsrb $2, 11(%rdi), %xmm2, %xmm2 vpmovzxbd %xmm1, %xmm1 # xmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero vpinsrb $3, 15(%rdi), %xmm2, %xmm2 vpslld $8, %xmm1, %xmm1 vpmovzxbd %xmm2, %xmm2 # xmm2 = xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[2],zero,zero,zero,xmm2[3],zero,zero,zero vpor %xmm2, %xmm1, %xmm1 vpor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rsi) movl (%rdi), %eax movl 4(%rdi), %ecx movl 8(%rdi), %edx movbel %eax, (%rsi) movbel %ecx, 4(%rsi) movl 12(%rdi), %ecx movbel %edx, 8(%rsi) movbel %ecx, 12(%rsi) Differential Revision: https://reviews.llvm.org/D78997	2020-05-05 12:44:38 -04:00
Jinsong Ji	b6ef1872d5	[MachinePipeliner] Add ORE for MachinePipeliner This patch adds ORE for MachinePipeliner, so that people can anaylyze their code using opt-viewer or other tools, then optimize the code to catch more piplining opportunities. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D79368	2020-05-05 16:04:53 +00:00
Simon Pilgrim	7b6765b11e	[TTI] getScalarizationOverhead - use explicit VectorType operand getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions). Followup to D78357 Reviewed By: @samparker Differential Revision: https://reviews.llvm.org/D79341	2020-05-05 16:59:23 +01:00
Pengxuan Zheng	0c52c79a03	[RISCV] Update debug scratch register names Summary: The RISC-V debug register was named dscratch in a previous draft of the RISC-V debug mode spec. The number of registers has been increased to 2 in the latest ratified version of the debug mode spec and the registers were named dscratch0 and dscratch1. We still support using the old register name "dscratch", but it would be disassembled as "dscratch0" with this change. Reviewers: apazos, asb, lenary, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78764	2020-05-05 08:46:07 -07:00

1 2 3 4 5 ...

196378 Commits