llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Christudasan Devadasan	f694eea8d3	[AMDGPU] Some refactoring after D90404. NFC.	2020-11-01 13:18:53 +05:30
Christudasan Devadasan	45dd9c1c1d	[AMDGPU] Add alignment check for v3 to v4 load type promotion It should be enabled only when the load alignment is at least 8-byte. Fixes: SWDEV-256824 Reviewed By: foad Differential Revision: https://reviews.llvm.org/D90404	2020-11-01 12:05:34 +05:30
Ayke van Laethem	a7eefcbc05	[AVR] Improve inline rotate/shift expansions These expansions were rather inefficient and were done with more code than necessary. This change optimizes them to use expansions more similar to GCC. The code size is the same (when optimizing for code size) but somehow LLVM reorders blocks in a non-optimal way. Still, this should be an improvement with a reduction in code size of around 0.12% (when building compiler-rt). Differential Revision: https://reviews.llvm.org/D86418	2020-10-31 23:15:49 +01:00
Florian Hahn	dc7616bd85	[DSE] Use same logic as legacy impl to check if free kills a location. This patch updates DSE + MemorySSA to use the same check as the legacy implementation to determine if a location is killed by a free call. This changes the existing behavior so that a free does not kill locations before the start of the freed pointer. This should fix PR48036.	2020-10-31 20:09:25 +00:00
Florian Hahn	94fd7ed787	Reland "[SLP] Consider alternatives for cost of select instructions." This reverts the revert commit a1b53db32418cb6ed6f5b2054d15a22b5aa3aeb9. This patch includes a fix for a reported issue, caused by matchSelectPattern returning UMIN for selects of pointers in some cases by looking to some connected casts. For now, ensure integer instrinsics are only returned for selects of ints or int vectors.	2020-10-31 16:52:36 +00:00
Paul C. Anagnostopoulos	2d0f8816ca	[TableGen] Eliminate uses of true and false in .td files. They occurred in one NVPTX file and some test files. Differential Revision: https://reviews.llvm.org/D90513	2020-10-31 10:54:33 -04:00
David Green	e722c3abc4	[ARM] Fix crash for gather of pointer costs. If the elt size is unknown due to it being a pointer, a comparison against 0 will cause an assert. Make sure the elt size is large enough before comparing and for the moment just return the scalar cost.	2020-10-31 13:10:14 +00:00
Simon Pilgrim	70069c9ff3	[InstCombine] foldSelectRotate - generalize to foldSelectFunnelShift This is the last of the rotate->funnel shift InstCombine generalizations for PR46896 We still have foldGuardedRotateToFunnelShift to deal with in AggressiveInstCombine Differential Revision: https://reviews.llvm.org/D90382	2020-10-31 12:32:34 +00:00
Simon Pilgrim	c63dceb3c6	[X86] Make some basic VarArgsLoweringHelper helper methods const. NFCI. Fixes a number of cppcheck remarks.	2020-10-31 12:16:49 +00:00
Simon Pilgrim	6ecc9d5cd6	[X86] Make the X86FrameSortingComparator operator const. NFCI. Fixes a cppcheck remark.	2020-10-31 12:16:49 +00:00
Simon Pilgrim	8ba90ee18c	[CSE] Make some basic EarlyCSE::StackNode helper methods const. NFCI. Fixes a number of cppcheck remarks.	2020-10-31 12:16:48 +00:00
Simon Pilgrim	cd24afa0df	[Bitcode] Make some basic PlaceholderQueue/MetadataLoaderImpl helper methods const. NFCI. Fixes a number of cppcheck remarks.	2020-10-31 12:16:48 +00:00
Andrea Di Biagio	7a72a7505a	[MCA][LSUnit] Correctly update the internal group flags on store barrier execution. Fixes PR48024. This is likely to be a regressigion introduced by my last refactoring of the LSUnit (commit 5578ec32f9c4f). Before this patch, the "CurrentStoreBarrierGroupID" index was not correctly reset on store barrier executions. This was leading to unexpected crashes like the one reported as PR48024.	2020-10-31 11:57:27 +00:00
Simon Pilgrim	3efd9f3456	[X86] X86MCTargetDesc - ensure the declaration/definition variable names match. NFCI. Silences cppcheck mismatch warnings.	2020-10-31 11:50:00 +00:00
Simon Pilgrim	a2ea3f5fb1	[X86] Reduce scope of DestReg and use specific Register type not unsigned. NFCI.	2020-10-31 11:46:07 +00:00
Simon Pilgrim	bbc158d7d8	[X86] printAsmMRegister - make the X86AsmPrinter arg a const reference. NFC. Fixes cppcheck warning.	2020-10-31 11:41:14 +00:00
Simon Pilgrim	d0fa3ff1b3	[X86] assignValueToReg - fix Wshadow warning. NFCI. X86OutgoingValueHandler already has a MIB member	2020-10-31 11:39:26 +00:00
Simon Pilgrim	15af1d25b2	[X86] printAsmVRegister - remove unused argument. NFC.	2020-10-31 11:34:28 +00:00
Simon Pilgrim	b884db032a	[X86] X86AsmPrinter - ensure the declaration/definition variable names match. NFCI. Silences cppcheck mismatch warnings.	2020-10-31 11:31:46 +00:00
Simon Pilgrim	ce989b4ba4	[X86] No need to determine pointer when the type is already a MachineInstr. NFCI. Caught by cppcheck - appears to be a copy+paste typo as the other var is an iterator that does need the & pointer operation.	2020-10-31 11:26:25 +00:00
Nikita Popov	3b8d8fb2a4	[Inliner] Consistently apply callsite noalias metadata Previously, !noalias and !alias.scope metadata on the call site was applied as part of CloneAliasScopeMetadata(), which short-circuits if the callee does not use any noalias metadata itself. However, these two things have no relation to each other. Consistently apply !noalias and !alias.scope metadata by integrating this into an existing function that handled !llvm.access.group and !llvm.mem.parallel_loop_access metadata. The handling for all of these metadata kinds essentially the same.	2020-10-31 10:54:45 +01:00
Arthur Eubanks	bb84082e59	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Liu, Chen3	0f29f1e458	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Thomas Lively	c38355d3a0	[WebAssembly] Prototype i64x2.bitmask As proposed in https://github.com/WebAssembly/simd/pull/368. Differential Revision: https://reviews.llvm.org/D90514	2020-10-30 17:23:30 -07:00
Wouter van Oortmerssen	9bad51347e	[WebAssembly] Fixed DWARF DW_AT_low_pc encoded as 64-bit in wasm64 Also added general wasm64 DWARF test Also added asserts for unsupported reloc combinations that triggered this bug. Differential Revision: https://reviews.llvm.org/D90503	2020-10-30 16:42:48 -07:00
Thomas Lively	dcc2517656	[WebAssembly] Prototype i64x2.eq As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508	2020-10-30 16:38:15 -07:00
Thomas Lively	5b1fe216f7	[WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u} As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these instructions are available only via builtin functions and intrinsics while they are in the prototyping stage. Differential Revision: https://reviews.llvm.org/D90504	2020-10-30 15:44:04 -07:00
Florian Hahn	7964568963	Revert "[SLP] Consider alternatives for cost of select instructions." This reverts commit 19225704890632cd2552f41ada41600a20db1371. This appears to cause a crash in the following example a, b, c; l() { int e = a, f = l, g, h, i, j; float d = c, k = b; for (;;) for (; g < f; g++) { k[h] = d[i]; k[h - 1] = d[j]; h += e << 1; i += e; } } clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.	2020-10-30 21:26:14 +00:00
Florian Hahn	f44af1a603	Revert "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts commit 73f01e3df58dca9d1596440b866b52929e3878de. This appears to break http://lab.llvm.org:8011/#/builders/85/builds/383.	2020-10-30 21:26:14 +00:00
Peter Collingbourne	51d3ffbbc5	hwasan: Support for outlined checks in the Linux kernel. Add support for match-all tags and GOT-free runtime calls, which are both required for the kernel to be able to support outlined checks. This requires extending the access info to let the backend know when to enable these features. To make the code easier to maintain introduce an enum with the bit field positions for the access info. Allow outlined checks to be enabled with -mllvm -hwasan-inline-all-checks=0. Kernels that contain runtime support for outlined checks may pass this flag. Kernels lacking runtime support will continue to link because they do not pass the flag. Old versions of LLVM will ignore the flag and continue to use inline checks. With a separate kernel patch [1] I measured the code size of defconfig + tag-based KASAN, as well as boot time (i.e. time to init launch) on a DragonBoard 845c with an Android arm64 GKI kernel. The results are below: code size boot time before 92824064 6.18s after 38822400 6.65s [1] https://linux-review.googlesource.com/id/I1a30036c70ab3c3ee78d75ed9b87ef7cdc3fdb76 Depends on D90425 Differential Revision: https://reviews.llvm.org/D90426	2020-10-30 14:25:40 -07:00
Cameron McInally	7b7e236aab	[Legalize] Add legalizations for VECREDUCE_SEQ_FADD Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass. Differential Revision: https://reviews.llvm.org/D90247	2020-10-30 16:02:55 -05:00
Peter Collingbourne	1e61e7c7e0	hwasan: Move fixed shadow behind opaque no-op cast as well. This is a workaround for poor heuristics in the backend where we can end up materializing the constant multiple times. This is particularly bad when using outlined checks because we materialize it for every call (because the backend considers it trivial to materialize). As a result the field containing the shadow base value will always be set so simplify the code taking that into account. Differential Revision: https://reviews.llvm.org/D90425	2020-10-30 13:23:52 -07:00
Peter Collingbourne	6c8896ea88	AArch64: Use SBFX instead of UBFX to extract address granule in outlined HWASan checks. In a kernel (or in general in environments where bit 55 of the address is set) the shadow base needs to point to the end of the shadow region, not the beginning. Bit 55 needs to be sign extended into bits 52-63 of the shadow base offset, otherwise we end up loading from an invalid address. We can do this by using SBFX instead of UBFX. Using SBFX should have no effect in the userspace case where bit 55 of the address is clear so we do so unconditionally. I don't think we need a ABI version bump for this (but one will come anyway when we switch to x20 for the shadow base register). Differential Revision: https://reviews.llvm.org/D90424	2020-10-30 12:53:15 -07:00
Peter Collingbourne	1263637c86	AArch64: Switch to x20 as the shadow base register for outlined HWASan checks. From a code size perspective it turns out to be better to use a callee-saved register to pass the shadow base. For non-leaf functions it avoids the need to reload the shadow base into x9 after each function call, at the cost of an additional stack slot to save the caller's x20. But with x9 there is also a stack size cost, either as a result of copying x9 to a callee-saved register across calls or by spilling it to stack, so for the non-leaf functions the change to stack usage is largely neutral. It is also code size (and stack size) neutral for many leaf functions. Although they now need to save/restore x20 this can typically be combined via LDP/STP into the x30 save/restore. In the case where the function needs callee-saved registers or stack spills we end up needing, on average, 8 more bytes of stack and 1 more instruction but given the improvements to other functions this seems like the right tradeoff. Unfortunately we cannot change the register for the v1 (non short granules) check because the runtime assumes that the shadow base register is stored in x9, so the v1 check still uses x9. Aside from that there is no change to the ABI because the choice of shadow base register is a contract between the caller and the outlined check function, both of which are compiler generated. We do need to rename the v2 check functions though because the functions are deduplicated based on their names, not on their contents, and we need to make sure that when object files from old and new compilers are linked together we don't end up with a function that uses x9 calling an outlined check that uses x20 or vice versa. With this change code size of /system/lib64/*.so in an Android build with HWASan goes from 200066976 bytes to 194085912 bytes, or a 3% decrease. Differential Revision: https://reviews.llvm.org/D90422	2020-10-30 12:51:30 -07:00
Mircea Trofin	3afc00f390	[FileCheck] Report missing prefixes when more than one is provided. If more than a prefix is provided - e.g. --check-prefixes=CHECK,FOO - we don't report if (say) FOO is never used. This may lead to a gap in our test coverage. This patch introduces a new option, --allow-unused-prefixes. It currently is set to true, keeping today's behavior. After we explicitly set it in tests where this behavior was actually intentional, we will switch it to false by default. Differential Revision: https://reviews.llvm.org/D90281	2020-10-30 12:39:29 -07:00
Anna Thomas	4583a75ff3	[CFG] Replace hardcoded max BBs explored as CL option. NFC. This option was hardcoded to 32. Changing this as a CL option since we have seen some cases downstream where increasing this limit allows us to disprove reachability. Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D90487	2020-10-30 15:11:48 -04:00
Craig Topper	1ca50ab309	[RISCV] Don't use DCI.CombineTo to replace a single result. NFCI Just return the new node, which is the standard practice. I also noticed what appeared to be an unnecessary attempt at creating an ANY_EXTEND where the type should already be correct. I replace with an assert to verify the type. Differential Revision: https://reviews.llvm.org/D90444	2020-10-30 10:46:32 -07:00
Ronald Wampler	3d6203fc7f	[Support] PR42623: Avoid setting the delete-on-close bit if a TempFile doesn't reside on a local drive On Windows, after commit 881ba104656c40098d4bc90c52613c08136f0fe1, tools using TempFile would error with "bad file descriptor" when writing the file on a network drive. It appears that setting the delete-on-close bit via SetFileInformationByHandle/FileDispositionInfo prevented it from accessing the file on network drives, and although using FILE_DISPOSITION_INFO seems to work, it causes other troubles. Differential Revision: https://reviews.llvm.org/D81803	2020-10-30 13:37:40 -04:00
Arthur Eubanks	3102160c9b	[NFC] Clean up PassBuilder Make DebugLogging a member variable so that users of PassBuilder don't need to pass it around so much. Move call to TargetMachine::registerPassBuilderCallbacks() within PassBuilder so users don't need to remember to call it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90437	2020-10-30 10:03:59 -07:00
Arthur Eubanks	f52f1e83f5	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Pedro Tammela	e61104b08c	[NFC][Reg2Mem] modernize loops iterators This patch updates the Reg2Mem loops to use more modern iterators. Differential Revision: https://reviews.llvm.org/D90122	2020-10-30 16:50:07 +00:00
Pedro Tammela	5cfbd72aaf	[NFC][LoopSimplify] modernize for loops over LoopInfo This patch modifies two for loops to use the range based syntax. Since they are equivalent, this patch is tagged NFC. Differential Revision: https://reviews.llvm.org/D90069	2020-10-30 16:50:07 +00:00
Sanjay Patel	a82a184b76	[x86] add cost overrides for mul with overflow I'm assuming the standard size integer instructions for this end up as something like: mulq %rsi seto %al And the 'mul' generally has reciprocal throughput of 1 on typical implementations (higher latency, but that's not handled here). The default costs may end up much higher than that, and that's what we see in the test diffs. Vector types are left as a 'TODO'. Differential Revision: https://reviews.llvm.org/D90431	2020-10-30 12:38:16 -04:00
Amy Huang	a59c18de66	[CodeView] Encode signed int values correctly when emitting S_CONSTANTs Differential Revision: https://reviews.llvm.org/D90199	2020-10-30 09:28:41 -07:00
Michael Liao	15bc14c9fa	[gvn] PRE needs to skip convergent intrinsics/calls. - As convergent intrinsics/calls could only be moved to control-equivalent blocks, or more precisely the same divergent branch, PRE needs to skip them. Differential Revision: https://reviews.llvm.org/D90391	2020-10-30 11:24:40 -04:00
Evgeniy Brevnov	d278d84c98	[DSE] Improve partial overlap detection Currently isOverwrite returns OW_MaybePartial even for accesss known not to overlap. This is not a big problem for legacy implementation (since isPartialOverwrite follows isOverwrite and clarifies the result). Contrary SSA based version does a lot of work to later find out that accesses don't overlap. Besides negative impact on compile time we quickly reach MemorySSAPartialStoreLimit and miss optimization opportunities. Note: In fact, I think it would be cleaner implementation if isOverwrite returned fully clarified result in the first place whithout need to call isPartialOverwrite. This can be done as a follow up. What do you think? Reviewed By: fhahn, asbirlea Differential Revision: https://reviews.llvm.org/D90371	2020-10-30 22:23:20 +07:00
Simon Pilgrim	d3042bf5bc	Use cast<> instead of dyn_cast<> as we dereference the pointers immediately. NFCI. Fix clang static analyzer warnings - we're better off relying on cast<> asserting on failure rather than a null dereference crash.	2020-10-30 15:20:40 +00:00
Simon Moll	4fac409eb4	[VE][NFC] Split up lowering init Split up the monolithic VETargetLowering ctor into three initialization phases: 1. initRegisterClasses() 2. initSPUActions() 3. // TODO initVPUActions() Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D90463	2020-10-30 16:18:27 +01:00
Matt Arsenault	f52c63425c	AMDGPU: Fix missing writelane cases to skip with exec=0	2020-10-30 11:15:11 -04:00
Florian Hahn	f1c34f2040	[VPlan] Use isa<> instead getVPRecipeID in getFirstNonPhi (NFC). As per the comment in VPRecipeBase, clients should not rely on getVPRecipeID, as it may change in the future. It should only be used in classof implementations. Use isa instead in getFirstNonPhi.	2020-10-30 14:56:06 +00:00

... 3 4 5 6 7 ...

140890 Commits