llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	e2bdc7fda9	[CostModel][X86] Match SSE41 legalized conversion costs as well as SSE2	2021-05-21 11:42:22 +01:00
Simon Pilgrim	61e6792674	[CostModel][X86] Add uitpfp v4f32->v4i32 + v8f32->v8i32 SSE/AVX costs These were using (default) scalarized values.	2021-05-21 11:30:15 +01:00
Luke Benes	38eda91550	Fix warning: comparison of integer expressions of different signedness. NFC This patch resolves the Wsign-compare warning that I observed on armv7l and x86 with both gcc and clang. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D102792	2021-05-21 18:23:27 +08:00
Tony Tye	8dfab88553	[NFC][AMDGPU] Mark C code in AMDGPUUsage.rst Reviewed By: foad Differential Revision: https://reviews.llvm.org/D102910	2021-05-21 10:08:05 +00:00
Stephen Tozer	15caaeb8e5	3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reapplies c0f3dfb9, which was reverted following the discovery of crashes on linux kernel and chromium builds - these issues have since been fixed, allowing this patch to re-land. This reverts commit 4397b7095d640f9b9426c4d0135e999c5a1de1c5.	2021-05-21 11:06:20 +01:00
Joe Ellis	bbfb092906	[InstSimplify] Properly constrain {insert,extract}_subvector intrinsic fold The previous rule: (insert_vector _, (extract_vector X, 0), 0) -> X is not quite correct. The correct fold should be: (insert_vector Y, (extract_vector X, 0), 0) -> X where: Y is X, or Y is undef This commit updates the pattern. Reviewed By: peterwaller-arm, paulwalker-arm Differential Revision: https://reviews.llvm.org/D102699	2021-05-21 10:05:03 +00:00
Djordje Todorovic	034cc0a244	[NFC][Debugify][Original DI] Use MapVector insted of DenseMap for DI tracking By using MapVector instead of DenseMap, reporting issues will be in deterministic order. Differential Revision: https://reviews.llvm.org/D102841	2021-05-21 02:58:16 -07:00
Andy Wingo	107b591be0	[IR][Verifier] Relax restriction on alloca address spaces In the WebAssembly target, we would like to allow alloca in two address spaces. The alloca instruction already has an address space argument, but the verifier asserts that the address space of an alloca is the default alloca address space from the datalayout. This patch removes this restriction. Targets that would like to impose additional restrictions should do so via target-specific verification passes. Differential Revision: https://reviews.llvm.org/D101045	2021-05-21 11:52:45 +02:00
Djordje Todorovic	88aa158bd7	Recommit: "[Debugify][Original DI] Test dbg var loc preservation"" [Debugify][Original DI] Test dbg var loc preservation This is an improvement of [0]. This adds checking of original llvm.dbg.values()/declares() instructions in optimizations. We have picked a real issue that has been found with this (actually, picked one variable location missing from [1] and resolved the issue), and the result is the fix for that -- D100844. Before applying the D100844, using the options from [0] (but with this patch applied) on the compilation of GDB 7.11, the final HTML report for the debug-info issues can be found at [1] (please scroll down, and look for "Summary of Variable Location Bugs"). After applying the D100844, the numbers has improved a bit -- please take a look into [2]. [0] https://llvm.org/docs/HowToUpdateDebugInfo.html#\ test-original-debug-info-preservation-in-optimizations [1] https://djolertrk.github.io/di-check-before-adce-fix/ [2] https://djolertrk.github.io/di-check-after-adce-fix/ Differential Revision: https://reviews.llvm.org/D100845 The Unit test was failing because the pass from the test that modifies the IR, in its runOnFunction() didn't return 'true', so the expensive-check configuration triggered an assertion.	2021-05-21 02:04:29 -07:00
David Green	c47364d339	[ARM] Fix the operand used for WLS in ARMLowOverheadLoops The Loop start instruction handled by the ARMLowOverheadLoops are: $lr = t2DoLoopStart $r0 $lr = t2DoLoopStartTP $r1, $r0 $lr = t2WhileLoopStartLR $r0, %bb, implicit-def dead $cpsr All three of these will have LR as the 0 argument, the trip count as the 1 argument. This patch updated a few places in ARMLowOverheadLoops where the 0th arg was being used for t2WhileLoopStartLR instructions as the trip count. One place was entirely removed as it does not seem valid any more, the case the code is trying to protect against should not be able to occur with our correct-by-construction low overhead loops. Differential Revision: https://reviews.llvm.org/D102620	2021-05-21 09:29:30 +01:00
Yevgeny Rouban	ebb8c67ccd	Allow incomplete template types in unique_function arguments We can't declare unique_function that has in its arguments a reference to a template type with an incomplete argument. For instance, we can't declare unique_function<void(SmallVectorImpl<A>&)> when A is forward declared. This is because SFINAE will trigger a hard error in this case, when instantiating IsSizeLessThanThresholdT with the incomplete type. This patch specialize AdjustedParamT for references to remove this error. Committed on behalf of: @math-fehr (Fehr Mathieu) Reviewed By: DaniilSuchkov, yrouban	2021-05-21 14:09:33 +07:00
Igor Kudrin	f79eaab45a	[unittests][CodeGen] Mark tests that cannot be executed with GTEST_SKIP() This helps to distinguish such tests from successfully passed ones. Differential Revision: https://reviews.llvm.org/D102754	2021-05-21 13:39:52 +07:00
Igor Kudrin	ca34183385	[lit][gtest] Support SKIPPED tests This updates the googletest format to support tests that use GTEST_SKIP(), which is now available with the updated googletest framework. Differential Revision: https://reviews.llvm.org/D102694	2021-05-21 13:39:52 +07:00
Xiang1 Zhang	b32d7b7d22	[HWASAN] No code changed, Only clang-format for HWAddressSanitizer.cpp	2021-05-21 14:00:34 +08:00
Christudasan Devadasan	44fa7cd53e	GlobalISel: Help reduce operation width for instruction with two results. The function `reduceOperationWidth` helps to legalize a vector operation either by narrowing its type or by scalarizing the operation itself. It currently supports instructions with one result. This patch, in addition allows the same for instructions with two results (for instance, G_SDIVREM). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D100725	2021-05-21 10:34:18 +05:30
Serge Pavlov	72fd6b9af9	[APFloat] convertToDouble/Float can work on shorter types Previously APFloat::convertToDouble may be called only for APFloats that were built using double semantics. Other semantics like single precision were not allowed although corresponding numbers could be converted to double without loss of precision. The similar restriction applied to APFloat::convertToFloat. With this change any APFloat that can be precisely represented by double can be handled with convertToDouble. Behavior of convertToFloat was updated similarly. It make the conversion operations more convenient and adds support for formats like half and bfloat. Differential Revision: https://reviews.llvm.org/D102671	2021-05-21 11:02:51 +07:00
Stanislav Mekhanoshin	ebc40fee5f	[AMDGPU] Request module used variables from LDS lowering as internal I do not see any practical difference but technically used.* variables are internal and a call to getGlobalVariable misses true as a second argument. NFC as far as I can tell. Differential Revision: https://reviews.llvm.org/D102884	2021-05-20 20:55:47 -07:00
Jinsong Ji	068fb4deaa	[AIX] Print printable byte list as quoted string .byte supports string, so if the whole byte list are printable, we can actually print the string for readability and LIT tests maintainence. .byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d -> .byte "Hello, world" Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D102814	2021-05-21 02:37:55 +00:00
Nicolai Hähnle	5d6415281f	[IR] Memory intrinsics are not unconditionally `nosync` Remove the `nosync` attribute from the memory intrinsic definitions (i.e. memset, memcpy, memmove). Like native memory accesses, memory intrinsics can be volatile. This is indicated by an immarg in the intrinsic call. All else equal, a volatile memory intrinsic is `sync`, so we cannot annotate the intrinsic functions themselves as `nosync`. The attributor and function-attr passes know to take the volatile bit into account. Since `nosync` is a default attribute, this means we have to stop using the DefaultAttrIntrinsic tablegen class for memory intrinsics, and specify all default attributes other than `nosync` explicitly. Most of the test changes are trivial churn, but one test case (in nosync.ll) was in fact incorrect before this change. Differential Revision: https://reviews.llvm.org/D102295	2021-05-21 03:40:59 +02:00
Nicolai Hähnle	43ee457d7b	[tests] Update Transforms/DeadStoreElim/multiblock-malloc-free.ll This change is generated by running update_test_checks.py. It serves to make subsequent diffs easier to understand.	2021-05-21 02:36:57 +02:00
Stanislav Mekhanoshin	cc39b4a525	[AMDGPU] Fix module LDS selection Accesses to global module LDS variable start from null, but kernel also thinks its variables start address is null. Fixed by not using a null as an address. Differential Revision: https://reviews.llvm.org/D102882	2021-05-20 15:59:01 -07:00
Vitaly Buka	d949d3f98f	[asan] Add autogenerated test for fake stack This will help to see result of D102462. Test was generated with ./llvm/utils/update_test_checks.py llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll --opt-binary <build_dir>/bin/opt Differential Revision: https://reviews.llvm.org/D102867	2021-05-20 15:48:16 -07:00
Min-Yih Hsu	906bc160f9	[M68k] Support for inline asm operands w/ simple constraints This patch adds supports for inline assembly operands and some simple operand constraints, including register and constant operands. Differential Revision: https://reviews.llvm.org/D102585	2021-05-20 14:00:09 -07:00
Min-Yih Hsu	7cd0b2d7ab	[M68k] Allow user to preserve certain registers Add `-ffixed-a[0-6]` and `-ffixed-d[0-7]` and the corresponding subtarget features to prevent certain register from being allocated. Differential Revision: https://reviews.llvm.org/D102805	2021-05-20 13:57:22 -07:00
Jan Kratochvil	528de0554b	[lldb] Improve invalid DWARF DW_AT_ranges error reporting In D98289#inline-939112 @dblaikie said: Perhaps this could be more informative about what makes the range list index of 0 invalid? "index 0 out of range of range list table (with range list base 0xXXX) with offset entry count of XX (valid indexes 0-(XX-1))" Maybe that's too verbose/not worth worrying about since this'll only be relevant to DWARF producers trying to debug their DWARFv5, maybe no one will ever see this message in practice. Just a thought. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102851	2021-05-20 21:37:01 +02:00
Jessica Clarke	8c42ad8897	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits, where zero-extending atomics were incorrectly returning 0 rather than the (slightly confusing) required return value of 1. Re-landed again after D102819 fixed PowerPC to correctly zero-extend all of its atomics as it claimed to do, since the combination of that bug and this optimisation caused buildbot regressions. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-20 20:34:23 +01:00
LLVM GN Syncbot	6ad4e7dfe1	[gn build] Port 0af3105b641a	2021-05-20 19:20:25 +00:00
Jon Roelofs	8ecb13b84d	Revert "[Remarks] Add analysis remarks for memset/memcpy/memmove lengths" This reverts commit 4bf69fb52b3c445ddcef5043c6b292efd14330e0. This broke spec2k6/403.gcc under -global-isel. Details to follow once I've reduced the problem.	2021-05-20 12:19:16 -07:00
Nico Weber	b71c3b1c87	[gn build] try reverting code part of f05fbb7795 Maybe aa8fe8fe6c7b was all that was needed to fix the build and we can keep the code with fewer conditionals after all.	2021-05-20 15:08:39 -04:00
Nico Weber	77abe655fc	[gn build] attempt again to unbreak linux after fc9696130c8	2021-05-20 15:01:35 -04:00
Nico Weber	8f7f747cf3	[gn build] use PEP-8 indents in symbol_exports.py	2021-05-20 15:00:24 -04:00
Nico Weber	56a414f103	[gn build] attempt to unbreak linux after fc9696130c8 Only emit `global:` if there are any exported symbols. While here, `chmod +x` the symbol_exports.py script.	2021-05-20 14:55:40 -04:00
Nico Weber	c41aa09bfb	[gn build] Use .export files Just fixing an old TODO, no dramatic behavior change. Differential Revision: https://reviews.llvm.org/D102843	2021-05-20 14:48:12 -04:00
Kevin P. Neal	d53eff6d30	[FPEnv] EarlyCSE support for constrained intrinsics, default FP environment edition EarlyCSE cannot distinguish between floating point instructions and constrained floating point intrinsics that are marked as running in the default FP environment. Said intrinsics are supposed to behave exactly the same as the regular FP instructions. Teach EarlyCSE to handle them in that case. Differential Revision: https://reviews.llvm.org/D99962	2021-05-20 14:40:51 -04:00
Fraser Cormack	21ba453e3b	[RISCV] Ensure small mask BUILD_VECTORs aren't expanded The default expansion for BUILD_VECTORs -- save for going through shuffles -- is to go through the stack. This method only works when the type is at least byte-sized, so for v2i1 and v4i1 we would crash. This patch ensures that small mask-type BUILD_VECTORs are always handled without crashing. We lower to a SETCC of the equivalent i8 type. This also exposes some pre-existing issues where the lowering when optimizing for size results in larger code than without. Those will be tackled in future patches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102767	2021-05-20 19:12:29 +01:00
Reid Kleckner	c18c30409c	[PGO] Don't reference functions unless value profiling is enabled This reduces the size of chrome.dll.pdb built with optimizations, coverage, and line table info from 4,690,210,816 to 2,181,128,192, which makes it possible to fit under the 4GB limit. This change can greatly reduce binary size in coverage builds, which do not need value profiling. IR PGO builds are unaffected. There is a minor behavior change for frontend PGO. PGO and coverage both use InstrProfiling to create profile data with counters. PGO records the address of each function in the __profd_ global. It is used later to map runtime function pointer values back to source-level function names. Coverage does not appear to use this information. Recording the address of every function with code coverage drastically increases code size. Consider this program: void foo(); void bar(); inline void inlineMe(int x) { if (x > 0) foo(); else bar(); } int getVal(); int main() { inlineMe(getVal()); } With code coverage, the InstrProfiling pass runs before inlining, and it captures the address of inlineMe in the __profd_ global. This greatly increases code size, because now the compiler can no longer delete trivial code. One downside to this approach is that users of frontend PGO must apply the -mllvm -enable-value-profiling flag globally in TUs that enable PGO. Otherwise, some inline virtual method addresses may not be recorded and will not be able to be promoted. My assumption is that this mllvm flag is not popular, and most frontend PGO users don't enable it. Differential Revision: https://reviews.llvm.org/D102818	2021-05-20 11:09:24 -07:00
Simon Pilgrim	e8c62e4bcc	[X86][Atom] Fix vector fadd/fcmp/fmul resource/throughputs Match whats documented in the Intel AOM - these are all fadd/fcmp use Port1 and fmul uses Port1, but in many cases BOTH ports are required - this was being incorrectly modelled as EITHER port. Discovered while investigating the correct fptoui costs to fix the regressions in D101555. Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.	2021-05-20 18:56:58 +01:00
Alex Orlov	5487f701b4	Add support for DWARF embedded source to llvm-symbolizer. This patch adds DWARF embedded source printout to llvm-symbolizer. Reviewed By: jhenderson, dblaikie Differential Revision: https://reviews.llvm.org/D102355	2021-05-20 21:40:28 +04:00
Stefan Pintilie	7cef6d55d4	[PowerPC] Add fix to partword atomic operations Partword atomic binaries are not zero extended as they should be. This patch fixes them to ensure that they are zero extended. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D102819	2021-05-20 12:36:37 -05:00
Fraser Cormack	a3089bc5eb	[RISCV] Ensure shuffle splat operands are type-legal The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a type-legal splat value as it may implicitly extract a vector element from another shuffle. It is not permitted to introduce an illegal type when lowering shuffles. This patch addresses the crash by adding a boolean flag to `getSplatValue`, defaulting to false, which when set will ensure a type-legal return value. If it is unable to do that it will fail to return a splat value. I've been through the existing uses of `getSplatValue` in other targets and was unable to find a need or test cases showing a need to update their uses. In some cases, the call is made during `LegalizeVectorOps` which may still produce illegal scalar types. In other situations, the illegally-typed splat value may be quickly patched up to a legal type (such as any-extending the returned `extract_vector_elt` up to a legal type) before `LegalizeDAG` notices. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102687	2021-05-20 18:00:03 +01:00
Wouter van Oortmerssen	1184fb03b6	[WebAssembly] Fix PIC/GOT codegen for wasm64 __table_base is know 64-bit, since in LLVM it represents a function pointer offset __table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there. New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added Differential Revision: https://reviews.llvm.org/D101784	2021-05-20 09:59:31 -07:00
Steven Wu	0da7fd24df	[IR][AutoUpgrade] Drop alignment from non-pointer parameters and returns This is a follow-up of D102201. After some discussion, it is a better idea to upgrade all invalid uses of alignment attributes on function return values and parameters, not just limited to void function return types. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102726	2021-05-20 09:54:38 -07:00
Stephen Tozer	90c1db3c3c	[DebugInfo] Handle DIArgList in FastISel or GlobalIsel Currently, variadic dbg.values (i.e. those using a DIArgList as part of their location) are not handled properly by FastISel or GlobalISel, and will produce invalid DBG_VALUE instructions if they encounter them. This patch fixes this issue by emitting undef DBG_VALUE instructions for variadic dbg.values, so that no incorrect instruction is produced and any prior variable location is terminated. This is simply a quick-fix to prevent errors; a correct implementation should come later for these ISel pipelines to ensure that we do not drop debug information unnecessarily. Differential Revision: https://reviews.llvm.org/D102500	2021-05-20 17:37:28 +01:00
Peter Waller	be488ff93c	[CodeGen][AArch64][SVE] Canonicalize intrinsic rdffr{ => _z} Follow up to D101357 / 3fa6510f6. Supersedes D102330. Goal: Use flags setting rdffrs instead of rdffr + ptest. Problem: RDFFR_P doesn't have have a flags setting equivalent. Solution: in instcombine, canonicalize to RDFFR_PP at the IR level, and rely on RDFFR_PP+PTEST => RDFFRS_PP optimization in AArch64InstrInfo::optimizePTestInstr. While here: * Test that rdffr.z+ptest generates a rdffrs. * Use update_{test,llc}_checks.py on the tests. * Use sve attribute on functions. Differential Revision: https://reviews.llvm.org/D102623	2021-05-20 16:22:50 +00:00
Sanjay Patel	94e851fedf	[GlobalOpt] recompute alignments for loads and stores of updated globals GlobalOpt can slice structs/arrays and change GEPs in the process, but it was not updating alignments for load/store users. This eventually causes the crashing seen in: https://llvm.org/PR49661 https://llvm.org/PR50253 On x86, this required SLP+codegen to create an aligned vector store on an invalid address. The bugs would be easier to demonstrate on a target with stricter alignment requirements. I'm not sure if this is a complete solution. The alignment updating code is adapted from InstCombine, so I assume that part is tested and good. Differential Revision: https://reviews.llvm.org/D102552	2021-05-20 12:12:21 -04:00
Sanjay Patel	eeebdacf1b	[GlobalOpt] adjust test to show load problems; NFC Goes with D102552	2021-05-20 12:12:21 -04:00
Alexey Bataev	2a555e9e81	[SLP]Try to vectorize tiny trees with shuffled gathers of extractelements. If we gather extract elements and they actually are just shuffles, it might be profitable to vectorize them even if the tree is tiny. Differential Revision: https://reviews.llvm.org/D101460	2021-05-20 08:36:16 -07:00
Daniel Kiss	00169a8661	[ARM][AArch64] SLSHardening: make non-comdat thunks possible Linker scripts might not handle COMDAT sections. SLSHardeing adds new section for each __llvm_slsblr_thunk_xN. This new option allows the generation of the thunks into the normal text section to handle these exceptional cases. ,comdat or ,noncomdat can be added to harden-sls to control the codegen. -mharden-sls=[all\|retbr\|blr],nocomdat. Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D100546	2021-05-20 17:07:05 +02:00
Djordje Todorovic	b69d892627	Revert "[Debugify][Original DI] Test dbg var loc preservation" This reverts commit 76f375f3d9d6902820ffc21200e454926748c678. This will be pushed again, after investigating a test failure: https://lab.llvm.org/buildbot/#/builders/16/builds/11254	2021-05-20 07:11:35 -07:00
Djordje Todorovic	8ece18da90	[Debugify][Original DI] Test dbg var loc preservation This is an improvement of [0]. This adds checking of original llvm.dbg.values()/declares() instructions in optimizations. We have picked a real issue that has been found with this (actually, picked one variable location missing from [1] and resolved the issue), and the result is the fix for that -- D100844. Before applying the D100844, using the options from [0] (but with this patch applied) on the compilation of GDB 7.11, the final HTML report for the debug-info issues can be found at [1] (please scroll down, and look for "Summary of Variable Location Bugs"). After applying the D100844, the numbers has improved a bit -- please take a look into [2]. [0] https://llvm.org/docs/HowToUpdateDebugInfo.html\ [1] https://djolertrk.github.io/di-check-before-adce-fix/ [2] https://djolertrk.github.io/di-check-after-adce-fix/ Differential Revision: https://reviews.llvm.org/D100845	2021-05-20 06:42:02 -07:00

1 2 3 4 5 ...

216128 Commits