llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
David Green	67f2592469	[ARM][RegAlloc] Add t2LoopEndDec We currently have problems with the way that low overhead loops are specified, with LR being spilled between the t2LoopDec and the t2LoopEnd forcing the entire loop to be reverted late in the backend. As they will eventually become a single instruction, this patch introduces a t2LoopEndDec which is the combination of the two, combined before registry allocation to make sure this does not fail. Unfortunately this instruction is a terminator that produces a value (and also branches - it only produces the value around the branching edge). So this needs some adjustment to phi elimination and the register allocator to make sure that we do not spill this LR def around the loop (needing to put a spill after the terminator). We treat the loop very carefully, making sure that there is nothing else like calls that would break it's ability to use LR. For that, this adds a isUnspillableTerminator to opt in the new behaviour. There is a chance that this could cause problems, and so I have added an escape option incase. But I have not seen any problems in the testing that I've tried, and not reverting Low overhead loops is important for our performance. If this does work then we can hopefully do the same for t2WhileLoopStart and t2DoLoopStart instructions. This patch also contains the code needed to convert or revert the t2LoopEndDec in the backend (which just needs a subs; bne) and the code pre-ra to create them. Differential Revision: https://reviews.llvm.org/D91358	2020-12-10 12:14:23 +00:00
Martin Storsjö	102c71a022	[llvm-rc] Handle driveless absolute windows paths when loading external files When llvm-rc loads an external file, it looks for it relative to a number of include directories and the current working directory. If the path is considered absolute, llvm-rc tries to open the filename as such, and doesn't try to open it relative to other paths. On Windows, a path name like "\dir\file" isn't considered absolute as it lacks the drive name, but by appending it on top of the search dirs, it's not found. LLVM's sys::path::append just appends such a path (same with a properly absolute posix path) after the paths it's supposed to be relative to. This fix doesn't handle the case if the resource script and the external file are on a different drive than the current working directory; to fix that, we'd have to make LLVM's sys::path::append handle appending fully absolute and partially absolute paths (ones lacking a drive prefix but containing a root directory), or switch to C++17's std::filesystem. Differential Revision: https://reviews.llvm.org/D92558	2020-12-10 14:11:06 +02:00
Alexey Lapshin	8c9a1f9e87	[dsymutil][DWARFLinker][NFC] Make interface of AddressMap more general. Current interface of AddressMap assumes that relocations exist. That is correct for not-linked object file but is not correct for linked executable. This patch changes interface in such way that AddressMap could be used not only with not-linked object files: hasValidRelocationAt() replaced with: hasLiveMemoryLocation() hasLiveAddressRange() Differential Revision: https://reviews.llvm.org/D87723	2020-12-10 14:57:08 +03:00
Mirko Brkusanin	e195dd75ce	[AMDGPU] Resolve issues when picking between ds_read/write and ds_read2/write2 Both ds_read_b128 and ds_read2_b64 are valid for 128bit 16-byte aligned loads but the one that will be selected is determined either by the order in tablegen or by the AddedComplexity attribute. Currently ds_read_b128 has priority. While ds_read2_b64 has lower alignment requirements, we cannot always restrict ds_read_b128 to 16-byte alignment because of unaligned-access-mode option. This was causing ds_read_b128 to be selected for 8-byte aligned loads regardles of chosen access mode. To resolve this we use two patterns for selecting ds_read_b128. One requires alignment of 16-byte and the other requires unaligned-access-mode option. Same goes for ds_write2_b64 and ds_write_b128. Differential Revision: https://reviews.llvm.org/D92767	2020-12-10 12:40:49 +01:00
David Green	b4f282c77f	[ARM] Additional test for Min loop. NFC	2020-12-10 10:49:00 +00:00
David Green	04038723cc	[ARM] Remove copies from low overhead phi inductions. The phi created in a low overhead loop gets created with a default register class it seems. There are then copied inserted between the low overhead loop pseudo instructions (which produce/consume GPRlr instructions) and the phi holding the induction. This patch removes those as a step towards attempting to make t2LoopDec and t2LoopEnd a single instruction, and appears useful in it's own right as shown in the tests. Differential Revision: https://reviews.llvm.org/D91267	2020-12-10 10:30:31 +00:00
Jun Ma	0d59bcd1c0	[TruncInstCombine] Remove scalable vector restriction Differential Revision: https://reviews.llvm.org/D92819	2020-12-10 18:00:19 +08:00
Benjamin Kramer	5449a6bbe1	Remove Shapet assignment operator that's identical to the default. NFC.	2020-12-10 10:58:41 +01:00
Benjamin Kramer	94cf8f1d41	[Hexagon] Fold single-use variables into assert. NFCI. Silences unused variable warnings in Release builds.	2020-12-10 10:53:56 +01:00
David Green	683e29b9a4	[ARM] MVE vcreate tests, for dual lane moves. NFC	2020-12-10 09:17:34 +00:00
LLVM GN Syncbot	dce1606292	[gn build] Port f80b29878b0	2020-12-10 09:13:09 +00:00
Luo, Yuanke	4a2765406d	[X86] AMX programming model. This patch implements amx programming model that discussed in llvm-dev (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html). Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet. This patch implemeted 7 components. 1. The c interface to end user. 2. The AMX intrinsics in LLVM IR. 3. Transform load/store <256 x i32> to AMX intrinsics or split the type into two <128 x i32>. 4. The Lowering from AMX intrinsics to AMX pseudo instruction. 5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx intruction. 6. The register allocation for tile register. 7. Morph AMX pseudo instruction to AMX real instruction. Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0 Differential Revision: https://reviews.llvm.org/D87981	2020-12-10 17:01:54 +08:00
Lang Hames	bc304940bf	[JITLink][ELF] Reformat/add debug logging in ELF_x86_64.cpp. Moves symbol name to the end of the output and makes other columns fixed width so that they line up.	2020-12-10 18:46:44 +11:00
Kazu Hirata	9793163151	[Tablegen] Use llvm::is_contained (NFC)	2020-12-09 23:34:07 -08:00
Sergey Dmitriev	9302af918f	[llvm-link][NFC] Minor cleanup llvm::Linker::linkModules() is a static member, so there is no need to pass reference to llvm::Linker instance to loadArFile() function. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D92918	2020-12-09 23:16:13 -08:00
Kazushi (Jam) Marukawa	4bf07b5b90	[VE][NFC] Disable VP tests VP tests recently added don't work on Release mode. They work on Debug mode, so I disable them on Release mode to make tests work.	2020-12-10 15:13:05 +09:00
Arthur Eubanks	07235ffbaf	[test] Fix coro-retcon.ll under NPM The full aa-pipeline is required to remove the extra store.	2020-12-09 22:04:59 -08:00
Alina Sbirlea	1574bc6938	[MemorySSA/docs] Extend MemorySSA documentation.	2020-12-09 18:00:16 -08:00
Arthur Eubanks	3c001d0408	[LTO][NPM] Default to using NPM under ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER This affects users of LTO that don't explicitly set UseNewPM. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D92894	2020-12-09 17:48:47 -08:00
Fangrui Song	ead0e3209d	Rename -plugin-opt=no-new-pass-manager to -plugin-opt=legacy-pass-manager	2020-12-09 16:43:30 -08:00
Stanislav Mekhanoshin	7b785e39c8	[AMDGPU] Fix expansion of 192 bit spills in PEI Differential Revision: https://reviews.llvm.org/D92979	2020-12-09 16:36:29 -08:00
Krzysztof Parzyszek	071e5ba62e	[Hexagon] Silence warnings about unused objects	2020-12-09 17:54:10 -06:00
Krzysztof Parzyszek	1a325ba866	[Hexagon] Fix build: move template specialization into namespace scope	2020-12-09 17:40:15 -06:00
Scott Linder	6218f09b03	[MC] Fix ICE with non-newline terminated input There is an explicit option for the lexer to support this, but we crash when `-preserve-comments` is enabled because it checks for `getTok().getString().empty()` to detect the case. This doesn't work currently because the lexer reports this case as a string of length 1, containing a null byte. Change the lexer to instead report this case via an empty string, as the null terminator isn't logically a part of the textual input, and the check for `.empty()` seems natural and obvious in the calling code. Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D92681	2020-12-09 23:39:32 +00:00
LLVM GN Syncbot	89ea615411	[gn build] Port f5d07a05bbd	2020-12-09 23:12:27 +00:00
Krzysztof Parzyszek	deb082d99d	[Hexagon] Realign HVX vectors wherever possible Introduce HexagonVectorCombine as a helper class for vector-related optimizations.	2020-12-09 17:11:25 -06:00
Saleem Abdulrasool	4ae2c1c200	X86: use a data driven configuration of Windows x86 libcalls (NFC) Rather than creating a series of associated calls and ensuring that everything is lined up, use a table driven approach that ensures that they two always stay in sync.	2020-12-09 22:49:11 +00:00
Scott Linder	19b5d1fffc	[MC][AMDGPU] Consume EndOfStatement in asm parser Avoids spurious newlines showing up in the output when emitting assembly via MC. Reviewed By: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D92690	2020-12-09 21:45:55 +00:00
Craig Topper	067e0b2781	[X86] Use APInt::isSignedIntN instead of isIntN for 64-bit ANDs in X86DAGToDAGISel::IsProfitableToFold Pretty sure we meant to be checking signed 32 immediates here rather than unsigned 32 bit. I suspect I messed this up because in MathExtras.h we have isIntN and isUIntN so isIntN differs in signedness depending on whether you're using APInt or plain integers. This fixes a case where we didn't fold a constant created by shrinkAndImmediate. Since shrinkAndImmediate doesn't topologically sort constants it creates, we can fail to convert the Constant to a TargetConstant. This leads to very strange behavior later. Fixes PR48458.	2020-12-09 13:39:07 -08:00
Fangrui Song	7f2a5362d1	[LLD][gold] Add -plugin-opt=no-new-pass-manager -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=on configured LLD and LLVMgold.so will use the new pass manager by default. Add an option to use the legacy pass manager. This will also be used by the Clang driver when -fno-new-pass-manager (D92915) / -fno-experimental-new-pass-manager is set. Reviewed By: aeubanks, tejohnson Differential Revision: https://reviews.llvm.org/D92916	2020-12-09 13:31:03 -08:00
Yuanfang Chen	bbd8d2f9e4	[NFCI] Add missing triple to several LTO tests Also remove the module triple of clang/test/CodeGenObjC/arc.ll, the commandline tripe is all it needs.	2020-12-09 13:13:58 -08:00
Scott Linder	ad95bab280	[AMDGPU][MC] Restore old error position for "too few operands" Revert part of https://reviews.llvm.org/D92084 to make it simpler to start consuming the EndOfStatement token within AMDGPU's ParseInstruction in a future patch. This also brings us back to what every other target currently does. A future change to move the position back to the end of the statement would likely need to audit all of the AMDGPUOperand SMLoc ranges, and determine the SMLoc for the last character of the last operand. Reviewed By: dp Differential Revision: https://reviews.llvm.org/D92960	2020-12-09 21:09:47 +00:00
Sam Clegg	cee0963ebc	[WebAssembly] Add support for named data sections in wasm binaries Followup to https://reviews.llvm.org/D91769 which added support for names globals. Differential Revision: https://reviews.llvm.org/D92909	2020-12-09 12:57:07 -08:00
Mircea Trofin	99ada595aa	[NFC] Removed unused prefixes in llvm/test/CodeGen/AArch64 Differential Revision: https://reviews.llvm.org/D92943	2020-12-09 12:47:51 -08:00
Florian Hahn	2e708d6d56	[AArch64] Add aarch64_neon_vcmla{_rot{90,180,270}} intrinsics. Add builtins required to implement vcmla and rotated variants from the ACLE Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92929	2020-12-09 19:46:49 +00:00
Michael Munday	0e3bafc4e2	[RISCV][NFC] Regenerate RISCV CodeGen tests Regenerated using: ./llvm/utils/update_llc_test_checks.py -u llvm/test/CodeGen/RISCV/*.ll This has added comments to spill-related instructions and added @plt to some symbols. Differential Revision: https://reviews.llvm.org/D92841	2020-12-09 19:42:49 +00:00
Jianzhou Zhao	38add4d4b7	[dfsan] Track field/index-level shadow values in variables ************* * The problem ************* See motivation examples in compiler-rt/test/dfsan/pair.cpp. The current DFSan always uses a 16bit shadow value for a variable with any type by combining all shadow values of all bytes of the variable. So it cannot distinguish two fields of a struct: each field's shadow value equals the combined shadow value of all fields. This introduces an overtaint issue. Consider a parsing function std::pair<char, int> get_token(char p); where p points to a buffer to parse, the returned pair includes the next token and the pointer to the position in the buffer after the token. If the token is tainted, then both the returned pointer and int ar tainted. If the parser keeps on using get_token for the rest parsing, all the following outputs are tainted because of the tainted pointer. The CL is the first change to address the issue. ************************** * The proposed improvement ************************ Eventually all fields and indices have their own shadow values in variables and memory. For example, variables with type {i1, i3}, [2 x i1], {[2 x i4], i8}, [2 x {i1, i1}] have shadow values with type {i16, i16}, [2 x i16], {[2 x i16], i16}, [2 x {i16, i16}] correspondingly; variables with primary type still have shadow values i16. ************************* * An potential implementation plan ************************* The idea is to adopt the change incrementially. 1) This CL Support field-level accuracy at variables/args/ret in TLS mode, load/store/alloca still use combined shadow values. After the alloca promotion and SSA construction phases (>=-O1), we assume alloca and memory operations are reduced. So if struct variables do not relate to memory, their tracking is accurate at field level. 2) Support field-level accuracy at alloca 3) Support field-level accuracy at load/store These two should make O0 and real memory access work. 4) Support vector if necessary. 5) Support Args mode if necessary. 6) Support passing more accurate shadow values via custom functions if necessary. ************* * About this CL. *************** The CL did the following 1) extended TLS arg/ret to work with aggregate types. This is similar to what MSan does. 2) implemented how to map between an original type/value/zero-const to its shadow type/value/zero-const. 3) extended (insert\|extract)value to use field/index-level progagation. 4) for other instructions, propagation rules are combining inputs by or. The CL converts between aggragate and primary shadow values at the cases. 5) Custom function interfaces also need such a conversion because all existing custom functions use i16. It is unclear whether custome functions need more accurate shadow propagation yet. 6) Added test cases for aggregate type related cases. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D92261	2020-12-09 19:38:35 +00:00
Justin Bogner	833d76977e	Limit the recursion depth of SelectionDAG::isSplatValue() This method previously always recursively checked both the left-hand side and right-hand side of binary operations for splatted (broadcast) vector values to determine if the parent DAG node is a splat. Like several other SelectionDAG methods, limit the recursion depth to MaxRecursionDepth (6). This prevents stack overflow. See also https://issuetracker.google.com/173785481 Patch by Nicolas Capens. Thanks! Differential Revision: https://reviews.llvm.org/D92421	2020-12-09 10:35:07 -08:00
Alexey Bader	792476a7c3	[MCJIT] Add cmake variables to customize ittapi git location and revision. To support llorg builds this patch provides the following changes: 1) Added cmake variable ITTAPI_GIT_REPOSITORY to control the location of ITTAPI repository. Default value of ITTAPI_GIT_REPOSITORY is github location: https://github.com/intel/ittapi.git Also, the separate cmake variable ITTAPI_GIT_TAG was added for repo tag. 2) Added cmake variable ITTAPI_SOURCE_DIR to control the place where the repo will be cloned. Default value of ITTAPI_SOURCE_DIR is build area: PROJECT_BINARY_DIR Reviewed By: etyurin, bader Patch by ekovanov. Differential Revision: https://reviews.llvm.org/D91935	2020-12-09 21:04:24 +03:00
Arthur Eubanks	286730195f	Reland Pin -loop-reduce to legacy PM This was accidentally reverted by a later change. LSR currently only runs in the codegen pass manager. There are a couple issues with LSR and the NPM. 1) Lots of tests assume that LCSSA isn't run before LSR. This breaks a bunch of tests' expected output. This is fixable with some time put in. 2) LSR doesn't preserve LCSSA. See llvm/test/Analysis/MemorySSA/update-remove-deadblocks.ll. LSR's use of SCEVExpander is the only use of SCEVExpander where the PreserveLCSSA option is off. Turning it on causes some code sinking out of loops to fail due to SCEVExpander's inability to handle the newly created trivial PHI nodes in the broken critical edge (I was looking at llvm/test/Transforms/LoopStrengthReduce/X86/2011-11-29-postincphi.ll). I also tried simply just calling formLCSSA() at the end of LSR, but the extra PHI nodes cause regressions in codegen tests. We'll delay figuring these issues out until later. This causes the number of check-llvm failures with -enable-new-pm true by default to go from 60 to 29. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D92796	2020-12-09 09:57:57 -08:00
Fangrui Song	7056fcfd30	[CMake] Add llvm-profgen to LLVM_TEST_DEPENDS Otherwise `check-llvm-*` may not rebuild llvm-profgen, causing llvm-profgen tests to fail if llvm-profgen happens to be stale.	2020-12-09 09:34:51 -08:00
Mircea Trofin	5dea8b8d27	[FileCheck] Enforce --allow-unused-prefixes=false for llvm/test/Transforms Explicitly opt-out llvm/test/Transforms/Attributor. Verified by flipping the default value of allow-unused-prefixes and observing that none of the failures were under llvm/test/Transforms. Differential Revision: https://reviews.llvm.org/D92404	2020-12-09 08:51:38 -08:00
LLVM GN Syncbot	5fe8872286	[gn build] Port b804eef0905	2020-12-09 16:19:07 +00:00
LLVM GN Syncbot	f4bc2a0129	[gn build] Port ac7864ec019	2020-12-09 16:19:07 +00:00
LLVM GN Syncbot	d32f21ec22	[gn build] Port 5934a79196b	2020-12-09 16:19:06 +00:00
Kazushi (Jam) Marukawa	12d012b50c	[VE] Add vsum and vfsum intrinsic instructions Add vsum and vfsum intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92938	2020-12-10 01:11:53 +09:00
Paul C. Anagnostopoulos	02220af447	[TableGen] Cache the vectors of records returned by getAllDerivedDefinitions(). Differential Revision: https://reviews.llvm.org/D92674	2020-12-09 10:54:04 -05:00
Sanjay Patel	d6e6ff92ec	[VectorCombine] allow peeking through an extractelt when creating a vector load This is an enhancement to load vectorization that is motivated by a pattern in https://llvm.org/PR16739. Unfortunately, it's still not enough to make a difference there. We will have to handle multi-use cases in some better way to avoid creating multiple overlapping loads. Differential Revision: https://reviews.llvm.org/D92858	2020-12-09 10:36:14 -05:00
Roman Lebedev	ae9bbdd2bb	[InstCombine] canonicalizeSaturatedAdd(): last fold is only valid for strict comparison (PR48390) We could create uadd.sat under incorrect circumstances if a select with -1 as the false value was canonicalized by swapping the T/F values. Unlike the other transforms in the same function, it is not invariant to equality. Some alive proofs: https://alive2.llvm.org/ce/z/emmKKL Based on original patch by David Green! Fixes https://bugs.llvm.org/show_bug.cgi?id=48390 Differential Revision: https://reviews.llvm.org/D92717	2020-12-09 18:19:09 +03:00
Roman Lebedev	be06a88cc2	[NFC][InstCombine] Add test coverage for @llvm.uadd.sat canonicalization The non-strict variants are already handled because they are canonicalized to strict variants by swapping hands in both the select and icmp, and the fold simply considers that strictness is irrelevant here. But that isn't actually true for the last pattern, as PR48390 reports.	2020-12-09 18:19:08 +03:00

1 2 3 4 5 ...

208037 Commits