llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Amara Emerson	9e0493b789	[AArch64] Fix CollectLOH creating an AdrpAdd LOH when there's a live used reg between the two instructions. If there's a pattern like: $xA = ADRP foo @PAGE [some killing use of reg Xb] $Xb = ADDXri $Xa, 0, @PAGEOFF CollectLOH would create an AdrpAdd LOH that resulted in the linker optimizing this sequence into: $xB = ADR foo [some killing use of reg $Xb] ... and therefore clobbers the live $Xb register that was used by the instruction in between. This was discovered by a GlobalISel patch D78465 which broke up global variable accesses into two pseudos, which in some cases could be moved apart. Differential Revision: https://reviews.llvm.org/D80834	2020-06-01 16:00:55 -07:00
Vedant Kumar	22f3fd7742	[LiveDebugValues] Remove early-exit when testing regmasks, NFC In transferRegisterDef, if the instruction has a regmask attached, we'll check if any currently used register is clobbered by the regmask. The early exit in this scan isn't necessary, costs a set lookup, and is almost never taken [1]. Delete it. [1] http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136	2020-06-01 15:16:10 -07:00
Matt Arsenault	3cd292c66e	AMDGPU: Change internal tracking of wave size Store the log2 wave size instead of forcing division and log2 operations when querying either.	2020-06-01 17:55:08 -04:00
Joseph Huber	76c68e0c0b	[OpenMP] Replace Clang's OpenMP RTL Definitions with OMPKinds.def Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits Tags: #openmp, #clang, #llvm Differential Revision: https://reviews.llvm.org/D80222	2020-06-01 16:23:10 -04:00
Sterling Augustine	d40cd3b3ce	For --relativenames, ignore directory 0, which is the comp_dir. Update for upstream comments. Improve test by writing all the debug info by hand. Reviewers: dblaikie, jhenderson Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80168	2020-06-01 13:13:37 -07:00
Mircea Trofin	368ac03869	[llvm][NFC] Cache FAM in InlineAdvisor Summary: This simplifies the interface by storing the function analysis manager with the InlineAdvisor, and, thus, not requiring it be passed each time we inquire for an advice. Reviewers: davidxl, asbirlea Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80405	2020-06-01 13:02:34 -07:00
Daniel Grumberg	cc866fecc7	Add DIAError.h to list of headers excluded from the LLVM_DebugInfo_PDB module Differential Revision: https://reviews.llvm.org/D80808	2020-06-01 21:01:05 +01:00
Florian Hahn	2456f3f2db	[Matrix] Implement matrix index expressions ([][]). This patch implements matrix index expressions (matrix[RowIdx][ColumnIdx]). It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx). MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First, if the base of a subscript is of matrix type, we create a incomplete MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete MatrixSubscriptExpr, we create a complete MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx) Similar to vector elements, it is not possible to take the address of a MatrixSubscriptExpr. For CodeGen, a new MatrixElt type is added to LValue, which is very similar to VectorElt. The only difference is that we may need to cast the type of the base from an array to a vector type when accessing it. Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D76791	2020-06-01 20:08:49 +01:00
Sanjay Patel	1a3f9700f3	[InstCombine] fix use of base VectorType; NFC SimplifyDemandedVectorElts() bails out on ScalableVectorType anyway, but we can exit faster with the external check. Move this to a helper function because there are likely other vector folds that we can try here.	2020-06-01 14:28:31 -04:00
Matt Arsenault	6b892181f5	AMDGPU: Fix not emitting nofpexcept on fdiv expansion In this awkward case, we have to emit custom pseudo-constrained FP wrappers. InstrEmitter concludes that since a mayRaiseFPException instruction had a chain, it can't add nofpexcept. Test deferred until mayRaiseFPException is really set on everything.	2020-06-01 14:10:26 -04:00
Vedant Kumar	b5e5fd1027	[LiveDebugValues] Add LocIndex::u32_{location,index}_t types for readability, NFC This is per Adrian's suggestion in https://reviews.llvm.org/D80684.	2020-06-01 11:02:36 -07:00
Vedant Kumar	6f84fd7763	[LiveDebugValues] Speed up removeEntryValue, NFC Summary: Instead of iterating over all VarLoc IDs in removeEntryValue(), just iterate over the interval reserved for entry value VarLocs. This changes the iteration order, hence the test update -- otherwise this is NFC. This appears to give an ~8.5x wall time speed-up for LiveDebugValues when compiling sqlite3.c 3.30.1 with a Release clang (on my machine): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- Before: 2.5402 ( 18.8%) 0.0050 ( 0.4%) 2.5452 ( 17.3%) 2.5452 ( 17.3%) Live DEBUG_VALUE analysis After: 0.2364 ( 2.1%) 0.0034 ( 0.3%) 0.2399 ( 2.0%) 0.2398 ( 2.0%) Live DEBUG_VALUE analysis ``` The change in removeEntryValue() is the only one that appears to affect wall time, but for consistency (and to resolve a pending TODO), I made the analogous changes for iterating over SpillLocKind VarLocs. Reviewers: nikic, aprantl, jmorse, djtodoro Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80684	2020-06-01 11:02:36 -07:00
Matt Arsenault	0ba1ec26dd	DAG: Fix getNode dropping flags if there's a glue output The AMDGPU non-strict fdiv lowering needs to introduce an FP mode switch in some cases, and has custom nodes to provide chain/glue for the intermediate FP operations. We need to propagate nofpexcept here, but getNode was dropping the flags. Adding nofpexcept in the AMDGPU custom lowering is left to a future patch. Also fix a second case where flags were dropped, but in this case it seems it just didn't handle this number of operands. Test will be included in future AMDGPU patch.	2020-06-01 13:48:02 -04:00
Hiroshi Yamauchi	c65f25e192	[PGO] Improve the working set size heuristics under the partial sample PGO. Summary: The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize) under the partial sample PGO may not be accurate because the profile is partial and the number of hot profile counters in the ProfileSummary may not reflect the actual working set size of the program being compiled. To improve this, the (approximated) ratio of the the number of profile counters of the program being compiled to the number of profile counters in the partial sample profile is computed (which is called the partial profile ratio) and the working set size of the profile is scaled by this ratio to reflect the working set size of the program being compiled and used for the working set size heuristics. The partial profile ratio is approximated based on the number of the basic blocks in the program and the NumCounts field in the ProfileSummary and computed through the thin LTO indexing. This means that there is the limitation that the scaled working set size is available to the thin LTO post link passes only. Reviewers: davidxl Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79831	2020-06-01 10:29:23 -07:00
Matt Arsenault	3fc69c930e	AMDGPU: Fix test in code directory	2020-06-01 13:26:51 -04:00
Matt Arsenault	bf7398d4cf	AMDGPU: Remove dead file	2020-06-01 13:26:51 -04:00
hsmahesha	ed5decd2c8	[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545	2020-06-01 22:52:34 +05:30
Stanislav Mekhanoshin	d47b2c2a4b	Temporarily removed unstable test. NFC.	2020-06-01 10:18:54 -07:00
Matt Arsenault	7e6b33626b	AMDGPU: Fix alignment for dynamic allocas The alignment value also needs to be scaled by the wave size.	2020-06-01 13:06:37 -04:00
Stanislav Mekhanoshin	3f75cfd780	Update some names in test. NFC. There seems to be some instability with IR nameing between platforms. Attempted to fix it with replacing dot-numbered names.	2020-06-01 09:11:18 -07:00
Fangrui Song	18064a73bd	[Object] Add DF_1_PIE This flag (and the whole field DT_FLAGS_1) originated from Solaris. I intend to use it in an LLD patch D80872. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D80871	2020-06-01 08:56:02 -07:00
Sanjay Patel	8cc0da7038	[InstCombine] add test for select-of-shuffle; NFC This is based on an example in D80658	2020-06-01 11:52:07 -04:00
Stanislav Mekhanoshin	17a709861f	Process gep (phi ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79218	2020-06-01 08:41:05 -07:00
Sam Clegg	232d118233	[WebAssembly] Update test expectations simd-2.C now compiles thanks to: https://github.com/WebAssembly/wasi-libc/pull/183 Differential Revision: https://reviews.llvm.org/D80930	2020-06-01 08:35:27 -07:00
Sanjay Patel	24f9324c29	[InstNamer] use 'i' for Instructions, not 'tmp' As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and D80584, the name 'tmp' is almost always a bad choice, but we have a legacy of regression tests with that name because it was baked into utils/update_test_checks.py. This change makes -instnamer more consistent (already using "arg" and "bb", the common LLVM shorthand). And it avoids the conflict in telling users of the FileCheck script to run "-instnamer" to create a better regression test and having that cause a warn/fail in update_test_checks.py.	2020-06-01 11:11:14 -04:00
Ehud Katz	1b0a29266b	[StructurizeCFG] Fix an incorrect comment, NFC.	2020-06-01 17:42:09 +03:00
James Henderson	47c0f73fb2	[Support] Add more context to DataExtractor getLEB128 errors Reviewed by: clayborg, dblaikie, labath Differential Revision: https://reviews.llvm.org/D80799	2020-06-01 14:00:01 +01:00
James Henderson	b5cd4f0921	[DebugInfo] Add use of truncating data extractor to debug line parsing This will ensure that nothing can ever start parsing data from a future sequence and part-read data will be returned as 0 instead. Reviewed by: aprantl, labath Differential Revision: https://reviews.llvm.org/D80796	2020-06-01 12:33:21 +01:00
Sanjay Patel	804d4bd875	[utils] change default nameless value to "TMP" This is effectively reverting rGbfdc2552664d to avoid test churn while we figure out a better way forward. We at least salvage the warning on name conflict from that patch though. If we change the default string again, we may want to mass update tests at the same time. Alternatively, we could live with the poor naming if we change -instnamer. This also adds a test to LLVM as suggested in the post-commit review. There's a clang test that is also affected. That seems like a layering violation, but I have not looked at fixing that yet. Differential Revision: https://reviews.llvm.org/D80584	2020-06-01 06:54:45 -04:00
James Henderson	129a1ceed7	[llvm-dwarfdump][test] Use verbose output to check expected opcodes The debug_line_invalid.test test case was previously using the interpreted line table dumping to identify which opcodes have been parsed. This change moves to looking for the expected opcodes explicitly. This is probably a little clearer and also allows for testing some cases that wouldn't be easily identifiable from the interpreted table. Reviewed by: MaskRay Differential Revision: https://reviews.llvm.org/D80795	2020-06-01 11:48:02 +01:00
Simon Pilgrim	61aa867bc8	ARMFrameLowering.h - remove unnecessary includes. NFC. They are implicitly included in TargetFrameLowering.h and only ever used in TargetFrameLowering override methods.	2020-06-01 11:47:13 +01:00
Simon Pilgrim	b88b305492	MIPatternMatch.h - remove unused APFloat/APInt includes. NFC.	2020-06-01 11:47:13 +01:00
Igor Kudrin	515faba258	[DebugInfo] Separate fields with commas in headers of type units (3/3). For most tables, we already use commas in headers. This set of patches unifies dumping the remaining ones. Differential Revision: https://reviews.llvm.org/D80806	2020-06-01 17:40:28 +07:00
Igor Kudrin	b9c53c8d85	[DebugInfo] Separate fields with commas in headers of compile units (2/3). For most tables, we already use commas in headers. This set of patches unifies dumping the remaining ones. Differential Revision: https://reviews.llvm.org/D80806	2020-06-01 17:40:24 +07:00
Igor Kudrin	e1c94df8e4	[DebugInfo] Separate fields with commas in headers of .debug_pub* tables (1/3). For most tables, we already use commas in headers. This set of patches unifies dumping the remaining ones. Differential Revision: https://reviews.llvm.org/D80806	2020-06-01 17:39:48 +07:00
Georgii Rymar	516ffefcde	[llvm-readelf] - Add explicit braces again. NFC. Partially reverts feee98645dde4be31a70cc6660d2fc4d4b9d32d8. Add explicit braces to a different place to fix "error: add explicit braces to avoid dangling else [-Werror,-Wdangling-else]"	2020-06-01 13:10:16 +03:00
Georgii Rymar	c01c172f33	[llvm-readelf] - Add explicit braces. NFC. Should fix the BB (http://lab.llvm.org:8011/builders/clang-ppc64le-rhel/builds/3907/steps/build%20stage%201/logs/stdio): llvm-readobj/ELFDumper.cpp:4708:5: error: add explicit braces to avoid dangling else [-Werror,-Wdangling-else] else ^	2020-06-01 12:55:24 +03:00
Ehud Katz	9df1915a6d	[StructurizeCFG] Fix region nodes ordering This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. The new implementation uses SCCs instead of Loops to take account of irreducible loops. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037	2020-06-01 12:50:35 +03:00
Georgii Rymar	ac85d75310	[llvm-readobj] - Improve error reporting for hash tables. This improves the next points for broken hash tables: 1) Use reportUniqueWarning to prevent duplication when --hash-table and --elf-hash-histogram are used together. 2) Dump nbuckets and nchain fields. It is often possible to dump them even when the table itself goes past the EOF etc. Differential revision: https://reviews.llvm.org/D80373	2020-06-01 12:36:23 +03:00
Tim Northover	8b6ab03c03	AArch64: materialize large stack offset into xzr correctly. When a stack offset was too big to materialize in a single instruction, we were trying to do it in stages: adds xD, sp, #imm adds xD, xD, #imm Unfortunately, if xD is xzr then the second instruction doesn't exist and wouldn't do what was needed if it did. Instead we can use a temporary register for all but the last addition.	2020-06-01 09:30:05 +01:00
serge-sans-paille	ffa794c4cb	Improve SmallPtrSetImpl::count implementation Relying on the find method implies a roundtrip to the iterator world, which is not costless because iterator creation involves a few check to ensure the iterator is in a valid position (through the SmallPtrSetIteratorImpl::AdvanceIfNotValid method). It turns out that the result of SmallPtrSetImpl::find_imp is either valid or the EndPointer, so there's no need to go through that abstraction, and the compiler cannot guess it. Differential Revision: https://reviews.llvm.org/D80708	2020-06-01 07:49:19 +02:00
Chen Zheng	c5c4a9bca7	[MachineCombine] add a hook for resource length limit	2020-05-31 23:21:04 -04:00
Li Rong Yi	80a001669f	[PowerPC] Exploit vabsd on P9 Summary: Exploit vabsd* for for absolute difference of vectors on P9, for example: void foo (char restrict p, char restrict q, char *restrict t) { for (int i = 0; i < 16; i++) t[i] = abs (p[i] - q[i]); } this case should be matched to the HW instruction vabsdub. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D80271	2020-06-01 02:30:27 +00:00
Nico Weber	6f003d58d9	[gn build] (semi-manually) port a8ca0ec2670	2020-05-31 22:06:11 -04:00
Matt Arsenault	ab446506d6	AMDGPU/GlobalISel: Add stub reg-bank aware combiner pass	2020-05-31 20:40:14 -04:00
Craig Topper	12253f5671	[X86] Rewrite how X86PartialReduction finds candidates to consider optimizing. Previously we walked the users of any vector binop looking for more binops with the same opcode or phis that eventually ended up in a reduction. While this is simple it also means visiting the same nodes many times since we'll do a forward walk for each BinaryOperator in the chain. It was also far more general than what we have tests for or expect to see. This patch replaces the algorithm with a new method that starts at extract elements looking for a horizontal reduction. Once we find a reduction we walk through backwards through phis and adds to collect leaves that we can consider for rewriting. We only consider single use adds and phis. Except for a special case if the Add is used by a phi that forms a loop back to the Add. Including other single use Adds to support unrolled loops. Ultimately, I want to narrow the Adds, Phis, and final reduction based on the partial reduction we're doing. I still haven't figured out exactly what that looks like yet. But restricting the types of graphs we expect to handle seemed like a good first step. As does having all the leaves and the reduction at once. Differential Revision: https://reviews.llvm.org/D79971	2020-05-31 12:53:01 -07:00
Simon Pilgrim	885928e3e6	[X86][AVX] Reduce unary target shuffles width if the upper elements aren't demanded.	2020-05-31 20:19:24 +01:00
Simon Pilgrim	9e57ca5e0e	[X86][AVX] combineX86ShufflesRecursively - peekThroughOneUseBitcasts subvector before widening. This matches what we do for the full sized vector ops at the start of combineX86ShufflesRecursively, and helps getFauxShuffleMask extract more INSERT_SUBVECTOR patterns.	2020-05-31 19:58:33 +01:00
Matt Arsenault	354a94569e	AArch64/GlobalISel: Fix incorrect ptrmask usage for alignment I inverted the mask when I ported to the new form of G_PTRMASK in 8bc03d2168241f7b12265e9cd7e4eb7655709f34. I don't think this really broke anything, since G_VASTART isn't handled for types with an alignment higher than the stack alignment.	2020-05-31 10:56:55 -04:00
Sanjay Patel	2ca9a76699	[utils] change update_test_checks.py use of 'TMP' value names As discussed in PR45951: https://bugs.llvm.org/show_bug.cgi?id=45951 There's a potential name collision between update_test_checks.py and -instnamer and/or manually-generated IR test files because all of them try to use the variable name that should never be used: "tmp". This patch proposes to reduce the odds of collision and adds a warning if we detect the problem. This will cause regression test churn when regenerating CHECK lines on existing files. Differential Revision: https://reviews.llvm.org/D80584	2020-05-31 10:46:11 -04:00

1 2 3 4 5 ...

197616 Commits