llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
James Molloy	b8dff35c25	[TableGen] Fix crash when using HwModes in CodeEmitterGen When an instruction has an encoding definition for only a subset of the available HwModes, ensure we just avoid generating an encoding rather than crash. llvm-svn: 374150	2019-10-09 09:15:34 +00:00
Clement Courbet	9a982602b3	[llvm-exegesis] Add missing std::move in rL374146. This was breaking some bots: /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/include/llvm/Support/Error.h:483:5: required from ‘llvm::Expected<T>::Expected(OtherT&&, typename std::enable_if<std::is_convertible<_Rep2, _Rep>::value>::type) [with OtherT = std::vector<llvm::exegesis::CodeTemplate>&; T = std::vector<llvm::exegesis::CodeTemplate>; typename std::enable_if<std::is_convertible<_Rep2, _Rep>::value>::type = void]’ /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:238:20: required from here /usr/include/c++/6/bits/stl_construct.h:75:7: error: use of deleted function ‘llvm::exegesis::CodeTemplate::CodeTemplate(const llvm::exegesis::CodeTemplate&)’ { ::new(static_cast<void>(__p)) _T1(std::forward<_Args>(__args)...); } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ llvm-svn: 374149	2019-10-09 09:07:21 +00:00
Hans Wennborg	70269b4845	Unify the two CRC implementations David added the JamCRC implementation in r246590. More recently, Eugene added a CRC-32 implementation in r357901, which falls back to zlib's crc32 function if present. These checksums are essentially the same, so having multiple implementations seems unnecessary. This replaces the CRC-32 implementation with the simpler one from JamCRC, and implements the JamCRC interface in terms of CRC-32 since this means it can use zlib's implementation when available, saving a few bytes and potentially making it faster. JamCRC took an ArrayRef<char> argument, and CRC-32 took a StringRef. This patch changes it to ArrayRef<uint8_t> which I think is the best choice, and simplifies a few of the callers nicely. Differential revision: https://reviews.llvm.org/D68570 llvm-svn: 374148	2019-10-09 09:06:30 +00:00
Clement Courbet	2a037be6e6	[llvm-exegesis][NFC] Fix rL374146. Remove extra semicolon: Target.cpp:187:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 374147	2019-10-09 09:03:42 +00:00
Clement Courbet	b3877eff38	[llvm-exegesis] Explore LEA addressing modes. Summary: This will help for PR32326. This shows the well-known issue with `RBP` and `R13` as base registers. Reviewers: gchatelet Subscribers: tschuett, llvm-commits, RKSimon, andreadb Tags: #llvm Differential Revision: https://reviews.llvm.org/D68646 llvm-svn: 374146	2019-10-09 08:49:13 +00:00
Jeremy Morse	c79513d483	Revert r374139, "[dsymutil] Fix handling of common symbols in multiple object files." The added test files ("com", "com1.o", "com2.o") are reserved names on Windows, and makes 'git checkout' fail with a filesystem error. llvm-svn: 374144	2019-10-09 08:27:48 +00:00
Clement Courbet	932bbb565b	[llvm-exegesis][NFC] Remove unecessary `using llvm::` directives. We've been in namespace llvm for at least a year. llvm-svn: 374143	2019-10-09 07:52:07 +00:00
Jonas Devlieghere	5ba2afa18c	[dsymutil] Fix handling of common symbols in multiple object files. For common symbols the linker emits only a single symbol entry in the debug map. This caused dsymutil to not relocate common symbols when linking DWARF coming form object files that did not have this entry. This patch fixes that by keeping track of common symbols in the object files and synthesizing a debug map entry for them using the address from the main binary. Differential revision: https://reviews.llvm.org/D68680 llvm-svn: 374139	2019-10-09 04:16:18 +00:00
Kristina Brooks	68776fe8ce	[TypeSize] Fix module builds (cassert) TypeSize.h uses `assert` statements without including the <cassert> header first which leads to failures in modular builds. llvm-svn: 374138	2019-10-09 04:00:03 +00:00
Nico Weber	6aa0bc0789	gn build: unbreak libcxx build after r374116 by restoring gen_link_script.py for gn llvm-svn: 374129	2019-10-08 23:08:18 +00:00
DeForest Richards	fef32a3f70	[Docs] Fixes broken sphinx build - undefined label Removes label ref pointing to non-existent subsystem docs page. llvm-svn: 374128	2019-10-08 22:45:20 +00:00
Bill Wendling	2a1b7733a8	[IA] Add tests for a few other edge cases Test with the last eight bits within the range [7F, FF] and with lower-case hex letters. llvm-svn: 374124	2019-10-08 22:06:09 +00:00
Jonas Devlieghere	4f0a9f4024	[dsymutil] Improve verbose output (NFC) The verbose output for finding relocations assumed that we'd always dump the DIE after (which starts with a newline) and therefore didn't include one itself. However, this isn't always true, leading to garbled output. This patch adds a newline to the verbose output and adds a line that says that the DIE is being kept (which isn't obvious otherwise). It also adds a 0x prefix to the relocations. llvm-svn: 374123	2019-10-08 22:03:13 +00:00
David Blaikie	952b16d00e	DebugInfo: Move LLE enum handling to .def to match RLE handling llvm-svn: 374122	2019-10-08 21:48:46 +00:00
Roman Lebedev	1b3874e519	[CVP} Replace SExt with ZExt if the input is known-non-negative Summary: zero-extension is far more friendly for further analysis. While this doesn't directly help with the shift-by-signext problem, this is not unrelated. This has the following effect on test-suite (numbers collected after the finish of middle-end module pass manager): \| Statistic \| old \| new \| delta \| percent change \| \| correlated-value-propagation.NumSExt \| 0 \| 6026 \| 6026 \| +100.00% \| \| instcount.NumAddInst \| 272860 \| 271283 \| -1577 \| -0.58% \| \| instcount.NumAllocaInst \| 27227 \| 27226 \| -1 \| 0.00% \| \| instcount.NumAndInst \| 63502 \| 63320 \| -182 \| -0.29% \| \| instcount.NumAShrInst \| 13498 \| 13407 \| -91 \| -0.67% \| \| instcount.NumAtomicCmpXchgInst \| 1159 \| 1159 \| 0 \| 0.00% \| \| instcount.NumAtomicRMWInst \| 5036 \| 5036 \| 0 \| 0.00% \| \| instcount.NumBitCastInst \| 672482 \| 672353 \| -129 \| -0.02% \| \| instcount.NumBrInst \| 702768 \| 702195 \| -573 \| -0.08% \| \| instcount.NumCallInst \| 518285 \| 518205 \| -80 \| -0.02% \| \| instcount.NumExtractElementInst \| 18481 \| 18482 \| 1 \| 0.01% \| \| instcount.NumExtractValueInst \| 18290 \| 18288 \| -2 \| -0.01% \| \| instcount.NumFAddInst \| 139035 \| 138963 \| -72 \| -0.05% \| \| instcount.NumFCmpInst \| 10358 \| 10348 \| -10 \| -0.10% \| \| instcount.NumFDivInst \| 30310 \| 30302 \| -8 \| -0.03% \| \| instcount.NumFenceInst \| 387 \| 387 \| 0 \| 0.00% \| \| instcount.NumFMulInst \| 93873 \| 93806 \| -67 \| -0.07% \| \| instcount.NumFPExtInst \| 7148 \| 7144 \| -4 \| -0.06% \| \| instcount.NumFPToSIInst \| 2823 \| 2838 \| 15 \| 0.53% \| \| instcount.NumFPToUIInst \| 1251 \| 1251 \| 0 \| 0.00% \| \| instcount.NumFPTruncInst \| 2195 \| 2191 \| -4 \| -0.18% \| \| instcount.NumFSubInst \| 92109 \| 92103 \| -6 \| -0.01% \| \| instcount.NumGetElementPtrInst \| 1221423 \| 1219157 \| -2266 \| -0.19% \| \| instcount.NumICmpInst \| 479140 \| 478929 \| -211 \| -0.04% \| \| instcount.NumIndirectBrInst \| 2 \| 2 \| 0 \| 0.00% \| \| instcount.NumInsertElementInst \| 66089 \| 66094 \| 5 \| 0.01% \| \| instcount.NumInsertValueInst \| 2032 \| 2030 \| -2 \| -0.10% \| \| instcount.NumIntToPtrInst \| 19641 \| 19641 \| 0 \| 0.00% \| \| instcount.NumInvokeInst \| 21789 \| 21788 \| -1 \| 0.00% \| \| instcount.NumLandingPadInst \| 12051 \| 12051 \| 0 \| 0.00% \| \| instcount.NumLoadInst \| 880079 \| 878673 \| -1406 \| -0.16% \| \| instcount.NumLShrInst \| 25919 \| 25921 \| 2 \| 0.01% \| \| instcount.NumMulInst \| 42416 \| 42417 \| 1 \| 0.00% \| \| instcount.NumOrInst \| 100826 \| 100576 \| -250 \| -0.25% \| \| instcount.NumPHIInst \| 315118 \| 314092 \| -1026 \| -0.33% \| \| instcount.NumPtrToIntInst \| 15933 \| 15939 \| 6 \| 0.04% \| \| instcount.NumResumeInst \| 2156 \| 2156 \| 0 \| 0.00% \| \| instcount.NumRetInst \| 84485 \| 84484 \| -1 \| 0.00% \| \| instcount.NumSDivInst \| 8599 \| 8597 \| -2 \| -0.02% \| \| instcount.NumSelectInst \| 45577 \| 45913 \| 336 \| 0.74% \| \| instcount.NumSExtInst \| 84026 \| 78278 \| -5748 \| -6.84% \| \| instcount.NumShlInst \| 39796 \| 39726 \| -70 \| -0.18% \| \| instcount.NumShuffleVectorInst \| 100272 \| 100292 \| 20 \| 0.02% \| \| instcount.NumSIToFPInst \| 29131 \| 29113 \| -18 \| -0.06% \| \| instcount.NumSRemInst \| 1543 \| 1543 \| 0 \| 0.00% \| \| instcount.NumStoreInst \| 805394 \| 804351 \| -1043 \| -0.13% \| \| instcount.NumSubInst \| 61337 \| 61414 \| 77 \| 0.13% \| \| instcount.NumSwitchInst \| 8527 \| 8524 \| -3 \| -0.04% \| \| instcount.NumTruncInst \| 60523 \| 60484 \| -39 \| -0.06% \| \| instcount.NumUDivInst \| 2381 \| 2381 \| 0 \| 0.00% \| \| instcount.NumUIToFPInst \| 5549 \| 5549 \| 0 \| 0.00% \| \| instcount.NumUnreachableInst \| 9855 \| 9855 \| 0 \| 0.00% \| \| instcount.NumURemInst \| 1305 \| 1305 \| 0 \| 0.00% \| \| instcount.NumXorInst \| 10230 \| 10081 \| -149 \| -1.46% \| \| instcount.NumZExtInst \| 60353 \| 66840 \| 6487 \| 10.75% \| \| instcount.TotalBlocks \| 829582 \| 829004 \| -578 \| -0.07% \| \| instcount.TotalFuncs \| 83818 \| 83817 \| -1 \| 0.00% \| \| instcount.TotalInsts \| 7316574 \| 7308483 \| -8091 \| -0.11% \| TLDR: we produce -0.11% less instructions, -6.84% less `sext`, +10.75% more `zext`. To be noted, clearly, not all new `zext`'s are produced by this fold. (And now i guess it might have been interesting to measure this for D68103 :S) Reviewers: nikic, spatel, reames, dberlin Reviewed By: nikic Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68654 llvm-svn: 374112	2019-10-08 20:29:48 +00:00
Roman Lebedev	b491f89f30	[CVP][NFC] Revisit sext vs. zext test llvm-svn: 374111	2019-10-08 20:29:36 +00:00
Jordan Rose	6422c4ff57	Mark several PointerIntPair methods as lvalue-only No point in mutating 'this' if it's just going to be thrown away. https://reviews.llvm.org/D63945 llvm-svn: 374102	2019-10-08 19:01:48 +00:00
Daniel Sanders	a562501cd0	[tblgen] Add getOperatorAsDef() to Record Summary: While working with DagInit's, it's often the case that you expect the operator to be a reference to a def. This patch adds a wrapper for this common case to reduce the amount of boilerplate callers need to duplicate repeatedly. getOperatorAsDef() returns the record if the DagInit has an operator that is a DefInit. Otherwise, it prints a fatal error. There's only a few pre-existing examples in LLVM at the moment and I've left a few instances of the code this simplifies as they had more specific error messages than the generic one this produces. I'm going to be using this a fair bit in my subsequent patches. Reviewers: bogner, volkan, nhaehnle Reviewed By: nhaehnle Subscribers: nhaehnle, hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68424 llvm-svn: 374101	2019-10-08 18:41:32 +00:00
Yonghong Song	7769861822	[BPF] do compile-once run-everywhere relocation for bitfields A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void , unsigned, const void ); int field_read(struct s arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void )arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = (unsigned char )((void )arg + offset); break; case 2: ull = (unsigned short )((void )arg + offset); break; case 4: ull = (unsigned int )((void )arg + offset); break; case 8: ull = (unsigned long long )((void )arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = (u64 )(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = (u64 )(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = (u16 )(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = (u64 )(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = (u8 )(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 llvm-svn: 374099	2019-10-08 18:23:17 +00:00
Matt Arsenault	fc005b5d71	AMDGPU: Fix i16 arithmetic pattern redundancy There were 2 problems here. First, these patterns were duplicated to handle the inverted shift operands instead of using the commuted PatFrags. Second, the point of the zext folding patterns don't apply to the non-0ing high subtargets. They should be skipped instead of inserting the extension. The zeroing high code would be emitted when necessary anyway. This was also emitting unnecessary zexts in cases where the high bits were undefined. llvm-svn: 374092	2019-10-08 17:36:38 +00:00
Jinsong Ji	6616c06663	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091	2019-10-08 17:32:56 +00:00
Sanjay Patel	edf9c4bf21	[SLP] add test with prefer-vector-width function attribute; NFC (PR43578) llvm-svn: 374090	2019-10-08 17:18:32 +00:00
Vedant Kumar	6436fd4f1a	[CodeExtractor] Factor out and reuse shrinkwrap analysis Factor out CodeExtractor's analysis of allocas (for shrinkwrapping purposes), and allow the analysis to be reused. This resolves a quadratic compile-time bug observed when compiling AMDGPUDisassembler.cpp.o. Pre-patch (Release + LTO clang): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 176.5278 ( 57.8%) 0.4915 ( 18.5%) 177.0192 ( 57.4%) 177.4112 ( 57.3%) Hot Cold Splitting ``` Post-patch (ReleaseAsserts clang): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 1.4051 ( 3.3%) 0.0079 ( 0.3%) 1.4129 ( 3.2%) 1.4129 ( 3.2%) Hot Cold Splitting ``` Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary pre- vs. post-patch. An alternate approach is to hide CodeExtractorAnalysisCache from clients of CodeExtractor, and to recompute the analysis from scratch inside of CodeExtractor::extractCodeRegion(). This eliminates some redundant work in the shrinkwrapping legality check. However, some clients continue to exhibit O(n^2) compile time behavior as computing the analysis is O(n). rdar://55912966 Differential Revision: https://reviews.llvm.org/D68616 llvm-svn: 374089	2019-10-08 17:17:51 +00:00
Tom Stellard	99acd7b50d	AMDGPU: Add offsets to MMO when lowering buffer intrinsics Summary: Without offsets on the MachineMemOperands (MMOs), MachineInstr::mayAlias() will return true for all reads and writes to the same resource descriptor. This leads to O(N^2) complexity in the MachineScheduler when analyzing dependencies of buffer loads and stores. It also limits the SILoadStoreOptimizer from merging more instructions. This patch reduces the compile time of one pathological compute shader from 12 seconds to 1 second. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65097 llvm-svn: 374087	2019-10-08 17:04:51 +00:00
Hideto Ueno	3073348b63	[Attributor][Fix] Temporary fix for windows build bot failure D65402 causes test failure related to attributor-max-iterations. This commit removes attributor-max-iterations-verify for now. I'll examine the factor and the flag should be reverted. llvm-svn: 374086	2019-10-08 17:01:56 +00:00
Simon Pilgrim	a23fa47859	CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 374085	2019-10-08 17:00:01 +00:00
Stanislav Mekhanoshin	fb4b305275	[AMDGPU] Disable unused gfx10 dpp instructions Inhibit generation of unused real dpp instructions on gfx10 just like it is done on other subtargets. This does not change anything because these are illegal anyway and not accepted, but it does reduce the number of instruction definitions generated. Differential Revision: https://reviews.llvm.org/D68607 llvm-svn: 374083	2019-10-08 16:56:01 +00:00
David Greene	bea2f8fa2d	[UpdateCCTestChecks] Detect function mangled name on separate line Sometimes functions with large comment blocks in front of them have their declarations output on several lines by c-index-test. Hence the one-line function name/line/mangled pattern will not work to detect them. Break the pattern up into two patterns and keep state after seeing the name/line information until we finally see the mangled name. Differential Revision: https://reviews.llvm.org/D68272 llvm-svn: 374078	2019-10-08 16:25:42 +00:00
Roman Lebedev	9a1dbacc31	[NFC][CVP] Add tests where we can replace sext with zext If the sign bit of the value that is being sign-extended is not set, i.e. the value is non-negative (s>= 0), then zero-extension will suffice, and is better for analysis: https://rise4fun.com/Alive/a8PD llvm-svn: 374075	2019-10-08 16:21:13 +00:00
Amaury Sechet	c46babc9a8	(Re)generate various tests. NFC llvm-svn: 374074	2019-10-08 16:16:26 +00:00
Heejin Ahn	f829d5d467	[WebAssembly] Fix a bug in 'try' placement Summary: When searching for local expression tree created by stackified registers, for 'block' placement, we start the search from the previous instruction of a BB's terminator. But in 'try''s case, we should start from the previous instruction of a call that can throw, or a EH_LABEL that precedes the call, because the return values of the call's previous instructions can be stackified and consumed by the throwing call. For example, ``` i32.call @foo call @bar ; may throw br $label0 ``` In this case, if we start the search from the previous instruction of the terminator (`br` here), we end up stopping at `call @bar` and place a 'try' between `i32.call @foo` and `call @bar`, because `call @bar` does not have a return value so it is not a local expression tree of `br`. But in this case, unlike when placing 'block's, we should start the search from `call @bar`, because the return value of `i32.call @foo` is stackified and used by `call @bar`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68619 llvm-svn: 374073	2019-10-08 16:15:39 +00:00
Nikola Prica	b5e4cd0d29	[DebugInfo][If-Converter] Update call site info during the optimization During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 llvm-svn: 374068	2019-10-08 15:43:12 +00:00
GN Sync Bot	c5067996f2	gn build: Merge r374062 llvm-svn: 374065	2019-10-08 15:34:52 +00:00
GN Sync Bot	8d17c89f11	gn build: Merge r374061 llvm-svn: 374064	2019-10-08 15:28:36 +00:00
Hideto Ueno	5316b0125f	[Attributor][MustExec] Deduce dereferenceable and nonnull attribute using MustBeExecutedContextExplorer Summary: In D65186 and related patches, MustBeExecutedContextExplorer is introduced. This enables us to traverse instructions guaranteed to execute from function entry. If we can know the argument is used as `dereferenceable` or `nonnull` in these instructions, we can mark `dereferenceable` or `nonnull` in the argument definition: 1. Memory instruction (similar to D64258) Trace memory instruction pointer operand. Currently, only inbounds GEPs are traced. ``` define i64* @f(i64* %a) { entry: %add.ptr = getelementptr inbounds i64, i64* %a, i64 1 ; (because of inbounds GEP we can know that %a is at least dereferenceable(16)) store i64 1, i64* %add.ptr, align 8 ret i64* %add.ptr ; dereferenceable 8 (because above instruction stores into it) } ``` 2. Propagation from callsite (similar to D27855) If `deref` or `nonnull` are known in call site parameter attributes we can also say that argument also that attribute. ``` declare void @use3(i8* %x, i8* %y, i8* %z); declare void @use3nonnull(i8* nonnull %x, i8* nonnull %y, i8* nonnull %z); define void @parent1(i8* %a, i8* %b, i8* %c) { call void @use3nonnull(i8* %b, i8* %c, i8* %a) ; Above instruction is always executed so we can say that@parent1(i8* nonnnull %a, i8* nonnull %b, i8* nonnull %c) call void @use3(i8* %c, i8* %a, i8* %b) ret void } ``` Reviewers: jdoerfert, sstefan1, spatel, reames Reviewed By: jdoerfert Subscribers: xbolva00, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65402 llvm-svn: 374063	2019-10-08 15:25:56 +00:00
Cyndy Ishida	86ba6cb6fb	Revert [TextAPI] Introduce TBDv4 This reverts r374058 (git commit 5d566c5a46aeaa1fa0e5c0b823c9d5f84036dc9a) llvm-svn: 374062	2019-10-08 15:24:37 +00:00
Hideto Ueno	2ec45fac4a	[Attributor] Add helper class to compose two structured deduction. Summary: This patch introduces a generic way to compose two structured deductions. This will be used for composing generic deduction with `MustBeExecutedExplorer` and other existing generic deduction. Reviewers: jdoerfert, sstefan1 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66645 llvm-svn: 374060	2019-10-08 15:20:19 +00:00
GN Sync Bot	8c2ccafc61	gn build: Merge r374058 llvm-svn: 374059	2019-10-08 15:12:38 +00:00
Cyndy Ishida	1ffad17301	[TextAPI] Introduce TBDv4 Summary: This format introduces new features and platforms The motivation for this format is to support more than 1 platform since previous versions only supported additional architectures and 1 platform, for example ios + ios-simulator and macCatalyst. Reviewers: ributzka, steven_wu Reviewed By: ributzka Subscribers: mgorny, hiraditya, mgrang, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67529 llvm-svn: 374058	2019-10-08 15:07:36 +00:00
Mirko Brkusanin	bd8c1921ef	[Mips] Emit proper ABI for _mcount calls When -pg option is present than a call to _mcount is inserted into every function. However since the proper ABI was not followed then the generated gmon.out did not give proper results. By inserting needed instructions before every _mcount we can fix this. Differential Revision: https://reviews.llvm.org/D68390 llvm-svn: 374055	2019-10-08 14:32:03 +00:00
Clement Courbet	e0fc2857f3	[llvm-exegesis] Add options to SnippetGenerator. Summary: This adds a `-max-configs-per-opcode` option to limit the number of configs per opcode. Reviewers: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68642 llvm-svn: 374054	2019-10-08 14:30:24 +00:00
Pavel Labath	24a2861b8f	Object/minidump: Add support for the MemoryInfoList stream Summary: This patch adds the definitions of the constants and structures necessary to interpret the MemoryInfoList minidump stream, as well as the object::MinidumpFile interface to access the stream. While the code is fairly simple, there is one important deviation from the other minidump streams, which is worth calling out explicitly. Unlike other "List" streams, the size of the records inside MemoryInfoList stream is not known statically. Instead it is described in the stream header. This makes it impossible to return ArrayRef<MemoryInfo> from the accessor method, as it is done with other streams. Instead, I create an iterator class, which can be parameterized by the runtime size of the structure, and return iterator_range<iterator> instead. Reviewers: amccarth, jhenderson, clayborg Subscribers: JosephTremoulet, zturner, markmentovai, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68210 llvm-svn: 374051	2019-10-08 14:15:32 +00:00
Kevin P. Neal	75d58de3bb	Nope, I'm wrong. It looks like someone else removed these on purpose and it just happened to break the bot right when I did my push. So I'm undoing this mornings incorrect push. I've also kicked off an email to hopefully get the bot fixed the correct way. llvm-svn: 374049	2019-10-08 14:10:26 +00:00
Kevin P. Neal	fcf18f9190	Restore documentation that 'svn update' unexpectedly yanked out from under me. llvm-svn: 374045	2019-10-08 13:38:42 +00:00
Sebastian Pop	2a094bb4d1	fix fmls fp16 Tim Northover remarked that the added patterns for fmls fp16 produce wrong code in case the fsub instruction has a multiplication as its first operand, i.e., all the patterns FMLSv_OP1: > define <8 x half> @test_FMLSv8f16_OP1(<8 x half> %a, <8 x half> %b, <8 x half> %c) { > ; CHECK-LABEL: test_FMLSv8f16_OP1: > ; CHECK: fmls {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h > entry: > > %mul = fmul fast <8 x half> %c, %b > %sub = fsub fast <8 x half> %mul, %a > ret <8 x half> %sub > } > > This doesn't look right to me. The exact instruction produced is "fmls > v0.8h, v2.8h, v1.8h", which I think calculates "v0 - v2v1", but the > IR is calculating "v2v1-v0". The equivalent <4 x float> code also > doesn't emit an fmls. This patch generates an fmla and negates the value of the operand2 of the fsub. Inspecting the pattern match, I found that there was another mistake in the opcode to be selected: matching FMULv416 should generate FMLSv416 and not FMLSv232. Tested on aarch64-linux with make check-all. Differential Revision: https://reviews.llvm.org/D67990 llvm-svn: 374044	2019-10-08 13:23:57 +00:00
Amaury Sechet	f337be076a	Add test for rotating truncated vectors. NFC llvm-svn: 374043	2019-10-08 13:08:51 +00:00
Graham Hunter	8b64c971e5	[SVE][IR] Scalable Vector size queries and IR instruction support * Adds a TypeSize struct to represent the known minimum size of a type along with a flag to indicate that the runtime size is a integer multiple of that size * Converts existing size query functions from Type.h and DataLayout.h to return a TypeSize result * Adds convenience methods (including a transparent conversion operator to uint64_t) so that most existing code 'just works' as if the return values were still scalars. * Uses the new size queries along with ElementCount to ensure that all supported instructions used with scalable vectors can be constructed in IR. Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen Reviewed By: rovka, sdesmalen Differential Revision: https://reviews.llvm.org/D53137 llvm-svn: 374042	2019-10-08 12:53:54 +00:00
Nicolai Haehnle	dad73a06de	AMDGPU: Propagate undef flag during pre-RA exec mask optimizations Summary: Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68184 llvm-svn: 374041	2019-10-08 12:46:32 +00:00
Nicolai Haehnle	23709dcaaf	MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block Summary: When getValueInMiddleOfBlock happens to be called for a basic block that has no incoming value at all, an IMPLICIT_DEF is inserted in that block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at the top of its basic block or it will likely not reach the use that the caller intends to insert. Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204 Reviewers: arsenm, rampitec Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68183 llvm-svn: 374040	2019-10-08 12:46:20 +00:00
Sanjay Patel	f6c01f969d	[SLP] add test with prefer-vector-width function attribute; NFC llvm-svn: 374039	2019-10-08 12:43:46 +00:00

1 2 3 4 5 ...

186047 Commits