llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00

Author	SHA1	Message	Date
Paul Robinson	dd6f1ac638	[DWARFv5] Allow ".loc 0" to refer to the root file. DWARF v5 explicitly represents file #0 in the line table. Prior versions did not, so ".loc 0" is still an error in those cases. Differential Revision: https://reviews.llvm.org/D48452 llvm-svn: 335350	2018-06-22 14:16:11 +00:00
Simon Pilgrim	27057b9ca4	[SLPVectorizer] Relax alternate opcodes to accept any BinaryOperator pair SLP currently only accepts (F)Add/(F)Sub alternate counterpart ops to be merged into an alternate shuffle. This patch relaxes this to accept any pair of BinaryOperator opcodes instead, assuming the target's cost model accepts the vectorization+shuffle. Differential Revision: https://reviews.llvm.org/D48477 llvm-svn: 335349	2018-06-22 14:04:06 +00:00
Simon Pilgrim	edee77041e	[SLPVectorizer][X86] Add alternate opcode tests for simple build vector cases llvm-svn: 335348	2018-06-22 13:53:58 +00:00
Sanjay Patel	79aa328055	[InstCombine] add shuffle+binops test from PR37806; NFC This one shows another pattern that we'll need to match in some cases, but the current ordering of folds allows us to match this as 2 binops before simplification takes place. llvm-svn: 335347	2018-06-22 13:44:42 +00:00
Sanjay Patel	85935bc2dc	[InstCombine] add tests for shuffle-with-different-binops; NFC llvm-svn: 335345	2018-06-22 13:19:25 +00:00
Sanjay Patel	cae5dab87a	[InstCombine] rearrange shuffle-of-binops logic; NFC The commutative matcher makes things more complicated here, and I'm planning an enhancement where this form is more readable. llvm-svn: 335343	2018-06-22 12:46:16 +00:00
Simon Pilgrim	f257c42473	[X86] Regenerate tests to include fma comments Noticed in the review of D48467 llvm-svn: 335342	2018-06-22 12:41:48 +00:00
Gabor Buella	5f131faa4b	[X86] Add notes to a few intrinsics This a change corresponding to the clang change in https://reviews.llvm.org/D45616 Reviewers: craig.topper, uriel.k, RKSimon, andrew.w.kaylor, spatel, scanon, efriedma Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D48280 llvm-svn: 335340	2018-06-22 12:01:43 +00:00
George Rimar	d23004c21c	Recommit r335333 "[MC] - Add .stack_size sections into groups and link them with .text" With compilation fix. Original commit message: D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335336	2018-06-22 10:53:47 +00:00
Simon Pilgrim	9ae81f44b6	[IR] Use Instruction::isBinaryOp helper instead of raw enum range tests. NFCI. llvm-svn: 335335	2018-06-22 10:48:02 +00:00
George Rimar	40f369f9f9	Revert r335332 "[MC] - Add .stack_size sections into groups and link them with .text" It broke bots. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443 http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551 llvm-svn: 335333	2018-06-22 10:27:33 +00:00
George Rimar	e53c16777f	[MC] - Add .stack_size sections into groups and link them with .text D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335332	2018-06-22 10:10:53 +00:00
Sjoerd Meijer	ce9d270373	Recommit of r335326, with the test fixed that I missed. llvm-svn: 335331	2018-06-22 10:03:03 +00:00
Simon Pilgrim	d34cfc19bf	[CostModel][AArch64] Add some initial costs for SK_Select and SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 llvm-svn: 335329	2018-06-22 09:45:31 +00:00
Sjoerd Meijer	26feb92a8c	Reverting r335326 while I look at the test failure llvm-svn: 335328	2018-06-22 09:17:08 +00:00
Eugene Leviant	17f34a12e6	Revert r335324 due to a builtbot failure llvm-svn: 335327	2018-06-22 08:57:01 +00:00
Sjoerd Meijer	129e526a8b	[ARM] ARMv6m and v8m.baseline strict align This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326	2018-06-22 08:48:13 +00:00
Matt Arsenault	aa80fb9253	AMDGPU: Add patterns for i32/i64 local atomic load/store Not sure why the 32/64 split is needed in the atomic_load store hierarchies. The regular PatFrags do this, but we don't do it for the existing handling for global. llvm-svn: 335325	2018-06-22 08:39:52 +00:00
Eugene Leviant	53b81ab9f1	[Evaluator] Improve evaluation of call instruction Differential revision: https://reviews.llvm.org/D46584 llvm-svn: 335324	2018-06-22 08:29:36 +00:00
Mikhail Dvoretckii	0fb1eee73a	[X86] Changing the check for valid inputs in combineScalarToVector Changing the logic of scalar mask folding to check for valid input types rather than against invalid ones, making it more robust and fixing PR37879. Differential Revision: https://reviews.llvm.org/D48366 llvm-svn: 335323	2018-06-22 08:28:05 +00:00
Chandler Carruth	26c36dc78d	Revert r335306 (and r335314) - the Call Graph Profile pass. This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320	2018-06-22 05:33:57 +00:00
Tom Stellard	b6447f67a8	AMDGPU/GlobalISel: Default to using TableGen'd instruction selector Summary: We can select all instructions that are marked as legal in a full piglit run, so now is a good time to make the TableGen'd instruction selector default for all opcodes. This is NFC for a full piglit run, which is why there are no tests. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48198 llvm-svn: 335319	2018-06-22 03:04:35 +00:00
Tom Stellard	4de5b46280	AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48196 llvm-svn: 335318	2018-06-22 02:54:57 +00:00
Chandler Carruth	8a9d5a8c29	[LegacyPM] Fix PR37888 by teaching the legacy loop pass manager how to clear out deleted loops from the current queue beyond just the current loop. This is important because SimpleLoopUnswitch will now enqueue the same loop to be re-processed. When it does this with the legacy PM, we don't have a way of canceling the rest of the pipeline and so we can end up deleting the loop before we reprocess it. =/ This change also makes it easy to support deleting other loops in the queue to process, although I don't have any use cases for that. Differential Revision: https://reviews.llvm.org/D48470 llvm-svn: 335317	2018-06-22 02:43:41 +00:00
Tom Stellard	d9853a4c72	AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48195 llvm-svn: 335316	2018-06-22 02:34:29 +00:00
Tom Stellard	e41569733d	AMDGPU/GlobalISel: Implement select() for COPY Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46151 llvm-svn: 335315	2018-06-22 00:44:29 +00:00
Chandler Carruth	dd1f4c2f3a	Fix test failures after r335306 due to the pipeline changing. This wasn't obvious for the author to fix because this is the first pipeline use of the magic utility to get function analyses within a module pass in the lagecy pass manager. Turns out that has a bug which prevents dumping the structure of the pipeline and shows up as an unnamed pass. I've just left a FIXME for that as it doesn't seem likely worth fixing and certainly shouldn't hold up getting the bots green. llvm-svn: 335314	2018-06-22 00:32:26 +00:00
Sanjay Patel	2d6c2216c7	[InstCombine] fix shuffle-of-binops bug With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312	2018-06-21 23:56:59 +00:00
Sanjay Patel	80ed390b9b	[InstCombine] add test for shuffle-of-binops; NFC This shows a miscompile that was missed in rL335283. llvm-svn: 335311	2018-06-21 23:53:01 +00:00
Tom Stellard	5c60568c11	AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEF Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46150 llvm-svn: 335307	2018-06-21 23:38:20 +00:00
Michael J. Spencer	9f6a23f8c6	[Instrumentation] Add Call Graph Profile pass This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306	2018-06-21 23:31:10 +00:00
Reid Kleckner	e279cb412a	[X86] Fix 32-bit mingw comdat names, only add one underscore llvm-svn: 335304	2018-06-21 23:06:33 +00:00
Fangrui Song	aefccfdf5d	[gdb] Update llvm::Optional Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48461 llvm-svn: 335303	2018-06-21 22:34:29 +00:00
Scott Linder	b3e8dacd9e	[AMDGPU] Fix lit failures introduced in r335281 The tests do not support big-endian hosts. llvm-svn: 335302	2018-06-21 22:30:09 +00:00
Sanjay Patel	7e29e3b945	[IR] fix typo in comment; NFC llvm-svn: 335301	2018-06-21 22:25:42 +00:00
Reid Kleckner	c1eeade8d2	Revert r335297 "[X86] Implement more of x86-64 large and medium PIC code models" MCJIT can't handle R_X86_64_GOT64 yet. llvm-svn: 335300	2018-06-21 22:19:05 +00:00
Reid Kleckner	3f66d67494	[X86] Commit some comments that weren't in the medium code model patch llvm-svn: 335298	2018-06-21 21:57:44 +00:00
Reid Kleckner	6435d99a79	[X86] Implement more of x86-64 large and medium PIC code models Summary: The large code model allows code and data segments to exceed 2GB, which means that some symbol references may require a displacement that cannot be encoded as a displacement from RIP. The large PIC model even relaxes the assumption that the GOT itself is within 2GB of all code. Therefore, we need a special code sequence to materialize it: .LtmpN: leaq .LtmpN(%rip), %rbx movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch addq %rax, %rbx # GOT base reg From that, non-local references go through the GOT base register instead of being PC-relative loads. Local references typically use GOTOFF symbols, like this: movq extern_gv@GOT(%rbx), %rax movq local_gv@GOTOFF(%rbx), %rax All calls end up being indirect: movabsq $local_fn@GOTOFF, %rax addq %rbx, %rax callq *%rax The medium code model retains the assumption that the code segment is less than 2GB, so calls are once again direct, and the RIP-relative loads can be used to access the GOT. Materializing the GOT is easy: leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg DSO local data accesses will use it: movq local_gv@GOTOFF(%rbx), %rax Non-local data accesses will use RIP-relative addressing, which means we may not always need to materialize the GOT base: movq extern_gv@GOTPCREL(%rip), %rax Direct calls are basically the same as they are in the small code model: They use direct, PC-relative addressing, and the PLT is used for calls to non-local functions. This patch adds reasonably comprehensive testing of LEA, but there are lots of interesting folding opportunities that are unimplemented. Reviewers: chandlerc, echristo Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47211 llvm-svn: 335297	2018-06-21 21:55:08 +00:00
Matthew Voss	223cb7d841	[GVN] Avoid casting a vector of size less than 8 bits to i8 Summary: A reprise of D25849. This crash was found through fuzzing some time ago and was documented in PR28879. No check for load size has been added due to the following tests: - Transforms/GVN/invariant.group.ll - Transforms/GVN/pr10820.ll These tests expect load sizes that are not a multiple of eight. Thanks to @davide for the original patch. Reviewers: nlopes, davide, RKSimon, reames, efriedma Reviewed By: efriedma Subscribers: davide, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D48330 llvm-svn: 335294	2018-06-21 21:43:20 +00:00
Jonas Devlieghere	b92270b149	[dsymutil] Force mmap'ing of binaries After the recent refactoring that introduced parallel handling of different object, the binary holder became unique per object file. This defeats its optimization of caching archives, leading to an archive being opened for every binary it contains. This is obviously unfortunate and will need to be refactored soon. Luckily in practice, the impact of this is limited as most files are mmap'ed instead of memcopy'd. There's a caveat however: when the memory buffer requires a null terminator and it's a multiple of the page size, we allocate instead of mmap'ing. If this happens for a static archive, we end up with N copies of it in memory, where N is the number of objects in the archive, leading to exuberant memory usage. This provided a stopgap solution to ensure that all the files it loads are mmap in memory by removing the requirement for a terminating null byte. Differential revision: https://reviews.llvm.org/D48397 llvm-svn: 335293	2018-06-21 21:37:53 +00:00
Tim Shen	ebe290e6fc	[SCEV] Re-apply r335197 (with Polly fixes). Summary: This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338). I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output. All LLVM files are already reviewed in D48338. Reviewers: jdoerfert, bollu, efriedma Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia Differential Revision: https://reviews.llvm.org/D48453 llvm-svn: 335292	2018-06-21 21:29:54 +00:00
Konstantin Zhuravlyov	d421112771	AMDGPU: Remove ability to reserve VGPRs for debugger Differential Revision: https://reviews.llvm.org/D48234 llvm-svn: 335288	2018-06-21 20:28:19 +00:00
Reid Kleckner	6281e108e5	[mingw] Fix GCC ABI compatibility for comdat things Summary: GCC and the binutils COFF linker do comdats differently from MSVC. If we want to be ABI compatible, we have to do what they do, which is to emit unique section names like ".text$_Z3foov" instead of short section names like ".text". Otherwise, the binutils linker gets confused and reports multiple definition errors when two object files from GCC and Clang containing the same inline function are linked together. The best description of the issue is probably at https://github.com/Alexpux/MINGW-packages/issues/1677, we don't seem to have a good one in our tracker. I fixed up the .pdata and .xdata sections needed everywhere other than 32-bit x86. GCC doesn't use associative comdats for those, it appears to rely on the section name. Reviewers: smeenai, compnerd, mstorsjo, martell, mati865 Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D48402 llvm-svn: 335286	2018-06-21 20:27:38 +00:00
Sanjay Patel	cc3c5dca53	[InstCombine] fold vector select of binops with constant ops to 1 binop (PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 llvm-svn: 335283	2018-06-21 20:15:09 +00:00
Scott Linder	4a78711447	[AMDGPU] Update assembler for HSA Code Object v3 Update AMDGPU assembler syntax behind the code-object-v3 feature: * Replace/rename most AMDGPU assembler directives/symbols and document them. * Provide more diagnostics (e.g. values out of range, missing values, repeated values). * Provide path for backwards compatibility, even with underlying descriptor changes. Differential Revision: https://reviews.llvm.org/D47736 llvm-svn: 335281	2018-06-21 19:38:56 +00:00
Francis Visoiu Mistrih	029e72fa45	Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions." This reverts commit r335206. As discussed here: https://reviews.llvm.org/rL333740, a fix will come tomorrow. In the meanwhile, revert this to fix some bots. llvm-svn: 335272	2018-06-21 19:18:36 +00:00
Simon Dardis	e025b14586	[mips] Modify comment to test new email address (NFC). llvm-svn: 335269	2018-06-21 18:52:32 +00:00
Scott Linder	a83da62375	[AMDGPU] Fix bug with tracking processed blocks in SIInsertWaitcnts BlockWaitcntProcessedSet was not being cleared between calls, so it was producing incorrect counts in cases where MBB addresses happened to coincide across multiple calls. Differential Revision: https://reviews.llvm.org/D48391 llvm-svn: 335268	2018-06-21 18:48:48 +00:00
Konstantin Zhuravlyov	1ba54fc164	AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z and everything that comes with it from implementation and v3 header files. Leave definition in v2 header files for backwards compatibility. Differential Revision: https://reviews.llvm.org/D48191 llvm-svn: 335267	2018-06-21 18:36:04 +00:00
Sanjay Patel	5c0e1473d6	[InstCombine] add tests for shuffled cmps; NFC llvm-svn: 335266	2018-06-21 18:07:38 +00:00

1 2 3 4 5 ...

165680 Commits