llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Seiya Nuta	466f12f007	[llvm-objcopy][MachO] Implement --remove-section Reviewers: alexshap, rupprecht, jhenderson Reviewed By: rupprecht, jhenderson Subscribers: jakehehrlich, abrachet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66282	2019-11-15 14:20:11 +09:00
Kevin P. Neal	2dbca15279	Document more specifically the rounding for "llvm.round". Differential Revision: https://reviews.llvm.org/D68810	2019-11-14 13:15:15 -05:00
Kevin P. Neal	592d69a8a6	Make the language more consistent since I'm about to commit a content change next.	2019-11-14 13:10:59 -05:00
Reid Kleckner	3f676674e7	Move CodeGenFileType enum to Support/CodeGen.h Avoids the need to include TargetMachine.h from various places just for an enum. Various other enums live here, such as the optimization level, TLS model, etc. Data suggests that this change probably doesn't matter, but it seems nice to have anyway.	2019-11-13 16:39:34 -08:00
Fangrui Song	61e55d5d11	[llvm-objcopy][COFF] Implement --redefine-sym and --redefine-syms The parsing error tests in ELF/redefine-symbols.test are not specific to ELF. Move them to redefine-symbols.test. Add COFF/redefine-symbols.test for COFF specific tests. Also fix the documentation regarding --redefine-syms: the old and new names are separated by whitespace, not an equals sign. Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D70036	2019-11-12 11:28:00 -08:00
Nuno Lopes	cfbe9609f3	docs: fix warning in LangRef parsing	2019-11-11 10:45:42 +00:00
drichards-87	d670c0cc49	Docs: Updates Sphinx Quickstart template for new contributors	2019-11-10 09:27:32 -07:00
Simon Pilgrim	7d5839670f	Try to fix sphinx "Could not lex literal_block as "llvm"" warning. Code block isn't IR - so treat it as "none" instead.	2019-11-09 22:15:26 +00:00
Stephan T. Lavavej	5a753f8dae	[www] More HTTPS and outdated link fixes. Resolves D69981.	2019-11-08 14:41:27 -08:00
Tom Stellard	fbd15449ff	[cmake] Remove LLVM_{BUILD,LINK}_LLVM_DYLIB options on Windows Summary: The options aren't supported so they can be removed. Reviewers: beanz, smeenai, compnerd Reviewed By: compnerd Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69877	2019-11-08 10:37:16 -08:00
Lang Hames	bc7e42d8b9	[docs] Fix references to a renamed flag. The -use-mcjit option was replaced with -jit-kind=mcjit a while back. This patch updates the docs to reflect that. Patch by Yu Jian. Thanks Jian!	2019-11-06 14:42:57 -08:00
Daniel Sanders	9413ba8b9d	[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference It looks like I pushed an older version of this commit without the review fixups earlier. This applies the review changes Differential Revision: https://reviews.llvm.org/D69545	2019-11-05 15:44:26 -08:00
Daniel Sanders	6cdb5ec3bc	[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference Summary: Rework the GMIR documentation to focus more on the end user than the implementation and tie it in to the MIR document. There was also some out-of-date information which has been removed. The quality of the GenericOpcode reference is highly variable and drops sharply as I worked through them all but we've got to start somewhere :-). It would be great if others could expand on this too as there is an awful lot to get through. Also fix a typo in the definition of G_FLOG. Previously, the comments said we had two base-2's (G_FLOG and G_FLOG2). Reviewers: aemerson, volkan, rovka, arsenm Reviewed By: rovka Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69545	2019-11-05 15:16:43 -08:00
Daniel Sanders	c868861fe8	[globalisel][docs] Add a section about debugging with the block extractor Summary: Depends on D69644 Reviewers: rovka, volkan, arsenm Subscribers: wdng, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69645	2019-11-05 14:48:27 -08:00
Daniel Sanders	76c9170daf	[globalisel][docs] Add KnownBits Analysis documentation Summary: This is largely based off of the slides from the keynote Depends on D69545 Reviewers: volkan, rovka, arsenm Subscribers: wdng, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69644	2019-11-05 09:55:33 -08:00
Fangrui Song	dd2cb2d821	[llvm-objcopy][ELF] Implement --only-keep-debug --only-keep-debug produces a debug file as the output that only preserves contents of sections useful for debugging purposes (the binutils implementation preserves SHT_NOTE and non-SHF_ALLOC sections), by changing their section types to SHT_NOBITS and rewritting file offsets. See https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html The intended use case is: ``` llvm-objcopy --only-keep-debug a a.dbg llvm-objcopy --strip-debug a b llvm-objcopy --add-gnu-debuglink=a.dbg b ``` The current layout algorithm is incapable of deleting contents and shrinking segments, so it is not suitable for implementing the functionality. This patch adds a new algorithm which assigns sh_offset to sections first, then modifies p_offset/p_filesz of program headers. It bears a resemblance to lld/ELF/Writer.cpp. Reviewed By: jhenderson, jakehehrlich Differential Revision: https://reviews.llvm.org/D67137	2019-11-05 08:56:15 -08:00
Nuno Lopes	105e48e377	[Docs] Add LangRef documentation for freeze instruction Summary: - Describe the new freeze instruction - Make it explicit that branch on undef/poison is UB Reviewers: chandlerc, majnemer, efriedma, nikic, reames, jdoerfert, lebedev.ri, regehr Subscribers: fhahn, bollu, lebedev.ri, delcypher, spatel, filcab, llvm-commits, aqjune Differential Revision: https://reviews.llvm.org/D29121	2019-11-05 11:35:55 +00:00
Craig Topper	7555ade6d7	[X86] Add support for -mvzeroupper and -mno-vzeroupper to match gcc -mvzeroupper will force the vzeroupper insertion pass to run on CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs where it normally runs. To support this with the default feature handling in clang, we need a vzeroupper feature flag in X86.td. Since this flag has the opposite polarity of the fast-partial-ymm-or-zmm-write we used to use to disable the pass, we now need to add this new flag to every CPU except KNL/KNM and BTVER2 to keep identical behavior. Remove -fast-partial-ymm-or-zmm-write which is no longer used. Differential Revision: https://reviews.llvm.org/D69786	2019-11-04 11:03:54 -08:00
Amy Huang	68fcd6e209	Recommit "[CodeView] Add option to disable inline line tables." This reverts commit 004ed2b0d1b86d424643ffc88fce20ad8bab6804. Original commit hash 6d03890384517919a3ba7fe4c35535425f278f89 Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. https://reviews.llvm.org/D67723	2019-11-04 09:15:26 -08:00
Stefan Stipanovic	4e04be8943	NoFree argument attribute. Summary: Deducing nofree atrribute for function arguments. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67886	2019-11-02 19:40:48 +01:00
Stefan Stipanovic	8973a4b594	Revert "NoFree argument attribute." This reverts commit c12efa2ed0547f7f9f8fba0ad7a76a4cb08bf53a.	2019-11-02 17:31:02 +01:00
Stefan Stipanovic	e0f934d786	NoFree argument attribute. Summary: Deducing nofree atrribute for function arguments. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67886	2019-11-02 16:35:38 +01:00
Roman Lebedev	f757355f93	Revert BCmp Loop Idiom recognition transform (PR43870) As discussed in https://bugs.llvm.org/show_bug.cgi?id=43870, this transform is missing a crucial legality check: the old (non-countable) loop would early-return upon first mismatch, but there is no such guarantee for bcmp/memcmp. We'd need to ensure that [PtrA, PtrA+NBytes) and [PtrB, PtrB+NBytes) are fully dereferenceable memory regions. But that would limit the transform to constant loop trip counts and would further cripple it because dereferenceability analysis is very partial. Furthermore, even if all that is done, every single test would need to be rewritten from scratch. So let's just give up.	2019-11-02 12:48:03 +03:00
Evgenii Stepanov	be50cd2577	Add MemTagSanitizer documentation. Summary: A lot of this is work in progress... Reviewers: kcc, pcc Subscribers: cryptoad, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69289	2019-11-01 10:46:04 -07:00
Adrian Prantl	f258d85c4e	Fix a few typos in SourceLevelDebugging.rst	2019-10-31 16:03:44 -07:00
James Henderson	ad6a78daf8	[llvm-objcopy] Preserve .ARM.attributes section when stripping files This works around a bug in Debian's patchset for glibc. The bug is described in detail in the upstream debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943798, but the short version of it is that glibc on any Debian based distro don't load libraries unless it has a .ARM.attribute section. Reviewed by: jhenderson, rupprecht, MaskRay, jakehehrlich Differential Revision: https://reviews.llvm.org/D69188 Patch by Tobias Hieta.	2019-10-31 11:57:19 +00:00
Seiya Nuta	464b92632c	[llvm-objcopy][MachO] Implement --strip-all Reviewers: alexshap, rupprecht, jdoerfert, jhenderson Reviewed By: alexshap Subscribers: jakehehrlich, abrachet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66281	2019-10-31 14:26:46 +09:00
Amy Huang	3c3d476bf5	Revert "[CodeView] Add option to disable inline line tables." because it breaks compiler-rt tests. This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.	2019-10-30 17:31:12 -07:00
Amy Huang	4594549314	[CodeView] Add option to disable inline line tables. Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. See https://bugs.llvm.org/show_bug.cgi?id=42344 Reviewers: rnk Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67723	2019-10-30 16:52:39 -07:00
Daniel Sanders	ace9d22f88	[globalisel][docs] Add the tutorial to the Porting document In lieu of converting that tutorial to text, add a link to the porting tutorial from the 2017 Dev Meeting to the porting page	2019-10-30 14:53:39 -07:00
Alina Sbirlea	1da1835885	[ReleaseNotes] Add item on deleting the BasicBlockPass(Manager).	2019-10-30 14:26:46 -07:00
Daniel Sanders	b28aa13df0	[globalisel][docs] Rework the Legalizer page slightly The legalizer page was in a fairly good state. I've mostly just inlined some information as a note and removed a reference to potential future work that I think is very unlikely to be done (it's very hard to tell if a pattern or set of patterns fully covers a node due to C++ predicates). Also added a note that 'selectable' doesn't mean that InstructionSelect must do it.	2019-10-30 13:42:19 -07:00
Evandro Menezes	7348b241c1	[clang][llvm] Obsolete Exynos M1 and M2	2019-10-30 15:02:59 -05:00
Daniel Sanders	001e8e94bf	[globalisel][docs] Add a pass index	2019-10-30 12:06:22 -07:00
Daniel Sanders	aea8b50910	[globalisel][docs] Fix a label that was renamed	2019-10-30 11:47:29 -07:00
Alina Sbirlea	dfa5798f2b	[LegacyPassManager] Delete BasicBlockPass/Manager. Summary: Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation. The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254. In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager. Reviewers: chandlerc, echristo Subscribers: mehdi_amini, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69121	2019-10-30 11:40:16 -07:00
Jay Foad	9a5af2ea92	[IR] Allow fast math flags on calls with floating point array type. Summary: This extends the rules for when a call instruction is deemed to be an FPMathOperator, which is based on the type of the call (i.e. the return type of the function being called). Previously we only allowed floating-point and vector-of-floating-point types. Now we also allow arrays (nested to any depth) of floating-point and vector-of-floating-point types. This was motivated by llpc, the pipeline compiler for AMD GPUs (https://github.com/GPUOpen-Drivers/llpc). llpc has many math library functions that operate on vectors, typically represented as <4 x float>, and some that operate on matrices, typically represented as [4 x <4 x float>], and it's useful to be able to decorate calls to all of them with fast math flags. Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy Subscribers: wdng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69161	2019-10-30 14:00:33 +00:00
Vlad Tsyrklevich	05d7b20b14	Revert "[llvm-cov] Add option to whitelist filenames" This reverts commit bfed824b57d14e2ba98ddbaf1a1410cf04a3e279, the included test fails on many bots including the sanitier bots, e.g. in http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/36140	2019-10-29 22:38:38 -07:00
Vedant Kumar	0f46172e59	[llvm-cov] Add option to whitelist filenames Add the `-whitelist-filename-regex` option to restrict coverage reporting to file paths that match a whitelist regex. Patch by Michael Daniels! rdar://56720320	2019-10-29 18:26:33 -07:00
Adrian Prantl	26a102eb58	[DWARF5] Added support for deleted C++ special member functions. This patch adds support for deleted C++ special member functions in clang and llvm. Also added Defaulted member encodings for future support for defaulted member functions. Patch by Sourabh Singh Tomar! Differential Revision: https://reviews.llvm.org/D69215	2019-10-29 13:44:06 -07:00
Daniel Sanders	8e371e2336	[globalisel][docs] Fix warning treated as error I had hoped that I could have some ``` .. code-block:: MIR ``` sections for MIR examples which causes a warning about pygments not supporting it but we have warnings treated as errors	2019-10-29 13:27:48 -07:00
Daniel Sanders	705b4ca866	[globalisel][docs] Rewrite the IRTranslator documentation Summary: I haven't refreshed the Function Calls section as I don't feel I have sufficient knowledge of that area. It would be appreciated if someone could review that section. Note: I'm aware that pygments doesn't support 'mir' as used in one of the code-block directives. This currently emits a warning and I decided to keep it to enable finding them later. Maybe we can teach pygments to support it. Depends on D69456 Reviewers: volkan, aditya_nandakumar Subscribers: rovka, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69457	2019-10-29 13:14:58 -07:00
Philip Reames	f4b7fd46a9	[Docs] Reflect the slow migration from guard to widenable condition which is currently in progress.	2019-10-29 12:46:24 -07:00
Daniel Sanders	285fcbbd38	[globalisel][docs] Rewrite the pipeline overview Summary: Rewrite the pipeline overview to be more focused on the structure and flexibility as well as highlight the increased usefulness of MachineVerifier and increased testability resulting from the smaller incremental passes approach. The diagrams are lifted from the slides for the LLVMDev 2019 talk 'Generating Optimized Code with GlobalISel' and adapted to be readable on the white background used in the docs. Reviewers: volkan Subscribers: rovka, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69456	2019-10-29 11:21:24 -07:00
Francis Visoiu Mistrih	9723c07d38	[Remarks] Remove references to ELF support There is no ELF support at the moment. Remove all the references to the `.remarks` section.	2019-10-28 12:50:46 -07:00
Francis Visoiu Mistrih	7aefbf742c	[Remarks] Emit the remarks section by default for certain formats Emit a remarks section by default for the following formats: * bitstream * yaml-strtab while still providing -remarks-section=<bool> to override the defaults.	2019-10-28 12:50:46 -07:00
Andrew Paverd	d090368b0c	Add Windows Control Flow Guard checks (/guard:cf). Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761	2019-10-28 15:19:39 +00:00
Seiya Nuta	e73b1ca85f	[llvm-objcopy][MachO] Implement --only-section Reviewers: alexshap, rupprecht, jdoerfert, jhenderson Reviewed By: alexshap, rupprecht, jhenderson Subscribers: mgorny, jakehehrlich, abrachet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65541	2019-10-28 16:00:20 +09:00
Daniel Sanders	c808a49b50	[globalisel] Restructure the GlobalISel documentation There's a couple minor deletions amongst this but 99% of it is just moving the documentation around to prepare the way for more meaningful changes.	2019-10-25 15:51:09 -07:00
Daniel Sanders	4932a4c57d	[globalisel] Fix typo in 'Add LLVMDev 2019 talks and links for the 2017 talks'	2019-10-25 15:01:14 -07:00
Daniel Sanders	74de9b7cb8	[globalisel] Add LLVMDev 2019 talks and links for the 2017 talks	2019-10-25 14:53:58 -07:00
Saleem Abdulrasool	f93757f9ac	build: remove `LLVM_CXX_STD` extension point This extension point is not needed. Provide the equivalent option through `CMAKE_CXX_STANDARD` which mirrors the previous extension point. Rely on CMake to provide the check for the compiler instead.	2019-10-25 11:51:47 -07:00
Simon Atanasyan	4e7839f5cd	[docs] Update Mips feature table in CodeGenerator.rst Patch by Miloš Stojanović Differential Revision: https://reviews.llvm.org/D69381	2019-10-25 12:17:34 +03:00
Tom Stellard	2e60739b40	docs: Update instructions for requesting commit access	2019-10-24 20:42:02 -07:00
Simon Atanasyan	ab098e60e0	[docs] Add Mips as a supported architecture in GettingStarted.rst Patch by Miloš Stojanović Differential Revision: https://reviews.llvm.org/D69380	2019-10-24 15:56:30 +03:00
Simon Atanasyan	a21e854499	[docs] Update link to the MIPS 64-bit ELF object file specification Patch by Miloš Stojanović Differential Revision: https://reviews.llvm.org/D69377	2019-10-24 15:56:30 +03:00
Marek Kurdej	4b723fae58	[libFuzzer] docs: update note to include REDUCE event.	2019-10-24 12:04:12 +02:00
Meike Baumgärtner	e95503bd4f	Improve language in GettingStarted.rst This patch was reviewed and approved by chandlerc. "Getting Started with the LLVM System" is the first point of contact for many newcomers in the LLVM community. * Make the first two paragraphs more welcoming * Use more inclusive language	2019-10-23 12:32:57 -07:00
Chandler Carruth	e778f38f69	Remove a no longer accurate sentence from the coding standards. (And test my commit access. We're working on larger changes here.)	2019-10-23 11:40:45 -07:00
Kit Barton	b4d367c63a	Fix broken sphinx link in CMake.rst. Reviewers: delcypher, beanz Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69325	2019-10-22 14:49:58 -07:00
Owen Reynolds	8604299757	[docs][llvm-ar] Update llvm-ar command guide The llvm-ar command guide had not been updated in some time, it was missing current functionality and contained information that was out of date. This change: - Updates the use of reStructuredText directives, as seen in other tools command guides. - Updates the command synopsis. - Updates the descriptions of the tool behaviour. - Updates the options section. - Adds details of MRI script functionality. - Removes the sections "Standards" and "File Format" Differential Revision: https://reviews.llvm.org/D68998 llvm-svn: 375412	2019-10-21 13:13:31 +00:00
Sylvestre Ledru	0e050e2a4c	Explicit in the doc the current list of projects (with easy copy and paste) llvm-svn: 375339	2019-10-19 09:55:24 +00:00
Sylvestre Ledru	0b69f98f0c	Make it clear in the doc that 'all' in LLVM_ENABLE_PROJECTS does install ALL projects llvm-svn: 375337	2019-10-19 09:27:14 +00:00
Jay Foad	ee4f590a23	Update docs for fast-math flags. This adds fneg, phi and select to the list of operations that may use fast-math flags. llvm-svn: 375250	2019-10-18 16:07:09 +00:00
Jordan Rupprecht	63099538ab	[llvm-objcopy] Add support for shell wildcards Summary: GNU objcopy accepts the --wildcard flag to allow wildcard matching on symbol-related flags. (Note: it's implicitly true for section flags). The basic syntax is to allow , ?, \, and [] which work similarly to how they work in a shell. Additionally, starting a wildcard with ! causes that wildcard to prevent it from matching a flag. Use an updated GlobPattern in libSupport to handle these patterns. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `` is what's used anyway). Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap Reviewed By: MaskRay Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66613 llvm-svn: 375169	2019-10-17 20:51:00 +00:00
Fangrui Song	7898dd39d1	[docs][llvm-ar] Fix option:: O after r375106 docs-llvm-html fails => unknown option: O There are lots of formatting issues in the file but they will be fixed by D68998. llvm-svn: 375107	2019-10-17 11:56:26 +00:00
Fangrui Song	c1f3443f19	[llvm-ar] Implement the O modifier: display member offsets inside the archive Since GNU ar 2.31, the 't' operation prints member offsets beside file names if the 'O' modifier is specified. 'O' is ignored for thin archives. Reviewed By: gbreynoo, ruiu Differential Revision: https://reviews.llvm.org/D69087 llvm-svn: 375106	2019-10-17 11:34:29 +00:00
Oliver Stannard	6aaf81e821	Reland: Dead Virtual Function Elimination Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 375094	2019-10-17 09:58:57 +00:00
Alina Sbirlea	2ef49693b1	Update ReleaseNotes: expand the section on enabling MemorySSA llvm-svn: 375045	2019-10-16 21:52:09 +00:00
Owen Reynolds	8ba2eb978f	[llvm-ar] Make paths case insensitive when on windows When on windows gnu-ar treats member names as case insensitive. This commit implements the same behaviour. Differential Revision: https://reviews.llvm.org/D68033 llvm-svn: 375002	2019-10-16 14:07:57 +00:00
DeForest Richards	ad49568b44	[Docs] Updates sidebar links and sets max-width property for div.body Updates the sidebar links for Getting Started. Also sets max-width on div.body to 1000px. llvm-svn: 374949	2019-10-15 21:27:20 +00:00
David Stenberg	49e133c5a7	[DebugInfo] Add a DW_OP_LLVM_entry_value operation Summary: Internally in LLVM's metadata we use DW_OP_entry_value operations with the same semantics as DWARF; that is, its operand specifies the number of bytes that the entry value covers. At the time of emitting entry values we don't know the emitted size of the DWARF expression that the entry value will cover. Currently the size is hardcoded to 1 in DIExpression, and other values causes the verifier to fail. As the size is 1, that effectively means that we can only have valid entry values for registers that can be encoded in one byte, which are the registers with DWARF numbers 0 to 31 (as they can be encoded as single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump will print an operation "correctly", even if the byte size is less than that, which may make it seem that we emit correct DWARF for registers with DWARF numbers > 31. If you instead use readelf for such cases, it will interpret the number of specified bytes as a DWARF expression. This seems like a limitation in llvm-dwarfdump. As suggested in D66746, a way forward would be to add an internal variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand instead specifies the number of operations that the entry value covers, and we then translate that into the byte size at the time of emission. In this patch that internal operation is added. This patch keeps the limitation that a entry value can only be applied to simple register locations, but it will fix the issue with the size operand being incorrect for DWARF numbers > 31. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: aprantl Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67492 llvm-svn: 374881	2019-10-15 11:31:21 +00:00
Jorge Gorbe Moya	6a19d78c0a	Revert "Dead Virtual Function Elimination" This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f. llvm-svn: 374844	2019-10-14 23:25:25 +00:00
DeForest Richards	a22e4dada0	[Docs] Moves Control Flow Document to User Guides Moves Control Flow document from Reference docs page to User guides page. llvm-svn: 374733	2019-10-13 20:05:22 +00:00
Roman Lebedev	0dff68630e	[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition Summary: This is a recommit, this originally landed in rL370454 but was subsequently reverted in rL370788 due to https://bugs.llvm.org/show_bug.cgi?id=43206 The reduced testcase was added to bcmp-negative-tests.ll as @pr43206_different_loops - we must ensure that the SCEV's we got are both for the same loop we are currently investigating. Original commit message: @mclow.lists brought up this issue up in IRC. It is a reasonably common problem to compare some two values for equality. Those may be just some integers, strings or arrays of integers. In C, there is `memcmp()`, `bcmp()` functions. In C++, there exists `std::equal()` algorithm. One can also write that function manually. libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ libc++ does not do anything like that, it simply relies on simple C++'s `operator==()`. https://godbolt.org/z/er0Zwf (GOOD!) So likely, there exists a certain performance opportunities. Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that is using `memcmp()` (in this case, compiled with modified compiler). {F8768213} ``` #include <algorithm> #include <cmath> #include <cstdint> #include <iterator> #include <limits> #include <random> #include <type_traits> #include <utility> #include <vector> #include "benchmark/benchmark.h" template <class T> bool equal(T* a, T* a_end, T* b) noexcept { for (; a != a_end; ++a, ++b) { if (a != b) return false; } return true; } template <typename T> std::vector<T> getVectorOfRandomNumbers(size_t count) { std::random_device rd; std::mt19937 gen(rd()); std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(), std::numeric_limits<T>::max()); std::vector<T> v; v.reserve(count); std::generate_n(std::back_inserter(v), count, [&dis, &gen]() { return dis(gen); }); assert(v.size() == count); return v; } struct Identical { template <typename T> static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) { auto Tmp = getVectorOfRandomNumbers<T>(count); return std::make_pair(Tmp, std::move(Tmp)); } }; struct InequalHalfway { template <typename T> static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) { auto V0 = getVectorOfRandomNumbers<T>(count); auto V1 = V0; V1[V1.size() / size_t(2)]++; // just change the value. return std::make_pair(std::move(V0), std::move(V1)); } }; template <class T, class Gen> void BM_bcmp(benchmark::State& state) { const size_t Length = state.range(0); const std::pair<std::vector<T>, std::vector<T>> Data = Gen::template Gen<T>(Length); const std::vector<T>& a = Data.first; const std::vector<T>& b = Data.second; assert(a.size() == Length && b.size() == a.size()); benchmark::ClobberMemory(); benchmark::DoNotOptimize(a); benchmark::DoNotOptimize(a.data()); benchmark::DoNotOptimize(b); benchmark::DoNotOptimize(b.data()); for (auto _ : state) { const bool is_equal = equal(a.data(), a.data() + a.size(), b.data()); benchmark::DoNotOptimize(is_equal); } state.SetComplexityN(Length); state.counters["eltcnt"] = benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant); state.counters["eltcnt/sec"] = benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate); const size_t BytesRead = 2 * sizeof(T) * Length; state.counters["bytes_read/iteration"] = benchmark::Counter(BytesRead, benchmark::Counter::kDefaults, benchmark::Counter::OneK::kIs1024); state.counters["bytes_read/sec"] = benchmark::Counter( BytesRead, benchmark::Counter::kIsIterationInvariantRate, benchmark::Counter::OneK::kIs1024); } template <typename T> static void CustomArguments(benchmark::internal::Benchmark* b) { const size_t L2SizeBytes = []() { for (const benchmark::CPUInfo::CacheInfo& I : benchmark::CPUInfo::Get().caches) { if (I.level == 2) return I.size; } return 0; }(); // What is the largest range we can check to always fit within given L2 cache? const size_t MaxLen = L2SizeBytes / /total bufs/ 2 / /maximal elt size/ sizeof(T) / /safety margin/ 2; b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN); } BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical) ->Apply(CustomArguments<uint8_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical) ->Apply(CustomArguments<uint16_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical) ->Apply(CustomArguments<uint32_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical) ->Apply(CustomArguments<uint64_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway) ->Apply(CustomArguments<uint8_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway) ->Apply(CustomArguments<uint16_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway) ->Apply(CustomArguments<uint32_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway) ->Apply(CustomArguments<uint64_t>); ``` {F8768210} ``` $ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx 2019-04-25 21:17:11 Running build-old/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU s) CPU Caches: L1 Data 16K (x8) L1 Instruction 64K (x4) L2 Unified 2048K (x4) L3 Unified 8192K (x1) Load Average: 0.65, 3.90, 4.14 --------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------- <...> BM_bcmp<uint8_t, Identical>/512000 432131 ns 432101 ns 1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s BM_bcmp<uint8_t, Identical>_BigO 0.86 N 0.86 N BM_bcmp<uint8_t, Identical>_RMS 8 % 8 % <...> BM_bcmp<uint16_t, Identical>/256000 161408 ns 161409 ns 4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s BM_bcmp<uint16_t, Identical>_BigO 0.67 N 0.67 N BM_bcmp<uint16_t, Identical>_RMS 25 % 25 % <...> BM_bcmp<uint32_t, Identical>/128000 81497 ns 81488 ns 8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s BM_bcmp<uint32_t, Identical>_BigO 0.71 N 0.71 N BM_bcmp<uint32_t, Identical>_RMS 42 % 42 % <...> BM_bcmp<uint64_t, Identical>/64000 50138 ns 50138 ns 10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s BM_bcmp<uint64_t, Identical>_BigO 0.84 N 0.84 N BM_bcmp<uint64_t, Identical>_RMS 27 % 27 % <...> BM_bcmp<uint8_t, InequalHalfway>/512000 192405 ns 192392 ns 3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s BM_bcmp<uint8_t, InequalHalfway>_BigO 0.38 N 0.38 N BM_bcmp<uint8_t, InequalHalfway>_RMS 3 % 3 % <...> BM_bcmp<uint16_t, InequalHalfway>/256000 127858 ns 127860 ns 5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s BM_bcmp<uint16_t, InequalHalfway>_BigO 0.50 N 0.50 N BM_bcmp<uint16_t, InequalHalfway>_RMS 0 % 0 % <...> BM_bcmp<uint32_t, InequalHalfway>/128000 49140 ns 49140 ns 14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s BM_bcmp<uint32_t, InequalHalfway>_BigO 0.40 N 0.40 N BM_bcmp<uint32_t, InequalHalfway>_RMS 18 % 18 % <...> BM_bcmp<uint64_t, InequalHalfway>/64000 32101 ns 32099 ns 21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s BM_bcmp<uint64_t, InequalHalfway>_BigO 0.50 N 0.50 N BM_bcmp<uint64_t, InequalHalfway>_RMS 1 % 1 % RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0 2019-04-25 21:19:29 Running build-new/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU s) CPU Caches: L1 Data 16K (x8) L1 Instruction 64K (x4) L2 Unified 2048K (x4) L3 Unified 8192K (x1) Load Average: 1.01, 2.85, 3.71 --------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------- <...> BM_bcmp<uint8_t, Identical>/512000 18593 ns 18590 ns 37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s BM_bcmp<uint8_t, Identical>_BigO 0.04 N 0.04 N BM_bcmp<uint8_t, Identical>_RMS 37 % 37 % <...> BM_bcmp<uint16_t, Identical>/256000 18950 ns 18948 ns 37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s BM_bcmp<uint16_t, Identical>_BigO 0.08 N 0.08 N BM_bcmp<uint16_t, Identical>_RMS 34 % 34 % <...> BM_bcmp<uint32_t, Identical>/128000 18627 ns 18627 ns 37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s BM_bcmp<uint32_t, Identical>_BigO 0.16 N 0.16 N BM_bcmp<uint32_t, Identical>_RMS 35 % 35 % <...> BM_bcmp<uint64_t, Identical>/64000 18855 ns 18855 ns 37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s BM_bcmp<uint64_t, Identical>_BigO 0.32 N 0.32 N BM_bcmp<uint64_t, Identical>_RMS 33 % 33 % <...> BM_bcmp<uint8_t, InequalHalfway>/512000 9570 ns 9569 ns 73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s BM_bcmp<uint8_t, InequalHalfway>_BigO 0.02 N 0.02 N BM_bcmp<uint8_t, InequalHalfway>_RMS 29 % 29 % <...> BM_bcmp<uint16_t, InequalHalfway>/256000 9547 ns 9547 ns 74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s BM_bcmp<uint16_t, InequalHalfway>_BigO 0.04 N 0.04 N BM_bcmp<uint16_t, InequalHalfway>_RMS 29 % 29 % <...> BM_bcmp<uint32_t, InequalHalfway>/128000 9396 ns 9394 ns 73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s BM_bcmp<uint32_t, InequalHalfway>_BigO 0.08 N 0.08 N BM_bcmp<uint32_t, InequalHalfway>_RMS 30 % 30 % <...> BM_bcmp<uint64_t, InequalHalfway>/64000 9499 ns 9498 ns 73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s BM_bcmp<uint64_t, InequalHalfway>_BigO 0.16 N 0.16 N BM_bcmp<uint64_t, InequalHalfway>_RMS 28 % 28 % Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench Benchmark Time CPU Time Old Time New CPU Old CPU New --------------------------------------------------------------------------------------------------------------------------------------- <...> BM_bcmp<uint8_t, Identical>/512000 -0.9570 -0.9570 432131 18593 432101 18590 <...> BM_bcmp<uint16_t, Identical>/256000 -0.8826 -0.8826 161408 18950 161409 18948 <...> BM_bcmp<uint32_t, Identical>/128000 -0.7714 -0.7714 81497 18627 81488 18627 <...> BM_bcmp<uint64_t, Identical>/64000 -0.6239 -0.6239 50138 18855 50138 18855 <...> BM_bcmp<uint8_t, InequalHalfway>/512000 -0.9503 -0.9503 192405 9570 192392 9569 <...> BM_bcmp<uint16_t, InequalHalfway>/256000 -0.9253 -0.9253 127858 9547 127860 9547 <...> BM_bcmp<uint32_t, InequalHalfway>/128000 -0.8088 -0.8088 49140 9396 49140 9394 <...> BM_bcmp<uint64_t, InequalHalfway>/64000 -0.7041 -0.7041 32101 9499 32099 9498 ``` What can we tell from the benchmark? * Performance of naive equality check somewhat improves with element size, maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s for uint64_t. I think, that instability implies performance problems. * Performance of `memcmp()`-aware benchmark always maxes out at around bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the naive variant! * eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and linearly decreases with element size. For uint64_t, it's ~4x+ the elements/second. * The call obvious is more pricey than the loop, with small element count. As it can be seen from the full output {F8768210}, the `memcmp()` is almost universally worse, independent of the element size (and thus buffer size) when element count is less than 8. So all in all, bcmp idiom does indeed pose untapped performance headroom. This diff does implement said idiom recognition. I think a reasonable test coverage is present, but do tell if there is anything obvious missing. Now, quality. This does succeed to build and pass the test-suite, at least without any non-bundled elements. {F8768216} {F8768217} This transform fires 91 times: ``` $ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json Tests: 1149 Metric: loop-idiom.NumBCmp Program result-new MultiSourc...Benchmarks/7zip/7zip-benchmark 79.00 MultiSource/Applications/d/make_dparser 3.00 SingleSource/UnitTests/vla 2.00 MultiSource/Applications/Burg/burg 1.00 MultiSourc.../Applications/JM/lencod/lencod 1.00 MultiSource/Applications/lemon/lemon 1.00 MultiSource/Benchmarks/Bullet/bullet 1.00 MultiSourc...e/Benchmarks/MallocBench/gs/gs 1.00 MultiSourc...gs-C/TimberWolfMC/timberwolfmc 1.00 MultiSourc...Prolangs-C/simulator/simulator 1.00 ``` The size changes are: I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look. ``` $ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash Tests: 1149 Same hash: 907 (filtered out) Remaining: 242 Metric: size..text Program result-old result-new diff test-suite...ingleSource/UnitTests/vla.test 753.00 833.00 10.6% test-suite...marks/7zip/7zip-benchmark.test 1001697.00 966657.00 -3.5% test-suite...ngs-C/simulator/simulator.test 32369.00 32321.00 -0.1% test-suite...plications/d/make_dparser.test 89585.00 89505.00 -0.1% test-suite...ce/Applications/Burg/burg.test 40817.00 40785.00 -0.1% test-suite.../Applications/lemon/lemon.test 47281.00 47249.00 -0.1% test-suite...TimberWolfMC/timberwolfmc.test 250065.00 250113.00 0.0% test-suite...chmarks/MallocBench/gs/gs.test 149889.00 149873.00 -0.0% test-suite...ications/JM/lencod/lencod.test 769585.00 769569.00 -0.0% test-suite.../Benchmarks/Bullet/bullet.test 770049.00 770049.00 0.0% test-suite...HMARK_ANISTROPIC_DIFFUSION/128 NaN NaN nan% test-suite...HMARK_ANISTROPIC_DIFFUSION/256 NaN NaN nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/64 NaN NaN nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/32 NaN NaN nan% test-suite...ENCHMARK_BILATERAL_FILTER/64/4 NaN NaN nan% Geomean difference nan% result-old result-new diff count 1.000000e+01 10.00000 10.000000 mean 3.152090e+05 311695.40000 0.006749 std 3.790398e+05 372091.42232 0.036605 min 7.530000e+02 833.00000 -0.034981 25% 4.243300e+04 42401.00000 -0.000866 50% 1.197370e+05 119689.00000 -0.000392 75% 6.397050e+05 639705.00000 -0.000005 max 1.001697e+06 966657.00000 0.106242 ``` I don't have timings though. And now to the code. The basic idea is to completely replace the whole loop. If we can't fully kill it, don't transform. I have left one or two comments in the code, so hopefully it can be understood. Also, there is a few TODO's that i have left for follow-ups: * widening of `memcmp()`/`bcmp()` * step smaller than the comparison size * Metadata propagation * more than two blocks as long as there is still a single backedge? * ??? Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet Reviewed By: courbet Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists Tags: #llvm Differential Revision: https://reviews.llvm.org/D61144 llvm-svn: 374662	2019-10-12 15:35:32 +00:00
Oliver Stannard	901c588c1f	Dead Virtual Function Elimination Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 374539	2019-10-11 11:59:55 +00:00
Kai Nacke	f65b9d0379	[FileCheck] Implement --ignore-case option. The FileCheck utility is enhanced to support a `--ignore-case` option. This is useful in cases where the output of Unix tools differs in case (e.g. case not specified by Posix). Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D68146 llvm-svn: 374538	2019-10-11 11:59:14 +00:00
Tom Stellard	9ce88f6e69	docs/DeveloperPolicy: Add instructions for requesting GitHub commit access Subscribers: mehdi_amini, jtony, xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66840 llvm-svn: 374474	2019-10-10 23:36:06 +00:00
Julian Lettner	f3c91b5401	[lit] Bring back `--threads` option alias Bring back `--threads` option which was lost in the move of the command line argument parsing code to cl_arguments.py. Update docs since `--workers` is preferred. llvm-svn: 374432	2019-10-10 19:43:57 +00:00
Jinsong Ji	afbbfe72dd	[PowerPC][docs] Update IBM official docs in Compiler Writers Info page Summary: Just realized that most of the links in this page are deprecated. So update some important reference here: * adding PowerISA 3.0B/2.7B * adding P8/P9 User Manual * ELFv2 ABI and errata Move deprecated ones into "Other documents..". Reviewers: #powerpc, hfinkel, nemanjai Reviewed By: hfinkel Subscribers: shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68817 llvm-svn: 374428	2019-10-10 19:25:30 +00:00
Roman Lebedev	f72d6fc559	[MCA] Show aggregate over Average Wait times for the whole snippet (PR43219) Summary: As disscused in https://bugs.llvm.org/show_bug.cgi?id=43219, i believe it may be somewhat useful to show //some// aggregates over all the sea of statistics provided. Example: ``` Average Wait times (based on the timeline view): [0]: Executions [1]: Average time spent waiting in a scheduler's queue [2]: Average time spent waiting in a scheduler's queue while ready [3]: Average time elapsed from WB until retire stage [0] [1] [2] [3] 0. 3 1.0 1.0 4.7 vmulps %xmm0, %xmm1, %xmm2 1. 3 2.7 0.0 2.3 vhaddps %xmm2, %xmm2, %xmm3 2. 3 6.0 0.0 0.0 vhaddps %xmm3, %xmm3, %xmm4 3 3.2 0.3 2.3 <total> ``` I.e. we average the averages. Reviewers: andreadb, mattd, RKSimon Reviewed By: andreadb Subscribers: gbedwell, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68714 llvm-svn: 374361	2019-10-10 14:46:21 +00:00
Dmitri Gribenko	0c3e83e627	Revert "[FileCheck] Implement --ignore-case option." This reverts commit r374339. It broke tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19066 llvm-svn: 374359	2019-10-10 14:27:14 +00:00
Kai Nacke	28535e7629	[FileCheck] Implement --ignore-case option. The FileCheck utility is enhanced to support a `--ignore-case` option. This is useful in cases where the output of Unix tools differs in case (e.g. case not specified by Posix). Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D68146 llvm-svn: 374339	2019-10-10 13:15:41 +00:00
Roman Lebedev	b32eb9dfb8	[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour Summary: Quote from http://eel.is/c++draft/expr.add#4: ``` 4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. (4.1) If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. (4.2) Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n. (4.3) Otherwise, the behavior is undefined. ``` Therefore, as per the standard, applying non-zero offset to `nullptr` (or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value from the pointer itself) is undefined behavior. (if `nullptr` is not defined, i.e. e.g. `-fno-delete-null-pointer-checks` was not specified.) To make things more fun, in C (6.5.6p8), applying any offset to null pointer is undefined, although Clang front-end pessimizes the code by not lowering that info, so this UB is "harmless". Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`) LLVM middle-end uses those guarantees for transformations. If the source contains such UB's, said code may now be miscompiled. Such miscompilations were already observed: * https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html * https://github.com/google/filament/pull/1566 Surprisingly, UBSan does not catch those issues ... until now. This diff teaches UBSan about these UB's. `getelementpointer inbounds` is a pretty frequent instruction, so this does have a measurable impact on performance; I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%), and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed \| RawSpeed ]] benchmark: (all measurements done with LLVM ToT, the sanitizer never fired.) * no sanitization vs. existing check: average `+21.62%` slowdown * existing check vs. check after this patch: average `22.04%` slowdown * no sanitization vs. this patch: average `48.42%` slowdown Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers Reviewed By: rsmith Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D67122 llvm-svn: 374293	2019-10-10 09:25:02 +00:00
DeForest Richards	0bfa80a30c	[Docs] Adds section for Additional Topics on Reference page Adds a new section for Additional Topics on the Reference documentation page. Also moves Support Library topic to User Guides page. llvm-svn: 374230	2019-10-09 21:09:09 +00:00
DeForest Richards	f358a34083	[Docs] Adds Documentation links to sidebar Adds links to Getting Started/Tutorials, User Guides, and Reference documentation pages to sidebar. Also adds a new section for LLVM IR on the Reference documentation page. llvm-svn: 374214	2019-10-09 20:26:13 +00:00
DeForest Richards	fef32a3f70	[Docs] Fixes broken sphinx build - undefined label Removes label ref pointing to non-existent subsystem docs page. llvm-svn: 374128	2019-10-08 22:45:20 +00:00
Clement Courbet	e0fc2857f3	[llvm-exegesis] Add options to SnippetGenerator. Summary: This adds a `-max-configs-per-opcode` option to limit the number of configs per opcode. Reviewers: gchatelet Subscribers: tschuett, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68642 llvm-svn: 374054	2019-10-08 14:30:24 +00:00
Kevin P. Neal	75d58de3bb	Nope, I'm wrong. It looks like someone else removed these on purpose and it just happened to break the bot right when I did my push. So I'm undoing this mornings incorrect push. I've also kicked off an email to hopefully get the bot fixed the correct way. llvm-svn: 374049	2019-10-08 14:10:26 +00:00
Kevin P. Neal	fcf18f9190	Restore documentation that 'svn update' unexpectedly yanked out from under me. llvm-svn: 374045	2019-10-08 13:38:42 +00:00
Joerg Sonnenberger	c8a30b2636	Fix the spelling of my name. llvm-svn: 373980	2019-10-07 22:55:42 +00:00
Reid Kleckner	a973c0bd85	[X86] Add new calling convention that guarantees tail call optimization When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization. Patch by Dwight Guth <dwight.guth@runtimeverification.com>! Reviewed By: lebedev.ri, paquette, rnk Differential Revision: https://reviews.llvm.org/D67855 llvm-svn: 373976	2019-10-07 22:28:58 +00:00
Kevin P. Neal	64943d3bc3	Fix another sphinx warning. Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373909	2019-10-07 14:14:46 +00:00
Kevin P. Neal	edcebb4ab4	Fix sphinx warnings. Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373902	2019-10-07 13:39:56 +00:00
Kevin P. Neal	02f26957db	[FPEnv] Add constrained intrinsics for lrint and lround Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373900	2019-10-07 13:20:00 +00:00
Djordje Todorovic	1b3358939a	[llvm-locstats] Fix a typo in the documentation; NFC llvm-svn: 373880	2019-10-07 07:31:49 +00:00
DeForest Richards	46281ba574	[Docs] Removes Subsystem Documentation page Removes Subsystem Documentation page. Also moves existing topics on Subsystem Documentation page to User Guides and Reference pages. llvm-svn: 373872	2019-10-06 22:49:22 +00:00
DeForest Richards	8f532fcce0	[Docs] Removes Programming Documentation page Removes Programming Documentation page. Also moves existing topics on Programming Documentation page to User Guides and Reference pages. llvm-svn: 373856	2019-10-06 16:10:11 +00:00
DeForest Richards	56ebd9c381	[Docs] Adds new Getting Started/Tutorials page Adds a new page for Getting Started/Tutorials topics. Also updates existing topic categories on the User Guides and Reference pages. llvm-svn: 373854	2019-10-06 15:36:37 +00:00
Sylvestre Ledru	1b710e7f71	Update the FAQ: remove stuff related to the previous license + update info about the portability of LLVM. llvm-svn: 373576	2019-10-03 09:43:54 +00:00

1 2 3 4 5 ...

7867 Commits