1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

8000 Commits

Author SHA1 Message Date
Craig Topper
7555ade6d7 [X86] Add support for -mvzeroupper and -mno-vzeroupper to match gcc
-mvzeroupper will force the vzeroupper insertion pass to run on
CPUs that normally wouldn't. -mno-vzeroupper disables it on CPUs
where it normally runs.

To support this with the default feature handling in clang, we
need a vzeroupper feature flag in X86.td. Since this flag has
the opposite polarity of the fast-partial-ymm-or-zmm-write we
used to use to disable the pass, we now need to add this new
flag to every CPU except KNL/KNM and BTVER2 to keep identical
behavior.

Remove -fast-partial-ymm-or-zmm-write which is no longer used.

Differential Revision: https://reviews.llvm.org/D69786
2019-11-04 11:03:54 -08:00
Amy Huang
68fcd6e209 Recommit "[CodeView] Add option to disable inline line tables."
This reverts commit 004ed2b0d1b86d424643ffc88fce20ad8bab6804.
Original commit hash 6d03890384517919a3ba7fe4c35535425f278f89

Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

https://reviews.llvm.org/D67723
2019-11-04 09:15:26 -08:00
Stefan Stipanovic
4e04be8943 NoFree argument attribute.
Summary: Deducing nofree atrribute for function arguments.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67886
2019-11-02 19:40:48 +01:00
Stefan Stipanovic
8973a4b594 Revert "NoFree argument attribute."
This reverts commit c12efa2ed0547f7f9f8fba0ad7a76a4cb08bf53a.
2019-11-02 17:31:02 +01:00
Stefan Stipanovic
e0f934d786 NoFree argument attribute.
Summary: Deducing nofree atrribute for function arguments.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67886
2019-11-02 16:35:38 +01:00
Roman Lebedev
f757355f93 Revert BCmp Loop Idiom recognition transform (PR43870)
As discussed in https://bugs.llvm.org/show_bug.cgi?id=43870,
this transform is missing a crucial legality check:
the old (non-countable) loop would early-return upon first mismatch,
but there is no such guarantee for bcmp/memcmp.

We'd need to ensure that [PtrA, PtrA+NBytes) and [PtrB, PtrB+NBytes)
are fully dereferenceable memory regions. But that would limit
the transform to constant loop trip counts and would further
cripple it because dereferenceability analysis is *very* partial.

Furthermore, even if all that is done, every single test
would need to be rewritten from scratch.

So let's just give up.
2019-11-02 12:48:03 +03:00
Evgenii Stepanov
be50cd2577 Add MemTagSanitizer documentation.
Summary: A lot of this is work in progress...

Reviewers: kcc, pcc

Subscribers: cryptoad, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69289
2019-11-01 10:46:04 -07:00
Adrian Prantl
f258d85c4e Fix a few typos in SourceLevelDebugging.rst 2019-10-31 16:03:44 -07:00
James Henderson
ad6a78daf8 [llvm-objcopy] Preserve .ARM.attributes section when stripping files
This works around a bug in Debian's patchset for glibc. The bug is
described in detail in the upstream debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943798, but the short
version of it is that glibc on any Debian based distro don't load
libraries unless it has a .ARM.attribute section.

Reviewed by: jhenderson, rupprecht, MaskRay, jakehehrlich

Differential Revision: https://reviews.llvm.org/D69188

Patch by Tobias Hieta.
2019-10-31 11:57:19 +00:00
Seiya Nuta
464b92632c [llvm-objcopy][MachO] Implement --strip-all
Reviewers: alexshap, rupprecht, jdoerfert, jhenderson

Reviewed By: alexshap

Subscribers: jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66281
2019-10-31 14:26:46 +09:00
Amy Huang
3c3d476bf5 Revert "[CodeView] Add option to disable inline line tables."
because it breaks compiler-rt tests.

This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.
2019-10-30 17:31:12 -07:00
Amy Huang
4594549314 [CodeView] Add option to disable inline line tables.
Summary:
This adds a clang option to disable inline line tables. When it is used,
the inliner uses the call site as the location of the inlined function instead of
marking it as an inline location with the function location.

See https://bugs.llvm.org/show_bug.cgi?id=42344

Reviewers: rnk

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D67723
2019-10-30 16:52:39 -07:00
Daniel Sanders
ace9d22f88 [globalisel][docs] Add the tutorial to the Porting document
In lieu of converting that tutorial to text, add a link to the porting
tutorial from the 2017 Dev Meeting to the porting page
2019-10-30 14:53:39 -07:00
Alina Sbirlea
1da1835885 [ReleaseNotes] Add item on deleting the BasicBlockPass(Manager). 2019-10-30 14:26:46 -07:00
Daniel Sanders
b28aa13df0 [globalisel][docs] Rework the Legalizer page slightly
The legalizer page was in a fairly good state. I've mostly just inlined
some information as a note and removed a reference to potential future
work that I think is very unlikely to be done (it's very hard to tell if
a pattern or set of patterns fully covers a node due to C++ predicates).

Also added a note that 'selectable' doesn't mean that InstructionSelect
must do it.
2019-10-30 13:42:19 -07:00
Evandro Menezes
7348b241c1 [clang][llvm] Obsolete Exynos M1 and M2 2019-10-30 15:02:59 -05:00
Daniel Sanders
001e8e94bf [globalisel][docs] Add a pass index 2019-10-30 12:06:22 -07:00
Daniel Sanders
aea8b50910 [globalisel][docs] Fix a label that was renamed 2019-10-30 11:47:29 -07:00
Alina Sbirlea
dfa5798f2b [LegacyPassManager] Delete BasicBlockPass/Manager.
Summary:
Delete the BasicBlockPass and BasicBlockManager, all its dependencies and update documentation.
The BasicBlockManager was improperly tested and found to be potentially broken, and was deprecated as of rL373254.

In light of the switch to the new pass manager coming before the next release, this patch is a first cleanup of the LegacyPassManager.

Reviewers: chandlerc, echristo

Subscribers: mehdi_amini, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69121
2019-10-30 11:40:16 -07:00
Jay Foad
9a5af2ea92 [IR] Allow fast math flags on calls with floating point array type.
Summary:
This extends the rules for when a call instruction is deemed to be an
FPMathOperator, which is based on the type of the call (i.e. the return
type of the function being called). Previously we only allowed
floating-point and vector-of-floating-point types. Now we also allow
arrays (nested to any depth) of floating-point and
vector-of-floating-point types.

This was motivated by llpc, the pipeline compiler for AMD GPUs
(https://github.com/GPUOpen-Drivers/llpc). llpc has many math library
functions that operate on vectors, typically represented as <4 x float>,
and some that operate on matrices, typically represented as
[4 x <4 x float>], and it's useful to be able to decorate calls to all
of them with fast math flags.

Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy

Subscribers: wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69161
2019-10-30 14:00:33 +00:00
Vlad Tsyrklevich
05d7b20b14 Revert "[llvm-cov] Add option to whitelist filenames"
This reverts commit bfed824b57d14e2ba98ddbaf1a1410cf04a3e279, the
included test fails on many bots including the sanitier bots, e.g. in
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/36140
2019-10-29 22:38:38 -07:00
Vedant Kumar
0f46172e59 [llvm-cov] Add option to whitelist filenames
Add the `-whitelist-filename-regex` option to restrict coverage
reporting to file paths that match a whitelist regex.

Patch by Michael Daniels!

rdar://56720320
2019-10-29 18:26:33 -07:00
Adrian Prantl
26a102eb58 [DWARF5] Added support for deleted C++ special member functions.
This patch adds support for deleted C++ special member functions in
clang and llvm. Also added Defaulted member encodings for future
support for defaulted member functions.

Patch by Sourabh Singh Tomar!

Differential Revision: https://reviews.llvm.org/D69215
2019-10-29 13:44:06 -07:00
Daniel Sanders
8e371e2336 [globalisel][docs] Fix warning treated as error
I had hoped that I could have some
```
.. code-block:: MIR
```
sections for MIR examples which causes a warning about pygments not
supporting it but we have warnings treated as errors
2019-10-29 13:27:48 -07:00
Daniel Sanders
705b4ca866 [globalisel][docs] Rewrite the IRTranslator documentation
Summary:
I haven't refreshed the Function Calls section as I don't feel I have
sufficient knowledge of that area. It would be appreciated if someone could
review that section.

Note: I'm aware that pygments doesn't support 'mir' as used in one of the
code-block directives. This currently emits a warning and I decided to
keep it to enable finding them later. Maybe we can teach pygments to
support it.

Depends on D69456

Reviewers: volkan, aditya_nandakumar

Subscribers: rovka, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69457
2019-10-29 13:14:58 -07:00
Philip Reames
f4b7fd46a9 [Docs] Reflect the slow migration from guard to widenable condition which is currently in progress. 2019-10-29 12:46:24 -07:00
Daniel Sanders
285fcbbd38 [globalisel][docs] Rewrite the pipeline overview
Summary:
Rewrite the pipeline overview to be more focused on the structure and
flexibility as well as highlight the increased usefulness of
MachineVerifier and increased testability resulting from the smaller
incremental passes approach.

The diagrams are lifted from the slides for the LLVMDev 2019 talk
'Generating Optimized Code with GlobalISel' and adapted to be readable on
the white background used in the docs.

Reviewers: volkan

Subscribers: rovka, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69456
2019-10-29 11:21:24 -07:00
Francis Visoiu Mistrih
9723c07d38 [Remarks] Remove references to ELF support
There is no ELF support at the moment.

Remove all the references to the `.remarks` section.
2019-10-28 12:50:46 -07:00
Francis Visoiu Mistrih
7aefbf742c [Remarks] Emit the remarks section by default for certain formats
Emit a remarks section by default for the following formats:

* bitstream
* yaml-strtab

while still providing -remarks-section=<bool> to override the defaults.
2019-10-28 12:50:46 -07:00
Andrew Paverd
d090368b0c Add Windows Control Flow Guard checks (/guard:cf).
Summary:
A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on
indirect function calls, using either the check mechanism (X86, ARM, AArch64) or
or the dispatch mechanism (X86-64). The check mechanism requires a new calling
convention for the supported targets. The dispatch mechanism adds the target as
an operand bundle, which is processed by SelectionDAG. Another pass
(CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as
required by /guard:cf. This feature is enabled using the `cfguard` CC1 option.

Reviewers: thakis, rnk, theraven, pcc

Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65761
2019-10-28 15:19:39 +00:00
Seiya Nuta
e73b1ca85f [llvm-objcopy][MachO] Implement --only-section
Reviewers: alexshap, rupprecht, jdoerfert, jhenderson

Reviewed By: alexshap, rupprecht, jhenderson

Subscribers: mgorny, jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65541
2019-10-28 16:00:20 +09:00
Daniel Sanders
c808a49b50 [globalisel] Restructure the GlobalISel documentation
There's a couple minor deletions amongst this but 99% of it is just moving
the documentation around to prepare the way for more meaningful changes.
2019-10-25 15:51:09 -07:00
Daniel Sanders
4932a4c57d [globalisel] Fix typo in 'Add LLVMDev 2019 talks and links for the 2017 talks' 2019-10-25 15:01:14 -07:00
Daniel Sanders
74de9b7cb8 [globalisel] Add LLVMDev 2019 talks and links for the 2017 talks 2019-10-25 14:53:58 -07:00
Saleem Abdulrasool
f93757f9ac build: remove LLVM_CXX_STD extension point
This extension point is not needed. Provide the equivalent option
through `CMAKE_CXX_STANDARD` which mirrors the previous extension point. Rely on
CMake to provide the check for the compiler instead.
2019-10-25 11:51:47 -07:00
Simon Atanasyan
4e7839f5cd [docs] Update Mips feature table in CodeGenerator.rst
Patch by Miloš Stojanović

Differential Revision: https://reviews.llvm.org/D69381
2019-10-25 12:17:34 +03:00
Tom Stellard
2e60739b40 docs: Update instructions for requesting commit access 2019-10-24 20:42:02 -07:00
Simon Atanasyan
ab098e60e0 [docs] Add Mips as a supported architecture in GettingStarted.rst
Patch by Miloš Stojanović

Differential Revision: https://reviews.llvm.org/D69380
2019-10-24 15:56:30 +03:00
Simon Atanasyan
a21e854499 [docs] Update link to the MIPS 64-bit ELF object file specification
Patch by Miloš Stojanović

Differential Revision: https://reviews.llvm.org/D69377
2019-10-24 15:56:30 +03:00
Marek Kurdej
4b723fae58 [libFuzzer] docs: update note to include REDUCE event. 2019-10-24 12:04:12 +02:00
Meike Baumgärtner
e95503bd4f Improve language in GettingStarted.rst
This patch was reviewed and approved by chandlerc.

"Getting Started with the LLVM System" is the first point of contact for many newcomers in the LLVM community.
 * Make the first two paragraphs more welcoming
 * Use more inclusive language
2019-10-23 12:32:57 -07:00
Chandler Carruth
e778f38f69 Remove a no longer accurate sentence from the coding standards.
(And test my commit access. We're working on larger changes here.)
2019-10-23 11:40:45 -07:00
Kit Barton
b4d367c63a Fix broken sphinx link in CMake.rst.
Reviewers: delcypher, beanz

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69325
2019-10-22 14:49:58 -07:00
Owen Reynolds
8604299757 [docs][llvm-ar] Update llvm-ar command guide
The llvm-ar command guide had not been updated in some time, it was
missing current functionality and contained information that was out
of date. This change:
- Updates the use of reStructuredText directives, as seen in other tools
  command guides.
- Updates the command synopsis.
- Updates the descriptions of the tool behaviour.
- Updates the options section.
- Adds details of MRI script functionality.
- Removes the sections "Standards" and "File Format"

Differential Revision: https://reviews.llvm.org/D68998

llvm-svn: 375412
2019-10-21 13:13:31 +00:00
Sylvestre Ledru
0e050e2a4c Explicit in the doc the current list of projects (with easy copy and paste)
llvm-svn: 375339
2019-10-19 09:55:24 +00:00
Sylvestre Ledru
0b69f98f0c Make it clear in the doc that 'all' in LLVM_ENABLE_PROJECTS does install ALL projects
llvm-svn: 375337
2019-10-19 09:27:14 +00:00
Jay Foad
ee4f590a23 Update docs for fast-math flags.
This adds fneg, phi and select to the list of operations that may use
fast-math flags.

llvm-svn: 375250
2019-10-18 16:07:09 +00:00
Jordan Rupprecht
63099538ab [llvm-objcopy] Add support for shell wildcards
Summary: GNU objcopy accepts the --wildcard flag to allow wildcard matching on symbol-related flags. (Note: it's implicitly true for section flags).

The basic syntax is to allow *, ?, \, and [] which work similarly to how they work in a shell. Additionally, starting a wildcard with ! causes that wildcard to prevent it from matching a flag.

Use an updated GlobPattern in libSupport to handle these patterns. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway).

Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap

Reviewed By: MaskRay

Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66613

llvm-svn: 375169
2019-10-17 20:51:00 +00:00
Fangrui Song
7898dd39d1 [docs][llvm-ar] Fix option:: O after r375106
docs-llvm-html fails => unknown option: O

There are lots of formatting issues in the file but they will be fixed by D68998.

llvm-svn: 375107
2019-10-17 11:56:26 +00:00
Fangrui Song
c1f3443f19 [llvm-ar] Implement the O modifier: display member offsets inside the archive
Since GNU ar 2.31, the 't' operation prints member offsets beside file
names if the 'O' modifier is specified. 'O' is ignored for thin
archives.

Reviewed By: gbreynoo, ruiu

Differential Revision: https://reviews.llvm.org/D69087

llvm-svn: 375106
2019-10-17 11:34:29 +00:00
Oliver Stannard
6aaf81e821 Reland: Dead Virtual Function Elimination
Remove dead virtual functions from vtables with
replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up
correctly.

Original commit message:

Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 375094
2019-10-17 09:58:57 +00:00
Alina Sbirlea
2ef49693b1 Update ReleaseNotes: expand the section on enabling MemorySSA
llvm-svn: 375045
2019-10-16 21:52:09 +00:00
Owen Reynolds
8ba2eb978f [llvm-ar] Make paths case insensitive when on windows
When on windows gnu-ar treats member names as case insensitive. This
commit implements the same behaviour.

Differential Revision: https://reviews.llvm.org/D68033

llvm-svn: 375002
2019-10-16 14:07:57 +00:00
DeForest Richards
ad49568b44 [Docs] Updates sidebar links and sets max-width property for div.body
Updates the sidebar links for Getting Started. Also sets max-width on div.body to 1000px.

llvm-svn: 374949
2019-10-15 21:27:20 +00:00
David Stenberg
49e133c5a7 [DebugInfo] Add a DW_OP_LLVM_entry_value operation
Summary:
Internally in LLVM's metadata we use DW_OP_entry_value operations with
the same semantics as DWARF; that is, its operand specifies the number
of bytes that the entry value covers.

At the time of emitting entry values we don't know the emitted size of
the DWARF expression that the entry value will cover. Currently the size
is hardcoded to 1 in DIExpression, and other values causes the verifier
to fail. As the size is 1, that effectively means that we can only have
valid entry values for registers that can be encoded in one byte, which
are the registers with DWARF numbers 0 to 31 (as they can be encoded as
single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte
DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump
will print an operation "correctly", even if the byte size is less than
that, which may make it seem that we emit correct DWARF for registers
with DWARF numbers > 31. If you instead use readelf for such cases, it
will interpret the number of specified bytes as a DWARF expression. This
seems like a limitation in llvm-dwarfdump.

As suggested in D66746, a way forward would be to add an internal
variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand
instead specifies the number of operations that the entry value covers,
and we then translate that into the byte size at the time of emission.

In this patch that internal operation is added. This patch keeps the
limitation that a entry value can only be applied to simple register
locations, but it will fix the issue with the size operand being
incorrect for DWARF numbers > 31.

Reviewers: aprantl, vsk, djtodoro, NikolaPrica

Reviewed By: aprantl

Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67492

llvm-svn: 374881
2019-10-15 11:31:21 +00:00
Jorge Gorbe Moya
6a19d78c0a Revert "Dead Virtual Function Elimination"
This reverts commit 9f6a873268e1ad9855873d9d8007086c0d01cf4f.

llvm-svn: 374844
2019-10-14 23:25:25 +00:00
DeForest Richards
a22e4dada0 [Docs] Moves Control Flow Document to User Guides
Moves Control Flow document from Reference docs page to User guides page.

llvm-svn: 374733
2019-10-13 20:05:22 +00:00
Roman Lebedev
0dff68630e [LoopIdiomRecognize] Recommit: BCmp loop idiom recognition
Summary:
This is a recommit, this originally landed in rL370454 but was
subsequently reverted in  rL370788 due to
https://bugs.llvm.org/show_bug.cgi?id=43206
The reduced testcase was added to bcmp-negative-tests.ll
as @pr43206_different_loops - we must ensure that the SCEV's
we got are both for the same loop we are currently investigating.

Original commit message:

@mclow.lists brought up this issue up in IRC.
It is a reasonably common problem to compare some two values for equality.
Those may be just some integers, strings or arrays of integers.

In C, there is `memcmp()`, `bcmp()` functions.
In C++, there exists `std::equal()` algorithm.
One can also write that function manually.

libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for
various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ

libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)

So likely, there exists a certain performance opportunities.
Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that
is using `memcmp()` (in this case, compiled with modified compiler). {F8768213}

```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>

#include "benchmark/benchmark.h"

template <class T>
bool equal(T* a, T* a_end, T* b) noexcept {
  for (; a != a_end; ++a, ++b) {
    if (*a != *b) return false;
  }
  return true;
}

template <typename T>
std::vector<T> getVectorOfRandomNumbers(size_t count) {
  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
                                       std::numeric_limits<T>::max());
  std::vector<T> v;
  v.reserve(count);
  std::generate_n(std::back_inserter(v), count,
                  [&dis, &gen]() { return dis(gen); });
  assert(v.size() == count);
  return v;
}

struct Identical {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto Tmp = getVectorOfRandomNumbers<T>(count);
    return std::make_pair(Tmp, std::move(Tmp));
  }
};

struct InequalHalfway {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto V0 = getVectorOfRandomNumbers<T>(count);
    auto V1 = V0;
    V1[V1.size() / size_t(2)]++;  // just change the value.
    return std::make_pair(std::move(V0), std::move(V1));
  }
};

template <class T, class Gen>
void BM_bcmp(benchmark::State& state) {
  const size_t Length = state.range(0);

  const std::pair<std::vector<T>, std::vector<T>> Data =
      Gen::template Gen<T>(Length);
  const std::vector<T>& a = Data.first;
  const std::vector<T>& b = Data.second;
  assert(a.size() == Length && b.size() == a.size());

  benchmark::ClobberMemory();
  benchmark::DoNotOptimize(a);
  benchmark::DoNotOptimize(a.data());
  benchmark::DoNotOptimize(b);
  benchmark::DoNotOptimize(b.data());

  for (auto _ : state) {
    const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
    benchmark::DoNotOptimize(is_equal);
  }
  state.SetComplexityN(Length);
  state.counters["eltcnt"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant);
  state.counters["eltcnt/sec"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate);
  const size_t BytesRead = 2 * sizeof(T) * Length;
  state.counters["bytes_read/iteration"] =
      benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
                         benchmark::Counter::OneK::kIs1024);
  state.counters["bytes_read/sec"] = benchmark::Counter(
      BytesRead, benchmark::Counter::kIsIterationInvariantRate,
      benchmark::Counter::OneK::kIs1024);
}

template <typename T>
static void CustomArguments(benchmark::internal::Benchmark* b) {
  const size_t L2SizeBytes = []() {
    for (const benchmark::CPUInfo::CacheInfo& I :
         benchmark::CPUInfo::Get().caches) {
      if (I.level == 2) return I.size;
    }
    return 0;
  }();
  // What is the largest range we can check to always fit within given L2 cache?
  const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
                        /*maximal elt size*/ sizeof(T) / /*safety margin*/ 2;
  b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical)
    ->Apply(CustomArguments<uint64_t>);

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway)
    ->Apply(CustomArguments<uint64_t>);
```
{F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench
RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx
2019-04-25 21:17:11
Running build-old/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000           432131 ns       432101 ns         1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO               0.86 N          0.86 N
BM_bcmp<uint8_t, Identical>_RMS                   8 %             8 %
<...>
BM_bcmp<uint16_t, Identical>/256000          161408 ns       161409 ns         4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO              0.67 N          0.67 N
BM_bcmp<uint16_t, Identical>_RMS                 25 %            25 %
<...>
BM_bcmp<uint32_t, Identical>/128000           81497 ns        81488 ns         8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO              0.71 N          0.71 N
BM_bcmp<uint32_t, Identical>_RMS                 42 %            42 %
<...>
BM_bcmp<uint64_t, Identical>/64000            50138 ns        50138 ns        10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO              0.84 N          0.84 N
BM_bcmp<uint64_t, Identical>_RMS                 27 %            27 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000      192405 ns       192392 ns         3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.38 N          0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS              3 %             3 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000     127858 ns       127860 ns         5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS             0 %             0 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000      49140 ns        49140 ns        14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.40 N          0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            18 %            18 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000       32101 ns        32099 ns        21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS             1 %             1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29
Running build-new/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000            18593 ns        18590 ns        37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO               0.04 N          0.04 N
BM_bcmp<uint8_t, Identical>_RMS                  37 %            37 %
<...>
BM_bcmp<uint16_t, Identical>/256000           18950 ns        18948 ns        37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO              0.08 N          0.08 N
BM_bcmp<uint16_t, Identical>_RMS                 34 %            34 %
<...>
BM_bcmp<uint32_t, Identical>/128000           18627 ns        18627 ns        37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO              0.16 N          0.16 N
BM_bcmp<uint32_t, Identical>_RMS                 35 %            35 %
<...>
BM_bcmp<uint64_t, Identical>/64000            18855 ns        18855 ns        37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO              0.32 N          0.32 N
BM_bcmp<uint64_t, Identical>_RMS                 33 %            33 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000        9570 ns         9569 ns        73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.02 N          0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS             29 %            29 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000       9547 ns         9547 ns        74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.04 N          0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS            29 %            29 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000       9396 ns         9394 ns        73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.08 N          0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            30 %            30 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000        9499 ns         9498 ns        73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.16 N          0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS            28 %            28 %
Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench
Benchmark                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000                      -0.9570         -0.9570        432131         18593        432101         18590
<...>
BM_bcmp<uint16_t, Identical>/256000                     -0.8826         -0.8826        161408         18950        161409         18948
<...>
BM_bcmp<uint32_t, Identical>/128000                     -0.7714         -0.7714         81497         18627         81488         18627
<...>
BM_bcmp<uint64_t, Identical>/64000                      -0.6239         -0.6239         50138         18855         50138         18855
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000                 -0.9503         -0.9503        192405          9570        192392          9569
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000                -0.9253         -0.9253        127858          9547        127860          9547
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000                -0.8088         -0.8088         49140          9396         49140          9394
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000                 -0.7041         -0.7041         32101          9499         32099          9498
```

What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element size,
  maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s
  for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
  bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the
  naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
  eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and
  linearly decreases with element size.
  For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element count.
  As it can be seen from the full output {F8768210}, the `memcmp()` is almost
  universally worse, independent of the element size (and thus buffer size) when
  element count is less than 8.

So all in all, bcmp idiom does indeed pose untapped performance headroom.
This diff does implement said idiom recognition. I think a reasonable test
coverage is present, but do tell if there is anything obvious missing.

Now, quality. This does succeed to build and pass the test-suite, at least
without any non-bundled elements. {F8768216} {F8768217}
This transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json
Tests: 1149
Metric: loop-idiom.NumBCmp

Program                                         result-new

MultiSourc...Benchmarks/7zip/7zip-benchmark    79.00
MultiSource/Applications/d/make_dparser         3.00
SingleSource/UnitTests/vla                      2.00
MultiSource/Applications/Burg/burg              1.00
MultiSourc.../Applications/JM/lencod/lencod     1.00
MultiSource/Applications/lemon/lemon            1.00
MultiSource/Benchmarks/Bullet/bullet            1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs     1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     1.00
MultiSourc...Prolangs-C/simulator/simulator     1.00
```
The size changes are:
I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash
Tests: 1149
Same hash: 907 (filtered out)
Remaining: 242
Metric: size..text

Program                                        result-old result-new diff
test-suite...ingleSource/UnitTests/vla.test   753.00     833.00     10.6%
test-suite...marks/7zip/7zip-benchmark.test   1001697.00 966657.00  -3.5%
test-suite...ngs-C/simulator/simulator.test   32369.00   32321.00   -0.1%
test-suite...plications/d/make_dparser.test   89585.00   89505.00   -0.1%
test-suite...ce/Applications/Burg/burg.test   40817.00   40785.00   -0.1%
test-suite.../Applications/lemon/lemon.test   47281.00   47249.00   -0.1%
test-suite...TimberWolfMC/timberwolfmc.test   250065.00  250113.00   0.0%
test-suite...chmarks/MallocBench/gs/gs.test   149889.00  149873.00  -0.0%
test-suite...ications/JM/lencod/lencod.test   769585.00  769569.00  -0.0%
test-suite.../Benchmarks/Bullet/bullet.test   770049.00  770049.00   0.0%
test-suite...HMARK_ANISTROPIC_DIFFUSION/128    NaN        NaN        nan%
test-suite...HMARK_ANISTROPIC_DIFFUSION/256    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/64    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/32    NaN        NaN        nan%
test-suite...ENCHMARK_BILATERAL_FILTER/64/4    NaN        NaN        nan%
Geomean difference                                                   nan%
         result-old    result-new       diff
count  1.000000e+01  10.00000      10.000000
mean   3.152090e+05  311695.40000  0.006749
std    3.790398e+05  372091.42232  0.036605
min    7.530000e+02  833.00000    -0.034981
25%    4.243300e+04  42401.00000  -0.000866
50%    1.197370e+05  119689.00000 -0.000392
75%    6.397050e+05  639705.00000 -0.000005
max    1.001697e+06  966657.00000  0.106242
```

I don't have timings though.

And now to the code. The basic idea is to completely replace the whole loop.
If we can't fully kill it, don't transform.
I have left one or two comments in the code, so hopefully it can be understood.

Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???

Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet

Reviewed By: courbet

Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61144

llvm-svn: 374662
2019-10-12 15:35:32 +00:00
Oliver Stannard
901c588c1f Dead Virtual Function Elimination
Currently, it is hard for the compiler to remove unused C++ virtual
functions, because they are all referenced from vtables, which are referenced
by constructors. This means that if the constructor is called from any live
code, then we keep every virtual function in the final link, even if there
are no call sites which can use it.

This patch allows unused virtual functions to be removed during LTO (and
regular compilation in limited circumstances) by using type metadata to match
virtual function call sites to the vtable slots they might load from. This
information can then be used in the global dead code elimination pass instead
of the references from vtables to virtual functions, to more accurately
determine which functions are reachable.

To make this transformation safe, I have changed clang's code-generation to
always load virtual function pointers using the llvm.type.checked.load
intrinsic, instead of regular load instructions. I originally tried writing
this using clang's existing code-generation, which uses the llvm.type.test
and llvm.assume intrinsics after doing a normal load. However, it is possible
for optimisations to obscure the relationship between the GEP, load and
llvm.type.test, causing GlobalDCE to fail to find virtual function call
sites.

The existing linkage and visibility types don't accurately describe the scope
in which a virtual call could be made which uses a given vtable. This is
wider than the visibility of the type itself, because a virtual function call
could be made using a more-visible base class. I've added a new
!vcall_visibility metadata type to represent this, described in
TypeMetadata.rst. The internalization pass and libLTO have been updated to
change this metadata when linking is performed.

This doesn't currently work with ThinLTO, because it needs to see every call
to llvm.type.checked.load in the linkage unit. It might be possible to
extend this optimisation to be able to use the ThinLTO summary, as was done
for devirtualization, but until then that combination is rejected in the
clang driver.

To test this, I've written a fuzzer which generates random C++ programs with
complex class inheritance graphs, and virtual functions called through object
and function pointers of different types. The programs are spread across
multiple translation units and DSOs to test the different visibility
restrictions.

I've also tried doing bootstrap builds of LLVM to test this. This isn't
ideal, because only classes in anonymous namespaces can be optimised with
-fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not
work correctly with -fvisibility=hidden. However, there are only 12 test
failures when building with -fvisibility=hidden (and an unmodified compiler),
and this change does not cause any new failures for either value of
-fvisibility.

On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size
reduction of ~6%, over a baseline compiled with "-O2 -flto
-fvisibility=hidden -fwhole-program-vtables". The best cases are reductions
of ~14% in 450.soplex and 483.xalancbmk, and there are no code size
increases.

I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which
show a geomean size reduction of ~3%, again with no size increases.

I had hoped that this would have no effect on performance, which would allow
it to awlays be enabled (when using -fwhole-program-vtables). However, the
changes in clang to use the llvm.type.checked.load intrinsic are causing ~1%
performance regression in the C++ parts of SPEC2006. It should be possible to
recover some of this perf loss by teaching optimisations about the
llvm.type.checked.load intrinsic, which would make it worth turning this on
by default (though it's still dependent on -fwhole-program-vtables).

Differential revision: https://reviews.llvm.org/D63932

llvm-svn: 374539
2019-10-11 11:59:55 +00:00
Kai Nacke
f65b9d0379 [FileCheck] Implement --ignore-case option.
The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

llvm-svn: 374538
2019-10-11 11:59:14 +00:00
Tom Stellard
9ce88f6e69 docs/DeveloperPolicy: Add instructions for requesting GitHub commit access
Subscribers: mehdi_amini, jtony, xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66840

llvm-svn: 374474
2019-10-10 23:36:06 +00:00
Julian Lettner
f3c91b5401 [lit] Bring back --threads option alias
Bring back `--threads` option which was lost in the move of the
command line argument parsing code to cl_arguments.py.  Update docs
since `--workers` is preferred.

llvm-svn: 374432
2019-10-10 19:43:57 +00:00
Jinsong Ji
afbbfe72dd [PowerPC][docs] Update IBM official docs in Compiler Writers Info page
Summary:
Just realized that most of the links in this page are deprecated.
So update some important reference here:
* adding PowerISA 3.0B/2.7B
* adding P8/P9 User Manual
* ELFv2 ABI and errata

Move deprecated ones into "Other documents..".

Reviewers: #powerpc, hfinkel, nemanjai

Reviewed By: hfinkel

Subscribers: shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68817

llvm-svn: 374428
2019-10-10 19:25:30 +00:00
Roman Lebedev
f72d6fc559 [MCA] Show aggregate over Average Wait times for the whole snippet (PR43219)
Summary:
As disscused in https://bugs.llvm.org/show_bug.cgi?id=43219,
i believe it may be somewhat useful to show //some// aggregates
over all the sea of statistics provided.

Example:
```
Average Wait times (based on the timeline view):
[0]: Executions
[1]: Average time spent waiting in a scheduler's queue
[2]: Average time spent waiting in a scheduler's queue while ready
[3]: Average time elapsed from WB until retire stage

      [0]    [1]    [2]    [3]
0.     3     1.0    1.0    4.7       vmulps     %xmm0, %xmm1, %xmm2
1.     3     2.7    0.0    2.3       vhaddps    %xmm2, %xmm2, %xmm3
2.     3     6.0    0.0    0.0       vhaddps    %xmm3, %xmm3, %xmm4
       3     3.2    0.3    2.3       <total>
```
I.e. we average the averages.

Reviewers: andreadb, mattd, RKSimon

Reviewed By: andreadb

Subscribers: gbedwell, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68714

llvm-svn: 374361
2019-10-10 14:46:21 +00:00
Dmitri Gribenko
0c3e83e627 Revert "[FileCheck] Implement --ignore-case option."
This reverts commit r374339. It broke tests:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19066

llvm-svn: 374359
2019-10-10 14:27:14 +00:00
Kai Nacke
28535e7629 [FileCheck] Implement --ignore-case option.
The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

llvm-svn: 374339
2019-10-10 13:15:41 +00:00
Roman Lebedev
b32eb9dfb8 [UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4     When an expression J that has integral type is added to or subtracted
      from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
      the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
      elements ([dcl.array]), the expressions P + J and J + P
      (where J has the value j) point to the (possibly-hypothetical) array
      element i+j of x if 0≤i+j≤n and the expression P - J points to the
      (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```

Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)

To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".

Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566

Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.

`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown

Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers

Reviewed By: rsmith

Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D67122

llvm-svn: 374293
2019-10-10 09:25:02 +00:00
DeForest Richards
0bfa80a30c [Docs] Adds section for Additional Topics on Reference page
Adds a new section for Additional Topics on the Reference documentation page. Also moves Support Library topic to User Guides page.

llvm-svn: 374230
2019-10-09 21:09:09 +00:00
DeForest Richards
f358a34083 [Docs] Adds Documentation links to sidebar
Adds links to Getting Started/Tutorials, User Guides, and Reference documentation pages to sidebar. Also adds a new section for LLVM IR on the Reference documentation page.

llvm-svn: 374214
2019-10-09 20:26:13 +00:00
DeForest Richards
fef32a3f70 [Docs] Fixes broken sphinx build - undefined label
Removes label ref pointing to non-existent subsystem docs page.

llvm-svn: 374128
2019-10-08 22:45:20 +00:00
Clement Courbet
e0fc2857f3 [llvm-exegesis] Add options to SnippetGenerator.
Summary:
This adds a `-max-configs-per-opcode` option to limit the number of
configs per opcode.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68642

llvm-svn: 374054
2019-10-08 14:30:24 +00:00
Kevin P. Neal
75d58de3bb Nope, I'm wrong. It looks like someone else removed these on purpose and
it just happened to break the bot right when I did my push. So I'm undoing
this mornings incorrect push. I've also kicked off an email to hopefully
get the bot fixed the correct way.

llvm-svn: 374049
2019-10-08 14:10:26 +00:00
Kevin P. Neal
fcf18f9190 Restore documentation that 'svn update' unexpectedly yanked out from under me.
llvm-svn: 374045
2019-10-08 13:38:42 +00:00
Joerg Sonnenberger
c8a30b2636 Fix the spelling of my name.
llvm-svn: 373980
2019-10-07 22:55:42 +00:00
Reid Kleckner
a973c0bd85 [X86] Add new calling convention that guarantees tail call optimization
When the target option GuaranteedTailCallOpt is specified, calls with
the fastcc calling convention will be transformed into tail calls if
they are in tail position. This diff adds a new calling convention,
tailcc, currently supported only on X86, which behaves the same way as
fastcc, except that the GuaranteedTailCallOpt flag does not need to
enabled in order to enable tail call optimization.

Patch by Dwight Guth <dwight.guth@runtimeverification.com>!

Reviewed By: lebedev.ri, paquette, rnk

Differential Revision: https://reviews.llvm.org/D67855

llvm-svn: 373976
2019-10-07 22:28:58 +00:00
Kevin P. Neal
64943d3bc3 Fix another sphinx warning.
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373909
2019-10-07 14:14:46 +00:00
Kevin P. Neal
edcebb4ab4 Fix sphinx warnings.
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373902
2019-10-07 13:39:56 +00:00
Kevin P. Neal
02f26957db [FPEnv] Add constrained intrinsics for lrint and lround
Earlier in the year intrinsics for lrint, llrint, lround and llround were
added to llvm. The constrained versions are now implemented here.

Reviewed by:	andrew.w.kaylor, craig.topper, cameron.mcinally
Approved by:	craig.topper
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373900
2019-10-07 13:20:00 +00:00
Djordje Todorovic
1b3358939a [llvm-locstats] Fix a typo in the documentation; NFC
llvm-svn: 373880
2019-10-07 07:31:49 +00:00
DeForest Richards
46281ba574 [Docs] Removes Subsystem Documentation page
Removes Subsystem Documentation page. Also moves existing topics on Subsystem Documentation page to User Guides and Reference pages.

llvm-svn: 373872
2019-10-06 22:49:22 +00:00
DeForest Richards
8f532fcce0 [Docs] Removes Programming Documentation page
Removes Programming Documentation page. Also moves existing topics on Programming Documentation page to User Guides and Reference pages. 

llvm-svn: 373856
2019-10-06 16:10:11 +00:00
DeForest Richards
56ebd9c381 [Docs] Adds new Getting Started/Tutorials page
Adds a new page for Getting Started/Tutorials topics. Also updates existing topic categories on the User Guides and Reference pages.

llvm-svn: 373854
2019-10-06 15:36:37 +00:00
Sylvestre Ledru
1b710e7f71 Update the FAQ: remove stuff related to the previous license +
update info about the portability of LLVM.

llvm-svn: 373576
2019-10-03 09:43:54 +00:00
Fangrui Song
85bbe2974b [llvm-objcopy] Add --set-section-alignment
Fixes PR43181. This option was recently added to GNU objcopy (binutils
PR24942).

`llvm-objcopy -I binary -O elf64-x86-64 --set-section-alignment .data=8` can set the alignment of .data.

Reviewed By: grimar, jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D67656

llvm-svn: 373461
2019-10-02 12:41:25 +00:00
Djordje Todorovic
20b221afa5 Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

The cause of the test failure was resolved.

llvm-svn: 373427
2019-10-02 07:00:01 +00:00
Vedant Kumar
60bb1d584e [ReleaseProcess] Document requirement to set MACOSX_DEPLOYMENT_TARGET
llvm-svn: 373356
2019-10-01 17:10:45 +00:00
Djordje Todorovic
3ef644a00f Revert "Reland "[utils] Implement the llvm-locstats tool""
This reverts commit rL373317 due to test failure on the
clang-s390x-linux build bot.

llvm-svn: 373336
2019-10-01 13:21:15 +00:00
Djordje Todorovic
cecb4ad7fc Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

llvm-svn: 373317
2019-10-01 09:59:15 +00:00
Fangrui Song
9112a48c7b [llvm-readobj/llvm-readelf] Delete --arm-attributes (alias for --arch-specific)
D68110 added --arch-specific (supported by GNU readelf) and made
--arm-attributes an alias for it. The tests were later migrated to use
--arch-specific.

Note, llvm-readelf --arch-specific currently just uses llvm-readobj
style output for ARM attributes. The readelf-style output is not
implemented.

Reviewed By: compnerd, kongyi, rupprecht

Differential Revision: https://reviews.llvm.org/D68196

llvm-svn: 373291
2019-10-01 01:31:15 +00:00
Pablo Barrio
a86f661bc5 Fix doc for t inline asm constraints for ARM/Thumb
Summary: The constraint goes up to regs d15 and q7, not d16 and q8.

Subscribers: kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68090

llvm-svn: 373228
2019-09-30 16:55:10 +00:00
Kevin P. Neal
d4f7c8e89a Fix breakage of sphinx builders. Sorry for leaving this broken over the
weekend!

llvm-svn: 373215
2019-09-30 14:51:59 +00:00
Djordje Todorovic
daa23ce8ba Revert "Reland "[utils] Implement the llvm-locstats tool""
This reverts commit rL373183.

llvm-svn: 373200
2019-09-30 11:19:11 +00:00
Djordje Todorovic
faafb41cf2 Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

llvm-svn: 373183
2019-09-30 07:35:17 +00:00
DeForest Richards
1d0a9f010f [Docs] Moves article links to new pages
Moves existing article links on the Programming, Subsystem, and Reference documentation pages to new locations. Also moves Github Repository and Publications links to the sidebar.

llvm-svn: 373169
2019-09-29 15:31:52 +00:00
DeForest Richards
6e8bb199f9 [Docs] Adds sections for Command Line and LibFuzzer articles
Adds sections for Command Line and Libfuzzer articles on Programming Documentation page.

llvm-svn: 373158
2019-09-29 02:16:38 +00:00
DeForest Richards
e4f8869b72 [Docs] Adds new section to User Guides page
Adds a section to the User Guides page for articles related to building, packaging, and distributing LLVM. Includes sub-sections for CMake, Clang, and Docker.

llvm-svn: 373113
2019-09-27 19:12:00 +00:00
Kevin P. Neal
a4dc431d14 Document requirement of function attributes with constrained floating
point.

Reviewed by:    andrew.w.kaylor, uweigand, efriedma
Approved by:    andrew.w.kaylor
Differential Revision:  https://reviews.llvm.org/D67839

llvm-svn: 373002
2019-09-26 17:50:25 +00:00
Nick Desaulniers
4e5aa08380 [Verifier] add invariant check for callbr
Summary:
The list of indirect labels should ALWAYS have their blockaddresses as
argument operands to the callbr (but not necessarily the other way
around).  Add an invariant that checks this.

The verifier catches a bad test case that was added recently in r368478.
I think that was a simple mistake, and the test was made less strict in
regards to the precise addresses (as those weren't specifically the
point of the test).

This invariant will be used to find a reported bug.

Link: https://www.spinics.net/lists/arm-kernel/msg753473.html
Link: https://github.com/ClangBuiltLinux/linux/issues/649

Reviewers: craig.topper, void, chandlerc

Reviewed By: void

Subscribers: ychen, lebedev.ri, javed.absar, kristof.beyls, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67196

llvm-svn: 372923
2019-09-25 22:28:27 +00:00
Florian Hahn
8ea72d111c [LangRef] Clarify absence of rounding guarantees for fmuladd.
During the review of D67434, it was recommended to make fmuladd's
behavior more explicit. D67434 depends on this interpretation.

Reviewers: efriedma, jfb, reames, scanon, lebedev.ri, spatel

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D67552

llvm-svn: 372892
2019-09-25 16:09:24 +00:00
Sanjay Patel
a3e1aa37c2 [IR] allow fast-math-flags on phi of FP values (2nd try)
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917 <https://reviews.llvm.org/D61917>

As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086

The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535

But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.

The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.

Differential Revision: https://reviews.llvm.org/D67564

llvm-svn: 372878
2019-09-25 14:35:02 +00:00
Sanjay Patel
f51a1549df Revert [IR] allow fast-math-flags on phi of FP values
This reverts r372866 (git commit dec03223a97af0e4dfcb23da55c0f7f8c9b62d00)

llvm-svn: 372868
2019-09-25 13:29:09 +00:00
Sanjay Patel
06a326f643 [IR] allow fast-math-flags on phi of FP values
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917

As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086

The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535

But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.

The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.

Differential Revision: https://reviews.llvm.org/D67564

llvm-svn: 372866
2019-09-25 13:14:12 +00:00
James Henderson
96774e78a8 [docs][llvm-strings] Clarify "printable character" wording
The --bytes option uses the phrase "printable ASCII characters", but the
description section used simply "printable characters". To avoid any
confusion about locale impacts etc, this change adopts the former's
phrasing in both places. It also fixes a minor grammar issue in the
description.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D68016

llvm-svn: 372865
2019-09-25 13:09:17 +00:00
James Henderson
d36d8656d9 [docs][llvm-strip] Update llvm-strip doc to better match llvm-objcopy's
Main changes are mostly wording of some options, but this change also
fixes a switch reference so that a link is created and moves
--strip-sections into the ELF-specific area since it is only supported
for ELF currently.

llvm-svn: 372864
2019-09-25 13:09:12 +00:00
Dmitry Preobrazhensky
ebbe05934d [AMDGPU][MC][DOC] Updated AMD GPU assembler description.
Summary of changes:
- Updated to reflect recent changes in assembler;
- Minor bugfixing and improvements.

llvm-svn: 372857
2019-09-25 12:38:35 +00:00
DeForest Richards
e13db8befb [Docs] Moves Reference docs to new page
Moves Reference docs to new page. Also adds a table of contents to Getting Involved page.

llvm-svn: 372796
2019-09-25 00:49:02 +00:00
James Henderson
2e737a2046 [docs][llvm-strip][llvm-objcopy] Improve wording and fix highlighting
llvm-svn: 372754
2019-09-24 13:41:39 +00:00
James Henderson
a4102e6e8f [docs][llvm-size] Fix typo
llvm-svn: 372750
2019-09-24 13:14:22 +00:00
Djordje Todorovic
c4d5e78650 Revert "Reland "[utils] Implement the llvm-locstats tool""
This reverts commit rL372554.

llvm-svn: 372580
2019-09-23 11:04:11 +00:00
Djordje Todorovic
1249d7b040 Reland "[utils] Implement the llvm-locstats tool"
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

llvm-svn: 372554
2019-09-23 07:57:53 +00:00
DeForest Richards
2bffc761b0 [Docs] Updates sidebar links
Adds sidebar links to mailing lists, IRC, and meetups and social events.

llvm-svn: 372488
2019-09-21 21:05:20 +00:00
DeForest Richards
636d7ad372 [Docs] Adds new page for Getting Involved articles
Adds a new page for existing Getting Involved, Development Process, and Community Proposals articles. Also moves Mailing Lists, Meetups and social events, and IRC sections.

llvm-svn: 372487
2019-09-21 20:56:40 +00:00
DeForest Richards
407a0cb703 [Docs] Bug fix for document not included in toctree
Fixes 'document not included in toctree' bug for FAQ and Lexicon topics.

llvm-svn: 372470
2019-09-21 14:29:19 +00:00
DeForest Richards
9d5f019089 [Docs] Updates sidebar links
Adds additional links to sidebar. Also removes Glossary and FAQ from LLVM Design & Overview section. (These links now reside on the sidebar.)

llvm-svn: 372469
2019-09-21 14:17:09 +00:00
DeForest Richards
1f41a90eb5 [Docs] Add a custom sidebar to doc pages
Adds a custom sidebar to LLVM docs. Sidebar includes links to How to submit a bug and FAQ topics, as well as a Show Source link and search box.

llvm-svn: 372432
2019-09-20 22:16:39 +00:00
DeForest Richards
4d6ef7156b [Docs] Move topics to new categories
This commit moves several topics to new categories. 

llvm-svn: 372428
2019-09-20 20:51:33 +00:00
Matt Morehouse
e4d9769ce4 [docs] Update structure-aware-fuzzing link.
The document has been moved to the google/fuzzing GitHub repo.

llvm-svn: 372423
2019-09-20 19:39:50 +00:00
Francesco Petrogalli
94323a35ed [docs] Remove training whitespaces. NFC
Subscribers: jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67835

llvm-svn: 372399
2019-09-20 15:02:32 +00:00
David Tellenbach
29b3e7a7d0 [NFC] Test commit, deleting some whitespace
llvm-svn: 372379
2019-09-20 09:43:31 +00:00
Francesco Petrogalli
e0aee60bcd [docs] Break long (>80) line. NFC
llvm-svn: 372326
2019-09-19 14:19:32 +00:00
DeForest Richards
53a0506f24 [Docs] Moves topics to new categories
This commit moves several topics to new categories. It also removes a few duplicate links in Subsystem Documentation.

llvm-svn: 372274
2019-09-18 23:04:31 +00:00
Bardia Mahjour
12d9e4e269 Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372238
2019-09-18 17:43:45 +00:00
Jinsong Ji
a033158ef6 Reland "[docs][Bugpoint]Add notes about multiple crashes"
Fix the warning.
Bugpoint.rst:124:Mismatch: both interpreted text role prefix and
reference suffix.

Note that the line no here is wrong and misleading,
the problem is in line 128, not 124.

llvm-svn: 372181
2019-09-17 21:09:41 +00:00
Bardia Mahjour
2d2a856f02 Revert "Data Dependence Graph Basics"
This reverts commit c98ec60993a7aa65073692b62f6d728b36e68ccd, which broke the sphinx-docs build.

llvm-svn: 372168
2019-09-17 19:22:01 +00:00
Bardia Mahjour
3f478f329a Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372162
2019-09-17 18:55:44 +00:00
Jinsong Ji
13a8a7aaf6 [docs][Bugpoint] Revert 5584ead50 a5aa3353
No sure why there are still warnings, revert while I investigate.

llvm-svn: 372161
2019-09-17 18:39:04 +00:00
Jinsong Ji
96ff72efbe [docs][Bugpoint] Fix build break.
Bugpoint.rst:124: WARNING: Mismatch: both interpreted text role prefix
and reference suffix.

llvm-svn: 372160
2019-09-17 18:23:06 +00:00
Jinsong Ji
f3b3429a72 [docs][Bugpoint]Add notes about multiple crashes
Summary:
    When reducing case for a CodeGenCrash, bugpoint may generate a new
    reduced
    testcase that exposes/causes another crash or break something due to
    limitation.

    Bugpoint does not distiguish different crashes currently,
    so when this happens, bugpoint will go on reducing for the new crash,
    or just abort, we can't get the case reduced for the origial crash.

    An advice is added into usage doc to connect to recommend checking error
    message with scripts and `-compile-command`.

Reviewers: modocache, bogner, sebpop, reames, vsk, MatzeB

Reviewed By: vsk

Subscribers: mehdi_amini, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66832

llvm-svn: 372157
2019-09-17 18:10:09 +00:00
James Henderson
cb51100751 [docs] Make --version text more correct
Follow-up to r371983. Referring to "this program" in the description of
the --version option in the documentation isn't exactly correct, because
the docs are not part of the program, and so "this program" doesn't
really refer to anything. This patch brings the other users of this
terminology into line with the new updates to llvm-size and
llvm-strings.

Reviewed by: alexshap, MaskRay

Differential Revision: https://reviews.llvm.org/D67618

llvm-svn: 372107
2019-09-17 11:43:42 +00:00
DeForest Richards
c7e199a8c6 [Docs] Bug fix for docs homepage
Removes reference to non-existent Reference Documentation page.

llvm-svn: 372032
2019-09-16 20:29:56 +00:00
DeForest Richards
6aac682bcb [Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage
Adds a section for Getting Started/Tutorials and Reference topics to the LLVM docs homepage.

llvm-svn: 372031
2019-09-16 20:19:32 +00:00
James Henderson
cc60d60782 [docs][llvm-strings] Write llvm-strings documentation
Previously we only had a stub document.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D67554

llvm-svn: 371984
2019-09-16 13:56:12 +00:00
James Henderson
26293dc986 [docs][llvm-size] Write llvm-size documentation
Previously we only had a stub document.

Reviewed by: serge-sans-paille, MaskRay

Differential Revision: https://reviews.llvm.org/D67555

llvm-svn: 371983
2019-09-16 13:20:37 +00:00
Kerry McLaughlin
ddebfac518 [SVE][Inline-Asm] Add constraints for SVE predicate registers
Summary:
Adds the following inline asm constraints for SVE:
  - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
  - Upa: SVE predicate register with full range, P0 to P15

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin

Reviewed By: rovka

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66524

llvm-svn: 371967
2019-09-16 09:45:27 +00:00
Fangrui Song
83bc3d5d9f [llvm-objcopy] Ignore -B --binary-architecture=
GNU objcopy documents that -B is only useful with architecture-less
input (i.e. "binary" or "ihex"). After D67144, -O defaults to -I, and
-B is essentially a NOP.

* If -O is binary/ihex, GNU objcopy ignores -B.
* If -O is elf*, -B provides the e_machine field in GNU objcopy.

So to convert a blob to an ELF, `-I binary -B i386:x86-64 -O elf64-x86-64` has to be specified.

`-I binary -B i386:x86-64 -O elf64-x86-64` creates an ELF with its
e_machine field set to EM_NONE in GNU objcopy, but a regular x86_64 ELF
in elftoolchain elfcopy. Follow the elftoolchain approach (ignoring -B)
to simplify code. Users that expect their command line portable should
specify -B.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D67215

llvm-svn: 371914
2019-09-14 01:36:31 +00:00
Michael Pozulp
23932e2e9b [llvm-objcopy] Add support for response files in llvm-strip and llvm-objcopy
Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=42671

Reviewers: jhenderson, espindola, alexshap, rupprecht

Reviewed By: jhenderson

Subscribers: seiya, emaste, arichardson, jakehehrlich, MaskRay, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65372

llvm-svn: 371911
2019-09-14 01:14:43 +00:00
DeForest Richards
bdeb54a918 [Docs] Bug fix for reference to nonexistent document
This commit fixes a bug in which the toctree contained a reference to a non-existent document.

llvm-svn: 371889
2019-09-13 20:05:57 +00:00
Kevin P. Neal
c5f29e2f14 [FPEnv] Document that constrained FP intrinsics cannot be mixed with non-constrained
Reviewed by:	andrew.w.kaylor, cameron.mcinally, uweigand
Approved by:	andrew.w.kaylor
Differential Revision:	https://reviews.llvm.org/D67360

llvm-svn: 371888
2019-09-13 19:36:19 +00:00
James Henderson
1a909630d5 [docs][llvm-readelf][llvm-readobj] Improve --stack-sizes documentation
llvm-readobj's document was missing --stack-sizes entirely from its
document, so this patch adds it. It also adds a note to the llvm-readelf
description that the switch is only implemented for GNU style output
currently. For reference, --stack-sizes was added in r367942.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D67548

llvm-svn: 371862
2019-09-13 15:01:39 +00:00
Nico Weber
fe26bc2586 Fix a few spellos in docs.
(Trying to debug an incremental build thing on a bot...)

llvm-svn: 371860
2019-09-13 14:58:24 +00:00
James Henderson
11f18d5eba [docs][llvm-objcopy][llvm-strip] Improve --strip-unneeded description
Behaviour was recently added to this switch to strip debug sections too.
See r369761.

This change also makes the description for the --strip-unneeded switch
consistent between the two docs.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D67546

llvm-svn: 371855
2019-09-13 13:26:52 +00:00
DeForest Richards
ebd06b083a [Docs] Adds page for reference docs
Adds a Reference Documentation page for LLVM and API reference documentation.

llvm-svn: 371782
2019-09-12 22:17:04 +00:00
James Henderson
65ef8f90c0 [docs][llvm-strip] Remove unnecessary whitespace for consistency
llvm-svn: 371739
2019-09-12 14:24:04 +00:00
Craig Topper
f9ccef34dc [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.
AVX512 instructions can cause a frequency drop on these CPUs. This
can negate the performance gains from using wider vectors. Enabling
prefer-vector-width=256 will prevent generation of zmm registers
unless explicit 512 bit operations are used in the original source
code.

I believe gcc and icc both do something similar to this by default.

Differential Revision: https://reviews.llvm.org/D67259

llvm-svn: 371694
2019-09-11 23:54:36 +00:00
Adrian Prantl
44c0d31c5b Update link to the DWARF spec.
llvm-svn: 371650
2019-09-11 19:57:29 +00:00
Adrian Prantl
d31cf64019 Update documentation.
llvm-svn: 371648
2019-09-11 19:49:38 +00:00
Sanjay Patel
bd7acc7bbc [LangRef] add link for fma intrinsic
llvm-svn: 371615
2019-09-11 13:25:32 +00:00
Sanjay Patel
ef07ddeebd [LangRef] fix punctuation; NFC
llvm-svn: 371612
2019-09-11 12:22:24 +00:00
Alina Sbirlea
50d7aa39ab Update ReleaseNotes: add enabling of MemorySSA.
llvm-svn: 371569
2019-09-10 23:22:37 +00:00
Djordje Todorovic
0dec37607c Revert "[utils] Implement the llvm-locstats tool"
This reverts commit rL371520.

llvm-svn: 371527
2019-09-10 14:48:52 +00:00
Djordje Todorovic
64810143df [utils] Implement the llvm-locstats tool
The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

The tool will be very useful for tracking improvements regarding the
"debugging optimized code" support with LLVM ecosystem.

Differential Revision: https://reviews.llvm.org/D66526

llvm-svn: 371520
2019-09-10 13:47:03 +00:00
Evgeniy Stepanov
cb123715ba LangRef: mention MSan's problem with speculative conditional branches.
Summary:
This short blurb aims to disallow optimizations like we had to revert
(under MSan) in
  https://reviews.llvm.org/D21165
  https://bugs.llvm.org/show_bug.cgi?id=28054
  https://reviews.llvm.org/D67205

Reviewers: vitalybuka, efriedma

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67244

llvm-svn: 371461
2019-09-09 22:24:57 +00:00
Craig Topper
d97bf0b804 [SelectionDAG] Remove ISD::FP_ROUND_INREG
I don't think anything in tree creates this node. So all of this
code appears to be dead.

Code coverage agrees
http://lab.llvm.org:8080/coverage/coverage-reports/llvm/coverage/Users/buildslave/jenkins/workspace/clang-stage2-coverage-R/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp.html

Differential Revision: https://reviews.llvm.org/D67312

llvm-svn: 371431
2019-09-09 17:54:44 +00:00
Bjorn Pettersson
f8a47e6d1e [Intrinsic] Add the llvm.umul.fix.sat intrinsic
Summary:
Add an intrinsic that takes 2 unsigned integers with
the scale of them provided as the third argument and
performs fixed point multiplication on them. The
result is saturated and clamped between the largest and
smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic
in clang where some of the more complex operations
will be implemented as intrinsics.

Patch by: leonardchan, bjope

Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel

Reviewed By: leonardchan

Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57836

llvm-svn: 371308
2019-09-07 12:16:14 +00:00
DeForest Richards
70724cacf6 Docs: Update Community section on homepage
This commit includes the following changes: Adds a Getting Involved section under Community. Moves the Development Process section under Community. Moves Sphinx Quickstart Template and How to submit an LLVM bug report from User Guides section to Getting Involved.

llvm-svn: 371127
2019-09-05 21:24:47 +00:00
Sylvestre Ledru
b0c1aff472 doc update: explain that Z3 is only for clang SA - thanks to LebedevRI for the suggestion
llvm-svn: 371110
2019-09-05 19:50:56 +00:00
Sylvestre Ledru
e6837b1644 document the LLVM_ENABLE_Z3_SOLVER option
llvm-svn: 371109
2019-09-05 19:38:15 +00:00
DeForest Richards
11709c89b7 Docs: Move Documentation sections to separate pages.
Updates the links on the homepage by moving the User Guides, Programming Documentation, and Subsystem Documentation sections to separate pages. Also changes "Overview" to "About" at the top of the LLVM Docs homepage. This work is part of the Google Season of Docs project.

llvm-svn: 371096
2019-09-05 17:30:52 +00:00
Guillaume Chatelet
5ec84d66c8 [LLVM][Alignment] Make functions using log of alignment explicit
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045
2019-09-05 10:00:22 +00:00
Lang Hames
28ccbabcfd [docs] Add some comments to the inline LLJIT example.
llvm-svn: 370950
2019-09-04 18:38:26 +00:00
Vedant Kumar
6f6a5a74b9 [llvm-profdata] Add mode to recover from profile read failures
Add a mode in which profile read errors are not immediately treated as
fatal. In this mode, merging makes forward progress and reports failure
only if no inputs can be read.

Differential Revision: https://reviews.llvm.org/D66985

llvm-svn: 370827
2019-09-03 22:23:16 +00:00
Roman Lebedev
6ffec5c24c Revert r370454 "[LoopIdiomRecognize] BCmp loop idiom recognition"
https://bugs.llvm.org/show_bug.cgi?id=43206 was filed,
claiming that there is a miscompilation.
Reverting until i investigate.

This reverts commit r370454

llvm-svn: 370788
2019-09-03 17:14:56 +00:00
Kerry McLaughlin
acca14efdf [SVE][Inline-Asm] Support for SVE asm operands
Summary:
Adds the following inline asm constraints for SVE:
  - w: SVE vector register with full range, Z0 to Z31
  - x: Restricted to registers Z0 to Z15 inclusive.
  - y: Restricted to registers Z0 to Z7 inclusive.

This change also adds the "z" modifier to interpret a register as an SVE register.

Not all of the bitconvert patterns added by this patch are used, but they have been included here for completeness.

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, rengolin, cameron.mcinally, greened

Reviewed By: sdesmalen

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66302

llvm-svn: 370673
2019-09-02 16:12:31 +00:00
Thomas Preud'homme
393fb2cc2e [FileCheck] Forbid using var defined on same line
Summary:
Commit r366897 introduced the possibility to set a variable from an
expression, such as [[#VAR2:VAR1+3]]. While introducing this feature, it
introduced extra logic to allow using such a variable on the same line
later on. Unfortunately that extra logic is flawed as it relies on a
mapping from variable to expression defining it when the mapping is from
variable definition to expression. This flaw causes among other issues
PR42896.

This commit avoids the problem by forbidding all use of a variable
defined on the same line, and removes the now useless logic. Redesign
will be done in a later commit because it will require some amount of
refactoring first for the solution to be clean. One example is the need
for some sort of transaction mechanism to set a variable temporarily and
from an expression and rollback if the CHECK pattern does not match so
that diagnostics show the right variable values.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66141

llvm-svn: 370663
2019-09-02 14:04:00 +00:00
Bjorn Pettersson
99b90b9e5c [LangRef] Update saturating examples for llvm.smul.fix.sat. NFC
Some saturation examples for llvm.smul.fix.sat were not showing
the correct result. I've adjusted the operands to make sure that
we actually trigger overflow in those examples.

llvm-svn: 370566
2019-08-31 09:01:16 +00:00
Craig Topper
44eac5c3b3 [X86] Pass v32i16/v64i8 in zmm registers on KNL target.
gcc and icc pass these types in zmm registers in zmm registers.

This patch implements a quick hack to override the register
type before calling convention handling to one that is legal.
Longer term we might want to do something similar to 256-bit
integer registers on AVX1 where we just split all the operations.

Fixes PR42957

Differential Revision: https://reviews.llvm.org/D66708

llvm-svn: 370495
2019-08-30 17:35:08 +00:00
Chris Jackson
4d8ae1bdd7 [llvm-objcopy] Allow the visibility of symbols created by --binary and
--add-symbol to be specified with --new-symbol-visibility

llvm-svn: 370458
2019-08-30 10:17:16 +00:00
Roman Lebedev
a54fd9866f [LoopIdiomRecognize] BCmp loop idiom recognition
Summary:
@mclow.lists brought up this issue up in IRC.
It is a reasonably common problem to compare some two values for equality.
Those may be just some integers, strings or arrays of integers.

In C, there is `memcmp()`, `bcmp()` functions.
In C++, there exists `std::equal()` algorithm.
One can also write that function manually.

libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for
various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ

libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)

So likely, there exists a certain performance opportunities.
Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that
is using `memcmp()` (in this case, compiled with modified compiler). {F8768213}

```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>

#include "benchmark/benchmark.h"

template <class T>
bool equal(T* a, T* a_end, T* b) noexcept {
  for (; a != a_end; ++a, ++b) {
    if (*a != *b) return false;
  }
  return true;
}

template <typename T>
std::vector<T> getVectorOfRandomNumbers(size_t count) {
  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
                                       std::numeric_limits<T>::max());
  std::vector<T> v;
  v.reserve(count);
  std::generate_n(std::back_inserter(v), count,
                  [&dis, &gen]() { return dis(gen); });
  assert(v.size() == count);
  return v;
}

struct Identical {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto Tmp = getVectorOfRandomNumbers<T>(count);
    return std::make_pair(Tmp, std::move(Tmp));
  }
};

struct InequalHalfway {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto V0 = getVectorOfRandomNumbers<T>(count);
    auto V1 = V0;
    V1[V1.size() / size_t(2)]++;  // just change the value.
    return std::make_pair(std::move(V0), std::move(V1));
  }
};

template <class T, class Gen>
void BM_bcmp(benchmark::State& state) {
  const size_t Length = state.range(0);

  const std::pair<std::vector<T>, std::vector<T>> Data =
      Gen::template Gen<T>(Length);
  const std::vector<T>& a = Data.first;
  const std::vector<T>& b = Data.second;
  assert(a.size() == Length && b.size() == a.size());

  benchmark::ClobberMemory();
  benchmark::DoNotOptimize(a);
  benchmark::DoNotOptimize(a.data());
  benchmark::DoNotOptimize(b);
  benchmark::DoNotOptimize(b.data());

  for (auto _ : state) {
    const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
    benchmark::DoNotOptimize(is_equal);
  }
  state.SetComplexityN(Length);
  state.counters["eltcnt"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant);
  state.counters["eltcnt/sec"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate);
  const size_t BytesRead = 2 * sizeof(T) * Length;
  state.counters["bytes_read/iteration"] =
      benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
                         benchmark::Counter::OneK::kIs1024);
  state.counters["bytes_read/sec"] = benchmark::Counter(
      BytesRead, benchmark::Counter::kIsIterationInvariantRate,
      benchmark::Counter::OneK::kIs1024);
}

template <typename T>
static void CustomArguments(benchmark::internal::Benchmark* b) {
  const size_t L2SizeBytes = []() {
    for (const benchmark::CPUInfo::CacheInfo& I :
         benchmark::CPUInfo::Get().caches) {
      if (I.level == 2) return I.size;
    }
    return 0;
  }();
  // What is the largest range we can check to always fit within given L2 cache?
  const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
                        /*maximal elt size*/ sizeof(T) / /*safety margin*/ 2;
  b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical)
    ->Apply(CustomArguments<uint64_t>);

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway)
    ->Apply(CustomArguments<uint64_t>);
```
{F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench
RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx
2019-04-25 21:17:11
Running build-old/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000           432131 ns       432101 ns         1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO               0.86 N          0.86 N
BM_bcmp<uint8_t, Identical>_RMS                   8 %             8 %
<...>
BM_bcmp<uint16_t, Identical>/256000          161408 ns       161409 ns         4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO              0.67 N          0.67 N
BM_bcmp<uint16_t, Identical>_RMS                 25 %            25 %
<...>
BM_bcmp<uint32_t, Identical>/128000           81497 ns        81488 ns         8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO              0.71 N          0.71 N
BM_bcmp<uint32_t, Identical>_RMS                 42 %            42 %
<...>
BM_bcmp<uint64_t, Identical>/64000            50138 ns        50138 ns        10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO              0.84 N          0.84 N
BM_bcmp<uint64_t, Identical>_RMS                 27 %            27 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000      192405 ns       192392 ns         3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.38 N          0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS              3 %             3 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000     127858 ns       127860 ns         5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS             0 %             0 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000      49140 ns        49140 ns        14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.40 N          0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            18 %            18 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000       32101 ns        32099 ns        21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS             1 %             1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29
Running build-new/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000            18593 ns        18590 ns        37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO               0.04 N          0.04 N
BM_bcmp<uint8_t, Identical>_RMS                  37 %            37 %
<...>
BM_bcmp<uint16_t, Identical>/256000           18950 ns        18948 ns        37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO              0.08 N          0.08 N
BM_bcmp<uint16_t, Identical>_RMS                 34 %            34 %
<...>
BM_bcmp<uint32_t, Identical>/128000           18627 ns        18627 ns        37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO              0.16 N          0.16 N
BM_bcmp<uint32_t, Identical>_RMS                 35 %            35 %
<...>
BM_bcmp<uint64_t, Identical>/64000            18855 ns        18855 ns        37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO              0.32 N          0.32 N
BM_bcmp<uint64_t, Identical>_RMS                 33 %            33 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000        9570 ns         9569 ns        73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.02 N          0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS             29 %            29 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000       9547 ns         9547 ns        74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.04 N          0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS            29 %            29 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000       9396 ns         9394 ns        73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.08 N          0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            30 %            30 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000        9499 ns         9498 ns        73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.16 N          0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS            28 %            28 %
Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench
Benchmark                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000                      -0.9570         -0.9570        432131         18593        432101         18590
<...>
BM_bcmp<uint16_t, Identical>/256000                     -0.8826         -0.8826        161408         18950        161409         18948
<...>
BM_bcmp<uint32_t, Identical>/128000                     -0.7714         -0.7714         81497         18627         81488         18627
<...>
BM_bcmp<uint64_t, Identical>/64000                      -0.6239         -0.6239         50138         18855         50138         18855
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000                 -0.9503         -0.9503        192405          9570        192392          9569
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000                -0.9253         -0.9253        127858          9547        127860          9547
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000                -0.8088         -0.8088         49140          9396         49140          9394
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000                 -0.7041         -0.7041         32101          9499         32099          9498
```

What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element size,
  maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s
  for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
  bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the
  naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
  eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and
  linearly decreases with element size.
  For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element count.
  As it can be seen from the full output {F8768210}, the `memcmp()` is almost
  universally worse, independent of the element size (and thus buffer size) when
  element count is less than 8.

So all in all, bcmp idiom does indeed pose untapped performance headroom.
This diff does implement said idiom recognition. I think a reasonable test
coverage is present, but do tell if there is anything obvious missing.

Now, quality. This does succeed to build and pass the test-suite, at least
without any non-bundled elements. {F8768216} {F8768217}
This transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json
Tests: 1149
Metric: loop-idiom.NumBCmp

Program                                         result-new

MultiSourc...Benchmarks/7zip/7zip-benchmark    79.00
MultiSource/Applications/d/make_dparser         3.00
SingleSource/UnitTests/vla                      2.00
MultiSource/Applications/Burg/burg              1.00
MultiSourc.../Applications/JM/lencod/lencod     1.00
MultiSource/Applications/lemon/lemon            1.00
MultiSource/Benchmarks/Bullet/bullet            1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs     1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     1.00
MultiSourc...Prolangs-C/simulator/simulator     1.00
```
The size changes are:
I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash
Tests: 1149
Same hash: 907 (filtered out)
Remaining: 242
Metric: size..text

Program                                        result-old result-new diff
test-suite...ingleSource/UnitTests/vla.test   753.00     833.00     10.6%
test-suite...marks/7zip/7zip-benchmark.test   1001697.00 966657.00  -3.5%
test-suite...ngs-C/simulator/simulator.test   32369.00   32321.00   -0.1%
test-suite...plications/d/make_dparser.test   89585.00   89505.00   -0.1%
test-suite...ce/Applications/Burg/burg.test   40817.00   40785.00   -0.1%
test-suite.../Applications/lemon/lemon.test   47281.00   47249.00   -0.1%
test-suite...TimberWolfMC/timberwolfmc.test   250065.00  250113.00   0.0%
test-suite...chmarks/MallocBench/gs/gs.test   149889.00  149873.00  -0.0%
test-suite...ications/JM/lencod/lencod.test   769585.00  769569.00  -0.0%
test-suite.../Benchmarks/Bullet/bullet.test   770049.00  770049.00   0.0%
test-suite...HMARK_ANISTROPIC_DIFFUSION/128    NaN        NaN        nan%
test-suite...HMARK_ANISTROPIC_DIFFUSION/256    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/64    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/32    NaN        NaN        nan%
test-suite...ENCHMARK_BILATERAL_FILTER/64/4    NaN        NaN        nan%
Geomean difference                                                   nan%
         result-old    result-new       diff
count  1.000000e+01  10.00000      10.000000
mean   3.152090e+05  311695.40000  0.006749
std    3.790398e+05  372091.42232  0.036605
min    7.530000e+02  833.00000    -0.034981
25%    4.243300e+04  42401.00000  -0.000866
50%    1.197370e+05  119689.00000 -0.000392
75%    6.397050e+05  639705.00000 -0.000005
max    1.001697e+06  966657.00000  0.106242
```

I don't have timings though.

And now to the code. The basic idea is to completely replace the whole loop.
If we can't fully kill it, don't transform.
I have left one or two comments in the code, so hopefully it can be understood.

Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???

Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet

Reviewed By: courbet

Subscribers: hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61144

llvm-svn: 370454
2019-08-30 09:51:23 +00:00
Craig Topper
b7262b7cbf [X86] Remove what little support we had for MPX
-Deprecate -mmpx and -mno-mpx command line options
-Remove CPUID detection of mpx for -march=native
-Remove MPX from all CPUs
-Remove MPX preprocessor define

I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything.

gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX

Differential Revision: https://reviews.llvm.org/D66669

llvm-svn: 370393
2019-08-29 18:09:02 +00:00
Craig Topper
3cd58bd208 [X86][ReleaseNotes] Add a note about the switch to widening legalization for narrow vectors.
llvm-svn: 370233
2019-08-28 17:18:56 +00:00
Kevin P. Neal
d20252fbdf [FPEnv] Add fptosi and fptoui constrained intrinsics.
This implements constrained floating point intrinsics for FP to signed and
unsigned integers.

Quoting from D32319:
The purpose of the constrained intrinsics is to force the optimizer to
respect the restrictions that will be necessary to support things like the
STDC FENV_ACCESS ON pragma without interfering with optimizations when
these restrictions are not needed.

Reviewed by:	Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton
Approved by:	Craig Topper
Differential Revision:	http://reviews.llvm.org/D63782

llvm-svn: 370228
2019-08-28 16:33:36 +00:00
Shafik Yaghmour
9812acd0fd Debug Info: Support for DW_AT_export_symbols for anonymous structs
This implements the DWARF 5 feature described in:

http://dwarfstd.org/ShowIssue.php?issue=141212.1

To support recognizing anonymous structs:

  struct A {
    struct { // Anonymous struct
        int y;
    };
  }   a;

This patch adds a new (DI)flag to LLVM metadata:

ExportSymbols

Differential Revision: https://reviews.llvm.org/D66352

llvm-svn: 369781
2019-08-23 17:19:21 +00:00
Sylvestre Ledru
f93b49c8b4 Fix some regressions caused by r369553 on old versions of Debian and Ubuntu
It was causing some errors like:

Encoding error:
'ascii' codec can't decode byte 0xe2 in position 341: ordinal not in range(128)
The full traceback has been saved in /tmp/sphinx-err-y2fq4dtb.log, if you want to report the issue to the developers.

llvm-svn: 369644
2019-08-22 12:16:08 +00:00
Mitch Phillips
dd3d3666c1 [docs] Add GwpAsan to toctree.
Reverts rL369556 in the process, as it's no longer needed.

llvm-svn: 369560
2019-08-21 18:31:03 +00:00
Jordan Rupprecht
b26ade37bc [docs] Fix GwpAsan.rst
llvm-svn: 369556
2019-08-21 18:09:31 +00:00
Mitch Phillips
8659675fa1 Add newline to GWP-ASan sphinx document.
Should fix the document builder.

llvm-svn: 369554
2019-08-21 18:03:11 +00:00
Jordan Rupprecht
1a675694f0 [docs] Convert remaining command guide entries from md to rst.
Summary:
Linking between markdown and rst files is currently not supported very well, e.g. the current llvm-addr2line docs [1] link to "llvm-symbolizer" instead of "llvm-symbolizer.html". This is weirdly broken in different ways depending on which versions of sphinx and recommonmark are being used, so workaround the bug by using rst everywhere.

[1] http://llvm.org/docs/CommandGuide/llvm-addr2line.html

Reviewers: jhenderson

Reviewed By: jhenderson

Subscribers: lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66305

llvm-svn: 369553
2019-08-21 18:00:17 +00:00
Mitch Phillips
2e4634042e [GWP-ASan] Add public-facing documentation [6].
Summary:
Note: Do not submit this documentation until Scudo support is reviewed and submitted (should be #[5]).

See D60593 for further information.

This patch introduces the public-facing documentation for GWP-ASan, as well as updating the definition of one of the options, which wasn't properly merged. The document describes the design and features of GWP-ASan, as well as how to use GWP-ASan from both a user's standpoint, and development documentation for supporting allocators.

Reviewers: jfb, morehouse, vlad.tsyrklevich

Reviewed By: morehouse, vlad.tsyrklevich

Subscribers: kcc, dexonsmith, kubamracek, cryptoad, jfb, #sanitizers, llvm-commits, vlad.tsyrklevich, morehouse

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D62875

llvm-svn: 369552
2019-08-21 17:53:51 +00:00
DeForest Richards
958b9f6311 [Docs] Test commit
Fixes typo - Removes extra space between last word of sentence and period.

llvm-svn: 369216
2019-08-18 19:07:10 +00:00
Siva Chandra
533c4033bd Add LLVMLibC proposal to docs/index.rst.
Reviewers: rupprecht

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66307

llvm-svn: 369030
2019-08-15 18:08:11 +00:00
Jonas Devlieghere
2c693415b7 [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Siva Chandra
2cf884bb0b Add a proposal for a libc project under the LLVM umbrella.
Reviewers: chandlerc, dlj, echristo, hfinkel, jfb, zturner

Subscribers: dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64939

llvm-svn: 369012
2019-08-15 15:50:42 +00:00
Florian Hahn
b1f95e7f28 Add ptrmask intrinsic
This patch adds a ptrmask intrinsic which allows masking out bits of a
pointer that must be zero when accessing it, because of ABI alignment
requirements or a restriction of the meaningful bits of a pointer
through the data layout.

This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged
pointers) and allows us to not lose information about the underlying
object.

Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune

Reviewed by: sanjoy, jdoerfert

Differential Revision: https://reviews.llvm.org/D59065

llvm-svn: 368986
2019-08-15 10:12:26 +00:00
Chris Jackson
6712b2c77c [llvm-objcopy] Allow 'protected' visibility to be set when using
add-symbol

Reviewers: Maskray, rupprecht

Differential Revision: https://reviews.llvm.org/D65891

llvm-svn: 368982
2019-08-15 09:45:09 +00:00
Jordan Rupprecht
4045956175 [docs] Fix sphinx doc generation errors
Summary:
Errors fixed:
 - GettingStarted: Duplicate explicit target name: "cmake"
 - GlobalISel: Unexpected indentation
 - LoopTerminology: Explicit markup ends without a blank line; unexpected unindent
 - ORCv2: Definition list ends without a blank line; unexpected unindent
 - Misc: document isn't included in any toctree

Verified that a clean docs build (`rm -rf docs/ && ninja docs-llvm-html`) passes with no errors. Spot checked the individual pages to make sure they look OK.

Reviewers: thakis, dsanders

Reviewed By: dsanders

Subscribers: arphaman, llvm-commits, lhames, rovka, dsanders, reames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66183

llvm-svn: 368932
2019-08-14 22:18:01 +00:00
Erich Keane
8d0fafd361 Add support in CMake to statically link the C++ standard library.
It is sometimes useful to have the C++ standard library linked into the
assembly when compiling clang, particularly when distributing a compiler
onto systems that don't have a copy of stdlibc++ or libc++ installed.

This functionality should work with either GCC or Clang as the host
compiler, though statically linking libc++ (as may be required for
licensing purposes) is only possible if the host compiler is Clang with
a copy of libc++ available.

Differential Revision: https://reviews.llvm.org/D65603

llvm-svn: 368907
2019-08-14 19:55:59 +00:00
JF Bastien
a8dac4e648 Move to C++14
Summary:
I just bumped the minimum compiler versions to support C++14 in D66188.

Following [our process](http://llvm.org/docs/DeveloperPolicy.html#toolchain) and [our previous agreement](http://lists.llvm.org/pipermail/llvm-dev/2019-January/129452.html), I'm now officially bumping the C++ version to 14 and updating the documentation.

Subscribers: mgorny, jkorous, dexonsmith, llvm-commits, chandlerc, thakis, EricWF, jyknight, lhames, JDevlieghere

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66195

llvm-svn: 368887
2019-08-14 17:39:07 +00:00
Craig Topper
9c7cd7edad [LangRef] Remove opening [ that was missing a closing ] from call/callbr/invoke syntax.
It looks like this bracket was added when the addrspace was added.
before it. So I think it can jut be removed.

llvm-svn: 368861
2019-08-14 15:10:37 +00:00
JF Bastien
0b85cce260 Remove minimum toolchain soft-error
Summary:
Back in January I changed the minimum toolchain version required to build clang
and LLVM: D57264. Since then we've release LLVM 8, following
[our process](http://llvm.org/docs/DeveloperPolicy.html#toolchain)
it's therefore now a good time to remove the soft-error and officially deprecate
older toolchains. I tried this out last Tursday night to see if any bots
complained, and I saw no complaints. I also manually audited bots and didn't see
any bot that should break, but their toolchain information is unreliable and
some bots are offline.

Once this patch stick we'll move to C++14 as we've
[already agreed](http://lists.llvm.org/pipermail/llvm-dev/2019-January/129452.html).

Subscribers: mgorny, jkorous, dexonsmith, llvm-commits, EricWF, thakis, chandlerc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66188

llvm-svn: 368799
2019-08-14 04:30:51 +00:00
John McCall
3f8b48091d Extend coroutines to support a "returned continuation" lowering.
A quick contrast of this ABI with the currently-implemented ABI:

- Allocation is implicitly managed by the lowering passes, which is fine
  for frontends that are fine with assuming that allocation cannot fail.
  This assumption is necessary to implement dynamic allocas anyway.

- The lowering attempts to fit the coroutine frame into an opaque,
  statically-sized buffer before falling back on allocation; the same
  buffer must be provided to every resume point.  A buffer must be at
  least pointer-sized.

- The resume and destroy functions have been combined; the continuation
  function takes a parameter indicating whether it has succeeded.

- Conversely, every suspend point begins its own continuation function.

- The continuation function pointer is directly returned to the caller
  instead of being stored in the frame.  The continuation can therefore
  directly destroy the frame when exiting the coroutine instead of having
  to leave it in a defunct state.

- Other values can be returned directly to the caller instead of going
  through a promise allocation.  The frontend provides a "prototype"
  function declaration from which the type, calling convention, and
  attributes of the continuation functions are taken.

- On the caller side, the frontend can generate natural IR that directly
  uses the continuation functions as long as it prevents IPO with the
  coroutine until lowering has happened.  In combination with the point
  above, the frontend is almost totally in charge of the ABI of the
  coroutine.

- Unique-yield coroutines are given some special treatment.

llvm-svn: 368788
2019-08-14 03:53:17 +00:00
Diego Trevino Ferrer
bd488127a7 [Bugpoint redesign] Fix nonlocal URI link in doc
Summary: Fixes documentation bot build  http://lab.llvm.org:8011/builders/llvm-sphinx-docs

Reviewers: JDevlieghere

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66022

llvm-svn: 368493
2019-08-09 21:48:47 +00:00
Michael Pozulp
d6ddff770e [Docs][llvm-strip] Fix an indentation issue.
llvm-svn: 368473
2019-08-09 19:41:13 +00:00
Michael Pozulp
2e9f600f2e [Docs][llvm-strip] Add help text to llvm-strip rst doc
Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=42383

Reviewers: jhenderson, alexshap, rupprecht

Reviewed By: jhenderson

Subscribers: wolfgangp, jakehehrlich, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65384

llvm-svn: 368464
2019-08-09 19:10:55 +00:00
Andrea Di Biagio
9bbf3a5aeb [MCA] Add flag -show-encoding to llvm-mca.
Flag -show-encoding enables the printing of instruction encodings as part of the
the instruction info view.

Example (with flags -mtriple=x86_64--  -mcpu=btver2):

Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[7]: Encoding Size

[1]    [2]    [3]    [4]    [5]    [6]    [7]    Encodings:     Instructions:
 1      2     1.00                         4     c5 f0 59 d0    vmulps   %xmm0, %xmm1, %xmm2
 1      4     1.00                         4     c5 eb 7c da    vhaddps  %xmm2, %xmm2, %xmm3
 1      4     1.00                         4     c5 e3 7c e3    vhaddps  %xmm3, %xmm3, %xmm4

In this example, column Encoding Size is the size in bytes of the instruction
encoding. Column Encodings reports the actual instruction encodings as byte
sequences in hex (objdump style).

The computation of encodings is done by a utility class named mca::CodeEmitter.

In future, I plan to expose the CodeEmitter to the instruction builder, so that
information about instruction encoding sizes can be used by the simulator. That
would be a first step towards simulating the throughput from the decoders in the
hardware frontend.

Differential Revision: https://reviews.llvm.org/D65948

llvm-svn: 368432
2019-08-09 11:26:27 +00:00
Diego Trevino Ferrer
c4a583a0ae Added Delta IR Reduction Tool
Summary: Tool parses input IR file, and runs the delta debugging algorithm to reduce the functions inside the input file.

Reviewers: alexshap, chandlerc

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63672

> llvm-svn: 368071

llvm-svn: 368358
2019-08-08 22:16:33 +00:00
Daniel Sanders
2dc0ac461e [globalisel][legalizer] Attempt to write down the minimal legalization rules
Summary:
There aren't very many requirements on the legalization rules but we should
document them.

Reviewers: aditya_nandakumar, volkan, bogner, paquette, aemerson, rovka, arsenm, Petar.Avramovic

Subscribers: wdng, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62423

# Conflicts:
#	llvm/docs/GlobalISel.rst

llvm-svn: 368321
2019-08-08 17:54:23 +00:00
Tim Corringham
b6d4024f9b Add llvm.licm.disable metadata
For some targets the LICM pass can result in sub-optimal code in some
cases where it would be better not to run the pass, but it isn't
always possible to suppress the transformations heuristically.

Where the front-end has insight into such cases it is beneficial
to attach loop metadata to disable the pass - this change adds the
llvm.licm.disable metadata to enable that.

Differential Revision: https://reviews.llvm.org/D64557

llvm-svn: 368296
2019-08-08 13:46:17 +00:00
Anusha Basana
c82d75f082 [llvm-lipo] Update llvm-lipo docs for -info -thin -create -replace -segalign flags
Summary:
The information for -info -thin -create -replace and -segalign flags are added to llvm-lipo.rst

Test Plan:

Reviewers: smeenai, alexshap, compnerd, mtrent

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65676

llvm-svn: 368235
2019-08-07 23:25:12 +00:00
Diego Trevino Ferrer
44a8f767f6 Revert Added Delta IR Reduction Tool
This reverts r368071 (git commit a2584978f5bb41973d65a145b0d9459b81e3ac6d)

llvm-svn: 368217
2019-08-07 21:51:54 +00:00
Diego Trevino Ferrer
2143dc29d5 Added Delta IR Reduction Tool
Summary: Tool parses input IR file, and runs the delta debugging algorithm to reduce the functions inside the input file.

Reviewers: alexshap, chandlerc

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63672

> llvm-svn: 368071

llvm-svn: 368214
2019-08-07 21:37:11 +00:00
Sam Elliott
01018e64da [RISCV][NFC] Document RISC-V-specific assembly constraints
llvm-svn: 368167
2019-08-07 13:08:07 +00:00
Petr Hosek
fac9592e7e Reverts commit r368117, r368115 and r368112
This reverts commits:

  "Added Delta IR Reduction Tool"
  "[Bugpoint redesign] Added Pass to Remove Global Variables"
  "Added Tool as Dependency to tests & fixed warnings"

Reduce/remove-funcs.ll is failing on bots.

llvm-svn: 368122
2019-08-07 05:15:34 +00:00
Diego Trevino Ferrer
72f0bd49ad Added Delta IR Reduction Tool
Summary: Tool parses input IR file, and runs the delta debugging algorithm to reduce the functions inside the input file.

Reviewers: alexshap, chandlerc

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63672

> llvm-svn: 368071

llvm-svn: 368112
2019-08-07 00:00:52 +00:00
Dmitri Gribenko
66e203026e Revert "Added Delta IR Reduction Tool"
This reverts commit r368071, it broke buildbots.

llvm-svn: 368073
2019-08-06 19:40:37 +00:00
Diego Trevino Ferrer
3fc8dbf0ee Added Delta IR Reduction Tool
Summary: Tool parses input IR file, and runs the delta debugging algorithm to reduce the functions inside the input file.

Reviewers: alexshap, chandlerc

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63672

llvm-svn: 368071
2019-08-06 18:59:11 +00:00
Hans Wennborg
e9787cf7d5 Revert r367941 "Add a note to the release not about a potentially breaking optimization"
The note was moved to the release_90 branch in r367997.

llvm-svn: 367998
2019-08-06 08:32:33 +00:00
Johannes Doerfert
4ab84c53b7 [Attributor] Deduce the "no-return" attribute for functions
A function is "no-return" if we never reach a return instruction, either
because there are none or the ones that exist are dead.

Test have been adjusted:
  - either noreturn was added, or
  - noreturn was avoided by modifying the code.

The new noreturn_{sync,async} test make sure we do handle invoke
instructions with a noreturn (and potentially nowunwind) callee
correctly, even in the presence of potential asynchronous exceptions.

llvm-svn: 367948
2019-08-05 23:22:05 +00:00
Wolfgang Pieb
205f1a60c6 [llvm-readelf] Support dumping of stack sizes sections with readelf --stack-sizes
Reviewers: jhenderson, grimar, rupprecht

Differential Revision: https://reviews.llvm.org/D65313

llvm-svn: 367942
2019-08-05 22:47:07 +00:00
Philip Reames
c0f1116865 Add a note to the release not about a potentially breaking optimization
This has come up twice already (once in pr42763 and once in the commit thread), so give warning of a new way in which UB can result in unexpected program behavior.

llvm-svn: 367941
2019-08-05 22:34:59 +00:00
Andrea Di Biagio
f5f269d720 [MCA][doc] Add a section for the 'Bottleneck Analysis'.
Also clarify the meaning of 'Block RThroughput' and 'RThroughput'.

llvm-svn: 367853
2019-08-05 13:18:37 +00:00
Fangrui Song
6b986b0b9e Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC
F_{None,Text,Append} are kept for compatibility since r334221.

llvm-svn: 367800
2019-08-05 05:43:48 +00:00
Tim Northover
e5745c32fd IR: print value numbers for unnamed function arguments
For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.

Also modifies the parser to accept IR in that form for obvious reasons.

llvm-svn: 367755
2019-08-03 14:28:34 +00:00
Yonghong Song
1f886e2f2a [BPF] annotate DIType metadata for builtin preseve_array_access_index()
Previously, debuginfo types are annotated to
IR builtin preserve_struct_access_index() and
preserve_union_access_index(), but not
preserve_array_access_index(). The debug info
is useful to identify the root type name which
later will be used for type comparison.

For user access without explicit type conversions,
the previous scheme works as we can ignore intermediate
compiler generated type conversions (e.g., from union types to
union members) and still generate correct access index string.

The issue comes with user explicit type conversions, e.g.,
converting an array to a structure like below:
  struct t { int a; char b[40]; };
  struct p { int c; int d; };
  struct t *var = ...;
  ... __builtin_preserve_access_index(&(((struct p *)&(var->b[0]))->d)) ...
Although BPF backend can derive the type of &(var->b[0]),
explicit type annotation make checking more consistent
and less error prone.

Another benefit is for multiple dimension array handling.
For example,
  struct p { int c; int d; } g[8][9][10];
  ... __builtin_preserve_access_index(&g[2][3][4].d) ...
It would be possible to calculate the number of "struct p"'s
before accessing its member "d" if array debug info is
available as it contains each dimension range.

This patch enables to annotate IR builtin preserve_array_access_index()
with proper debuginfo type. The unit test case and language reference
is updated as well.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D65664

llvm-svn: 367724
2019-08-02 21:28:28 +00:00
Paul Robinson
65503fe2fe [doc] Give a workaround for a FileCheck regex that ends in a brace.
Addresses PR42864.

llvm-svn: 367689
2019-08-02 16:07:48 +00:00
Lang Hames
a6587cc70d [ORC] Change the locking scheme for ThreadSafeModule.
ThreadSafeModule/ThreadSafeContext are used to manage lifetimes and locking
for LLVMContexts in ORCv2. Prior to this patch contexts were locked as soon
as an associated Module was emitted (to be compiled and linked), and were not
unlocked until the emit call returned. This could lead to deadlocks if
interdependent modules that shared contexts were compiled on different threads:
when, during emission of the first module, the dependence was discovered the
second module (which would provide the required symbol) could not be emitted as
the thread emitting the first module still held the lock.

This patch eliminates this possibility by moving to a finer-grained locking
scheme. Each client holds the module lock only while they are actively operating
on it. To make this finer grained locking simpler/safer to implement this patch
removes the explicit lock method, 'getContextLock', from ThreadSafeModule and
replaces it with a new method, 'withModuleDo', that implicitly locks the context,
calls a user-supplied function object to operate on the Module, then implicitly
unlocks the context before returning the result.

ThreadSafeModule TSM = getModule(...);
size_t NumFunctions = TSM.withModuleDo(
    [](Module &M) { // <- context locked before entry to lambda.
      return M.size();
    });

Existing ORCv2 layers that operate on ThreadSafeModules are updated to use the
new method.

This method is used to introduce Module locking into each of the existing
layers.

llvm-svn: 367686
2019-08-02 15:21:37 +00:00
Andrea Di Biagio
66df8adc20 [MCA] Add support for printing immedate values as hex. Also enable lexing of masm binary and hex literals.
This patch adds a new llvm-mca flag named -print-imm-hex.

By default, the instruction printer prints immediate operands as decimals. Flag
-print-imm-hex enables the instruction printer to print those operands in hex.

This patch also adds support for MASM binary and hex literal numbers (example
0FFh, 101b).
Added tests to verify the behavior of the new flag. Tests also verify that masm
numeric literal operands are now recognized.

Differential Revision: https://reviews.llvm.org/D65588

llvm-svn: 367671
2019-08-02 10:38:25 +00:00
Matt Arsenault
5e1524bdd5 CodeGen: Allow virtual registers in bundles
The note in the documentation suggests this restriction is a compile
time optimization for architectures that make heavy use of
bundling. Allowing virtual registers in a bundle is useful for some
(non-R600) AMDGPU use cases and are infrequent enough to matter.

A more common AMDGPU use case has already been using virtual registers
in bundles since r333691, although never calling finalizeBundle on
them and manually creating the use/def list on the BUNDLE
instruction. This is also relatively infrequent, and only happens for
consecutive sequences of some load/store types.

llvm-svn: 367597
2019-08-01 18:41:28 +00:00
Erich Keane
2629d550df Fix spacing of LLVM_USE_PERF in CMake.rst that caused it to be tabbed in funny
llvm-svn: 367585
2019-08-01 17:30:25 +00:00
Erich Keane
ae6568a137 Document LLVM_ENABLE_LIBCXX in CMake.rst
llvm-svn: 367584
2019-08-01 17:30:21 +00:00
Philip Reames
a13aa37873 Attempt to unbreak sphinx build bot by inserting a link.
llvm-svn: 367487
2019-07-31 22:14:26 +00:00
Lang Hames
b024a57c29 [docs] Add references to unreferenced footnotes.
Thanks to Stefan Granitz for catching the issue.

llvm-svn: 367458
2019-07-31 18:07:37 +00:00
Djordje Todorovic
30604a8ec2 Reland "[DwarfDebug] Dump call site debug info"
The build failure found after the rL365467 has been
resolved.

Differential Revision: https://reviews.llvm.org/D60716

llvm-svn: 367446
2019-07-31 16:51:28 +00:00
Johannes Doerfert
552f8fc59c [docs][FIX] Add missing word to documentation in terms of SCCs
In the approval of D65299, commited as rL367440, I mentioned that my
proposed wording was lacking the word "maximal". It is added now for
correctness.

llvm-svn: 367445
2019-07-31 16:48:42 +00:00
Anusha Basana
cb472f88b8 [build] Add the ability to create a symlink for lipo
Add user enabled option to create lipo with symlink to llvm-lipo
Used rL326381 for reference.

Differential Revision: https://reviews.llvm.org/D65477

llvm-svn: 367444
2019-07-31 16:46:57 +00:00
Philip Reames
19f3d78573 [docs] Reword documentation in terms of SCCs not cycles
Given the example:
header:
  br i1 %c, label %next, label %header
next:
  br i1 %c2, label %exit, label %header

We end up with a loop containing both header and next.  Given that, the describing the loop in terms of cycles is confusing since we have multiple distinct cycles within a single Loop.  Standardize on the SCC to clarify.

Differential Revision: https://reviews.llvm.org/D65299

llvm-svn: 367440
2019-07-31 16:24:20 +00:00
Diana Picus
e4a8ea4508 [docs] Add cmake to Software requirements
Add cmake to the list of packages required for compiling LLVM.
Also move make to the bottom of the list and mark it as optional.

Differential Revision: https://reviews.llvm.org/D65438

llvm-svn: 367395
2019-07-31 08:48:36 +00:00
JF Bastien
c63c7fb213 [NFC] Remove LLVM_ALIGNAS
Summary: The minimum compilers support all have alignas, and we don't use LLVM_ALIGNAS anywhere anymore. This also removes an MSVC diagnostic which, according to the comment above, isn't relevant anymore.

Reviewers: rnk

Subscribers: mgorny, jkorous, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65458

llvm-svn: 367383
2019-07-31 03:22:08 +00:00
Francis Visoiu Mistrih
75722a7d1f Reland: [Remarks] Add an LLVM-bitstream-based remark serializer
Add a new serializer, using a binary format based on the LLVM bitstream
format.

This format provides a way to serialize the remarks in two modes:

1) Separate mode: the metadata is separate from the remark entries.
2) Standalone mode: the metadata and the remark entries are in the same
file.

The format contains:

* a meta block: container version, container type, string table,
external file path, remark version
* a remark block: type, remark name, pass name, function name, debug
file, debug line, debug column, hotness, arguments (key, value, debug
file, debug line, debug column)

A string table is required for this format, which will be dumped in the
meta block to be consumed before parsing the remark blocks.

On clang itself, we noticed a size reduction of 13.4x compared to YAML,
and a compile-time reduction of between 1.7% and 3.5% on CTMark.

Differential Revision: https://reviews.llvm.org/D63466

Original llvm-svn: 367364
Revert llvm-svn: 367370

llvm-svn: 367372
2019-07-31 00:13:51 +00:00
Francis Visoiu Mistrih
c4b9e92de5 Revert "[Remarks] Add an LLVM-bitstream-based remark serializer"
This reverts commit r367364.

Breaks some bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-gn/builds/3161/steps/annotate/logs/stdio

llvm-svn: 367370
2019-07-31 00:01:34 +00:00
Francis Visoiu Mistrih
9c85c62254 [Remarks] Add an LLVM-bitstream-based remark serializer
Add a new serializer, using a binary format based on the LLVM bitstream
format.

This format provides a way to serialize the remarks in two modes:

1) Separate mode: the metadata is separate from the remark entries.
2) Standalone mode: the metadata and the remark entries are in the same
file.

The format contains:

* a meta block: container version, container type, string table,
external file path, remark version
* a remark block: type, remark name, pass name, function name, debug
file, debug line, debug column, hotness, arguments (key, value, debug
file, debug line, debug column)

A string table is required for this format, which will be dumped in the
meta block to be consumed before parsing the remark blocks.

On clang itself, we noticed a size reduction of 13.4x compared to YAML,
and a compile-time reduction of between 1.7% and 3.5% on CTMark.

Differential Revision: https://reviews.llvm.org/D63466

llvm-svn: 367364
2019-07-30 23:11:57 +00:00
Thomas Lively
658456e9bf [WebAssembly] Do not emit tail calls with return type mismatch
Summary:
return_call and return_call_indirect are only valid if the return
types of the callee and caller match. We were previously not enforcing
that, which was producing invalid modules.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65246

llvm-svn: 367339
2019-07-30 18:08:39 +00:00
Francis Visoiu Mistrih
88c68e4dd6 [Docs] Fix sphinx warning in OCamlLangImpl5.rst
The path to the image was outdated.

http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/33865/steps/docs-llvm-html/logs/stdio

llvm/docs/tutorial/OCamlLangImpl5.rst:173: WARNING: image file not readable: tutorial/LangImpl05-cfg.png

llvm-svn: 367334
2019-07-30 16:56:45 +00:00
Francis Visoiu Mistrih
4842a27e5b [Remarks] Add two serialization modes for remarks: separate and standalone
The default mode is separate, where the metadata is serialized
separately from the remarks.

Another mode is the standalone mode, where the metadata is serialized
before the remarks, on the same stream.

llvm-svn: 367328
2019-07-30 16:01:40 +00:00
Francis Visoiu Mistrih
a2ae23ab50 Reland: [Remarks] Support parsing remark metadata in the YAML remark parser
This adds support to the yaml remark parser to be able to parse remarks
directly from the metadata.

This supports parsing separate metadata and following the external file
with the associated metadata, and also a standalone file containing
metadata + remarks all together.

Original llvm-svn: 367148
Revert llvm-svn: 367151

This has a fix for gcc builds.

llvm-svn: 367155
2019-07-26 21:02:02 +00:00
Francis Visoiu Mistrih
fb8a688982 Revert "[Remarks] Support parsing remark metadata in the YAML remark parser"
This reverts r367148.

Seems to fail on
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/27768.

llvm-svn: 367151
2019-07-26 20:54:44 +00:00
Francis Visoiu Mistrih
b07bf19865 [Remarks] Support parsing remark metadata in the YAML remark parser
This adds support to the yaml remark parser to be able to parse remarks
directly from the metadata.

This supports parsing separate metadata and following the external file
with the associated metadata, and also a standalone file containing
metadata + remarks all together.

llvm-svn: 367148
2019-07-26 20:11:53 +00:00
Sergey Dmitriev
01192d9596 [llvm-objcopy] Add support for --add-section for COFF
This patch enables support for --add-section=... option for COFF objects.

Differential Revision: https://reviews.llvm.org/D65040

llvm-svn: 367130
2019-07-26 17:06:41 +00:00
Sjoerd Meijer
445740fac4 [Clang] New loop pragma vectorize_predicate
This adds a new vectorize predication loop hint:

  #pragma clang loop vectorize_predicate(enable)

that can be used to indicate to the vectoriser that all (load/store)
instructions should be predicated (masked). This allows, for example, folding
of the remainder loop into the main loop.

This patch will be followed up with D64916 and D65197. The former is a
refactoring in the loopvectorizer and the groundwork to make tail loop folding
a more general concept, and in the latter the actual tail loop folding
transformation will be implemented.

Differential Revision: https://reviews.llvm.org/D64744

llvm-svn: 366989
2019-07-25 07:33:13 +00:00
Philip Reames
9d0871d80e [docs] Split out a section on LoopInfo in the new loop documentation
llvm-svn: 366964
2019-07-24 23:46:13 +00:00
Philip Reames
8639011900 Apply a few more reviewer suggestions from D65164
llvm-svn: 366961
2019-07-24 23:30:56 +00:00
Philip Reames
1603cb1137 Define some basic terminology around loops in our documentation
I've noticed a lot of confusion around this area recently with key terms being misused in a number of threads. To help reign that in, let's go ahead and document the current terminology and meaning thereof.

My hope is to grow this over time into a broader discussion of canonical loop forms - yes, there are more than one ... many more than one - but for the moment, simply having the key terminology is a good stopping place.

Note: I am landing this *without* an LGTM.  All feedback so far has been positive, and trying to apply all of the suggested changes/extensions would cause the review to never end.  Instead, I decided to land it with the obvious fixes made based on reviewer comments, then iterate from there.

Differential Revision: https://reviews.llvm.org/D65164

llvm-svn: 366960
2019-07-24 23:24:13 +00:00
Thomas Preud'homme
fddc968349 FileCheck [8/12]: Define numeric var from expr
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch lift the restriction for a
numeric expression to either be a variable definition or a numeric
expression to try to match.

This commit allows a numeric variable to be set to the result of the
evaluation of a numeric expression after it has been matched
successfully. When it happens, the variable is allowed to be used on
the same line since its value is known at match time.

It also makes use of this possibility to reuse the parsing code to
parse a command-line definition by crafting a mirror string of the
-D option with the equal sign replaced by a colon sign, e.g. for option
'-D#NUMVAL=10' it creates the string
'-D#NUMVAL=10 (parsed as [[#NUMVAL:10]])' where the numeric expression
is parsed to define NUMVAL. This result in a few tests needing updating
for the location diagnostics on top of the tests for the new feature.

It also enables empty numeric expression which match any number without
defining a variable. This is done here rather than in commit #5 of the
patch series because it requires to dissociate automatic regex insertion
in RegExStr from variable definition which would make commit #5 even
bigger than it already is.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60388

> llvm-svn: 366860

llvm-svn: 366897
2019-07-24 12:38:22 +00:00
Thomas Preud'homme
443542ddca Revert "FileCheck [8/12]: Define numeric var from expr"
This reverts commit 1b05977538d9487aa845ee2f3bec8b89c63c4f29.

llvm-svn: 366872
2019-07-24 07:32:34 +00:00
Thomas Preud'homme
f562841f29 FileCheck [8/12]: Define numeric var from expr
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch lift the restriction for a
numeric expression to either be a variable definition or a numeric
expression to try to match.

This commit allows a numeric variable to be set to the result of the
evaluation of a numeric expression after it has been matched
successfully. When it happens, the variable is allowed to be used on
the same line since its value is known at match time.

It also makes use of this possibility to reuse the parsing code to
parse a command-line definition by crafting a mirror string of the
-D option with the equal sign replaced by a colon sign, e.g. for option
'-D#NUMVAL=10' it creates the string
'-D#NUMVAL=10 (parsed as [[#NUMVAL:10]])' where the numeric expression
is parsed to define NUMVAL. This result in a few tests needing updating
for the location diagnostics on top of the tests for the new feature.

It also enables empty numeric expression which match any number without
defining a variable. This is done here rather than in commit #5 of the
patch series because it requires to dissociate automatic regex insertion
in RegExStr from variable definition which would make commit #5 even
bigger than it already is.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60388

llvm-svn: 366860
2019-07-23 22:41:38 +00:00
Eli Friedman
fd778f3433 [docs] Clarify where the indirect UB due to write-write races comes from
This is based on https://bugs.llvm.org/show_bug.cgi?id=42435#c3.

Patch by Ralf Jung.

llvm-svn: 366855
2019-07-23 21:51:26 +00:00
Francis Visoiu Mistrih
14b03d1c85 [Remarks] Introduce a new format: yaml-strtab
This exposes better support to use a string table with a format through
an actual new remark::Format, called yaml-strtab.

This can now be used with -fsave-optimization-record=yaml-strtab.

llvm-svn: 366849
2019-07-23 20:42:46 +00:00
Ryan Taylor
d817c75cad [IR][Verifier] Allow IntToPtrInst to be !dereferenceable
Summary:
Allow IntToPtrInst to carry !dereferenceable metadata tag.
This is valid since !dereferenceable can be only be applied to
pointer type values.

Change-Id: If8a6e3c616f073d51eaff52ab74535c29ed497b4

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64954

llvm-svn: 366826
2019-07-23 17:19:56 +00:00
Chris Lattner
80b9807d04 unbreak links
llvm-svn: 366530
2019-07-19 05:49:11 +00:00
Chris Lattner
d7454a1883 replace the old kaleidoscope tutorial files with orphaned pages that forward to the new copy.
llvm-svn: 366529
2019-07-19 05:23:17 +00:00
Chris Lattner
fd07a98324 Point to the dusted off version of the kaleidoscope tutorial.
llvm-svn: 366528
2019-07-19 05:15:57 +00:00