Add the `-whitelist-filename-regex` option to restrict coverage
reporting to file paths that match a whitelist regex.
Patch by Michael Daniels!
rdar://56720320
Summary:
When createing an ORC remote JIT target the current library split forces the target process to link large portions of LLVM (Core, Execution Engine, JITLink, Object, MC, Passes, RuntimeDyld, Support, Target, and TransformUtils). This occurs because the ORC RPC interfaces rely on the static globals the ORC Error types require, which starts a cycle of pulling in more and more.
This patch breaks the ORC RPC Error implementations out into an "OrcError" library which only depends on LLVM Support. It also pulls the ORC RPC headers into their own subdirectory.
With this patch code can include the Orc/RPC/*.h headers and will only incur link dependencies on LLVMOrcError and LLVMSupport.
Reviewers: lhames
Reviewed By: lhames
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68732
Summary:
Add a flag `F_no_mmap` to `FileOutputBuffer` to support
`--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores
this flag for compatibility with GNU ld and gold.
We need this flag to speed up link time for large binaries in certain
scenarios. When we link some of our larger binaries we find that LLD
takes 50+ GB of memory, which causes memory pressure. The memory
pressure causes the VM to flush dirty pages of the output file to disk.
This is normally okay, since we should be flushing cold pages. However,
when using BtrFS with compression we need to write 128KB at a time when
we flush a page. If any page in that 128KB block is written again, then
it must be flushed a second time, and so on. Since LLD doesn't write
sequentially this causes write amplification. The same 128KB block will
end up being flushed multiple times, causing the linker to many times
more IO than necessary. We've observed 3-5x faster builds with
-no-mmap-output-file when we hit this scenario.
The bad scenario only applies to compressed filesystems, which group
together multiple pages into a single compressed block. I've tested
BtrFS, but the problem will be present for any compressed filesystem
on Linux, since it is caused by the VM.
Silently ignoring --no-mmap-output-file caused a silent regression when
we switched from gold to lld. We pass --no-mmap-output-file to fix this
edge case, but since lld silently ignored the flag we didn't realize it
wasn't being respected.
Benchmark building a 9 GB binary that exposes this edge case. I linked 3
times with --mmap-output-file and 3 times with --no-mmap-output-file and
took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM,
BtrFS mounted with -compress-force=zstd, and an 80% full disk.
| Mode | Time |
|---------|-------|
| mmap | 894 s |
| no mmap | 126 s |
When compression is disabled, BtrFS performs just as well with and
without mmap on this benchmark.
I was unable to reproduce the regression with any binaries in
lld-speed-test.
Reviewed By: ruiu, MaskRay
Differential Revision: https://reviews.llvm.org/D69294
This patch adds support for deleted C++ special member functions in
clang and llvm. Also added Defaulted member encodings for future
support for defaulted member functions.
Patch by Sourabh Singh Tomar!
Differential Revision: https://reviews.llvm.org/D69215
I had hoped that I could have some
```
.. code-block:: MIR
```
sections for MIR examples which causes a warning about pygments not
supporting it but we have warnings treated as errors
Summary:
I haven't refreshed the Function Calls section as I don't feel I have
sufficient knowledge of that area. It would be appreciated if someone could
review that section.
Note: I'm aware that pygments doesn't support 'mir' as used in one of the
code-block directives. This currently emits a warning and I decided to
keep it to enable finding them later. Maybe we can teach pygments to
support it.
Depends on D69456
Reviewers: volkan, aditya_nandakumar
Subscribers: rovka, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69457
Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other.
This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization.
Differential Revision: https://reviews.llvm.org/D69219
When using lit's internal shell, RUN lines like the following
accidentally execute an external `diff` instead of lit's internal
`diff`:
```
# RUN: program | diff file -
```
Such cases exist now, in `clang/test/Analysis` for example. We are
preparing patches to ensure lit's internal `diff` is called in such
cases, which will then fail because lit's internal `diff` doesn't
recognize `-` as a command-line option. This patch adds support for
`-` to mean stdin.
Reviewed By: probinson, rnk
Differential Revision: https://reviews.llvm.org/D67643
When using lit's internal shell, RUN lines like the following
accidentally execute an external `diff` instead of lit's internal
`diff`:
```
# RUN: program | diff file -
# RUN: not diff file1 file2 | FileCheck %s
```
Such cases exist now, in `clang/test/Analysis` for example. We are
preparing patches to ensure lit's internal `diff` is called in such
cases, which will then fail because lit's internal `diff` cannot
currently be used in pipelines and doesn't recognize `-` as a
command-line option.
To enable pipelines, this patch moves lit's `diff` implementation into
an out-of-process script, similar to lit's `cat` implementation. A
follow-up patch will implement `-` to mean stdin.
Also, when lit's `diff` prints differences to stdout in Windows, this
patch ensures it always terminate lines with `\n` not `\r\n`. That
way, strict FileCheck directives checking the `diff` output succeed in
both Linux and Windows. This wasn't an issue when `diff` was internal
to lit because `diff` didn't then write to the true stdout, which is
where the `\n` -> `\r\n` conversion happened in Python.
Reviewed By: probinson, stella.stamenova
Differential Revision: https://reviews.llvm.org/D66574
This catches some cases. There are probably ways to improve this.
I tried doing it as a combine on the setcc, but that broke
some cases involving flag reuse in place of test.
I renamed the isX86CCUnsigned to isX86CCSigned and flipped its
polarity to make it consistent with the similar functions for
ISD::SETCC. This avoids calling EQ/NE as being signed or unsigned.
Fixes PR43823.
Differential Revision: https://reviews.llvm.org/D69499
Summary:
Rewrite the pipeline overview to be more focused on the structure and
flexibility as well as highlight the increased usefulness of
MachineVerifier and increased testability resulting from the smaller
incremental passes approach.
The diagrams are lifted from the slides for the LLVMDev 2019 talk
'Generating Optimized Code with GlobalISel' and adapted to be readable on
the white background used in the docs.
Reviewers: volkan
Subscribers: rovka, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69456
Demonstrates missed opportunity to combine to the VBMI2 SHLDV/SHRDV ops - combineOrShiftToFunnelShift should handle vector ops (and we should eventually move this to DAGCombine).
Adding patten matching for two SVE intrinsics: frecps and frsqrts.
Also added patterns for fsub and fmul - these SDNodes directly correspond
to machine instructions.
Review: https://reviews.llvm.org/D68476
Patch authored by mgudim (Mikhail Gudim).
llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with
EXPENSIVE_CHECKS enabled, causing the patch to be reverted in
rG2c496bb5309c972d59b11f05aee4782ddc087e71.
This patch relands the patch with a proper fix to the
live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the
callee-saves correctly so that the CalleeSaved info is taken from MIR
directly, rather than letting it be recalculated by the PEI pass. I've
done this by running `llc -stop-before=prologepilog` on the LLVM
IR as captured in the test files, adding the extra MOV instructions
that were manually added in the original test file, then running `llc
-run-pass=prologepilog` and finally re-added the comments for the MOV
instructions.
Stores are vectorized with maximum vectorization factor of 16. Patch
tries to improve the situation and use maximal vectorization factor.
Reviewers: spatel, RKSimon, mkuper, hfinkel
Differential Revision: https://reviews.llvm.org/D43582
This is a fix for:
https://bugs.llvm.org/show_bug.cgi?id=43730
...and as shown there, we have existing test cases that show potential miscompiles.
We could just bail out for vector constants that contain any undef elements, or we can do as shown here:
allow the transform, but replace the undefs with a safe value.
For most of the tests shown, this results in a full splat constant (no undefs) which is probably a win
for further IR analysis because we conservatively don't match undefs in most cases. Codegen can probably
recover these kinds of undef lanes via demanded elements analysis if that's profitable.
Differential Revision: https://reviews.llvm.org/D69519
In some cases, we fail to reduce the pass list earlier because of
complex pass dependencies, but we can reduce it after we simplified the
reproducer.
An example of that is PR43474, which can limit the crash to
-loop-interchange. Adding a test case would require at least 2
interacting Loop passes I think.
Reviewers: davide, reames, modocache
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D69236
Define BitVector::BitWord as uintptr_t instead of unsigned long, as long does not necessarily translates to a pointer size (especially on 64-bit Visual Studio).
Committed on behalf of @ekatz (Ehud Katz)
Differential Revision: https://reviews.llvm.org/D69336
This should trigger a dereference before null-check warning,
but I don't see it when building with clang. In any case, the
current and known future users of this helper require non-null
args, so I'm converting the 'if' to an assert.
This reverts commit 8af5ada09319e5a021d57a1a03715b2fd022e415.
As Bjorn pointed out in D68816, the iteration over `UserVals` may not be safe.
Reverting on behalf of Orlando.
Summary:
Currently we only forget the loop we added LCSSA phis for. But SCEV
expressions in other loops could also depend on the instruction we added
a PHI for and currently we do not invalidate those expressions. This can
happen when we use ScalarEvolution before converting a function to LCSSA
form. The SCEV expressions will refer to the non-LCSSA value. If this
SCEV expression is then used with the expander, we do not preserve LCSSA
form.
This patch properly forgets the values we created PHIs for. Those need
to be recomputed again. This patch fixes PR43458.
Currently SCEV::verify does not catch this mismatch and any test would
need to run multiple passes to trigger the error (e.g. -loop-reduce
-loop-unroll). I will also look into catching this kind of mismatch in
the verifier. Also, we currently forget the whole loop in LCSSA and I'll
check if we can be more surgical.
Reviewers: efriedma, sanjoy.google, reames
Reviewed By: efriedma
Subscribers: zzheng, hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68194
Use the existing helper function in BranchFolding, "countsAsInstruction",
to skip over non-instructions. Otherwise debug instructions can be
identified as the last real instruction in a block, leading to different
codegen decisions when debug is enabled as demonstrated by the test case.
Patch by: yechunliang (Chris Ye)!
Differential Revision: https://reviews.llvm.org/D66467
This fixes a minor oversight mentioned in the review of D69379:
we should push extractelement into the operands of getelementptr
regardless of whether that enables further folding.
Before this change .symtab section was required for SHT_REL[A] section
declarations. yaml2obj automatically defined it in case when YAML document
did not have it.
With this change it is now possible to produce an object that
has a relocation section, but has no symbol table.
It simplifies the code and also it is inline with how we handle Link fields
for another special sections.
Differential revision: https://reviews.llvm.org/D69260
Currently, when we do not specify "Info" field in a YAML description
for SHT_GROUP section, yaml2obj reports an error:
"error: unknown symbol referenced: '' by YAML section '.group1'"
Also, we do not link it with a symbol table by default,
though it is what we do for AddrsigSection, HashSection, RelocationSection.
(http://www.sco.com/developers/gabi/latest/ch4.sheader.html#sh_link)
The patch fixes missings mentioned.
Differential revision: https://reviews.llvm.org/D69299
Sections may have zero size and zero-sized sections may share a start address
with other zero-sized sections. For the section overlap test to function
correctly zero-sized sections must be ordered before any non-zero sized ones.
This should fix the intermittent failures in the
test/ExecutionEngine/JITLink/X86/MachO_zero_fill_alignment.s test case that
have been observed on some builders.
When TableGen is inferring register classes from contexts, it uses a
sorting function based on the number of registers in the class. Since
this was being treated as an alias of VGPR_32, they had exactly the
same size. The sort used wasn't a stable sort, and even if it were, I
believe the tie breaker would effectively end up being the
alphabetical ordering of the class name. There appear to be issues
trying to use an empty set of registers, so add only one so this will
always sort to the end.
Also add a comment explaining how VReg_1 is a dirty hack for
SelectionDAG.
This does end up changing the behavior of i1 with inline asm and VGPR
constraints, but the existing behavior was was already nonsensical and
inconsistent. It should probably be disallowed anyway.
Fixes bug 43699
To make IntegerState more flexible but also less error prone we split it
up into (1) incrementing, (2) decrementing, and (3) bit-tracking states.
This adds functionality compared to before and disallows misuse, e.g.,
"incrementing" updates on a bit-tracking state.
Part of the change is a single operator in the base class which
simplifies helper functions that deal with states.
There are certain functional changes but all of which should actually be
corrections.