Add symlinks for `llvm-libtool-darwin` and
`llvm-install-name-tool`.
Reviewed by jhenderson, smeenai
Differential Revision: https://reviews.llvm.org/D85054
This diff adds documentation for `allow-empty` flag under FileCheck
docs.
Reviewed by jhenderson, smeenai, thopre
Differential Revision: https://reviews.llvm.org/D83682
Problems with instrumenting atomic_load when the call has no successor,
blocking compiler roll
This reverts commit 33d239513c881d8c11c60d5710c55cf56cc309a5.
The history of dropTriviallyDeadConstantArrays is like this. Because the appending linkage uses too much memory (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150105/251381.html), dropTriviallyDeadConstantArrays was introduced (https://reviews.llvm.org/rG81f385b0c6ea37dd7195a65be162c75bbdef29d2) to release unused constant arrays. Recently, dropTriviallyDeadConstantArrays was improved (https://reviews.llvm.org/rG81f385b0c6ea37dd7195a65be162c75bbdef29d2) to reduce its quadratic cost.
Our recent LTO profiling shows that when a target is large, 15-20% of time cost is from the SetVector::insert called by dropTriviallyDeadConstantArrays.
A large application has hundreds or thousands of modules; each module calls dropTriviallyDeadConstantArrays once for cleaning up tens of thousands of ConstantArrays a module has. In those ConstantArrays, usually around 5 can be deleted; a very very few deleted ConstantArrays reference other ConstantArrays: less than 10 out of millions.
Given this, the cost of SetVector::insert is mainly from the construction of WorkList from ArrayConstants. This motivated the fix that iterates ArrayConstants directly, and uses WorkList only when necessary.
Our evaluation shows that
1) The cumulative time percentage of dropTriviallyDeadConstantArrays is reduced from 15-17% to 4-6%.
2) For targets with LTO time > 20min, the time reduction is about 20%.
3) No observable performance impact for build without using LTO.
{F12506218}
{F12506221}
Reviewed By: mehdi_amini, tejohnson, jdoerfert
Differential Revision: https://reviews.llvm.org/D85379
Regions are sometimes skipped which should be rescheduled without memory op
clustering. RegionIdx is not incremented when iterating over regions that
are flagged to be skipped, causing the index to be incorrect.
Thanks to Vang Thao for discovering this bug!
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D85498
No verification for pass mangers since it is not needed.
No verification for skipped loop pass since the asserted condition is not used.
Add a BeforeNonSkippedPass callback for this. The callback needs more
inputs than its parameters to work so the callback is added on-the-fly.
Reviewed By: aeubanks, asbirlea
Differential Revision: https://reviews.llvm.org/D84977
This reverts commit b497665d98ad5026b1d3d67d5793a28fefe27bea.
Spent some time trying to reproduce this locally, reverting in a
desparate attempt to fix the sanitizer buildbot:
- http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/28828
I don't know exactly why or how this patch breaks the bots, but it seems
pretty concrete that it's the culprit.
Add a small script to sum *.stats file given as input and output the totals
usage example:
merge-stats.py $(find ./builddir/ -name "*.stats") > total.stats
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D83505
This patch adds the instruction definitions and assembly/disassembly tests for
the following set of instructions:
Vector Extract [byte | half | word | doubleword | quad] with mask
Vector Expand [byte | half | word | doubleword | quad] with mask
Move to VSR [byte | byte immediate | half | word | doubleword | quad] with mask
Vector Count Mask Bits [byte | half | word | doubleword]
Differential Revision: https://reviews.llvm.org/D83724
Introduce a fatal error if any thread local storage code is compiled
using pc relative memory operations as well as a hidden override
option `-enable-ppc-pcrel-tls` so that this support can be incrementally
added if possible.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D85448
dumpDebugStrings() and dumpDebugAbbrev() are no longer used in
macho2yaml. This patch helps remove them.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D85496
As mentioned on D85463, we should be using SimplifyMultipleUseDemandedBits (which is the default fallback).
The minor regression in illegal-bitfield-loadstore.ll will be addressed properly by D77804.
Change to expand MULHU/MULHS/UMUL_LOHI/SMUL_LOHI for i32 and i64 since
those instructions are not available on Aurora SX VE. Some of them
are used in expansion of i128 multiply, so need to modify them to
support i128. Then, update basic arithmetic regression tests of
i128 and signed/unsigned i32 typed integer values.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85490
This removes members of the DIEUnit class which were used only in unit
tests. Note also that child classes shadowed some of these methods,
namely, getDwarfVersion() was overridden in DwartfUnit and getLength()
was overridden in DwarfCompileUnit.
Differential Revision: https://reviews.llvm.org/D85436
This is a split patch of D80991.
This patch introduces AAPotentialValues and its interface only.
For more detail of AAPotentialValues abstract attribute, see the original patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D83283
Fixed an incorrect pattern in lib/Target/AArch64/AArch64SVEInstrInfo.td
for storing out <vscale x 2 x f32> unpacked scalable vectors. Added
a couple of tests to
test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll
Differential Revision: https://reviews.llvm.org/D85441
This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10.
Differential Revision: https://reviews.llvm.org/D84622
If it is load cluster, we don't need to create the dependency edges(SUb->reg) from SUb to SUa
as they both depend on the base register "reg"
+-------+
+----> reg |
| +---+---+
| ^
| |
| |
| |
| +---+---+
| | SUa | Load 0(reg)
| +---+---+
| ^
| |
| |
| +---+---+
+----+ SUb | Load 4(reg)
+-------+
But if it is store cluster, we need to create it as follow shows to avoid the instruction store
depend on scheduled in-between SUb and SUa.
+-------+
+----> reg |
| +---+---+
| ^
| | Missing +-------+
| | +-------------------->+ y |
| | | +---+---+
| +---+-+-+ ^
| | SUa | Store x 0(reg) |
| +---+---+ |
| ^ |
| | +------------------------+
| | |
| +---+--++
+----+ SUb | Store y 4(reg)
+-------+
Reviewed By: evandro, arsenm, rampitec, foad, fhahn
Differential Revision: https://reviews.llvm.org/D72031
This patch is a follow up of D84733.
If a function has noundef attribute in returned position, instructions that return undef or poison value cause UB.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85178
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D84630
addGlobalValueSummary can check newly added FunctionSummary
and set HasParamAccess to mark that generateParamAccessSummary
is needed.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85182