This moves the `reportUniqueWarning` method to the base class.
My motivation is the following:
I've experimented with replacing `reportWarning` calls with `reportUniqueWarning`
in ELF dumper. I've found that for example for removing them from `DynRegionInfo` helper
class, it is worth to pass a dumper instance to it (to be able to call dumper()->reportUniqueWarning()).
The problem was that `ELFDumper<ELFT>` is a template class. I had to make `DynRegionInfo` to be templated
and do lots of minor changes everywhere what did not look reasonable/nice.
At the same time I guess one day other dumpers like COFF/MachO/Wasm etc might want to
start using `reportUniqueWarning` API too. Then it looks reasonable to move the logic to the
base class.
With that the problem of passing the dumper instance will be gone.
Differential revision: https://reviews.llvm.org/D92218
This is the #2 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change expands remarks hotness threshold option
-fdiagnostics-hotness-threshold in clang and *-remarks-hotness-threshold in
other tools to utilize hotness threshold from profile summary.
Remarks hotness filtering relies on several driver options. Table below lists
how different options are correlated and affect final remarks outputs:
| profile | hotness | threshold | remarks printed |
|---------|---------|-----------|-----------------|
| No | No | No | All |
| No | No | Yes | None |
| No | Yes | No | All |
| No | Yes | Yes | None |
| Yes | No | No | All |
| Yes | No | Yes | None |
| Yes | Yes | No | All |
| Yes | Yes | Yes | >=threshold |
In the presence of profile summary, it is often more desirable to directly use
the hotness threshold from profile summary. The new argument value 'auto'
indicates threshold will be synced with hotness threshold from profile summary
during compilation. The "auto" threshold relies on the availability of profile
summary. In case of missing such information, no remarks will be generated.
Differential Revision: https://reviews.llvm.org/D85808
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:
* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='
Differential Revision: https://reviews.llvm.org/D85809
Adds a constructor to MachineModuleInfo and MachineModuleInfoWapperPass that
takes an external MCContext. If provided, the external context will be used
throughout codegen instead of MMI's default one.
This enables external drivers to take ownership of data put on the MMI's context
during codegen. The internal context is used otherwise and destroyed upon
finish.
Differential Revision: https://reviews.llvm.org/D91313
By explicitly requesting the system linker with `-fuse-ld=`, the
tests are able to CHECK for the system linker even with
CLANG_DEFAULT_LINKER=lld.
Alternative to D74704.
Differential Revision: https://reviews.llvm.org/D92291
The lowering of vector selects needs to first splat the scalar mask into a vector
first.
This was causing a crash when building oggenc in the test suite.
Differential Revision: https://reviews.llvm.org/D91655
The existing code handles this correctly and I checked that the code
in NativeInlineSiteSymbol also handles this correctly, but it was
wrong in the NativeFunctionSymbol code.
Differential Revision: https://reviews.llvm.org/D92134
Now that we flush the local value map for every instruction, we don't
need any extra flushes for specific cases. Also, LastFlushPoint is
not used for anything. Follow-ups to #dc35368 (D91734).
Differential Revision: https://reviews.llvm.org/D92338
Enable performing mandatory inlinings upfront, by reusing the same logic
as the full inliner, instead of the AlwaysInliner. This has the
following benefits:
- reduce code duplication - one inliner codebase
- open the opportunity to help the full inliner by performing additional
function passes after the mandatory inlinings, but before th full
inliner. Performing the mandatory inlinings first simplifies the problem
the full inliner needs to solve: less call sites, more contextualization, and,
depending on the additional function optimization passes run between the
2 inliners, higher accuracy of cost models / decision policies.
Note that this patch does not yet enable much in terms of post-always
inline function optimization.
Differential Revision: https://reviews.llvm.org/D91567
Apart from getting the entry in the table (which is already a
separate function), the remaining logic is different for all
alignment types and is better combined with getAlignment().
This is a minor efficiency improvement, and should make further
improvements like using separate storage for different alignment
types simpler.
There's a small number of users of this function, they are all updated.
This updates the C API adding a new method LLVMGetTypeByName2 that takes a context and a name.
Differential Revision: https://reviews.llvm.org/D78793
If prefaced with a %, expand text macros and macro functions in any statement.
Also, prevent expanding text macros in the message of an ECHO directive unless expanded explicitly by the statement expansion operator.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D89740
The model was committed in 4b8ade837e36b7f0181ce86fc23f33851d0fdd35
but not yet enabled to allow for a few fix ups. This adds a few
of these fixes, and also a LLVM MCA test to check most instructions.
While I do have plans to look into some more tuning, it's time to
enable this as it better than using the A53 schedule.
Differential Revision: https://reviews.llvm.org/D88017
For LP64 mode, this has no effect as pointers are already 64 bits.
For ILP32 mode (x32), this extension is specified by the ABI.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D91338
Pass through the demanded elts mask to the source operands.
The next step will be to add support for folding to add/sub if we only demand odd/even elements.
This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient.
The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function.
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86502
Continue the work started at D50989.
The code has been long dead since the triple has been removed (D75494).
Reviewed By: nickdesaulniers, void
Differential Revision: https://reviews.llvm.org/D91836
The mapping between registers and relative size has been updated to
use TypeSize to account for the size of scalable EVTs.
The patch is a NFCI, if not for the fact that with this change the
function `getUnderlyingArgRegs` does not raise a warning for implicit
conversion of `TypeSize` to `unsigned` when generating machine code
from the test added to the patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D92096
Optimize prologue/epilogue instructions if a given function use GOT but
do not call other functions by eliminating FP. Previously, we had wrong
implementations taken from other architectures. Update regression tests
also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92313
Previously, these check routines accepted non-generatble instructions.
This time, I clean them and add assert for those non-generatable
instructions.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92254
This was suggested in D92247 - I initially committed an alternate
fix ( bfd2c216ea ) to avoid the crash/assert shown in
https://llvm.org/PR48296 ,
but that was reverted because it caused msan failures on other
tests. We can try to revive that patch using the test included
here, but I do not have an immediate plan to isolate that problem.
This enables bswap/bitreverse to combine with other GREVI patterns or each other without needing to add more special cases to the DAG combine or new DAG combines.
I've also enabled the existing GREVI combine for GREVIW so that it can pick up the i32 bswap/bitreverse on RV64 after they've been type legalized to GREVIW.
Differential Revision: https://reviews.llvm.org/D92253
clang may produce `movl x@GOTPCREL+4(%rip), %eax` when loading the high
32 bits of the address of a global variable in -fpic/-fpie mode.
If assembled by GNU as, the fixup emits R_X86_64_GOTPCRELX with an addend != -4.
The instruction loads from the GOT entry with an offset and thus it is incorrect
to relax the instruction.
This patch does not emit a relaxable relocation for a GOT load with an offset
because R_X86_64_[REX_]GOTPCRELX do not make sense for instructions which cannot
be relaxed. The result is good enough for LLD to work. GNU ld relaxes
mov+GOTPCREL as well, but it suppresses the relaxation if addend != -4.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D92114
GORCI performs an OR between each stage. So we need to ensure only
one stage is active before doing this combine.
Initial attempts at finding a test case for this failed due to
the order things get combined. It's most likely that we'll form
one stage of GREVI then combine to GORCI before the two stages of
GREVI are able to be formed and combined with each other to form
a multi stage GREVI.
Differential Revision: https://reviews.llvm.org/D92289
https://llvm.org/PR48296 shows an example where we delete all of the operands
of a phi without actually deleting the phi, and that is currently considered
invalid IR. The reduced test included here would crash for that reason.
A suggested follow-up is to loosen the assert to allow 0-operand phis
in unreachable blocks.
Differential Revision: https://reviews.llvm.org/D92247