Summary:
Machine Block Frequency Info (MBFI) is being computed but unused in AsmPrinter.
MBFI computation was introduced with PGO change D71149 and then its use was
removed in D71106. No need to keep computing it.
Reviewers: MaskRay, jyknight, skan, yamauchi, davidxl, efriedma, huihuiz
Reviewed By: MaskRay, skan, yamauchi
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78526
xray_instr_map contains absolute addresses of sleds, which are relocated
by `R_*_RELATIVE` when linked in -pie or -shared mode.
By making these addresses relative to PC, we can avoid the dynamic
relocations and remove the SHF_WRITE flag from xray_instr_map. We can
thus save VM pages containg xray_instr_map (because they are not
modified).
This patch changes x86-64 and bumps the sled version to 2. Subsequent
changes will change powerpc64le and AArch64.
Reviewed By: dberris, ianlevesque
Differential Revision: https://reviews.llvm.org/D78082
These are mostly replicated from D78430 (instsimplify).
If we implement more general transforms for instcombine,
then we probably don't need to add that complexity to instsimplify.
AbstractAttribute::initialize is used to initialize the deduction and
the object we do not always call it. To make sure we have the option to
initialize the object even if initialize is not called we pass the
Attributor to AbstractAttribute constructors now.
This is an optimization that applies to global addresses and
allows for the following transformation:
Convert this:
paddi r3, 0, symbol@PCREL, 1
ld r4, 8(r3)
To this:
pld r4, symbol@PCREL+8(0), 1
An instruction is saved and the linker can do the addition when
the symbol is resolved.
Differential Revision: https://reviews.llvm.org/D76160
Summary:
The repeated use of std::next() on a MachineBasicBlock::iterator was
clever, but we only need to reconstruct the iterator post creation of
the spill instruction.
This helps simplifying where we plan to place the spill, as discussed in
D77849.
From here, we can simplify the code a little by flipping the return code
of a helper.
Reviewers: efriedma
Reviewed By: efriedma
Subscribers: qcolombet, hiraditya, llvm-commits, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78520
This was yet another function that had to be updated whenever you added
a new register class. Remove it by refactoring its only caller to use
standard helper functions from SIRegisterInfo.
Differential Revision: https://reviews.llvm.org/D78557
Summary:
Without this we could silently accept an invalid prologue because the
default DataExtractor behavior is to return an empty string when
reaching the end of file. And empty string is also used to terminate
these lists.
This makes the parsing code slightly more complicated, but this
complexity will go away once the parser starts working with truncating
data extractors. The reason I am doing it this way is because without
this, the truncation would regress the quality of error messages (right
now, we produce bad error messages only near EOF, but truncation would
make everything behave as if it was near EOF).
Reviewers: dblaikie, probinson, jhenderson
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77555
Summary:
This constructor allows us to create a new DWARFDataExtractor which will
only present a subrange of an entire debug section. Since debug sections
typically consist of multiple contributions, it is expected that one
will create a new data extractor for each contribution in order to
avoid unexpectedly running off into the next one.
This is very useful for unifying the flows for detecting parse errors.
Without it, the code needs to consider two very different scenarios:
1. If there is another contribution after the current one, the
DataExtractor functions will just start reading from there. This is
detectable by comparing the current offset against the known
end-of-contribution offset.
2. If this is the last contribution, the data extractor will just start
returning zeroes (or other default values). This situation can *not*
be detected by checking the parsing offset, as this will not be
advanced in case of errors.
Using a truncated data extractor simplifies the code (and reduces
cognitive load) by making these two cases behave identically -- a
running off the end of a contribution will _always_ produce an EOF error
(if one uses error-aware parsing methods) or return default values.
Reviewers: dblaikie, probinson, jhenderson, ikudrin
Subscribers: aprantl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77556
The individual tryTo* helpers do not need to be public. Also, the
builder contained two consecutive public: sections, which is not
necessary. Moved the remaining public methods after the constructor.
Also make some of the tryTo* helpers const.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed by: gilr
Differential Revision: https://reviews.llvm.org/D78288
This validates that sections listed for a segment in the YAML
declaration are ordered by their file offsets.
It might help to simplify the file size computation, but also
is useful by itself as helps to avoid issues in test cases and
to maintain their readability.
Differential revision: https://reviews.llvm.org/D78361
Replace Twine.h/SourceMgr.h includes with forward declarations and include in TGParser.cpp
Remove forward declarations we already have to include in Record.h
Summary:
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header in
order to follow other architectures.
Differential Revision: https://reviews.llvm.org/D78543
This API call has been used recently with, a very valid, expectation
that it would do something useful but it doesn't actually query any
backend information. So, remove this method and merge its
functionality into getUserCost. As well as that, also use
getCastInstrCost to get a proper cost from the backend for the
concerned instructions though we only currently return the answer if
it's considered free. The default implementation now also checks
int/ptr conversions too, as well as truncs and bitcasts.
Differential Revision: https://reviews.llvm.org/D76124
The logic in ARMParallelDSP is setup to merge two 16-bits loads into
a 32-bit load and feed them into the smlads. This requires that four
loads are combined for the four inputs, but there wasn't actually a
check for this.
Differential Revision: https://reviews.llvm.org/D78492
This code was added in 887efa51c1e0e43ca684ed78b92dbc3a0720881b to
fix reverse iteration.
The call to InsertIntoBucket/InsertIntoBucketWithLookup can change
the number of buckets which will invalidate the BucketEnd. So
don't cache it and calculate it when creating the iterator.
Before we kept the first applicable `ident_t*` during deduplication of
runtime calls. The problem is that "first" is dependent on the iteration
order of a DenseMap. Since the proper solution, which is to combine the
information from all `ident_t*`, should be deterministic on its own, we
will not try to make the iteration order deterministic. Instead, we will
create a fresh `ident_t*` if there is not a unique existing `ident_t*`
to pick.
Don't error on Config.KeepFileSymbols for COFF and Mach-O.
Original description:
GNU objcopy removes STT_FILE symbols for strip-debug operations, and
keeps them for --discard-all operation. Match their behaviour for
llvm-objcopy.
Bug: https://github.com/android/ndk/issues/1212
Differential Revision: https://reviews.llvm.org/D76675
Summary:
Before this patch, `relaxInstruction` takes three arguments, the first
argument refers to the instruction before relaxation and the third
argument is the output instruction after relaxation. There are two quite
strange things:
1) The first argument's type is `const MCInst &`, the third
argument's type is `MCInst &`, but they may be aliased to the same
variable
2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third
argument is a fresh uninitialized `MCInst` even if `relaxInstruction`
may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a
loop.
In this patch, we drop the thrid argument, and let `relaxInstruction`
directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and
breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon.
Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain
Reviewed By: Razer6, MaskRay, bcain
Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78364
For the test case in this patch like below
struct t { int a; } __attribute__((preserve_access_index));
int foo(void *);
int test(struct t *arg) {
long param[1];
param[0] = (long)&arg->a;
return foo(param);
}
The IR right before BPF SimplifyPatchable phase:
%1:gpr = LD_imm64 @"llvm.t:0:0$0:0"
%2:gpr = LDD killed %1:gpr, 0
%3:gpr = ADD_rr %0:gpr(tied-def 0), killed %2:gpr
STD killed %3:gpr, %stack.0.param, 0
After SimplifyPatchable phase, the incorrect IR is generated:
%1:gpr = LD_imm64 @"llvm.t:0:0$0:0"
%3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr
CORE_MEM killed %3:gpr, 306, %0:gpr, @"llvm.t:0:0$0:0"
Note that CORE_MEM pseudo op is introduced to encode
memory operations related to CORE. In the above, we intend
to check whether we have a store like
*(%3:gpr + 0) = ...
and if this is the case, we could replace it with
*(%0:gpr + @"llvm.t:0:0$0:0"_ = ...
Unfortunately, in the above, IR for the store is
*(%stack.0.param + 0) = %3:gpr
and transformation should not happen.
Note that we won't have problem if the actual CORE
dereference (arg->a) happens.
This patch fixed the problem by skip CORE optimization if
the use of ADD_rr result is not the base address of the store
operation.
Differential Revision: https://reviews.llvm.org/D78466
We now also use the BumpPtrAllocator from the Attributor in the
InformationCache. The lifetime of objects in either is pretty much the
same and it should result in consistently good performance regardless of
the allocator.
Doing so requires to call more constructors manually but so far that
does not seem to be problematic or messy.
---
Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):
Before:
```
calls to allocation functions: 615359 (368257/s)
temporary memory allocations: 83315 (49859/s)
peak heap memory consumption: 75.64MB
peak RSS (including heaptrack overhead): 163.43MB
total memory leaked: 269.04KB
```
After:
```
calls to allocation functions: 613042 (359555/s)
temporary memory allocations: 83322 (48869/s)
peak heap memory consumption: 75.64MB
peak RSS (including heaptrack overhead): 162.92MB
total memory leaked: 269.04KB
```
Difference:
```
calls to allocation functions: -2317 (-68147/s)
temporary memory allocations: 7 (205/s)
peak heap memory consumption: 2.23KB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
---