See [GRP_COMDAT group with STB_LOCAL signature](https://groups.google.com/g/generic-abi/c/2X6mR-s2zoc)
objcopy PR: https://sourceware.org/bugzilla/show_bug.cgi?id=27931
GRP_COMDAT deduplication is purely based on the signature symbol name in
ld.lld/GNU ld/gold. The local/global status is not part of the equation.
If the signature symbol is localized by --localize-hidden or
--keep-global-symbol, the intention is likely to make the group fully
localized. Drop GRP_COMDAT to suppress deduplication.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D106782
Summary:
Add support for the basic section stripping (and keeping) flags for wasm:
strip with no flags, --strip-all, --strip-debug,
--only-section, --keep-section, and --only-keep-debug.
Factor section removal into a function and use a predicate chain like
the ELF implementation.
Reviewers: jhenderson, sbc100
Differential Revision: https://reviews.llvm.org/D73820
Some users use a long list of fixed patterns (PR50404) and
O(|patterns|*|symbols|) can be too slow. Such usage typically does not use
--regex or --wildcard. We can use a DenseSet<CachedHashStringRef> to optimize
name lookups.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D105218
GNU and Apple `strip` implementations seems to support grouped options.
Enable the support for grouped options introduced in
https://reviews.llvm.org/D83639 for `llvm-strip` invocations.
Includes test that checks that both the grouped and non grouped
invocations produces the same result.
Reviewed By: alexander-shaposhnikov, MaskRay
Differential Revision: https://reviews.llvm.org/D105249
The load command is currently specific to arm64 and holds information
for instruction rewriting, e.g. converting a GOT load to an ADR to
compute a local address.
(On ELF the information is usually conveyed by relocations, e.g.
R_X86_64_REX_GOTPCRELX, R_PPC64_TOC16_HA)
Reviewed By: alexander-shaposhnikov
Differential Revision: https://reviews.llvm.org/D104968
An ARM64_RELOC_ADDEND relocation reuses the symbol field for the addend value.
We should pass through such relocations.
Reviewed By: alexander-shaposhnikov
Differential Revision: https://reviews.llvm.org/D104967
This reverts commit c94cf97b53566a26245c54ea0c41b0dc83daf8a0
since it appears to have broken linaro-clang-armv7-quick build bot
and needs further investigation.
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
There is no need to differentiate whether `UseSegments` is true or
false. Unifying the cases makes the behavior closer to BinaryWriter.
This improves compatibility with objcopy because SHF_ALLOC sections not in
a PT_LOAD will not be skipped. Such cases are usually erroneous input, though.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D104186
Since this only comes up with inputs containing sections at least 4GB
large (I guess I could use a bzero section or something, so the input
file doesn't have to be 4GB, but even then the output file would have to
be 4GB, right?) I've skipped testing this. If there's a nice way to test
this without needing 4GB inputs or output files.
The subtlety here is demonstrated by this code:
struct t { operator uint64_t(); };
static_assert(std::is_same_v<int, decltype(std::declval<bool>() ? 0 : std::declval<t>())>);
static_assert(std::is_same_v<uint64_t, decltype(std::declval<bool>() ? 0 : std::declval<uint64_t>())>);
Because of this difference, the original source code was getting an int
type (truncating the actual size) and then extending it again, resulting
in bogus values (I haven't thought through this hard enough to explain
why the resulting value was 0xffff... - sign extension, possible UB, but
in any case it's the wrong answer - in this particular case I was
looking at that resulted in a size so large that we couldn't open a file
large enough to write to and ended up with a rather vague:
error: 'file_name.o': Invalid argument
IHexWriter was evaluating a section's physical address when deciding if
that section should be written to an output. This approach does not
account for a zero-sized section that has the same physical address as a
sized section. The behavior varies from GNU objcopy, and may result in a
HEX file that does not include all program sections.
The IHexWriter now excludes zero-sized sections when deciding what
should be written to the output. This affects the contents of the
writer's `Sections` collection; we will not try to insert multiple
sections that could have the same physical address. The behavior seems
consistent with GNU objcopy, which always excludes empty sections,
no matter the address.
The new test case evaluates the IHexWriter behavior when provided a
variety of empty sections that overlap or append a filled section. See
the input file's comments for more information. Given that test input,
and the change to the IHexWriter, GNU objcopy and llvm-objcopy produce
the same output.
Reviewed By: jhenderson, MaskRay, evgeny777
Differential Revision: https://reviews.llvm.org/D101332
During reviewing D102277 it was decided to remove lazy options processing
from llvm-objcopy CopyConfig structure. This patch transforms processing of ELF
lazy options into the in-place processing.
Differential Revision: https://reviews.llvm.org/D103260
This will allow to use llvm-strip with file names that begin with dashes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102825
This patch prepares llvm-objcopy to move its implementation
into a separate library. To make it possible it is necessary
to minimize internal dependencies.
Differential Revision: https://reviews.llvm.org/D99055
This will allow to use llvm-objcopy with file names that begin with dashes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102665
PR50160: we currently ignore non-PT_PHDR segments with no sections, not
accounting for its p_offset and p_filesz: this can cause an out-of-bounds write
in `writeSegmentData` if the p_offset+p_filesz is larger than the total file
size.
This can be fixed by setting p_offset=p_filesz=0. The logic nicely unifies with
the logic added in D90897.
Reviewed By: jhenderson, rupprecht
Differential Revision: https://reviews.llvm.org/D101560
Fix PR45416: the diagnostic when '=' is missing is misleading.
`FileOutputBuffer::create` returns successfully when the filename is empty
(the temporary file is `.tmp%%%%%%%`), but `FileOutputBuffer::commit` will error when
renaming `.tmp%%%%%%%` to the empty name).
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D101697
Add support for LC_THREAD/LC_UNIXTHREAD
(these load commands can be copied over without any modifications).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D101384
Add support for LC_THREAD/LC_UNIXTHREAD
(these load commands can be copied over without any modifications).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D101384
writeToOutput function is useful when it is necessary to create different kinds
of streams(based on stream name) and when we need to use a temporary file
while writing(which would be renamed into the resulting file in a success case).
This patch moves the writeToStream helper into the Support library.
Differential Revision: https://reviews.llvm.org/D98426
This patch is follow-up for D91028. It implements direct writing into the
output stream for wasm.
Depends on D91028
Differential Revision: https://reviews.llvm.org/D95478
This patch removes creation of the resulting file from the
executeObjcopyOnBinary() function. For the most use cases, the
executeObjcopyOnBinary receives output file as a parameter
- raw_ostream &Out. The splitting .dwo file is implemented differently:
file containg .dwo tables is created inside executeObjcopyOnBinary().
When objcopy functionality would be moved into separate library,
current implementation will become inconvenient. The goal of that
refactoring is to separate concerns: It might be convenient to
to do dwo tables splitting but to create resulting file differently.
Differential Revision: https://reviews.llvm.org/D98582
The D93881 added functionality which preserve ownership for output file
if llvm-objcopy is called under root. That code was added into the place
where output file is created. The llvm-objcopy already has a function which
sets/restores rights/permissions for the output file.
That is the restoreStatOnFile() function. This patch moves code
(preserving ownershipping) into the restoreStatOnFile() function.
Differential Revision: https://reviews.llvm.org/D98511
During D88827 it was requested to remove the local implementation
of Memory/File Buffers:
// TODO: refactor the buffer classes in LLVM to enable us to use them here
// directly.
This patch uses raw_ostream instead of Buffers. Generally, using streams
could allow us to reduce memory usages. No need to load all data into the
memory - the data could be streamed through a smaller buffer.
Thus, this patch uses raw_ostream as an interface for output data:
Error executeObjcopyOnBinary(CopyConfig &Config,
object::Binary &In,
raw_ostream &Out);
Note 1. This patch does not change the implementation of Writers
so that data would be directly stored into raw_ostream.
This is assumed to be done later.
Note 2. It would be better if Writers would be implemented in a such way
that data could be streamed without seeking/updating. If that would be
inconvenient then raw_ostream could be replaced with raw_pwrite_stream
to have a possibility to seek back and update file headers.
This is assumed to be done later if necessary.
Note 3. Current FileOutputBuffer allows using a memory-mapped file.
The raw_fd_ostream (which could be used if data should be stored in the file)
does not allow us to use a memory-mapped file. Memory map functionality
could be implemented for raw_fd_ostream:
It is possible to add resize() method into raw_ostream.
class raw_ostream {
void resize(uint64_t size);
}
That method, implemented for raw_fd_ostream, could create a memory-mapped file.
The streamed data would be written into that memory file then.
Thus we would be able to use memory-mapped files with raw_fd_ostream.
This is assumed to be done later if necessary.
Differential Revision: https://reviews.llvm.org/D91028
This diff introduces --keep-undefined in llvm-objcopy/llvm-strip for Mach-O
which makes the tools preserve undefined symbols.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D97040
The code was using the standard isalnum function which doesn't handle
values outside the non-ascii range. Switching to using llvm::isAlnum
instead ensures we don't provoke undefined behaviour, which can in some
cases result in crashes.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97663
The help text and documentation for the --discard-all option failed to
mention that the option also causes the removal of debug sections. This
change fixes both for both llvm-objcopy and llvm-strip.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D97662
The check for whether an extended symbol index table was required
dropped the first SHN_LORESERVE sections from the sections array before
checking whether the remaining sections had symbols. Unfortunately, the
null section header is not present in this list, so the check was
skipping the first section that might be important. If that section
contained a symbol, and no subsequent ones did, the .symtab_shndx
section would not be emitted, leading to a corrupt object.
Also consolidate and expand test coverage in the area to cover this bug
and other aspects of the SYMTAB_SHNDX section.
Reviewed by: alexshap, MaskRay
Differential Revision: https://reviews.llvm.org/D97661
This makes the behavior similar to cp
```
chmod u+s,g+s,o+x a
sudo llvm-strip a -o b
// With this patch, b drops set-user-ID and set-group-ID bits.
// sudo cp a b => b does not have set-user-ID or set-group-ID bits.
```
This also changes the behavior for the following case:
```
chmod u+s,g+s,o+x a
llvm-strip a
// a preserves set-user-ID and set-group-ID bits.
// This matches binutils<2.36 and probably >=2.37. 2.36 and 2.36.1 have some compatibility issues.
```
Differential Revision: https://reviews.llvm.org/D97253
The few options are niche. They solved a problem which was traditionally solved
with more shell commands (`llvm-readelf -n` fetches the Build ID. Then
`ln` is used to hard link the file to a directory derived from the Build ID.)
Due to limitation, they are no longer used by Fuchsia and they don't appear to
be used elsewhere (checked with Google Search and Debian Code Search). So delete
them without a transition period.
Announcement: https://lists.llvm.org/pipermail/llvm-dev/2021-February/148446.html
Differential Revision: https://reviews.llvm.org/D96310
As of binutils 2.36, GNU strip calls chown(2) for "sudo strip foo" and
"sudo strip foo -o foo", but no "sudo strip foo -o bar" or "sudo strip
foo -o ./foo". In other words, while "sudo strip foo -o bar" creates a
new file bar with root access, "sudo strip foo" will keep the owner and
group of foo unchanged. Currently llvm-objcopy and llvm-strip behave
differently, always changing the owner and gropu to root. The
discrepancy prevents Chrome OS from migrating to llvm-objcopy and
llvm-strip as they change file ownership and cause intended users/groups
to lose access when invoked by sudo with the following sequence
(recommended in man page of GNU strip).
1.<Link the executable as normal.>
1.<Copy "foo" to "foo.full">
1.<Run "strip --strip-debug foo">
1.<Run "objcopy --add-gnu-debuglink=foo.full foo">
This patch makes llvm-objcopy and llvm-strip follow GNU's behavior.
Link: crbug.com/1108880
This is consistent with BFD objcopy.
Previously llvm objcopy would allocate space for SHT_NOBITS sections
often resulting in enormous binary files.
New test case (binary-paddr.test %t6).
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D95569