1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

43888 Commits

Author SHA1 Message Date
Florian Hahn
f1e4600eb5 [LTO] Update splitCodeGen to take a reference to the module. (NFC)
splitCodeGen does not need to take ownership of the module, as it
currently clones the original module for each split operation.

There is an ~4 year old fixme to change that, but until this is
addressed, the function can just take a reference to the module.

This makes the transition of LTOCodeGenerator to use LTOBackend a bit
easier, because under some circumstances, LTOCodeGenerator needs to
write the original module back after codegen.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95222
2021-01-29 11:53:11 +00:00
Kazu Hirata
606fed8845 [llvm] Forward-declare formatted_raw_ostream (NFC)
Various *TargetStreamer.h need formatted_raw_ostream but rely on a
forward declaration of formatted_raw_ostream in MCStreamer.h.  This
patch adds forward declarations right in *TargetStreamer.h.

While we are at it, this patch removes the one in MCStreamer.h, where
it is unnecessary.
2021-01-28 22:21:13 -08:00
Christudasan Devadasan
e86b7c773c Support a list of CostPerUse values
This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.

For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D86836
2021-01-29 10:14:52 +05:30
Duncan P. N. Exon Smith
a126d2972b ADT: Add SFINAE to the generic IntrusiveRefCntPtr constructors
Add an `enable_if` to the generic `IntrusiveRefCntPtr` constructors so
that std::is_convertible gives an honest answer when the underlying
pointers cannot be converted. Added `static_assert`s to the test suite
to verify.

Also combine generic constructors from `IntrusiveRefCntPtr<X>&&` and
`const IntrusiveRefCntPtr<X>&`. At first glance this appears to be an
infinite loop, but the real copy/move constructors are spelled out
separately above. Added a unit test to verify.

Differential Revision: https://reviews.llvm.org/D95498
2021-01-28 15:07:27 -08:00
Jessica Paquette
c1671731c6 Recommit "[GlobalISel] Walk through hints in getDefIgnoringCopies et al"
Recommit of 4580acf6752ea3cc884657b5aa3e174bed86fc8c

`Opc = DefMI->getOpcode()` was in the wrong place.
2021-01-28 14:43:00 -08:00
Jessica Paquette
2033979607 Revert "[GlobalISel] Walk through hints in getDefIgnoringCopies et al"
This reverts commit 4580acf6752ea3cc884657b5aa3e174bed86fc8c.

Reverting while looking into some test failures.
2021-01-28 14:37:57 -08:00
Jessica Paquette
fe6b2e148e [GlobalISel] Walk through hints in getDefIgnoringCopies et al
Treat hint instructions like G_ASSERT_ZEXT like COPY instructions in helpers
which walk through copies.

This ensures that instructions like G_ASSERT_ZEXT won't impact any optimizations
that rely on these helpers.

Differential Revision: https://reviews.llvm.org/D95577
2021-01-28 14:27:00 -08:00
Cassie Jones
e11c57fcf5 [GlobalISel] Implement widenScalar for carry-in add/sub
These are widened to a wider UADDE/USUBE, with the overflow value
unused, and with the same synthesis of a new overflow value as for the
O operations.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D95326
2021-01-28 17:06:24 -05:00
Jessica Paquette
3fcc23c823 [GlobalISel] Add G_ASSERT_ZEXT
This adds a generic opcode which communicates that a type has already been
zero-extended from a narrower type.

This is intended to be similar to AssertZext in SelectionDAG.

For example,

```
%x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16
```

Signifies that the top 48 bits of %x are known to be 0.

This is useful in cases like this:

```
define i1 @zeroext_param(i8 zeroext %x) {
  %cmp = icmp ult i8 %x, -20
  ret i1 %cmp
}
```

In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit
value.

If we know that `%x` is already zero-ed out in the relevant high bits, we can
avoid the truncate.

Currently, in GISel, this looks like this:

```
_zeroext_param:
  and w8, w0, #0xff ; We don't actually need this!
  cmp w8, #236
  cset w0, lo
  ret
```

While SDAG does not produce the truncation, since it knows that it's
unnecessary:

```
_zeroext_param:
  cmp w0, #236
  cset w0, lo
  ret
```

This patch

- Adds G_ASSERT_ZEXT
- Adds MIRBuilder support for it
- Adds MachineVerifier support for it
- Documents it

It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There
should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.)

This allows us to skip over hints in the legalizer etc. These can then later
be selected like COPY instructions or removed.

Differential Revision: https://reviews.llvm.org/D95564
2021-01-28 13:58:37 -08:00
Greg Clayton
4dc036b075 Add the ability to extract the unwind rows from DWARF Call Frame Information.
This patch adds the ability to evaluate the state machine for CIE and FDE unwind objects and produce a UnwindTable with all UnwindRow objects needed to unwind registers. It will also dump the UnwindTable for each CIE and FDE when dumping DWARF .debug_frame or .eh_frame sections in llvm-dwarfdump or llvm-objdump. This allows users to see what the unwind rows actually look like for a given CIE or FDE instead of just seeing a list of opcodes.

This patch adds new classes: UnwindLocation, RegisterLocations, UnwindRow, and UnwindTable.

UnwindLocation is a class that describes how to unwind a register or Call Frame Address (CFA).

RegisterLocations is a class that tracks registers and their UnwindLocations. It gets populated when parsing the DWARF call frame instruction opcodes for a unwind row. The registers are mapped from their register numbers to the UnwindLocation in a map.

UnwindRow contains the result of evaluating a row of DWARF call frame instructions for the CIE, or a row from a FDE. The CIE can produce a set of initial instructions that each FDE that points to that CIE will use as the seed for the state machine when parsing FDE opcodes. A UnwindRow for a CIE will not have a valid address, whille a UnwindRow for a FDE will have a valid address.

The UnwindTable is a class that contains a sorted (by address) vector of UnwindRow objects and is the result of parsing all opcodes in a CIE, or FDE. Parsing a CIE should produce a UnwindTable with a single row. Parsing a FDE will produce a UnwindTable with one or more UnwindRow objects where all UnwindRow objects have valid addresses. The rows in the UnwindTable will be sorted from lowest Address to highest after parsing the state machine, or an error will be returned if the table isn't sorted. To parse a UnwindTable clients can use the following methods:

    static Expected<UnwindTable> UnwindTable::create(const CIE *Cie);
    static Expected<UnwindTable> UnwindTable::create(const FDE *Fde);

A valid table will be returned if the DWARF call frame instruction opcodes have no encoding errors. There are a few things that can go wrong during the evaluation of the state machine and these create functions will catch and return them.

Differential Revision: https://reviews.llvm.org/D89845
2021-01-28 13:39:17 -08:00
Reid Kleckner
365f97bc0d Revert "[PDB] Defer relocating .debug$S until commit time and parallelize it"
This reverts commit 1a9bd5b81328adf0dd5a8b4f3ad5949463e66da3.

I suspect that this patch may have caused https://crbug.com/1171438.
2021-01-28 13:17:27 -08:00
Thomas Lively
30175dd5d9 [WebAssembly] Prototype i8x16 to i32x4 widening instructions
As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the
opcodes used in V8:
https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h

Differential Revision: https://reviews.llvm.org/D95557
2021-01-28 10:59:32 -08:00
David Blaikie
266d73d7f0 DebugInfo: Add a DWARF FORM extension for addrx+offset references to reduce relocations
This is an alternative to the use of complex DWARF expressions for
addresses - shaving off a few extra bytes of expression overhead.
2021-01-28 10:20:02 -08:00
Simon Pilgrim
0cbd016ac6 [APFloat] Remove orphan ilogb(DoubleAPFloat) declaration. NFCI. 2021-01-28 15:18:25 +00:00
Simon Pilgrim
984ce3745a [APFloat] scalbn - pass DoubleAPFloat arg as const-ref. NFCI.
Avoid unnecessary copy and fix clang-tidy warning.
2021-01-28 15:18:24 +00:00
Stefan Gränitz
2f98bedc10 [Orc] Remove unused header from TPC server
The header would include OrcJIT headers in OrcTargetProcess, which is not desired. All common declarations should be in OrcShared.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D95606
2021-01-28 14:16:49 +01:00
Simon Pilgrim
5bf02200da [DebugInfo] Remove some unused includes. NFCI.
Mainly removing a lot of <vector> includes from files that don't explicitly use std::vector
2021-01-28 11:21:35 +00:00
Georgii Rymar
441aa2949d [yaml2obj] - Allow empty SectionHeaderTable definitions.
Currently we don't allow the following definition:

```
Sections:
  - Type: SectionHeaderTable
  - Name: .foo
    Type: SHT_PROGBITS
```

We report an error: "SectionHeaderTable can't be empty. Use 'NoHeaders' key to drop the section header table".

It was implemented in this way earlier, when `SectionHeaderTable`
was a dedicated key outside of the `Sections` list. And we did not
allow to select where the table is written.

Currently it makes sense to allow it, because a user might
want to place the default section header table at an arbitrary position,
e.g. before other sections. In this case it is not convenient and error prone
to require specifying all sections:

```
Sections:
  - Type: SectionHeaderTable
    Sections:
      - Name: .foo
      - Name: .strtab
      - Name: .shstrtab
  - Name: .foo
    Type: SHT_PROGBITS
```

This patch allows empty SectionHeaderTable definitions.

Differential revision: https://reviews.llvm.org/D95341
2021-01-28 10:51:52 +03:00
Kazu Hirata
3902d5e888 [DebugInfo] Forward-declare PDBFile (NFC)
NativeEnumInjectedSources.h needs PDBFile but relies on a
forward declaration of PDBFile in InjectedSourceStream.h.
This patch adds a forward declaration right in
NativeEnumInjectedSources.h.

While we are at it, this patch removes the one in
InjectedSourceStream.h, where it is unnecessary.
2021-01-27 23:25:38 -08:00
Hongtao Yu
9b80fe63e4 [CSSPGO] Support of CS profiles in extended binary format.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95547
2021-01-27 21:29:46 -08:00
Fangrui Song
612a586751 [llvm-c] Move LLVMX86_AMXTypeKind & LLVMPoisonValueValueKind to the bottom to avoid value changes compared with LLVM<=11
Fixes PR48905
2021-01-27 16:28:04 -08:00
Teresa Johnson
75870000ad [LTO] Prevent devirtualization for symbols dynamically exported
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.

Differential Revision: https://reviews.llvm.org/D91583
2021-01-27 15:54:13 -08:00
James Y Knight
37f683eeb0 Itanium Mangling: Mangle __alignof__ differently than alignof.
The two operations have acted differently since Clang 8, but were
unfortunately mangled the same. The new mangling uses new "vendor
extended expression" syntax proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112

GCC had the same mangling problem, https://gcc.gnu.org/PR88115, and
will hopefully be switching to the same mangling as implemented here.

Additionally, fix the mangling of `__uuidof` to use the new extension
syntax, instead of its previous nonstandard special-case.

Adjusts the demangler accordingly.

Differential Revision: https://reviews.llvm.org/D93922
2021-01-27 16:46:51 -05:00
Varun Gandhi
52a271678f [Demangle] Support demangling Swift calling convention in MS demangler.
Previously, Clang was able to mangle the Swift calling
convention but 'MicrosoftDemangle.cpp' was not able to demangle it.

Reviewed By: compnerd, rnk

Differential Revision: https://reviews.llvm.org/D95053
2021-01-27 13:24:54 -08:00
Sanjay Patel
2ae45edb62 [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes)
I am trying to untangle the fast-math-flags propagation logic
in the vectorizers (see a6f022127 for SLP).

The loop vectorizer has a mix of checking FP function attributes,
IR-level FMF, and just wrong assumptions.

I am trying to avoid regressions while fixing this, and I think
the IR-level logic is good enough for that, but it's hard to say
for sure. This would be the 1st step in the clean-up.

The existing test that I changed to include 'fast' actually shows
a miscompile: the function only had the equivalent of nnan, but we
created new instructions that had fast (all FMF set). This is
similar to the example in https://llvm.org/PR35538

Differential Revision: https://reviews.llvm.org/D95452
2021-01-27 14:17:11 -05:00
Fangrui Song
b3b970744d [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags
Imported functions and variable get the visibility from the module supplying the
definition.  However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.

This patch

* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions

Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).

Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.

Differential Revision: https://reviews.llvm.org/D92900
2021-01-27 10:43:51 -08:00
Craig Topper
96c406a887 [FaultsMaps][llvm-objdump] Move FaultMapParser to Object/. Remove CodeGen dependency from llvm-objdump
FaultsMapParser lived in CodeGen and was forcing llvm-objdump to
link CodeGen and everything CodeGen depends on.

This was previously attempted in r240364 to fix a link failure.
The CodeGen dependency was independently added to fix the same
link failure, and that ended up being kept.

Removing the dependency seems like the correct layering for
llvm-objdump.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D95414
2021-01-27 10:39:59 -08:00
Valentin Clement
f7b64e36c0 [flang][openacc] Allow multiple wait clauses
kernels loop and enter data had a too restrictive constraint for the wait clause.
The wait clause is allowed multiple times and not only once. This patch fix this problem.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D95469
2021-01-27 13:18:46 -05:00
Florian Hahn
5d270f12d2 [LoopUtils] Pass SCEVExpander instead SE to addRuntimeChecks.
This gives the user control over which expander to use, which in turn
allows the user to decide what to do with the expanded instructions.

Used in D75980.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94295
2021-01-27 17:36:19 +00:00
Valentin Clement
1973b57fb0 [flang][openacc] Fix clause restriction for exit data directive
Restriction on clauses for the EXIT DATA directive were not fully correct.
This patch fixes the situation. The async, if and finalize clauses are allowed
only once.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D95470
2021-01-27 10:07:19 -05:00
Valentin Clement
202f767031 [flang][openacc] Fix clause restriction for host_data directive
Restriction on clauses for the HOST_DATA directive were not fully correct.
This patch fixes the situation. The if and if_present clauses are allowed
only once.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D95473
2021-01-27 10:06:33 -05:00
Kazu Hirata
d493c03566 [AMDGPU] Forward-declare TargetRegisterClass (NFC)
AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a
forward declaration of TargetRegisterClass in InstructionSelector.h.
This patch adds a forward declaration right in
AMDGPUInstructionSelector.h.

While we are at it, this patch removes the one in
InstructionSelector.h, where it is unnecessary.
2021-01-26 20:00:16 -08:00
Petr Hosek
3cfd18b793 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Duncan P. N. Exon Smith
795af9c7de Frontend: Simplify handling of non-seeking streams in CompilerInstance, NFC
Add a new `raw_pwrite_ostream` variant, `buffer_unique_ostream`, which
is like `buffer_ostream` but with unique ownership of the stream it's
wrapping. Use this in CompilerInstance to simplify the ownership of
non-seeking output streams, avoiding logic sprawled around to deal with
them specially.

This also simplifies future work to encapsulate output files in a
different class.

Differential Revision: https://reviews.llvm.org/D93260
2021-01-26 15:20:43 -08:00
Haowei Wu
4cba78c0ca [llvm-elfabi] Support ELF file that lacks .gnu.hash section
Before this change, when reading ELF file, elfabi determines number of
entries in .dynsym by reading the .gnu.hash section. This change makes
elfabi read section headers directly first. This change allows elfabi
works on ELF files which do not have .gnu.hash sections.

Differential Revision: https://reviews.llvm.org/D93362
2021-01-26 12:31:52 -08:00
Fangrui Song
71834ae8ab Add -fbinutils-version= to gate ELF features on the specified binutils version
There are two use cases.

Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler().  Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).

Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld".  You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969)

Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).

This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.

`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.

Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).

Differential Revision: https://reviews.llvm.org/D85474
2021-01-26 12:28:23 -08:00
Petr Hosek
ef4906bd9c Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a9e20bd5df3cb47283715f0ff38b751 because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Petr Hosek
a8839c989d Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Simon Pilgrim
95a58e9335 [AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI.
Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions
2021-01-26 16:09:39 +00:00
Sander de Smalen
466208abea [CostModel] Handle CTLZ and CCTZ in getTypeBasedIntrinsicInstrCost
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D95355
2021-01-26 14:37:51 +00:00
Sebastian Neubauer
3bc57e4094 [AMDGPU] Add IntrWillReturn to three intrinsics
None of these can terminate a wave or lane.
With these, all intrinsic are IntrWillReturn except those that change
exec or can terminate the wave.

Not marking intrinsics as WillReturn may prevent optimizations in the
future: https://lists.llvm.org/pipermail/llvm-dev/2021-January/148047.html

Differential Revision: https://reviews.llvm.org/D95436
2021-01-26 15:33:15 +01:00
Florian Hahn
752c4c588e [Passes] Run peeling as part of simple/full loop unrolling.
Loop peeling removes conditions from loop bodies that become invariant
after a small number of iterations. When triggered, this leads to fewer
compares and possibly PHIs in loop bodies, enabling further
optimizations. The current cost-model of loop peeling should be quite
conservative/safe, i.e. only peel if a condition in the loop becomes
known after peeling.

For example, see PR47671, where loop peeling enables vectorization by
removing a PHI the vectorizer does not understand. Granted, the
loop-vectorizer could also be taught about constant PHIs, but loop
peeling is likely to enable other optimizations as well.

This has an impact on quite a few benchmarks from
MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto, for example

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsVectorized

    Program                                        base   patch  diff
     test-suite...ve-susan/automotive-susan.test     8.00   9.00 12.5%
     test-suite...nal/skidmarks10/skidmarks.test    35.00  31.00 -11.4%
     test-suite...lications/sqlite3/sqlite3.test    41.00  43.00  4.9%
     test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test    25.00  26.00  4.0%
     test-suite...006/450.soplex/450.soplex.test    88.00  89.00  1.1%
     test-suite...TimberWolfMC/timberwolfmc.test   120.00 119.00 -0.8%
     test-suite.../CINT2006/403.gcc/403.gcc.test   215.00 216.00  0.5%
     test-suite...006/447.dealII/447.dealII.test   957.00 958.00  0.1%
     test-suite...ternal/HMMER/hmmcalibrate.test    75.00  75.00  0.0%

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsAnalyzed

    Program                                        base    patch   diff
     test-suite...ks/Prolangs-C/agrep/agrep.test   440.00  434.00  -1.4%
     test-suite...nal/skidmarks10/skidmarks.test   312.00  308.00  -1.3%
     test-suite...marks/7zip/7zip-benchmark.test   6399.00 6323.00 -1.2%
     test-suite...lications/minisat/minisat.test   134.00  135.00   0.7%
     test-suite...rks/FreeBench/pifft/pifft.test   295.00  297.00   0.7%
     test-suite...TimberWolfMC/timberwolfmc.test   1879.00 1869.00 -0.5%
     test-suite...pplications/treecc/treecc.test   689.00  691.00   0.3%
     test-suite...T2000/300.twolf/300.twolf.test   1593.00 1597.00  0.3%
     test-suite.../Benchmarks/Bullet/bullet.test   1394.00 1392.00 -0.1%
     test-suite...ications/JM/ldecod/ldecod.test   1431.00 1429.00 -0.1%
     test-suite...6/464.h264ref/464.h264ref.test   2229.00 2230.00  0.0%
     test-suite...lications/sqlite3/sqlite3.test   2590.00 2589.00 -0.0%
     test-suite...ications/JM/lencod/lencod.test   2732.00 2733.00  0.0%
     test-suite...006/453.povray/453.povray.test   3395.00 3394.00 -0.0%

Note the -11% regression in number of loops vectorized for skidmarks. I
suspect this corresponds to the fact that those loops are gone now (see
the reduction in number of loops analyzed by LV).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88471
2021-01-26 13:52:30 +00:00
Georgii Rymar
b39ef35f86 [yaml2obj][obj2yaml] - Improve how we set/dump the sh_entsize field.
We already set the `sh_entsize` field in a single place
for all non-implicit sections.

This patch reorders the logic slightly and with it
we finally have the only one place where the `sh_entsize` is set.

obj2yaml will not dump the `EntSize` key for `SHT_DYNSYM/SHT_SYMTAB` sections anymore,
when the value of `sh_entsize` is equal to `sizeof(Elf_Sym)`

Note that this also seems revealed an issue in llvm-objcopy:
Previously yaml2obj set the `sh_entsize` for the `.symtab` section to 0x18,
now we it sets it for `SHT_SYMTAB` sections, i.e. by type.
But the `llvm-objcopy/ELF/only-keep-debug.test` has a `.symtab` section of type `SHT_STRTAB`,
and now yaml2obj sets the `sh_entsize` to 0 for it.
I had to update the corresponding check lines for `ES`, but the behavior of
`llvm-objcopy` should be fixed instead I think.
I've added a TODO and a comment.

Differential revision: https://reviews.llvm.org/D95364
2021-01-26 13:33:02 +03:00
Georgii Rymar
2900e3238f [libObject,llvm-readelf/obj] - Don't use @@ when printing versions of undefined symbols.
A default version (@@) is only available for defined symbols.

Currently we use "@@" for undefined symbols too.
This patch fixes the issue and improves our test case.

Differential revision: https://reviews.llvm.org/D95219
2021-01-26 12:05:59 +03:00
Jan Svoboda
8d411fdc2d [clang][cli] Accept strings instead of options in ImpliedByAnyOf
To be able to refer to constant keypaths (e.g. `defvar cplusplus = LangOpts<"CPlusPlus">`) inside `ImpliedByAnyOf`, let's accept strings instead of `Option` instances.

This somewhat weakens the guarantees that we're referring to an existing (option) record, but we can still use the option.KeyPath syntax to simulate this.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D95344
2021-01-26 09:30:36 +01:00
Hsiangkai Wang
816e1ded77 [RISCV] Implement vlsegff intrinsics.
Differential Revision: https://reviews.llvm.org/D95303
2021-01-26 12:02:43 +08:00
Kazu Hirata
803b1cc469 [AMDGPU] Forward-declare MachineIRBuilder (NFC)
AMDGPULegalizerInfo.h needs MachineIRBuilder but relies on a forward
declaration of MachineIRBuilder in LegalizerInfo.h.  This patch adds a
forward declaration right in AMDGPULegalizerInfo.h.

While we are at it, this patch removes the one in LegalizerInfo.h,
where it is unnecessary.
2021-01-25 19:24:01 -08:00
Kazu Hirata
23ef7e2902 [StackSafety] Use ListSeparator (NFC) 2021-01-25 19:23:59 -08:00
Amara Emerson
20267c085f [GlobalISel][Localizer] Don't localize phi operands which are used more than once in the phi.
The current algorithm just tries to localize defs as far as they can go, and in
the case of G_PHI operands, it clones the def into the predecessor block for
each incoming edge. When multiple edges have the same register value, this can
cause unnecessary code bloat, and inhibit later optimizations.

This change checks if a given phi operand is unique in the phi, if not the
def of that register is not localized to the predecessor.

Differential Revision: https://reviews.llvm.org/D95406
2021-01-25 17:48:04 -08:00
Mitch Phillips
587eafbc21 Revert "Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method""
This reverts commit 554b3211fefd09b56b64357b9edd66c78ae200b5.

Differential Revision: https://reviews.llvm.org/D95035
2021-01-25 16:22:22 -08:00