1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
Commit Graph

191997 Commits

Author SHA1 Message Date
Fangrui Song
3222c2a7d8 [AsmPrinter] Omit unique ID for .stack_sizes
Follow-up for D74006.
2020-02-14 21:25:06 -08:00
Fangrui Song
a791526017 [AsmPrinter][XRay] Omit unique ID for xray_instr_map and xray_fn_idx
Follow-up for D74006.
2020-02-14 21:10:46 -08:00
Diogo Sampaio
3ae173562a [AArch64][FPenv] Update chain of int to fp conversion
Summary:
When using strict fp, it is required to update the
chain when performing integer type promotion of a
operand to a integer to floating point conversion.

Reviewers: craig.topper, john.brawn

Reviewed By: craig.topper

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74597
2020-02-15 05:07:34 +00:00
Fangrui Song
917b37117e [AsmPrinter] Omit unique ID for __patchable_function_entries sections
Follow-up for D74006.

When the integrated assembler is used, we use SHF_LINK_ORDER.  The
linked-to symbol is part of ELFSectionKey, thus we can omit the unique
ID.
2020-02-14 20:54:54 -08:00
Fangrui Song
48a7715451 [MC] Add MCSection::NonUniqueID and delete one MCContext::getELFSection overload 2020-02-14 20:25:52 -08:00
Fangrui Song
fdfda04ad5 [MC][ELF] Make linked-to symbol name part of ELFSectionKey
https://bugs.llvm.org/show_bug.cgi?id=44775

This rule has been implemented by GNU as https://sourceware.org/ml/binutils/2020-02/msg00028.html (binutils >= 2.35)

It allows us to simplify

```
.section .foo,"o",foo,unique,0
.section .foo,"o",bar,unique,1  # different section
```

to

```
.section .foo,"o",foo
.section .foo,"o",bar  # different section
```

We consider the two `.foo` different even if the linked-to symbols foo and bar
are defined in the same section.  This is a deliberate choice so that we don't
need to know the section where foo and bar are defined beforehand.

Differential Revision: https://reviews.llvm.org/D74006
2020-02-14 20:03:04 -08:00
Johannes Doerfert
256fbd76a9 [Attributor][FIX] Ensure abstract attributes are existing before manifest
While the function return updateImpl did only look at call sites the
manifest method looked at return values. If we don't do this during the
updateImpl we might create new abstract attributes during manifest. This
is a problem when it comes to liveness information.
2020-02-14 21:44:46 -06:00
Johannes Doerfert
5ca603fd18 [Attributor] Manifest simplified (return) values properly
If we simplify a function return value we have to modify the return
instructions.
2020-02-14 21:44:46 -06:00
Johannes Doerfert
e769d72982 [Attributor][FIX] Collapse undef to a proper value
If we see an undef we cannot assume it's the same as "no value". For now
we just collapse it to 0.
2020-02-14 21:44:46 -06:00
Johannes Doerfert
cd7e2d0838 [Attributor][FIX] Restrict cross-SCC call deletion
If we know a call was not needed we might have ended up deleting it even
if it was in a different SCC. This prevents us from doing so.
2020-02-14 21:44:46 -06:00
Johannes Doerfert
e6241accb0 [Attributor][NFC] Add check lines for tests 2020-02-14 21:44:46 -06:00
Johannes Doerfert
23335d87a7 [Attributor][FIX] Carefully strip casts in AANoAlias
We can strip casts in AANoAlias but that might cause us to end up with a
non-pointer type. We do properly handle that case now.
2020-02-14 21:44:46 -06:00
Johannes Doerfert
e6077023d4 [Attributor][FIX] Do not RAUW void values
This caused an error when passes iterated over cached assumptions in the
tracker and assumed them to be `null` or an instruction. I failed to
create a test case so far.
2020-02-14 21:44:46 -06:00
Fangrui Song
52e7c94059 [llvm-ranlib][test] Fix rwx- after a4f3847f3d5742cfab7acdc614e7ca54643e0c85 2020-02-14 19:41:55 -08:00
Matt Arsenault
3d99313085 AMDGPU/GlobalISel: Fix missing impdef of scc on boolean bit ops 2020-02-14 22:35:30 -05:00
Fangrui Song
e6d3f76ab0 [MC] De-capitalize another set of MCStreamer::Emit* functions
Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc
2020-02-14 19:26:52 -08:00
Fangrui Song
343c2a2b44 [MC] De-capitalize some MCStreamer::Emit* functions 2020-02-14 19:11:53 -08:00
Nico Weber
2c0f20cf6c [gn build] Make build locally deterministic
This follows http://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html
to make the GN build locally deterministic.

With this, I've built lld at two different build paths on my Windows box and got
identical binaries. (I'd expect the same to happen on Linux, and with other
binaries.)

This doesn't have the bits to get universal determinism yet.

Differential Revision: https://reviews.llvm.org/D74519
2020-02-14 21:55:33 -05:00
Shiva Chen
0188dcd0c7 [RISCV] Correct the CallPreservedMask for the function call in an interrupt handler
CallPreservedMask is used to describe the register liveness after a
function call. The function call in an interrupt handler should use the same
CallPreservedMask as normal functions. So that only callee save registers
can live through the function call.
2020-02-15 09:14:04 +08:00
Johannes Doerfert
07870c8135 [Attributor] Derive memory location attributes (argmemonly, ...)
In addition to memory behavior attributes (readonly/writeonly) we now
derive memory location attributes (argmemonly/inaccessiblememonly/...).
The former is part of AAMemoryBehavior and the latter part of
AAMemoryLocation. While they are similar in nature it got messy when
they were put in a single AA. Location attributes for arguments and
floating values will follow later.

Note that both memory attributes kinds can derive readnone. If there are
no accesses AAMemoryBehavior will derive readnone. If there are accesses
but only to stack (=local) locations AAMemoryLocation will derive
readnone.

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D73426
2020-02-14 19:05:51 -06:00
Matt Arsenault
36eb29ea7a AMDGPU: Don't preserve analyses with div64 IR expansion
The dominator tree needs to be updated, but that isn't handled now.
2020-02-14 20:06:02 -05:00
Amy Huang
4ca15f6a1c Fix 01b02a73de78 to use correct macro spelling and fix unit tests. 2020-02-14 15:58:36 -08:00
Matt Arsenault
ef81b8a752 AMDGPU/GlobalISel: Fix G_EXTRACT of 96-bit results
This would assert on an unhandled size in getRegSplitParts.
2020-02-14 15:57:40 -08:00
Matt Arsenault
4b67ae249f AMDGPU: Use generated checks for memcpy expansion 2020-02-14 15:57:40 -08:00
Matt Arsenault
2679771e12 AMDGPU/GlobalISel: Improve 16-bit bswap
Match the new DAG behavior and use v_perm_b32 when available. Also
does better on SI/CI by expanding 16-bit swaps. Also fix
non-power-of-2 cases.
2020-02-14 15:57:39 -08:00
Matt Arsenault
c9e2bd579f GlobalISel: Remove unused function argument 2020-02-14 15:57:39 -08:00
Stanislav Mekhanoshin
20cb1e9351 [TBLGEN] Allow to override RC weight
Differential Revision: https://reviews.llvm.org/D74509
2020-02-14 15:49:52 -08:00
Derek Schuff
71ec991b79 [WebAssembly] Add section names for some DWARF5 sections
Summary:
Addresses PR44728 but no tests because I've not yet made any attempt to verify
correctness of the debug info.

Reviewers: sbc100, aardappel

Differential Revision: https://reviews.llvm.org/D74656
2020-02-14 15:45:06 -08:00
Reid Kleckner
e151258050 Fix -Wstring-compare warnings in new OpenMP code 2020-02-14 15:23:49 -08:00
Johannes Doerfert
afa3c7f6dd [Attributor][FIX] Validate the type for AAValueConstantRange as needed
Due to the genericValueTraversal we might visit values for which we did
not create an AAValueConstantRange object, e.g., as they are behind a
PHI or select or call with `returned` argument. As a consequence we need
to validate the types as we are about to query AAValueConstantRange for
operands.
2020-02-14 17:22:40 -06:00
Amy Huang
f331fd6869 Don't call computeHostNumPhysicalCores when LLVM_ENABLE_THREADS is off
Summary:
Fix change from 8404aeb56a73 to avoid calling
computeHostNumPhysicalCores if LLVM_ENABLE_THREADS is off.

Reviewers: rnk, aganea

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74654
2020-02-14 15:09:27 -08:00
Lang Hames
99413ab624 [lli] Add a '-dlopen <library-path>' option to lli.
Passing '-dlopen <library-path>' to lli will cause the specified library to be
loaded (via llvm::sys::DynamicLibrary::LoadLibraryPermanently) before JIT'd code
is executed, making the library's symbols accessible to JIT'd code.
2020-02-14 15:01:40 -08:00
Johannes Doerfert
ba1400c967 [OpenMP][Part 2] Use reusable OpenMP context/traits handling
This patch implements an almost complete handling of OpenMP
contexts/traits such that we can reuse most of the logic in Flang
through the OMPContext.{h,cpp} in llvm/Frontend/OpenMP.

All but construct SIMD specifiers, e.g., inbranch, and the device ISA
selector are define in `llvm/lib/Frontend/OpenMP/OMPKinds.def`. From
these definitions we generate the enum classes `TraitSet`,
`TraitSelector`, and `TraitProperty` as well as conversion and helper
functions in `llvm/lib/Frontend/OpenMP/OMPContext.{h,cpp}`.

The above enum classes are used in the parser, sema, and the AST
attribute. The latter is not a collection of multiple primitive variant
arguments that contain encodings via numbers and strings but instead a
tree that mirrors the `match` clause (see `struct OpenMPTraitInfo`).

The changes to the parser make it more forgiving when wrong syntax is
read and they also resulted in more specialized diagnostics. The tests
are updated and the core issues are detected as before. Here and
elsewhere this patch tries to be generic, thus we do not distinguish
what selector set, selector, or property is parsed except if they do
behave exceptionally, as for example `user={condition(EXPR)}` does.

The sema logic changed in two ways: First, the OMPDeclareVariantAttr
representation changed, as mentioned above, and the sema was adjusted to
work with the new `OpenMPTraitInfo`. Second, the matching and scoring
logic moved into `OMPContext.{h,cpp}`. It is implemented on a flat
representation of the `match` clause that is not tied to clang.
`OpenMPTraitInfo` provides a method to generate this flat structure (see
`struct VariantMatchInfo`) by computing integer score values and boolean
user conditions from the `clang::Expr` we keep for them.

The OpenMP context is now an explicit object (see `struct OMPContext`).
This is in anticipation of construct traits that need to be tracked. The
OpenMP context, as well as the `VariantMatchInfo`, are basically made up
of a set of active or respectively required traits, e.g., 'host', and an
ordered container of constructs which allows duplication. Matching and
scoring is kept as generic as possible to allow easy extension in the
future.

---

Test changes:

The messages checked in `OpenMP/declare_variant_messages.{c,cpp}` have
been auto generated to match the new warnings and notes of the parser.
The "subset" checks were reversed causing the wrong version to be
picked. The tests have been adjusted to correct this.
We do not print scores if the user did not provide one.
We print spaces to make lists in the `match` clause more legible.

Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim

Subscribers: merge_guards_bot, rampitec, mgorny, hiraditya, aheejin, fedor.sergeev, simoncook, bollu, guansong, dexonsmith, jfb, s.egerton, llvm-commits, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71830
2020-02-14 16:37:42 -06:00
Roman Lebedev
b7ba6dbbca [NFC][llvm-exegesis] CombinationGenerator::performGeneration(): pull put state increment into lambda
This avoids questionable code such as taking address of current
range-based for variable and comparing it with vector begin iterator.
While this may not be a problem in itself, it can be written more consice.
This was initially suggested by @aaronpuchert.
2020-02-15 00:56:42 +03:00
Craig Topper
4a6a0df804 [llvm-exegesis] Rename range based for loop variable in a unit test so its different than the container being iterated over. NFC
It seems like gcc 5.5 wants to iterate over the new variable instead
of the container that lives outside the loop. But of course this
new container is empty.

Plus using a different variable names makes the code more readable.
2020-02-14 13:51:39 -08:00
Craig Topper
c8ab6ebc05 [Hexagon] Add an explicit makeArrayRef to pacify gcc 5.5
The array seemed to have decayed to a pointer before the ArrayRef
constructor got called so there was no size information available.
2020-02-14 13:51:39 -08:00
Sean Fertile
8c3df61f94 [AsmPrinter] Use the McASMInfo to determine if we need descriptors.
In https://reviews.llvm.org/rG8b737688c21a9755cae14cb9343930e0882164ab I
switched the condition gating the creation of the descriptor symbol from
checking the MCAsmInfo if we need to support descriptors, to if the OS
was AIX. Technically the 2 should be interchangeable: if we are
targeting AIX then we need to emit XCOFF object files, and the MCAsmInfo
must return true for needing function descriptors.

This doesn't account for lit test with runsteps that only set the arch.
Eg: test/CodeGen/XCore/section-name.ll
which when run natively on AIX we end up with a target xcore-ibm-aix and
needFunctionDescriptors is false.

This patch reverts to using the MCAsmInfo and adds an assert that the
target OS must be AIX since that is the only target using the descriptor
hook.

Differential Revision: https://reviews.llvm.org/D74622
2020-02-14 15:20:39 -05:00
Nico Weber
eeff3458f0 [windows] Add /Gw to compiler flags
This is like -fdata-sections, and it's not part of /O2 by default for some reason.

In the cmake build, reduces the size of clang.exe from 70,358,016 bytes to 69,982,720 bytes.

clang-format.exe goes from 3,703,296 bytes to 3,331,072 bytes.

Differential Revision: https://reviews.llvm.org/D74573
2020-02-14 15:15:00 -05:00
Austin Kerbow
65d3ccce3d [AMDGPU] Always enable XNACK feature when support is explicitly requested
Differential Revision: https://reviews.llvm.org/D74630
2020-02-14 11:58:58 -08:00
Evandro Menezes
81503a4556 [docs] Add note on using cmake to perform the build
Repeat the build instructions from the top level README in the Getting
Started guide.
2020-02-14 13:44:56 -06:00
Matt Arsenault
1165113be8 AMDGPU: Add option to disable CGP division expansion
The division expansions in AMDGPUCodeGenPrepare can't be relied on for
correctness, since they punt to later optimization and possibly
legalization in some cases. We still need a way to be able to write
tests for the legalizer versions of the expansion. This is mostly for
GlobalISel, since the expected optimzations is expecting aren't
implemented.

The interaction with the flag to expand 64-bit division in the IR is
pretty confusing, but these flags have different purposes.
2020-02-14 11:37:07 -08:00
Sanjay Patel
f2c7d40b2f [x86] remove stray test assertions; NFC
I updated the prefix and forgot to manually remove the old names
as part of rG6071fc57a45.f
2020-02-14 14:28:50 -05:00
Sanjay Patel
ca8fbdbc04 [x86] regenerate complete test checks for sqrt{est}; NFC
The existing checks were trying to test both CPU-specific
codegen and generic codegen with explicit attributes for
the various sqrt estimate possibilities, but that was hard
to decipher and update (D69989).

Instead generate the complete results for various CPUs,
and that makes it clear which models have slow/fast sqrt
attributes along with all of the other potential diffs
(FMA, AVX2, scheduling).

Also, explicitly add the function attributes corresponding
to whether DAZ/FTZ denorm settings are expected.
2020-02-14 14:21:28 -05:00
Matt Arsenault
1b559ce622 AMDGPU: Add option to expand 64-bit integer division in IR
I didn't realize we were already expanding 24/32-bit division here
already. Use the available IntegerDivision utilities. This uses loops,
so produces significantly smaller code than the inline DAG expansion.

This now requires width reductions of 64-bit divisions before
introducing the expanded loops.

This helps work around missing legalization in GlobalISel for
division, which are the only remaining core instructions that didn't
work at all.

I think this is plausibly a better implementation than exists in the
DAG, although turning it on by default misses out on the constant
value optimizations and also needs benchmarking.
2020-02-14 11:16:08 -08:00
Craig Topper
500b8b242f [X86] Use ZERO_EXTEND instead of SIGN_EXTEND in the fast isel handling of convert_from_fp16. 2020-02-14 10:57:12 -08:00
Craig Topper
452f11b33c [X86] Add AVX512 support to the fast isel code for Intrinsic::convert_from_fp16/convert_to_fp16. 2020-02-14 10:57:11 -08:00
Alina Sbirlea
274cdd6d1d [LoopRotate] Get and update MSSA only if available in legacy pass manager.
Summary:
Potential fix for: https://bugs.llvm.org/show_bug.cgi?id=44889 and https://bugs.llvm.org/show_bug.cgi?id=44408

In the legacy pass manager, loop rotate need not compute MemorySSA when not being in the same loop pass manager with other loop passes.
There isn't currently a way to differentiate between the two cases, so this attempts to limit the usage in LoopRotate to only update MemorySSA when the analysis is already available.
The side-effect of this is that it will split the Loop pipeline.

This issue does not apply to the new pass manager, where we have a flag specifying if all loop passes in that loop pass manager preserve MemorySSA.

Reviewers: dmgreen, fedor.sergeev, nikic

Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74574
2020-02-14 10:47:26 -08:00
Matt Arsenault
a191fc7b7d GlobalISel: Lower s64->s16 G_FPTRUNC
This is more or less directly ported from the AMDGPU custom lowering
for FP_TO_FP16. I made a few minor fixups (using G_UNMERGE_VALUES
instead of creating shift/trunc to extract the two halves, and zexting
an inverted compare instead of select_cc).

This also does not include the fast math expansion the DAG which
converts to f32 and then to f16. I think that belongs in a
pre-legalize combine instead.
2020-02-14 10:46:58 -08:00
Volkan Keles
7d4c004c85 [GlobalISel] LegalizationArtifactCombiner: Fix a bug in tryCombineMerges
Like COPY instructions explained in D70616, we don't check the constraints
when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check
if registers can be replaced, or a COPY instruction needs to be built.

https://reviews.llvm.org/D70564
2020-02-14 10:45:58 -08:00
Brian Cain
8a9cdbd9d8 [Hexagon] v67+ HVX register pairs should support either direction
Assembler now permits pairs like 'v0:1', which are encoded
differently from the odd-first pairs like 'v1:0'.

The compiler will require more work to leverage these new register
pairs.
2020-02-14 12:43:43 -06:00