1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

202491 Commits

Author SHA1 Message Date
Xing GUO
54320a5f65 Recommit: [DWARFYAML] Add support for referencing different abbrev tables.
The original commit (7ff0ace96db9164dcde232c36cab6519ea4fce8) was causing
build failure and was reverted in 6d242a73264ef1e3e128547f00e0fe2d20d3ada0

==================== Original Commit Message ====================
This patch adds support for referencing different abbrev tables. We use
'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly
assign an abbrev table to compilation units.

The syntax is:
```
debug_abbrev:
  - ID: 0
    Table:
      ...
  - ID: 1
    Table:
      ...
debug_info:
  - ...
    AbbrevTableID: 1 ## Reference the second abbrev table.
  - ...
    AbbrevTableID: 0 ## Reference the first abbrev table.
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D83116
2020-08-21 19:02:10 +08:00
Sam Parker
25faf23408 [NFC] Add SimplifyCFG for ARM
Add some phi elimination threshold testing.
2020-08-21 11:52:31 +01:00
lewis-revill
fd94b72c2e [RISCV] Fix inaccurate annotations on PseudoBRIND
PseudoBRIND had seemingly inherited incorrect annotations denoting it as
a call instruction and that it defines X1/ra. This caused excess
save/restore code to be emitted for ra.

Differential Revision: https://reviews.llvm.org/D86286
2020-08-21 11:38:42 +01:00
Mirko Brkusanin
afddfeaf4a [AMDGPU] Use ds_read/write_b96/b128 when possible for SDag
Do not break down local loads and stores so ds_read/write_b96/b128 in
ISelLowering can be selected on subtargets that support them and if align
requirements allow them.

Differential Revision: https://reviews.llvm.org/D84403
2020-08-21 12:26:31 +02:00
Mirko Brkusanin
09694f5b10 [AMDGPU][GlobalISel] Fix 96 and 128 local loads and stores
Fix local ds_read/write_b96/b128 so they can be selected if the alignment
allows. Otherwise, either pick appropriate ds_read2/write2 instructions or break
them down.

Differential Revision: https://reviews.llvm.org/D81638
2020-08-21 12:26:31 +02:00
Mirko Brkusanin
08706e7bce [AMDGPU] Reorganize GCN subtarget features for unaligned access
Features UnalignedBufferAccess and UnalignedDSAccess are now used to determine
whether hardware supports such access.
UnalignedAccessMode should be used to enable them.
hasUnalignedBufferAccessEnabled() and hasUnalignedDSAccessEnabled() can be
now used to quickly check both.

Differential Revision: https://reviews.llvm.org/D84522
2020-08-21 12:26:31 +02:00
Mirko Brkusanin
49f2d14543 [AMDGPU] Fix alignment requirements for 96bit and 128bit local loads and stores
Adjust alignment requirements for ds_read/write_b96/b128.
GFX9 and onwards allow misaligned access for reads and writes but only if
SH_MEM_CONFIG.alignment_mode allows it.
UnalignedDSAccess is set on GCN subtargets from GFX9 onward to let us know if we
can relax alignment requirements.
UnalignedAccessMode acts similary to UnalignedBufferAccess for DS instructions
but only from GFX9 onward and is supposed to match alignment_mode. By default
alignment of 4 is required.

Differential Revision: https://reviews.llvm.org/D82788
2020-08-21 12:26:31 +02:00
Georgii Rymar
53c4cabd9b [llvm-readelf] - Start recognizing 'PT_OPENBSD_*' segment types.
Its a follow-up for D85830, it stops ignoring 'PT_OPENBSD_*' segment types.
Now them are recognized properly.

Note: GNU readelf does not recognize them, though perhaps it shouldn't.
Anyways, it was reported to binutils: https://sourceware.org/bugzilla/show_bug.cgi?id=26405#c0

Differential revision: https://reviews.llvm.org/D86208
2020-08-21 13:13:05 +03:00
Florian Hahn
48a439a77a [DSE,MemorySSA] Handle atomicrmw/cmpxchg conservatively.
This adds conservative handling of AtomicRMW/AtomicCmpXChg to
isDSEBarrier, similar to atomic loads and stores.
2020-08-21 10:42:42 +01:00
Roman Lebedev
2893862e92 [NFC] Port InstCount pass to new pass manager 2020-08-21 12:39:42 +03:00
Jay Foad
6d725be5b3 [SelectionDAG] Better legalization for FSHL and FSHR
In SelectionDAGBuilder always translate the fshl and fshr intrinsics to
FSHL and FSHR (or ROTL and ROTR) instead of lowering them to shifts and
ORs. Improve the legalization of FSHL and FSHR to avoid code quality
regressions.

Differential Revision: https://reviews.llvm.org/D77152
2020-08-21 10:32:49 +01:00
Florian Hahn
20d85a73f8 [DSE,MemorySSA] Regenerate check lines for atomic.ll tests. 2020-08-21 10:18:06 +01:00
Jay Foad
451e760e46 [AMDGPU] Apply llvm-prefer-register-over-unsigned from clang-tidy 2020-08-21 10:14:35 +01:00
sstefan1
8f1b61f465 [Attributor][NFC] run update_test_checks with --check-attributes. 2020-08-21 11:12:41 +02:00
Yevgeny Rouban
1bdf10a116 [NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks
Both AfterPass and AfterPassInvalidated pass instrumentation
callbacks get additional parameter of type PreservedAnalyses.
This patch was created by @fedor.sergeev. I have just slightly
changed it.

Reviewers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81555
2020-08-21 16:10:42 +07:00
Sam Parker
76932d3b0f [SimplifyCFG] Cost required selects
Before we speculatively execute a basic block, query the cost of
inserting the necessary select instructions against the phi folding
threshold. For non-trivial insertions, a more accurate decision can
probably be made during machine if-conversion. With minsize we query
the CodeSize cost, otherwise we use SizeAndLatency.

Differential Revision: https://reviews.llvm.org/D82438
2020-08-21 09:52:52 +01:00
Georgii Rymar
ac79b2437e [llvm-readobj] - Change how we create DynRegionInfo objects. NFCI.
Currently we have `checkDRI` and two `createDRIFrom` methods which
are used to create `DynRegionInfo` objects.

And we have an issue: constructions like:
`ObjF->getELFFile()->base() + P->p_offset`
that are used in `createDRIFrom` functions might overflow.

I had to revert `D85519` which triggered such UBSan failure.

This NFC, simplifies and generalizes how we create `DynRegionInfo` objects.
It will allow us to introduce more/better validation checks in a single place.
It also will allow to change `createDRI` to return `Expected<>` so
that we will be able to stop using the `reportError`, which
is used inside currently, and have a warning instead.

Differential revision: https://reviews.llvm.org/D86297
2020-08-21 11:35:28 +03:00
Florian Hahn
4bd8dfc4a6 [DSE,MemorySSA] Split off partial tracking from isOverwite.
When traversing memory uses to look for aliasing reads/writes, we only
care about complete overwrites. This patch splits off the partial
overwrite tracking from isOverwrite This avoids some unnecessary work
when checking for read/write clobbers with MemorySSA-DSE.
isOverwrite, which skips the partial overwrite tracking.

This gives a relatively small improvement
http://llvm-compile-time-tracker.com/compare.php?from=ef2a2f77f87553a0a4a39f518eb9ac86b756bda6&to=658f3905dd96d3415f3782adc712c79fa59a4665&stat=instructions

This is part of the patches to bring down compile-time to the level
referenced in
http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D86280
2020-08-21 09:13:59 +01:00
Sam Parker
0c424524e7 [ARM][CostModel] Select instruction costs.
Modify the ARM getCmpSelInstrCost implementation for the code size
costs of selects. Now consider the legalization cost and increase
the cost of i1 because those values wouldn't live in a general purpose
register. We also make selects +1 more expensive to account for the IT
instruction.

Differential Revision: https://reviews.llvm.org/D82091
2020-08-21 08:49:56 +01:00
David Green
53fac1f9ad [ARM][LV] Add a preferPredicatedReductionSelect target hook
As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to tail predicate most
reduction loops.

Differential Revision: https://reviews.llvm.org/D85980
2020-08-21 08:48:12 +01:00
Qiu Chaofan
f862a86f7a [PowerPC] Add readflm/setflm intrinsics to Clang
Commit dbcfbffc adds ppc.readflm and ppc.setflm intrinsics to read or
write FPSCR register. This patch adds them to Clang.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D85874
2020-08-21 15:12:19 +08:00
Mehdi Amini
88fcacb64c Allow multiple calls to InitLLVM() (NFC)
In e99dee82b0, the "out_of_memory_new_handler" was changed to be
explicitly initialized instead of relying on a global static
constructor.
However before this change, install_out_of_memory_new_handler could be
called multiple times while it asserts right now.
We can be more tolerant to calling multiple time InitLLVM without
reintroducing a global constructor for this handler.

Differential Revision: https://reviews.llvm.org/D86330
2020-08-21 06:13:00 +00:00
Xing GUO
7720abc372 Revert "[DWARFYAML] Add support for referencing different abbrev tables."
This reverts commit f7ff0ace96db9164dcde232c36cab6519ea4fce8.

This change is causing build failure.

http://lab.llvm.org:8011/builders/clang-cmake-armv7-global-isel/builds/10400
2020-08-21 12:15:54 +08:00
Yevgeny Rouban
ae25ecf12b [ADT] Allow IsSizeLessThanThresholdT for incomplete types. NFC
If the type T is incomplete then sizeof(T) results in C++ compilation error at line:
  static constexpr bool value = sizeof(T) <= (2 * sizeof(void *));

This patch allows incomplete types in parameters of function. Example:
  using SomeFunc = void(SomeIncompleteType &);
  llvm::unique_function<SomeFuncType> SomeFunc;

Reviewers: DaniilSuchkov, vvereschaka

Differential Revision: https://reviews.llvm.org/D81554
2020-08-21 11:01:57 +07:00
Xing GUO
de84ba17c9 [DWARFYAML] Add support for referencing different abbrev tables.
This patch adds support for referencing different abbrev tables. We use
'ID' to distinguish abbrev tables and use 'AbbrevTableID' to explicitly
assign an abbrev table to compilation units.

The syntax is:
```
debug_abbrev:
  - ID: 0
    Table:
      ...
  - ID: 1
    Table:
      ...
debug_info:
  - ...
    AbbrevTableID: 1 ## Reference the second abbrev table.
  - ...
    AbbrevTableID: 0 ## Reference the first abbrev table.
```

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D83116
2020-08-21 11:44:25 +08:00
Xing GUO
14e7218bec [DWARFYAML] Add support for emitting multiple abbrev tables.
This patch adds support for emitting multiple abbrev tables. Currently,
compilation units will always reference the first abbrev table.

Reviewed By: jhenderson, labath

Differential Revision: https://reviews.llvm.org/D86194
2020-08-21 11:44:25 +08:00
Shoaib Meenai
5e804f9e89 [cmake] Don't use ld.lld when targeting Darwin
ld.lld is an ELF linker. We can switch to the new LLD for Mach-O port
when it's more complete, but for now, assume the user will have set
CMAKE_LINKER correctly themselves when targeting Darwin.
2020-08-20 19:51:29 -07:00
Xing GUO
6752521113 [DWARFYAML] Add support for emitting multiple abbrev tables.
This patch adds support for emitting multiple abbrev tables. Currently,
compilation units will always reference the first abbrev table.

Reviewed By: jhenderson, labath

Differential Revision: https://reviews.llvm.org/D86194
2020-08-21 10:12:08 +08:00
Michael Liao
1cf2d56956 [amdgpu] Add codegen support for HIP dynamic shared memory.
Summary:
- HIP uses an unsized extern array `extern __shared__ T s[]` to declare
  the dynamic shared memory, which size is not known at the
  compile time.

Reviewers: arsenm, yaxunl, kpyzhov, b-sumner

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82496
2020-08-20 21:29:18 -04:00
Shoaib Meenai
c6c2e0aed2 [runtimes] Allow LLVM_BUILTIN_TARGETS to include Darwin
We have two ways of using the runtimes build setup to build the
builtins. You can either have an empty LLVM_BUILTIN_TARGETS (or have it
include the "default" target), in which case builtin_default_target is
called to set up the default target, or you can have actual triples in
LLVM_BUILTIN_TARGETS, in which case builtin_register_target is called
for each triple. builtin_default_target lets you build the builtins for
Darwin (assuming your default triple is Darwin); builtin_register_target
does not.

I don't understand the reason for this distinction. The Darwin builtins
build is special in that a single CMake configure handles building the
builtins for multiple platforms (e.g. macOS, iPhoneSimulator, and iOS)
and architectures (e.g. arm64, armv7, and x86_64). Consequently, if you
specify multiple Darwin triples in LLVM_BUILTIN_TARGETS, expecting each
configure to only build for that particular triple, it won't work.
However, if you specify a *single* x86_64-apple-darwin triple in
LLVM_BUILTIN_TARGETS, that single configure will build the builtins for
all Darwin targets, exactly the same way that the default target would.
The only difference between the configuration for the default target and
the x86_64-apple-darwin triple is that the latter runs the configuration
with `-DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON`, but that makes no
difference for Apple targets (none of the CMake codepaths which have
different behavior based on that variable are run for Apple targets).

I tested this by running two builtins builds on my Mac, one with the
default target and one with the x86_64-apple-darwin19.5.0 target (which
is the default target triple for my clang). The only relevant
CMakeCache.txt difference was the following, and as discussed above, it
has no effect on the actual build for Apple targets:

```
-//Default triple for which compiler-rt runtimes will be built.
-COMPILER_RT_DEFAULT_TARGET_TRIPLE:STRING=x86_64-apple-darwin19.5.0
+//No help, variable specified on the command line.
+COMPILER_RT_DEFAULT_TARGET_ONLY:UNINITIALIZED=ON
```

Furthermore, when I add the `-D` flag to compiler-rt's libtool
invocations, the libraries produced by the two builds are *identical*.

If anything, I would expect builtin_register_target to complain if you
tried specifying a triple for a particular Apple platform triple (e.g.
macosx), since that's the scenario in which it won't work as you want.
The generic darwin triple should be fine though, as best as I can tell.
I'm happy to add the error for specific Apple platform triples, either
in this diff or in a follow-up.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D86313
2020-08-20 18:28:30 -07:00
Kang Zhang
310e96f3fb [PowerPC] Fix a typo for InstAlias of mfsprg
D77531 has a type for mfsprg, it should be mtsprg. This patch is to fix
this typo.
2020-08-21 01:10:52 +00:00
Vitaly Buka
5c17690d3b Fix msan build
After D85820 TERMINFO_LIB is undefined.
2020-08-20 17:28:09 -07:00
Justin Bogner
5b1cb91987 [GISel] Correct the known bits of G_ANYEXT
Known bits for G_ANYEXT was incorrectly using KnownBits::zext, causing
us to treat the high bits as zero even though they're (by definition)
unknown.

Differential Revision: https://reviews.llvm.org/D86323
2020-08-20 17:17:04 -07:00
Fangrui Song
4d9361b2f3 [llvm-dwarfdump] Fix a typo: witin -> within 2020-08-20 14:12:37 -07:00
Jon Roelofs
3f612e225b Fix a couple of typos. NFC 2020-08-20 14:56:57 -06:00
Matt Arsenault
587fbc0a85 CodeGen: Don't drop AA metadata when splitting MachineMemOperands
Assuming this is used to split a memory access into smaller pieces,
the new access should still have the same aliasing properties as the
original memory access. As far as I can tell, this wasn't
intentionally dropped. It may be necessary to drop this if you are
moving the operand outside of the bounds of the original object in
such a way that it may alias another IR object, but I don't think any
of the existing users are doing this. Some of the uses widen into
unused alignment padding, which I think is OK.
2020-08-20 16:17:30 -04:00
Matt Arsenault
bd23f78f2f AMDGPU/GlobalISel: Legalize odd sized loads with widening
Custom lower and widen odd sized loads up to the alignment. The
default set of legalization actions doesn't have a way to represent
this. This fixes naturally aligned <3 x s8> and <3 x s16> loads.

This also starts moving towards eliminating the buggy and
overcomplicated legalization rules for narrowing. All the memory size
changes should be done in the lower or custom action, not NarrowScalar
/ FewerElements. These currently have redundant and ambiguous code
with the lower action.
2020-08-20 16:15:53 -04:00
vnalamot
ff4d481bab allSGPRSpillsAreDead() should use actual FP/BP frame indices
The SGPR spills happen in SILowerSGPRSpills() and allSGPRSpillsAreDead()
make sure there are no SGPR spills pending during PEI. But the FP/BP
spills happen during PEI and are exceptions.

Use actual frame indices of FP/BP in allSGPRSpillsAreDead() to
accommodate the exceptions.

Differential Revision: https://reviews.llvm.org/D86291
2020-08-20 16:15:53 -04:00
Kamau Bridgeman
7be92ab238 [PowerPC][PCRelative] Thread Local Storage Support for General Dynamic
This patch is the initial support for the General Dynamic Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.

Patch by: NeHuang

Reviewed By: stefanp

Differential Revision: https://reviews.llvm.org/D82315
2020-08-20 15:08:13 -05:00
Cameron McInally
640a9a840f [NFCI][SVE] Move fixed length i32/i64 SDIV tests
Move fixed length SDIV tests from sve-fixed-length-int-arith.ll to sve-fixed-length-int-div.ll. The former uses CHECK lines that verify legalization decisions. That's overkill for the i8/i16 SDIV tests, since they have a tricky legalization.
2020-08-20 14:46:26 -05:00
Fangrui Song
2eec803753 [llvm-dwarfdump] --statistics: switch to json::OStream. NFC
Then it is trivial to make the output indented (the second parameter of
json::OStream::OStream specifies the indentation).

Reviewed By: jhenderson, echristo

Differential Revision: https://reviews.llvm.org/D86045
2020-08-20 12:24:06 -07:00
Cameron McInally
06340b3cd4 [SVE] Lower fixed length vXi8/vXi16 SDIV to scalable
There are no nxv16i8/nxv8i16 SDIV instructions, so these fixed width operations must be promoted to nxv4i32.

Differential Revision: https://reviews.llvm.org/D86114
2020-08-20 13:47:01 -05:00
LLVM GN Syncbot
91886ab6b7 [gn build] Port 1a995a0af3c 2020-08-20 18:24:44 +00:00
Jessica Clarke
4356c41c9d [RISCV] Enable MCCodeEmitter instruction predicate verifier
This ensures that we never encode an instruction which is unavailable,
such as if we explicitly insert a forbidden instruction when lowering.
This is particularly important on RISC-V given its high degree of
modularity, and will become increasingly important as new standard
extensions appear.

Reviewed By: asb, lenary

Differential Revision: https://reviews.llvm.org/D85015
2020-08-20 18:36:54 +01:00
Roman Lebedev
c0a69dfec4 [NFC][InstCombine] Tests for PHI-of-insertvalue's
Currently we don't do anything about these,
neither in InstCombine, nor in SimplifyCFG's sinking.
These happen exceedingly rarely, but i've seen them in the cases where
PHI-aware aggregate reconstruction would have fired if not for them.
2020-08-20 20:16:31 +03:00
Jay Foad
efb79ce4ea [AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister
... in favour of the isPhysical/isVirtual methods.
2020-08-20 17:59:11 +01:00
Mircea Trofin
c4f0613bd4 [NFC] Expose the -Oz module optimization pipeline to opt
This exposes the module optimization pipeline as a pass that can be
applied stand-alone when using 'opt'. This helps ml inliner training
scenarios, where we start with IR captured right before inlining,
perform the inlining (-scc-oz-module-inliner) and then want to continue
and observe the final IR (where this patch comes into play). We can then
apply llc on the resulting IR to continue compilation down to native.

Differential Revision: https://reviews.llvm.org/D86224
2020-08-20 09:28:58 -07:00
Jay Foad
fe2d2102d1 [PeepholeOptimizer] Remove dead code
At this point we have already ruled out all def operands, so we can't
possibly see a dead implicit def operand.
2020-08-20 16:48:57 +01:00
David Green
2017b8f59b [LV] Allow tail folded reduction selects to remain in the loop
The normal scheme for tail folding reductions is to use:

loop:
  p = phi(0, a)
  mask = ...
  x = masked_load(..., mask)
  a = add(x, p)
s = select(mask, a, p)

This means we need to keep the register p and a alive out of the loop, plus
the mask. On a target with predicated operations we can instead generate
the phi as p = phi(0, s). This ensures the select in the loop and we can
fold select(m, add(a, b), c) to something like a vaddt c, a, b using the
m predicate. This in turn allows us to tail predicate the entire loop.

Differential Revision: https://reviews.llvm.org/D84741
2020-08-20 14:31:14 +01:00
Bjorn Pettersson
99c2e9eaf0 [AArch64] Update a code comment incorrectly referring to zero_reg. NFC
The getSrcFromCopy helper nowadays return a MachineOperand pointer,
so talking about zero_reg was incorrect as it nowadays return
a nullptr when not finding a copy like instruction.
2020-08-20 14:36:59 +02:00