1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

190773 Commits

Author SHA1 Message Date
Reid Kleckner
85e933eefe [IR] Keep a double break between functions when printing a module
This behavior appears to have changed unintentionally in
b0e979724f2679e4e6f5b824144ea89289bd6d56.

Instead of printing the leading newline in printFunction, print it when
printing a module. This ensures that `OS << *Func` starts printing
immediately on the current line, but whole modules are printed nicely.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D73505
2020-01-27 15:31:09 -08:00
Reid Kleckner
7bdbe4665a [WinEH] Re-run stack coloring test for i686
This would've caught https://crbug.com/1045650, which resulted in the
revert of 7a8b0b1595e7dc878b48cf9bbaa652087a6895db.
2020-01-27 15:26:03 -08:00
Evgenii Stepanov
d2f0ede221 Support zero size types in StackSafetyAnalysis.
Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73395
2020-01-27 15:22:59 -08:00
Evgenii Stepanov
a55605a524 Fix StackSafetyAnalysis crash with scalable vector types.
Summary:
Treat scalable allocas as if they have storage size of 0, and
scalable-typed memory accesses as if their range is unlimited.

This is not a proper support of scalable vector types in the analysis -
we can do better, but not today.

Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73394
2020-01-27 15:22:59 -08:00
Florian Hahn
aadd018f63 [AArch64] Add option to enable/disable load-store renaming.
This patch adds a new option to enable/disable register renaming in the
load-store optimizer. Defaults to disabled, as there is a potential
mis-compile caused by this.
2020-01-27 15:15:50 -08:00
Eric Schweitz
377ffc3bc4 remove a trailing space character (test commit) 2020-01-27 15:01:55 -08:00
Matt Arsenault
c134e9710f AMDGPU/GlobalISel: Eliminate SelectVOP3Mods_f32
Trivial type predicates should be moved into the tablegen pattern
itself, and not checked inside complex patterns. This eliminates a
redundant complex pattern, and fixes select source modifiers for
GlobalISel.

I have further patches which fully handle select in tablegen and
remove all of the C++ selection, although it requires the ugliness to
support the entire range of legal register types.
2020-01-27 17:53:54 -05:00
Jay Foad
1eddee5ba9 [GlobalISel] Make use of KnownBits::computeForAddSub
Summary:
This is mostly NFC. computeForAddSub may give more precise results in
some cases, but that doesn't seem to affect any existing GlobalISel
tests.

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73431
2020-01-27 22:22:56 +00:00
Stephen Neuendorffer
fd541e396d [examples] Fix CMakefiles for JITLink and OrcError library refactoring
The examples need explicit library dependencies when building with
BUILD_SHARED_LIBS=on
2020-01-27 13:58:50 -08:00
Sanjay Patel
665298049e [InstCombine] allow more narrowing of casted select
D47163 created a rule that we should not change the casted
type of a select when we have matching types in its compare condition.
That was intended to help vector codegen, but it also could create
situations where we miss subsequent folds as shown in PR44545:
https://bugs.llvm.org/show_bug.cgi?id=44545

By using shouldChangeType(), we can continue to get the vector folds
(because we always return false for vector types). But we also solve
the motivating bug because it's ok to narrow the scalar select in that
example.

Our canonicalization rules around select are a mess, but AFAICT, this
will not induce any infinite looping from the reverse transform (but
we'll need to watch for that possibility if committed).

Side note: there's a similar use of shouldChangeType() for phi ops
just below this diff, and the source and destination types appear to
be reversed.

Differential Revision: https://reviews.llvm.org/D72733
2020-01-27 16:35:50 -05:00
Simon Pilgrim
1cf1e98f38 [DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling
This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.

Differential Revision: This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.
2020-01-27 21:17:47 +00:00
Adrian Prantl
7a0267fdba Fix an assertion failure in DwarfExpression's subregister composition
This patch fixes an assertion failure in DwarfExpression that is
triggered when a complex fragment has exactly the size of a
subregister of the register the DBG_VALUE points to *and* there is no
DWARF encoding for the super-register.

I took the opportunity to replace/document some magic values with
static constructor functions to make this code less confusing to read.

rdar://problem/58489125

Differential Revision: https://reviews.llvm.org/D72938
2020-01-27 12:44:37 -08:00
Roman Lebedev
fdfa5834a9 [NFC][LoopVectorize] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668) 2020-01-27 23:34:30 +03:00
Roman Lebedev
9bf8fcc900 [NFC][IndVarSimplify] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668) 2020-01-27 23:34:29 +03:00
Matt Arsenault
75fe3f5ccf AMDGPU/GlobalISel: Select buffer atomics
The cmpswap handling is incomplete and fails to select.
2020-01-27 15:16:44 -05:00
Matt Arsenault
6e236c4885 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.tbuffer.store 2020-01-27 15:16:21 -05:00
Matt Arsenault
63f1b52260 AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.store[.format] 2020-01-27 15:00:21 -05:00
Matt Arsenault
6f974dc6d3 AMDGPU/GlobalISel: Move llvm.amdgcn.raw.buffer.store handling
Treat this the same way as loads. There's less value to the
intermediate nodes, but it's good to be consistent.
2020-01-27 14:59:30 -05:00
Sanjay Patel
29404e978c [InstCombine] convert fsub nsz with fneg operand to -(X + Y)
This was noted in D72521 - we need to match fneg specifically to
consistently handle that pattern along with (-0.0 - X).
2020-01-27 14:49:15 -05:00
Nikita Popov
31a82e35d7 [InstCombine] Move negation handling into freelyNegateValue()
Followup to D72978. This moves existing negation handling in
InstCombine into freelyNegateValue(), which make it composable.
In particular, root negations of div/zext/sext/ashr/lshr/sub can
now always be performed through a shl/trunc as well.

Differential Revision: https://reviews.llvm.org/D73288
2020-01-27 20:46:23 +01:00
Nikita Popov
77c205a8f8 [InstCombine] Add more negation tests; NFC
Additional test cases for pushing negations through various
instructions.
2020-01-27 20:46:23 +01:00
Matt Arsenault
e2876f478f TableGen: Try to fix expensive checks failures 2020-01-27 14:42:04 -05:00
Matt Arsenault
05d11238d6 AMDGPU/GlobalISel: Select llvm.amdcn.struct.tbuffer.load 2020-01-27 14:42:04 -05:00
Petr Hosek
d81def4b19 [Symbolize] Handle error after the notes loop
We always have to check the error, even if we're going to ignore it.
2020-01-27 11:00:27 -08:00
Vedant Kumar
3ce517d29d Reland (again): [DWARF] Allow cross-CU references of subprogram definitions
This is a revert-of-revert (i.e. this reverts commit 802bec89, which
itself reverted fa4701e1 and 79daafc9) with a fix folded in. The problem
was that call site tags weren't emitted properly when LTO was enabled
along with split-dwarf. This required a minor fix. I've added a reduced
test case in test/DebugInfo/X86/fission-call-site.ll.

Original commit message:

This allows a call site tag in CU A to reference a callee DIE in CU B
without resorting to creating an incomplete duplicate DIE for the callee
inside of CU A.

We already allow cross-CU references of subprogram declarations, so it
doesn't seem like definitions ought to be special.

This improves entry value evaluation and tail call frame synthesis in
the LTO setting. During LTO, it's common for cross-module inlining to
produce a call in some CU A where the callee resides in a different CU,
and there is no declaration subprogram for the callee anywhere. In this
case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram
in order to fill in the call site tag. That empty 'definition' defeats
entry value evaluation etc., because the debugger can't figure out what
it means.

As a follow-up, maybe we could add a DWARF verifier check that a
DW_TAG_subprogram at least has a DW_AT_name attribute.

Update #1:

Reland with a fix to create a declaration DIE when the declaration is
missing from the CU's retainedTypes list. The declaration is left out
of the retainedTypes list in two cases:

1) Re-compiling pre-r266445 bitcode (in which declarations weren't added
   to the retainedTypes list), and
2) Doing LTO function importing (which doesn't update the retainedTypes
   list).

It's possible to handle (1) and (2) by modifying the retainedTypes list
(in AutoUpgrade, or in the LTO importing logic resp.), but I don't see
an advantage to doing it this way, as it would cause more DWARF to be
emitted compared to creating the declaration DIEs lazily.

Update #2:

Fold in a fix for call site tag emission in the split-dwarf + LTO case.

Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a
ReleaseLTO-g build of the test suite.

rdar://46577651, rdar://57855316, rdar://57840415, rdar://58888440

Differential Revision: https://reviews.llvm.org/D70350
2020-01-27 10:52:34 -08:00
Matt Arsenault
6236286341 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.tbuffer.load 2020-01-27 13:40:37 -05:00
Stanislav Mekhanoshin
ea36e0a92b [AMDGPU] Attempt to reschedule withou clustering
We want to have more load/store clustering but we also want
to maintain low register pressure which are oposit targets.
Allow scheduler to reschedule regions without mutations
applied if we hit a register limit.

Differential Revision: https://reviews.llvm.org/D73386
2020-01-27 10:27:16 -08:00
Matt Arsenault
e205dff5e1 AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load.format 2020-01-27 13:23:35 -05:00
Luke Drummond
a097bd08a1 [tablegen] Emit string literals instead of char arrays
This changes the generated (Instr|Asm|Reg|Regclass)Name tables from this
form:
    extern const char HexagonInstrNameData[] = {
      /* 0 */ 'G', '_', 'F', 'L', 'O', 'G', '1', '0', 0,
      /* 9 */ 'E', 'N', 'D', 'L', 'O', 'O', 'P', '0', 0,
      /* 18 */ 'V', '6', '_', 'v', 'd', 'd', '0', 0,
      /* 26 */ 'P', 'S', '_', 'v', 'd', 'd', '0', 0,
      [...]
    };

...to this:

    extern const char HexagonInstrNameData[] = {
      /* 0 */ "G_FLOG10\0"
      /* 9 */ "ENDLOOP0\0"
      /* 18 */ "V6_vdd0\0"
      /* 26 */ "PS_vdd0\0"
      [...]
    };

This should make debugging and exploration a lot easier for mortals,
while providing a significant compile-time reduction for common compilers.

To avoid issues with low implementation limits, this is disabled by
default for visual studio.

To force output one way or the other, pass
`--long-string-literals=<bool>` to `tablegen`

Reviewers: mstorsjo, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D73044

A variation of this patch was originally committed in ce23515f5ab011 and
then reverted in e464b31c due to build failures.
2020-01-27 18:22:25 +00:00
Jonas Devlieghere
6df1071106 [llvm][TextAPI/MachO] Support writing single macCatalyst platform
TAPI currently lacks a way to emit the macCatalyst platform. For TBD_V3
is does support zippered frameworks given that both macOS and
macCatalyst are part of the PlatformSet.

Differential revision: https://reviews.llvm.org/D73325
2020-01-27 10:21:06 -08:00
Matt Arsenault
5a867ce09b AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load 2020-01-27 13:05:55 -05:00
Matt Arsenault
b11071aa6a AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load.format 2020-01-27 13:02:19 -05:00
Matt Arsenault
940fc835e4 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load
Use intermediate instructions, unlike with buffer stores. This is
necessary because of the need to have an internal way to distinguish
between signed and unsigned extloads. This introduces some duplication
and near duplication with the buffer store selection path. The store
handling should maybe be moved into legalization to match and
eliminate the duplication.
2020-01-27 12:49:23 -05:00
Matt Arsenault
42b291cc3e AMDGPU/GlobalISel: Handle VOP3NoMods 2020-01-27 09:03:44 -08:00
Matt Arsenault
f22b9884b2 AMDGPU/GlobalISel: Add baseline tests for fma/fmad selection 2020-01-27 09:02:13 -08:00
Matt Arsenault
995d910de2 AMDGPU/GlobalISel: Minor refactor of MUBUF complex patterns
This will make it easier to support the small variants in the complex
patterns for atomics.
2020-01-27 09:00:00 -08:00
Matt Arsenault
14b1e59c2d AMDGPU: Fix not using f16 fsin/fcos
I noticed this because this accidentally started working for
GlobalISel.
2020-01-27 08:59:59 -08:00
Jay Foad
8a902e6ea3 [AMDGPU] Simplify test and extend to gfx9 and gfx10
Summary:
This is in preparation for adding more test cases for D69661 and other
bug fixes in the same area.

Reviewers: tpr, dstuttard, critson, nhaehnle, arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70708
2020-01-27 16:56:40 +00:00
Simon Pilgrim
e72c285895 [X86][AVX] Add a more aggressive SimplifyMultipleUseDemandedBits to simplify masked store masks.
Fixes a poor codegen issue noticed in PR11210.
2020-01-27 16:44:25 +00:00
Matt Arsenault
6723ae0205 AMDGPU/GlobalISel: Custom legalize v2s16 G_SHUFFLE_VECTOR
Try to keep simple v2s16 cases as-is. This will more naturally map to
how the VOP3P op_sel modifiers work compared to the expansion
involving bitcasts and bitshifts.

This could maybe try harder with wider source vector types, although
that could be handled with a pre-legalize combine.
2020-01-27 08:28:05 -08:00
Christian Sigg
4789f36750 Add pretty printers for llvm::PointerIntPair and llvm::PointerUnion.
Reviewers: aprantl, dblaikie, jdoerfert, nicolasvasilache

Reviewed By: dblaikie

Subscribers: jpienaar, dexonsmith, merge_guards_bot, llvm-commits

Tags: #llvm, #clang, #lldb, #openmp

Differential Revision: https://reviews.llvm.org/D72557
2020-01-27 17:23:59 +01:00
Nico Weber
86bbb1153b Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()"
This reverts commit 7a8b0b1595e7dc878b48cf9bbaa652087a6895db.
It seems to break exception handling on 32-bit Windows, see
https://crbug.com/1045650
2020-01-27 11:22:33 -05:00
Matt Arsenault
a76ad215d1 Revert "AMDGPU: Temporary drop s_mul_hi_i/u32 patterns"
This reverts commit fe23ed2c681413e7baf517c79aee9be130579873.

It was never really clear this was responsible for the performance
regressions that caused this to be reverted. It's been a long time,
and we need to have scalar patterns for this to get GlobalISel
working.
2020-01-27 08:07:21 -08:00
Teresa Johnson
32209014dc Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d37cf9ad88b5021b33ecdbaf2e18911c (D71913), along
with bot fix 19c76989bb505c3117730c47df85fd3800ea2767.

The bot failure should be fixed by D73418, committed as
af954e441a5170a75687699d91d85e0692929d43.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Matt Arsenault
77c2d662f8 AMDGPU/GlobalISel: Fix not using global atomics on gfx9+
For some reason the flat/global atomics end up in the generated
matcher table in a different order from SelectionDAG. Use
AddedComplexity to prefer checking for global atomics first.
2020-01-27 07:42:42 -08:00
Whitney Tsang
533c97e3ea [LoopUnroll] Remove remapInstruction().
Summary:
LoopUnroll can reuse the RemapInstruction() in ValueMapper, or
remapInstructionsInBlocks() in CloneFunction, depending on the needs.
There is no need to have its own version in LoopUnroll.

By calling RemapInstruction() without TypeMapper or Materializer and
with Flags (RF_NoModuleLevelChanges | RF_IgnoreMissingLocals), it does
the same as remapInstruction(). remapInstructionsInBlocks() calls
RemapInstruction() exactly as described.

Looking at the history, I cannot find any obvious reason to have its own
version.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto,
foad, aprantl
Reviewed By: jdoerfert
Subscribers: hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73277
2020-01-27 15:42:13 +00:00
James Henderson
564b3705b0 [test][llvm-dwarfdump] Add extra test case for invalid MD5 form
A subsequent patch will change how an invalid file name table is handled
to allow parsing to continue. This patch adds a test case that will
demonstrate a difference in behaviour with that change between invalid
file tables where the error is before the end of the stated prologue
length and where the error occurs after the stated length.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72157
2020-01-27 15:33:34 +00:00
James Henderson
a050bdcc58 [DebugInfo] Make incorrect debug line extended opcode length non-fatal
It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72155
2020-01-27 15:32:41 +00:00
Matt Arsenault
718bb37069 AMDPGPU/GlobalISel: Select more MUBUF global addressing modes
The handling of the high bits of the resource descriptor seem weird to
me, where the 3rd dword changes based on the instruction.
2020-01-27 07:28:36 -08:00
Matt Arsenault
8cbdc76cb3 AMDGPU/GlobalISel: Initial selection of MUBUF addr64 load/store
Fixes the main reason for compile failures on SI, but doesn't really
try to use the addressing modes yet.
2020-01-27 07:13:56 -08:00