1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

191515 Commits

Author SHA1 Message Date
David Blaikie
754b9ad300 DebugInfo: Hash DW_OP_convert in loclists when using Split DWARF
Originally committed in: 1ced28cbe75ff81f35ac2c71e941041eb3afcd00
            Reverted in: f75301d16d444d8cb6810d679290df744bc79ec7

(reverted due to tests failing on non-linux/x86 targets, tests have since been
generalized and specialized... since Split DWARF isn't supported on non-elf
targets anyway and we have no way to run on "whatever elf target is available"
so they fail on MacOS without an explicit target triple)

This code was incorrectly emitting extra bytes into arbitrary parts of
the object file when it was meant to be hashing them to compute the DWO
ID.

Follow-up patch(es) will refactor this API somewhat to make such bugs
harder to introduce, hopefully.
2020-02-04 19:25:47 -08:00
David Blaikie
e8f41bc32c DebugInfo: Add a couple of missing COFF sections to make convert-loclist.ll pass on Windows 2020-02-04 19:23:57 -08:00
David Blaikie
ff32271704 DebugInfo: convert-debugloc.ll generalize to run on ppc64le
This target produces a location list for the location, so split the
match between lines to allow for a location list match.
2020-02-04 19:14:22 -08:00
David Blaikie
ef2854822b DebugInfo: Fix convert-loclist.ll Split DWARF variant to use a hardcoded triple
Since we don't support Split DWARF emission on non-ELF formats, hardcode
an elfine triple (we don't have a way to ask for "any ELF triple" it
seems, so hardcoded will have to do)
2020-02-04 19:02:11 -08:00
Thomas Lively
fb5e7af6ce Revert "[WebAssembly] Split and recombine multivalue calls for ISel"
Summary:
This reverts commit 28857d14a86b1e99a9d2795636a5faf17674f5a2. This
commit worked toward a solution that did not turn out to be feasible
because MachineInstrs cannot contain an arbitrary number of defs.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73927
2020-02-04 18:46:43 -08:00
Yonghong Song
a635edbb9b [BPF] disable ReduceLoadWidth during SelectionDag phase
The compiler may transform the following code
  ctx = ctx + reloc_offset
  ... (*(u32 *)ctx) & 0x8000 ...
to
  ctx = ctx + reloc_offset
  ... (*(u8 *)(ctx + 1)) & 0x80 ...
where reloc_offset will be replaced with a constant during
AsmPrinter phase.

The above transformed code will be rejected the kernel verifier
as it does not allow
  *(type *)((ctx + non_zero_offset1) + non_zero_offset2)
style access pattern.

It is hard at SelectionDag phase to identify whether a load
is related to context or not. Sometime, interprocedure analysis
may be needed. So let us simply prevent such optimization
from happening.

Differential Revision: https://reviews.llvm.org/D73997
2020-02-04 18:37:43 -08:00
David Blaikie
839fcb753d Recommit: DebugInfo: Check DW_OP_convert in loclists with Split DWARF
Originally committed in: 552a8fe12bd1822f48dda2e9e8728a179f82d356
            Reverted in: f75301d16d444d8cb6810d679290df744bc79ec7

Reverted because it was running llc directly (rather than %llc_dwarf)
which uses COFF files on Windows which LLVM doesn't support all DWARF
features in.

This functionality isn't fully working, but sets up the testing for a
follow-on patch that demonstrates and fixes the brokenness related to
DWO ID hashing this construct.
2020-02-04 18:37:06 -08:00
Thomas Lively
50202c53b2 [WebAssembly] Enable recently implemented SIMD operations
Summary:
Moves a batch of instructions from unimplemented-simd128 to simd128
because they have recently become available in V8.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73926
2020-02-04 18:36:32 -08:00
David Blaikie
3bacbcf81c DebugInfo: use a symbolic DIE reference in convert-loclist.ll 2020-02-04 18:23:22 -08:00
David Blaikie
39a5cbe424 Reapply: DebugInfo: Add missing test coverage for DW_OP_convert in loclists
Originally committed in: 5327b917e3bd0b3db352cb5a61eea7409f2d1972
      and follow on fix: 4f281f047457ce3f1870a93253476222314f420b

            Reverted in: 191a9a78b3f4bdf35a30d3480bd630d787a2fdf6
	            and: f75301d16d444d8cb6810d679290df744bc79ec7

Reverted because it wasn't portable between the targets it was running
on. Using %llc_dwarf ensures the target triple is always elfine and thus
DWARF compatible.
2020-02-04 18:18:45 -08:00
David Blaikie
ea04f28aeb DebugInfo: Generalize test/DebugInfo/X86/convert-linked.ll to run with different architectures 2020-02-04 18:02:03 -08:00
David Blaikie
253c75ac9b DebugInfo: Generalize test/DebugInfo/X86/convert-inlined.ll
This test was hardcoded to only run on x86-64-linux-gnu and was overly
constrained by CHECK-NEXTing every line for an exact match.
2020-02-04 17:51:35 -08:00
Michael Liao
4c2f3ec765 Fix warning on trailing ;. NFC. 2020-02-04 20:47:55 -05:00
Francis Visoiu Mistrih
96cf9b122a [Remarks] Fix gcc build 2020-02-04 17:43:59 -08:00
Michael Liao
e9ceb1cd35 Fix warning on trailing ;. NFC. 2020-02-04 20:42:05 -05:00
David Blaikie
754db50549 DebugInfo: convert-debugloc.ll remove erroneous CHECK 2020-02-04 17:39:38 -08:00
David Blaikie
a9e63414e5 DebugInfo: Generalize convert-debugloc.ll to run for multiple target architectures
This test was overly constrained & hardcoded only to x86-linux-gnu -
generalize the test & remove the hardcoded target triple.
2020-02-04 17:34:13 -08:00
LLVM GN Syncbot
225e5a5e85 [gn build] Port b8a847c0a3e 2020-02-05 01:27:20 +00:00
LLVM GN Syncbot
a0537a10f2 [gn build] Port 7531a5039fd 2020-02-05 01:27:19 +00:00
Francis Visoiu Mistrih
57aee84cb1 [Remarks] Extend the RemarkStreamer to support other emitters
This extends the RemarkStreamer to allow for other emitters (e.g.
frontends, SIL, etc.) to emit remarks through a common interface.

See changes in llvm/docs/Remarks.rst for motivation and design choices.

Differential Revision: https://reviews.llvm.org/D73676
2020-02-04 17:16:02 -08:00
Alina Sbirlea
5f84616692 [NFCI] Update according to style.
clang-tidy + clang-format
2020-02-04 17:11:36 -08:00
Reid Kleckner
909a964504 Fix some more -Wrange-loop-analysis warnings in AArch64TargetParser 2020-02-04 16:57:49 -08:00
Craig Topper
ccc67b1eb8 [X86] Add custom lowering for lrint/llrint to either cvtss2si/cvtsd2si or fist.
lrint/llrint are defined as rounding using the current rounding
mode. Numbers that can't be converted raise FE_INVALID and an
implementation defined value is returned. They may also write to
errno.

I believe this means we can use cvtss2si/cvtsd2si or fist to
convert as long as -fno-math-errno is passed on the command line.
Clang will leave them as libcalls if errno is enabled so they
won't become ISD::LRINT/LLRINT in SelectionDAG.

For 64-bit results on a 32-bit target we can't use cvtss2si/cvtsd2si
but we can use fist since it can write to a 64-bit memory location.
Though maybe we could consider using vcvtps2qq/vcvtpd2qq on avx512dq
targets?

gcc also does this optimization.

I think we might be able to do this with STRICT_LRINT/LLRINT as
well, but I've left that for future work.

Differential Revision: https://reviews.llvm.org/D73859
2020-02-04 16:15:40 -08:00
Reid Kleckner
65d778ed6e [Support] Fix warnings in ARMTargetParser.cpp 2020-02-04 15:48:22 -08:00
Craig Topper
44689041f6 [X86] Give KSET0* and KSET1* pseudos the same scheduler resource usage as KXOR/KXNOR.
These aren't recognized as idioms by the CPU so they still use
execution resources. We just use the pseudo to force the input
register to k0.
2020-02-04 15:22:54 -08:00
Reid Kleckner
4849f968cb [SEH] Remove CATCHPAD SDNode and X86::EH_RESTORE MachineInstr
The CATCHPAD node mostly existed to be selected into the EH_RESTORE
instruction, which sets the frame back up when 32-bit Windows exceptions
return to the parent function. However, creating this MachineInstr early
increases the risk that other passes will come along and insert
instructions that use the stack before ESP and EBP are restored. That
happened in PR44697.

Instead of representing these in the instruction stream early, delay it
until PEI. Mark the blocks where this needs to happen as EHPads, but not
funclet entry blocks. Passes after PEI have to be careful not to hoist
instructions that can use stack across frame setup instructions, so this
should be relatively reliable.

Fixes PR44697

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D73752
2020-02-04 15:13:12 -08:00
Kiran Chandramohan
dccd6cdc3a [OpenMP] Add Flush directive to OpenMPIRBuilder
Add support for Flush in the OMPIRBuilder. This patch also adds changes
to clang to use the OMPIRBuilder when '-fopenmp-enable-irbuilder'
commandline option is used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70712
2020-02-04 22:48:02 +00:00
Matt Arsenault
9cc24c4b43 AMDGPU: Fix isAlwaysUniform for simple asm SGPR results
We were handling the case where the result was a struct with an
extracted SGPR component, but not for the simple case.
2020-02-04 13:34:14 -08:00
Simon Pilgrim
8bb306d76b Fix "expression is redundant [misc-redundant-expression]" warning (PR44768)
Be more specific that getOperandConstraint should return -1 or a uint8_t value
2020-02-04 21:24:21 +00:00
Matt Arsenault
94d2d8f376 AMDGPU/GlobalISel: Select G_SEXT_INREG 2020-02-04 13:23:53 -08:00
Matt Arsenault
d15b15865e AMDGPU/GlobalISel: Do a better job splitting 64-bit G_SEXT_INREG
We don't need to expand to full shifts for the > 32-bit case. This
just switches to a sext_inreg of the high half.
2020-02-04 13:23:53 -08:00
Matt Arsenault
acf5f4aa9b AMDGPU/GlobalISel: Legalize G_SEXT_INREG
Split the VALU 64-bit case in RegBankSelect.
2020-02-04 13:23:53 -08:00
Austin Kerbow
4d4cde0c25 [AMDGPU] Fix infinite loop with fma combines
https://reviews.llvm.org/D72312 introduced an infinite loop which involves
DAGCombiner::visitFMA and AMDGPUTargetLowering::performFNegCombine.

fma( a, fneg(b), fneg(c) ) => fneg( fma (a, b, c) ) => fma( a, fneg(b), fneg(c) ) ...

This only breaks with types where 'isFNegFree' returns flase, e.g. v4f32.
Reproducing the issue also needs the attribute 'no-signed-zeros-fp-math',
and no source mods allowed on one of the users of the Op.

This fix makes changes to indicate that it is not free to negate a fma if it
has users with source mods.

Differential Revision: https://reviews.llvm.org/D73939
2020-02-04 13:11:09 -08:00
Matt Arsenault
6a53042562 AMDGPU/GlobalISel: Remove extension legality hacks
The legalization has improved since this was added, and the tests
relying on this no longer need it.
2020-02-04 12:50:47 -08:00
Craig Topper
212c73c460 Recommit "[X86] Use X86ISD::SUB instead of X86ISD::CMP in some places."
This time with correct types for the data result from the SUB.

Original commit message:

Our normal lowering for ISD::SETCC uses X86ISD::SUB to enable
CSE unless the RHS is 0. optimizeCompareInstr called by the peephole
pass can turn subs with unused results into cmps to clean this up.

This commit makes other places that create X86ISD::CMP have the
same behavior.
2020-02-04 12:19:34 -08:00
Teresa Johnson
a38d4dc3f7 [InlineCost] Add flag to allow changing the default inline cost
Summary:
It can be useful to tune the default inline threshold without overriding other inlining thresholds (e.g. in code compiled for size).

The existing `-inline-threshold` flag overrides other thresholds, so it is insufficient in codebases where there is a mix of code compiled for size and speed.

Patch by Michael Holman <michael.holman@microsoft.com>

Reviewers: eraman, tejohnson

Reviewed By: tejohnson

Subscribers: tejohnson, mtrofin, davidxl, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73217
2020-02-04 12:06:20 -08:00
Matt Arsenault
81f69fa788 AMDGPU/GlobalISel: Custom lower G_FEXP 2020-02-04 11:50:55 -08:00
Matt Arsenault
5c5f09fb85 AMDGPU/GlobalISel: Legalize s16 G_FEXP2 2020-02-04 11:50:55 -08:00
Matt Arsenault
726bd885da CodeGenPrepare: Reorder check for cold and shouldOptimizeForSize
shouldOptimizeForSize is showing up in a profile, spending around 10%
of the pass time in one function. This should probably not be so slow,
but the much cheaper attribute check should be done first anyway.
2020-02-04 11:23:13 -08:00
Matt Arsenault
0f987f6333 AMDGPU: Split denormal mode tracking bits
Prepare to accurately track the future denormal-fp-math attribute
changes. The way to actually set these separately is not wired in yet.

This is just a mechanical change, and mostly still assumes the input
and output mode match. This should be refined for some cases. For
example, fcanonicalize lowering should use the flushing variant if
either input or output flushing is enabled
2020-02-04 10:44:21 -08:00
Fangrui Song
6b2ff4d5ae [test] yaml2obj -docnum => --docnum=
Make usage more consistent, and make it possible to enable LongOptionsUseDoubleDash.
2020-02-04 10:33:21 -08:00
Matt Arsenault
d5d8b833fb AMDGPU: Cleanup SMRD buffer selection
The usage of the Imm out argument from SelectSMRDOffset is pretty
confusing. Stop trying to reject CI immediates in the case where the
offset field can be used. It's not an illegal way to encode the
immediate, so just prefer the better encoding pattern with
AddedComplexity.

We probably don't even really need the different opcodes for the
different offset types anymore, but that will be more work to cleanup.

The SMRD non-buffer load patterns could also use a cleanup to be done
separately.
2020-02-04 10:28:08 -08:00
Matt Arsenault
edc134394e GlobalISel: Fold SmallVector resizes into constructors 2020-02-04 10:28:08 -08:00
Simon Pilgrim
dea8bc5779 [X86] Fix missing load latencies (PR36894)
We weren't account for load latencies in the SSE42/AES/CLMUL schedule classes
2020-02-04 18:18:29 +00:00
Matt Arsenault
5025270c03 Try to fix buildbot failure 2020-02-04 13:12:46 -05:00
Hiroshi Yamauchi
fcb2bf76aa [BFI] Add a debug check for unknown block queries.
Summary:
Add a debug check for frequency queries for unknown blocks (typically blocks
that are created after BFI is computed but their frequencies are not
communicated to BFI.)

This is useful for detecting and debugging missed BFI updates.

This is debug build only and disabled behind a flag.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73920
2020-02-04 10:05:28 -08:00
Sanjay Patel
06f90b77ee [InstCombine] add FIXME comment to shuffle transform; NFC
Existing tests:
rG5d04e008f708
rG2a191cf8500f
...should verify that the underlying analysis doesn't improve
too much without updating this user code.
2020-02-04 13:02:06 -05:00
Matt Arsenault
0e9ab3b9f6 Separately track input and output denormal mode
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
2020-02-04 12:59:21 -05:00
Fangrui Song
25f7d09947 [X86] -fpatchable-function-entry=N,0: place patch label after ENDBR{32,64}
Similar to D73680 (AArch64 BTI).

A local linkage function whose address is not taken does not need ENDBR32/ENDBR64. Placing the patch label after ENDBR32/ENDBR64 has the advantage that code does not need to differentiate whether the function has an initial ENDBR.

Also, add 32-bit tests and test that .cfi_startproc is at the function
entry. The line information has a general implementation and is tested
by AArch64/patchable-function-entry-empty.mir

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73760
2020-02-04 09:42:36 -08:00
David Spickett
a7418937f9 [ARM] Correct missing newline after outputting .tlsdescseq directive.
Differential Revision: https://reviews.llvm.org/D73972
2020-02-04 17:38:09 +00:00