1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

194053 Commits

Author SHA1 Message Date
David Blaikie
a7f9ea705a Make llvm::function_ref's operator bool explicit
This can avoid all sorts of mistakes with implicit conversion
(indirectly) to int, etc. I'm quite surprise there aren't any things to
fixup with this - but I guess most uses of function_ref aren't
optional/nullable.
2020-03-26 20:09:57 -07:00
Zakk Chen
028706e67a Fix typo, targetFeature should be lowercase.
this fixing also enable llc -mattr=+cpuhelp

Reviewers: ziangwan, kongyi

Reviewed By: kongyi

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76757
2020-03-26 19:40:04 -07:00
Kai Wang
78d8129469 [NFC] Clang format for the ELF header and ARM build attributes.
Differential Revision: https://reviews.llvm.org/D76819
2020-03-27 09:53:12 +08:00
Leonard Chan
583d100083 Move setBugReportMsg() out from under a conditional
Fixes a build break with LLVM_ENABLE_BACKTRACES=OFF.

Differential Revision: https://reviews.llvm.org/D76893
2020-03-26 16:39:03 -07:00
Cyndy Ishida
dbf20123ca [llvm][TextAPI/MachO] silence clang-tidy warnings, NFC
* applies only to tests
2020-03-26 16:32:04 -07:00
Dan Gohman
b72287bdfc [WebAssembly] Support wasm exports with zero-length names.
Zero-length strings are valid export names in WebAssembly, so allow
users to specify them.

Differential Revision: https://reviews.llvm.org/D71793
2020-03-26 16:20:43 -07:00
Dan Gohman
d964845ffe [WebAssembly] Fix the order of destructors in the LowerGlobalDtors pass.
Fix the LowerGlobalDtors pass to run destructors in the same order as the
regular LLVM destructor lowering -- in reverse order. Adjacent
destructors with the same associated object are grouped, but destructors
are not reordered based on associated objects.

Differential Revision: https://reviews.llvm.org/D70685
2020-03-26 16:19:02 -07:00
Stanislav Mekhanoshin
f9ca869bd0 [AMDGPU] Propagate amdgpu-waves-per-eu to callees
Differential Revision: https://reviews.llvm.org/D76868
2020-03-26 14:43:44 -07:00
LLVM GN Syncbot
c9d270bf8d [gn build] Port 9f7d4150b9e 2020-03-26 21:10:45 +00:00
Craig Topper
26b3a1d698 [X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.
These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag exists because SelectionDAG
can't see across basic blocks so SelectionDAGBuilder is looking
across and saving the info. X86 is the only target that uses this
flag currently. By removing the X86 code we can remove the flag
and the SelectionDAGBuilder code.

This pass adds a dedicated IR pass for X86 that looks across the
blocks and transforms the IR into a form that the X86 SelectionDAG
can finish.

An advantage of this new approach is that we can enhance it to
shrink the phi nodes and final reduction tree based on the zeroes
that we need to concatenate to bring the partially reduced
reduction back up to the original width.

Differential Revision: https://reviews.llvm.org/D76649
2020-03-26 14:10:20 -07:00
Simon Pilgrim
a8c9ffb590 [X86] Prefer PACKUS(AND(),AND()) to SHUFFLE(PSHUFB(),PSHUFB()) on all targets
Extends rG9d1721ce3926 to support AVX2+ targets.
2020-03-26 20:46:24 +00:00
Jay Foad
fa0f9a79a6 [AMDGPU] Rename overloaded getMaxWavesPerEU to getWavesPerEUForWorkGroup
Summary: I think Max in the name was misleading. NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76860
2020-03-26 20:21:04 +00:00
Jay Foad
6f92f87437 [AMDGPU] Remove getMaxWavesPerCU in favour of getWavesPerWorkGroup.
Summary:
These methods were identical. I chose to remove getMaxWavesPerCU because
I think Max in the name was misleading. NFC.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76859
2020-03-26 20:21:04 +00:00
Derek Schuff
9362eda6c4 [WEbAssembly] Clear frame base vreg in explicit-locals when stack pointer is dead
Having an alloca in a function causes the stack pointer to be generated in the
prolog, but if it's unused other than for debug info, explicit-locals will drop
it and not allocate a local. In this case we need to reset the FrameBaseVreg.

Differential Revision: https://reviews.llvm.org/D76784
2020-03-26 13:07:32 -07:00
Simon Pilgrim
f0bba60cca [X86] lowerV16I8Shuffle - create v8i16 mask for PACKUS(AND(),AND()) patterns.
We can improve computeKnownBits results by avoiding excess bitcasts.

For this pattern we were doing:

  (v16i8 PACKUS(v8i16 BITCAST(v16i8 AND(V1, MASK)), v8i16 BITCAST(v16i8 AND(V2, MASK))))

By performing the MASK/AND with a v8i16 type and bitcasting V1/V2 directly we can help computeKnownBits see that the mask is clearing the upper bits and allows shuffle combining to peek through later on.

This will be necessary to extend rG9d1721ce3926 to AVX2+ targets in a future patch.
2020-03-26 19:59:57 +00:00
diggerlin
fd7184191d [AIX] discard the label in the csect of function description and use qualname for linkage
SUMMARY:

SUMMARY
for a source file  "test.c"

void foo() {};

llc will generate assembly code as (assembly patch)
     .globl  foo
     .globl  .foo
     .csect foo[DS]
foo:

        .long   .foo
        .long   TOC[TC0]
        .long   0

   and symbol table as (xcoff object file)
   [4]     m   0x00000004     .data     1  unamex                    foo
   [5]     a4  0x0000000c       0    0     SD       DS    0    0
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     LD       DS    0    0

   After first patch, the assembly will be as

        .globl  foo[DS]                 # -- Begin function foo
        .globl  .foo
        .align  2
        .csect foo[DS]
        .long   .foo
        .long   TOC[TC0]
        .long   0

    and symbol table will as
   [6]     m   0x00000004     .data     1  extern                    foo
   [7]     a4  0x00000004       0    0     DS      DS    0    0
Change the code for the assembly path and xcoff objectfile patch for llc.

Reviewers: Jason Liu
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D76162
2020-03-26 15:46:52 -04:00
Michael Liao
58a5965510 [cuda][hip] Add CUDA builtin surface/texture reference support.
Summary:
- Even though the bindless surface/texture interfaces are promoted,
  there are still code using surface/texture references. For example,
  [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the
  compilation issue for code using `tex2D` with texture references. For
  better compatibility, this patch proposes the support of
  surface/texture references.
- Due to the absent documentation and magic headers, it's believed that
  `nvcc` does use builtins for texture support. From the limited NVVM
  documentation[^nvvm] and NVPTX backend texture/surface related
  tests[^test], it's believed that surface/texture references are
  supported by replacing their reference types, which are annotated with
  `device_builtin_surface_type`/`device_builtin_texture_type`, with the
  corresponding handle-like object types, `cudaSurfaceObject_t` or
  `cudaTextureObject_t`, in the device-side compilation. On the host
  side, that global handle variables are registered and will be
  established and updated later when corresponding binding/unbinding
  APIs are called[^bind]. Surface/texture references are most like
  device global variables but represented in different types on the host
  and device sides.
- In this patch, the following changes are proposed to support that
  behavior:
  + Refine `device_builtin_surface_type` and
    `device_builtin_texture_type` attributes to be applied on `Type`
    decl only to check whether a variable is of the surface/texture
    reference type.
  + Add hooks in code generation to replace that reference types with
    the correponding object types as well as all accesses to them. In
    particular, `nvvm.texsurf.handle.internal` should be used to load
    object handles from global reference variables[^texsurf] as well as
    metadata annotations.
  + Generate host-side registration with proper template argument
    parsing.

---
[^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf
[^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll
[^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf).
[^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used.  But, the current backend doesn't have that supported. We may revise that later.

Reviewers: tra, rjmccall, yaxunl, a.sidorin

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76365
2020-03-26 14:44:52 -04:00
Scott Linder
3a0c436c93 [AMDGPU] Fix PC register mapping in wave32 mode
Summary:
The PC_32 DWARF register is for a 32-bit process address space which we
don't implement in AMDGCN; another way of putting this is that the size
of the PC register is not a function of the wavefront size. If we ever
implement a 32-bit process address space we will need to add two more
DwarfFlavours i.e. we will need to represent the product of (wave32,
wave64) x (64-bit address space, 32-bit address space).

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76732
2020-03-26 14:43:25 -04:00
David Blaikie
8238e95412 Roll otherwise unused subexpressions into an assertion 2020-03-26 11:32:33 -07:00
Sanjay Patel
b80ff02f88 [InstCombine] add shuffle-with-bitcast-operand tests; NFC 2020-03-26 14:28:47 -04:00
Guillaume Chatelet
bb1b18ffd2 [Alignment][NFC] Use llvmTargetFrameLowering::getStackAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: wuzish, arsenm, jyknight, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, fedor.sergeev, jrtc27, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76613
2020-03-26 18:15:53 +00:00
Jonathan Roelofs
f30eee7386 [InstCombine] Fix Incorrect fold of ashr+xor -> lshr w/ vectors
Fixes https://bugs.llvm.org/show_bug.cgi?id=43665
2020-03-26 12:09:36 -06:00
Jinsong Ji
012f102459 [docs][Phabricator] git migration related update
1.Add instructions to update author when committing other's patch

We have updated DeveloperPolicy to show how to change author in
https://reviews.llvm.org/D72468

We should also update Phabricator page to include such infomation,
in case people follow the steps here and forget to update author info.

2. Replace `git llvm push` with `git push`

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D76718
2020-03-26 18:08:06 +00:00
Jay Foad
50b935813e [AMDGPU] Make use of divideCeil. NFC. 2020-03-26 16:11:35 +00:00
Jay Foad
a62c51da79 [AMDGPU] Remove unused methods. NFC. 2020-03-26 16:11:35 +00:00
Fangrui Song
d7fb6561f1 [llvm-objdump] Fix typo. NFC 2020-03-26 09:10:14 -07:00
Justin Hibbits
6bf4f263ac [PowerPC]: Don't allow r0 as a target for LD_GOT_TPREL_L/32
Summary:
The linker is free to relax this (relocation R_PPC_GOT_TPREL16) against
R_PPC_TLS, if it sees fit (initial exec to local exec).  If r0 is used,
this can generate execution-invalid code (converts to 'addi %rX, %r0,
FOO, which translates in PPC-lingo to li %rX, FOO).  Forbid this
instead.

This fixes static binaries using locales on FreeBSD/powerpc
(tested on FreeBSD/powerpcspe).

Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D76662
2020-03-26 10:59:28 -05:00
Simon Pilgrim
27633c0863 [X86][SSE] Prefer PACKUS(AND(),AND()) to SHUFFLE(PSHUFB(),PSHUFB()) on pre-AVX2 targets
As discussed on PR31443, we should be trying to use PACKUS for binary truncation patterns to reduce the number of shuffles.

The plan is to support AVX2+ targets once we've worked around PR45315 - we fail to peek through a VBROADCAST_LOAD mask to recognise zero upper bits in a PACKUS pattern.

We should also be able to add support for v8i16 and possibly 256/512-bit vectors as well.
2020-03-26 15:47:43 +00:00
Fangrui Song
82533e3f53 [PPCInstPrinter] Change printBranchOperand(calltarget) to print the target address in hexadecimal form
```
// llvm-objdump -d output (before)
0: bl .-4
4: bl .+0
8: bl .+4

// llvm-objdump -d output (after) ; GNU objdump -d
0: bl 0xfffffffc / bl 0xfffffffffffffffc
4: bl 0x4
8: bl 0xc
```

Many Operand's are not annotated as OPERAND_PCREL.
They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches.

Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test
address space wraparound for powerpc32 and powerpc64.

Reviewed By: sfertile, jhenderson

Differential Revision: https://reviews.llvm.org/D76591
2020-03-26 08:32:29 -07:00
Fangrui Song
3d2508df1f [X86InstPrinter] Change printPCRelImm to print the target address in hexadecimal form
```
// llvm-objdump -d output (before)
400000: e8 0b 00 00 00   callq 11
400005: e8 0b 00 00 00   callq 11

// llvm-objdump -d output (after)
400000: e8 0b 00 00 00  callq 0x400010
400005: e8 0b 00 00 00  callq 0x400015

// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
400000: e8 0b 00 00 00  callq 400010
400005: e8 0b 00 00 00  callq 400015
```

In llvm-objdump, we pass the address of the next MCInst. Ideally we
should just thread the address of the current address, unfortunately we
cannot call X86MCCodeEmitter::encodeInstruction (X86MCCodeEmitter
requires MCInstrInfo and MCContext) to get the length of the MCInst.

MCInstPrinter::printInst has other callers (e.g llvm-mc -filetype=asm, llvm-mca) which set Address to 0.
They leave MCInstPrinter::PrintBranchImmAsAddress as false and this change is a no-op for them.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D76580
2020-03-26 08:28:59 -07:00
Simon Cook
baf8a6dc1d [RISCV] Support negative constants in CompressInstEmitter
Summary:
Some compressed instructions match against negative values; store
immediates as a signed value such that these patterns will now match
the intended instructions.

Reviewers: asb, lenary, PaoloS

Reviewed By: asb

Subscribers: rbar, johnrusso, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, evandro, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76767
2020-03-26 15:23:38 +00:00
Fangrui Song
7f7bfe12ea [MCInstPrinter] Pass Address parameter to MCOI::OPERAND_PCREL typed operands. NFC
Follow-up of D72172 and D72180

This patch passes `uint64_t Address` to print methods of PC-relative
operands so that subsequent target specific patches can change
`*InstPrinter::print{Operand,PCRelImm,...}` to customize the output.

Add MCInstPrinter::PrintBranchImmAsAddress which is set to true by
llvm-objdump.

```
// Current llvm-objdump -d output
aarch64: 20000: bl #0
ppc:     20000: bl .+4
x86:     20000: callq 0

// Ideal output
aarch64: 20000: bl 0x20000
ppc:     20000: bl 0x20004
x86:     20000: callq 0x20005

// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
aarch64: 20000: bl 20000
ppc:     20000: bl 0x20004
x86:     20000: callq 20005
```

In `lib/Target/X86/X86GenAsmWriter1.inc` (generated by `llvm-tblgen -gen-asm-writer`):

```
   case 12:
     // CALL64pcrel32, CALLpcrel16, CALLpcrel32, EH_SjLj_Setup, JCXZ, JECXZ, J...
-    printPCRelImm(MI, 0, O);
+    printPCRelImm(MI, Address, 0, O);
     return;
```

Some targets have 2 `printOperand` overloads, one without `Address` and
one with `Address`. They should annotate derived `Operand` properly with
`let OperandType = "OPERAND_PCREL"`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D76574
2020-03-26 08:21:15 -07:00
LLVM GN Syncbot
bee7db5b02 [gn build] Port 2aac0c47aed 2020-03-26 15:16:51 +00:00
Dominik Montada
5df600714d [GlobalISel] add helper function to create arbitrary libcalls
Summary:
The existing helper function can only create a libcall to functions available in
RTLIB. Add a helper function that can create a libcall to a given function name
using the provided calling convention.

Reviewers: aditya_nandakumar, t.p.northover, rovka, arsenm, dsanders

Reviewed By: arsenm

Subscribers: wdng, hiraditya, volkan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76845
2020-03-26 16:11:13 +01:00
Louis Dionne
8126913310 [lit] NFC: Remove trailing whitespace
I keep having to remove them from my diffs!
2020-03-26 11:05:18 -04:00
Qiu Chaofan
4e091a22bd [Legalizer] Fix some flags miss in vector results
In some scalarize/split result methods (unary, binary, ...), flags in
SDNode were not passed down, which may lead to unexpected results in
unsafe float-point optimization. This patch fixes them. (maybe not
complete)

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D76832
2020-03-26 22:01:19 +08:00
Sam Parker
33ab418b5a [NFC] Create X86 subdirectory for indvar tests
Many IndVarSiimplify tests target an x86 triple, so move them into
a target specific folder.
2020-03-26 12:24:45 +00:00
Aaron Ballman
e021b93f38 Clarify use of llvm_unreachable in the coding standard.
There has been some ongoing confusion regarding when to use `llvm_unreachable`
which this patch attempts to address. Specifically, the confusion has been
around whether `llvm_unreachable` is intended to mark only unreachable code
paths that the compiler cannot determine itself or to mark a code path which is
unconditionally a bug to reach. Based on email and IRC discussions, it sounds
like "unconditional bug to reach" is the consensus.
2020-03-26 08:08:23 -04:00
Simon Pilgrim
2950a4b04b [X86][SSE] getFauxShuffleMask - peek through TRUNCATE/AEXT/ZEXT for INSERT_VECTOR_ELT(EXTRACT_VECTOR_ELT())
As long we extract from a source vector with smaller elements and we zero-extend the element in the final shuffle mask then we can safely peek through truncations and any/zero-extensions to find the source extraction.
2020-03-26 11:57:45 +00:00
Jonas Paulsson
e127891baf [SystemZ] Bugfix in tieOpsIfNeeded()
This function did a check which was broken to see if an opcode requires op0
and op1 to be tied. By chance this is NFC.

Review: Ulrich Weigand
2020-03-26 12:22:14 +01:00
Georgii Rymar
edd888ac89 [obj2yaml] - Refactor how we dump sections. NFCI.
This is a NFC splitted from D75342.

Previously obj2yaml never dumped a normal SHT_NULL section (i.e. when it is just zeroed)
or non-allocatable SHT_STRTAB/SHT_SYMTAB/SHT_DYNSYM sections.

This patch does not change the output, but it changes the logic so that we now dump these
sections, and them remove them later. It allows us to create and work with our internal representation
of sections, i.e. to work with the vector of Chunks, what looks cleaner.

It is used by D75342 and also should help us to support dumping a content that does not
belong to a section (i.e. to dump some data as `Fill` chunks).

Differential revision: https://reviews.llvm.org/D76684
2020-03-26 14:04:07 +03:00
gbreynoo
b1f1108ed2 Tools emit the bug report URL on crash
When Clang crashes a useful message is output:

"PLEASE submit a bug report to https://bugs.llvm.org/ and include the
crash backtrace, preprocessed source, and associated run script."

A similar message is now output for all tools.

Differential Revision: https://reviews.llvm.org/D74324
2020-03-26 10:26:59 +00:00
Kang Zhang
4130033848 [PowerPC] Remove the repeated definition for some InstAlias for mtspr/mfspr
Summary:
Below InstAlias have been redefined, this patch is to remove the repeated
definition.
mtdec/mfdec mtsdr1/mfsdr1 mtsrr0/mfsrr0 mtsrr1/mfsrr1 mtasr

Reviewed By: nemanjai, steven.zhang

Differential Revision: https://reviews.llvm.org/D75821
2020-03-26 09:58:30 +00:00
James Henderson
0518ef06da [NFC][llvm-readobj] Refactor unique warning handler
The unique warning handler was previously a property of the dump style,
but it is commonly used in the dumper too. Since the two ELF output
styles have no impact on the way warnings are printed, this patch moves
the handler and related functions into the dumper class, instead of the
dump style class.

Reviewed by: MaskRay, grimar

Differential Revision: https://reviews.llvm.org/D76777
2020-03-26 09:54:55 +00:00
Cullen Rhodes
4020b155d6 [AArch64][SVE] Implement structured store intrinsics
Summary:
This patch adds initial support for the following intrinsics:

    * llvm.aarch64.sve.st2
    * llvm.aarch64.sve.st3
    * llvm.aarch64.sve.st4

For storing two, three and four vectors worth of data. Basic codegen for
reg+immediate forms are implemented. Reg+reg addressing modes will be
addressed in a later patch.

These intrinsics are intended for use in the Arm C Language Extension
(ACLE).

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D75947
2020-03-26 09:34:51 +00:00
Ties Stuij
74a8dfdced [PATCH] [ARM] ARMv8.6-a command-line + BFloat16 Asm Support
Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html

In detail this patch

- march options for armv8.6-a
- BFloat16 assembly

This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.

Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson

Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson

Reviewed By: SjoerdMeijer

Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D76062
2020-03-26 09:17:20 +00:00
Simon Tatham
b77c309b4c Do export symbols when LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is on.
Summary:
In D76527, we stopped exporting symbols from clang, opt and llc unless
the `LLVM_ENABLE_PLUGINS` cmake variable is true (which causes clang's
own plugin collection to be built).

But another reasonable build configuration is to ask clang to export
its symbols for out-of-tree plugins to use, without building the
in-tree ones. That is, you might set `LLVM_EXPORT_SYMBOLS_FOR_PLUGINS`
without also setting `LLVM_ENABLE_PLUGINS` (at least if you're using
MSVC, where you need to ask explicitly for the symbols to be
exported).

In that situation, the symbols should still be exported, but after
D76527, they weren't being.

Reviewers: efriedma, john.brawn

Reviewed By: efriedma, john.brawn

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76760
2020-03-26 09:07:02 +00:00
David Green
0b35e38c5b [ARM] Sink splats to vector float instructions
Some MVE floating point instructions have gpr register variants that take
the scalar gpr value and splat them to all lanes. In order to accept
them in loops, the shuffle_vector and insert need to be sunk down into
the loop, next to the instruction so that ISel can see the whole
pattern.

This does that sinking for FAdd, FSub, FMul and FCmp. The patterns for
mul are slightly more constrained as there are no fms variants taking
register arguments.

Differential Revision: https://reviews.llvm.org/D76023
2020-03-26 09:02:18 +00:00
Craig Topper
e608f0b2f9 [X86] Update more intrinsic tests to prepare to extend D60940 to scalar fp.
I want to extend D60940 to scalar FP which will prevent forming
masked instructions if the arithmetic op has another use. To
prepare for that, this patch updates tests to avoid repeating
the operation multiple times with different masking.
2020-03-25 23:03:20 -07:00
Fangrui Song
5a9a269ca9 [InstCombine] Fix a code-sinking bug after D73832/f1a9efabcb9b
- UserParent = PN->getIncomingBlock(*I->use_begin());
+ UserParent = PN->getIncomingBlock(*SingleUse);

The first use of I may be droppable (llvm.assume).

When compiling llvm/lib/IR/AutoUpgrade.cpp with a bootstrapped clang
with ThinLTO with minimized bitcode files, I see such a case in
the function _ZN4llvm20UpgradeIntrinsicCallEPNS_8CallInstEPNS_8FunctionE

  clang -c -fthinlto-index=AutoUpgrade.o.thinlto.bc AutoUpgrade.bc -O3

Unfortunately it is really difficult to get a minimized reproduce.
2020-03-25 22:50:53 -07:00