1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

208285 Commits

Author SHA1 Message Date
Hsiangkai Wang
027811bee8 [RISCV] Separate masked and unmasked definitions for pseudo instructions.
Differential Revision: https://reviews.llvm.org/D93012
2020-12-11 14:02:56 +08:00
Kazu Hirata
39748752bb [MemorySSA] Remove unused declaration optimizeUses (NFC)
The declaration was introduced on Aug 2, 2016 in commit
c43aa5a5b62b21c1d38cd3d2ece7d0d5124d5180 without a corresponding
definition.

Note that we do have a definition for
MmeorySSA::OptimizeUses::optimizeUses but not for
MmeorySSA::optimizeUses.
2020-12-10 20:54:37 -08:00
Kazu Hirata
fbbf5e09c7 [Support] Use is_contained (NFC) 2020-12-10 20:40:37 -08:00
Craig Topper
81034448a4 [RISCV] Use tail agnostic policy for vsetvli instruction emitted in the custom inserter
The compiler is making no effort to preserve upper elements. To do so would require another source operand tied with the destination and a different intrinsic interface to give control of this source to the programmer.

This patch changes the tail policy to agnostic so that the CPU doesn't need to make an effort to preserve them.

This is consistent with the RVV intrinsic spec here https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#configuration-setting

Differential Revision: https://reviews.llvm.org/D93080
2020-12-10 19:48:03 -08:00
Alexandre Ganea
6c43b14f86 Re-land: [lit] Support running tests on Windows without GnuWin32
Historically, we have told contributors that GnuWin32 is a pre-requisite
because our tests depend on utilities such as sed, grep, diff, and more.
However, Git on Windows includes versions of these utilities in its
installation. Furthermore, GnuWin32 has not been updated in many years.
For these reasons, it makes sense to have the ability to run llvm tests
in a way that is both:

a) Easier on the user (less stuff to install)
b) More up-to-date (The verions that ship with git are at least as
   new, if not newer, than the versions in GnuWin32.
We add support for this here by attempting to detect where Git is
installed using the Windows registry, confirming the existence of
several common Unix tools, and then adding this location to lit's PATH
environment.

Differential Revision: https://reviews.llvm.org/D84380
2020-12-10 21:41:54 -05:00
LLVM GN Syncbot
b446710430 [gn build] Port 705a4c149d8 2020-12-11 01:40:59 +00:00
Hongtao Yu
85e4f6f241 [CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s

Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections.  The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. 

**ELF object emission**

The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.

Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication.  A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.

The format of `.pseudo_probe_desc` section looks like:

```
.section   .pseudo_probe_desc,"",@progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
    NPROBES (ULEB128)
        Number of probes originating from this function.
    NUM_INLINED_FUNCTIONS (ULEB128)
        Number of callees inlined into this function, aka number of
        first-level inlinees
    PROBE RECORDS
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
            reserved
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
    INLINED FUNCTION RECORDS
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
          FUNCTION BODY
            A FUNCTION BODY entry describing the inlined function.
```

To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.

**Assembling**

Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.

A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.

A example assembly looks like:

```
foo2: # @foo2
# %bb.0: # %bb0
pushq %rax
testl %edi, %edi
.pseudoprobe 837061429793323041 1 0 0
je .LBB1_1
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 6 2 0
callq foo
.pseudoprobe 837061429793323041 3 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
.LBB1_1: # %bb1
.pseudoprobe 837061429793323041 5 1 0
callq *%rsi
.pseudoprobe 837061429793323041 2 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
# -- End function
.section .pseudo_probe_desc,"",@progbits
.quad 6699318081062747564
.quad 72617220756
.byte 3
.ascii "foo"
.quad 837061429793323041
.quad 281547593931412
.byte 4
.ascii "foo2"
```

With inlining turned on, the assembly may look different around %bb2 with an inlined probe:

```
# %bb.2:                                # %bb2
.pseudoprobe    837061429793323041 3 0
.pseudoprobe    6699318081062747564 1 0 @ 837061429793323041:6
.pseudoprobe    837061429793323041 4 0
popq    %rax
retq
```

**Disassembling**

We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.

An example disassembly looks like:

```
00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91878
2020-12-10 17:29:28 -08:00
Derek Schuff
d4257740dd [WebAssembly] Support COMDAT sections in assembly syntax
This CL changes the asm syntax for section flags, making them more like ELF
(previously "passive" was the only option). Now we also allow "G" to designate
COMDAT group sections. In these sections we set the appropriate comdat flag on
function symbols, and also avoid auto-creating a new section for them.

This also adds asm-based tests for the changes D92691 to go along with
the direct-to-object tests.

Differential Revision: https://reviews.llvm.org/D92952
This is a reland of rG4564553b8d8a with a fix to the lit pipeline in
llvm/test/MC/WebAssembly/comdat.ll
2020-12-10 16:43:59 -08:00
Jonas Paulsson
c55f1c4d7e Revert "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing."
Temporarily reverted.

This reverts commit ea475c77ff9eab1de7d44684c8fb453b39f70081.
2020-12-10 18:05:51 -06:00
LLVM GN Syncbot
73613a3596 [gn build] Port 7ead5f5aa38 2020-12-10 23:59:49 +00:00
Derek Schuff
7296ac9f66 Revert "[WebAssembly] Support COMDAT sections in assembly syntax"
This reverts commit 4564553b8d8ab81dc21431a35275581cb42329c8.
It broke several buildbots.
2020-12-10 15:55:33 -08:00
Mitch Phillips
7c847657fe Revert "[CSSPGO] Pseudo probe encoding and emission."
This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.

Reason: Broke the ASan buildbots:
  http://lab.llvm.org:8011/#/builders/5/builds/2269
2020-12-10 15:53:39 -08:00
Mitch Phillips
437d145e05 Revert "[NFC] Fix a gcc build break by using an explict constructor."
This reverts commit 248b279cf04d9e439a1e426ffd24f2dfa93d02f8.

Reason: Dependency of patch that broke the ASan buildbots:
  http://lab.llvm.org:8011/#/builders/5/builds/2269
2020-12-10 15:53:38 -08:00
Mitch Phillips
2553270a20 Revert "[NFC] Fix a gcc build break by not using an initializer."
This reverts commit 1dc0a8521f616af5897327e4c03098f9312e9c59.

Reason: Dependency of patch that broke the ASan buildbots:
  http://lab.llvm.org:8011/#/builders/5/builds/2269
2020-12-10 15:53:38 -08:00
Xinhao Yuan
38310b229d [llvm-cov][gcov] Optimize the cycle counting algorithm by skipping zero count cycles
This change is similar to http://gcc.gnu.org/PR90380

This reduces the complexity from exponential to polynomial of the arcs.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93036
2020-12-10 15:22:29 -08:00
Derek Schuff
2c92440f99 [WebAssembly] Support COMDAT sections in assembly syntax
This CL changes the asm syntax for section flags, making them more like ELF
(previously "passive" was the only option). Now we also allow "G" to designate
COMDAT group sections. In these sections we set the appropriate comdat flag on
function symbols, and also avoid auto-creating a new section for them.

This also adds asm-based tests for the changes D92691 to go along with
the direct-to-object tests.

Differential Revision: https://reviews.llvm.org/D92952
2020-12-10 14:46:24 -08:00
Scott Linder
789653fed0 [SmallVector] Copy new docs into Doxygen comment
Copy the `ProgrammersManual.rst` changes from D92522 to the Doxygen
comment for `SmallVector`, to hopefully encourage new uses migrating to
the no-explicit-`N` form.

Differential Revision: https://reviews.llvm.org/D93069
2020-12-10 22:20:37 +00:00
Craig Topper
df0fb7b768 [RISCV] Simplify vector instruction handling in RISCVMCInstLower.cpp.
Use RegisterClass::contains instead of going through getMinimalPhysRegClass
and hasSuperClassEq.

Remove the special case for NoRegister. It's identical to the
handling for any other regsiter that isn't VRM2/M4/M8.
2020-12-10 13:40:00 -08:00
Nico Weber
23fb61a411 [gn build] fix up arm64 builtin sources a bit
The fp_mode.c removal is done by filter_builtin_sources in the cmake build.
2020-12-10 16:22:48 -05:00
Nico Weber
a0359d324b [gn build] only build iOS builtins with full Xcode
Commandline tools doesn't include the iOS SDK.
2020-12-10 16:22:31 -05:00
Nico Weber
c23af21e45 [gn build] add a missing dependency 2020-12-10 16:22:26 -05:00
Jonas Paulsson
3646ae19a8 [SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing.
The loop-based probing done for stack clash protection altered R1D which
corrupted the backchain value to be stored after the probing was done.

By using R0D instead for the loop exit value, R1D is not modified.

Review: Ulrich Weigand.

Differential Revision: https://reviews.llvm.org/D92803
2020-12-10 15:06:18 -06:00
Zequan Wu
94ff08c70f [PGO] Enable preinline and cleanup when optimize for size
Differential Revision: https://reviews.llvm.org/D91673
2020-12-10 12:29:17 -08:00
Amara Emerson
dbf44feca3 [AArch64] Don't try to compress jump tables if there are any inline asm instructions.
Inline asm can contain constructs like .bytes which may have arbitrary size.
In some cases, this causes us to miscalculate the size of blocks and therefore
offsets, causing us to incorrectly compress a JT.

To be safe, just bail out of the whole thing if we find any inline asm.

Fixes PR48255

Differential Revision: https://reviews.llvm.org/D92865
2020-12-10 12:20:02 -08:00
Sam Elliott
5beb5b74f3 [RISCV][NFC] Fix Sext/Zext Tests
These were missed in a rebase of https://reviews.llvm.org/D92793
2020-12-10 20:10:29 +00:00
Hongtao Yu
d5104055ba [NFC] Fix a gcc build break by not using an initializer.
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
2020-12-10 11:54:41 -08:00
Arthur Eubanks
093409b8fe [NPM] Support -fmerge-functions
I tried to put it in the same place in the pipeline as the legacy PM.

Fixes PR48399.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D93002
2020-12-10 11:45:08 -08:00
Sam Elliott
83dc68d3af [RISCV] Add (Proposed) Assembler Extend Pseudo-Instructions
There is an in-progress proposal for the following pseudo-instructions
in the assembler, to complement the existing `sext.w` rv64i instruction:
- sext.b
- sext.h
- zext.b
- zext.h
- zext.w

The `.b` and `.h` variants are available with rv32i and rv64i, and `zext.w` is
only available with `rv64i`.

These are implemented primarily as pseudo-instructions, as these instructions
expand to multiple real instructions. In the case of `zext.b`, this expands to a
single rv32/64i instruction, so it is implemented with an InstAlias (like
`sext.w` is on rv64i).

The proposal is available here: https://github.com/riscv/riscv-asm-manual/pull/61

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D92793
2020-12-10 19:25:51 +00:00
Alexey Bader
bd81c828b4 [Doc] Update branch name in Phabricator documentation
master -> main

Differential Revision: https://reviews.llvm.org/D93020
2020-12-10 22:25:04 +03:00
Hongtao Yu
9f85e62b52 [NFC] Fix a gcc build break by using an explict constructor. 2020-12-10 11:21:40 -08:00
LLVM GN Syncbot
8a1d1a744b [gn build] Port ea6641085d0 2020-12-10 19:09:35 +00:00
Sanjay Patel
6cbb454575 [InstCombine] avoid crash sinking to unreachable block
The test is reduced from the example in D82005.

Similar to 94f6d365e, the test here would assert in
the DomTree when we tried to convert a select to a
phi with an unreachable block operand.

We may want to add some kind of guard code in DomTree
itself to avoid this sort of problem.
2020-12-10 13:10:26 -05:00
Sanjay Patel
65d2c9f776 [VectorCombine] improve readability; NFC
If we are going to allow adjusting the pointer for GEPs,
rearranging the code a bit will make it easier to follow.
2020-12-10 13:10:26 -05:00
LLVM GN Syncbot
a0276f09f2 [gn build] Port b035513c06d 2020-12-10 17:56:12 +00:00
Hongtao Yu
91873af129 [CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s

Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections.  The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. 

**ELF object emission**

The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.

Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication.  A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.

The format of `.pseudo_probe_desc` section looks like:

```
.section   .pseudo_probe_desc,"",@progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
    NPROBES (ULEB128)
        Number of probes originating from this function.
    NUM_INLINED_FUNCTIONS (ULEB128)
        Number of callees inlined into this function, aka number of
        first-level inlinees
    PROBE RECORDS
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
            reserved
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
    INLINED FUNCTION RECORDS
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
          FUNCTION BODY
            A FUNCTION BODY entry describing the inlined function.
```

To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.

**Assembling**

Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.

A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.

A example assembly looks like:

```
foo2: # @foo2
# %bb.0: # %bb0
pushq %rax
testl %edi, %edi
.pseudoprobe 837061429793323041 1 0 0
je .LBB1_1
# %bb.2: # %bb2
.pseudoprobe 837061429793323041 6 2 0
callq foo
.pseudoprobe 837061429793323041 3 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
.LBB1_1: # %bb1
.pseudoprobe 837061429793323041 5 1 0
callq *%rsi
.pseudoprobe 837061429793323041 2 0 0
.pseudoprobe 837061429793323041 4 0 0
popq %rax
retq
# -- End function
.section .pseudo_probe_desc,"",@progbits
.quad 6699318081062747564
.quad 72617220756
.byte 3
.ascii "foo"
.quad 837061429793323041
.quad 281547593931412
.byte 4
.ascii "foo2"
```

With inlining turned on, the assembly may look different around %bb2 with an inlined probe:

```
# %bb.2:                                # %bb2
.pseudoprobe    837061429793323041 3 0
.pseudoprobe    6699318081062747564 1 0 @ 837061429793323041:6
.pseudoprobe    837061429793323041 4 0
popq    %rax
retq
```

**Disassembling**

We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.

An example disassembly looks like:

```
00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret
```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91878
2020-12-10 09:50:08 -08:00
Arthur Eubanks
efbcfa65ec [test] Fix scev-expander-preserve-lcssa.ll under NPM
The NPM runs loop passes over loops in forward program order, rather
than the legacy loop PM's reverse program order. This seems to produce
better results as shown here.

I verified that changing the loop order to reverse program order results
in the same IR with the NPM.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D92817
2020-12-10 09:46:08 -08:00
Craig Topper
55c03d9d7b [RISCV][LegalizeDAG] Expand SETO and SETUO comparisons. Teach LegalizeDAG to expand SETUO expansion when UNE isn't legal.
If SETUNE isn't legal, UO can use the NOT of the SETO expansion.

Removes some complex isel patterns. Most of the test changes are
from using XORI instead of SEQZ.

Differential Revision: https://reviews.llvm.org/D92008
2020-12-10 09:15:52 -08:00
Florian Hahn
11dfe26f5c [CallBase] Add hasRetAttr version that takes StringRef.
This makes it slightly easier to deal with custom attributes and
CallBase already provides hasFnAttr versions that support both AttrKind
and StringRef arguments in a similar fashion.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D92567
2020-12-10 17:00:16 +00:00
Irina Dobrescu
1f92697f11 [flang]Add Parser Support for Allocate Directive
Differential Revision: https://reviews.llvm.org/D89562
2020-12-10 16:21:19 +00:00
clementval
a38956acf7 Revert "[openmp] Remove clause from OMPKinds.def and use OMP.td info"
This reverts commit a7b2847216b4f7a84ef75461fd47a5adfbb63e27.

failing buildbot on warnings
2020-12-10 10:34:59 -05:00
Nuno Lopes
3db4186d3d AA: make AliasAnalysis.h compatible with C++20 (NFC)
can't mix arithmetic with different enums
2020-12-10 15:32:11 +00:00
Nico Weber
eeec860f46 [gn build] fix build after a7b2847216b4f7
Ports 6e42a417bacb since it's now needed, and undo an accidental
deletion from d69762c404ded while here (this part is not needed to fix
the build, it's just in the vicinity).
2020-12-10 10:28:48 -05:00
Valentin Clement
643cb4d428 [openmp] Remove clause from OMPKinds.def and use OMP.td info
Remove the OpenMP clause information from the OMPKinds.def file and use the
information from the new OMP.td file. There is now a single source of truth for the
directives and clauses.

To avoid generate lots of specific small code from tablegen, the macros previously
used in OMPKinds.def are generated almost as identical. This can be polished and
possibly removed in a further patch.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D92955
2020-12-10 10:19:09 -05:00
Krzysztof Parzyszek
eb2eae25ad [Hexagon] Fix gcc6 compilation issue 2020-12-10 08:17:07 -06:00
Kerry McLaughlin
1ca5a57655 [SVE][CodeGen] Extend index of masked gathers
This patch changes performMSCATTERCombine to also promote the indices of
masked gathers where the element type is i8 or i16, and adds various tests
for gathers with illegal types.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91433
2020-12-10 13:54:45 +00:00
Haojian Wu
9fb48763b2 Fix a -Wunused-variable warning in release build. 2020-12-10 14:52:45 +01:00
Kazushi (Jam) Marukawa
e3d737a117 [VE] Add vector reduce intrinsic instructions
Add vrmax, vrmin, vfrmax, vfrmin, vrand, vror, and vrxor intrinsic
instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92941
2020-12-10 22:21:17 +09:00
Sjoerd Meijer
1a124afc04 [AArch64] Cortex-R82: remove crypto
Remove target features crypto for Cortex-R82, because it doesn't have any, and
add LSE which was missing while we are at it.
This also removes crypto from the v8-R architecture description because that
aligns better with GCC and so far none of the R-cores have implemented crypto,
so is probably a more sensible default.

Differential Revision: https://reviews.llvm.org/D91994
2020-12-10 12:54:51 +00:00
David Green
67f2592469 [ARM][RegAlloc] Add t2LoopEndDec
We currently have problems with the way that low overhead loops are
specified, with LR being spilled between the t2LoopDec and the t2LoopEnd
forcing the entire loop to be reverted late in the backend. As they will
eventually become a single instruction, this patch introduces a
t2LoopEndDec which is the combination of the two, combined before
registry allocation to make sure this does not fail.

Unfortunately this instruction is a terminator that produces a value
(and also branches - it only produces the value around the branching
edge). So this needs some adjustment to phi elimination and the register
allocator to make sure that we do not spill this LR def around the loop
(needing to put a spill after the terminator). We treat the loop very
carefully, making sure that there is nothing else like calls that would
break it's ability to use LR. For that, this adds a
isUnspillableTerminator to opt in the new behaviour.

There is a chance that this could cause problems, and so I have added an
escape option incase. But I have not seen any problems in the testing
that I've tried, and not reverting Low overhead loops is important for
our performance. If this does work then we can hopefully do the same for
t2WhileLoopStart and t2DoLoopStart instructions.

This patch also contains the code needed to convert or revert the
t2LoopEndDec in the backend (which just needs a subs; bne) and the code
pre-ra to create them.

Differential Revision: https://reviews.llvm.org/D91358
2020-12-10 12:14:23 +00:00
Martin Storsjö
102c71a022 [llvm-rc] Handle driveless absolute windows paths when loading external files
When llvm-rc loads an external file, it looks for it relative to
a number of include directories and the current working directory.
If the path is considered absolute, llvm-rc tries to open the
filename as such, and doesn't try to open it relative to other
paths.

On Windows, a path name like "\dir\file" isn't considered absolute
as it lacks the drive name, but by appending it on top of the search
dirs, it's not found.

LLVM's sys::path::append just appends such a path (same with a properly
absolute posix path) after the paths it's supposed to be relative to.

This fix doesn't handle the case if the resource script and the
external file are on a different drive than the current working
directory; to fix that, we'd have to make LLVM's sys::path::append
handle appending fully absolute and partially absolute paths (ones
lacking a drive prefix but containing a root directory), or switch
to C++17's std::filesystem.

Differential Revision: https://reviews.llvm.org/D92558
2020-12-10 14:11:06 +02:00