1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

188088 Commits

Author SHA1 Message Date
Sander de Smalen
5074e446e7 [AArch64][SVE] Allocate locals that are scalable vectors.
This patch adds a target interface to set the StackID for a given type,
which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a
'sve-vec' StackID, so it is allocated in the SVE area of the stack frame.

Reviewers: ostannard, efriedma, rengolin, cameron.mcinally

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70080
2019-11-13 09:45:24 +00:00
Simon Tatham
6aa69a8001 [ARM,MVE] Use VMOV.{S8,S16} for sign-extended extractelement.
MVE includes instructions that extract an 8- or 16-bit lane from a
vector and sign-extend it into the output 32-bit GPR. `ARMInstrMVE.td`
already included isel patterns to select those instructions in
response to the `ARMISD::VGETLANEs` selection-DAG node type. But
`ARMISD::VGETLANEs` was never actually generated, because the code
that creates it was conditioned on NEON only.

It's an easy fix to enable the same code for integer MVE, and now IR
that sign-extends the result of an extractelement (whether explicitly
or as part of the function call ABI) will use `vmov.s8` instead of
`vmov.u8` followed by `sxtb`.

Reviewers: SjoerdMeijer, dmgreen, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70132
2019-11-13 09:08:41 +00:00
joanlluch
a1334aac0e [TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4)
Summary:
Replaces
```
unsigned getShiftAmountThreshold(EVT VT)
```
by

```
bool shouldAvoidTransformToShift(EVT VT, unsigned amount)
```
thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not.

Updates the MSP430 target with a custom implementation.

This continues  D69116, D69120, D69326 and updates them, so all of them must be committed before this.

Existing tests apply, a few more have been added.

Reviewers: asl, spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70042
2019-11-13 09:23:08 +01:00
Craig Topper
e9ebc959cb [X86] Remove setOperationAction for FP_TO_SINT v8i16.
This is no longer needed after widening legalization as we
custom legalize v8i8 ourselves.

Added entries to the cost model, but bumped the cost slightly
to account for the truncate shuffle that wasn't costed before.
2019-11-12 22:45:52 -08:00
Francesco Petrogalli
d054b2dd05 [VFABI] Add LLVM internal mangling for vector functions.
Summary:
This patch adds a custom ISA for vector functions for internal use
in LLVM. The <isa> token is set to "_LLVM_", and it is not attached
to any specific instruction Vector ISA, or Vector Function ABI.

The ISA is used as a token for handling Vector Function ABI-style
vectorization for those vector functions that are not directly
associated to any existing Vector Function ABI (for example, some of
the vector functions exposed by TargetLibraryInfo). The demangling
function for this ISA in a Vector Function ABI context is set to be
the same as the common one shared between X86 and AArch64.

Reviewers: jdoerfert, sdesmalen, simoll

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70089
2019-11-13 03:26:39 +00:00
Matt Arsenault
ee4c001c6a AMDGPU: Extend add x, (ext setcc) combine to sub
This is the same as the add case, but inverts the operation type.

This avoids regressions in a future patch.
2019-11-13 07:13:58 +05:30
Matt Arsenault
bf9d9a1180 AMDGPU: Switch backend default max workgroup size to 1024
Previously this would default to 256, not the maximum supported size
of 1024. Using a maximum lower than the hardware maximum requires
language runtimes to enforce this limit for correctness, which no
language has correctly done. Switch the default to the conservatively
correct maximum, and force frontends to opt-in to the more optimal 256
default maximum.

I don't really understand why the changes in occupancy-levels.ll
increased the computed occupancy, which I expected to decrease. I'm
not sure if these tests should be forcing the old maximum.
2019-11-13 07:11:02 +05:30
Matt Arsenault
5cfd953988 AMDGPU Reduce reported maximum group size to 1024
While some targets allow encoding 2048, this was never tested or
supported.
2019-11-13 06:34:28 +05:30
Alina Sbirlea
a76fef4322 [GlobalsAA] Reenable test. 2019-11-12 16:53:28 -08:00
Alina Sbirlea
6f9d9bcfe8 Temporarily disable test. 2019-11-12 15:57:51 -08:00
Eric Christopher
ec6661f8aa Temporarily Revert "Reapply [LVI] Normalize pointer behavior" as it's broken python 3.6.
Reverting to figure out if it's a problem in python or the compiler for now.

This reverts commit 885a05f48a5d320946c89590b73a764e5884fe4f.
2019-11-12 15:51:51 -08:00
Craig Topper
335135422b [X86] Don't consider v64i1 as a legal type unless v64i8 is also a legal type.
This avoids some nasty issues with argument passing and lowering of
arbitrary v64i8 shuffles.
2019-11-12 14:56:02 -08:00
Craig Topper
90fb1e427b [X86] Only pass v64i8/v32i16 as v16i32 on non-avx512bw targets if the v16i32 type won't be split by prefer-vector-width=256
Otherwise just let the v64i8/v32i16 types be split to v32i8/v16i16.

In reality this shouldn't happen because it means we have a 512-bit
vector argument, but min-legal-vector-width says a value less than
512. But a 512-bit argument should have been factored into the
preferred vector width.
2019-11-12 14:56:01 -08:00
Yonghong Song
9cf279eaff [BPF] generate BTF_KIND_VARs for all non-static globals
Enable to generate BTF_KIND_VARs for non-static
default-section globals which is not allowed previously.
Modified the existing test case to accommodate the new change.

Also removed unused linkage enum members VAR_GLOBAL_TENTATIVE and
VAR_GLOBAL_EXTERNAL.

Differential Revision: https://reviews.llvm.org/D70145
2019-11-12 14:34:08 -08:00
Alina Sbirlea
28990514f5 [GlobalsAA] Restrict ModRef result if any internal method has its address taken.
Summary:
If there are any internal methods whose address was taken, conclude there is nothing known in relation of any other internal method and a global.

Reviewers: nlopes, sanjoy.google

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69690
2019-11-12 14:24:56 -08:00
Alina Sbirlea
bbaa6235bb [GVNHoist] Preserve AAResults.
Resolves PR38906, PR40898.
2019-11-12 14:10:04 -08:00
Evandro Menezes
4d92249e81 [AArch64] Update for Exynos
Fix the modeling for loads and stores using the register offset addresing mode.
2019-11-12 14:37:41 -06:00
Evandro Menezes
2acbd313a1 [AArch64] Fix addressing mode predicates
Fix predicates related to the register offset addressing mode.
2019-11-12 14:37:28 -06:00
Fangrui Song
61e55d5d11 [llvm-objcopy][COFF] Implement --redefine-sym and --redefine-syms
The parsing error tests in ELF/redefine-symbols.test are not specific to ELF.
Move them to redefine-symbols.test.
Add COFF/redefine-symbols.test for COFF specific tests.

Also fix the documentation regarding --redefine-syms: the old and new
names are separated by whitespace, not an equals sign.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D70036
2019-11-12 11:28:00 -08:00
Peter Collingbourne
9930bdcc8f ARM: Don't emit R_ARM_NONE relocations to compact unwinding decoders in .ARM.exidx on Android.
These relocations are specified by the ARM EHABI (section 6.3). As I understand
it, their purpose is to accommodate unwinder implementations that wish to
reduce code size by placing the implementations of the compact unwinding
decoders in a separate translation unit, and using extern weak symbols to
refer to them from the main unwinder implementation, so that they are only
linked when something in the binary needs them in order to unwind.

However, neither of the unwinders used on Android (libgcc, LLVM libunwind)
use this technique, and in fact emitting these relocations ends up being
counterproductive to code size because they cause a copy of the unwinder
to be statically linked into most binaries, regardless of whether it is
actually needed. Furthermore, these relocations create circular dependencies
(between libc and the unwinder) in cases where the unwinder is dynamically
linked and libc contains compact unwind info.

Therefore, deviate from the EHABI here and stop emitting these relocations
on Android.

Differential Revision: https://reviews.llvm.org/D70027
2019-11-12 10:52:59 -08:00
Michael Liao
db1604061b Fix build with shared libraries. NFC.
- Dependent components need linking directly.
2019-11-12 13:40:35 -05:00
Krzysztof Parzyszek
a7dcb8c305 [Hexagon] Update PS_aligna with max stack alignment once isel completes 2019-11-12 11:47:29 -06:00
Julian Lettner
95339f1d20 [lit] Better/earlier errors for empty runs
Fail early, when we discover no tests at all, or filter out all of them.

There is also `--allow-empty-runs` to disable test to allow workflows
like `LIT_FILTER=abc ninja check-all`.  Apparently `check-all` invokes
lit multiple times if certain projects are enabled, which would produce
unwanted "empty runs". Specify via `LIT_OPTS=--allow-empty-runs`.

There are 3 causes for empty runs:
1) No tests discovered.  This is always an error.  Fix test suite config
   or command line.
2) All tests filtered out.  This is an error by default, but can be
   suppressed via `--alow-empty-runs`.  Should prevent accidentally
   passing empty runs, but allow the workflow above.
3) The number of shards is greater than the number of tests.  Currently,
   this is never an error.  Personally, I think we should consider
   making this an error by default; if this happens, you are doing
   something wrong. I added a warning but did not change the behavior,
   since this warrants more discussion.

Reviewed By: atrick, jdenny

Differential Revision: https://reviews.llvm.org/D70105
2019-11-12 09:11:36 -08:00
Sanjay Patel
0d90bb492f [SLP] add test for miscompile with reduction (PR43948); NFC 2019-11-12 11:36:41 -05:00
Krzysztof Parzyszek
fe9991c322 [Hexagon] Fix vector spill expansion to use proper alignment
1. Add pseudos PS_vloadrv_ai and PS_vstorerv_ai: those are now used
   for single vector registers in loadRegFromStackSlot (and store...).
2. Remove pseudos PS_vloadrwu_ai and PS_vstorerwu_ai. The alignment is
   now checked when expanding spill pseudos (both in frame lowering
   and in expand-post-ra-pseudos), and a proper instruction is generated.
3. Update MachineMemOperands when dealigning vector spill slots.
4. Return vector predicate registers in getCallerSavedRegs.
2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek
573d28992d [Hexagon] Convert stack object offsets to int64, NFC
This will print [SP-56] instead of [SP+4294967240].
2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek
34d2e6a2af [Hexagon] Handle stack realignment in hexagon-vextract 2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek
0ce37ce8eb [Hexagon] Require PS_aligna whenever variable-sized objects are present 2019-11-12 09:43:21 -06:00
Jinsong Ji
477ae4e17b [PowerPC] Remove allow-deprecated-dag-overlap and fix broken tests
Summary:
This is found during review of https://reviews.llvm.org/D67088.

CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106.
-allow-deprecated-dag-overlap was introduced to temporary accept old
behavior.

But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll`
The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`.

This patch remove the deprecated options, and fix the broken tests.

Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz

Reviewed By: shchenz

Subscribers: shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69733
2019-11-12 15:18:54 +00:00
Tom Weaver
30144fca09 [DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass.
Reviewed By: aprantl, vsk

Differential revision: https://reviews.llvm.org/D69943
2019-11-12 15:17:04 +00:00
Jinsong Ji
ca8d41d144 [PowerPC][NFC]Fix typo in desc for enable-ppc-prefetching 2019-11-12 14:46:57 +00:00
Florian Hahn
2d3ee56e42 [Examples] Add IRTransformations directory to examples.
This patch adds a new IRTransformations directory to llvm/examples/. This is
intended to serve as a new home for example transformations/analysis
code used by various tutorials.

If LLVM_BUILD_EXAMPLES is enabled, the ExamplesIRTransforms library is
linked into the opt binary and the example passes become available.

To start off with, it contains the CFG simplifications used in the IR
part of the 'Getting Started With LLVM: Basics' tutorial at the US LLVM
Developers Meeting 2019.

Reviewers: paquette, jfb, meikeb, lhames, kbarton

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D69416
2019-11-12 14:14:48 +00:00
Alex Denisov
ed10a34931 Mark llvm::ConstantExpr::getAsInstruction as const
Summary:
getAsInstruction is the only non-const member method.
It is impossible to enforce const-correctness because of it.

Reviewers: jmolloy, majnemer

Reviewed By: jmolloy

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70113
2019-11-12 14:24:12 +01:00
Florian Hahn
380e3e814f [AArch64ExpandPseudos] Preserve renamable state when expanding MOVi64 & co.
If the MOVi operand was renamable, the operands of the expanded
instructions are also renamable.

Reviewers: thegameg, samparker, zatrazz

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D70061
2019-11-12 11:29:04 +00:00
Diana Picus
003518d973 [InstCombine] Skip scalable vectors in combineLoadToOperationType
Don't try to canonicalize loads to scalable vector types to loads
of integers.

This removes one assertion when trying to use a TypeSize as a parameter
to DataLayout::isLegalInteger. It does not handle the second part of the
function (which looks at bitcasts).

This patch also contains a NFC fix for Load Analysis, where a variable
initialization that would cause the same assertion is moved closer to
its use. This allows us to run the new test for InstCombine without
having to teach LocationSize to play nicely with scalable vectors.

Differential Revision: https://reviews.llvm.org/D70075
2019-11-12 12:27:09 +01:00
Simon Pilgrim
f9dd23df58 [X86] Cleanup prefixes + regenerate for fp-intrinsics-fma.ll 2019-11-12 11:24:00 +00:00
Simon Pilgrim
0afb0b590c FileCheckPattern::FindRegexVarEnd - make helper function static. NFC
Fixes cppcheck warning.
2019-11-12 11:14:19 +00:00
Simon Pilgrim
4f01b8cba4 [X86] Add PR39464 addcarry/subborrow test cases
Additional coverage for D70079
2019-11-12 11:14:18 +00:00
Florian Hahn
5f133744e9 [LoopInterchange] Only skip PHIs with incoming values from the inner loop.
Currently we have limited support for outer loops with multiple basic
blocks after the inner loop exit. But the current checks for creating
PHIs for loop exit values only assumes the header and latches of the
outer loop. It is better to just skip incoming values defined in the
original inner loops. Those are handled earlier.

Reviewers: efriedma, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D70059
2019-11-12 10:30:51 +00:00
Pavel Labath
6350896386 DWARFDebugLoclists: add location list "interpretation" logic
Summary:
This patch extracts the logic for computing the "absolute" locations,
which was partially present in the debug_loclists dumper, completes it,
and moves it into a separate function. This makes it possible to later
reuse the same logic for uses other than dumping.

The dumper is changed to reuse the location list interpreter, and its
format is changed somewhat. In "verbose" mode it prints the "raw" value
of a location list, the interpreted location (if available) and the
expression itself. In non-verbose mode it prints only one of the
location forms: it prefers the interpreted form, but falls back to the
"raw" format if interpretation is not possible (for instance, because we
were not given a base address, or the resolution of indirect addresses
failed).

This patch also undos some of the changes made in D69672, namely the
part about making all functions static. The main reason for this is that
I learned that the original approach (dumping only fully resolved
locations) meant that it was impossible to rewrite one of the existing
tests. To make that possible (and make the "inline location" dump work
in more cases), I now reuse the same dumping mechanism as is used for
section-based dumping. As this required having more objects know about
the various location lists classes, it seemed like a good idea to create
an interface abstracting the difference between them.

Therefore, I now create a DWARFLocationTable class, which will serve as
a base class for the location list classes. DWARFDebugLoclists is made
to inherit from that. DWARFDebugLoc will follow.

Another positive effect of this change is that section-based dumping
code will not need to use templates (as originally) envisioned, and that
the argument lists of the dumping functions become shorter.

Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70081
2019-11-12 10:40:13 +01:00
David Zarzycki
a3f2e48ddd [X86] Add more add/sub carry tests
Preparation for: https://reviews.llvm.org/D70079

https://reviews.llvm.org/D70077
2019-11-12 11:36:59 +02:00
Daniil Suchkov
f01bc2a4a4 [NFC][InstCombine] Add tests that show a number of canonicalization opportunities
Reviewers: spatel, RKSimon, lebedev.ri, apilipenko

Reviewed-By: apilipenko

Tags: #llvm

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D68263
2019-11-12 15:43:29 +07:00
Tim Renouf
11ee47cb13 MCP: Fixed bug with dest overlapping copy source
In MachineCopyPropagation, when propagating the source of a copy into
the operand of a later instruction, bail if a destination overlaps
(partly defines) the copy source. If the instruction where the
substitution is happening is also a copy, allowing the propagation
confuses the tracking mechanism.

Differential Revision: https://reviews.llvm.org/D69953

Change-Id: Ic570754f878f2d91a4a50a9bdcf96fbaa240726d
2019-11-12 08:18:11 +00:00
Craig Topper
d453b74bfe [X86] Add fptosi test to fp-intrinsics.ll 2019-11-11 23:55:12 -08:00
Craig Topper
60037a5cd9 [X86] Update stale comment. NFC 2019-11-11 23:55:12 -08:00
Mikael Holmen
9ef711ef65 [VFABI] Remove unused variables in testcase, fix buildbot
E.g. the buildbot at

 http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/7259/steps/build-stage2-unified-tree/logs/stdio

failed with

/home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/Transforms/Utils/VFABIUtils.cpp:50:22: error: unused variable 'FnAttrs' [-Werror,-Wunused-variable]
  const AttributeSet FnAttrs = Attrs.getFnAttributes();
                     ^
1 error generated.
2019-11-12 08:28:12 +01:00
Georgii Rymar
6350ceb0f2 [llvm-readelf/llvm-readobj][test] - Convert elf-linker-options.ll to use YAML.
This converts elf-linker-options.ll to use yaml2obj instead of llc,
improves and cleanups it a bit.

This opens a road to add an additional tests for checking the broken cases.

Differential revision: https://reviews.llvm.org/D70004
2019-11-12 10:08:06 +03:00
Georgii Rymar
edf395d3e2 [yaml2obj/obj2yaml] - Add support for SHT_LLVM_LINKER_OPTIONS sections.
SHT_LLVM_LINKER_OPTIONS section contains pairs of null-terminated strings.
This patch adds support for them.

Differential revision: https://reviews.llvm.org/D69895
2019-11-12 09:55:20 +03:00
Hideto Ueno
3e7b5160af [Attributor] Use must-be-executed-context in align deduction
Summary:
This patch introduces align attribute deduction for callsite argument, function argument, function returned and floating value based on must-be-executed-context.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69797
2019-11-12 06:41:19 +00:00
Nick Terrell
639a0c16d7 [Support] Optimize SHA1 implementation
* Add inline to the helper functions because gcc-9 won't inline all of
  them without the hint. I've avoided `__attribute__((always_inline))`
  because gcc and clang will inline without it, and improves
  compatibility.
* Replace the byte-by-byte copy in update() with endian::readbe32()
  since perf reports that 1/2 of the time is spent copying into the
  buffer before this patch.

When lld uses --build-id=sha1 it spends 30-45% of CPU in SHA1 depending on the binary (not wall-time since it is parallel). This patch speeds up SHA1 by a factor of 2 on clang-8 and 3 on gcc-6. This leads to a >10% improvement in overall linking time.

lld-speed-test benchmarks run on an Intel i9-9900k with Turbo disabled on CPU 0 compiled with clang-9. Stats recorded with `perf stat -r 5`. All inputs are using `--build-id=sha1`.

| Input | Before (seconds) | After (seconds) |
| --- | --- | --- |
| chrome | 2.14 | 1.82 (-15%) |
| chrome-icf | 2.56 | 2.29 (-10%) |
| clang | 0.65 | 0.53 (-18%) |
| clang-fsds | 0.69 | 0.58 (-16%) |
| clang-gdb-index | 21.71 | 19.3 (-11%) |
| gold | 0.42 | 0.34 (-19%) |
| gold-fsds | 0.431 | 0.355 (-17%) |
| linux-kernel | 0.625 | 0.575 (-8%) |
| llvm-as | 0.045 | 0.039 (-14%) |
| llvm-as-fsds | 0.035 | 0.039 (-11%) |
| mozilla | 11.3 | 9.8  (-13%) |
| mozilla-gc | 11.84 | 10.36 (-12%) |
| mozilla-O0 | 8.2 | 5.84 (-28%) |
| scylla | 5.59 | 4.52 (-19%) |

Reviewed By: ruiu, MaskRay

Differential Revision: https://reviews.llvm.org/D69295
2019-11-11 22:14:28 -08:00