1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

179982 Commits

Author SHA1 Message Date
Nico Weber
5ce0272690 gn build: Merge r362913
llvm-svn: 362932
2019-06-10 12:49:02 +00:00
George Rimar
e2b65f9692 [yaml2obj/obj2yaml] - Make RawContentSection::Content and RawContentSection::Size optional
This is a follow-up for D62809.

Content and Size fields should be optional as was discussed in comments
of the D62809's thread. With that, we can describe a specific string table and
symbol table sections in a more correct way and also show appropriate errors.

The patch adds lots of test cases where the behavior is described in details.

Differential revision: https://reviews.llvm.org/D62957

llvm-svn: 362931
2019-06-10 12:43:18 +00:00
George Rimar
a46ae2d5c4 [yaml2obj] - Do not assert when .dynsym is specified explicitly, but .dynstr is not present.
We have a code in buildSectionIndex() that adds implicit sections:

// Add special sections after input sections, if necessary.
for (StringRef Name : implicitSectionNames())
  if (SN2I.addName(Name, SecNo)) {
    // Account for this section, since it wasn't in the Doc
    ++SecNo;
    DotShStrtab.add(Name);
  }

The problem arises when .dynsym is specified explicitly and no
DynamicSymbols is used. In that case, we do not add
.dynstr implicitly and will assert later when will try to set Link
for .dynsym.

Seems, in this case, reasonable behavior is to allow Link field to be zero.
This is what this patch does.

Differential revision: https://reviews.llvm.org/D63001

llvm-svn: 362929
2019-06-10 11:38:06 +00:00
David Green
8d09c211fb [ARM] Enable Unroll UpperBound
This option allows loops with small max trip counts to be fully unrolled. This
can help with code like the remainder loops from manually unrolled loops like
those that appear in the cmsis dsp library. We would apparently previously
runtime unroll them with the default unroll count (4).

Differential Revision: https://reviews.llvm.org/D63064

llvm-svn: 362928
2019-06-10 10:22:14 +00:00
Simon Pilgrim
a6a4630e62 Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI.
llvm-svn: 362927
2019-06-10 10:13:32 +00:00
George Rimar
6ff51a50e2 [yaml2obj] - Remove helper methods that are probably excessive. NFC.
These methods are used only once. One of them is not used at all.

Differential revision: https://reviews.llvm.org/D63002

llvm-svn: 362925
2019-06-10 09:57:29 +00:00
Nikola Prica
cbbb59739c [DebugInfo] More strict debug range for stack variables
Variable's stack location can stretch longer than it should. If a
variable is placed at the stack in a some nested basic block its range
can be calculated to be up to the next occurrence of the variable's
DBG_VALUE, or up to the end of the function, thus covering a basic
blocks that should not be included in the variable’s location range.
This happens because the DbgEntityHistoryCalculator ends register
locations at the end of a basic block only if the variable’s location
register has been changed throughout the function, which is not the
case for the register used to reference stack objects.

This patch also tries to produce a single value location if the location
list builder managed to merge all the locations into one.

Reviewers: aprantl, dstenb, jmorse

Reviewed By: aprantl, dstenb, jmorse

Subscribers: djtodoro, ivanbaev, asowda

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D61600

llvm-svn: 362923
2019-06-10 08:41:06 +00:00
QingShan Zhang
9d1f0bd735 [DAGCombine] Match a pattern where a wide type scalar value is stored by several narrow stores
This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c

static void store64(u64 x, unsigned char* y)
{
    for(int i = 0; i != 8; ++i)
        y[i] = (x >> ((7-i) * 8)) & 255;
}

static u64 load64(const unsigned char* y)
{
    u64 res = 0;
    for(int i = 0; i != 8; ++i)
        res |= (u64)(y[i]) << ((7-i) * 8);
    return res;
}
The load64 has been implemented by https://reviews.llvm.org/D26149
This patch is trying to implement the store pattern.

Match a pattern where a wide type scalar value is stored by several narrow
stores. Fold it into a single store or a BSWAP and a store if the targets
supports it.

Assuming little endian target:
i8 *p = ...
i32 val = ...
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
p[3] = (val >> 24) & 0xFF;

>
*((i32)p) = val;

i8 *p = ...
i32 val = ...
p[0] = (val >> 24) & 0xFF;
p[1] = (val >> 16) & 0xFF;
p[2] = (val >> 8) & 0xFF;
p[3] = (val >> 0) & 0xFF;

>
*((i32)p) = BSWAP(val);

Differential Revision: https://reviews.llvm.org/D62897

llvm-svn: 362921
2019-06-10 05:40:21 +00:00
Craig Topper
e08d80b306 [X86] When promoting i16 compare with immediate to i32, try to use sign_extend for eq/ne if the input is truncated from a type with enough sign its.
Summary:
Our default behavior is to use sign_extend for signed comparisons and zero_extend for everything else. But for equality we have the freedom to use either extension. If we can prove the input has been truncated from something with enough sign bits, we can use sign_extend instead and let DAG combine optimize it out. A similar rule is used by type legalization in LegalizeIntegerTypes.

This gets rid of the movzx in PR42189. The immediate will still take 4 bytes instead of the 2 bytes plus 0x66 prefix a cmp di, 32767 would get, but it avoids a length changing prefix.

Reviewers: RKSimon, spatel, xbolva00

Reviewed By: xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63032

llvm-svn: 362920
2019-06-10 04:50:12 +00:00
Craig Topper
4d52e38c48 [X86] Disable f32->f64 extload when sse2 is enabled
Summary:
We can only use the memory form of cvtss2sd under optsize due to a partial register update. So previously we were emitting 2 instructions for extload when optimizing for speed. Also due to a late optimization in preprocessiseldag we had to handle (fpextend (loadf32)) under optsize.

This patch forces extload to expand so that it will always be in the (fpextend (loadf32)) form during isel. And when optimizing for speed we can just let each of those pieces select an instruction independently.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62710

llvm-svn: 362919
2019-06-10 04:37:16 +00:00
Vivek Pandya
9465b38a99 Do not derive no-recurse attribute if function does not have exact definition.
This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336

Reviewers: jdoerfert
Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D63045

llvm-svn: 362918
2019-06-10 04:16:04 +00:00
Kai Luo
5dee963b26 [NFC] Test if commit access granted.
llvm-svn: 362917
2019-06-10 03:20:33 +00:00
Nico Weber
4ab91c35ba Make test not write to source directory
llvm-svn: 362916
2019-06-10 01:47:04 +00:00
Craig Topper
4936805386 [X86] Use EVEX instructions for f128 FAND/FOR/FXOR when avx512vl is enabled.
llvm-svn: 362915
2019-06-10 01:18:55 +00:00
Craig Topper
66c40dd3b6 [X86] Convert f32/f64 FANDN/FAND/FOR/FXOR to vector logic ops and scalar_to_vector/extract_vector_elts to reduce isel patterns.
Previously we did the equivalent operation in isel patterns with
COPY_TO_REGCLASS operations to transition. By inserting
scalar_to_vetors and extract_vector_elts before isel we can
allow each piece to be selected individually and accomplish the
same final result.

I ideally we'd use vector operations earlier in lowering/combine,
but that looks to be more difficult.

The scalar-fp-to-i64.ll changes are because we have a pattern for
using movlpd for store+extract_vector_elt. While an f64 store
uses movsd. The encoding sizes are the same.

llvm-svn: 362914
2019-06-10 00:41:07 +00:00
Nico Weber
911a1ae0e3 Revert r361953 "[SVE][IR] Scalable Vector IR Type"
This reverts commit f4fc01f8dd3a5dfd2060d1ad0df6b90e8351ddf7.
It caused a 3-4x slowdown when doing thinlto links, PR42210.

llvm-svn: 362913
2019-06-09 19:27:50 +00:00
David Bolvansky
4974d01151 [TargetLowering] Simplify (ctpop x) == 1
Reviewers: craig.topper, spatel, RKSimon, bkramer

Reviewed By: spatel

Subscribers: javed.absar, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63004

llvm-svn: 362912
2019-06-09 18:18:57 +00:00
Roman Lebedev
eacd04f650 [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles
A precondition 'x != 0' was forgotten by me:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL

These 4 folds with non-constants could be re-enabled,
but for now let's go for the simplest solution.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362911
2019-06-09 16:30:42 +00:00
Roman Lebedev
08f625dc49 [NFC][InstCombine] Revisit canonicalize-constant-low-bit-mask-and-icmp-s* tests in preparatio for PR42198.
The `icmp sgt`/`icmp sle` variants are, too, miscompiles:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL
A precondition 'x != 0' was forgotten by me.

While ensuring test coverage for `-1`, also add test coverage
for `0` mask. Mask `0` is allowed for all the folds,
mask `-1` is allowed for all the folds with unsigned `icmp` pred.
Constant mask `0` is missed though.

https://bugs.llvm.org/show_bug.cgi?id=42198

llvm-svn: 362910
2019-06-09 16:30:14 +00:00
Sanjay Patel
836775cb4b [InstCombine] change canonicalization to fabs() to use FMF on fneg
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fneg (fsub) because they
have the same operand.

This works around the most glaring FMF logical inconsistency cited
in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

llvm-svn: 362909
2019-06-09 16:22:01 +00:00
David Bolvansky
99fec64adb [NFC] Adjust test for D63004
llvm-svn: 362908
2019-06-09 16:15:08 +00:00
David Bolvansky
e506fe7b8b [NFC] Added test from PR19758
llvm-svn: 362907
2019-06-09 15:12:46 +00:00
David Bolvansky
72adb13a04 [NFC] Added test from PR42084 for D63058
llvm-svn: 362906
2019-06-09 14:56:46 +00:00
Nikita Popov
6c17587199 [InstCombine] Add tests for usub.sat(x,y)+y etc; NFC
For PR42178.

llvm-svn: 362905
2019-06-09 14:39:47 +00:00
Sanjay Patel
b22b368718 [InstSimplify] reduce code duplication for fcmp folds; NFC
llvm-svn: 362904
2019-06-09 13:58:46 +00:00
Sanjay Patel
1361930fcb [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Sanjay Patel
132a7dcd28 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
Opposite predicate for rL362742 / rL362879 / D62979

llvm-svn: 362902
2019-06-09 13:30:14 +00:00
Anton Afanasyev
f14d26089e [MIR] Add simple PRE pass to MachineCSE
This is the second part of the commit fixing PR38917 (hoisting
partitially redundant machine instruction). Most of PRE (partitial
redundancy elimination) and CSE work is done on LLVM IR, but some of
redundancy arises during DAG legalization. Machine CSE is not enough
to deal with it. This simple PRE implementation works a little bit
intricately: it passes before CSE, looking for partitial redundancy
and transforming it to fully redundancy, anticipating that the next
CSE step will eliminate this created redundancy. If CSE doesn't
eliminate this, than created instruction will remain dead and eliminated
later by Remove Dead Machine Instructions pass.

The third part of the commit is supposed to refactor MachineCSE,
to make it more clear and to merge MachinePRE with MachineCSE,
so one need no rely on further Remove Dead pass to clear instrs
not eliminated by CSE.

First step: https://reviews.llvm.org/D54839

Fixes llvm.org/PR38917

This is fixed recommit of r361356 after PowerPC64 multistage build failure.

llvm-svn: 362901
2019-06-09 12:15:47 +00:00
Ayke van Laethem
172706e7f2 [CaptureTracking] Don't let comparisons against null escape inbounds pointers
Pointers that are in-bounds (either through dereferenceable_or_null or
thorough a getelementptr inbounds) cannot be captured with a comparison
against null. There is no way to construct a pointer that is still in
bounds but also NULL.

This helps safe languages that insert null checks before load/store
instructions. Without this patch, almost all pointers would be
considered captured even for simple loads. With this patch, an icmp with
null will not be seen as escaping as long as certain conditions are met.

There was a lot of discussion about this patch. See the Phabricator
thread for detals.

Differential Revision: https://reviews.llvm.org/D60047

llvm-svn: 362900
2019-06-09 10:20:33 +00:00
Ayke van Laethem
7d6b8741a2 [bindings/go] Add wrappers for atomic operations.
This patch adds Go bindings for atomic operations in LLVM.

Differential Revision: https://reviews.llvm.org/D61034

llvm-svn: 362899
2019-06-09 10:06:35 +00:00
Jatin Bhateja
5a376274f5 [X86] NFCI : Comment updation for EVEX to VEX translation.
Reviewers: llvm-commits, jbhateja

Reviewed By: jbhateja

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63055

llvm-svn: 362898
2019-06-09 09:59:26 +00:00
Simon Pilgrim
7d2803248c Use for-range loop. NFCI.
llvm-svn: 362897
2019-06-09 09:07:30 +00:00
Amara Emerson
d1dff45e07 [AArch64][GlobalISel] Select immediate forms of cmp instructions.
A simple re-use of the immediate operand matcher and renderer functions.

rdar://43795178

llvm-svn: 362896
2019-06-09 07:31:25 +00:00
Craig Topper
d284452f0a [X86] Remove (store (f32 (extractelt (v4f32))) isel patterns which is redundant.
We emit a MOVSSmr and a COPY_TO_REGCLASS, but that's what we would get from
selecting the store and extractelt independently.

llvm-svn: 362895
2019-06-09 03:21:33 +00:00
Craig Topper
a4b4a052d7 [X86] Mutate scalar fceil/ffloor/ftrunc/fnearbyint/frint into X86ISD::RNDSCALE during PreProcessIselDAG to cut down on number of isel patterns.
Similar was done for vectors in r362535. Removes about 1200 bytes from the isel table.

llvm-svn: 362894
2019-06-08 23:53:31 +00:00
Ayke van Laethem
4390ba5b81 [bindings/go] Add bindings to LLVMGet?CmpPredicate
Add bindings so that predicates on comparisons (icmp/fcmp) can be
inspected from IR.

Note: I considered adding Value.ICmpPredicate() etc. instead but
Value.IntPredicate() seemed easier to read and matches the name of the
returned type.

(This change was also pushed two commits ago but accidentally had the
wrong title and description.)

Revision: https://reviews.llvm.org/D53884
llvm-svn: 362893
2019-06-08 22:21:37 +00:00
Ayke van Laethem
31cbc86342 Revert "[bindings/go] Add Go bindings for CalledValue"
This reverts commit f675a60ca7a93f22e22dd4209504a9846dd04630.
The commit had the wrong title/description. Sorry about the mess!

llvm-svn: 362892
2019-06-08 22:17:51 +00:00
Ayke van Laethem
bdf7e2e7bd [bindings/go] Add Go bindings for CalledValue
This is very useful for inspecting generated IR, there appears to be no
other way to get the called function from a CallInst.

Differential Revision: https://reviews.llvm.org/D52972

llvm-svn: 362891
2019-06-08 22:15:38 +00:00
Ayke van Laethem
06a69245a1 [bindings/go] Add Go bindings for CalledValue
This is very useful for inspecting generated IR, there appears to be no
other way to get the called function from a CallInst.

Revision: https://reviews.llvm.org/D52972
llvm-svn: 362890
2019-06-08 22:08:52 +00:00
Ayke van Laethem
a1af64264e [bindings/go] Add EraseFromParent
After using ReplaceAllUsesWith on an instruction, it may be necessary to
erase it even though it is dead.

llvm-svn: 362889
2019-06-08 22:00:19 +00:00
Ayke van Laethem
fa896f12c9 [NFC] Test commit
Add a newline, which is missing according to go fmt.

llvm-svn: 362888
2019-06-08 21:42:00 +00:00
Roman Lebedev
801d3ed1c5 [X86][Codegen] Add missed pattern that may be a lea+neg
llvm-svn: 362886
2019-06-08 19:38:14 +00:00
Simon Pilgrim
bc9b63be1d [DAGCombine] visitAND - merge (zext_inreg ((s)extload x)) -> (zextload x) combines. NFCI.
Same codegen, only differ by the oneuse limit for the sextload case.

llvm-svn: 362880
2019-06-08 17:02:00 +00:00
Sanjay Patel
e61275cc46 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
Sylvestre Ledru
168855d61c fix a typo unavaliable=>unavailable
llvm-svn: 362878
2019-06-08 15:07:55 +00:00
David Bolvansky
43d7a48a61 [NFC] Added tests for D63038
llvm-svn: 362875
2019-06-08 12:07:59 +00:00
David Green
09c3fe91e2 [ARM] Adjust isLegalT1AddressImmediate for non-legal types
Types such as float and i64's do not have legal loads in Thumb1, but will still
be loaded with a LDR (or potentially multiple LDR's). As such we can treat the
cost of addressing mode calculations the same as an i32 and get some optimisation
benefits.

Differential Revision: https://reviews.llvm.org/D62968

llvm-svn: 362874
2019-06-08 10:32:53 +00:00
David Green
9b21f6c6fa [ARM] Add MVE addressing to isLegalT2AddressImmediate
Now with MVE being added, we can add the vector addressing mode costs for it.
These are generally imm7 multiplied by the size of the type being loaded /
stored.

Differential Revision: https://reviews.llvm.org/D62967

llvm-svn: 362873
2019-06-08 10:18:23 +00:00
David Green
98901f37bd [ARM] Add fp16 addressing to isLegalT2AddressImmediate
The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs
to account for those, and adds extra testing. It is dependant upon hasFPRegs16
as this is what the load/store instructions require.

Differential Revision: https://reviews.llvm.org/D62966

llvm-svn: 362872
2019-06-08 10:09:02 +00:00
David Green
c7795989ed [ARM] Add extra gep costmodel tests for MVE and half float. NFC
llvm-svn: 362871
2019-06-08 09:58:05 +00:00