1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

165060 Commits

Author SHA1 Message Date
Zachary Turner
2f6a8ddfe8 [FileSystem] Split up the OpenFlags enumeration.
This breaks the OpenFlags enumeration into two separate
enumerations: OpenFlags and CreationDisposition.  The first
controls the behavior of the API depending on whether or not
the target file already exists, and is not a flags-based
enum.  The second controls more flags-like values.

This yields a more easy to understand API, while also allowing
flags to be passed to the openForRead api, where most of the
values didn't make sense before.  This also makes the apis more
testable as it becomes easy to enumerate all the configurations
which make sense, so I've added many new tests to exercise all
the different values.

llvm-svn: 334221
2018-06-07 19:58:58 +00:00
Matt Arsenault
f65764ffa1 DAG: Avoid bitcast/ext/build_vector combine
This avoids regressions in a future AMDGPU change
to make v4i16/v4f16 legal. For these types, build_vector
is implemented as bitcasted operations on v2i32. This
combine was creating v4i16s out of what would have been
already been a v2i32 build_vector, creating a mess
of nodes that never get cleaned up.

I'm not sure this is the right condition to check.
I initially tried just checking for the legality of the
new build_vector. This works for my case, but breaks dozens
of x86 tests. A Mips test seems to show some improvement
or at least a neutral change. I don't want to think
about how long it would take to analyze the set of
different x86 vector operations impacted.

Test included in future commit.

llvm-svn: 334218
2018-06-07 19:42:27 +00:00
Alexander Shaposhnikov
15f4a16453 [llvm-objcopy] Remove unused field from Object
The class Object contains std::shared_ptr<MemoryBuffer> OwnedData
which is not used anywhere. Besides avoiding two stage initialization 
the motivation to remove it comes from the plan to add (currently missing) support 
for static libraries.
NFC.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47855

llvm-svn: 334217
2018-06-07 19:41:42 +00:00
Sanjay Patel
bc783fddef [TargetLibraryInfo] add mappings from LLVM sin/cos intrinsics to SVML calls
These weren't included in D19544 - probably just an oversight.
D40044 made it more likely that we'll have LLVM math intrinsics rather 
than libcalls, so this bug was more easily exposed.
As the tests/code show, we already have the complete mappings for pow/exp/log.

I don't have any experience with SVML, so I don't know if anything else is 
missing. It's also not clear to me that we should be doing this transform in 
IR rather than DAG/isel, but that's a separate issue.

Differential Revision: https://reviews.llvm.org/D47610

llvm-svn: 334211
2018-06-07 18:21:24 +00:00
Daniil Fukalov
debbd7365f [LSR] Check yet more intrinsic pointer operands
the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

llvm-svn: 334209
2018-06-07 17:30:58 +00:00
David Carlier
d2c88a98fb [docs] add various sanitisers support for FreeBSD/OpenBSD
since couple of months, supports had been enabled for FreeBSD and OpenBSD.

Reviewers: thakis, spatel, dim

Reviewed By: dim

Differential Revision: https://reviews.llvm.org/D47322

llvm-svn: 334207
2018-06-07 16:33:48 +00:00
Roman Lebedev
0b5f034322 [NFC][InstSimplify] Add more tests for shl nuw C, %x -> C fold.
Follow-up for rL334200.
For these, KnownBits will be needed.

llvm-svn: 334206
2018-06-07 16:18:26 +00:00
Simon Pilgrim
60cc93fa0e [X86][SSE] Updated comment - combineVectorSignBitsTruncation handles PACKSS and PACKUS. NFCI.
llvm-svn: 334204
2018-06-07 16:08:40 +00:00
Alex Bradbury
d56655b31e [RISCV] AsmParser support for the li pseudo instruction
The implementation follows the MIPS backend and expands the pseudo instruction 
directly during asm parsing. As the result, only real MC instructions are 
emitted to the MCStreamer. The actual expansion to real instructions is 
similar to the expansion performed by the GNU Assembler.

This patch supersedes D41949.

Differential Revision: https://reviews.llvm.org/D46118
Patch by Mario Werner.

llvm-svn: 334203
2018-06-07 15:35:47 +00:00
Alex Bradbury
c08915b4a2 [AVR] Fix build after r334078
r334078 added MCSubtargetInfo to fixupNeedsRelaxation and applyFixup. This 
patch makes the necessary adjustment for the AVR target.

llvm-svn: 334202
2018-06-07 15:29:09 +00:00
Simon Pilgrim
1ec8abafb9 [X86][SSE] Simplify combineVectorTruncationWithPACKUS. NFCI.
Move code only used by combineVectorTruncationWithPACKUS out of combineVectorTruncation.

llvm-svn: 334201
2018-06-07 14:53:32 +00:00
Roman Lebedev
baf51c46e0 [NFC][InstSimplify] Add tests for shl nuw C, %x -> C fold.
%r = shl nuw i8 C, %x

As per langref: If the nuw keyword is present, then the shift produces
                a poison value if it shifts out any non-zero bits.
Thus, if the sign bit is set on C, then %x can only be 0,
which means that %r can only be C.

https://rise4fun.com/Alive/WMk
Was mentioned in D47428 review.

llvm-svn: 334200
2018-06-07 14:18:38 +00:00
Sanjay Patel
fea6e9223d [x86] add tests for backwards propagate mask bug (PR37060, PR37667); NFC
llvm-svn: 334199
2018-06-07 14:11:18 +00:00
Guillaume Chatelet
6542e7eeea [llvm-exegesis] Make BenchmarkRunner handle multiple configurations.
Summary: BenchmarkRunner subclasses can now create many configurations - although this patch still generates one.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47877

llvm-svn: 334197
2018-06-07 14:00:29 +00:00
Paul Semel
125097cf42 [llvm-objdump] Add -R option
This option prints dynamic relocation entries of the given file

Differential Revision: https://reviews.llvm.org/D47493

llvm-svn: 334196
2018-06-07 13:30:55 +00:00
Hiroshi Inoue
0c9ae6e9df [PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector
BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation.
However, enforcing 32-bit instruction sometimes results in redundant generated code.
For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF).

uint64_t func(uint64_t a, uint64_t b) {
	return (a & 0xFFFFFFFF) | (b << 32) ;
}

To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group.

Differential Revision: https://reviews.llvm.org/D47867

llvm-svn: 334195
2018-06-07 13:21:14 +00:00
Petar Jovanovic
dca3ba34fb [Mips] Silencing warnings in instruction info (NFC)
isORCopyInst and isReadOrWriteToDSPReg functions were producing warning
that some statements my fall through.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D47876

llvm-svn: 334194
2018-06-07 13:06:06 +00:00
Simon Pilgrim
2daa6b3f4d [X86][SSE] Simplify combineVectorTruncationWithPACKSS to reduce code duplication
Simplify combineVectorTruncationWithPACKSS to just a SIGN_EXTEND_INREG followed by using the existing truncateVectorWithPACK instead of duplicating code.

llvm-svn: 334193
2018-06-07 13:01:42 +00:00
Hiroshi Inoue
0926bea084 [PowerPC] fix trivial typos in comment, NFC
llvm-svn: 334191
2018-06-07 12:49:12 +00:00
Matt Arsenault
6290f5cc50 AMDGPU: Fix not including v2f64 in SReg_128
Fixes assertion with calls returning v2f64.

llvm-svn: 334189
2018-06-07 12:16:31 +00:00
Simon Pilgrim
d656899c27 [X86][SSE] Add extra trunc(shl) test cases
The existing trunc_shl_17_v8i16_v8i32 test case should (but doesn't) fold to zero, I've added 2 new test cases:
 - trunc_shl_16_v8i16_v8i32 which folds to zero (this is actually testing the target faux shuffle combine)
 - trunc_shl_15_v8i16_v8i32 which should perform the full shl + truncate

llvm-svn: 334188
2018-06-07 11:22:52 +00:00
Florian Hahn
fff954c47e [Mem2Reg] Avoid replacing load with itself in promoteSingleBlockAlloca.
We do the same thing in rewriteSingleStoreAlloca.

Fixes PR37632.

Reviewers: chandlerc, davide, efriedma

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D47825

llvm-svn: 334187
2018-06-07 11:09:05 +00:00
Matt Arsenault
ce21e1b000 AMDGPU: Use scalar operations for f16 fabs/fneg patterns
Fixes unnecessary differences between subtargets.

llvm-svn: 334184
2018-06-07 10:15:20 +00:00
Simon Pilgrim
336e0b1623 [X86] Regenerate rotate tests
Add 32-bit tests to show missed SHLD/SHRD cases

llvm-svn: 334183
2018-06-07 10:13:09 +00:00
Paul Semel
e9cbc4c61c [llvm-strip] Expose --strip-unneeded option
Differential Revision: https://reviews.llvm.org/D47818

llvm-svn: 334182
2018-06-07 10:05:25 +00:00
Matt Arsenault
1944ea0dcb AMDGPU: Try a lot harder to emit scalar loads
This has two main components. First, widen
widen short constant loads in DAG when they have
the correct alignment. This is already done a bit in
AMDGPUCodeGenPrepare, since that has access to
DivergenceAnalysis. This can't help kernarg loads
created in the DAG. Start to use DAG divergence analysis
to help this case.

The second part is to avoid kernel argument lowering
breaking the alignment of short vector elements because
calling convention lowering wants to split everything
into legal register types.

When loading a split type, load the nearest 4-byte aligned
segment and shift to get the desired bits. This extra
load of the earlier argument piece ends up merging,
and the bit extract hopefully folds out.

There are a number of improvements and regressions with
this, but I think as-is this is a better compromise between
several of the worst parts of SelectionDAG.

Particularly when i16 is legal, this produces worse code
for i8 and i16 element vector kernel arguments. This is
partially due to the very weak load merging the DAG does.
It only looks for fairly specific combines between pairs
of loads which no longer appear. In particular this
causes v4i16 loads to be split into 2 components when
previously the two halves were merged.

Worse, because of the newly introduced shifts, there
is a lot more unnecessary vector packing and unpacking code
emitted. At least some of this is due to reporting
false for isTypeDesirableForOp for i16 as a workaround for
the lack of divergence information in the DAG. The cases
where this happens it doesn't actually matter, but the
relevant code in SimplifyDemandedBits doens't have the context
to know to ignore this.

The use of the  scalar cache is probably more important
than the mess of mostly scalar instructions doing this packing
and unpacking. Future work can fix this, possibly by making better
use of the new DAG divergence information for controlling promotion
decisions, or adding another version of shift + trunc + shift
combines that doesn't only know about the used types.

llvm-svn: 334180
2018-06-07 09:54:49 +00:00
Clement Courbet
20b90b2f9b [X86][NFC] Fix harmless typo in BtVer2 model.
See D46356 for context.

llvm-svn: 334178
2018-06-07 09:26:33 +00:00
Tomasz Krupa
2bc9b01da4 [X86] Block UndefRegUpdate
Summary: Prevent folding of operations with memory loads when one of the sources has undefined register update.

Reviewers: craig.topper

Subscribers: llvm-commits, mike.dvoretsky, ashlykov

Differential Revision: https://reviews.llvm.org/D47621

llvm-svn: 334175
2018-06-07 08:48:45 +00:00
Max Kazantsev
1315521a86 [NFC] Use variable instead of accessing pair many times
llvm-svn: 334173
2018-06-07 08:47:19 +00:00
Tomasz Krupa
311204a49f Test commit access.
Added a bunch of periods after comments.

llvm-svn: 334171
2018-06-07 08:20:28 +00:00
Guillaume Chatelet
4306e4f160 [llvm-exegesis] Add a Configuration object for Benchmark.
Summary: This is the first step to have the BenchmarkRunner create and measure many different configurations (different initial values for instance).

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47826

llvm-svn: 334169
2018-06-07 08:11:54 +00:00
Guillaume Chatelet
faceafb547 [llvm-exegesis] Improve error reporting.
Summary: BenchmarkResult IO functions now return an Error or Expected so caller can deal take proper action.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47868

llvm-svn: 334167
2018-06-07 07:51:16 +00:00
Guillaume Chatelet
7ada5fd5d2 [llvm-exegesis] Serializes instruction's operand in BenchmarkResult's key.
Summary: Follow up patch to https://reviews.llvm.org/D47764.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D47785

llvm-svn: 334165
2018-06-07 07:40:40 +00:00
Clement Courbet
9fb142b4db [X86][NFC] Fix harmless typos in BDW/ZnVer1 sched models.
See D46356 for context.

llvm-svn: 334164
2018-06-07 07:37:49 +00:00
Karl-Johan Karlsson
bee8ac1f16 [BranchFolding] Fix live-in's when hoisting code
Summary:
When the branch folder hoist code into a predecessor it adjust live-in's
in the blocks it hoist code from. However it fail to handle hoisted code
that contain a defed register that originally is live-in in the block
through a super register.

This is fixed by replacing the live-in handling code with calls to
utility functions in LivePhysRegs.

Reviewers: kparzysz, gberry, MatzeB, uweigand, aprantl

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47529

llvm-svn: 334163
2018-06-07 07:20:33 +00:00
Jonas Paulsson
6e843177d7 [SystemZ] Build Load And Test from scratch in convertToLoadAndTest.
This is needed to get CC operand in right place, as expected by the
SchedModel.

Review: Ulrich Weigand
https://reviews.llvm.org/D47820

llvm-svn: 334161
2018-06-07 05:59:07 +00:00
Michael Zolotukhin
2d2aa76e2d SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation.
The pass doesn't touch CFG in any way, only moves instructions between
blocks.

llvm-svn: 334150
2018-06-07 00:19:29 +00:00
Peter Collingbourne
a721654ba3 Add definition for ELF dynamic tag DT_SYMTAB_SHNDX.
DT_SYMTAB_SHNDX is defined in generic-abi:

http://www.sco.com/developers/gabi/latest/ch5.dynamic.html

Patch by Rahul Chaudhry!

Differential Revision: https://reviews.llvm.org/D47803

llvm-svn: 334149
2018-06-07 00:06:41 +00:00
Peter Collingbourne
47277d18ab llvm-readobj: fix printing number of relocations in Android packed format.
With '-elf-output-style=GNU -relocations', a header containing the number
of entries is printed before all the relocation entries in the section.
For Android packed format, we need to perform the unpacking first before
we can get the actual number of relocations in the section.

Patch by Rahul Chaudhry!

Differential Revision: https://reviews.llvm.org/D47800

llvm-svn: 334147
2018-06-07 00:02:07 +00:00
Stanislav Mekhanoshin
75cbfdbaf3 [AMDGPU] Improve reciprocal handling
When denormals are supported we are producing a full division for
1.0f / x. That still can be replaced by the faster version:

    bool c = fabs(x) > 0x1.0p+96f;
    float s = c ? 0x1.0p-32f : 1.0f;
    x *= s;
    return s * v_rcp_f32(x)

in case if requested accuracy is 2.5ulp or less. The same version
is used if denormals are not supported for non 1.0 numerators, where
just v_rcp_f32 is then used for 1.0 numerator.

The optimization of 1/x is extended to the case -1/x, which is the
same except for the resulting sign bit.

OpenCL conformance passed with both enabled and disabled denorms.

Differential Revision: https://reviews.llvm.org/D47805

llvm-svn: 334142
2018-06-06 22:22:32 +00:00
Teresa Johnson
5831cde8b1 [ThinLTO] Rename index IsAnalysis flag to HaveGVs (NFC)
With the upcoming patch to add summary parsing support, IsAnalysis would
be true in contexts where we are not performing module summary analysis.
Rename to the more specific and approprate HaveGVs, which is essentially
what this flag is indicating.

llvm-svn: 334140
2018-06-06 22:22:01 +00:00
Sanjay Patel
623814d2db [InstCombine] fold another shifty abs pattern to cmp+sel (PR36036)
The bug report:
https://bugs.llvm.org/show_bug.cgi?id=36036

...requests a DAG change for this, but an IR canonicalization
probably handles most cases. If we still want to match this
pattern in the backend, there's a proposal for that too:
D47831

Alive proofs including nsw/nuw cases that were first noted in:
D46988

https://rise4fun.com/Alive/Kmp

This patch is largely copied from the existing code that was
initially added with:
D40984
...but I didn't see much gain from trying to share code.

llvm-svn: 334137
2018-06-06 21:58:12 +00:00
Petr Hosek
2480179bdd [CMake] Pass additional CMake tools to external projects
This is needed when the external projects try to use other tools
besides just the compiler and the linker.

Differential Revision: https://reviews.llvm.org/D47833

llvm-svn: 334136
2018-06-06 21:43:37 +00:00
Sanjay Patel
635ef457d1 [InstCombine] add tests for another abs() pattern (PR36036); NFC
llvm-svn: 334133
2018-06-06 21:32:42 +00:00
Matt Arsenault
afc0bd916e AMDGPU: Custom lower v2f16 fneg/fabs with illegal f16
Fixes terrible code on targets without f16 support. The
legalization creates a mess that is difficult to recover
from. Also should avoid randomly breaking these tests
multiple times in sequence in future commits.

Some regressions in cases where it happens to be better
to pull the source modifier after the conversion.

llvm-svn: 334132
2018-06-06 21:28:11 +00:00
Alexander Shaposhnikov
f951a2d41d [llvm-strip] Expose --discard-all option
Expose objcopy's --discard-all option in llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D47750

llvm-svn: 334131
2018-06-06 21:23:19 +00:00
Roman Lebedev
17ed890610 [InstCombine] PR37603: low bit mask canonicalization
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 | PR37603 ]].

https://godbolt.org/g/VCMNpS
https://rise4fun.com/Alive/idM

When doing bit manipulations, it is quite common to calculate some bit mask,
and apply it to some value via `and`.

The typical C code looks like:
```
int mask_signed_add(int nbits) {
    return (1 << nbits) - 1;
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 1, %0
  %3 = add nsw i32 %2, -1
  ret i32 %3
}
```

But there is a second, less readable variant:
```
int mask_signed_xor(int nbits) {
    return ~(-(1 << nbits));
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 -1, %0
  %3 = xor i32 %2, -1
  ret i32 %3
}
```

Since we created such a mask, it is quite likely that we will use it in `and` next.
And then we may get rid of `not` op by folding into `andn`.

But now that i have actually looked:
https://godbolt.org/g/VTUDmU
_some_ backend changes will be needed too.
We clearly loose `bzhi` recognition.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47428

llvm-svn: 334127
2018-06-06 19:38:27 +00:00
Roman Lebedev
2128ec3ad1 [InstCombine][NFC] PR37603: low bit mask canonicalization tests
Differential Revision: https://reviews.llvm.org/D47427

llvm-svn: 334126
2018-06-06 19:38:21 +00:00
Roman Lebedev
b3af34d70b [X86] Emit BZHI when mask is ~(-1 << nbits))
Summary:
In D47428, i propose to choose the `~(-(1 << nbits))` as the canonical form of low-bit-mask formation.
As it is seen from these tests, there is a reason for that.

AArch64 currently better handles `~(-(1 << nbits))`, but not the more traditional `(1 << nbits) - 1` (sic!).
The other way around for X86.
It would be much better to canonicalize.

This patch is completely monkey-typing.
I don't really understand how this works :)
I have based it on `// x & (-1 >> (32 - y))` pattern.

Also, when we only have `BMI`, i wonder if we could use `BEXTR` with `start=0` ?

Related links:
https://bugs.llvm.org/show_bug.cgi?id=36419
https://bugs.llvm.org/show_bug.cgi?id=37603
https://bugs.llvm.org/show_bug.cgi?id=37610
https://rise4fun.com/Alive/idM

Reviewers: craig.topper, spatel, RKSimon, javed.absar

Reviewed By: craig.topper

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D47453

llvm-svn: 334125
2018-06-06 19:38:16 +00:00
Roman Lebedev
485bbc3cba [NFC][X86][AArch64] Reorganize/cleanup BZHI test patterns
Summary:
In D47428, i propose to choose the `~(-(1 << nbits))` as the canonical form of low-bit-mask formation.
As it is seen from these tests, there is a reason for that.

AArch64 currently better handles `~(-(1 << nbits))`, but not the more traditional `(1 << nbits) - 1` (sic!).
The other way around for X86.
It would be much better to canonicalize.

It would seem that there is too much tests, but this is most of all the auto-generated possible variants
of C code that one would expect for BZHI to be formed, and then manually cleaned up a bit.
So this should be pretty representable, which somewhat good coverage...

Related links:
https://bugs.llvm.org/show_bug.cgi?id=36419
https://bugs.llvm.org/show_bug.cgi?id=37603
https://bugs.llvm.org/show_bug.cgi?id=37610
https://rise4fun.com/Alive/idM

Reviewers: javed.absar, craig.topper, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: kristof.beyls, llvm-commits, RKSimon, craig.topper, spatel

Differential Revision: https://reviews.llvm.org/D47452

llvm-svn: 334124
2018-06-06 19:38:10 +00:00