1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00
Commit Graph

170241 Commits

Author SHA1 Message Date
Lang Hames
0301b4b6ef [ORC] Promote and rename private symbols inside the CompileOnDemand layer,
rather than require them to have been promoted before being passed in.

Dropping this precondition is better for layer composition (CompileOnDemandLayer
was the only one that placed pre-conditions on the modules that could be added).
It also means that the promoted private symbols do not show up in the target
JITDylib's symbol table. Instead, they are confined to the hidden implementation
dylib that contains the actual definitions.

For the 403.gcc testcase this cut down the public symbol table size from ~15,000
symbols to ~4000, substantially reducing symbol dependence tracking costs.

llvm-svn: 344078
2018-10-09 20:44:32 +00:00
Nemanja Ivanovic
73b04fc1f4 [PowerPC] Implement hasBitPreservingFPLogic for types that can be supported
This is the PPC-specific non-controversial part of
https://reviews.llvm.org/D44548 that simply enables this combine for PPC
since PPC has these instructions.
This commit will allow the target-independent portion to be truly target
independent.

llvm-svn: 344077
2018-10-09 20:35:15 +00:00
Craig Topper
28ed2f6496 [X86] When lowering unsigned v2i64 setcc without SSE42, flip the sign bits in the v2i64 type then bitcast to v4i32.
This may give slightly better opportunities for DAG combine to simplify with the operations before the setcc. It also matches the type the xors will eventually be promoted to anyway so it saves a legalization step.

Almost all of the test changes are because our constant pool entry is now v2i64 instead of v4i32 on 64-bit targets. On 32-bit targets getConstant should be emitting a v4i32 build_vector and a v4i32->v2i64 bitcast.

There are a couple test cases where it appears we now combine a bitwise not with one of these xors which caused a new constant vector to be generated. This prevented a constant pool entry from being shared. But if that's an issue we're concerned about, it seems we need to address it another way that just relying a bitcast to hide it.

This came about from experiments I've been trying with pushing the promotion of and/or/xor to vXi64 later than LegalizeVectorOps where it is today. We run LegalizeVectorOps in a bottom up order. So the and/or/xor are promoted before their users are legalized. The bitcasts added for the promotion act as a barrier to computeKnownBits if we try to use it during vector legalization of a later operation. So by moving the promotion out we can hopefully get better results from computeKnownBits/computeNumSignBits like in LowerTruncate on AVX512. I've also looked at running LegalizeVectorOps in a top down order like LegalizeDAG, but thats showing some other issues.

llvm-svn: 344071
2018-10-09 19:05:50 +00:00
Sam Clegg
b517fd6012 [SLPVectorizer] Check that lowered type is floating point before calling isFabsFree
In the case of soft-fp (e.g. fp128 under wasm) the result of
getTypeLegalizationCost() can be an integer type even if the input is
floating point (See LegalizeTypeAction::TypeSoftenFloat).

Before calling isFabsFree() (which asserts if given a non-fp
type) we need to check that that result is fp.  This is safe since in
fabs is certainly not free in the soft-fp case.

Fixes PR39168

Differential Revision: https://reviews.llvm.org/D52899

llvm-svn: 344069
2018-10-09 18:41:17 +00:00
Wolfgang Pieb
9d2a915748 [DWARF] Make llvm-dwarfdump display the .debug_loc.dwo section. Fixes PR38991.
Reviewer: dblaikie

Differential Revision: https://reviews.llvm.org/D52444

llvm-svn: 344068
2018-10-09 18:38:55 +00:00
Sanjay Patel
f1f598dab2 [InstCombine] add tests for extract subvector shuffles; NFC
llvm-svn: 344067
2018-10-09 18:37:20 +00:00
Adrian Prantl
7d344e8dfe Add missing space
llvm-svn: 344064
2018-10-09 18:12:04 +00:00
Zachary Turner
7fd80a3c2e [PDB] Fix failure on big endian machines.
We changed an ArrayRef<uint8_t> to an ArrayRef<uint32_t>, but
it needs to be an ArrayRef<support::ulittle32_t>.

We also change ArrayRef<> to FixedStreamArray<>.  Technically
an ArrayRef<> will work, but it can cause a copy in the underlying
implementation if the memory is not contiguous, and there's no
reason not to use a FixedStreamArray<>.

Thanks to nemanjai@ and thakis@ for helping me track this down
and confirm the fix.

llvm-svn: 344063
2018-10-09 17:58:51 +00:00
Craig Topper
6209430105 [X86] Autogenerate complete checks. NFC
llvm-svn: 344060
2018-10-09 17:52:07 +00:00
Sanjay Patel
117efab3e3 [AArch64][x86] add tests for bitcasted fnabs; NFC
Alternate target coverage for  D44548.

llvm-svn: 344059
2018-10-09 17:20:26 +00:00
Sanjay Patel
41e8411908 [InstCombine] make helper function 'static'; NFC
llvm-svn: 344056
2018-10-09 15:29:26 +00:00
Guillaume Chatelet
6fba352f41 Fix function case.
llvm-svn: 344051
2018-10-09 14:51:33 +00:00
Guillaume Chatelet
a0bccc42f3 [llvm-exegesis] Fix invalid return type and add a Dump function.
Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53020

llvm-svn: 344050
2018-10-09 14:51:29 +00:00
Sanjay Patel
0b30701f8e [x86] use demanded bits to simplify masked store codegen
As noted in D52747, if we prefer IR to use trunc for bool vectors rather 
than and+icmp, we can expose codegen shortcomings as seen here with masked store.

Replace a hard-coded PCMPGT simplification with the more general demanded bits call
to improve things.

Differential Revision: https://reviews.llvm.org/D52964

llvm-svn: 344048
2018-10-09 14:04:14 +00:00
Simon Pilgrim
bc24c23da9 [SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG and CONCAT_VECTORS support to SimplifyDemandedBits
Fix for AVX1 masked load/store regression on D52964

llvm-svn: 344043
2018-10-09 13:13:35 +00:00
Simon Atanasyan
588f246b95 [mips] Fix FDE/CFI encoding in case of N32 ABI
For O32 and N32 ABI FDE/CFI encoding should be `DW_EH_PE_sdata4` and only
N64 ABI uses `DW_EH_PE_sdata8`. To cover all cases this patch check code
pointer size and setup a correct FDE/CFI encoding type.

Differential revision: https://reviews.llvm.org/D52876

llvm-svn: 344040
2018-10-09 11:29:51 +00:00
Simon Atanasyan
8ca3b3f070 [mips] Set pointer size to 4 bytes for N32 ABI
CodePointerSize and CalleeSaveStackSlotSize values are used in DWARF
generation. In case of MIPS it's incorrect to check for Triple::isMIPS64()
only this function returns true for N32 ABI too.

Now we do not have a method to recognize N32 if it's specified by a command
line option and is not a part of a target triple. So we check for
Triple::GNUABIN32 only. It's better than nothing.

Differential revision: https://reviews.llvm.org/D52874

llvm-svn: 344039
2018-10-09 11:29:45 +00:00
Nemanja Ivanovic
7b5b2c4cb9 Fix buildbot failures with the newly added test case (triple was missing).
llvm-svn: 344037
2018-10-09 11:17:47 +00:00
Nemanja Ivanovic
2d39ad0c20 [PowerPC] Remove self-copies in pre-emit peephole
There are occasionally instances where AADB rewrites registers in such a way
that a reg-reg copy becomes a self-copy. Such an instruction is obviously
redundant and can be removed. This patch does precisely that.

Note that this will not remove various nop's that we insert (which are
themselves just self-copies). The reason those are left alone is that all of
them have their own opcodes (that just encode to a self-copy).

What prompted this patch is the fact that these self-copies sometimes end up
using registers that make the instruction a priority-setting nop, thereby
having a significant effect on performance.

Differential revision: https://reviews.llvm.org/D52432

llvm-svn: 344036
2018-10-09 10:54:04 +00:00
Guillaume Chatelet
d422c6902e [llvm-exegesis] Fix wrong index type.
llvm-svn: 344032
2018-10-09 10:06:19 +00:00
Guillaume Chatelet
aada53dc8b [llvm-exegesis] Fix unused lambda capture.
llvm-svn: 344029
2018-10-09 09:33:29 +00:00
Guillaume Chatelet
c27f3c0720 [llvm-exegesis][NFC] Use accessors for Operand.
Summary:
This moves checking logic into the accessors and makes the structure smaller.
It will also help when/if Operand are generated from the TD files.

Subscribers: tschuett, courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D52982

llvm-svn: 344028
2018-10-09 08:59:10 +00:00
Aleksandr Urakov
0e9b1923fd [ADT] Force the alignment of the data field of IntervalMap
Summary:
This patch forces the alignment of the `data` field of `IntervalMap`.
It is because x86 MSVC doesn't apply automatically
(without `__declspec(align(...))`) alignments more than 4 bytes,
even if `alignof` has returned so. Consider the example:

https://godbolt.org/z/zIPa_G

Here `alignof` for both `S0` and `S1` returns `8`, but only `S1` is really
aligned on x86. The explanation of this behavior is here:

https://docs.microsoft.com/en-us/cpp/build/conflicts-with-the-x86-compiler

Reviewers: bkramer, stoklund, hans, rnk

Reviewed By: rnk

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52613

llvm-svn: 344027
2018-10-09 08:50:50 +00:00
Aleksandr Urakov
45d068d804 Revert "[ADT] Change the IntervalMap alignment assert for x86 MSVC"
This reverts commit 7f9eb168a9a8f5ff4fc931a00aec43e8706afecb.

llvm-svn: 344020
2018-10-09 07:44:17 +00:00
Simon Pilgrim
f9dbe90eb7 [X86][AVX1] Enable *_EXTEND_VECTOR_INREG lowering of 256-bit vectors
As discussed on D52964, this adds 256-bit *_EXTEND_VECTOR_INREG lowering support for AVX1 targets to help improve SimplifyDemandedBits handling.

Differential Revision: https://reviews.llvm.org/D52980

llvm-svn: 344019
2018-10-09 07:42:01 +00:00
Aleksandr Urakov
94bc0d9932 [ADT] Change the IntervalMap alignment assert for x86 MSVC
Summary:
This patch forces the alignment of the `data` field of `IntervalMap`.
It is because x86 MSVC doesn't apply automatically
(without `__declspec(align(...))`) alignments more than 4 bytes,
even if `alignof` has returned so. Consider the example:

https://godbolt.org/z/zIPa_G

Here `alignof` for both `S0` and `S1` returns `8`, but only `S1` is really
aligned on x86. The explanation of this behavior is here:

https://docs.microsoft.com/en-us/cpp/build/conflicts-with-the-x86-compiler

Reviewers: bkramer, stoklund, hans, rnk

Reviewed By: rnk

Subscribers: dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52613

llvm-svn: 344018
2018-10-09 07:33:09 +00:00
Chandler Carruth
0073f67e9e [CFG Printer] Add support for writing the dot files with a custom
prefix.

Use this to direct these files to a specific location in the test suite
so that we don't write files out to random directories (or fail if the
working directory isn't writable).

llvm-svn: 344014
2018-10-09 04:30:23 +00:00
George Burgess IV
09f6f1ef21 Make LocationSize a proper Optional type; NFC
This is the second in a series of changes intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context. The first change being r344012.

Since I was requested to do all of this with post-commit review, this is
about as small as I can make this patch.

This patch makes LocationSize into an actual type that wraps a uint64_t;
users are required to call getValue() in order to get the size now. If
the LocationSize has an Unknown size (e.g. if LocSize ==
MemoryLocation::UnknownSize), getValue() will assert.

This also adds DenseMap specializations for LocationInfo, which required
taking two more values from the set of values LocationInfo can
represent. Hence, heavy users of multi-exabyte arrays or structs may
observe slightly lower-quality code as a result of this change.

The intent is for getValue()s to be very close to a corresponding
hasValue() (which is often spelled `!= MemoryLocation::UnknownSize`).
Sadly, small diff context appears to crop that out sometimes, and the
last change in DSE does require a bit of nonlocal reasoning about
control-flow. :/

This also removes an assert, since it's now redundant with the assert in
getValue().

llvm-svn: 344013
2018-10-09 03:18:56 +00:00
George Burgess IV
4a912f9043 Use locals instead of struct fields; NFC
This is one of a series of changes intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context.

Since I was requested to do all of this with post-commit review, this is
about as small as I can make it (beyond committing changes to these few
files separately, but they're incredibly similar in spirit, so...)

On its own, this change doesn't make a great deal of sense. I plan on
having a follow-up Real Soon Now(TM) to make the bits here make more
sense. :)

In particular, the next change in this series is meant to make
LocationSize an actual type, which you have to call .getValue() on in
order to get at the uint64_t inside. Hence, this change refactors code
so that:
- we only need to call the soon-to-come getValue() once in most cases,
  and
- said call to getValue() happens very closely to a piece of code that
  checks if the LocationSize has a value (e.g. if it's != UnknownSize).

llvm-svn: 344012
2018-10-09 02:14:33 +00:00
David Blaikie
4b95f4ad7c llvm-link: Improve diagnostic for module-level metadata mismatch
This might produce hard to read/illegible diagnostics for especially
weird/non-trivial module metadata but integers are about all we are
using these days, so seems more useful than not.

Patch based on work by Kristina Brooks - thanks!

Differential Revision: https://reviews.llvm.org/D52952

llvm-svn: 344011
2018-10-09 01:17:27 +00:00
Matthias Braun
07dbfddbd4 ExpandPostRAPseudos: Fix alldefsAreDead() not removing operands
One case left around nonsensical operands for the KILL instruction
which the machine verifier checks for nowadays. While this should not
hurt in release builds we should fix the machine verifier errors anyway.

llvm-svn: 344008
2018-10-09 00:07:34 +00:00
Petar Jovanovic
e75e60eab8 [MIPS GlobalISel] Legalize i64 add
Custom legalize s64 G_ADD for MIPS32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52652

llvm-svn: 344007
2018-10-08 23:59:37 +00:00
Matthias Braun
f13d37d886 TwoAddressInstructionPass: Modernize/fix some comments; NFC
llvm-svn: 344006
2018-10-08 23:47:35 +00:00
Matthias Braun
ae541b0e31 PHIElimination: Remove wrong comment; NFC
The comment was contradicting the code. Looking at history the feature
was implemented a day after the comment was written without dropping the
comment.

llvm-svn: 344005
2018-10-08 23:47:35 +00:00
Matthias Braun
ebcc3ae71c MachineFunctionPrinterPass: Declare SlotIndexes as used if available; NFC
This makes print-machineinstrs print the slot indexes in more
situations. NFC for normal compilation.

llvm-svn: 344004
2018-10-08 23:47:34 +00:00
Zachary Turner
c6a8dfc2a2 Remove unused variable.
llvm-svn: 344002
2018-10-08 22:56:57 +00:00
Zachary Turner
a6d52887b1 [PDB] fix a bug in global stream name lookup.
When we're looking up a record in the last hash bucket chain, we
need to be careful with the end-offset calculation.

llvm-svn: 344001
2018-10-08 22:38:27 +00:00
Petar Jovanovic
28aefdcb34 [DebugInfo] Fix debug information label tests
Remove the space in the asm check so that the expression is more general
and can also capture MIPS labels which can be surrounded by braces, e.g.:

.4byte        ($tmp1)                 # DW_AT_low_pc

Also change optimization level to O0 because the DW_TAG_label does not
appear on MIPS when -O2 is used.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D52901

llvm-svn: 343999
2018-10-08 22:10:34 +00:00
Rong Xu
9632247152 [X86] Revert r343993 condition branches folding for three-way conditional codes
Some buildbots failed.

llvm-svn: 343998
2018-10-08 22:08:43 +00:00
Sanjay Patel
a1c4d72e29 [DAGCombiner] simplify code for fmul with constant fold; NFCI
llvm-svn: 343997
2018-10-08 21:17:20 +00:00
Craig Topper
807baf0131 [X86] Prefer isTypeLegal over checking isSimple in a DAG combine.
Simple types are a superset of what all in tree targets in LLVM could possibly have a legal type. This means the behavior of using isSimple to check for a supported type for X86 could change over time. For example, this could would change if a v256i1 type was added to MVT in the future.

llvm-svn: 343995
2018-10-08 20:02:59 +00:00
Sanjay Patel
fbf17791f8 [x86] add tests for phaddd/phaddw; NFC
More tests related to PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195

If we limit the horizontal codegen, it may require different
constraints for FP and integer.

llvm-svn: 343994
2018-10-08 19:48:18 +00:00
Rong Xu
cffa0071cd [X86] condition branches folding for three-way conditional codes
This patch implements a pass that optimizes condition branches on x86 by
taking advantage of the three-way conditional code generated by compare
instructions.

Currently, it tries to hoisting EQ and NE conditional branch to a dominant
conditional branch condition where the same EQ/NE conditional code is
computed. An example:
bb_0:
  cmp %0, 19
  jg bb_1
  jmp bb_2
bb_1:
  cmp %0, 40
  jg bb_3
  jmp bb_4
bb_4:
  cmp %0, 20
  je bb_5
  jmp bb_6
Here we could combine the two compares in bb_0 and bb_4 and have the
following code:

bb_0:
  cmp %0, 20
  jg bb_1
  jl bb_2
  jmp bb_5
bb_1:
  cmp %0, 40
  jg bb_3
  jmp bb_6

For the case of %0 == 20 (bb_5), we eliminate two jumps, and the control height
for bb_6 is also reduced. bb_4 is gone after the optimization.

This optimization is motivated by the branch pattern generated by the switch
lowering: we always have pivot-1 compare for the inner nodes and we do a pivot
compare again the leaf (like above pattern).

This pass currently is enabled on Intel's Sandybridge and later arches. Some
reviewers pointed out that on some arches (like AMD Jaguar), this pass may
increase branch density to the point where it hurts the performance of the
branch predictor.

Differential Revision: https://reviews.llvm.org/D46662

llvm-svn: 343993
2018-10-08 18:52:39 +00:00
Scott Linder
a1e537617c [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions
Emit a waterfall loop in the general case for a potentially-divergent Rsrc
operand. When practical, avoid this by using Addr64 instructions.

Recommits r341413 with changes to update the MachineDominatorTree when present.

Differential Revision: https://reviews.llvm.org/D51742

llvm-svn: 343992
2018-10-08 18:47:01 +00:00
Simon Pilgrim
4b773ffb52 [X86][AVX2] Enable ZERO_EXTEND_VECTOR_INREG lowering of 256-bit vectors
Some necessary yak shaving before lowering *_EXTEND_VECTOR_INREG 256-bit vectors on AVX1 targets as suggested by D52964.

Differential Revision: https://reviews.llvm.org/D52970

llvm-svn: 343991
2018-10-08 18:40:50 +00:00
Sanjay Patel
306bb76156 [x86] make horizontal binop matching clearer; NFCI
The instructions are complicated, so this code will
probably never be very obvious, but hopefully this
makes it better. 

As shown in PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195
...we need to improve the matching to not miss cases
where we're h-opping on 1 source vector, and that
should be a small patch after this rearranging.

llvm-svn: 343989
2018-10-08 18:08:02 +00:00
Robert Lougher
a83e090ac2 [TailCallElim] Enable marking of calls with byval as tails
In r339636 the alias analysis rules were changed with regards to tail calls
and byval arguments. Previously, tail calls were assumed not to alias
allocas from the current frame. This has been updated, to not assume this
for arguments with the byval attribute.

This patch aligns TailCallElim with the new rule. Tail marking can now be
more aggressive and mark more calls as tails, e.g.:

define void @test() {
  %f = alloca %struct.foo
  call void @bar(%struct.foo* byval %f)
  ret void
}

define void @test2(%struct.foo* byval %f) {
  call void @bar(%struct.foo* byval %f)
  ret void
}

define void @test3(%struct.foo* byval %f) {
  %agg.tmp = alloca %struct.foo
  %0 = bitcast %struct.foo* %agg.tmp to i8*
  %1 = bitcast %struct.foo* %f to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 40, i1 false)
  call void @bar(%struct.foo* byval %agg.tmp)
  ret void
}

The problematic case where a byval parameter is captured by a call is still
handled correctly, and will not be marked as a tail (see PR7272).

llvm-svn: 343986
2018-10-08 18:03:40 +00:00
Tom Stellard
638f15ba8d AMDGPU/GlobalISel: Select amdgcn.cvt.pkrtz to 64-bit instructions
Summary: The 32-bit variants do not exist on VI+.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52958

llvm-svn: 343985
2018-10-08 17:49:29 +00:00
Kristina Brooks
dd017bbf60 Fix incorrect Twine usage in CFGPrinter
CFGPrinter (-view-cfg, -dot-cfg) invokes an undefined behaviour (dangling
pointer to rvalue) on IR files with branch weights. This patch fixes the
problem caused by Twine initialization and string conversion split into
two statements.

This change fixes the bug 37019. A similar patch to this problem was
provided in the llvmlite project

Patch by mcopik (Marcin Copik).

Differential Revision: https://reviews.llvm.org/D52933

llvm-svn: 343984
2018-10-08 17:29:39 +00:00
Nicolai Haehnle
22f80d62d3 AMDGPU: Future-proof {raw,struct}.buffer.atomic intrinsics
Summary:
The ISA is really supposed to support 64-bit atomics as well,
so the data type should be an overload.

Mesa doesn't use these atomics yet, in fact I noticed this
issue while trying to use the atomics from Mesa.

Change-Id: I77f58317a085a0d3eb933cc7e99308c48a19f83e

Reviewers: tpr

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D52291

llvm-svn: 343978
2018-10-08 16:53:48 +00:00