1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

184263 Commits

Author SHA1 Message Date
Nico Weber
bc0e7d950e gn build: Merge r370746
llvm-svn: 370749
2019-09-03 13:01:17 +00:00
David Green
b5fa754985 [ARM] Ignore Implicit CPSR regs when lowering from Machine to MC operands
The code here seems to date back to r134705, when tablegen lowering was first
being added. I don't believe that we need to include CPSR implicit operands on
the MCInst. This now works more like other backends (like AArch64), where all
implicit registers are skipped.

This allows the AliasInst for CSEL's to match correctly, as can be seen in the
test changes.

Differential revision: https://reviews.llvm.org/D66703

llvm-svn: 370745
2019-09-03 11:30:54 +00:00
Jonas Paulsson
e89d1ea35f [SystemZ] Add support for fentry.
SystemZAsmPrinter now properly emits function calls to __fentry__.

Review: Ulrich Weigand
llvm-svn: 370743
2019-09-03 11:21:12 +00:00
David Green
dbd4d84623 [ARM] Invert CSEL predicates if the opposite is a simpler constant to materialise
This moves ConstantMaterializationCost into ARMBaseInstrInfo so that it can
also be used in ISel Lowering, adding codesize values to the computed costs, to
be able to compare either approximate instruction counts or codesize costs.

It also adds a HasLowerConstantMaterializationCost, which compares the
ConstantMaterializationCost of two values, returning true if the first is
smaller either in instruction count/codesize, or falling back to the other in
the case that they are equal.

This is used in constant CSEL lowering to invert the predicate if the opposite
is easier to materialise.

Differential revision: https://reviews.llvm.org/D66701

llvm-svn: 370741
2019-09-03 11:06:24 +00:00
David Green
2cffd5437a [ARM] Generate 8.1-m CSINC, CSNEG and CSINV instructions.
Arm 8.1-M adds a number of related CSEL instructions, including CSINC, CSNEG and CSINV. These choose between two values given the content in CPSR and a condition, performing an increment, negation or inverse of the false value.

This adds some selection for them, either from constant values or patterns. It does not include CSEL directly, which is currently not always making code better. It is still useful, but we will have to check more carefully where it should and shouldn't be used.

Code by Ranjeet Singh and Simon Tatham, with some modifications from me.

Differential revision: https://reviews.llvm.org/D66483

llvm-svn: 370739
2019-09-03 10:53:07 +00:00
David Green
c8cdfa7976 [ARM] Add csel tests. NFC
llvm-svn: 370738
2019-09-03 10:32:46 +00:00
Simon Atanasyan
71d4f2d741 [mips] Switch to the .text section after emitting asm file preamble
Now the last `.section` directive in the MIPS asm file preamble
is the `.section .mdebug.abi`. If assembler code injected for example
by the LLVM `module asm` or the C ` __asm` directives do not contain
explicit switching to the `.text` section it goes to the `.mdebug.abi`
section. It might be unexpected to the user and in fact for example
breaks building some existing code like FreeBSD libc [1].

The patch forces switching to the `.text` section after emitting MIPS
assembler file preamble.

[1] https://bugs.llvm.org/show_bug.cgi?id=43119

Fix PR43119.

Differential Revision: https://reviews.llvm.org/D67014

llvm-svn: 370735
2019-09-03 10:24:07 +00:00
David Green
c978d9ace7 [ARM] Fix MVE ldst offset ranges
We were using isShiftedInt<7, Shift>(RHSC) to detect the ranges of offsets to
fold into MVE loads/stores. The instructions actually take a 7 bit unsigned
integer which is either added or subtracted. So something more like
isShiftedUInt<7, Shift>(abs(RHSC)).

Instead I've changes this to use the isScaledConstantInRange method, same as in
SelectT2AddrModeImm7Offset used by pre/post inc, which seemed to already be
getting this correct.

Differential revision: https://reviews.llvm.org/D66997

llvm-svn: 370731
2019-09-03 09:57:02 +00:00
Oliver Stannard
70b03012c1 [ARM][MVE] Decoding of VMSR doesn't diagnose some unpredictable encodings
Decoding of VMSR doesn't diagnose some unpredictable encodings, as the unpredictable bits are not correctly set.

Diff-reduce this instruction's internals WRT VMRS so I can see the differences better. Mostly this is s/src/Rt/g.

Fill in the "should-be-(0)" bits.

Designate the Unpredictable{} bits for both VMRS and VMSR.

Patch by Mark Murray!

Differential revision: https://reviews.llvm.org/D66938

llvm-svn: 370729
2019-09-03 09:55:30 +00:00
Oliver Stannard
486da6004f Bug fix on function epilog optimization (ARM backend)
To save a 'add sp,#val' instruction by adding registers to the final pop instruction,
the first register transferred by this pop instruction need to be found.
If the function to be optimized has a non-void return value, the operand list contains
r0 (implicit) which prevents the optimization to take place.
Therefore implicit register references should be skipped in the search loop,
because this registers are never popped from the stack.

Patch by Rainer Herbertz (rOptimizer)!

Differential revision: https://reviews.llvm.org/D66730

llvm-svn: 370728
2019-09-03 09:51:19 +00:00
David Green
13ea5d8f72 [ARM] More MVE load/store tests for offsets around the negative limit. NFC
llvm-svn: 370726
2019-09-03 09:42:16 +00:00
Bjorn Pettersson
e93d54f960 [LV] Fix miscompiles by adding non-header PHI nodes to AllowedExit
Summary:
Fold-tail currently supports reduction last-vector-value live-out's,
but has yet to support last-scalar-value live-outs, including
non-header phi's. As it relies on AllowedExit in order to detect
them and bail out we need to add the non-header PHI nodes to
AllowedExit, otherwise we end up with miscompiles.

Solves https://bugs.llvm.org/show_bug.cgi?id=43166

Reviewers: fhahn, Ayal

Reviewed By: fhahn, Ayal

Subscribers: anna, hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67074

llvm-svn: 370721
2019-09-03 09:33:55 +00:00
Bjorn Pettersson
f4d7c30670 [LV] Precommit test case showing miscompile from PR43166. NFC
Summary:  Precommit test case showing miscompile from PR43166.

Reviewers: fhahn, Ayal

Reviewed By: fhahn

Subscribers: rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67072

llvm-svn: 370720
2019-09-03 09:33:40 +00:00
Sjoerd Meijer
654ab28c6e [LV] Tail-folding, runtime scev checks
Now that we allow tail-folding, not only when we optimise for size, make
sure we do not run in this assert.

Differential revision: https://reviews.llvm.org/D66932

llvm-svn: 370711
2019-09-03 08:53:02 +00:00
Sjoerd Meijer
7b5470772a [LV] Tail-folding with runtime memory checks
The loop vectorizer was running in an assert when it tried to fold the tail and
had to emit runtime memory disambiguation checks.

Differential revision: https://reviews.llvm.org/D66803

llvm-svn: 370707
2019-09-03 08:38:24 +00:00
James Molloy
33edb0a6f1 [MachinePipeliner] Add a way to unit-test the schedule emitter
Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself.

One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway.

This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at.

We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with:

  llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir

And run the emission in isolation with:

  llc < test.mir -run-pass=modulo-schedule-test

llvm-svn: 370705
2019-09-03 08:20:31 +00:00
Sam Tebbs
f88ca36af5 [ARM] Select vmla
This patch adds vmla selection.

Differential revision: https://reviews.llvm.org/D66297

llvm-svn: 370704
2019-09-03 08:17:46 +00:00
Craig Topper
26d299992f [X86] Simplify the setOperationAction handling for fp_to_uint by improving the Custom handler a bit.
This merges the 32-bit and 64-bit mode code to just use Custom
for both i32 and i64. We already had most of the handling in
the custom handling due to the AVX512 having legal fp_to_uint.
Just needed to add the i32->i64 promotion handling. Refactor
the fp_to_uint code in the custom handler to simplify the
number of times we check things.

Tweak cost model tables to match the default handling we were
getting due to Expand before.

llvm-svn: 370700
2019-09-03 05:57:22 +00:00
Craig Topper
65cc8b4a7f [X86] Don't use Expand for i32 fp_to_uint on SSE1/2 targets on 32-bit target.
Use Custom lowering instead. Fall back to default expansion only
when the scalar FP type belongs in an XMM register. This improves
lowering for i32 to fp80, and also i32 to double on SSE1 only.

llvm-svn: 370699
2019-09-03 05:57:18 +00:00
Craig Topper
01e84c517f [X86] Add an exhaustive test for i32 fptosi/fptoui across different triples and features.
llvm-svn: 370698
2019-09-03 05:57:14 +00:00
Craig Topper
434a360032 [LegalizeDAG] Pass DAG to two calls to SDNode::dump in debug prints so that they will print target specific nodes correctly.
The dump methods can only print target node names correctly if
they can get access to the TLI object.

llvm-svn: 370694
2019-09-03 02:51:14 +00:00
Craig Topper
af345a0f8e [X86] Custom promote i32->f80 uint_to_fp on AVX512 64-bit targets.
Reuse the same code to promote all i32 uint_to_fp on 64-bit targets
to simplify the X86ISelLowering constructor.

llvm-svn: 370693
2019-09-03 02:51:10 +00:00
Simon Pilgrim
76d3d5fb33 [CostModel][X86] Add scalar sext/zext cost tests
llvm-svn: 370684
2019-09-02 21:02:51 +00:00
Craig Topper
4194f27f3f [X86] Enable fp128 as a legal type with SSE1 rather than with MMX.
FP128 values are passed in xmm registers so should be asssociated
with an SSE feature rather than MMX which uses a different set
of registers.

llc enables sse1 and sse2 by default with x86_64. But does not
enable mmx. Clang enables all 3 features by default.

I've tried to add command lines to test with -sse
where possible, but any test that returns a value in an xmm
register fails with a fatal error with -sse since we have no
defined ABI for that scenario.

llvm-svn: 370682
2019-09-02 20:16:30 +00:00
David Green
b3389c16e8 [ARM] MVE predicate bitcast test and VPSEL adjustment. NFC
llvm-svn: 370678
2019-09-02 19:03:35 +00:00
David Green
f5fd08fc0d [ARM] Use MQPR not QPR for MVE registers
We should be using MQPR, and if we don't we can get COPYs and PHIs created for
QPR. These get folded into instructions, failing verification checks.

Differential revision: https://reviews.llvm.org/D66214

llvm-svn: 370676
2019-09-02 17:18:23 +00:00
Robert Lougher
a7d1d8c4ff [TargetLowering][PS4] Add sincos(f) lib functions when target is PS4
PS4 supports sincosf and sincos. Adding the library functions enables
the sin(f)+cos(f) -> sincos(f) optimization.

Differential Revision: https://reviews.llvm.org/D67009

llvm-svn: 370675
2019-09-02 16:53:32 +00:00
Ulrich Weigand
68d6f27a24 [SystemZ] Support constrained fpto[su]i intrinsics
Now that constrained fpto[su]i intrinsic are available,
add codegen support to the SystemZ backend.

In addition to pure back-end changes, I've also needed
to add the strict_fp_to_[su]int and any_fp_to_[su]int
pattern fragments in the obvious way.

llvm-svn: 370674
2019-09-02 16:49:29 +00:00
Kerry McLaughlin
acca14efdf [SVE][Inline-Asm] Support for SVE asm operands
Summary:
Adds the following inline asm constraints for SVE:
  - w: SVE vector register with full range, Z0 to Z31
  - x: Restricted to registers Z0 to Z15 inclusive.
  - y: Restricted to registers Z0 to Z7 inclusive.

This change also adds the "z" modifier to interpret a register as an SVE register.

Not all of the bitconvert patterns added by this patch are used, but they have been included here for completeness.

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, rengolin, cameron.mcinally, greened

Reviewed By: sdesmalen

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66302

llvm-svn: 370673
2019-09-02 16:12:31 +00:00
Simon Pilgrim
3e7cad2294 [X86] getPMOVMSKB - add MVT::v64i8 handling and remove from combineBitcastvxi1. NFCI.
llvm-svn: 370670
2019-09-02 15:10:35 +00:00
George Rimar
2e52b72fcb Recommit r370661 "[llvm-nm] - Add a test case for case when we dump a symbol that belongs to a section with a broken sh_name."
Fix: add a 'consumeError()' call to ObjectFile.cpp.
This error was never checked.

Original commit message:

It adds a test case for a problem fixed by D66976 <https://reviews.llvm.org/D66976>.

It was introduced by me in D66089 <https://reviews.llvm.org/D66089>.
The error reported was never consumed because of a wrong variable name used,
so it could fail when LLVM_ENABLE_ABI_BREAKING_CHECKS is used.

Differential revision: https://reviews.llvm.org/D67002

llvm-svn: 370669
2019-09-02 14:57:35 +00:00
Sanjay Patel
6a053c6ace [DAGCombiner] try to form test+set out of shift+mask patterns
The motivating bugs are:
https://bugs.llvm.org/show_bug.cgi?id=41340
https://bugs.llvm.org/show_bug.cgi?id=42697

As discussed there, we could view this as a failure of IR canonicalization,
but then we would need to implement a backend fixup with target overrides
to get this right in all cases. Instead, we can just view this as a codegen
opportunity. It's not even clear for x86 exactly when we should favor
test+set; some CPUs have better theoretical throughput for the ALU ops than
bt/test.

This patch is made more complicated than I expected because there's an early
DAGCombine for 'and' that can change types of the intermediate ops via
trunc+anyext.

Differential Revision: https://reviews.llvm.org/D66687

llvm-svn: 370668
2019-09-02 14:52:09 +00:00
Jay Foad
1f578036a2 Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0"
Summary:
D61491 caused us to use relocs when they're not strictly necessary, to
refer to symbols in the text section. This is a pessimization and it's a
problem for some loaders that don't support relocs yet.

Reviewers: nhaehnle, arsenm, tpr

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65813

llvm-svn: 370667
2019-09-02 14:40:57 +00:00
Dmitry Preobrazhensky
011ce9a859 [AMDGPU][MC][GFX10] Corrected constant bus checks to exclude null
See AMD SWDEV-157286

Reviewers: atamazov, arsenm

Differential Revision: https://reviews.llvm.org/D65229

llvm-svn: 370665
2019-09-02 14:19:52 +00:00
Thomas Preud'homme
f7825f2b74 [FileCheck] Make NumericVariable ctor explicit
Summary:
Make FileCheckNumericVariable constructor explicit to avoid implicit
conversions from StringRef.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66640

llvm-svn: 370664
2019-09-02 14:04:05 +00:00
Thomas Preud'homme
393fb2cc2e [FileCheck] Forbid using var defined on same line
Summary:
Commit r366897 introduced the possibility to set a variable from an
expression, such as [[#VAR2:VAR1+3]]. While introducing this feature, it
introduced extra logic to allow using such a variable on the same line
later on. Unfortunately that extra logic is flawed as it relies on a
mapping from variable to expression defining it when the mapping is from
variable definition to expression. This flaw causes among other issues
PR42896.

This commit avoids the problem by forbidding all use of a variable
defined on the same line, and removes the now useless logic. Redesign
will be done in a later commit because it will require some amount of
refactoring first for the solution to be clean. One example is the need
for some sort of transaction mechanism to set a variable temporarily and
from an expression and rollback if the CHECK pattern does not match so
that diagnostics show the right variable values.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: JonChesterfield, rogfer01, hfinkel, kristina, rnk, tra, arichardson, grimar, dblaikie, probinson, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66141

llvm-svn: 370663
2019-09-02 14:04:00 +00:00
George Rimar
59a79d45c8 Revert r370661 "[llvm-nm] - Add a test case for case when we dump a symbol that belongs to a section with a broken sh_name"
It broke BB:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16955/steps/test/logs/stdio

Expected<T> must be checked before access or destruction.
Unchecked Expected<T> contained error:
a section [index 1] has an invalid sh_name (0xffff) offset which goes past the end of the section name string tableStack dump:
0.	Program arguments: /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/test/tools/llvm-nm/Output/format-sysv-section.test.tmp2.o --format=sysv 
 #0 0x00000000008af7c4 PrintStackTraceSignalHandler(void*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8af7c4)
 #1 0x00000000008ad8be llvm::sys::RunSignalHandlers() (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8ad8be)
 #2 0x00000000008afbd8 SignalHandler(int) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8afbd8)
 #3 0x00007f0a6b989730 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12730)
 #4 0x00007f0a6b48d7bb raise (/lib/x86_64-linux-gnu/libc.so.6+0x377bb)
 #5 0x00007f0a6b478535 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22535)
 #6 0x000000000042004b llvm::Expected<llvm::StringRef>::fatalUncheckedExpected() const (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x42004b)
 #7 0x00000000008367f5 (/sv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8367f5)
 #8 0x0000000000817b80 llvm::object::IRObjectFile::findBitcodeInObject(llvm::object::ObjectFile const&) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x817b80)
 #9 0x0000000000838416 llvm::object::SymbolicFile::createSymbolicFile(llvm::MemoryBufferRef, llvm::file_magic, llvm::LLVMContext*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x838416)
#10 0x00000000007f36cb llvm::object::createBinary(llvm::MemoryBufferRef, llvm::LLVMContext*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x7f36cb)
#11 0x0000000000413123 dumpSymbolNamesFromFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x413123)
#12 0x0000000000412e38 main (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x412e38)
#13 0x00007f0a6b47a09b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409b)
#14 0x00000000004120da _start (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x4120da)
FileCheck error: '-' is empty.
FileCheck command line:  /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.src/test/tools/llvm-nm/format-sysv-section.test --check-prefix=ERR

--

llvm-svn: 370662
2019-09-02 14:03:50 +00:00
George Rimar
8942653f23 [llvm-nm] - Add a test case for case when we dump a symbol that belongs to a section with a broken sh_name.
It adds a test case for a problem fixed by D66976.

It was introduced by me in D66089.
The error reported was never consumed because of a wrong variable name used,
so it could fail when LLVM_ENABLE_ABI_BREAKING_CHECKS is used.

Differential revision: https://reviews.llvm.org/D67002

llvm-svn: 370661
2019-09-02 13:54:45 +00:00
Dmitry Preobrazhensky
8fbb623003 [AMDGPU][MC][GFX10] Enabled null with 64-bit operands
See Bug 42745: https://bugs.llvm.org/show_bug.cgi?id=42745

Reviewers: atamazov, arsenm

https://reviews.llvm.org/D65231

llvm-svn: 370660
2019-09-02 13:42:25 +00:00
Sanjay Patel
ed830e5e7b [InstCombine] recognize bswap disguised as shufflevector
bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N*8} --> bswap (bitcast X to i{N*8})

In PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146
...we have a more complicated case where SLP is making a mess of bswap. This patch won't
do anything for that currently, but we need to improve bswap recognition in instcombine,
SLP, and/or a standalone pass to avoid that problem.

This is limited using the data-layout so we don't try to do this transform with actual
vector types. The backend does not appear to have folds to convert in either direction,
so we don't want to mess up something that is actually better lowered as a shuffle.

On x86, we're trading something like this:

  vmovd	%edi, %xmm0
  vpshufb	LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u]
  vmovd	%xmm0, %eax

For:

  movl	%edi, %eax
  bswapl	%eax

Differential Revision: https://reviews.llvm.org/D66965

llvm-svn: 370659
2019-09-02 13:33:20 +00:00
Martin Storsjo
eec6c132ce [test] [llvm-dlltool] Improve test strictness a little. NFC.
llvm-svn: 370657
2019-09-02 13:28:21 +00:00
Martin Storsjo
0b651a90a6 [llvm-dlltool] Handle external and internal names with differing decoration
Also add a missed part of the test from SVN r369747.

Differential Revision: https://reviews.llvm.org/D66996

llvm-svn: 370656
2019-09-02 13:28:16 +00:00
Martin Storsjo
d718a53c2e [llvm-dlltool] Remove support for implying output name
I don't see GNU dlltool supporting doing this; with only a -d option
and no -l option, GNU dlltool runs successfully but doesn't write any
output file.

Differential Revision: https://reviews.llvm.org/D65645

llvm-svn: 370655
2019-09-02 13:28:07 +00:00
Dmitry Preobrazhensky
93effe4818 [AMDGPU][MC][GFX10] Corrected constant bus limit for 64-bit shift instructions
See bug 42744: https://bugs.llvm.org/show_bug.cgi?id=42744

Reviewers: atamazov, arsenm

Differential Revision: https://reviews.llvm.org/D65228

llvm-svn: 370652
2019-09-02 12:50:05 +00:00
Andrea Di Biagio
08b9b0a5ac [X86][BtVer2] Fix latency and throughput of conditional SIMD store instructions.
On BtVer2 conditional SIMD stores are heavily microcoded.
The latency is directly proportional to the number of packed elements extracted
from the input vector. Also, according to micro-benchmarks, most of the
computation seems to be done in the integer unit.

Only a minority of the uOPs is executed by the FPU. The observed behaviour on
the FPU looks similar to this:
 - The input MASK value is moved to the Integer Unit
   -- [ a VMOVMSK-like uOP-executed on JFPU0].
 - In parallel, each element of the input XMM/YMM is extracted and then sent to
   the IntegerUnit through JFPU1.

As expected, a (conditional) store is executed for every extracted element.
Interestingly, a (speculative) load is executed for every extracted element too.
It is as-if a "LOAD - BIT_EXTRACT- CMOV" sequence of uOPs is repeated by the
integer unit for every contionally stored element.
VMASKMOVDQU is a special case: the number of speculative loads is always 2
(presumably, one load per quadword). That means, extra shifts and masking is
performed on (one of) the loaded quadwords before each conditional store (that
also explains the big number of non-FP uOPs retired).

This patch replaces the existing writes for conditional SIMD stores (i.e.
WriteFMaskedStore, and WriteFMaskedStoreY) with the following new writes:

  WriteFMaskedStore32  [ XMM Packed Single ]
  WriteFMaskedStore32Y [ YMM Packed Single ]
  WriteFMaskedStore64  [ XMM Packed Double ]
  WriteFMaskedStore64Y [ YMM Packed Double ]

Added a wrapper class named X86SchedWriteMaskMove in X86Schedule.td to describe
both RM and MR variants for conditional SIMD moves in a single tablegen
definition.
Instances of that class are then passed in input to multiclass avx_movmask_rm
when constructing MASKMOVPS/PD definitions.

Since this patch introduces new writes, I had to update all the X86 scheduling
models.

Differential Revision: https://reviews.llvm.org/D66801

llvm-svn: 370649
2019-09-02 12:32:28 +00:00
Jeremy Morse
7698ed73e1 [DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations
The missing line added by this patch ensures that only spilt variable
locations are candidates for being restored from the stack. Otherwise,
register or constant-value information can be interpreted as a spill
location, through a union.

The added regression test replicates a scenario where this occurs: the
stack load from [rsp] causes the register-location DBG_VALUE to be
"restored" to rsi, when it should be left alone. See PR43058 for details.

Un x-fail a test that was suffering from this from a previous patch.

Differential Revision: https://reviews.llvm.org/D66895

llvm-svn: 370648
2019-09-02 12:28:36 +00:00
James Henderson
277b2b47c0 [llvm-strings][test] Merge two closely related tests
This is a follow-up to feedback on D66015.

Reviewed by: grimar

Differential Revision: https://reviews.llvm.org/D67069

llvm-svn: 370643
2019-09-02 11:42:30 +00:00
Nandor Licker
bd1a6d1125 Revert [Clang Interpreter] Initial patch for the constexpr interpreter
This reverts r370636 (git commit 8327fed9475a14c3376b4860c75370c730e08f33)

llvm-svn: 370642
2019-09-02 11:34:47 +00:00
Simon Pilgrim
a05c487c5f [X86] combineHorizontalPredicateResult - pull out repeated getTargetLoweringInfo() calls. NFCI.
llvm-svn: 370637
2019-09-02 10:42:48 +00:00
Nandor Licker
aec22e274e [Clang Interpreter] Initial patch for the constexpr interpreter
Summary:
This patch introduces the skeleton of the constexpr interpreter,
capable of evaluating a simple constexpr functions consisting of
if statements. The interpreter is described in more detail in the
RFC. Further patches will add more features.

Reviewers: Bigcheese, jfb, rsmith

Subscribers: bruno, uenoku, ldionne, Tyker, thegameg, tschuett, dexonsmith, mgorny, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64146

llvm-svn: 370636
2019-09-02 10:38:08 +00:00