1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

172856 Commits

Author SHA1 Message Date
Fangrui Song
1f32d67d40 [Object] Rename getRelrRelocationType to getRelativeRelocationType
Summary:
The two utility functions were added in D47919 to support SHT_RELR.
However, these are just relative relocations types and are't
necessarily be named Relr.

Reviewers: phosek, dberris

Reviewed By: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55691

llvm-svn: 349133
2018-12-14 07:46:58 +00:00
Petr Hosek
6a4e651274 [llvm-xray] Use correct variable name
This fixes the compiler error introduced in r349129.

llvm-svn: 349130
2018-12-14 06:06:19 +00:00
Petr Hosek
98196da44e [llvm-xray] Store offset pointers in temporaries
DataExtractor::getU64 modifies the OffsetPtr which also pass to
RelocateOrElse which breaks on Windows. This addresses the issue
introduced in r349120.

Differential Revision: https://reviews.llvm.org/D55689

llvm-svn: 349129
2018-12-14 05:56:20 +00:00
Nico Weber
349a17eff9 [gn build] Merge r348963 and r349076
llvm-svn: 349124
2018-12-14 03:20:46 +00:00
Petr Hosek
aa9b3d7c8c [llvm-xray] Support for PIE
When the instrumented binary is linked as PIE, we need to apply the
relative relocations to sleds. This is handled by the dynamic linker
at runtime, but when processing the file we have to do it ourselves.

Differential Revision: https://reviews.llvm.org/D55542

llvm-svn: 349120
2018-12-14 01:37:56 +00:00
Alex Lorenz
6f0d069409 [macho] save the SDK version stored in module metadata into the version min and
build version load commands in the object file

This commit introduces a new metadata node called "SDK Version". It will be set
by the frontend to mark the platform SDK (macOS/iOS/etc) version which was used
during that particular compilation.
This node is used when machine code is emitted, by either saving the SDK version
into the appropriate macho load command (version min/build version), or by
emitting the assembly for these load commands with the SDK version specified as
well.
The assembly for both load commands is extended by allowing it to contain the
sdk_version X, Y [, Z] trailing directive to represent the SDK version
respectively.

rdar://45774000

Differential Revision: https://reviews.llvm.org/D55612

llvm-svn: 349119
2018-12-14 01:14:10 +00:00
Reid Kleckner
ef5fb71f38 Silence CMP0048 warning in the benchmark utility library
I'm testing this in LLVM before sending it upstream.

Part of PR38874

llvm-svn: 349097
2018-12-14 00:17:12 +00:00
Nico Weber
4635e976e8 [gn build] Add infrastructure to create symlinks and use it to create lld's symlinks
This is slightly involved, see the comments in the code.

The GN build now builds a functional lld!

Differential Revision: https://reviews.llvm.org/D55606

llvm-svn: 349096
2018-12-14 00:16:33 +00:00
Sanjay Patel
f44785a46f [DAGCombiner] clean up visitEXTRACT_VECTOR_ELT
This isn't quite NFC, but I don't know how to expose
any outward diffs from these changes. Mostly, this
was confusing because it used 'VT' to refer to the
operand type rather the usual type of the input node.

There's also a large block at the end that is dedicated 
solely to matching loads, but that wasn't obvious. This
could probably be split up into separate functions to
make it easier to see. 

It's still not clear to me when we make certain transforms 
because the legality and constant conditions are 
intertwined in a way that might be improved.

llvm-svn: 349095
2018-12-14 00:09:08 +00:00
Craig Topper
5cd99cd6e6 [X86] Demote EmitTest to a helper function of EmitCmp. Route all callers except EmitCmp through EmitCmp.
This requires the two callers to manifest a 0 to make EmitCmp call EmitTest.

I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely.

llvm-svn: 349094
2018-12-13 23:55:30 +00:00
Evgeniy Stepanov
4765d46c23 Revert "[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)"
Breaks sanitizer-android buildbot.

This reverts commit af8443a984c3b491c9ca2996b8d126ea31e5ecbe.

llvm-svn: 349092
2018-12-13 23:47:50 +00:00
Evandro Menezes
6b61a418d6 [AArch64] Fix Exynos predicates (NFC)
Fix the logic in the definition of the `ExynosShiftExPred` as a more
specific version of `ExynosShiftPred`.  But, since `ExynosShiftExPred` is
not used yet, this change has NFC.

llvm-svn: 349091
2018-12-13 23:19:46 +00:00
Wei Mi
a323ae1555 [SampleFDO] handle ProfileSampleAccurate when initializing function entry count
ProfileSampleAccurate is used to indicate the profile has exact match to the
code to be optimized.

Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite
and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function
entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle
ProfileSampleAccurate in multiple places.

Differential Revision: https://reviews.llvm.org/D55660

llvm-svn: 349088
2018-12-13 21:51:42 +00:00
Aakanksha Patil
ee2b4d8ed1 Revert r348971: [AMDGPU] Support for "uniform-work-group-size" attribute
This patch breaks RADV (and probably RadeonSI as well)

llvm-svn: 349084
2018-12-13 21:23:12 +00:00
Matt Arsenault
c25740dde7 AMDGPU/GlobalISel: Legalize/regbankselect block_addr
llvm-svn: 349081
2018-12-13 20:34:15 +00:00
Nikita Popov
c645cd0965 Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail"
Currently memcpyopt optimizes cases like

    memset(a, byte, N);
    memcpy(b, a, M);

to

    memset(a, byte, N);
    memset(b, byte, M);

if M <= N. Often this allows further simplifications down the line,
which drop the first memset entirely.

This patch extends this optimization for the case where M > N, but we
know that the bytes a[N..M] are undef due to alloca/lifetime.start.

This situation arises relatively often for Rust code, because Rust does
not initialize trailing structure padding and loves to insert redundant
memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844.

The previous version of this patch did not perform dependency checking
properly: While the dependency is checked at the position of the memset,
the used size must be that of the memcpy. Previously the size of the
memset was used, which missed modification in the region
MemSetSize..CopySize, resulting in miscompiles. The added tests cover
variations of this issue.

Differential Revision: https://reviews.llvm.org/D55120

llvm-svn: 349078
2018-12-13 20:04:27 +00:00
Easwaran Raman
8482e48d19 [ThinLTO] Compute synthetic function entry count
Summary:
This patch computes the synthetic function entry count on the whole
program callgraph (based on module summary) and writes the entry counts
to the summary. After function importing, this count gets attached to
the IR as metadata. Since it adds a new field to the summary, this bumps
up the version.

Reviewers: tejohnson

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D43521

llvm-svn: 349076
2018-12-13 19:54:27 +00:00
Mircea Trofin
8fb4b5dbb4 [llvm] Address base discriminator overflow in X86DiscriminateMemOps
Summary:
Macros are expanded on a single line. In case of large expansions,
with sufficiently many instructions with memory operands (and when
-fdebug-info-for-profiling is requested), we may be unable to generate
new base discriminator values - new values overflow (base
discriminators may not be larger than 2^12).

This CL warns instead of asserting in such a case. A subsequent CL
will add APIs to check for overflow before creating new debug info.

See https://bugs.llvm.org/show_bug.cgi?id=39890

Reviewers: davidxl, wmi, gbedwell

Reviewed By: davidxl

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D55643

llvm-svn: 349075
2018-12-13 19:40:59 +00:00
Jordan Rupprecht
b88c3ef139 [llvm-size][libobject] Add explicit "inTextSegment" methods similar to "isText" section methods to calculate size correctly.
Summary:
llvm-size uses "isText()" etc. which seem to indicate whether the section contains code-like things, not whether or not it will actually go in the text segment when in a fully linked executable.

The unit test added (elf-sizes.test) shows some types of sections that cause discrepencies versus the GNU size tool. llvm-size is not correctly reporting sizes of things mapping to text/data segments, at least for ELF files.

This fixes pr38723.

Reviewers: echristo, Bigcheese, MaskRay

Reviewed By: MaskRay

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54369

llvm-svn: 349074
2018-12-13 19:40:12 +00:00
Craig Topper
fc983d8fa5 [CostModel][X86] Don't count 2 shuffles on the last level of a pairwise arithmetic or min/max reduction
This is split from D55452 with the correct patch this time.

Pairwise reductions require two shuffles on every level but the last. On the last level the two shuffles are <1, u, u, u...> and <0, u, u, u...>, but <0, u, u, u...> will be dropped by InstCombine/DAGCombine as being an identity shuffle.

Differential Revision: https://reviews.llvm.org/D55615

llvm-svn: 349072
2018-12-13 19:08:10 +00:00
Stefan Granitz
b659e7b4c4 [CMake] llvm_codesign workaround for Xcode double-signing errors
Summary:
When using Xcode to build LLVM with code signing, the post-build rule is executed even if the actual build-step was skipped. This causes double-signing errors. We can currently only avoid it by passing the `--force` flag.

Plus some polishing for my previous patch D54443.

Reviewers: beanz, kubamracek

Reviewed By: kubamracek

Subscribers: #lldb, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D55116

llvm-svn: 349070
2018-12-13 18:51:19 +00:00
Davide Italiano
d8651da4d3 [LoopUtils] Use i32 instead of void.
The actual type of the first argument of the @dbg intrinsic
doesn't really matter as we're setting it to `undef`, but the
bitcode reader is picky about `void` types.

llvm-svn: 349069
2018-12-13 18:37:23 +00:00
Tom Stellard
6ece9abfcd Don't add unnecessary compiler flags to llvm-config output
Summary:
llvm-config --cxxflags --cflags, should only output the minimal flags
required to link against the llvm libraries.  They currently contain
all flags used to compile llvm including flags like -g, -pedantic,
-Wall, etc, which users may not always want.

This changes the llvm-config output to only include flags that have been
explictly added to the COMPILE_FLAGS property of the llvm-config target
by the llvm build system.

llvm.org/PR8220

Output from llvm-config when running cmake with:
cmake -G Ninja .. -DCMAKE_CXX_FLAGS=-funroll-loops

Before:

--cppflags: -I$HEADERS_DIR/llvm/include -I$HEADERS_DIR/llvm/build/include
            -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
--cflags:   -I$HEADERS_DIR/llvm/include -I$HEADERS_DIR/llvm/build/include
            -fPIC -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings \
            -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough \
            -Wno-comment -fdiagnostics-color -g -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS \
            -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
--cxxflags: -I$HEADERS_DIR/llvm/include -I$HEADERS_DIR/llvm/build/include\
            -funroll-loops -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall \
            -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers \
            -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized \
            -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment \
            -fdiagnostics-color -g  -fno-exceptions -fno-rtti -D_GNU_SOURCE -D_DEBUG \
            -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS"

After:

--cppflags: -I$HEADERS_DIR/llvm/include -I$HEADERS_DIR/llvm/build/include \
            -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
--cflags:   -I$HEADERS_DIR/llvm/include -I$HEADERS_DIR/llvm/build/include \
            -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
--cxxflags: -I$HEADERS_DIR/llvm/include -I$HEADERS_DIR/llvm/build/include \
             -std=c++11   -fno-exceptions -fno-rtti \
             -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS

Reviewers: sylvestre.ledru, infinity0, mgorny

Reviewed By: sylvestre.ledru, mgorny

Subscribers: mgorny, dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D55391

llvm-svn: 349068
2018-12-13 18:21:23 +00:00
Zachary Turner
5ec91b4438 Correctly handle skewed streams in drop_front() method.
When calling BinaryStreamArray::drop_front(), if the stream
is skewed it means we must never drop the first bytes of the
stream since offsets which occur in records assume the existence
of those bytes.  So if we want to skip the first record in a
stream, then what we really want to do is just set the begin
pointer to the next record.  But we shouldn't actually remove
those bytes from the underlying view of the data.

llvm-svn: 349066
2018-12-13 18:11:33 +00:00
Francis Visoiu Mistrih
a0043f81b2 [MachO][TLOF] Add support for local symbols in the indirect symbol table
On 32-bit archs, before, we would assume that an indirect symbol will
never have local linkage. This can lead to miscompiles where the
symbol's value would be 0 and the linker would use that value, because
the indirect symbol table would contain the value
`INDIRECT_SYMBOL_LOCAL` for that specific symbol.

Differential Revision: https://reviews.llvm.org/D55573

llvm-svn: 349060
2018-12-13 17:23:30 +00:00
Sanjay Patel
ab6a3a3093 [DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract; 2nd try
This is a retry of rL349051 (reverted at rL349056). I changed the check for dead-ness from
number of uses to an opcode test for DELETED_NODE based on existing similar code.

Differential Revision: https://reviews.llvm.org/D55655

llvm-svn: 349058
2018-12-13 17:05:01 +00:00
Simon Pilgrim
240f871115 [X86][SSE] Add SSE vector imm/var shift support to SimplifyDemandedVectorEltsForTargetNode
llvm-svn: 349057
2018-12-13 16:39:29 +00:00
Sanjay Patel
ca019ed7bd revert rL349051: [DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract
This causes an address sanitizer bot failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/27187/steps/check-llvm%20asan/logs/stdio

llvm-svn: 349056
2018-12-13 16:32:44 +00:00
Daniel Sanders
28c46e8037 Recommit r349041: [tblgen][disasm] Separate encodings from instructions
Removed const from the ArrayRef<const EncodingAndInst> to avoid the
std::vector<const EncodingAndInst> that G++ saw

llvm-svn: 349055
2018-12-13 16:17:54 +00:00
Simon Pilgrim
e579f1bff3 [X86][SSE] Fix all remaining modulo vector rotation amounts (PR38243)
There's still a couple of minor SimplifyDemandedElts regressions in some of the shift amount splats that will be fixed in future patches.

llvm-svn: 349052
2018-12-13 15:50:31 +00:00
Sanjay Patel
ea2b479c28 [DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract
Differential Revision: https://reviews.llvm.org/D55655

llvm-svn: 349051
2018-12-13 15:44:26 +00:00
Daniel Cederman
e3cb6ccb12 [Sparc] Add membar assembler tags
Summary: The Sparc V9 membar instruction can enforce different types of
memory orderings depending on the value in its immediate field.  In the
architectural manual the type is selected by combining different assembler
tags into a mask. This patch adds support for these tags.

Reviewers: jyknight, venkatra, brad

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D53491

llvm-svn: 349048
2018-12-13 15:29:12 +00:00
Simon Pilgrim
8f5885f53e [X86][SSE] Fix modulo rotation amounts for v8i16/v16i16/v4i32 (PR38243)
llvm-svn: 349047
2018-12-13 15:23:09 +00:00
Daniel Sanders
bf52210850 Revert r349041: [tblgen][disasm] Separate encodings from instructions
One of the GCC based bots is objecting to a vector of const EncodingAndInst's:
In file included from /usr/include/c++/8/vector:64,
                 from /export/users/atombot/llvm/clang-atom-d525-fedora-rel/llvm/utils/TableGen/CodeGenInstruction.h:22,
                 from /export/users/atombot/llvm/clang-atom-d525-fedora-rel/llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:15:
/usr/include/c++/8/bits/stl_vector.h: In instantiation of 'class std::vector<const {anonymous}::EncodingAndInst, std::allocator<const {anonymous}::EncodingAndInst> >':
/export/users/atombot/llvm/clang-atom-d525-fedora-rel/llvm/utils/TableGen/FixedLenDecoderEmitter.cpp:375:32:   required from here
/usr/include/c++/8/bits/stl_vector.h:351:21: error: static assertion failed: std::vector must have a non-const, non-volatile value_type
       static_assert(is_same<typename remove_cv<_Tp>::type, _Tp>::value,
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/8/bits/stl_vector.h:354:21: error: static assertion failed: std::vector must have the same value_type as its allocator
       static_assert(is_same<typename _Alloc::value_type, _Tp>::value,
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

llvm-svn: 349046
2018-12-13 15:14:21 +00:00
Daniel Cederman
a4358d0136 [Sparc] Use float register for integer constrained with "f" in inline asm
Summary:
Constraining an integer value to a floating point register using "f"
causes an llvm_unreachable to trigger. This patch allows i32 integers
to be placed in a single precision float register and i64 integers to
be placed in a double precision float register. This matches the behavior
of GCC.

For other types the llvm_unreachable is removed to instead trigger an
error message that points out the offending line.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51614

llvm-svn: 349045
2018-12-13 15:13:29 +00:00
Jinsong Ji
ba49832db6 [PowerPC][NFC] Sorting out Pseudo related classes to avoid confusion
There are several Pseudo in PowerPC backend. 
eg:

* ISel Pseudo-instructions , which has let usesCustomInserter=1 in td 
ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them.
* Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.) 
ExpandPostRAPseudos -> expandPostRAPseudo will expand them
* Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction
* Pseudo instruction in CodeEmitter, which has encoding of 0.

Currently, in td files, especially PPCInstrVSX.td, 
we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly.

This patch is to

* Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter
* Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo
* Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo

Differential Revision: https://reviews.llvm.org/D55143

llvm-svn: 349044
2018-12-13 15:12:57 +00:00
Daniel Sanders
d7ebd4dbc1 [mir] Fix uninitialized variable in r349035 noticed by clang-atom-d525-fedora-rel and 3 other bots
llvm-svn: 349043
2018-12-13 15:05:27 +00:00
Daniel Sanders
05a85ce79b [tblgen][disasm] Separate encodings from instructions
Summary:
Separate the concept of an encoding from an instruction. This will enable
the definition of additional encodings for the same instruction which can
be used to support variable length instruction sets in the disassembler
(and potentially assembler but I'm not working towards that right now)
without causing an explosion in the number of Instruction records that
CodeGen then has to pick between.

Reviewers: bogner, charukcs

Reviewed By: bogner

Subscribers: kparzysz, llvm-commits

Differential Revision: https://reviews.llvm.org/D52366

llvm-svn: 349041
2018-12-13 14:55:57 +00:00
Simon Pilgrim
bab6760889 [X86][SSE] Merge the vXi16/vXi32 vector rotation expansion cases. NFCI.
Merged the repeated code into a single if().

llvm-svn: 349040
2018-12-13 14:51:28 +00:00
Jonas Paulsson
6239df0eb2 [SystemZ] Pass copy-hinted regs first from getRegAllocationHints().
When computing register allocation hints for a GRX32Bit register, make sure
that any of the hinted registers that are also copy hints are returned first
in the list.

Review: Ulrich Weigand.
llvm-svn: 349037
2018-12-13 14:37:05 +00:00
Daniel Sanders
cceb8ca160 [mir] Serialize DILocation inline when not possible to use a metadata reference
Summary:
Sometimes MIR-level passes create DILocations that were not present in the
LLVM-IR. For example, it may merge two DILocations together to produce a
DILocation that points to line 0.

Previously, the address of these DILocations were printed which prevented the
MIR from being read back into LLVM. With this patch, DILocations will use
metadata references where possible and fall back on serializing them inline like so:
    MOV32mr %stack.0.x.addr, 1, _, 0, _, %0, debug-location !DILocation(line: 1, scope: !15)

Reviewers: aprantl, vsk, arphaman

Reviewed By: aprantl

Subscribers: probinson, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D55243

llvm-svn: 349035
2018-12-13 14:25:27 +00:00
Simon Pilgrim
b4fa86589e [X86][BWI] Don't custom lower vXi8 rotations.
We always expand to shifts anyhow - test changes are just different scheduling only.

llvm-svn: 349034
2018-12-13 13:44:33 +00:00
Chen Zheng
50e80400c5 [NFC][PowerPC] add verify-machineinstrs check
After rL349029 and rL348566, sj-ctr-loop.ll is ok for verify-machineinstrs check.

llvm-svn: 349030
2018-12-13 12:55:42 +00:00
Chen Zheng
c2a6277f77 [PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier.
Differential Revision: https://reviews.llvm.org/D55499

llvm-svn: 349029
2018-12-13 12:25:20 +00:00
Simon Pilgrim
1e9c4bce6b [DAGCombine] Moved X86 rotate_amount % bitwidth == 0 early out to DAGCombiner
Remove common code from custom lowering (code is still safe if somehow a zero value gets used).

llvm-svn: 349028
2018-12-13 12:23:32 +00:00
Diana Picus
5026fe4d53 [ARM GlobalISel] Support exts and truncs for Thumb2
Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for
them in the instruction selector. This uses handwritten code again
because the patterns that are generated with TableGen are tuned for what
the DAG combiner would produce and not for simple sext/zext nodes.
Luckily, we only need to update the opcodes to use the Thumb2 variants,
everything else can be reused from ARM.

llvm-svn: 349026
2018-12-13 12:06:54 +00:00
Simon Pilgrim
9527e2e426 [TargetLowering] Add ISD::ROTL/ROTR vector expansion
Move existing rotation expansion code into TargetLowering and set it up for vectors as well.

Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment.

Begun removing x86 vector rotate custom lowering to use the expansion.

llvm-svn: 349025
2018-12-13 11:20:48 +00:00
Alex Bradbury
12e6230097 [RISCV] Add support for the various RISC-V FMA instruction variants
Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd).

The criteria for choosing whether a fused add or subtract is used, as well as
whether the product is negated or not, is whether some of the arguments to the
llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd
instructions were added to avoid the negation being performed using a xor
trick, which prevented the proper FMA forms from being selected and thus
tested.

The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 -
rs3), but they should be correct. The misleading names were inherited from
MIPS, where the negation happens after computing the sum.

The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions,
as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd.

Some comments in the test files about what type of instructions are there
tested were updated, to better reflect the current content of those test
files.

Differential Revision: https://reviews.llvm.org/D54205
Patch by Luís Marques.

llvm-svn: 349023
2018-12-13 10:49:05 +00:00
Arnaud A. de Grandmaison
9ed5c53397 [AArch64] Catch some more CMN opportunities.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33486

llvm-svn: 349022
2018-12-13 10:31:32 +00:00
Clement Courbet
ee0b0ebdcd [CodeGen] Allow mempcy/memset to generate small overlapping stores.
Summary:
All targets either just return false here or properly model `Fast`, so I
don't think there is any reason to prevent CodeGen from doing the right
thing here.

Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D55365

llvm-svn: 349016
2018-12-13 09:56:19 +00:00