Fixes PR40332 in the limited case where we're selecting between a target shuffle and a zero vector.
We can extend this in the future to handle more opcodes and non-zero selections.
llvm-svn: 359378
Summary: If we have SSE2 we can use a MOVQ to store 64-bits and avoid falling back to a cmpxchg8b loop. If its a seq_cst store we need to insert an mfence after the store.
Reviewers: spatel, RKSimon, reames, jfb, efriedma
Reviewed By: RKSimon
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60546
llvm-svn: 359368
This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4.
We discovered some internal test failures, so reverting for now.
Differential Revision: https://reviews.llvm.org/D61213
llvm-svn: 359363
ObjectLinkingLayer::Plugin provides event notifications when objects are loaded,
emitted, and removed. It also provides a modifyPassConfig callback that allows
plugins to modify the JITLink pass configuration.
This patch moves eh-frame registration into its own plugin, and teaches
llvm-jitlink to only add that plugin when performing execution runs on
non-Windows platforms. This should allow us to re-enable the test case that was
removed in r359198.
llvm-svn: 359357
getConstantVRegValWithLookThrough does the same thing as the
getConstantValueForReg function, and has more visibility across GISel. Plus, it
supports looking through G_TRUNC, G_SEXT, and G_ZEXT. So, we get better code
reuse and more functionality for free by using it.
Add some test cases to select-extract-vector-elt.mir to show that we can now
look through those instructions.
llvm-svn: 359351
Summary:
Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when
printing the address of a MachineOperand::MO_GlobalAddress. Move that
handling into a new overriden method in each base class. A virtual
method was added to the base class for handling the generic case.
Refactors a few subclasses to support the target independent %a, %c, and
%n.
The patch also contains small cleanups for AVRAsmPrinter and
SystemZAsmPrinter.
It seems that NVPTXTargetLowering is possibly missing some logic to
transform GlobalAddressSDNodes for
TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended
inline assembly asm constraints.
Fixes:
- https://bugs.llvm.org/show_bug.cgi?id=41402
- https://github.com/ClangBuiltLinux/linux/issues/449
Reviewers: echristo, void
Reviewed By: void
Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60887
llvm-svn: 359337
There are instructions for these, so mark them as legal. Select the correct
instruction in AArch64InstructionSelector.cpp.
Update select-bswap.mir and arm64-rev.ll to reflect the changes.
llvm-svn: 359331
The PPC vector cost model values for insert/extract element reflect older
processors that lacked vector insert/extract and move-to/move-from VSR
instructions. Update getVectorInstrCost to give appropriate values for when
the newer instructions are present.
Differential Revision: https://reviews.llvm.org/D60160
llvm-svn: 359313
In addition, fix and convert the two tests to yaml2obj based. This
allows us to delete two executables.
X86/weak.test: 'v' was not tested
X86/init-fini.test: symbol types of __bss_start _edata _end were wrong
GNU nm reports __init_array_start as 't', and __preinit_array_start as 'd'.
__init_array_start is 't' just because its section ".init_array" starts with ".init"
'd' makes more sense and allows us to drop the weird SHT_INIT_ARRAY rule.
So, change __init_array_start to 'd' instead.
llvm-svn: 359311
As detailed on PR40758, Bobcat/Jaguar can perform vector immediate shifts on the same pipes as vector ANDs with the same latency - so it doesn't make sense to replace a shl+lshr with a shift+and pair as it requires an additional mask (with the extra constant pool, loading and register pressure costs).
Differential Revision: https://reviews.llvm.org/D61068
llvm-svn: 359293
A small step towards combining shuffles across vector sizes - this recognizes when a shuffle's operands are all extracted from the same larger source and tries to combine to an unary shuffle of that source instead. Fixes one of the test cases from PR34380.
Differential Revision: https://reviews.llvm.org/D60512
llvm-svn: 359292
The code was using the alignment of a pointer to the value, not the
alignment of the constant itself.
Maybe we got away with it so far because the pointer alignment is
fairly high, but we did end up under-aligning <16 x i8> vectors,
which was caught in the Chromium build after lld stopped over-aligning
the .rodata.cst16 section in r356428. (See crbug.com/953815)
Differential revision: https://reviews.llvm.org/D61124
llvm-svn: 359287
Summary:
llvm-{objcopy,strip} (and many other LLVM binary utilities) accept
cl::opt style -long-option as well as many short options (e.g. -p -S
-x). People who use them as replacement of GNU binutils often use the
grouped option syntax (POSIX Utility Conventions), e.g. -Sx => -S -x,
-Wd => -W -d, -sj.text => -s -j.text
There is ambiguity if a long option starts with the character used by a
short option. Drop the support for -long-option to resolve the ambiguity.
This divergence from other utilities is accepted (other utilities
continue supporting -long-option).
https://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html
Reviewers: alexshap, jakehehrlich, jhenderson, rupprecht, espindola
Reviewed By: jakehehrlich, jhenderson, rupprecht
Subscribers: grimar, emaste, arichardson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60439
llvm-svn: 359265
All of the new instructions are still handled mostly by tablegen. I've slightly
refactored the code to drive intrinsic/instruction generation from a master
list of supported variants, so all irregularities have to be implemented in one place only.
The test generation script wmma.py has been refactored in a similar way.
Differential Revision: https://reviews.llvm.org/D60015
llvm-svn: 359247
PTX 6.3 requires using ".aligned" in the MMA instruction names.
In order to generate correct name, now we pass current
PTX version to each instruction as an extra constant operand
and InstPrinter adjusts its output accordingly.
Differential Revision: https://reviews.llvm.org/D59393
llvm-svn: 359246
Adds a representation of the section header table to XCOFFObjectFile,
and implements enough to dump the section headers with llvm-obdump.
Differential Revision: https://reviews.llvm.org/D60784
llvm-svn: 359244
This case was missing before, so we couldn't legalize it.
Add it to AArch64LegalizerInfo.cpp and update select-extract-vector-elt.mir.
llvm-svn: 359231
it keeps track of becomes too large
ARC optimizer does a top-down and a bottom-up traversal of the whole
function to pair up retain and release instructions and remove them.
This can be expensive if the number of instructions in the function and
pointer states it tracks are large since it has to look at each pointer
state and determine whether the instruction being visited can
potentially use the pointer.
This patch adds a command line option that sets a limit to the number of
pointers it tracks.
rdar://problem/49477063
Differential Revision: https://reviews.llvm.org/D61100
llvm-svn: 359226
This adds a legalization rule for G_ZEXT, G_ANYEXT, and G_SEXT which allows
extends whenever the types will fit in registers (or the source is an s1).
Update tests. Add GISel checks throughout all of arm64-vabs.ll,
where we now select a good portion of the code. Add GISel checks to
arm64-subvector-extend.ll, which has a good number of vector extends in it.
Differential Revision: https://reviews.llvm.org/D60889
llvm-svn: 359222
We had special case handling here, but it uses a scalar any_extend for the
promotion then bitcasts to the final type. This won't split up the input data
into multiple promoted elements like we need.
This patch falls back to doing the conversion through memory.
Fixes PR41594 which I believe was reflected in the bitcast-vector-bool.ll
changes. The changes to vector-half-conversions.ll are fixing a previously
unknown miscompile from this issue.
Differential Revision: https://reviews.llvm.org/D61114
llvm-svn: 359219
When evaluating a store through a bitcast, the evaluator tries to move the
bitcast from the pointer onto the stored value. If the cast is invalid, it
tries to "introspect" the type to get a valid cast by obtaining a pointer to
the initial element (if the type is nested, this may require walking several
initial elements).
In some situations it is possible to get a bitcast on a load (e.g. with
unions, where the bitcast may not be the same type as the store). However,
equivalent logic to the store to introspect the type is missing. This patch
add this logic.
Note, when developing the patch I was unhappy with adding similar logic
directly to the load case as it could get out of step. Instead, I have
abstracted the "introspection" into a helper function, with the specifics
being handled by a passed-in lambda function.
Differential Revision: https://reviews.llvm.org/D60793
llvm-svn: 359205
Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc.
Since the importer allows us to automatically select this after legalization,
also add tests for selection etc. Also update arm64-vfloatintrinsics.ll.
llvm-svn: 359204
Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update
arm64-irtranslator.ll.
Differential Revision: https://reviews.llvm.org/D60922
llvm-svn: 359203
For eventually selecting llvm.nearbyint. Equivalent to the SelectionDAG
nearbyint node.
Update legalizer-info-validation.mir.
Differential Revision: https://reviews.llvm.org/D60921
llvm-svn: 359201
yaml2obj might crash on invalid input when unable to parse the YAML.
Recently a crash with a very similar nature was fixed for an empty files.
This patch revisits the fix and does it in yaml::Input instead.
It seems to be more correct way to handle such situation.
With that crash for invalid inputs is also fixed now.
Differential revision: https://reviews.llvm.org/D61059
llvm-svn: 359178
Truncate the movmsk scalar integer result to the equivalent scalar integer width as before but then bitcast to the requested type.
We still have the issue identified in PR41594 but D61114 should handle this.
llvm-svn: 359176
On Mips32r2 bitcast can be expanded to two sw instructions and an ldc1
when using bitcast i64 to double or an sdc1 and two lw instructions when
using bitcast double to i64. By introducing custom lowering that uses
mtc1/mthc1 we can avoid excessive instructions.
Patch by Mirko Brkusanin.
Differential Revision: https://reviews.llvm.org/D61069
llvm-svn: 359171
This should fix the MachO/x86-64 eh-frame regression test by ensuring that
the symbols __ZTIi and ___gxx_personality_v0 are defined on all platforms.
llvm-svn: 359169
Summary:
When refactoring vectorization flags, vectorization was disabled by default in the new pass manager.
This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults.
Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization.
Reviewers: chandlerc, jgorbe
Subscribers: jlebar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61091
llvm-svn: 359167
For well-known type IDs, include the name of the type.
To not duplicate the ID->name map, make llvm-readobj call this new
function as well. It has slightly different output, so this also
requires updating a few tests.
Differential Revision: https://reviews.llvm.org/D61086
llvm-svn: 359153
Summary:
This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug
info in codeview. Currently only changes FastISel, so emitting labels still
needs to be implemented in SelectionDAG.
Reviewers: rnk
Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D61083
llvm-svn: 359149