llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Fraser Cormack	be68e4ef95	[RISCV] Expand unaligned fixed-length vector memory accesses RVV vectors must be aligned to their element types, so anything less is unaligned. For regular loads and stores, our custom-lowering of fixed-length vectors meant that we opted out of LegalizeDAG's built-in unaligned expansion. This patch adds that logic in to our custom lower function. For masked intrinsics, we declare that anything unaligned is not legal, leaving the ScalarizeMaskedMemIntrin pass to do the expansion for us. Note that neither of these methods can handle the expansion of scalable-vector memory ops, so those cases are left alone by this patch. Scalable loads and stores already go through expansion by default but hit an assertion, and scalable masked intrinsics will silently generate incorrect code. It may be prudent to return an error in both of these cases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102493	2021-06-02 09:27:44 +01:00
Daniil Fukalov	ce7972786d	[NFC] Fix 'Load' name masking. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D103456	2021-06-02 11:09:53 +03:00
Sriraman Tallam	f1e4168f92	Resubmit D85085 after fixing the tests that were failing. D85085 was pushed earlier but broke tests on mac and win: http://lab.llvm.org:8080/green/job/clang-stage1-RA/21182/consoleFull#-706149783d489585b-5106-414a-ac11-3ff90657619c Recommitting it after adding mtriple to the llc commands. Emit correct location lists with basic block sections. This patch addresses multiple things: 1) It ensures that const_value is emitted when possible with basic block sections. 2) It emits location lists such that the labels are always within the section boundary. 3) It fixes a bug when the parameter is first used in a non-entry block which is in a different section from the entry block. Differential Revision: https://reviews.llvm.org/D85085	2021-06-01 21:59:47 -07:00
Amy Huang	c06376acba	Revert "Fix tmp files being left on Windows builds." for now; causing some asan test failures. This reverts commit 7daa18215905c831e130c7542f17619e9d936dfc.	2021-06-01 19:51:47 -07:00
Craig Topper	32b14b7a99	[RISCV] Improve register allocation for masked vwadd(u).wv, vwsub(u).wv, vfwadd.wv, and vfwsub.wv. The first source has the same EEW as the destination, but we're using earlyclobber which prevents them from ever being the same register. To workaround this, add a special TIED pseudo to use whenever the first source and merge operand are the same value. This allows us to use a single operand for the merge operand and first source which we can then tie to the destination. A tied source disables earlyclobber for that operand. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D103211	2021-06-01 18:59:00 -07:00
LLVM GN Syncbot	5680b05e63	[gn build] Port 924ea3bb53ca	2021-06-02 01:47:33 +00:00
Rahman Lavaee	2769abb764	[llvm-readobj] Print function names with `--bb-addr-map`. This patch uses the `getSymbolIndexForFunctionAddress` helper function to print function names for BB address map entries. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102900	2021-06-01 18:40:42 -07:00
Ben Shi	8ef9e4e535	[RISCV][test] Add new tests of bitwise and with constant for the Zbs extension These tests will show how (and r i) will be optimized to (BCLRI (BCLRI r, i0), i1) or (BCLRI (ANDI r, i0), i1) by future commits. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103359	2021-06-02 09:10:21 +08:00
Xiang1 Zhang	2a81a230ca	Remove x86 test amx-fast-tile-config.mir (by its author) This test contains a lot of manual changes which is not convenient to update, and the checks are duplicated with test amx-configO2toO0.ll	2021-06-02 08:29:36 +08:00
Amy Huang	c579427605	Fix tmp files being left on Windows builds. Clang writes object files by first writing to a .tmp file and then renaming to the final .obj name. On Windows, if a compile is killed partway through the .tmp files don't get deleted. Currently it seems like RemoveFileOnSignal takes care of deleting the tmp files on Linux, but on Windows we need to call setDeleteDisposition on tmp files so that they are deleted when closed. This patch switches to using TempFile to create the .tmp files we write when creating object files, since it uses setDeleteDisposition on Windows. This change applies to both Linux and Windows for consistency. Differential Revision: https://reviews.llvm.org/D102876	2021-06-01 17:09:08 -07:00
Stanislav Mekhanoshin	82e56dec58	[AMDGPU] All GWS instructions need aligned VGPR on gfx90a Fixes: SWDEV-288006 Differential Revision: https://reviews.llvm.org/D103197	2021-06-01 17:08:03 -07:00
Arthur Eubanks	2a26a5c713	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429	2021-06-01 16:52:32 -07:00
Arthur Eubanks	3b6a5ff6b4	[InstSimplify] Treat invariant group insts as bitcasts for load operands We can look through invariant group intrinsics for the purposes of simplifying the result of a load. Since intrinsics can't be constants, but we also don't want to completely rewrite load constant folding, we convert the load operand to a constant. For GEPs and bitcasts we just treat them as constants. For invariant group intrinsics, we treat them as a bitcast. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D101103	2021-06-01 16:33:06 -07:00
Arthur Eubanks	15110c5b3d	[test] Precommit test for D101103	2021-06-01 16:31:02 -07:00
Michael Benfield	f8d5955717	[various] Remove or use variables which are unused but set. This is in preparation for the -Wunused-but-set-variable warning. Differential Revision: https://reviews.llvm.org/D102942	2021-06-01 15:38:48 -07:00
LLVM GN Syncbot	f55aa7710b	[gn build] Port 065cf3f9d703	2021-06-01 21:08:31 +00:00
Daniel Sanders	8c8a0a3167	fixup: Missing operator in [globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one My local compiler was fine with it but the bots complain about ambiguous types.	2021-06-01 13:58:03 -07:00
LLVM GN Syncbot	ae74a4e8b3	[gn build] Port aaac268285ff	2021-06-01 20:28:25 +00:00
Daniel Sanders	71a22fb7f8	[globalisel][legalizer] Separate the deprecated LegalizerInfo from the current one It's still in use in a few places so we can't delete it yet but there's not many at this point. Differential Revision: https://reviews.llvm.org/D103352	2021-06-01 13:23:48 -07:00
Stephen Neuendorffer	b0a805b25d	Convert TableGen assert to error This gives a nice message about the location of errors in a large tablegen file, which is much more useful for users Differential Revision: https://reviews.llvm.org/D102740	2021-06-01 13:17:58 -07:00
Arthur Eubanks	c5cd1cf901	[NFC][OpaquePtr] Explicitly pass GEP source type to IRBuilder in more places	2021-06-01 13:13:37 -07:00
Andrew Kelley	08aca8b420	WindowsSupport.h: do not depend on private config header WindowsSupport.h is a public header, however if it gets included, will cause a compile error indicating that llvm/Config/config.h cannot be found, because config.h is a private header. However there is no actual dependency on the private things in this header, so it can be changed to the public config header. Reviewed By: amccarth Differential Revision: https://reviews.llvm.org/D103370	2021-06-01 23:05:03 +03:00
Sanjay Patel	6012aaaacf	[InstCombine] add tests for cast folding; NFC https://llvm.org/PR49543	2021-06-01 16:03:24 -04:00
Anirudh Prasad	2991e3d38e	[SystemZ][z/OS] Stricter condition for HLASM class instantiation - A lot of lit tests simply specify the arch minus the triple. On z/OS, this could result in a scenario of some-other-triple-unknown-ibm-zos. This points to an incorrect triple + arch combo. - To prevent this, isOSzOS change is switched in favour of isOSBinFormatGOFF. - This is because, the GOFF format is set only if the triple is systemz and if the operating system is GOFF. And currently, there are no other architectures/os's using the GOFF file format. - An argument could be made that the problematic tests be fixed to explicitly specify the arch-vendor-triple string, but there's a large number of these tests, and adding this stricter scope ensures that we aren't instantiating the incorrect instance of the AsmParser for other platforms when run on z/OS. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D103343	2021-06-01 15:56:50 -04:00
LLVM GN Syncbot	92d9ca26bc	[gn build] Port 5671ff20d92b	2021-06-01 19:37:29 +00:00
madhur13490	8a89499a33	[AMDGPU][NFC] Remove author's name from codebase This must have made to code by accident. Differential Revision: https://reviews.llvm.org/D103484	2021-06-02 00:51:48 +05:30
Harald van Dijk	fb6500dfb1	[SLPVectorizer] Ignore unreachable blocks As the existing test unreachable.ll shows, we should be doing more work to avoid entering unreachable blocks: we should not stop vectorization just because a PHI incoming value from an unreachable block cannot be vectorized. We know that particular value will never be used so we can just replace it with poison.	2021-06-01 20:21:04 +01:00
Jessica Paquette	852a8449e7	[GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width Also add a target hook which allows us to get around custom legalization on AArch64. Differential Revision: https://reviews.llvm.org/D99283	2021-06-01 10:56:17 -07:00
Guozhi Wei	b2dfe60e88	[X86FixupLEAs] Transform the sequence LEA/SUB to SUB/SUB This patch transforms the sequence lea (reg1, reg2), reg3 sub reg3, reg4 to two sub instructions sub reg1, reg4 sub reg2, reg4 Similar optimization can also be applied to LEA/ADD sequence. The modifications to TwoAddressInstructionPass is to ensure the operands of ADD instruction has expected order (the dest register of LEA should be src register of ADD). Differential Revision: https://reviews.llvm.org/D101970	2021-06-01 10:31:30 -07:00
Jonas Paulsson	5e3b55a7dc	[SystemZ] Return true from hasBitPreservingFPLogic(). This is currently NFC on benchmarks and tests. Review: Ulrich Weigand	2021-06-01 11:52:50 -05:00
Eli Friedman	a4632c066a	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
Nikita Popov	0d55b59b6a	[ADT] Move DenseMapInfo for APInt into APInt.h (PR50527) As suggested in https://bugs.llvm.org/show_bug.cgi?id=50527, this moves the DenseMapInfo for APInt and APSInt into the respective headers, removing the need to include APInt.h and APSInt.h from DenseMapInfo.h. We could probably do the same from StringRef and ArrayRef as well. Differential Revision: https://reviews.llvm.org/D103422	2021-06-01 18:31:41 +02:00
Craig Topper	ec4f4175eb	[RISCV] Remove earlyclobber from vnsrl/vnsra/vnclip(u) when the source and dest are a single vector register. This guarantees they meet this overlap exception: "The destination EEW is smaller than the source EEW and the overlap is in the lowest-numbered part of the source register group" Being a single register guarantees the overlap is always in the lowerst-number part of the group. Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D103351	2021-06-01 09:17:52 -07:00
Craig Topper	7b6a74df3c	[RISCV] Remove earlyclobber from compares with LMUL<=1. Compares are considered a narrowing operation for register overlap. I believe for LMUL<=1 they meet this exception to allow overlap "The destination EEW is smaller than the source EEW and the overlap is in the lowest-numbered part of the source register group" Both the result and the sources will occupy a single register for LMUL<=1 so the overlap would always be in the "lowest-numbered part". Reviewed By: frasercrmck, HsiangKai Differential Revision: https://reviews.llvm.org/D103336	2021-06-01 09:08:11 -07:00
Sanjay Patel	28aa6ad4db	[x86] add test for sext-of-setcc; NFC	2021-06-01 11:12:52 -04:00
Xun Li	1825d50d24	Simplify coro-zero-alloca.ll D101841 added this test. It appears to generate different outcome on different platforms. Make it to only call -coro-split instead of entire O2 pipeline to simplify the test flow. Hope this will make the test more robust. Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D103418	2021-06-01 08:12:35 -07:00
Alexey Bataev	8c3d0ae3df	[SLP]Better detection of perfect/shuffles matches for gather nodes. Implemented better scheme for perfect/shuffled matches of the gather nodes which allows to fix the performance regressions introduced by earlier patches. Starting detecting matches for broadcast nodes and extractelement gathering. Differential Revision: https://reviews.llvm.org/D102920	2021-06-01 07:08:07 -07:00
gbreynoo	081a4bd35e	[llvm-dwarfdump][test] Add missing dedicated tests for some options This change adds tests specifically for --parent-recurse-depth, --quiet and -o. The test for -o found a typo in an error message which is also fixed in this change. Differential Revision: https://reviews.llvm.org/D103250	2021-06-01 14:57:00 +01:00
Daniil Seredkin	764783428b	[InstCombine] Relax constraints of uses for exp(X) * exp(Y) -> exp(X + Y) InstCombine didn't perform the transformations when fmul's operands were the same instruction because it required to have one use for each of them which is false in the case. This patch fixes this + adds tests for them and introduces a new function isOnlyUserOfAnyOperand to check these cases in a single place. This patch is a result of discussion in D102574. Differential Revision: https://reviews.llvm.org/D102698	2021-06-01 08:33:23 -04:00
Florian Hahn	f40584cef6	[LoopDeletion] Consider infinite loops alive, unless mustprogress. The current loop or any of its sub-loops may be infinite. Unless the function or the loops are marked as mustprogress, this in itself makes the loop not dead. This patch moves the logic to check whether the current loop is finite or mustprogress to `isLoopDead` and also extends it to check the sub-loops. This should fix PR50511. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D103382	2021-06-01 13:07:36 +01:00
Sanjay Patel	5888c31732	[SDAG] add helper function for sext-of-setcc folds; NFC Try to make this easier to read as noted in D103280	2021-06-01 08:07:17 -04:00
Florian Hahn	00bbac35a4	[VectorCombine] Freeze index unless it is known to be non-poison. If the index itself is already poison, the poison propagates through instructions clamping the index to a valid range. This still causes introducing a load of poison, as flagged by Alive2 and pointed out at 575e2aff5574. This patch updates the code to freeze the index, unless it is proven to not be poison. Reviewed By: nlopes Differential Revision: https://reviews.llvm.org/D103378	2021-06-01 10:40:57 +01:00
Fraser Cormack	9fec15e2c5	[RISCV] Support vector types in combination with fastcc This patch extends the RISC-V lowering of the 'fastcc' calling convention to vector types, both fixed-length and scalable. Without this patch, any function passing or returning vector types by value would throw a compiler error. Vectors are handled in 'fastcc' much as they are in the default calling convention, the noticeable difference being the extended set of scalar GPR registers that can be used to pass vectors indirectly. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102505	2021-06-01 10:31:18 +01:00
Andy Wingo	a2b88794ad	[WebAssembly][CodeGen] IR support for WebAssembly local variables This patch adds TargetStackID::WasmLocal. This stack holds locations of values that are only addressable by name -- not via a pointer to memory. For the WebAssembly target, these objects are lowered to WebAssembly local variables, which are managed by the WebAssembly run-time and are not addressable by linear memory. For the WebAssembly target IR indicates that an AllocaInst should be put on TargetStackID::WasmLocal by putting it in the non-integral address space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift these allocations to SSA locals, but any alloca that reaches instruction selection (usually in non-optimized builds) will be assigned the new TargetStackID there. Loads and stores to those values are transformed to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes, which then lower to the type-specific LOCAL_GET_I32 etc instructions via tablegen patterns. Differential Revision: https://reviews.llvm.org/D101140	2021-06-01 11:31:39 +02:00
Florian Hahn	49cafe1d7b	[VectorCombine] Add tests with multiple noundef indices for scalarization.	2021-06-01 10:17:50 +01:00
Douglas Yung	1ab0dec8b7	Mark test as requiring asserts.	2021-06-01 02:01:01 -07:00
Roman Lebedev	0573b00888	[X86] AMD Zen 3 has fast variable per-lane shuffles ... but lane-crossing shuffles are slow.	2021-06-01 10:46:05 +03:00
Roman Lebedev	19a9e819da	[X86] Split FeatureFastVariableShuffle tuning into Lane-Crossing and Per-Lane variants Currently, X86 backend only has a global one-size-fits-all `FeatureFastVariableShuffle` feature, which controls profitability of both the cross-lane and per-lane variable shuffles. I guess, this has been fine so far. But at least on AMD Zen 3, while per-line variable shuffles (e.g. `VPSHUFB`) are as fast as as shuffles with fixed/immediate mask, while lane-crossing shuffles, e.g. `VPERMPS` is performing worse. So to get the benefits of variable-mask shuffles, but not the drawbacks of lane-crossing shuffles, as suggested by @RKSimon, split the feature flag into two. Differential Revision: https://reviews.llvm.org/D103274	2021-06-01 10:39:36 +03:00
Martin Storsjö	89b11af641	[libcxx] [test] Fix the _supportsVerify check on Windows by fixing quoting The pipes.quote function quotes using single quotes, the same goes for the newer shlex.quote (which is the preferred form in Python 3). This isn't suitable for quoting in command lines on Windows (and the documentation for shlex.quote even says it's only usable for Unix shells). In general, the python subprocess.list2cmdline function should do proper quoting for the platform's current shell. However, it doesn't quote the ';' char, which we pass within some arguments to run.py. Therefore use the custom reimplementation from lit.TestRunner which is amended to quote ';' too. The fact that arguemnts were quoted with single quotes didn't matter for command lines that were executed by either bash or the lit internal shell, but if executing things directly using subprocess.call, as in _supportsVerify, the quoted path to %{cxx} fails to be resolved by the Windows shell. This unlocks 114 tests that previously were skipped on Windows. Differential Revision: https://reviews.llvm.org/D103310	2021-06-01 09:51:41 +03:00
Serge Pavlov	fbabf77db8	[PowerPC] Split tests for constrained intrinsics The test CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll checks code generation for constrained floating point intrinsics. Many test cases in it were implemented using operations on constants. Constant folding of constrained intrinsics would make these test cases almost useless, because they would check only constant loading. To keep the tests useful, operations on constants were replaced with operations on function parameters. Differential Revision: https://reviews.llvm.org/D103259	2021-06-01 12:30:17 +07:00

1 2 3 4 5 ...

216632 Commits