llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Sanjay Patel	068a0e3768	[CostModel] remove hack for intrinsic cost based on cost type This hack seems to only have been necessary because of the constructor bug noted in 33125cffd. Once again, it's hard to prove NFC, but that's the hope...	2020-09-28 15:58:42 -04:00
Baptiste Saleil	3dca9af6d2	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Sanjay Patel	1fffe5761d	[CostModel] fill in arguments as part of intrinsic attribute constructor This appears to be an error of code duplication - instead of one constructor variant calling another, we have N similar but not identical versions. I think this is 'NFC' based on the current callers, but it's hard to tell or guess the intent in all cases.	2020-09-28 15:27:45 -04:00
Jon Roelofs	9b0c93e6aa	[AArch64] reuse another map iterator. NFC	2020-09-28 11:30:21 -07:00
Amara Emerson	188cec631b	Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar." This reverts commit b5e87c9ef2243ecd65e0ef87a1bf303c0c26db04 as it seems to have broken a bot.	2020-09-28 11:25:19 -07:00
Dominic Chen	12ebbafac0	[AddressSanitizer] Copy type metadata to prevent miscompilation When ASan and e.g. Dead Virtual Function Elimination are enabled, the latter will rely on type metadata to determine if certain virtual calls can be removed. However, ASan currently does not copy type metadata, which can cause virtual function calls to be incorrectly removed. Differential Revision: https://reviews.llvm.org/D88368	2020-09-28 13:56:05 -04:00
Simon Pilgrim	60566b9591	[InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests	2020-09-28 18:53:38 +01:00
Heejin Ahn	88d1852eae	[WebAssembly] Use wasm::Signature for in ObjectWriter (NFC) There are two `WasmSignature` structs, one in include/llvm/BinaryFormat/Wasm.h and the other in lib/MC/WasmObjectWriter.cpp. I don't know why they got separated in this way in the first place, but it seems we can unify them to use the one in Wasm.h for all cases. Reviewed By: dschuff, sbc100 Differential Revision: https://reviews.llvm.org/D88428	2020-09-28 10:46:55 -07:00
Jessica Paquette	2dcd508332	[AArch64][GlobalISel] Infer whether G_PHI is going to be a FPR in regbankselect Some instructions (G_LOAD, G_SELECT, G_UNMERGE_VALUES) check if their uses will define/use FPRs (using `onlyUsesFP` and `onlyDefinesFP`). The register bank of a use isn't necessarily known when an instruction asks for this. Teach `hasFPConstraints` to look at the instructions feeding into a G_PHI when its destination bank is unknown. If any of them are FPR, assume the entire G_PHI will also be assigned a FPR. Since a phi can have many inputs, and those inputs can in turn be phis, restrict the search depth to a very low number. Also improve the docs for `hasFPConstraints` and friends a little. This is a 0.3% code size improvement on CTMark/Bullet at -O3, and a 0.2% code size improvement at CTMark/pairlocalalign at -O3. Differential Revision: https://reviews.llvm.org/D88177	2020-09-28 10:37:09 -07:00
Sanjay Patel	1d7afc6eeb	[CostModel] move early exit for free intrinsics This should be NFC unless some target was expecting that some form of cttz/ctlz/memcpy is free in terms of size/latency but not free in throughput cost.	2020-09-28 13:30:55 -04:00
Sanjay Patel	2915eafc46	[CostModel] split handling of intrinsics from other calls This should be close to NFC (no-functional-change), but I can't completely rule out that some call on some target travels down a different path. There's an especially large amount of code spaghetti in this part of the cost model. The goal is to clean up the intrinsic cost handling so we can canonicalize to the new min/max intrinsics without causing regressions.	2020-09-28 13:30:55 -04:00
Jessica Paquette	7ee5835082	[AArch64][GlobalISel] Support shifted register form in emitTST Support emitting ANDSXrs and ANDSWrs in `emitTST`. Update opt-fold-compare.mir to show that it works. Differential Revision: https://reviews.llvm.org/D87530	2020-09-28 10:13:47 -07:00
Jessica Paquette	7a97485533	[GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y) When we see this: ``` %and = G_AND %x, %y %xor = G_XOR %and, %y ``` Produce this: ``` %not = G_XOR %x, -1 %new_and = G_AND %not, %y ``` as long as we are guaranteed to eliminate the original G_AND. Also matches all commuted forms. E.g. ``` %and = G_AND %y, %x %xor = G_XOR %y, %and ``` will be matched as well. Differential Revision: https://reviews.llvm.org/D88104	2020-09-28 10:08:14 -07:00
Simon Pilgrim	135593d1be	[InstCombine] Add basic trunc(shr(trunc(x),c)) tests Helps improve the minor regressions noticed on D88316	2020-09-28 18:00:28 +01:00
Jon Roelofs	95dada364f	[AArch64] Reuse map iterator instead of double lookup. NFC	2020-09-28 09:47:00 -07:00
Mikhail Maltsev	48709cc3db	[unittests] Preserve LD_LIBRARY_PATH in crash recovery test We need to preserve the LD_LIBRARY_PATH environment variable when spawning a child process (certain setups rely on non-standard paths for e.g. libstdc++). In order to achieve this, set LLVM_CRC_UNIXCRCRETURNCODE in the parent process instead of creating the child's environment from scratch. Reviewed By: aganea Differential Revision: https://reviews.llvm.org/D88308	2020-09-28 17:46:03 +01:00
Jay Foad	f4f70547af	[AMDGPU] Reformat AMDGPUTargetLowering::isSDNodeAlwaysUniform. NFC.	2020-09-28 16:24:16 +01:00
Sam Parker	5a6cdf9c3b	[ARM][LowOverheadLoops] Cleanup and re-arrange Rename and reorganise how we decide where to put the LoopStart instruction.	2020-09-28 16:06:30 +01:00
Tres Popp	352980fbe9	[llvm] Fix unused variable in non-debug configurations	2020-09-28 17:04:08 +02:00
Meera Nakrani	54f9731add	[ARM] Added more patterns to generate SSAT/USAT with shift Added patterns to generate an SSAT or USAT with shift for SSAT/USAT instructions that are matched from IR patterns. Differential Revision: https://reviews.llvm.org/D88145	2020-09-28 14:50:19 +00:00
Cameron McInally	a43296f995	[SVE] Lower fixed length VECREDUCE_[UMAX\|UMIN] to Scalable Essentially the same as the signed variants from D88259. Also includes a clean up of the lowering function. Differential Revision: https://reviews.llvm.org/D88317	2020-09-28 09:29:00 -05:00
Juneyoung Lee	31a4179ce0	[ValueTracking] Fix analyses to update CxtI to be phi's incoming edges' terminators It was mentioned that D88276 that when a phi node is visited, terminators at their incoming edges should be used for CtxI. This is a patch that makes two functions (ComputeNumSignBitsImpl, isGuaranteedNotToBeUndefOrPoison) to do so. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D88360	2020-09-28 23:24:20 +09:00
Paul C. Anagnostopoulos	90d4bf8784	[TableGen] Improved messages in PseudoLoweringEmitter.	2020-09-28 10:18:22 -04:00
Simon Pilgrim	a9b5f69419	[InstCombine] matchRotate - force splat of uniform constant rotation amounts (PR46895) Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.	2020-09-28 15:12:41 +01:00
Sam Parker	db9206f64f	[NFC][ARM] Factor out some logic for LoLoops. Create a DCE function that accepts an instruction.	2020-09-28 14:51:52 +01:00
Jay Foad	f7c8caa309	[AMDGPU] Reformat SITargetLowering::isSDNodeSourceOfDivergence. NFC.	2020-09-28 14:42:05 +01:00
Georgii Rymar	c050c97e5e	[llvm-readobj/elf] - Fix the PREL31 relocation computation used for dumping arm32 unwind info (-u). This is a part of https://bugs.llvm.org/show_bug.cgi?id=47581. We have the following computation: ``` (1) uint64_t Location = Address & 0x7fffffff; (2) if (Location & 0x04000000) (3) Location \|= (uint64_t) ~0x7fffffff; (4) return Location + Place; ``` At line 2 there is a mistype. The constant should be `0x40000000`, not `0x04000000`, because the intention here is to sign extend the `Location`, which is the 31 bit signed value. Differential revision: https://reviews.llvm.org/D88407	2020-09-28 16:22:56 +03:00
Sjoerd Meijer	f26e5a332c	[ARM][MVE] Enable tail-predication by default We have been running tests/benchmarks downstream with tail-predication enabled for some time now and this behaves as expected: we are not aware of any correctness issues, and this performs better across the board than with tail-predication disabled. Time to flip the switch! Differential Revision: https://reviews.llvm.org/D88093	2020-09-28 14:01:23 +01:00
Simon Pilgrim	a75755b2d9	[InstCombine] matchRotate - allow undef in uniform constant rotation amounts (PR46895) An extension to D87452, we can safely permit undefs in the uniform/splat detection https://alive2.llvm.org/ce/z/nT-ptN Differential Revision: https://reviews.llvm.org/D88402	2020-09-28 13:36:13 +01:00
Florian Hahn	08b000a04c	[SCEV] Also use info from assumes in applyLoopGuards. Similar to collecting information from branches guarding a loop, we can also collect information from assumes dominating the loop header. Fixes PR47247. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87854	2020-09-28 13:14:24 +01:00
Daniel Kiss	85d6c62b2c	[AArch64] Generate .note.gnu.property based on module flags. Flags of the module derived exclusively from the compiler flag `-mbranch-protection`. The note is generated based on the module flags accordingly. After this change in case of compile unit without function won't have the .note.gnu.property if the compiler flag is not present [1]. [1] https://bugs.llvm.org/show_bug.cgi?id=46480 Reviewed By: chill Differential Revision: https://reviews.llvm.org/D80791	2020-09-28 14:14:04 +02:00
Simon Pilgrim	1b71f5f90f	[X86] Flip isShuffleEquivalent argument order to match isTargetShuffleEquivalent A while ago, we converted isShuffleEquivalent/isTargetShuffleEquivalent to both use IsElementEquivalent internally. This allows us to make the shuffle args optional like isTargetShuffleEquivalent and update foldShuffleOfHorizOp to use isShuffleEquivalent (which it should as its using a ISD::VECTOR_SHUFFLE mask).	2020-09-28 12:53:56 +01:00
Simon Pilgrim	b93147ffac	[X86] Simplify broadcast mask detection with isUndefOrEqual helper. Add an additional isUndefOrEqual variant that matches an entire mask, not just a single value.	2020-09-28 12:53:56 +01:00
LLVM GN Syncbot	9643d50ff5	[gn build] Port 018066d9475	2020-09-28 11:38:04 +00:00
Qiu Chaofan	a3362da3ea	[PowerPC] Clean-up mayRaiseFPException bits According to POWER ISA, floating point instructions altering exception bits in FPSCR should be 'may raise FP exception'. (excluding those read or write the whole FPSCR directly, like mffs/mtfsf) We need to model FPSCR well in future patches to handle the special case properly. Instructions added mayRaiseFPException: - fre(s)/frsqrte(s) - fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s) - xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp - xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp - xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp Instructions removed mayRaiseFPException: - xstdivdp/xvtdiv(d\|s)p/xstsqrtdp/xvtsqrt(d\|s)p - xsabsdp/xsnabsdp/xvabs(d\|s)p/xvnabs(d\|s)p - xsnegdp/xscpsgndp/xvneg(d\|s)p/xvcpsgn(d\|s)p - xvcvsxwdp/xvcvuxwdp - xscvdpspn/xscvspdpn Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87738	2020-09-28 18:22:12 +08:00
Jay Foad	3f3da716fa	[AMDGPU] Add bfi immediate pattern Differential Revision: https://reviews.llvm.org/D88246	2020-09-28 10:16:51 +01:00
Jay Foad	905b53ab6b	[AMDGPU] Make bfi patterns divergence-aware This tends to increase code size but more importantly it reduces vgpr usage, and could avoid costly readfirstlanes if the result needs to be in an sgpr. Differential Revision: https://reviews.llvm.org/D88245	2020-09-28 10:16:51 +01:00
Jay Foad	343d947d8d	[AMDGPU] Split R600 and GCN bfi patterns This is in preparation for making the GCN patterns divergence-aware. NFC. Differential Revision: https://reviews.llvm.org/D88244	2020-09-28 10:16:51 +01:00
Simon Pilgrim	4d69aec2be	[InstCombine] Add tests for vector rotate by constants with undefs.	2020-09-28 09:55:43 +01:00
Georgii Rymar	b76a0e7b80	[yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section. This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools. SHT_ARM_EXIDX is a ARM specific index table filled with entries. Each entry consists of two 4-bytes values (words). (https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries) Differential revision: https://reviews.llvm.org/D88228	2020-09-28 11:45:49 +03:00
Georgii Rymar	e88d56b496	[obj2yaml][yaml2obj] - Stop recognizing SHT_MIPS_ABIFLAGS on non-MIPS targets. Currently we are always recognizing the `SHT_MIPS_ABIFLAGS` section, even on non-MIPS targets. The problem of doing this is briefly discussed in D88228 which does the same for `SHT_ARM_EXIDX`: "The problem is that `SHT_ARM_EXIDX` shares the value with `SHT_X86_64_UNWIND (0x70000001U)`. We might have other machine specific conflicts, e.g. `SHT_ARM_ATTRIBUTES` vs `SHT_MSP430_ATTRIBUTES` vs `SHT_RISCV_ATTRIBUTES (0x70000003U)`." I think we should only recognize target specific sections when the machine type matches. I.e. `SHT_MIPS_` should be recognized only on `MIPS`, `SHT_ARM_` only on `ARM` etc. This patch stops recognizing `SHT_MIPS_ABIFLAGS` on `non-MIPS` targets. Note: I had to update `ScalarEnumerationTraits<ELFYAML::MIPS_ISA>::enumeration`, because otherwise test crashes, calling `llvm_unreachable`. Differential revision: https://reviews.llvm.org/D88294	2020-09-28 11:28:53 +03:00
Benjamin Kramer	d83ca05fea	[Coroutines] Remove unused includes. NFC.	2020-09-28 10:27:23 +02:00
Sjoerd Meijer	72cf3e7d38	[ARM][MVE] tail-predication: overflow checks for elementcount, cont'd This is a reimplementation of the overflow checks for the elementcount, i.e. the 2nd argument of intrinsic get.active.lane.mask. The element count is lowered in each iteration of the tail-predicated loop, and we must prove that this expression doesn't overflow. Many thanks to Eli Friedman and Sam Parker for all their help with this work. Differential Revision: https://reviews.llvm.org/D88086	2020-09-28 09:20:51 +01:00
David Green	82ad955cee	[ARM] Expand cannotInsertWDLSTPBetween to the last instruction 9d9a11c7be037 added this check for predicatable instructions between the D/WLSTP and the loop's start, but it was missing the last instruction in the block. Change it to use some iterators instead. Differential Revision: https://reviews.llvm.org/D88354	2020-09-28 09:14:40 +01:00
Chuanqi Xu	5802e5931e	[Coroutines] Reuse storage for local variables with non-overlapping lifetimes bug 45566 shows the process of building coroutine frame won't consider that the lifetimes of different local variables are not overlapped, which means the compiler could generates smaller frame. This patch calculate the lifetime range of each alloca by StackLifetime class. Then the patch build non-overlapped sets for allocas whose lifetime ranges are not overlapped. We use the largest type in a non-overlapped set as the field type in the frame. In insertSpills process, if we find the type of field is not the same with the alloca, we cast the pointer to the field type to the pointer to the alloca type. Since the lifetime range of alloca in one non-overlapped set is not overlapped with each other, it should be ok to reuse the storage space in the frame. Test plan: check-llvm, check-clang, cppcoro, folly Reviewers: junparser, lxfind, modocache Differential Revision: https://reviews.llvm.org/D87596	2020-09-28 15:48:00 +08:00
David Sherwood	0927cfa9f6	[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy After some recent upstream discussion we decided that it was best to avoid having the / operator for both ElementCount and TypeSize, since this could give the impression that these classes can be used in the same way as basic integer integer types. However, division for scalable types is a bit odd because we are only dividing the minimum quantity by a value, as opposed to something like: (MinSize * Vscale) / SomeValue This is why when performing division it's important the caller first establishes whether the operation makes sense, perhaps by calling isKnownMultipleOf() prior to division. The caller must now explictly call divideCoefficientBy() on the class to perform the operation. Differential Revision: https://reviews.llvm.org/D87700	2020-09-28 08:03:00 +01:00
Kai Luo	e87c3b7d7a	[PowerPC] Add tests for `select` patterns. NFC.	2020-09-28 06:11:40 +00:00
Arthur Eubanks	546a7d793e	Revert "Reland [CodeGen] emit CG profile for COFF object file" This reverts commit 506b6170cb513f1cb6e93a3b690c758f9ded18ac. This still causes link errors, see https://crbug.com/1130780.	2020-09-27 22:43:14 -07:00
Max Kazantsev	6281760e76	[Test] Add tests where we can replace condition with invariants	2020-09-28 12:04:20 +07:00
Dávid Bolvanský	1c26e35afd	[BuildLibCalls] Add noalias for strcat and stpcpy strcat: destination and source shall not overlap. (http://www.cplusplus.com/reference/cstring/strcat/) stpcpy: The strings may not overlap, and the destination string dest must be large enough to receive the copy. (https://man7.org/linux/man-pages/man3/stpcpy.3.html) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D88335	2020-09-27 21:37:09 +02:00

1 2 3 4 5 ...

204297 Commits