llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00

Author	SHA1	Message	Date
madhur13490	2779613cc9	[GlobalISel] Don't skip adding predicate matcher This patch fixes a bug which skipped adding predicate matcher for a pattern in many cases. For example, if predicate is Load and its memoryVT is non-null then the loop continues and never reaches to the end which adds the predicate matcher. This patch moves the matcher addition to the top of the loop so that it gets added regardless of contextual checks later in the loop. Other way to fix this issue is to remove all "continue" statements in checks and let the loop continue till end. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D83034	2020-08-19 07:54:14 +00:00
Florian Hahn	1da65b5ff0	[DSE,MemorySSA] Use NumRedundantStores instead of NumNoopStores. Legacy DSE uses NumRedundantStores, while MemorySSA DSE uses NumNoopStores. We should just use the same counter.	2020-08-19 08:50:33 +01:00
Ronak Chauhan	4697f34ed6	Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors" This reverts commit cacfb02d28a3cabd4e45d2535cb0686cef48a2c9. Reverting due to buildbot failures.	2020-08-19 13:12:29 +05:30
David Sherwood	f7a1832d69	[SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorLoads In DAGTypeLegalizer::GenWidenVectorLoads the algorithm assumes it only ever deals with fixed width types, hence the offsets for each individual store never take 'vscale' into account. I've changed the code in that function to use TypeSize instead of unsigned for tracking the remaining load amount. In addition, I've changed the load loop to use the new IncrementPointer helper function for updating the addresses in each iteration, since this handles scalable vector types. Also, I've added report_fatal_errors in GenWidenVectorExtLoads, TargetLowering::scalarizeVectorLoad and TargetLowering::scalarizeVectorStores, since these functions currently use a sequence of element-by-element scalar loads/stores. In a similar vein, I've also added a fatal error report in FindMemType for the case when we decide to return the element type for a scalable vector type. I've added new tests in CodeGen/AArch64/sve-split-load.ll CodeGen/AArch64/sve-ld-addressing-mode-reg-imm.ll for the changes in GenWidenVectorLoads. Differential Revision: https://reviews.llvm.org/D85909	2020-08-19 07:54:32 +01:00
Shinji Okumura	540752542a	[Attributor][NFC] Add tests to range.ll Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86128	2020-08-19 15:01:14 +09:00
LLVM GN Syncbot	4e0cb4ff28	[gn build] Port 7546b29e761	2020-08-19 03:44:19 +00:00
Yaxun (Sam) Liu	6660be7005	[HIP] Support target id by --offload-arch This patch introduces support of target id by -offload-arch. Differential Revision: https://reviews.llvm.org/D60620	2020-08-18 23:43:53 -04:00
Ronak Chauhan	142f4dd209	[AMDGPU] Support disassembly for AMDGPU kernel descriptors Decode AMDGPU Kernel descriptors as assembler directives. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D80713	2020-08-19 08:49:07 +05:30
Changpeng Fang	c3904f6ffc	AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc Summary: When the resource descriptor is of vgpr, we need a waterfall loop to read into a sgpr. In this patchm we generalized the implementation to work for any regster class sizes, and extend the work to MIMG instructions. Fixes: SWDEV-223405 Reviewers: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D82603	2020-08-18 16:27:36 -07:00
Chuanqi Xu	3188b05ed0	[NFC][StackSafety] Test that StackLifetime looks through stripPointerCasts StackLifetime class collects lifetime marker of an `alloca` by collect the user of `BitCast` who is the user of the `alloca`. However, either the `alloca` itself could be used with the lifetime marker or the `BitCast` of the `alloca` could be transformed to other instructions. (e.g., it may be transformed to all zero reps in `InstCombine` pass). This patch tries to fix this process in `collectMarkers` functions. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D85399	2020-08-18 16:21:00 -07:00
Elliott Hughes	8e3a33cacc	ld128 demangle: allow space for 'L' suffix. Summary: Caught by HWASAN on arm64 Android (which uses ld128 for long double). This was running the existing fuzzer. The specific minimized fuzz input to reproduce this is: __cxa_demangle("1\006ILeeeEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", 0, 0, 0); Reviewers: eugenis, srhines, #libc_abi! Subscribers: kristof.beyls, danielkiss, libcxx-commits Tags: #libc_abi Differential Revision: https://reviews.llvm.org/D77924	2020-08-18 16:14:05 -07:00
Roman Lebedev	3f4579ac3a	[NFC][InstCombine] Aggregate reconstruction: use plain map Now that we no longer require for this map to have stable iteration order, we no longer need to pay for keeping the iteration order stable, so switch from `SmallMapVector` to `SmallDenseMap`.	2020-08-19 01:09:25 +03:00
Roman Lebedev	2083389218	[InstCombine] PHI-aware aggregate reconstruction: properly handle duplicate predecessors While it may seem like we can just "deduplicate" the case where some basic block happens to be a predecessor more than once, which happens for e.g. switches, that is not correct thing to do. We must actually add a PHI operand for each predecessor. This was initially reported to me by David Major as a clang crash during gecko build for android.	2020-08-19 01:00:42 +03:00
Amara Emerson	2018ade545	Use std::make_tuple instead of initializer lists to make a bot happy: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux	2020-08-18 14:55:52 -07:00
Craig Topper	5408cc13ce	[X86] Fix the Predicates on MMX_PSHUFWri/PSHUFWmi to include SSE1 in addition to MMX. These instructions weren't in the initial version of MMX, but were added when SSE1 was introduced. We already have the intrinsic named correctly to include sse and the frontened header enforces sse. We have one place in the backend where we DAG combine to this intrinsic, but that's also qualified. So don't know of anything currently broken unless someone writes their own IR and doesn't set the sse feature.	2020-08-18 14:28:26 -07:00
David Blaikie	c9c4b5ec13	Recommit "PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units" Originally committed as be3ef93bf58aa5546c7baadfb21d43b75fbb4e24. Reverted by b4bffdbadfcceb3959aaf231c1542301944e5812 due to bot failures: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/17380/testReport/junit/LLVM/DebugInfo_X86/addr_tu_to_non_tu_ll/ http://45.33.8.238/win/22216/step_11.txt MacOS failure due to testing Split DWARF which isn't compatible with MachO. Windows failure due to testing type units which aren't enabled on Windows. Fix both of these by applying an explicit x86 linux triple to the test.	2020-08-18 13:43:28 -07:00
Sanjay Patel	30bd733531	[VectorCombine] add tests for vector loads; NFC	2020-08-18 16:23:33 -04:00
Eli Friedman	610e3a8c86	[AArch64][SVE] Add patterns for integer mla/mls. We probably want to introduce pseudo-instructions at some point, like we have for binary operations, but this seems okay for now. One thing I'm not sure about is whether we should be doing this as a DAGCombine instead of directly pattern-matching it. I don't see any big downside to doing it this way, though. Differential Revision: https://reviews.llvm.org/D85681	2020-08-18 12:51:16 -07:00
Eli Friedman	2be753c211	[AArch64][SVE] Allow llvm.aarch64.sve.st2/3/4 with vectors of pointers. This isn't necessaary for ACLE, but could be useful in other situations. And the change is simple. Differential Revision: https://reviews.llvm.org/D85251	2020-08-18 12:51:16 -07:00
Jessica Paquette	67ae683e5b	[GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList It's annoying to have to maintain multiple, nearly identical chains of if statements which all set the same attributes. Add a helper function, `addFlagsUsingAttrFn` which performs the attribute setting. Then, use wrappers for that function in `lowerCall` and `setArgFlags`. (Note that the flag-setting code in `setArgFlags` was missing the returned attribute. There's no selection for this yet, so no test. It's an example of the kind of thing this lets us avoid, though.) Differential Revision: https://reviews.llvm.org/D86159	2020-08-18 11:07:33 -07:00
Jessica Paquette	761fea8dc0	[GlobalISel][CallLowering] Don't tail call with non-forwarded explicit sret Similar to this commit: faf8065a99817bcb10e6f09b558fe3e0972c35ce Testcase is pretty much the same as test/CodeGen/AArch64/tailcall-explicit-sret.ll Except it uses i64 (since we don't handle the i1024 return values yet), and doesn't have indirect tail call testcases (because we can't translate those yet). Differential Revision: https://reviews.llvm.org/D86148	2020-08-18 11:06:57 -07:00
Matt Arsenault	418515b7d0	GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT Add unit tests since AMDGPU will only trigger this for gigantic vectors, and won't use the annoying odd sized breakdown case.	2020-08-18 13:51:19 -04:00
David Blaikie	01ab206194	[WIP][DebugInfo] Lazily parse debug_loclist offsets Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing down memory usage of symbolizers that don't need to parse this data at all. There's not much benefit to caching these anyway - since they are O(1) lookup and reading once you know where the offset list starts (and can do bounds checking with the offset list size too). In general, I think it might be time to start paying down some of the technical debt of loc/loclist/range/rnglist parsing to try to unify it a bit more. eg: * Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection, LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if these were all wrapped up in two variables - one for loclists, one for rnglists) * rnglists and loclists are handled differently (see: LoclistTableHeader, but no RnglistTableHeader) * maybe all these types could be less stateful - lazily parse what they need to, even reparsing rather than caching because it doesn't seem too expensive, for instance. (though admittedly so long as it's constantcost/overead per compilatiton that's probably adequate) * Maybe implementing and using a DWARFDataExtractor that can be sub-ranged (so we could slice it up to just the single contribution) - though maybe that's not so useful because loc/ranges need to refer to it by absolute, not contribution-relative mechanisms Differential Revision: https://reviews.llvm.org/D86110	2020-08-18 10:49:39 -07:00
Amara Emerson	f6bce1ffcd	[GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x This is restricted to single use loads, which if we fold to sextloads we can find more optimal addressing modes on AArch64. This also fixes an overload the MachineFunction::getMachineMemOperand() method which was incorrectly using the MF alignment instead of the MMO alignment. Differential Revision: https://reviews.llvm.org/D85966	2020-08-18 10:42:15 -07:00
Amara Emerson	d1d273ff1c	[GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c' By detecting this sign extend pattern early, we can uncover opportunities for more optimizations. Differential Revision: https://reviews.llvm.org/D85965	2020-08-18 10:42:15 -07:00
Arthur Eubanks	77e79ccb03	[gn build] Add support for expensive checks Reviewed By: hans, MaskRay Differential Revision: https://reviews.llvm.org/D86007	2020-08-18 09:53:39 -07:00
Simon Pilgrim	54169f4f54	[X86][AVX] lowerShuffleWithVPMOV - add non-VLX support. We can efficiently handle non-VLX cases now that we have the getAVX512TruncNode helper.	2020-08-18 17:51:14 +01:00
Fangrui Song	7e9b7402f0	[ARM] Fix build after D86087	2020-08-18 09:20:32 -07:00
David Green	f2589e2340	[ARM] Allow tail predication of VLDn VLD2/4 instructions cannot be predicated, so we cannot tail predicate them from autovec. From intrinsics though, they should be valid as they will just end up loading extra values into off vector lanes, not effecting the on lanes. The same is true for loads in general where so long as we are not using the other vector lanes, an unpredicated load can be converted to a predicated one. This marks VLD2 and VLD4 instructions as validForTailPredication and allows any unpredicated load in tail predication loop, which seems to be valid given the other checks we have. Differential Revision: https://reviews.llvm.org/D86022	2020-08-18 17:15:45 +01:00
Sam Tebbs	928822abd7	[ARM] Use mov operand if the mov cannot be moved while tail predicating There are some cases where the instruction that sets up the iteration count for a tail predicated loop cannot be moved before the dlstp, stopping tail predication entirely. This patch checks if the mov operand can be used and if so, uses that instead. Differential Revision: https://reviews.llvm.org/D86087	2020-08-18 17:10:29 +01:00
Fangrui Song	af8011f23e	[llvm-dwarfdump][test] Add a --statistics test for a DW_AT_artificial variable There is an untested but useful case: `this` (even if not written) is counted as a source variable. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D86044	2020-08-18 09:08:38 -07:00
Jamie Schmeiser	831953d6d2	[NFC] Add raw_ostream parameter to printIR routines This is a non-functional-change to generalize the printIR routines so that the output can be saved and manipulated rather than being directly output to dbgs(). This is a prerequisite change for many upcoming changes that allow new ways of examining changes made to the IR in the new pass manager. Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D85999	2020-08-18 16:05:27 +00:00
Jessica Paquette	7e08e6c7a3	[GlobalISel][CallLowering] Look through call parameters for flags We weren't looking through the parameters on calls at all. E.g., say you had ``` declare i32 @zext(i32 zeroext %x) ... %y = call i32 @zext(i32 %something) ... ``` At the point of the call, we wouldn't know that the %something should have the zeroext attribute. This sets flags in about the same way as TargetLoweringBase::ArgListEntry::setAttributes. Differential Revision: https://reviews.llvm.org/D86125	2020-08-18 08:48:56 -07:00
jasonliu	26926d02d9	[XCOFF] emit .rename for .lcomm when necessary Summary: This is a follow up for D82481. For .lcomm directive, although it's not necessary to have .rename emitted, it's still desirable to do it so that we do not see internal 'Rename..' gets print out in symbol table. And we could have consistent naming between TC entry and .lcomm. And also have consistent naming between IR and final object file. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D86075	2020-08-18 15:32:45 +00:00
Simon Pilgrim	f9607d1829	[X86] Regenerate load-slice test labels. NFCI. Pulled out a superfluous diff from D66004	2020-08-18 16:08:35 +01:00
David Green	805498d035	[LV] Predicated reduction tests. NFC	2020-08-18 16:02:21 +01:00
Simon Pilgrim	aaddcf1562	[X86][AVX] lowerShuffleWithPERMV - pad 128/256-bit shuffles on non-VLX targets Allow non-VLX targets to use 512-bits VPERMV/VPERMV3 for 128/256-bit shuffles. TBH I'm not sure these targets actually exist in the wild, but we're testing for them and its good test coverage for shuffle lowering/combines across different subvector widths.	2020-08-18 15:46:02 +01:00
Simon Pilgrim	5b3aace9c9	[X86][AVX] lowerShuffleWithVTRUNC - extend to support v16i16/v32i8 binary shuffles. This requires a few additional SrcVT vs DstVT padding cases in getAVX512TruncNode.	2020-08-18 15:30:02 +01:00
Sanjay Patel	b8e421a05f	[SLP] remove instcombine dependency from regression test; NFC InstCombine doesn't do that much here - sinks some instructions and improves alignments - but that should not be part of the SLP pass unit testing.	2020-08-18 10:18:22 -04:00
Simon Pilgrim	bc23e1d476	[X86][AVX] lowerShuffleWithVTRUNC - pull out TRUNCATE/VTRUNC creation into helper code. NFCI. Prep work toward adding v16i16/v32i8 support for lowerShuffleWithVTRUNC and improving lowerShuffleWithVPMOV.	2020-08-18 14:52:42 +01:00
Matt Arsenault	e8f21df36b	AMDGPU/GlobalISel: Select llvm.amdgcn.groupstaticsize Previously, it would successfully select and assert if not HSA or PAL when expanding the pseudoinstruction. We don't need the pseudoinstruction anymore since we know the total size after legalization.	2020-08-18 09:28:01 -04:00
Matt Arsenault	441f0f056c	AMDGPU/GlobalISel: Fix selection of s1/s16 G_[F]CONSTANT The code to determine the value size was overcomplicated and only correct in the case where the result register already had a register class assigned. We can always take the size directly from the register's type.	2020-08-18 09:28:01 -04:00
Georgii Rymar	dfdc7b8a6d	[llvm-readobj/elf] - Refine testing of broken Android's packed relocation sections. This uses modern `split-file` tool to merge 5 `packed-relocs-error*.s` tests to a new `packed-relocs-errors.s` and adds testing for GNU style. Differential revision: https://reviews.llvm.org/D85835	2020-08-18 16:23:41 +03:00
Sanjay Patel	e60caae143	[InstCombine] fold fabs of select with negated operand This is the FP example shown in: https://bugs.llvm.org/PR39474	2020-08-18 09:23:07 -04:00
Sanjay Patel	5005da6598	[InstCombine] add tests for fneg+fabs; NFC	2020-08-18 09:23:07 -04:00
Georgii Rymar	7dc8b6931c	[yaml2obj] - Don't crash when `FileHeader` declares an empty `Flags` key in specific situations. We currently call the `llvm_unreachable` for the following YAML: ``` --- !ELF FileHeader: Class: ELFCLASS32 Data: ELFDATA2LSB Type: ET_REL Machine: EM_NONE Flags: [ ] ``` it happens because the `Flags` key is present, though `EM_NONE` is a machine type that has no known `EF_*` values and we call `llvm_unreachable` by mistake. Differential revision: https://reviews.llvm.org/D86138	2020-08-18 16:09:28 +03:00
Ronak Chauhan	6e3663ae70	[ELF] Hide target specific methods as private Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D86136	2020-08-18 18:26:08 +05:30
Simon Pilgrim	b682d6311b	[X86][AVX] lowerShuffleWithVTRUNC - avoid unnecessary division in element counts. NFCI. (256 / SrcEltBits) == ((2 * EltSizeInBits * NumElts) / (EltSizeInBits * Scale)) == (2 * (NumElts / Scale)) == NumSrcElts	2020-08-18 13:48:22 +01:00
Nico Weber	7f41844cac	Revert "PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units" This reverts commit be3ef93bf58aa5546c7baadfb21d43b75fbb4e24. Test fails on macOS and Windows, e.g. http://45.33.8.238/win/22216/step_11.txt	2020-08-18 08:40:36 -04:00
Ronak Chauhan	0f95014e38	[llvm-objdump][AMDGPU] Detect CPU string AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly. However, the AMDGPU target CPU is stored in e_flags in the ELF object. This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D84519	2020-08-18 17:43:16 +05:30

1 2 3 4 5 ...

202158 Commits