llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	9d05ebb2cb	[ValueTracking] Add computeKnownBits DemandedElts support to shift instructions (PR36319)	2020-03-20 11:08:08 +00:00
Nikita Popov	330cabedf4	[Tests] Regenerate some test checks; NFC	2020-03-20 12:06:53 +01:00
James Henderson	c2469985d5	[llvm-readobj] Allow syms from all sections to match stack size entries Prior to this change, for non-relocatable objects llvm-readobj would assume that all symbols that corresponded to a stack size section's entries were in the section specified by the section's sh_link field. In the presence of an output section description combining SHF_LINK_ORDER sections linking different output sections, this cannot be respected, since linker script section patterns are "by name" by nature. Consequently, the sh_link value would not be correct for all section entries. This patch changes llvm-readobj to ignore the section of symbols in a non-relocatable object. Fixes https://bugs.llvm.org/show_bug.cgi?id=45228. Reviewed by: grimar, MaskRay Differential Revision: https://reviews.llvm.org/D76425	2020-03-20 10:54:18 +00:00
Georgii Rymar	5defb79081	[llvm-readobj][llvm-readelf][test] - Add a test to check how we dump relocation addends. Seems we do not test how we print relocation addends well. And the behavior of dumpers does not seem to be ideal here (and llvm-readelf does not match GNU as the test case shows). This patch adds a test case to document the current behavior. Differential revision: https://reviews.llvm.org/D75671	2020-03-20 13:41:32 +03:00
Adrian Kuegel	00bbf66f21	Revert "[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes" This reverts commit e9f22fd4293a65bcdcf1b18b91c72f63e5e9e45b. When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70 failing tests with this revision, and 29 without this revision.	2020-03-20 11:02:50 +01:00
David Green	b0d54a04ab	[ARM] Change VDUP type to i32 for MVE The MVE VDUP instruction take a GPR and splats into every lane of a vector register. Unlike NEON we do not have a VDUPLANE equivalent instruction, doing the same splat from a fp register. Previously a VDUP to a v4f32/v8f16 would be represented as a (v4f32 VDUP f32), which would mean the instruction pattern needs to add a COPY_TO_REGCLASS to the GPR. Instead this now converts that earlier during an ISel DAG combine, converting (VDUP x) to (VDUP (bitcast x)). This can allow instruction selection to tell that the input needs to be an i32, which in one of the testcases allows it to use ldr (or specifically ldm) over (vldr;vmov). Whilst being simple enough for floats, as the types sizes are the same, these is no BITCAST equivalent for getting a half into a i32. This uses a VMOVrh ARMISD node, which doesn't know the same tricks yet. Differential Revision: https://reviews.llvm.org/D76292	2020-03-20 09:48:45 +00:00
Roger Ferrer Ibanez	ef495f25d0	[RISCV] Select +0.0 immediate using fmv.{w,d}.x / fcvt.d.w Floating point positive zero can be selected using fmv.w.x / fmv.d.x / fcvt.d.w and the zero source register. Differential Revision: https://reviews.llvm.org/D75729	2020-03-20 09:42:24 +00:00
Roger Ferrer Ibanez	1f76f9f3fb	[NFC][RISCV] Test for 0.0 fp immediate To show a later change that impacts 0.0 fp constant generation. Differential Revision: https://reviews.llvm.org/D75728	2020-03-20 09:42:24 +00:00
Nikita Popov	f73282e3d5	[InstCombine] Simplify calls with "returned" attribute If a call argument has the "returned" attribute, we can simplify the call to the value of that argument. This was already partially handled by InstSimplify/InstCombine for the case where the argument is an integer constant, and the result is thus known via known bits. The non-constant (or non-int) argument cases weren't handled though. This previously landed as an InstSimplify transform, but was reverted due to assertion failures when compiling the Linux kernel. The reason is that simplifying a call to another call breaks assumptions in call graph updating during inlining. As the code is not easy to fix, and there is no particularly strong motivation for having this in InstSimplify, the transform is only performed in InstCombine instead. Differential Revision: https://reviews.llvm.org/D75815	2020-03-20 10:23:39 +01:00
David Green	469aaafeb8	[ARM] Extra MVE float loop tests. NFC	2020-03-20 09:21:45 +00:00
Nikita Popov	117eda6206	[InstCombine] Don't replace musttail result based on known bits This is the same change as D75824, but for two cases where InstCombine performs the same optimization: Replacing an instruction whose bits are fully known with a constant. This is not (generally) legal for musttail calls. Differential Revision: https://reviews.llvm.org/D76457	2020-03-20 10:17:09 +01:00
Florian Hahn	ad3bbfd795	[Matrix] Generalize ColumnMatrixTy to MatrixTy (NFC). This patch sets the stage for supporting both row and column major layouts for matrixes. It renames ColumnMatrixTy to MatrixTy, adds booleans indicating the underlying layout to both MatrixTy and ShapeInfo and generalizes the methods of MatrixTy to support both row and column major layouts. Reviewers: Gerolf, anemet, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76324	2020-03-20 08:32:13 +00:00
Florian Hahn	30717a4559	[DSE] Support traversing MemoryPhis. For MemoryPhis, we have to avoid that the MemoryPhi may be executed before before the access we are currently looking at. To do this we do a post-order numbering of the basic blocks in the function and bail out once we reach a MemoryPhi with a larger (or equal) post-order block number than the current MemoryAccess. This changes the order in which we visit stores for elimination. This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72148	2020-03-20 07:51:42 +00:00
Austin Kerbow	ba7de4f645	[AMDGPU] Reuse register during frame index elimination If there were no free VGPRs we would need two emergency spill slots for register scavenging during PEI/frame index elimination. Reuse 'ResultReg' for scale calculation so that only one spill is needed. Differential Revision: https://reviews.llvm.org/D76387	2020-03-20 00:19:15 -07:00
cdevadas	555c7ce155	[AMDGPU] Set the CostPerUse value for vgpr registers. Apart from the argument registers, set the CostPerUse value as per the ratio reg_index/allocation_granularity. It is a pre-commit for introducing the scratch registers in the ABI. This change should help in a balanced register allocation. Differential Revision: https://reviews.llvm.org/D76417	2020-03-20 11:49:35 +05:30
Wei Mi	2f9d0b467d	Revert "Generate Callee Saved Register (CSR) related cfi directives like .cfi_restore." This reverts commit 3c96d01d2e3de63304ca3429d349ec62ae2adef3. Got report that it caused test failures in libc++.	2020-03-19 22:45:27 -07:00
Jun Ma	030942f981	[Coroutines] Fix PR45130 For now, when final suspend can be simplified by simplifySuspendPoint, handleFinalSuspend is executed as well to remove last case in switch instruction. This patch fixes it. Differential Revision: https://reviews.llvm.org/D76345	2020-03-20 11:27:08 +08:00
David Blaikie	dda08b0f66	Recommit: CFGDiff: Simplify/common the begin/end implementations to use a common range helper"" (would be nice to revisit the CFG traits and change them to use ranges rather than begin/end - if anyone wants to do that refactor) Also use more auto because writing the names of range utilty iterators isn't helping readability here - they're sort of implementation details for the most part, especially once you nest a few different filtering and adapting iterators. The fix (shooting from the hip since I couldn't reproduce this locally) was to capture by value in a lambda used in a filtering iterator - because the iterator would persist beyond the lifetime of the function (as the iterators are returned to callers). Originally committed in 79a7ed92a9b135212a6a271dd8dbc625038c8f06. This was reverted in 4a7f2032a350bc7eefd26709563f65216df3e2ce.	2020-03-19 18:21:14 -07:00
Yuta Saito	e102360e48	[WebAssembly] Support swiftself and swifterror for WebAssembly target Summary: Swift ABI is based on basic C ABI described here https://github.com/WebAssembly/tool-conventions/blob/master/BasicCABI.md Swift Calling Convention on WebAssembly is a little deffer from swiftcc on another architectures. On non WebAssembly arch, swiftcc accepts extra parameters that are attributed with swifterror or swiftself by caller. Even if callee doesn't have these parameters, the invocation succeed ignoring extra parameters. But WebAssembly strictly checks that callee and caller signatures are same. https://github.com/WebAssembly/design/blob/master/Semantics.md#calls So at WebAssembly level, all swiftcc functions end up extra arguments and all function definitions and invocations explicitly have additional parameters to fill swifterror and swiftself. This patch support signature difference for swiftself and swifterror cc is swiftcc. e.g. ``` declare swiftcc void @foo(i32, i32) @data = global i8* bitcast (void (i32, i32)* @foo to i8) define swiftcc void @bar() { %1 = load i8, i8** @data %2 = bitcast i8* %1 to void (i32, i32, i32)* call swiftcc void %2(i32 1, i32 2, i32 swiftself 3) ret void } ``` For swiftcc, emit additional swiftself and swifterror parameters if there aren't while lowering. These additional parameters are added for both callee and caller. They are necessary to match callee and caller signature for direct and indirect function call. Differential Revision: https://reviews.llvm.org/D76049	2020-03-19 17:39:52 -07:00
Thomas Lively	2fea740d56	[WebAssembly] SIMD integer abs instructions Summary: These were merged to the SIMD proposal in https://github.com/WebAssembly/simd/pull/128. Depends on D76397 to avoid merge conflicts. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76399	2020-03-19 17:25:58 -07:00
Sterling Augustine	3f51a0ef88	Revert "Cleanup the plumbing for DILineInfoSpecifier. [NFC]" This broke lldb. Will fix and resubmit. This reverts commit 98ff6eb679cd5a2556d990d3d629e6c03c1da6a0.	2020-03-19 17:25:05 -07:00
Thomas Lively	d200caf70e	[WebAssembly] SIMD bitmask intrinsics and builtin functions Summary: These experimental new instructions are proposed in https://github.com/WebAssembly/simd/pull/201. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76397	2020-03-19 17:15:37 -07:00
Matt Arsenault	a4f71b0079	AMDGPU/GlobalISel: Remove leftover #if 0 The subtarget feature used to be missing from subtargets, but that was fixed.	2020-03-19 20:07:05 -04:00
Sterling Augustine	7d98860ec6	Cleanup the plumbing for DILineInfoSpecifier. [NFC] Summary: 1. FileLineInfoSpecifier::Default isn't the default for anything. Rename to RawValue, which accurately reflects its role. 2. Most functions that take a part of a FileLineInfoSpecifier end up constructing a full one later or plumb two values through. Make them all just take a complete FileLineInfoSpecifier. 3. Printing basenames only was handled differently from all other variants, make it parallel to all the other variants. Reviewers: jhenderson Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76394	2020-03-19 16:56:43 -07:00
Jessica Paquette	933279999d	[GlobalISel] Port some basic shufflevector undef combines from the DAGCombiner Port over the following: - shuffle undef, undef, any_mask -> undef - shuffle anything, anything, undef_mask -> undef This sort of thing shows up a lot when you try to bugpoint code containing shufflevector. Differential Revision: https://reviews.llvm.org/D76382	2020-03-19 16:46:06 -07:00
Lang Hames	f9245f3da0	[ORC] Re-apply 98f2bb44610, enable JITEventListeners in OrcV2, with fixes. Updates the object buffer ownership scheme in jitLinkForOrc and related functions: Ownership of both the object::ObjectFile and underlying MemoryBuffer is passed into jitLinkForOrc and passed back to the onEmit callback once linking is complete. This avoids the use-after-free errors that were seen in 98f2bb44610.	2020-03-19 16:30:08 -07:00
Petr Hosek	a988522a04	[CMake] Enable the use of -ffile-prefix-map This handles not paths embedded in debug info, but also in sources. Since the use of this flag is controlled by an option, rather than replacing the new option, we add a new option. Differential Revision: https://reviews.llvm.org/D76018	2020-03-19 15:14:15 -07:00
Simon Pilgrim	3445f5ec2f	[InstSimplify] Add some vector shift tests to show lack of DemandedElts support	2020-03-19 22:09:51 +00:00
Stefan Agner	5aeb5453b2	[MC][ARM] add implicit immediate form for ldrsbt/ldrht/ldrsht Add pseudo instructions for ldrsbt/ldrht/ldrsht with implicit immediate and add fall back C++ code to transform the instruction to the equivalent LDRSBTi/LDRHTi/LDRSHTi form. This is similar to how it has been done in commit fb3950ec6312dfa4317d8cbf83a1db4aae7428ce This fixes: https://bugs.llvm.org/show_bug.cgi?id=45070	2020-03-19 22:36:42 +01:00
Benjamin Kramer	3a398e615d	[Matrix] Fold single-use variable into assert Avoids -Wunused-variable warnings in Release builds.	2020-03-19 21:42:22 +01:00
Florian Hahn	b9c3b2fd13	[Matrix] Move multiply-add code generation into separate function (NFC). This logic can be shared with the tiled code generation. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75565	2020-03-19 20:26:19 +00:00
Kazu Hirata	0f5d7d1ae5	[JumpThreading] Fix infinite loop (PR44611) Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44611 by preventing an infinite loop in the jump threading pass when -jump-threading-across-loop-headers is on. Specifically, without this patch, jump threading through two basic blocks would trigger on the same area of the CFG over and over, resulting in an infinite loop. Consider testcase PR44611-across-header-hang.ll in this patch. The first opportunity to thread through two basic blocks is: from bb_body2 through bb_header and bb_body1 to bb_body2. The pass duplicates bb_header and bb_body1 as, say, bb_header.thread1 and bb_body1.thread1. Since bb_header contains a successor edge back to itself, bb_header.thread1 also contains a successor edge to bb_header, immediately giving rise to the next jump threading opportunity: from bb_header.thread1 through bb_header and bb_body1 to bb_body2. After that, we repeatedly thread an incoming edge into bb_header through bb_header and bb_body1 to bb_body2. In other words, we keep peeling one iteration from bb_header's self loop. The patch fixes the problem by preventing the pass from duplicating a basic block containing a self loop. Reviewers: wmi, junparser, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76390	2020-03-19 12:49:36 -07:00
Scott Linder	85bcde4aea	[AMDGPU] Move frame pointer from s34 to s33 Remove the gap left between the stack pointer (s32) and frame pointer (s34) now that the scratch wave offset is no longer a part of the calling convention ABI. Update llvm/docs/AMDGPUUsage.rst to reflect the change. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75657	2020-03-19 15:35:16 -04:00
Scott Linder	a70016c8d5	[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in the entry function prologue. This allows us to removes the scratch wave offset register from the calling convention ABI. As part of this change, allow the use of an inline constant zero for the SOffset of MUBUF instructions accessing the stack in entry functions when a frame pointer is not requested/required. Entry functions with calls still need to set up the calling convention ABI stack pointer register, and reference it in order to address arguments of called functions. The ABI stack pointer register remains unswizzled, but is now wave-relative instead of queue-relative. Non-entry functions also use an inline constant zero SOffset for wave-relative scratch access, but continue to use the stack and frame pointers as before. When the stack or frame pointer is converted to a swizzled offset it is now scaled directly, as the scratch wave offset no longer needs to be subtracted first. Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling convention. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75138	2020-03-19 15:35:16 -04:00
Scott Linder	00b0d3e619	[AMDGPU][NFC] Refactor some uses of unsigned to Register Tags: #llvm Differential Revision: https://reviews.llvm.org/D76035	2020-03-19 15:35:16 -04:00
Scott Linder	9e034896d6	[AMDGPU][NFC] Refactor emitEntryFunctionPrologue Remove dead code and factor repeated conditions out into a single check. Rename and move code to make it more obvious what is running only for entry functions. Simplify function arguments to make it clearer what the relevant inputs are. Make flat scratch init accept an MBB iterator and move it to where it was logically being emitted within the prologue. These changes will make a future update to the calling convention simpler. Tags: #llvm Differential Revision: https://reviews.llvm.org/D75092	2020-03-19 15:35:16 -04:00
Florian Hahn	3de3635359	[Matrix] Hoist load/store generation logic, add helpers for tiled access. This patch slightly generalizes the code to emit loads and stores of a matrix and adds helpers to load/store a tile of a larger matrix. This will be used in a follow-up patch introducing initial tiling. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75564	2020-03-19 19:28:21 +00:00
Simon Pilgrim	4ca6edb997	[InstCombine][X86] Tests for variable but in-range vector-by-scalar shift amounts (PR40391) These shifts are masked to be inrange so we should be able to replace them with generic shifts.	2020-03-19 19:24:55 +00:00
Simon Pilgrim	3e6dfd9a5a	[InstCombine][X86] simplifyX86immShift - handle variable out-of-range vector shift by immediate amounts (PR40391) If we know the SSE shift amount is out of range then we can simplify to zero value (logical) or a 'signsplat' bitwidth-1 shift (arithmetic). This allows us to remove the equivalent ConstantInt constant folding path from simplifyX86immShift.	2020-03-19 18:27:31 +00:00
Cameron McInally	b07cda4b10	[AArch64][SVE] Add support for DestructiveBinaryImm DestructiveInstType Support prefixing destructive operations, with the MOVPRFX instruction, to build constructive operations. Differential Revision: https://reviews.llvm.org/D75064	2020-03-19 13:11:46 -05:00
Lang Hames	dac63bbcfc	[ORC] Don't use a platform mutex for LLJIT's GenericLLVMIRPlatformSupport class. Along the same lines as eb918d8daf1: This code also had to acquire the session mutex, and this could cause a deadlock under the wrong circumstances. This patch updates GenericLLVMIRPlatformSupport to just use the session lock for everything.	2020-03-19 11:03:34 -07:00
Lang Hames	2945504341	[ORC] Fix indentation in debugging output.	2020-03-19 11:02:56 -07:00
Lang Hames	f62b9d12ef	[ORC] Use finer-grained and session locking in MachOPlatform to avoid deadlock. In MachOPlatform, obtaining the link-order for a JITDylib requires locking the session, but also needs to be part of a larger atomic operation that collates initializer symbols tracked by the platform. Trying to do this under a separate platform mutex leads to potential locking order issues, e.g. T1 locks session then tries to lock platform to register a new init symbol meanwhile T2 locks platform then tries to lock session to obtain link order. Removing the platform lock and performing all these operations under the session lock eliminates this possibility. At the same time we also need to collate init pointers from the MachOPlatform::InitScraperPlugin, and we don't need or want to lock the session for that. The new InitSeqMutex has been added to guard these init pointers, and the session mutex is never obtained while the InitSeqMutex is held.	2020-03-19 11:02:56 -07:00
Lang Hames	ea57541031	[ORC] Don't waste time building empty replacement MaterializationUnits.	2020-03-19 11:02:56 -07:00
Lang Hames	91189ff985	[ORC] Bail out early if a replacement MaterializationUnit is empty. The MU may define no symbols, but still contain a non-trivial destructor (e.g. an LLVM IR module that has been stripped of all externally visible definitions, but which still needs to lock its context to be destroyed). Bailing out early ensures that we destroy the unit outside the session lock, rather than under it which may cause deadlocks. Also adds some extra sanity-checking assertions.	2020-03-19 11:02:56 -07:00
Sanjay Patel	5c8fcd9cfc	[SDAG] reduce code duplication in getNegatedExpression(); NFCI	2020-03-19 13:55:15 -04:00
Craig Topper	a63f955255	[X86] Attempt to more accurately model the cost of a bool reduction of wide vector type. Previously we multiplied the cost for the table entries by the number of splits needed. But that implies that each split goes through a reduction to scalar independently. I think what really happens is that the we AND/OR the split pieces until we're down to a single value with a legal type and then do special reduction sequence on that. So to model that this patch takes the number of splits minus one multiplied by the cost of a AND/OR at the legal element count and adds that on top of the table lookup. Differential Revision: https://reviews.llvm.org/D76400	2020-03-19 09:31:05 -07:00
Vedant Kumar	af2c299da6	[test] Re-enable accidentally disabled X86 tests A number of X86 tests were accidentally disabled in https://reviews.llvm.org/D73568. This commit re-enables those tests. ``` $ for x86_test in $(gg 'REQUIRES: x86$' llvm/test \| fst); do sed -i "" '/REQUIRES: x86/d' $x86_test; done ``` (Note that 'x86' is not an available feature, that's what caused the tests to be disabled.)	2020-03-19 09:29:23 -07:00
Sam Parker	782943a635	[NFC][ARM] Fix for buildbots Update broken test.	2020-03-19 15:50:13 +00:00
Simon Pilgrim	b65b8d7149	[InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by immediate amounts to generic shifts (PR40391) The slli/srli/srai 'immediate' vector shifts (although its not immediate anymore to match gcc) can be replaced with generic shifts if the shift amount is known to be in range.	2020-03-19 15:44:24 +00:00

1 2 3 4 5 ...

193607 Commits