llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Roman Lebedev	823fdcbc30	[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPS tests	2021-05-14 14:06:23 +03:00
Roman Lebedev	dd7ffd482a	[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPS tests	2021-05-14 14:06:23 +03:00
Roman Lebedev	e17513d36b	[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPS tests	2021-05-14 14:06:23 +03:00
Sander de Smalen	7a186bd3d7	[LoopVectorizationLegality] NFC: Mark some interfaces as 'const' This patch marks blockNeedsPredication, isConsecutivePtr, isMaskRequired and getSymbolicStrides as 'const'.	2021-05-14 11:53:54 +01:00
Heejin Ahn	5ec2c45efb	[WebAssembly] Omit DBG_VALUE after terminator When a stackified variable has an associated `DBG_VALUE` instruction, DebugFixup pass adds a `DBG_VALUE` instruction after the stackified value's last use to clear the variable's debug range info. But when the last use instruction is a terminator, it can cause a verification failure (when run with `-verify-machineinstrs`) because there are no instructions allowed after a terminator. For example: ``` %myvar = ... DBG_VALUE target-index(wasm-operand-stack), $noreg, !"myvar", ... BR_IF 0, %myvar, ... DBG_VALUE $noreg, $noreg, !"myvar", ... ``` In this test, `%myvar` is stackified, so the first `DBG_VALUE` instruction's first operand has changed to `wasm-operand-stack` to denote it. And an additional `DBG_VALUE` instruction is added after its last use, `BR_IF`, to signal variable `myvar` is not in the operand stack anymore. But because the `DBG_VALUE` instruction is added after the `BR_IF`, a terminator, it fails MachineVerifier. `DBG_VALUE` instructions are used in `DbgEntityHistoryCalculator` to compute value ranges to emit DWARF info, and it turns out the `DbgEntityHistoryCalculator` terminates ranges at the end of a BB, so we don't need to emit `DBG_VALUE` after a terminator. Fixes https://bugs.llvm.org/show_bug.cgi?id=50175. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D102309	2021-05-14 03:48:19 -07:00
Heejin Ahn	871c0f68dd	[WebAssembly] Support Emscripten EH/SjLj in Wasm64 In wasm64, the signatures of some library functions and global variables defined in Emscripten change: - `emscripten_longjmp`: `(i32, i32) -> ()` -> `(i64, i32) -> ()` This changes because the first argument is the address of a memory buffer. This in turn causes more changes below. - `setThrew`: `(i32, i32) -> ()` -> `(i64, i32) -> ()` `emscripten_longjmp` calls `setThrew` with the i64 buffer argument as the first parameter. - `__THREW__` (global var): `i32` to `i64` `setThrew`'s first argument is set to this `__THREW__` variable, so it should change to i64 as well. - `testSetjmp`: `(i32, i32, i32) -> (i32)` -> `(i64, i32, i32) -> (i32)` In the code transformation done in this pass, the value of `__THREW__` is passed as the first parameter of `testSetjmp`. This patch creates some helper functions to easily get types that become different depending on the wasm32/wasm64, and uses them to change various function signatures and code transformations. Also updates the tests with WASM32/WASM64 check lines. (Untested) Emscripten side patch: https://github.com/emscripten-core/emscripten/pull/14108 Reviewed By: aardappel Differential Revision: https://reviews.llvm.org/D101985	2021-05-14 03:45:09 -07:00
Tim Northover	5661b7eb80	IR+AArch64: add a "swiftasync" argument attribute. This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).	2021-05-14 11:43:58 +01:00
Simon Pilgrim	e0d18bc91b	[Local] collectBitParts - for bswap-only matches, limit shift amounts to whole bytes to reduce compile time.	2021-05-14 11:42:52 +01:00
Simon Pilgrim	25112d22fc	[Local] collectBitParts - reduce maximum recursion depth. As noticed on D90170, the recursion depth for matching a maximum of a i128 bitwidth was too high. @lebedev.ri mentioned that we can probably do better by limiting the number of collected Values instead of just depth, but I'll look at that later.	2021-05-14 11:42:51 +01:00
Florian Hahn	f8663c485a	[VectorCombine] Add tests with assumes involvind variable index. Add test cases with variable indices together with assumes guaranteeing that the indices are valid.	2021-05-14 11:20:08 +01:00
Anton Afanasyev	3dad2ee3ae	[SLP] Fix spill cost computation for insertelement tree node This is follow up for D98714, bugfixing.	2021-05-14 13:14:41 +03:00
Simon Pilgrim	f2ce6e36d3	[X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI.	2021-05-14 11:14:18 +01:00
Max Kazantsev	28797ff852	[Test] Add test on missing opportunity in Loop Deletion We can break the backedge in some cases when we can evaluate some of the values and conditions on the 1st iteration.	2021-05-14 16:57:50 +07:00
Tim Northover	fc7c25abd8	AArch64: support i128 cmpxchg in GlobalISel. There are three essentially different cases to handle: * -O1, no LSE. The IR is expanded to ldxp/stxp and we need patterns to select them. * -O0, no LSE. We get G_ATOMIC_CMPXCHG, and need to produce CMP_SWAP_N pseudos. The registers are all 64-bit so this is easy. * LSE. We get G_ATOMIC_CMPXCHG and need to produce a CASP instruction with XSeqPair registers. The last case is by far the hardest, and and adds 128-bit GPR support as a byproduct.	2021-05-14 10:41:38 +01:00
Sander de Smalen	8f17561cd5	NFCI: Remove VF argument from isScalarWithPredication As discussed in D102437, the VF argument to isScalarWithPredication seems redundant, so this is intended to be a non-functional change. It seems wrong to query the widening decision at this point. Removing the operand and code to get the widening decision causes no unit/regression tests to fail. I've also found no issues running the LLVM test-suite. This subsequently removes the VF argument from isPredicatedInst as well, since it is no longer required.	2021-05-14 10:34:40 +01:00
Jay Foad	85f4b8dffe	[AMDGPU] getMemOperandsWithOffset: add vaddr operand for stack access BUF instructions A consequence is that checkInstOffsetsDoNotOverlap can now distinguish sp+offset from fp+offset, so it knows that it shouldn't try to work out whether the accesses overlap just by comparing the offsets. For example in these two instructions: MIR: BUFFER_STORE_DWORD_OFFSET %0:vgpr_32(s32), $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store 4 into stack + 4, addrspace 5) %4:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN %stack.0.alloca, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, 0, 0, 0, implicit $exec :: (load 4 from `i8 addrspace(5)* undef`, addrspace 5) ISA: buffer_store_dword v0, off, s[0:3], s32 offset:4 buffer_load_dword v0, off, s[0:3], s34 Differential Revision: https://reviews.llvm.org/D73957	2021-05-14 10:10:43 +01:00
Alexandros Lamprineas	4ec21b78e4	[llvm-mc][AArch64] HINT instruction disassembled as BTI The Arm Architecture Reference Manual says that the SystemHintOp_BTI opcode is prefered when CRm:op2 matches 0100:xx0, but llvm-mc currently accepts 0100:xxx, which isn't right. Differential Revision: https://reviews.llvm.org/D102415	2021-05-14 10:05:37 +01:00
Roman Lebedev	d4b7ba4b9e	[X86] AMD Zen 3: same-reg AVX YMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom As confirmed by exegesis measurements, and ref docs.	2021-05-14 11:56:07 +03:00
Roman Lebedev	1e0a6ede0d	[X86] AMD Zen 3: same-reg AVX XMM VXORPD is a zero-cycle(!) dep-breaking zero-idiom As confirmed by exegesis measurements, and ref docs.	2021-05-14 11:56:07 +03:00
Roman Lebedev	456286cdbe	[X86] AMD Zen 3: same-reg SSE XMM XORPD is a 1-cycle(!) dep-breaking zero-idiom Same as with it's float friend, unlike their AVX versions. As confirmed by exegesis, and ref docs.	2021-05-14 11:56:07 +03:00
Roman Lebedev	ada20786f6	[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPD tests	2021-05-14 11:56:07 +03:00
Roman Lebedev	5f134255ba	[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPD tests	2021-05-14 11:56:06 +03:00
Roman Lebedev	fe385c88c4	[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM XORPD tests	2021-05-14 11:56:06 +03:00
Roman Lebedev	8b2d3e8b88	[X86] AMD Zen 3: same-reg AVX YMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom As confirmed by exegesis, and ref docs.	2021-05-14 11:56:06 +03:00
Roman Lebedev	942ccdfd9c	[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VXORPS tests	2021-05-14 11:56:06 +03:00
Roman Lebedev	af80d479c4	[X86] AMD Zen 3: same-reg AVX XMM VXORPS is a zero-cycle(!) dep-breaking zero-idiom Unlike it's legacy SSE XMM XORPS version, which measures as being 1-cycle, this one is certainly a zero-cycle instruction, in addition to both of them being dependency breaking. As confirmed by exegesis measurements, and ref docs.	2021-05-14 11:56:06 +03:00
Roman Lebedev	6d05f6c67e	[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VXORPS tests	2021-05-14 11:56:06 +03:00
Pooja Yadav	5296db8df0	[docs] Added llvm/cmake section Added information about the cmake inside llvm. Reviewed By: xgupta, jroelofs Differential Revision: https://reviews.llvm.org/D101925	2021-05-14 14:10:56 +05:30
David Stuttard	5ca0f8e582	[AMDGPU] Fix codegen of image intrinsics for g16 and a16 For gfx10 gradient (g16) and address (a16) can be independent. Previous implementation assumed that a16 implied g16. There are some other changes that fix the verification (as well as asm/disasm) that are required for the included test to pass - the XFAIL will be removed in those changes. This also includes required fixes for GlobalISel Differential Revision: https://reviews.llvm.org/D102066 Change-Id: I7d171cc90994de05f41669b66a6d0ffa2ed05d09	2021-05-14 09:28:15 +01:00
David Stuttard	0a28768900	[AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling A16 support for image instructions assembly/disassembly (gfx10) was missing Also refactor MIMG op addr size calcs to common function We'd got 3 places where the same operation was being done. One test is now marked XFAIL until a related codegen patch is in place Differential Revision: https://reviews.llvm.org/D102231 Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb	2021-05-14 09:25:44 +01:00
David Spickett	3d7d76ad6f	[llvm][AsmPrinter] Restore source location to register clobber warning Since 5de2d189e6ad466a1f0616195e8c524a4eb3cbc0 this particular warning hasn't had the location of the source file containing the inline assembly. Fix this by reporting via LLVMContext. Which means that we no longer have the "instantiated into assembly here" lines but they were going to point to the start of the inline asm string anyway. This message is already tested via IR in llvm. However we won't have the required location info there so I've added a C file test in clang to cover it. (though strictly, this is testing llvm code) Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D102244	2021-05-14 08:22:57 +00:00
Alexey Bader	ae75421161	New tag for ittapi - fix an error related to cross-compiling ITTAPI in LLVM with mingw Fix was implemented in the ittap repo to solve an error about cross-compiling ITTAPI in LLVM with mingw. The problem occurred in the cross-compilation environment for Julia's dependencies. The corresponding issue item in ittapi repo: https://github.com/intel/ittapi/issues/19 A new tag was created in ittapi repo for that fix. This patch contains changes to update the ittapi tag in LLVM. Reviewed By: bader Differential Revision: https://reviews.llvm.org/D102471	2021-05-14 08:18:49 +03:00
dfukalov	3f7e516e28	[GVN] Clobber partially aliased loads. Use offsets stored in `AliasResult` implemented in D98718. Updated with fix of issue reported in https://reviews.llvm.org/D95543#2745161 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D95543	2021-05-14 11:17:14 +03:00
David Green	d997870c3e	[DSE] Move isOverwrite into DSEState. NFC This moves the isOverwrite function into the DSEState so that it can share the analyses and members from the state. A few extra loop tests were also added to test stores in and around multi block loops for D100464.	2021-05-14 09:16:51 +01:00
Lang Hames	5fae3540d3	[ORC] Add JITLink dependence for ObjectLinkingLayerTest. This aims to fix the failure at https://lab.llvm.org/buildbot/#/builders/61/builds/9590.	2021-05-13 22:48:30 -07:00
LLVM GN Syncbot	12903dbf9e	[gn build] Port 0fda4c4745b8	2021-05-14 04:56:03 +00:00
Lang Hames	d5111c292d	[ORC] Add support for adding LinkGraphs directly to ObjectLinkingLayer. This is separate from (but builds on) the support added in ec6b71df70a for emitting LinkGraphs in the context of an active materialization. This commit makes LinkGraphs a first-class data structure with features equivalent to object files within ObjectLinkingLayer.	2021-05-13 21:44:13 -07:00
Lang Hames	736891b041	[JITLink] Fix missing 'static' keyword in unit test.	2021-05-13 21:44:13 -07:00
Carl Ritson	6f78a47930	[AMDGPU] Do not clause NSA instructions To ensure correct behaviour NSA instructions should not be claused. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D102211	2021-05-14 12:54:56 +09:00
Lang Hames	ec3bd41065	[ORC] Remove the OrcExecutionTest class. It is no longer used.	2021-05-13 18:32:36 -07:00
Lang Hames	30e7a462fb	[ORC] Remove unused RTDyldObjectLinkingLayerExecutionTest class from unit test.	2021-05-13 18:32:35 -07:00
Lang Hames	7eb0435795	[ORC] Remove some stale unit test utils. This code was used to test ORCv1, which has been removed. It is not useful for testing ORCv2.	2021-05-13 18:32:35 -07:00
Chen Zheng	4eed210a79	[Debug-Info] change Tag type to dwarf::Tag for createAndAddDIE; NFC Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102207	2021-05-13 21:15:06 -04:00
Arthur Eubanks	6e23cf88e5	[test] Fix new-pm-lto-defaults.ll to work on all platforms https://lab.llvm.org/buildbot/#/builders/119/builds/3775/steps/8/logs/FAIL__LLVM__new-pm-lto-defaults_ll Followup to D102345.	2021-05-13 18:12:55 -07:00
Chen Zheng	b9ce1812f9	[Debug-Info] make DIE attributes generation under strict DWARF control Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101024	2021-05-13 20:34:07 -04:00
Amara Emerson	0ab348ad5b	[AArch64][GlobalISel] Fix a crash during unsuccessful G_CTPOP <2 x s64> legalization. The legalization rule for scalar-same-as doesn't handle vectors. Until we implement custom legalization for this, at least fall back properly.	2021-05-13 17:28:11 -07:00
Reid Kleckner	b409d78956	[gn] Don't pass -fprofile-instr-generate to linker on Windows Avoids a warning from the linker. The user still has to put the resource directory on the linker search path, and I can't find a clean way to do that automatically in gn.	2021-05-13 16:04:11 -07:00
Matt Arsenault	b8b464cdd8	AMDGPU/GlobalISel: Don't hardcode stack alignment in assert message	2021-05-13 19:00:13 -04:00
Matt Arsenault	2f4beff49d	AMDGPU/GlobalISel: Implement tail calls Or at least the sibling call cases which the DAG already handles.	2021-05-13 18:57:42 -04:00
Arthur Eubanks	1b32fba3b3	[IR] Introduce the opaque pointer type The opaque pointer type is essentially just a normal pointer type with a null pointee type. This also adds support for the opaque pointer type to the bitcode reader/writer, as well as to textual IR. To avoid confusion with existing pointer types, we disallow creating a pointer to an opaque pointer. Opaque pointer types should not be widely used at this point since many parts of LLVM still do not support them. The next steps are to add some very simple use cases of opaque pointers to make sure they work, then start pretending that all pointers are opaque pointers and see what breaks. https://lists.llvm.org/pipermail/llvm-dev/2021-May/150359.html Reviewed By: dblaikie, dexonsmith, pcc Differential Revision: https://reviews.llvm.org/D101704	2021-05-13 15:22:27 -07:00

1 2 3 4 5 ...

215762 Commits