llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Craig Topper	14af611f66	Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions." I thought this might help with another optimization I was thinking about, but I don't think it will. So it just wastes compile time calling computeKnownBits for no benefit. This reverts commit 81b2f95971edd47a0057ac4a77b674d7ea620c01.	2021-06-27 10:33:43 -07:00
Nikita Popov	b0bb011472	[DSE] Support opaque pointers For the start shortening optimization, always use a i8 type for the GEP, as it is a raw offset calculation. Handling of non-i8* memset/memcpy arguments requires insertion of casts. These cases were previously miscompiled, as the offset calculation was performed on the wrong type.	2021-06-27 17:41:40 +02:00
Nikita Popov	802d7a3dde	[MemCpyOpt] Handle unusual memcpy element type Apparently, it is legal to use memcpy/memset with pointer types other than i8. Prior to 81fcdae68c5ff656c30032fd26c6a21af4c51dbb this case was silently miscompiled, as the i8 offset calculation was performed on some other type. Now it would crash due to a type mismatch. Fix this by inserting an explicit bitcast to i8.	2021-06-27 16:21:44 +02:00
Sanjay Patel	fa04d281c5	[InstCombine] hoist min/max intrinsics above select with constant op This is an extension of the handling for unary intrinsics and follows the logic that we use for binary ops. We don't canonicalize to min/max intrinsics yet, but this might help unlock other folds seen in D98152.	2021-06-27 10:02:23 -04:00
Nikita Popov	d5928b4916	[MemCpyOpt] Support opaque pointers	2021-06-27 15:52:38 +02:00
Nikita Popov	e3cd3226a4	[LoadStoreVectorizer] Support opaque pointers There are remaining redundant bitcasts.	2021-06-27 15:42:16 +02:00
Florian Hahn	6f498ede31	[VPlan] Track both incoming values for first-order recurrence phis. This patch updates VPWidenPHI recipes for first-order recurrences to also track the incoming value from the back-edge. Similar to D99294, which did the same for reductions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104197	2021-06-27 14:29:35 +01:00
Sanjay Patel	27099e9b95	[InstCombine][test] add tests for min/max intrinsics with select operand; NFC	2021-06-27 08:19:00 -04:00
Sanjay Patel	b406a3d74c	[Analysis] improve function signature checking for calloc This would crash later if we thought the parameters were valid for the standard library call as shown in: https://llvm.org/PR50846	2021-06-27 08:19:00 -04:00
Mara Sophie Grosch	27c4ea000e	[Orc][examples] LLJITWithRemoteDebugger: fix CMake when utils are not built	2021-06-27 13:52:04 +02:00
Jan Kratochvil	d1dd2f5d77	llvm-dwarfdump: Print warnings on invalid DWARF llvm-dwarfdump was silent even when the format of DWARF was invalid and/or llvm-dwarfdump did not understand/support some of the constructs. This can be pretty confusing as llvm-dwarfdump is a tool for DWARF producers+consumers development. Review comments also by @dblaikie. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D104271	2021-06-27 11:38:35 +02:00
Craig Topper	72e0f3ae04	[X86] Tighten up some inline assembly constraint handling. Don't allow vectors to split into GPRs for 'r' and other scalar constraints. Prevents assertion in getCopyToPartsVector. Makes PR50907 give a better error instead of crashing.	2021-06-26 22:57:22 -07:00
Alexander Shaposhnikov	559d4f5102	[docs][llvm-strip] Fix documentation for -s/-S Fix the command line guide for -g/-s/-S. In particular, previously it was incorrectly stating that -S is an alias for --strip-all. Differential revision: https://reviews.llvm.org/D104888	2021-06-26 21:26:53 -07:00
David Green	c9378320d3	[ARM] Lower MVETRUNC to stack operations The MVETRUNC node truncates two wide vectors to a single vector with narrower elements. This is usually lowered to a series of extract/insert elements, going via GPR registers. This patch changes that to instead use a pair of truncating stores and a stack reload. This cuts down the number of instructions at the expense of some stack space. Differential Revision: https://reviews.llvm.org/D104515	2021-06-26 22:12:57 +01:00
David Green	d1eb4f5a05	[ARM] Introduce MVETRUNC ISel lowering Currently, when encountering store(trunc(..)) where the trunc is double a legal vector lenth in MVE, we spilt the node into two different stores each performing half of the trunc from the wider type. This works well for efficiently lowering wider than legal types, else the trunc becomes a series of individual lane moves. Unfortunately this splitting is currently one of the first combines attempted, so can happen before any other combines which might be more preferable. This patch instead introduces the concept of a MVETRUNC ISel node that the trunk is initially lowered to, to keep it intact as a single item as opposed to splitting it up. This allows us to push the store(trunc(..)) combine later, allowing other optimisations to potentially happen on the trunc first. The store(trunc(..)) splitting can then be done later in the legalisation period if needed, or else fall back to a buildvector as before. This can also be used in the future to lower to loads/stores, as opposed to the more expensive lane extracts/inserts. Some extra combines are added to keep all the existing tests happy. Differential Revision: https://reviews.llvm.org/D91921	2021-06-26 22:00:26 +01:00
Craig Topper	2bab6a4c53	[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions.	2021-06-26 11:57:26 -07:00
David Green	d9e635a956	[ARM] MVE vabd This adds MVE lowering for VABDS/VABDU, using the code parted from AArch64 in D91937. Differential Revision: https://reviews.llvm.org/D91938	2021-06-26 19:41:32 +01:00
David Green	6a316cf978	[ISel] Port AArch64 SABD and UABD to DAGCombine This ports the AArch64 SABD and USBD over to DAG Combine, where they can be used by more backends (notably MVE in a follow-up patch). The matching code has changed very little, just to handle legal operations and types differently. It selects from (ABS (SUB (EXTEND a), (EXTEND b))), producing a ubds/abdu which is zexted to the original type. Differential Revision: https://reviews.llvm.org/D91937	2021-06-26 19:34:16 +01:00
Nikita Popov	2d1d507858	[Verifier] Support masked load/store with opaque pointers	2021-06-26 18:11:59 +02:00
LLVM GN Syncbot	4d0d5cf124	[gn build] Port 8b7881a084d0	2021-06-26 14:20:52 +00:00
David Green	2a8be6538a	[ARM] Regenerate big-endian-vector-caller.ll test checks. NFC	2021-06-26 13:21:54 +01:00
Florian Hahn	417caa2b9f	[LV] Adjust trip count based on IsOrdered in widenPHIInstruction (NFC). Suggested in D104197, avoids the early exit.	2021-06-26 13:13:25 +01:00
LLVM GN Syncbot	769467b21f	[gn build] Port aff57ff24aca	2021-06-26 11:38:00 +00:00
Lang Hames	59ded59cbe	[JITLink][ELF] Add generic ELFLinkGraphBuilder template. ELFLinkGraphBuilder<ELFT> will hold generic parsing and LinkGraph-building code that can be shared between JITLink ELF backends for different architectures. For now it's just a stub. The plan is to incrementally move functionality down from ELFLinkGraphBuilder_x86_64 into the new template.	2021-06-26 21:37:33 +10:00
Jim Lin	f8d2dcd1dd	[RISCV][NFC] Combine the control flow for different RetOp of interrupt function Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D104838	2021-06-26 17:28:03 +08:00
Craig Topper	c2e96e7cf2	[RISCV] Add DAG combine to detect opportunities to replace (i64 (any_extend (i32 X)) with sign_extend. If type legalization is going to insert a sign_extend for other users of X and we can fold the sign_extend into ADDW/MULW/SUBW, it is better to replace the ANY_EXTEND so we don't end up with a separate ADD/MUL/SUB instruction for the users of the ANY_EXTEND. I'm only handling setcc uses right now, but there are other instructions that force sign_extends like ashr. There are probably other *W instructions we could use in addition to ADDW/SUBW/MULW. My motivating case was a loop terminating compare and a phi use as seen in the new test file. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D104581	2021-06-25 23:16:37 -07:00
Eric Astor	7492c6b2fb	[ms] [llvm-ml] Disable C-style comments	2021-06-25 23:09:13 -04:00
Luo, Yuanke	b033502dd5	[X86] Selecting fld0 for undefined value in fast ISEL. When set opt-bisect-limit to some value that is less than ISel pass in command line and CurBisectNum expired, "DAG to DAG" pass lower its opt level to O0. However "processimpdefs" and "X86 FP Stackifier" is not stopped due to the CurBisectNum expiration. So undefined fp0 is generated. This cause crash in the "X86 FP Stackifier" pass, because Stackifier doesn't expect any undefined fp value. Here is the scenario that cause compiler crash. successors: %bb.26 liveins: $r14 ST_FPrr $st0, implicit-def $fpsw, implicit $fpcw renamable $rdi = MOV64ri @.str.3.16422 renamable $rdx = LEA64r %stack.6, 1, $noreg, 0, $noreg ADJCALLSTACKDOWN64 0, 0, 0, implicit-def $rsp, implicit-def dead $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp dead $esi = MOV32r0 implicit-def dead $eflags, implicit-def $rsi CALL64pcrel32 @foo, implicit $rsp, implicit $ssp, implicit $rdi, implicit $rsi, implicit $rdx, implicit-def dead $fp0 renamable $xmm0 = MOVSDrm_alt %stack.10, 1, $noreg, 0, $noreg :: (load 8 from %stack.10) ADJCALLSTACKUP64 0, 0, implicit-def $rsp, implicit-def dead $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp renamable $fp2 = CHS_Fp80 killed undef renamable $fp0, implicit-def $fpsw JMP_1 %bb.26 The CALL64pcrel32 mark fp0 dead, so llvm free the stack slot for fp0 and the stack become empty. In the late instruction CHS_Fp80, it use undefined register fp0, the original code assume there must be a stack slot for the src register (fp0) without respecting it is undefined, so llvm report error. We have some discussion in https://reviews.llvm.org/D104440 and we decide to fix it in fast ISel. The fix is to lower undefined fp value to zero value, so that it release the burden of "X86 FP Stackifier" pass. Thank Craig for the suggestion and the initial patch to fix it. Differential Revision: https://reviews.llvm.org/D104678	2021-06-26 08:43:09 +08:00
Jon Chesterfield	7f9c53b162	Disable ReplaceLDS pass, patch up tests to match Most tests passed with an extra argument to explicitly enable the pass. One does not, deleted it as part of this change. I can't see why the codegen would be different between default on and default off but switched on. It can be retrieved from the project history. This would be a revert, but git revert was not clean. Disabling the pass and leaving it in tree is less likely to cause breakage elsewhere than patching up the git revert conflicts on unfamiliar code. It'll be landed without review, as @hsmhsm is believed unavailable at present. Differential Revision: https://reviews.llvm.org/D104962	2021-06-26 01:36:42 +01:00
Andrew Browne	55bbe5301b	[DFSan] Change shadow and origin memory layouts to match MSan. Previously on x86_64: +--------------------+ 0x800000000000 (top of memory) \| application memory \| +--------------------+ 0x700000008000 (kAppAddr) \| \| \| unused \| \| \| +--------------------+ 0x300000000000 (kUnusedAddr) \| origin \| +--------------------+ 0x200000008000 (kOriginAddr) \| unused \| +--------------------+ 0x200000000000 \| shadow memory \| +--------------------+ 0x100000008000 (kShadowAddr) \| unused \| +--------------------+ 0x000000010000 \| reserved by kernel \| +--------------------+ 0x000000000000 MEM_TO_SHADOW(mem) = mem & ~0x600000000000 SHADOW_TO_ORIGIN(shadow) = kOriginAddr - kShadowAddr + shadow Now for x86_64: +--------------------+ 0x800000000000 (top of memory) \| application 3 \| +--------------------+ 0x700000000000 \| invalid \| +--------------------+ 0x610000000000 \| origin 1 \| +--------------------+ 0x600000000000 \| application 2 \| +--------------------+ 0x510000000000 \| shadow 1 \| +--------------------+ 0x500000000000 \| invalid \| +--------------------+ 0x400000000000 \| origin 3 \| +--------------------+ 0x300000000000 \| shadow 3 \| +--------------------+ 0x200000000000 \| origin 2 \| +--------------------+ 0x110000000000 \| invalid \| +--------------------+ 0x100000000000 \| shadow 2 \| +--------------------+ 0x010000000000 \| application 1 \| +--------------------+ 0x000000000000 MEM_TO_SHADOW(mem) = mem ^ 0x500000000000 SHADOW_TO_ORIGIN(shadow) = shadow + 0x100000000000 Reviewed By: stephan.yichao.zhao, gbalats Differential Revision: https://reviews.llvm.org/D104896	2021-06-25 17:00:38 -07:00
Nikita Popov	5789265b3d	Revert "[InstCombine] Make indexed compare fold opaque ptr compatible" This reverts commit 5cb20ef8a235c2027489a196bba27630ca21a00b. Assertion failures with this patch were reported on https://reviews.llvm.org/rG5cb20ef8a235, revert for now.	2021-06-26 00:32:59 +02:00
Duncan P. N. Exon Smith	a6ff73a905	OpaquePtr: Reject 'ptr' again when parsing textual IR Bring back the testcase dropped in 1e6303e60ca5af4fbe7ca728572fd65666a98271 and get it passing by checking explicitly for `ptr` in LLParser. Uses `Type::isOpaquePointerTy()` from ad4bb8280952c2cacf497e30560ee94c119b36e0. Differential Revision: https://reviews.llvm.org/D104938	2021-06-25 15:18:44 -07:00
Eli Friedman	905b300022	[NFC] Prefer ConstantRange::makeExactICmpRegion over makeAllowedICmpRegion The implementation is identical, but it makes the semantics a bit more obvious.	2021-06-25 14:43:13 -07:00
Eric Astor	b88bba840f	[ms] [llvm-ml] Add support for ALIGN, EVEN, and ORG directives Match ML.EXE's behavior for ALIGN, EVEN, and ORG directives both at file level and in STRUCTs. We currently reject negative offsets passed to ORG inside STRUCTs (in ML.EXE and ML64.EXE, they wrap around as for an unsigned 32-bit integer). Also, if a STRUCT is declared using an ORG directive, no value of that type can be defined. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92507	2021-06-25 17:19:45 -04:00
Juneyoung Lee	1291b464ba	[SimplifyLibCalls] Fix memchr opt to use CreateLogicalAnd This fixes a bug at LibCallSimplifier::optimizeMemChr which does the following transformation: ``` // memchr("\r\n", C, 2) != nullptr -> (1 << C & ((1 << '\r') \| (1 << '\n'))) // != 0 // after bounds check. ``` As written above, a bounds check on C (whether it is less than integer bitwidth) is done before doing `1 << C` otherwise 1 << C will overflow. If the bounds check is false, the result of (1 << C & ...) must not be used at all, otherwise the result of shift (which is poison) will contaminate the whole results. A correct way to encode this is `select i1 (bounds check), (1 << C & ...), false` because select does not allow the unused operand to contaminate the result. However, this optimization was introducing `and (bounds check), (1 << C & ...)` which cannot do that. The bug was found from compilation of this C++ code: https://reviews.llvm.org/rG2fd3037ac615#1007197 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D104901	2021-06-26 05:59:35 +09:00
Joseph Huber	b8d800fd9c	[OpenMP] Change OpenMPOpt to check openmp metadata The metadata added in D102361 introduces a module flag that we can check to determine if the module was compiled with `-fopenmp` enables. We can now check for the precense of this instead of scanning the call graph for OpenMP runtime functions. Depends on D102361 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102423	2021-06-25 16:34:22 -04:00
Nemanja Ivanovic	87c4e2706e	[PowerPC] Disable combine 64-bit bswap(load) without LDBRX This causes failures on the big endian bootstrap bot. Disabling this combine temporarily until I can get a proper fix.	2021-06-25 15:11:22 -05:00
Martin Storsjö	94252d9e23	[llvm-rc] Don't rewrite the arch in the default triple unless necessary When the default target arch isn't one that is supported as a windows target, we want to set a suitable architecture (so that Clang tests that run plain 'llvm-rc' succeed checks for e.g. "#ifdef _WIN32" even for llvm builds that default to e.g. ppc64). But if the default target architecture is usable, don't rewrite it. (Rewriting it, by e.g. "T.setArch(T.getArch())", normalizes the spelling of the architecture, e.g. changing i686 to i386. Such a change can make clang unable to find the right sysroot.) This can't, unfortunately, practically be tested very well because it is entirely dependent on the default triple of the llvm build. Differential Revision: https://reviews.llvm.org/D104589	2021-06-25 22:59:09 +03:00
Ulrich Weigand	fc8f374f5d	[SystemZ] Add support for .reloc assembler directive Add support for the .reloc directive along the lines of other back-ends. This fixes a regression after https://reviews.llvm.org/D104080 was merged, since that patch presupposed support for .reloc.	2021-06-25 21:51:10 +02:00
Hongtao Yu	8e43edc28f	[Coroutines] Define __coro_frame_ty in function scope Types should be defined in function scope instead of a local lexical scope. Field types should be defined inside in its parent type scope. We were seeing a type defined in a local scope causing trouble to the dwarf emitter where a context is required to be a funciton scope, a namespace or a global scope. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D104937	2021-06-25 12:33:20 -07:00
Nikita Popov	67a9e2940b	[OpaquePtr] Enumerate GlobalAlias value type The type is no longer implicitly enumerated through the pointer type.	2021-06-25 21:21:10 +02:00
Nikita Popov	33e01a9045	[IR] Add Type::isOpaquePointerTy() helper (NFC) Shortcut to check for opaque pointers without a cast to PointerType.	2021-06-25 20:56:59 +02:00
David Green	20d5a26896	[DAG] Fold neg(splat(neg(x)) -> splat(x) This add as a fold of sub(0, splat(sub(0, x))) -> splat(x). This can come up in the lowering of right shifts under AArch64, where we generate a shift left of a negated number. Differential Revision: https://reviews.llvm.org/D103755	2021-06-25 19:53:29 +01:00
Craig Topper	40a736ff72	[X86] Simplify part of the isel for X86ISD::FCMP/STRICT_FCMP/STRICT_FCMPS. We don't need to have the compare output a value and then copy it to FPSW for use by FNSTSW. Instead we can just have the compare output Glue and glue the FNSTSW to it. InstrEmitter effectively performed this optimization when emitting the Machine IR. Doing it directly simplifies the codes and reduces the work in InstrEmitter. There's no change in the machine IR at the end of isel before and after this change.	2021-06-25 11:39:01 -07:00
David Green	b250b8a12a	[AArch64] Extra negated shift tests. NFC	2021-06-25 19:17:31 +01:00
Philip Reames	65d08a3a8f	[test] Add coverage for existing overflow rule with uadd.with.overflow	2021-06-25 10:45:00 -07:00
Florian Hahn	cd014dcdee	[LV] Doxygenize VectorizationFactor member comments (NFC). Minor cleanup for follow-up patch.	2021-06-25 18:35:00 +01:00
Philip Reames	bfe000bb38	[instcombine] Fold overflow check using umulo to comparison If we have a umul.with.overflow where the multiply result is not used and one of the operands is a constant, we can perform the overflow check cheaper with a comparison then by performing the multiply and extracting the overflow flag. (Noticed when looking at the conditions SCEV emits for overflow checks.) Differential Revision: https://reviews.llvm.org/D104665	2021-06-25 10:25:45 -07:00
Joel E. Denny	6ae82bea96	[UpdateCCTestChecks] Support --check-globals This option is already supported by update_test_checks.py, but it can also be useful in update_cc_test_checks.py. For example, I'd like to use it in OpenMP offload codegen tests to check global variables like `.offload_maptypes*`. Reviewed By: jdoerfert, arichardson, ggeorgakoudis Differential Revision: https://reviews.llvm.org/D104714	2021-06-25 13:17:56 -04:00
Philip Reames	b215117634	[test][instcombine] Add test cases for all x.with.overflow overflow checks For each of the x.with.overflow variants, if only the overflow bit is consumed, we can generate a direct overflow comparison. This precommits tests for each of the variants and tries to cover interesting cornercases.	2021-06-25 10:09:58 -07:00

1 2 3 4 5 ...

217776 Commits