llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Max Kazantsev	a15cdde064	[SCEV] Limited support for unsigned preds in isImpliedViaOperations The logic there only considers `SLT/SGT` predicates. We can use the same logic for proving `ULT/UGT` predicates if all involved values are non-negative. Adding full-scale support for unsigned might be challenging because of code amount, so we can consider this in the future. Differential Revision: https://reviews.llvm.org/D88087 Reviewed By: reames	2020-10-02 10:20:57 +07:00
Philip Reames	4bb7568ead	[gvn] Handle a corner case w/vectors of non-integral pointers If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers. In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back. This shows up as wrong code bugs, and in some cases, crashes due to failed asserts. Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.	2020-10-01 19:20:21 -07:00
Carl Ritson	75e5d1576a	[AMDGPU] SIInsertSkips: Tidy block splitting to use splitAt Convert to use new MachineBasicBlock splitAt function. Place code in splitBlock function for reuse in future changes. Should yield no functional change. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88537	2020-10-02 11:10:55 +09:00
Carl Ritson	01c4226b07	CodeGen: Fix livein calculation in MachineBasicBlock splitAt Fix and simplify computation of liveins for new block. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88535	2020-10-02 10:45:04 +09:00
Esme-Yi	a538a3723b	[PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC. Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that. The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS. Reviewed By: nemanjai, shchenz Differential Revision: https://reviews.llvm.org/D88274	2020-10-02 01:26:18 +00:00
Joseph Huber	dbad0ba351	[OpenMP] Add Missing Runtime Call for Globalization Remarks Summary: Add a missing runtime call to perform data globalization checks. Reviewers: jdoerfert Subscribers: guansong hiraditya llvm-commits sstefan1 yaxunl Tags: #LLVM #OpenMP Differential Revision: https://reviews.llvm.org/D88621	2020-10-01 21:19:53 -04:00
jasonliu	89f3a464f7	[XCOFF] Enable -fdata-sections on AIX Summary: Some design decision worth noting about: I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF target: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html But on AIX, our linker could not split the mergeable string like other target. So I think it would make more sense for us to emit separate csect for every mergeable string in -fdata-sections mode, as there might not be other ways for linker to do garbage collection on unused mergeable string. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88339	2020-10-02 00:16:24 +00:00
Philip Reames	24c50af17e	[memcpyopt] Conservatively handle non-integral pointers If we allow the non-integral pointers to become memset and memcpy, we loose the ability to reason about pointer propagation. This patch is modeled on changes we've carried downstream for a long time, figured it was worth being equally conservative for other users. There is room to refine the semantics and handling here if anyone is motivated.	2020-10-01 16:46:56 -07:00
Muhammad Asif Manzoor	94e48de239	[AArch64][SVE] Add lowering for llvm fabs Add the functionality to lower fabs for passthru variant Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D88679	2020-10-01 19:41:25 -04:00
Philip Reames	874fb0cf7a	Fix a bug in memset formation with vectors of non-integral pointers We were converting the non-integral store into a integer store which is not legal.	2020-10-01 16:11:11 -07:00
Stanislav Mekhanoshin	b53c88017f	[AMDGPU] Allow SOP asm mnemonic to differ Allows the creation of real SOP1 instructions with assembler mnemonics that differ from their pseudo-instruction mnemonics. The default behavior keeps the mnemonics matching. Corrects a subtarget label typo in a comment. Authored By: Joe_Nash Differential Revision: https://reviews.llvm.org/D88708	2020-10-01 16:00:04 -07:00
Jessica Paquette	c5bda355e6	[GlobalISel][AArch64] Don't emit cset for G_FCMPs feeding into G_BRCONDs Similar to the FP case in `AArch64TargetLowering::LowerBR_CC`. Instead of emitting the csets + a tbnz, just emit a compare + bcc (or two bccs, depending on the condition code) This improves cases like this: https://godbolt.org/z/v8hebx This is a 0.1% geomean code size improvement for CTMark at -O3. Differential Revision: https://reviews.llvm.org/D88624	2020-10-01 15:34:16 -07:00
Jessica Paquette	7197bc439e	[AArch64][GlobalISel] Use emitTestBit in selection for G_BRCOND Partially refactoring, partially fixing a bug. - We shouldn't use TB(N)ZX unless the bit number is >= 32 - We can fold more than xor using emitTestBit Also remove a check which isn't relevant anymore + update tests. Rename select-brcond-of-not.mir to select-brcond-of-binop.mir, since it now tests more than just G_XOR. Differential Revision: https://reviews.llvm.org/D88702	2020-10-01 15:33:34 -07:00
Amara Emerson	b4d0184534	[AArch64][GlobalISel] Alias rules for G_FCMP to G_ICMP. No need to be different here for the vast majority of rules.	2020-10-01 15:20:09 -07:00
Amara Emerson	b838382ceb	[AArch64][GlobalISel] Make <8 x s8> integer arithmetic ops legal.	2020-10-01 14:35:21 -07:00
Amara Emerson	2e35d9215f	[AArch64][GlobalISel] Make <8 x s8> shifts legal and add selection support.	2020-10-01 14:21:18 -07:00
Amara Emerson	22d12ebe1b	Revert "[AArch64][GlobalISel] Make <8 x s8> shifts legal." Accidentally pushed this.	2020-10-01 14:15:57 -07:00
Amara Emerson	54e3d67d23	[AArch64][GlobalISel] Make <8 x s8> shifts legal.	2020-10-01 14:10:10 -07:00
Amara Emerson	d997cfda8a	[AArch64][GlobalISel] Merge G_SHL, G_ASHR and G_LSHR legalizer rules together. There's no need for any difference between these.	2020-10-01 14:02:45 -07:00
Arthur Eubanks	f70b28627d	[gn build] Support building with ThinLTO Differential Revision: https://reviews.llvm.org/D88584	2020-10-01 13:48:31 -07:00
Amara Emerson	c0c56e8bad	[AArch64][GlobalISel] Use custom legalization for G_TRUNC for v8i8 vectors. Truncating to v8i8 is a case where we want to split the source but also generate intermediate truncates to reduce the size of the source vector before truncating down to v8i8. This implements the same strategy as what SelectionDAG does, but I'm not certain where if anywhere in generic code it should live. Use it for legalization of v8s8 = G_ICMP v8s32. Differential Revision: https://reviews.llvm.org/D88191	2020-10-01 13:22:00 -07:00
Amara Emerson	4d12624339	[AArch64][GlobalISel] Camp oversize v4s64 G_FPEXT operations.	2020-10-01 13:08:31 -07:00
Reid Kleckner	f0bf98943a	[lit] Fix Python 2/3 compat in new winreg search code This should fix the test failures on the clang win64 bot: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/18830 It has been red since Sept 23-ish. This was subtle to debug. Windows has 'find' and 'sort' utilities in C:\Windows\system32, but they don't support all the same flags as the coreutils programs. I configured the buildbot above with Python 2.7 64-bit (hey, it was set up in 2016). When I installed git for Windows, I opted to add all the Unix utilities that come with git to the system PATH. This is almost enough to make the LLVM tests pass, but not quite, because if you use the system PATH, the Windows version of find and sort come first, but the tests that use diff, cmp, etc, will all pass. So only a handful of tests will fail, and with cryptic error messages. The code changed in this CL doesn't work with Python 2. Before Python 3.2, the winreg.OpenKey function did not accept the `access=` keyword argument, the caller was required to pass an unused `reserved` positional argument of 0. The try/except/pass around the OpenKey operation masked this usage error in Python 2. Further, the result of the registry operation has to be converted from unicode to add it to the environment, but that was incidental.	2020-10-01 12:22:28 -07:00
Nikita Popov	6d2cf6fc6a	[InstCombine] Fix select operand simplification with undef (PR47696) When replacing X == Y ? f(X) : Z with X == Y ? f(Y) : Z, make sure that Y cannot be undef. If it may be undef, we might end up picking a different value for undef in the comparison and the select operand.	2020-10-01 21:15:48 +02:00
Petr Hosek	98bed1a704	[CMake] Use -isystem flag to access libc++ headers This is a partial revert of D62155. Rather than copying libc++ headers into the build directory to be later overwritten by the final headers, use -isystem flag to access libc++ headers during CMake checks. This should address the occasional flake we've seen, especially on Windows builders where CMake fails to overwrite __config with the final version. Differential Revision: https://reviews.llvm.org/D88454	2020-10-01 12:09:27 -07:00
Sanjay Patel	4a25740376	[APFloat] convert SNaN to QNaN in convert() and raise Invalid signal This is an alternate fix (see D87835) for a bug where a NaN constant gets wrongly transformed into Infinity via truncation. In this patch, we uniformly convert any SNaN to QNaN while raising 'invalid op'. But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR, so those are always encoded/decoded by calling convert from/to 64-bit hex. See D88664 for a clang fix needed to allow this change. Differential Revision: https://reviews.llvm.org/D88238	2020-10-01 14:37:38 -04:00
Arthur Eubanks	920d09ef67	Revert "[CFGuard] Add address-taken IAT tables and delay-load support" This reverts commit ef4e971e5e18ae796466623df8f26265ba6bdfb5.	2020-10-01 11:29:54 -07:00
Sanjay Patel	65ea3c84f4	[InstCombine] auto-generate complete test checks; NFC	2020-10-01 13:44:31 -04:00
zoecarver	b38d4083e5	[DSE] Look through memory PHI arguments when removing noop stores in MSSA. Summary: Adds support for "following" memory through MSSA PHI arguments. This will help catch more noop stores that exist between blocks. Originally part of D79391. Reviewers: fhahn, jfb, asbirlea Differential Revision: https://reviews.llvm.org/D82588	2020-10-01 10:42:02 -07:00
Jamie Schmeiser	37e405d0af	Reland No.3: Add new hidden option -print-changed which only reports changes to IR A new hidden option -print-changed is added along with code to support printing the IR as it passes through the opt pipeline in the new pass manager. Only those passes that change the IR are reported, with others only having the banner reported, indicating that they did not change the IR, were filtered out or ignored. Filtering of output via the -filter-print-funcs is supported and a new supporting hidden option -filter-passes is added. The latter takes a comma separated list of pass names and filters the output to only show those passes in the list that change the IR. The output can also be modified via the -print-module-scope function. The code introduces an abstract template base class that generalizes the comparison of IRs that takes an IR representation as template parameter. Derived classes provide overrides that provide an event based API for generalized reporting of IRs as they are changed in the opt pipeline through the new pass manager. The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen), MaskRay (Fangrui Song) Differential Revision: https://reviews.llvm.org/D86360	2020-10-01 17:39:13 +00:00
Mircea Trofin	14a8d83207	[NFC] Let (MC)Register APIs check isStackSlot The user is expected to make the isStackSlot check before calling isPhysicalRegister or isVirtualRegister. The APIs assert otherwise. We can improve the usability of these APIs by carrying out the check in the 2 APIs: they become a complete "source of truth" and remove an extra responsibility from the user. Differential Revision: https://reviews.llvm.org/D88598	2020-10-01 09:55:20 -07:00
Shoaib Meenai	8896c8f66c	[runtimes] Remove TOOLCHAIN_TOOLS specialization https://reviews.llvm.org/D88310 fixed the AIX issue in LLVMExternalProjectUtils, so we shouldn't need the workaround in the runtimes build anymore. I'm reverting it because it prevents the target-specific tool selection in LLVMExternalProjectUtils from taking effect, which we rely on for our runtimes builds. Reviewed By: daltenty Differential Revision: https://reviews.llvm.org/D88627	2020-10-01 09:53:10 -07:00
Vy Nguyen	a8dbe39f1b	Reland rG4fcd1a8e6528:[llvm-exegesis] Add option to check the hardware support for a given feature before benchmarking. This is mostly for the benefit of the LBR latency mode. Right now, it performs no checking. If this is run on non-supported hardware, it will produce all zeroes for latency. Differential Revision: https://reviews.llvm.org/D85254 New change: Updated lit.local.cfg to use pass the right argument to llvm-exegesis to actually request the LBR mode. Differential Revision: https://reviews.llvm.org/D88670	2020-10-01 12:21:16 -04:00
Martin Storsjö	0bc7a3ead4	[AArch64] Don't merge sp decrement into later stores when using WinCFI This matches the corresponding existing case in AArch64LoadStoreOpt::findMatchingUpdateInsnForward. Both cases could also be modified to check MBBI->getFlag(FrameSetup/FrameDestroy) instead of forbidding any optimization involving SP, but the effect is probably pretty much the same. Differential Revision: https://reviews.llvm.org/D88541	2020-10-01 19:03:27 +03:00
Martin Storsjö	bbdb6c4cf7	[AArch64] Remove a duplicate call to setHasWinCFI. NFCI. The function already has a cleanup scope that calls the same whenever the function is exited. When reading the code, seeing that this return codepath has an explicit call while other return paths lack it is confusing. In the hypothetical case of a function having a prologue that set the HasWinCFI flag in the MF, but the epilogue containing no WinCFI instructions, the HasWinCFI flag in the MF would end up reset back to false. Differential Revision: https://reviews.llvm.org/D88636	2020-10-01 19:03:27 +03:00
Simon Pilgrim	744dd3076c	[InstCombine] collectBitParts - convert to use PatterMatch matchers and avoid IntegerType casts. Make sure we're using getScalarSizeInBits instead of cast<IntegerType> to get Type bit widths. This is preliminary cleanup before we can start adding vector support to the bswap/bitreverse (element level) matching.	2020-10-01 16:44:14 +01:00
Meera Nakrani	1ebfeef2a5	[ARM] Removed hasSideEffects from signed/unsigned saturates Removed hasSideEffects from SSAT and USAT so that they are no longer marked as unpredictable. Differential Revision: https://reviews.llvm.org/D88545	2020-10-01 14:55:01 +00:00
Jay Foad	b964542465	[AMDGPU] Simplify getNumFlatOffsetBits. NFC. Remove some checks that have already been done in the only caller.	2020-10-01 15:24:09 +01:00
LLVM GN Syncbot	fb69abb068	[gn build] Port f6b1323bc68	2020-10-01 14:18:52 +00:00
Simon Pilgrim	a4a9ce83e5	[InstCombine] Use m_FAbs matcher helper. NFCI.	2020-10-01 14:42:34 +01:00
Simon Pilgrim	038cfb6d37	[IR] PatternMatch - add m_FShl/m_FShr funnel shift intrinsic matchers. NFCI.	2020-10-01 14:42:34 +01:00
Jay Foad	0359573ae0	[AMDGPU] Tiny cleanup in isLegalFLATOffset. NFC.	2020-10-01 14:26:03 +01:00
David Sherwood	42f980da4f	[SVE][CodeGen] Replace use of TypeSize operator< in GlobalMerge::doMerge We don't support global variables with scalable vector types so I've changed the code to compare the fixed sizes instead. Differential Revision: https://reviews.llvm.org/D88564	2020-10-01 14:06:59 +01:00
James Henderson	6428f0596b	[Archive] Don't throw away errors for malformed archive members When adding an archive member with a problem, e.g. a new bitcode with an old archiver, containing an unsupported attribute, or an ELF file with a malformed symbol table, the archiver would throw away the error and simply add the member to the archive without any symbol entries. This meant that the resultant archive could be silently unusable when not using --whole-archive, and result in unexpected undefined symbols. This change fixes this issue by addressing two FIXMEs and only throwing away not-an-object errors. However, this meant that some LLD tests which didn't need symbol tables and were using invalid members deliberately to test the linker's malformed input handling no longer worked, so this patch also stops the archiver from looking for symbols in an object if it doesn't require a symbol table, and updates the tests accordingly. Differential Revision: https://reviews.llvm.org/D88288 Reviewed by: grimar, rupprecht, MaskRay	2020-10-01 14:03:34 +01:00
LLVM GN Syncbot	77de5cb36a	[gn build] Port d53b4bee0cc	2020-10-01 12:55:59 +00:00
Sjoerd Meijer	c208ece583	[LoopFlatten] Add a loop-flattening pass This is a simple pass that flattens nested loops. The intention is to optimise loop nests like this, which together access an array linearly: for (int i = 0; i < N; ++i) for (int j = 0; j < M; ++j) f(A[iM+j]); into one loop: for (int i = 0; i < (NM); ++i) f(A[i]); It can also flatten loops where the induction variables are not used in the loop. This can help with codesize and runtime, especially on simple cpus without advanced branch prediction. This is only worth flattening if the induction variables are only used in an expression like i*M+j. If they had any other uses, we would have to insert a div/mod to reconstruct the original values, so this wouldn't be profitable. This partially fixes PR40581 as this pass triggers on one of the two cases. I will follow up on this to learn LoopFlatten a few more (small) tricks. Please note that LoopFlatten is not yet enabled by default. Patch by Oliver Stannard, with minor tweaks from Dave Green and myself. Differential Revision: https://reviews.llvm.org/D42365	2020-10-01 13:54:45 +01:00
Sam Parker	55afff27c3	[NFC][ARM] LowOverheadLoop DEBUG statements	2020-10-01 13:38:16 +01:00
Simon Pilgrim	7c21647448	[InstCombine] collectBitParts - use APInt directly to check for out of range bit shifts. NFCI.	2020-10-01 12:50:36 +01:00
Andrew Paverd	ee355f6f7f	[CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-01 12:45:07 +01:00
Kerry McLaughlin	6865916c61	[SVE][CodeGen] Lower scalable fp_extend & fp_round operations This patch adds FP_EXTEND_MERGE_PASSTHRU & FP_ROUND_MERGE_PASSTHRU ISD nodes, used to lower scalable vector fp_extend/fp_round operations. fp_round has an additional argument, the 'trunc' flag, which is an integer of zero or one. This also fixes a warning introduced by the new tests added to sve-split-fcvt.ll, resulting from an implicit TypeSize -> uint64_t cast in SplitVecOp_FP_ROUND. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88321	2020-10-01 12:17:37 +01:00

1 2 3 4 5 ...

204509 Commits