llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00

Author	SHA1	Message	Date
Victor Huang	bc814804ef	[PowerPC][NFC] Clang-format on commit 4b414d	2020-02-05 13:47:54 -06:00
Adrian McCarthy	3ec1d2b411	[VFS] More consistent support for Windows Removed some #ifdefs specific to Windows handling of VFS paths. This eliminates most of the differences between the Windows and non-Windows code paths. Making this work required some changes to account for the fact that VFS file paths can be Posix style or Windows style, so you cannot just assume that they use the host's native path style. In one case, this means implementing our own version of make_absolute, since the filesystem code in Support doesn't have styles in the sense that the path code does. Differential Review: https://reviews.llvm.org/D71092	2020-02-05 11:38:20 -08:00
Matt Arsenault	45fe1cf69f	AMDGPU/GlobalISel: Legalize f64 G_FFLOOR for SI Use cmp ord instead of cmp_class compared to the DAG version for the nan check, but mostly try to match the existsing pattern. I think the sign doesn't matter for fract, so we could do a little better with the source modifier matching. I think this is also still broken as in D22898, but I'm leaving it as-is for now while I don't have an SI system to test on.	2020-02-05 14:32:01 -05:00
Nate Voorhies	35f8a76c2c	[NFC][RISCV] Fixing typo in comment. Reviewers: luismarques, lenary Reviewed By: lenary Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73984	2020-02-05 11:30:11 -08:00
LLVM GN Syncbot	9d6725a8d3	[gn build] Port b12176d2aaf	2020-02-05 19:16:15 +00:00
Nico Weber	878db5393d	Revert "[llvm-reduce] add ReduceAttribute delta pass" This reverts commit fc62b36a000681c01e993242b583c5ec4ab48a3c. Breaks tests on mac: http://45.33.8.238/mac/7301/step_11.txt	2020-02-05 14:15:11 -05:00
Jan Korous	4a2256fd57	Revert "Activate extension loading test on Darwin now that the underlying fix has landed" This reverts commit 058070893428a480b76a137f647ae6b9c89ac2d4.	2020-02-05 11:04:38 -08:00
Shu-Chun Weng	7269103002	[GlobalISel][AArch64] Fix contract cross-bank copies with SIMD instructions contractCrossBankCopyIntoStore() finds the instruction defines the source register and uses its output to replace the register. There are, however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES. Current implementation hardcodes to operand 0 and has no way of knowing which output should be used. This change adds another function to directly return the register that is the source of the register and use that for folding. This fixes https://bugs.llvm.org/show_bug.cgi?id=44783 Differential Revision: https://reviews.llvm.org/D74005	2020-02-05 10:38:35 -08:00
David Green	96b3c41b3d	[ARM] Add extra use test for MVE VPT blocks. NFC	2020-02-05 18:32:18 +00:00
Matt Arsenault	dffd3d8c49	GlobalISel: Assume G_INTRINSIC* are convergent This is safer in case anyone tries to run MI optimization passes on pre-selected MIR. If there turns out to be a real reason to do this, we might need to add separate convergent intrinsic opcodes.	2020-02-05 10:17:22 -08:00
LLVM GN Syncbot	8c3a57058a	[gn build] Port fc62b36a000	2020-02-05 18:06:25 +00:00
Nick Desaulniers	686a91a786	[llvm-reduce] add ReduceAttribute delta pass Summary: The output from llvm-reduce still has significantly more attributes than bugpoint does. Teach llvm-reduce to remove attributes. Reviewers: diegotf, dblaikie, george.burgess.iv Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73853	2020-02-05 10:05:25 -08:00
Jessica Paquette	fe54d96576	[AArch64][GlobalISel] Fold G_ASHR into TB(N)Z bit calculation This implements walking over G_ASHR in the same way as `getTestBitOperand` in AArch64ISelLowering. ``` (tbz (ashr x, c), b) -> (tbz x, b+c) or (tbz x, msb) if b+c is > # bits in x ``` Differential Revision: https://reviews.llvm.org/D73933	2020-02-05 10:04:48 -08:00
Christopher Tetreault	c5dc9d3b15	Reapply: [SVE] Fix bug in simplification of scalable vector instructions This reverts commit a05441038a3a4a011b9421751367c5c797d57137, reapplying commit 31574d38ac5fa4646cf01dd252a23e682402134f	2020-02-05 10:00:09 -08:00
Petr Hosek	0e71bf0b60	[CMake] Filter libc++abi and libunwind from runtimes build in MSVC These don't build on MSVC at the moment, so filter these out altogether from the list of runtimes and print a warning. Differential Revision: https://reviews.llvm.org/D73812	2020-02-05 09:59:06 -08:00
Matt Arsenault	df0d66718e	AMDGPU/GlobalISel: Prefer merge/unmerge ops to legalize TFE These have a better chance of combining with other operations and are currently much better supported than G_EXTRACT.	2020-02-05 12:56:10 -05:00
Jessica Paquette	bdc86b5318	[AArch64][GlobalISel] Fix one use check in getTestBitReg (1) The check needs to be on the 0th operand of whatever we're folding (2) Checks for validity should happen before we change the bit Fixes a bug which caused MultiSource/Applications/JM/lencod to fail at -O3. Differential Revision: https://reviews.llvm.org/D74002	2020-02-05 09:54:52 -08:00
Matt Arsenault	657b413bfa	AMDGPU/GlobalISel: Legalize TFE image result loads Rewrite the result register pair into the expected sinigle register format in the legalizer. I'm also operating under the assumption that TFE doesn't apply to stores or atomics, but don't know if this is true or not.	2020-02-05 12:40:20 -05:00
Hiroshi Yamauchi	fcd3bf40ff	[PGO][PGSO] Tune flags for profile guided size optimization. Summary: Tune the profile threshold flag value for instrumentation PGO based on internal benchmarks. Also, add flags to allow profile guided size optimizations for non-cold code to be enabled separately for instrumentation and sample PGSO. Neither changes the default behavior (yet) as it's disabled for non-cold code. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72937	2020-02-05 09:37:32 -08:00
Matt Arsenault	0861faad9c	AMDGPU: Fix divergence analysis of control flow intrinsics The mask results of these should be uniform. The trickier part is the dummy booleans used as IR glue need to be treated as divergent. This should make the divergence analysis results correct for the IR the DAG is constructed from. This should allow us to eliminate requiresUniformRegister, which has an expensive, recursive scan over all users looking for control flow intrinsics. This should avoid recent compile time regressions.	2020-02-05 09:30:54 -08:00
Jordan Rupprecht	88aab300b4	NFC: fix unused var warnings in no-assert builds	2020-02-05 09:26:59 -08:00
Kazu Hirata	d766c964c9	Resubmit^2: [JumpThreading] Thread jumps through two basic blocks This reverts commit 41784bed01543315a1d03141e6ddc023fd914c0b. Since the original revision ead815924e6ebeaf02c31c37ebf7a560b5fdf67b, this revision fixes three issues: - This revision fixes the Windows build. My original patch improperly copied EH pads on Windows. This patch disregards jump threading opportunities having to do with EH pads. - This revision fixes jump threading to a wrong destination. Specifically, my original patch treated any Constant other than 0 as 1 while evaluating the branch condition. This bug led to treating constant expressions like: icmp ugt i8* null, inttoptr (i64 4 to i8) to "true". This patch fixes the bug by calling isOneValue. - This revision fixes the cost calculation of two basic blocks being threaded through. Note that getJumpThreadDuplicationCost returns "(unsigned)~0" for those basic blocks that cannot be duplicated. If we sum of two return values from getJumpThreadDuplicationCost, we could have an unsigned overflow like: (unsigned)~0 + 5 = 4 and mistakenly determine that it's safe and profitable to proceed with the jump threading opportunity. The patch fixes the bug by checking each return value before summing them up. [JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32 [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-02-05 09:23:37 -08:00
Alina Sbirlea	88ae466cc3	[IRCE] Make IRCE a Function pass. Summary: Make InductiveRangeCheckElimination a FunctionPass. Reviewers: reames, mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73592	2020-02-05 09:22:41 -08:00
LLVM GN Syncbot	9d09a8e119	[gn build] Port b198f16e1e1	2020-02-05 17:03:12 +00:00
Matt Arsenault	2e7059e936	AMDGPU/GlobalISel: Legalize llvm.amdgcn.s.buffer.load The 96-bit results need to be widened. I find the interaction between LegalizerHelper and MIRBuilder somewhat awkward. The custom legalization is called by the LegalizerHelper, but then does not have access to the helper. You have to construct a new helper, which then does not own the MachineIRBuilder, but does modify it. Maybe custom legalization should be passed the helper?	2020-02-05 12:01:34 -05:00
Teresa Johnson	addd1606be	[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP Summary: Currently type test assume sequences inserted for devirtualization are removed during WPD. This patch delays their removal until later in the optimization pipeline. This is an enabler for upcoming enhancements to indirect call promotion, for example streamlined promotion guard sequences that compare against vtable address instead of the target function, when there are small number of possible vtables (either determined via WPD or by in-progress type profiling). We need the type tests to correlate the callsites with the address point offset needed in the compare sequence, and optionally to associated type summary info computed during WPD. This depends on work in D71913 to enable invocation of LowerTypeTests to drop type test assume sequences, which will now be invoked following ICP in the ThinLTO post-LTO link pipelines, and also after the existing export phase LowerTypeTests invocation in regular LTO (which is already after ICP). We cannot simply move the existing import phase LowerTypeTests pass later in the ThinLTO post link pipelines, as the comment in PassBuilder.cpp notes (it must run early because when performing CFI other passes may disturb the sequences it looks for). This necessitated adding a new type test resolution "Unknown" that we can use on the type test assume sequences previously removed by WPD, that we now want LTT to ignore. Depends on D71913. Reviewers: pcc, evgeny777 Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73242	2020-02-05 08:59:48 -08:00
Matt Arsenault	b46a1225fc	AMDGPU/GlobalISel: Fix processing new phi in waterfall loop The adjusted iterator range included the last we just inserted, and don't want to process. Figure out the new iterator range before inserting phis. This was a harmless problem, but added an unnecessary complication for a future patch.	2020-02-05 11:52:42 -05:00
Matt Arsenault	1da609be66	GlobalISel: Make LegalizerHelper primitives public I want to re-use widenScalarDst/moreElementsVectorDst directly.	2020-02-05 11:52:18 -05:00
Matt Arsenault	4ba6a5e05e	AMDGPU/GlobalISel: Don't use legal v2s16 G_BUILD_VECTOR If we have s_pack_* instructions, legalize this to G_BUILD_VECTOR_TRUNC from s32 elements. This is closer to how how the s_pack_* instructions really behave. If we don't have s_pack_ instructions, expand this by creating a merge to s32 and bitcasting. This expands to the expected bit operations. I think this eventually should go in a new bitcast legalize action type in LegalizerHelper. We already directly emit the shift operations in RegBankSelect for the vector case. This could possibly be cleaned up, but I also may want to defer doing this expansion to selection anyway. I'll see about that when I try to actually match VOP3P instructions. This breaks the selection of the build_vector since tablegen doesn't know how to match G_BUILD_VECTOR_TRUNC yet, so just xfail it for now.	2020-02-05 11:52:18 -05:00
Momchil Velikov	ad184987b8	[ARM][TargetParser] Improve handling of dependencies between target features The patch at https://reviews.llvm.org/D64048 added "negative" dependency handling in `ARM::appendArchExtFeatures`: feature "noX" removes all features, which imply "X". This patch adds the "positive" handling: feature "X" adds all the feature strings implied by "X". (This patch also comes from the suggestion here https://reviews.llvm.org/D72633#inline-658582) Differential Revision: https://reviews.llvm.org/D72762	2020-02-05 16:07:51 +00:00
Alex Richardson	16edf004ac	Re-enable a update_cc_test_checks.py tests This test was not running because it still had a REQUIRES: python3 line. As this is no longer necessary, remove the REQUIRES to run the test again.	2020-02-05 15:37:30 +00:00
Sjoerd Meijer	9711abe6d3	[ARM][MVE] LowOverheadLoops: DCE on the iteration count setup expression Once we have created a tail-predicated hardware-loop, and thus know the number of elements that are processed, we want to clean-up the iteration count expression of that loop. In D73682, we bailed the analysis on conditionally executed instructions. This adds support for IT-blocks, so that we can handle these cases again. The restriction is that we only support IT blocks containing 1 statement, but that seems to cover most cases and forms of the iteration count expression. Differential Revision: https://reviews.llvm.org/D73947	2020-02-05 15:15:46 +00:00
Momchil Velikov	38241fc5d7	[ARM] Correct syntax of the CLRM insn The predicate should be adjacent to the opcode. Differential Revision: https://reviews.llvm.org/D74040	2020-02-05 13:54:34 +00:00
Andrea Di Biagio	6d0e7e12a0	[MCA] Remove verification check on MayLoad and MayStore. NFCI Field NumMicroOpcodes is currently used by mca to model the number of uOPs dispatched from the uOp-Queue to the out of order backend. From a 'dispatch' point of view, an instruction with zero opcodes is still valid; it simply doesn't consume any dispatch group slots. However, mca doesn't expect an instruction with zero uOPs to consume pipeline resources because it is seen as a contradiction. In practice, it only makes sense if such an instruction is eliminated and never really executed. It may be that mca is being too conservative here. However I believe that mca is right, and we should probably check that inconsistency in CodeGenSchedule.cpp (when we also verify scheduling classes in general). This patch removes the check for MayLoad and MayStore in mca. That check is probably too conservative: we are already checking if a zero-uops instruction consumes any processor resources. Note also that instructions with unmodelled side-effects also tend to set the MayLoad/MayStore flags even if - theoretically speaking - they might not even consume any hw resources in practice. In future we may want to implement different checks (possibly outside of mca) and potentially revisit the logic in mca that verifies instructions. For that reason I have raised PR44797.	2020-02-05 13:50:01 +00:00
Simon Pilgrim	29c63c6731	visitINSERT_VECTOR_ELT - pull out repeated dyn_cast. NFCI. This always gets called at least once.	2020-02-05 13:30:54 +00:00
Sam Parker	3b7b1e369e	[ARM][LowOverheadLoops] Fix loop count chain Checking that the use-def chain that performs the loop count isSafeToRemove is not sufficient because it means that we can remove register copies that we need to restore lr to its correct value. This change now prevents the transform from kicking in for the 'remove-elem-moves' test which needs to addressed later on. Differential Revision: https://reviews.llvm.org/D74037	2020-02-05 13:21:51 +00:00
Sam Parker	4a2128748f	[ARM][LowOverheadLoops] Ensure memory predication While validating each MVE instruction, check that all instructions that touch memory are somehow predicated upon the VCTP. Differential Revision: https://reviews.llvm.org/D73616	2020-02-05 13:19:08 +00:00
Simon Pilgrim	b6af7e47c1	[X86] Fix missing load latencies (PR36894) We weren't account for load latencies in the SSE42/AES/CLMUL schedule classes	2020-02-05 11:53:16 +00:00
Ayke van Laethem	0b73574e68	[AVR] Add disassembly tests for supported instructions The disassembler of the AVR backend is incomplete: most instructions do not correctly disassemble yet. This patch is the first in a series to add disassembly support to the AVR backend. It starts with adding disassembler tests for instructions that already disassemble correctly. Differential Revision: https://reviews.llvm.org/D73911	2020-02-05 12:38:51 +01:00
Martin Storsjö	73cf6c4a89	Partially revert c1c9819ef91aab51b5a23fb3027adac5a2f551cc Revert the part of that change that broke the test Passes/./PluginsTests/PluginsTests.LoadPlugin.	2020-02-05 13:29:48 +02:00
Martin Storsjö	d6b7a01674	[CMake] Add missing component dependencies, to fix building for mingw with BUILD_SHARED_LIBS Differential Revision: https://reviews.llvm.org/D73840	2020-02-05 13:10:27 +02:00
Sebastian Neubauer	097cac0b5f	[AMDGPU] Fix lowering a16 image intrinsics scalar_to_vector takes only one argument, not two. The a16 tests now also check the packing of coordinates into registers Differential Revision: https://reviews.llvm.org/D73482	2020-02-05 10:54:34 +01:00
Sebastian Neubauer	c9e54cbc12	[AMDGPU] Use v3f32 type in image instructions This should lower the amount of used registers for gfx9. I updated some of the changed tests with the update script because changing them by hand is tedious. Differential Revision: https://reviews.llvm.org/D73884	2020-02-05 10:35:41 +01:00
Georgii Rymar	38ad9710a6	[yaml2obj][obj2yaml] - Simplify format of the SHT_LLVM_ADDRSIG section. Previously the description allowed to describe symbols with use of `Name` and `Index` keys. This patch removes them and now it is still possible to use either names or symbol indexes, but the code is simpler and the format is slightly different. Such a change will be useful for another patches, e.g: https://reviews.llvm.org/D73788#inline-671077 Differential revision: https://reviews.llvm.org/D73888	2020-02-05 12:33:14 +03:00
Djordje Todorovic	73925edae1	[DebugInfo] Avoid the call site param for mem instrs with multiple defs We currently only handle mem instructions with a single define. Avoid the call site parameter debug info when we find the case with multiple defs, rather than throwing an assert. Differential Revision: https://reviews.llvm.org/D73954	2020-02-05 10:03:14 +01:00
Craig Topper	145a539e76	[X86] Add a DAG combine for (i32 (sext (i8 (x86isd::setcc_carry)))) -> (i32 (x86isd::setcc_carry)) and remove isel patterns. Same for any_extend though we don't have coverage for that. The test changes are because isel didn't check one use of the setcc_carry. So in isel we would end up with two different sized setcc_carry instructions. And since it clobbers the flags we would need to recreate the flags for the second instruction. This code handles additional uses by truncating the new wide setcc_carry back to the original size for those uses.	2020-02-04 22:40:36 -08:00
Petr Hosek	ff60cb9276	[CMake] Passthrough CMAKE_SYSTEM_NAME to default builtin and runtimes target When building the default builtin and runtimes target, set the CMAKE_SYSTEM_NAME to the current one. This is not necessary on Linux and Darwin, but it appears to be necessary on Windows, otherwise CMake fails. Differential Revision: https://reviews.llvm.org/D73811	2020-02-04 22:38:20 -08:00
Jan Vesely	d881d8ad71	AMDGPU/EG,CM: Implement fsqrt using recip(rsqrt(x)) instead of x * rsqrt(x) The old version might be faster on EG (RECIP_IEEE is Trans only), but it'd need extra corner case checks. This gives correct corner case behaviour and saves a register. Fixes OCL CTS sqrt test (1-thread, scalar) on Turks. Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D74017	2020-02-05 00:24:07 -05:00
Thomas Lively	352e779087	Revert "[WebAssembly][InstrEmitter] Foundation for multivalue call lowering" Summary: This reverts commit 3ef169e586f4d14efe690c23c878d5aa92a80eb5. The purpose of this commit was to allow stack machines to perform instruction selection for instructions with variadic defs. However, MachineInstrs fundamentally cannot support variadic defs right now, so this change does not turn out to be useful. Depends on D73927. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73928	2020-02-04 20:04:59 -08:00
Matt Arsenault	bce6c219c5	AMDGPU: Correct memory size for image intrinsics This was incorrectly rounding up to the next power of 2. v4f32 was rounding up to v8f32, which was just wrong. There are also v3i16/v3f16 available in MVT, so we don't even need to round the f16 cases anymore. Additionally, this field is really an EVT so we don't even need to consider this. Also switch some asserts to return invalid. We should have an IR verifier for these intrinsic return types, but for now it's better to not assert on IR that passes the verifier. This should also probably be fixed to consider that dmask is really eliminating some of the loaded components.	2020-02-04 22:29:23 -05:00

... 3 4 5 6 7 ...

191565 Commits