llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Craig Topper	a9839b29b1	[X86] Add separate intrinsics for scalar FMA4 instructions. Summary: These instructions zero the non-scalar part of the lower 128-bits which makes them different than the FMA3 instructions which pass through the non-scalar part of the lower 128-bits. I've only added fmadd because we should be able to derive all other variants using operand negation in the intrinsic header like we do for AVX512. I think there are still some missed negate folding opportunities with the FMA4 instructions in light of this behavior difference that I hadn't noticed before. I've split the tests so that we can use different intrinsics for scalar testing between the two. I just copied the tests split the RUN lines and changed out the scalar intrinsics. fma4-fneg-combine.ll is a new test to make sure we negate the fma4 intrinsics correctly though there are a couple TODOs in it. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39851 llvm-svn: 318984	2017-11-25 18:32:43 +00:00
Craig Topper	4e1ffd068c	[X86] Don't report gather is legal on Skylake CPUs when AVX2/AVX512 is disabled. Allow gather on SKX/CNL/ICL when AVX512 is disabled by using AVX2 instructions. Summary: This adds a new fast gather feature bit to cover all CPUs that support fast gather that we can use independent of whether the AVX512 feature is enabled. I'm only using this new bit to qualify AVX2 codegen. AVX512 is still implicitly assuming fast gather to keep tests working and to match the scatter behavior. Test command lines have been added for these two cases. Reviewers: magabari, delena, RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40282 llvm-svn: 318983	2017-11-25 18:09:37 +00:00
Craig Topper	536fdc003e	[SelectionDAG] Remove some dead code from vector scalaring Summary: Currently ScalarizeVecRes_SETCC checks for the result type being a vector and jumps to ScalarizeVecRes_VSETCC. But if we're scalarizing a vector result, aren't we guaranteed to be looking at a vector type? This patch deletes the current ScalarizeVecRes_SETCC and renames ScalarizeVecRes_VSETCC to ScalarizeVecRes_SETCC. Reviewers: RKSimon, arsenm, eladcohen, zvi Reviewed By: RKSimon Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D40452 llvm-svn: 318982	2017-11-25 17:59:00 +00:00
Andrew V. Tischenko	4884d8ff8f	Add BTVER2 sched support for SHLD/SHRD. Differential Revision: https://reviews.llvm.org/D40124 llvm-svn: 318977	2017-11-25 10:46:53 +00:00
Craig Topper	39099b078a	[X86] Simplify some code in combineSetCC. NFCI Make the condition for doing a std::swap simpler so we don't have to repeat the full checks. llvm-svn: 318970	2017-11-25 07:20:24 +00:00
Craig Topper	3afb989c68	[X86] Qualify some vector specific code with VT.isVector(). NFCI Other checks inside require a build_vector, but we this lets us stop earlier and makes the code more clear. llvm-svn: 318969	2017-11-25 07:20:23 +00:00
Craig Topper	4da0b2efdd	[X86] Support folding to andnps with SSE1 only. With SSE1 only, we emit FAND and FXOR nodes for v4f32. llvm-svn: 318968	2017-11-25 07:20:22 +00:00
Craig Topper	9d3dddce37	[X86] Add some early DAG combines to turn v4i32 AND/OR/XOR into FAND/FOR/FXOR whe only SSE1 is available. v4i32 isn't a legal type with sse1 only and would end up getting scalarized otherwise. This isn't completely ideal as it doesn't handle cases like v8i32 that would get split to v4i32. But it at least helps with code written using the clang intrinsic header. llvm-svn: 318967	2017-11-25 07:20:21 +00:00
Craig Topper	8d5c529d1f	Recommit r318963 "[APInt] Don't print debug messages from the APInt knuth division algorithm by default" The previous commit had the condition in the do/while backwards. Debug builds currently print out low level details of the Knuth division algorithm when -debug is used. This information isn't useful in most cases and just adds noise to the log. This adds a new preprocessor flag to enable the prints in the knuth division code in APInt. Differential Revision: https://reviews.llvm.org/D40404 llvm-svn: 318966	2017-11-24 20:29:04 +00:00
Craig Topper	44ad6e2d33	[X86] Prevent using X * rsqrt(X) to approximate sqrt when only sse1 is enabled. This optimization can occur after type legalization and emit a vselect with v4i32 type. But that type is not legal with sse1. This ultimately gets scalarized by the second type legalization that runs after vector op legalization, but that's really intended to handle the scalar types that might be introduced by legalizing vector ops. For now just stop this from happening by disabling the optimization with sse1. llvm-svn: 318965	2017-11-24 19:57:48 +00:00
Craig Topper	cf84134b0f	Revert 318963 "[APInt] Don't print debug messages from the APInt knuth division algorithm by default" I seem to have botched the logic when switching to push_macro llvm-svn: 318964	2017-11-24 19:32:34 +00:00
Craig Topper	5aed624232	[APInt] Don't print debug messages from the APInt knuth division algorithm by default Debug builds currently print out low level details of the Knuth division algorithm when -debug is used. This information isn't useful in most cases and just adds noise to the log. This adds a new preprocessor flag to enable the prints in the knuth division code in APInt. Differential Revision: https://reviews.llvm.org/D40404 llvm-svn: 318963	2017-11-24 19:13:24 +00:00
Simon Dardis	13522bc598	[CodeGenPrepare] Check that erased sunken address are not reused CodeGenPrepare sinks address computations from one basic block to another and attempts to reuse address computations that have already been sunk. If the same address computation appears twice with the first instance as an operand of a load whose result is an operand to a simplifable select, CodeGenPrepare simplifies the select and recursively erases the now dead instructions. CodeGenPrepare then attempts to use the erased address computation for the second load. Fix this by erasing the cached address value if it has zero uses before looking for the address value in the sunken address map. This partially resolves PR35209. Thanks to Alexander Richardson for reporting the issue! This fixed version relands r318032 which was reverted in r318049 due to sanitizer buildbot failures. Reviewers: john.brawn Differential Revision: https://reviews.llvm.org/D39841 llvm-svn: 318956	2017-11-24 16:45:28 +00:00
Dmitry Preobrazhensky	77393a5f25	[AMDGPU][MC][GFX9] Added v_interp_p2_f16 and v_interp_p2_legacy_f16 See bug 33629: https://bugs.llvm.org//show_bug.cgi?id=33629 Reviewers: artem.tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D39488 llvm-svn: 318955	2017-11-24 15:37:14 +00:00
Dylan McKay	eebdc67e4c	[AVR] Use the short form of 'clr <reg>' r318895 made it so that the simpler instruction aliases are printed rather than their expanded form. llvm-svn: 318954	2017-11-24 15:36:43 +00:00
Benjamin Kramer	0b9cc4f48d	Make helpers static. NFC. llvm-svn: 318953	2017-11-24 14:55:41 +00:00
Javed Absar	da4536e29a	[SCEV] : Simplify loop to range-loop.NFC. llvm-svn: 318952	2017-11-24 14:35:38 +00:00
John Brawn	74277962fb	[CGP] Make optimizeMemoryInst able to combine more kinds of ExtAddrMode fields This patch extends the recent work in optimizeMemoryInst to make it able to combine more ExtAddrMode fields than just the BaseReg. This fixes some benchmark regressions introduced by r309397, where GVN PRE is hoisting a getelementptr such that it can no longer be combined into the addressing mode of the load or store that uses it. Differential Revision: https://reviews.llvm.org/D38133 llvm-svn: 318949	2017-11-24 14:10:45 +00:00
Aleksandar Beserminji	0dd888baec	[mips] Set microMIPS ASE flag This patch fixes an issue where microMIPS ASE flag is not set when a function has micromips attribute or when .set micromips directive is used. Differential Revision: https://reviews.llvm.org/D40316 llvm-svn: 318948	2017-11-24 14:00:47 +00:00
Dmitry Preobrazhensky	7cd493f906	[AMDGPU][MC][GFX9] Added support of 'inst_offset' modifier for compatibility with SP3 See bug 35329: https://bugs.llvm.org//show_bug.cgi?id=35329 Reviewers: arsenm, vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D40350 llvm-svn: 318947	2017-11-24 13:22:38 +00:00
Benjamin Kramer	a5e422cf52	[YAMLParser] Fix unused variable warning. llvm-svn: 318936	2017-11-23 21:07:11 +00:00
Benjamin Kramer	dda77d3dd4	[YAMLParser] Don't crash on null keys in KeyValueNodes. Found by clangd-fuzzer! llvm-svn: 318935	2017-11-23 20:57:20 +00:00
Craig Topper	ccc25a5474	[X86] Don't invert NewCC variable while processing the jcc/setcc/cmovcc instructions in optimizeCompareInstr. The NewCC variable is calculated outside of the loop that processes jcc/setcc/cmovcc instructions. If we invert it during the loop it can cause an incorrect value to be used by a later iteration. Instead only read it during the loop and use a new variable to store the possibly inverted value. Fixes PR35399. llvm-svn: 318934	2017-11-23 19:25:45 +00:00
Craig Topper	65940a5482	[X86] Teach isel that X86ISD::CMPM_RND zeros the upper bits of the mask register. llvm-svn: 318933	2017-11-23 18:41:21 +00:00
Craig Topper	f3313bc309	[X86] Remove some unneeded opcodes from getVectorMaskingNode. NFC We never reach here with these opcodes. llvm-svn: 318932	2017-11-23 18:41:20 +00:00
Craig Topper	f8fdc05a80	[X86] Add X86ISD::CMPM_RND to getVectorMaskingNode to select ISD::AND instead of ISD::VSELECT A later DAG combine will turn the VSELECT into an AND, but we have the other mask compare opcodes here so add this one too. llvm-svn: 318931	2017-11-23 18:41:19 +00:00
Craig Topper	d44dce2a0b	[X86] Remove some dead code leftover from when i1 was a legal type. NFCI llvm-svn: 318930	2017-11-23 18:41:18 +00:00
Craig Topper	8f42585355	[X86] Remove some dead code. NFC AVX512 code never reaches here so we don't need to handle X86ISD::CMPM as an opcode. llvm-svn: 318929	2017-11-23 18:41:17 +00:00
Alexander Potapenko	500b98de47	MSan: remove an unnecessary cast. NFC for userspace instrumenetation. llvm-svn: 318923	2017-11-23 15:06:51 +00:00
Simon Pilgrim	59a5ef9f69	[X86][SSE] Use (V)PHMINPOSUW for vXi16 SMAX/SMIN/UMAX/UMIN horizontal reductions (PR32841) (V)PHMINPOSUW determines the UMIN element in an v8i16 input, with suitable bit flipping it can also be used for SMAX/SMIN/UMAX cases as well. This patch matches vXi16 SMAX/SMIN/UMAX/UMIN horizontal reductions and reduces the input down to a v8i16 vector before calling (V)PHMINPOSUW. A later patch will use this for v16i8 reductions as well (PR32841). Differential Revision: https://reviews.llvm.org/D39729 llvm-svn: 318917	2017-11-23 13:50:27 +00:00
Diana Picus	39d919739c	[ARM GlobalISel] Support G_FDIV for s32 and s64 TableGen already generates code for selecting a G_FDIV, so we only need to add a test. For the legalizer and reg bank select, we do the same thing as for the other floating point binary operations: either mark as legal if we have a FP unit or lower to a libcall, and map to the floating point registers. llvm-svn: 318915	2017-11-23 13:26:07 +00:00
Ying Yi	83c2048bab	Reverted rL318911 since it broke the sanitizer-windows. llvm-svn: 318914	2017-11-23 13:23:21 +00:00
Ying Yi	c5a292bdd3	[lit] Implement non-pipelined ‘mkdir’, ‘diff’ and ‘rm’ commands internally Summary: The internal shell already supports 'cd', ‘export’ and ‘echo’ commands. This patch adds implementation of non-pipelined ‘mkdir’, ‘diff’ and ‘rm’ commands as the internal shell builtins. Reviewers: Zachary Turner, Reid Kleckner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39567 llvm-svn: 318911	2017-11-23 12:48:41 +00:00
Diana Picus	7fce27389c	[ARM GlobalISel] Support G_FMUL for s32 and s64 TableGen already generates code for selecting a G_FMUL, so we only need to add a test for that part. For the legalizer and reg bank select, we do the same thing as the other floating point binary operators: either mark as legal if we have a FP unit or lower to a libcall, and map to the floating point registers. llvm-svn: 318910	2017-11-23 12:44:20 +00:00
Simon Dardis	8327d5577b	[mips] Use the delay slot filler to convert branches for microMIPSR6. The MIPS delay slot filler converts delay slot branches into compact forms for the MIPS ISAs which support them. For branches that compare (in)equality with with zero, it converts them into branches with implict zero register operands. These branches have a slightly greater range than normal two register operands branches. Changing the branches at this point in the pipeline offers the long branch pass the ability to mark better judgements if a long branch sequence is required. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D40314 llvm-svn: 318908	2017-11-23 12:38:04 +00:00
Coby Tayree	be83415853	[x86][icelake]BITALG 2/3 vpshufbitqmb encoding 3/3 vpshufbitqmb intrinsics Differential Revision: https://reviews.llvm.org/D40222 llvm-svn: 318904	2017-11-23 11:15:50 +00:00
Alexander Potapenko	87d48e130a	[MSan] Move the access address check before the shadow access for that address MSan used to insert the shadow check of the store pointer operand _after_ the shadow of the value operand has been written. This happens to work in the userspace, as the whole shadow range is always mapped. However in the kernel the shadow page may not exist, so the bug may cause a crash. This patch moves the address check in front of the shadow access. llvm-svn: 318901	2017-11-23 08:34:32 +00:00
George Rimar	c855b4aa3d	Revert r318822 "[llvm-tblgen] - Stop using std::string in RecordKeeper." It reported to have problems with memory sanitizers and DBUILD_SHARED_LIBS=ON. llvm-svn: 318899	2017-11-23 06:52:44 +00:00
Max Kazantsev	21fb04396f	[IRCE][NFC] Add no wrap flags to no-wrapping SCEV calculation In a lambda where we expect to have result within bounds, add respective `nsw/nuw` flags to help SCEV just in case if it fails to figure them out on its own. Differential Revision: https://reviews.llvm.org/D40168 llvm-svn: 318898	2017-11-23 06:14:39 +00:00
Leslie Zhai	3a5189ac95	Add backend name to AVR Target to enable runtime info to be fed back into TableGen llvm-svn: 318895	2017-11-23 04:11:11 +00:00
Craig Topper	a7f1d93652	[X86] Turn an if condition that should always be true into an assert. NFCI If Values.size() == 0, we should have returned 0 or undef earlier. If it was 1, it's a splat and we already handled that too. llvm-svn: 318894	2017-11-23 03:24:01 +00:00
Craig Topper	31624db927	[X86] Remove unnecessary check for is128BitVector. NFC 256 and 512 bit vectors were picked off earlier in the function. Lots of code between there and here already assumed 128-bit vectors. llvm-svn: 318893	2017-11-23 03:24:00 +00:00
Craig Topper	32229a6174	[X86] Simplify some bitmasking and use llvm_unreachable to mark an impossible case. NFC llvm-svn: 318892	2017-11-23 03:23:59 +00:00
Craig Topper	b172abe0c6	[X86] Remove a ternary operator that can only ever be false. NFC We are checking for AVX512 in an SSE1 only block. llvm-svn: 318891	2017-11-23 03:23:58 +00:00
Yaxun Liu	56252e9b95	[NFC] CodeGen: Handle shift amount type in DAGTypeLegalizer::SplitInteger This patch reverts change to X86TargetLowering::getScalarShiftAmountTy in rL318727 and move the logic to DAGTypeLegalizer::SplitInteger. The reason is that getScalarShiftAmountTy returns a shift amount type that is suitable for common use cases in CodeGen. DAGTypeLegalizer::SplitInteger is a rare situation which requires a shift amount type larger than what getScalarShiftAmountTy. In this case, it is more reasonable to do special handling of shift amount type in DAGTypeLegalizer::SplitInteger only. If similar situations arises the logic may be moved to a separate function. Differential Revision: https://reviews.llvm.org/D40320 llvm-svn: 318890	2017-11-23 03:08:51 +00:00
David Blaikie	19377ebd74	Instrumentation.h: Remove dead/untested code for DFSan JIT support llvm-svn: 318887	2017-11-23 00:08:40 +00:00
Craig Topper	5c7b8f41e1	[X86] Regenerate the vector-popcnt and vector-tzcnt tests to get BITALG CHECK linse on all functions not just the vXi16/vXi8. llvm-svn: 318885	2017-11-22 23:35:12 +00:00
Evandro Menezes	b0258a4fdc	[AArch64] Adjust the cost model for Exynos M1 and M2 Fix the modeling of some loads and stores. llvm-svn: 318884	2017-11-22 22:48:50 +00:00
Fedor Sergeev	862e8fda82	IR printing improvement for loop passes Summary: Loop-pass printing is somewhat deficient since it does not provide the context around the loop (e.g. preheader). This context information becomes pretty essential when analyzing transformations that move stuff out of the loop. Extending printLoop to cover preheader and exit blocks (if any). Reviewers: sanjoy, silvas, weimingz Reviewed By: sanjoy Subscribers: apilipenko, skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D40246 llvm-svn: 318878	2017-11-22 20:59:53 +00:00
Krzysztof Parzyszek	e8d084d3f2	[Hexagon] Implement buildVector32 and buildVector64 as utility functions Change LowerBUILD_VECTOR to use those functions. This commit will tempora- rily affect constant vector generation (it will generate constant-extended values instead of non-extended combines), but the code for the general case should be better. The constant selection part will be fixed later. llvm-svn: 318877	2017-11-22 20:56:23 +00:00

1 2 3 4 5 ...

157005 Commits