llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Clement Courbet	53fea2956b	[X86] Fix missing predicates HasAVX512 Predicates in avx512_sqrt_scalar. Summary: For example, VSQRTSDZr and VSQRTSSZr were missing the predicate. Also fix braces indentation and braces for consistency. Reviewers: craig.topper, RKSimon Suscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41983 llvm-svn: 322478	2018-01-15 12:05:33 +00:00
Pavel Labath	fe0ecce9d4	[Support] Remove MemoryBuffer::getNewMemBuffer all callers have been switched the the Writable version (which does not require const_casting to be useful). llvm-svn: 322475	2018-01-15 11:03:30 +00:00
Benjamin Kramer	c98b7c0b21	Revert "[DAG] Elide overlapping stores" This reverts commit r322085. Internal PPC testing is still showing the same symptoms as when this patch landed the last time. llvm-svn: 322474	2018-01-15 10:57:24 +00:00
Andrei Elovikov	eebe9ed57e	[LV] Don't call recordVectorLoopValueForInductionCast for newly-created IV from a trunc. Summary: This method is supposed to be called for IVs that have casts in their use-def chains that are completely ignored after vectorization under PSE. However, for truncates of such IVs the same InductionDescriptor is used during creation/widening of both original IV based on PHINode and new IV based on TruncInst. This leads to unintended second call to recordVectorLoopValueForInductionCast with a VectorLoopVal set to the newly created IV for a trunc and causes an assert due to attempt to store new information for already existing entry in the map. This is wrong and should not be done. Fixes PR35773. Reviewers: dorit, Ayal, mssimpso Reviewed By: dorit Subscribers: RKSimon, dim, dcaballe, hsaito, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41913 llvm-svn: 322473	2018-01-15 10:56:07 +00:00
Mikael Holmen	455a18e971	[GlobalsAA] Don't let dbg intrinsics affect analysis result Summary: This fixes PR35899. Debug info intrinsics shouldn't affect code generation so ignore them in GlobalsAA. Reviewers: hfinkel, aprantl Reviewed By: aprantl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D41984 llvm-svn: 322470	2018-01-15 07:05:51 +00:00
Max Kazantsev	55ddbc8a8a	[NFC] Fix comment to adjust to reality llvm-svn: 322468	2018-01-15 05:44:43 +00:00
Davide Italiano	1e4933df39	[BasicAA] Stop crashing when dealing with pointers > 64 bits. An alternative (and probably better) fix would be that of making `Scale` an APInt, and there's a patch floating around to do this. As we're still discussing it, at least stop crashing in the meanwhile (added bonus, we now have a regression test for this situation). Fixes PR35843. Thanks to Eli for suggesting the fix and Simon for reporting and reducing the bug. llvm-svn: 322467	2018-01-15 01:40:18 +00:00
Craig Topper	90b0c61a22	[X86] Autoupgrade kunpck intrinsics using vector operations instead of scalar operations Summary: This patch changes the kunpck intrinsic autoupgrade to use vXi1 shufflevector operations to perform vector extracts and concats. This more closely matches the definition of the kunpck instructions. Currently we rely on a DAG combine to turn the scalar shift/and/or code into a concat vectors operation. By doing it in the IR we get this for free. Reviewers: spatel, RKSimon, zvi, jina.nahias Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42018 llvm-svn: 322462	2018-01-14 19:24:10 +00:00
Simon Pilgrim	6556cecd6b	[X86][SSE] Support combining MOVLHPS undef inputs llvm-svn: 322459	2018-01-14 18:50:34 +00:00
Sanjay Patel	4430ada396	[InstSimplify] fix code comments; NFC llvm-svn: 322456	2018-01-14 15:58:18 +00:00
Craig Topper	7095915282	[X86] Use ISD::TRUNCATE instead of X86ISD::VTRUNC when input and output types have the same number of elements. llvm-svn: 322455	2018-01-14 08:11:36 +00:00
Craig Topper	f1f9a8c7f7	[X86] Add X86ISD::VTRUNC to computeKnownBitsForTargetNode. We have to take special care to avoid the cases where the result of the truncate would be padded with zero elements. Ideally we'd just use ISD::TRUNCATE for these cases instead. llvm-svn: 322454	2018-01-14 08:11:33 +00:00
Craig Topper	767d0f3bfe	[X86] Improve legalization of vXi16/vXi8 selects. Extend vXi1 conditions of vXi8/vXi16 selects even before type legalization gets a chance to split wide vectors. Previously we would only extend 128 and 256 bit vectors. But if we start with a 512 bit vector or wider that needs to be split we wouldn't extend until after the split had taken place. By extending early we improve the results of type legalization. Don't widen condition of 128/256 bit vXi16/vXi8 selects when we have BWI but not VLX. We can still use a mask register by widening the select to 512-bits instead. This is similar to what we do for compares already. llvm-svn: 322450	2018-01-14 02:05:51 +00:00
Zvi Rackover	441282dd8c	X86: Add pattern matching for PMADDWD In addition to the existing match as part of a loop-reduction, add a straightforward pattern match for DAG-contained patterns. Reviewers: RKSimon, craig.topper Subscribers: llvm-commits Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D41811 llvm-svn: 322446	2018-01-13 17:42:19 +00:00
Sanjay Patel	ae40514007	[InstSimplify] fold implied null ptr check (PR35790) This extends rL322327 to handle the pointer cast and should solve: https://bugs.llvm.org/show_bug.cgi?id=35790 Name: or_eq_zero %isnull = icmp eq i64* %p, null %x = ptrtoint i64* %p to i64 %somebits = and i64 %x, %y %somebits_are_zero = icmp eq i64 %somebits, 0 %or = or i1 %somebits_are_zero, %isnull => %or = %somebits_are_zero Name: and_ne_zero %isnotnull = icmp ne i64* %p, null %x = ptrtoint i64* %p to i64 %somebits = and i64 %x, %y %somebits_are_not_zero = icmp ne i64 %somebits, 0 %and = and i1 %somebits_are_not_zero, %isnotnull => %and = %somebits_are_not_zero https://rise4fun.com/Alive/CQ3 llvm-svn: 322439	2018-01-13 15:44:44 +00:00
Craig Topper	297e87e001	[X86] Add DAG combine to promote vXi1 result of a vXi8/vXi16 setcc when we have AVX512 but not BWI. This avoids having the result type stick around until lowering where we have to extend the setcc and insert a truncate. If we get the types converted early we can do more to optimize it. llvm-svn: 322432	2018-01-13 06:24:46 +00:00
Evgeniy Stepanov	185ee8f832	[hwasan] An LLVM flag to disable stack tag randomization. Summary: Necessary to achieve consistent test results. Reviewers: kcc, alekseyshl Subscribers: kubamracek, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42023 llvm-svn: 322429	2018-01-13 01:32:15 +00:00
Jessica Paquette	e467b14843	[MachineOutliner] Move hasAddressTaken check to MachineOutliner.cpp Mostly NFC. Still updating the test though just for completeness. This moves the hasAddressTaken check to MachineOutliner.cpp and replaces it with a per-basic block test rather than a per-function test. The old test was too conservative and was preventing functions in C programs from being outlined even though they were safe to outline. This was mostly a problem in C sources. llvm-svn: 322425	2018-01-13 00:42:28 +00:00
Tim Renouf	b1963b408e	[AMDGPU] stop image_store being moved illegally Summary: A recent change 321556: AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores can allow the machine instruction scheduler to move an image store past an image load using the same descriptor. V2: Fixed by marking image ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. V3: Reverted test change done on 321556. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D41969 llvm-svn: 322419	2018-01-12 22:57:24 +00:00
Daniel Neilson	b58a2de2b4	[NFC] Change MemIntrinsicInst::setAlignment() to take an unsigned instead of a Constant Summary: In preparation for https://reviews.llvm.org/D41675 this NFC changes this prototype of MemIntrinsicInst::setAlignment() to accept an unsigned instead of a Constant. llvm-svn: 322403	2018-01-12 21:33:37 +00:00
Changpeng Fang	14b06e6060	AMDGPU/SI: Add d16 support for buffer intrinsics. Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. llvm-svn: 322402	2018-01-12 21:12:19 +00:00
Brian M. Rzycki	504eb62dfb	[JumpThreading] Preservation of DT and LVI across the pass Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perform the preversation was minimally altered and simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements such as threading across loop headers. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: mgorny, dmgreen, kuba, rnk, rsmith, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 llvm-svn: 322401	2018-01-12 21:06:48 +00:00
Florian Hahn	86e5f6fc56	Silence GCC 7 warning by using an enum class. This silences the following GCC7 warning: lib/Target/Hexagon/HexagonISelDAGToDAGHVX.cpp:142:30: warning: enumeral and non-enumeral type in conditional expression [-Wextra] return F != Colors.end() ? F->second : None; ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~ Reviewers: amharc, RKSimon, davide Reviewed By: RKSimon, davide Differential Revision: https://reviews.llvm.org/D41003 llvm-svn: 322398	2018-01-12 20:35:45 +00:00
Rui Ueyama	18f10e166c	Remove ELFDataTypeTypedefHelper class. Differential Revision: https://reviews.llvm.org/D41973 llvm-svn: 322395	2018-01-12 19:59:43 +00:00
Evandro Menezes	c8be5be9f8	[AArch64] Fix scheduling resources for post indexed loads and stores Fix typos in the default scheduling resources when using the post indexed addressing modes. Differential revision: https://reviews.llvm.org/D40511 llvm-svn: 322392	2018-01-12 19:20:11 +00:00
Paul Robinson	f9fda8e2e4	[DWARFv5] CodeGen support for MD5 file checksums Pass MD5 checksums through from IR to assembly/object files. After this, getting Clang to compute the MD5 should be the last step to supporting MD5 in the DWARF v5 line table header. Differential Revision: https://reviews.llvm.org/D41926 llvm-svn: 322391	2018-01-12 19:17:50 +00:00
Sam Clegg	3c7181ca32	MC: Remove redundant `SetUsed` arguments in MCSymbol methods We can probably take this a step further since the only user of the isUsed flag is AsmParser it should probably be doing this explicitly. For now this is a step in the right direction though. Differential Revision: https://reviews.llvm.org/D41971 llvm-svn: 322386	2018-01-12 18:05:40 +00:00
Craig Topper	aa0ccaf5cd	[X86] Remove unused isel pattern for zero extend from v16i1/v8i1 to v16i32/v8i64. We have custom lowering on vzext that produces a vselect and a build vector. So zext never gets to isel. llvm-svn: 322381	2018-01-12 17:34:09 +00:00
Rafael Espindola	2ce48efc54	Allow dso_local on ifunc. It was never fully disallowed. We were rejecting it in the asm parser, but not in the verifier. Currently TargetMachine::shouldAssumeDSOLocal returns true for hidden ifuncs. I considered changing it and moving the check from the asm parser to the verifier. The reason for deciding to allow it instead is that all linkers handle a direct reference just fine. They use the plt address as the address of the function. In fact doing that means that clang doesn't have the same bug as gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83782. This patch then removes the check from the asm parser and updates the bitcode reader and writer. llvm-svn: 322378	2018-01-12 17:03:43 +00:00
Benjamin Kramer	ebec464532	[PowerPC] Don't miscompile rotate+mask into an ANDIo if it can't recreate the immediate I'm not even sure if this transform is ever worth it, but this at least stops the bleeding. llvm-svn: 322373	2018-01-12 15:03:24 +00:00
Nemanja Ivanovic	b728e61465	[PowerPC] Zero-extend the compare operand for ATOMIC_CMP_SWAP Part of the fix for https://bugs.llvm.org/show_bug.cgi?id=35812. This patch ensures that the compare operand for the atomic compare and swap is properly zero-extended to 32 bits if applicable. A follow-up commit will fix the extension for the SETCC node generated when expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS. That will complete the bug fix. Differential Revision: https://reviews.llvm.org/D41856 llvm-svn: 322372	2018-01-12 14:58:41 +00:00
Stefan Pintilie	76ff3e6f26	Revert "[PowerPC] Manually schedule the prologue and epilogue" This reverts commit r322124 since some tests were broken by that patch. Will recommmit once the patch is fixed. llvm-svn: 322369	2018-01-12 13:12:49 +00:00
Diana Picus	cdc9563ac8	[ARM GlobalISel] Map G_FMA to FPR llvm-svn: 322367	2018-01-12 12:06:01 +00:00
Diana Picus	1ca302b5d1	[ARM GlobalISel] Legalize G_FMA For hard float with VFP4, it is legal. Otherwise, we use libcalls. This needs a bit of support in the LegalizerHelper for soft float because we didn't handle G_FMA libcalls yet. The support is trivial, as the only difference between G_FMA and other libcalls that we already handle is that it has 3 input operands rather than just 2. llvm-svn: 322366	2018-01-12 11:30:45 +00:00
Max Kazantsev	eca4559f29	[IRCE][NFC] Make range check's End a non-null SCEV Currently, IRC contains `Begin` and `Step` as SCEVs and `End` as value. Aside from that, `End` can also be `nullptr` which can be later conditionally converted into a non-null SCEV. To make this logic more transparent, this patch makes `End` a SCEV and calculates it early, so that it is never a null. Differential Revision: https://reviews.llvm.org/D39590 llvm-svn: 322364	2018-01-12 10:00:26 +00:00
Andre Vieira	cf5af96d9c	[ARM] Add codegen for SMMULR, SMMLAR and SMMLSR This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR instructions from equivalent IR patterns. Differential Revision: https://reviews.llvm.org/D41775 llvm-svn: 322361	2018-01-12 09:24:41 +00:00
Andre Vieira	a1f92109e8	[ARM] Fix erroneous availability of SMMLS for Armv7-M Differential Revision: https://reviews.llvm.org/D41855 llvm-svn: 322360	2018-01-12 09:21:09 +00:00
Serguei Katkov	9a70c54302	[CGP] Re-enable Select in complex addressing mode Re-enable Select after a couple of fixes. Differential Revision: https://reviews.llvm.org/D40634 llvm-svn: 322358	2018-01-12 08:33:34 +00:00
Serguei Katkov	444df39ddc	[LoopDeletion] Handle users in unreachable block This is a fix for PR35884. When we want to delete dead loop we must clean uses in unreachable blocks otherwise we'll get an assert during deletion of instructions from the loop. Reviewers: anna, davide Reviewed By: anna Subscribers: llvm-commits, lebedev.ri Differential Revision: https://reviews.llvm.org/D41943 llvm-svn: 322357	2018-01-12 07:24:43 +00:00
Craig Topper	46f6d0692f	[X86] Don't allow lods/stos/scas/cmps/movs to be parsed without a suffix and only memory operand in at&t syntax. Without a register with a size being mentioned the instruction is ambiguous in at&t syntax. With Intel syntax the memory operation caries a size that can be used to disambiguate. llvm-svn: 322356	2018-01-12 06:48:26 +00:00
Craig Topper	67c7573d58	[X86] Don't require suffix on 'clr' mnemonic in intel syntax llvm-svn: 322355	2018-01-12 06:48:24 +00:00
Craig Topper	2e17caf317	[X86] Add 'l' and 'q' suffixes to the tbm instruction mnemonics. While the suffix isn't required to disambiguate the instructions, it is required in order to parse the instructions when the suffix is specified in order to match the GNU assembler. llvm-svn: 322354	2018-01-12 06:21:36 +00:00
Craig Topper	6509d82de1	[X86] Disable sldtq parsing in 64-bit mode. llvm-svn: 322353	2018-01-12 05:38:15 +00:00
Craig Topper	2a983cee68	[X86] Disable movsq/stosq/scasqcmpsq/lodsq parsing in 64-bit mode. llvm-svn: 322352	2018-01-12 05:38:14 +00:00
Rui Ueyama	45248cfb84	Instead of ELFFile<ELFT>::Type, use ELFT::Type. NFC. llvm-svn: 322346	2018-01-12 02:28:31 +00:00
Ana Pazos	0b39f93ebd	[RISCV] Pass MCSubtargetInfo to print methods. Summary: This change allows checking for ISA extensions in print methods. Reviewers: asb, niosHD Reviewed By: asb, niosHD Subscribers: llvm-commits, niosHD, asb, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal Differential Revision: https://reviews.llvm.org/D41503 llvm-svn: 322345	2018-01-12 02:27:00 +00:00
Sam Clegg	f220eca523	[WebAssembly] Don't allow functions to be named twice The spec doesn't allow this. Differential Revision: https://reviews.llvm.org/D41974 llvm-svn: 322343	2018-01-12 02:11:31 +00:00
Lang Hames	0857f0fe49	[ORC] Add a stub ExecutionSession and VModuleKey type. ExecutionSession will represent a running JIT program. VModuleKey is a unique key assigned to each module added as part of an ExecutionSession. The Layer concept will be updated in future to require a VModuleKey when a module is added. llvm-svn: 322336	2018-01-12 00:22:05 +00:00
David L. Jones	3c7677b5c6	Revert r322279 due to Skylake miscompile. Summary: This revision causes Skylake (and apparently, only Skylake) codegen to fail in certain cases. Details: https://bugs.llvm.org/show_bug.cgi?id=35918 Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D41972 llvm-svn: 322335	2018-01-12 00:17:38 +00:00
Sam Clegg	c35b211f46	[WebAssembly] MC: Remove SetUsed argument when calling MCSymbol::isDefined et al Summary: This argument (the isUsed flag) seems to only be relevant when parsing. Other calls sites such as these don't seem to ever use it. Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish Differential Revision: https://reviews.llvm.org/D41970 llvm-svn: 322332	2018-01-11 23:59:16 +00:00
Sanjay Patel	cbad481c69	[InstSimplify] fold implied cmp with zero (PR35790) This doesn't handle the more complicated case in the bug report yet: https://bugs.llvm.org/show_bug.cgi?id=35790 For that, we have to match / look through a cast. llvm-svn: 322327	2018-01-11 23:27:37 +00:00
Matthias Braun	e57153960c	PeepholeOpt cleanup/refactor; NFC - Less unnecessary use of `auto` - Add early `using RegSubRegPair(AndIdx) =` to avoid countless `TargetInstrInfo::` qualifications. - Use references instead of pointers where possible. - Remove unused parameters. - Rewrite the CopyRewriter class hierarchy: - Pull out uncoalescable copy rewriting functionality into PeepholeOptimizer class. - Use an abstract base class to make it clear that rewriters are independent. - Remove unnecessary \brief in doxygen comments. - Remove unused constructor and method from ValueTracker. - Replace UseAdvancedTracking of ValueTracker with DisableAdvCopyOpt use. llvm-svn: 322325	2018-01-11 22:59:33 +00:00
Evgeniy Stepanov	4e2f26d080	[hwasan] Stack instrumentation. Summary: Very basic stack instrumentation using tagged pointers. Tag for N'th alloca in a function is built as XOR of: * base tag for the function, which is just some bits of SP (poor man's random) * small constant which is a function of N. Allocas are aligned to 16 bytes. On every ReturnInst allocas are re-tagged to catch use-after-return. This implementation has a bunch of issues that will be taken care of later: 1. lifetime intrinsics referring to tagged pointers are not recognized in SDAG. This effectively disables stack coloring. 2. Generated code is quite inefficient. There is one extra instruction at each memory access that adds the base tag to the untagged alloca address. It would be better to keep tagged SP in a callee-saved register and address allocas as an offset of that XOR retag, but that needs better coordination between hwasan instrumentation pass and prologue/epilogue insertion. 3. Lifetime instrinsics are ignored and use-after-scope is not implemented. This would be harder to do than in ASan, because we need to use a differently tagged pointer depending on which lifetime.start / lifetime.end the current instruction is dominated / post-dominated. Reviewers: kcc, alekseyshl Subscribers: srhines, kubamracek, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41602 llvm-svn: 322324	2018-01-11 22:53:30 +00:00
Matthias Braun	63dd47f323	PeepholeOptimizer: Fix for vregs without defs The PeepholeOptimizer would fail for vregs without a definition. If this was caused by an undef operand abort to keep the code simple (so we don't need to add logic everywhere to replicate the undef flag). Differential Revision: https://reviews.llvm.org/D40763 llvm-svn: 322319	2018-01-11 22:30:43 +00:00
Rafael Espindola	3457994310	Make internal/private GVs implicitly dso_local. While updating clang tests for having clang set dso_local I noticed that: - There are a lot of tests to update. - Many of the updates are redundant. They are redundant because a GV is "obviously dso_local". This patch starts formalizing that a bit by requiring that internal and private GVs be dso_local too. Since they all are, we don't have to print dso_local to the textual representation, making it a bit more compact and easier to read. llvm-svn: 322317	2018-01-11 22:15:05 +00:00
Paul Robinson	c53ec0941d	Tighten up DIFile verifier for checksums Differential Revision: https://reviews.llvm.org/D41965 llvm-svn: 322314	2018-01-11 22:03:43 +00:00
Matthias Braun	93bdc4fdbc	PeepholeOptimizer: Do not form PHI with subreg arguments When replacing a PHI the PeepholeOptimizer currently takes the register class of the register at the first operand. This however is not correct if this argument has a subregister index. As there is currently no API to query the register class resulting from applying a subregister index to all registers in a class, we can only abort in these cases and not perform the transformation. This changes findNextSource() to require the end of all copy chains to not use a subregister if there is any PHI in the chain. I had to rewrite the overly complicated inner loop there to have a good place to insert the new check. This fixes https://llvm.org/PR33071 (aka rdar://32262041) Differential Revision: https://reviews.llvm.org/D40758 llvm-svn: 322313	2018-01-11 21:57:03 +00:00
Evgeniy Stepanov	963f1bbe50	[arm] Implement Target Operand Flag MIR serialization. Reviewers: efriedma, pcc Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D39975 llvm-svn: 322312	2018-01-11 21:37:58 +00:00
Fiona Glaser	c5f9051b0c	[Sink] Really really fix predicate in legality check LoadInst isn't enough; we need to include intrinsics that perform loads too. All side-effecting intrinsics and such are already covered by the isSafe check, so we just need to care about things that read from memory. D41960, originally from D33179. llvm-svn: 322311	2018-01-11 21:28:57 +00:00
Sam Clegg	f07600c58f	[WebAssemlby] MC: Don't write COMDAT symbols as global imports This was causing undefined references at link time in lld. Differential Revision: https://reviews.llvm.org/D41959 llvm-svn: 322309	2018-01-11 20:35:17 +00:00
Craig Topper	06f7bf5c4f	[X86] Legalize 128/256 gathers/scatters on KNL by using widening rather than sign extending the index. We can just widen the vectors with undef and zero extend the mask. llvm-svn: 322308	2018-01-11 19:38:30 +00:00
Adrian Prantl	0d633f4b9f	dag-combine: Transfer debug information when folding (zext (truncate x)) -> (zext (truncate x)) This patch adds debug info support to the dagcombine rule (zext (truncate x)) -> (zext (truncate x)). Differential Revision: https://reviews.llvm.org/D41924 llvm-svn: 322304	2018-01-11 18:35:12 +00:00
Krzysztof Parzyszek	c5793e99d9	[Hexagon] Fix building 64-bit vector from constant values The constants were aggregated in a reverse order. llvm-svn: 322303	2018-01-11 18:30:41 +00:00
Krzysztof Parzyszek	97851e9d20	[Hexagon] Cast elements to correct type when creating constant vector llvm-svn: 322301	2018-01-11 18:03:23 +00:00
Zvi Rackover	0f570e2658	DAGCombine: Let truncates negate extension through extract-subvector Summary: Fold cases such as: (v8i8 truncate (v8i32 extract_subvector (v16i32 sext (v16i8 V), Idx))) -> (v8i8 extract_subvector (v16i8 V), Idx) This can be generalized to cases where the truncate and extend do not fully cancel each other out, but it may require querying the target about profitability. Reviewers: RKSimon, craig.topper, spatel, efriedma Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41927 llvm-svn: 322300	2018-01-11 18:02:33 +00:00
Krzysztof Parzyszek	c82c443401	[Hexagon] Impose limits on container sizes in HexagonGenInsert With over 300k virtual registers, the size of the data exceeded 12GB. Impose limits on how much information is collected. llvm-svn: 322299	2018-01-11 18:02:13 +00:00
Krzysztof Parzyszek	bd2b662745	[Hexagon] Use SetVector when queuing nodes to scan in selectVectorConstants llvm-svn: 322298	2018-01-11 17:59:34 +00:00
Zvi Rackover	94be74ee9c	X86: Refactor type-splitting to target-legal size vector to a helper function Summary: This is a preparatory step for D41811: refactoring code for breaking vector operands of binary operation to legal-types. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41925 llvm-svn: 322296	2018-01-11 17:29:47 +00:00
Joel Jones	2e4646a322	[AArch64] Remove Unsupported = 1 flag for the WriteAtomic WriteRes. In practice, this patch has no effect on scheduling. There is no test case as there already exists a comprehensive test case for LSE Atomics. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D40694 llvm-svn: 322291	2018-01-11 16:50:56 +00:00
Benjamin Kramer	e92669457e	[InstCombine] Apply the fix from r322284 for sin / cos -> tan too llvm-svn: 322285	2018-01-11 15:33:21 +00:00
Benjamin Kramer	758ad7ba4d	[InstCombine] For cos/sin -> tan copy attributes from cos instead of the parent function Ideally we should merge the attributes from the functions somehow, but this is obviously an improvement over taking random attributes from the caller which will trip up the verifier if they're nonsensical for an unary intrinsic call. llvm-svn: 322284	2018-01-11 15:19:02 +00:00
Sanjay Patel	384913118b	[ValueTracking] recognize min/max-of-min/max with notted ops (PR35875) This was originally planned as the fix for: https://bugs.llvm.org/show_bug.cgi?id=35834 ...but simpler transforms handled that case, so I implemented a lesser solution. It turns out we need to handle the case with 'not' ops too because the real code example that we are trying to solve: https://bugs.llvm.org/show_bug.cgi?id=35875 ...has extra uses of the intermediate values, so we can't rely on smaller canonicalizations to get us to the goal. As with rL321672, I've tried to show every possibility in the codegen tests because that's the simplest way to prove we're doing the right thing in the wide variety of permutations of this pattern. We can also show an InstCombine win because we added a fold for this case in: rL321998 / D41603 An Alive proof for one variant of the pattern to show that the InstCombine and codegen results are correct: https://rise4fun.com/Alive/vd1 Name: min3_nots %nx = xor i8 %x, -1 %ny = xor i8 %y, -1 %nz = xor i8 %z, -1 %cmpxz = icmp slt i8 %nx, %nz %minxz = select i1 %cmpxz, i8 %nx, i8 %nz %cmpyz = icmp slt i8 %ny, %nz %minyz = select i1 %cmpyz, i8 %ny, i8 %nz %cmpyx = icmp slt i8 %y, %x %r = select i1 %cmpyx, i8 %minxz, i8 %minyz => %cmpxyz = icmp slt i8 %minxz, %ny %r = select i1 %cmpxyz, i8 %minxz, i8 %ny Name: min3_nots_alt %nx = xor i8 %x, -1 %ny = xor i8 %y, -1 %nz = xor i8 %z, -1 %cmpxz = icmp slt i8 %nx, %nz %minxz = select i1 %cmpxz, i8 %nx, i8 %nz %cmpyz = icmp slt i8 %ny, %nz %minyz = select i1 %cmpyz, i8 %ny, i8 %nz %cmpyx = icmp slt i8 %y, %x %r = select i1 %cmpyx, i8 %minxz, i8 %minyz => %xz = icmp sgt i8 %x, %z %maxxz = select i1 %xz, i8 %x, i8 %z %xyz = icmp sgt i8 %maxxz, %y %maxxyz = select i1 %xyz, i8 %maxxz, i8 %y %r = xor i8 %maxxyz, -1 llvm-svn: 322283	2018-01-11 15:13:47 +00:00
Simon Pilgrim	f195f0dd08	[X86][SSE] Add ISD::VECTOR_SHUFFLE to faux shuffle decoding Primarily, this allows us to use the aggressive extraction mechanisms in combineExtractWithShuffle earlier and make use of UNDEF elements that may be lost during lowering. llvm-svn: 322279	2018-01-11 14:25:18 +00:00
Jonas Paulsson	66b86c128d	[VectorLegalizer] Remove broken code in ExpandStore. The code that is supposed to "Round odd types to the next pow of two" seems broken and as well completely unused (untested). It also seems that ExpandStore really shouldn't ever change the memory VT, which this in fact does. As a first step in fixing the broken handling of vector stores (of irregular types, e.g. an i1 vector), this code is removed. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=35520. Review: Eli Friedman llvm-svn: 322275	2018-01-11 13:03:21 +00:00
Zvi Rackover	2024c53178	X86: Fix LowerBUILD_VECTORAsVariablePermute for case Src is smaller than Indices Summary: As RKSimon suggested in pr35820, in the case that Src is smaller in bit-size than Indices, need to widen Src to avoid type mismatch. Fixes pr35820 Reviewers: RKSimon, craig.topper Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41865 llvm-svn: 322272	2018-01-11 12:26:52 +00:00
Alex Bradbury	132ad3d36c	[RISCV] Reserve an emergency spill slot for the register scavenger when necessary Although the register scavenger can often find a spare register, an emergency spill slot is needed to guarantee success. Reserve this slot in cases where the function is known to have a large stack (meaning the scavenger may be needed when forming stack addresses). llvm-svn: 322269	2018-01-11 11:17:19 +00:00
Andrew V. Tischenko	020c50b67a	Implementation of X86Operand::print. Differential Revision: https://reviews.llvm.org/D41610 llvm-svn: 322267	2018-01-11 10:31:01 +00:00
Stefan Maksimovic	e61ac366bf	[Mips] Handle one byte unsupported relocations Fail gracefully instead of crashing upon encountering this type of relocation. Differential revision: https://reviews.llvm.org/D41857 llvm-svn: 322266	2018-01-11 10:07:47 +00:00
Craig Topper	0b2f3f6e7a	[X86] Fix unused variable in release builds. llvm-svn: 322262	2018-01-11 07:19:29 +00:00
Aaron Smith	34143bb819	[CodeView] Fix the type for a variadic argument Summary: - MSVC uses the none type for a variadic argument in CodeView - Add a unit test Reviewers: zturner, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D41931 llvm-svn: 322257	2018-01-11 06:42:11 +00:00
Dmitry Venikov	ad27a6a17a	[InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x) Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag Reviewers: hfinkel, spatel Reviewed By: spatel Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits Differential Revision: https://reviews.llvm.org/D41286 llvm-svn: 322255	2018-01-11 06:33:00 +00:00
Craig Topper	119ce7b99a	[X86] Optimize v2i32/v2f32 scatters. If the index is v2i64 we can use the scatter instruction that has v4i32/v4f32 data register, v2i64 index, and v2i1 mask. Similar was already done for gather. Implement custom widening for v2i32 data to remove the code that reverses type legalization during lowering. llvm-svn: 322254	2018-01-11 06:31:28 +00:00
Wolfgang Pieb	4698ad4a47	[DWARF][NFC] Overload AsmPrinter::emitDwarfStringOffsets() to take a DwarfStringPoolEntry record. Differential Revision: https://reviews.llvm.org/D41920 llvm-svn: 322250	2018-01-11 02:35:00 +00:00
Marcello Maggioni	f51ea3e080	[NFC] Commit to mention that r322248 is actually made by AndrewScheidecker llvm-svn: 322249	2018-01-11 02:06:28 +00:00
Marcello Maggioni	a739a91c59	[SimplifyCFG] Add cut-off for InitializeUniqueCases. The function can take a significant amount of time on some complicated test cases, but for the currently only use of the function we can stop the initialization much earlier when we find out we are going to discard the result anyway in the caller of the function. Adding configurable cut-off points so that we avoid wasting time. NFCI. llvm-svn: 322248	2018-01-11 02:01:16 +00:00
Matthias Braun	0e3f9cfb8b	Revert "AArch64: Fix emergency spillslot being out of reach for large callframes" Revert for now as the testcase is hitting a pre-existing verifier error that manifest as a failure when expensive checks are enabled (or -verify-machineinstrs) is used. This reverts commit r322200. llvm-svn: 322231	2018-01-10 22:36:28 +00:00
Matthias Braun	f103029d0d	LiveRangeEdit: Inline markDeadRemat() into only user; NFC This function was only called from a single place in which we didn't even need the `if (DeadRemats)` check. llvm-svn: 322230	2018-01-10 22:36:26 +00:00
Craig Topper	ee1a3c5643	[X86] Move HasNOPL to a subtarget feature bit. Plumb MCSubtargetInfo through the MCAsmBackend constructor After D41349, we can no get a MCSubtargetInfo into the MCAsmBackend constructor. This allows us to get NOPL from a subtarget feature rather than a CPU name blacklist. Differential Revision: https://reviews.llvm.org/D41721 llvm-svn: 322227	2018-01-10 22:07:16 +00:00
Matthias Braun	798a2b9e65	LiveRangeEdit: Simplify code; NFC Simplify the code slightly: Instead of creating empty subranges in one case and immediately removing them, do not create them in the first place. llvm-svn: 322226	2018-01-10 21:41:02 +00:00
Alex Bradbury	7ca90a551a	[RISCV] Implement support for the BranchRelaxation pass Branch relaxation is needed to support branch displacements that overflow the instruction's immediate field. Differential Revision: https://reviews.llvm.org/D40830 llvm-svn: 322224	2018-01-10 21:05:07 +00:00
Matthias Braun	8d679a50b7	TargetLoweringBase: The ios simulator has no bzero function. Make sure I really get back to the beahvior before my rewrite in r321035 which turned out not to be completely NFC as I changed the behavior for the ios simulator environment. llvm-svn: 322223	2018-01-10 20:49:57 +00:00
Alex Bradbury	1534210353	[RISCV] Implement branch analysis This is a prerequisite for the branch relaxation pass, and allows a number of optimisation passes (e.g. BranchFolding and MachineBlockPlacement) to work. Differential Revision: https://reviews.llvm.org/D40808 llvm-svn: 322222	2018-01-10 20:47:00 +00:00
Alex Bradbury	272abbd2f8	[RISCV] Add support for llvm.{frameaddress,returnaddress} intrinsics llvm-svn: 322218	2018-01-10 20:12:00 +00:00
Alex Bradbury	22e7f0dc96	[RISCV] Add basic support for inline asm constraints llvm-svn: 322217	2018-01-10 20:05:09 +00:00
Alex Bradbury	7d25eb9809	[RISCV] Support stack frames and offsets up to 32-bits Differential Revision: https://reviews.llvm.org/D40807 llvm-svn: 322216	2018-01-10 19:53:46 +00:00
Alex Bradbury	25d739d334	[RISCV] Support for varargs Includes support for expanding va_copy. Also adds support for using 'aligned' registers when necessary for vararg calls, and ensure the frame pointer always points to the bottom of the vararg spill region. This is necessary to ensure that the saved return address and stack pointer are always available at fixed known offsets of the frame pointer. Differential Revision: https://reviews.llvm.org/D40805 llvm-svn: 322215	2018-01-10 19:41:03 +00:00
Scott Linder	f8034277dd	Test commit access llvm-svn: 322213	2018-01-10 19:27:20 +00:00
Craig Topper	8d3d87a0cc	[SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend. Most of this patch is just making sure we copy the scale around everywhere. Differential Revision: https://reviews.llvm.org/D40055 llvm-svn: 322210	2018-01-10 19:16:05 +00:00
Jessica Paquette	1ba1b7d255	[MachineOutliner] Outline ADRPs ADRP instructions weren't being outlined because they're PC-relative and thus fail the LR checks. This patch adds a special case for ADRPs to getOutliningType to make sure that ADRPs can be outlined and updates the MIR test. llvm-svn: 322207	2018-01-10 18:49:57 +00:00
Matthias Braun	63ad6b7f05	AArch64: Fix emergency spillslot being out of reach for large callframes Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322200	2018-01-10 18:16:24 +00:00
Simon Pilgrim	0166d0b510	[X86][MMX] Pull out common MMX VT test. NFCI. llvm-svn: 322195	2018-01-10 15:32:19 +00:00
Dmitry Preobrazhensky	01bdb0ae28	[AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK support See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764 Differential Revision: https://reviews.llvm.org/D41614 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322189	2018-01-10 14:22:19 +00:00
Bjorn Pettersson	7022e7d38f	Avoid inlining if there is byval arguments with non-alloca address space Summary: After teaching InlineCost more about address spaces () another fault was detected in the inliner. If an argument has the byval attribute the parameter might be copied to an alloca. That part seems to work fine even if the argument has a different address space than the alloca address space. However, if the address spaces differ, then the inlined function still might refer to the parameter using the original address space (the inliner does not handle that situation very well). This patch avoids the problem by simply disallowing inlining when there are byval arguments with address space that differs from the alloca address space. I'm not really sure how to transform the code if we want to get inlining for this situation. I assume that it never has been working, and that the fixes in r321809 just exposed an old problem. Fault found by skatkov (Serguei Katkov). It is mentioned in follow up comments to https://reviews.llvm.org/D40455. Reviewers: skatkov Reviewed By: skatkov Subscribers: uabelho, eraman, llvm-commits, haicheng Differential Revision: https://reviews.llvm.org/D41898 llvm-svn: 322181	2018-01-10 13:01:18 +00:00
Sander de Smalen	4efc80a40a	[AArch64][SVE] Asm: Add support for (mov\|dup) of scalar Summary: This patch adds support for 'dup' (Scalar -> SVE) and its corresponding 'mov' alias. Reviewers: fhahn, rengolin, evandro, echristo Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41822 llvm-svn: 322172	2018-01-10 11:32:47 +00:00
Diana Picus	c7b104d268	[ARM GlobalISel] Map G_FNEG to the FPR bank llvm-svn: 322169	2018-01-10 11:13:31 +00:00
Diana Picus	8e69a8831f	[ARM GlobalISel] Legalize G_FNEG for s32 and s64 For hard float, it is legal. For soft float, we need to lower to 0 - x first, and then we can use the libcall for G_FSUB. This is undoing some of the canonicalization performed by the IRTranslator (which introduces G_FNEG when it sees a 0 - x). Ideally, that canonicalization would be performed by a pre-legalizer pass that would allow targets to opt out of this behaviour rather than dance around it in the legalizer. llvm-svn: 322168	2018-01-10 10:45:34 +00:00
Sander de Smalen	dee11fd6e7	[TableGen][AsmMatcherEmitter] Generate assembler checks for tied operands Summary: This extends TableGen's AsmMatcherEmitter with code that generates a table with tied-operand constraints. The constraints are checked when parsing the instruction. If an operand is not equal to its tied operand, the assembler will give an error. Patch [2/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB. Reviewers: olista01, rengolin, mcrosier, fhahn, craig.topper, evandro, echristo Reviewed By: fhahn Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D41446 llvm-svn: 322166	2018-01-10 10:10:56 +00:00
Jonas Paulsson	82460f15a6	Temporarily revert "[SystemZ] Check for legality before doing LOAD AND TEST transformations." , due to test failures. llvm-svn: 322165	2018-01-10 10:05:55 +00:00
Diana Picus	1fee069590	[ARM GlobalISel] Legalize s32/s64 G_FCONSTANT Legal for hard float. Change to G_CONSTANT for soft float (but preserve the binary representation). llvm-svn: 322164	2018-01-10 10:01:49 +00:00
Jonas Paulsson	63c8cd21e5	[SelectionDAGBuilder] Chain prefetches less aggressively. Prefetches used to always be chained between any previous and following memory accesses. The problem with this was that later optimizations, such as folding of a load into the user instruction, got disrupted. This patch relaxes the chaining of prefetches in order to remedy this. Reveiw: Hal Finkel https://reviews.llvm.org/D38886 llvm-svn: 322163	2018-01-10 09:33:00 +00:00
Diana Picus	e39b0284e3	[ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits Make G_CONSTANT narrow for any scalars larger than 32 bits. llvm-svn: 322162	2018-01-10 09:32:01 +00:00
Jonas Paulsson	95f5630397	[SystemZ] Check for legality before doing LOAD AND TEST transformations. Since a load and test instruction treat its operands as signed, it can only replace a logical compare for EQ/NE uses. Review: Ulrich Weigand https://bugs.llvm.org/show_bug.cgi?id=35662 llvm-svn: 322161	2018-01-10 09:18:17 +00:00
Lang Hames	f74233b3c5	[ExecutionEngine] Remove an unused variable. Patch by Evgeniy Tyurin. Thanks Evgeniy! Review: https://reviews.llvm.org/D41431 llvm-svn: 322158	2018-01-10 03:43:14 +00:00
Justin Lebar	d3751782b8	Add explanatory comment to LoadStoreVectorizer. Reviewers: arsenm Subscribers: rengolin, sanjoy, wdng, hiraditya, asbirlea Differential Revision: https://reviews.llvm.org/D41890 llvm-svn: 322157	2018-01-10 03:02:12 +00:00
Puyan Lotfi	bb3ea20b55	[MIR] Repurposing '$' sigil used by external symbols. Replacing with '&'. Planning to add support for named vregs. This puts is in a conundrum since physregs are named as well. To rectify this we need to use a sigil other than '%' for physregs in MIR. We've settled on using '$' for physregs but first we must repurpose it from external symbols using it, which is what this commit is all about. We think '&' will have familiar semantics for C/C++ users. llvm-svn: 322146	2018-01-10 00:56:48 +00:00
Lang Hames	b0f9fb8cbd	[ORC] Re-apply r321838 again with a workaround for a bug present in the libcxx version being used on some of the green dragon builders (plus a clang-format). Workaround: AsynchronousSymbolQuery and VSO want to work with JITEvaluatedSymbols anyway, so just use them (instead of JITSymbol, which happens to tickle the bug). The libcxx bug being worked around was fixed in r276003, and there are plans to update the offending builders. llvm-svn: 322140	2018-01-10 00:09:38 +00:00
Vlad Tsyrklevich	7f72d85310	LowerTypeTests: Add limited support for aliases Summary: LowerTypeTests moves some function definitions from individual object files to the merged module, leaving a stub to be called in the merged module's jump table. If an alias was pointing to such a function definition LowerTypeTests would fail because the alias would be left without a definition to point to. This change 1) emits information about aliases to the ThinLTO summary, 2) replaces aliases pointing to function definitions that are moved to the merged module with function declarations, and 3) re-emits those aliases in the merged module pointing to the correct function definitions. The patch does not correctly fix all possible mis-uses of aliases in LowerTypeTests. For example, it does not handle aliases with a different type from the pointed to function. The addition of alias data increases the size of Chrome build artifacts by less than 1%. Reviewers: pcc Reviewed By: pcc Subscribers: mehdi_amini, eraman, mgrang, llvm-commits, eugenis, kcc Differential Revision: https://reviews.llvm.org/D41741 llvm-svn: 322139	2018-01-10 00:00:51 +00:00
Michael Zolotukhin	2c87fa96bb	[LoopRotate] Detect loops with indirect branches better (we're giving up on them). llvm-svn: 322137	2018-01-09 23:54:35 +00:00
Adrian McCarthy	de2d0196a2	Reland "Emit Function IDs table for Control Flow Guard" Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs of functions that have their address taken into a section named .gfids$y for compatibility with Microsoft's Control Flow Guard feature. The original patch didn't have the lit.local.cfg file that restricts the new test to x86, thus the new test was failing on the non-x86 bots. Differential Revision: https://reviews.llvm.org/D40531 The reverts r322008, which was a revert of r322005. This reverts commit a05b89f9aca70597dc79fe97bc49b50b51f525ba. llvm-svn: 322136	2018-01-09 23:49:30 +00:00
Sam Clegg	9c22504bad	[WebAssembly] Add COMDAT support This adds COMDAT support to the Wasm object-file format. Spec: https://github.com/WebAssembly/tool-conventions/pull/31 Corresponding LLD change: https://bugs.llvm.org/show_bug.cgi?id=35533, and D40845 Patch by Nicholas Wilson Differential Revision: https://reviews.llvm.org/D40844 llvm-svn: 322135	2018-01-09 23:43:14 +00:00
Paul Robinson	bfc554dace	[DWARFv5] MC support for MD5 file checksums Extend .file directive syntax to allow specifying an MD5 checksum for the source file. Emit the checksums in DWARF v5 line tables. llvm-svn: 322134	2018-01-09 23:31:48 +00:00
Eric Christopher	1daca4da0e	Tidy some grammar in some comments llvm-svn: 322133	2018-01-09 23:25:38 +00:00
Rafael Espindola	a45d438e5a	Use a MCExpr for the size of MCFillFragment. This allows the size to be found during ralaxation. This fixes pr35858. llvm-svn: 322131	2018-01-09 22:48:37 +00:00
Sam Clegg	cf4f07211b	[WebAssembly] MC: Use zero for provisional value of undefined symbols This is more in line with what happens in the final executable when symbols are undefined (i.e. weak references). Differential Revision: https://reviews.llvm.org/D41840 llvm-svn: 322130	2018-01-09 22:44:02 +00:00
Chris Bieneman	152dec707a	[IPSCCP] Remove calls without side effects Summary: When performing constant propagation for call instructions we have historically replaced all uses of the return from a call, but not removed the call itself. This is required for correctness if the calls have side effects, however the compiler should be able to safely remove calls that don't have side effects. This allows the compiler to completely fold away calls to functions that have no side effects if the inputs are constant and the output can be determined at compile time. Reviewers: davide, sanjoy, bruno, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38856 llvm-svn: 322125	2018-01-09 21:58:46 +00:00
Stefan Pintilie	f7f2e9cc72	[PowerPC] Manually schedule the prologue and epilogue This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. This commit is almost identical to this one: r322036 except that two warnings that broke build bots have been fixed. The revision number is D41737 as before. llvm-svn: 322124	2018-01-09 21:57:49 +00:00
Rafael Espindola	1bbec740b6	Don't create MCFillFragment directly. Instead use higher level APIs that take care of most bookkeeping. llvm-svn: 322123	2018-01-09 21:55:10 +00:00
Sam Clegg	9ebc8a13dc	[WebAssembly] Explicitly specify function/global index space in YAML These indexes are useful because they are not always zero based and functions and globals are referenced elsewhere by their index. This matches what we already do for the type index space. Differential Revision: https://reviews.llvm.org/D41877 llvm-svn: 322121	2018-01-09 21:38:53 +00:00
Tim Renouf	49faf0a1ac	[SelectionDAG] Fixed f16-from-vector promotion problem Summary: In the case of an fp_extend of v1f16 to v1f32 where the v1f16 is the result of a bitcast from i16, avoid creating an illegal fp16_to_fp where the input is not a vector and the result is a v1f32. V2: The fix is now to avoid vector scalarization creating a v1->scalar bitcast. Reviewers: srhines, t.p.northover Subscribers: nhaehnle, llvm-commits, dstuttard, t-tye, yaxunl, wdng, kzhuravl, arsenm Differential Revision: https://reviews.llvm.org/D41126 llvm-svn: 322120	2018-01-09 21:36:25 +00:00
Tim Renouf	fe07be2ca9	[AMDGPU] Fixed incorrect uniform branch condition Summary: I had a case where multiple nested uniform ifs resulted in code that did v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first ensuring that bits for inactive lanes were clear. There was already code for inserting an "s_and_b64 vcc, exec, vcc" to clear bits for inactive lanes in the case that the branch is instruction selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in SIFixSGPRCopies. I have added the same code into SILowerControlFlow for the case that the branch is instruction selected as s_cbranch_vccnz. This de-optimizes the code in some cases where the s_and is not needed, because vcc is the result of a v_cmp, or multiple v_cmp instructions combined by s_and/s_or. We should add a pass to re-optimize those cases. Reviewers: arsenm, kzhuravl Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle Differential Revision: https://reviews.llvm.org/D41292 llvm-svn: 322119	2018-01-09 21:34:43 +00:00
Daniel Berlin	33a5f98012	NewGVN: Fix PR/33367, which was causing us to delete non-copy intrinsics accidentally in some rare cases llvm-svn: 322115	2018-01-09 20:12:42 +00:00
Rafael Espindola	8c62496ec5	Inline a emitFill variant that is only used once. NFC. llvm-svn: 322111	2018-01-09 19:50:29 +00:00
Easwaran Raman	f04207e3b2	Add a pass to generate synthetic function entry counts. Summary: This pass synthesizes function entry counts by traversing the callgraph and using the relative block frequencies of the callsites. The intended use of these counts is in inlining to determine hot/cold callsites in the absence of profile information. The pass is split into two files with the code that propagates the counts in a callgraph in a Utils file. I plan to add support for propagation in the thinlto link phase and the propagation code will be shared and hence this split. I did not add support to the old PM since hot callsite determination in inlining is not possible in old PM (although we could use hot callee heuristic with synthetic counts in the old PM it is not worth the effort tuning it) Reviewers: davidxl, silvas Subscribers: mgorny, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D41604 llvm-svn: 322110	2018-01-09 19:39:35 +00:00
Brian Gesiak	6abdbd7d39	[Option] For typo '-foo', suggest '--foo' Summary: https://reviews.llvm.org/rL321877 introduced the `OptTable::findNearest` method, to find the closest edit distance option for a given string. However, the implementation contained a bug: for a typo `-foo` with an edit distance of 1 away from a valid option `--foo`, `findNearest` would suggest a nearby option of `foo`. That is, the result would not include the `--` prefix, and so was not a valid option. Fix the bug by ensuring that the prefix string is initialized to one of the valid prefixes for the option. Test Plan: `check-llvm-unit` Reviewers: v.g.vassilev, teemperor, ruiu, jroelofs, yamaguchi Reviewed By: jroelofs Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41873 llvm-svn: 322109	2018-01-09 19:38:04 +00:00
Rafael Espindola	0f0fe4383d	Make one of the emitFill methods non virtual. NFC. This is just preparatory work to fix PR35858. llvm-svn: 322108	2018-01-09 19:29:33 +00:00
Alexey Bataev	c3619b2825	[COST]Fix PR35865: Fix cost model evaluation for shuffle on X86. Summary: If the vector type is transformed to non-vector single type, the compile may crash trying to get vector information about non-vector type. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41862 llvm-svn: 322106	2018-01-09 19:08:22 +00:00
Derek Schuff	a1da553c33	[WebAssembly] Update libcall signature lists New signatures added in r322087. A fix for this tight coupling is forthcoming. llvm-svn: 322105	2018-01-09 19:05:34 +00:00
Sanjay Patel	883f134b81	[InstCombine] weaken assertions for icmp folds (PR35846) Because of potential UB (known bits conflicts with an llvm.assume), we have to check rather than assert here because InstSimplify doesn't kill the compare: https://bugs.llvm.org/show_bug.cgi?id=35846 llvm-svn: 322104	2018-01-09 18:56:03 +00:00
Teresa Johnson	09464eb2ae	Fix crash when linking metadata with ODR type uniquing Summary: With DebugTypeODRUniquing enabled, during IR linking debug metadata in the destination module may be reached from the source module. This means that ConstantAsMetadata nodes (e.g. on DITemplateValueParameter) may contain a value the destination module. When trying to map such metadata nodes, we will attempt to map a GV already in the dest module. linkGlobalValueProto will end up with a source GV that is the same as the dest GV as well as the new GV. Trying to access the TypeMap for the source GV type, which is actually a dest GV type, hits an assertion since it appears that we have mapped into the source module (because the type is the value not a key into the map). Detect that we don't need to access the TypeMap in this case, since there is no need to create a bitcast from the new GV to the source GV type as they GV are the same. Fixes PR35722. Reviewers: mehdi_amini, pcc Subscribers: probinson, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D41624 llvm-svn: 322103	2018-01-09 18:32:53 +00:00
Craig Topper	3d19c1e4f2	[X86] Add a DAG combine to combine (sext (setcc)) with VLX Normally target independent DAG combine would do this combine based on getSetCCResultType, but with VLX getSetCCResultType returns a vXi1 type preventing the DAG combining from kicking in. But doing this combine can allow us to remove the explicit sign extend that would otherwise be emitted. This patch adds a target specific DAG combine to combine the sext+setcc when the result type is the same size as the input to the setcc. I've restricted this to FP compares and things that can be represented with PCMPEQ and PCMPGT since we don't have full integer compare support on the older ISAs. Differential Revision: https://reviews.llvm.org/D41850 llvm-svn: 322101	2018-01-09 18:14:22 +00:00
Matthew Voss	e568747614	Test commit This is a commit to test commit access. llvm-svn: 322099	2018-01-09 17:52:00 +00:00
Florian Hahn	687480985c	[TargetParser] Add missing armv8l ARMv8 variant. This change adds the missing armv8l variant as an alias of armv8 architecture. The issue was observed with several regressions in validation on armv8l hardware (for instance ExecutionEngine/frem.ll failed due to lack of neon fpu). Tested with regression testsuite passed without regression on ARM and x86_64. Patch by Yvan Roux. Reviewers: rengolin, rogfer01, olista01, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D41859 llvm-svn: 322098	2018-01-09 17:49:25 +00:00
Francis Visoiu Mistrih	637037fa50	[CodeGen] Don't print "pred:" and "opt:" in -debug output In -debug output we print "pred:" whenever a MachineOperand is a predicate operand in the instruction descriptor, and "opt:" whenever a MachineOperand is an optional def in the instruction descriptor. Differential Revision: https://reviews.llvm.org/D41870 llvm-svn: 322096	2018-01-09 17:31:07 +00:00
Davide Italiano	a7955b99df	[Support] Use realpath(3) instead of trying to open a file. If we don't have read permissions on the directory the call would fail. <rdar://problem/35871293> llvm-svn: 322095	2018-01-09 17:27:45 +00:00
Pavel Labath	198e4bdc0a	[Support] Add WritableMemoryBuffer::getNewMemBuffer Summary: The idea is that it would replace (non-Writable)MemoryBuffer::getNewMemBuffer, which is quite useless unless you const_cast its contents to write to it (which all (both) callers of this function were doing). This patch also fixes one of the usages in COFFWriter. After fixing the other usage in clang, I plan to delete the old function. Reviewers: dblaikie, Bigcheese Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41540 llvm-svn: 322094	2018-01-09 17:26:06 +00:00
Sander de Smalen	d839b92957	Recommit r322073: [AArch64][SVE] Asm: Add predicated ADD/SUB instructions Fixed issue that was found on sanitizer-x86_64-linux-fast. I changed the result type of 'Parser.getTok().getString().lower()' in AArch64AsmParser::tryParseSVEPredicateVector() from 'StringRef' to 'auto', since StringRef::lower() returns a std::string. llvm-svn: 322092	2018-01-09 17:01:27 +00:00
Francis Visoiu Mistrih	5b1cb469f1	[CodeGen] Print frame-setup/destroy flags in -debug output like we do in MIR Currently the MachineInstr::print function prints the frame-setup/frame-destroy differently than it does in MIR. Instead of: %x21 = LDR %sp, -16; flags: FrameDestroy print: %x21 = frame-destroy LDR %sp, -16 llvm-svn: 322088	2018-01-09 16:11:51 +00:00
Sanjay Patel	e515369ba1	[SelectionDAG] lower math intrinsics to finite version of libcalls when possible (PR35672) Ingredients in this patch: 1. Add HANDLE_LIBCALL defs for finite mathlib functions that correspond to LLVM intrinsics. 2. Plumbing to send TargetLibraryInfo down to SelectionDAGLegalize. 3. Relaxed math and library checking in SelectionDAGLegalize::ConvertNodeToLibcall() to choose finite libcalls. There was a bug about determining the availability of the finite calls that should be fixed with: rL322010 Not in this patch: This doesn't resolve the question/bug of clang creating the intrinsic IR in the first place. There's likely follow-up work needed to support the long double variants better. There's room for improvement to reduce the code duplication. Create finite calls that don't originate from a corresponding intrinsic or DAG node? Differential Revision: https://reviews.llvm.org/D41338 llvm-svn: 322087	2018-01-09 15:41:00 +00:00
Francis Visoiu Mistrih	70776ef6f7	[CodeGen] Don't print register classes in -debug output Since register classes and banks are already printed with the register definition, don't print it at the end of every instruction anymore. This follows MIR in this regard and is another step to the unification of the two formats. llvm-svn: 322086	2018-01-09 15:39:44 +00:00
Nirav Dave	24870f524d	[DAG] Elide overlapping stores Relanding after fixing handling of pre-indexed memory operations in BaseIndexOffset analysis (r322003). Extend overlapping store elision to handle overwrites of stores by larger stores. Reviewers: craig.topper, rnk, t.p.northover Subscribers: javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40969 llvm-svn: 322085	2018-01-09 15:23:12 +00:00
Petar Jovanovic	ed79ac7199	[EarlyCSE] Salvage debug info during DCE EarlyCSE did not try to salvage debug info during erasing of instructions. This change fixes it. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D41496 llvm-svn: 322083	2018-01-09 15:08:37 +00:00
Simon Pilgrim	c39efb317d	[InstCombine] Check for out of range ashr values using APInt before calling getZExtValue Reduced from oss-fuzz #5032 test case llvm-svn: 322078	2018-01-09 14:23:46 +00:00
Sander de Smalen	15d7cd9c7a	Reverted r322073 because of AddressSanitizer failure on sanitizer-x86_64-linux-fast builder. llvm-svn: 322077	2018-01-09 13:51:09 +00:00
Sander de Smalen	02c875ed8d	[AArch64][SVE] Asm: Add predicated ADD/SUB instructions Summary: Add the predicated ADD/SUB instructions and corresponding tests. Patch [3/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo Reviewed By: fhahn Subscribers: aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41443 llvm-svn: 322073	2018-01-09 12:43:46 +00:00
Francis Visoiu Mistrih	af104b58a6	[MIR] Add support for the frame-destroy MachineInstr flag We are printing / parsing the `frame-setup` MachineInstr flag but not the `frame-destroy` one. Differential Revision: https://reviews.llvm.org/D41509 llvm-svn: 322071	2018-01-09 11:33:22 +00:00
Sander de Smalen	62eaf09505	[AArch64][SVE] Asm: Add parsing of merging/zeroing suffix for SVE predicate vector operands Summary: Parsing of the '/m' (merging) or '/z' (zeroing) suffix of a predicate operand. Patch [2/3] in a series to add predicated ADD/SUB instructions for SVE. Reviewers: rengolin, mcrosier, evandro, fhahn, echristo, MatzeB, t.p.northover Reviewed By: fhahn Subscribers: t.p.northover, MatzeB, aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41442 llvm-svn: 322070	2018-01-09 11:17:06 +00:00
Nikolai Bozhenov	2f935bc924	[Nios2] Arithmetic instructions for R1 and R2 ISA. Summary: This commit enables some of the arithmetic instructions for Nios2 ISA (for both R1 and R2 revisions), implements facilities required to emit those instructions and provides LIT tests for added instructions. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D41236 Author: belickim <mateusz.belicki@intel.com> llvm-svn: 322069	2018-01-09 11:15:08 +00:00
Oren Ben Simhon	b468feac4d	Instrument Control Flow For Indirect Branch Tracking CET (Control-Flow Enforcement Technology) introduces a new mechanism called IBT (Indirect Branch Tracking). According to IBT, each Indirect branch should land on dedicated ENDBR instruction (End Branch). The new pass adds ENDBR instructions for every indirect jmp/call (including jumps using jump tables / switches). For more information, please see the following: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf Differential Revision: https://reviews.llvm.org/D40482 Change-Id: Icb754489faf483a95248f96982a4e8b1009eb709 llvm-svn: 322062	2018-01-09 08:51:18 +00:00
Craig Topper	922dfb8775	[X86] Allow more cmpps/pd immediate encodings to be commuted during isel. The code that checks the immediate wasn't masking to the lower 3-bits like the code in X86InstrInfo.cpp that's used by the peephole pass does. llvm-svn: 322060	2018-01-09 07:09:34 +00:00
Serguei Katkov	d58a989008	[SCEV] Do not cache S -> V if S is not equivalent of V SCEV tracks the correspondence of created SCEV to original instruction. However during creation of SCEV it is possible that nuw/nsw/exact flags are lost. As a result during expansion of the SCEV the instruction with nuw/nsw/exact will be used where it was expected and we produce poison incorreclty. Reviewers: sanjoy, mkazantsev, sebpop, jbhateja Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41578 llvm-svn: 322058	2018-01-09 06:47:14 +00:00
Serguei Katkov	3e7f820f40	[CGP] Fix Complex addressing mode for offset If the offset is differ in two addressing mode we can continue only if ScaleReg is not set due to we will use it as merge of different offsets. It should fix PR35799 and PR35805. Reviewers: john.brawn, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41227 llvm-svn: 322056	2018-01-09 04:37:06 +00:00
Sean Fertile	4e57a395a3	[PowerPC] Can not assume an intrinsic argument is a simple type. The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics, and assumes the arguments must be of a simple type. This isn't always the case though. For example if we unroll and vectorize a loop we may end up with vectors larger then the largest legal type, along with intrinsics that operate on those wider types. This happened in the ffmpeg build, where we unrolled a loop and ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion. Differential Revision: https://reviews.llvm.org/D41758 llvm-svn: 322055	2018-01-09 03:03:41 +00:00
Eric Christopher	ec511cd956	Remove unused function HvxSelector::zerous. llvm-svn: 322053	2018-01-09 02:38:17 +00:00
Stefan Pintilie	4daa2c99c9	Revert "[PowerPC] Manually schedule the prologue and epilogue" [PowerPC] This reverts commit r322036. Failing build bots. Revert the commit now. llvm-svn: 322051	2018-01-09 01:06:21 +00:00
Craig Topper	63aae39c34	[X86] Remove llvm.x86.avx512.cvt2mask. intrinsics and autoupgrade to (icmp slt X, 0) I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way. llvm-svn: 322050	2018-01-09 00:50:47 +00:00
Craig Topper	c5dbf5a582	[X86] Remove unnecessary isel pattern that is a combination of two other patterns. The pattern was this def : Pat<(i32 (zext (i8 (bitconvert (v8i1 VK8:$src))))), (MOVZX32rr8 (EXTRACT_SUBREG (i32 (COPY_TO_REGCLASS VK8:$src, GR32)), sub_8bit))>, Requires<[NoDQI]>; but if you just let (i32 (zext X)) match byte itself you'll get MOVZX32rr8. And if you let (i8 (bitconvert (v8i1 VK8:$src))) match by itself you'll get (EXTRACT_SUBREG (i32 (COPY_TO_REGCLASS VK8:$src, GR32)), sub_8bit). So we can just let isel do the two patterns naturally. llvm-svn: 322049	2018-01-09 00:50:42 +00:00
Jessica Paquette	3075bb4222	[MachineOutliner] AArch64: Handle instrs that use SP and will never need fixups This commit does two things. Firstly, it adds a collection of flags which can be passed along to the target to encode information about the MBB that an instruction lives in to the outliner. Second, it adds some of those flags to the AArch64 outliner in order to add more stack instructions to the list of legal instructions that are handled by the outliner. The two flags added check if - There are calls in the MachineBasicBlock containing the instruction - The link register is available in the entire block If the link register is available and there are no calls, then a stack instruction can always be outlined without fixups, regardless of what it is, since in this case, the outliner will never modify the stack to create a call or outlined frame. The motivation for doing this was checking which instructions are most often missed by the outliner. Instructions like, say %sp<def> = ADDXri %sp, 32, 0; flags: FrameDestroy are very common, but cannot be outlined in the case that the outliner might modify the stack. This commit allows us to outline instructions like this. llvm-svn: 322048	2018-01-09 00:26:18 +00:00
Stefan Pintilie	75855c772b	[PowerPC] Manually schedule the prologue and epilogue This patch makes the following changes to the schedule of instructions in the prologue and epilogue. The stack pointer update is moved down in the prologue so that the callee saves do not have to wait for the update to happen. Saving the lr is moved down in the prologue to hide the latency of the mflr. The stack pointer is moved up in the epilogue so that restoring of the lr can happen sooner. The mtlr is moved up in the epilogue so that it is away form the blr at the end of the epilogue. The latency of the mtlr can now be hidden by the loads of the callee saved registers. Differential Revision: https://reviews.llvm.org/D41737 llvm-svn: 322036	2018-01-08 22:23:10 +00:00
Justin Bogner	c95467b366	AlwaysInliner: Alow setting InsertLifetime in the new-style pass llvm-svn: 322033	2018-01-08 22:07:42 +00:00
Zachary Turner	4e0d96a656	Fix uninitialized read error reported by MSAN. The problem was that our Obj -> Yaml dumper had not been taught to handle certain types of records. This meant that when I generated the test input files, the records were still there but none of its fields were filled out. So when it did the Yaml -> Obj conversion as part of the test, it generated records with garbage in them. The patch here fixes the Obj <-> Yaml converter, and additionally updates the test file with fresh Yaml generated by the fixed converter. llvm-svn: 322029	2018-01-08 21:38:50 +00:00
Justin Bogner	e9bc465e2c	ArgPromotion: Allow setting MaxElements in the new-style pass llvm-svn: 322025	2018-01-08 21:13:35 +00:00
Sanjay Patel	60d5bad02a	[ValueTracking] remove overzealous assert The test is derived from a failing fuzz test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5008 Credit to @rksimon for pointing out the problem. llvm-svn: 322016	2018-01-08 18:31:13 +00:00
Petar Jovanovic	1bbc932540	[LiveDebugValues] Change condition for block termination recognition The last iterator of MBB should be recognized as MBB.end() not as MBB.instr_end() which could return bundled instruction that is not iterable with basic iterator. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D41626 llvm-svn: 322015	2018-01-08 18:21:15 +00:00
Sanjay Patel	a000fc9f42	[TargetLibraryInfo] fix finite mathlib function availability This patch was part of: https://reviews.llvm.org/D41338 ...but we can expose the bug in IR via constant propagation as shown in the test. Unless the triple includes 'linux', we should not fold these because the functions don't exist on other platforms (yet?). llvm-svn: 322010	2018-01-08 17:38:09 +00:00
Adrian McCarthy	77150ad708	Revert "Emit Function IDs table for Control Flow Guard" The new test fails on the Hexagon bot. Reverting while I investigate. This reverts https://reviews.llvm.org/rL322005 This reverts commit b7e0026b4385180c378edc658ec91a39566f2942. llvm-svn: 322008	2018-01-08 17:12:01 +00:00
Aleksandar Beserminji	e98367697a	[mips] Remove duplicated R6 EVA instructions This patch removes duplicated EVA instructions in R6. Differential Revision: https://reviews.llvm.org/D41769 llvm-svn: 322007	2018-01-08 16:50:33 +00:00
Davide Italiano	d4d99b33ee	[CVP] Replace incoming values from unreachable blocks with undef. This is an attempt of fixing PR35807. Due to the non-standard definition of dominance in LLVM, where uses in unreachable blocks are dominated by anything, you can have, in an unreachable block: %patatino = OP1 %patatino, CONSTANT When `SimplifyInstruction` receives a PHI where an incoming value is of the aforementioned form, in some cases, loops indefinitely. What I propose here instead is keeping track of the incoming values from unreachable blocks, and replacing them with undef. It fixes this case, and it seems to be good regardless (even if we can't prove that the value is constant, as it's coming from an unreachable block, we can ignore it). Differential Revision: https://reviews.llvm.org/D41812 llvm-svn: 322006	2018-01-08 16:34:06 +00:00
Adrian McCarthy	1914213a11	Emit Function IDs table for Control Flow Guard Adds option /guard:cf to clang-cl and -cfguard to cc1 to emit function IDs of functions that have their address taken into a section named .gfids$y for compatibility with Microsoft's Control Flow Guard feature. Differential Revision: https://reviews.llvm.org/D40531 llvm-svn: 322005	2018-01-08 16:33:42 +00:00
Nirav Dave	d9a55e3d2b	[DAG] Teach BaseIndexOffset to correctly handle with indexed operations BaseIndexOffset address analysis incorrectly ignores offsets folded into indexed memory operations causing potential errors in alias analysis of pre-indexed operations. Reviewers: efriedma, RKSimon, hfinkel, jyknight Subscribers: hiraditya, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D41701 llvm-svn: 322003	2018-01-08 16:21:35 +00:00
Sanjay Patel	782c5d6f79	[InstCombine] fold min/max tree with common operand (PR35717) There is precedence for factorization transforms in instcombine for FP ops with fast-math. We also have similar logic in foldSPFofSPF(). It would take more work to add this to reassociate because that's specialized for binops, and min/max are not binops (or even single instructions). Also, I don't have evidence that larger min/max trees than this exist in real code, but if we find that's true, we might want to reorganize where/how we do this optimization. In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have: int test(int xc, int xm, int xy) { int xk; if (xc < xm) xk = xc < xy ? xc : xy; else xk = xm < xy ? xm : xy; return xk; } This patch solves that problem because we recognize more min/max patterns after rL321672 https://rise4fun.com/Alive/Qjne https://rise4fun.com/Alive/3yg Differential Revision: https://reviews.llvm.org/D41603 llvm-svn: 321998	2018-01-08 15:05:34 +00:00
Momchil Velikov	b56f278e08	[ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz The patch makes the unwind information not mention registers, which were pushed solely for the purpose of saving stack adjustment instructions. Differential revision: https://reviews.llvm.org/D41300 Fixes https://bugs.llvm.org/show_bug.cgi?id=35379 llvm-svn: 321996	2018-01-08 14:47:19 +00:00
Alexey Bataev	d73719cbbe	[SLP] Fix PR35777: Incorrect handling of aggregate values. Summary: Fixes the bug with incorrect handling of InsertValue\|InsertElement instrucions in SLP vectorizer. Currently, we may use incorrect ExtractElement instructions as the operands of the original InsertValue\|InsertElement instructions. Reviewers: mkuper, hfinkel, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41767 llvm-svn: 321994	2018-01-08 14:43:06 +00:00
Alexey Bataev	85a077392b	[SLP] Fix PR35628: Count external uses on extra reduction arguments. Summary: If the vectorized value is marked as extra reduction argument, its users are not considered as external users. Patch fixes this. Reviewers: mkuper, hfinkel, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41786 llvm-svn: 321993	2018-01-08 14:33:11 +00:00
Sam Parker	1a528fc8fd	[DAGCombine] Fix for PR35761 I had falsely assumed that constant operands would be operand(1) of the bin ops that may need their constant operand to be masked. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=35761 Differential Revision: https://reviews.llvm.org/D41667 llvm-svn: 321991	2018-01-08 13:21:24 +00:00
Jonas Paulsson	f24de3924d	[SystemZ] Comment fix in SystemZElimCompare.cpp NFC Review: Ulrich Weigand llvm-svn: 321990	2018-01-08 12:52:40 +00:00
Momchil Velikov	c0e2b430f9	[ARM] Fix PR35481 This patch allows `r7` to be used, regardless of its use as a frame pointer, as a temporary register when popping `lr`, and also falls back to using a high temporary register if, for some reason, we weren't able to find a suitable low one. Differential revision: https://reviews.llvm.org/D40961 Fixes https://bugs.llvm.org/show_bug.cgi?id=35481 llvm-svn: 321989	2018-01-08 11:32:37 +00:00
Francis Visoiu Mistrih	8198e11651	[X86] Remove side-effects from determineCalleeSaves (Target)FrameLowering::determineCalleeSaves can be called multiple times. I don't think it should have side-effects as creating stack objects and setting global MachineFunctionInfo state as it is doing today (in other back-ends as well). This moves the creation of stack objects from determineCalleeSaves to assignCalleeSavedSpillSlots. Differential Revision: https://reviews.llvm.org/D41703 llvm-svn: 321987	2018-01-08 10:46:05 +00:00
Craig Topper	573f223715	[X86] Replace CVT2MASK ISD opcode with PCMPGTM compared to zero. CVT2MASK is just checking the sign bit which can be represented with a comparison with zero. llvm-svn: 321985	2018-01-08 06:53:54 +00:00
Craig Topper	2d72cefc7c	[X86] Add patterns to allow 512-bit BWI compare instructions to be used for 128/256-bit compares when VLX is not available. llvm-svn: 321984	2018-01-08 06:53:52 +00:00
Craig Topper	5116b57234	[X86] Simplify some code in lower1BitVectorShuffle by relying on getNode's ability to constant fold vector SIGN_EXTEND. llvm-svn: 321979	2018-01-07 23:56:37 +00:00
Craig Topper	df5dbc87a9	[X86] Add VSHUFF32X4 and similar instructions to load folding tables. llvm-svn: 321978	2018-01-07 23:30:20 +00:00
Davide Italiano	5490b3aaf6	Revert "[SCCP] Manually fold branches on undef." I thought this was responsible for PR35723, but I was wrong, the issue lies elsewhere. Revert while I debug. llvm-svn: 321975	2018-01-07 22:09:44 +00:00
Davide Italiano	c9e2c5d0f5	[SLPVectorizer] Reintroduce std::stable_sort(properlyDominates()). The approach was never discussed, I wasn't able to reproduce this non-determinism, and the original author went AWOL. After a discussion on the ML, Philip suggested to revert this. llvm-svn: 321974	2018-01-07 22:06:24 +00:00
Craig Topper	4d10a572ab	[X86] Revert accidental change to CMakeLists.txt in r321952 I had removed the qualifiers around the autogenerated folding table so I could compare with the manual table, but didn't intend to commit the change. llvm-svn: 321971	2018-01-07 21:03:43 +00:00
Simon Pilgrim	63f4bc337a	[DAG] Fix for Bug PR34620 - Allow SimplifyDemandedBits to look through bitcasts Allow SimplifyDemandedBits to use TargetLoweringOpt::computeKnownBits to look through bitcasts. This can help simplifying in some cases where bitcasts of constants generated during or after legalization can't be folded away, and thus didn't get picked up by SimplifyDemandedBits. This fixes PR34620, where a redundant pand created during legalization from lowering and lshr <16xi8> wasn't being simplified due to the presence of a bitcasted build_vector as an operand. Committed on the behalf of @sameconrad (Sam Conrad) Differential Revision: https://reviews.llvm.org/D41643 llvm-svn: 321969	2018-01-07 19:09:40 +00:00
Craig Topper	4ebaa3ea4d	[X86] Remove unneeded code from combineGatherScatter that used to delte SIGN_EXTEND_INREG nodes created during legalization of v2i1/v4i1 masks on KNL. v2i1/v4i1 are now legal on KNL so no sign_extend_inreg is generated. llvm-svn: 321968	2018-01-07 18:34:08 +00:00
Craig Topper	971ece4f26	[X86] Make v2i1 and v4i1 legal types without VLX Summary: There are few oddities that occur due to v1i1, v8i1, v16i1 being legal without v2i1 and v4i1 being legal when we don't have VLX. Particularly during legalization of v2i32/v4i32/v2i64/v4i64 masked gather/scatter/load/store. We end up promoting the mask argument to these during type legalization and then have to widen the promoted type to v8iX/v16iX and truncate it to get the element size back down to v8i1/v16i1 to use a 512-bit operation. Since need to fill the upper bits of the mask we have to fill with 0s at the promoted type. It would be better if we could just have the v2i1/v4i1 types as legal so they don't undergo any promotion. Then we can just widen with 0s directly in a k register. There are no real v4i1/v2i1 instructions anyway. Everything is done on a larger register anyway. This also fixes an issue that we couldn't implement a masked vextractf32x4 from zmm to xmm properly. We now have to support widening more compares to 512-bit to get a mask result out so new tablegen patterns got added. I had to hack the legalizer for widening the operand of a setcc a bit so it didn't try create a setcc returning v4i32, extract from it, then try to promote it using a sign extend to v2i1. Now we create the setcc with v4i1 if the original setcc's result type is v2i1. Then extract that and don't sign extend it at all. There's definitely room for improvement with some follow up patches. Reviewers: RKSimon, zvi, guyblank Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41560 llvm-svn: 321967	2018-01-07 18:20:37 +00:00
Hal Finkel	f34aaa2108	[LV][VPlan] NFC patch to move LoopVectorizationPlanner class out of LoopVectorize.cpp Another small step forward to move VPlan stuff outside of LoopVectorize.cpp. VPlanBuilder.h is renamed to LoopVectorizationPlanner.h LoopVectorizationPlanner class is moved from LoopVectorize.cpp to LoopVectorizationPlanner.h LoopVectorizationCostModel::VectorizationFactor class is moved to LoopVectorizationPlanner.h (used by the planner class) --- this needs further streamlining work in later patches and thus all I did was take it out of the CostModel class and moved to the header file. The callback function had to stay inside LoopVectorize.cpp since it calls an InnerLoopVectorizer member function declared in it. Next Steps: Make InnerLoopVectorizer, LoopVectorizationCostModel, and other classes more modular and more aligned with VPlan direction, in small increments. Previous step was: r320900 (https://reviews.llvm.org/D41045) Patch by Hideki Saito, thanks! Differential Revision: https://reviews.llvm.org/D41420 llvm-svn: 321962	2018-01-07 16:02:58 +00:00
Florian Hahn	16f8e91244	[CodeExtractor] Use subset of function attributes for extracted function. In addition to target-dependent attributes, we can also preserve a white-listed subset of target independent function attributes. The white-list excludes problematic attributes, most prominently: * attributes related to memory accesses, as alloca instructions could be moved in/out of the extracted block * control-flow dependent attributes, like no_return or thunk, as the relerelevant instructions might or might not get extracted. Thanks @efriedma and @aemerson for providing a set of attributes that cannot be propagated. Reviewers: efriedma, davidxl, davide, silvas Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D41334 llvm-svn: 321961	2018-01-07 11:22:25 +00:00
Craig Topper	bafeab90b0	[PowerPC] Add an ISD::TRUNCATE to the legalization for ppc_is_decremented_ctr_nonzero Summary: I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work. There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed. With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: nemanjai, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D41654 llvm-svn: 321959	2018-01-07 07:51:36 +00:00

... 2 3 4 5 6 ...

109809 Commits