llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	cf80897776	[X86] Add isAnyZero shuffle mask helper	2020-03-29 19:51:37 +01:00
Nikita Popov	6967b5c883	[InstCombine] Erase old mul when creating umulo As we don't return the result of replaceInstUsesWith(), we are responsible for erasing the instruction. There is a small subtlety here in that we need to do this after the other uses of Builder, which uses the original multiply as the insertion point. NFC apart from worklist order changes.	2020-03-29 20:46:08 +02:00
Nikita Popov	9434424736	[InstCombine] Use replaceOperand() in demanded elements simplification To make sure that dead operands get DCEd. This fixes the largest source of leftover dead operands we see in tests. NFC apart from worklist changes.	2020-03-29 20:43:19 +02:00
Nikita Popov	98c2b02c04	[InstCombine] Use replaceOperand() in assoc cast simplification To make sure the old operands are DCEd. NFC apart from worklist order.	2020-03-29 20:28:37 +02:00
Nikita Popov	b667603246	[InstCombine] Erase old add when optimizing add overflow We don't return the replaceInstUsesWith() result, so we're responsible for cleaning up. NFC apart from worklist order changes.	2020-03-29 20:20:14 +02:00
Uday Bondhugula	336d326470	Introduce support for lib function aligned_alloc in TLI / memory builtins Aligned_alloc is a standard lib function and has been in glibc since 2.16 and in the C11 standard. It has semantics similar to malloc/calloc for several analyses/transforms. This patch introduces aligned_alloc in target library info and memory builtins. Subsequent ones will make other passes aware and fix https://bugs.llvm.org/show_bug.cgi?id=44062 This change will also be useful to LLVM generators that need to allocate buffers of vector elements larger than 16 bytes (for eg. 256-bit ones), element boundary alignment for which is not typically provided by glibc malloc. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D76970	2020-03-29 23:36:24 +05:30
Matt Arsenault	d0c820ef49	GlobalISel: Add matcher for G_SHL	2020-03-29 14:03:07 -04:00
Matt Arsenault	88f2f59236	AMDGPU/GlobalISel: Remove redundant virtual	2020-03-29 14:03:07 -04:00
Matt Arsenault	f37e7222d1	AMDGPU: Fix using wrong instruction for FP conversion This was was never actually hit, but FTRUNC was clearly not the intent here.	2020-03-29 14:03:07 -04:00
Matt Arsenault	de61f4db43	AMDGPU: Add some additional tests for v_cvt_ubyte* formation Use functions now that we have them for less boilerplate in the output.	2020-03-29 14:03:07 -04:00
Matt Arsenault	8d9df1b07c	AMDGPU: Fix typo	2020-03-29 14:03:06 -04:00
Sanjay Patel	c46bba2092	[VectorCombine] skip debug intrinsics first for efficiency	2020-03-29 13:58:04 -04:00
Sanjay Patel	d424ebe347	[InstCombine] make test independent of branch undef/UB; NFC	2020-03-29 13:32:47 -04:00
Simon Pilgrim	6bc7b2e92c	[X86][AVX] Add tests for 512-bit shuffle patterns that could reduce to subvector extractions	2020-03-29 18:27:18 +01:00
Simon Pilgrim	a7ae4789c3	Remove unnecessary empty comments from test check lines. NFC.	2020-03-29 18:27:18 +01:00
Nikita Popov	365e061515	[InstCombine] Simplify select of cmpxchg transform Rather than converting to a dummy select with equal true and false ops, just directly return the resulting value. As a side-effect, this fixes missing DCE of the previously replaced operand.	2020-03-29 18:57:32 +02:00
Florian Hahn	d7782de60a	[OpenMP] set_bits iterator yields unsigned elements, no reference (NFC). BitVector::set_bits() returns an iterator range yielding unsinged elements, which always will be copied while const & gives the impression that there will be no copy. Newer version of clang complain: warning: loop variable 'SetBitsIt' is always a copy because the range of type 'iterator_range<llvm::BitVector::const_set_bits_iterator>' (aka 'iterator_range<const_set_bits_iterator_impl<llvm::BitVector> >') does not return a reference [-Wrange-loop-analysis] Reviewers: jdoerfert, rnk Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D77010	2020-03-29 17:08:13 +01:00
Nikita Popov	1a23c1c6f9	[InstCombine] Fix worklist management in varargs transform Add a replaceUse() helper to mirror replaceOperand() for the rare cases where we're working directly on uses. NFC apart from worklist order changes.	2020-03-29 18:04:12 +02:00
Nikita Popov	284f903c54	[InstCombine] Erase original add when creating saddo Usually when we replaceInstUsesWith() we also return the original instruction, and InstCombine will take care of erasing it. Here we don't do that, so we need to manually erase it. NFC apart from worklist order changes.	2020-03-29 18:01:32 +02:00
Nikita Popov	4fa5d7e83e	[InstCombine] Use replaceOperand() in a few more places To make sure the old operands get DCEd. NFC apart from worklist order changes.	2020-03-29 18:01:00 +02:00
Simon Pilgrim	232445ea2c	[X86][AVX] Combine 128-bit lane shuffles with a zeroable upper half to EXTRACT_SUBVECTOR (PR40720) As explained on PR40720, EXTRACTF128 is always as good/better than VPERM2F128, and we can use the implicit zeroing of the upper half. I've added some extra tests to vector-shuffle-combining-avx2.ll to make sure we don't lose coverage.	2020-03-29 16:41:59 +01:00
Simon Pilgrim	896d173cfb	[X86] Rename matchShuffleAsByteRotate to matchShuffleAsElementRotate. NFC. This was an inner helper function for the real matchShuffleAsByteRotate function, but it is more generic and is used directly for VALIGN lowering which doesn't work at the byte level.	2020-03-29 16:41:58 +01:00
Simon Pilgrim	30fee933fa	[X86][AVX] Add X86ISD::VALIGN target shuffle decode support Allows us to combine VALIGN instructions with other shuffles - the combiner doesn't create VALIGN yet though.	2020-03-29 16:41:58 +01:00
Florian Hahn	e03568e786	[VPlan] Use one VPWidenRecipe per original IR instruction. (NFC). This patch changes VPWidenRecipe to only store a single original IR instruction. This is the first required step towards modeling it's operands as VPValues and also towards breaking it up into a VPInstruction. Discussed as part of D74695. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D76988	2020-03-29 13:47:28 +01:00
Nikita Popov	51f95f57fa	[PostOrderIterator] Use SmallVector to store stack; NFC We use a SmallPtrSet to track visited nodes, use a SmallVector of the same size for the stack.	2020-03-29 14:29:02 +02:00
Simon Pilgrim	ae5e10d787	[X86] X86CallFrameOptimization - generalize slow push code path Replace the explicit isAtom() \|\| isSLM() test with the more general (and more specific) slowTwoMemOps() check to avoid the use of the PUSHrmm push from memory case. This is actually very tricky to test in anything but quite complex code, but the atomic-idempotent.ll tests seem to be the most straightforward to use. Differential Revision: https://reviews.llvm.org/D76239	2020-03-29 11:01:59 +01:00
Richard Diamond	527a4d99f8	[AlignmentFromAssumptions] Fix a SCEV assertion resulting from address space differences. Summary: On targets with different pointer sizes, -alignment-from-assumptions could attempt to create SCEV expressions which use different effective SCEV types. The provided test illustrates the issue. In `getNewAlignment`, AASCEV would be the (only) alloca, which would have an effective SCEV type of i32. But PtrSCEV, the GEP in this case, due to being in the flat/default address space, will have an effective SCEV of i64. This patch resolves the issue by truncating PtrSCEV to AASCEV's effective type. Reviewers: hfinkel, jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75471	2020-03-29 01:26:31 -05:00
Craig Topper	115066186a	[X86] Add cost model test cases for fmin/fmax reduction.	2020-03-28 17:12:49 -07:00
Fangrui Song	b930500a76	[MC][PowerPC] Make .reloc support arbitrary relocation types Generalizes ad7199f3e60a49db023099dcb879fcc9cdf94a2e (R_PPC_NONE/R_PPC64_NONE).	2020-03-28 17:04:31 -07:00
Matt Arsenault	32fff55466	AMDGPU: Make use of default operands	2020-03-28 17:33:29 -04:00
Benjamin Kramer	1597d6396c	Put back initializers that were dropped in 0ab5b5b8581d9f2951575f7245824e6e4fc57dec Found by msan.	2020-03-28 22:06:12 +01:00
Benjamin Kramer	e937c2bb41	[MDBuilder] Don't use stable sort for sorting integers.	2020-03-28 21:19:46 +01:00
Nikita Popov	74ebf8606d	[InstCombine] Remove unreachable blocks before DCE Dropping unreachable code may reduce use counts on other instructions, so it's better to do this earlier rather than later. NFC-ish, may only impact worklist order.	2020-03-28 21:19:16 +01:00
Nikita Popov	798e241738	[InstCombine] Merge two functions; NFC Merge AddReachableCodeToWorklist() into prepareICWorklistFromFunction(). It's one logical step, and this makes it easier to move code.	2020-03-28 21:19:16 +01:00
Benjamin Kramer	1c3423f774	[ADT] Automatically forward llvm::sort to array_pod_sort if safe This is safe if the iterator type is a pointer and the comparator is stateless. The enable_if pattern I'm adding here only uses array_pod_sort for the default comparator (std::less). Using array_pod_sort has a potential performance impact, but I didn't notice anything when testing clang. Sorting doesn't seem to be on the hot path anywhere in LLVM. Shrinks Release+Asserts clang by 73k.	2020-03-28 20:20:14 +01:00
Benjamin Kramer	888aa45564	[AMDGPU] Stabilize sort order Found by the expensive checks in llvm::sort.	2020-03-28 20:20:14 +01:00
Yonghong Song	1b7f236ced	[BPF] support 128bit int explicitly in layout spec Currently, bpf does not specify 128bit alignment in its layout spec. So for a structure like struct ipv6_key_t { unsigned pid; unsigned __int128 saddr; unsigned short lport; }; clang will generate IR type %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] } Additional padding is to ensure later IR->MIR can generate correct stack layout with target layout spec. But it is common practice for a tracing program to be first compiled with target flag (e.g., x86_64 or aarch64) through clang to generate IR and then go through llc to generate bpf byte code. Tracing program often refers to kernel internal data structures which needs to be compiled with non-bpf target. But such a compilation model may cause a problem on aarch64. The bcc issue https://github.com/iovisor/bcc/issues/2827 reported such a problem. For the above structure, since aarch64 has "i128:128" in its layout string, the generated IR will have %struct.ipv6_key_t = type { i32, i128, i16 } Since bpf does not have "i128:128" in its spec string, the selectionDAG assumes alignment 8 for i128 and computes the stack storage size for the above is 32 bytes, which leads incorrect code later. The x86_64 does not have this issue as it does not have "i128:128" in its layout spec as it does permits i128 to be alignmented at 8 bytes at stack. Its IR type looks like %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] } The fix here is add i128 support in layout spec, the same as aarch64. The only downside is we may have less optimal stack allocation in certain cases since we require 16byte alignment for i128 instead of 8. But this is probably fine as i128 is not used widely and in most cases users should already have proper alignment. Differential Revision: https://reviews.llvm.org/D76587	2020-03-28 11:46:29 -07:00
Benjamin Kramer	9f995c3d3d	Upgrade some instances of std::sort to llvm::sort. NFC.	2020-03-28 19:23:29 +01:00
Reid Kleckner	6523508714	[CodeGen] Fix sinking local values in lpads with phis There was already a test case for landingpads to handle this case, but I had forgotten to consider PHI instructions preceding the EH_LABEL in the landingpad. PR45261	2020-03-28 11:10:33 -07:00
Nikita Popov	f72f98d666	[InstCombine] Use replaceOperand() API in GEP transforms To make sure that replaced operands get DCEd. This drops one iteration from gepphigep.ll, which is still not optimal. This was the last test case performing more than 3 iterations. NFC-ish, only worklist order should change.	2020-03-28 19:07:25 +01:00
Nikita Popov	b7e2a09400	[InstCombine] Reduce code duplication in GEP of PHI transform; NFC The `NewGEP->setOperand(DI, NewPN)` call was duplicated, and the insertion of NewGEP is the same in both if/else, so we can extract it.	2020-03-28 19:07:25 +01:00
Alexandre Ganea	83280593de	After 09158252f777c2e2f06a86b154c44abcbcf9bb74, fix build when -DLLVM_ENABLE_THREADS=OFF Tested on Linux with Clang 9, and on Windows with Visual Studio 2019 16.5.1 with -DLLVM_ENABLE_THREADS=ON and OFF.	2020-03-28 13:54:58 -04:00
Nikita Popov	71de06ab72	[InstCombine] Fix worklist management in foldXorOfICmps() Because this code does not use the IC-aware replaceInstUsesWith() helper, we need to manually push users to the worklist. This is NFC-ish, in that it may only change worklist order.	2020-03-28 18:25:21 +01:00
Nikita Popov	820f1d3f35	[InstCombine] Change limit-max-iterations test case; NFC This particular case will stop needing multiple iterations in a followup change.	2020-03-28 18:25:20 +01:00
Enna1	471b8e4441	[CorrelatedValuePropagation] Remove redundant if statement in processSelect() This statement if (ReplaceWith == S) ReplaceWith = UndefValue::get(S->getType()); is introduced in https://reviews.llvm.org/rG35609d97ae89b8e13f40f4e6b9b056954f8baa83 to fix a case where unreachable code can cause select instruction simplification to fail. In https://reviews.llvm.org/rGd10480657527ffb44ea213460fb3676a6b1300aa, we begin to perform a depth-first walk of basic blocks. This means we will not visit unreachable blocks. So we do not need this the special check any more. Differential Revision: https://reviews.llvm.org/D76753	2020-03-28 18:01:17 +01:00
Martin Storsjö	839d814b88	[AsmPrinter] Emit .weak directive for weak linkage on COFF for symbols without a comdat MC already knows how to emulate the .weak directive (with its ELF semantics; i.e., an undefined weak symbol resolves to 0, and a defined weak symbol has lower link precedence than a strong symbol of the same name) using COFF weak externals. Plumb this through the ASM printer too, so that definitions marked with __attribute__((weak)) at the language level (which gets translated to weak linkage at the IR level) have the corresponding .weak directive emitted. Note that declarations marked with __attribute__((weak)) at the language level (which translates to extern_weak at the IR level) already have .weak directives emitted. Weak/linkonce symbols without an associated comdat (in particular, ones generated with __attribute__((weak)) in C/C++) were earlier emitted as normal unique globals, as the comdat is required to provide the linkonce semantics. This change makes sure they are emitted as .weak instead, allowing other symbols to override them. Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not quite sure what that test covers, since the behavior being tested in it (the emission of a one_only section) is just a result of passing -function-sections to llc; the linkonce_odr makes no difference. Add a new coff-weak.ll which tests the new directive emission. Based on an previous patch by Shoaib Meenai. Differential Revision: https://reviews.llvm.org/D44543	2020-03-28 18:48:58 +02:00
Florian Hahn	0dc99c559e	[SCCP] Remove LatticeVal alias now that transition is done (NFC). The LatticeVal alias was introduced to reduce the diff size for the transition to ValueLatticeElement, which is done now. This patch removes the unnecessary alias and updates some very verbose type uses with auto.	2020-03-28 15:40:24 +00:00
Florian Hahn	1ce52f195f	[SCCP] Remove unused toLatticeValue helper (NFC). LatticeVal is an alias for ValueLatticeElement and the function is not used any longer.	2020-03-28 15:40:24 +00:00
Michael Liao	3bacfd4e26	Fix `-Wsign-compare` warning. NFC.	2020-03-28 10:20:27 -04:00
Martin Storsjö	e24fbfb6df	[llvm-rc] Allow -1 for menu item IDs This seems to be used in some resource files, e.g. `f3217573d7/include/wx/msw/wx.rc (L28)`. MSVC rc.exe and GNU windres both allow any value here, and silently just truncate to uint16_t range. This just explicitly allows the -1 value and errors out on others - the same was done for control IDs in dialogs in c1a67857ba0a6ba558818b589fe7c0fcc8f238ae. Differential Revision: https://reviews.llvm.org/D76951	2020-03-28 14:32:08 +02:00

1 2 3 4 5 ...

194067 Commits