llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

Author	SHA1	Message	Date
Baptiste Saleil	5f9d8eb8f8	[PowerPC] Add clang options to control MMA support This patch adds frontend and backend options to enable and disable the PowerPC MMA operations added in ISA 3.1. Instructions using these options will be added in subsequent patches. Differential Revision: https://reviews.llvm.org/D81442	2020-08-24 09:35:55 -05:00
dongAxis	7a35eee5d4	[coroutine] should disable inline before calling coro split summary: When callee coroutine function is inlined into caller coroutine function before coro-split pass, llvm will emits "coroutine should have exactly one defining @llvm.coro.begin". It seems that coro-early pass can not handle this quiet well. So we believe that unsplited coroutine function should not be inlined. This patch fix such issue by not inlining function if it has attribute "coroutine.presplit" (it means the function has not been splited) to fix this issue TestPlan: check-llvm Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D85812	2020-08-24 22:22:08 +08:00
Matt Arsenault	a801d01966	GlobalISel: Improve dead instruction debug printing This was printing the "Is dead" on a separate line from the instruction, which was harder to follow.	2020-08-24 10:12:00 -04:00
Matt Arsenault	7af9eb8150	AMDGPU/GlobalISel: Use different technique for sample v3s16 values Avoid relying on implicit_def values, and odd sized G_INSERT/G_EXTRACT	2020-08-24 10:07:30 -04:00
Matt Arsenault	1221909666	AMDGPU/GlobalISel: Add baseline, failing unmerge tests	2020-08-24 10:07:30 -04:00
Francesco Petrogalli	4b43384841	[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] Changes: * Change `ToVectorTy` to deal directly with `ElementCount` instances. * `VF == 1` replaced with `VF.isScalar()`. * `VF > 1` and `VF >=2` replaced with `VF.isVector()`. * `VF <=1` is replaced with `VF.isZero() \|\| VF.isScalar()`. * Replaced the uses of `llvm::SmallSet<ElementCount, ...>` with `llvm::SmallSetVector<ElementCount, ...>`. This avoids the need of an ordering function for the `ElementCount` class. * Bits and pieces around printing the `ElementCount` to string streams. To guarantee that this change is a NFC, `VF.Min` and asserts are used in the following places: 1. When it doesn't make sense to deal with the scalable property, for example: a. When computing unrolling factors. b. When shuffle masks are built for fixed width vector types In this cases, an assert(!VF.Scalable && "<mgs>") has been added to make sure we don't enter coepaths that don't make sense for scalable vectors. 2. When there is a conscious decision to use `FixedVectorType`. These uses of `FixedVectorType` will likely be removed in favour of `VectorType` once the vectorizer is generic enough to deal with both fixed vector types and scalable vector types. 3. When dealing with building constants out of the value of VF, for example when computing the vectorization `step`, or building vectors of indices. These operation _make sense_ for scalable vectors too, but changing the code in these places to be generic and make it work for scalable vectors is to be submitted in a separate patch, as it is a functional change. 4. When building the potential VFs in VPlan. Making the VPlan generic enough to handle scalable vectorization factors is a functional change that needs a separate patch. See for example `void LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned MaxVF)`. 5. The class `IntrinsicCostAttribute`: this class still uses `unsigned VF` as updating the field to use `ElementCount` woudl require changes that could result in changing the behavior of the compiler. Will be done in a separate patch. 7. When dealing with user input for forcing the vectorization factor. In this case, adding support for scalable vectorization is a functional change that migh require changes at command line. Note that in some places the idiom ``` unsigned VF = ... auto VTy = FixedVectorType::get(ScalarTy, VF) ``` has been replaced with ``` ElementCount VF = ... assert(!VF.Scalable && ...); auto VTy = VectorType::get(ScalarTy, VF) ``` The assertion guarantees that the new code is (at least in debug mode) functionally equivalent to the old version. Notice that this change had been possible because none of the methods that are specific to `FixedVectorType` were used after the instantiation of `VTy`. Reviewed By: rengolin, ctetreau Differential Revision: https://reviews.llvm.org/D85794	2020-08-24 13:54:03 +00:00
Matt Arsenault	9583788fb1	AMDGPU/GlobalISel: Start implementing computeKnownBitsForTargetInstr Handle workitem intrinsics. There isn't really away to adequately test this right now, since none of the known bits users are fine grained enough to test the edge conditions. This triggers a number of instances of the new 64-bit to 32-bit shift combine in the existing tests.	2020-08-24 09:53:27 -04:00
Francesco Petrogalli	0a6aded52a	Revert "[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]" Reverting because the commit message doesn't reflect the one agreed on phabricator at https://reviews.llvm.org/D85794. This reverts commit c8d2b065b98fa91139cc7bb1fd1407f032ef252e.	2020-08-24 13:50:55 +00:00
Matt Arsenault	21aca8a3e2	GlobalISel: Reduce G_SHL width if source is extension shl ([sza]ext x, y) => zext (shl x, y). Turns expensive 64 bit shifts into 32 bit if it does not overflow the source type: This is a port of an AMDGPU DAG combine added in 5fa289f0d8ff85b9e14d2f814a90761378ab54ae. InstCombine does this already, but we need to do it again here to apply it to shifts introduced for lowered getelementptrs. This will help matching addressing modes that use 32-bit offsets in a future patch. TableGen annoyingly assumes only a single match data operand, so introduce a reusable struct. However, this still requires defining a separate GIMatchData for every combine which is still annoying. Adds a morally equivalent function to the existing getShiftAmountTy. Without this, we would have to do try to repeatedly query the legalizer info and guess at what type to use for the shift.	2020-08-24 09:42:40 -04:00
Francesco Petrogalli	04c1fe05b7	[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] Changes: * Change `ToVectorTy` to deal directly with `ElementCount` instances. * `VF == 1` replaced with `VF.isScalar()`. * `VF > 1` and `VF >=2` replaced with `VF.isVector()`. * `VF <=1` is replaced with `VF.isZero() \|\| VF.isScalar()`. * Add `<` operator to `ElementCount` to be able to use `llvm::SmallSetVector<ElementCount, ...>`. * Bits and pieces around printing the ElementCount to string streams. * Added a static method to `ElementCount` to represent a scalar. To guarantee that this change is a NFC, `VF.Min` and asserts are used in the following places: 1. When it doesn't make sense to deal with the scalable property, for example: a. When computing unrolling factors. b. When shuffle masks are built for fixed width vector types In this cases, an assert(!VF.Scalable && "<mgs>") has been added to make sure we don't enter coepaths that don't make sense for scalable vectors. 2. When there is a conscious decision to use `FixedVectorType`. These uses of `FixedVectorType` will likely be removed in favour of `VectorType` once the vectorizer is generic enough to deal with both fixed vector types and scalable vector types. 3. When dealing with building constants out of the value of VF, for example when computing the vectorization `step`, or building vectors of indices. These operation _make sense_ for scalable vectors too, but changing the code in these places to be generic and make it work for scalable vectors is to be submitted in a separate patch, as it is a functional change. 4. When building the potential VFs in VPlan. Making the VPlan generic enough to handle scalable vectorization factors is a functional change that needs a separate patch. See for example `void LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned MaxVF)`. 5. The class `IntrinsicCostAttribute`: this class still uses `unsigned VF` as updating the field to use `ElementCount` woudl require changes that could result in changing the behavior of the compiler. Will be done in a separate patch. 7. When dealing with user input for forcing the vectorization factor. In this case, adding support for scalable vectorization is a functional change that migh require changes at command line. Differential Revision: https://reviews.llvm.org/D85794	2020-08-24 13:39:42 +00:00
Florian Hahn	726de666d8	[DSE,MemorySSA] Delay PointerMayBeCaptured calls until actually needed. Avoid computing InvisibleToCallerBefore/AfterRet up front. In most cases, this information is not really needed. Instead, introduce helper functions to compute and cache the result on demand. Notably, this also does not use PointerMayBeCapturedBefore for isInvisibleToCallerBeforeRet, as it requires the killing MemoryDef as starting instruction, making the caching ineffective. But it appears the use of PointerMayBeCapturedBefore has very limited benefits in practice (e.g. on SPEC2000/SPEC2006/MultiSource there are no binary changes with -O3 -flto). Refrain from using it for now, to limit-compile-time. This gives some nice compile-time improvements: http://llvm-compile-time-tracker.com/compare.php?from=db9345f6810f379a36752dc52caf5230585d0ebd&to=b4d091047e1b8a3d377d200137b79d03aca65663&stat=instructions	2020-08-24 14:05:44 +01:00
Anna Welker	1f8e3db230	[ARM][MVE] Allow tail predication for strides !=1 with gather/scatters If gather/scatters are enabled, ARMTargetTransformInfo now allows tail predication for loops with a much wider range of strides, up to anything that is loop invariant. Differential Revision: https://reviews.llvm.org/D85410	2020-08-24 13:54:47 +01:00
Florian Hahn	8ba75d1814	[DSE,MemorySSA] Regnerate some check lines. The check lines where generated before align was added for all instructions. Re-generate them, to reduce diff noise for actual functional changes.	2020-08-24 13:24:44 +01:00
Jonas Paulsson	9c268838b4	[SystemZ] Preserve the MachineMemOperand in emitCondStore() in all cases. Review: Ulrich Weigand	2020-08-24 14:07:30 +02:00
Florian Hahn	fd197bfffa	[DSE,MemorySSA] Limit elimination at end of function to single UO. Limit elimination of stores at the end of a function to MemoryDefs with a single underlying object, to save compile time. In practice, the case with multiple underlying objects seems not very important in practice. For -O3 -flto on MultiSource/SPEC2000/SPEC2006 this results in a total of 2 more stores being eliminated. We can always re-visit that in the future.	2020-08-24 13:00:17 +01:00
Sanjay Patel	8e77949af5	[InstCombine] fold abs of select with negated op (PR39474) Similar to the existing transform - peek through a select to match a value and its negation. https://alive2.llvm.org/ce/z/MXi5KG define i8 @src(i1 %b, i8 %x) { %0: %neg = sub i8 0, %x %sel = select i1 %b, i8 %x, i8 %neg %abs = abs i8 %sel, 1 ret i8 %abs } => define i8 @tgt(i1 %b, i8 %x) { %0: %abs = abs i8 %x, 1 ret i8 %abs } Transformation seems to be correct!	2020-08-24 07:37:55 -04:00
Sanjay Patel	ee2a844238	[InstCombine] add tests for abs of select with negated op; NFC (PR39474)	2020-08-24 07:37:54 -04:00
Sam Parker	9295052eb8	[SCEV] Still (again) trying to fix buildbots	2020-08-24 11:24:30 +01:00
Sam Parker	0828b21ed0	[SCEV] Still trying to fix windows buildbots	2020-08-24 10:26:48 +01:00
Julien Etienne	16f5b362c9	Add support for AVR attiny441 and attiny841 Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D85589 Patch by Julien Etienne	2020-08-24 20:28:32 +12:00
Sam Parker	e3a639bcac	[NFCI][SimplifyCFG] Combine select costs and checks Combine the cost modelling and validity checks for the phi to select conversion in SpeculativelyExecuteBB, extracting the logic out into a function.	2020-08-24 09:16:11 +01:00
Bjorn Pettersson	8f041837f3	[SelectionDAG] Fix miscompile bug in expandFunnelShift This is a fixup of commit 0819a6416fd217 (D77152) which could result in miscompiles. The miscompile could only happen for targets where isOperationLegalOrCustom could return different values for FSHL and FSHR. The commit mentioned above added logic in expandFunnelShift to convert between FSHL and FSHR by swapping direction of the funnel shift. However, that transform is only legal if we know that the shift count (modulo bitwidth) isn't zero. Basically, since fshr(-1,0,0)==0 and fshl(-1,0,0)==-1 then doing a rewrite such as fshr(X,Y,Z) => fshl(X,Y,0-Z) would be incorrect if Z modulo bitwidth, could be zero. ``` $ ./alive-tv /tmp/test.ll ---------------------------------------- define i32 @src(i32 %x, i32 %y, i32 %z) { %0: %t0 = fshl i32 %x, i32 %y, i32 %z ret i32 %t0 } => define i32 @tgt(i32 %x, i32 %y, i32 %z) { %0: %t0 = sub i32 32, %z %t1 = fshr i32 %x, i32 %y, i32 %t0 ret i32 %t1 } Transformation doesn't verify! ERROR: Value mismatch Example: i32 %x = #x00000000 (0) i32 %y = #x00000400 (1024) i32 %z = #x00000000 (0) Source: i32 %t0 = #x00000000 (0) Target: i32 %t0 = #x00000020 (32) i32 %t1 = #x00000400 (1024) Source value: #x00000000 (0) Target value: #x00000400 (1024) ``` It could be possible to add back the transform, given that logic is added to check that (Z % BW) can't be zero. Since there were no test cases proving that such a transform actually would be useful I decided to simply remove the faulty code in this patch. Reviewed By: foad, lebedev.ri Differential Revision: https://reviews.llvm.org/D86430	2020-08-24 09:52:11 +02:00
Sam Parker	83a53e27fa	[SCEV] Attempt to fix windows buildbots	2020-08-24 08:29:22 +01:00
Sam Parker	edc1713733	[SCEV] Add operand methods to Cast and UDiv Add methods to access operands in a similar manner to NAryExpr. Differential Revision: https://reviews.llvm.org/D86083	2020-08-24 06:57:07 +01:00
Fangrui Song	e52bb53986	[LiveDebugVariables] Internalize class DbgVariableValue. NFC	2020-08-23 22:53:46 -07:00
Qiu Chaofan	9835dee1b9	[PowerPC] Support lowering int-to-fp on ppc_fp128 D70867 introduced support for expanding most ppc_fp128 operations. But sitofp/uitofp is missing. This patch adds that after D81669. Reviewed By: uweigand Differntial Revision: https://reviews.llvm.org/D81918	2020-08-24 11:18:16 +08:00
Qiu Chaofan	04286d2214	[PowerPC] Allow constrained FP intrinsics in mightUseCTR We may meet Invalid CTR loop crash when there's constrained ops inside. This patch adds constrained FP intrinsics to the list so that CTR loop verification doesn't complain about it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D81924	2020-08-24 11:09:58 +08:00
QingShan Zhang	4ba8c0db80	[DAGCombine] Remove dead node when it is created by getNegatedExpression We hit the compiling time reported by https://bugs.llvm.org/show_bug.cgi?id=46877 and the reason is the same as D77319. So we need to remove the dead node we created to avoid increase the problem size of DAGCombiner. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D86183	2020-08-24 02:50:58 +00:00
Qiu Chaofan	6cd03c3d8a	[PowerPC] Support constrained vector fp/int conversion This patch makes these operations legal, and add necessary codegen patterns. There's still some issue similar to D77033 for conversion from v1i128 type. But normal type tests synced in vector-constrained-fp-intrinsic are passed successfully. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83654	2020-08-24 10:10:27 +08:00
Roman Lebedev	6072801a3e	[InstCombine] Negator: freeze is freely negatible if it's operand is negatible	2020-08-23 23:28:19 +03:00
Roman Lebedev	324948b801	[NFC][InstCombine] Add tests for negation of freeze	2020-08-23 23:28:19 +03:00
Florian Hahn	27f75460b8	[llvm-reduce] Skip terminators when reducing instructions. Removing terminators will result in invalid IR, making further reductions pointless. I do not think there is any valid use case where we actually want to create invalid IR as part of a reduction. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D86210	2020-08-23 17:20:34 +01:00
Fangrui Song	1b73f40c27	[X86][FastISel] Support materializing floating-point constants for large code model & PIC The following program miscompiles because rL216012 added static relocation model support but not for PIC. ``` // clang -fpic -mcmodel=large -O0 a.cc double foo() { return 42.0; } ``` This patch adds PIC support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86024	2020-08-23 08:36:18 -07:00
Florian Hahn	cba391237b	[DSE,MemorySSA] Keep single DL instance in DSEState (NFC). Small cleanup, also removes one instance of getting DataLayout without using it later.	2020-08-23 15:56:38 +01:00
Sanjay Patel	de6544e4be	[DAGCombiner] restrict store merge of truncs to early combining The pattern matching does not account for truncating stores, so it is unlikely to work at later stages. So we are likely wasting compile-time with no hope of improvement by running this later.	2020-08-23 10:44:23 -04:00
Stefan Gränitz	45e94e8a0b	[ORC] Add a LLJITWithThinLTOSummaries example in OrcV2Examples The example demonstrates how to use a module summary index file produced for ThinLTO to: * find the module that defines the main entry point * find all extra modules that are required for the build A LIT test runs the example as part of the LLVM test suite [1] and shows how to create a module summary index file. The code also provides two Error types that can be useful when working with ThinLTO summaries. [1] if LLVM_BUILD_EXAMPLES=ON and platform is not Windows Differential Revision: https://reviews.llvm.org/D85974	2020-08-23 14:02:10 +02:00
Craig Topper	31a27aaaa3	[X86] Allow 32-bit mode only CPUs with -mtune on 64-bit targets gcc errors on this, but I'm nervous that since -mtune has been ignored by clang for so long that there may be code bases out there that pass 32-bit cpus to clang.	2020-08-22 16:38:05 -07:00
Fangrui Song	887f6810a9	[DebugInfo][test] Fix dwarf-callsite-related-attrs.ll after llvm-dwarfdump --statistics change	2020-08-22 14:09:19 -07:00
Fangrui Song	ed1bc644de	[llvm-dwarfdump] --statistics: break lines and indent by 2 so that the user does not have to pipe the output to `jq` or `python -m json.tool`. This change makes testing more convenient because `-NEXT` patterns can be used. The "prettify by default" is a good tradeoff to make. The output size increases a bit. Differential Revision: https://reviews.llvm.org/D86318	2020-08-22 13:58:18 -07:00
Sanjay Patel	3e888e3981	[DAGCombiner] add early exit for store merging of truncs This should be NFC in terms of output because the endian check further down would bail out too, but we are wasting time by waiting to that point to give up. If we generalize that function to deal with more than i8 types, we should not have to deal with the degenerate case.	2020-08-22 16:25:16 -04:00
Sanjay Patel	e023c2fe80	[AArch64] add tests for store merge of truncs; NFC	2020-08-22 14:54:40 -04:00
Jeremy Morse	614cb4374c	Follow-up build fix for rGae6f78824031 One of the bots objects to brace-initializing a tuple: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/43595/steps/build%20stage%201/logs/stdio As the tuple constructor is apparently explicit. Fall back to the (not as pretty) explicit construction of a tuple. I'd thought this was permitted behaviour; will investigate why this fails later.	2020-08-22 19:09:30 +01:00
Fangrui Song	aa4ef6c65a	[LiveDebugValues] Delete unneeded copy constructor after D83047 It will suppress the implicitly-declared copy assignment operator in C++20.	2020-08-22 10:55:28 -07:00
LLVM GN Syncbot	97e8e05e8c	[gn build] Port ae6f7882403	2020-08-22 17:32:25 +00:00
Jeremy Morse	82eef64a62	[LiveDebugValues] Add instruction-referencing LDV implementation This patch imports the instruction-referencing implementation of LiveDebugValues proposed here: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142368.html The new implementation is unreachable in this patch, it's the next patch that enables it behind a command line switch. Briefly, rather than tracking variable locations by just their location as the 'VarLoc' implementation does, this implementation does it by value: * Each value defined in a function is numbered, and propagated through dataflow, * Each DBG_VALUE reads a machine value number from a machine location, * Variable _values_ are propagated through dataflow, * Variable values are translated back into locations, DBG_VALUEs inserted to specify where those locations are. The ultimate aim of this is to enable referring to variable values throughout post-isel code, rather than locations. Those patches will build on top of this new LiveDebugValues implementation in later patches -- it can't be done with the VarLoc implementation as we don't have value information, only locations. Differential Revision: https://reviews.llvm.org/D83047	2020-08-22 18:31:08 +01:00
Tyker	d75baa4613	[llvm-reduce] make llvm-reduce save the best reduction it has when it crashes This helps with both debugging llvm-reduce and sometimes getting usefull result even if llvm-reduce crashes Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85996	2020-08-22 19:16:43 +02:00
Matt Arsenault	4175888640	GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS This switches from using G_EXTRACT in odd cases to widen with undef and unmerge.	2020-08-22 10:25:53 -04:00
Jeremy Morse	0aa1ee80e0	Fix some builds after 20bb9fe565a -Wsuggest-override indicates this VarLocBasedLDV method needs the override keyword.	2020-08-22 15:20:42 +01:00
LLVM GN Syncbot	50d4259933	[gn build] Port 20bb9fe565a	2020-08-22 13:52:08 +00:00
Jeremy Morse	e6ff960652	[LiveDebugValues] Install an implementation-picking LiveDebugValues pass This patch renames the current LiveDebugValues class to "VarLocBasedLDV" and removes the pass-registration code from it. It creates a separate LiveDebugValues class that deals with pass registration and management, that calls through to VarLocBasedLDV::ExtendRanges when runOnMachineFunction is called. This is done through the "LDVImpl" abstract class, so that a future patch can install the new instruction-referencing LiveDebugValues implementation and have it picked at runtime. No functional change is intended, just shuffling responsibilities. Differential Revision: https://reviews.llvm.org/D83046	2020-08-22 14:50:22 +01:00

1 2 3 4 5 ...

202382 Commits