llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 12:43:36 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	2593dc40a4	[X86][SSE] canonicalizeShuffleWithBinOps - handle target shuffles. Fold SHUFFLE(BINOP(SHUFFLE(X),SHUFFLE(Y))) -> BINOP(SHUFFLE'(X),SHUFFLE'(Y)) style patterns as well as the existing shuffles of constants.	2021-03-15 15:01:29 +00:00
David Green	19dfc4f3fc	[AArch64] Zero extended extract_vector_elt pattern This adds a pattern for i64 zext_inreg(i32 extract_vector_elt X), producing a single UMOVvi16 instruction that is already expected to clear the top bits. The exact pattern that this matches is and(anyext(vector_extract X, lane), 0xff), similar to the sext patterns higher up in the same file. Differential Revision: https://reviews.llvm.org/D98599	2021-03-15 14:56:20 +00:00
Amy Kwan	d53e3aade2	[NFC][PowerPC] Add additional load/store test cases This patch adds additional load/store test cases involving scalars, vectors, and PC-Rel in preparation for the refactored load and store implementation introduced in D93370. Differential Revision: https://reviews.llvm.org/D97391	2021-03-15 08:54:38 -05:00
Wael Yehia	187913c5cf	[PATCH] fix location of test case from D97507.	2021-03-15 09:34:24 -04:00
Anton Afanasyev	0e939ab792	[SLP][Test] Precommit test for PR40522	2021-03-15 15:53:54 +03:00
Carl Ritson	aa27a4d8bb	[AMDGPU] Fix shortfalls in WQM marking When tracking defined lanes through phi nodes in the live range graph each branch of the phi must be handled independently. Also rewrite the marking algorithm to reduce unnecessary operations. Previously a shared set of defined lanes was used which caused marking to stop prematurely. This was observable in existing lit tests, but test patterns did not cover this detail. Reviewed By: piotr Differential Revision: https://reviews.llvm.org/D98614	2021-03-15 21:44:15 +09:00
Simon Pilgrim	f127ae93b8	[X86][SSE] canonicalizeShuffleWithBinOps - add X86ISD::PSHUFB handling. Recommit rGcd938ab162b0ac560dd0e9fee290980c7e0e47e5 with an early-out if the pshub would introduce zeros across the binop.	2021-03-15 12:43:30 +00:00
Bradley Smith	19a711a469	[AArch64][SVE] Add unpredicated ld1/st1 patterns for reg+reg addressing modes Differential Revision: https://reviews.llvm.org/D95677	2021-03-15 12:36:28 +00:00
Simon Pilgrim	eaf9528054	Revert rG9ba577eca2e339726bfaad4e615c6324a705b292 "[X86][SSE] canonicalizeShuffleWithBinOps - handle target shuffles. NFCI." Sorry this wasn't supposed to be committed yet (and certainly not tagged as NFCI....)	2021-03-15 12:23:44 +00:00
Nikita Popov	7180decf64	Revert "[NFCI][ValueTracking] getUnderlyingObject(): gracefully handle cycles" This reverts commit aa440ba24dc25e4c95f6dcf8ff647024f3b12661. This has a non-trivial compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=0c5b789c7342ee8384507c3242fc256e23248c4d&to=aa440ba24dc25e4c95f6dcf8ff647024f3b12661&stat=instructions I don't believe this is the correct way to address the issue in this case.	2021-03-15 13:12:39 +01:00
Simon Pilgrim	7525992d5b	[X86][SSE] canonicalizeShuffleWithBinOps - handle target shuffles. NFCI. Fold SHUFFLE(BINOP(SHUFFLE(X),SHUFFLE(Y))) -> BINOP(SHUFFLE'(X),SHUFFLE'(Y)) style patterns as well as the existing shuffles of constants.	2021-03-15 11:59:25 +00:00
Stephen Kelly	9bf084b8f1	[AST] Add generator for source location introspection Generate a json file containing descriptions of AST classes and their public accessors which return SourceLocation or SourceRange. Use the JSON file to generate a C++ API and implementation for accessing the source locations and method names for accessing them for a given AST node. This new API can be used to implement 'srcloc' output in clang-query: http://ce.steveire.com/z/m_kTIo The JSON file can also be used to generate bindings for other languages, such as Python and Javascript: https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching In this first version of this feature, only the accessors for Stmt classes are generated, not Decls, TypeLocs etc. Those can be added after this change is reviewed, as this change is mostly about infrastructure of these code generators. Also in this version, the platforms/cmake configurations are excluded as much as possible so that support can be added iteratively. Currently a break on any platform causes a revert of the entire feature. This way, the `OR WIN32` can be removed in a future commit and if it breaks the buildbots, only that commit gets reverted, making the entire process easier to manage. Differential Revision: https://reviews.llvm.org/D93164	2021-03-15 10:52:44 +00:00
Roman Lebedev	39d655ee4f	[NFCI][ValueTracking] getUnderlyingObject(): gracefully handle cycles Normally, this function just doesn't bother about cycles, and hopes that the caller supplied small-enough depth so that at worst it will take a potentially large, but limited amount of time. But that obviously doesn't work if there is no depth limit. This reapples 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4, but without asserting, just bailout once cycle is detected.	2021-03-15 13:51:02 +03:00
Fraser Cormack	a81aeba1b5	[RISCV] Support fixed-length vectors in the calling convention This patch adds fixed-length vector support to the calling convention when RVV is used to lower fixed-length vectors. The scheme follows the regular vector calling convention for the argument/return registers, but uses scalable vector container types as the LocVTs, and converts to/from the fixed-length vector value types as required. Fixed-length vector types may be split when the combination of minimum VLEN and the maximum allowable LMUL is not large enough to fully contain the vector. In this case the behaviour differs between fixed-length vectors passed as parameters and as return values: 1. For return values, vectors must be passed entirely via registers or via the stack. 2. For parameters, unlike scalar values, split vectors continue to be passed by value, and are split across multiple registers until there are no remaining registers. Thus vector parameters may be found partly in registers and partly on the stack. As with scalable vectors, the first fixed-length mask vector is passed via v0. Split mask fixed-length vectors are passed first via v0 and then via the next available vector register: v8,v9,etc. The handling of vector return values uses all available argument registers v8-v23 which does not adhere to the calling convention we're supposedly implementing, but since this issue affects both fixed-length and scalable-vector values, it was left as-is. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97954	2021-03-15 10:43:51 +00:00
Jay Foad	8f1047f96b	[AMDGPU] Use depth first iterator instead of recursive DFS. NFCI. The reason for this is to avoid deep recursion in DFS() which can cause stack overflow on large CFGs, especially on Windows. Differential Revision: https://reviews.llvm.org/D98528	2021-03-15 10:32:55 +00:00
Simon Pilgrim	53a291a3fe	Fix MSVC "switch statement contains 'default' but no 'case' labels" warning. NFCI.	2021-03-15 09:45:45 +00:00
Simon Pilgrim	e9b7429757	[X86][SSE] Attempt to merge single-op hops for slow targets. For slow-hop targets, see if any single-op hops are duplicating work already done on another (dual-op) hop, which can sometimes occur as isHorizontalBinOp tries to find potential duplicates (but can't merge them itself). If so, reuse the other hop and shuffle the result.	2021-03-15 09:30:20 +00:00
Roman Lebedev	a289925031	Revert "[NFCI][ValueTracking] getUnderlyingObject(): assert that no cycles are encountered" This reverts commit 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4. Seems to make bots unhappy.	2021-03-15 12:00:59 +03:00
Roman Lebedev	b4b4a1122d	[NFCI][ValueTracking] getUnderlyingObject(): assert that no cycles are encountered Jeroen Dobbelaere in https://lists.llvm.org/pipermail/llvm-dev/2021-March/149206.html is reporting that this function can end up in an endless loop when called from SROA w/ full restrict patches. For now, simply ensure that such problems are caught earlier/easier.	2021-03-15 11:52:31 +03:00
Max Kazantsev	dee7bb8299	[Test] Replace checks with auto-generated checks	2021-03-15 14:32:00 +07:00
Hongtao Yu	9e418be560	[NFC][Inliner] Debugging support to print funtion size after each inlining. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D98439	2021-03-14 22:11:53 -07:00
Hsiangkai Wang	f8bb6b4096	[RISCV] Support inline asm for vector instructions. Types of fractional LMUL and LMUL=1 are all using VR register class. When using inline asm, it will use the first type in the register class as the type for the register. It is not necessary the same as the value type. We need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal to put the value in the corresponding register class. Differential Revision: https://reviews.llvm.org/D97480	2021-03-15 11:02:18 +08:00
Stephen Kelly	ec57b30294	Revert "[AST] Add generator for source location introspection" This reverts commit 91abaa1f8d97e8efa249c31686fd643ff5f1e2c2.	2021-03-15 01:16:10 +00:00
Craig Topper	729d7b754d	[RISCV] Give an explicit error if 'generic' CPU is passed instead of 'generic-rv32' or 'generic-rv64'. Validate 64Bit feature against the triple. I encountered a project that uses llvm that passes "generic" by default. While I could fix that project, I wouldn't be surprised if other projects did something similar. So it seems like a good idea to provide a better error here. I've also added validation of the 64Bit feature against the triple so that we can catch a mismatched CPU before failing in a mysterious way. We can make it pretty far in isel because we calculate XLenVT from the triple and use that to set up the legal integer type. Reviewed By: luismarques, khchen Differential Revision: https://reviews.llvm.org/D98307	2021-03-14 17:21:31 -07:00
Stephen Kelly	84c98ffec3	[AST] Add generator for source location introspection Generate a json file containing descriptions of AST classes and their public accessors which return SourceLocation or SourceRange. Use the JSON file to generate a C++ API and implementation for accessing the source locations and method names for accessing them for a given AST node. This new API can be used to implement 'srcloc' output in clang-query: http://ce.steveire.com/z/m_kTIo The JSON file can also be used to generate bindings for other languages, such as Python and Javascript: https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching In this first version of this feature, only the accessors for Stmt classes are generated, not Decls, TypeLocs etc. Those can be added after this change is reviewed, as this change is mostly about infrastructure of these code generators. Also in this version, the platforms/cmake configurations are excluded as much as possible so that support can be added iteratively. Currently a break on any platform causes a revert of the entire feature. This way, the `OR WIN32` can be removed in a future commit and if it breaks the buildbots, only that commit gets reverted, making the entire process easier to manage. Differential Revision: https://reviews.llvm.org/D93164	2021-03-15 00:00:29 +00:00
Stephen Kelly	9a84a85fd4	Revert "[AST] Add generator for source location introspection" This reverts commit 477e4b974653f92960c0bf569d88da7baacef68a.	2021-03-14 22:51:45 +00:00
Craig Topper	fb361c2b5e	[X86] Add -prefer-vector-width=256 tests for v16i8 smulo/umulo.	2021-03-14 15:37:23 -07:00
Stephen Kelly	99ea2ede68	[AST] Add generator for source location introspection Generate a json file containing descriptions of AST classes and their public accessors which return SourceLocation or SourceRange. Use the JSON file to generate a C++ API and implementation for accessing the source locations and method names for accessing them for a given AST node. This new API can be used to implement 'srcloc' output in clang-query: http://ce.steveire.com/z/m_kTIo The JSON file can also be used to generate bindings for other languages, such as Python and Javascript: https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching In this first version of this feature, only the accessors for Stmt classes are generated, not Decls, TypeLocs etc. Those can be added after this change is reviewed, as this change is mostly about infrastructure of these code generators. Also in this version, the platforms/cmake configurations are excluded as much as possible so that support can be added iteratively. Currently a break on any platform causes a revert of the entire feature. This way, the `OR WIN32` can be removed in a future commit and if it breaks the buildbots, only that commit gets reverted, making the entire process easier to manage. Differential Revision: https://reviews.llvm.org/D93164	2021-03-14 22:32:42 +00:00
Chenguang Wang	c7cbedcb2c	[ArgPromotion] Copy additional metadata for loads. Current ArgPromotion implementation does not copy it: https://godbolt.org/z/zzTKof Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D93927	2021-03-14 21:28:14 +00:00
Simonas Kazlauskas	fc4f1bf4fc	[InstSimplify] Add additional GEP transform tests & regenerate	2021-03-14 23:18:26 +02:00
Jan Kratochvil	1b4932460d	[llvm] [dwarf] Fix DWARFListTableHeader::getOffsetEntry off-by-one D98289 was erroneously reporting `invalid range list offset 0x20110` instead of `invalid range list table index 0`. Differential Revision: https://reviews.llvm.org/D98589	2021-03-14 21:42:44 +01:00
Ricky Taylor	f9724c94e4	[M68k] Tidy up some bit shifting during code emission This fixes some issues with bit masking when emitting instructions (including one TODO). Differential Revision: https://reviews.llvm.org/D98527	2021-03-14 11:53:14 -07:00
Ricky Taylor	784e884345	[M68k] Make M68k TargetMachine use getter function This makes M68k match other platforms in this regard. This was done as part of the AsmParser/Disassembler work since the entry functions of those modules usually reference `getTheXXXTarget()`. Differential Revision: https://reviews.llvm.org/D98517	2021-03-14 11:51:58 -07:00
Ricky Taylor	b94c972042	[M68k] Fix extract-section.py under Python 3 read_raw_stdin() was opening a file in binary mode, but Popen was being told to use text mode (universal_newlines). This is benign on Python 2 but an error on Python 3. Differential Revision: https://reviews.llvm.org/D98428	2021-03-14 11:36:57 -07:00
Nico Weber	4d1294714b	Revert "[gn build] (manually) kind of merge d627a27d26" This reverts commit 5123327edab15bacb44a63a874d9d379d4873407. d627a27d26 was reverted in e0f70a8a979f.	2021-03-14 12:18:22 -04:00
Nikita Popov	331d475703	[X86] Add test for PR49587 (NFC) Shows a miscompile with FastISel.	2021-03-14 16:39:49 +01:00
David Green	e19cd9f8e5	[AArch64] Expand build-vector-extract.ll tests to i8's. NFC	2021-03-14 15:29:14 +00:00
Simonas Kazlauskas	1101d7f026	[InstCombine] Restrict a GEP transform to avoid changing provenance This is an alternative to D98120. Herein, instead of deleting the transformation entirely, we check that the underlying objects are both the same and therefore this transformation wouldn't incur a provenance change, if applied. https://alive2.llvm.org/ce/z/SYF_yv Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D98588	2021-03-14 16:32:04 +02:00
Matt Arsenault	5bf1dc2e73	CodeGen: Reorder MachinePointerInfo fields This saves a little bit of padding.	2021-03-14 10:06:39 -04:00
Nico Weber	a70aec2a37	[gn build] (manually) kind of merge d627a27d26 This only merges the no-op generator part for now.	2021-03-14 09:19:44 -04:00
Luo, Yuanke	6f29193e28	[X86][AMX] Prevent transforming load pointer from <256 x i32>* to x86_amx*. The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy. Differential Revision: https://reviews.llvm.org/D98247	2021-03-14 09:24:56 +08:00
Saleem Abdulrasool	cbcd94df7b	X86: adjust the windows 64 calling convention for Swift Adjust the Win64 calling convention for Swift to pass self in R13, which is traditionally a CSR. This makes the behaviour similar to the SysV CC for Swift as well. This should improve the argument passing on Windows, although it comes at a high cost of ABI incompatibility. Fortunately in this case, there is no guarantee of ABI stability, and so we can make this incompatible change.	2021-03-13 16:53:20 -08:00
Philip Reames	3ea5a04801	Restore fixed version of "[CodeGenPrepare] Fix isIVIncrement (PR49466)" Change was reverted in commit 8d20f2c2c66eb486ff23cc3d55a53bd840b36971 because it was causing an infinite loop. 9228f2f32 fixed the root issue in the code structure, this change just reapplies the original change w/adaptation to the new code structure.	2021-03-13 15:25:02 -08:00
Philip Reames	44498cb737	[CGP] Consolidate logic for getIVIncrement and isIVIncrement This fixes the bug demonstrated by the test case in the commit message of 8d20f2c2 (which was a revert of cf82700). The root issue was that we have two transforms which are inverses of each other. We use one for simple induction variables (where we can use the post-inc form), and the other for everything else. The problem was that the two transforms could disagree about whether something was an induction variable. The reverted commit made a change to one of the matcher routines which was used for one of the two transforms without updating the other matcher. However, it's worth noting the existing code w/o the reverted change also has cases where the decision could differ between the two paths. The fix is simply to consolidate the code such that two paths must agree by construction, and to add an assert to catch any potential future re-divergence. Triggering the infinite loop requires side stepping the SunkAddrs cache. The SunkAddrs cache has the effect of suppressing the iteration in the common case, but there are codepaths through CGP which restart iteration and clear this cache. Unfortunately, I have not been able to construct a standalone IR test case for this. The original test case is a c++ program which when compiled by clang demonstrates the infinite loop, but all of my attempts at extracting an IR test case runnable through opt/llc have failed to reproduce. (Including capturing the IR at point of the transform itself!) I have no idea what weird state clang is creating here. I also tried creating a test case by hand, but gave up after about an hour of trying to find the right combination to dance through multiple transforms to create the end result needed to trip the bug.	2021-03-13 14:55:25 -08:00
Simonas Kazlauskas	56ab20bb41	[InstCombine] Update GEP tests Adds a test for D98588 and updates the test checks.	2021-03-13 23:38:53 +02:00
Nikita Popov	1e6539234a	[SROA] Regenerate test checks (NFC)	2021-03-13 22:00:00 +01:00
Nikita Popov	063c1a1b70	[MemCpyOpt] Handle read from lifetime.start with offset This fixes a regression from the MemDep-based implementation: MemDep completely ignores lifetime.start intrinsics that aren't MustAlias -- this is probably unsound, but it does mean that the MemDep based implementation successfully eliminated memcpy's from lifetime.start if the memcpy happens at an offset, rather than the base address of the alloca. Add a special case for the case where the lifetime.start spans the whole alloca (which is pretty much the only kind of lifetime.start that frontends ever emit), as we don't need to figure out our exact aliasing relationship in that case, the whole alloca is dead prior to the call. If this doesn't cover all practically relevant cases, then it would be possible to make use of the recently added PartialAlias clobber offsets to make this more precise.	2021-03-13 20:38:09 +01:00
Nikita Popov	55c1bfa677	[MemCpyOpt] Add additional tests for memcpy of undef (NFC)	2021-03-13 20:38:09 +01:00
Craig Topper	a98a11913f	[DAGCombiner] Optimize 1-bit smulo to AND+SETNE. A 1-bit smulo overflows is both inputs are -1 since the result should be +1 which can't be represented in a signed 1 bit value. We can detect this with an AND and a setcc. The multiply result can also use the same AND. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D97634	2021-03-13 09:39:36 -08:00
Stefan Gränitz	c9783d2f73	[Orc] Deallocate debug objects properly when removing resources from DebugObjectManagerPlugin	2021-03-13 16:34:38 +01:00

... 2 3 4 5 6 ...

212804 Commits