llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00

Author	SHA1	Message	Date
Florian Hahn	91b8e2e75a	[InferAttrs] Do not mark first argument of str(n)cat as writeonly. str(n)cat appends a copy of the second argument to the end of the first argument. To find the end of the first argument, str(n)cat has to read from it until it finds the terminating 0. So it should not be marked as writeonly. I think this means the argument should not be marked as writeonly. (This is causing a mis-compile with legacy DSE, before it got removed) Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D100601	2021-04-15 23:00:21 +01:00
Momchil Velikov	d98e321d12	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
Craig Topper	6b675e5ab5	[TableGen] Reduce the number of map lookups in TypeSetByHwMode::getOrCreate. NFCI hasMode was looking up the map once. Then we'd either call get which would look up again, or we'd insert into the map which requires walking the map to find the insertion point. I believe the hasMode was needed because get has a special case to look for DefaultMode if the mode being asked for doesn't exist. We don't want that here so we were using hasMode to make sure we wouldn't hit that case. Simplify to a regular operator[] access which will default construct a SetType if the lookup fails.	2021-04-15 12:32:21 -07:00
Stanislav Mekhanoshin	e718ff5dc3	[AMDGPU] Factor out predicate FmaakFmamkF32Insts Differential Revision: https://reviews.llvm.org/D100409	2021-04-15 12:29:16 -07:00
Florian Hahn	1003f18483	[VPlan] Replace a few unnecessary includes with forward decls.	2021-04-15 20:08:31 +01:00
Stanislav Mekhanoshin	5b15ced47a	[AMDGPU] Add new EmitDstSel field to VOPPofile. NFC. Differential Revision: https://reviews.llvm.org/D100589	2021-04-15 12:07:08 -07:00
LLVM GN Syncbot	3e03953b19	[gn build] Port 82787eb2285d	2021-04-15 18:54:08 +00:00
hsmahesha	eb7757a102	[AMDGPU] Move LDS lowering related utility functions to a separate utils file. Move some utility functions which are used within LDS lowering pass to a separate utils file so that other LDS related passes can make use of them when required. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100526	2021-04-16 00:15:48 +05:30
Krzysztof Parzyszek	685c4cfa64	[Hexagon] Avoid infinite loops in type legalization when lowering SETCC Only widen SETCC if the operands can be widened. Not checking that caused infinite widen-split loops in legalization.	2021-04-15 13:34:37 -05:00
Craig Topper	3583af245c	[RISCV] Share RVInstIShift and RVInstIShiftW instruction format classes with the B extension. This generalizes RVInstIShift/RVInstIShiftW to take the upper 5 or 7 bits of the immediate as an input instead of only bit 30. Then we can share them. For RVInstIShift I left a hardcoded 0 at bit 26 where RV128 gets a 7th bit for the shift amount. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D100424	2021-04-15 11:08:28 -07:00
cchen	7304924c0f	[OpenMP] Added codegen for masked directive Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100514	2021-04-15 12:55:07 -05:00
Danilo C. Grael	779b4e72af	[LoopUnrollAndJam] Avoid repeated instructions for UAJ analysis Avoid visiting repeated instructions for processHeaderPhiOperands as it can cause a scenario of endless loop. Test case is attached and can be ran with `opt -basic-aa -tbaa -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=4`. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D97407	2021-04-15 12:59:42 -04:00
Arthur Eubanks	2a2896669d	[NewPM] Cleanup IR printing instrumentation Being lazy with printing the banner seems hard to reason with, we should print it unconditionally first (it could also lead to duplicate banners if we have multiple functions in -filter-print-funcs). The printIR() functions were doing too many things. I separated out the call from PrintPassInstrumentation since we were essentially doing two completely separate things in printIR() from different callers. There were multiple ways to generate the name of some IR. That's all been moved to getIRName(). The printing of the IR name was also inconsistent, now it's always "IR Dump on $foo" where "$foo" is the name. For a function, it's the function name. For a loop, it's what's printed by Loop::print(), which is more detailed. For an SCC, it's the list of functions in parentheses. For a module it's "[module]", to differentiate between a possible SCC with a function called "module". To preserve D74814, we have to check if we're going to print anything at all first. This is unfortunate, but I would consider this a special case that shouldn't be handled in the core logic. Reviewed By: jamieschmeiser Differential Revision: https://reviews.llvm.org/D100231	2021-04-15 09:50:55 -07:00
Mark Johnston	f964e041e4	[asan] Add an offset for the kernel address sanitizer on FreeBSD This is based on a port of the sanitizer runtime to the FreeBSD kernel that has been commited as https://cgit.freebsd.org/src/commit/?id=38da497a4dfcf1979c8c2b0e9f3fa0564035c147 and the following commits. Reviewed By: emaste, dim Differential Revision: https://reviews.llvm.org/D98285	2021-04-15 17:49:00 +01:00
Stefan Pintilie	3c5cd9faed	[PowerPC] Add ROP Protection Instructions for PowerPC There are four new PowerPC instructions that are introduced in Power 10. They are hashst, hashchk, hashstp, hashchkp. These instructions will be used for ROP Protection. This patch adds the four instructions. Reviewed By: nemanjai, amyk, #powerpc Differential Revision: https://reviews.llvm.org/D99375	2021-04-15 11:38:38 -05:00
Stelios Ioannou	4da00eddbe	[LSR] Fix for pre-indexed generated constant offset This patch changed the isLegalUse check to ensure that LSRInstance::GenerateConstantOffsetsImpl generates an offset that results in a legal addressing mode and formula. The check is changed to look similar to the assert check used for illegal formulas. Differential Revision: https://reviews.llvm.org/D100383 Change-Id: Iffb9e32d59df96b8f072c00f6c339108159a009a	2021-04-15 16:44:42 +01:00
OCHyams	9fd912e6a2	Revert "[DebugInfo] Replace debug uses in replaceUsesOutsideBlock" This reverts commit 96a1e6b7cf72d9bd625903ea4b441404200383cf. Failing build bots e.g. https://lab.llvm.org/buildbot/#/builders/161/builds/163	2021-04-15 16:35:45 +01:00
OCHyams	94edb782c9	[DebugInfo] Replace debug uses in replaceUsesOutsideBlock Value::replaceUsesOutsideBlock doesn't replace debug uses which leads to an unnecessary reduction in variable location coverage. Fix this, add a unittest for it, and add a regression test demonstrating the change through instcombine's replacedSelectWithOperand. Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D99169	2021-04-15 16:19:36 +01:00
Sanjay Patel	94869b0900	[InstCombine] update RUN lines in assume test; NFC This was in a draft of from D82703, but it got left out of the committed version, so we were not actually testing the new code.	2021-04-15 10:48:00 -04:00
Kerry McLaughlin	7624945380	[NFC] Remove the -instcombine flag from strict-fadd.ll This also fixes a CHECK line in @fadd_strict_unroll which ensures the changes made to fixReduction() to support in-order reductions with unrolling are being tested correctly.	2021-04-15 15:10:48 +01:00
LemonBoy	0ebe53ad79	[yaml2obj/obj2yaml/llvm-readobj] Support printing and parsing AVR-specific e_flags The `e_flags` contains a mixture of bitfields and regular ones, ensure all of them can be serialized and deserialized. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100250	2021-04-15 15:54:28 +02:00
Paul C. Anagnostopoulos	b1e2c33e83	[TableGen] [docs] Correct a reference in the TableGen Overview document Differential Revision: https://reviews.llvm.org/D100382	2021-04-15 09:25:09 -04:00
Sebastian Neubauer	033c97f321	[AMDGPU] Fix large return values with amdgpu_gfx Returning in memory is not supported, so fall back to sret. Also, extend i1 and i16 to i32. Otherwise, they would be passed through memory. Differential Revision: https://reviews.llvm.org/D100543	2021-04-15 14:57:56 +02:00
Simon Pilgrim	fd3b975a9f	[X86] combineCMP - fold cmpEQ/NE(TRUNC(X),0) -> cmpEQ/NE(X,0) If we are truncating from a i32 source before comparing the result against zero, then see if we can directly compare the source value against zero. If the upper (truncated) bits are known to be zero then we can compare against that, hopefully increasing the chances of us folding the compare into a EFLAG result of the source's operation. Fixes PR49028. Differential Revision: https://reviews.llvm.org/D100491	2021-04-15 13:55:51 +01:00
Bradley Smith	843db3db21	[AArch64][NEON] Match (or (and -a b) (and (a+1) b)) => bit select With this patch vbslq_f32(vnegq_s32(a), b, c) lowers to a BIT instruction. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D100304	2021-04-15 13:52:47 +01:00
Alex Orlov	e7b532348e	Fix bug in .eh_frame/.debug_frame PC offset calculation for DW_EH_PE_pcrel This fixes the following bugs: https://bugs.llvm.org/show_bug.cgi?id=27249 https://bugs.llvm.org/show_bug.cgi?id=46414 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100328	2021-04-15 15:06:20 +04:00
Florian Hahn	09bcc572cb	[VPlan] Add VPRecipeBase::mayHaveSideEffects. Add an initial version of a helper to determine whether a recipe may have side-effects. Reviewed By: a.elovikov Differential Revision: https://reviews.llvm.org/D100259	2021-04-15 11:49:40 +01:00
Jun Ma	68f371389f	[DAGCombiner] Fold step_vector with add/mul/shl This patch implements some DAG combines for STEP_VECTOR: add step_vector(C1), step_vector(C2) -> step_vector(C1+C2) add (add X step_vector(C1)), step_vector(C2) -> add X step_vector(C1+C2) mul step_vector(C1), C2 -> step_vector(C1*C2) shl step_vector(C1), C2 -> step_vector(C1<<C2) TestPlan: check-llvm Differential Revision: https://reviews.llvm.org/D100088	2021-04-15 18:06:35 +08:00
David Sherwood	8ace3ea4dd	[SVE][LoopVectorize] Fix crash in InnerLoopVectorizer::widenPHIInstruction There were a few places in widenPHIInstruction where calculations of offsets were failing to take the runtime calculation of VF into account for scalable vectors. I've fixed those cases in this patch as well as adding an assert that we should not be scalarising for scalable vectors. Tests are added here: Transforms/LoopVectorize/AArch64/sve-widen-phi.ll Differential Revision: https://reviews.llvm.org/D99254	2021-04-15 10:51:49 +01:00
Fraser Cormack	7ac7c52b81	[RISCV] Pre-commit vector shuffle test cases This codegen will be improved by future patches.	2021-04-15 10:31:13 +01:00
dfukalov	7b0a514671	[AA] Updates for D95543. Addressing latter comments in D95543: - `AliasResult::Result` renamed to `AliasResult::Kind` - Offset printing added for `PartialAlias` case in `-aa-eval` - Removed VisitedPhiBBs check from BasicAA' Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D100454	2021-04-15 12:22:03 +03:00
Florian Hahn	53489ff399	[AArch64] Use type-legalization cost for code size memop cost. At the moment, getMemoryOpCost returns 1 for all inputs if CostKind is CodeSize or SizeAndLatency. This fools LoopUnroll into thinking memory operations on large vectors have a cost of one, even if they will get expanded to a large number of memory operations in the backend. This patch updates getMemoryOpCost to return the cost for the type legalization for both CodeSize and SizeAndLatency. This should more accurately reflect the number of memory operations required. I am not sure how latency should properly be included in SizeAndLatency from the description, but returning the size cost should be clearly more accurate. This does not cause any binary changes when building MultiSource/SPEC2000/SPEC2006 with -O3 -flto for AArch64, likely because large vector memops are not really formed by code emitted from Clang. But using the C/C++ matrix extension can easily result in code with very large vector operations directly from Clang, e.g. https://clang.godbolt.org/z/6xzxcTGvb Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D100291	2021-04-15 10:11:05 +01:00
David Sherwood	cce05cf2bc	[NFC][LoopVectorize] Remove unnecessary VF.isScalable asserts There are a few places in LoopVectorize.cpp where we have been too cautious in adding VF.isScalable() asserts and it can be confusing. It also makes it more difficult to see the genuine places where work needs doing to improve scalable vectorization support. This patch changes getMemInstScalarizationCost to return an invalid cost instead of firing an assert for scalable vectors. Also, vectorizeInterleaveGroup had multiple asserts all for the same thing. I have removed all but one assert near the start of the function, and added a new assert that we aren't dealing with masks for scalable vectors. Differential Revision: https://reviews.llvm.org/D99727	2021-04-15 09:41:03 +01:00
Martin Storsjö	63e2bdfb17	[AArch64] Fix windows vararg functions with floats in the fixed args On Windows, float arguments are normally passed in float registers in the calling convention for regular functions. For variable argument functions, floats are passed in integer registers. This already was done correctly since many years. However, the surprising bit was that floats among the fixed arguments also are supposed to be passed in integer registers, contrary to regular functions. (This also seems to be the behaviour on ARM though, both on Windows, but also on e.g. hardfloat linux.) In the calling convention, don't promote shorter floats to f64, but convert them to integers of the same length. (Floats passed as part of the actual variable arguments are promoted to double already on the C/Clang level; the LLVM vararg calling convention doesn't do any extra promotion of f32 to f64 - this matches how it works on X86 too.) Technically, this is an ABI break compared to older LLVM versions, but it fixes compatibility with the official platform ABI. (In practice, floats among the fixed arguments in variable argument functions is a pretty rare construct.) Differential Revision: https://reviews.llvm.org/D100365	2021-04-15 11:02:14 +03:00
Martin Storsjö	a74fefb10d	Reland "[lit] Handle plain negations directly in the internal shell" Keep running "not --crash" via the external "not" executable, but for plain negations, and for cases that use the shell "!" operator, just skip that argument and invert the return code. The libcxx tests only use the shell operator "!" for negations, never the "not" executable, because libcxx tests can be run without having a fully built llvm tree available providing the "not" executable. This allows using the internal shell for libcxx tests. It should be possible to reland this now that D99938 fixed the one test failure in clang-tidy that broke when "not" was handled internally, letting lit/python execute grep.exe directly instead of via not.exe. (See D99330 and D99406 for more commentery on the exact issue that broke and other potential ways of fixing it.) Differential Revision: https://reviews.llvm.org/D98859	2021-04-15 11:02:14 +03:00
Nikita Popov	79b9e7549e	Revert "[SCEV] Don't walk uses of phis without SCEV expression when forgetting" This reverts commit faf9f11589ce892b31d271917cf840f8ca903221. Issues with this patch have been reported in https://reviews.llvm.org/D100264#2689917 and https://bugs.llvm.org/show_bug.cgi?id=49967.	2021-04-15 09:43:52 +02:00
Florian Hahn	69b0c6315b	[NewGVN] Add phi-of-ops operands if no real PHI is created. If the PHI-of-ops simplifies to an existing value, no real PHI is created, which means the dependencies between the PHI-of-ops and its operands is not materialized in IR. At the moment, we fail to create a real PHI node for the PHI-of-ops, because the PHI-of-ops root instruction is not re-visited if one of the PHI-of-ops operands changes. We need to add the operands as additional users in this case. Even with this patch, there are still some dependencies missing. I will continue tackling the outstanding reporeted crashes in this area. Fixes PR36501, PR42422, PR42557. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D66924	2021-04-15 08:25:10 +01:00
Craig Topper	1e3f0cc364	[RISCV] Add a PatFrag to shorten repeated (XLenVT (VLOp GPR:$vl)) in V extension patterns. Reduces the amount of changes needed in D100288.	2021-04-14 22:36:35 -07:00
Max Kazantsev	78c3c3055c	[Test] Propagate nofree attribute from function to calls	2021-04-15 11:50:37 +07:00
hsmahesha	67b47974a5	[AMDGPU] Disable forceful inline of non-kernel functions which use LDS. Now since LDS uses within non-kernel functions are being handled in the pass - LowerModuleLDS, we NO need to forcefully inline non-kernel functions just because they use LDS. Do forceful inlining only when the pass - LowerModuleLDS is not enabled. It is enabled by default. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100481	2021-04-15 09:12:56 +05:30
Nico Weber	f0c201e2e7	fix comment typos to cycle bots	2021-04-14 22:12:56 -04:00
LLVM GN Syncbot	275a701620	[gn build] Port b7459a10dad1	2021-04-15 01:52:03 +00:00
Alexander Yermolovich	597f83a23d	[DWARF] Fix crash for DWARFDie::dump. When DIE is extracted manually, the DieArray is empty. When dump is invoked on aforementioned DIE it tries to extract child, even if Dump options say otherwise. Resulting in crash. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99698	2021-04-14 18:46:34 -07:00
Sterling Augustine	8d25d3bfef	Revert "Simplify BitVector code" This reverts commit 82f0e3d3ea6bf927e3397b2fb423abbc5821a30f. The change breaks the asan buildbots. https://lab.llvm.org/buildbot/#/builders/99/builds/2835	2021-04-14 18:06:51 -07:00
Nico Weber	3d7c83c432	[llvm-objdump] try to fix section-filter.test in full builds after 51aa61e74bdb	2021-04-14 20:58:51 -04:00
Nico Weber	3612bb926d	[llvm-objdump] try to fix hexagon tests more after 51aa61e74bdb	2021-04-14 20:50:03 -04:00
Nico Weber	11e166fe6a	[llvm-objdump] try to fix hexagon and riscv tests after 1035123ac50db	2021-04-14 20:40:38 -04:00
Nico Weber	691c156cc8	[llvm-objdump] Switch command-line parsing from llvm::cl to OptTable This is similar to D83530, but for llvm-objdump. The motivation is the desire to add an `llvm-otool` symlink to llvm-objdump that behaves like macOS's `otool`, using the same technique the at llvm-objcopy uses to behave like `strip` (etc). This change for the most part preserves behavior. In some cases, it increases compatibility with GNU objdump a bit. For example, the long options now require two dashes, and the long options taking arguments for the most part now require a `=` in front of the value. Exceptions are flags where tests passed the value separately, for these the separate form is kept as an alias to the = form. The one-letter short form args are now joined or separate and long longer accept a =, which also matches GNU objdump. cl::opt<>s in libraries now have to be explicitly plumbed through. This patch does that for --x86-asm-syntax=, but there's hope that we can remove that again. Differential Revision: https://reviews.llvm.org/D100433	2021-04-14 20:12:24 -04:00
Philip Reames	37583a96a8	Reapply "[InferAttributes] Materialize all infered attributes for declaration"" and follow on patches. This reverts commit ab98f2c7129a52e216fd7e088b964cf4af27b0f2 and 98eea392cdbcdb7360e58b46e9329573f092cd96. It includes a fix for the clang test which triggered the revert. I failed to notice this one because there was another AMDGPU llvm test with a similiar name and the exact same text in the error message. Odd. Since only one build bot reported the clang test, I didn't notice that one.	2021-04-14 16:38:07 -07:00
Nico Weber	53eb9a74f5	Revert "Fix buildbots after 61a85da" This reverts commit c609d533634416fc701939d39bf1e43f293e84dc. 61a85da was reverted in ab98f2c7	2021-04-14 18:47:46 -04:00

1 2 3 4 5 ...

214218 Commits