llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Nico Weber	3b240e5b42	[gn build] (manually) port 4c58f333f141	2021-04-02 18:21:37 -04:00
Jinsong Ji	3eb81563c9	[CSSPGO][Test] XFAIL profile-context-tracker-debug.ll on AIX The case start to fail since https://reviews.llvm.org/D99351. Looks like to me that the node order within Context Profile Tree depends on the implmementation of std::hash<std::string>. Unfortunately, the current clang implementation generate different values on AIX (or for all big-endian systems?) On Linux: main: 2408804140(0x8f936f2c) external: 896680882(0x357243b2) externalA: 620231129(0x24f7f9d9) On AIX: main: 994322777(0x3b442959) external: 3548191215(0xd37d19ef) externalA: 1390365101(0x52df49ad) XFAIL it first while we discuss and seek for a fix. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D99815	2021-04-02 22:16:21 +00:00
Simon Pilgrim	cad94dd97b	[KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI. Include exhaustive test coverage.	2021-04-02 21:44:33 +01:00
Jennifer Yu	96cc3dc1e4	[OPENMP5.1]Initial support for novariants clause. Added basic parsing/sema/serialization support for the 'novariants' clause.	2021-04-02 13:19:01 -07:00
Jan Kratochvil	7e722974e1	[nfc] [llvm] Make DWARFListTableBase::findList const Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D99731	2021-04-02 21:41:47 +02:00
Eric Astor	e5b828d67b	[ms] [llvm-ml] Accept /WX to signal that warnings should be fatal. Define -fatal-warnings to make warnings fatal, and accept /WX as an ML.EXE compatible alias for it. Also make sure that if Warning() returns true, we always treat it as an error. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92504	2021-04-02 15:13:20 -04:00
Levy Hsu	9bba5888d6	[RISCV] Add IR intrinsics for Zbc extension Head files are included in a separate patch in case the name needs to be changed. RV32 / 64: clmul clmulh clmulr Differential Revision: https://reviews.llvm.org/D99711	2021-04-02 12:09:13 -07:00
Levy Hsu	b39ecd380a	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	3ba975602b	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit 1808194590dd2b308bc146406425d5d52e46b7e6. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Cyndy Ishida	ed9ca516af	[TextAPI] run clang-format on violating sections, NFC	2021-04-02 11:44:33 -07:00
Craig Topper	45e35f884b	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Fangrui Song	062916f3dc	[SLPVectorizer] Fix divide-by-zero after D99719 Will add a test case later.	2021-04-02 11:13:51 -07:00
Eric Astor	9ffa0c7b9d	[ms] [llvm-ml] Fix case-sensitivity for variables and textmacros Make variables and text-macro references case-insensitive, to match ml.exe. Also improve error handling for text-macro expansion. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D92503	2021-04-02 14:08:02 -04:00
Levy Hsu	deb679e8ad	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Thomas Preud'homme	c215eaceb3	[Hexagon, test] Fix use of undef FileCheck var LLVM test CodeGen/Hexagon/hwloop3.ll tries to check for the absence of a sequence of consecutive instructions with several CHECK-NOT with one of those directives using a variable defined in another. However CHECK-NOT are checked independently so that is using a variable defined in a pattern that should not occur in the input. This commit merges the two CHECK-NOT into a single CHECK-NOT that matches the content of two successive non-blank lines, thereby allowing to preserve the intent of the test. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D99778	2021-04-02 18:47:49 +01:00
Craig Topper	6e55b005fb	[RISCV] Improve 64-bit integer constant materialization for more cases. For positive constants we try shifting left to remove leading zeros and fill the bottom bits with 1s. We then materialize that constant shift it right. This patch adds a new strategy to try filling the bottom bits with zeros instead. This catches some additional cases.	2021-04-02 10:18:08 -07:00
Craig Topper	8dc5ea9ce9	[RISCV] Add missing CHECK-EXPAND line to one case in rv64i-aliases-valid.s. Use -NEXT to protect against other missing lines.	2021-04-02 10:18:08 -07:00
Sanjay Patel	b76ba99fa7	[InstCombine] fold not+or+neg ~((-X) \| Y) --> (X - 1) & (~Y) We generally prefer 'add' over 'sub', this reduces the dependency chain, and this looks better for codegen on x86, ARM, and AArch64 targets. https://llvm.org/PR45755 https://alive2.llvm.org/ce/z/cxZDSp	2021-04-02 13:16:36 -04:00
Sanjay Patel	893bc148de	[InstCombine] add tests for not+or+neg; NFC https://llvm.org/PR45755	2021-04-02 13:16:35 -04:00
Sanjay Patel	bc2dd74e85	[GVNSink] auto-generate test checks; NFC	2021-04-02 13:16:35 -04:00
Dimitry Andric	85f44c8bd7	[SCCP] Avoid modifying AdditionalUsers while iterating over it When run under valgrind, or with a malloc that poisons freed memory, this can lead to segfaults or other problems. To avoid modifying the AdditionalUsers DenseMap while still iterating, save the instructions to be notified in a separate SmallPtrSet, and use this to later call OperandChangedState on each instruction. Fixes PR49582. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D98602	2021-04-02 19:05:59 +02:00
Nico Weber	6c0a714ca5	[gn build] add build file for tsan runtime Linux-only for now. Some mac bits stubbed out, but not tested. Good enough for the tiny_race.c example at https://clang.llvm.org/docs/ThreadSanitizer.html : $ out/gn/bin/clang -fsanitize=address -g -O1 tiny_race.c $ while true; do ./a.out \|\| echo $? ; done While here, also make `-fsanitize=address` work for .c files. Differential Revision: https://reviews.llvm.org/D99795	2021-04-02 12:59:14 -04:00
Florian Hahn	eb79acb47c	[LV] Hoist mapping of IR operands to VPValues (NFC). This patch moves mapping of IR operands to VPValues out of tryToCreateWidenRecipe. This allows using existing VPValue operands when widening recipes directly, which will be introduced in future patches.	2021-04-02 17:57:20 +01:00
Philip Reames	ba06487d24	[rs4gc] Use loops instead of straightline code for attribute stripping [nfc] Mostly because I'm about to add more attributes and the straightline copies get much uglier. What's currently there isn't too bad.	2021-04-02 09:25:15 -07:00
Philip Reames	4cd522ec4c	[rs4gc] Strip nofree and nosync attributes when lowering from abstract model The safepoints being inserted exists to free memory, or coordinate with another thread to do so. Thus, we must strip any inferred attributes and reinfer them after the lowering. I'm not aware of any active miscompiles caused by this, but since I'm working on strengthening inference of both and leveraging them in the optimization decisions, I figured a bit of future proofing was warranted.	2021-04-02 09:12:24 -07:00
Philip Reames	ea8b83d8a8	[rs4gc] add tests for existing code stripping attributes from function signatures	2021-04-02 08:59:55 -07:00
Brendon Cahoon	83a227f986	[GlobalISel] Allow different types for G_SBFX and G_UBFX operands Change the definition of G_SBFX and G_UBFX so that the lsb and width can have different types than the src and dst operands. Differential Revision: https://reviews.llvm.org/D99739	2021-04-02 11:11:06 -04:00
Nikita Popov	08587aa237	[LVI] Use range metadata on intrinsics If we don't know how to handle an intrinsic, we should still make use of normal call range metadata.	2021-04-02 16:45:31 +02:00
Nikita Popov	34db0e45ed	[CVP] Add test for !range on intrinsic (NFC)	2021-04-02 16:45:30 +02:00
Alexey Bataev	0b255d6d16	[SLP]Added a test for min/max reductions with the key store inside, NFC.	2021-04-02 07:37:40 -07:00
Alexey Bataev	541df75d6c	[SLP]Fix a bug in min/max reduction, number of condition uses. The ultimate reduction node may have multiple uses, but if the ultimate reduction is min/max reduction and based on SelectInstruction, the condition of this select instruction must have only single use. Differential Revision: https://reviews.llvm.org/D99753	2021-04-02 07:09:44 -07:00
Nico Weber	bbcf60b471	Revert "[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper" This reverts commit 500969f1d0b1d92d7c4ccfb6bf8807de96b7e4a0. Makes clang assert compiling avx2 code, see https://bugs.chromium.org/p/chromium/issues/detail?id=1195353#c4 for a standalone repro.	2021-04-02 09:55:55 -04:00
Paul C. Anagnostopoulos	f346298b33	[TableGen] [Docs] Add lldb-tblgen to command guide; add 4 guide stubs Differential Revision: https://reviews.llvm.org/D99605	2021-04-02 09:52:16 -04:00
Jun Ma	43aad326e7	[NFC][SVE] update sve-intrinsics-int-arith.ll under update_llc_test_checks.py	2021-04-02 20:17:11 +08:00
Jun Ma	00ea0c58c4	[AArch64][SVE] Lowering sve.dot to DOT node Differential Revision: https://reviews.llvm.org/D99699	2021-04-02 20:05:17 +08:00
Jun Ma	08c114509d	[NFC][SVE] Use SVE_4_Op_Imm_Pat for sve_intx_dot_by_indexed_elem	2021-04-02 20:05:17 +08:00
Jeroen Dobbelaere	b41d9938e1	[InstCombine] Fix out-of-bounds ashr(shl) optimization This fixes a crash found by the oss fuzzer and reported by @fhahn. The suggestion of @RKSimon seems to be the correct fix here. (See D91343). The oss fuzz report can be found here: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32759 Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D99792	2021-04-02 13:45:11 +02:00
Fraser Cormack	3a37dfb076	[RISCV] Test llvm.experimental.vector.insert intrinsics on RV32 RV32 is able to use the llvm.experimental.vector.insert intrinsics too. This patch ensures they're tested. Reviewed By: khchen, asb Differential Revision: https://reviews.llvm.org/D99655	2021-04-02 11:49:54 +01:00
Simon Pilgrim	8ba32b146e	[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper Use the getTargetShuffleInputs helper for all shuffle decoding	2021-04-02 11:50:18 +01:00
LLVM GN Syncbot	c1f569fa7a	[gn build] Port 0f7bbbc481e2	2021-04-02 10:22:54 +00:00
Sander de Smalen	1ba98aa252	Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed. In order to bring up scalable vector support in LLVM incrementally, we introduced behaviour to emit a warning, instead of an error, when asking the wrong question of a scalable vector, like asking for the fixed number of elements. This patch puts that behaviour under a flag. The default behaviour is that the compiler will always error, which means that all LLVM unit tests and regression tests will now fail when a code-path is taken that still uses the wrong interface. The behaviour to demote an error to a warning can be individually enabled for tools that want to support experimental use of scalable vectors. This patch enables that behaviour when driving compilation from Clang. This means that for users who want to try out scalable-vector support, fixed-width codegen support, or build user-code with scalable vector intrinsics, Clang will not crash and burn when the compiler encounters such a case. This allows us to do away with the following pattern in many of the SVE tests: RUN: .... 2>%t RUN: cat %t \| FileCheck --check-prefix=WARN WARN-NOT: warning: ... The behaviour to emit warnings is only temporary and we expect this flag to be removed in the future when scalable vector support is more stable. This patch also has fixes the following tests: unittests: ScalableVectorMVTsTest.SizeQueries SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG regression tests: Transforms/InstCombine/vscale_gep.ll Reviewed By: paulwalker-arm, ctetreau Differential Revision: https://reviews.llvm.org/D98856	2021-04-02 10:55:22 +01:00
Florian Hahn	c8792774bf	[SLP] Better estimate cost of no-op extracts on target vectors. The motivation for this patch is to better estimate the cost of extracelement instructions in cases were they are going to be free, because the source vector can be used directly. A simple example is %v1.lane.0 = extractelement <2 x double> %v.1, i32 0 %v1.lane.1 = extractelement <2 x double> %v.1, i32 1 %a.lane.0 = fmul double %v1.lane.0, %x %a.lane.1 = fmul double %v1.lane.1, %y Currently we only consider the extracts free, if there are no other users. In this particular case, on AArch64 which can fit <2 x double> in a vector register, the extracts should be free, independently of other users, because the source vector of the extracts will be in a vector register directly, so it should be free to use the vector directly. The SLP vectorized version of noop_extracts_9_lanes is 30%-50% faster on certain AArch64 CPUs. It looks like this does not impact any code in SPEC2000/SPEC2006/MultiSource both on X86 and AArch64 with -O3 -flto. This originally regressed after D80773, so if there's a better alternative to explore, I'd be more than happy to do that. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99719	2021-04-02 10:40:12 +01:00
Fraser Cormack	05a612bde2	[RISCV] Optimize more redundant VSETVLIs D99717 introduced some test cases which showed that the output of one vsetvli into another would not be picked up by the RISCVCleanupVSETVLI pass. This patch teaches the optimization about such a pattern. The pattern is quite common when using the RVV vsetvli intrinsic to pass the VL onto other intrinsics. The second test case introduced by D99717 is left unoptimized by this patch. It is a rarer case and will require us to rewire any uses of the redundant vset[i]vli's output to the previous one's. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99730	2021-04-02 10:04:07 +01:00
Fraser Cormack	6b16d74f7d	[RISCV] Add some tests showing vsetvli cleanup opportunities Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99717	2021-04-02 09:43:04 +01:00
Evgeniy Brevnov	84c021a3a6	[NARY-REASSOCIATE] Support reassociation of min/max Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available. Reviewed By: mkazantsev, lebedev.ri Differential Revision: https://reviews.llvm.org/D88287	2021-04-02 15:30:13 +07:00
Roman Lebedev	360691947c	[PassManager] Run additional LICM before LoopRotate Loop rotation often has to perform code duplication from header into preheader, which introduces PHI nodes. >>! In D99204, @thopre wrote: > > With loop peeling, it is important that unnecessary PHIs be avoided or > it will leads to spurious peeling. One source of such PHIs is loop > rotation which creates PHIs for invariant loads. Those PHIs are > particularly problematic since loop peeling is now run as part of simple > loop unrolling before GVN is run, and are thus a source of spurious > peeling. > > Note that while some of the load can be hoisted and eventually > eliminated by instruction combine, this is not always possible due to > alignment issue. In particular, the motivating example [1] was a load > inside a class instance which cannot be hoisted because the `this' > pointer has an alignment of 1. > > [1] http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/attachment.cpp Now, we could enhance LoopRotate to avoid duplicating code when not needed, but instead hoist loop-invariant code, but isn't that a code duplication? (sic) We have LICM, and in fact we already run it right after LoopRotation. We could try to move it to before LoopRotation, that is basically free from compile-time perspective: https://llvm-compile-time-tracker.com/compare.php?from=6c93eb4477d88af046b915bc955c03693b2cbb58&to=a4bee6d07732b1184c436da489040b912f0dc271&stat=instructions But, looking at stats, i think it isn't great that we would no longer do LICM after LoopRotation, in particular: \| statistic name \| LoopRotate-LICM \| LICM-LoopRotate \| Δ \| % \| abs(%) \| \| asm-printer.EmittedInsts \| 9015930 \| 9015799 \| -131 \| 0.00% \| 0.00% \| \| indvars.NumElimCmp \| 3536 \| 3544 \| 8 \| 0.23% \| 0.23% \| \| indvars.NumElimExt \| 36725 \| 36580 \| -145 \| -0.39% \| 0.39% \| \| indvars.NumElimIV \| 1197 \| 1187 \| -10 \| -0.84% \| 0.84% \| \| indvars.NumElimIdentity \| 143 \| 136 \| -7 \| -4.90% \| 4.90% \| \| indvars.NumElimRem \| 4 \| 5 \| 1 \| 25.00% \| 25.00% \| \| indvars.NumLFTR \| 29842 \| 29890 \| 48 \| 0.16% \| 0.16% \| \| indvars.NumReplaced \| 2293 \| 2227 \| -66 \| -2.88% \| 2.88% \| \| indvars.NumSimplifiedSDiv \| 6 \| 8 \| 2 \| 33.33% \| 33.33% \| \| indvars.NumWidened \| 26438 \| 26329 \| -109 \| -0.41% \| 0.41% \| \| instcount.TotalBlocks \| 1178338 \| 1173840 \| -4498 \| -0.38% \| 0.38% \| \| instcount.TotalFuncs \| 111825 \| 111829 \| 4 \| 0.00% \| 0.00% \| \| instcount.TotalInsts \| 9905442 \| 9896139 \| -9303 \| -0.09% \| 0.09% \| \| lcssa.NumLCSSA \| 425871 \| 423961 \| -1910 \| -0.45% \| 0.45% \| \| licm.NumHoisted \| 378357 \| 378753 \| 396 \| 0.10% \| 0.10% \| \| licm.NumMovedCalls \| 2193 \| 2208 \| 15 \| 0.68% \| 0.68% \| \| licm.NumMovedLoads \| 35899 \| 31821 \| -4078 \| -11.36% \| 11.36% \| \| licm.NumPromoted \| 11178 \| 11154 \| -24 \| -0.21% \| 0.21% \| \| licm.NumSunk \| 13359 \| 13587 \| 228 \| 1.71% \| 1.71% \| \| loop-delete.NumDeleted \| 8547 \| 8402 \| -145 \| -1.70% \| 1.70% \| \| loop-instsimplify.NumSimplified \| 12876 \| 11890 \| -986 \| -7.66% \| 7.66% \| \| loop-peel.NumPeeled \| 1008 \| 925 \| -83 \| -8.23% \| 8.23% \| \| loop-rotate.NumNotRotatedDueToHeaderSize \| 368 \| 365 \| -3 \| -0.82% \| 0.82% \| \| loop-rotate.NumRotated \| 42015 \| 42003 \| -12 \| -0.03% \| 0.03% \| \| loop-simplifycfg.NumLoopBlocksDeleted \| 240 \| 242 \| 2 \| 0.83% \| 0.83% \| \| loop-simplifycfg.NumLoopExitsDeleted \| 497 \| 20 \| -477 \| -95.98% \| 95.98% \| \| loop-simplifycfg.NumTerminatorsFolded \| 618 \| 336 \| -282 \| -45.63% \| 45.63% \| \| loop-unroll.NumCompletelyUnrolled \| 11028 \| 11032 \| 4 \| 0.04% \| 0.04% \| \| loop-unroll.NumUnrolled \| 12608 \| 12529 \| -79 \| -0.63% \| 0.63% \| \| mem2reg.NumDeadAlloca \| 10222 \| 10221 \| -1 \| -0.01% \| 0.01% \| \| mem2reg.NumPHIInsert \| 192110 \| 192106 \| -4 \| 0.00% \| 0.00% \| \| mem2reg.NumSingleStore \| 637650 \| 637643 \| -7 \| 0.00% \| 0.00% \| \| scalar-evolution.NumBruteForceTripCountsComputed \| 814 \| 812 \| -2 \| -0.25% \| 0.25% \| \| scalar-evolution.NumTripCountsComputed \| 283108 \| 282934 \| -174 \| -0.06% \| 0.06% \| \| scalar-evolution.NumTripCountsNotComputed \| 106712 \| 106718 \| 6 \| 0.01% \| 0.01% \| \| simple-loop-unswitch.NumBranches \| 5178 \| 4752 \| -426 \| -8.23% \| 8.23% \| \| simple-loop-unswitch.NumCostMultiplierSkipped \| 914 \| 503 \| -411 \| -44.97% \| 44.97% \| \| simple-loop-unswitch.NumSwitches \| 20 \| 18 \| -2 \| -10.00% \| 10.00% \| \| simple-loop-unswitch.NumTrivial \| 183 \| 95 \| -88 \| -48.09% \| 48.09% \| ... but that actually regresses LICM (-12% `licm.NumMovedLoads`), loop-simplifycfg (`NumLoopExitsDeleted`, `NumTerminatorsFolded`), simple-loop-unswitch (`NumTrivial`). What if we instead have LICM both before and after LoopRotate? \| statistic name \| LoopRotate-LICM \| LICM-LoopRotate-LICM \| Δ \| % \| abs(%) \| \| asm-printer.EmittedInsts \| 9015930 \| 9014474 \| -1456 \| -0.02% \| 0.02% \| \| indvars.NumElimCmp \| 3536 \| 3546 \| 10 \| 0.28% \| 0.28% \| \| indvars.NumElimExt \| 36725 \| 36681 \| -44 \| -0.12% \| 0.12% \| \| indvars.NumElimIV \| 1197 \| 1185 \| -12 \| -1.00% \| 1.00% \| \| indvars.NumElimIdentity \| 143 \| 146 \| 3 \| 2.10% \| 2.10% \| \| indvars.NumElimRem \| 4 \| 5 \| 1 \| 25.00% \| 25.00% \| \| indvars.NumLFTR \| 29842 \| 29899 \| 57 \| 0.19% \| 0.19% \| \| indvars.NumReplaced \| 2293 \| 2299 \| 6 \| 0.26% \| 0.26% \| \| indvars.NumSimplifiedSDiv \| 6 \| 8 \| 2 \| 33.33% \| 33.33% \| \| indvars.NumWidened \| 26438 \| 26404 \| -34 \| -0.13% \| 0.13% \| \| instcount.TotalBlocks \| 1178338 \| 1173652 \| -4686 \| -0.40% \| 0.40% \| \| instcount.TotalFuncs \| 111825 \| 111829 \| 4 \| 0.00% \| 0.00% \| \| instcount.TotalInsts \| 9905442 \| 9895452 \| -9990 \| -0.10% \| 0.10% \| \| lcssa.NumLCSSA \| 425871 \| 425373 \| -498 \| -0.12% \| 0.12% \| \| licm.NumHoisted \| 378357 \| 383352 \| 4995 \| 1.32% \| 1.32% \| \| licm.NumMovedCalls \| 2193 \| 2204 \| 11 \| 0.50% \| 0.50% \| \| licm.NumMovedLoads \| 35899 \| 35755 \| -144 \| -0.40% \| 0.40% \| \| licm.NumPromoted \| 11178 \| 11163 \| -15 \| -0.13% \| 0.13% \| \| licm.NumSunk \| 13359 \| 14321 \| 962 \| 7.20% \| 7.20% \| \| loop-delete.NumDeleted \| 8547 \| 8538 \| -9 \| -0.11% \| 0.11% \| \| loop-instsimplify.NumSimplified \| 12876 \| 12041 \| -835 \| -6.48% \| 6.48% \| \| loop-peel.NumPeeled \| 1008 \| 924 \| -84 \| -8.33% \| 8.33% \| \| loop-rotate.NumNotRotatedDueToHeaderSize \| 368 \| 365 \| -3 \| -0.82% \| 0.82% \| \| loop-rotate.NumRotated \| 42015 \| 42005 \| -10 \| -0.02% \| 0.02% \| \| loop-simplifycfg.NumLoopBlocksDeleted \| 240 \| 241 \| 1 \| 0.42% \| 0.42% \| \| loop-simplifycfg.NumTerminatorsFolded \| 618 \| 619 \| 1 \| 0.16% \| 0.16% \| \| loop-unroll.NumCompletelyUnrolled \| 11028 \| 11029 \| 1 \| 0.01% \| 0.01% \| \| loop-unroll.NumUnrolled \| 12608 \| 12525 \| -83 \| -0.66% \| 0.66% \| \| mem2reg.NumPHIInsert \| 192110 \| 192073 \| -37 \| -0.02% \| 0.02% \| \| mem2reg.NumSingleStore \| 637650 \| 637652 \| 2 \| 0.00% \| 0.00% \| \| scalar-evolution.NumTripCountsComputed \| 283108 \| 282998 \| -110 \| -0.04% \| 0.04% \| \| scalar-evolution.NumTripCountsNotComputed \| 106712 \| 106691 \| -21 \| -0.02% \| 0.02% \| \| simple-loop-unswitch.NumBranches \| 5178 \| 5185 \| 7 \| 0.14% \| 0.14% \| \| simple-loop-unswitch.NumCostMultiplierSkipped \| 914 \| 925 \| 11 \| 1.20% \| 1.20% \| \| simple-loop-unswitch.NumTrivial \| 183 \| 179 \| -4 \| -2.19% \| 2.19% \| \| simple-loop-unswitch.NumBranches \| 5178 \| 4752 \| -426 \| -8.23% \| 8.23% \| \| simple-loop-unswitch.NumCostMultiplierSkipped \| 914 \| 503 \| -411 \| -44.97% \| 44.97% \| \| simple-loop-unswitch.NumSwitches \| 20 \| 18 \| -2 \| -10.00% \| 10.00% \| \| simple-loop-unswitch.NumTrivial \| 183 \| 95 \| -88 \| -48.09% \| 48.09% \| I.e. we end up with less instructions, less peeling, more LICM activity, also note how none of those 4 regressions are here. Namely: \| statistic name \| LICM-LoopRotate \| LICM-LoopRotate-LICM \| Δ \| % \| abs(%) \| \| asm-printer.EmittedInsts \| 9015799 \| 9014474 \| -1325 \| -0.01% \| 0.01% \| \| indvars.NumElimCmp \| 3544 \| 3546 \| 2 \| 0.06% \| 0.06% \| \| indvars.NumElimExt \| 36580 \| 36681 \| 101 \| 0.28% \| 0.28% \| \| indvars.NumElimIV \| 1187 \| 1185 \| -2 \| -0.17% \| 0.17% \| \| indvars.NumElimIdentity \| 136 \| 146 \| 10 \| 7.35% \| 7.35% \| \| indvars.NumLFTR \| 29890 \| 29899 \| 9 \| 0.03% \| 0.03% \| \| indvars.NumReplaced \| 2227 \| 2299 \| 72 \| 3.23% \| 3.23% \| \| indvars.NumWidened \| 26329 \| 26404 \| 75 \| 0.28% \| 0.28% \| \| instcount.TotalBlocks \| 1173840 \| 1173652 \| -188 \| -0.02% \| 0.02% \| \| instcount.TotalInsts \| 9896139 \| 9895452 \| -687 \| -0.01% \| 0.01% \| \| lcssa.NumLCSSA \| 423961 \| 425373 \| 1412 \| 0.33% \| 0.33% \| \| licm.NumHoisted \| 378753 \| 383352 \| 4599 \| 1.21% \| 1.21% \| \| licm.NumMovedCalls \| 2208 \| 2204 \| -4 \| -0.18% \| 0.18% \| \| licm.NumMovedLoads \| 31821 \| 35755 \| 3934 \| 12.36% \| 12.36% \| \| licm.NumPromoted \| 11154 \| 11163 \| 9 \| 0.08% \| 0.08% \| \| licm.NumSunk \| 13587 \| 14321 \| 734 \| 5.40% \| 5.40% \| \| loop-delete.NumDeleted \| 8402 \| 8538 \| 136 \| 1.62% \| 1.62% \| \| loop-instsimplify.NumSimplified \| 11890 \| 12041 \| 151 \| 1.27% \| 1.27% \| \| loop-peel.NumPeeled \| 925 \| 924 \| -1 \| -0.11% \| 0.11% \| \| loop-rotate.NumRotated \| 42003 \| 42005 \| 2 \| 0.00% \| 0.00% \| \| loop-simplifycfg.NumLoopBlocksDeleted \| 242 \| 241 \| -1 \| -0.41% \| 0.41% \| \| loop-simplifycfg.NumLoopExitsDeleted \| 20 \| 497 \| 477 \| 2385.00% \| 2385.00% \| \| loop-simplifycfg.NumTerminatorsFolded \| 336 \| 619 \| 283 \| 84.23% \| 84.23% \| \| loop-unroll.NumCompletelyUnrolled \| 11032 \| 11029 \| -3 \| -0.03% \| 0.03% \| \| loop-unroll.NumUnrolled \| 12529 \| 12525 \| -4 \| -0.03% \| 0.03% \| \| mem2reg.NumDeadAlloca \| 10221 \| 10222 \| 1 \| 0.01% \| 0.01% \| \| mem2reg.NumPHIInsert \| 192106 \| 192073 \| -33 \| -0.02% \| 0.02% \| \| mem2reg.NumSingleStore \| 637643 \| 637652 \| 9 \| 0.00% \| 0.00% \| \| scalar-evolution.NumBruteForceTripCountsComputed \| 812 \| 814 \| 2 \| 0.25% \| 0.25% \| \| scalar-evolution.NumTripCountsComputed \| 282934 \| 282998 \| 64 \| 0.02% \| 0.02% \| \| scalar-evolution.NumTripCountsNotComputed \| 106718 \| 106691 \| -27 \| -0.03% \| 0.03% \| \| simple-loop-unswitch.NumBranches \| 4752 \| 5185 \| 433 \| 9.11% \| 9.11% \| \| simple-loop-unswitch.NumCostMultiplierSkipped \| 503 \| 925 \| 422 \| 83.90% \| 83.90% \| \| simple-loop-unswitch.NumSwitches \| 18 \| 20 \| 2 \| 11.11% \| 11.11% \| \| simple-loop-unswitch.NumTrivial \| 95 \| 179 \| 84 \| 88.42% \| 88.42% \| {F15983613} {F15983615} {F15983616} (this is vanilla llvm testsuite + rawspeed + darktable) As an example of the code where early LICM only is bad, see: https://godbolt.org/z/GzEbacs4K This does have an observable compile-time regression of +~0.5% geomean https://llvm-compile-time-tracker.com/compare.php?from=7c5222e4d1a3a14f029e5f614c9aefd0fa505f1e&to=5d81826c3411982ca26e46b9d0aff34c80577664&stat=instructions but i think that's basically nothing, and there's potential that it might be avoidable in the future by fixing clang to produce alignment information on function arguments, thus making the second run unneeded. Differential Revision: https://reviews.llvm.org/D99249	2021-04-02 11:11:42 +03:00
Wenlei He	17e4ecbb16	[CSSPGO] Skip dangling probe value when computing profile summary Recently we switched to use InvalidProbeCount = UINT64_MAX (instead of 0) to represent dangling probe, but UINT64_MAX is not excluded when computing profile summary. This caused profile summary to produce incorrect hot/cold threshold. The change fixed it by excluding UINT64_MAX from summary builder. Differential Revision: https://reviews.llvm.org/D99788	2021-04-01 22:49:11 -07:00
Craig Topper	aab811b9ef	[RISCV] Add missing nxvXf64 intrinsics tests cases for floating-point compare for RV32.	2021-04-01 20:57:13 -07:00
Samuel	caa230762f	[llvm-reduce] Add header guards and fix clang-tidy warnings Add header guards and fix other clang-tidy warnings in .h files. Also align misaligned header docs Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D99634	2021-04-01 20:38:49 -07:00
Craig Topper	53efcb18a0	[RISCV] Add more nxvi64 vector intrinsic tests for RV32. NFC This confirms we handle most instrutions gracefully. We do currently fail for vslide1up and vslide1down though.	2021-04-01 20:34:28 -07:00

1 2 3 4 5 ...

213619 Commits