llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Eli Friedman	d624c63575	[NFC][ScalarEvolution] Clean up ExitLimit constructors. Make all the constructors forward to one constructor. Remove redundant assertions.	2021-06-20 17:40:30 -07:00
Jim Lin	191d405aea	[IVDescriptors] Fix comment that getUnsafeAlgebraInst has been renamed to getExactFPMathInst https://reviews.llvm.org/rG36a489d194750dc888f214240e9dec9122ca1f0e renamed the function call in the test from getUnsafeAlgebraInst to getExactFPMathInst. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D104441	2021-06-21 07:56:22 +08:00
Dmitri Gribenko	71570a2380	[GCOVProfiling][test] Ensure that 'opt' drops any files in a temp directory	2021-06-20 22:48:35 +02:00
Craig Topper	0dab191ced	[TypePromotion] Prune Intrinsic includes. NFC TypePromotion is meant to be a generic pass and doesn't reference any ARM intrinsics so it shouldn't include IntrinsicsARM.h. The other Intrinsic related headers appear to be unneeded as well.	2021-06-20 13:04:02 -07:00
Nikita Popov	45f41f7ab1	[LoopUnroll] Use smallest exact trip count from any exit This is a more general alternative/extension to D102635. Rather than handling the special case of "header exit with non-exiting latch", this unrolls against the smallest exact trip count from any exit. The latch exit is no longer treated as priviledged when it comes to full unrolling. The motivating case is in full-unroll-one-unpredictable-exit.ll. Here the header exit is an IV-based exit, while the latch exit is a data comparison. This kind of loop does not get rotated, because the latch is already exiting, and loop rotation doesn't try to distinguish IV-based/analyzable latches. Differential Revision: https://reviews.llvm.org/D102982	2021-06-20 20:58:26 +02:00
Fangrui Song	74f848b886	Fix -Wunused-variable and -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC	2021-06-20 11:09:07 -07:00
David Green	51d7c2c19b	[DSE] Remove stores in the same loop iteration DSE will currently only remove stores in the same block unless they can be guaranteed to be loop invariant. This expands that to any stores that are in the same Loop, at the same loop level. This should still account for where AA/MSSA will not handle aliasing between loops, but allow the dead stores to be removed where they overlap in the same loop iteration. It requires adding loop info to DSE, but that looks fairly harmless. The test case this helps is from code like this, which can come up in certain matrix operations: for(i=..) dst[i] = 0; for(j=..) dst[i] += src[in+j]; After LICM, this becomes: for(i=..) dst[i] = 0; sum = 0; for(j=..) sum += src[in+j]; dst[i] = sum; The first store is dead, and with this patch is now removed. Differntial Revision: https://reviews.llvm.org/D100464	2021-06-20 17:03:30 +01:00
Sanjay Patel	245d1ca508	[InstCombine] fold ctpop-of-select with 1 or more constant arms The general pattern is mentioned in: https://llvm.org/PR50140 ...but we need to do a bit more to handle intrinsics with extra operands like ctlz/cttz.	2021-06-20 11:28:45 -04:00
Sanjay Patel	f90ab103b5	[InstCombine] avoid infinite loops with select folds of constant expressions This pair of transforms was added recently with: 8591640379ac9175a And could lead to conflicting folds: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35399	2021-06-20 09:46:25 -04:00
Roman Lebedev	48ca532a9e	[NFC][AArch64][ARM][Thumb][Hexagon] Autogenerate some tests These all (and some others) are being affected by D104597, but they are manually-written, which rather complicates checking the effect that change has on them.	2021-06-20 14:12:45 +03:00
Roman Lebedev	9acde4873f	[UpdateTestUtils] Print test filename when complaining about conflicting prefix Now that FileCheck eagerly complains when prefixes are unused, the update script does the same, and is becoming very common to need to drop some prefixes, yet figuring out the file it complains about isn't obvious unless it actually tells us.	2021-06-20 14:12:39 +03:00
Roman Lebedev	21a466ca4d	[SimplifyCFG] FoldTwoEntryPHINode(): don't fold if either block has it's address taken Same as with HoistThenElseCodeToIf() (ad87761925c2790aab272138b5bbbde4a93e0383).	2021-06-20 12:37:14 +03:00
Roman Lebedev	9120098319	[SimplifyCFG] HoistThenElseCodeToIf(): don't hoist if either block has it's address taken This problem is exposed by D104598, after it tail-merges `ret` in `@test_inline_constraint_S_label`, the verifier would start complaining `invalid operand for inline asm constraint 'S'`. Essentially, taking address of a block is mismodelled in IR. It should probably be an explicit instruction, a first one in block, that isn't identical to any other instruction of the same type, so that it can't be hoisted.	2021-06-20 12:18:15 +03:00
Juneyoung Lee	b719c3f4a5	[InstSimplify] icmp poison, X -> poison This adds a simple transformation from icmp with poison constant to poison. Comparing poison with something else is poison, so this is okay. https://alive2.llvm.org/ce/z/e8iReb https://alive2.llvm.org/ce/z/q4MurY	2021-06-20 15:39:07 +09:00
Fangrui Song	9e8233e08c	[llvm-cov gcov] Support GCC 12 format GCC 12 will change the length field to represent the number of bytes instead of 32-bit words. This avoids padding for strings.	2021-06-19 22:51:20 -07:00
Fangrui Song	f02bea7812	[llvm-cov gcov] Change case to match the prevailing style && replace getString with readString	2021-06-19 22:50:52 -07:00
Fangrui Song	8a7045847f	[test] Fix nocompress.test	2021-06-19 16:27:53 -07:00
Fangrui Song	2fe891f533	[llvm-profdata] Make diagnostics consistent with the (no capitalization, no period) style The format is currently inconsistent. Use the https://llvm.org/docs/CodingStandards.html#error-and-warning-messages style. And add `error:` or `warning:` to CHECK lines wherever appropriate.	2021-06-19 14:54:25 -07:00
Fangrui Song	e2a2115bff	[llvm-profdata] Delete unneeded empty output filename check	2021-06-19 12:20:45 -07:00
Craig Topper	61a08a19b8	[RISCV] Prevent formation of shXadd(.uw) and add.uw if it prevents the use of addi. If the outer add has an simm12 immediate operand we should prefer it instead of materializing it in a register. This would guarantee and extra instruction and temporary register. Since we don't check one use on the shl or zext we might generate more instructions if there is an additional user.	2021-06-19 12:10:42 -07:00
Roman Lebedev	a8e6eca719	[NFC] AMD Zen 3: fix typo in a comment	2021-06-19 22:05:17 +03:00
Fangrui Song	2beef4520d	Simplify some typedef struct	2021-06-19 11:36:44 -07:00
Nico Weber	a20aff32da	[gn build] (manually) port b9c05aff205b (MIRTests)	2021-06-19 13:04:09 -04:00
Michael Liao	d21f701c76	[MIRPrinter] Add machine metadata support. - Distinct metadata needs generating in the codegen to attach correct AAInfo on the loads/stores after lowering, merging, and other relevant transformations. - This patch adds 'MachhineModuleSlotTracker' to help assign slot numbers to these newly generated unnamed metadata nodes. - To help 'MachhineModuleSlotTracker' track machine metadata, the original 'SlotTracker' is rebased from 'AbstractSlotTrackerStorage', which provides basic interfaces to create/retrive metadata slots. In addition, once LLVM IR is processsed, additional hooks are also introduced to help collect machine metadata and assign them slot numbers. - Finally, if there is any such machine metadata, 'MIRPrinter' outputs an additional 'machineMetadataNodes' field containing all the definition of those nodes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103205	2021-06-19 12:48:08 -04:00
Michael Liao	be5f17eb4b	[amdgpu] Improve the from f32 to i64. - Take the same principle as the conversion from f64 to i64 with extra necessary pre- and post-processing. It helps to reduce that conversion sequence by half compared to legacy one. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D104427	2021-06-19 12:46:48 -04:00
Sanjay Patel	649e66f4eb	[InstCombine][test] add tests for select-of-bit-manip; NFC	2021-06-19 12:34:32 -04:00
Tomas Matheson	e797f6f6ee	Allow building for release with EXPENSIVE_CHECKS D97225 moved LazyCallGraph verify() calls behind EXPENSIVE_CHECKS, but verity() is defined for debug builds only so this had the unintended effect of breaking release builds with EXPENSIVE_CHECKS. Fix by enabling verify() for both debug and EXPENSIVE_CHECKS. Differential Revision: https://reviews.llvm.org/D104514	2021-06-19 17:02:11 +01:00
Tomas Matheson	48473628b0	[ARM][NFC] Tidy up subtarget frame pointer routines getFramePointerReg only depends on information in ARMSubtarget, so move it in there so it can be accessed from more places. Make use of ARMSubtarget::getFramePointerReg to remove duplicated code. The main use of useR7AsFramePointer is getFramePointerReg, so inline it. Differential Revision: https://reviews.llvm.org/D104476	2021-06-19 17:00:45 +01:00
LLVM GN Syncbot	db683fa2a1	[gn build] Port 134723edd5bf	2021-06-19 11:49:56 +00:00
Nikita Popov	272334cfc0	[LoopUnroll] Push runtime unrolling decision up into tryToUnrollLoop() Currently, UnrollLoop() is passed an AllowRuntime flag and decides itself whether runtime unrolling should be used or not. This patch pushes the decision into the caller and allows us to eliminate the ULO.TripCount and ULO.TripMultiple parameters. Differential Revision: https://reviews.llvm.org/D104487	2021-06-19 09:25:57 +02:00
Ben Shi	45e207d211	[RISCV] Optimize add-mul in the zba extension with SHADD This patch does the following optimization. Rx + Ry 18 => (SH1ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry) Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104588	2021-06-19 14:33:27 +08:00
Ben Shi	e7b41c5208	[RISCV][test] Add new tests for add-mul optimization in the zba extension with SHADD These tests will show the following optimization by future patches. Rx + Ry 18 => (SH1ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 20 => (SH2ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 24 => (SH3ADD (SH1ADD Rx, Rx), Ry) Rx + Ry * 36 => (SH2ADD (SH3ADD Rx, Rx), Ry) Rx + Ry * 40 => (SH3ADD (SH2ADD Rx, Rx), Ry) Rx + Ry * 72 => (SH3ADD (SH3ADD Rx, Rx), Ry) Rx * (3 << C) => (SLLI (SH1ADD Rx, Rx), C) Rx * (5 << C) => (SLLI (SH2ADD Rx, Rx), C) Rx * (9 << C) => (SLLI (SH3ADD Rx, Rx), C) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104507	2021-06-19 14:31:01 +08:00
Lang Hames	7afd72c69b	[ORC][examples] Add missing library dependence	2021-06-19 14:48:34 +10:00
Liqiang Tao	d5e843fe26	[llvm][Inliner] Add an optional PriorityInlineOrder This patch adds an optional PriorityInlineOrder, which uses the heap to order inlining. The callsite which size is smaller would have a higher priority. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D104028	2021-06-19 10:17:32 +08:00
Lang Hames	b6ca0c60bb	[ORC][C-bindings] Add access to LLJIT IRTransformLayer, ThreadSafeModule utils. This patch was derived from Valentin Churavy's work in https://reviews.llvm.org/D104480. It adds support for setting the transform on an IRTransformLayer, and for accessing the IRTransformLayer in LLJIT. It also adds access to the ThreadSafeModule::withModuleDo method for thread-safe access to modules. A new example has been added to show how to use these APIs to optimize a module during materialization. Thanks Valentin! Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D103855	2021-06-19 11:50:27 +10:00
Lang Hames	320507db5e	[ORC][examples] Fix file name in comment.	2021-06-19 11:50:26 +10:00
Guozhi Wei	d8a557bfc0	[InstCombine] Don't transform code if DoTransform is false In patch https://reviews.llvm.org/D72396, it doesn't check DoTransform before transforming the code, and generates wrong result for the attached test case. Differential Revision: https://reviews.llvm.org/D104567	2021-06-18 18:01:34 -07:00
Fangrui Song	e71d1a723a	[InstrProfiling][ELF] Make __profd_ private if the function does not use value profiling On ELF, the D1003372 optimization can apply to more cases. There are two prerequisites for making `__profd_` private: * `__profc_` keeps `__profd_` live under compiler/linker GC * `__profd_` is not referenced by code The first is satisfied because all counters/data are in a section group (either `comdat any` or `comdat noduplicates`). The second requires that the function does not use value profiling. Regarding the second point: `__profd_` may be referenced by other text sections due to inlining. There will be a linker error if a prevailing text section references the non-prevailing local symbol. With this change, a stage 2 (`-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_BUILD_INSTRUMENTED=IR`) clang is 4.2% smaller (1-169620032/177066968). `stat -c %s */.o \| awk '{s+=$1}END{print s}' is 2.5% smaller. Reviewed By: davidxl, rnk Differential Revision: https://reviews.llvm.org/D103717	2021-06-18 17:01:17 -07:00
Matt Arsenault	6c04830c65	AMDGPU: Fix infinite loop in DAG combine with fneg + fma We were not reporting isFNegFree for v2f32, although it is effectively free after legalization. The generic combine was pulling fneg out of the fma source operands, and the AMDGPU combine was doing the opposite.	2021-06-18 19:09:03 -04:00
Matt Arsenault	bf6259aa0b	AMDGPU: Fix assert on m0_lo16/m0_hi16 These get added (redundantly) to the bundle expanded for indirect register accesses. We hit this path only when there is a call in the function.	2021-06-18 18:48:53 -04:00
Hongtao Yu	7fbb587058	[CSSPGO] Undoing the concept of dangling pseudo probe As a follow-up to https://reviews.llvm.org/D104129, I'm cleaning up the danling probe related code in both the compiler and llvm-profgen. I'm seeing a 5% size win for the pseudo_probe section for SPEC2017 and 10% for Ciner. Certain benchmark such as 602.gcc has a 20% size win. No obvious difference seen on build time for SPEC2017 and Cinder. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104477	2021-06-18 15:14:11 -07:00
Nikita Popov	7afbecd70d	[LoopUnroll] Simplify optimization remarks Remove dependence on ULO.TripCount/ULO.TripMultiple from ORE and debug code. For debug code, print information about all exits. For optimization remarks, only include the unroll count and the type of unroll (complete, partial or runtime), but omit detailed information about exit folding, now that more than one exit may be folded. Differential Revision: https://reviews.llvm.org/D104482	2021-06-18 23:47:03 +02:00
Hongtao Yu	d9e9fe620c	[CSSPGO][llvm-profgen] Fix an issue in findDisjointRanges We were using 0 as an indicator of invalid offset when computing disjoint ranges. In reality, 0 can be an valid code offset which stands for the first function in .text section. I'm using UINT64_MAX as an invalid code offset instead. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104497	2021-06-18 14:38:48 -07:00
Nick Desaulniers	de2155141d	[GCOVProfiling] don't profile Fn's w/ noprofile attribute Similar to D104475, the Linux kernel would like to avoid compiler generated code in certain functions. The no_profile function attribute can be used in C to generate the the noprofile fn attr in IR. Respect that from GCOVProfiling. Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/ Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D104257	2021-06-18 13:58:34 -07:00
Craig Topper	a2969894c5	[RISCV] Teach vsetvli insertion to remember when predecessors have same AVL and SEW/LMUL ratio if their VTYPEs otherwise mismatch. Previously we went directly to unknown state on VTYPE mismatch. If we instead remember the partial match, we can use this to still use X0, X0 vsetvli in successors if AVL and needed SEW/LMUL ratio match. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D104069	2021-06-18 12:16:07 -07:00
Hongtao Yu	f2d99918a7	[CSSPGO][llvm-profgen] Ignore LBR records after interrupt transition If we have seen an inwards transition from external code to internal code, but not a following outwards transition, the inwards transition is likely due to interrupt which is usually unpaired. Ignore current and subsequent entries since they are likely from an unrelated pre-interrupt context. LBR records from different interrupt context are unrelated and they should not be mixed together. Currenlty the OS does this for task-scheduling interrupt but not for all interrupts. Reviewed By: wenlei, wlei Differential Revision: https://reviews.llvm.org/D104276	2021-06-18 12:13:53 -07:00
Anshil Gandhi	7f0aeb0ac8	[AMDGPU] [CodeGen] Fold negate llvm.amdgcn.class into test mask Implemented the transformation of xor (llvm.amdgcn.class x, mask), -1 into llvm.amdgcn.class(x, ~mask). Added LIT tests as well. Differential Revision: https://reviews.llvm.org/D104049	2021-06-18 13:04:12 -06:00
Hongtao Yu	89f841e69b	[CSSPGO] Fix an invalid hash table reference issue in the CS preinliner. We were using a `StringMap` object to store all profiles to be emitted. The object is basically an unordered hash table, therefore updating it in the process of trasvering it may cause issue since the underlying bucket array could change. I'm also moving the `csspgo-preinliner` switch around so that no context tri will be constructed (by the constructor of `CSPreInliner`) when the switch is off. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D104267	2021-06-18 11:54:23 -07:00
Andrew Browne	d75f39d6d7	[DFSan] Cleanup code for platforms other than Linux x86_64. These other platforms are unsupported and untested. They could be re-added later based on MSan code. Reviewed By: gbalats, stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D104481	2021-06-18 11:21:46 -07:00
Krzysztof Parzyszek	f7732c65f0	Revert "Delay initialization of OptBisect" This reverts commit ec91df8d8195b8b759a89734dba227da1eaa729f. It was committed by accident.	2021-06-18 13:16:45 -05:00

... 4 5 6 7 8 ...

217651 Commits