llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Shoaib Meenai	011d3c9d37	[cmake] Add support for multiple distributions LLVM's build system contains support for configuring a distribution, but it can often be useful to be able to configure multiple distributions (e.g. if you want separate distributions for the tools and the libraries). Add this support to the build system, along with documentation and usage examples. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D89177	2021-05-12 11:13:18 -07:00
Fangrui Song	1050db3338	[X86] Fix -Wunused-lambda-capture	2021-05-12 10:34:32 -07:00
Fangrui Song	686fd925f4	[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions llvm-dev message: https://lists.llvm.org/pipermail/llvm-dev/2021-May/150465.html In an ELF shared object, a default visibility defined symbol is preemptible by default. This creates some missed optimization opportunities. -fno-semantic-interposition can optimize -fPIC: * in Clang: avoid GOT/PLT cost for variable access/function calls to external linkage definition in the same TU * in GCC: enable interprocedural optimizations (including inlining) and avoid PLT See https://gist.github.com/MaskRay/2d4dfcfc897341163f734afb59f689c6 for more information. -Bsymbolic-functions is more aggressive than -fvisibility-inlines-hidden (present since 2012) as it applies to all function definitions. It can * avoid PLT for cross-TU function calls && reduce dynamic symbol lookup * reduce dynamic symbol lookup for taking function addresses and optimize out GOT/TOC on x86-64/ppc64 With both options, the libLLVM.so and libclang-cpp.so performance should be closer to PIE binary linking against `libLLVM.a` and `libclang.a` (In a -DLLVM_TARGETS_TO_BUILD=X86 build, the number of JUMP_SLOT decreases from 12716 to 1628, and the number of GLOB_DAT decreases from 1918 to 1313 The built clang with `-DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` is significantly faster. See the Linux kernel build result https://bugs.archlinux.org/task/70697 ) Some implication: Interposing a subset of functions is no longer supported. (This is fragile anyway and cannot really be supported. For Mach-O we don't use `ld -interpose`, so interposition is not supported on Mach-O at all.) Compiling a program which takes the address of any LLVM function with `{gcc,clang} -fno-pic` and expects the address to equal to the address taken from libLLVM.so or libclang-cpp.so is unsupported. I am fairly confident that llvm-project shouldn't have different behaviors depending on such pointer equality (as we've been using -fvisibility-inlines-hidden which applies to inline functions for a long time), but if we accidentally do, users should be aware that they should not make assumption on pointer equality in `-fno-pic` mode. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D102090	2021-05-12 10:34:31 -07:00
Simon Pilgrim	b8bfb9ec6b	[X86][AVX] Fold concat(pslq(x,32),pslq(y,32)) -> shuffle(concat(x,y),zero) (PR46621) On AVX1 targets we can handle v4i64 logical shifts by 32 bits as a pair of v8f32 shuffles with zero. I was hoping to put this in LowerScalarImmediateShift, but performing that early causes regressions where other instructions were respliting the subvectors.	2021-05-12 18:04:40 +01:00
Amara Emerson	9018891110	[AArch64][GlobalISel] Add MMOs to constant pool loads to allow LICM hoisting. This caused performance regressions vs SDAG on SingleSource/Benchmarks/Adobe-C++	2021-05-12 09:47:09 -07:00
Abhina Sreeskantharajan	d19ac5a2e6	[SystemZ][z/OS] Fix warning caused by umask returning a signed integer type On z/OS, umask() returns an int because mode_t is type int, however it is being compared to an unsigned int. This patch fixes the following warning we see when compiling Path.cpp. ``` comparison of integers of different signs: 'const int' and 'const unsigned int' ``` Reviewed By: muiez Differential Revision: https://reviews.llvm.org/D102326	2021-05-12 12:26:22 -04:00
Victor Huang	2b9c980a98	[PowerPC] Fix definitions of CMPRB8, CMPEQB, CMPRB, SETB in PPCInstr64Bit.td and PPCInstrInfo.td	2021-05-12 10:59:33 -05:00
Baptiste Saleil	c1b3aa761b	[AMDGPU] Disable the SIFormMemoryClauses pass at -O1 This patch disables the SIFormMemoryClauses pass at -O1. This pass has a significant impact on compilation time, so we only want it to be enabled starting from -O2. Differential Revision: https://reviews.llvm.org/D101939	2021-05-12 11:51:59 -04:00
Simon Pilgrim	df83d38bf7	[X86][AVX] combineConcatVectorOps - add ConcatSubOperand helper. NFCI. Pull out repeated code to create a concat_vectors of the same operand from all subvecs.	2021-05-12 16:42:18 +01:00
Simon Pilgrim	071634994d	[X86][AVX] Add v4i64 shift-by-32 tests AVX1 could perform this as a v8f32 shuffle instead of splitting - based off PR46621	2021-05-12 16:42:18 +01:00
Fraser Cormack	4db676be05	[TargetLowering] Improve legalization of scalable vector types This patch extends the vector type-conversion and legalization capabilities of scalable vector types. Firstly, `vscale x 1` types now behave more like the corresponding `vscale x 2+` types. This enables the integer promotion legalization of extended scalable types, such as the promotion of `<vscale x 1 x i5>` to `<vscale x 1 x i8>`. These `vscale x 1` types are also now better handled by `getVectorTypeBreakdown`, where what looks like older handling for 1-element fixed-length vector types was spuriously updated to include scalable types. Widening of scalable types is now better supported, by using `INSERT_SUBVECTOR` to insert the smaller scalable vector "value" type into the wider scalable vector "part" type. This allows AArch64 to pass and return `vscale x 1` types by value by widening. There are still cases where we are unable to legalize `vscale x 1` types, such as where expansion would require splitting the vector in two. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102073	2021-05-12 16:33:07 +01:00
Jordan Rupprecht	817ad917ee	[llvm-cov][test] Add test coverage for "gcov" implying "llvm-cov gcov" compatibility. Much like other LLVM binary utilities, `llvm-cov` has a symlink compatibility feature where it runs in `gcov` compatibility mode if the binary name ends in `gcov`. This is identical to invoking `llvm-cov gcov ...`. Differential Revision: https://reviews.llvm.org/D102299	2021-05-12 08:21:42 -07:00
Craig Topper	f3f2380881	[ValueTypes] Rename MVT::getVectorNumElements() to MVT::getVectorMinNumElements(). Fix some misuses of getVectorNumElements() getVectorNumElements() returns a value for scalable vectors without any warning so it is effectively getVectorMinNumElements(). By renaming it and making getVectorNumElements() forward to it, we can insert a check for scalable vectors into getVectorNumElements() similar to EVT. I didn't do that in this patch because there are still more fixes needed, but I was able to temporarily do it and passed the RISCV lit tests with these changes. The changes to isPow2VectorType and getPow2VectorType are copied from EVT. The change to TypeInfer::EnforceSameNumElts reduces the size of AArch64's isel table. We're now considering SameNumElts to require the scalable property to match which removes some unneeded type checks. This was motivated by the bug I fixed yesterday in 80b9510806cf11c57f2dd87191d3989fc45defa8 Reviewed By: frasercrmck, sdesmalen Differential Revision: https://reviews.llvm.org/D102262	2021-05-12 07:46:45 -07:00
Stefan Pintilie	3947a17f41	Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" This reverts commit 6c80361b8474535852afb2f7201370fb5f410091. Breaks PowerPC Big Endian buildbots.	2021-05-12 09:46:18 -05:00
Hendrik Greving	82a56e9c79	[DAGCombiner] Fix DAG combine store elimination, different address space. Fixes a bug in the DAG combiner that eliminates the stores because it missed to inspect the address space of the pointers. %v = load %ptr_as1 // no chain side effect store %v, %ptr_as2 As well as store %v, %ptr_as1 store %v, %ptr_as2 Fixes a test for above in X86. Differential Revision: https://reviews.llvm.org/D102096	2021-05-12 07:14:22 -07:00
Hendrik Greving	abd9b81f2f	[DAGCombiner] Add test exposing bug in DAG combine. Adds a test in X86, exposing a bug in DAG combine eliminating stores that are the same value but no the same address space. Differential Revision: https://reviews.llvm.org/D102243	2021-05-12 07:14:21 -07:00
Peter Waller	765479e0fb	[CodeGen][AArch64][SVE] Fold [rdffr, ptest] => rdffrs; bugfix for optimizePTestInstr When a ptest is used to set flags from the output of rdffr, the ptest can be eliminated, using a flags-setting rdffrs instead. Additionally, check that nothing consumes flags between rdffr and ptest; this case appears to have been missed previously. * There is no unpredicated RDFFRS instruction. * If substituting RDFFR_PP, require that the mask argument of the PTEST matches that of the RDFFR_PP. * Move some precondition code up inside optimizePTestInstr, so that it covers the new code paths for RDFFR which return earlier. * Only consider RDFFR, PTEST in same basic block. * Check for other flag setting instructions between the two, abort if found. * Drop an old TODO comment about removing dead PTEST instructions. RDFFR_P to follow in later patch. Differential Revision: https://reviews.llvm.org/D101357	2021-05-12 15:06:22 +01:00
David Sherwood	007b167d4f	[NFC] Use variable GEP index in vec_demanded_elts tests I've changed a test in each of these files: Transforms/InstCombine/vec_demanded_elts.ll Transforms/InstCombine/vec_demanded_elts-inseltpoison.ll to use a variable GEP index instead of a constant value so that we're testing the more general case.	2021-05-12 14:56:04 +01:00
Martin Storsjö	6f0905c089	[Passes] Reenable the relative lookup table converter pass for ELF and COFF on aarch64 The bug (PR50227, affecting COFF) that caused the revert in 6f5670a4c3d8c079d4b676140ee69e5cc235d5a8 has been fixed in 382c505d9cfca8adaec47aea2da7bbcbc00fc05c now, so it should be safe to reenable the pass for that target (and ELF). In PR50227 it's also mentioned that the same pass seems to cause problems on aarch64 on darwin, so leaving it disabled there for now.	2021-05-12 16:42:11 +03:00
Greg McGary	0e7e9c4f85	[llvm-objdump] Exclude __mh__header symbols during MachO disassembly `__mh_(execute\|dylib\|dylinker\|bundle\|preload\|object)_header` are special symbols whose values hold the VMA of the Mach header to support introspection. They are attached to the first section in `__TEXT`, even though their addresses are outside `__TEXT`, and they do not refer to code. It is normally harmless, but when the first section of `__TEXT` has no other symbols, `__mh__header` is considered by the disassembler when determing function boundaries. Since `__mh_*_header` refers to an address outside `__TEXT`, the boundary determination fails and disassembly quits. Since `__TEXT,__text` normally has symbols, this bug is obscured. Experiments placing `__stubs` and `__stub_helper` first exposed the bug, since neither has symbols. Differential Revision: https://reviews.llvm.org/D101786	2021-05-12 06:39:14 -07:00
Julien Pagès	94c9dd7bff	[AMDGPU] Improve Codegen for build_vector Improve the code generation of build_vector. Use the v_pack_b32_f16 instruction instead of v_and_b32 + v_lshl_or_b32 Differential Revision: https://reviews.llvm.org/D98081 Patch by Julien Pagès!	2021-05-12 14:17:44 +01:00
Roman Lebedev	3841b63321	[InstCombine] ~(C + X) --> ~C - X (PR50308) We can not rely on (C+X)-->(X+C) already happening, because we might not have visited that `add` yet. The added testcase would get stuck in an endless combine loop.	2021-05-12 16:10:55 +03:00
Jay Foad	3a723e1522	[TargetRegisterInfo] Speed up getAllocatableSet. NFCI. MachineRegisterInfo caches the reserved register set that is computed by by TargetRegisterInfo::getReservedRegs, so call into MRI to get the reserved regs to avoid recomputing them. In particular this speeds up AMDGPU's SIFormMemoryClauses pass because AMDGPU has a particularly complicated reserved set that is expensive to compute. Differential Revision: https://reviews.llvm.org/D102318	2021-05-12 14:09:05 +01:00
Piotr Sobczak	876cd27749	[AMDGPU] Remove assert Remove assert introduced in D101177, following post-commit feedback.	2021-05-12 14:52:37 +02:00
Sanjay Patel	4479d97145	[x86] try harder to lower to PCMPGT instead of not-of-PCMPEQ This is motivated by the example in https://llvm.org/PR50055 , but it doesn't do anything for that bug currently because we don't actually have a zero-extended setcc there. Proof for the generic transform (inverse of what we would try to do in combining): https://alive2.llvm.org/ce/z/aBL-Mg Differential Revision: https://reviews.llvm.org/D102275	2021-05-12 08:25:29 -04:00
Sanjay Patel	36d619684f	[x86] add test for pcmpeq with 0; NFC	2021-05-12 08:25:29 -04:00
Simon Pilgrim	42fa48b514	[X86][AVX] canonicalizeShuffleMaskWithHorizOp - improve support for 256/512-bit vectors Extend the HOP(HOP(X,Y),HOP(Z,W)) and SHUFFLE(HOP(X,Y),HOP(Z,W)) folds to handle repeating 256/512-bit vector cases. This allows us to drop the UNPACK(HOP(),HOP()) custom fold in combineTargetShuffle. This required isRepeatedTargetShuffleMask to be tweaked to support target shuffle masks taking more than 2 inputs.	2021-05-12 12:13:24 +01:00
gbreynoo	ba11c7d4fb	[llvm-readelf] Unhide short options to match the command guide The readelf command guide shows the short options used as aliases but these are not found in the help text unless --show-hidden is used, other tools show aliases with --help. This change fixes the help output to be consistent with the command guide. Differential Revision: https://reviews.llvm.org/D102173	2021-05-12 12:09:08 +01:00
gbreynoo	90e3438b60	[llvm-symbolizer] Place Mach-O options into the Mach-O option group. In the help output of other tools and in the symbolizer command guide, Mach-O specific options are in their own section. This change fixes the symbolizer help output to be consistent. Differential Revision: https://reviews.llvm.org/D102178	2021-05-12 12:04:54 +01:00
David Sherwood	0d70233bde	[LoopVectorize] Fix scalarisation crash in widenPHIInstruction for scalable vectors In InnerLoopVectorizer::widenPHIInstruction there are cases where we have to scalarise a pointer induction variable after vectorisation. For scalable vectors we already deal with the case where the pointer induction variable is uniform, but we currently crash if not uniform. For fixed width vectors we calculate every lane of the scalarised pointer induction variable for a given VF, however this cannot work for scalable vectors. In this case I have added support for caching the whole vector value for each unrolled part so that we can always extract an arbitrary element. Additionally, we still continue to cache the known minimum number of lanes too in order to improve code quality by avoiding an extractelement operation. I have adapted an existing test `pointer_iv_mixed` from the file: Transforms/LoopVectorize/consecutive-ptr-uniforms.ll and added it here for scalable vectors instead: Transforms/LoopVectorize/AArch64/sve-widen-phi.ll Differential Revision: https://reviews.llvm.org/D101294	2021-05-12 11:02:11 +01:00
Peter Waller	7081b98fbb	[AArch64][SVE] Improve sve.convert.to.svbool lowering The sve.convert.to.svbool lowering has the effect of widening a logical <M x i1> vector representing lanes into a physical <16 x i1> vector representing bits in a predicate register. In general, if converting to svbool, the contents of lanes in the physical register might not be known. For sve.convert.to.svbool the new lanes are specified to be zeroed, requiring 'and' instructions to mask off the new lanes. For lanes coming from a ptrue or a comparison, however, they are known to be zero. CodeGen Before: ptrue p0.s, vl16 ptrue p1.s ptrue p2.b and p0.b, p2/z, p0.b, p1.b ret After: ptrue p0.s, vl16 ret Differential Revision: https://reviews.llvm.org/D101544	2021-05-12 10:57:25 +01:00
Stephen Tozer	e0ba9acd2a	Reapply "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" Previous crashes caused by this patch were the result of machine subregisters being incorrectly handled in updateDbgUsersToReg; this has been fixed by using RegUnits to determine overlapping registers, instead of using the register values directly. Differential Revision: https://reviews.llvm.org/D101523 This reverts commit 7ca26c5fa2df253878cab22e1e2f0d6f1b481218.	2021-05-12 10:19:57 +01:00
Piotr Sobczak	5d49c0c0e7	[AMDGPU] Skip invariant loads when avoiding WAR conflicts No need to handle invariant loads when avoiding WAR conflicts, as there cannot be a vector store to the same memory location. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D101177	2021-05-12 10:57:05 +02:00
Tomas Matheson	738c47313c	[ARM] Prevent spilling between ldrex/strex pairs Based on the same for AArch64: 4751cadcca45984d7671e594ce95aed8fe030bf1 At -O0, the fast register allocator may insert spills between the ldrex and strex instructions inserted by AtomicExpandPass when expanding atomicrmw instructions in LL/SC loops. To avoid this, expand to cmpxchg loops and therefore expand the cmpxchg pseudos after register allocation. Required a tweak to ARMExpandPseudo::ExpandCMP_SWAP to use the 4-byte encoding of UXT, since the pseudo instruction can be allocated a high register (R8-R15) which the 2-byte encoding doesn't support. However, the 4-byte encodings are not present for ARM v8-M Baseline. To enable this, two new pseudos are added for Thumb which are only valid for v8mbase, tCMP_SWAP_8 and tCMP_SWAP_16. The previously committed attempt in D101164 had to be reverted due to runtime failures in the test suites. Rather than spending time fixing that implementation (adding another implementation of atomic operations and more divergence between backends) I have chosen to follow the approach taken in D101163. Differential Revision: https://reviews.llvm.org/D101898 Depends on D101912	2021-05-12 09:43:21 +01:00
Tomas Matheson	3ac69db0a3	[ARM] Precommit test for D101898 Differential Revision: https://reviews.llvm.org/D101912	2021-05-12 09:43:21 +01:00
Alex Orlov	c23a0ac8d6	Fixed llvm-objcopy to add correct symbol table for ELF with program headers. This fixes the following bugs: https://bugs.llvm.org/show_bug.cgi?id=43935 Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D102258	2021-05-12 12:39:30 +04:00
Djordje Todorovic	6a6d1c647a	[NFC][llvm-dwarfdump] Avoid passing std::string by value in collectStatsForDie()	2021-05-12 01:29:37 -07:00
Martin Storsjö	b44324fc5c	[COFF] Fix ARM and ARM64 REL32 relocations to be relative to the end of the relocation This matches how they are defined on X86. This should fix the relative lookup tables pass for COFF, allowing it to be reenabled. Differential Revision: https://reviews.llvm.org/D102217	2021-05-12 09:53:43 +03:00
Vitaly Buka	79d2416a46	[symbolizer] Fix leak after D96883	2021-05-11 22:51:36 -07:00
Qiu Chaofan	3be56445eb	[VectorComine] Restrict single-element-store index to inbounds constant Vector single element update optimization is landed in 2db4979. But the scope needs restriction. This patch restricts the index to inbounds and vector must be fixed sized. In future, we may use value tracking to relax constant restrictions. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D102146	2021-05-12 13:18:20 +08:00
Congzhe Cao	6e3a5acf29	[LoopInterchange] Handle lcssa PHIs with multiple predecessors This is a bugfix in the transformation phase. If the original outer loop header branches to both the inner loop (header) and the outer loop latch, and if there is an lcssa PHI node outside the loop nest, then after interchange the new outer latch will have an lcssa PHI node inserted which has two predecessors, i.e., the original outer header and the original outer latch. Currently the transformation assumes it has only one predecessor (the original outer latch) and crashes, since the inserted lcssa PHI node does not take both predecessors as incoming BBs. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D100792	2021-05-11 21:30:54 -04:00
Matt Arsenault	836042a211	AMDGPU: Fix SILoadStoreOptimizer for gfx90a This was hardcoding the register class to use for the newly created pointer registers, violating the aligned VGPR requirement.	2021-05-11 21:26:43 -04:00
Matt Arsenault	71baa5ae09	GlobalISel: Don't hardcode varargs=false in resultsCompatible	2021-05-11 20:22:06 -04:00
Matt Arsenault	a97e83f7fc	AMDGPU: Fix assert on constant load from addrspacecasted pointer This was trying to create a bitcast between different address spaces.	2021-05-11 20:12:20 -04:00
Matt Arsenault	b97c2a8a8f	GlobalISel: Make constant fields const	2021-05-11 20:10:55 -04:00
Matt Arsenault	d8bc7ef86e	GlobalISel: Split ValueHandler into assignment and emission classes Currently the ValueHandler handles both selecting the type and location for arguments, as well as inserting instructions needed to handle them. Split this so that the determination of the argument handling is independent of the function state. Currently the checks for tail call compatibility do not follow the full assignment logic, so it misses cases where arguments require nontrivial legalization. This should help avoid targets ending up in a buggy state where the argument evaluation may change in different contexts.	2021-05-11 19:50:12 -04:00
Matt Arsenault	e77e0d934c	GlobalISel: Move AArch64 AssignFnVarArg to base class We can handle the distinction easily enough in the generic code, and this makes it easier to abstract the selection of type/location from the code to insert code.	2021-05-11 19:50:12 -04:00
Jordan Rupprecht	f19959f1c4	Revert "[GVN] Clobber partially aliased loads." This reverts commit 6c570442318e2d3b8b13e95c2f2f588d71491acb. It causes assertion errors due to widening atomic loads, and potentially causes miscompile elsewhere too. Repro, also posted to D95543: ``` $ cat repro.ll ; ModuleID = 'repro.ll' source_filename = "repro.ll" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.widget = type { i32 } %struct.baz = type { i32, %struct.snork } %struct.snork = type { %struct.spam } %struct.spam = type { i32, i32 } @global = external local_unnamed_addr global %struct.widget, align 4 @global.1 = external local_unnamed_addr global i8, align 1 @global.2 = external local_unnamed_addr global i32, align 4 define void @zot(%struct.baz* %arg) local_unnamed_addr align 2 { bb: %tmp = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1 %tmp1 = bitcast %struct.snork* %tmp to i64* %tmp2 = load i64, i64* %tmp1, align 4 %tmp3 = getelementptr inbounds %struct.baz, %struct.baz* %arg, i64 0, i32 1, i32 0, i32 1 %tmp4 = icmp ugt i64 %tmp2, 4294967295 br label %bb5 bb5: ; preds = %bb14, %bb %tmp6 = load i32, i32* %tmp3, align 4 %tmp7 = icmp ne i32 %tmp6, 0 %tmp8 = select i1 %tmp7, i1 %tmp4, i1 false %tmp9 = zext i1 %tmp8 to i8 store i8 %tmp9, i8* @global.1, align 1 %tmp10 = load i32, i32* @global.2, align 4 switch i32 %tmp10, label %bb11 [ i32 1, label %bb12 i32 2, label %bb12 ] bb11: ; preds = %bb5 br label %bb14 bb12: ; preds = %bb5, %bb5 %tmp13 = load atomic i32, i32* getelementptr inbounds (%struct.widget, %struct.widget* @global, i64 0, i32 0) acquire, align 4 br label %bb14 bb14: ; preds = %bb12, %bb11 br label %bb5 } $ opt -O2 repro.ll -disable-output opt: /home/rupprecht/src/llvm-project/llvm/lib/Transforms/Utils/VNCoercion.cpp:496: llvm::Value llvm::VNCoercion::getLoadValueForLoad(llvm::LoadInst , unsigned int, llvm::Type , llvm::Instruction , const llvm::DataLayout &): Assertion `SrcVal->isSimple() && "Cannot widen volatile/atomic load!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/rupprecht/dev/opt -O2 repro.ll -disable-output ... ```	2021-05-11 16:08:53 -07:00
Lang Hames	705819c25d	[JITLink] Fix bogus format string.	2021-05-11 16:04:00 -07:00
Congzhe Cao	819056fa35	[LoopInterchange] Fix legality for triangular loops This is a bug fix in legality check. When we encounter triangular loops such as the following form: for (int i = 0; i < m; i++) for (int j = 0; j < i; j++), or for (int i = 0; i < m; i++) for (int j = 0; j*i < n; j++), we should not perform interchange since the number of executions of the loop body will be different before and after interchange, resulting in incorrect results. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D101305	2021-05-11 18:36:53 -04:00

1 2 3 4 5 ...

215601 Commits