llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00

Author	SHA1	Message	Date
Jay Foad	f6bcc30446	[AMDGPU] Mark the scheduling model as complete	2020-02-28 13:35:55 +00:00
Jay Foad	8de9a35fef	[AMDGPU] Update a comment missed in 74e2974ac6a	2020-02-28 13:35:55 +00:00
Alexey Lapshin	ccd90397cc	Fix buildbots after c074f5234d29439116f0e0be6033ea9331e85394. Removed unused function getSectionByName() from dsymutil/DwarfStreamer.cpp.	2020-02-28 16:20:29 +03:00
Simon Cook	5c4172c74b	[RISCV] Compress instructions based on function features When running under LTO, it is common to not specify the architecture spec, which is used for setting up the target machine, and instead rely on features specified in each function to generate the correct instructions. This works for the code generator, but the RISC-V backend uses the AsmPrinter to do instruction compression, which does not see these features but instead uses a MCSubtargetInfo object to see whether compression is enabled. Since this is configured based on the TargetMachine at startup, it will result in compressed instructions not being emitted when it has not been given the 'c' TargetFeature, but the function has it. This changes the RISCVAsmPrinter to re-initialize the STI feature set based on the current MachineFunction, such that compressed instructions are now correctly emitted regardless of the method used to enable them. Differential revision: https://reviews.llvm.org/D73339	2020-02-28 11:52:55 +00:00
LLVM GN Syncbot	cff53cb2cb	[gn build] Port 6af859dcca2	2020-02-28 11:49:23 +00:00
Jeremy Morse	80365b065d	[DebugInfo] Re-implement LexicalScopes dominance method, add unit tests Way back in D24994, the combination of LexicalScopes::dominates and LiveDebugValues was identified as having worst-case quadratic complexity, but it wasn't triggered by any code path at the time. I've since run into a scenario where this occurs, in a very large basic block where large numbers of inlined DBG_VALUEs are present. The quadratic-ness comes from LiveDebugValues::join calling "dominates" on every variable location, and LexicalScopes::dominates potentially touching every instruction in a block to test for the presence of a scope. We have, however, already computed the presence of scopes in blocks, in the "InstrRanges" of each scope. This patch switches the dominates method to examine whether a block is present in a scope's InsnRanges, avoiding walking through the whole block. At the same time, fix getMachineBasicBlocks to account for the fact that InsnRanges can cover multiple blocks, and add some unit tests, as Lexical Scopes didn't have any. Differential revision: https://reviews.llvm.org/D73725	2020-02-28 11:41:28 +00:00
Juneyoung Lee	274ef2efad	Let EarlyCSE fold equivalent freeze instructions Summary: This patch makes EarlyCSE fold equivalent freeze instructions. Another optimization that I think will be useful is to remove freeze if its operand is used as a branch condition or at llvm.assume: ``` %c = ... br i1 %c, label %A, .. A: %d = freeze %c ; %d can be optimized to %c because %c cannot be poison or undef (or 'br %c' would be UB otherwise) ``` If it make sense for EarlyCSE to support this as well, I will make a patch for this. Reviewers: spatel, reames, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75334	2020-02-28 20:35:20 +09:00
Peter Smith	9a1fd96887	[MC][ELF][ARM] Add relocations for some pc-relative fixups Add ELF relocations for the following fixups: fixup_thumb_adr_pcrel_10 -> R_ARM_THM_PC8 fixup_thumb_cp -> R_ARM_THM_PC8 fixup_t2_adr_pcrel_12 -> R_ARM_THM_PREL_11_0 fixup_t2_ldst_pcrel_12 -> R_ARM_THM_PC12 While these relocations are short-ranged there is support in the open source ELF linker's in binutils and soon to be in LLD. MC will no longer resolve pc-relative fixups to global symbols due to interpositioning concerns. We can handle these at link time by implementing the relocations. The R_ARM_THM_PC8 has some extra encoding rules for addends that llvm-mc sidesteps by not supporting addends for these instructions, using the wide Thumb 2 instruction if it is available. I think that this is a reasonable compromise given that these are rare. This partiall reverts D72892, the Thumb fixups no longer need to be evaluated at assembly time. Differential Revision: https://reviews.llvm.org/D75039	2020-02-28 11:29:29 +00:00
Sam Parker	48cb6ea724	[NFC][ARM] Add tests	2020-02-28 11:24:02 +00:00
Jay Foad	68730495b8	[AMDGPU] Precommit some scheduler related test updates Summary: The point of this is to make some tests with manual checks robust against scheduler tweaks, so that only autogenerated test updates will be required when pushing D68338 "[AMDGPU] Remove dubious logic in bidirectional list scheduler". Reviewers: arsenm, rampitec, vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75302	2020-02-28 11:20:58 +00:00
Sam Parker	61a9bbd1eb	[RDA] Track implicit-defs Ensure that we're recording implicit defs, as well as visiting implicit uses and implicit defs when we're walking through operands. Differential Revision: https://reviews.llvm.org/D75185	2020-02-28 11:14:42 +00:00
Stefan Agner	a40bf13d13	[ARM][Thumb2] support .w assembler qualifier for dmb/dsb/isb Support the explicit wide assembler qualifier for the dmb/dsb/isb synchronization barrier instructions. Differential revision: https://reviews.llvm.org/D75143	2020-02-28 11:08:24 +00:00
Stefan Agner	d1deb4199c	[ARM][Thumb2] Support .w assembler qualifier for pld/pldw/pli Accept explicit wide assembler qualifier for the pld/pldw/pli. Differential revision: https://reviews.llvm.org/D75144	2020-02-28 11:08:24 +00:00
Djordje Todorovic	79c36c8212	[NFC] [Test commit] Testing commit access with new email	2020-02-28 12:01:52 +01:00
Alexey Lapshin	24b18e4dfe	[DWARFLinker][NFC] Remove usages of "const object::ObjectFile" from DWARFLinker. Summary: DWARFContext has all the required information to access source debug info. It is not necessary to use "const object::ObjectFile" to create DWARFContext. Thus this patch removes all usages of "const object::ObjectFile" from DWARFLinker. Instead, already created DWARFContext is passed to DWARFLinker. The purpose is to not depend on "const object::ObjectFile". The patch looks big, but most of changes are renamings and movements. Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: hiraditya, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D75029	2020-02-28 13:26:22 +03:00
Hans Wennborg	3d298169f4	SROA: Don't drop atomic load/store alignments (PR45010) SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266	2020-02-28 10:38:40 +01:00
serge-sans-paille	c97c187c13	No longer generate calls to *_finite According to Joseph Myers, a libm maintainer > They were only ever an ABI (selected by use of -ffinite-math-only or > options implying it, which resulted in the headers using "asm" to redirect > calls to some libm functions), not an API. The change means that ABI has > turned into compat symbols (only available for existing binaries, not for > anything newly linked, not included in static libm at all, not included in > shared libm for future glibc ports such as RV32), so, yes, in any case > where tools generate direct calls to those functions (rather than just > following the "asm" annotations on function declarations in the headers), > they need to stop doing so. As a consequence, we should no longer assume these symbols are available on the target system. Still keep the TargetLibraryInfo for constant folding. Differential Revision: https://reviews.llvm.org/D74712	2020-02-28 10:07:37 +01:00
Hans Wennborg	34b8d5cba5	llvm-ar: Fix MinGW compilation llvm-ar is using CompareStringOrdinal which is available only starting with Windows Vista (WINVER 0x600). Fix this by hoising WindowsSupport.h, which sets _WIN32_WINNT to 0x0601, up to llvm/include/llvm/Support and use it in llvm-ar. Patch by Cristian Adam! Differential revision: https://reviews.llvm.org/D74599	2020-02-28 09:59:24 +01:00
Igor Kudrin	cd7f9e3ce0	[DebugInfo] Fix parsing DWARF64 units in DWP. The integrity check code allowed only DWARF32 units. Differential Revision: https://reviews.llvm.org/D75178	2020-02-28 15:35:51 +07:00
Igor Kudrin	f473ff580e	[DebugInfo] Avoid crashing when parsing an invalid unit header in DWP. The integrity checks for index entries in DWARFUnitHeader::extract() might cause the function to return before checking the state of an Error object, which leads to a crash in runtime. The patch fixes the issue by moving the checks in a safe place. Differential Revision: https://reviews.llvm.org/D75177	2020-02-28 15:35:51 +07:00
Pavel Labath	7d8c00606f	[DataExtractor] Improve error message when we run off the end of the buffer Summary: Include the offset at which this happened. Reviewers: dblaikie, jhenderson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75265	2020-02-28 09:02:33 +01:00
Saleem Abdulrasool	3ae1657c3e	build: process the libxml2 library path for embedding Process the path for libxml2 before embedding that into the command line that is generated in `llvm-config`. Each element in the path is being given a `-l` unconditionally which should not be the case for absolute paths. Since the library path may be absolute or not, just apply some CMake pre-processing when generating the path. Before: ``` /usr/lib/x86_64-linux-gnu/libz.so -lrt -ldl -ltinfo -lpthread -lm /usr/lib/x86_64-linux-gnu/libxml2.so ``` After: ``` /usr/lib/x86_64-linux-gnu/libz.so -lrt -ldl -ltinfo -lpthread -lm -lxml2 ``` Resolves PR44179!	2020-02-27 22:00:30 -08:00
Craig Topper	af4ecda792	[X86] Add FMA commuting test case for D75016 This test case shows extra moves due to not fully considering all commuting opportunities.	2020-02-27 20:41:26 -08:00
Jun Ma	b622c11886	[Coroutines] CoroElide enhancement Fix regression of CoreElide pass when current function is coroutine. Differential Revision: https://reviews.llvm.org/D71663	2020-02-28 10:41:59 +08:00
Juneyoung Lee	a92b27e638	Revert "[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison" .. due to performance regression. This patch is reverted until infrastructore for CSE/LICM support for freeze is added. This reverts commit 181628b	2020-02-28 11:10:46 +09:00
Eli Friedman	16b6718f9a	[IndVars] Fix sort comparator. std::sort will compare an element to itself in some cases. We should not crash if this happens. Differential Revision: https://reviews.llvm.org/D75000	2020-02-27 17:25:18 -08:00
Reid Kleckner	312b1860b3	Add missing cstdint include not found on Windows	2020-02-27 17:24:50 -08:00
Reid Kleckner	d1e548f99a	[Support] Remove byte swapping from MathExtras.h MathExtras.h was just wrapping SwapByteOrder.h functionality, so have the callers use it directly. Use the MathExtras.h name (ByteSwap_NN) as the standard naming, since it appears to be the most popular.	2020-02-27 17:23:48 -08:00
Matt Morehouse	e284262f0e	[DFSan] Add flag to insert event callbacks. Summary: For now just insert the callback for stores, similar to how MSan tracks origins. In the future we may want to add callbacks for loads, memcpy, function calls, CMPs, etc. Reviewers: pcc, vitalybuka, kcc, eugenis Reviewed By: vitalybuka, kcc, eugenis Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits, kcc Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75312	2020-02-27 17:14:19 -08:00
Matt Morehouse	3bb1a71268	[DFSan] Remove unused IRBuilder. NFC Reviewers: pcc, vitalybuka, kcc Reviewed By: kcc Subscribers: hiraditya, llvm-commits, kcc Tags: #llvm Differential Revision: https://reviews.llvm.org/D75190	2020-02-27 16:27:20 -08:00
Artur Pilipenko	0a4fed949e	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Craig Topper	4ee352a8a6	[llvm-exegesis] Remove unnecessary deletion of an assignment operator of WrappingIterator that angers some versions of MSVC The deletion of the const WrappingIterator & should already cover this.	2020-02-27 14:33:32 -08:00
Vedant Kumar	6f2dc299d1	unittest: Convert EXPECT_EQ iterator checks to use EXPECT_TRUE instead Hopefully fixes compile errors on some bots, like: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13383/steps/ninja%20check%201/logs/stdio /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/unittests/ADT/CoalescingBitVectorTest.cpp:452:3: required from here /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/utils/unittest/googletest/include/gtest/gtest-printers.h:377:56: error: ‘const class llvm::CoalescingBitVector<long unsigned int>::const_iterator’ has no member named ‘begin’ for (typename C::const_iterator it = container.begin(); ^ /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/llvm/utils/unittest/googletest/include/gtest/gtest-printers.h:378:11: error: ‘const class llvm::CoalescingBitVector<long unsigned int>::const_iterator’ has no member named ‘end’ it != container.end(); ++it, ++count) { ^	2020-02-27 14:19:45 -08:00
Vedant Kumar	7887a1f316	unittest: Disable checks to work around compiler errors On some bots, using gtest asserts to compare iterators does not compile, and I'm not sure why (this certainly compiles with clang). Disable the checks for now :/. ``` C:\buildbot\as-builder-3\llvm-clang-x86_64-win-fast\llvm-project\llvm\utils\unittest\googletest\include\gtest/gtest-printers.h(377): error C2039: 'begin': is not a member of 'llvm::CoalescingBitVector<unsigned int,16>::const_iterator' C:\buildbot\as-builder-3\llvm-clang-x86_64-win-fast\llvm-project\llvm\include\llvm/ADT/CoalescingBitVector.h(243): note: see declaration of 'llvm::CoalescingBitVector<unsigned int,16>::const_iterator' C:\buildbot\as-builder-3\llvm-clang-x86_64-win-fast\llvm-project\llvm\utils\unittest\googletest\include\gtest/gtest-printers.h(478): note: see reference to function template instantiation 'void testing::internal::DefaultPrintTo<T>(testing::internal::IsContainer,testing::internal::false_type,const C &,std::ostream *)' being compiled with [ T=T1, C=T1 ] ``` http://lab.llvm.org:8011/builders/llvm-clang-x86_64-win-fast/builds/12006/steps/test-check-llvm-unit/logs/stdio http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34521/steps/ninja%20check%201/logs/stdio	2020-02-27 13:06:46 -08:00
Stanislav Mekhanoshin	036e39e260	[AMDGPU] Enable runtime unroll for LDS We want to do unroll for LDS even for runtime trip count to combine LDS operations. Differential Revision: https://reviews.llvm.org/D75293	2020-02-27 12:59:35 -08:00
LLVM GN Syncbot	f74056c4dc	[gn build] Port b0142cd9867	2020-02-27 20:40:16 +00:00
Vedant Kumar	55c3504da4	[LiveDebugValues] Encode register location within VarLoc IDs [3/3] This is part 3 of a 3-part series to address a compile-time explosion issue in LiveDebugValues. --- Start encoding register locations within VarLoc IDs, and take advantage of this encoding to speed up transferRegisterDef. There is no fundamental algorithmic change: this patch simply swaps out SparseBitVector in favor of CoalescingBitVector. That changes iteration order (hence the test updates), but otherwise this patch is NFCI. The only interesting change is in transferRegisterDef. Instead of doing: ``` KillSet = {} for (ID : OpenRanges.getVarLocs()) if (DeadRegs.count(ID)) KillSet.add(ID) ``` We now do: ``` KillSet = {} for (Reg : DeadRegs) for (ID : intervalsReservedForReg(Reg, OpenRanges.getVarLocs())) KillSet.add(ID) ``` By not visiting each open location every time we visit an instruction, this eliminates some potentially quadratic behavior. The new implementation basically does a constant amount of work per instruction because the interval map lookups are very fast. For a file in WebKit, this brings the time spent in LiveDebugValues down from ~2.5 minutes to 4 seconds, reducing compile time spent in that pass from 28% of the total to just over 1%. Before: ``` 2.49 min 27.8% 0 s LiveDebugValues::process 2.41 min 27.0% 5.40 s LiveDebugValues::transferRegisterDef 1.51 min 16.9% 1.51 min LiveDebugValues::VarLoc::isDescribedByReg() const 32.73 s 6.1% 8.70 s llvm::SparseBitVector<128u>::SparseBitVectorIterator::operator++() ``` After: ``` 4.53 s 1.1% 0 s LiveDebugValues::process 3.00 s 0.7% 107.00 ms LiveDebugValues::transferRegisterCopy 892.00 ms 0.2% 406.00 ms LiveDebugValues::transferSpillOrRestoreInst 404.00 ms 0.1% 32.00 ms LiveDebugValues::transferRegisterDef 110.00 ms 0.0% 2.00 ms LiveDebugValues::getUsedRegs 57.00 ms 0.0% 1.00 ms std::__1::vector<>::push_back 40.00 ms 0.0% 1.00 ms llvm::CoalescingBitVector<>::find(unsigned long long) ``` FWIW, I tried the same approach using SparseBitVector, but got bad results. To do that, I had to extend SparseBitVector to support 64-bit indices and expose its lower bound operation. The problem with this is that the performance is very hard to predict: SparseBitVector's lower bound operation falls back to O(n) linear scans in a std::list if you're not /very/ careful about managing iteration order. When I profiled this the performance looked worse than the baseline. You can see the full CoalescingBitVector-based implementation here: https://github.com/vedantk/llvm-project/commits/try-coalescing You can see the full SparseBitVector-based implementation here: https://github.com/vedantk/llvm-project/commits/try-sparsebitvec-find Depends on D74984 and D74985. Differential Revision: https://reviews.llvm.org/D74986	2020-02-27 12:39:47 -08:00
Vedant Kumar	04b38d9f30	[LiveDebugValues] Encode a location in VarLoc IDs, NFC [2/3] This is part 2 of a 3-part series to address a compile-time explosion issue in LiveDebugValues. --- Each VarLoc has a unique ID: this ID is used to look up a VarLoc in the VarLocMap, and to virtually insert a VarLoc into a VarLocSet. Instead of inserting the VarLoc /itself/ into the VarLocSet, we insert just the ID, because this can be represented efficiently with a SparseBitVector. This change introduces LocIndex, a layer of abstraction on top of VarLoc IDs. Prior to this change, an ID was just an index into a vector. With this change, an ID encodes both an index /and/ a register location. The type-checker ensures that conversions to and from LocIndex are correct. For the moment the register location is always 0 (undef). We have plenty of bits left over to encode physregs, stack slots, and other locations in the future. Differential Revision: https://reviews.llvm.org/D74985	2020-02-27 12:39:47 -08:00
Vedant Kumar	dfa1bc247b	[ADT] Add CoalescingBitVector, implemented using IntervalMap [1/3] Add CoalescingBitVector to ADT. This is part 1 of a 3-part series to address a compile-time explosion issue in LiveDebugValues. --- CoalescingBitVector is a bitvector that, under the hood, relies on an IntervalMap to coalesce elements into intervals. CoalescingBitVector efficiently represents sets which predominantly contain contiguous ranges (e.g. the VarLocSets in LiveDebugValues, which are very long sequences that look like {1, 2, 3, ...}). OTOH, CoalescingBitVector isn't good at representing sets with lots of gaps between elements. The first N coalesced intervals of set bits are stored in-place (in the initial heap allocation). Compared to SparseBitVector, CoalescingBitVector offers more predictable performance for non-sequential find() operations. This provides a crucial speedup in LiveDebugValues. Differential Revision: https://reviews.llvm.org/D74984	2020-02-27 12:39:46 -08:00
Sanjay Patel	7b6e711e86	[x86] use instruction-level fast-math-flags to drive MachineCombiner The code changes here are hopefully straightforward: 1. Use MachineInstruction flags to decide if FP ops can be reassociated (use both "reassoc" and "nsz" to be consistent with IR transforms; we probably don't need "nsz", but that's a safer interpretation of the FMF). 2. Check that both nodes allow reassociation to change instructions. This is a stronger requirement than we've usually implemented in IR/DAG, but this is needed to solve the motivating bug (see below), and it seems unlikely to impede optimization at this late stage. 3. Intersect/propagate MachineIR flags to enable further reassociation in MachineCombiner. We managed to make MachineCombiner flexible enough that no changes are needed to that pass itself. So this patch should only affect x86 (assuming no other targets have implemented the hooks using MachineIR flags yet). The motivating example in PR43609 is another case of fast-math transforms interacting badly with special FP ops created during lowering: https://bugs.llvm.org/show_bug.cgi?id=43609 The special fadd ops used for converting int to FP assume that they will not be altered, so those are created without FMF. However, the MachineCombiner pass was being enabled for FP ops using the global/function-level TargetOption for "UnsafeFPMath". We managed to run instruction/node-level FMF all the way down to MachineIR sometime in the last 1-2 years though, so we can do better now. The test diffs require some explanation: 1. llvm/test/CodeGen/X86/fmf-flags.ll - no target option for unsafe math was specified here, so MachineCombiner kicks in where it did not previously; to make it behave consistently, we need to specify a CPU schedule model, so use the default model, and there are no code diffs. 2. llvm/test/CodeGen/X86/machine-combiner.ll - replace the target option for unsafe math with the equivalent IR-level flags, and there are no code diffs; we can't remove the NaN/nsz options because those are still used to drive x86 fmin/fmax codegen (special SDAG opcodes). 3. llvm/test/CodeGen/X86/pow.ll - similar to #1 4. llvm/test/CodeGen/X86/sqrt-fastmath.ll - similar to #1, but MachineCombiner does some reassociation of the estimate sequence ops; presumably these are perf wins based on latency/throughput (and we get some reduction of move instructions too); I'm not sure how it affects numerical accuracy, but the test reflects reality better now because we would expect MachineCombiner to be enabled if the IR was generated via something like "-ffast-math" with clang. 5. llvm/test/CodeGen/X86/vec_int_to_fp.ll - this is the test added to model PR43609; the fadds are not reassociated now, so we should get the expected results. 6. llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll - similar to #1 7. llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll - similar to #1 Differential Revision: https://reviews.llvm.org/D74851	2020-02-27 15:19:37 -05:00
Sanjay Patel	2be2e916ed	[AArch64] add splat shuffle combine test; NFC	2020-02-27 14:38:56 -05:00
Sanjay Patel	ddb43badef	[AArch64] regenerate complete test checks; NFC	2020-02-27 14:38:55 -05:00
David Tenty	ab8ea38971	[XCOFF] Don't emit non-external labels in the symbol table and handle MCSA_LGlobal Summary: We need to handle the MCSA_LGlobal case in emitSymbolAttribute for functions marked internal in the IR so that the appropriate storage class is emitted on the function descriptor csect. As part of this we need to make sure that external labels are not emitted into the symbol table, so we don't emit the descriptor label in the object writing path. Reviewers: jasonliu, DiggerLin, hubert.reinterpretcast Reviewed By: jasonliu Subscribers: Xiangling_L, wuzish, nemanjai, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74968	2020-02-27 13:37:13 -05:00
Sameer Sahasrabuddhe	d1f1b50843	[AMDGPU] improve fragile test for divergent branches Summary: The affected LIT test intends to test the correct use of divergence analysis to detect a divergent branch with a uniform predicate. The passes involved are LLVM IR passes, but the test runs llc and tries to match against generated ISA, which makes it hard to demonstrate that the intended behavior was really tested. Replaced this with a test that invokes opt on the required passes and then checks for the appropriate changes in the LLVM IR. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D75267	2020-02-27 23:31:03 +05:30
Nikita Popov	730509657a	[InstCombine] DCE instructions earlier When InstCombine initially populates the worklist, it already performs constant folding and DCE. However, as the instructions are initially visited in program order, this DCE can pick up only the last instruction of a dead chain, the rest would only get picked up in the main InstCombine run. To avoid this, we instead perform the DCE in separate pass over the collected instructions in reverse order, which will allow us to pick up full dead instruction chains. We already need to do this reverse iteration anyway to populate the worklist, so this shouldn't add extra cost. This by itself only fixes a small part of the problem though: The same basic issue also applies during the main InstCombine loop. We generally always want DCE to occur as early as possible, because it will allow one-use folds to happen. Address this by also performing DCE while adding deferred instructions to the main worklist. This drops the number of tests that perform more than 2 InstCombine iterations from ~80 to ~40. There's some spurious test changes due to operand order / icmp toggling. Differential Revision: https://reviews.llvm.org/D75008	2020-02-27 18:45:59 +01:00
Simon Moll	9f544f2f1b	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Pierre-vh	ed0cc3a061	[Transforms][Debugify] Ignore PHI nodes when checking for DebugLocs Fix for: https://bugs.llvm.org/show_bug.cgi?id=37964 Differential Revision: https://reviews.llvm.org/D75242	2020-02-27 16:14:11 +00:00
Dan Gohman	2fd062d2cd	[WebAssembly] Add an `isWasm` target triple predicate. This simplies code which needs to apply the same logic to both wasm32 and wasm64. This patch is part of https://reviews.llvm.org/D70700.	2020-02-27 07:55:01 -08:00
Simon Pilgrim	9b56ed0634	[InstCombine] Add PR14365 test cases + vector equivalents.	2020-02-27 15:54:14 +00:00
Simon Pilgrim	1899e92869	[CostModel][X86] Improve extract/insert element costs (PR43605) This tries to improve the accuracy of extract/insert element costs by accounting for subvector extraction/insertion for >128-bit vectors and the shuffling of elements to/from the 0'th index. It also adds INSERTPS for f32 types and PINSR/PEXTR costs for integer types (at the moment we assume the same cost as MOVD/MOVQ - which isn't always true). Differential Revision: https://reviews.llvm.org/D74976	2020-02-27 15:54:13 +00:00

1 2 3 4 5 ...

192662 Commits