llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

Author	SHA1	Message	Date
Esme-Yi	f754a1db1b	Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA." This reverts commit 119ab2181e6ed823849c93d55af8e989c28c9f3c.	2020-11-03 16:34:02 +00:00
Tim Renouf	83e3834a8d	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	2a63696860	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Jay Foad	d6e4e20e6b	[AMDGPU] Fix ds_read2/write2 with unaligned offsets These instructions use a scaled offset. We were wrongly selecting them even when the required offset was not a multiple of the scale factor. Differential Revision: https://reviews.llvm.org/D90607	2020-11-03 15:16:10 +00:00
Jameson Nash	11a667f122	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Replaces https://reviews.llvm.org/D74158 Differential Revision: https://reviews.llvm.org/D89613	2020-11-03 10:02:09 -05:00
Simon Pilgrim	fc543f8a95	[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 14:22:28 +00:00
Sanjay Patel	d761b7c23c	[x86] update cost table comments for maxnum; NFC Follow-up suggested in D90613.	2020-11-03 08:09:59 -05:00
Roman Lebedev	67e1befafc	[InstCombine] Perform C-(X+C2) --> (C-C2)-X transform before using Negator In particular, it makes it fire for C=0, because negator doesn't want to perform that fold since in general it's not beneficial.	2020-11-03 16:06:52 +03:00
Roman Lebedev	ca79221be4	[InstCombine] Negator: - (C - %x) --> %x - C (PR47997) This relaxes one-use restriction on that `sub` fold, since apparently the addition of Negator broke preexisting `C-(C2-X) --> X+(C-C2)` (with C=0) fold.	2020-11-03 16:06:51 +03:00
Florian Hahn	8985e25f47	[SCCP] Handle bitcast of vector constants. Vectors where all elements have the same known constant range are treated as a single constant range in the lattice. When bitcasting such vectors, there is a mis-match between the width of the lattice value (single constant range) and the original operands (vector). Go to overdefined in that case. Fixes PR47991.	2020-11-03 12:58:39 +00:00
David Green	be5f4f896c	[ARM] Remove unused variable. NFC	2020-11-03 12:58:10 +00:00
Hans Wennborg	0be5551915	Revert "[CodeGen] [WinException] Only produce handler data at the end of the function if needed" This caused an explosion in ICF times during linking on Windows when libfuzzer instrumentation is enabled. For a small binary we see ICF time go from ~0 to ~10 s. For a large binary it goes from ~1 s to forevert (I gave up after 30 minutes). See comment on the code review. > If we are going to write handler data (that is written as variable > length data following after the unwind info in .xdata), we need to > emit the handler data immediately, but for cases where no such > info is going to be written, skip emitting it right away. (Unwind > info for all remaining functions that hasn't gotten it emitted > directly is emitted at the end.) > > This does slightly change the ordering of sections (triggering a > bunch of updates to DebugInfo/COFF tests), but the change should be > benign. > > This also matches GCC's assembly output, which doesn't output > .seh_handlerdata unless it actually is needed. > > For ARM64, the unwind info can be packed into the runtime function > entry itself (leaving no data in the .xdata section at all), but > that can only be done if there's no follow-on data in the .xdata > section. If emission of the unwind info is triggered via > EmitWinEHHandlerData (or the .seh_handlerdata directive), which > implicitly switches to the .xdata section, there's a chance of the > caller wanting to pass further data there, so the packed format > can't be used in that case. > > Differential Revision: https://reviews.llvm.org/D87448 This reverts commit 36c64af9d7f97414d48681b74352c9684077259b.	2020-11-03 13:12:10 +01:00
Stefan Gränitz	07d8c057bd	[JITLink][ELF] Implement R_X86_64_PLT32 relocations Basic implementation for call and jmp branches with 32 bit offset. Branches to local targets produce Branch32 edges that are resolved like a regular PCRel32 relocations. Branches to external (undefined) targets produce Branch32ToStub edges and go through a PLT entry by default. If the target happens to get resolved within the 32 bit range from the callsite, the edge is relaxed during post-allocation optimization. There is a test for each of these cases. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D90331	2020-11-03 12:05:54 +00:00
David Green	1b81a85e7e	[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops If an instruction will be lowered to a call there is no advantage of using a low overhead loop as the LR register will need to be spilled and reloaded around the call, and the low overhead will end up being reverted. This teaches our hardware loop lowering that these memory intrinsics will be calls under certain situations. Differential Revision: https://reviews.llvm.org/D90439	2020-11-03 11:53:09 +00:00
Simon Pilgrim	63512b001b	[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns. This should allow us to begin resolving PR46896 et al. Differential Revision: https://reviews.llvm.org/D90625	2020-11-03 10:49:49 +00:00
Florian Hahn	3a9ee235ce	[SLP] Pass VecPred argument to getCmpSelInstrCost. Check if all compares in VL have the same predicate and pass it to getCmpSelInstrCost, to improve cost-modeling on targets that only support compare/select combinations for certain uniform predicates. This leads to additional vectorization in some cases ``` Same hash: 217 (filtered out) Remaining: 19 Metric: SLP.NumVectorInstructions Program base slp2 diff test-suite...marks/SciMark2-C/scimark2.test 11.00 26.00 136.4% test-suite...T2006/445.gobmk/445.gobmk.test 79.00 135.00 70.9% test-suite...ediabench/gsm/toast/toast.test 54.00 71.00 31.5% test-suite...telecomm-gsm/telecomm-gsm.test 54.00 71.00 31.5% test-suite...CI_Purple/SMG2000/smg2000.test 426.00 542.00 27.2% test-suite...ch/g721/g721encode/encode.test 30.00 24.00 -20.0% test-suite...000/186.crafty/186.crafty.test 116.00 138.00 19.0% test-suite...ications/JM/ldecod/ldecod.test 697.00 765.00 9.8% test-suite...6/464.h264ref/464.h264ref.test 822.00 886.00 7.8% test-suite...chmarks/MallocBench/gs/gs.test 154.00 162.00 5.2% test-suite...nsumer-lame/consumer-lame.test 621.00 651.00 4.8% test-suite...lications/ClamAV/clamscan.test 223.00 231.00 3.6% test-suite...marks/7zip/7zip-benchmark.test 680.00 695.00 2.2% test-suite...CFP2000/177.mesa/177.mesa.test 2121.00 2129.00 0.4% test-suite...:: External/Povray/povray.test 2406.00 2412.00 0.2% test-suite...TimberWolfMC/timberwolfmc.test 634.00 634.00 0.0% test-suite...CFP2006/433.milc/433.milc.test 1036.00 1036.00 0.0% test-suite.../Benchmarks/nbench/nbench.test 321.00 321.00 0.0% test-suite...ctions-flt/Reductions-flt.test NaN 5.00 nan% ``` Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D90124	2020-11-03 10:16:43 +00:00
Nicholas Guy	8b3116b36c	[AArch64] Redundant masks in downcast long multiply Adds patterns to catch masks preceeding a long multiply, and generating a single umull/smull instruction instead. Differential revision: https://reviews.llvm.org/D89956	2020-11-03 10:12:28 +00:00
David Green	41688b499e	[CostModel] Make target intrinsics cheap by default This patch changes the intrinsics cost model to assume that by default target intrinsics are cheap. This didn't seem to be the case for all intrinsics, and is potentially an MVE problem due to our scalarization overheads. Cheap seems to be a good default in general though. Differential Revision: https://reviews.llvm.org/D90597	2020-11-03 09:58:28 +00:00
Petar Avramovic	cf35e6aad4	AMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combiner Change match/apply functions into methods of new target specific combiner helper class. Use reference to MachineIRBuilder from helper instead of constructing new MachineIRBuilder each time new instruction needs to made. Allows correct tracking of newly created instructions. Differential Revision: https://reviews.llvm.org/D90623	2020-11-03 09:24:50 +01:00
Max Kazantsev	9c353872bf	[NFC] Refactor code in IndVars, preparing for further improvement	2020-11-03 15:08:12 +07:00
Esme-Yi	cc2de8bfa9	[PowerPC] Extend folding RLWINM + RLWINM to post-RA. Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. Reviewed By: shchenz, steven.zhang Differential Revision: https://reviews.llvm.org/D89855	2020-11-03 07:44:11 +00:00
Max Kazantsev	6a18756c16	[NFC] Split lambda into 2 parts for further reuse	2020-11-03 14:13:55 +07:00
Craig Topper	3eb7636233	[RISCV] Remove isel patterns for fshl/fshr with same inputs. NFC These were being selected to ROL/ROR, but DAG combine should canonicalize fshl/fshr with same inputs to rotl/rotr which we also have patterns for.	2020-11-02 23:12:18 -08:00
Max Kazantsev	b0c85eefb0	[IndVars] Use knowledge about execution on last iteration when removing checks If we know that some check will not be executed on the last iteration, we can use this fact to eliminate its check. Differential Revision: https://reviews.llvm.org/D88210 Reviwed By: ebrevnov	2020-11-03 13:38:58 +07:00
Esme-Yi	04514155d5	[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo. Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D89846	2020-11-03 06:28:56 +00:00
Jessica Clarke	d022c319e8	[CodeGen] Fix regression from D83655 Arm EHABI has a null LSDASection as it does its own thing, so we should continue to return null in that case rather than try and cast it.	2020-11-03 03:57:46 +00:00
Jessica Clarke	7b1a5513ca	[RISCV] Only return DestSourcePair from isCopyInstrImpl for registers ADDI often has a frameindex in operand 1, but consumers of this interface, such as MachineSink, tend to call getReg() on the Destination and Source operands, leading to the following crash when building FreeBSD after this implementation was added in 8cf6778d30: ``` clang: llvm/include/llvm/CodeGen/MachineOperand.h:359: llvm::Register llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: #0 0x00007f4286f9b4d0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) llvm/lib/Support/Unix/Signals.inc:563:0 #1 0x00007f4286f9b587 PrintStackTraceSignalHandler(void) llvm/lib/Support/Unix/Signals.inc:630:0 #2 0x00007f4286f9926b llvm::sys::RunSignalHandlers() llvm/lib/Support/Signals.cpp:71:0 #3 0x00007f4286f9ae52 SignalHandler(int) llvm/lib/Support/Unix/Signals.inc:405:0 #4 0x00007f428646ffd0 (/lib/x86_64-linux-gnu/libc.so.6+0x3efd0) #5 0x00007f428646ff47 raise /build/glibc-2ORdQG/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0 #6 0x00007f42864718b1 abort /build/glibc-2ORdQG/glibc-2.27/stdlib/abort.c:81:0 #7 0x00007f428646142a __assert_fail_base /build/glibc-2ORdQG/glibc-2.27/assert/assert.c:89:0 #8 0x00007f42864614a2 (/lib/x86_64-linux-gnu/libc.so.6+0x304a2) #9 0x00007f428d4078e2 llvm::MachineOperand::getReg() const llvm/include/llvm/CodeGen/MachineOperand.h:359:0 #10 0x00007f428d8260e7 attemptDebugCopyProp(llvm::MachineInstr&, llvm::MachineInstr&) llvm/lib/CodeGen/MachineSink.cpp:862:0 #11 0x00007f428d826442 performSink(llvm::MachineInstr&, llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::SmallVectorImpl<llvm::MachineInstr>&) llvm/lib/CodeGen/MachineSink.cpp:918:0 #12 0x00007f428d826e27 (anonymous namespace)::MachineSinking::SinkInstruction(llvm::MachineInstr&, bool&, std::map<llvm::MachineBasicBlock, llvm::SmallVector<llvm::MachineBasicBlock, 4u>, std::less<llvm::MachineBasicBlock>, std::allocator<std::pair<llvm::MachineBasicBlock const, llvm::SmallVector<llvm::MachineBasicBlock*, 4u> > > >&) llvm/lib/CodeGen/MachineSink.cpp:1073:0 #13 0x00007f428d824a2c (anonymous namespace)::MachineSinking::ProcessBlock(llvm::MachineBasicBlock&) llvm/lib/CodeGen/MachineSink.cpp:410:0 #14 0x00007f428d824513 (anonymous namespace)::MachineSinking::runOnMachineFunction(llvm::MachineFunction&) llvm/lib/CodeGen/MachineSink.cpp:340:0 ``` Thus, check that operand 1 is also a register in the condition. Reviewed By: arichardson, luismarques Differential Revision: https://reviews.llvm.org/D89090	2020-11-03 03:55:47 +00:00
Qiu Chaofan	fd9800defc	[PowerPC] Skip IEEE 128-bit FP type in FastISel Vector types, quadword integers and f128 currently cannot be handled in FastISel. We did not skip f128 type in lowering arguments, which causes a crash. This patch will fix it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D90206	2020-11-03 11:17:11 +08:00
Qiu Chaofan	07283aa0c6	[PowerPC] [NFC] Rename VCMPo to VCMP_rec Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D90581	2020-11-03 11:10:59 +08:00
Alina Sbirlea	c3d75ade94	[LICM] Add assert of AST/MSSA exclusiveness. The API `canSinkOrHoistInst` may be called by LoopSink. Add assert to avoid having two analyses passed in.	2020-11-02 18:04:43 -08:00
Akira Hatanaka	a14d53181f	Remove unused parameter	2020-11-02 17:40:06 -08:00
Fangrui Song	c6463c4b94	[PowerPC] Parse and ignore .machine ppc64 In the wild, kexec-tools purgatory/arch/ppc64/v2wrap.S and hvcall.S use this directive.	2020-11-02 16:49:57 -08:00
Gaurav Jain	4f8e5f73dc	[NFC] Use [MC]Register in Live-ness tracking Differential Revision: https://reviews.llvm.org/D90611	2020-11-02 15:46:13 -08:00
Jonas Devlieghere	6989bc45bd	[MachO] Also recongize __swift_ast as a debug info section Address post-commit review from Adrian.	2020-11-02 14:49:57 -08:00
Fangrui Song	4be7087bf0	[AsmPrinter] Split up .gcc_except_table MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table: * if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat` * otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits` This ensures that (a) non-prevailing copies are discarded and (b) .gcc_except_table associated to discarded text sections can be discarded by a .gcc_except_table-aware linker (GNU ld, but not gold or LLD) This patches matches the GCC behavior. If -fno-unique-section-names is specified, we don't append the suffix. If -ffunction-sections is additionally specified, use `.section ...,unique`. Note, if clang driver communicates that the linker is LLD and we know it is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table costs, at least in the -fno-unique-section-names case. We cannot use it on GNU ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER components in an output section https://sourceware.org/bugzilla/show_bug.cgi?id=26256 For RISC-V -mrelax, this patch additionally fixes an assembler-linker interaction problem: because a section is shrinkable, the length of a call-site code range is not a constant. Relocations referencing the associated text section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a discarded section group member from outside the group is disallowed by the ELF specification (PR46675): ``` // a.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int main() { return comdat(); } // b.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int foo() { return comdat(); } clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section: ``` -fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations. Reviewed By: jrtc27, rahmanl Differential Revision: https://reviews.llvm.org/D83655	2020-11-02 14:36:25 -08:00
Fangrui Song	c9829bfb08	[LazyCallGraph] Build SCCs of the reference graph in order ``` // The legacy PM CGPassManager discovers SCCs this way: for function in the source order tarjanSCC(function) // While the new PM CGSCCPassManager does: for function in the reversed source order [1] discover a reference graph SCC build call graph SCCs inside the reference graph SCC ``` In the common cases, reference graph ~= call graph, the new PM order is undesired because for `a \| b \| c` (3 independent functions), the new PM will process them in the reversed order: c, b, a. If `a <-> b <-> c`, we can see that `-print-after-all` will report the sole SCC as `scc: (c, b, a)`. This patch corrects the iteration order. The discovered SCC order will match the legacy PM in the common cases. For some tests (`Transforms/Inline/cgscc-*.ll` and `unittests/Analysis/CGSCCPassManagerTest.cpp`), the behaviors are dependent on the SCC discovery order and there are too many check lines for the particular order. This patch simply reverses the function order to avoid changing too many check lines. Differential Revision: https://reviews.llvm.org/D90566	2020-11-02 13:22:42 -08:00
Fangrui Song	f727ae92f5	[MC] Make MCStreamer aware of AsmParser's StartTokLoc A SMLoc allows MCStreamer to report location-aware diagnostics, which were previously done by adding SMLoc to various methods (e.g. emit*) in an ad-hoc way. Since the file:line is most important, the column is less important and the start token location suffices in many cases, this patch reverts b7e7131af2dd7bdb03fa42a3bc1b4bc72ab95ce1 ``` // old symbol-binding-changed.s:6:8: error: local changed binding to STB_GLOBAL .globl local ^ // new symbol-binding-changed.s:6:1: error: local changed binding to STB_GLOBAL .globl local ^ ``` Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D90511	2020-11-02 12:32:07 -08:00
Krzysztof Parzyszek	13b75ea80b	[Hexagon] Move isTypeForHVX from Hexagon TTI to HexagonSubtarget, NFC It's useful outside of Hexagon TTI, and with how TTI is implemented, it is not accessible outside of TTI.	2020-11-02 14:00:45 -06:00
Mircea Trofin	158d222d30	[NFC][regalloc] Use MCRegister appropriately Differential Revision: https://reviews.llvm.org/D90506	2020-11-02 11:48:49 -08:00
Stanislav Mekhanoshin	d05e59d827	[AMDGPU] Improve FLAT scratch detection We were useing too broad check for isFLATScratch() which also includes FLAT global. Differential Revision: https://reviews.llvm.org/D90505	2020-11-02 11:37:33 -08:00
Ettore Tiotto	157bbdf8a4	[PartialInliner]: Handle code regions in a switch stmt cases This patch enhances computeOutliningColdRegionsInfo() to allow it to consider regions containing a single basic block and a single predecessor as candidate for partial inlining. Reviewed By: fhann Differential Revision: https://reviews.llvm.org/D89911	2020-11-02 14:32:45 -05:00
Alex Richardson	f7fe395a92	[AtomicExpand] Avoid creating an unnamed libcall I recently modified this pass to better support CHERI-RISC-V and while doing so I noticed that this pass was calling M->getOrInsertFunction() with the result of TLI->getLibcallName(RTLibType). However, AMDGPU fills the libcalls array with nullptr, so this creates an anonymous function instead. This patch changes expandAtomicOpToLibcall to return false in case the libcall does not exist and changes the assert() in the callees to a report_fatal_error() instead. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88800	2020-11-02 17:52:37 +00:00
Craig Topper	dfd4863ad7	[RISCV] Make SelectRORIW handle the commutability of OR. The SHL and SRL could be in opposite order so account for that. Differential Revision: https://reviews.llvm.org/D90586	2020-11-02 09:32:54 -08:00
Paul C. Anagnostopoulos	e028635778	[TableGen] Fix a couple of minor issues regarding the paste operator. Update the documentation to fully describe it. Differential Revision: https://reviews.llvm.org/D90617	2020-11-02 12:21:54 -05:00
Sanjay Patel	389858bbc4	[x86] add AVX2 cost model entries for maxnum of 256-bit vectors As noticed in D90554 , the AVX2 costs for 256-bit vectors did not include FMAXNUM entries, so we fell back to AVX1 which assumes those ops will be split into 128-bit halves or something close to that. Differential Revision: https://reviews.llvm.org/D90613	2020-11-02 12:20:17 -05:00
Craig Topper	d4f846d5af	[RISCV] When matching RORIW, make sure the same input is given to both shifts. The code is looking for (sext_inreg (or (shl X, C2), (shr (and Y, C3), C1))). We need to ensure X and Y are the same. Differential Revision: https://reviews.llvm.org/D90580	2020-11-02 09:12:40 -08:00
Simon Pilgrim	aeaa523af1	[AggressiveInstCombine] foldGuardedRotateToFunnelShift - generalize rotation to funnel shift matcher. Replace matchRotate with a more general matchFunnelShift - at the moment this is still just used for rotation patterns.	2020-11-02 17:09:17 +00:00
Momchil Velikov	5fd1acbb48	[ARM][MachineOutliner] Do not overestimate LR liveness in return block The `LiveRegUnits` utility (as well as `LivePhysRegs`) considers callee-saved registers to be alive at the point after the return instruction in a block. In the ARM backend, the `LR` register is classified as callee-saved, which is not really correct (from an ARM eABI or just common sense point of view). These two conditions cause the `MachineOutliner` to overestimate the liveness of `LR`, which results in unnecessary saves/restores of `LR` around calls to outlined sequences. It also causes the `MachineVerifer` to crash in some cases, because the save instruction reads a dead `LR`, for example when the following program: int h(int, int); int f(int a, int b, int c, int d) { a = h(a + 1, b - 1); b = b + c; return 1 + (2 * a + b) * (c - d) / (a - b) * (c + d); } int g(int a, int b, int c, int d) { a = h(a - 1, b + 1); b = b + c; return 2 + (2 * a + b) * (c - d) / (a - b) * (c + d); } is compiled with `-target arm-eabi -march=armv7-m -Oz`. This patch computes the liveness of `LR` in return blocks only, while taking into account the few ARM instructions, which read `LR`, but nevertheless the register is not mentioned (explicitly or implicitly) in the instruction operands. Differential Revision: https://reviews.llvm.org/D89189	2020-11-02 16:47:22 +00:00
Fangrui Song	6fba8f1f14	[Debugify] Port -debugify-each to NewPM Preemptively switch 2 tests to the new PM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D90365	2020-11-02 08:16:43 -08:00
Florian Hahn	1db8566f5e	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit 408c4408facc3a79ee4ff7e9983cc972f797e176. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00

1 2 3 4 5 ...

140809 Commits