llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Valery Pykhtin	0d933088de	LiveInterval.h: add LiveRange::findIndexesLiveAt function - return a list of SlotIndexes the LiveRange live at. Differential revision: https://reviews.llvm.org/D62411 llvm-svn: 363593	2019-06-17 18:23:39 +00:00
Simon Pilgrim	2900cf3d1d	[X86][SSE] Scalarize under-aligned XMM vector nt-stores (PR42026) If a XMM non-temporal store has less than natural alignment, scalarize the vector - with SSE4A we can stay on the vector and use MOVNTSD(f64), else we must move to GPRs and use MOVNTI(i32/i64). llvm-svn: 363592	2019-06-17 18:20:04 +00:00
Matt Arsenault	6999aef238	AMDGPU: Make getreg intrinsic inaccessiblememonly llvm-svn: 363591	2019-06-17 18:17:25 +00:00
Alina Sbirlea	fb04ae0293	[MemorySSA] Add all MemoryPhis before filling their values. Summary: Add all MemoryPhis in IDF before filling in their incomign values. Otherwise, a new Phi can be added that needs to become the incoming value of another Phi. Test fails the verification in verifyPrevDefInPhis. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63353 llvm-svn: 363590	2019-06-17 18:16:53 +00:00
Stanislav Mekhanoshin	34b508a4fa	[AMDGPU] gfx1010 wavefrontsize intrinsic folding Differential Revision: https://reviews.llvm.org/D63206 llvm-svn: 363588	2019-06-17 17:57:50 +00:00
Matt Arsenault	6372ca2f72	AMDGPU: Fold readlane/readfirstlane calls llvm-svn: 363587	2019-06-17 17:52:35 +00:00
Stanislav Mekhanoshin	ff3f5f72e4	[AMDGPU] Pass to propagate ABI attributes from kernels to the functions The pass works in two modes: Mode 1: Just set attributes starting from kernels. This can work at the very beginning of opt and llc pipeline, but cannot clone functions because it must be a function pass. Mode 2: Actually clone functions for new attributes. This can only work after all function passes in the opt pipeline because it has to be a module pass. Differential Revision: https://reviews.llvm.org/D63208 llvm-svn: 363586	2019-06-17 17:47:28 +00:00
Nico Weber	9848cb5275	gn build: Merge r363541 llvm-svn: 363583	2019-06-17 17:45:12 +00:00
Simon Pilgrim	7d046410c0	[X86][AVX] Split under-aligned vector nt-stores. If a YMM/ZMM non-temporal store has less than natural alignment, split the vector - either they will be satisfactorily aligned or will continue to be split until they are XMMs - at which point the legalizer will scalarize it. llvm-svn: 363582	2019-06-17 17:22:38 +00:00
Warren Ristow	8897816bcd	[LV] Suppress vectorization in some nontemporal cases When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581	2019-06-17 17:20:08 +00:00
Matt Arsenault	cb36737365	GlobalISel: Ignore callsite attributes when picking intrinsic type A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. I fixed the same bug in SelectionDAG in r287593. llvm-svn: 363580	2019-06-17 17:01:35 +00:00
Matt Arsenault	f7a171647b	GlobalISel: Verify intrinsics I keep using the wrong instruction when manually writing tests. This really needs to check the number of operands, but I don't see an easy way to do that right now. llvm-svn: 363579	2019-06-17 17:01:32 +00:00
Matt Arsenault	db38726b51	AMDGPU/GlobalISel: Account for multiple defs when finding intrinsic ID llvm-svn: 363578	2019-06-17 17:01:27 +00:00
Stanislav Mekhanoshin	3240d7297b	[AMDGPU] gfx1010 wave32 metadata Differential Revision: https://reviews.llvm.org/D63207 llvm-svn: 363577	2019-06-17 16:48:56 +00:00
Tom Stellard	fe9a142dd4	AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECT Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60640 llvm-svn: 363576	2019-06-17 16:27:43 +00:00
Francis Visoiu Mistrih	1e67de55c2	[Remarks] Extend -fsave-optimization-record to specify the format Use -fsave-optimization-record=<format> to specify a different format than the default, which is YAML. For now, only YAML is supported. llvm-svn: 363573	2019-06-17 16:06:00 +00:00
Simon Pilgrim	eb5d68a02c	[X86] combineLoad - begun making the load split code more generic. NFCI. This is currently only used for ymm->xmm splitting but we shouldn't hardcode the offsets/alignment. This is necessary for an upcoming patch to split under-aligned non-temporal vector loads. llvm-svn: 363570	2019-06-17 15:54:36 +00:00
Whitney Tsang	6f7c010bdb	PHINode: introduce setIncomingValueForBlock() function, and use it. Summary: There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue() but no function to replace incoming value for a specified BasicBlock* predecessor. Clearly, there are a lot of places that could use that functionality. Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn Reviewed By: Meinersbur, fhahn Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63338 llvm-svn: 363566	2019-06-17 14:38:56 +00:00
Simon Pilgrim	b2d83418d8	[X86][SSE] Add tests for underaligned nt loads Test both 'unaligned' (which we should just use regular unaligned loads) and 'subvector aligned' (which we should split) llvm-svn: 363565	2019-06-17 14:38:17 +00:00
Simon Pilgrim	453c703aa1	[X86][SSE] Prevent misaligned non-temporal vector load/store combines For loads, pre-SSE41 we can't perform NT loads at all, and after that we can only perform vector aligned loads, so if the alignment is less than for a xmm we'll just end up using the regular unaligned vector loads anyway. First step towards fixing PR42026 - the next step for stores will be to use SSE4A movntsd where possible and to avoid the stack spill on SSE2 targets. Differential Revision: https://reviews.llvm.org/D63246 llvm-svn: 363564	2019-06-17 14:26:10 +00:00
Matt Arsenault	1127d3fd3b	InferAddressSpaces: Fix cloning original addrspacecast If an addrspacecast needed to be inserted again, this was creating a clone of the original cast for each user. Just use the original, which also saves losing the value name. llvm-svn: 363562	2019-06-17 14:13:29 +00:00
Matt Arsenault	552bf58679	AMDGPU: Ignore subtarget for InferAddressSpaces Even if the target doesn't have flat instructions, addrspace(0) is still flat. It just happens to not work. llvm-svn: 363561	2019-06-17 14:13:24 +00:00
Matt Arsenault	b7b6243629	AMDGPU: Mark exp/exp.compr as inaccessiblememonly Should also be marked writeonly, but I think that would require splitting the version with done set to a separate intrinsic Test change is only from renumbering the attribute group numbers, which for some reason the generated check lines consider. llvm-svn: 363560	2019-06-17 13:52:24 +00:00
Matt Arsenault	a6c3b008b9	AMDGPU/GlobalISel: Fix default mapping for non-register operands Tests will be in future commits when new intrinsics are handled here. llvm-svn: 363559	2019-06-17 13:52:19 +00:00
Matt Arsenault	9727aa70c2	AMDGPU: Cleanup custom PseudoSourceValue definitions Use separate enums for each kind, avoid repeating overloads, and add missing classof implementation. llvm-svn: 363558	2019-06-17 13:52:15 +00:00
Sam Parker	c2a04b68af	[CodeGen] Check for HardwareLoop Latch ExitBlock The HardwareLoops pass finds exit blocks with a scevable exit count. If the target specifies to update the loop counter in a register, through a phi, we need to ensure that the exit block is a latch so that we can insert the phi with the correct value for the incoming edge. Differential Revision: https://reviews.llvm.org/D63336 llvm-svn: 363556	2019-06-17 13:39:28 +00:00
Simon Pilgrim	525f418f90	[X86][SSE] Avoid unnecessary stack codegen in NT store codegen tests. llvm-svn: 363552	2019-06-17 12:35:26 +00:00
Nicolai Haehnle	a9fe4094dc	AsmPrinter: add doc-string for EmitLinkage Change-Id: I376fcbd58f84a2aac6aaf744bc1665c92d312b25 llvm-svn: 363550	2019-06-17 12:24:04 +00:00
Nico Weber	c377505761	gn build: Merge r363530 llvm-svn: 363549	2019-06-17 12:18:27 +00:00
Bjorn Pettersson	7f22195fc5	[LV] Deny irregular types in interleavedAccessCanBeWidened Summary: Avoid that loop vectorizer creates loads/stores of vectors with "irregular" types when interleaving. An example of an irregular type is x86_fp80 that is 80 bits, but that may have an allocation size that is 96 bits. So an array of x86_fp80 is not bitcast compatible with a vector of the same type. Not sure if interleavedAccessCanBeWidened is the best place for this check, but it solves the problem seen in the added test case. And it is the same kind of check that already exists in memoryInstructionCanBeWidened. Reviewers: fhahn, Ayal, craig.topper Reviewed By: fhahn Subscribers: hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63386 llvm-svn: 363547	2019-06-17 12:02:24 +00:00
Sander de Smalen	67b35c2d65	Test forward references in IntrinsicEmitter on Neon LD(2\|3\|4) This patch tests the forward-referencing added in D62995 by changing some existing intrinsics to use forward referencing of overloadable parameters, rather than backward referencing. This patch changes the TableGen definition/implementation of llvm.aarch64.neon.ld2lane and llvm.aarch64.neon.ld2lane intrinsics (and similar for ld3 and ld4). This change is intended to be non-functional, since the behaviour of the intrinsics is expected to be the same. Reviewers: arsenm, dmgreen, RKSimon, greened, rnk Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D63189 llvm-svn: 363546	2019-06-17 12:01:53 +00:00
Luis Marques	9baf159f0c	[DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting Some GEPs were not being split, presumably because that split would just be undone by the DAGCombiner. Not performing those splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates. This patch: - Makes the splits also occur in the cases where the base address and the GEP are in the same BB. - Ensures that the DAGCombiner doesn't reassociate them back again. Differential Revision: https://reviews.llvm.org/D60294 llvm-svn: 363544	2019-06-17 10:54:12 +00:00
Fangrui Song	31d4a25ff8	Fix clang -Wcovered-switch-default after stack-id change by D60137 llvm-svn: 363543	2019-06-17 10:20:20 +00:00
Simon Pilgrim	1751d6e555	[SelectionDAG] Fold insert_subvector(undef, extract_subvector(v, c), c) -> v in getNode This is already done in DAGCombiner::visitINSERT_SUBVECTOR, but this helps a number of shuffles across different vector widths recognise when they come from the same source. llvm-svn: 363542	2019-06-17 10:14:52 +00:00
Sam Parker	d6e9b943f9	[SCEV] Use NoWrapFlags when expanding a simple mul Second functional change following on from rL362687. Pass the NoWrapFlags from the MulExpr to InsertBinop when we're generating a shl or mul. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363540	2019-06-17 10:05:18 +00:00
Fangrui Song	f07a96ee21	[llvm-objdump] Use %08 instead of %016 to print leading addresses for 32-bit binaries Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D63398 llvm-svn: 363539	2019-06-17 09:59:55 +00:00
Fangrui Song	921cf42d37	[lit] Delete empty lines at the end of lit.local.cfg NFC llvm-svn: 363538	2019-06-17 09:51:07 +00:00
Roman Lebedev	8c37b784cb	[NFC][Codegen] Standalone tests for icmp eq/ne (urem %x, C), 0 -> icmp eq/ne %x, 0 fold (D63390) llvm-svn: 363537	2019-06-17 09:50:50 +00:00
Fangrui Song	d851ea46cf	[ARM] Fix another -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265 llvm-svn: 363535	2019-06-17 09:29:50 +00:00
Fangrui Song	8bb3e58b0d	[ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265 llvm-svn: 363534	2019-06-17 09:26:50 +00:00
Sander de Smalen	520ee96f7f	Describe stack-id as an enum This patch changes MIR stack-id from an integer to an enum, and adds printing/parsing support for this in MIR files. The default stack-id '0' is now renamed to 'default'. This should make MIR tests that have stack objects with different stack-ids more descriptive. It also clarifies code operating on StackID. Reviewers: arsenm, thegameg, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60137 llvm-svn: 363533	2019-06-17 09:13:29 +00:00
Sam Parker	a371e87c80	[ARM] Remove ARMComputeBlockSize Forgot to remove file! llvm-svn: 363532	2019-06-17 09:13:10 +00:00
Sam Parker	0db1b676fa	[ARM] Add ARMBasicBlockInfo.cpp Forgot to add file! llvm-svn: 363531	2019-06-17 09:05:43 +00:00
Sam Parker	7806cead69	[ARM] Extract some code from ARMConstantIslandPass Create the ARMBasicBlockUtils class for tracking and querying basic blocks sizes so we can use them when generating low-overhead loops. Differential Revision: https://reviews.llvm.org/D63265 llvm-svn: 363530	2019-06-17 08:49:09 +00:00
Hans Wennborg	6b843448d0	Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" Third time's the charm. This was reverted in r363220 due to being suspected of an internal benchmark regression and a test failure, none of which turned out to be caused by this. llvm-svn: 363529	2019-06-17 07:47:28 +00:00
Yevgeny Rouban	0b9f7752c8	[SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata if unreachable switch cases are removed. This patch fixes this bug by making use of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122). A new test is created. Differential Revision: https://reviews.llvm.org/D62186 llvm-svn: 363527	2019-06-17 05:55:12 +00:00
Justin Hibbits	2bad53a983	PowerPC: Optimize SPE double parameter calling setup Summary: SPE passes doubles the same as soft-float, in register pairs as i32 types. This is all handled by the target-independent layer. However, this is not optimal when splitting or reforming the doubles, as it pushes to the stack and loads from, on either side. For instance, to pass a double argument to a function, assuming the double value is in r5, the sequence currently looks like this: evstdd 5, X(1) lwz 3, X(1) lwz 4, X+4(1) Likewise, to form a double into r5 from args in r3 and r4: stw 3, X(1) stw 4, X+4(1) evldd 5, X(1) This optimizes the fence to use SPE instructions. Now, to pass a double to a function: mr 4, 5 evmergehi 3, 5, 5 And to form a double into r5 from args in r3 and r4: evmergelo 5, 3, 4 This is comparable to the way that gcc generates the double splits. This also fixes a bug with expanding builtins to libcalls, where the LowerCallTo() code path was generating intermediate illegal type nodes. Reviewers: nemanjai, hfinkel, joerg Subscribers: kbarton, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54583 llvm-svn: 363526	2019-06-17 03:15:23 +00:00
Seiya Nuta	942fd33165	[yaml2obj][MachO] Don't fill dummy data for virtual sections Summary: Currently, MachOWriter::writeSectionData writes dummy data (0xdeadbeef) to fill section data areas in the file even if the section is a virtual one. Since virtual sections don't occupy any space in the file, writing dummy data could results the "OS.tell() - fileStart <= Sec.offset" assertion failure. This patch fixes the bug by simply not writing any dummy data for virtual sections. Reviewers: beanz, jhenderson, rupprecht, alexshap Reviewed By: alexshap Subscribers: compnerd, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62991 llvm-svn: 363525	2019-06-17 02:07:20 +00:00
Seiya Nuta	d44769e77f	[llvm-objcopy] Add elf32-sparc and elf32-sparcel target Summary: The "sparc"/"sparcel" architectures appears in ArchMap (used by -B option) but not in OutputFormatMap (used by -I/-O option). Add their targets into OutputFormatMap for consistency. Note that AFAIK there're no targets for 32-bit little-endian SPARC ("elf32-sparcel") in GNU binutils. Reviewers: espindola, alexshap, rupprecht, jhenderson, compnerd, jakehehrlich Reviewed By: jhenderson, compnerd, jakehehrlich Subscribers: jyknight, emaste, arichardson, fedor.sergeev, jakehehrlich, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63238 llvm-svn: 363524	2019-06-17 02:03:45 +00:00
Craig Topper	8d6f3601df	[X86] Add TB_NO_REVERSE to some folding table entries where the register from uses the REX prefix, but the memory form does not. It would not be safe to unfold the memory form the register form without checking that we are compiling for 64-bit mode. This probaby isn't a real functional issue since we are unlikely to unfold any of these instructions since they don't have any tied registers, aren't commutable, and don't have any inputs other than the address. llvm-svn: 363523	2019-06-16 22:33:09 +00:00

1 2 3 4 5 ...

180382 Commits