llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Scott Linder	ec09d11aac	[AMDGPU] Consider XOR in waterfall loop as a terminator Ensure the XOR in the waterfall loop for indirect addressing is considered a terminator. Differential Revision: https://reviews.llvm.org/D57703 llvm-svn: 353207	2019-02-05 19:50:32 +00:00
Matt Arsenault	240cdafd6c	AMDGPU: Fix assert on trunc from bitcast of build_vector The v2i64 argument is lowered to a bitcast of v4i32 build_vector. This would then attempt to use the i32-element as the source of the vector truncate. This really would need to collect 2 elements from the build_vector to produce the intended truncate. llvm-svn: 353202	2019-02-05 19:23:57 +00:00
Simon Pilgrim	2fce0141f6	[X86][SSE] Disable ZERO_EXTEND shuffle combining rL352997 enabled ZERO_EXTEND from non-shuffle-able value types. I've disabled it for now to fix a regression identified by @asbirlea until I can fix this properly. llvm-svn: 353198	2019-02-05 19:15:48 +00:00
Anton Korobeynikov	287ccc2dff	Enable integrated assembler on MSP430 by default. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56787 llvm-svn: 353192	2019-02-05 18:01:45 +00:00
Oliver Stannard	a390364af8	[AArch64][Outliner] Don't outline BTI instructions We can't outline BTI instructions, because they need to be the very first instruction executed after an indirect call or branch. If we outline them, then an indirect call might go to the branch to the outlined function, which will fault. Differential revision: https://reviews.llvm.org/D57753 llvm-svn: 353190	2019-02-05 17:21:57 +00:00
Simon Pilgrim	1f1c4b7b3e	[X86][AVX] Attempt to combine shuffles to subvector broadcast load llvm-svn: 353189	2019-02-05 17:02:49 +00:00
Matt Arsenault	eaac595e10	AArch64/GlobalISel: Don't clamp from 2 to 2 This is equivalent to clampMaxNumElements, but saves a check. llvm-svn: 353188	2019-02-05 16:57:18 +00:00
Simon Pilgrim	e83a6e5ac6	[X86][SSE] Add SimplifyDemandedVectorElts support for X86ISD::BLENDV llvm-svn: 353165	2019-02-05 12:27:29 +00:00
Simon Pilgrim	9819ff8761	[X86][AVX] Attempt to share broadcasts of different widths (PR39454) If we have broadcasts of different vector widths, keep the longest vector width and extract subvectors for the shorter vectors (which should be free). Differential Revision: https://reviews.llvm.org/D57663 llvm-svn: 353154	2019-02-05 10:58:43 +00:00
Florian Hahn	ad99b95580	[CGP] Add support for sinking operands to their users, if they are free. This patch improves code generation for some AArch64 ACLE intrinsics. It adds support to CGP to duplicate and sink operands to their user, if they can be folded into a target instruction, like zexts and sub into usubl. It adds a TargetLowering hook shouldSinkOperands, which looks at the operands of instructions to see if sinking is profitable. I decided to add a new target hook, as for the sinking to be profitable, at least on AArch64, we have to look at multiple operands of an instruction, instead of looking at the users of a zext for example. The sinking is done in CGP, because it works around an instruction selection limitation. If instruction selection is not limited to a single basic block, this patch should not be needed any longer. Alternatively this could be done in the LoopSink pass, which tries to undo LICM for instructions in blocks that are not executed frequently. Note that we do not force the operands to sink to have a single user, because we duplicate them before sinking. Therefore this is only desirable if they really can be done for free. Additionally we could consider the impact on live ranges later on. This should fix https://bugs.llvm.org/show_bug.cgi?id=40025. As for performance, we have internal code that uses intrinsics and can be speed up by 10% by this change. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D57377 llvm-svn: 353152	2019-02-05 10:27:40 +00:00
Diana Picus	d29b1fa6a3	[ARM GlobalISel] Support G_GEP for Thumb2 Same as ARM, but use a different opcode in the instruction selection. llvm-svn: 353151	2019-02-05 10:21:37 +00:00
Craig Topper	f0b48521da	[X86] Connect the default fpsr and dirflag clobbers in inline assembly to the registers we have defined for them. Summary: We don't currently map these constraints to physical register numbers so they don't make it to the MachineIR representation of inline assembly. This could have problems for proper dependency tracking in the machine schedulers though I don't have a test case that shows that. Reviewers: rnk Reviewed By: rnk Subscribers: eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57641 llvm-svn: 353141	2019-02-05 06:13:06 +00:00
Heejin Ahn	0144d76ea6	[WebAssembly] Fix indentation after adding IsCanonical property (NFC) llvm-svn: 353132	2019-02-05 01:59:49 +00:00
Wouter van Oortmerssen	2e1ee891b8	[WebAssembly] Make disassembler always emit most canonical name. Summary: There are a few instructions that all map to the same opcode, so when disassembling, we have to pick one. That was just the first one before (the except_ref variant in the case of "call"), now it is the one marked as IsCanonical in tablegen, or failing that, the shortest name (which is typically the "canonical" one). Also introduced a canonical "end" instruction for this purpose. Reviewers: dschuff, tlively Subscribers: sbc100, jgravelle-google, aheejin, llvm-commits, sunfish Tags: #llvm Differential Revision: https://reviews.llvm.org/D57713 llvm-svn: 353131	2019-02-05 01:19:45 +00:00
Thomas Lively	cfcee8e8e8	[WebAssembly] memory.copy Summary: Depends on D57495. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D57498 llvm-svn: 353127	2019-02-05 00:49:55 +00:00
Matt Arsenault	8931a222e0	AMDGPU: Don't rematerialize mov with implicit operands This was pulling the mov used for register indexing on gfx9 out of the loop. llvm-svn: 353101	2019-02-04 22:26:21 +00:00
Craig Topper	efe45f4fe0	[CodeGen][ARC][SystemZ][WebAssembly] Use MachineInstr::isInlineAsm in more places instead of just comparing opcode. NFCI I'm looking at adding a second INLINEASM opcode for better modeling asm-goto as a terminator. Using the existing predicate will reduce teh number of places that will need to use the new opcode. llvm-svn: 353095	2019-02-04 21:24:13 +00:00
Scott Linder	3100fdcbd0	[AMDGPU] Support emitting GOT relocations for function calls Differential Revision: https://reviews.llvm.org/D57416 llvm-svn: 353083	2019-02-04 20:00:07 +00:00
Heejin Ahn	f857e44c40	[WebAssembly] clang-tidy (NFC) Summary: This patch fixes clang-tidy warnings on wasm-only files. The list of checks used is: `-,clang-diagnostic-,llvm-,misc-,-misc-unused-parameters,readability-identifier-naming,modernize-` (LLVM's default .clang-tidy list is the same except it does not have `modernize-`. But I've seen in multiple CLs in LLVM the modernize style was recommended and code was fixed based on the style, so I added it as well.) The common fixes are: - Variable names start with an uppercase letter - Function names start with a lowercase letter - Use `auto` when you use casts so the type is evident - Use inline initialization for class member variables - Use `= default` for empty constructors / destructors - Use `using` in place of `typedef` Reviewers: sbc100, tlively, aardappel Subscribers: dschuff, sunfish, jgravelle-google, yurydelendik, kripken, MatzeB, mgorny, rupprecht, llvm-commits Differential Revision: https://reviews.llvm.org/D57500 llvm-svn: 353075	2019-02-04 19:13:39 +00:00
Roman Lebedev	9f046e58b4	[X86] X86DAGToDAGISel::matchBitExtract(): prepare 'control' in 32 bits Summary: Noticed while looking at D56052. ``` // The 'control' of BEXTR has the pattern of: // [15...8 bit][ 7...0 bit] location // [ bit count][ shift] name // I.e. 0b000000011'00000001 means (x >> 0b1) & 0b11 ``` I.e. we do not care about any of the bits aside from the low 16 bits. So there is no point in doing the `slh`,`or` in 64 bits, let's just do everything in 32 bits, and anyext if needed. We could do that in 16 even, but we intentionally don't zext to i16 (longer encoding IIRC), so i'm guessing the same applies here. Reviewers: craig.topper, andreadb, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D56715 llvm-svn: 353073	2019-02-04 19:04:26 +00:00
Craig Topper	2613e553a6	[X86] Add ST0 as an implicit def/use of x87 load/store instructions during FP stackifying. These instructions implicitly operate on ST0, but we don't currently add that information to the MachineInstr. We also don't add it the tablegen definitions either. For the most part this doesn't cause any problems because the stackifying occurs after register allocation. All the instructions are marked as having side effects so the postRA scheduler won't reorder them amongst themselves. But nothing stops inline assembly using X87 instructions from being reordered around other x87 instructions if that inline assembly wasn't marked volatile. The two test cases I've identified so far in PR40539 involve loads and stores used to set up the inline assembly or capture the results of the inline assembly ending up in the wrong order. This patch adds implicit ST0 uses/defs to the load/store instructions to prevent this from happening. I plan to fix all of the FP instructions, but the binops are bit trickier to get right. So I've chosen fixing the known test cases as a good first step. I think we also need to update the tablegen descriptions so MS inline assembly infers the right clobbers, but I haven't checked that yet. Differential Revision: https://reviews.llvm.org/D57644 llvm-svn: 353070	2019-02-04 18:43:55 +00:00
Wouter van Oortmerssen	d52bc029cd	[WebAssembly] Make segment/size/type directives optional in asm Summary: These were "boilerplate" that repeated information already present in .functype and end_function, that needed to be repeated to Please the particular way our object writing works, and missing them would generate errors. Instead, we generate the information for these automatically so the user can concern itself with writing more canonical wasm functions that always work as expected. Reviewers: dschuff, sbc100 Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57546 llvm-svn: 353067	2019-02-04 18:03:11 +00:00
Sam Clegg	d8d06a1305	[WebAssembly] Rename relocations from R_WEBASSEMBLY_ to R_WASM_ See https://github.com/WebAssembly/tool-conventions/pull/95. This is less typing and IMHO more readable, and it also fits with our naming around the binary format which tends to use the short name. e.g. include/llvm/BinaryFormat/Wasm.h tools/llvm-objdump/WasmDump.cpp etc.. Differential Revision: https://reviews.llvm.org/D57611 llvm-svn: 353062	2019-02-04 17:28:46 +00:00
Craig Topper	e1dccb4d49	[X86] Print all register forms of x87 fadd/fsub/fdiv/fmul as having two arguments where on is %st. All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read. This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior. llvm-svn: 353061	2019-02-04 17:28:18 +00:00
Simon Pilgrim	08d6482cba	[X86][SSE] SimplifyDemandedBitsForTargetNode - PCMPGT(0,X) sign mask For PCMPGT(0, X) patterns where we only demand the sign bit (e.g. BLENDV or MOVMSK) then we can use X directly. Differential Revision: https://reviews.llvm.org/D57667 llvm-svn: 353051	2019-02-04 15:43:36 +00:00
Matt Arsenault	f2659e02d9	AMDGPU/GlobalISel: Legalize select for v4s16 Also add some more select tests to help show future legalization changes. llvm-svn: 353045	2019-02-04 14:04:52 +00:00
Andrea Di Biagio	2820258583	[AsmPrinter] Remove hidden flag -print-schedule. This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043	2019-02-04 12:51:26 +00:00
Simon Pilgrim	793452a500	Use auto for dyn_cast case to save a line. NFCI. llvm-svn: 353041	2019-02-04 12:32:39 +00:00
David Green	43a3f10458	[ARM] Mark 255 and 65535 as cheap for Thumb1 "And" This prevents Constant Hoisting from pulling the constant out of the block, allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap by the default getIntImmCost, but I've added it for clarity. Differential Revision: https://reviews.llvm.org/D57671 llvm-svn: 353040	2019-02-04 11:58:48 +00:00
Craig Topper	32f3ee2f37	Recommit r352660 "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7." We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects. Original commit message: This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler. Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler. Differential Revision: https://reviews.llvm.org/D57298 llvm-svn: 353016	2019-02-04 04:44:20 +00:00
Craig Topper	c4b858bb71	[X86] Print %st(0) as %st when its implicit to the instruction. Continue printing it as %st(0) when its encoded in the instruction. This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior. llvm-svn: 353015	2019-02-04 04:15:10 +00:00
Craig Topper	08dc72ab87	Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly" Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st. I'll be making a more directed change in a future patch. llvm-svn: 353013	2019-02-04 04:15:02 +00:00
Simon Pilgrim	47d4cdc4bc	[X86][AVX] Support shuffle combining for VBROADCAST with smaller vector sources getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type. llvm-svn: 352999	2019-02-03 16:51:33 +00:00
Simon Pilgrim	84ba9041e1	[X86][AVX] Support shuffle combining for VPMOVZX with smaller vector sources llvm-svn: 352997	2019-02-03 16:10:18 +00:00
Simon Pilgrim	f3f0f8cb13	[X86][AVX] More aggressively simplify BROADCAST source operand Aim to use scalar source or lowest 128-bit vector directly. We're still missing some VZMOVL_LOAD combines. llvm-svn: 352994	2019-02-03 14:39:41 +00:00
Craig Topper	91f5c96339	[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly Summary: When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name. This also matches what objdump disassembly prints. It's also what is printed by gcc -S. Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri Reviewed By: rnk Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D57621 llvm-svn: 352985	2019-02-03 07:53:39 +00:00
Craig Topper	2a8b04f2e2	[X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a constant 1 to encourage INC formation. Summary: Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions. I believe this catches the cases that D57547 got. Reviewers: RKSimon, spatel Reviewed By: spatel Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57637 llvm-svn: 352984	2019-02-03 07:25:06 +00:00
Fangrui Song	9b76ec32a6	[AMDGPU] Fix -Wunused-variable after rL352978 llvm-svn: 352982	2019-02-03 03:51:52 +00:00
Matt Arsenault	eb50f27657	GlobalISel: Implement widenScalar for G_UNMERGE_VALUES For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979	2019-02-03 00:07:33 +00:00
Matt Arsenault	8ce8c26480	GlobalISel: Implement widenScalar for G_EXTRACT vector sources Handle the basic element extract case. llvm-svn: 352978	2019-02-02 23:56:00 +00:00
Matt Arsenault	827dc2a1cf	AMDGPU/GlobalISel: Avoid reporting illegal extloads as legal This avoids breaking a test in a future commit. llvm-svn: 352977	2019-02-02 23:39:13 +00:00
Matt Arsenault	728a67200b	AMDGPU/GlobalISel: Legalize icmp for pointer types llvm-svn: 352976	2019-02-02 23:35:15 +00:00
Matt Arsenault	e0f071d018	AMDGPU/GlobalISel: Legalize constant for pointer types llvm-svn: 352975	2019-02-02 23:33:49 +00:00
Matt Arsenault	ee43afbe0d	AMDGPU/GlobalISel: Legalize select for pointer types llvm-svn: 352974	2019-02-02 23:31:50 +00:00
Matt Arsenault	8cef0be33e	GlobalISel: Legalization for inttoptr/ptrtoint llvm-svn: 352973	2019-02-02 23:29:55 +00:00
Simon Pilgrim	7fee27762c	[X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combining Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles. The is exposes a couple of minor issues that will be fixed shortly: Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts). llvm-svn: 352963	2019-02-02 18:08:04 +00:00
Simon Pilgrim	6f967961aa	[SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI. We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961	2019-02-02 17:35:06 +00:00
Yonghong Song	fb5cbe921f	[BPF] [BTF] Process FileName with absolute path correctly In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352952	2019-02-02 05:54:59 +00:00
Yonghong Song	0ba3435a2c	Revert "[BPF] [BTF] Process FileName with absolute path correctly" This reverts commit r352939. Some tests failed. Revert to unblock others. llvm-svn: 352941	2019-02-01 23:49:52 +00:00
Mandeep Singh Grang	553513010b	[AArch64] Fix unused variable [NFC] llvm-svn: 352940	2019-02-01 23:42:34 +00:00

1 2 3 4 5 ...

50766 Commits