llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Matt Arsenault	9c6f0e54a5	AMDGPU: Legalize more bitcasts llvm-svn: 351700	2019-01-20 19:45:18 +00:00
Matt Arsenault	481e85cce0	AMDGPU/GlobalISel: Really legalize exts from i1 There is a combine that was hiding these tests not actually testing what they should be, although they were producing the expected end result. llvm-svn: 351698	2019-01-20 19:28:20 +00:00
Simon Pilgrim	0b80866b7e	[X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisons This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp. Noticed while cleaning up vector integer comparison costs for PR40376. llvm-svn: 351697	2019-01-20 19:27:40 +00:00
Matt Arsenault	12837ca780	GlobalISel: Implement widenScalar for basic FP ops llvm-svn: 351696	2019-01-20 19:10:31 +00:00
Matt Arsenault	4a10662415	AMDGPU/GlobalISel: Legalize f32->f16 fptrunc llvm-svn: 351695	2019-01-20 19:10:26 +00:00
Matt Arsenault	3d37d26708	AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_values This was crashing in the predicate function assuming the value is a vector. Copy more of what AArch64 uses. This probably needs more refinement later, but I don't exactly understand what it means in some cases, particularly since any legalization for these seems to be missing. llvm-svn: 351693	2019-01-20 18:40:36 +00:00
Matt Arsenault	af0f1330ee	AMDGPU/GlobalISel: Regbank select for fpext llvm-svn: 351692	2019-01-20 18:35:41 +00:00
Matt Arsenault	dc75546ec1	AMDGPU/GlobalISel: Cleanup legality for extensions llvm-svn: 351691	2019-01-20 18:34:24 +00:00
Simon Pilgrim	04482873c0	[CostModel][X86] Add explicit vector select costs Prior to SSE41 (and sometimes on AVX1), vector select has to be performed as a ((X & C)\|(Y & ~C)) bit select. Exposes a couple of issues with the min/max reduction costs (which only go down to SSE42 for some reason). The increase pre-SSE41 selection costs also prevent a couple of tests from firing any longer, so I've either tweaked the target or added AVX tests as well to the existing SSE2 tests. llvm-svn: 351685	2019-01-20 13:55:01 +00:00
Simon Pilgrim	b7bbc260af	[CostModel][X86] Add explicit fcmp costs for pre-SSE42 targets Typical throughputs: cmpss/cmpps = 1cy and cmpsd/cmppd = 2cy before the Core2 era llvm-svn: 351684	2019-01-20 13:21:43 +00:00
Simon Pilgrim	41ffe33de4	[TTI][X86] Reordered getCmpSelInstrCost cost tables in descending ISA order. NFCI. Minor tidyup to make it clearer whats going on before adding additional costs. llvm-svn: 351683	2019-01-20 12:28:13 +00:00
Dylan McKay	4c1ab02ba2	[AVR] Replace two references to ARM's 't2_so_imm' type comments These were originally introduced in a copy-paste committed in r351526. The reference to 't2_so_imm' have been updated to 'imm_com8' so the comment is now accurate. Thanks to Eli Friedman for noticing this. llvm-svn: 351674	2019-01-20 03:45:29 +00:00
Dylan McKay	f2b2682fd1	[AVR] Fix codegen bug in 16-bit loads Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to instructions of this pattern: ld $GPR8, [PTR:XYZ]+ ld $GPR8, [PTR]+1 This has a problem; the [PTR] is incremented in-place once, but never decremented. Future uses of the same pointer will use the now clobbered value, leading to the pointer being incorrect by an offset of one. This patch modifies the expansion code of the LDWRdPtr pseudo instruction so that the pointer variable is not silently clobbered in future uses in the same live range. Bug first reported by Keshav Kini. Patch by Kaushik Phatak. llvm-svn: 351673	2019-01-20 03:41:08 +00:00
Dylan McKay	3d8e01aa8e	Revert "[AVR] Fix codegen bug in 16-bit loads" This reverts commit r351544. In that commit, I had mistakenly misattributed the issue submitter as the patch author, Kaushik Phatak. The patch will be recommitted immediately with the correct attribution. llvm-svn: 351672	2019-01-20 03:41:00 +00:00
Craig Topper	59062585b0	[X86] Add masked MCVTSI2P/MCVTUI2P ISD opcodes to model the cvtqq2ps cvtuqq2ps nodes that produce less than 128-bits of results. These nodes zero the upper half of the result and can't be represented with vselect. llvm-svn: 351666	2019-01-19 21:26:20 +00:00
Chandler Carruth	43ee626c3c	Update more file headers across all of the LLVM projects in the monorepo to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648	2019-01-19 10:56:40 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Reid Kleckner	fa6cb76678	[X86] Deduplicate static calling convention helpers for code size, NFC Summary: Right now we include ${TGT}GenCallingConv.inc once per each instruction selection method implemented by ${TGT}: - ${TGT}ISelLowering.cpp - ${TGT}CallLowering.cpp - ${TGT}FastISel.cpp Instead, add a mechanism to tablegen for marking a particular convention as "External", which causes tablegen to emit into the ::llvm namespace, instead of as a static helper. This allows us to provide a header to forward declare it, so we can simply call the function from all the places it is referenced. Typically the calling convention analyzer is called indirectly, so it doesn't benefit from inlining. This saves a bit of final binary size, but mostly just saves object file size: before after diff artifact 12852K 12492K -360K X86ISelLowering.cpp.obj 4640K 4280K -360K X86FastISel.cpp.obj 1704K 2092K +388K X86CallingConv.cpp.obj 52448K 52336K -112K llc.exe I didn't collect before numbers for X86CallLowering.cpp.obj, which is for GlobalISel, but we should save 360K there as well. This patch applies the strategy to the X86 backend, but there is no reason it couldn't be applied to the other backends that implement multiple ISel strategies, like AArch64. Reviewers: craig.topper, hfinkel, efriedma Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56883 llvm-svn: 351616	2019-01-19 00:33:02 +00:00
Matt Arsenault	89584ac207	AMDGPU/GlobalISel: Legalize more types for select llvm-svn: 351599	2019-01-18 21:42:55 +00:00
Matt Arsenault	3e711f6542	AMDGPU/GlobalISel: Legalize illegal g_constant llvm-svn: 351596	2019-01-18 21:33:50 +00:00
Matt Arsenault	3af6b4cc9d	AMDGPU: Remove llvm.SI.load.const It's taken 3 years, but now all of the old AMDGPU and SI intrinsics are finally gone llvm-svn: 351586	2019-01-18 20:27:02 +00:00
Craig Topper	749ca38372	[X86] Lower avx512f scatter intrinsics to X86MaskedScatterSDNode instead of going directly to MachineSDNode. This sends these intrinsics through isel in a much more normal way. This should allow addressing mode matching in isel to make better use of the displacement field. llvm-svn: 351583	2019-01-18 20:14:46 +00:00
Craig Topper	5c2f5a0877	[X86] Lower avx2/avx512f gather intrinsics to X86MaskedGatherSDNode instead of going directly to MachineSDNode.: This sends these intrinsics through isel in a much more normal way. This should allow addressing mode matching in isel to make better use of the displacement field. Differential Revision: https://reviews.llvm.org/D56827 llvm-svn: 351570	2019-01-18 18:22:26 +00:00
Neil Henning	2716e86e57	[AMDGPU] Add some missing always-uniform values. This commit adds some missing intrinsics into the isAlwaysUniform list for the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D56845 llvm-svn: 351562	2019-01-18 16:39:27 +00:00
Sanjay Patel	fb170f42be	[x86] simplify code for SDValue.getOperand(); NFC llvm-svn: 351557	2019-01-18 15:55:21 +00:00
Dmitry Preobrazhensky	97d150850a	[AMDGPU][MC][GFX8+][DISASSEMBLER] Corrected 1/2pi value for 64-bit operands See bug 39332: https://bugs.llvm.org/show_bug.cgi?id=39332 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D56794 llvm-svn: 351555	2019-01-18 15:17:17 +00:00
Dmitry Preobrazhensky	13885f924c	[AMDGPU][MC] Disabled use of 2 different literals with SOP2/SOPC instructions See bug 39319: https://bugs.llvm.org/show_bug.cgi?id=39319 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D56847 llvm-svn: 351549	2019-01-18 13:57:43 +00:00
Dylan McKay	1eb0d37da4	[AVR] Fix codegen bug in 16-bit loads Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to instructions of this pattern: ld $GPR8, [PTR:XYZ]+ ld $GPR8, [PTR]+1 This has a problem; the [PTR] is incremented in-place once, but never decremented. Future uses of the same pointer will use the now clobbered value, leading to the pointer being incorrect by an offset of one. This patch modifies the expansion code of the LDWRdPtr pseudo instruction so that the pointer variable is not silently clobbered in future uses in the same live range. Patch by Keshav Kini. llvm-svn: 351544	2019-01-18 11:27:38 +00:00
Dylan McKay	8f21a8ed4d	[AVR] Rewrite the CBRRdK instruction as an alias of ANDIRdK The CBR instruction is just an ANDI instruction with the immediate complemented. Because of this, prior to this change TableGen would warn due to a decoding conflict. This commit fixes the existing compilation warning: =============== [423/492] Building AVRGenDisassemblerTables.inc... Decoding Conflict: 0111............ 01.............. ................ ANDIRdK 0111____________ CBRRdK 0111____________ ================ After this commit, there are no more decoding conflicts in the AVR backend's instruction definitions. Thanks to Eli F for pointing me torward `t2_so_imm_not` as an example of how to perform a complement in an instruction alias. Fixes BugZilla PR38802. llvm-svn: 351526	2019-01-18 07:31:34 +00:00
Dylan McKay	e48d05bbd3	[AVR] Expand 8/16-bit multiplication to libcalls on MCUs that don't have hardware MUL This change modifies the LLVM ISel lowering settings so that 8-bit/16-bit multiplication is expanded to calls into the compiler runtime library if the MCU being targeted does not support multiplication in hardware. Before this, MUL instructions would be generated on CPUs like the ATtiny85, triggering a CPU reset due to an illegal instruction at runtime. First raised in https://github.com/avr-rust/rust/issues/124. llvm-svn: 351523	2019-01-18 06:10:41 +00:00
Thomas Lively	fa5fce0016	[WebAssembly] Add languages from debug info to producers section Reviewers: aheejin, dschuff, sbc100 Subscribers: aprantl, jgravelle-google, hiraditya, sunfish Differential Revision: https://reviews.llvm.org/D56889 llvm-svn: 351507	2019-01-18 02:47:48 +00:00
Vladimir Stefanovic	e467c341aa	[mips] Emit .reloc R_{MICRO}MIPS_JALR along with j(al)r(c) $25 The callee address is added as an optional operand (MCSymbol) in AdjustInstrPostInstrSelection() and then used by asm printer to insert: '.reloc tmplabel, R_MIPS_JALR, symbol tmplabel:'. Controlled with '-mips-jalr-reloc', default is true. Differential revision: https://reviews.llvm.org/D56694 llvm-svn: 351485	2019-01-17 21:50:37 +00:00
Sanjin Sijaric	19c7db09aa	Fix the buildbot failure introduced by r351404 EXPENSIVE_CHECKS buildbots are failing due to r351404. Add x1 as live in to the funclet basic block for SEH funclets, as well as -verify-machineinstrs to the test case that triggered the failure. llvm-svn: 351472	2019-01-17 20:24:14 +00:00
Wouter van Oortmerssen	9741215258	[WebAssembly] Fixed objdump not parsing function headers. Summary: objdump was interpreting the function header containing the locals declaration as instructions. To parse these without injecting target specific code in objdump, MCDisassembler::onSymbolStart was added to be implemented by the WebAssembly implemention. WasmObjectFile now returns a code offset for the "address" of a symbol, rather than the index. This is also more in-line with what other targets do. Also ensured that the AsmParser correctly puts each function in its own segment to enable this test case. Reviewers: sbc100, dschuff Subscribers: jgravelle-google, aheejin, sunfish, rupprecht, llvm-commits Differential Revision: https://reviews.llvm.org/D56684 llvm-svn: 351460	2019-01-17 18:14:09 +00:00
Matt Arsenault	afd1f8cb4f	Allow FP types for atomicrmw xchg llvm-svn: 351427	2019-01-17 10:49:01 +00:00
Diana Picus	aa9082cd09	Fix capitalization. NFC llvm-svn: 351425	2019-01-17 10:11:59 +00:00
Diana Picus	b99810417e	[ARM GlobalISel] Allow calls to varargs functions Allow varargs functions to be called, both in arm and thumb mode. This boils down to choosing the correct calling convention, which we can easily test by making sure arm_aapcscc is used instead of arm_aapcs_vfpcc when the callee is variadic. llvm-svn: 351424	2019-01-17 10:11:55 +00:00
Alex Bradbury	dcf62df83b	[RISCV] Add codegen support for RV64A In order to support codegen RV64A, this patch: * Introduces masked atomics intrinsics for atomicrmw operations and cmpxchg that use the i64 type. These are ultimately lowered to masked operations using lr.w/sc.w, but we need to use these alternate intrinsics for RV64 because i32 is not legal * Modifies RISCVExpandPseudoInsts.cpp to handle PseudoAtomicLoadNand64 and PseudoCmpXchg64 * Modifies the AtomicExpandPass hooks in RISCVTargetLowering to sext/trunc as needed for RV64 and to select the i64 intrinsic IDs when necessary * Adds appropriate patterns to RISCVInstrInfoA.td * Updates test/CodeGen/RISCV/atomic-*.ll to show RV64A support This ends up being a fairly mechanical change, as the logic for RV32A is effectively reused. Differential Revision: https://reviews.llvm.org/D53233 llvm-svn: 351422	2019-01-17 10:04:39 +00:00
Thomas Lively	c66efefd0f	[WebAssembly] Parse llvm.ident into producers section llvm-svn: 351413	2019-01-17 02:29:55 +00:00
Thomas Lively	27982aa1e1	Revert "[WebAssembly] Parse llvm.ident into producers section" This reverts commit eccdbba3a02a33e13b5262e92200a33e2ead873d. llvm-svn: 351410	2019-01-17 00:39:49 +00:00
Sanjin Sijaric	10c2d07dfd	[SEH] [ARM64] Retrieve the frame pointer from SEH funclets The Windows ARM64 runtime passes the establisher frame to funclets as the first argument. llvm-svn: 351404	2019-01-17 00:24:38 +00:00
Thomas Lively	30a5c9786a	[WebAssembly] Parse llvm.ident into producers section Summary: Everything before the word "version" is the tool, and everything after the word "version" is the version. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56742 llvm-svn: 351399	2019-01-16 23:46:14 +00:00
Sam Clegg	1ef2c042ad	[WebAssembly] Remove expected failure from known_gcc_test_failures.txt. NFC. Differential Revision: https://reviews.llvm.org/D56809 llvm-svn: 351388	2019-01-16 22:26:59 +00:00
Reid Kleckner	ee3d7ef6d6	[X86] Sink complex MCU CC helper to .cpp file from .h file, NFC llvm-svn: 351384	2019-01-16 22:05:36 +00:00
Craig Topper	bae7415b04	[X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlv Previously we used ISD::SHL and ISD::SRL to represent these in SelectionDAG. ISD::SHL/SRL interpret an out of range shift amount as undefined behavior and will constant fold to undef. While the intrinsics are defined to return 0 for out of range shift amounts. A previous patch added a special node for VPSRAV to produce all sign bits. This was previously believed safe because undefs frequently get turned into 0 either from the constant pool or a desire to not have a false register dependency. But undef is treated specially in some optimizations. For example, its ignored in detection of vector splats. So if the ISD::SHL/SRL can be constant folded and all of the elements with in bounds shift amounts are the same, we might fold it to single element broadcast from the constant pool. This would not put 0s in the elements with out of bounds shift amounts. We do have an existing InstCombine optimization to use shl/lshr when the shift amounts are all constant and in bounds. That should prevent some loss of constant folding from this change. Patch by zhutianyang and Craig Topper Differential Revision: https://reviews.llvm.org/D56695 llvm-svn: 351381	2019-01-16 21:46:32 +00:00
Craig Topper	ce0da9c2af	[X86] Use X86ISD::BLENDV for blendv intrinsics. Replace vselect with blendv just before isel table lookup. Remove vselect isel patterns. This cleans up the duplication we have with both intrinsic isel patterns and vselect isel patterns. This should also allow the intrinsics to get SimplifyDemandedBits support for the condition. I've switched the canonical pattern in isel to use the X86ISD::BLENDV node instead of VSELECT. Since it always seemed weird to move from BLENDV with its relaxed rules on condition bits to VSELECT which has strict rules about all bits of the condition element being the same. Its more correct to go from VSELECT to BLENDV. Differential Revision: https://reviews.llvm.org/D56771 llvm-svn: 351380	2019-01-16 21:46:28 +00:00
Changpeng Fang	25c9efdee4	AMDGPU: Adjust the chain for loads writing to the HI part of a register. Summary: For these loads that write to the HI part of a register, we should chain them to the op that writes to the LO part of the register to maintain the appropriate order. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D56454 llvm-svn: 351379	2019-01-16 21:32:53 +00:00
Craig Topper	8e2e7a1d93	[X86] Add a one use check to the setcc inversion code in combineVSelectWithAllOnesOrZeros If we're going to generate a new inverted setcc, we should make sure we will be able to remove the old setcc. Differential Revision: https://reviews.llvm.org/D56765 llvm-svn: 351378	2019-01-16 21:29:29 +00:00
Mandeep Singh Grang	9f9869fa2f	[COFF, ARM64] Implement support for SEH extensions __try/__except/__finally Summary: This patch supports MS SEH extensions __try/__except/__finally. The intrinsics localescape and localrecover are responsible for communicating escaped static allocas from the try block to the handler. We need to preserve frame pointers for SEH. So we create a new function/property HasLocalEscape. Reviewers: rnk, compnerd, mstorsjo, TomTan, efriedma, ssijaric Reviewed By: rnk, efriedma Subscribers: smeenai, jrmuizel, alex, majnemer, ssijaric, ehsan, dmajor, kristina, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53540 llvm-svn: 351370	2019-01-16 19:52:59 +00:00
Krzysztof Parzyszek	6e2dddb656	[Hexagon] Do not promote terminator instructions in Hexagon loop idioms llvm-svn: 351369	2019-01-16 19:40:27 +00:00

1 2 3 4 5 ...

50500 Commits