llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Daniel Sanders	60c2b0f7af	Re-commit: r366610 and r366612: Expand pseudo-components before embedding in llvm-config There were two main problems: * The 'nativecodegen' pseudo-component was unconditionally adding ${native_tgt}CodeGen even though it conditionally added ${native_tgt}Info and ${native_tgt}Desc. This has been fixed by making ${native_tgt}CodeGen conditional too * The 'all' pseudo-component was causing library names like LLVMLLVMDemangle as the expansion was to a library name and not a component. There doesn't seem to be a list of available components anywhere so this has been fixed by moving the expansion of 'all' back where it was before. This manifested in different ways on different builders but it was the same root cause llvm-svn: 366622	2019-07-19 22:46:47 +00:00
Matt Arsenault	0e9abc2fbd	AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces llvm-svn: 366621	2019-07-19 22:28:44 +00:00
Stanislav Mekhanoshin	1ace6d5780	[AMDGPU] Autogenerate register sequences in tuples Differential Revision: https://reviews.llvm.org/D65007 llvm-svn: 366619	2019-07-19 21:43:42 +00:00
Stanislav Mekhanoshin	4f62c7dc7b	[AMDGPU] Fixed occupancy calculation for gfx10 Differential Revision: https://reviews.llvm.org/D65010 llvm-svn: 366616	2019-07-19 21:29:51 +00:00
Daniel Sanders	47b2ce4000	Revert r366610 and r366612: Expand pseudo-components before embedding in llvm-config Some targets are missing LLVMDemangle, one is adding the LLVM prefix twice, and two are hitting the very error this patch fixes for my target. Reverting while I work through the reports. llvm-svn: 366615	2019-07-19 21:11:05 +00:00
Craig Topper	a0fef0a13e	[InstCombine] Fix copy/paste mistake in the test cases I added for PR42691. NFC llvm-svn: 366614	2019-07-19 21:09:21 +00:00
Matt Arsenault	e337bb98bb	AMDGPU: Avoid custom predicates for stores with glue llvm-svn: 366613	2019-07-19 21:01:30 +00:00
Daniel Sanders	eb9d60e13b	Fix a latent bug discovered by r366610: nativecodegen includes X86CodeGen when X86 is not compiled I believe this to have been a latent bug as the same expansion checks for the existence of ${native_tgt}Info and ${native_tgt}Desc and only adds them if they were compiled but unconditionally adds ${native_tgt}CodeGen. This should fix llvm-clang-x86_64-win-fast which builds ARM only on an X86 host and similar builders. llvm-svn: 366612	2019-07-19 20:58:11 +00:00
Craig Topper	192e68f782	[InstCombine] Add test cases for PR42691. NFC llvm-svn: 366611	2019-07-19 20:48:52 +00:00
Daniel Sanders	70b0573626	Expand pseudo-components before embedding in llvm-config Summary: If you use pseudo-targets like AllTargetsCodeGens in LLVM_DYLIB_COMPONENTS then a test will fail because `./bin/llvm-config --shared-mode` can't handle these targets. We can fix this by expanding them before embedding the string into llvm-config Reviewers: bogner Reviewed By: bogner Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65011 llvm-svn: 366610	2019-07-19 20:38:05 +00:00
Matt Arsenault	6e1ebba17d	AMDGPU: Redefine setcc condition PatLeafs Avoid using custom code predicates. llvm-svn: 366609	2019-07-19 20:24:40 +00:00
Matt Arsenault	2724c8cef5	AMDGPU: Don't rely on m0 being -1 for GWS offsets This only works if the high bits of m0 are also 0, so m0 would have to be set to 0xffff. llvm-svn: 366608	2019-07-19 20:01:24 +00:00
Matt Arsenault	b81a355be5	AMDGPU: Force s_waitcnt after GWS instructions This is apparently required to be the immediately following instruction, so force it into a bundle with a waitcnt. llvm-svn: 366607	2019-07-19 19:47:30 +00:00
Matt Arsenault	f3cd8c40d1	LiveIntervals: Fix handleMove asserting on BUNDLE The top-level BUNDLE instruction should behave as an ordinary instruction. It is supposed to have all relevant registers as implicit operands. Moving it should work as any other instruction. I believe the assert intended to avoid moving instructions inside bundles. llvm-svn: 366605	2019-07-19 19:32:00 +00:00
Louis Dionne	747dd4b220	Revert "[libc++] Integrate the PSTL into libc++" This reverts r366593, which caused unforeseen breakage on the build bots. I'm reverting until the problems have been figured out and fixed. llvm-svn: 366603	2019-07-19 18:52:46 +00:00
Michael Liao	5f9a32dcb3	[AMDGPU] Add test case on crashing of `si-lower-sgpr-spills` pass Reviewers: arsenm Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64273 llvm-svn: 366602	2019-07-19 18:50:53 +00:00
Nick Desaulniers	c56d9ef695	Revert "Use the MachineBasicBlock symbol for a callbr target" This reverts commit r366523/ccbffefccaff42b0d094c9ef0f49fc3e8c8456ea. Two regressions were immediately reported: - https://github.com/ClangBuiltLinux/linux/issues/614 - https://github.com/ClangBuiltLinux/linux/issues/615 Reported-by: nathanchance llvm-svn: 366600	2019-07-19 18:18:02 +00:00
Matt Morehouse	2bd8f0a2eb	[RISCV] Disable tests failing on buildbots. r366399 enabled a couple tests that are failing on a few buildbots. llvm-svn: 366599	2019-07-19 18:05:12 +00:00
Stanislav Mekhanoshin	0c9d743a5c	[AMDGPU] Allow register tuples to set asm names This change reverts most of the previous register name generation. The real problem is that RegisterTuple does not generate asm names. Added optional operand to RegisterTuple. This way we can simplify register name access and dramatically reduce the size of static tables for the backend. Differential Revision: https://reviews.llvm.org/D64967 llvm-svn: 366598	2019-07-19 18:05:01 +00:00
Matt Arsenault	c815dbdcb1	AMDGPU/GlobalISel: Fix MMO flags for kernel argument loads The DAG lowering sets dereferencable and invariant, not nontemporal. llvm-svn: 366597	2019-07-19 17:52:56 +00:00
Matt Arsenault	fbb71b0b9e	GlobalISel: Add GINodeEquiv for fcopysign I don't need this at the moment, but it should be here. llvm-svn: 366596	2019-07-19 17:32:19 +00:00
Shoaib Meenai	5746665a65	[llvm-lipo] Remove trailing whitespace. NFC llvm-svn: 366595	2019-07-19 17:19:57 +00:00
Louis Dionne	5a48459ac2	[libc++] Integrate the PSTL into libc++ Summary: This commit allows specifying LIBCXX_ENABLE_PARALLEL_ALGORITHMS when configuring libc++ in CMake. When that option is enabled, libc++ will assume that the PSTL can be found somewhere on the CMake module path, and it will provide the C++17 parallel algorithms based on the PSTL (that is assumed to be available). The commit also adds support for running the PSTL tests as part of the libc++ test suite. Reviewers: rodgert, EricWF Subscribers: mgorny, christof, jkorous, dexonsmith, libcxx-commits, mclow.lists, EricWF Tags: #libc Differential Revision: https://reviews.llvm.org/D60480 llvm-svn: 366593	2019-07-19 17:02:42 +00:00
Matt Arsenault	59f385cd0c	AMDGPU: Add some function return test cases llvm-svn: 366591	2019-07-19 16:45:48 +00:00
Simon Pilgrim	caf2aa9106	[AMDGPU] Regenerate test file for upcoming patch. NFCI. llvm-svn: 366589	2019-07-19 15:43:56 +00:00
Matt Arsenault	2217349a57	AMDGPU: Attempt to fix bot error Manually remove file name from check line, since it somehow ends up being different on an msvc bot. llvm-svn: 366586	2019-07-19 14:56:24 +00:00
Matt Arsenault	43bcdb17a1	AMDGPU/GlobalISel: Selection for fminnum/fmaxnum v2f16 case doesn't work yet because the VOP3P complex patterns haven't been ported yet. llvm-svn: 366585	2019-07-19 14:42:40 +00:00
Matt Arsenault	e76c843d30	AMDGPU/GlobalISel: Support arguments with multiple registers Handles structs used directly in argument lists. llvm-svn: 366584	2019-07-19 14:29:30 +00:00
Matt Arsenault	3b8fc1b930	AMDGPU/GlobalISel: Rewrite lowerFormalArguments This should now handle everything except structs passed as multiple registers. I think most of the packing logic should be handled by handleAssignments, but I'm unclear on what the contract is for multiple registers. This is copying how x86 handles this. This does change the behavior of the test_sgpr_alignment0 amdgpu_vs test. I don't think shader arguments should try to follow the alignment, and registers need to be repacked. I also don't think it matters, since I think the pointers are packed to the beginning of the argument list anyway. llvm-svn: 366582	2019-07-19 14:15:18 +00:00
Matt Arsenault	a09c49dee2	AMDGPU: Decompose all values to 32-bit pieces for calling conventions This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit. This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel. llvm-svn: 366578	2019-07-19 13:57:44 +00:00
Nico Weber	ab9357c357	gn build: Set +x on symlink_or_copy.py llvm-svn: 366576	2019-07-19 13:40:54 +00:00
Matt Arsenault	1ad1497bb5	DAG: Handle dbg_value for arguments split into multiple subregs This was handled previously for arguments split due to not fitting in an MVT. This was dropping the register for argument registers split due to TLI::getRegisterTypeForCallingConv. llvm-svn: 366574	2019-07-19 13:36:46 +00:00
Than McIntosh	5b376040fd	[NFC] include cstdint/string prior to using uint8_t/string Summary: include proper header prior to use of uint8_t typedef and std::string. Subscribers: llvm-commits Reviewers: cherry Tags: #llvm Differential Revision: https://reviews.llvm.org/D64937 llvm-svn: 366572	2019-07-19 13:13:54 +00:00
Dmitry Preobrazhensky	29afa16795	[AMDGPU][MC] Corrected parsing of branch offsets See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D64629 llvm-svn: 366571	2019-07-19 13:12:47 +00:00
Kai Luo	aa4508e98a	[MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs. Summary: Current PRE hoists common computations into CMBB = DT->findNearestCommonDominator(MBB, MBB1). However, if CMBB is in a hot loop body, we might get performance degradation. Differential Revision: https://reviews.llvm.org/D64394 llvm-svn: 366570	2019-07-19 12:58:16 +00:00
Than McIntosh	fb3f89d50d	[X86] for split stack, not save/restore nested arg if unused Summary: For split-stack, if the nested argument (i.e. R10) is not used, no need to save/restore it in the prologue. Reviewers: thanm Reviewed By: thanm Subscribers: mstorsjo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64673 llvm-svn: 366569	2019-07-19 12:54:44 +00:00
Roman Lebedev	5d89c66ff4	[NFC][InstCombine] Tests for 'rem' formation from sub-of-mul-by-'div' (PR42673) https://rise4fun.com/Alive/8Rp https://bugs.llvm.org/show_bug.cgi?id=42673 llvm-svn: 366565	2019-07-19 11:29:18 +00:00
Roman Lebedev	173550eaba	[NFC][InstCombine] Redundant masking before left-shift: tests with assume If the legality check is `(shiftNbits-maskNbits) s>= 0`, then we can simplify it to `shiftNbits u>= maskNbits`, which is easier to check for. However, currently switching the `dropRedundantMaskingOfLeftShiftInput()` to `SimplifyICmpInst()` does not catch these cases and regresses currently-handled cases, so i'll leave it as is for now. https://rise4fun.com/Alive/25P llvm-svn: 366564	2019-07-19 11:29:04 +00:00
Simon Pilgrim	d94781f714	Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 366563	2019-07-19 11:18:46 +00:00
Oliver Stannard	5b6f9348e5	Don't update NoTrappingFPMath and FPDenormalMode in resetTargetOptions We'd like to remove this whole function, because these are properties of functions, not the target as a whole. These two are easy to remove because they are only used for emitting ARM build attributes, which expects them to represent the defaults for the whole module, not just the last function generated. This is needed to get correct build attributes when using IPRA on ARM, because IPRA causes resetTargetOptions to get called before ARMAsmPrinter::emitAttributes. Differential revision: https://reviews.llvm.org/D64929 llvm-svn: 366562	2019-07-19 10:37:37 +00:00
George Rimar	e8c8550933	[llvm-readelf] - A fix for: "--hash-symbols asserts for 64-bit ELFs" Fixes https://bugs.llvm.org/show_bug.cgi?id=42622. (--hash-symbols switch is currently broken for 64-bit ELF files, due to r352630.) Differential revision: https://reviews.llvm.org/D64788 llvm-svn: 366558	2019-07-19 10:15:03 +00:00
Oliver Stannard	1900a7e6ef	[IPRA] Don't rely on non-exact function definitions If a function definition is not exact, then the linker could select a differently-compiled version of it, which could use different registers. https://reviews.llvm.org/D64909 llvm-svn: 366557	2019-07-19 09:59:26 +00:00
Mikhail Maltsev	3c59f45453	[ARM] Add <saturate> operand to SQRSHRL and UQRSHLL Summary: According to the new Armv8-M specification https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the instructions SQRSHRL and UQRSHLL now have an additional immediate operand <saturate>. The new assembly syntax is: SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm where <saturate> can be either 64 (the existing behavior) or 48, in that case the result is saturated to 48 bits. The new operand is encoded as follows: #64 Encoded as sat = 0 #48 Encoded as sat = 1 sat is bit 7 of the instruction bit pattern. This patch adds a new assembler operand class MveSaturateOperand which implements parsing and encoding. Decoding is implemented in DecodeMVEOverlappingLongShift. Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64810 llvm-svn: 366555	2019-07-19 09:46:28 +00:00
Hubert Tong	8c8d2305a0	[sanitizers] Use covering ObjectFormatType switches Summary: This patch removes the `default` case from some switches on `llvm::Triple::ObjectFormatType`, and cases for the missing enumerators (`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added. For `UnknownObjectFormat`, the effect of the action for the `default` case is maintained; otherwise, where `llvm_unreachable` is called, `report_fatal_error` is used instead. Where the `default` case returns a default value, `report_fatal_error` is used for XCOFF as a placeholder. For `Wasm`, the effect of the action for the `default` case in maintained. The code is structured to avoid strongly implying that the `Wasm` case is present for any reason other than to make the switch cover all `ObjectFormatType` enumerator values. Reviewers: sfertile, jasonliu, daltenty Reviewed By: sfertile Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D64222 llvm-svn: 366544	2019-07-19 08:46:18 +00:00
Jay Foad	70edc47b65	[AMDGPU] Simplify the exclusive scan used for optimized atomics Summary: Change the scan algorithm to use only power-of-two shifts (1, 2, 4, 8, 16, 32) instead of starting off shifting by 1, 2 and 3 and then doing a 3-way ADD, because: 1. It simplifies the compiler a little. 2. It minimizes vgpr pressure because each instruction is now of the form vn = vn + vn << c. 3. It is more friendly to the DPP combiner, which currently can't combine into an ADD3 instruction. Because of #2 and #3 the end result is improved from this: v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf v_mov_b32_dpp v1, v3 row_shr:3 row_mask:0xf bank_mask:0xf v_add3_u32 v1, v4, v5, v1 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf To this: v_add_u32_dpp v1, v1, v1 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0 s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe s_nop 1 v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf s_nop 1 v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf I.e. two fewer computational instructions, one extra nop where we could schedule something else. Reviewers: arsenm, sheredom, critson, rampitec, vpykhtin Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64411 llvm-svn: 366543	2019-07-19 08:40:37 +00:00
Serguei Katkov	8a582d259c	[Loop Peeling] Enable peeling of multiple exits by default. Enable loop peeling with multiple exits where all non-latch exits ends up with deopt by default. Reviewers: reames, fhahn Reviewed By: reames Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D64619 llvm-svn: 366542	2019-07-19 08:35:45 +00:00
Roman Lebedev	6165112266	[InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) Normally, the inner pattern is sign-extend, but for our purposes it's no different to other patterns: alive proofs: f: https://rise4fun.com/Alive/7U3 For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64524 llvm-svn: 366540	2019-07-19 08:26:58 +00:00
Roman Lebedev	58a3a62a9e	[InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: e: https://rise4fun.com/Alive/0FT For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64521 llvm-svn: 366539	2019-07-19 08:26:47 +00:00
Roman Lebedev	07ecac9c32	[InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: d: https://rise4fun.com/Alive/I5Y For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64519 llvm-svn: 366538	2019-07-19 08:26:37 +00:00
Roman Lebedev	b21b3169de	[InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563) Summary: If we have some pattern that leaves only some low bits set, and then performs left-shift of those bits, if none of the bits that are left after the final shift are modified by the mask, we can omit the mask. There are many variants to this pattern: c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt` All these patterns can be simplified to just: `x << ShiftShAmt` iff: c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`) alive proofs: c: https://rise4fun.com/Alive/RgJh For now let's start with patterns where both shift amounts are variable, with trivial constant "offset" between them, since i believe this is both simplest to handle and i think this is most common. But again, there are likely other variants where we could use ValueTracking/ConstantRange to handle more cases. https://bugs.llvm.org/show_bug.cgi?id=42563 Differential Revision: https://reviews.llvm.org/D64517 llvm-svn: 366537	2019-07-19 08:26:25 +00:00

1 2 3 4 5 ...

182134 Commits