llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Serge Pavlov	7d91a1a046	[Windows] Fix limit on command line size Documentation on CreateProcessW states that maximal size of command line is 32767 characters including ternimation null character. In the function llvm::sys::commandLineFitsWithinSystemLimits this limit was set to 32768. As a result if command line was exactly 32768 characters long, a response file was not created and CreateProcessW was called with too long command line. Differential Revision: https://reviews.llvm.org/D83772	2020-07-21 17:33:22 +07:00
Djordje Todorovic	bd9b7ebd66	[NFC][Debugify] Rename OptCustomPassManager into DebugifyCustomPassManager In addition, move the definition of the class into the Debugify.h, so we can use it from different levels. The motivation for this is D82547. Differential Revision: https://reviews.llvm.org/D83391	2020-07-21 12:16:07 +02:00
Florian Hahn	de943b561f	[SCCP] Add range metadata to call sites with known return ranges. If we inferred a range for the function return value, we can add !range at all call-sites of the function, if the range does not include undef. Reviewers: efriedma, davide, nikic Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D83952	2020-07-21 10:06:54 +01:00
Nathan James	6fc76c37f5	[ADT] use is_base_of inplace of is_same for random_access_iterator_tag checks Replace `std::is_same<X, std::random_access_iterator_tag>` with `std::is_base_of<std::random_access_iterator_tag, X>` in STLExtra algos. This doesn't have too much impact on LLVM internally as no structs derive from it. However external projects embedding LLVM may use `std::contiguous_iterator_tag` which should be considered by these algorithms. As well as any other potential tags people want to define derived from `std::random_access_iterator_tag` Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D84141	2020-07-21 09:55:16 +01:00
Alex Richardson	4b94bc362e	[NFC] Use FileCheck for llvm-reduce interesness test This makes the test added in 6187eeb683d8c639282d437e6af585e9b7f9c93e easier to understand since you no longer have to look at another script to see if it's doing the right thing.	2020-07-21 09:03:45 +01:00
Jared Wyles	5603ee7630	[jitlink] Updating test file for GOT relocations for elf x86	2020-07-21 17:19:48 +10:00
David Green	5a76ea81c6	[ARM] More unpredictable VCVT instructions. These extra vcvt instructions were missed from 74ca67c109 because they live in a different Domain, but should be treated in the same way. Differential Revision: https://reviews.llvm.org/D83204	2020-07-21 07:24:37 +01:00
David Green	b082241dfa	[ARM] Predicated MVE reduction tests. NFC	2020-07-21 06:47:48 +01:00
Logan Smith	49c75c362c	[NFC] Add another missing 'override' This should be the last one needed to appease the -Werror bots (knock on wood).	2020-07-20 22:04:27 -07:00
Logan Smith	24c415cf6d	[NFC] Add missing 'override's	2020-07-20 19:52:49 -07:00
David Blaikie	dc9888b6b6	DebugInfo: Move getMD5AsBytes from DwarfUnit to DwarfDebug It wasn't using any state from DwarfUnit anyway.	2020-07-20 19:21:39 -07:00
Matt Arsenault	f4e8befd9e	GlobalISel: Rewrite getLCMType Try to make the behavior more consistent with getGCDType, and bias towards returning something closer to the source type whenever there's an ambiguity.	2020-07-20 21:06:30 -04:00
Matt Arsenault	6ba9358e88	GlobalISel: Handle more cases in getGCDType Try harder to find a canonical unmerge type when trying to cover the desired target type. Handle finding a compatible unmerge type for two vectors with different element types. This will return the largest multiple of the source vector element that will evenly divide the target vector type. Also make the handling mixing scalars and vectors, and prefer the source element type as the unmerge target type.	2020-07-20 20:53:35 -04:00
Matt Arsenault	d057444115	AMDGPU/GlobalISel: Remove unnecessary parameter	2020-07-20 20:53:01 -04:00
Artem Belevich	9d0375c353	[MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives. PTX does not support negative values in .bNN data directives and we must typecast such values to unsigned before printing them. MCAsmInfo can now specify whether such casting is necessary for particular target. Differential Revision: https://reviews.llvm.org/D83423	2020-07-20 16:24:41 -07:00
Lang Hames	abf1154870	[ExecutionEngine] Initialize near block hint in SectionMemoryManager. When allocating a new memory block in SectionMemoryManager, initialize the Near hint for the other memory groups if they have not been set already. Patch by Dana Koch. Thanks Dana!	2020-07-20 14:40:54 -07:00
Roman Lebedev	0885e066d4	[Reduce] Argument reduction: don't try to drop terminator instructions Newly-added test previously crashed. While it is up for debate whether or not instruction reduction should be indiscriminate in instruction dropping (there you can just ensure that the test case is still -verify'ies), here if we drop terminator, CloneFunctionInto() will immediately crash. So let's not do that :)	2020-07-21 00:06:03 +03:00
Logan Smith	038f81782a	[llvm][unittest] Add -Wno-suggest-override to more infrastructure that includes googletest/googlemock headers	2020-07-20 13:59:39 -07:00
Louis Dionne	93b238862a	[NFC] Use std::free instead of ::free Since we include <cstdlib> instead of <stdlib.h>, it makes sense to use std::free.	2020-07-20 16:19:08 -04:00
Sanjay Patel	d607929186	[InstCombine] allow peeking through zext of shift amount to match rotate idioms (PR45701) We might want to also allow trunc of the shift amount, but that seems less likely? define i32 @src(i32 %x, i1 %y) { %0: %rem = and i1 %y, 1 %cmp = icmp eq i1 %rem, 0 %sh_prom = zext i1 %rem to i32 %sub = sub nsw nuw i1 0, %rem %sh_prom1 = zext i1 %sub to i32 %shr = lshr i32 %x, %sh_prom1 %shl = shl i32 %x, %sh_prom %or = or i32 %shl, %shr %r = select i1 %cmp, i32 %x, i32 %or ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %t = zext i1 %y to i32 %r = fshl i32 %x, i32 %x, i32 %t ret i32 %r } Transformation seems to be correct! https://alive2.llvm.org/ce/z/xgMvE3 http://bugs.llvm.org/PR45701	2020-07-20 16:18:11 -04:00
Sanjay Patel	2feb6522ca	[InstCombine] add tests for funnel shift/rotate with narrow shift amount; NFC	2020-07-20 16:18:11 -04:00
Florian Hahn	ded006d0d9	[Matrix] Use TileInfo to create tiled loop nest for matrix multiply. This patch uses the TileInfo introduced in D77550 to generate a loop nest for tiled matrix multiplication, instead of generating the unrolled code for the whole multiplication. This makes code-generation more scalable for larger matrixes. Initially loops are only used if both the number of rows and columns are divisible by the tile size. Other cases will be added as follow-up. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, nicolasvasilache Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D81308	2020-07-20 21:11:53 +01:00
Eli Friedman	941026e51d	[AArch64][SVE] Add support for trunc to <vscale x N x i1>. This isn't a natively supported operation, so convert it to a mask+compare. In addition to the operation itself, fix up some surrounding stuff to make the testcase work: we need concat_vectors on i1 vectors, we need legalization of i1 vector truncates, and we need to fix up all the relevant uses of getVectorNumElements(). Differential Revision: https://reviews.llvm.org/D83811	2020-07-20 13:11:02 -07:00
Logan Smith	16a34f1645	Enable -Wsuggest-override in the LLVM build This patch adds Clang's new (and GCC's old) -Wsuggest-override to the warning flags for the LLVM build. The warning is a stronger form of -Winconsistent-missing-override which warns _everywhere_ that override is missing, not just in places where it's inconsistent within a class. Some directories in the monorepo need the warning disabled for compatibility's, or sanity's, sake; in particular, libcxx/libcxxabi, and any code implementing or interoperating with googletest, googlemock, or google benchmark (which do not themselves use override). This patch adds -Wno-suggest-override to the relevant CMakeLists.txt's to accomplish this. Differential Revision: https://reviews.llvm.org/D84126	2020-07-20 12:32:47 -07:00
Hiroshi Yamauchi	d2f68caf31	[PGO] Enable the extended value profile buckets for mem op sizes. Following up D81682 and enable the new, extended value profile buckets for mem op sizes. Differential Revision: https://reviews.llvm.org/D83903	2020-07-20 12:05:09 -07:00
Hiroshi Yamauchi	821825f6a3	[PGO][PGSO] Remove a temporary flag used for gradual rollout. Remove the temporary flag PGSOIRPassOrTestOnly and the guard code which was used for the staged rollout. This is a cleanup (NFC) as it's now false by default. Differential Revision: https://reviews.llvm.org/D84057	2020-07-20 11:12:11 -07:00
Mircea Trofin	3e712a8637	[llvm] Development-mode InlineAdvisor Summary: This is the InlineAdvisor used in 'development' mode. It enables two scenarios: - loading models via a command-line parameter, thus allowing for rapid training iteration, where models can be used for the next exploration phase without requiring recompiling the compiler. This trades off some compilation speed for the added flexibility. - collecting training logs, in the form of tensorflow.SequenceExample protobufs. We generate these as textual protobufs, which simplifies generation and testing. The protobufs may then be readily consumed by a tensorflow-based training algorithm. To speed up training, training logs may also be collected from the 'default' training policy. In that case, this InlineAdvisor does not use a model. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html Reviewers: jdoerfert, davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83733	2020-07-20 11:01:56 -07:00
LLVM GN Syncbot	d8d3b91ab3	[gn build] Port e1270b16c94	2020-07-20 17:51:57 +00:00
Florian Hahn	992d1824c2	[Matrix] Add TileInfo abstraction for tiled matrix code-gen. This patch adds a TileInfo abstraction and utilities to create a 3-level loop nest for tiling. Reviewers: anemet Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D77550	2020-07-20 18:49:08 +01:00
Victor Huang	5b0daad836	[PowerPC] Implement R_PPC64_REL24_NOTOC local calls, callee requires a TOC The PC Relative code now allows for calls that are marked with the relocation R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC pointer in R2 and does not require R2 to be restored after the call. This patch is added to support local calls to callees that require a TOC Reviewed By: sfertile, MaskRay, nemanjai, stefanp Differential Revision: https://reviews.llvm.org/D83504	2020-07-20 17:46:49 +00:00
Yuanfang Chen	8dacd8c600	[NFC] remove unused llvm::deleter	2020-07-20 10:43:29 -07:00
Yuanfang Chen	52f25d528b	[NFC] remove unused includes of SelectionDAGISel.h	2020-07-20 10:43:29 -07:00
Yuanfang Chen	5ffd4d5373	[NFC] remove unneeded TargetLoweringObjectFile init after 85c30f3374d9	2020-07-20 10:43:28 -07:00
Yuanfang Chen	bf8086d1c1	[llc] (almost) remove `--print-machineinstrs` Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches. The motivation of this patch is to reduce tests dependency on would-be-deprecated feature. Reviewed By: arsenm, dsanders Differential Revision: https://reviews.llvm.org/D83275	2020-07-20 10:43:28 -07:00
Matt Arsenault	f08273f200	AMDGPU: Use MCRegister for preloaded arguments Attempt to fix build error with ancient GCC	2020-07-20 13:34:28 -04:00
Fangrui Song	5a1c9a0071	[llvm-readobj] clang-format DwarfCFIEHPrinter.h, NFC Pre-commit header ordering changes (and other minor clean-ups) before landing D84106.	2020-07-20 10:25:16 -07:00
Fangrui Song	6f16bde0e0	[LLVMgold.so] -plugin-opt=save-temps: save combined module to .lto.o instead of .o This matches LLD and fixes https://sourceware.org/bugzilla/show_bug.cgi?id=26262#c1 .o is a bad choice for save-temps output because it is easy to override the bitcode file (*.o) ``` # Use bfd for the example, -fuse-ld=gold is similar. clang -flto -c a.c # generate bitcode file a.o clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps # override a.o # The user repeats the command but get surprised, because a.o is now a combined module. clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D84132	2020-07-20 10:02:56 -07:00
Nick Desaulniers	72d9dcdd94	[ThinLTO] parse flags and blockcount summaries Forked from pr/46523, we were having a hard time running llvm-extract on IR from a thinLTO build of the Linux kernel. $ llvm-extract --func jeq_imm jit-42f488b63a04fdaa931315bdadecb6d23e20529a.ll llvm-extract: jit-42f488b63a04fdaa931315bdadecb6d23e20529a.ll:47463:8: error: Expected 'gv', 'module', or 'typeid' at the start of summary entry ^209 = flags: 8 ^ Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D82917	2020-07-20 09:50:22 -07:00
Matt Arsenault	6a998dc1d5	AMDGPU: Remove outdated fixme	2020-07-20 11:41:41 -04:00
Matt Arsenault	48dcb3085a	AMDGPU: Fix not accounting for constantexpr uses of LDS globals This was failing to add the size of LDS globals that weren't directly used by an instruction. They could be used by constant expressions which are transitively used by the function. This requires a better search, but just abort on this for now for correctness.	2020-07-20 11:41:41 -04:00
Matt Arsenault	b0a70713f7	AMDGPU/GlobalISel: Initial Implementation of calls Return values, and tail calls are not yet handled.	2020-07-20 11:13:22 -04:00
Matt Arsenault	b7a779e13a	Verifier: Check byref address space for AMDGPU calling conventions	2020-07-20 11:13:11 -04:00
Matt Arsenault	413b267e1e	Verifier: Disallow byval and similar for AMDGPU calling conventions These imply stack-like semantics, which doesn't make any sense for entry points.	2020-07-20 10:58:57 -04:00
Alok Kumar Sharma	0a592fd282	[DebugInfo] Support for DW_AT_associated and DW_AT_allocated. Summary: This support is needed for the Fortran array variables with pointer/allocatable attribute. This support enables debugger to identify the status of variable whether that is currently allocated/associated. for pointer array (before allocation/association) without DW_AT_associated (gdb) pt ptr type = integer (140737345375288:140737354129776) (gdb) p ptr value requires 35017956 bytes, which is more than max-value-size with DW_AT_associated (gdb) pt ptr type = integer (:) (gdb) p ptr $1 = <not associated> for allocatable array (before allocation) without DW_AT_allocated (gdb) pt arr type = integer (140737345375288:140737354129776) (gdb) p arr value requires 35017956 bytes, which is more than max-value-size with DW_AT_allocated (gdb) pt arr type = integer, allocatable (:) (gdb) p arr $1 = <not allocated> Testing - unit test cases added - check-llvm - check-debuginfo Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83544	2020-07-20 19:54:35 +05:30
Matt Arsenault	ea505ad2f6	IR: Define byref parameter attribute This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition. This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch. My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway. This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads. Background: There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind. It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable. The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage. This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments. Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads. The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner. Additional context is in D79744.	2020-07-20 10:23:09 -04:00
Simon Pilgrim	a3033adc1a	MCFixup.h - remove unnecessary MCExpr.h include. NFCI. Move the include down to files that actually depend on MCExpr definitions. Also exposes an implicit dependency on MCContext in AVRAsmBackend.h	2020-07-20 15:17:19 +01:00
Simon Pilgrim	741fef4349	CodeGenDAGPatterns.h - remove unnecessary ComplexPattern forward declaration. NFCI. This is defined in CodeGenTarget.h which we have to explicitly include already.	2020-07-20 15:17:19 +01:00
Simon Pilgrim	f042856ca0	CodeGenDAGPatterns.h - remove unused CodeGenHwModes.h include. NFCI.	2020-07-20 15:17:18 +01:00
Petar Avramovic	bd8687800b	AMDGPU/GlobalISel: Legalize s16->s64 G_FPEXT Legalize using narrowScalar as s16->s32 G_FPEXT followed by s32->s64 G_FPEXT. Differential Revision: https://reviews.llvm.org/D84030	2020-07-20 16:12:19 +02:00
Matt Arsenault	d494042b78	AMDGPU/GlobalISel: Remove outdated comment	2020-07-20 10:06:18 -04:00

1 2 3 4 5 ...

200457 Commits