llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Stanislav Mekhanoshin	26c2b984ef	[AMDGPU] gfx1030 RT support Differential Revision: https://reviews.llvm.org/D87782	2020-09-16 11:40:58 -07:00
Johannes Doerfert	facd70cf60	[OpenMP] Context selector extensions for template functions With this extension the effects of `omp begin declare variant` will be applied to template function declarations. The behavior is opt-in and controlled by the `extension(allow_templates)` trait. While generally useful, this will enable us to implement complex math function calls by overloading the templates of the standard library with the ones in libc++. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D85735	2020-09-16 13:37:10 -05:00
Johannes Doerfert	bb8acd5a57	[OpenMP] Context selector extensions for return value overloading This extension allows to declare variants in between `omp begin/end declare variant` that do not match the type of the existing function with that name. Without this extension we would not find a base function (with a compatible type), therefore create a new one, which would cause conflicting declarations. With this extension we will not create "missing" base functions, which basically renders these specializations harmless. They will be generated but never called. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D85878	2020-09-16 13:37:09 -05:00
Johannes Doerfert	c187f80294	[UpdateTestChecks][NFC] Fix spelling	2020-09-16 13:37:08 -05:00
Rahman Lavaee	848731407e	[obj2yaml] - Match ".stack_size" with the original section name, and not the uniquified name. Without this patch, obj2yaml decodes the content of only one ".stack_size" section. Other sections are dumped with their full contents. Reviewed By: grimar, MaskRay Differential Revision: https://reviews.llvm.org/D87727	2020-09-16 11:33:20 -07:00
Matt Arsenault	07bfd23edb	GlobalISel: Lift store value widening restriction This doesn't change the memory size and doesn't need to worry about non-power-of-2 sizes.	2020-09-16 14:25:07 -04:00
Nico Weber	1db80af6a3	[gn build] make "all" target build If you want to build everything, building the default target via just `ninja` is better, but `ninja all` shouldn't give you compile errors -- this fixes that.	2020-09-16 14:21:48 -04:00
Amara Emerson	684834aea1	[AArch64][GlobalISel] Make G_BUILD_VECTOR os <16 x s8> legal.	2020-09-16 11:19:47 -07:00
Michael Kitzan	fec094fca1	[GISel] Add new combines for unary FP instrs with constant operand https://reviews.llvm.org/D86393 Patch adds five new `GICombinerRules`, one for each of the following unary FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The combine rules perform the FP operation on the constant operand and replace the original instr with the result. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules.	2020-09-16 10:34:15 -07:00
Simon Pilgrim	b6c38b504d	DwarfUnit.h - remove unnecessary includes. NFCI.	2020-09-16 18:32:29 +01:00
Simon Pilgrim	c0364fac98	raw_ostream.cpp - remove duplicate includes. NFCI. Remove headers already included in raw_ostream.h	2020-09-16 18:32:28 +01:00
Simon Pilgrim	0f41e280da	InterferenceCache.cpp - remove duplicate includes. NFCI. Remove headers already included in InterferenceCache.h	2020-09-16 18:32:28 +01:00
Simon Pilgrim	2041d8607d	ValueEnumerator.cpp - remove duplicate includes. NFCI. Remove headers already included in ValueEnumerator.h	2020-09-16 18:32:28 +01:00
Sanjay Patel	173d6a2882	[SLP] add tests for reduction ordering; NFC	2020-09-16 13:28:19 -04:00
Fangrui Song	2e45ee1e4c	[llvm-nm] Use aggregate initialization instead of memset zero	2020-09-16 10:27:12 -07:00
Jamie Schmeiser	8d6d1d8a73	Re-land: Add new hidden option -print-changed which only reports changes to IR A new hidden option -print-changed is added along with code to support printing the IR as it passes through the opt pipeline in the new pass manager. Only those passes that change the IR are reported, with others only having the banner reported, indicating that they did not change the IR, were filtered out or ignored. Filtering of output via the -filter-print-funcs is supported and a new supporting hidden option -filter-passes is added. The latter takes a comma separated list of pass names and filters the output to only show those passes in the list that change the IR. The output can also be modified via the -print-module-scope function. The code introduces a template base class that generalizes the comparison of IRs that takes an IR representation as template parameter. The constructor takes a series of lambdas that provide an event based API for generalized reporting of IRs as they are changed in the opt pipeline through the new pass manager. The first of several instantiations is provided that prints the IR in a form similar to that produced by -print-after-all with the above mentioned filtering capabilities. This version, and the others to follow will be introduced at the upcoming developer's conference. Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen) Differential Revision: https://reviews.llvm.org/D86360	2020-09-16 17:25:18 +00:00
Matt Arsenault	bd0c7f4ec9	RegAllocFast: Make self loop live-out heuristic more aggressive This currently has no impact on code, but prevents sizeable code size regressions after D52010. This prevents spilling and reloading all values inside blocks that loop back. Add a baseline test which would regress without this patch.	2020-09-16 13:12:38 -04:00
Reid Kleckner	eccb0fb0b3	Include (Type\|Symbol)Record.h less Most clients only need CVType and CVSymbol, not structs for every type and symbol. Move CVSymbol and CVType to CVRecord.h to accomplish this. Update some of the common headers that need CVSymbol and CVType to use the new location.	2020-09-16 09:59:03 -07:00
Matt Arsenault	7a37a84909	AMDGPU: Clear offset register when using local stack area eliminateFrameIndex won't fix up the offset register when the direct frame index reference is moved to a separate move instruction. Switch the offset to a base 0 (which it probably should be to begin with).	2020-09-16 12:56:40 -04:00
Matt Arsenault	c25699b258	AMDGPU: Add baseline test for incorrect SP access	2020-09-16 12:56:40 -04:00
Matt Arsenault	cc3396a954	LocalStackSlotAllocation: Swap order of check	2020-09-16 12:56:40 -04:00
Arthur Eubanks	7a72eeae37	[Coro][NewPM] Handle llvm.coro.prepare.retcon in NPM coro-split pass Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D87731	2020-09-16 09:09:10 -07:00
Sjoerd Meijer	047df8d3e0	[ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled Additional sanity checks were added to get.active.lane.mask's second argument, the loop tripcount/elementcount, in rG635b87511ec3. Like the other (overflow) checks, skip this if tail-predication is forced. Differential Revision: https://reviews.llvm.org/D87769	2020-09-16 17:05:14 +01:00
Jay Foad	21197a6926	[AMDGPU] Remove obsolete comment Obsoleted by e4464bf3d45848461630e3771d66546d389f1ed5 "AMDGPU/GlobalISel: Select scalar v2s16 G_BUILD_VECTOR"	2020-09-16 17:03:55 +01:00
Francesco Petrogalli	f1d24f0e57	[llvm][CodeGen] Do not scalarize `llvm.masked.[gather\|scatter]` operating on scalable vectors. This patch prevents the `llvm.masked.gather` and `llvm.masked.scatter` intrinsics to be scalarized when invoked on scalable vectors. The change in `Function.cpp` is needed to prevent the warning that is raised when `getNumElements` is used in place of `getElementCount` on `VectorType` instances. The tests guards for regressions on this change. The tests makes sure that calls to `llvm.masked.[gather\|scatter]` are still scalarized when: # the intrinsics are operating on fixed size vectors, and # the compiler is not targeting fixed length SVE code generation. Reviewed By: efriedma, sdesmalen Differential Revision: https://reviews.llvm.org/D86249	2020-09-16 16:00:28 +00:00
Arthur Eubanks	825221f2e5	[NPM] Translate alias analysis into require<> as well 'require<globals-aa>' is needed to make globals-aa work in NPM, since globals-aa is a module analysis but function passes cannot run module analyses on demand. So don't skip translating alias analyses to 'require<>'. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87743	2020-09-16 08:54:09 -07:00
Dmitry Preobrazhensky	2ead9fa20c	[AMDGPU] Corrected directive to use for ELF weak refs WeakRefDirective should specify a directive to declare "a global as being a weak undefined symbol". The directive used by AMDGPU was incorrect - ".weakref" was intended for other purposes. The correct directive is ".weak" and it is already defined as default for ELF. So the redefinition was removed. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D87762	2020-09-16 18:51:26 +03:00
Simon Pilgrim	c2569d39e2	[X86] EmitInstrWithCustomInserter - remove redundant getDebugLoc() calls. NFCI. Use the same DebugLoc that is called at the top of the method. Fixes some Wshadow static analyzer warnings.	2020-09-16 16:29:56 +01:00
Mircea Trofin	2e97c41718	[NFC][Regalloc] accessors for 'reg' and 'weight' Also renamed the fields to follow style guidelines. Accessors help with readability - weight mutation, in particular, is easier to follow this way. Differential Revision: https://reviews.llvm.org/D87725	2020-09-16 08:28:57 -07:00
Matt Arsenault	f2aa3ef913	AMDGPU: Improve <2 x i24> arguments and return value handling This was asserting for GlobalISel. For SelectionDAG, this was passing this on the stack. Instead, scalarize this as if it were a 32-bit vector.	2020-09-16 11:21:56 -04:00
Sebastian Neubauer	6aba538d9d	[AMDGPU] Add v3f16/v3i16 support to SDag Fix lowering and instruction selection for v3x16 types and enable InstCombine to emit them. This patch only implements it for the selection dag. GlobalISel tests in GlobalISel/llvm.amdgcn.image.load.1d.d16.ll and GlobalISel/llvm.amdgcn.image.store.2d.d16.ll still don't work. Differential Revision: https://reviews.llvm.org/D84420	2020-09-16 17:20:27 +02:00
Simon Pilgrim	eae470c498	[X86] Assert that we've found a terminator instruction. NFCI. Fixes clang static analayzer null dereference warning.	2020-09-16 16:17:49 +01:00
Jay Foad	d11aa00c67	[AMDGPU] Enable scheduling around FP MODE-setting instructions Pre-gfx10 all MODE-setting instructions were S_SETREG_B32 which is marked as having unmodeled side effects, which makes the machine scheduler treat it as a barrier. Now that we have proper implicit $mode operands we can use a no-side-effects S_SETREG_B32_mode pseudo instead for setregs that only touch the FP MODE bits, to give the scheduler more freedom. Differential Revision: https://reviews.llvm.org/D87446	2020-09-16 16:10:47 +01:00
Jay Foad	eea65ac487	[AMDGPU] Add -show-mc-encoding to setreg tests This is a pre-commit for D87446 "[AMDGPU] Enable scheduling around FP MODE-setting instructions"	2020-09-16 16:09:47 +01:00
Simon Pilgrim	c3ae82fe83	[X86][SSE] Move VZEXT_MOVL(INSERT_SUBVECTOR(UNDEF,X,0)) handling into combineTargetShuffle. Now that we're getting better at combining shuffles of different vector widths, this can now be performed as part of the standard target shuffle combines and isn't required for cleanup. Exposed a minor issue in combineX86ShufflesRecursively where we failed to check if a shuffle's src ops were simple types.	2020-09-16 16:08:31 +01:00
Dangeti Tharun kumar	b31191fb60	[Partial Inliner] Compute intrinsic cost through TTI https://bugs.llvm.org/show_bug.cgi?id=45932 assert(OutlinedFunctionCost >= Cloner.OutlinedRegionCost && "Outlined function cost should be no less than the outlined region") getting triggered in computeBBInlineCost. Intrinsics like "assume" are considered regular function calls while computing costs. This patch enables computeBBInlineCost to queries TTI for intrinsic call cost. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87132	2020-09-16 15:12:31 +01:00
Florian Hahn	bd672b8800	[DSE] Add another test cases with loop carried dependence.	2020-09-16 14:50:35 +01:00
Paul C. Anagnostopoulos	3fb53046bc	Add section with details about DAGs.	2020-09-16 09:27:28 -04:00
Sanjay Patel	4abaadfc37	[SLP] fix formatting; NFC Also move variable declarations closer to usage and add code comments.	2020-09-16 08:50:27 -04:00
Sam Parker	3322b6c5f2	[ARM] Reorder some logic Re-order some checks in ValidateMVEInst.	2020-09-16 13:39:22 +01:00
Sanjay Patel	fe442597a7	[SLP] remove uses of 'auto' that obscure functionality; NFC	2020-09-16 08:26:21 -04:00
Sanjay Patel	838e7cb42b	[SLP] remove redundant size check; NFC We bail out on small array size anyway.	2020-09-16 08:11:19 -04:00
Sanjay Patel	2afe46fd57	[SLP] move loop index variable declaration to its use; NFC	2020-09-16 07:59:31 -04:00
Sanjay Patel	fec534536c	[SLP] change poorly named variable; NFC 'V' shadows a function argument.	2020-09-16 07:59:31 -04:00
Sam Parker	f4c2727f1f	[RDA] Fix getUniqueReachingDef for self loops We've fixed the case where this could return an instruction after the given instruction, but also means that we can falsely return a 'unique' def when they could be one coming from the backedge of a loop. Differential Revision: https://reviews.llvm.org/D87751	2020-09-16 12:44:23 +01:00
Sam Parker	51a8d5f1cd	[ARM] Fix tail predication predicate tracking Clear the CurrentPredicate when we find an instruction which would completely overwrite the VPR. This fix essentially means we're back to not really being able to handle VPT instructions when tail predicating. Differential Revision: https://reviews.llvm.org/D87610	2020-09-16 11:59:29 +01:00
Sam Parker	5953b93f6b	[ARM] Add more validForTailPredication Modify the unit test to inspect all MVE instructions and mark the load/store/move of vpr/p0 as valid, as well as the remaining scalar shifts. Differential Revision: https://reviews.llvm.org/D87753	2020-09-16 11:51:50 +01:00
Simon Pilgrim	6d100e4f64	[DAG] Remover getOperand() call. NFCI.	2020-09-16 11:18:58 +01:00
Sam Tebbs	44c6b51e9c	[ARM][LowOverheadLoops] Fix tests after ef0b9f3 ef0b9f3 didn't update the tests that it affected.	2020-09-16 11:01:21 +01:00
Georgii Rymar	32405053ae	[llvm-readobj][test] - Improve section-symbols.test `section-symbols.test` tests how we print section symbols in different situations. We might have 2 different cases: 1) A named STT_SECTION symbol. 2) An unnamed STT_SECTION symbol. Usually section symbols have no name and then `--symbols` uses their section names when prints them. If symbol has a name, then it is used. For `--relocations` we also want to have this logic probably, but currently we always ignore symbol names and always use section names. It is not consistent with GNU readelf and with our logic for `--symbols`. This patch refines testing to document the existent behavior and improve coverage. Differential revision: https://reviews.llvm.org/D87612	2020-09-16 12:36:09 +03:00

1 2 3 4 5 ...

203630 Commits