llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Craig Topper	3205bd7427	[X86] Use some preprocessor macros to reduce the very similar repeated code in getVPTESTMOpc. NFCI This function picks X86 opcode name based on type, masking, and whether not a load or broadcast has been folded using multiple switch statements. The contents of the switches mostly just vary in a few characters in the instruction name. So use some macros to build the instruction names to reduce the repetiveness.	2020-06-30 14:38:22 -07:00
Amy Kwan	d10d816f3a	[PowerPC][Power10] Add Vector Splat Imm/Permute/Blend/Shift Double Bit Imm Definitions and MC Tests This patch adds the td definitions and asm/disasm tests for the following instructions: XXSPLTIW XXSPLTIDP XXSPLTI32DX XXPERMX XXBLENDVB XXBLENDVH XXBLENDVW XXBLENDVD VSLDBI VSRDBI Differential Revision: https://reviews.llvm.org/D82896	2020-06-30 16:07:21 -05:00
Eli Friedman	24a885401b	[SVE] Reject vector struct indexes for scalable vectors. It's messy to pattern-match, and completely unnecessary: scalar indexes work equally well. See also discussion on D81620 and D82061. Differential Revision: https://reviews.llvm.org/D82430	2020-06-30 13:52:38 -07:00
Eli Friedman	ccb803369c	[BitcodeReader] Fix DelayedShuffle handling for ConstantExpr shuffles. The indexing was messed up, so the result was completely broken. Shuffle constant exprs are rare in practice; without vscale types, constant folding generally elminates them. So sort of hard to trip over. Fixes regression from D72467. (Recommitting after fix for memory leak.) Differential Revision: https://reviews.llvm.org/D80330	2020-06-30 13:23:07 -07:00
Matt Arsenault	d907e0ced3	Sparc: Use Register	2020-06-30 16:14:23 -04:00
Matt Arsenault	6494b099e2	RISCV: Don't store function in RISCVMachineFunctionInfo Targets should not depend on the MachineFunction state during the MachineFunctionInfo construction.	2020-06-30 16:08:51 -04:00
Matt Arsenault	8215de3621	PPC: Don't store function in PPCFunctionInfo Continue migrating targets from depending on the MachineFunction during the initial construction.	2020-06-30 16:08:51 -04:00
Matt Arsenault	1d8040150a	Mips: Don't store MachineFunction in MipsFunctionInfo It will soon be disallowed to depend on MachineFunction state on construction.	2020-06-30 16:08:51 -04:00
Eli Friedman	7b42c225db	[IR] Delete llvm::Constants using the correct type. In most cases, this doesn't have much impact: the destructors just call the base class destructor anyway. A few subclasses of ConstantExpr actually store non-trivial data, though. Make sure we clean up appropriately. This is sort of ugly, but I don't see a good alternative given the constraints. Issue found by asan buildbots running the testcase for D80330. Differential Revision: https://reviews.llvm.org/D82509	2020-06-30 12:37:53 -07:00
Florian Hahn	ef10a35a6b	[AArch64] Add getCFInstrCost, treat branches as free for throughput. D79164/2596da31740f changed getCFInstrCost to return 1 per default. AArch64 did not have its own implementation, hence the throughput cost of CFI instructions is overestimated. On most cores, most branches should be predicated and essentially free throughput wise. This restores a 9% performance regression on a SPEC2006 benchmark on AArch64 with -O3 LTO & PGO. This patch effectively restores pre 2596da31740f behavior for AArch64 and undoes the AArch64 test changes of the patch. Reviewers: samparker, dmgreen, anemet Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D82755	2020-06-30 20:34:04 +01:00
Craig Topper	9600883c24	[X86] Move frontend CPU feature initialization to a look up table based implementation. NFCI This replaces the switch statement implementation in the clang's X86.cpp with a lookup table in X86TargetParser.cpp. I've used constexpr and copy of the FeatureBitset from SubtargetFeature.h to store the features in a lookup table. After the lookup the bitset is translated into strings for use by the rest of the frontend code. I had to modify the implementation of the FeatureBitset to avoid bugs in gcc 5.5 constexpr handling. It seems to not like the same array entry to be used on the left side and right hand side of an assignment or &= or \|=. I've also used uint32_t instead of uint64_t and sized based on the X86::CPU_FEATURE_MAX. I've initialized the features for different CPUs outside of the table so that we can express inheritance in an adhoc way. This was one of the big limitations of the switch and we had resorted to labels and gotos. Differential Revision: https://reviews.llvm.org/D82731	2020-06-30 12:04:58 -07:00
David Green	b85514f884	[InstCombine] fma x, y, 0 -> fmul x, y If the addend of the fma is zero, common sense would suggest that we can convert fma x, y, 0.0 to fmul x, y. This comes up with some user code that was expecting the first fma in an unrolled loop to simplify to a fmul. Floating point often does not follow naive common sense though. Alive suggests that this should be guarded by nsz (as fadd -0.0, 0.0 = 0.0). fma x, y, -0.0 is always valid. Differential Revision: https://reviews.llvm.org/D82778	2020-06-30 19:56:37 +01:00
Valentin Clement	031f9ae6ad	[openmp] Move Directive and Clause helper function to tablegen Summary: Follow up to D81736. Move getOpenMPDirectiveKind, getOpenMPClauseKind, getOpenMPDirectiveName and getOpenMPClauseName to the new tablegen code generation. The code is generated in a new file named OMP.cpp.inc Reviewers: jdoerfert, jdenny, thakis Reviewed By: jdoerfert, jdenny Subscribers: mgorny, yaxunl, hiraditya, guansong, sstefan1, llvm-commits, thakis Tags: #llvm Differential Revision: https://reviews.llvm.org/D82405	2020-06-30 14:51:59 -04:00
Alex Lorenz	c5914576d2	[macho] emit LC_BUILD_VERSION load command for supported OSes and platforms This change lets LLVM use the LC_BUILD_VERSION command when building for macOS 10.14, iOS 12, tvOS 12, and watchOS 5. Additionally, this change ensures that new platforms like Apple Silicon macOS / Mac Catalyst, and simulators running on Apple Silicon alway use LC_BUILD_VERSION with the OS version set to the minimum supported OS version if the deployment target version is older. Differential Revision: https://reviews.llvm.org/D82836	2020-06-30 11:48:17 -07:00
Sameer Arora	2f551c7741	[llvm-install-name-tool] Add -change option Implement `-change` option for install-name-tool. The behavior exactly matches that of cctools. Depends on D82410. Reviewed By: jhenderson, smeenai Differential Revision: https://reviews.llvm.org/D82613	2020-06-30 11:28:53 -07:00
Sameer Arora	c3265ed7a7	[llvm-install-name-tool] Add -id option Implement `-id` option for install-name-tool. Differences from cctool's behavior: - Does NOT throw an error if multiple -id options are specified. Instead, picks the last one. - Throws an error in case empty id is specified. Reviewed By: jhenderson, smeenai Differential Revision: https://reviews.llvm.org/D82410	2020-06-30 11:28:53 -07:00
Reid Kleckner	f3337db614	[PDB] Defer public serialization until PDB writing This reduces peak memory on my test case from 1960.14MB to 1700.63MB (-260MB, -13.2%) with no measurable impact on CPU time. I'm currently working with a publics stream that is about 277MB. Before this change, we would allocate 277MB of heap memory, serialize publics into them, hold onto that heap memory, open the PDB, and commit into it. After this change, we defer the serialization until commit time. In the last change I made to public writing, I re-sorted the list of publics multiple times in place to avoid allocating new temporary data structures. Deferring serialization until later requires that we don't reorder the publics. Instead of sorting the publics, I partially construct the hash table data structures, store a publics index in them, and then sort the hash table data structures. Later, I replace the index with the symbol record offset. This change also addresses a FIXME and moves the list of global and public records from GSIHashStreamBuilder to GSIStreamBuilder. Now that publics aren't being serialized, it makes even less sense to store them as a list of CVSymbol records. The hash table used to deduplicate globals is moved as well, since that is specific to globals, and not publics. Reviewed By: aganea, hans Differential Revision: https://reviews.llvm.org/D81296	2020-06-30 11:28:04 -07:00
Sanjay Patel	174204e1a1	[PhaseOrdering][NewPM] update test that silently showed bug with SpeculativeExecutionPass; NFC See D82735 / rG1a6cebb4d12c744699e23624f8afda5cbe216fe6	2020-06-30 14:22:20 -04:00
Christopher Tetreault	4a8eb5f3d3	[SVE] Remove calls to VectorType::getNumElements from AArch64 Reviewers: efriedma, paquette, david-arm, kmclaughlin Reviewed By: david-arm Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82214	2020-06-30 11:17:50 -07:00
Christopher Tetreault	b0b0a7801d	[SVE] Remove calls to VectorType::getNumElements from ExecutionEngine Reviewers: efriedma, lhames, sdesmalen, fpetrogalli Reviewed By: lhames, sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82211	2020-06-30 11:05:38 -07:00
Eric Christopher	4e74268d23	Update the phabricator docs to reflect the monorepo change. Patch by Nathan Froyd! Differential Revision: https://reviews.llvm.org/D82389	2020-06-30 10:53:38 -07:00
Hsiangkai Wang	c131338b9d	[MVT] Add new MVT types for RISC-V vector. In RISC-V vector extension, users could group multiple vector registers as one pseudo register. In mixed width operations, users could use partial vector registers to reduce the register pressure. The parameter to control register grouping and partial use is called LMUL. LMUL is a part of the type. So, we have a bunch of vector types. In order to support all these types, we need new MVT types in LLVM. In this patch, I added several MVT types that are used in RISC-V vector implementation. This is a standalone patch for MVT types without RISC-V related implementation. Differential revision: https://reviews.llvm.org/D81724	2020-07-01 01:07:50 +08:00
David Green	152a88cd95	[InstCombine] New FMA tests and regenerate tests. NFC	2020-06-30 18:05:13 +01:00
Lei Huang	a12db9449e	[PowerPC][NFC] Rename/organize encoding test files for ISA3.1 Rename `future` encoding test files to include ISA3.1 in the file name and combine with exisitng ISA3.1 instruction encoding tests that were added into `p10` test files. Keeping the `p10*` files for now to ensure we don't add more to it. Will remove once all ISA3.1 instruction are implemented.	2020-06-30 11:42:36 -05:00
Samuel Tebbs	dcd6d8787a	[ARM] Allow the fabs intrinsic to be tail predicated This patch stops the fabs intrinsic from blocking tail predication. Differential Revision: https://reviews.llvm.org/D82570	2020-06-30 17:27:28 +01:00
Simon Pilgrim	d9c34b0e0b	Pass MDFieldPrinter::printAPInt APInt arg by reference not value. Noticed by clang-tidy performance-unnecessary-value-param warning.	2020-06-30 17:18:20 +01:00
Samuel Tebbs	d908315744	[ARM] Allow the usub_sat and ssub_sat intrinsics to be tail predicated This patch stops the usub_sat and ssub_sat intrinsics from blocking tail predication. Differential Revision: https://reviews.llvm.org/D82571	2020-06-30 17:16:58 +01:00
Matt Arsenault	cfb54aad1c	RegAlloc: Start using Register	2020-06-30 12:13:08 -04:00
Matt Arsenault	9854836d08	BranchFolding: Use Register	2020-06-30 12:13:08 -04:00
Matt Arsenault	692c52cacb	TailDuplicator: Use Register	2020-06-30 12:13:08 -04:00
Matt Arsenault	2ae74c00b9	AMDGPU: Use Register	2020-06-30 12:13:08 -04:00
Matt Arsenault	4900c827a3	X86: Use Register	2020-06-30 12:13:08 -04:00
Sjoerd Meijer	4fb902bfc8	[ARM][MVE] Tail-predication: clean-up of unused code After the rewrite of this pass (D79175) I missed one thing: the inserted VCTP intrinsic can be cloned to exit blocks if there are instructions present in it that perform the same operation, but this wasn't triggering anymore. However, it turns out that for handling reductions, see D75533, it's actually easier not not to have the VCTP in exit blocks, so this removes that code. This was possible because it turned out that some other code that depended on this, rematerialization of the trip count enabling more dead code removal later, wasn't doing much anymore due to more aggressive dead code removal that was added to the low-overhead loops pass. Differential Revision: https://reviews.llvm.org/D82773	2020-06-30 17:09:36 +01:00
Samuel Tebbs	4fc642e8a5	[ARM] Allow rounding intrinsics to be tail predicated This patch stops the trunc, rint, round, floor and ceil intrinsics from blocking tail predication. Differential Revision: https://reviews.llvm.org/D82553	2020-06-30 16:52:25 +01:00
Guillaume Chatelet	5415efc6ea	[Alignment][NFC] TargetLowering::allowsMemoryAccessForAlignment First patch of a series to adapt TargetLowering::allowsXXX functions This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D81372	2020-06-30 15:31:24 +00:00
Louis Dionne	22095a116e	[libc++abi] Remove empty source file cxa_unexpected.cpp	2020-06-30 11:18:26 -04:00
Guillaume Chatelet	d5f8f57b04	[NFC] Remove dead code Differential Revision: https://reviews.llvm.org/D81195	2020-06-30 14:46:56 +00:00
Ulrich Weigand	7cf5768d98	[SystemZ] Simplify knownbits.ll test The knownbits.ll test case is somewhat fragile since: - it relies on undef inputs; and - it operates just at the limits of the MaxRecursionDepth This means that optimization changes may easily cause the test to spuriously fail. Rewrite the test so it still validates the same thing, but in a less fragile manner.	2020-06-30 16:31:59 +02:00
Xing GUO	093e41853b	[DWARFYAML][MachO] Remove endianness related tests. fe08ab542bd6328a7906e38ae473cf655eb6a228 makes build bots unhappy (http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33624/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3ADWARF-debug_info.yaml). This patch removes failed tests.	2020-06-30 21:48:50 +08:00
Simon Pilgrim	dd6630a7e7	[X86][SSE] LowerVectorAllZero - add support for masked OR-reductions If we're masking the result of an OR-reduction before comparing against zero, we can fold this into the PTEST() / MOVMSK(CMPEQ()) codegen by pre-masking the source value. This works particularly well on PTEST which performs the AND as part of its operation, but the MOVMSK variant also benefits for non-V2I64 cases. Fixes PR44781	2020-06-30 14:38:52 +01:00
Guillaume Chatelet	ced6ab5db1	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemcpy to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82849	2020-06-30 13:12:31 +00:00
Guillaume Chatelet	b2f3277e56	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemmove to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82850	2020-06-30 12:46:59 +00:00
Guillaume Chatelet	3950904de0	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemset to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82851	2020-06-30 12:46:26 +00:00
Sam Parker	5b21582cc5	[NFC][ARM] Tail predication reduction tests	2020-06-30 13:27:22 +01:00
dfukalov	6a3ce76190	[PM] Fix new PM to perform SpeculativeExecution as in old PM Summary: Old PM runs SpeculativeExecutionPass for targets that have divergent branches. It uses `createSpeculativeExecutionIfHasBranchDivergencePass` that creates the pass with `OnlyIfDivergentTarget=true`, whereas new PM just created the pass with default `OnlyIfDivergentTarget=fase` so it unexpectedly runs and causes buildbot test fails. Reviewers: chandlerc, arsenm Reviewed By: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82735	2020-06-30 15:21:04 +03:00
Simon Pilgrim	8b9ddf8506	[X86] Add tests for cmp-zero + and/trunc + or-reduction patterns Expanding off the original PR44781 test case, show the failure to fold cmp-all-zero patterns when a demanded bits limiting and/trunc is in the way.	2020-06-30 12:51:38 +01:00
Ilya Leoshkevich	75336ea828	[SystemZ] Add NoMerge MIFlag Summary: This fixes ASan and MSan tests on SystemZ after commit 6a822e20ce70 ("[ASan][MSan] Remove EmptyAsm and set the CallInst to nomerge to avoid from merging."). Based on commit 80e107ccd088 ("Add NoMerge MIFlag to avoid MIR branch folding"). Reviewers: uweigand, jonpa Reviewed By: uweigand Subscribers: hiraditya, llvm-commits, Andreas-Krebbel Tags: #llvm Differential Revision: https://reviews.llvm.org/D82794	2020-06-30 12:44:45 +02:00
Balazs Benics	68e318ed0b	[llvm][Z3][NFC] Improve mkBitvector performance We convert `APSInt`s to Z3 Bitvectors in an inefficient way for most cases. We should not serialize to std::string just to pass an int64 integer. For the vast majority of cases, we use at most 64-bit width integers (at least in the Clang Static Analyzer). We should simply call the `Z3_mk_unsigned_int64` and `Z3_mk_int64` instead of the `Z3_mk_numeral` as stated in the Z3 docs. Which says: > It (`Z3_mk_unsigned_int64`, etc.) is slightly faster than `Z3_mk_numeral` since > it is not necessary to parse a string. If the `APSInt` is wider than 64 bits, we will use the `Z3_mk_numeral` with a `SmallString` instead of a heap-allocated `std::string`. Differential Revision: https://reviews.llvm.org/D78453	2020-06-30 12:26:50 +02:00
Guillaume Chatelet	dde4971043	[Alignment][NFC] Migrate AtomicExpandPass to Align This is a followup on D78403. I'm unsure about `getAtomicOpAlign` overloads that take `AtomicRMWInst` and `AtomicCmpXchgInst`, shouldn't `getAlign` provide the correct answer already? Differential Revision: https://reviews.llvm.org/D81369	2020-06-30 09:54:45 +00:00
sstefan1	b5d0352264	[IR] NoFree IntrinsicProperty. Summary: Separate introduction of IntrNoFree property as suggested in D70365 Reviewers: arsenm, nhaehnle Tags: #llvm Differential Revision: https://reviews.llvm.org/D82587	2020-06-30 11:26:00 +02:00

1 2 3 4 5 ...

199315 Commits