llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Peter Collingbourne	3f38e31261	ARM: Use BKPT instead of TRAP to implement llvm.debugtrap. The BKPT instruction is specified to cause a software breakpoint, and at least on Linux results in a SIGTRAP. This makes it more suitable for implementing debugtrap than TRAP (aka UDF #254), which is specified to cause an undefined instruction exception and results in a SIGILL on Linux. Moreover, BKPT is not marked as a terminator, which is not only consistent with the IR instruction but allows the analyzeBlock function to correctly analyze a basic block containing the instruction, which fixes an assertion failure in the machine block placement pass previously triggered by the included test case. Because BKPT is only supported starting with ARMv5T, we continue to use UDF #254 when targeting v4T. Differential Revision: https://reviews.llvm.org/D53614 llvm-svn: 345171	2018-10-24 18:10:38 +00:00
Saleem Abdulrasool	eca209183e	ARM: handle checking aliases with out-of-bounds GEPs A global alias may use indices which are not considered in bounds. In such a case, accessing the base object will fail as it only peers through inbounds accesses. This pattern is used by the swift compiler to create references to preceeding members in the type metadata. This would cause the code generation to fail when targeting a platform that used ELF as the object file format. Be conservative and fail the read-only check if we run into an alias that we cannot peer through. llvm-svn: 345107	2018-10-24 00:00:52 +00:00
Simon Pilgrim	df74e03322	[TTI] Add generic cost handling of SK_Reverse shuffles These can be treated as a general permute. This required a fix for missing reverse patterns on ARM llvm-svn: 345015	2018-10-23 09:42:10 +00:00
Eli Friedman	ec411cfda2	Revert r344693 ("[ARM] bottom-top mul support in ARMParallelDSP") Still causing failures on the polly-aosp buildbot; I'll follow up with a reduced testcase. llvm-svn: 344752	2018-10-18 19:34:30 +00:00
Sam Parker	967e905c0d	[ARM] bottom-top mul support in ARMParallelDSP Previously reverted in rL343082. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 344693	2018-10-17 13:02:48 +00:00
Sjoerd Meijer	3e73839b39	[ARM] Do not fuse VADD and VMUL, continued (2/2) This is patch 2/2, following up on D53314, and is the functional change to prevent fusing mul + add sequences into VFMAs. Differential revision: https://reviews.llvm.org/D53315 llvm-svn: 344683	2018-10-17 10:05:44 +00:00
Sjoerd Meijer	e06ebcf09e	[ARM] Follow up of rL344671, attempt to pacify a buildbot It was rightfully complaining about an unpretty logical expression. llvm-svn: 344677	2018-10-17 07:51:24 +00:00
Sjoerd Meijer	dbb2ea77e4	[ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2) This is a follow up of rL342874, which stopped fusing muls and adds into VMLAs for performance reasons on the Cortex-M4 and Cortex-M33. This is a serie of 2 patches, that is trying to achieve the same for VFMA. The second column in the table below shows what we were generating before rL342874, the third column what changed with rL342874, and the last column what we want to achieve with these 2 patches: -------------------------------------------------------- \| Opt \| < rL342874 \| >= rL342874 \| \| \|------------------------------------------------------\| \|-O3 \| vmla \| vmul \| vmul \| \| \| \| vadd \| vadd \| \|------------------------------------------------------\| \|-Ofast \| vfma \| vfma \| vmul \| \| \| \| \| vadd \| \|------------------------------------------------------\| \|-Oz \| vmla \| vmla \| vmla \| -------------------------------------------------------- This patch 1/2, is a cleanup of the spaghetti predicate logic on the different VMLA and VFMA codegen rules, so that we can make the final functional change in patch 2/2. This also fixes a typo in the regression test added in rL342874. Differential revision: https://reviews.llvm.org/D53314 llvm-svn: 344671	2018-10-17 07:26:35 +00:00
Evandro Menezes	8c438fbd8d	[NFC][ARM] Refactor macro fusion Simplify code for wildcards. llvm-svn: 344625	2018-10-16 17:19:51 +00:00
Simon Pilgrim	e44f188a68	[ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281) As I suggested on PR39281, this patch uses PADDL pairwise addition to widen from the vXi8 CTPOP result to the target vector type. This is a blocker for moving more x86 code to generic vector CTPOP expansion (P32655 + D53258) - ARM's vXi64 CTPOP currently expands, which would generate a vXi64 MUL but ARM's custom lowering expands the general MUL case and vectors aren't well handled in LegalizeDAG - improving the CTPOP lowering was a lot easier than fixing the MUL lowering for this one case...... Differential Revision: https://reviews.llvm.org/D53257 llvm-svn: 344512	2018-10-15 13:20:41 +00:00
Dorit Nuzman	a3df726c55	recommit 344472 after fixing build failure on ARM and PPC. llvm-svn: 344475	2018-10-14 08:50:06 +00:00
Dorit Nuzman	70052d5053	revert 344472 due to failures. llvm-svn: 344473	2018-10-14 07:21:20 +00:00
Dorit Nuzman	c4c9199631	[IAI,LV] Add support for vectorizing predicated strided accesses using masked interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472	2018-10-14 07:06:16 +00:00
George Burgess IV	37a10a0055	Replace most users of UnknownSize with LocationSize::unknown(); NFC Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186	2018-10-10 21:28:44 +00:00
Peter Smith	8682e9a796	[ARM] Account for implicit IT when calculating inline asm size When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated. The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT. Fixes pr31805 Differential Revision: https://reviews.llvm.org/D52834 llvm-svn: 343960	2018-10-08 09:38:28 +00:00
Matthias Braun	4acd274986	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions This rebases and recommits r343520. hwasan should be fixed now and this shouldn't break the tests anymore. Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343895	2018-10-05 22:00:13 +00:00
Jonas Paulsson	d362ca6156	[TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints() Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851	2018-10-05 14:23:11 +00:00
Matt Morehouse	de1f22a41d	Revert "X86, AArch64, ARM: Do not attach debug location to spill/reload instructions" This reverts r343520 due to breakage of HWASan tests on Android. llvm-svn: 343616	2018-10-02 18:35:44 +00:00
Diogo N. Sampaio	8af1166e5f	[ARM] Emmit data symbol for constant pool data The ARM elf emitter would omit printing data symbol when constant data. This patch overrides the emitFill method as to enforce that the symbol is correctly printed. Differential revision: https://reviews.llvm.org/D52737 llvm-svn: 343594	2018-10-02 14:55:48 +00:00
Matthias Braun	9c73b0eb23	X86, AArch64, ARM: Do not attach debug location to spill/reload instructions Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343520	2018-10-01 18:56:39 +00:00
Eli Friedman	a0f892568b	[ARM] Fix correctness checks in promoteToConstantPool. Correctly check for relocations in the constant to promote. And don't allow promoting a constant multiple times. This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ; it's not a complete fix because we also need to prevent ARMConstantIslands from cloning the constant. (-arm-promote-constant is currently off by default, and it stays off with this patch. I'll look into turning it on again when all the known issues are fixed.) Differential Revision: https://reviews.llvm.org/D51472 llvm-svn: 343361	2018-09-28 20:27:31 +00:00
Eli Friedman	da1384c2fe	[ARM] Use preferred alignment for constants in promoteToConstantPool. This mostly affects IR generated by non-clang frontends because clang generally sets the alignment of globals explicitly. Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 . (-arm-promote-constant is currently off by default, and it stays off with this patch. I'll look into turning it on again when all the known issues are fixed.) Differential Revision: https://reviews.llvm.org/D51469 llvm-svn: 343359	2018-09-28 20:21:51 +00:00
David Spickett	bfb82efb28	[ARM] Allow execute only code on Cortex-m23 The NoMovt feature prevents the use of MOVW/MOVT instructions on Cortex-M23 for performance reasons. These instructions are required for execute only code so NoMovt should be disabled when that option is enabled. Differential Revision: https://reviews.llvm.org/D52551 llvm-svn: 343302	2018-09-28 08:55:19 +00:00
Oliver Stannard	62a89909f4	[ARM][v8.5A] Add speculation barriers SSBB and PSSBB This adds two new barrier instructions which can be used to restrict speculative execution of load instructions. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52484 llvm-svn: 343300	2018-09-28 08:27:56 +00:00
Oliver Stannard	9fd24e88bc	[ARM][v8.5A] Add speculation barrier to ARM & Thumb instruction sets This is a new barrier which limits speculative execution of the instructions following it. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52477 llvm-svn: 343213	2018-09-27 13:41:14 +00:00
Fangrui Song	c2791239be	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
Oliver Stannard	347bc189fb	[ARM/AArch64][v8.5A] Add Armv8.5-A target This patch allows targeting Armv8.5-A, adding the architecture to tablegen and setting the options to be identical to Armv8.4-A for the time being. Subsequent patches will add support for the different features included in the Armv8.5-A Reference Manual. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52470 llvm-svn: 343102	2018-09-26 12:48:21 +00:00
Sam Parker	7284df400a	[ARM] Fix for PR39060 When calculating whether a value can safely overflow for use by an icmp, we weren't checking that the value couldn't wrap around. To do this we need the icmp to be using a constant, as well as the incoming add or sub. bugzilla report: https://bugs.llvm.org/show_bug.cgi?id=39060 Differential Revision: https://reviews.llvm.org/D52463 llvm-svn: 343092	2018-09-26 10:56:00 +00:00
Hans Wennborg	a5c38063f6	Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP" This broke Chromium's Android build (https://crbug.com/889390) and the polly-aosp buildbot (http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable). > Originally committed in rL342210 but was reverted in rL342260 because > it was causing issues in vectorized code, because I had forgotten to > ensure that we're operating on scalar values. > > Original commit message: > > On failing to find sequences that can be converted into dual macs, > try to find sequential 16-bit loads that are used by muls which we > can then use smultb, smulbt, smultt with a wide load. > > Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 343082	2018-09-26 08:41:50 +00:00
Nirav Dave	36a936ebf7	[ARM] Share predecessor bookkeeping in CombineBaseUpdate. NFCI. llvm-svn: 342987	2018-09-25 15:30:47 +00:00
Evandro Menezes	a69b45a477	[ARM] Adjust the cost model for Exynos Tune `MaxInterleaveFactor` and `LdStMultipleTiming`and remove `PartialUpdateClearance` for the Exynos processors. llvm-svn: 342900	2018-09-24 16:35:14 +00:00
Evandro Menezes	933d3c8ea9	[ARM] Adjust the feature set for Exynos Enable crypto and literals fusion for the Exynos processors. llvm-svn: 342899	2018-09-24 16:35:09 +00:00
Zhaoshi Zheng	14ebc5a569	[Thumb1] Any imm8 should have cost of 1 A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or [0, 255] in unsigned i8 type on Thumb1. Differential Revision: https://reviews.llvm.org/D52257 llvm-svn: 342898	2018-09-24 16:15:23 +00:00
Luke Cheeseman	645b604217	[Arm][AsmParser] Restrict register list size for VSTM/VLDM - The assembler accepts VSTM/VLDM with register lists (specifically double registers lists) with more than 16 registers specified - The Arm architecture reference manual says this instruction must not contain more than 16 registers when the registers are doubleword registers - This addresses one of the concerns in https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52082 llvm-svn: 342891	2018-09-24 15:13:48 +00:00
Sjoerd Meijer	d5015b6840	[ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33 A sequence of VMUL and VADD instructions always give the same or better performance than a fused VMLA instruction on the Cortex-M4 and Cortex-M33. Executing the VMUL and VADD back-to-back requires the same cycles, but having separate instructions allows scheduling to avoid the hazard between these 2 instructions. Differential Revision: https://reviews.llvm.org/D52289 llvm-svn: 342874	2018-09-24 12:02:50 +00:00
Hans Wennborg	058e12737a	Revert r341932 "[ARM] Enable ARMCodeGenPrepare by default" This caused miscompilation of WebRTC for Android: PR39060. > We've had the pass enabled downstream for a couple of weeks and it > seems to be okay, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D51920 llvm-svn: 342873	2018-09-24 11:40:07 +00:00
Luke Cheeseman	77de0a1dfd	[ARM][ARMLoadStoreOptimizer] - The load store optimizer is currently merging multiple loads/stores into VLDM/VSTM with more than 16 doubleword registers - This is an UNPREDICTABLE instruction and shouldn't be done - It looks like the Limit for how many registers included in a merge got dropped at some point so I am reintroducing it in this patch - This fixes https://bugs.llvm.org/show_bug.cgi?id=38389 Differential Revision: https://reviews.llvm.org/D52085 llvm-svn: 342872	2018-09-24 10:42:22 +00:00
Sam Parker	b4d8a969b6	[ARM] bottom-top mul support ARMParallelDSP Originally committed in rL342210 but was reverted in rL342260 because it was causing issues in vectorized code, because I had forgotten to ensure that we're operating on scalar values. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342870	2018-09-24 09:34:06 +00:00
Maya Madhavan	ac015cd2c1	Fix for bug 34002 - label generated before it block is finalized. Differential Revision: https://reviews.llvm.org/D52258 llvm-svn: 342615	2018-09-20 05:11:42 +00:00
Evandro Menezes	cdfed5405b	[ARM] Adjust the feature set for Exynos Fine tune the cost model for all Exynos processors. llvm-svn: 342585	2018-09-19 19:51:29 +00:00
Evandro Menezes	0849eadfca	[ARM] Refactor Exynos feature set (NFC) Since all Exynos processors share the same feature set, fold them in the implied fatures list for the subtarget. llvm-svn: 342583	2018-09-19 19:43:23 +00:00
Alex Bradbury	b88141f5a6	[AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IR This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550	2018-09-19 14:51:42 +00:00
Oliver Stannard	959a756a99	[ARM] Fix unwind information for floating point registers Fixes the unwind information generated for floating-point registers. Previously, all padding registers were assumed to be four bytes wide. Now, the width of the register is used to specify the amount of padding. Patch by Jackson Woodruff! Differential revision: https://reviews.llvm.org/D51494 llvm-svn: 342545	2018-09-19 13:25:31 +00:00
Volodymyr Sapsai	d2bca167a0	Revert "[ARM] Cleanup ARM CGP isSupportedValue" This reverts r342395 as it caused error > Argument value type does not match pointer operand type! > %0 = atomicrmw volatile xchg i8* %_Value1, i32 1 monotonic, !dbg !25 > i8in function atomic_flag_test_and_set > fatal error: error in backend: Broken function found, compilation aborted! on bot http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/ More details are available at https://reviews.llvm.org/D52080 llvm-svn: 342431	2018-09-18 00:11:55 +00:00
Sam Parker	dfcd3f1ff0	[ARM] Cleanup ARM CGP isSupportedValue isSupportedValue explicitly checked and accepted many types of value, primarily for debugging reasons. Remove most of these checks and do a bit of refactoring now that the pass is more stable. This also enables ZExts to be sources, but this has very little practical benefit at the moment extend instructions will still be introduced. Differential Revision: https://reviews.llvm.org/D52080 llvm-svn: 342395	2018-09-17 13:57:39 +00:00
Sam Parker	77177923c8	[ARM] Disallow icmp with negative imm and overflow We allow overflowing instructions if they're decreasing and only used by an unsigned compare. Add the extra condition that the icmp cannot be using a negative immediate. Differential Revision: https://reviews.llvm.org/D52102 llvm-svn: 342392	2018-09-17 13:48:25 +00:00
Reid Kleckner	1cb33bd6c5	Revert r342210 "[ARM] bottom-top mul support in ARMParallelDSP" It causes assertion failures while building Skia for Android in Chromium: https://ci.chromium.org/buildbot/chromium.clang/ToTAndroid/4550 Reduction forthcoming. llvm-svn: 342260	2018-09-14 18:44:37 +00:00
Sam Parker	3764cb8a6f	[ARM] bottom-top mul support in ARMParallelDSP On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342210	2018-09-14 08:09:09 +00:00
Sam Parker	45c545ce0a	[ARM] Allow truncs as sources in ARM CGP We previously only allowed truncs as sinks, but now allow them as sources too. We do this by checking that the result type is the narrow type that we're trying to optimise for. Differential Revision: https://reviews.llvm.org/D51978 llvm-svn: 342141	2018-09-13 15:14:12 +00:00
Sam Parker	89b3b396fd	[ARM] Fix FixConst for ARMCodeGenPrepare Part of FixConsts wrongly assumes either a 8- or 16-bit constant which can result in the wrong constants being generated during promotion. Differential Revision: https://reviews.llvm.org/D52032 llvm-svn: 342140	2018-09-13 14:48:10 +00:00

1 2 3 4 5 ...

9811 Commits