llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
Evandro Menezes	b365e531e0	[AArch64] Fix formatting (NFC) llvm-svn: 372357	2019-09-19 21:48:22 +00:00
Akira Hatanaka	dc9ac88c60	[ObjC][ARC] Skip debug instructions when computing the insert point of objc_release calls This fixes a bug where the presence of debug instructions would cause ARC optimizer to change the order of retain and release calls. rdar://problem/55319419 llvm-svn: 372352	2019-09-19 20:58:51 +00:00
Stanislav Mekhanoshin	837f0cb5f7	[AMDGPU] fixed underflow in getOccupancyWithNumVGPRs The function could return zero if an extreme number or registers were used. Minimal possible occupancy is 1. Differential Revision: https://reviews.llvm.org/D67771 llvm-svn: 372350	2019-09-19 20:09:04 +00:00
David Blaikie	0051a61051	llvm-reduce: Follow-up to 372280, now with more-better msan fixing llvm-svn: 372349	2019-09-19 20:04:04 +00:00
Jakub Kuderski	d1d897b032	Don't use invalidated iterators in FlattenCFGPass Summary: FlattenCFG may erase unnecessary blocks, which also invalidates iterators to those erased blocks. Before this patch, `iterativelyFlattenCFG` could try to increment a BB iterator after that BB has been removed and crash. This patch makes FlattenCFGPass use `WeakVH` to skip over erased blocks. Reviewers: dblaikie, tstellar, davide, sanjoy, asbirlea, grosser Reviewed By: asbirlea Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67672 llvm-svn: 372347	2019-09-19 19:39:42 +00:00
Shoaib Meenai	08db8a450f	[Analysis] Allow -scalar-evolution-max-iterations more than once At present, `-scalar-evolution-max-iterations` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same value) which results in an error, after which compilation fails: ``` clang (LLVM option parsing): for the -scalar-evolution-max-iterations option: may only occur zero or one times! ``` It seems reasonable to allow -scalar-evolution-max-iterations to be passed more than once. Quoting the [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed \| documentation ]]: > The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times. > ... > If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained. Original patch by: Enrico Bern Hardy Tanuwidjaja <etanuwid@fb.com> Differential Revision: https://reviews.llvm.org/D67512 llvm-svn: 372346	2019-09-19 18:21:32 +00:00
Jinsong Ji	8695739f8c	[NFC][PowerPC] Fast-isel VSX support test We have fixed most of the VSX limitation in Fast-isel, so we can remove the -mattr=-vsx for most testcases now. llvm-svn: 372345	2019-09-19 18:18:18 +00:00
GN Sync Bot	0d2758f267	gn build: Merge r372343 llvm-svn: 372344	2019-09-19 17:53:03 +00:00
Francesco Petrogalli	bb6e7b26a0	[SVFS] Vector Function ABI demangling. This patch implements the demangling functionality as described in the Vector Function ABI. This patch will be used to implement the SearchVectorFunctionSystem (SVFS) as described in the RFC: http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html A fuzzer is added to test the demangling utility. Patch by Sumedh Arani <sumedh.arani@arm.com> Differential revision: https://reviews.llvm.org/D66024 llvm-svn: 372343	2019-09-19 17:47:32 +00:00
Roman Lebedev	472f610901	[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251) Summary: This is again motivated by D67122 sanitizer check enhancement. That patch seemingly worsens `-fsanitize=pointer-overflow` overhead from 25% to 50%, which strongly implies missing folds. In this particular case, given ``` char* test(char& base, unsigned long offset) { return &base - offset; } ``` it will end up producing something like https://godbolt.org/z/luGEju which after optimizations reduces down to roughly ``` declare void @use64(i64) define i1 @test(i8* dereferenceable(1) %base, i64 %offset) { %base_int = ptrtoint i8* %base to i64 %adjusted = sub i64 %base_int, %offset call void @use64(i64 %adjusted) %not_null = icmp ne i64 %adjusted, 0 %no_underflow = icmp ule i64 %adjusted, %base_int %no_underflow_and_not_null = and i1 %not_null, %no_underflow ret i1 %no_underflow_and_not_null } ``` Without D67122 there was no `%not_null`, and in this particular case we can "get rid of it", by merging two checks: Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset` Alive proofs: https://rise4fun.com/Alive/QOs The `@llvm.usub.with.overflow` pattern itself is not handled here because this is the main pattern, that we currently consider canonical. https://bugs.llvm.org/show_bug.cgi?id=43251 Reviewers: spatel, nikic, xbolva00, majnemer Reviewed By: xbolva00, majnemer Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67356 llvm-svn: 372341	2019-09-19 17:25:19 +00:00
Alexander Timofeev	e15136d802	[AMDGPU] Unnecessary -amdgpu-scalarize-global-loads=false flag removed from min/max lit tests. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D67712 llvm-svn: 372340	2019-09-19 16:44:38 +00:00
Sanjay Patel	c86655e22d	[Float2Int] avoid crashing on unreachable code (PR38502) In the example from: https://bugs.llvm.org/show_bug.cgi?id=38502 ...we hit infinite looping/crashing because we have non-standard IR - an instruction operand is used before defined. This and other unusual constructs are allowed in unreachable blocks, so avoid the problem by using DominatorTree to step around landmines. Differential Revision: https://reviews.llvm.org/D67766 llvm-svn: 372339	2019-09-19 16:31:17 +00:00
Matt Arsenault	c204981f6f	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338	2019-09-19 16:26:14 +00:00
Andrea Di Biagio	935e89a564	[MCA] Improved cost computation for loop carried dependencies in the bottleneck analysis. This patch introduces a cut-off threshold for dependency edge frequences with the goal of simplifying the critical sequence computation. This patch also removes the cost normalization for loop carried dependencies. We didn't really need to artificially amplify the cost of loop-carried dependencies since it is already computed as the integral over time of the delay (in cycle). In the absence of backend stalls there is no need for computing a critical sequence. With this patch we early exit from the critical sequence computation if no bottleneck was reported during the simulation. llvm-svn: 372337	2019-09-19 16:05:11 +00:00
Chris Bieneman	6195acfd6f	Make appendCallNB lambda mutable Lambdas are by deafult const so that they produce the same output every time they are run. This lambda needs to set the value on a captured promise which is a mutating operation, so it must be mutable. llvm-svn: 372336	2019-09-19 15:45:12 +00:00
Matt Arsenault	5c09b89d62	X86: Add missing test for vshli SimplifyDemandedBitsForTargetNode This would have caught this regression which triggered the revert of r372285: https://bugs.chromium.org/p/chromium/issues/detail?id=1005750 llvm-svn: 372335	2019-09-19 15:44:00 +00:00
Simon Pilgrim	1a2577191a	[DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 372333	2019-09-19 15:02:47 +00:00
Amaury Sechet	ec179c1a46	[DAGCombiner] Add node to the worklist in topological order in scalarizeExtractedVectorLoad Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66661 llvm-svn: 372327	2019-09-19 14:22:11 +00:00
Francesco Petrogalli	e0aee60bcd	[docs] Break long (>80) line. NFC llvm-svn: 372326	2019-09-19 14:19:32 +00:00
Sanjay Patel	b5b0126d47	[Float2Int] auto-generate complete test checks; NFC llvm-svn: 372324	2019-09-19 13:58:15 +00:00
James Molloy	7fff32705f	[TableGen] Support encoding per-HwMode Much like ValueTypeByHwMode/RegInfoByHwMode, this patch allows targets to modify an instruction's encoding based on HwMode. When the EncodingInfos field is non-empty the Inst and Size fields of the Instruction are ignored and taken from EncodingInfos instead. As part of this promote getHwMode() from TargetSubtargetInfo to MCSubtargetInfo. This is NFC for all existing targets - new code is generated only if targets use EncodingByHwMode. llvm-svn: 372320	2019-09-19 13:39:54 +00:00
Simon Pilgrim	91e656c4fc	[DAG] Add SelectionDAG::MaxRecursionDepth constant As commented on D67557 we have a lot of uses of depth checks all using magic numbers. This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly. Differential Revision: https://reviews.llvm.org/D67711 llvm-svn: 372315	2019-09-19 12:58:43 +00:00
Hans Wennborg	230a0cd001	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314	2019-09-19 12:33:07 +00:00
David Green	0d8d81d28e	[ARM] MVE i1 splat We needn't BFI each lane individually into a predicate register when each lane in the same. A simple sign extend and a vmsr will do. Differential Revision: https://reviews.llvm.org/D67653 llvm-svn: 372313	2019-09-19 12:17:41 +00:00
Owen Reynolds	85e14fae63	Revert [llvm-ar] Include a line number when failing to parse an MRI script Revert r372309 due to buildbot failures Differential Revision: https://reviews.llvm.org/D67449 llvm-svn: 372311	2019-09-19 11:22:59 +00:00
Simon Pilgrim	e0162e73d8	Fix -Wdocumentation "@returns in a void function" warning. NFCI. llvm-svn: 372310	2019-09-19 11:12:04 +00:00
Owen Reynolds	b4ee5eb054	[llvm-ar] Include a line number when failing to parse an MRI script Errors that occur when reading an MRI script now include a corresponding line number. Differential Revision: https://reviews.llvm.org/D67449 llvm-svn: 372309	2019-09-19 10:51:43 +00:00
Simon Pilgrim	dd27f48e31	Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 372308	2019-09-19 10:47:12 +00:00
Serguei Katkov	5ca2dc481a	[Unroll] Add an option to control complete unrolling Add an ability to specify the max full unroll count for LoopUnrollPass pass in pass options. Reviewers: fhahn, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D67701 llvm-svn: 372305	2019-09-19 06:57:29 +00:00
Craig Topper	445a371fb6	[X86] Prevent crash in LowerBUILD_VECTORvXi1 for v64i1 vectors on 32-bit targets when the vector is a mix of constants and non-constant. We need to materialize the constants as two 32-bit values that are casted to v32i1 and then concatenated. llvm-svn: 372304	2019-09-19 06:50:39 +00:00
Sam Parker	4ee8d37373	[ARM] Fix for buildbots I had missed that massive.mir also needed updating. llvm-svn: 372303	2019-09-19 06:50:19 +00:00
Craig Topper	f88cfb2220	[X86] Change a SmallVector& argument to SmallVectorImpl&. NFC Avoids repeating the size. llvm-svn: 372302	2019-09-19 06:27:12 +00:00
Craig Topper	a8c91178c9	[X86] Remove unused argument from a helper function. NFC llvm-svn: 372301	2019-09-19 06:27:07 +00:00
Tom Stellard	a2fa29480b	AMDGPU/SILoadStoreOptimizer: Add const to more functions Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65901 llvm-svn: 372298	2019-09-19 04:39:45 +00:00
Matt Arsenault	b474b21e9d	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzle llvm-svn: 372297	2019-09-19 04:11:17 +00:00
Matt Arsenault	981975e5f4	AMDGPU/GlobalISel: RegBankSelect tbuffer load/store These have the same operand structure as the non-t buffer operations. llvm-svn: 372296	2019-09-19 04:11:12 +00:00
Matt Arsenault	b62b57ad18	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.format This needs special handling due to some subtargets that have a nonstandard register layout for f16 vectors Also reject some illegal types on other targets. llvm-svn: 372293	2019-09-19 02:35:08 +00:00
Matt Arsenault	5e5b3033c1	AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store llvm-svn: 372292	2019-09-19 02:30:27 +00:00
Matt Arsenault	82762d4636	AMDGPU/GlobalISel: RegBankSelect struct buffer load/store llvm-svn: 372291	2019-09-19 02:26:53 +00:00
Matt Arsenault	7839034e76	AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load\|store} llvm-svn: 372290	2019-09-19 02:25:09 +00:00
Matt Arsenault	afdcbd0c55	AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsics Images should always have 2 consecutive, mandatory SGPR arguments. llvm-svn: 372289	2019-09-19 02:23:06 +00:00
Matt Arsenault	65892112fe	Fix typo llvm-svn: 372288	2019-09-19 02:15:29 +00:00
Matt Arsenault	da628cba04	MachineScheduler: Fix assert from not checking subregs The assert would fail if there was a dead def of a subregister if there was a previous use of a different subregister. llvm-svn: 372287	2019-09-19 02:14:12 +00:00
Matt Arsenault	e17fe1dae5	AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9 The scalar versions were only introduced in gfx9. llvm-svn: 372286	2019-09-19 01:42:34 +00:00
Matt Arsenault	6df65c514b	GlobalISel: Don't materialize immarg arguments to intrinsics Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285	2019-09-19 01:33:14 +00:00
GN Sync Bot	4007f026b5	gn build: Merge r372282 llvm-svn: 372283	2019-09-19 01:03:39 +00:00
David Blaikie	d5d1d93e4d	llvm-reduce: Add pass to reduce instructions Patch by Diego Treviño! Differential Revision: https://reviews.llvm.org/D66263 llvm-svn: 372282	2019-09-19 00:59:27 +00:00
David Blaikie	7867f03df9	llvm-reduce: Avoid use-after-free when removing a branch instruction Found my msan buildbot & pointed out by Nico Weber - thanks Nico! llvm-svn: 372280	2019-09-19 00:35:32 +00:00
Alexander Shaposhnikov	7afab9f42f	[Object] Extend MachOUniversalBinary::getObjectForArch Make the method MachOUniversalBinary::getObjectForArch return MachOUniversalBinary::ObjectForArch and add helper methods MachOUniversalBinary::getMachOObjectForArch, MachOUniversalBinary::getArchiveForArch for those who explicitly expect to get a MachOObjectFile or an Archive. Differential revision: https://reviews.llvm.org/D67700 Test plan: make check-all llvm-svn: 372278	2019-09-19 00:02:12 +00:00
Roman Tereshin	af808a61d7	[utils] Add minimal support for MIR inputs to update_llc_test_checks.py update_{llc,mir}_test_checks.py applicability is determined by the output (assembly or MIR), not the input, which makes update_llc_test_checks.py the right tool to generate tests that start at MIR and stop at the final assembly. This commit adds the minimal support for this path. Main limitation that remains: - MIR has to have LLVM IR section, and the CHECK lines will be inserted into the LLVM IR functions that correspond to the MIR functions. Running ../utils/update_llc_test_checks.py --llc-binary ./bin/llc on a slightly modified ../test/CodeGen/X86/bad-tls-fold.mir produces the following diff: +# NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +# RUN: llc %s -o - \| FileCheck %s --- \| target triple = "x86_64-unknown-linux-gnu" @@ -6,17 +7,31 @@ @i = external thread_local global i32 define i32 @or() { + ; CHECK-LABEL: or: + ; CHECK: # %bb.0: # %entry + ; CHECK-NEXT: movq {{.}}(%rip), %rax + ; CHECK-NEXT: orq $7, %rax + ; CHECK-NEXT: movq i@{{.}}(%rip), %rcx + ; CHECK-NEXT: orq %rax, %rcx + ; CHECK-NEXT: movl %fs:(%rcx), %eax + ; CHECK-NEXT: retq entry: ret i32 undef } - define i32 @and() { + ; CHECK-LABEL: and: + ; CHECK: # %bb.0: # %entry + ; CHECK-NEXT: movq {{.}}(%rip), %rax + ; CHECK-NEXT: orq $7, %rax + ; CHECK-NEXT: movq i@{{.}}(%rip), %rcx + ; CHECK-NEXT: andq %rax, %rcx + ; CHECK-NEXT: movl %fs:(%rcx), %eax + ; CHECK-NEXT: retq entry: ret i32 undef } ... (not applied) llvm-svn: 372277	2019-09-18 23:44:17 +00:00

1 2 3 4 5 ...

185145 Commits