llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Luis Marques	2ba3c6d779	[RISCV] Add CostModel GEP tests Differential Revision: https://reviews.llvm.org/D61185 llvm-svn: 362691	2019-06-06 09:47:53 +00:00
Petar Avramovic	660e5789cc	[MIPS GlobalISel] Select fabs Select G_FABS for MIPS32. Differential Revision: https://reviews.llvm.org/D62903 llvm-svn: 362690	2019-06-06 09:22:37 +00:00
Petar Avramovic	6b57826324	[MIPS GlobalISel] Select fpext and fptrunc Select G_FPEXT and G_FPTRUNC for MIPS32. Differential Revision: https://reviews.llvm.org/D62902 llvm-svn: 362689	2019-06-06 09:16:58 +00:00
Petar Avramovic	d52d0b845e	[MIPS GlobalISel] Select floor and ceil Select G_FFLOOR and G_FCEIL for MIPS32. Differential Revision: https://reviews.llvm.org/D62901 llvm-svn: 362688	2019-06-06 09:02:24 +00:00
Sam Parker	ca8a6c73b9	[SCEV] Use wrap flags in InsertBinop If the given SCEVExpr has no (un)signed flags attached to it, transfer these to the resulting instruction or use them to find an existing instruction. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 362687	2019-06-06 08:56:26 +00:00
Dylan McKay	0a57fcb94a	[AVR] Fix the 'load.ll' test after r362351 In that commit, the 'load.ll' test was modified, but still failed. This commit updates the test so that it now passes. llvm-svn: 362684	2019-06-06 08:06:50 +00:00
Amara Emerson	e6558e5f9b	[AArch64][GlobalISel] Add manual selection support for G_ZEXTLOADs to s64. We already get support for G_ZEXTLOAD to s32 from the importer, but it can't deal with the SUBREG_TO_REG in the pattern. Tweaking the existing manual selection code for G_LOAD to handle an additional SUBREG_TO_REG when dealing with G_ZEXTLOAD isn't much work. Also add tests to check the imported pattern selections to s32 work. llvm-svn: 362681	2019-06-06 07:58:37 +00:00
Amara Emerson	a0fe92de9c	[AArch64][GlobalISel] Add the new changes to fix PR42129 that were supposed to go into r362666. The changes weren't staged so ended up just re-commiting the unmodified reverted change. llvm-svn: 362677	2019-06-06 07:33:47 +00:00
Craig Topper	8d331c2481	[X86] Don't turn avx masked.load with constant mask into masked.load+vselect when passthru value is all zeroes. This is intended to enable the use of an immediate blend or more optimal instruction. But if the passthru is zero we don't need any additional instructions. llvm-svn: 362675	2019-06-06 05:41:27 +00:00
Craig Topper	728f4bbb5d	[X86] Add test case for masked load with constant mask and all zeros passthru. avx/avx2 masked loads only support all zeros for passthru in hardware. So we have to emit a blend for all other values. We have an optimization that tries to optimize this blend if the mask is constant. But we don't need to perform this optimization if the passthru value is zero which doesn't need the blend at all. llvm-svn: 362674	2019-06-06 05:41:22 +00:00
Amara Emerson	170eb09636	Revert "Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp"" When looking through copies, make sure to not try to find the vreg def of a physreg. Normally getVRegDef will return nullptr in this case, but if there happens to be multiple defs then it will assert. This fixes PR42129. llvm-svn: 362666	2019-06-05 23:46:16 +00:00
Matt Arsenault	641a689b3a	AMDGPU: Don't fix emergency stack slot at offset 0 This forced the caller to be aware of this, which is an ugly ABI feature. Partially reverts r295877. The original reasons for doing this are mostly fixed. Alloca is now in a non-0 address space, so it should be OK to have 0 as a valid pointer. Since we treat the absolute address as the pointer value, this part only really needed to apply to kernels. Since r357093, we avoid the need to increment/decrement the offset register in more cases, and since r354816 the scavenger can fail without spilling, so it's less critical that we try to avoid an offset that fits in the MUBUF offset. Restrict to callable functions for now to split this into 2 steps to limit thte number of test updates and in case anything breaks. llvm-svn: 362665	2019-06-05 22:37:50 +00:00
Cameron McInally	5ca7fd5c21	[MSAN] Add unary FNeg visitor to the MemorySanitizer Differential Revision: https://reviews.llvm.org/D62909 llvm-svn: 362664	2019-06-05 22:37:05 +00:00
Ulrich Weigand	fba10ebb96	Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663	2019-06-05 22:33:10 +00:00
Petr Hosek	76f0779d4e	Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp" This reverts commit r362435 as this triggers ICE, see PR42129 for details. llvm-svn: 362662	2019-06-05 22:27:31 +00:00
Matt Arsenault	6ac0148c30	AMDGPU: Invert frame index offset interpretation Since the beginning, the offset of a frame index has been consistently interpreted backwards. It was treating it as an offset from the scratch wave offset register as a frame register. The correct interpretation is the offset from the SP on entry to the function, before the prolog. Frame index elimination then should select either SP or another register as an FP. Treat the scratch wave offset on kernel entry as the pre-incremented SP. Rely more heavily on the standard hasFP and frame pointer elimination logic, and clean up the private reservation code. This saves a copy in most callee functions. The kernel prolog emission code is still kind of a mess relying on checking the uses of physical registers, which I would prefer to eliminate. Currently selection directly emits MUBUF instructions, which require using a reference to some register. Use the register chosen for SP, and then ignore this later. This should probably be cleaned up to use pseudos that don't refer to any specific base register until frame index elimination. Add a workaround for shaders using large numbers of SGPRs. I'm not sure these cases were ever working correctly, since as far as I can tell the logic for figuring out which SGPR is the scratch wave offset doesn't match up with the shader input initialization in the shader programming guide. llvm-svn: 362661	2019-06-05 22:20:47 +00:00
Joseph Tremoulet	fe7c3cb2a6	[EarlyCSE] Add tests for negated min/max/abs [NFC] Summary: I'm planning to update the hashing logic to recognize their equivalence in a subsequent change (D62644). Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62918 llvm-svn: 362657	2019-06-05 21:30:10 +00:00
Mircea Trofin	3a93ac6600	[CallSite removal] Refactoring llvm::InlineFunction APIs Summary: This change only unifies the API previous API pair accepting CallInst and InvokeInst, thus making it easier to refactor inliner pass ode to CallBase. The implementation of the unified API still relies on the CallSite implementation. Reviewers: eraman, chandlerc, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62283 llvm-svn: 362656	2019-06-05 21:28:13 +00:00
Sanjay Patel	2db45ec884	[InstCombine] simplify code for bitcast of insertelement; NFC llvm-svn: 362655	2019-06-05 21:26:52 +00:00
Matt Arsenault	bea813a1c9	NewGVN: Handle addrspacecast The AllConstant check needs to be moved out of the if/else if chain to avoid a test regression. The "there is no SimplifyZExt" comment puzzles me, since there is SimplifyCastInst. Additionally, the Simplify* calls seem to not see the operand as constant, so this needs to be tried if the simplify failed. llvm-svn: 362653	2019-06-05 21:15:52 +00:00
Craig Topper	29dfffe3a2	[X86] Fix mistake that marked VADDSSrrb_Int/VADDSDrrb_Int/VMULSSrrb_Int/VMULSDrrb_Int as commutable. One of the sources controls the pass through value for the upper bits of the result so we can't really commute it. In practice this problem isn't a functional issue because we would only try to commute this instruction in order to fold a load. But we can't do embedded rounding and fold a load at the same time. So the load fold would never succeed so I don't think we would ever commute or at least keep the version after commuting. llvm-svn: 362647	2019-06-05 21:00:31 +00:00
Whitney Tsang	6bfb9aa78b	[LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, and loop induction variable. Summary: This PR extends the loop object with more utilities to get loop bounds, step, and loop induction variable. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 362644	2019-06-05 20:42:47 +00:00
Tim Northover	5e8439beff	InstCombine: correctly change byval type attribute alongside call args. When the byval attribute has a type, it must match the pointee type of any parameter; but InstCombine was not updating the attribute when folding casts of various kinds away. llvm-svn: 362643	2019-06-05 20:38:17 +00:00
Tim Northover	3ab3deccc7	IR: make getParamByValType Just Work. NFC. Most parts of LLVM don't care whether the byval type is derived from an explicit Attribute or from the parameter's pointee type, so it makes sense for the main access function to just return the right value. The very few users who do care (only BitcodeReader so far) can find out how it's specified by accessing the Attribute directly. llvm-svn: 362642	2019-06-05 20:37:47 +00:00
Matt Arsenault	65e62e13d9	AMDGPU: Remove amdgpu-max-work-group-size attribute This has been deprecated for a long time, and mesa recently switched to amdgpu-flat-work-group-size. llvm-svn: 362641	2019-06-05 20:32:32 +00:00
Matt Arsenault	2d4d01bd21	AMDGPU: Fix using 2 different enums for same operand flags These enums are really for the same namespace of flags set on arbitrary MachineOperands, so merge them to avoid value collisions. llvm-svn: 362640	2019-06-05 20:32:25 +00:00
Dan Gohman	848edd55f3	[WebAssembly] Limit PIC support to the Emscripten target The current PIC support currently only works with Emscripten, so disable it for other targets. This is the PIC portion of https://reviews.llvm.org/D62542. Reviewed By: dschuff, sbc100 llvm-svn: 362638	2019-06-05 20:01:01 +00:00
Simon Pilgrim	ed4d3e5184	[X86][SSE] Add vector tests to cover more isNegatibleForFree/GetNegatedExpression cases (PR42105) Some already combine correctly, but vector constant analysis is weak. llvm-svn: 362633	2019-06-05 18:55:54 +00:00
Cameron McInally	cd1f261230	[NFC][Reassociate] Fix mistake in 468b2ad Missed 2 'fast fsub(0.0,X) -> fneg(X)' changes. llvm-svn: 362631	2019-06-05 18:50:07 +00:00
Cameron McInally	dc2e0e5492	[NFC][Reassociate] Add unary fneg tests to fast-basictest.ll llvm-svn: 362630	2019-06-05 18:35:54 +00:00
Craig Topper	4d7b02e3a5	[X86] Add the vector integer min/max instructions to isAssociativeAndCommutative. As far as I know these should be freely reassociatable just like the floating point MAXC/MINC instructions. The reduce test changes are largely regressions and caused by the "generic" CPU we default to not having a scheduler model. The machine-combiner-int-vec.ll test shows the positive benefits of this change. Differential Revision: https://reviews.llvm.org/D62787 llvm-svn: 362629	2019-06-05 18:25:09 +00:00
Philip Reames	5c7a7a56cc	[Tests] Add poison inference tests for indvars showing both existing transforms, and some room for improvement llvm-svn: 362628	2019-06-05 18:00:59 +00:00
Cameron McInally	2fec5c8a30	[NFC][Reassociate] Regenerate CHECKs for fast-basictest.ll llvm-svn: 362627	2019-06-05 18:00:27 +00:00
Simon Pilgrim	6ff0ebfacb	Fix shadow local variable warning. NFCI. llvm-svn: 362622	2019-06-05 17:26:29 +00:00
Jonas Devlieghere	fe0b699dbf	[dsymutil] Support more than 4 architectures When running dsymutil on a fat binary, we use temporary files in a small vector of size four. When processing more than 4 architectures, this resulted in a user-after-move, because the temporary files got moved to the heap. Instead of storing an optional temp file, we now use a unique pointer, so the location of the actual temp file doesn't change. We could test this by checking in 5 binaries for 5 different architectures, but this seems wasteful, especially since the number of elements in the small vector is arbitrary. llvm-svn: 362621	2019-06-05 17:14:32 +00:00
Sanjay Patel	afebdfcb0e	[x86] split more 256-bit stores of concatenated vectors As suggested in D62498 - collectConcatOps() matches both concat_vectors and insert_subvector patterns, and we see more test improvements by using the more general match. llvm-svn: 362620	2019-06-05 16:40:57 +00:00
Simon Pilgrim	d77b2c8b61	[X86][AVX] Generalize split256BitStore to splitVectorStore. NFCI. Enables us to use this to split 512-bit vectors in future patches. llvm-svn: 362617	2019-06-05 16:14:14 +00:00
Simon Pilgrim	d0943ed2ea	[X86][SSE] Add additional nt-load test cases as discussed on D62910 llvm-svn: 362616	2019-06-05 16:11:57 +00:00
Whitney Tsang	3eaa8d09fc	Revert "Title: [LOOPINFO] Extend Loop object to add utilities to get the loop" This reverts commit d34797dfc26c61cea19f45669a13ea572172ba34. llvm-svn: 362615	2019-06-05 15:32:56 +00:00
George Rimar	0425d0f266	[llvm-readobj] - Remove TODOs from gnu-hash-symbols.test and demangle.test test cases. We can remove this TODOs now. Differential revision: https://reviews.llvm.org/D62846 llvm-svn: 362614	2019-06-05 15:29:50 +00:00
Dinar Temirbulatov	ed700abffd	[SLP] Fix regression in broadcasts caused by operand reordering patch D59973. This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 . The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found. Please see the lit test for some examples. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D62427 llvm-svn: 362613	2019-06-05 15:26:28 +00:00
Sanjay Patel	18e4404610	[LoopUtils][SLPVectorizer] clean up management of fast-math-flags Instead of passing around fast-math-flags as a parameter, we can set those using an IRBuilder guard object. This is no-functional-change-intended. The motivation is to eventually fix the vectorizers to use and set the correct fast-math-flags for reductions. Examples of that not behaving as expected are: https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast') https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0) D61802 (should be able to reduce with IR-level FMF) Differential Revision: https://reviews.llvm.org/D62272 llvm-svn: 362612	2019-06-05 14:58:04 +00:00
Benjamin Kramer	725f424d90	[LoopInfo] Fix unused variable warning. NFC. llvm-svn: 362610	2019-06-05 14:43:58 +00:00
Whitney Tsang	e5741d7263	Title: [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, and loop induction variable. Summary: This PR extends the loop object with more utilities to get loop bounds, step, and loop induction variable. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 362609	2019-06-05 14:34:12 +00:00
Roman Lebedev	7692594815	[NFC][Codegen][X86] Add AVX2 runline for '(X & (C l>> Y)) ==/!= 0' tests llvm-svn: 362606	2019-06-05 14:08:11 +00:00
Roman Lebedev	169313ec4b	UpdateTestChecks: hexagon support Summary: These tests are being affected by an upcoming patch, so having an understandable (autogenerated) diff is helpful. This target, again, prefers `-march`: ``` llvm/test/CodeGen/Hexagon$ grep -r triple \| wc -l 467 llvm/test/CodeGen/Hexagon$ grep -r march \| wc -l 1167 ``` Reviewers: RKSimon, kparzysz Reviewed By: kparzysz Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62867 llvm-svn: 362605	2019-06-05 14:08:01 +00:00
Petar Avramovic	00c94c775d	[MIPS GlobalISel] Select fcmp Select floating point compare for MIPS32. Differential Revision: https://reviews.llvm.org/D62721 llvm-svn: 362603	2019-06-05 14:03:13 +00:00
George Rimar	fadb9fb7ce	[yaml2obj] - Change how we handle implicit sections. We have a few sections that can be added implicitly to the output: ".dynsym", ".dynstr", ".symtab", ".strtab" and ".shstrtab". Problem appears when such section is listed explicitly in YAML. In that case it's content is written twice: first time during writing of regular sections listed in the document and second time during special handling. Because of that their file offsets can become unexpectedly broken: (yaml file for sample below lists .dynsym explicitly before .text.foo) Before patch: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .dynsym DYNSYM 0000000000000100 00000250 0000000000000030 0000000000000018 A 6 0 8 [ 2] .text.foo PROGBITS 0000000000000200 00000200 0000000000000000 0000000000000000 AX 0 0 0 After patch: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .dynsym DYNSYM 0000000000000100 00000200 0000000000000030 0000000000000018 A 6 0 8 [ 2] .text.foo PROGBITS 0000000000000200 00000230 0000000000000000 0000000000000000 AX 0 0 0 This patch reorganizes our code and fixes the issue described. Differential revision: https://reviews.llvm.org/D62809 llvm-svn: 362602	2019-06-05 13:16:53 +00:00
Sjoerd Meijer	14d9e3e82e	[ARM] Allow "-march=foo+fp" to vary with foo This is the LLVM part of this change, the Clang part contains the full description in its commit message. Differential Revision: https://reviews.llvm.org/D60697 llvm-svn: 362600	2019-06-05 13:11:51 +00:00
Simon Pilgrim	a9ea2c1dd3	[X86][AVX] combineX86ShuffleChain - combine shuffle(extractsubvector(x),extractsubvector(y)) We already handle the case where we combine shuffle(extractsubvector(x),extractsubvector(x)), this relaxes the requirement to permit different sources as long as they have the same value type. This causes a couple of cases where the VPERMV3 binary shuffles occur at a wider width than before, which I intend to improve in future commits - but as only the subvector's mask indices are defined, these will broadcast so we don't see any increase in constant size. llvm-svn: 362599	2019-06-05 12:56:53 +00:00

1 2 3 4 5 ...

179823 Commits