llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00

Author	SHA1	Message	Date
Serguei Katkov	4f8d50be22	[BPI] Ignore remainder while distributing the remaining probability from unreachanble This is a follow up patch for https://reviews.llvm.org/rL300440 to address a comment. To make implementation to be consistent with other cases we just ignore the remainder after distribution of remaining probability between reachable edges. If we reduced the probability of some edges coming to unreachable blocks we should distribute the remaining part across other edges coming to reachable blocks to satisfy the condition that sum of all probabilities should be equal to one. If this remaining part is not divided by number of "reachable" edges then we get this remainder. This remainder probability should be pretty small. Other cases just ignore if the sum of probabilities is not equal to one so we do the same. Reviewers: chandlerc, sanjoy, vsk, junbuml, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D32124 llvm-svn: 302883	2017-05-12 07:50:06 +00:00
Craig Topper	5d64cddba6	[APInt] Fix a case where udivrem might delete and create a new allocation instead of reusing the original. llvm-svn: 302882	2017-05-12 07:21:09 +00:00
George Rimar	419a29016a	[Support/Compiler.h] - Use gnu::fallthrough for LLVM_FALLTHROUGH when available. I tried to compile LLD using GCC 7.1.0 and got warnings like "warning: this statement may fall through [-Wimplicit-fallthrough=]" (some more details are here: D32907) GCC's __cplusplus value is 201402L by default, so macro expands to nothing, though GCC 7 has support for [[fallthrough]]. Patch uses gnu::fallthrough when it is available and fixes warning I am observing. Initial idea of way to fix belongs to Davide Italiano. Differential revision: https://reviews.llvm.org/D33036 llvm-svn: 302878	2017-05-12 06:53:48 +00:00
Jonas Paulsson	9727d7a8ef	Handle a COPY with undef source operand in LowerCopy() Llvm-stress discovered that a COPY may end up in ExpandPostRA::LowerCopy() with an undef source operand. It is not possible for the target to handle this, as this flag is not passed to TII->copyPhysReg(). This patch solves this by treating such a COPY as an identity COPY. Review: Matthias Braun https://reviews.llvm.org/D32892 llvm-svn: 302877	2017-05-12 06:32:03 +00:00
Mikael Holmen	3048bfca17	[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle Summary: Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot always be trusted) at the end to fixup the CFG we keep the CFG updated as we go along and remove or add branches and merge blocks. This way we won't have any problems if the involved MBBs contain unanalyzable instructions. This fixes PR32721. In that case we had a triangle EBB \| \ \| \| \| TBB \| / FBB where FBB didn't have any successors at all since it ended with an unconditional return. Then TBB and FBB were be merged into EBB, but EBB would still keep its successors, and the use of analyzeBranch and CorrectExtraCFGEdges wouldn't help to remove them since the return instruction is not analyzable (at least not on ARM). Reviewers: kparzysz, iteratee, MatzeB Reviewed By: iteratee Subscribers: aemerson, rengolin, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33037 llvm-svn: 302876	2017-05-12 06:28:58 +00:00
Chandler Carruth	a360b737fa	[PM/Unswitch] Teach the new simple loop unswitch to handle loop invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867	2017-05-12 02:19:59 +00:00
Craig Topper	bf05ecc53e	[APInt] Add a utility method to change the bit width and storage size of an APInt. Summary: This adds a resize method to APInt that manages deleting/allocating storage for an APInt and changes its bit width. Use this to simplify code in copy assignment and divide. The assignment code in particular was overly complicated. Treating every possible case as a separate implementation. I'm also pretty sure the clearUnusedBits code at the end was unnecessary. Since we always copying whole words from the source APInt. All unused bits should be clear in the source. Reviewers: hans, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33073 llvm-svn: 302863	2017-05-12 01:46:01 +00:00
David Blaikie	ea27535943	DWARF: Avoid cross-CU references under Fission Turns out that the Fission/Split DWARF package format (DWP) is currently insufficient to handle cross-CU (ref_addr) references. So for now, duplicate any debug info needed in these situations: * inlined_subroutine's abstract_origin * inlined variable's abstract_origin * types Keep the ref_addr behavior in general, including in the split DWARF inline debug info that can be emitted into the object files for online symbolication. Keep a flag to use the old (ref_addr) behavior for testing ways of addressing this limitation in the DWP tool (& for those not using DWP packaging). llvm-svn: 302858	2017-05-12 01:13:45 +00:00
Dean Michael Berris	619efbbc33	[XRay][lib] Support and temporarily skip over CustomEvent records Summary: In D30630 we will start writing custom event records. To avoid breaking the tools that read the FDR mode records, we skip over these records. To support these custom event records more effectively, we will have to expose them in the trace loading API. Those changes will be forthcoming. Reviewers: kpw, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33032 llvm-svn: 302856	2017-05-12 01:06:41 +00:00
Reid Kleckner	871641ed3e	[git-llvm] Fix svn:eol-style issue for one-file patches llvm-svn: 302853	2017-05-12 00:10:19 +00:00
Peter Collingbourne	5510aada63	CallGraph: Remove almost-unused field 'Root'. llvm-svn: 302852	2017-05-11 23:59:05 +00:00
Dehao Chen	228587901e	Change sample profile writer to make it deterministic. Summary: This patch changes the function profile output order to be deterministic. In order to make it easier to understand, hottest functions (with most total samples) is ordered first. Reviewers: dnovillo, davidxl Reviewed By: dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33111 llvm-svn: 302851	2017-05-11 23:43:44 +00:00
Teresa Johnson	81aa3660f7	Restrict call metadata based hotness detection to Sample PGO mode Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844	2017-05-11 23:18:05 +00:00
Reid Kleckner	9976b72566	Issue diagnostics when returning FP values on x86_64 without SSE1/2 Avoid using report_fatal_error, because it will ask the user to file a bug. If the user attempts to disable SSE on x86_64 and them use floating point, that's a bug in their code, not a bug in the compiler. This is just a start. There are other ways to crash the backend in this configuration, but they should be updated to follow this pattern. Differential Revision: https://reviews.llvm.org/D27522 llvm-svn: 302835	2017-05-11 22:43:02 +00:00
Guozhi Wei	37cf363f24	[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 llvm-svn: 302834	2017-05-11 22:17:35 +00:00
Aditya Nandakumar	7cd2083da2	[GISel]: Remove unused lambda captures. NFC https://reviews.llvm.org/D33085 llvm-svn: 302831	2017-05-11 21:56:51 +00:00
Easwaran Raman	7fb8c288b9	Decrease inlinecold-threshold to 45 I ran the test-suite (including SPEC 2006) in PGO mode comparing cold thresholds of 225 and 45. Here are some stats on the text size: Out of 904 tests that ran, 197 see a change in text size. The average text size reduction (of all the 904 binaries) is 1.07%. Of the 197 binaries, 19 see a text size increase, as high as 18%, but most of them are small single source benchmarks. There are 3 multisource benchmarks with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On the other side of the spectrum, 31 benchmarks see >10% size reduction and 6 of them are MultiSource. I haven't run the test-suite with other values of inlinecold-threshold. Since we have a cold callsite threshold of 45, I picked this value. Differential revision: https://reviews.llvm.org/D33106 llvm-svn: 302829	2017-05-11 21:36:28 +00:00
Reid Kleckner	b700ffc4cc	De-virtualize TerminatorInst successor accessors Use the same switch technique to eliminate virtual successor accessors from TerminatorInst. Extracted from D31261. NFC llvm-svn: 302827	2017-05-11 21:26:55 +00:00
Reid Kleckner	c2f82ff8e0	De-virtualize GlobalValue The erase/remove from parent methods now use a switch table to remove themselves from their appropriate parent ilist. The copyAttributesFrom method is now completely non-virtual, since we only ever copy attributes from a global of the appropriate type. Pre-requisite to de-virtualizing Value to save a vptr (https://reviews.llvm.org/D31261). NFC llvm-svn: 302823	2017-05-11 21:14:29 +00:00
Chad Rosier	4962115cec	[AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD. Differential Revision: http://reviews.llvm.org/D33101. llvm-svn: 302822	2017-05-11 20:07:24 +00:00
Davide Italiano	2059f769c1	[AMDGPU] Placate unused variable warning in release builds. llvm-svn: 302821	2017-05-11 19:58:52 +00:00
Vadzim Dambrouski	80fb90f30d	[MSP430] Generate EABI-compliant libcalls Updates the MSP430 target to generate EABI-compatible libcall names. As a byproduct, adjusts the hardware multiplier options available in the MSP430 target, adds support for promotion of the ISD::MUL operation for 8-bit integers, and correctly marks R11 as used by call instructions. Patch by Andrew Wygle. Differential Revision: https://reviews.llvm.org/D32676 llvm-svn: 302820	2017-05-11 19:56:14 +00:00
Davide Italiano	efdc1acac0	[LiveVariables] Switch Kill/Defs sets to be DenseSet(s). The testcase in PR32984 shows a non linear compile time increase after a change that made the LoopUnroll pass more aggressive (increasing the threshold). My profiling shows all the time of PHI elimination goes to llvm::LiveVariables::addNewBlock. This is because we keep Defs/Kills registers in a SmallSet and vfind(const T &V); is O(N). Switching to a DenseSet reduces the time spent in the pass from 297 seconds to 97 seconds. Profiling still shows a lot of time is spent iterating the data structure, so I guess there's room for improvement. Dan tells me GCC uses real set operations for live registers and it takes no-time on this testcase. Matthias points out we might want to switch all this to LiveIntervalAnalysis so it's not entirely sure if a rewrite is worth it. Differential Revision: https://reviews.llvm.org/D33088 llvm-svn: 302819	2017-05-11 19:37:43 +00:00
Craig Topper	0e993cde91	[APInt] Remove an APInt copy from the return of APInt::multiplicativeInverse. llvm-svn: 302816	2017-05-11 18:40:53 +00:00
Craig Topper	b77e278950	[APInt] Fix typo in comment. NFC llvm-svn: 302815	2017-05-11 17:57:43 +00:00
Matt Arsenault	4f44d0e3e0	AMDGPU: Remove tfe bit from flat instruction definitions We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814	2017-05-11 17:38:33 +00:00
Matt Arsenault	fd9981d4ab	AMDGPU: Pull fneg out of extract_vector_elt This allows folding source modifiers in more f16 cases. Makes it easier to select per-component packed neg modifiers. llvm-svn: 302813	2017-05-11 17:26:25 +00:00
Stanislav Mekhanoshin	7c5b44f204	[AMDGPU] Fix incorrect register pressure calculation Earlier fix D32572 introduced a bug where live-ins were calculated for basic block instead of scheduling region. This change fixes it. Differential Revision: https://reviews.llvm.org/D33086 llvm-svn: 302812	2017-05-11 17:16:55 +00:00
Adam Nemet	73703d12a4	[SLP] Emit optimization remarks The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811	2017-05-11 17:06:17 +00:00
Nemanja Ivanovic	58096023de	[PowerPC] Eliminate integer compare instructions - vol. 1 This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 llvm-svn: 302810	2017-05-11 16:54:23 +00:00
Simon Pilgrim	d3fba3034e	[DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI. llvm-svn: 302808	2017-05-11 16:40:44 +00:00
Hans Wennborg	50ef7ca1e8	Fix -DLLVM_ENABLE_THREADS=OFF build after r302748 llvm-svn: 302806	2017-05-11 15:32:47 +00:00
Simon Pilgrim	8ee014b2e7	[X86][AVX] Added zeroall/zeroupper scheduler tests Missing on SandyBridge and Btver2 models llvm-svn: 302804	2017-05-11 15:02:49 +00:00
Tim Northover	5684bf9378	Modules: fix modules build. A recent commit made GlobalVariable.h depend on intrinsics generation, so (I think) it needs to be in the lower-level module. I'll confirm with others, but this should fix the bots. llvm-svn: 302803	2017-05-11 14:51:43 +00:00
Javed Absar	a6a50d93e8	[IR] Allow attributes with global variables This patch extends llvm-ir to allow attributes to be set on global variables. An RFC was sent out earlier by my colleague James Molloy: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053100.html A key part of that proposal was to extend LLVM-IR to carry attributes on global variables. This generic feature could be useful for multiple purposes. In our present context, it would be useful to carry user specified sections for bss/rodata/data. Reviewed by: Jonathan Roelofs, Reid Kleckner Differential Revision: https://reviews.llvm.org/D32009 llvm-svn: 302794	2017-05-11 12:28:08 +00:00
Igor Breger	36c9e8bbd2	[GlobalISel][X86] Remove hand-written G_FADD/F_SUB selection. Now it handle by TableGen. llvm-svn: 302793	2017-05-11 12:15:03 +00:00
Ayman Musa	136e8dd266	[X86] Moving X86Local namespace from .cpp to .h file to use it in memory folding TableGen backend. Differential Revision: https://reviews.llvm.org/D32797 llvm-svn: 302791	2017-05-11 11:51:12 +00:00
Ayal Zaks	4c2e1fba70	[LV] Refactor ILV.vectorize{Loop}() by introducing LVP.executePlan(); NFC Introduce LoopVectorizationPlanner.executePlan(), replacing ILV.vectorize() and refactoring ILV.vectorizeLoop(). Method collectDeadInstructions() is moved from ILV to LVP. These changes facilitate building VPlans and using them to generate code, following https://reviews.llvm.org/D28975 and its tentative breakdown. Method ILV.createEmptyLoop() is renamed ILV.createVectorizedLoopSkeleton() to improve clarity; it's contents remain intact. Differential Revision: https://reviews.llvm.org/D32200 llvm-svn: 302790	2017-05-11 11:36:33 +00:00
Alexander Potapenko	d099c88f0a	[msan] Fix PR32842 It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless. This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero). Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0. This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact. For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code: orq %rcx, %rax jne .LBB0_6 , instead of: orl %ecx, %eax testb $1, %al jne .LBB0_6 llvm-svn: 302787	2017-05-11 11:07:48 +00:00
Chandler Carruth	7ba7a0f05a	[x86] Fix a failure to select with AVX-512 when the type legalizer manages to form a VSELECT with a non-i1 element type condition. Those are technically allowed in SDAG (at least, the generic type legalization logic will form them and I wouldn't want to try to audit everything te preclude forming them) so we need to be able to lower them. This isn't too hard to implement. We mark VSELECT as custom so we get a chance in C++, add a fast path for i1 conditions to get directly handled by the patterns, and a fallback when we need to manually force the condition to be an i1 that uses the vptestm instruction to turn a non-mask into a mask. This, unsurprisingly, generates awful code. But it at least doesn't crash. This was actually impacting open source packages built with LLVM for AVX-512 in the wild, so quickly landing a patch that at least stops the immediate bleeding. I think I've found where to fix the codegen quality issue, but less confident of that change so separating it out from the thing that doesn't change the result of any existing test case but causes mine to not crash. llvm-svn: 302785	2017-05-11 10:52:16 +00:00
Simon Pilgrim	41124038ea	Strip trailing whitespace. NFCI. llvm-svn: 302784	2017-05-11 10:03:05 +00:00
Diana Picus	aa41f21260	[ARM][GlobalISel] Legalize narrow scalar ops by widening This is the same as r292827 for AArch64: we widen 8- and 16-bit ADD, SUB and MUL to 32 bits since we only have TableGen patterns for 32 bits. See the commit message for r292827 for more details. At this point we could just remove some of the tests for regbankselect and instruction-select, since we're not going to see any narrow operations at those levels anymore. Instead I decided to update them with G_ANYEXT/G_TRUNC operations, so we can validate the full sequences generated by the legalizer. llvm-svn: 302782	2017-05-11 09:45:57 +00:00
Serge Guelton	02363cd9ec	Remove spurious cast of nullptr. NFC. Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780	2017-05-11 08:53:00 +00:00
Serge Guelton	b5ee247bb9	Remove now useless trailing nullptr in StructType::get llvm-svn: 302779	2017-05-11 08:46:02 +00:00
Diana Picus	97796d8b55	[ARM][GlobalISel] Support for G_ANYEXT G_ANYEXT can be introduced by the legalizer when widening scalars. Add support for it in the register bank info (same mapping as everything else) and in the instruction selector. When selecting it, we treat it as a COPY, just like G_TRUNC. On this occasion we get rid of some assertions in selectCopy so we can reuse it. This shouldn't be a problem at the moment since we're not supporting any complicated cases (e.g. FPR, different register banks). We might want to separate the paths when we do. llvm-svn: 302778	2017-05-11 08:28:31 +00:00
Igor Breger	d4690d9a2f	[GlobalISel][X86] G_ICMP support. Summary: support G_ICMP for scalar types i8/i16/i64. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits, krytarowski Differential Revision: https://reviews.llvm.org/D32995 llvm-svn: 302774	2017-05-11 07:17:40 +00:00
Craig Topper	c1ef9481dc	[APInt] Remove an unneeded extra temporary APInt from toString. Turns out udivrem can write its output to the same location as one of its inputs so the extra temporary isn't needed. llvm-svn: 302772	2017-05-11 07:10:43 +00:00
Craig Topper	584e78cfa3	[APInt] Use negate() instead of copying an APInt to negate it and then writing back over the original value. llvm-svn: 302770	2017-05-11 07:02:04 +00:00
Craig Topper	0a5b5b8c48	[SCEV] Reduce possible APInt allocations a bit. llvm-svn: 302769	2017-05-11 06:48:54 +00:00
Craig Topper	e21ecdd7c2	[SCEV] Remove unneeded 'using namespace APIntOps'. llvm-svn: 302768	2017-05-11 06:48:51 +00:00

1 2 3 4 5 ...

148812 Commits