llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00

Author	SHA1	Message	Date
David Bolvansky	0cc8fb27ad	[InstCombine] Annotate strdup with deref_or_null llvm-svn: 372098	2019-09-17 10:12:48 +00:00
David Bolvansky	e0406acfdf	[NFCI] Fixed buildbots llvm-svn: 372097	2019-09-17 10:03:45 +00:00
Fangrui Song	540b06dd3a	[SimplifyLibCalls] Fix -Wunused-result after D53342/r372091 llvm-svn: 372096	2019-09-17 09:56:55 +00:00
David Bolvansky	23ec32776c	[NFC} Updated test llvm-svn: 372093	2019-09-17 09:45:52 +00:00
Luis Marques	22a56903f8	Patch from Phabricator llvm-svn: 372092	2019-09-17 09:43:08 +00:00
David Bolvansky	4d7d2beaf1	[SimplifyLibCalls] Mark known arguments with nonnull Reviewers: efriedma, jdoerfert Reviewed By: jdoerfert Subscribers: ychen, rsmith, joerg, aaron.ballman, lebedev.ri, uenoku, jdoerfert, hfinkel, javed.absar, spatel, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D53342 llvm-svn: 372091	2019-09-17 09:32:52 +00:00
George Rimar	019e98e986	[llvm-readobj] - Fix BB after r372087. Seems I forgot to update the number of bytes checked. llvm-svn: 372089	2019-09-17 09:26:49 +00:00
Fangrui Song	590f84d84c	[llvm-ar] Parse 'h' and '-h': display help and exit Support `llvm-ar h` and `llvm-ar -h` because they may be what users try at first. Note, operation 'h' is undocumented in GNU ar. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D67560 llvm-svn: 372088	2019-09-17 09:25:52 +00:00
George Rimar	56a1935729	[llvm-readobj] - Fix a TODO in elf-reloc-zero-name-or-value.test. The "TODO" mentioned was: "Add test for symbol with no name but with a value once yaml2obj allows referencing symbols with no name from relocations." We can do it now. Differential revision: https://reviews.llvm.org/D67609 llvm-svn: 372087	2019-09-17 09:12:10 +00:00
Alexander Timofeev	6b488065a6	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed Defferential Revision: https://reviews.llvm.org/D67101 Reviewers: rampitec, vpykhtin llvm-svn: 372086	2019-09-17 09:08:58 +00:00
Sam Parker	7b81fc68ca	[ARM] LE support in ConstantIslands The low-overhead branch extension provides a loop-end 'LE' instruction that performs no decrement nor compare, it just jumps backwards. This patch modifies the constant islands pass to try to insert LE instructions in place of a Thumb2 conditional branch, instead of shrinking it. This only happens if a cmp can be converted to a cbn/z and used to exit the loop. Differential Revision: https://reviews.llvm.org/D67404 llvm-svn: 372085	2019-09-17 09:08:05 +00:00
Florian Hahn	9e1c9bd69d	[LoopUnroll] Use LoopSize+1 as threshold, to allow unrolling loops matching LoopSize. We use `< UP.Threshold` later on, so we should use LoopSize + 1, to allow unrolling if the result won't exceed to loop size. Fixes PR43305. Reviewers: efriedma, dmgreen, paquette Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D67594 llvm-svn: 372084	2019-09-17 09:02:48 +00:00
George Rimar	008671759e	[llvm-readobj] - Refactor the code. It's a straightforward refactoring that allows to simplify and encapsulate the code. Differential revision: https://reviews.llvm.org/D67624 llvm-svn: 372083	2019-09-17 08:53:18 +00:00
George Rimar	88eb61f790	[llvm-objcopy] - Remove python invocations from 2 test cases. It is possible to use yaml2obj to create sections with overlapping sh_offset now. This patch does that. Differential revision: https://reviews.llvm.org/D67610 llvm-svn: 372081	2019-09-17 08:38:53 +00:00
Florian Hahn	a04631bfe3	[bugpoint] Add support for -Oz and properly enable -Os. This patch adds -Oz as option and also properly enables support for -Os. Currently, the existing check for -Os is dead, because the enclosing if only checks of O1, O2 and O3. There is still a difference between the -Oz pipeline compared to opt, but I have not been able to track that down yet. Reviewers: bogner, sebpop, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D67593 llvm-svn: 372079	2019-09-17 08:14:09 +00:00
Sam Parker	e91af3cf2d	[ARM][MVE] Add invalidForTailPredication to TSFlags Set this bit for the MVE reduction instructions to prevent a loop from becoming tail predicated in their presence. Differential Revision: https://reviews.llvm.org/D67444 llvm-svn: 372076	2019-09-17 07:43:04 +00:00
Hideto Ueno	e74a978cb1	[Attributor] Use Alias Analysis in noalias callsite argument deduction Summary: This patch adds a check of alias analysis in `noalias` callsite argument deduction. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67604 llvm-svn: 372075	2019-09-17 06:53:27 +00:00
Hideto Ueno	e89a3c1190	[Attributor] Create helper struct for handling analysis getters Summary: This patch introduces a helper struct `AnalysisGetter` to put together analysis getters. In this patch, a getter for `AAResult` is also added for `noalias`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67603 llvm-svn: 372072	2019-09-17 05:45:18 +00:00
David Zarzycki	a2114177f9	[git-llvm] Do not reinvent `@{upstream}` (take 2) This makes git-llvm more of a thin wrapper around git while temporarily maintaining backwards compatibility with past git-llvm behavior. Using @{upstream} makes git-llvm more robust when used with a nontrivial local repository. https://reviews.llvm.org/D67389 llvm-svn: 372070	2019-09-17 04:44:13 +00:00
Craig Topper	82fb049153	[X86] Split oversized vXi1 vector arguments and return values into scalars on avx512 targets. Previously we tried to split them into narrower v64i1 or v16i1 pieces that each got promoted to vXi8 and then passed in a zmm or xmm register. But this crashes when you need to pass more pieces than available registers reserved for argument passing. The scalarizing done here generates much longer and slower code, but is consistent with the behavior of avx2 and earlier targets for these types. Fixes PR43323. llvm-svn: 372069	2019-09-17 04:41:14 +00:00
Craig Topper	ef8177e003	[X86] Allow masked VBROADCAST instructions to be turned into BLENDM with a broadcast load to avoid a copy. The BLENDM instructions allow an 2 sources and an independent destination while masked VBROADCAST has the destination tied to the source. llvm-svn: 372068	2019-09-17 04:41:10 +00:00
Craig Topper	6a4f10bb7e	[X86] Add support for commuting EVEX VCMP instructons with any immediate value. Previously we limited to the EQ/NE/TRUE/FALSE/ORD/UNORD immediates. llvm-svn: 372067	2019-09-17 04:41:05 +00:00
Craig Topper	9a1ee6dff6	[X86] Add test case for missed opportunity to commute a VCMP instruction after unfolding one load in order to fold another load. llvm-svn: 372066	2019-09-17 04:41:01 +00:00
Craig Topper	fdaaecbd28	[X86] Enable commuting of EVEX VCMP for all immediate values during isel. llvm-svn: 372065	2019-09-17 04:40:58 +00:00
David Blaikie	91ba4f99df	llvm-reduce: Clean out previous test temp/output dir, since it was a dir and now it's used as just a single file llvm-svn: 372054	2019-09-16 23:56:26 +00:00
David Blaikie	1018faa8e7	llvm-reduce: Remove some string copies llvm-svn: 372053	2019-09-16 23:54:57 +00:00
Joel E. Denny	f04bbad688	Revert r372035: "[lit] Make internal diff work in pipelines" This breaks a Windows bot. llvm-svn: 372051	2019-09-16 23:47:46 +00:00
Amara Emerson	b69135c67b	[GlobalISel] Partially revert r371901. r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050	2019-09-16 23:46:03 +00:00
David Blaikie	e6945e5ce7	llvm-reduce: Make tests shell-independent by passing the interpreter on the command line rather than using #! in the test file llvm-svn: 372049	2019-09-16 23:41:19 +00:00
David L. Jones	51503f4f31	Add libc to path mappings in git-llvm. llvm-svn: 372048	2019-09-16 23:36:35 +00:00
Nemanja Ivanovic	7cde51ec49	[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32 Add the missing piece of r372029. Somehow when the patch for review D61961 was committed, only the test case went in and the code didn't. This of course caused all kinds of build bot breaks. This patch just adds the code for that patch. Author: Lei Huang Differential revision: https://reviews.llvm.org/D61961 llvm-svn: 372043	2019-09-16 22:54:52 +00:00
Francis Visoiu Mistrih	fb3304cda8	[Remarks] Allow remarks::Format::YAML to take a string table It should be allowed to take a string table in case all the strings in the remarks point there, but it shouldn't use it during serialization. llvm-svn: 372042	2019-09-16 22:45:17 +00:00
Joel E. Denny	4878f8e5d6	[lit] Make internal diff work in pipelines When using lit's internal shell, RUN lines like the following accidentally execute an external `diff` instead of lit's internal `diff`: ``` # RUN: program \| diff file - # RUN: not diff file1 file2 \| FileCheck %s ``` Such cases exist now, in `clang/test/Analysis` for example. We are preparing patches to ensure lit's internal `diff` is called in such cases, which will then fail because lit's internal `diff` cannot currently be used in pipelines and doesn't recognize `-` as a command-line option. To enable pipelines, this patch moves lit's `diff` implementation into an out-of-process script, similar to lit's `cat` implementation. A follow-up patch will implement `-` to mean stdin. Reviewed By: probinson, stella.stamenova Differential Revision: https://reviews.llvm.org/D66574 llvm-svn: 372035	2019-09-16 21:22:29 +00:00
Bardia Mahjour	7e0a26b645	[NFC] Test commit access llvm-svn: 372033	2019-09-16 20:44:15 +00:00
DeForest Richards	c7e199a8c6	[Docs] Bug fix for docs homepage Removes reference to non-existent Reference Documentation page. llvm-svn: 372032	2019-09-16 20:29:56 +00:00
DeForest Richards	6aac682bcb	[Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage Adds a section for Getting Started/Tutorials and Reference topics to the LLVM docs homepage. llvm-svn: 372031	2019-09-16 20:19:32 +00:00
Lei Huang	05ac4244eb	[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32 This is a follow up patch from https://reviews.llvm.org/D57857 to handle extract_subvector v4f32. For cases where we fpext of v2f32 to v2f64 from extract_subvector we currently generate on P9 the following: lxv 0, 0(3) xxsldwi 1, 0, 0, 1 xscvspdpn 2, 0 xxsldwi 3, 0, 0, 3 xxswapd 0, 0 xscvspdpn 1, 1 xscvspdpn 3, 3 xscvspdpn 0, 0 xxmrghd 0, 0, 3 xxmrghd 1, 2, 1 stxv 0, 0(4) stxv 1, 0(5) This patch custom lower it to the following sequence: lxv 0, 0(3) # load the v4f32 <w0, w1, w2, w3> xxmrghw 2, 0, 0 # Produce the following vector <w0, w0, w1, w1> xxmrglw 3, 0, 0 # Produce the following vector <w2, w2, w3, w3> xvcvspdp 2, 2 # FP-extend to <d0, d1> xvcvspdp 3, 3 # FP-extend to <d2, d3> stxv 2, 0(5) # Store <d0, d1> (%vecinit11) stxv 3, 0(4) # Store <d2, d3> (%vecinit4) Differential Revision: https://reviews.llvm.org/D61961 llvm-svn: 372029	2019-09-16 20:04:15 +00:00
Vedant Kumar	1ab21606e7	[Coverage] Speed up file-based queries for coverage info, NFC Speed up queries for coverage info in a file by reducing the amount of time spent determining whether a function record corresponds to a file. This gives a 36% speedup when generating a coverage report for `llc`. The reduction is entirely in user time. rdar://54758110 Differential Revision: https://reviews.llvm.org/D67575 llvm-svn: 372025	2019-09-16 19:08:44 +00:00
Vedant Kumar	53a68e5af8	[Coverage] Assert that filenames in a TU are unique, NFC llvm-svn: 372024	2019-09-16 19:08:41 +00:00
Steven Wu	7fb3882581	[LTO][Legacy] Add new C inferface to query libcall functions Summary: This is needed to implemented the same approach as lld (implemented in r338434) for how to handling symbols that can be generated by LTO code generator but not present in the symbol table for linker that uses legacy C APIs. libLTO is in charge of providing the list of symbols. Linker is in charge of implementing the eager loading from static libraries using the list of symbols. rdar://problem/52853974 Reviewers: tejohnson, bd1976llvm, deadalnix, espindola Reviewed By: tejohnson Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67568 llvm-svn: 372021	2019-09-16 18:49:54 +00:00
Reid Kleckner	f30ad55559	[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups This fixes relocations against __profd_ symbols in discarded sections, which is PR41380. In general, instrumentation happens very early, and optimization and inlining happens afterwards. The counters for a function are calculated early, and after inlining, counters for an inlined function may be widely referenced by other functions. For C++ inline functions of all kinds (linkonce_odr & available_externally mainly), instr profiling wants to deduplicate these __profc_ and __profd_ globals. Otherwise the binary would be quite large. I made __profd_ and __profc_ comdat in r355044, but I chose to make __profd_ internal. At the time, I was only dealing with coverage, and in that case, none of the instrumentation needs to reference __profd_. However, if you use PGO, then instrumentation passes add calls to __llvm_profile_instrument_range which reference __profd_ globals. The solution is to make these globals externally visible by using linkonce_odr linkage for data as was done for counters. This is safe because PGO adds a CFG hash to the names of the data and counter globals, so if different TUs have different globals, they will get different data and counter arrays. Reviewers: xur, hans Differential Revision: https://reviews.llvm.org/D67579 llvm-svn: 372020	2019-09-16 18:49:09 +00:00
Roman Lebedev	315cdcecbb	[ARM][Codegen] Autogenerate arm-cgp-casts.ll test. Apparently it got broken by r372009 while i thought it was r372012. llvm-svn: 372019	2019-09-16 18:28:22 +00:00
Simon Pilgrim	64045ff52d	[X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector. llvm-svn: 372013	2019-09-16 17:30:33 +00:00
David Green	4c69ced591	[ARM] A predicate cast of a predicate cast is a predicate cast The adds some very basic folding of PREDICATE_CASTS, removing cases when they are chained together. These would already be removed eventually, as these are lowered to copies. This just allows it to happen earlier, which can help other simplifications. Differential Revision: https://reviews.llvm.org/D67591 llvm-svn: 372012	2019-09-16 17:29:07 +00:00
Roman Lebedev	03e9b9d4a0	[SimplifyCFG] FoldTwoEntryPHINode(): consider total speculation cost, not per-BB cost Summary: Previously, if the threshold was 2, we were willing to speculatively execute 2 cheap instructions in both basic blocks (thus we were willing to speculatively execute cost = 4), but weren't willing to speculate when one BB had 3 instructions and other one had no instructions, even thought that would have total cost of 3. This looks inconsistent to me. I don't think `cmov`-like instructions will start executing until both of it's inputs are available: https://godbolt.org/z/zgHePf So i don't see why the existing behavior is the correct one. Also, let's add it's own `cl::opt` for this threshold, with default=4, so it is not stricter than the previous threshold: will allow to fold when there are 2 BB's each with cost=2. And since the logic has changed, it will also allow to fold when one BB has cost=3 and other cost=1, or there is only one BB with cost=4. This is an alternative solution to D65148: This fix is mainly motivated by `signbit-like-value-extension.ll` test. That pattern comes up in JPEG decoding, see e.g. `Figure F.12 – Extending the sign bit of a decoded value in V` of `ITU T.81` (JPEG specification). That branch is not predictable, and it is within the innermost loop, so the fact that that pattern ends up being stuck with a branch instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial. This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240) \| metric \| old \| new \| delta \| % \| \| x86-mi-counting.NumMachineFunctions \| 37720 \| 37721 \| 1 \| 0.00% \| \| x86-mi-counting.NumMachineBasicBlocks \| 773545 \| 771181 \| -2364 \| -0.31% \| \| x86-mi-counting.NumMachineInstructions \| 7488843 \| 7486442 \| -2401 \| -0.03% \| \| x86-mi-counting.NumUncondBR \| 135770 \| 135543 \| -227 \| -0.17% \| \| x86-mi-counting.NumCondBR \| 423753 \| 422187 \| -1566 \| -0.37% \| \| x86-mi-counting.NumCMOV \| 24815 \| 25731 \| 916 \| 3.69% \| \| x86-mi-counting.NumVecBlend \| 17 \| 17 \| 0 \| 0.00% \| We significantly decrease basic block count, notably decrease instruction count, significantly decrease branch count and very significantly increase `cmov` count. Performance-wise, unsurprisingly, this has great effect on target RawSpeed benchmark. I'm seeing 5 major improvements: ``` Benchmark Time CPU Time Old Time New CPU Old CPU New ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean -0.3064 -0.3064 226.9913 157.4452 226.9800 157.4384 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median -0.3057 -0.3057 226.8407 157.4926 226.8282 157.4828 Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev -0.4985 -0.4954 0.3051 0.1530 0.3040 0.1534 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean -0.1747 -0.1747 80.4787 66.4227 80.4771 66.4146 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median -0.1742 -0.1743 80.4686 66.4542 80.4690 66.4436 Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev +0.6089 +0.5797 0.0670 0.1078 0.0673 0.1062 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean -0.1598 -0.1598 171.6996 144.2575 171.6915 144.2538 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median -0.1598 -0.1597 171.7109 144.2755 171.7018 144.2766 Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev +0.4024 +0.3850 0.0847 0.1187 0.0848 0.1175 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean -0.0550 -0.0551 280.3046 264.8800 280.3017 264.8559 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median -0.0554 -0.0554 280.2628 264.7360 280.2574 264.7297 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev +0.7005 +0.7041 0.2779 0.4725 0.2775 0.4729 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 49 vs 49 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean -0.0354 -0.0355 316.7396 305.5208 316.7342 305.4890 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median -0.0354 -0.0356 316.6969 305.4798 316.6917 305.4324 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev +0.0493 +0.0330 0.3562 0.3737 0.3563 0.3681 ``` That being said, it's always best-effort, so there will likely be cases where this worsens things. Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc Reviewed By: jmolloy Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67318 llvm-svn: 372009	2019-09-16 16:18:24 +00:00
Sanjay Patel	d0f836d87a	[InstCombine] remove unneeded one-use checks for icmp fold Related folds were added in: rL125734 ...the code comment about register pressure is discussed in more detail in: https://bugs.llvm.org/show_bug.cgi?id=2698 But 10 years later, perf testing bzip2 with this change now shows a slight (0.2% average) improvement on Haswell although that's probably within test noise. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 and rL371981 are related patches in this series. llvm-svn: 372007	2019-09-16 16:15:25 +00:00
Sanjay Patel	e419f46690	[InstCombine] move tests for icmp+add; NFC llvm-svn: 372004	2019-09-16 15:33:40 +00:00
Oliver Cruickshank	b9bb673ebd	[ARM] Add patterns for BSWAP intrinsic on MVE BSWAP can use the VREV instruction on MVE to produce better results than expanding. llvm-svn: 372002	2019-09-16 15:20:10 +00:00
Oliver Cruickshank	9a94a62358	[ARM] Add patterns for bitreverse intrinsic on MVE BITREVERSE can use the VBRSR which will reverse and right shift. Shifting right by 0 will just reverse the bits. llvm-svn: 372001	2019-09-16 15:20:03 +00:00
Oliver Cruickshank	486452e5ee	[ARM] Lower CTTZ on MVE Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and count the leading zeros, equivalent to a count trailing zeros (CTTZ). llvm-svn: 372000	2019-09-16 15:19:56 +00:00

... 2 3 4 5 6 ...

185129 Commits