llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-18 18:42:46 +02:00

Author	SHA1	Message	Date
Craig Topper	2d70f5fabd	Revert "[X86] Increase the number of instructions searched for isSafeToClobberEFLAGS in a couple places" This reverts commit 44b260cb0aab387d85e4d59c16fc7b8866264f5e. I messed up the bug number in the commit message so I'm reverting to fix it.	2020-08-08 11:53:14 -07:00
Dávid Bolvanský	b26d8d1ccf	[FileCheckTest] Supress new warning	2020-08-08 20:45:24 +02:00
Simon Pilgrim	9b68c08ed2	[X86][SSE] combineTargetShuffle - use scaleShuffleMask helper to widen shuffle mask. NFCI. Use scaleShuffleMask helper for the shuffle(hadd,hadd) canonicalization.	2020-08-08 19:36:18 +01:00
Craig Topper	b998790f91	[X86] Increase the number of instructions searched for isSafeToClobberEFLAGS in a couple places Previously this function searched 4 instructions forwards or backwards to determine if it was ok to clobber eflags. This is called in 3 places: rematerialization, turning 2 operand leas into adds or splitting 3 ops leas into an lea and add on some CPU targets. This patch increases the search limit to 10 instructions for rematerialization and 2 operand lea to add. I've left the old treshold for 3 ops lea spliting as that increases code size. Fixes PR47024 and PR43014	2020-08-08 11:29:41 -07:00
Simon Pilgrim	733733f727	[InstCombine] Use CreateVectorSplat(ElementCount) variant directly This was introduced at rGe20223672100, and the CreateVectorSplat(unsigned NumElements) variant calls it internally	2020-08-08 19:26:02 +01:00
Roman Lebedev	903dd081e7	[SimplifyCFG] Fix invoke->call fold w/ multiple invokes in presence of lifetime intrinsics SimplifyCFG has two main folds for resumes - one when resume is directly using the landingpad, and the other one where resume is using a PHI node. While for the first case, we were already correctly ignoring all the PHI nodes, and both the debug info intrinsics and lifetime intrinsics, in the PHI-based-one, we weren't ignoring PHI's in the resume block, and weren't ignoring lifetime intrinsics. That is clearly a bug. On RawSpeed library, this results in +9.34% (+81) more invoke->call folds, -0.19% (-39) landing pads, -0.24% (-81) invoke instructions but +51 call instructions and -132 basic blocks. Though, the run-time performance impact appears to be within the noise.	2020-08-08 20:00:28 +03:00
Roman Lebedev	5027b49a96	[NFC][SimplifyCFG] Rewrite isCleanupBlockEmpty() to be iterator_range-based	2020-08-08 20:00:28 +03:00
Roman Lebedev	b96154f169	[NFC][SimplifyCFG] Add a test showing invoke->call simplification failure	2020-08-08 20:00:28 +03:00
Roman Lebedev	152f29684a	[NFC][SimplifyCFG] Count the number of invokes turned into calls due to empty cleanup blocks	2020-08-08 20:00:27 +03:00
Sanjay Patel	33d6e8d6a8	[DAGCombiner] reassociate reciprocal sqrt expression to eliminate FP division, part 2 Follow-up to D82716 / rGea71ba11ab11 We do not have the fabs removal fold in IR yet for the case where the sqrt operand is repeated, so that's another potential improvement.	2020-08-08 10:38:06 -04:00
Sanjay Patel	d58ed6d476	[x86] add tests for another reciprocal sqrt pattern; NFC	2020-08-08 10:38:06 -04:00
Benjamin Kramer	ba661f5c49	lib/CodeGen doesn't depend on lib/Passes.	2020-08-08 13:40:24 +02:00
Rainer Orth	2697fb8b1e	[test][DebugInfo] Adapt two tests for Sun assembler syntax on Sparc Two DebugInfo tests currently `FAIL` on Sparc: LLVM :: DebugInfo/Generic/2010-06-29-InlinedFnLocalVar.ll LLVM :: DebugInfo/Generic/array.ll both in a similar way. E.g. : 'RUN: at line 1'; /var/llvm/local-sparcv9-A/bin/llc -O2 /vol/llvm/src/llvm-project/local/llvm/test/DebugInfo/Generic/2010-06-29-InlinedFnLocalVar.ll -o - \| /var/llvm/local-sparcv9-A/bin/FileCheck /vol/llvm/src/llvm-project/local/llvm/test/DebugInfo/Generic/2010-06-29-InlinedFnLocalVar.ll /vol/llvm/src/llvm-project/local/llvm/test/DebugInfo/Generic/2010-06-29-InlinedFnLocalVar.ll:4:10: error: CHECK: expected string not found in input ; CHECK: debug_info, ^ On `amd64-pc-solaris2.11`, the corresponding line is .section .debug_info,"",@progbits while on `sparcv9-sun-solaris2.11` we have only .section .debug_info This happens because Sparc currently emits `.section` directives using the style of the Solaris/SPARC assembler (controlled by `SunStyleELFSectionSwitchSyntax`). This patch takes the easy way out and allows both forms while tightening the check to only match the `.section` directive. Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, `x86_64-pc-linux-gnu`, and `x86_64-apple-darwin20.0.0`. Differential Revision: https://reviews.llvm.org/D85414	2020-08-08 09:13:47 +02:00
Juneyoung Lee	8d60604a96	[InstCombine] Optimize select(freeze(icmp eq/ne x, y), x, y) This patch adds an optimization that folds select(freeze(icmp eq/ne x, y), x, y) to x or y. This was needed to resolve slowdown after D84940 is applied. I tried to bake this logic into foldSelectInstWithICmp, but it wasn't clear. This patch conservatively writes the pattern in a separate function, foldSelectWithFrozenICmp. The output does not need freeze; https://alive2.llvm.org/ce/z/X49hNE (from @nikic) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85533	2020-08-08 15:22:29 +09:00
Juneyoung Lee	2677ef03a3	[InstCombine] Add tests for select(freeze(icmp x, y), x, y); NFC	2020-08-08 15:09:08 +09:00
Craig Topper	b2ec81eb93	[X86] Limit the scope of the min/max canonicalization in combineSelect Previously the transform was doing these two canonicalizations (x > y) ? x : y -> (x >= y) ? x : y (x < y) ? x : y -> (x <= y) ? x : y But those don't seem to be useful generally. And they actively pessimize the cases in PR47049. This patch limits it to (x > 0) ? x : 0 -> (x >= 0) ? x : 0 (x < -1) ? x : -1 -> (x <= -1) ? x : -1 These are the cases mentioned in the comments as the motivation for the canonicalization. These allow the CMOV to use the S flag from the compare thus improving opportunities to use a TEST or the flags from an arithmetic instruction.	2020-08-07 22:51:49 -07:00
Keno Fischer	d9083126e6	[X86] Don't produce bad x86andp nodes for i1 vectors In D85499, I attempted to fix this same issue by canonicalizing andnp for i1 vectors, but since there was some opposition to such a change, this commit just fixes the bug by using two different forms depending on which kind of vector type is in use. We can then always decide to switch the canonical forms later. Description of the original bug: We have a DAG combine that tries to fold (vselect cond, 0000..., X) -> (andnp cond, x). However, it does so by attempting to create an i64 vector with the number of elements obtained by truncating division by 64 from the bitwidth. This is bad for mask vectors like v8i1, since that division is just zero. Besides, we don't want i64 vectors anyway. For i1 vectors, switch the pattern to (andnp (not cond), x), which is the canonical form for `kandn` on mask registers. Fixes https://github.com/JuliaLang/julia/issues/36955. Differential Revision: https://reviews.llvm.org/D85553	2020-08-07 20:05:47 -04:00
LLVM GN Syncbot	f4add55acc	[gn build] Port f5b5ccf2a68	2020-08-07 23:43:14 +00:00
Yuanfang Chen	526e29b5b9	Reland "Revert "[NewPM][CodeGen] Introduce machine pass and machine pass manager"" This relands commit 320eab2d558fde0b61437e9b9075bfd301c2c474. The test failed because it was looking for x86-linux target unconditionally. Now it gets the default target.	2020-08-07 16:40:49 -07:00
Matt Arsenault	dee6e5cc20	AMDGPU: Avoid explicitly listing all the memory nodes	2020-08-07 19:22:46 -04:00
Vitaly Buka	3e043ca45f	[NFC][StackSafety] Fix statistics	2020-08-07 16:18:52 -07:00
Arthur Eubanks	834a6fc438	[NewPM] Print 'Skipping pass' as pass instrumentation If OptNoneInstrumentation prints it instead, 'Skipping pass' will print for even required passes. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D85493	2020-08-07 15:02:02 -07:00
Mircea Trofin	be6dadeffd	[NFC][MLInliner] Refactor logging implementation This prepares it for logging externally-specified outputs. Differential Revision: https://reviews.llvm.org/D85451	2020-08-07 14:56:56 -07:00
Sameer Arora	629a98ce69	[llvm-libtool-darwin] Add support for -D and -U options Add support for `-D` and `-U` options for llvm-libtool-darwin. `-D` allows for using zero for timestamps and UIDs/GIDs. `-U` allows for using actual timestamps and UIDs/GIDs. Reviewed by jhenderson, smeenai Differential Revision: https://reviews.llvm.org/D84209	2020-08-07 14:44:32 -07:00
Sameer Arora	a753e1d6dd	[llvm-libtool-darwin] Add support for -filelist option Add support for `-filelist` option for llvm-libtool-darwin. `-filelist` option allows for passing in a file containing a list of filenames. Reviewed by jhenderson, smeenai Differential Revision: https://reviews.llvm.org/D84206	2020-08-07 14:29:24 -07:00
Sameer Arora	df248b9fac	[llvm-libtool-darwin] Add constant CPU_SUBTYPE_ARM64_V8 Add support for constant MachO::CPU_SUBTYPE_ARM64_V8. This constant is needed so as to match `llvm-libtool-darwin`'s behavior to that of cctools' libtool when `-arch_only` flag is passed in on command line. Reviewed by jhenderson, alexshap, smeenai Differential Revision: https://reviews.llvm.org/D85041	2020-08-07 14:09:27 -07:00
Vitaly Buka	6669d78639	Revert "[StackSafety] Skip ambiguous lifetime analysis" This reverts commit 0b2616a8045cb776ea1514c3401d0a8577de1060. Crashes with safe-stack.	2020-08-07 14:02:50 -07:00
Vitaly Buka	ae3ad63414	[StackSafety,NFC] Add Stats counters	2020-08-07 14:02:50 -07:00
Sameer Arora	a27cdf02e5	Add symlinks for `libtool` and `install_name_tool` Add symlinks for `llvm-libtool-darwin` and `llvm-install-name-tool`. Reviewed by jhenderson, smeenai Differential Revision: https://reviews.llvm.org/D85054	2020-08-07 13:46:36 -07:00
Matt Arsenault	fc0dd4b853	GlobalISel: Handle zext(sext x) in artifact combiner This eliminates the illegal intermediate s8 value in the added test.	2020-08-07 16:37:46 -04:00
Sameer Arora	218d68982d	[FileCheck] Add docs for --allow-empty This diff adds documentation for `allow-empty` flag under FileCheck docs. Reviewed by jhenderson, smeenai, thopre Differential Revision: https://reviews.llvm.org/D83682	2020-08-07 13:27:57 -07:00
Sameer Arora	8d6c074d8c	[llvm-install-name-tool] Adds docs for llvm-install-name-tool Adding documentation for llvm-install-name-tool. Reviewed by smeenai, Ktwu Differential Revision: https://reviews.llvm.org/D81944	2020-08-07 12:51:58 -07:00
Gui Andrade	d255b7f7cb	Revert "[MSAN] Instrument libatomic load/store calls" Problems with instrumenting atomic_load when the call has no successor, blocking compiler roll This reverts commit 33d239513c881d8c11c60d5710c55cf56cc309a5.	2020-08-07 19:45:51 +00:00
LLVM GN Syncbot	79c68f2fb8	[gn build] Port 320eab2d558	2020-08-07 19:01:40 +00:00
Yuanfang Chen	cee8d8ef70	Revert "[NewPM][CodeGen] Introduce machine pass and machine pass manager" This reverts commit 911565d1085d9447363fe8ad041817436c4998fe. Broke some non-Linux bots.	2020-08-07 11:59:58 -07:00
Jianzhou Zhao	b6143710cb	Reduce dropTriviallyDeadConstantArrays cumulative time percentage from 17% to 4% The history of dropTriviallyDeadConstantArrays is like this. Because the appending linkage uses too much memory (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150105/251381.html), dropTriviallyDeadConstantArrays was introduced (https://reviews.llvm.org/rG81f385b0c6ea37dd7195a65be162c75bbdef29d2) to release unused constant arrays. Recently, dropTriviallyDeadConstantArrays was improved (https://reviews.llvm.org/rG81f385b0c6ea37dd7195a65be162c75bbdef29d2) to reduce its quadratic cost. Our recent LTO profiling shows that when a target is large, 15-20% of time cost is from the SetVector::insert called by dropTriviallyDeadConstantArrays. A large application has hundreds or thousands of modules; each module calls dropTriviallyDeadConstantArrays once for cleaning up tens of thousands of ConstantArrays a module has. In those ConstantArrays, usually around 5 can be deleted; a very very few deleted ConstantArrays reference other ConstantArrays: less than 10 out of millions. Given this, the cost of SetVector::insert is mainly from the construction of WorkList from ArrayConstants. This motivated the fix that iterates ArrayConstants directly, and uses WorkList only when necessary. Our evaluation shows that 1) The cumulative time percentage of dropTriviallyDeadConstantArrays is reduced from 15-17% to 4-6%. 2) For targets with LTO time > 20min, the time reduction is about 20%. 3) No observable performance impact for build without using LTO. {F12506218} {F12506221} Reviewed By: mehdi_amini, tejohnson, jdoerfert Differential Revision: https://reviews.llvm.org/D85379	2020-08-07 11:36:30 -07:00
Arthur Eubanks	8c7c923e8d	[PPC] Rename bool-ret-to-int -> ppc-bool-ret-to-int Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D85391	2020-08-07 11:27:05 -07:00
LLVM GN Syncbot	1bc795778b	[gn build] Port 911565d1085	2020-08-07 18:22:24 +00:00
Arthur Eubanks	cec3bdab26	[NFC] Use value initializer for OVERLAPPED To fix ../llvm/lib/Support/Windows/Path.inc(1265,21): warning: missing field 'InternalHigh' initializer [-Wmissing-field-initializers] OVERLAPPED OV = {0}; Differential Revision: https://reviews.llvm.org/D85480	2020-08-07 11:18:33 -07:00
Vang Thao	0c2a21406a	[AMDGPU] Fix not rescheduling without clustering Regions are sometimes skipped which should be rescheduled without memory op clustering. RegionIdx is not incremented when iterating over regions that are flagged to be skipped, causing the index to be incorrect. Thanks to Vang Thao for discovering this bug! Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D85498	2020-08-07 11:15:58 -07:00
Yuanfang Chen	4240996330	[NewPM][CodeGen] Introduce machine pass and machine pass manager machine pass could define four methods: - `PreservedAnalyses run(MachineFunction &, MachineFunctionAnalysisManager &)` - `Error doInitialization(Module &, MachineFunctionAnalysisManager &)` - `Error doFinalization(Module &, MachineFunctionAnalysisManager &)` - `Error run(Module &, MachineFunctionAnalysisManager &)` machine pass manger: - MachineFunctionAnalysisManager: Basically an AnalysisManager<MachineFunction> augmented with the ability to register and query IR analyses - MachineFunctionPassManager: support only two methods, `addPass` and `run` Reviewed By: arsenm, asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D67687	2020-08-07 11:00:31 -07:00
Yuanfang Chen	377ad5f083	[NewPM] Only verify loop for nonskipped user loop pass No verification for pass mangers since it is not needed. No verification for skipped loop pass since the asserted condition is not used. Add a BeforeNonSkippedPass callback for this. The callback needs more inputs than its parameters to work so the callback is added on-the-fly. Reviewed By: aeubanks, asbirlea Differential Revision: https://reviews.llvm.org/D84977	2020-08-07 11:00:31 -07:00
Mitch Phillips	4cdf8a5d36	Revert "Reland D64327 [MC][ELF] Allow STT_SECTION referencing SHF_MERGE on REL targets" This reverts commit b497665d98ad5026b1d3d67d5793a28fefe27bea. Spent some time trying to reproduce this locally, reverting in a desparate attempt to fix the sanitizer buildbot: - http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/28828 I don't know exactly why or how this patch breaks the bots, but it seems pretty concrete that it's the culprit.	2020-08-07 10:56:33 -07:00
Tyker	c2b7184126	[NFC] Add utility to sum/merge stats files Add a small script to sum .stats file given as input and output the totals usage example: merge-stats.py $(find ./builddir/ -name ".stats") > total.stats Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D83505	2020-08-07 19:02:42 +02:00
David Green	1b078d7ecd	[ARM] Extra reduction plus tailpredication tests. NFC	2020-08-07 17:16:56 +01:00
Amy Kwan	3ee1bf0ca0	[PowerPC] Add Vector Extract/Expand/Count with Mask, Move to VSR Mask Instruction Definitions and MC Tests This patch adds the instruction definitions and assembly/disassembly tests for the following set of instructions: Vector Extract [byte \| half \| word \| doubleword \| quad] with mask Vector Expand [byte \| half \| word \| doubleword \| quad] with mask Move to VSR [byte \| byte immediate \| half \| word \| doubleword \| quad] with mask Vector Count Mask Bits [byte \| half \| word \| doubleword] Differential Revision: https://reviews.llvm.org/D83724	2020-08-07 11:02:08 -05:00
Kamau Bridgeman	48a9603fbc	[PowerPC][PCRelative] Set TLS unsupported with PC relative memops Introduce a fatal error if any thread local storage code is compiled using pc relative memory operations as well as a hidden override option `-enable-ppc-pcrel-tls` so that this support can be incrementally added if possible. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D85448	2020-08-07 10:56:24 -05:00
Jay Foad	e9385fa488	[NFC][GVN] Fix "avaliable" typos Differential Revision: https://reviews.llvm.org/D85520	2020-08-07 14:22:24 +01:00
Bevin Hansson	7c243aea4b	[Intrinsic] Add sshl.sat/ushl.sat, saturated shift intrinsics. Summary: This patch adds two intrinsics, llvm.sshl.sat and llvm.ushl.sat, which perform signed and unsigned saturating left shift, respectively. These are useful for implementing the Embedded-C fixed point support in Clang, originally discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-August/125433.html and http://lists.llvm.org/pipermail/cfe-dev/2018-May/058019.html Reviewers: leonardchan, craig.topper, bjope, jdoerfert Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83216	2020-08-07 15:09:24 +02:00
Bevin Hansson	71fc113f30	[LangRef] Minor fixes to intrinsic headers and descriptions. NFC.	2020-08-07 15:09:24 +02:00

1 2 3 4 5 ...

201601 Commits