llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00

Author	SHA1	Message	Date
Craig Topper	c72365125a	[X86] Autogenerate complete checks. NFC llvm-svn: 338921	2018-08-03 20:58:14 +00:00
Anastasis Grammenos	23f909efa9	[TRE][DebugInfo] Preserve Debug Location in new branch instruction There are two branch instructions created so the new test covers them both. Differential Revision: https://reviews.llvm.org/D50263 llvm-svn: 338917	2018-08-03 20:27:13 +00:00
Craig Topper	633fcc8728	[SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. llvm-svn: 338915	2018-08-03 20:14:18 +00:00
Fangrui Song	26910fb248	[Support] Don't initialize compressed buffer allocated by zlib::compress resize() (zeroing) makes every allocated page resident. The actual size of the compressed buffer is usually much smaller. Making every page resident is wasteful. When linking a test binary with ~1.9GiB uncompressed debug info with LLD, this optimization decreases max RSS by ~1.5GiB. Differential Revision: https://reviews.llvm.org/50223 llvm-svn: 338913	2018-08-03 19:37:49 +00:00
Matt Arsenault	5eaf2b4fbf	DAG: Enhance isKnownNeverNaN Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910	2018-08-03 18:27:52 +00:00
Artem Belevich	621737a0ae	[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH"). Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement. Reviewers: jlebar Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D50207 llvm-svn: 338908	2018-08-03 18:05:24 +00:00
Craig Topper	744780ede5	[X86] Add a DAG combine for the __builtin_parity idiom used by clang to enable better codegen Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag. Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and. I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful. Differential Revision: https://reviews.llvm.org/D50165 llvm-svn: 338907	2018-08-03 18:00:29 +00:00
Craig Topper	95cbce3702	[X86] Add test cases for the current codegen of __builtin_parity. Will be improved in a follow commit llvm-svn: 338906	2018-08-03 18:00:23 +00:00
Evandro Menezes	9d091b8150	[SLC] Refactor shrinking of functions (NFC) Merge the helper functions for shrinking unary and binary functions into a single one, while keeping all their functionality. Otherwise, NFC. llvm-svn: 338905	2018-08-03 17:50:16 +00:00
Joel Galenson	954a936289	Fix crash in bounds checking. In r337830 I added SCEV checks to enable us to insert fewer bounds checks. Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues. This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions. Differential Revision: https://reviews.llvm.org/D49946 llvm-svn: 338902	2018-08-03 17:12:23 +00:00
Matt Davis	b0535f09cc	[llvm-mca][docs] Move the code marker text into its own subsection. NFC. Also fixed a few undecorated 'llvm-mca' references to be highlighted with the 'program' emphasis. llvm-svn: 338900	2018-08-03 15:56:07 +00:00
Graham Yiu	ccd98a2a35	[Partial Inlining] Fix small bug in detecting if we did something - It's possible for 'Changed' to return as false even if we did partial inline something. Fixed to accumulate return values llvm-svn: 338896	2018-08-03 14:42:53 +00:00
Nicholas Wilson	2d51b0f7e5	[WebAssembly] Cleanup of the way globals and global flags are handled Differential Revision: https://reviews.llvm.org/D44030 llvm-svn: 338894	2018-08-03 14:33:37 +00:00
Andrea Di Biagio	3d390604c7	[llvm-mca] Speed up the computation of the wait/ready/issued sets in the Scheduler. This patch is a follow-up to r338702. We don't need to use a map to model the wait/ready/issued sets. It is much more efficient to use a vector instead. This patch gives us an average 7.5% speedup (on top of the ~12% speedup obtained after r338702). llvm-svn: 338883	2018-08-03 12:55:28 +00:00
Chijun Sima	a13157e779	[Dominators] Make RemoveUnreachableBlocks return false if the BasicBlock is already awaiting deletion Summary: Previously, `removeUnreachableBlocks` still returns true (which indicates the CFG is changed) even when all the unreachable blocks found is awaiting deletion in the DDT class. This makes code pattern like ``` // Code modified from lib/Transforms/Scalar/SimplifyCFGPass.cpp bool EverChanged = removeUnreachableBlocks(F, nullptr, DDT); ... do { EverChanged = someMightHappenModifications(); EverChanged \|= removeUnreachableBlocks(F, nullptr, DDT); } while (EverChanged); ``` become a dead loop. Fix this by detecting whether a BasicBlock is already awaiting deletion. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49738 llvm-svn: 338882	2018-08-03 12:45:29 +00:00
Andrea Di Biagio	1aca2c2e82	[llvm-mca][docs] Improve the CommandLine documentation. This patch replaces all the remaining occurrences of string "MCA" with ":program:`llvm-mca`". Somehow I missed those strings when I committed r338394. This patch also improves section "Instruction Dispatch". llvm-svn: 338881	2018-08-03 12:44:56 +00:00
Nico Weber	e27c5b613c	convert some tabs to spaces llvm-svn: 338880	2018-08-03 12:21:54 +00:00
Jonas Devlieghere	ad1fff0c4d	[DebugInfo/Verifier] Don't emit error for missing module in index We don't expect module names to be present in the index. This patch adds DW_TAG_module to the blacklist. Differential revision: https://reviews.llvm.org/D50237 llvm-svn: 338878	2018-08-03 12:01:43 +00:00
Jonas Paulsson	e8819358a0	[SystemZ] Improve handling of instructions which expand to several groups Some instructions expand to more than one decoder group. This has been hitherto ignored, but is handled with this patch. Review: Ulrich Weigand https://reviews.llvm.org/D50187 llvm-svn: 338849	2018-08-03 10:43:05 +00:00
Max Kazantsev	6f32be6ee8	[NFC] Add missing comment llvm-svn: 338848	2018-08-03 10:41:51 +00:00
Max Kazantsev	76fa18c0f1	[NFC] Move some methods into static functions llvm-svn: 338843	2018-08-03 10:16:40 +00:00
Jeremy Morse	17ff598941	[Windows FS] Allow moving files in TempFile::keep In r338216 / D49860 TempFile::keep was extended to allow keeping across filesystems. The aim on Windows was to have this happen in rename_internal using the existing system API. However, to fix an issue and preserve the idea of "renaming" not being a move, put Windows keep-across-filesystem in TempFile::keep. Differential Revision: https://reviews.llvm.org/D50048 llvm-svn: 338841	2018-08-03 10:13:35 +00:00
Simon Pilgrim	222b9aadec	[TargetLowering] Generalise BuildSDIV function First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used. Differential Revision: https://reviews.llvm.org/D50185 llvm-svn: 338838	2018-08-03 10:00:54 +00:00
Guillaume Chatelet	882cd9ac82	[llvm-exegesis] Renaming classes and functions. Summary: Functional No Op. Reviewers: gchatelet Subscribers: tschuett, courbet, llvm-commits Differential Revision: https://reviews.llvm.org/D50231 llvm-svn: 338836	2018-08-03 09:29:38 +00:00
Sjoerd Meijer	96990242de	[ARM] FP16: support vector zip and unzip This is addressing PR38404. Differential Revision: https://reviews.llvm.org/D50186 llvm-svn: 338835	2018-08-03 09:24:29 +00:00
Dean Michael Berris	731da7b6b3	[XRay][tools] Use Support/JSON.h in llvm-xray convert Summary: This change removes the ad-hoc implementation used by llvm-xray's `convert` subcommand to generate JSON encoded catapult (AKA Chrome Trace Viewer) trace output, to instead use the JSON encoder now in the Support library. Reviewers: kpw, zturner, eizan Reviewed By: kpw Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50129 llvm-svn: 338834	2018-08-03 09:21:31 +00:00
Simon Pilgrim	edafefc419	[X86] Add example of 'zero shift' guards on rotation patterns (PR34924) Basic pattern that leaves an unnecessary select on a rotation by zero result. This variant is trivial - the more general case with a compare+branch to prevent execution of undefined shifts is more tricky. llvm-svn: 338833	2018-08-03 09:20:02 +00:00
Sjoerd Meijer	de46c3b80d	[ARM] FP16: support VFMA This is addressing PR38404. llvm-svn: 338830	2018-08-03 09:12:56 +00:00
Dean Michael Berris	b171503f4e	[XRay] fixup: add one more missing std::move(...) Follow up to D48370. llvm-svn: 338829	2018-08-03 09:06:11 +00:00
Dean Michael Berris	c0aba6e2c4	[XRay] fixup: Add missing std::move(...) Follow up to D48370. llvm-svn: 338827	2018-08-03 07:54:37 +00:00
Dean Michael Berris	e289480427	[XRay] Fixup: remove 'noexcept' in defaulted move members This is to appease stage1 builds using gcc. Follow-up to D48370. llvm-svn: 338826	2018-08-03 07:41:34 +00:00
Dean Michael Berris	0c7cbedfef	[XRay][llvm] Load XRay Profiles Summary: This change implements the profile loading functionality in LLVM to support XRay's profiling mode in compiler-rt. We introduce a type named `llvm::xray::Profile` which allows building a profile representation. We can load an XRay profile from a file to build Profile instances, or do it manually through the Profile type's API. The intent is to get the `llvm-xray` tool to generate `Profile` instances and use that as the common abstraction through which all conversion and analysis can be done. In the future we can generate `Profile` instances from `Trace` instances as well, through conversion functions. Some of the key operations supported by the `Profile` API are: - Path interning (`Profile::internPath(...)`) which returns a unique path identifier. - Block appending (`Profile::addBlock(...)`) to add thread-associated profile information. - Path ID to Path lookup (`Profile::expandPath(...)`) to look up a PathID and return the original interned path. - Block iteration. A 'Path' in this context represents the function call stack in leaf-to-root order. This is represented as a path in an internally managed prefix tree in the `Profile` instance. Having a handle (PathID) to identify the unique Paths we encounter for a particular Profile allows us to reduce the amount of memory required to associate profile data to a particular Path. This is the first of a series of patches to migrate the `llvm-stacks` tool towards using a single profile representation. Depends on D48653. Reviewers: kpw, eizan Reviewed By: kpw Subscribers: mgorny, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D48370 llvm-svn: 338825	2018-08-03 07:18:39 +00:00
Craig Topper	f314d592c4	[X86] Remove all the vector NOP bitcast patterns. Use a few lines of code in the Select method in X86ISelDAGToDAG.cpp instead. There are a lot of permutations of types here generating a lot of patterns in the isel table. It's more efficient to just ReplaceUses and RemoveDeadNode from the Select function. The test changes are because we have a some shuffle patterns that have a bitcast as their root node. But the behavior is identical to another instruction whose pattern doesn't start with a bitcast. So this isn't a functional change. llvm-svn: 338824	2018-08-03 07:01:10 +00:00
Hans Wennborg	a28e486aa2	build_llvm_package.bat: Add OpenMP back After r338721, it builds again. llvm-svn: 338823	2018-08-03 07:00:08 +00:00
Chijun Sima	fba2ebc72b	[Dominators] Refine the logic of recalculate() in the DomTreeUpdater Summary: This patch refines the logic of `recalculate()` in the `DomTreeUpdater` in the following two aspects: 1. Previously, `recalculate()` tests whether there are pending updates/BBs awaiting deletion and then do recalculation under Lazy UpdateStrategy; and do recalculation immediately under Eager UpdateStrategy. (The former behavior is inherited from the `DeferredDominance` class). This is an inconsistency between two strategies and there is no obvious reason to do this. So the behavior is changed to always recalculate available trees when calling `recalculate()`. 2. Fix the issue of when DTU under Lazy UpdateStrategy holds nothing but with BBs awaiting deletion, after calling `recalculate()`, BBs awaiting deletion aren't flushed. An additional unittest is added to cover this case. Reviewers: kuhar, dmgreen, brzycki, grosser, davide Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50173 llvm-svn: 338822	2018-08-03 06:51:35 +00:00
Craig Topper	3309615c71	[X86] Support fp128 and/or/xor/load/store with VEX and EVEX encoded instructions. Move all the patterns to X86InstrVecCompiler.td so we can keep SSE/AVX/AVX512 all in one place. To save some patterns we'll use an existing DAG combine to convert f128 fand/for/fxor to integer when sse2 is enabled. This allows use to reuse all the existing patterns for v2i64. I believe this now makes SHA instructions the only case where VEX/EVEX and legacy encoded instructions could be generated simultaneously. llvm-svn: 338821	2018-08-03 06:12:56 +00:00
Hiroshi Inoue	bd266a779f	[InstSimplify] fold extracting from std::pair (2/2) This is the second patch of the series which intends to enable jump threading for an inlined method whose return type is std::pair<int, bool> or std::pair<bool, int>. The first patch is https://reviews.llvm.org/rL338485. This patch handles code sequences that merges two values using `shl` and `or`, then extracts one value using `and`. Differential Revision: https://reviews.llvm.org/D49981 llvm-svn: 338817	2018-08-03 05:39:48 +00:00
Chijun Sima	f6f16ab9ad	[Dominators] Convert existing passes and utils to use the DomTreeUpdater class Summary: This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html \| RFC - A new dominator tree updater for LLVM ]]. It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class. These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater. Reviewers: brzycki, kuhar, dmgreen, grosser, davide Reviewed By: brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48967 llvm-svn: 338814	2018-08-03 05:08:17 +00:00
Craig Topper	489536319e	[X86] When post-processing the DAG to remove zero extending moves for YMM/ZMM, make sure the producing instruction is VEX/XOP/EVEX encoded. If the producing instruction is legacy encoded it doesn't implicitly zero the upper bits. This is important for the SHA instructions which don't have a VEX encoded version. We might also be able to hit this with the incomplete f128 support that hasn't been ported to VEX. llvm-svn: 338812	2018-08-03 04:49:42 +00:00
Craig Topper	0b390c1968	[X86] Autogenerate complete checks. NFC llvm-svn: 338811	2018-08-03 04:49:41 +00:00
Craig Topper	bfee814832	[X86] Add R13D to the isInefficientLEAReg in FixupLEAs. I'm assuming the R13 restriction extends to R13D. Guessing this restriction is related to the funny encoding of this register as base always requiring a displacement to be encoded. llvm-svn: 338806	2018-08-03 03:45:19 +00:00
Craig Topper	7c3ad0ab27	[X86] Autogenerate complete checks. NFC llvm-svn: 338802	2018-08-03 01:28:12 +00:00
Craig Topper	9f1ad5842a	[X86] Autogenerate complete checks. NFC llvm-svn: 338799	2018-08-03 01:20:32 +00:00
Craig Topper	7b3d767b51	[X86] Prevent promotion of i16 add/sub/and/or/xor to i32 if we can fold an atomic load and atomic store. This makes them consistent with i8/i32/i64. Which still seems to be more aggressive on folding than icc, gcc, or MSVC. llvm-svn: 338795	2018-08-03 00:37:34 +00:00
Philip Reames	865e1ee3fc	[LICM] Remove unneccessary safety check to increase sinking effectiveness This one requires a bit of explaination. It's not every day you simply delete code to implement an optimization. :) The transform in question is sinking an instruction from a loop to the uses in loop exiting blocks. We know (from LCSSA) that all of the uses outside the loop must be phi nodes, and after predecessor splitting, we know all phi users must have a single operand. Since the use must be strictly dominated by the def, we know from the definition of dominance/ssa that the exit block must execute along a (non-strict) subset of paths which reach the def. As a result, duplicating a potentially faulting instruction can not introduce a fault that didn't previously exist in the program. The full story is that this patch builds on "rL338671: [LICM] Factor out fault legality from canHoistOrSinkInst [NFC]" which pulled this logic out of a common helper routine. As best I can tell, this check was originally added to the helper function for hoisting legality, later an incorrect fastpath for loads/calls was added, and then the bug was fixed by duplicating the fault safety check in the hoist path. This left the redundant check in the common code to pessimize sinking for no reason. I split it out in an NFC, and am not removing the unneccessary check. I wanted there to be something easy to revert in case I missed something. Reviewed by: Anna Thomas (in person) llvm-svn: 338794	2018-08-03 00:21:56 +00:00
Dave Lee	956ce46fd0	objdump: Better handling of Mach-O universal binaries Summary: With Mach-O, there is a flag requirement discrepancy between working with universal binaries and thin binaries. Many flags that don't require the `-macho` flag (for example `-private-headers` and `-disassemble`) fail to work on universal binaries unless `-macho` is given. When this happens, the error message is unhelpful, stating: The file was not recognized as a valid object file. Which can lead to confusion. This change allows generic flags to be used on universal binaries with and without the `-macho` flag. This means flags that can be used for thin files can be used consistently with fat files too. To do this, the universal binary support within `ParseInputMachO()` is extracted into a new function. This new function is called directly from `DumpInput()` when the input binary is universal. Additionally the `-arch` flag validation in `ParseInputMachO()` was extracted to be reused. Reviewers: compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48702 llvm-svn: 338792	2018-08-03 00:06:38 +00:00
Eli Friedman	5e3a3e7469	[GlobalMerge] Allow merging globals with explicit section markings. At least on ELF, it's impossible to tell from the object file whether two globals with the same section marking were merged: the merged global uses "private" linkage to hide its symbol, and the aliases look like regular symbols. I can't think of any other reason to disallow it. (Of course, we can only merge globals in the same section.) The weird alignment handling matches AsmPrinter; our alignment handling for global variables should probably be refactored. Differential Revision: https://reviews.llvm.org/D49822 llvm-svn: 338791	2018-08-02 23:54:16 +00:00
Tim Renouf	c1efbeb4fd	[AMDGPU] Minor change to d16 buffer load implementation Summary: By not reconstructing the operand list of the SDNode, this change makes it easier to add the forthcoming new tbuffer and buffer intrinsics. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49995 Change-Id: I0cb79ef0801532645d7dd954a6d7355139db7b38 llvm-svn: 338784	2018-08-02 23:33:01 +00:00
Tim Renouf	991d99708a	[AMDGPU] Reworked SIFixWWMLiveness Summary: I encountered some problems with SIFixWWMLiveness when WWM is in a loop: 1. It sometimes gave invalid MIR where there is some control flow path to the new implicit use of a register on EXIT_WWM that does not pass through any def. 2. There were lots of false positives of registers that needed to have an implicit use added to EXIT_WWM. 3. Adding an implicit use to EXIT_WWM (and adding an implicit def just before the WWM code, which I tried in order to fix (1)) caused lots of the values to be spilled and reloaded unnecessarily. This commit is a rework of SIFixWWMLiveness, with the following changes: 1. Instead of considering any register with a def that can reach the WWM code and a def that can be reached from the WWM code, it now considers three specific cases that need to be handled. 2. A register that needs liveness over WWM to be synthesized now has it done by adding itself as an implicit use to defs other than the dominant one. Also added the following fixmes: FIXME: We should detect whether a register in one of the above categories is already live at the WWM code before deciding to add the implicit uses to synthesize its liveness. FIXME: I believe this whole scheme may be flawed due to the possibility of the register allocator doing live interval splitting. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46756 Change-Id: Ie7fba0ede0378849181df3f1a9a7a39ed1a94a94 llvm-svn: 338783	2018-08-02 23:31:32 +00:00
Craig Topper	27e7f59c1b	[X86] Allow 'atomic_store (neg/not atomic_load)' to isel to a RMW instruction. There was a FIXMe in the td file about a type inference issue that was easy to fix. llvm-svn: 338782	2018-08-02 23:30:38 +00:00

1 2 3 4 5 ...

167483 Commits