llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

Author	SHA1	Message	Date
Kristof Beyls	8957a377cc	[ARM] Generate vcmp instead of vcmpe Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp instead of vcmpe instructions by default, i.e. not be producing an Invalid Operation exception when either arguments in a floating point compare are quiet NaNs. In the future, after constrained floating point intrinsics for floating point compare have been introduced, vcmpe instructions probably should be produced for those intrinsics - depending on the exact semantics they'll be defined to have. This patch logically consists of the following parts: - Revert http://llvm.org/viewvc/llvm-project?rev=294945&view=rev and http://llvm.org/viewvc/llvm-project?rev=294968&view=rev, which implemented fine-tuning for when to produce vcmpe (i.e. not do it for equality comparisons). The complexity introduced by those patches isn't needed anymore if we just always produce vcmp instead. Maybe these patches need to be reintroduced again once support is needed to map potential LLVM-IR constrained floating point compare intrinsics to the ARM instruction set. - Simply select vcmp, instead of vcmpe, see simple changes in lib/Target/ARM/ARMInstrVFP.td - Adapt lots of tests that tested for vcmpe (instead of vcmp). For all of these test, the intent of what is tested for isn't related to whether the vcmp should produce an Invalid Operation exception or not. Fixes PR43374. Differential Revision: https://reviews.llvm.org/D68463 llvm-svn: 374025	2019-10-08 08:25:42 +00:00
Kai Nacke	6221b33368	[Tools] Mark output of tools as text if it is text Several LLVM tools write text files/streams without using OF_Text. This can cause problems on platforms which distinguish between text and binary output. This PR adds the OF_Text flag for the following tools: - llvm-dis - llvm-dwarfdump - llvm-mca - llvm-mc (assembler files only) - opt (assembler files only) - RemarkStreamer (used e.g. by opt) Reviewers: rnk, vivekvpandya, Bigcheese, andreadb Differential Revision: https://reviews.llvm.org/D67696 llvm-svn: 374024	2019-10-08 08:21:20 +00:00
Kadir Cetinkaya	40c14830e6	[LoopVectorize] Fix non-debug builds after rL374017 llvm-svn: 374021	2019-10-08 07:39:50 +00:00
Bill Wendling	aeb129da12	[IA] Recognize hexadecimal escape sequences Summary: Implement support for hexadecimal escape sequences to match how GNU 'as' handles them. I.e., read all hexadecimal characters and truncate to the lower 16 bits. Reviewers: nickdesaulniers, jcai19 Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D68598 llvm-svn: 374018	2019-10-08 04:39:52 +00:00
Zi Xuan Wu	d4140de63c	[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017	2019-10-08 03:28:33 +00:00
Chen Zheng	222ca02a5b	[ConstantRange] [NFC] replace addWithNoSignedWrap with addWithNoWrap. llvm-svn: 374016	2019-10-08 03:00:31 +00:00
Matt Arsenault	6da1b321d9	AMDGPU/GlobalISel: Clamp G_SITOFP/G_UITOFP sources llvm-svn: 373989	2019-10-07 23:33:08 +00:00
Johannes Doerfert	c8ccadccc4	[Attributor][NFC] Add debug output llvm-svn: 373988	2019-10-07 23:30:04 +00:00
Johannes Doerfert	968c61865f	[Attributor][FIX] Remove initialize calls and add undefs The initialization logic has become part of the Attributor but the patches that introduced these calls here were in development when the transition happened. We also now clean up (undefine) the macros used to create attributes. llvm-svn: 373987	2019-10-07 23:28:54 +00:00
Johannes Doerfert	06d2b87499	[Attributor] Use local linkage instead of internal Local linkage is internal or private, and private is a specialization of internal, so either is fine for all our "local linkage" queries. llvm-svn: 373986	2019-10-07 23:21:52 +00:00
Johannes Doerfert	60d7c833d2	[Attributor] Use abstract call sites for call site callback Summary: When we iterate over uses of functions and expect them to be call sites, we now use abstract call sites to allow callback calls. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, hfinkel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67871 llvm-svn: 373985	2019-10-07 23:14:58 +00:00
Craig Topper	bd251115d2	[X86] Shrink zero extends of gather indices from type less than i32 to types larger than i32. Gather instructions can use i32 or i64 elements for indices. If the index is zero extended from a type smaller than i32 to i64, we can shrink the extend to just extend to i32. llvm-svn: 373982	2019-10-07 23:03:12 +00:00
Reid Kleckner	a973c0bd85	[X86] Add new calling convention that guarantees tail call optimization When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization. Patch by Dwight Guth <dwight.guth@runtimeverification.com>! Reviewed By: lebedev.ri, paquette, rnk Differential Revision: https://reviews.llvm.org/D67855 llvm-svn: 373976	2019-10-07 22:28:58 +00:00
Heejin Ahn	383930d445	[WebAssembly] Fix unwind mismatch stat computation Summary: There was a bug when computing the number of unwind destination mismatches in CFGStackify. When there are many mismatched calls that share the same (original) destination BB, they have to be counted separately. This also fixes a typo and runs `fixUnwindMismatches` only when the wasm exception handling is enabled. This is to prevent unnecessary computations and does not change behavior. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68552 llvm-svn: 373975	2019-10-07 22:19:40 +00:00
Heejin Ahn	737d120caf	[WebAssembly] Add memory intrinsics handling to mayThrow() Summary: Previously, `WebAssembly::mayThrow()` assumed all inputs are global addresses. But when intrinsics, such as `memcpy`, `memmove`, or `memset` are lowered to external symbols in instruction selection and later emitted as library calls. And these functions don't throw. This patch adds handling to those memory intrinsics to `mayThrow` function. But while most of libcalls don't throw, we can't guarantee all of them don't throw, so currently we conservatively return true for all other external symbols. I think a better way to solve this problem is to embed 'nounwind' info in `TargetLowering::CallLoweringInfo`, so that we can access the info from the backend. This will also enable transferring 'nounwind' properties of LLVM IR instructions. Currently we don't transfer that info and we can only access properties of callee functions, if the callees are within the module. Other targets don't need this info in the backend because they do all the processing before isel, but it will help us because that info will reduce code size increase in fixing unwind destination mismatches in CFGStackify. But for now we return false for these memory intrinsics and true for all other libcalls conservatively. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68553 llvm-svn: 373967	2019-10-07 21:14:45 +00:00
Johannes Doerfert	9b814421f3	[Attributor] Deduce memory behavior of functions and arguments Deduce the memory behavior, aka "read-none", "read-only", or "write-only", for functions and arguments. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67384 llvm-svn: 373965	2019-10-07 21:07:57 +00:00
Roman Lebedev	5612040298	[InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389) This can come up in Bit Stream abstractions. The pattern looks big/scary, but it can't be simplified any further. It only is so simple because a number of my preparatory folds had happened already (shift amount reassociation / shift amount reassociation in bit test, sign bit test detection). Highlights: * There are two main flavors: https://rise4fun.com/Alive/zWi The difference is add vs. sub, and left-shift of -1 vs. 1 * Since we only change the shift opcode, we can preserve the exact-ness: https://rise4fun.com/Alive/4u4 * There can be truncation after high-bit-extraction: https://rise4fun.com/Alive/slHc1 (the main pattern i'm after!) Which means that we need to ignore zext of shift amounts and of NBits. * The sign-extending magic can be extended itself (in add pattern via sext, in sub pattern via zext. not the other way around!) https://rise4fun.com/Alive/NhG (or those sext/zext can be sinked into `select`!) Which again means we should pay attention when matching NBits. * We can have both truncation of extraction and widening of magic: https://rise4fun.com/Alive/XTw In other words, i don't believe we need to have any checks on bitwidths of any of these constructs. This is worsened in general by the fact that we may have `sext` instead of `zext` for shift amounts, and we don't yet canonicalize to `zext`, although we should. I have not done anything about that here. Also, we really should have something to weed out `sub` like these, by folding them into `add` variant. https://bugs.llvm.org/show_bug.cgi?id=42389 llvm-svn: 373964	2019-10-07 20:53:27 +00:00
Roman Lebedev	6ab1f89334	[InstCombine] Move isSignBitCheck(), handle rest of the predicates True, no test coverage is being added here. But those non-canonical predicates that are already handled here already have no test coverage as far as i can tell. I tried to add tests for them, but all the patterns already get handled elsewhere. llvm-svn: 373962	2019-10-07 20:53:08 +00:00
Roman Lebedev	8ee484bf3b	[InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask Summary: Currently, we pre-check whether we need to produce a mask or not. This involves some rather magical constants. I'd like to extend this fold to also handle the situation when there's also a `trunc` before outer shift. That will require another set of magical constants. It's ugly. Instead, we can just compute the mask, and check whether mask is a pass-through (all-ones) or not. This way we don't need to have any magical numbers. This change is NFC other than the fact that we now compute the mask and then check if we need (and can!) apply it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68470 llvm-svn: 373961	2019-10-07 20:53:00 +00:00
Roman Lebedev	c206e02bd1	[InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts Summary: When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`, which means that for patterns a/b we'd assume that we must not produce any bits for that channel, while in reality we simply didn't care about that channel - i.e. we don't need to mask it. Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68239 llvm-svn: 373960	2019-10-07 20:52:52 +00:00
Cameron McInally	bbfedfe1f0	[Bitcode] Update naming of UNOP_NEG to UNOP_FNEG Differential Revision: https://reviews.llvm.org/D68588 llvm-svn: 373958	2019-10-07 20:41:25 +00:00
Matt Arsenault	2adeb0bfad	AMDGPU/GlobalISel: Handle more G_INSERT cases Start manually writing a table to get the subreg index. TableGen should probably generate this, but I'm not sure what it looks like in the arbitrary case where subregisters are allowed to not fully cover the super-registers. llvm-svn: 373947	2019-10-07 19:16:26 +00:00
Matt Arsenault	1627348cef	GlobalISel: Partially implement lower for G_INSERT llvm-svn: 373946	2019-10-07 19:13:27 +00:00
Matt Arsenault	a360345744	AMDGPU/GlobalISel: Fix selection of 16-bit shifts llvm-svn: 373945	2019-10-07 19:10:44 +00:00
Matt Arsenault	e97221f3b0	AMDGPU/GlobalISel: Select VALU G_AMDGPU_FFBH_U32 llvm-svn: 373944	2019-10-07 19:10:43 +00:00
Matt Arsenault	edca1e1bc9	AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants This hides some defects in SIFoldOperands when the immediates are split. llvm-svn: 373943	2019-10-07 19:07:19 +00:00
Matt Arsenault	359940dff8	AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sources Continue making a mess of merge/unmerge legality. llvm-svn: 373942	2019-10-07 19:05:58 +00:00
Matt Arsenault	b5b0333eb6	AMDGPU/GlobalISel: Select more G_INSERT cases At minimum handle the s64 insert type, which are emitted in real cases during legalization. We really need TableGen to emit something to emit something like the inverse of composeSubRegIndices do determine the subreg index to use. llvm-svn: 373938	2019-10-07 18:43:31 +00:00
Matt Arsenault	ae11fea1a6	GlobalISel: Add target pre-isel instructions Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. llvm-svn: 373937	2019-10-07 18:43:29 +00:00
Jordan Rose	f093084da4	Second attempt to add iterator_range::empty() Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935	2019-10-07 18:14:24 +00:00
Erich Keane	91fee56bca	Fix Calling Convention through aliases r369697 changed the behavior of stripPointerCasts to no longer include aliases. However, the code in CGDeclCXX.cpp's createAtExitStub counted on the looking through aliases to properly set the calling convention of a call. The result of the change was that the calling convention mismatch of the call would be replaced with a llvm.trap, causing a runtime crash. Differential Revision: https://reviews.llvm.org/D68584 llvm-svn: 373929	2019-10-07 17:28:03 +00:00
Florian Hahn	abd9d73d87	[Remarks] Pass StringBlockValue as StringRef. After changing the remark serialization, we now pass StringRefs to the serializer. We should use StringRef for StringBlockVal, to avoid creating temporary objects, which then cause StringBlockVal.Value to point to invalid memory. Reviewers: thegameg, anemet Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D68571 llvm-svn: 373923	2019-10-07 17:05:09 +00:00
Wei Mi	b8f1a4e11b	Fix build errors caused by rL373914. llvm-svn: 373919	2019-10-07 16:45:47 +00:00
Wenlei He	f016ffed52	[llvm-profdata] Minor format fix Summary: Minor format fix for output of "llvm-profdata -show" Reviewers: wmi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68440 llvm-svn: 373917	2019-10-07 16:30:31 +00:00
Simon Pilgrim	acee6a4e31	[X86][SSE] getTargetShuffleInputs - move VT.isSimple/isVector checks inside. NFCI. Stop all the callers from having to check the value type before calling getTargetShuffleInputs. llvm-svn: 373915	2019-10-07 16:15:20 +00:00
Wei Mi	7850ded25a	[SampleFDO] Add compression support for any section in ExtBinary profile format Previously ExtBinary profile format only supports compression using zlib for profile symbol list. In this patch, we extend the compression support to any section. User can select some or all of the sections to compress. In an experiment, for a 45M profile in ExtBinary format, compressing name table reduced its size to 24M, and compressing all the sections reduced its size to 11M. Differential Revision: https://reviews.llvm.org/D68253 llvm-svn: 373914	2019-10-07 16:12:37 +00:00
Simon Atanasyan	c846f55ed0	[Mips] Always save RA when disabling frame pointer elimination This ensures that frame-based unwinding will continue to work when calling a noreturn function; there is not much use having the caller's frame pointer saved if you don't also have the caller's program counter. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68542 llvm-svn: 373907	2019-10-07 14:01:37 +00:00
Simon Atanasyan	01265b11da	[Mips] Fix evaluating J-format branch targets J/JAL/JALX/JALS are absolute branches, but stay within the current 256 MB-aligned region, so we must include the high bits of the instruction address when calculating the branch target. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68548 llvm-svn: 373906	2019-10-07 14:01:22 +00:00
whitequark	18728bc378	[LLVM-C] Add bindings to create macro debug info Summary: The C API doesn't have the bindings to create macro debug information. Reviewers: whitequark, CodaFi, deadalnix Reviewed By: whitequark Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58334 llvm-svn: 373903	2019-10-07 13:57:13 +00:00
Mirko Brkusanin	9a83f6c433	Test commit Fix comment. llvm-svn: 373901	2019-10-07 13:23:12 +00:00
Kevin P. Neal	02f26957db	[FPEnv] Add constrained intrinsics for lrint and lround Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373900	2019-10-07 13:20:00 +00:00
Nico Weber	376cfd32fd	Revert r373888 "[IA] Recognize hexadecimal escape sequences" It broke MC/AsmParser/directive_ascii.s on all bots: Assertion failed: (Index < Length && "Invalid index!"), function operator[], file ../../llvm/include/llvm/ADT/StringRef.h, line 243. llvm-svn: 373898	2019-10-07 11:46:26 +00:00
Bill Wendling	97854ad2fb	[IA] Recognize hexadecimal escape sequences Summary: Implement support for hexadecimal escape sequences to match how GNU 'as' handles them. I.e., read all hexadecimal characters and truncate to the lower 16 bits. Reviewers: nickdesaulniers Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68483 llvm-svn: 373888	2019-10-07 09:54:53 +00:00
Martin Storsjo	6530ff4317	Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine" This reverts SVN r373833, as it caused a failed assert "Non-zero loop cost expected" on building numerous projects, see PR43582 for details and reproduction samples. llvm-svn: 373882	2019-10-07 08:21:37 +00:00
Craig Topper	6ee1e349e5	[X86] Support LEA64_32r in processInstrForSlow3OpLEA and use INC/DEC when possible. Move the erasing and iterator updating inside to match the other slow LEA function. I've adapted code from optTwoAddrLEA and basically rebuilt the implementation here. We do lose the kill flags now just like optTwoAddrLEA. This runs late enough in the pipeline that shouldn't really be a problem. llvm-svn: 373877	2019-10-07 06:27:55 +00:00
Simon Pilgrim	1ab70808c2	[X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217) If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element. This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted. Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper. Fixes PR43217 Differential Revision: https://reviews.llvm.org/D68544 llvm-svn: 373871	2019-10-06 21:11:45 +00:00
Simon Pilgrim	c0dceb79da	Fix signed/unsigned warning. NFCI llvm-svn: 373870	2019-10-06 19:54:20 +00:00
Amy Kwan	fc85904dce	[NFC][PowerPC] Reorganize CRNotPat multiclass patterns in PPCInstrInfo.td This is patch aims to group together the `CRNotPat` multi class instantiations within the `PPCInstrInfo.td` file. Integer instantiations of the multi class are grouped together into a section, and the floating point patterns are separated into its own section. Differential Revision: https://reviews.llvm.org/D67975 llvm-svn: 373869	2019-10-06 19:45:53 +00:00
Simon Pilgrim	edbf0a5ae5	[X86][SSE] Remove resolveTargetShuffleInputs and use getTargetShuffleInputs directly. Move the resolveTargetShuffleInputsAndMask call to after the shuffle mask combine before the undef/zero constant fold instead. llvm-svn: 373868	2019-10-06 19:07:00 +00:00
Simon Pilgrim	20fe3d2062	[X86][SSE] Don't merge known undef/zero elements into target shuffle masks. Replaces setTargetShuffleZeroElements with getTargetShuffleAndZeroables which reports the Zeroable elements but doesn't merge them into the decoded target shuffle mask (the merging has been moved up into getTargetShuffleInputs until we can get rid of it entirely). This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle mask isn't adjusted by its source inputs but instead we cache them in a parallel Zeroable mask. llvm-svn: 373867	2019-10-06 19:06:45 +00:00

1 2 3 4 5 ...

127380 Commits