llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Vitaly Buka	1bae08d2a5	[NFC] Remove unused GetUnderlyingObject paramenter Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621	2020-07-31 02:10:03 -07:00
Balazs Benics	42bc0ed1ab	[analyzer] Fix out-of-tree only clang build by not relaying on private header It turned out that the D78704 included a private LLVM header, which is excluded from the LLVM install target. I'm substituting that `#include` with the public one by moving the necessary `#define` into that. There was a discussion about this at D78704 and on the cfe-dev mailing list. I'm also placing a note to remind others of this pitfall. Reviewed By: mgorny Differential Revision: https://reviews.llvm.org/D84929	2020-07-31 10:28:14 +02:00
QingShan Zhang	120b0ac26d	[PowerPC] Retrieve the offset from load/store if it stores to stack slots Scheduler will try to retrieve the offset and base addr to determine if two loads/stores are disjoint memory access. PowerPC failed to handle this for frame index which will bring extra memory dependency for loads/stores. Reviewed By: jji Differential Revision: https://reviews.llvm.org/D84308	2020-07-31 07:08:20 +00:00
Juneyoung Lee	f7e2f3c8f3	[JumpThreading] Let SimplifyPartiallyRedundantLoad look into freeze This patch allows SimplifyPartiallyRedundantLoad work when the branch condition was frozen. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84944	2020-07-31 15:28:24 +09:00
Fangrui Song	2085074770	[MC] Support infix operator ! Disabled for Darwin mode. Also disabled for ARM which has compatible aliases (implied 'sp' operand in 'srs*' instructions like 'srsda #31!').	2020-07-30 23:25:53 -07:00
Juneyoung Lee	3b5a9e3f2b	[JumpThreading] Add a test for D84944 ; NFC	2020-07-31 15:20:59 +09:00
Lang Hames	8bc77257a4	[JITLink] Use correct Addressable constructor. Calling createAddressable(false) generates an absolute symbol. We want createAddressable(0, false), which generates an external symbol.	2020-07-30 22:48:57 -07:00
Craig Topper	758017c2d0	[X86] Remove x86_sse42_crc32_64_64 from X86TTIImpl::simplifyDemandedUseBitsIntrinsic It doesn't do any simplifying. It just computes known bits. We can just let InstCombine call computeKnownBits which will handle this just as well.	2020-07-30 21:51:23 -07:00
Max Kazantsev	c10031a2ab	[SimpleLoopUnswitch] Preserve make.implicit in non-trivial unswitch if legal We can preserve make.implicit metadata in the split block if it is guaranteed that after following the branch we always reach the block where processing of null case happens, which is equivalent to "initial condition must execute if the loop is entered". Differential Revision: https://reviews.llvm.org/D84925 Reviewed By: asbirlea	2020-07-31 11:38:43 +07:00
Max Kazantsev	9a8468080c	[SimpleLoopUnswitch] Drop make.implicit metadata in case of non-trivial unswitching Non-trivial unswitching simply moves terminator being unswitch from the loop up to the switch block. It also preserves all metadata that was there. It might not be a correct thing to do for `make.implicit` metadata. Consider case: ``` for (...) { cond = // computed in loop if (cond) return X; if (p == null) throw_npe(); !make implicit } ``` Before the unswitching, if `p` is null and we reach this check, we are guaranteed to go to `throw_npe()` block. Now we unswitch on `p == null` condition: ``` if (p == null) !make implicit { for (...) { if (cond) return X; throw_npe() } } else { for (...) { if (cond) return X; } } ``` Now, following `true` branch of `p == null` does not always lead us to `throw_npe()` because the loop has side exit. Now, if we run ImplicitNullCheck pass on this code, it may end up making the unswitch condition implicit. This may lead us to turning normal path to `return X` into signal-throwing path, which is not efficient. Note that this does not happen during trivial unswitch: it guarantees that we do not have side exits before condition being unswitched. This patch fixes this situation by unconditional dropping of `make.implicit` metadata when we perform non-trivial unswitch. We could preserve it if we could prove that the condition always executes. This can be done as a follow-up. Differential Revision: https://reviews.llvm.org/D84916 Reviewed By: asbirlea	2020-07-31 11:33:02 +07:00
Wei Mi	b565123367	Fix a crash when the sample profile uses md5 and -sample-profile-merge-inlinee is enabled. When -sample-profile-merge-inlinee is enabled, new FunctionSamples may be created during profile merge without GUIDToFuncNameMap being initialized. That will occasionally cause compiler crash. The patch fixes it. Differential Revision: https://reviews.llvm.org/D84994	2020-07-30 21:21:06 -07:00
Vitaly Buka	4ee4573a60	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway	2020-07-30 21:08:24 -07:00
Craig Topper	7735985257	[X86] Pass the OperandVector by reference to ParseIntelOperand and ParseRoundingMode. NFCI Similar to what was recently done to ParseATTOperand. Make ParseIntelOperand directly responsible for adding to the operand vector instead of returning the operand. Return a bool for error. Remove ErrorOperand since it is no longer used.	2020-07-30 19:52:38 -07:00
Arthur Eubanks	d0801d868e	[tbaa] Rename type-based-aa -> tbaa For consistency with legacy pass name. Helps with 37 instances of "unknown pass name 'tbaa'" in check-llvm under NPM. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D84967	2020-07-30 19:51:35 -07:00
Arthur Eubanks	050789b0c9	[NewPM] Don't print 'Invalidating all non-preserved analyses' If an analysis is actually invalidated, there's already a log statement for that: 'Invalidating analysis: FooAnalysis'. Otherwise the statement is not very useful. Reviewed By: asbirlea, ychen Differential Revision: https://reviews.llvm.org/D84981	2020-07-30 19:40:29 -07:00
Vitaly Buka	0093612032	[ValueTracking] Remove AllocaForValue parameter findAllocaForValue uses AllocaForValue to cache resolved values. The function is used only to resolve arguments of lifetime intrinsic which usually are not fare for allocas. So result reuse is likely unnoticeable. In followup patches I'd like to replace the function with GetUnderlyingObjects. Depends on D84616. Differential Revision: https://reviews.llvm.org/D84617	2020-07-30 18:48:34 -07:00
Vitaly Buka	fe28af466f	[NFC] Move findAllocaForValue into ValueTracking.h Differential Revision: https://reviews.llvm.org/D84616	2020-07-30 18:22:59 -07:00
dfukalov	b712ce0b7d	[NFC][AMDGPU] Improve fused fmul+fadd tests. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D84903	2020-07-31 04:00:09 +03:00
Scott Constable	fdefae7e6f	[X86] Fix for ballooning compile times due to Load Value Injection (LVI) mitigations Fix for the issue raised in https://github.com/rust-lang/rust/issues/74632. The current heuristic for inserting LFENCEs uses a quadratic-time algorithm. This can apparently cause substantial compilation slowdowns for building Rust projects, where functions > 5000 LoC are apparently common. The updated heuristic in this patch implements a linear-time algorithm. On a set of benchmarks, the slowdown factor for the generated code was comparable (2.55x geo mean for the quadratic-time heuristic, vs. 2.58x for the linear-time heuristic). Both heuristics offer the same security properties, namely, mitigating LVI. This patch also includes some formatting fixes. Differential Revision: https://reviews.llvm.org/D84471	2020-07-30 17:22:33 -07:00
Craig Topper	729680740b	[X86] Separate CPU Feature lists in X86.td between architecture features and tuning features After the recent change to the tuning settings for pentium4 to improve our default 32-bit behavior, I've decided to see about implementing -mtune support. This way we could have a default architecture CPU of "pentium4" or "x86-64" and a default tuning cpu of "generic". And we could change our "pentium4" tuning settings back to what they were before. As a step to supporting this, this patch separates all of the features lists for the CPUs into 2 lists. I'm using the Proc class and a new ProcModel class to concat the 2 lists before passing to the target independent ProcessorModel. Future work to truly support mtune would change ProcessorModel to take 2 lists separately. I've diffed the X86GenSubtargetInfo.inc file before and after this patch to ensure that the final feature list for the CPUs isn't changed. Differential Revision: https://reviews.llvm.org/D84879	2020-07-30 17:19:19 -07:00
kuterd	7bdcceaabf	[Attributor] Add time trace support. This patch addes time trace functionality to have a better understanding of the analysis times. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D84980	2020-07-31 03:08:50 +03:00
Craig Topper	5b76494391	[ValueTracking] Add basic computeKnownBits support for llvm.abs intrinsic This includes basic support for computeKnownBits on abs. I've left FIXMEs for more complicated things we could do. Differential Revision: https://reviews.llvm.org/D84963	2020-07-30 16:26:54 -07:00
Eli Friedman	244f68004d	[LegalizeTypes][SVE] Support widen/split legalization for SPLAT_VECTOR Just the obvious implementation that rewrites the result type. Also fix warning from EXTRACT_SUBVECTOR legalization that triggers on the test. Differential Revision: https://reviews.llvm.org/D84706	2020-07-30 16:17:45 -07:00
Amara Emerson	dd34b50099	[AArch64][GlobalISel] Add legalization & selection support for G_INTRINSIC_LRINT. Differential Revision: https://reviews.llvm.org/D84552	2020-07-30 16:14:56 -07:00
Mircea Trofin	629123ee54	[doc] Describe the header guard style clang-tidy's llvm-header-guard rule references the LLVM style - where it's missing. Differential Revision: https://reviews.llvm.org/D84989	2020-07-30 16:08:07 -07:00
LLVM GN Syncbot	ed417d552d	[gn build] Port 763671f387f	2020-07-30 22:29:22 +00:00
Lang Hames	6ca1bd0130	[llvm-jitlink] Add -harness option to llvm-jitlink. The -harness option enables new testing use-cases for llvm-jitlink. It takes a list of objects to treat as a test harness for any regular objects passed to llvm-jitlink. If any files are passed using the -harness option then the following transformations are applied to all other files: (1) Symbols definitions that are referenced by the harness files are promoted to default scope. (This enables access to statics from test harness). (2) Symbols definitions that clash with definitions in the harness files are deleted. (This enables interposition by test harness). (3) All other definitions in regular files are demoted to local scope. (This causes untested code to be dead stripped, reducing memory cost and eliminating spurious unresolved symbol errors from untested code). These transformations allow the harness files to reference and interpose symbols in the regular object files, which can be used to support execution tests (including fuzz tests) of functions in relocatable objects produced by a build.	2020-07-30 15:26:19 -07:00
Lang Hames	eeb8251d74	[JITLink] Allow JITLinkContext::notifyResolved to return an Error. This allows clients to detect invalid transformations applied by JITLink passes (e.g. inserting or removing symbols in unexpected ways) and terminate linking with an error. This change is used to simplify the error propagation logic in ObjectLinkingLayer.	2020-07-30 15:26:18 -07:00
Matt Arsenault	39a288b179	AMDGPU: Fix liveness errors when copying AGPR tuples Avoid recursively calling copyPhysReg for AGPR handling. This was dropping the necessary super register implicit defs to avoid liveness verifier errors.	2020-07-30 18:13:04 -04:00
Changpeng Fang	dc0f54c2fe	AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst Summary: This is in response to the review of https://reviews.llvm.org/D84873: The expensive check should be reordered last Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D84890	2020-07-30 14:37:06 -07:00
Nikita Popov	3f67e02a40	[ConstantRange][CVP] Make use of abs poison flag Pass the abs poison flag to the underlying ConstantRange implementation, allowing CVP to simplify based on it. Importantly, this recognizes that abs with poison flag is actually non-negative...	2020-07-30 23:06:10 +02:00
Jon Roelofs	50a1ea2ba8	[SelectionDAG] Fix lowering of vector geps This fixes an assertion failure that was being triggered in SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32> to i64 (which should have been <2xi64>). Fixes: rdar://66016901 Differential Revision: https://reviews.llvm.org/D84884	2020-07-30 14:56:53 -06:00
Nikita Popov	9bc2ea2f9d	[ConstantRange] Support abs with poison flag This just adds the ConstantRange support, including exhaustive testing. It's not wired up to the IR intrinsic flag yet.	2020-07-30 22:49:28 +02:00
Nikita Popov	c1baeb79cd	[ConstantRange][CVP] Compute min/max/abs intrinsic ranges Wire up ConstantRange::intrinsic() to the existing primitives for min, max and abs. The poison flag on abs is not yet taken into account.	2020-07-30 22:21:34 +02:00
Nikita Popov	87ef56982f	[CVP] Add tests for min/max/abs intrinsic comparisons (NFC)	2020-07-30 22:17:03 +02:00
Nikita Popov	6a6a8d731b	[SCCP] Remove dead switch cases based on range information Determine whether switch edges are feasible based on range information, and remove non-feasible edges lateron. This does not try to determine whether the default edge is dead, as we'd have to determine that the range is fully covered by the cases for that. Another limitation here is that we don't remove dead cases that have the same successor as a live case. I'm not handling this because I wanted to keep the edge removal based on feasible edges only, rather than inspecting ranges again there -- this does not seem like a particularly useful case to handle. Differential Revision: https://reviews.llvm.org/D84270	2020-07-30 21:21:08 +02:00
Florian Hahn	db60ce547b	[LAA] Avoid adding pointers to the checks if they are not needed. Currently we skip alias sets with only reads or a single write and no reads, but still add the pointers to the list of pointers in RtCheck. This can lead to cases where we try to access a pointer that does not exist when grouping checks. In most cases, the way we access PositionMap masked that, as the value would default to index 0. But in the example in PR46854 it causes a crash. This patch updates the logic to avoid adding pointers for alias sets that do not need any checks. It makes things slightly more verbose, by first checking the numbers of reads/writes and bailing out early if we don't need checks for the alias set. I think this makes the logic a bit simpler to follow. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D84608	2020-07-30 19:21:14 +01:00
Sanjay Patel	6d2aa4676f	[InstCombine] update test checks; NFC	2020-07-30 14:16:48 -04:00
Ettore Tiotto	bd3629535d	Fix computeHostNumPhysicalCores() for Linux on POWER and Linux on Z ThinLTO is run using a single thread on Linux on Power. The compute_thread_count() routine calls getHostNumPhysicalCores which returns -1 by default, and so `MaxThreadCount is set to 1. unsigned llvm::ThreadPoolStrategy::compute_thread_count() const { int MaxThreadCount = UseHyperThreads ? computeHostNumHardwareThreads() : sys::getHostNumPhysicalCores(); if (MaxThreadCount <= 0) MaxThreadCount = 1; … } Fix: provide custom implementation of getHostNumPhysicalCores for Linux on Power and Linux on Z. Reviewed By: Kai, uweigand Differential Revision: https://reviews.llvm.org/D84764	2020-07-30 18:05:36 +00:00
Wouter van Oortmerssen	a492dddf2e	[WebAssembly] Fixed 64-bit indices in br_table LLVM selection dag assumes "switch" indices are pointer sized, which causes problems for our 32-bit br_table. The new function ensures 32-bit operands don't get unnecessarily extended, and 64-bit operands get truncated. Note that the changes to the existing test test exactly that: the addition of -NEXT in 2 places ensures no extension is inserted (which the test previously ignored) and that the wrap is present (previously omitted in wasm64 mode). Differential Revision: https://reviews.llvm.org/D84705	2020-07-30 10:52:16 -07:00
Stanislav Mekhanoshin	d577e781e7	[AMDGPU] Do not use undef on indirect source We are using undef on the indirect move source subreg and then using implicit super-reg. This creates a problem in RA when Greedy decides to split the register. It reassigns the implicit super-reg but does not bother to change undef source because it is really does not matter. The fix is to stop lying to RA and drop undef flag. This has also hit a problem in SIFoldOperands as it can fold immediate into an indirect move since there is no undef flag anymore. That results in multiple test failures, so added the check for this case. Differential Revision: https://reviews.llvm.org/D84899	2020-07-30 10:41:59 -07:00
Simon Pilgrim	d93ae4be23	LoopUnroll.cpp - pass std::vector by const reference to needToInsertPhisForLCSSA helper. NFCI. Avoid an unnecessary pass by value.	2020-07-30 18:17:04 +01:00
Yuanfang Chen	e1803bebb8	[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations Problem: Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order. Solution: Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder) This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order. This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want. The test fixes looks messy. It includes changes: - Remove PassInstrumentationAnalysis - Remove PassAdaptor - If a PassAdaptor is for a real pass, the pass is added - Pass reorder (to the correct order), related to PassAdaptor - Add missing passes (due to Debuglogging not passed down) Reviewed By: asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D84774	2020-07-30 10:07:57 -07:00
Craig Topper	a1d16be90c	[WebAssembly] Fix GCC 5 build. Hans' speculative fix in b7292f2db02d37c9291afc0613a3fbce0a4ad4e8 didn't work for me. This seems to.	2020-07-30 10:00:28 -07:00
Hiroshi Yamauchi	88463a5c71	[PGO] Include the mem ops into the function hash. To avoid hash collisions when the only difference is in mem ops.	2020-07-30 09:26:20 -07:00
hsmahesha	cca61dc4bb	[AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic Get rid of all fixmes and base heuristic on `num-clustered-dwords`. The main intuition behind this is as follows. The existing heuristic roughly summarizes as below: * Assume, all the mem ops instructions participating in the clustering process, loads/stores same num bytes * If num bytes loaded by each mem op is 4 bytes, then cluster at max 5 mem ops, that is at max 20 bytes * If num bytes loaded by each mem op is 8 bytes, then cluster at max 3 mem ops, that is at max 24 bytes * If num bytes loaded by each mem op is 16 bytes, then cluster at max 2 mem ops, that is at max 32 bytes So, we need to make sure that the new heuristic do not completey deviate away from the above one, and it properly handles both the sub-word loads and the wide loads. Reviewed By: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D84354	2020-07-30 21:41:13 +05:30
Brendon Cahoon	4b5fd94277	Align store conditional address In cases where the alignment of the datatype is smaller than expected by the instruction, the address is aligned. The aligned address is used for the load, but wasn't used for the store conditional, which resulted in a run-time alignment exception.	2020-07-30 10:42:00 -05:00
Fangrui Song	4649a23e60	[X86] Parse and ignore .arch directives We parse .arch so that some `.arch i386; .code32` code can assemble. It seems that X86AsmParser does not do a good job tracking what features are needed to assemble instructions. GNU as's x86 port supports a very wide range of .arch operands. Ignore the operand for now. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D84900	2020-07-30 08:30:06 -07:00
Johannes Doerfert	49c4fd21d9	[OpenMP][IRBuilder] Support allocas in nested parallel regions We need to keep track of the alloca insertion point (which we already communicate via the callback to the user) as we place allocas as well. Reviewed By: fghanim, SouraVX Differential Revision: https://reviews.llvm.org/D82470	2020-07-30 10:19:39 -05:00
Momchil Velikov	84e2b86794	[AArch64] Fix operand definitions of XPACI/XPACD The operand to these instructions is both input and output. These are not yet emitted by the compiler and the assembler already works fine, so can't test in this patch. But D75044 will use XPACI and provide test coverage for this patch as well. Differential Revision: https://reviews.llvm.org/D84298	2020-07-30 15:31:44 +01:00

1 2 3 4 5 ...

201106 Commits