llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Roman Lebedev	55213cd08b	[X86] AMD Zen 3: double the LoopMicroOpBufferSize While the IndVars issue (PR50384) has been resolved, and the compile performance improved, a new blocker emerged, the codegen machine instruction scheduling is also quadratic. So we still can't really specify the right value here. Filed PR50584.	2021-06-05 01:23:58 +03:00
Nico Weber	8a6b99494c	[gn build] (semi-manually) port 07c92b2e9581	2021-06-04 16:41:42 -04:00
Eli Friedman	1e64a33bcb	Regenerate a few tests related to SCEV. In preparation for https://reviews.llvm.org/D103656	2021-06-04 13:35:00 -07:00
Fangrui Song	76489f90a5	[InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat `__profd_` variables are referenced by code only when value profiling is enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just waste space on ELF/Mach-O. We change the comdat symbol from `__profd_` to `__profc_` because an internal symbol does not provide deduplication features on COFF. The choice doesn't matter on ELF. (In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_` symbols.) On Windows this enables further optimization. We are no longer affected by the link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can cause duplicate definition error. https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585). This avoids many `/INCLUDE:` directives in `.drectve`. Here is rnk's measurement for Chrome: ``` This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%: #BEFORE $ find . -iname '.obj' \| xargs du -b \| awk '{ sum += $1 } END { print sum}' 1047758867 $ du -cksh base_unittests.exe 82M base_unittests.exe 82M total # AFTER $ find . -iname '.obj' \| xargs du -b \| awk '{ sum += $1 } END { print sum}' 937886499 $ du -cksh base_unittests.exe 78M base_unittests.exe 78M total ``` The change is NFC for Mach-O. Reviewed By: davidxl, rnk Differential Revision: https://reviews.llvm.org/D103372	2021-06-04 13:27:56 -07:00
Nikita Popov	29a4cb7698	[IndVars] Don't forget value when inferring nowrap flags When SimplifyIndVars infers IR nowrap flags from SCEV, this may happen in two ways: Either nowrap flags were already present in SCEV and just get transferred to IR. Or zero/sign extension of addrecs infers additional nowrap flags, and those get transferred to IR. In the latter case, calling forgetValue() ensures that the newly inferred nowrap flags get propagated to any other SCEV expressions based on the addrec. However, the invalidation can also have a major compile-time effect in some cases. For https://bugs.llvm.org/show_bug.cgi?id=50384 with n=512 compile- time drops from 7.1s to 0.8s without this invalidation. At the same time, removing the invalidation doesn't affect any codegen in test-suite. Differential Revision: https://reviews.llvm.org/D103424	2021-06-04 20:57:22 +02:00
Rong Xu	559805b594	[SampleFDO] New hierarchical discriminator for FS SampleFDO (llvm-profdata part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is for llvm-profdata part of change. It sets the bit masks for the profile reader in llvm-profdata. Also add an internal option "-fs-discriminator-pass" for show and merge command to process the profile offline. This patch also moved setDiscriminatorMaskedBitFrom() to SampleProfileReader::create() to simplify the interface. Differential Revision: https://reviews.llvm.org/D103550	2021-06-04 11:22:06 -07:00
Adam Nemet	713ac872c6	[Matrix] Fix transpose-multiply folding if transpose has multiple uses Don't add it to FusedInsts in this case. Differential Revision: https://reviews.llvm.org/D103627	2021-06-04 10:55:03 -07:00
Jessica Paquette	1c60765227	[AArch64][GlobalISel] Handle multiple phis in fixupPHIOpBanks If we ended up with two phi instructions in a block, and we needed to fix up the banks for the first one, we'd end up inserting our COPY before the second phi. E.g. ``` %x = G_PHI ... %fixup = COPY ... %y = G_PHI ... ``` This is invalid MIR, and breaks assumptions made by the register allocator later down the line. With the verifier enabled, it also emits a verification error. This teaches fixupPHIOpBanks to walk past any phi instructions in the block when emitting the fixup copies. Here's an example of the crashing code (same as added testcase): https://godbolt.org/z/h5j1x3o6e Differential Revision: https://reviews.llvm.org/D103582	2021-06-04 09:59:36 -07:00
Mark Schimmel	9c179c2114	Add commutable attribute to opcodes for ARC This patch sets the isCommutable attribute for several opcodes that have the "reg = OPCODE reg, reg" format. Differential Revision: https://reviews.llvm.org/D103653	2021-06-04 19:49:19 +03:00
LLVM GN Syncbot	601a8b2bc9	[gn build] Port d31a2e7554ea	2021-06-04 16:41:04 +00:00
LLVM GN Syncbot	3a3c157dfb	[gn build] Port 7ed7d4ccb899	2021-06-04 16:41:03 +00:00
Joseph Huber	6028348946	[Attributor] Check HeapToStack's state for isKnownHeapToStack This patch changes the `isKnownHeapToStack` and `isAssumedHeapToStack` member functions to return if a function call is going to be altered by HeapToStack. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103574	2021-06-04 12:38:33 -04:00
Joseph Huber	d685caa27a	[Attributor] Allow lookupAAFor to return null on invalid state This patch adds an option to `lookupAAFor` that allows it to return a nullptr if the state of the looked up attribute is invalid. This is so future passes can use this to query other attributes with the guarantee that they are valid. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D103556	2021-06-04 12:29:15 -04:00
Craig Topper	1f378948f5	[RISCV] Teach vsetvli insertion pass that operations on masks don't care about SEW/LMUL. All that really matters is that the VLMAX of the preceding instructions is the same as the VLMAX required by the mask operation. Also update the vmsge(u) handling to use the SEW/LMUL we use for other mask register operations. We were matching it to the compare before. Some cases will be improve if we fix masked compares to use tail agnostic policy. I think they ignore the tail policy anyway. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103299	2021-06-04 09:17:46 -07:00
Sanjay Patel	a621794425	[ConstantFolding] add copysign tests for more FP types; NFC D102673 proposes to ease the current type check, but there doesn't appear to be any test coverage for that.	2021-06-04 11:42:53 -04:00
Alexey Bataev	fe6e3a2893	[OPENMP]Fix PR50129: omp cancel parallel not working as expected. Need to emit a call for __kmpc_cancel_barrier in the exit block for __kmpc_cancel function call if cancellation of the parallel block is requested. Differential Revision: https://reviews.llvm.org/D103646	2021-06-04 08:24:55 -07:00
Bradley Smith	f1cec9e1a5	[AArch64] Remove SETCC of CSEL when the latter's condition can be inverted setcc (csel 0, 1, cond, X), 1, ne ==> csel 0, 1, !cond, X Where X is a condition code setting instruction. Co-authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D103256	2021-06-04 15:53:21 +01:00
Nico Weber	a9f5bf83d0	[gn build] (manually) port de07b1e84d8de9	2021-06-04 10:37:53 -04:00
Sanjay Patel	130926f0a4	[InstCombine] add tests for pow() reassociation; NFC Baseline tests for D102574	2021-06-04 10:16:07 -04:00
Nico Weber	335147c74d	Revert "[InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat" This reverts commit a14fc749aab2c8e1a45d19d512255ebfc69357c3. Breaks check-profile on macOS. See https://reviews.llvm.org/D103372 for details.	2021-06-04 10:00:12 -04:00
Nicholas Guy	bacd696ee9	[AArch64] Further enable UnrollAndJam Due to the dependency on runtime unrolling, UnJ is only enabled by default on in-order scheduling models, and if a cpu is specified through -mcpu. Differential Revision: https://reviews.llvm.org/D103604	2021-06-04 14:18:49 +01:00
Sanjay Patel	9fa61addaf	[InstCombine] add/adjust test comments; NFC Follow-up to post-commit comment: https://reviews.llvm.org/rG23a116c8c446	2021-06-04 09:04:53 -04:00
Mirko Brkusanin	2afa995d4a	[AMDGPU][GlobalISel] Legalize G_ABS Legalize and select G_ABS so that we can use llvm.abs intrinsic Differential Revision: https://reviews.llvm.org/D102391	2021-06-04 14:46:43 +02:00
Dmitry Preobrazhensky	3e26b4f39a	[AMDGPU][MC][NFC] Fixed typos in parser Differential Revision: https://reviews.llvm.org/D103680	2021-06-04 15:40:42 +03:00
Bradley Smith	c336efa1c6	[AArch64][SVE] Add support for using reverse forms of SVE2 shifts When using and ACLE intrinsic for an SVE2 shift, if the predicate passed has all relevant lanes active, then use a reversed version of the instruction if beneficial.	2021-06-04 12:56:53 +01:00
Sanjay Patel	dbb641cb82	[InstCombine] convert lshr to ashr to eliminate cast op This is similar to b865eead7657 ( D103617 ) and fixes: https://llvm.org/PR50575 41b71f718b94c6f12b did this and more (noted with TODO comments in the tests), but it didn't handle the case where the destination is narrower than the source, so it got reverted. This is a simple match-and-replace. If there's evidence that the TODO cases are useful, we can revisit/extend.	2021-06-04 07:04:37 -04:00
Sanjay Patel	45c2069ad4	[InstCombine] add tests for sext-of-trunc-of-lshr; NFC	2021-06-04 07:04:37 -04:00
Nico Weber	5442d5f5fd	Revert "[gn build] port d1d36f7ad (llvm-tapi-diff)" This reverts commit 13155138c1ce1e91032d467e20e557f9cdbf08f5. d1d36f7ad was reverted in 5337c7550d.	2021-06-04 06:46:19 -04:00
Jeremy Morse	f08c337615	Re-land ae4303b42c, "Track PHI values through register coalescing" Was reverted in 0507fc2ffc9, in phi-coalesce-subreg.mir I'd explicitly named some passes to run instead of specifying a range. As a result some two-address-instrs weren't correctly rewritten and the verifier got upset. Original commit message: [DebugInstrRef][2/3] Track PHI values through register coalescing In the instruction referencing variable location model, we store variable locations that point at PHIs in MachineFunction during register allocation. Unfortunately, register coalescing can substantially change the locations of registers, and so that PHI-variable-location side table needs maintenence during the pass. This patch builds an index from the side table, and whenever a vreg gets coalesced into another vreg, update the index to record the new vreg that the PHI happens in. It also accepts a limited range of subregister coalescing, for example merging a subregister into a larger class. Differential Revision: https://reviews.llvm.org/D86813	2021-06-04 11:32:02 +01:00
Thomas Preud'homme	9245f47396	[test] Fix accidental match in parent_recurse_depth.s The CHECK-NOT directives in tools/llvm-dwarfdump/X86/parent_recurse_depth.s can accidentally match something in the path of the object file created by yaml2obj, for example: llvm-project/llvm/test/tools/llvm-dwarfdump/X86/parent_recurse_depth.s:13:12: error: ONE-NOT: excluded string found in input ^ <stdin>:1:22: note: found here builds/llvm-projects/mainline/release/test/tools/llvm-dwarfdump/X86/Output/parent_recurse_depth.s.tmp.o: file format elf64-x86-64 ^~~~ This commit alleviate this issue by consuming the file name from the output, forcing all the CHECK-NOT to match what comes after. Reviewed By: Higuoxing Differential Revision: https://reviews.llvm.org/D103676	2021-06-04 11:23:27 +01:00
Fraser Cormack	5ee63b3d02	[SelectionDAG] Extend FoldConstantVectorArithmetic to SPLAT_VECTOR This patch extends the SelectionDAG's ability to constant-fold vector arithmetic to include support for SPLAT_VECTOR. This is not only for scalable-vector types but also for fixed-length vector types, which helps Hexagon in a couple of cases. The original RISC-V test case was in fact an infinite DAGCombine loop. The pattern `and (truncate v1), (truncate v2)` can be combined to `truncate (and v1, v2)` but the truncate can similarly be combined back to `truncate (and v1, v2)` (but, crucially, only when one of `v1` or `v2` is a constant vector). It wasn't exposed in on fixed-length types because a TRUNCATE of a constant BUILD_VECTOR was folded into the BUILD_VECTOR itself, whereas this did not happen for the equivalent (scalable-vector) SPLAT_VECTOR. Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D103246	2021-06-04 09:53:15 +01:00
Tim Northover	672a2c4fb0	AArch64: support atomic zext/sextloads	2021-06-04 09:45:51 +01:00
Esme-Yi	914b5eeb20	[Debug-Info] handle DW_CC_pass_by_value/DW_CC_pass_by_reference under strict DWARF. Summary: When -strict-dwarf=true is specified, the calling convention info DW_CC_pass_by_value or DW_CC_pass_by_reference can only be generated at DWARF5. Reviewed By: shchenz, dblaikie Differential Revision: https://reviews.llvm.org/D103300	2021-06-04 08:14:47 +00:00
Muhammad Omair Javaid	0768752613	Add LLDB in release binaries by default LLDB is currently not selected in LLVM release testing and thus it doesnt make its way into prebuilt binaries which build with default configuration. This patch enables LLDB by default in test-release script. Assuming LLDB build by default was disabled back in 2016 LLDB support for various architectures has a long way since then. It has buildbots for most architectures and supports a case to be included by default. Also lldb build can easily be disabled in case some release managers choose to do so. Reviewed By: tstellar Differential Revision: https://reviews.llvm.org/D101864	2021-06-04 11:57:00 +05:00
madhur13490	aa50d7ed0d	[AMDGPU] [IndirectCalls] Don't propagate attributes to address taken functions and their callees Don't propagate launch bound related attributes to address taken functions and their callees. The idea is to do a traversal over the call graph starting at address taken functions and erase the attributes set by previous logic i.e. process(). This two phase approach makes sure that we don't miss out on deep nested callees from address taken functions as a function might be called directly as well as indirectly. This patch is also reattempt to D94585 as latent issues are fixed in hasAddressTaken function in the recent past. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D103138	2021-06-04 11:36:56 +05:30
hsmahesha	f0ab92c1c6	Revert "[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering." This reverts commit d71ff907ef23eaef86ad66ba2d711e4986cd6cb2.	2021-06-04 11:16:46 +05:30
Cyndy Ishida	35b9c30842	Revert "[llvm] llvm-tapi-diff" This reverts commit d1d36f7ad2ae82bea8a6fcc40d6c42a72e21f096. Reverting this patch to investigate linux bot failures + fix with author offline	2021-06-03 21:10:51 -07:00
hsmahesha	fdb852810c	[AMDGPU] Increase alignment of LDS globals if necessary before LDS lowering. Before packing LDS globals into a sorted structure, make sure that their alignment is properly updated based on their size. This will make sure that the members of sorted structure are properly aligned, and hence it will further reduce the probability of unaligned LDS access. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D103261	2021-06-04 09:34:37 +05:30
Arthur Eubanks	ad23e89deb	[NFC] Remove checking pointee type for byval/preallocated type These currently always require a type parameter. The bitcode reader already upgrades old bitcode without the type parameter to use the pointee type.	2021-06-03 19:09:09 -07:00
Nico Weber	c129182606	Revert "Update and improve compiler-rt tests for -mllvm -asan_use_after_return=(never\|[runtime]\|always)." This reverts commit 41b3088c3f33d712e3d2f64b66ae4eb701fa4bfb. Doesn't build on macOS, see comments on https://reviews.llvm.org/D103304	2021-06-03 21:01:11 -04:00
Wenlei He	34de16e5aa	[CSSPGO][llvm-profgen] Make extended binary the default output format Make extended binary the default output format for CSSPGO. This avoids having to pass flag every time when generating profile. It also matches llvm-profdata where binary profile is the default (should we switch to extbinary as default for llvm-profdata?). We plan to compress name table for context profile, which depends on the built-in compression of extbinary. Differential Revision: https://reviews.llvm.org/D103650	2021-06-03 17:58:16 -07:00
Craig Topper	43e89e38de	[RISCV] Simplify some code in RISCVInsertVSETVLI by calling an existing function that does the same thing. NFCI	2021-06-03 17:31:54 -07:00
Nico Weber	218519bf96	[gn build] port d1d36f7ad (llvm-tapi-diff)	2021-06-03 19:22:39 -04:00
Arthur Eubanks	b01ec7e228	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. Issues can be diagnosed with D103412. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-06-03 15:52:01 -07:00
Arthur Eubanks	cf3b931893	[BuildLibCalls] Properly set ABI attributes on arguments Some floating point lib calls have ABI attributes that need to be set on the caller. Found via D103412. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D103415	2021-06-03 15:45:07 -07:00
Fangrui Song	fcb0b24958	[CMake][ELF] Add -fno-semantic-interposition for GCC and Clang>=13 In a `-DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on` build, libLLVM-13git.so is 2% smaller and libclang-cpp.so is 1% smaller (on top of -Wl,-Bsymbolic-functions). There may be some small performance improvement as well because GCC -fPIC suppresses interprocedural optimizations for non-inline definitions by default. Note: we cannot add -fno-semantic-interposition for Clang<13. Clang<13's implementation additionally optimizes global variables, which is incompatible with unfortunate ELF -fno-pic default: direct access relocations for external data. If the executable has a -fno-pic object file referencing a global variable declared in a public header, the direct access relocation will cause a copy relocation. The executable and libLLVM.so/libclang-cpp.so will disagree on the address. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D102453	2021-06-03 15:26:34 -07:00
Philip Reames	c69504e130	Kill a variable which is unused after cddcc4cf [nfc]	2021-06-03 14:38:57 -07:00
zero9178	d052166c91	[NFC] Add missing includes for LLVM_ENABLE_MODULES builds Building LLVM with the LLVM_ENABLE_MODULES cmake option fails when the modules are being compiled due to missing includes. This is a side effect of some transitive includes that changed recently. Differential Revision: https://reviews.llvm.org/D103645	2021-06-03 23:29:03 +02:00
Philip Reames	821fed9ea6	A couple style tweaks on top of 5c0d1b2f9 [nfc]	2021-06-03 14:14:59 -07:00
Philip Reames	8c571feefd	[LoopUnroll] Eliminate PreserveCondBr parameter and fix a bug in the process This builds on D103584. The change eliminates the coupling between unroll heuristic and implementation w.r.t. knowing when the passed in trip count is an exact trip count or a max trip count. In theory the new code is slightly less powerful (since it relies on exact computable trip counts), but in practice, it appears to cover all the same cases. It can also be extended if needed. The test change shows what appears to be a bug in the existing code around the interaction of peeling and unrolling. The original loop only ran 8 iterations. The previous output had the loop peeled by 2, and then an exact unroll of 8. This meant the loop ran a total of 10 iterations which appears to have been a miscompile. Differential Revision: https://reviews.llvm.org/D103620	2021-06-03 14:09:16 -07:00

1 2 3 4 5 ...

216759 Commits