llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Fangrui Song	8e87244228	[llvm-objdump] --source: drop the warning when there is no debug info Warnings have been added for three cases (PR41905): (1) missing debug info, (2) the source file cannot be found, (3) the debug info points at a line beyond the end of the file. (1) is probably less useful. This was brought up once on http://lists.llvm.org/pipermail/llvm-dev/2020-April/141264.html and two internal users mentioned it to me that it was annoying. (I personally find the warning confusing, too.) Users specify --source to get additional information if sources happen to be available. If sources are not available, it should be obvious as the output will have no interleaved source lines. The warning can be especially annoying when using llvm-objdump -S on a bunch of files. This patch drops the warning when there is no debug info. (If LLVMSymbolizer::symbolizeCode returns an `Error`, there will still be an error. There is currently no test for an `Error` return value. The only code path is probably a broken symbol table, but we probably already emit a warning in that case) `source-interleave-prefix.test` has an inappropriate "malformed" test - the test simply has no .debug_* because new llc does not produce debug info when the filename is empty (invalid). I have tried tampering the header of .debug_info/.debug_line but llvm-symbolizer does not warn. This patch does not intend to add the missing test coverage. Differential Revision: https://reviews.llvm.org/D88715	2021-02-04 09:07:44 -08:00
Jay Foad	8ec3c48d90	[AMDGPU][GlobalISel] Fix v2s16 right shifts When widening, each half of the v2s16 operands needs to be sign extended for G_ASHR or zero extended for G_LSHR. Differential Revision: https://reviews.llvm.org/D96048	2021-02-04 17:04:32 +00:00
Jay Foad	414015c7e8	[AMDGPU][GlobalISel] Use scalar min/max instructions SALU min/max s32 instructions exist so use them. This means that regbankselect can handle min/max much like add/sub/mul/shifts. Differential Revision: https://reviews.llvm.org/D96047	2021-02-04 17:04:32 +00:00
wlei	425d60e18c	[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110	2021-02-04 08:43:21 -08:00
Sanjay Patel	b659297be9	[InstCombine] add tests for demanded/known bits of shifted constant; NFC These are variations of a missed analysis noted in: https://llvm.org/PR48984	2021-02-04 10:31:22 -05:00
Dylan McKay	93bf157993	[AVR] Fix up a few accidentally-regressed Generic CodeGen tests recently broken In 85e8e6246e0fcc62ba727e8fb5990f1a632125d0, these tests were modified to work with AVR, but the regex matchers were finicky and required a fix forward patch, being this.	2021-02-05 04:21:54 +13:00
Louis Dionne	85de81a5d4	[libc++] Rename include/support to include/__support We do ship those headers, so the directory name should not be something that can potentially conflict with user-defined directories. Differential Revision: https://reviews.llvm.org/D95956	2021-02-04 10:16:33 -05:00
Simon Pilgrim	a18c56e6ee	[X86] Use VT::changeVectorElementType helper where possible. NFCI.	2021-02-04 15:03:56 +00:00
Dylan McKay	c468872427	[AVR] Add 'XFAIL' to the remaining failing Generic CodeGen tests for AVR This patch adds 'XFAIL: avr' to 2 Generic CodeGen tests, bringing the Generic CodeGen tests for AVR to a pass, with only two XFAILures. After this patch, the Generic CodeGen tests pass on AVR.	2021-02-05 04:02:27 +13:00
Dylan McKay	dd0f6f1963	[AVR] Fix 14 Generic CodeGen tests by making address space explicit or optional This fixes the vast majority of remaining failing AVR Generic CodeGen tests.	2021-02-05 04:02:27 +13:00
Sander de Smalen	fc8e04bae8	NFC: Migrate LoopUnrollPass to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: david-arm, fhahn Differential Revision: https://reviews.llvm.org/D95817	2021-02-04 14:05:40 +00:00
Florian Hahn	c3bf6e0e7a	[ConstraintElimination] Support conditions from loop preheaders This patch extends the condition collection logic to allow adding conditions from pre-headers to loop headers, by allowing cases where the target block dominates some of its predecessors.	2021-02-04 13:58:32 +00:00
Konstantin Zhuravlyov	f21b553131	AMDGPU: Add support for amdgpu-unsafe-fp-atomics attribute If amdgpu-unsafe-fp-atomics is specified, allow {flat\|global}_atomic_add_f32 even if atomic modes don't match. Differential Revision: https://reviews.llvm.org/D95391	2021-02-04 08:09:34 -05:00
Dylan McKay	45fc9d7bf4	[AVR] Remove an assertion that causes generic CodeGen tests to fail It was discussed a few years ago and agreed that it makes sense to remove this assertion as other targets do not perform similar register size checking in inline assembly constraint logic, so the check just adds a needless barrier on AVR. This patch removes the assertion and removes 'XFAIL' from two Generic CodeGen tests for AVR as a result.	2021-02-05 02:05:23 +13:00
Simon Pilgrim	131bb74242	[X86] Remove stale TODO comment. NFC. We now handle implicit zero-extension shuffle mask cases.	2021-02-04 12:14:05 +00:00
Nico Weber	e65f0c4dde	[gn build] (manually) port 0609f257dc2e2c3	2021-02-04 06:52:55 -05:00
Sander de Smalen	beb3fa0881	[ElementCount] NFC: Set 'const' qualifier for getWithIncrement/Decrement. These class methods simply return a new UnivariateLinearPolyBase (e.g. ElementCount), and do not modify the object in any way or form, so qualify for being 'const'.	2021-02-04 11:27:45 +00:00
Jeremy Morse	1dab9824e4	Re-land D94976 after revert in e29552c5aff6 This modified patch avoids redirecting the unit in which a subprogram is created if type units are enabled -- DIEs were getting children allocated from different units memory pools. Original commit message: [DWARF] Create subprogram's DIE in DISubprogram's unit This is a fix for PR48790. Over in D70350, subprogram DIEs were permitted to be shared between CUs. However, the creation of a subprogram DIE can be triggered early, from other CUs. The subprogram definition is then created in one CU, and when the function is actually emitted children are attached to the subprogram that expect to be in another CU. This breaks internal CU references in the children. Fix this by redirecting the creation of subprogram DIEs in getOrCreateContextDIE to the CU specified by it's DISubprogram definition. This ensures that the subprogram DIE is always created in the correct CU. Differential Revision: https://reviews.llvm.org/D94976	2021-02-04 11:17:18 +00:00
David Green	a9c44b95c0	[ARM] Handle f16 in GeneratePerfectShuffle This new f16 shuffle under Neon would hit an assert in GeneratePerfectShuffle as it would try to treat a f16 vector as an i8. Add f16 handling, treating them like an i16. Differential Revision: https://reviews.llvm.org/D95446	2021-02-04 11:14:52 +00:00
Jan Svoboda	6e10f537ef	[clang][cli] Command line round-trip for HeaderSearch options This patch implements generation of remaining header search arguments. It's done manually in C++ as opposed to TableGen, because we need the flexibility and don't anticipate reuse. This patch also tests the generation of header search options via a round-trip. This way, the code gets exercised whenever Clang is built and tested in asserts mode. All `check-clang` tests pass. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D94472	2021-02-04 10:18:34 +01:00
Joachim Meyer	f40f02934a	[Support] Indent multi-line descr of enum cli options. As noted in https://reviews.llvm.org/D93459, the formatting of multi-line descriptions of clEnumValN and the likes is unfavorable. Thus this patch adds support for correctly indenting these. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D93494	2021-02-04 10:14:44 +01:00
Sebastian Neubauer	ab4f1aa423	[AMDGPU] Save all lanes for reserved VGPRs When SGPRs are spilled to VGPRs, they can overwrite any lane. We need to preserve the value of inactive lanes in function calls, so we save the register even if it is marked as caller saved. Also, teach buildPrologSpill to work when no registers are free like in CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir and update the comment on findScratchNonCalleeSaveRegister as it is not used anymore to realign the stack pointer since D95865. Differential Revision: https://reviews.llvm.org/D95946	2021-02-04 09:56:36 +01:00
wlei	ba7695d4ea	[CSSPGO][llvm-profgen] Compress recursive cycles in calling context This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556	2021-02-03 22:16:07 -08:00
wlei	a12b3252a9	Revert "[CSSPGO][llvm-profgen] Compress recursive cycles in calling context" This reverts commit 0609f257dc2e2c3e4c7cd30fe2ffd520117e706b.	2021-02-03 22:16:05 -08:00
wlei	924cc19d25	Revert "[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation" This reverts commit 1714ad2336293f351b15dd4b518f9e8618ec38f2.	2021-02-03 22:16:05 -08:00
Petr Hosek	b7bd8d62b5	[NFC] Fix the noprofile attribute comment	2021-02-03 21:54:09 -08:00
Chuanqi Xu	9aeeaeef76	[NFC][Coroutine] Remove redundant comment The functionallity in the TODO was added before: https://reviews.llvm.org/rGb3a722e66b75328ab5e2eb5c8572022cb083855b	2021-02-04 12:54:30 +08:00
Kazu Hirata	d6e82443c7	[Transforms/IPO] Use range-based for loops (NFC)	2021-02-03 20:41:20 -08:00
Kazu Hirata	7c818b1f11	[TableGen] Use ListSeparator (NFC)	2021-02-03 20:41:18 -08:00
Kazu Hirata	03a18c0465	[Support] Drop unnecessary const from return types (NFC) Identified with const-return-type.	2021-02-03 20:41:16 -08:00
wlei	99976f0cf2	[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110	2021-02-03 18:50:14 -08:00
wlei	4683e274de	[CSSPGO][llvm-profgen] Compress recursive cycles in calling context This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556	2021-02-03 18:50:14 -08:00
Michael Kruse	930857b772	[OpenMPIRBuilder] Implement collapseLoops. The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop. This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93268	2021-02-03 19:12:02 -06:00
wlei	0adbe76ec7	[CSSPGO][llvm-profgen] Pseudo probe based CS profile generation This change implements profile generation infra for pseudo probe in llvm-profgen. During virtual unwinding, the raw profile is extracted into range counter and branch counter and aggregated to sample counter map indexed by the call stack context. This change introduces the last step and produces the eventual profile. Specifically, the body of function sample is recorded by going through each probe among the range and callsite target sample is recorded by extracting the callsite probe from branch's source. Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen. Implementation - Extended `PseudoProbeProfileGenerator` for pseudo probe based profile generation. - `populateBodySamplesWithProbes` reading range counter is responsible for recording function body samples and inferring caller's body samples. - `populateBoundarySamplesWithProbes` reading branch counter is responsible for recording call site target samples. - Each sample is recorded with its calling context(named `ContextId`). Remind that the probe based context key doesn't include the leaf frame probe info, so the `ContextId` string is created from two part: one from the probe stack strings' concatenation and other one from the leaf frame probe. - Added regression test Test Plan: ninja & ninja check-llvm Differential Revision: https://reviews.llvm.org/D92998	2021-02-03 16:21:53 -08:00
Jessica Paquette	fb07198b1d	[AArch64][GlobalISel] Change store value type from p0 -> s64 to import patterns Similar to the G_PTR_ADD + G_LOAD twiddling we do in `preISelLower`. The imported patterns expect scalars only, so they can't handle things like ``` G_STORE %ptr1, %ptr2 ``` To get around this, use s64 instead. (This probably makes a good portion of the manual selection code for G_STORE dead.) This is a 0.2% geomean code size improvement on CTMark at -Os. (Best is consumer-typeset @ -0.7%) Differential Revision: https://reviews.llvm.org/D95908	2021-02-03 16:19:16 -08:00
Nico Weber	1c6798fa1f	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit 97ba5cde52664200819446c1a18de28faf2ed1c6. Still breaks tests: https://reviews.llvm.org/D76802#2540647	2021-02-03 19:14:34 -05:00
Jessica Paquette	19febf3e8e	[AArch64][GlobalISel] Emit G_ASSERT_ZEXT in assignValueToAddress for ZExt params When we have a zeroext parameter coming in on the stack, build ``` %x = G_LOAD ... %x_assert_zext = G_ASSERT_ZEXT %x, narrow_size %trunc = G_TRUNC %x_assert_zext ``` Rather than just loading into the truncated type. This allows us to optimize cases like this: https://godbolt.org/z/vfjhW8 Differential Revision: https://reviews.llvm.org/D95805	2021-02-03 16:06:05 -08:00
Florian Hahn	b7ef828be9	Revert "[LTO] Use lto::backend for code generation." This reverts commit 6a59f0560648b43324b5aed51b9ef996404a25e0, because it is causing failures on green dragon.	2021-02-03 22:49:30 +00:00
Florian Hahn	66a5e06679	Revert "[LTO] Add option enable NewPM with LTOCodeGenerator." This reverts commit 7a6a2cc81aaf064e6f5bc9a9a16973f552d2bdc2 because it is causing failures on green dragon.	2021-02-03 22:49:20 +00:00
Florian Hahn	b7fb55584a	Revert "[LTOCodeGenerator] Use lto::Config for options (NFC)." This reverts commit 0d487cf87aa1b609b7db061def3e5ad068576ecf because it is causing failures on green dragon.	2021-02-03 22:48:54 +00:00
Arthur Eubanks	c05f64384e	Turn on the new pass manager by default This turns on the new pass manager by default for the optimization pipeline in Clang and ThinLTO in various LLD backends. This also makes uses of `opt -instcombine` use the new pass manager (unless specifically opted out). This does not affect the backend target-dependent codegen pipeline. If this causes regressions, you can opt out of the new pass manager either via the -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF CMake flag while building LLVM, or via various compiler flags, e.g. -flegacy-pass-manager for Clang or -Wl,--lto-legacy-pass-manager for ELF LLD. Please file bugs for any regressions. Major differences: * The inliner works slightly differently * -O1 does some amount of inlining * LCSSA and LoopSimplify are run before all loop passes * Loop unswitching is implemented slightly differently * A new SpeculateAroundPHIs pass is added to the pipeline https://lists.llvm.org/pipermail/llvm-dev/2021-January/148098.html Reviewed By: asbirlea, ychen, MaskRay, echristo Differential Revision: https://reviews.llvm.org/D95380	2021-02-03 14:37:46 -08:00
Amara Emerson	cb21fc8971	[GlobalISel] Add sext(constant) -> constant artifact combine. This is the G_SEXT counterpart to the existing G_ZEXT/G_ANYEXT combines. Differential Revision: https://reviews.llvm.org/D95729	2021-02-03 14:10:08 -08:00
Arthur Eubanks	213c28c3fc	[NewPM][HelloWorld] Move HelloWorld to Utils To prevent creating a new component, which creates a new library. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D95907	2021-02-03 12:59:40 -08:00
Richard Smith	d38c68eaec	Fix overflowing signed left shift, found by ubsan buildbot.	2021-02-03 12:51:39 -08:00
Krzysztof Parzyszek	65e46aa23b	[Hexagon] Add LLVM instruction definitions for Hexagon V68	2021-02-03 13:59:34 -06:00
Rong Xu	8e87532070	[SampleFDO][NFC] Detach SampleProfileLoader from SampleCoverageTracker This patch detaches SampleProfileLoader from class SampleCoverageTracker. We plan to move SampleProfileLoader to a template class. This would remain SampleCoverageTracker as a class. Also make callsiteIsHot() as a file static function. Differential Revision: https://reviews.llvm.org/D95823	2021-02-03 11:38:04 -08:00
Justin Bogner	030339eb55	[GlobalISel] Combine narrowScalar of G_ADD and G_SUB. NFC These two cases have identical implementations other than an unreachable part of `G_ADD` that checks if the scalar we're narrowing is a vector. Combining them to avoid unnecessary divergence.	2021-02-03 11:06:04 -08:00
Matt Arsenault	0db1ff1c80	RegisterCoalescer: Fix not setting undef on coalesced subregister uses This was only adding undef to the use if the copy itself had a subregister index. It did not consider the subrange liveness if the use had a subreg index to begin with.	2021-02-03 13:54:43 -05:00
Matt Arsenault	a94851d3fb	RegisterCoalescer: Prune undef subranges from copy pairs in loops If we had a pair of copies inside a loop which introduced new liveness to a subregister which was undef before the loop, we would have a dummy phi-only segment remaining across the loop body. Later, this false segment would confuse RenameIndependentSubregs causing it to introduce IMPLICIT_DEFs with broken value numbering. It seems always adding the lanes to ShrinkMask is OK, so any conditions should be purely a compile time filter.	2021-02-03 13:42:53 -05:00
Matt Arsenault	89df6b0538	Revert "AMDGPU: Don't consider global pressure when bundling soft clauses" This reverts commit 1e377a273f59375d8e6a424f66f069b3adfa1ca4. A regression was reported.	2021-02-03 13:25:05 -05:00

1 2 3 4 5 ...

210746 Commits