llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00

Author	SHA1	Message	Date
Matheus Izvekov	7065f0a696	[X86] Generate unaligned access for fixed slots in unaligned stack loadRegFromStackSlot()/storeRegToStackSlot() can generate aligned access instructions for stack slots even if the stack is unaligned, based on the assumption that the stack can be realigned. However, this doesn't work for fixed slots, which are e.g. used for spilling XMM registers in a non-leaf function with `__attribute__((preserve_all))`. When compiling such code with `-mstack-alignment=8`, this causes general protection faults. Fix it by only considering stack realignment for non-fixed slots. Note that this changes the output of three existing tests which spill AVX registers, since AVX requires higher alignment than the ABI provides on stack frame entry. Reviewed By: rnk, jyknight Differential Revision: https://reviews.llvm.org/D73126	2021-02-05 11:36:54 +08:00
Craig Topper	b58d6a4b20	[TargetLowering] Use Align in allowsMisalignedMemoryAccesses. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96097	2021-02-04 19:22:06 -08:00
Dan Gohman	1471de1de2	[WebAssembly] Support single-floating-point immediate value As mentioned in TODO comment, casting double to float causes NaNs to change bits. To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode. Patch by Yuta Saito. Differential Revision: https://reviews.llvm.org/D77384	2021-02-04 18:05:06 -08:00
Philip Reames	3f9664176e	Add missing test update from 3e5ce49 Sorry for the build break, apparently forgot to build ARM target.	2021-02-04 18:04:24 -08:00
Fangrui Song	5481fc6d44	DebugInfo: Temporarily work around -gsplit-dwarf + LTO .debug_gnu_pubnames regression after D94976 `-flto -gsplit-dwarf -g -O[123]` may create .debug_gnu_pubnames with 0 DIE offset entries. llvm-dwarfdump -debug-gnu-pubnames/ld.lld --gdb-index errors for that. ``` .section .debug_gnu_pubnames,"",@progbits .long .LpubNames_end2-.LpubNames_begin2 # Length of Public Names Info .LpubNames_begin2: .short 2 # DWARF Version .long .Lcu_begin2 # Offset of Compilation Unit Info .long 57 # Compilation Unit Length .long 0 # DIE offset .byte 16 # Attributes: TYPE, EXTERNAL .asciz "absl" # External Name .long 0 # DIE offset .byte 16 # Attributes: TYPE, EXTERNAL .asciz "absl::base_internal" # External Name .long 0 # End Mark ```	2021-02-04 17:35:09 -08:00
Philip Reames	d42b6ae9c0	[LV] Unconditionally branch from middle to scalar preheader if the scalar loop must execute If we know that the scalar epilogue is required to run, modify the CFG to end the middle block with an unconditional branch to scalar preheader. This is instead of a conditional branch to either the preheader or the exit block. The motivation to do this is to support multiple exit blocks. Specifically, the current structure forces us to identify immediate dominators and which exit block to branch from in the middle terminator. For the multiple exit case - where we know require scalar will hold - these questions are ill formed. This is the last change needed to support multiple exit loops, but since the diffs are already large enough, I'm going to land this, and then enable separately. You can think of this as being NFCI-ish prep work, but the changes are a bit too involved for me to feel comfortable tagging the change that way. Differential Revision: https://reviews.llvm.org/D94892	2021-02-04 17:28:30 -08:00
Craig Topper	7c9d7ad64a	[RISCV] Add i8/i16 test cases to div.ll and i8/i16/i64 to rem.ll. NFC This improves our coverage of these operations and shows that we use really large constants for division by constant on i8/i16 especially on RV64. The issue is that BuildSDIV/BuildUDIV are limited to legal types so we have to promote to i64 before it kicks in. At that point we've lost the range information for the original type.	2021-02-04 16:46:23 -08:00
wlei	e7e8e955e2	[CSSPGO][llvm-profgen] Fix bug with parsing hybrid sample trace line when we skip the call stack starting with an external address, we should also skip the bottom LBR entry, otherwise it will cause a truncated context issue. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D95480	2021-02-04 16:15:05 -08:00
Richard Smith	8a8273062f	Don't infer attributes on '::operator new'. These attributes were all incorrect or inappropriate for LLVM to infer: - inaccessiblememonly is generally wrong; user replacement operator new can access memory that's visible to the caller, as can a new_handler function. - willreturn is generally wrong; a custom new_handler is not guaranteed to terminate. - noalias is inappropriate: Clang has a flag to determine whether this attribute should be present and adds it itself when appropriate. - noundef and nonnull on the return value should be specified by the frontend on all 'operator new' functions if we want them, not here. In any case, inferring attributes on functions declared 'nobuiltin' (as these are when Clang emits them) seems questionable.	2021-02-04 13:59:49 -08:00
Richard Smith	c07a815edb	Revert "[BuildLibcalls, Attrs] Support more variants of C++'s new, add attributes for C++'s delete" Several of the new attributes here were incorrect, and even the ones that are generally correct were being added even to nobuiltin calls. This reverts commit bb3f169b59e1c8bd7fd70097532220bbd11e9967.	2021-02-04 13:59:49 -08:00
Ayke van Laethem	388be82ce9	[ARM] Do not emit ldrexd/strexd on Cortex-M chips The ldrexd/strexd instructions are not supported on M-class chips, see for example https://developer.arm.com/documentation/dui0489/e/arm-and-thumb-instructions/memory-access-instructions/ldrex-and-strex which says: > All these 32-bit Thumb instructions are available in ARMv6T2 and > above, except that LDREXD and STREXD are not available in the ARMv7-M > architecture. Looking at the ARMv8-M architecture, it appears that these instructions aren't supported either. The Architecture Reference Manual lists ldrex/strex but not ldrexd/strexd: https://developer.arm.com/documentation/ddi0553/bn/ Godbolt example on LLVM 11.0.0, which incorrectly emits ldrexd/strexd instructions: https://llvm.godbolt.org/z/5qqPnE Differential Revision: https://reviews.llvm.org/D95891	2021-02-04 21:55:34 +01:00
Craig Topper	7f5abbda56	[TargetLowering] Use LegalOnly operand to isOperationLegalOrCustom to simplify some code. NFC	2021-02-04 12:30:37 -08:00
Nikita Popov	0760338efa	[MemorySSA] Don't treat lifetime.end as NoAlias MemorySSA currently treats lifetime.end intrinsics as not aliasing anything. This breaks MemorySSA-based MemCpyOpt, because we'll happily move a read of a pointer below a lifetime.end intrinsic, as no clobber is reported. I think the MemorySSA modelling here isn't correct: lifetime.end(p) has approximately the same effect as doing a memcpy(p, undef), and should be treated as a clobber. This patch removes the special handling of lifetime.end, leaving alias analysis to handle it appropriately. Differential Revision: https://reviews.llvm.org/D95763	2021-02-04 20:58:28 +01:00
wlei	6ee5e73092	[CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs). For example, two input profiles: [main @ foo @ bar]:60 [main @ bar]:50 Under threshold = 100, the two profiles will be merge into one with the base context, get result: [bar]:110 Added two switches: `--csprof-cold-thres=<value>`: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default. `--csprof-keep-cold`: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile. Results: Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later. Differential Revision: https://reviews.llvm.org/D94111	2021-02-04 11:05:03 -08:00
Adrian Prantl	81c0c6d2aa	Remove overzealous verifier check on DW_OP_LLVM_entry_value and improve the documentation Based on the comments in the code, the idea is that AsmPrinter is unable to produce entry value blocks of arbitrary length, such as DW_OP_entry_value [DW_OP_reg5 DW_OP_lit1 DW_OP_plus]. But the way the Verifier check is written it also disallows DW_OP_entry_value [DW_OP_reg5] DW_OP_lit1 DW_OP_plus which seems to overshoot the target. Note that this patch does not change any of the safety guards in LiveDebugValues — there is zero behavior change for clang. It just allows us to legalize more complex expressions in future patches. rdar://73907559 Differential Revision: https://reviews.llvm.org/D95990	2021-02-04 10:58:35 -08:00
Sanjay Patel	685525f3ae	[ExpandReductions] fix FMF requirement for fmin/fmax The upstream callers (the vectorizers) were fixed with: bbed5f2f8a04 ( D95690 ) 77adbe6a8c71 We should remove this pass entirely now that reduction legalization/lowering is expected to work just as well, but we need to confirm that the shuffle ops do not regress (for x86 in particular). This should be the last step needed to close: https://llvm.org/PR23116	2021-02-04 13:32:08 -05:00
Christopher Tetreault	27e0b248a8	Reland "Ensure that InstructionCost actually implements a total ordering" The operator< in the previous attempt was incorrect. It is unfortunate that this was only caught by the expensive checks. This reverts commit ff1147c3635685ba6aefbdc9394300adb5404595.	2021-02-04 10:04:10 -08:00
Peng Guo	6d08b0343e	[NFC][llvm-mca] Fix compiler warning Fix clang compiler warning from `-Wrange-loop-analysis`. Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D95997	2021-02-04 09:44:36 -08:00
Wen-Heng (Jack) Chung	5cc002b97f	[AMDGPU] Add f16 to i1 CodeGen patterns. Follow patterns used for f32 and f64 types. Differential Revision: https://reviews.llvm.org/D95964	2021-02-04 11:44:18 -06:00
Paul Robinson	afb45f94ea	[PS4] Allow triple to reflect the new company name.	2021-02-04 09:43:17 -08:00
xgupta	50ff35b458	[examples] Fix Target does not support MC emission in ParallelJIT	2021-02-04 22:44:46 +05:30
Fangrui Song	8e87244228	[llvm-objdump] --source: drop the warning when there is no debug info Warnings have been added for three cases (PR41905): (1) missing debug info, (2) the source file cannot be found, (3) the debug info points at a line beyond the end of the file. (1) is probably less useful. This was brought up once on http://lists.llvm.org/pipermail/llvm-dev/2020-April/141264.html and two internal users mentioned it to me that it was annoying. (I personally find the warning confusing, too.) Users specify --source to get additional information if sources happen to be available. If sources are not available, it should be obvious as the output will have no interleaved source lines. The warning can be especially annoying when using llvm-objdump -S on a bunch of files. This patch drops the warning when there is no debug info. (If LLVMSymbolizer::symbolizeCode returns an `Error`, there will still be an error. There is currently no test for an `Error` return value. The only code path is probably a broken symbol table, but we probably already emit a warning in that case) `source-interleave-prefix.test` has an inappropriate "malformed" test - the test simply has no .debug_* because new llc does not produce debug info when the filename is empty (invalid). I have tried tampering the header of .debug_info/.debug_line but llvm-symbolizer does not warn. This patch does not intend to add the missing test coverage. Differential Revision: https://reviews.llvm.org/D88715	2021-02-04 09:07:44 -08:00
Jay Foad	8ec3c48d90	[AMDGPU][GlobalISel] Fix v2s16 right shifts When widening, each half of the v2s16 operands needs to be sign extended for G_ASHR or zero extended for G_LSHR. Differential Revision: https://reviews.llvm.org/D96048	2021-02-04 17:04:32 +00:00
Jay Foad	414015c7e8	[AMDGPU][GlobalISel] Use scalar min/max instructions SALU min/max s32 instructions exist so use them. This means that regbankselect can handle min/max much like add/sub/mul/shifts. Differential Revision: https://reviews.llvm.org/D96047	2021-02-04 17:04:32 +00:00
wlei	425d60e18c	[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110	2021-02-04 08:43:21 -08:00
Sanjay Patel	b659297be9	[InstCombine] add tests for demanded/known bits of shifted constant; NFC These are variations of a missed analysis noted in: https://llvm.org/PR48984	2021-02-04 10:31:22 -05:00
Dylan McKay	93bf157993	[AVR] Fix up a few accidentally-regressed Generic CodeGen tests recently broken In 85e8e6246e0fcc62ba727e8fb5990f1a632125d0, these tests were modified to work with AVR, but the regex matchers were finicky and required a fix forward patch, being this.	2021-02-05 04:21:54 +13:00
Louis Dionne	85de81a5d4	[libc++] Rename include/support to include/__support We do ship those headers, so the directory name should not be something that can potentially conflict with user-defined directories. Differential Revision: https://reviews.llvm.org/D95956	2021-02-04 10:16:33 -05:00
Simon Pilgrim	a18c56e6ee	[X86] Use VT::changeVectorElementType helper where possible. NFCI.	2021-02-04 15:03:56 +00:00
Dylan McKay	c468872427	[AVR] Add 'XFAIL' to the remaining failing Generic CodeGen tests for AVR This patch adds 'XFAIL: avr' to 2 Generic CodeGen tests, bringing the Generic CodeGen tests for AVR to a pass, with only two XFAILures. After this patch, the Generic CodeGen tests pass on AVR.	2021-02-05 04:02:27 +13:00
Dylan McKay	dd0f6f1963	[AVR] Fix 14 Generic CodeGen tests by making address space explicit or optional This fixes the vast majority of remaining failing AVR Generic CodeGen tests.	2021-02-05 04:02:27 +13:00
Sander de Smalen	fc8e04bae8	NFC: Migrate LoopUnrollPass to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: david-arm, fhahn Differential Revision: https://reviews.llvm.org/D95817	2021-02-04 14:05:40 +00:00
Florian Hahn	c3bf6e0e7a	[ConstraintElimination] Support conditions from loop preheaders This patch extends the condition collection logic to allow adding conditions from pre-headers to loop headers, by allowing cases where the target block dominates some of its predecessors.	2021-02-04 13:58:32 +00:00
Konstantin Zhuravlyov	f21b553131	AMDGPU: Add support for amdgpu-unsafe-fp-atomics attribute If amdgpu-unsafe-fp-atomics is specified, allow {flat\|global}_atomic_add_f32 even if atomic modes don't match. Differential Revision: https://reviews.llvm.org/D95391	2021-02-04 08:09:34 -05:00
Dylan McKay	45fc9d7bf4	[AVR] Remove an assertion that causes generic CodeGen tests to fail It was discussed a few years ago and agreed that it makes sense to remove this assertion as other targets do not perform similar register size checking in inline assembly constraint logic, so the check just adds a needless barrier on AVR. This patch removes the assertion and removes 'XFAIL' from two Generic CodeGen tests for AVR as a result.	2021-02-05 02:05:23 +13:00
Simon Pilgrim	131bb74242	[X86] Remove stale TODO comment. NFC. We now handle implicit zero-extension shuffle mask cases.	2021-02-04 12:14:05 +00:00
Nico Weber	e65f0c4dde	[gn build] (manually) port 0609f257dc2e2c3	2021-02-04 06:52:55 -05:00
Sander de Smalen	beb3fa0881	[ElementCount] NFC: Set 'const' qualifier for getWithIncrement/Decrement. These class methods simply return a new UnivariateLinearPolyBase (e.g. ElementCount), and do not modify the object in any way or form, so qualify for being 'const'.	2021-02-04 11:27:45 +00:00
Jeremy Morse	1dab9824e4	Re-land D94976 after revert in e29552c5aff6 This modified patch avoids redirecting the unit in which a subprogram is created if type units are enabled -- DIEs were getting children allocated from different units memory pools. Original commit message: [DWARF] Create subprogram's DIE in DISubprogram's unit This is a fix for PR48790. Over in D70350, subprogram DIEs were permitted to be shared between CUs. However, the creation of a subprogram DIE can be triggered early, from other CUs. The subprogram definition is then created in one CU, and when the function is actually emitted children are attached to the subprogram that expect to be in another CU. This breaks internal CU references in the children. Fix this by redirecting the creation of subprogram DIEs in getOrCreateContextDIE to the CU specified by it's DISubprogram definition. This ensures that the subprogram DIE is always created in the correct CU. Differential Revision: https://reviews.llvm.org/D94976	2021-02-04 11:17:18 +00:00
David Green	a9c44b95c0	[ARM] Handle f16 in GeneratePerfectShuffle This new f16 shuffle under Neon would hit an assert in GeneratePerfectShuffle as it would try to treat a f16 vector as an i8. Add f16 handling, treating them like an i16. Differential Revision: https://reviews.llvm.org/D95446	2021-02-04 11:14:52 +00:00
Jan Svoboda	6e10f537ef	[clang][cli] Command line round-trip for HeaderSearch options This patch implements generation of remaining header search arguments. It's done manually in C++ as opposed to TableGen, because we need the flexibility and don't anticipate reuse. This patch also tests the generation of header search options via a round-trip. This way, the code gets exercised whenever Clang is built and tested in asserts mode. All `check-clang` tests pass. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D94472	2021-02-04 10:18:34 +01:00
Joachim Meyer	f40f02934a	[Support] Indent multi-line descr of enum cli options. As noted in https://reviews.llvm.org/D93459, the formatting of multi-line descriptions of clEnumValN and the likes is unfavorable. Thus this patch adds support for correctly indenting these. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D93494	2021-02-04 10:14:44 +01:00
Sebastian Neubauer	ab4f1aa423	[AMDGPU] Save all lanes for reserved VGPRs When SGPRs are spilled to VGPRs, they can overwrite any lane. We need to preserve the value of inactive lanes in function calls, so we save the register even if it is marked as caller saved. Also, teach buildPrologSpill to work when no registers are free like in CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir and update the comment on findScratchNonCalleeSaveRegister as it is not used anymore to realign the stack pointer since D95865. Differential Revision: https://reviews.llvm.org/D95946	2021-02-04 09:56:36 +01:00
wlei	ba7695d4ea	[CSSPGO][llvm-profgen] Compress recursive cycles in calling context This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556	2021-02-03 22:16:07 -08:00
wlei	a12b3252a9	Revert "[CSSPGO][llvm-profgen] Compress recursive cycles in calling context" This reverts commit 0609f257dc2e2c3e4c7cd30fe2ffd520117e706b.	2021-02-03 22:16:05 -08:00
wlei	924cc19d25	Revert "[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation" This reverts commit 1714ad2336293f351b15dd4b518f9e8618ec38f2.	2021-02-03 22:16:05 -08:00
Petr Hosek	b7bd8d62b5	[NFC] Fix the noprofile attribute comment	2021-02-03 21:54:09 -08:00
Chuanqi Xu	9aeeaeef76	[NFC][Coroutine] Remove redundant comment The functionallity in the TODO was added before: https://reviews.llvm.org/rGb3a722e66b75328ab5e2eb5c8572022cb083855b	2021-02-04 12:54:30 +08:00
Kazu Hirata	d6e82443c7	[Transforms/IPO] Use range-based for loops (NFC)	2021-02-03 20:41:20 -08:00
Kazu Hirata	7c818b1f11	[TableGen] Use ListSeparator (NFC)	2021-02-03 20:41:18 -08:00

... 2 3 4 5 6 ...

210917 Commits