llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

Author	SHA1	Message	Date
James Henderson	ccd76d48a6	[llvm-objdump] Fix missing first line of license in header file	2021-02-05 08:45:50 +00:00
Max Kazantsev	95717fc944	[Test] Add more tests demonstrating oddities in behavior of LSR These tests demonstrate that LSR does not insert IV increment into the latch block (as it supposes to) when it can use an existing Phi as IV rather than creating a new LSR IV.	2021-02-05 14:04:29 +07:00
Kazu Hirata	9912faec1f	[Transforms/Scalar] Use range-based for loops (NFC)	2021-02-04 21:18:05 -08:00
Kazu Hirata	f9bdd6d212	[GlobalISel] Use ListSeparator (NFC)	2021-02-04 21:18:04 -08:00
Kazu Hirata	70b89a598e	[IR] Drop unnecessary const from return types (NFC) Identified with const-return-type.	2021-02-04 21:18:02 -08:00
Fangrui Song	8a939b8ebd	LLVMgold.so: Fix tests after D95380	2021-02-04 21:14:36 -08:00
Fangrui Song	b5deb0e404	[MC] Add isFPImm after D96091	2021-02-04 20:51:02 -08:00
Fangrui Song	01de7047c8	[VE] Fix allowsMisalignedMemoryAccesses after D96097	2021-02-04 20:46:18 -08:00
Fangrui Song	1c762d7312	[MC] Add createFPImm/isFPImm/setFPImm to smooth migration from FPImm to DFPImm after D96091	2021-02-04 20:42:35 -08:00
Fangrui Song	161c766c65	[ARM][WebAssembly] Fix incorrect MCOperand::createDFPImm after D96091	2021-02-04 20:39:52 -08:00
Craig Topper	2420728a51	[RISCV] Use LLVMScalarOrSameVectorWidth to make avoid needing to mention the index type for vrgatherei16 intrinsics. Add .vv to the intrinsic name to be consistent with D95979. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D95981	2021-02-04 20:26:45 -08:00
Craig Topper	29b2411301	[RISCV] Split vrgather intrinsics into separate vrgather.vv and vrgather.vx intrinsics. The vrgather.vv instruction uses a vector of indices with the same SEW as operand 0. The vrgather.vx instructions use a scalar index operand of XLen bits. By splitting this into 2 intrinsics we are able to use LLVMatchType in the definition to avoid specifying the type for the index operand when creating the IR for the intrinsic. For .vv it will match the operand 0 type. And for .vx it will match the type of the vl operand we already needed to specify a type for. I'm considering splitting more intrinsics. This was a somewhat odd one because the .vx doesn't use the element type, it always use XLen. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D95979	2021-02-04 19:50:12 -08:00
Matheus Izvekov	7065f0a696	[X86] Generate unaligned access for fixed slots in unaligned stack loadRegFromStackSlot()/storeRegToStackSlot() can generate aligned access instructions for stack slots even if the stack is unaligned, based on the assumption that the stack can be realigned. However, this doesn't work for fixed slots, which are e.g. used for spilling XMM registers in a non-leaf function with `__attribute__((preserve_all))`. When compiling such code with `-mstack-alignment=8`, this causes general protection faults. Fix it by only considering stack realignment for non-fixed slots. Note that this changes the output of three existing tests which spill AVX registers, since AVX requires higher alignment than the ABI provides on stack frame entry. Reviewed By: rnk, jyknight Differential Revision: https://reviews.llvm.org/D73126	2021-02-05 11:36:54 +08:00
Craig Topper	b58d6a4b20	[TargetLowering] Use Align in allowsMisalignedMemoryAccesses. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96097	2021-02-04 19:22:06 -08:00
Dan Gohman	1471de1de2	[WebAssembly] Support single-floating-point immediate value As mentioned in TODO comment, casting double to float causes NaNs to change bits. To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode. Patch by Yuta Saito. Differential Revision: https://reviews.llvm.org/D77384	2021-02-04 18:05:06 -08:00
Philip Reames	3f9664176e	Add missing test update from 3e5ce49 Sorry for the build break, apparently forgot to build ARM target.	2021-02-04 18:04:24 -08:00
Fangrui Song	5481fc6d44	DebugInfo: Temporarily work around -gsplit-dwarf + LTO .debug_gnu_pubnames regression after D94976 `-flto -gsplit-dwarf -g -O[123]` may create .debug_gnu_pubnames with 0 DIE offset entries. llvm-dwarfdump -debug-gnu-pubnames/ld.lld --gdb-index errors for that. ``` .section .debug_gnu_pubnames,"",@progbits .long .LpubNames_end2-.LpubNames_begin2 # Length of Public Names Info .LpubNames_begin2: .short 2 # DWARF Version .long .Lcu_begin2 # Offset of Compilation Unit Info .long 57 # Compilation Unit Length .long 0 # DIE offset .byte 16 # Attributes: TYPE, EXTERNAL .asciz "absl" # External Name .long 0 # DIE offset .byte 16 # Attributes: TYPE, EXTERNAL .asciz "absl::base_internal" # External Name .long 0 # End Mark ```	2021-02-04 17:35:09 -08:00
Philip Reames	d42b6ae9c0	[LV] Unconditionally branch from middle to scalar preheader if the scalar loop must execute If we know that the scalar epilogue is required to run, modify the CFG to end the middle block with an unconditional branch to scalar preheader. This is instead of a conditional branch to either the preheader or the exit block. The motivation to do this is to support multiple exit blocks. Specifically, the current structure forces us to identify immediate dominators and which exit block to branch from in the middle terminator. For the multiple exit case - where we know require scalar will hold - these questions are ill formed. This is the last change needed to support multiple exit loops, but since the diffs are already large enough, I'm going to land this, and then enable separately. You can think of this as being NFCI-ish prep work, but the changes are a bit too involved for me to feel comfortable tagging the change that way. Differential Revision: https://reviews.llvm.org/D94892	2021-02-04 17:28:30 -08:00
Craig Topper	7c9d7ad64a	[RISCV] Add i8/i16 test cases to div.ll and i8/i16/i64 to rem.ll. NFC This improves our coverage of these operations and shows that we use really large constants for division by constant on i8/i16 especially on RV64. The issue is that BuildSDIV/BuildUDIV are limited to legal types so we have to promote to i64 before it kicks in. At that point we've lost the range information for the original type.	2021-02-04 16:46:23 -08:00
wlei	e7e8e955e2	[CSSPGO][llvm-profgen] Fix bug with parsing hybrid sample trace line when we skip the call stack starting with an external address, we should also skip the bottom LBR entry, otherwise it will cause a truncated context issue. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D95480	2021-02-04 16:15:05 -08:00
Richard Smith	8a8273062f	Don't infer attributes on '::operator new'. These attributes were all incorrect or inappropriate for LLVM to infer: - inaccessiblememonly is generally wrong; user replacement operator new can access memory that's visible to the caller, as can a new_handler function. - willreturn is generally wrong; a custom new_handler is not guaranteed to terminate. - noalias is inappropriate: Clang has a flag to determine whether this attribute should be present and adds it itself when appropriate. - noundef and nonnull on the return value should be specified by the frontend on all 'operator new' functions if we want them, not here. In any case, inferring attributes on functions declared 'nobuiltin' (as these are when Clang emits them) seems questionable.	2021-02-04 13:59:49 -08:00
Richard Smith	c07a815edb	Revert "[BuildLibcalls, Attrs] Support more variants of C++'s new, add attributes for C++'s delete" Several of the new attributes here were incorrect, and even the ones that are generally correct were being added even to nobuiltin calls. This reverts commit bb3f169b59e1c8bd7fd70097532220bbd11e9967.	2021-02-04 13:59:49 -08:00
Ayke van Laethem	388be82ce9	[ARM] Do not emit ldrexd/strexd on Cortex-M chips The ldrexd/strexd instructions are not supported on M-class chips, see for example https://developer.arm.com/documentation/dui0489/e/arm-and-thumb-instructions/memory-access-instructions/ldrex-and-strex which says: > All these 32-bit Thumb instructions are available in ARMv6T2 and > above, except that LDREXD and STREXD are not available in the ARMv7-M > architecture. Looking at the ARMv8-M architecture, it appears that these instructions aren't supported either. The Architecture Reference Manual lists ldrex/strex but not ldrexd/strexd: https://developer.arm.com/documentation/ddi0553/bn/ Godbolt example on LLVM 11.0.0, which incorrectly emits ldrexd/strexd instructions: https://llvm.godbolt.org/z/5qqPnE Differential Revision: https://reviews.llvm.org/D95891	2021-02-04 21:55:34 +01:00
Craig Topper	7f5abbda56	[TargetLowering] Use LegalOnly operand to isOperationLegalOrCustom to simplify some code. NFC	2021-02-04 12:30:37 -08:00
Nikita Popov	0760338efa	[MemorySSA] Don't treat lifetime.end as NoAlias MemorySSA currently treats lifetime.end intrinsics as not aliasing anything. This breaks MemorySSA-based MemCpyOpt, because we'll happily move a read of a pointer below a lifetime.end intrinsic, as no clobber is reported. I think the MemorySSA modelling here isn't correct: lifetime.end(p) has approximately the same effect as doing a memcpy(p, undef), and should be treated as a clobber. This patch removes the special handling of lifetime.end, leaving alias analysis to handle it appropriately. Differential Revision: https://reviews.llvm.org/D95763	2021-02-04 20:58:28 +01:00
wlei	6ee5e73092	[CSSPGO][llvm-profgen] Merge and trim profile for cold context to reduce profile size This change allows merging and trimming cold context profile in llvm-profgen to solve profile size bloat problem. Currently when the profile's total sample is below threshold(supported by a switch), it will be considered cold and merged into a base context-less profile, which will at least keep the profile quality as good as the baseline(non-cs). For example, two input profiles: [main @ foo @ bar]:60 [main @ bar]:50 Under threshold = 100, the two profiles will be merge into one with the base context, get result: [bar]:110 Added two switches: `--csprof-cold-thres=<value>`: Specified the total samples threshold for a context profile to be considered cold, with 100 being the default. Any cold context profiles will be merged into context-less base profile by default. `--csprof-keep-cold`: Force profile generation to keep cold context profiles instead of dropping them. By default, any cold context will not be written to output profile. Results: Though not yet evaluating it with the latest CSSPGO, our internal branch shows neutral on performance but significantly reduce the profile size. Detailed evaluation on llvm-profgen with CSSPGO will come later. Differential Revision: https://reviews.llvm.org/D94111	2021-02-04 11:05:03 -08:00
Adrian Prantl	81c0c6d2aa	Remove overzealous verifier check on DW_OP_LLVM_entry_value and improve the documentation Based on the comments in the code, the idea is that AsmPrinter is unable to produce entry value blocks of arbitrary length, such as DW_OP_entry_value [DW_OP_reg5 DW_OP_lit1 DW_OP_plus]. But the way the Verifier check is written it also disallows DW_OP_entry_value [DW_OP_reg5] DW_OP_lit1 DW_OP_plus which seems to overshoot the target. Note that this patch does not change any of the safety guards in LiveDebugValues — there is zero behavior change for clang. It just allows us to legalize more complex expressions in future patches. rdar://73907559 Differential Revision: https://reviews.llvm.org/D95990	2021-02-04 10:58:35 -08:00
Sanjay Patel	685525f3ae	[ExpandReductions] fix FMF requirement for fmin/fmax The upstream callers (the vectorizers) were fixed with: bbed5f2f8a04 ( D95690 ) 77adbe6a8c71 We should remove this pass entirely now that reduction legalization/lowering is expected to work just as well, but we need to confirm that the shuffle ops do not regress (for x86 in particular). This should be the last step needed to close: https://llvm.org/PR23116	2021-02-04 13:32:08 -05:00
Christopher Tetreault	27e0b248a8	Reland "Ensure that InstructionCost actually implements a total ordering" The operator< in the previous attempt was incorrect. It is unfortunate that this was only caught by the expensive checks. This reverts commit ff1147c3635685ba6aefbdc9394300adb5404595.	2021-02-04 10:04:10 -08:00
Peng Guo	6d08b0343e	[NFC][llvm-mca] Fix compiler warning Fix clang compiler warning from `-Wrange-loop-analysis`. Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D95997	2021-02-04 09:44:36 -08:00
Wen-Heng (Jack) Chung	5cc002b97f	[AMDGPU] Add f16 to i1 CodeGen patterns. Follow patterns used for f32 and f64 types. Differential Revision: https://reviews.llvm.org/D95964	2021-02-04 11:44:18 -06:00
Paul Robinson	afb45f94ea	[PS4] Allow triple to reflect the new company name.	2021-02-04 09:43:17 -08:00
xgupta	50ff35b458	[examples] Fix Target does not support MC emission in ParallelJIT	2021-02-04 22:44:46 +05:30
Fangrui Song	8e87244228	[llvm-objdump] --source: drop the warning when there is no debug info Warnings have been added for three cases (PR41905): (1) missing debug info, (2) the source file cannot be found, (3) the debug info points at a line beyond the end of the file. (1) is probably less useful. This was brought up once on http://lists.llvm.org/pipermail/llvm-dev/2020-April/141264.html and two internal users mentioned it to me that it was annoying. (I personally find the warning confusing, too.) Users specify --source to get additional information if sources happen to be available. If sources are not available, it should be obvious as the output will have no interleaved source lines. The warning can be especially annoying when using llvm-objdump -S on a bunch of files. This patch drops the warning when there is no debug info. (If LLVMSymbolizer::symbolizeCode returns an `Error`, there will still be an error. There is currently no test for an `Error` return value. The only code path is probably a broken symbol table, but we probably already emit a warning in that case) `source-interleave-prefix.test` has an inappropriate "malformed" test - the test simply has no .debug_* because new llc does not produce debug info when the filename is empty (invalid). I have tried tampering the header of .debug_info/.debug_line but llvm-symbolizer does not warn. This patch does not intend to add the missing test coverage. Differential Revision: https://reviews.llvm.org/D88715	2021-02-04 09:07:44 -08:00
Jay Foad	8ec3c48d90	[AMDGPU][GlobalISel] Fix v2s16 right shifts When widening, each half of the v2s16 operands needs to be sign extended for G_ASHR or zero extended for G_LSHR. Differential Revision: https://reviews.llvm.org/D96048	2021-02-04 17:04:32 +00:00
Jay Foad	414015c7e8	[AMDGPU][GlobalISel] Use scalar min/max instructions SALU min/max s32 instructions exist so use them. This means that regbankselect can handle min/max much like add/sub/mul/shifts. Differential Revision: https://reviews.llvm.org/D96047	2021-02-04 17:04:32 +00:00
wlei	425d60e18c	[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110	2021-02-04 08:43:21 -08:00
Sanjay Patel	b659297be9	[InstCombine] add tests for demanded/known bits of shifted constant; NFC These are variations of a missed analysis noted in: https://llvm.org/PR48984	2021-02-04 10:31:22 -05:00
Dylan McKay	93bf157993	[AVR] Fix up a few accidentally-regressed Generic CodeGen tests recently broken In 85e8e6246e0fcc62ba727e8fb5990f1a632125d0, these tests were modified to work with AVR, but the regex matchers were finicky and required a fix forward patch, being this.	2021-02-05 04:21:54 +13:00
Louis Dionne	85de81a5d4	[libc++] Rename include/support to include/__support We do ship those headers, so the directory name should not be something that can potentially conflict with user-defined directories. Differential Revision: https://reviews.llvm.org/D95956	2021-02-04 10:16:33 -05:00
Simon Pilgrim	a18c56e6ee	[X86] Use VT::changeVectorElementType helper where possible. NFCI.	2021-02-04 15:03:56 +00:00
Dylan McKay	c468872427	[AVR] Add 'XFAIL' to the remaining failing Generic CodeGen tests for AVR This patch adds 'XFAIL: avr' to 2 Generic CodeGen tests, bringing the Generic CodeGen tests for AVR to a pass, with only two XFAILures. After this patch, the Generic CodeGen tests pass on AVR.	2021-02-05 04:02:27 +13:00
Dylan McKay	dd0f6f1963	[AVR] Fix 14 Generic CodeGen tests by making address space explicit or optional This fixes the vast majority of remaining failing AVR Generic CodeGen tests.	2021-02-05 04:02:27 +13:00
Sander de Smalen	fc8e04bae8	NFC: Migrate LoopUnrollPass to work on InstructionCost This patch migrates cost values and arithmetic to work on InstructionCost. When the interfaces to TargetTransformInfo are changed, any InstructionCost state will propagate naturally. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: david-arm, fhahn Differential Revision: https://reviews.llvm.org/D95817	2021-02-04 14:05:40 +00:00
Florian Hahn	c3bf6e0e7a	[ConstraintElimination] Support conditions from loop preheaders This patch extends the condition collection logic to allow adding conditions from pre-headers to loop headers, by allowing cases where the target block dominates some of its predecessors.	2021-02-04 13:58:32 +00:00
Konstantin Zhuravlyov	f21b553131	AMDGPU: Add support for amdgpu-unsafe-fp-atomics attribute If amdgpu-unsafe-fp-atomics is specified, allow {flat\|global}_atomic_add_f32 even if atomic modes don't match. Differential Revision: https://reviews.llvm.org/D95391	2021-02-04 08:09:34 -05:00
Dylan McKay	45fc9d7bf4	[AVR] Remove an assertion that causes generic CodeGen tests to fail It was discussed a few years ago and agreed that it makes sense to remove this assertion as other targets do not perform similar register size checking in inline assembly constraint logic, so the check just adds a needless barrier on AVR. This patch removes the assertion and removes 'XFAIL' from two Generic CodeGen tests for AVR as a result.	2021-02-05 02:05:23 +13:00
Simon Pilgrim	131bb74242	[X86] Remove stale TODO comment. NFC. We now handle implicit zero-extension shuffle mask cases.	2021-02-04 12:14:05 +00:00
Nico Weber	e65f0c4dde	[gn build] (manually) port 0609f257dc2e2c3	2021-02-04 06:52:55 -05:00
Sander de Smalen	beb3fa0881	[ElementCount] NFC: Set 'const' qualifier for getWithIncrement/Decrement. These class methods simply return a new UnivariateLinearPolyBase (e.g. ElementCount), and do not modify the object in any way or form, so qualify for being 'const'.	2021-02-04 11:27:45 +00:00

1 2 3 4 5 ...

210779 Commits