llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 18:54:02 +01:00

Author	SHA1	Message	Date
Fangrui Song	8b3392c500	ELFObjectWriter: Simplify * Delete unused ELFSymbolData::operator< * Inline createStringTable * Fix a comment * Change align to return uint64_t	2021-02-13 14:52:30 -08:00
Craig Topper	9d5ee57922	[RISCV] Rename the RVVBaseAddr ComplexPattern to just BaseAddr and use it to merge some scalar load/store patterns too.	2021-02-13 12:01:51 -08:00
Fangrui Song	07bc1859f5	ELFObjectWriter: Delete redundant registerSymbol MCELFStreamer::changeSection has registered the group signature symbol.	2021-02-13 12:01:37 -08:00
Fangrui Song	d33006d182	ELFObjectWriter: Don't sort non-local symbols As we don't sort local symbols, don't sort non-local symbols. This makes non-local symbols appear in their register order, which matches GNU as. The register order is nice in that you can write tests with interleaved CHECK prefixes, e.g. ``` // CHECK: something about foo .globl foo foo: // CHECK: something about bar .globl bar bar: ``` With the lexicographical order, the user needs to place lexicographical smallest symbol first or keep CHECK prefixes in one place.	2021-02-13 10:32:27 -08:00
Sanjay Patel	17f939474e	[InstCombine] add tests for pow() divisor; NFC	2021-02-13 13:04:38 -05:00
Nikita Popov	4d31ccb85f	[IRBuilder] Remove Align-related deprecated APIs This removes IRBuilder methods accepting unsigned alignments in favor of their Align/MaybeAlign variants. These methods have been deprecated for more than a year at this point, so they should be safe to remove.	2021-02-13 16:42:37 +01:00
David Green	0472de536c	[ARM] Fix duplicate fdiv tests, changing them to frem. NFC	2021-02-13 15:16:11 +00:00
David Green	2f0db07412	[ARM] Extra vector shuffle tests of various kinds. NFC	2021-02-13 15:03:10 +00:00
Simon Pilgrim	998d22c332	[DAG] Fold i1/vXi1 saddsat/uaddsat(x,y) -> or(x,y) Alive2: https://alive2.llvm.org/ce/z/FzcrpH	2021-02-13 15:02:01 +00:00
Simon Pilgrim	953099c481	[DAG] Fold i1/vXi1 ssubsat/usubsat(x,y) -> and(x,~y) Alive2: https://alive2.llvm.org/ce/z/4nkNGh	2021-02-13 13:21:15 +00:00
Simon Pilgrim	a0306a3243	[DAG] PromoteIntRes_ADDSUBSHLSAT - use promoted ISD::USUBSAT directly As discussed on D96413, as long as the promoted bits of the args are zero we can use the basic ISD::USUBSAT pattern directly, without the shifting like we do for other ops. I think something similar should be possible for ISD::UADDSAT as well, which I'll look at later. Also, create a ISD::USUBSAT node directly - this will be expanded back by the legalizer later on if necessary. Differential Revision: https://reviews.llvm.org/D96622	2021-02-13 12:35:10 +00:00
Tyker	443904009e	reland [InstCombine] convert assumes to operand bundles Instcombine will convert the nonnull and alignment assumption that use the boolean condtion to an assumption that uses the operand bundles when knowledge retention is enabled. Differential Revision: https://reviews.llvm.org/D82703	2021-02-13 13:03:11 +01:00
Simon Pilgrim	1123f44663	[DAG] Fix shift amount limit in SimplifyDemandedBits trunc(shift(x,c)) to truncated bitwidth We lost this in D56387/rG69bc0990a9181e6eb86228276d2f59435a7fae67 - where I got the src/dst bitwidths mixed up and assumed getValidShiftAmountConstant would catch it. Patch by @craig.topper - confirmed by @Carrot that it fixes PR49162	2021-02-13 12:00:08 +00:00
Heejin Ahn	4214250471	[WebAssemblly] Fix rethrow's argument computation Previously we assumed `rethrow`'s argument was always 0, but it turned out `rethrow` follows the same rule with `br` or `delegate`: https://github.com/WebAssembly/exception-handling/pull/137 https://github.com/WebAssembly/exception-handling/issues/146#issuecomment-777349038 Currently `rethrow`s generated by our backend always rethrow the exception caught by the innermost enclosing catch, so this adds a function to compute that and replaces `rethrow`'s argument with its computed result. This also renames `EHPadStack` in `InstPrinter` to `TryStack`, because in CFGStackify we use `EHPadStack` to mean the range between `catch`~`end`, while in `InstPrinter` we used it to mean the range between `try`~`catch`, so choosing different names would look clearer. Doesn't contain any functional changes in `InstPrinter`. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D96595	2021-02-13 03:43:15 -08:00
Simon Pilgrim	27eeae4b39	[X86] Add reduced test case for PR49162	2021-02-13 11:33:35 +00:00
David Green	89d0773885	[ARM] MVE min/max cost tests. NFC	2021-02-13 11:12:12 +00:00
Fangrui Song	5e054365c4	[test] Make ELF tests less reliant on the lexicographical order of non-local symbols	2021-02-13 01:01:06 -08:00
Kazu Hirata	9c56d039e9	[CodeGen] Use range-based for loops (NFC)	2021-02-12 23:44:33 -08:00
Kazu Hirata	f0327193f8	[AMDGPU] Drop unnecessary const from a return type (NFC) Identified with readability-const-return-type.	2021-02-12 23:44:32 -08:00
Kazu Hirata	33601fbc90	[TableGen] Use ListSeparator (NFC)	2021-02-12 23:44:30 -08:00
Juneyoung Lee	6da8315c72	[InstSimplify] add tests that look into pointer operands of instructions	2021-02-13 16:25:27 +09:00
Wei Wang	58f68f472f	[LTO] Perform DSOLocal propagation in combined index Perform DSOLocal propagation within summary list of every GV. This avoids the repeated query of this information during function importing. Differential Revision: https://reviews.llvm.org/D96398	2021-02-12 22:58:26 -08:00
Juneyoung Lee	b74bacff63	[LangRef] Update memory access ops to raise UB if ptrs are not well defined In the past, it was stated in D87994 that it is allowed to dereference a pointer that is partially undefined if all of its possible representations fit into a dereferenceable range. The motivation of the direction was to make a range analysis helpful for assuring dereferenceability. Even if a range analysis concludes that its offset is within bounds, the offset could still be partially undefined; to utilize the range analysis, this relaxation was necessary. https://groups.google.com/g/llvm-dev/c/2Qk4fOHUoAE/m/KcvYMEgOAgAJ has more context about this. However, this is currently blocking another optimization, which is annotating the noundef attribute for library functions' arguments. D95122 is the patch. Currently, there are quite a few library functions which cannot have noundef attached to its pointer argument because it can be transformed from load/store. For example, MemCpyOpt can convert stores into memset: ``` store p, i32 0 store (p+1), i32 0 // Since currently it is allowed for store to have partially undefined pointer.. -> memset(p, 0, 8) // memset cannot guarantee that its ptr argument is noundef. ``` A bigger problem is that this makes unclear which library functions are allowed to have 'noundef' and which functions aren't (e.g., strlen). This makes annotating noundef almost impossible for this kind of functions. This patch proposes that all memory operations should have well-defined pointers. For memset/memcpy, it is semantically equivalent to running a loop until the size is met (and branching on undef is UB), so the size is also updated to be well-defined. Strictly speaking, this again violates the implication of dereferenceability from range analysis result. However, I think this is okay for the following reasons: 1. It seems the existing analyses in the LLVM main repo does not have conflicting implementation with the new proposal. `isDereferenceableAndAlignedPointer` works only when the GEP offset is constant, and `isDereferenceableAndAlignedInLoop` is also fine. 2. A possible miscompilation happens only when the source has a pointer with a partially undefined offset (it's okay with poison because there is no 'partially poison' value). But, at least I'm not aware of a language using LLVM as backend that has a well-defined program while allowing partially undefined pointers. There might be such a language that I'm not aware of, but improving the performance of the mainstream languages like C and Rust is more important IMHO. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95238	2021-02-13 14:13:19 +09:00
Fangrui Song	9197899fc0	[test] Make ELF tests amenable to the order of non-local symbols	2021-02-12 21:00:42 -08:00
Serge Pavlov	c5d859bfe2	[FPEnv][ARM] Implement lowering of llvm.set.rounding Differential Revision: https://reviews.llvm.org/D96501	2021-02-13 11:16:29 +07:00
Jian Cai	f73c5e8b40	[llvm-objcopy] preserve file ownership when overwritten by root As of binutils 2.36, GNU strip calls chown(2) for "sudo strip foo" and "sudo strip foo -o foo", but no "sudo strip foo -o bar" or "sudo strip foo -o ./foo". In other words, while "sudo strip foo -o bar" creates a new file bar with root access, "sudo strip foo" will keep the owner and group of foo unchanged. Currently llvm-objcopy and llvm-strip behave differently, always changing the owner and gropu to root. The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as they change file ownership and cause intended users/groups to lose access when invoked by sudo with the following sequence (recommended in man page of GNU strip). 1.<Link the executable as normal.> 1.<Copy "foo" to "foo.full"> 1.<Run "strip --strip-debug foo"> 1.<Run "objcopy --add-gnu-debuglink=foo.full foo"> This patch makes llvm-objcopy and llvm-strip follow GNU's behavior. Link: crbug.com/1108880	2021-02-12 18:01:43 -08:00
Adrian Prantl	e9020e2ee2	Store the LocationKind of an entry value buffer independently from the main LocationKind (NFC) This patch hides the logic for setting the location kind of an entry value inside the begin/finalize/cancel functions. This way we get rid the strange workaround that is currently in setLocation(). In the future, this will allow us to set the location kind of the entry value independently from the location kind of the main expression. Differential Revision: https://reviews.llvm.org/D96554	2021-02-12 16:59:39 -08:00
wlei	be4f05ba0b	[CSSPGO][llvm-profgen] Filter out the instructions without location info for symbolizer It appears some instructions doesn't have the debug location info and the symbolizer will return an empty call stack for them which will cause some crash later in profile unwinding. Actually we do not record the sample info for them, so this change just filter out those instruction. As those instruction would appears at the begin and end of the instruction list, without them we need to add the boundary check for IP `advance` and `backward`. Also for pseudo probe based profile, we actually don't need the symbolized location info, so here just change to use an empty stack for it. This could save half of the binary loading time. Differential Revision: https://reviews.llvm.org/D96434	2021-02-12 16:47:49 -08:00
Arthur Eubanks	fe7a083c4c	[NFC] Combine runNewPMPasses() and runNewPMCustomPasses() I've already witnessed two separate changes missing runNewPMPasses() because runNewPMCustomPasses() is so similar. This cleans up some duplicated code. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D96553	2021-02-12 16:44:52 -08:00
Craig Topper	25ee8ece36	[RISCV] Move riscv_vfmv_v_f_vl patterns to RISCVInstrInfoVVLPatterns.td for consistency with riscv_vmv_v_x_vl. NFC	2021-02-12 16:08:27 -08:00
Craig Topper	542e18bed6	[RISCV] Add support for fixed vector fabs	2021-02-12 15:33:36 -08:00
Craig Topper	a3618cb881	[RISCV] Add support for fixed vector sqrt.	2021-02-12 15:33:29 -08:00
James Y Knight	de99742eb0	LLVM-C: Allow LLVM{Get/Set}Alignment on an atomicrmw/cmpxchg instruction. (Now that these can have alignment specified.)	2021-02-12 18:31:18 -05:00
wlei	59ffba4f90	[CSSPGO][llvm-profgen] Renovate perfscript check and command line input validation This include some changes related with PerfReader's the input check and command line change: 1) It appears there might be thousands of leading MMAP-Event line in the perfscript for large workload. For this case, the 4k threshold is not eligible to determine it's a hybrid sample. This change renovated the `isHybridPerfScript` by going through the script without threshold limitation checking whether there is a non-empty call stack immediately followed by a LBR sample. It will stop once it find a valid one. 2) Added several input validations for the command line switches in PerfReader. 3) Changed the command line `show-disassembly` to `show-disassembly-only`, it will print to stdout and exit early which leave an empty output profile. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D96387	2021-02-12 15:18:50 -08:00
Jessica Paquette	1f262f8352	[AArch64][GlobalISel] Fold constants into G_GLOBAL_VALUE This is pretty much just ports `performGlobalAddressCombine` from AArch64ISelLowering. (AArch64 doesn't use the generic DAG combine for this.) This adds a pre-legalize combine which looks for this pattern: ``` %g = G_GLOBAL_VALUE @x %ptr1 = G_PTR_ADD %g, cst1 %ptr2 = G_PTR_ADD %g, cst2 ... %ptrN = G_PTR_ADD %g, cstN ``` And then, if possible, transforms it like so: ``` %g = G_GLOBAL_VALUE @x %offset_g = G_PTR_ADD %g, -min(cst) %ptr1 = G_PTR_ADD %offset_g, cst1 %ptr2 = G_PTR_ADD %offset_g, cst2 ... %ptrN = G_PTR_ADD %offset_g, cstN ``` Where min(cst) is the smallest out of the G_PTR_ADD constants. This means we should save at least one G_PTR_ADD. This also updates code in the legalizer + selector which assumes that G_GLOBAL_VALUE will never have an offset and adds/updates relevant tests. Differential Revision: https://reviews.llvm.org/D96624	2021-02-12 14:55:15 -08:00
Craig Topper	42bd4878de	[RISCV] Use a ComplexPattern to merge the PatFrags for removing unneeded masks on shift amounts. Rather than having patterns with and without an AND, use a ComplexPattern to handle both cases. Reduces the isel table by about 700 bytes.	2021-02-12 14:03:23 -08:00
Jay Foad	d27645b72c	[GlobalISel] Simpler verification of G_SEXT_INREG and G_ASSERT_ZEXT There's no need to call verifyVectorElementMatch since we already know that the source and destination types are identical. Differential Revision: https://reviews.llvm.org/D96589	2021-02-12 21:33:27 +00:00
Arthur Eubanks	6bb03a7eef	[gn build] Add missing llvm-profgen dependency Or else a clean build fails with missing Attributes.inc.	2021-02-12 12:44:17 -08:00
Nikita Popov	5e6e337ba9	[AA] Add option for tracing AA queries (NFC) Add an -aa-trace debug option that can be used to print AA queries, including any recursive queries and their results.	2021-02-12 21:42:49 +01:00
Nikita Popov	f6693f56a5	[AA] Move Depth member from AAResults to AAQI (NFC) Rather than storing the query depth in AAResults, store it in AAQI. This makes more sense, as it is a property of the query. This sidesteps the issue of D94363, fixing slightly inaccurate AA statistics. Additionally, I plan to use the Depth from BasicAA in the future, where fetching it from AAResults would be unreliable. This change is not quite as straightforward as it seems, because we need to preserve the depth when creating a new AAQI for recursive queries across phis. I'm adding a new method for this, as we may need to preserve additional information here in the future.	2021-02-12 21:42:36 +01:00
Stanislav Mekhanoshin	525e98279b	[AMDGPU] Fix Windows build A trivial fix, 64 bit constant is 1ull, not 1ul on Windows. Fixed build broken by c0d7a8bc6241.	2021-02-12 12:30:52 -08:00
Jessica Paquette	b5aa821b4f	[GlobalISel] Combine (x + 0) -> x, G_PTR_ADD edition Add it to right_identity_zero. Differential Revision: https://reviews.llvm.org/D96621	2021-02-12 12:09:48 -08:00
Vedant Kumar	97f4d2cb51	[docs/Coverage] Document -show-region-summary As a drive-by, fix the section in the clang docs about the number of statistics visible in a report.	2021-02-12 12:05:45 -08:00
James Y Knight	91b7513f11	Fix layering after ed4718eccb12. That commit added a dependency from IR to Analysis, which isn't allowed. Fix it by duplicating a string constant.	2021-02-12 14:54:59 -05:00
Amara Emerson	988eea1dc5	[GlobalISel] Propagate extends through G_PHIs into the incoming value blocks. This combine tries to do inter-block hoisting of extends of G_PHIs, into the originating blocks of the phi's incoming value. The idea is to expose further optimization opportunities that are normally obscured by the PHI. Some basic heuristics, and a target hook for AArch64 is added, to allow tuning. E.g. if the extend is used by a G_PTR_ADD, it doesn't perform this combine since it may be folded into the addressing mode during selection. There are very minor code size improvements on AArch64 -Os, but the real benefit is that it unlocks optimizations like AArch64 conditional compares on some benchmarks. Differential Revision: https://reviews.llvm.org/D95703	2021-02-12 11:52:52 -08:00
Fangrui Song	1fad034c68	DebugInfo/Symbolize: Exclude ARM mapping symbols for .symtab symbolization after D95916 Their names don't convey much information, so they should be excluded. The behavior matches addr2line. Differential Revision: https://reviews.llvm.org/D96617	2021-02-12 11:04:20 -08:00
Paul Robinson	2571f5980f	[RGT][GlobalIsel] Add missing setUp() calls to legalizer unittests Some of these accidentally disabled tests failed as a result; updated tests per @qcolombet instructions. A small number needed additional updates because legalization has actually changed since they were written. Found by the Rotten Green Tests project. Differential Revision: https://reviews.llvm.org/D95257	2021-02-12 10:45:48 -08:00
Xun Li	fcb89a3296	[NFC][Coroutine] Fix an error message on coro.id verification The error message should be about coro.id, not coro.begin Differential Revision: https://reviews.llvm.org/D96447	2021-02-12 10:44:03 -08:00
LLVM GN Syncbot	5ea384d16f	[gn build] Port cb2d2ae56ae3	2021-02-12 18:40:40 +00:00
David Green	0c8d4b9ca1	[ARM] Optimize fp store of extract to integer store if already available. Given a floating point store from an extracted vector, with an integer VGETLANE that already exists, storing the existing VGETLANEu directly can be better for performance. As the value is known to already be in an integer registers, this can help reduce fp register pressure, removed the need for the fp extract and allows use of more integer post-inc stores not available with vstr. This can be a bit narrow in scope, but helps with certain biquad kernels that store shuffled vector elements. Differential Revision: https://reviews.llvm.org/D96159	2021-02-12 18:34:58 +00:00

1 2 3 4 5 ...

211170 Commits