llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Jay Foad	14dd772f1b	[AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics (fix) Try to fix Windows buildbots.	2020-06-03 09:44:33 +01:00
Serge Pavlov	86f8cab821	Revert "[Support] Add file lock/unlock functions" This reverts commit f51bc4fb60fbcef26d18eff549fc68307fd46489. It broke the Solaris buildbots (Builder clang-solaris11-sparcv9 Build #5494 <http://lab.llvm.org:8014/builders/clang-solaris11-sparcv9/builds/54).	2020-06-03 15:40:12 +07:00
Vitaly Buka	f99a2b16cb	[StackSafety,NFC] Convert to template internal stuff It's going to be usefull for ThinLTO.	2020-06-03 01:36:20 -07:00
Vitaly Buka	299555c7e7	[StackSafety,NFC] Rename internal class	2020-06-03 01:36:20 -07:00
Jay Foad	4a4ccbfa8d	[AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics Differential Revision: https://reviews.llvm.org/D80702	2020-06-03 09:34:22 +01:00
hsmahesha	d03b426308	[AMDGPU/MemOpsCluster] Code clean-up around accessing of memory operand width Summary: Clean-up the width computing logic given a memory operand, and re-arrange code to avoid code duplication. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80946	2020-06-03 14:03:52 +05:30
LLVM GN Syncbot	d96290b617	[gn build] Port 755a8959152	2020-06-03 08:27:24 +00:00
Thomas Lively	21ce405fed	Revert "[WebAssembly] Eliminate range checks on br_tables" This reverts commit f99d5f8c32a822580a732d15a34e8197da55d22b. The change was causing UBSan and other failures on some bots.	2020-06-03 01:26:53 -07:00
Vitaly Buka	5b5814a4b5	[StackSafety] Skip non-pointer parameters Summary: Depends on D80908. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80956	2020-06-03 01:16:39 -07:00
Vitaly Buka	9a416f28c3	[NFC, StackSafety] Change type of internal container Summary: Depends on D80771. Reviewers: eugenis Reviewed By: eugenis Subscribers: mehdi_amini, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80847	2020-06-03 01:05:10 -07:00
QingShan Zhang	4237f2e791	[NFC][PowerPC] Remove unused node PPCISD::VMADDFP and PPCISD::VNMSUBFP These two nodes were added by 69caef2b781130a7d0eeaf8898eb346b6423ae03 in 2005 and they are not used by PowerPC backend anymore. And the ISD::FMA is a prefer way for VMADDFP if we really want to create that node. For VNMSUBFP, we will also add a more generic node FNMSUB in D76585 if we really want it. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D80429	2020-06-03 06:36:30 +00:00
David Sherwood	5423c8ae53	[CodeGen] Fix warnings in getPackedVectorTypeFromPredicateType Use getVectorElementCount() instead of getVectorNumElements(). The code changed in this patch is covered by an existing test: CodeGen/AArch64/sve-intrinsics-contiguous-prefetches.ll Differential Revision: https://reviews.llvm.org/D80615	2020-06-03 07:01:20 +01:00
Craig Topper	bb3074f462	[X86] Add CLWB to Tremont CPU. Remove CLDEMOTE, MOVDIRI, MOVDIR64B, and WAITPKG to match gcc.	2020-06-02 22:38:51 -07:00
Serge Pavlov	0a558b962d	[Support] Add file lock/unlock functions New functions `lockFile`, `tryLockFile` and `unlockFile` implement simple file locking. They lock or unlock entire file. This must be enough to support simulataneous writes to log files in parallel builds. Differential Revision: https://reviews.llvm.org/D78896	2020-06-03 12:22:45 +07:00
Carl Ritson	8b90fe296e	[AMDGPU] Make SGPR spills exec mask agnostic Explicitly set the exec mask for SGPR spills and reloads. This fixes a bug where SGPR spills to memory could be incorrect if the exec mask was 0 (or differed between spill and reload). Additionally pack scalar subregisters (upto 16/32 per VGPR), so that the majority of scalar types can be spilt or reloaded with a simple memory access. This should amortize some of the additional overhead of manipulating the exec mask. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D80282	2020-06-03 12:34:26 +09:00
Mehdi Amini	356c1d11d8	Revert "[NFC, StackSafety] Change type of internal container" This reverts commit f62813e7eae148a6175de28bfa384524a9f2bf94. GCC 5.3 build is broken.	2020-06-03 03:02:28 +00:00
Jessica Paquette	9e9596734c	[AArch64][GlobalISel] Select zip1 and zip2 Port the code to recognize a zip1/zip2 shuffle mask from AArch64ISelLowering and put it into the post-legalizer combiner. Add G_ZIP1 and G_ZIP2 to AArch64InstrGISel.td and hook them up as equivalent nodes to AArch64zip1 and AArch64zip2. This allows us to select them. Minor code size improvements for SPECINT2000 at -O3 on 197.parser, 252.eon, and 186.crafty. Differential Revision: https://reviews.llvm.org/D80969	2020-06-02 18:57:11 -07:00
Kazu Hirata	47434a0533	[JumpThreading] Simplify FindMostPopularDest (NFC) Summary: This patch simplifies FindMostPopularDest without changing the functionality. Given a list of jump threading destinations, the function finds the most popular destination. To ensure determinism when there are multiple destinations with the highest popularity, the function picks the first one in the successor list with the highest popularity. Without this patch: - The function populates DestPopularity -- a histogram mapping destinations to their respective occurrence counts. - Then we iterate over DestPopularity, looking for the highest popularity while building a vector of destinations with the highest popularity. - Finally, we iterate the successor list, looking for the destination with the highest popularity. With this patch: - We implement DestPopularity with MapVector instead of DenseMap. We populate the map with popularity 0 for all successors in the order they appear in the successor list. - We build the histogram in the same way as before. - We simply use std::max_element on DestPopularity to find the most popular destination. The use of MapVector ensures determinism. Reviewers: wmi, efriedma Reviewed By: wmi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81030	2020-06-02 18:43:31 -07:00
Vitaly Buka	78160ddec5	[NFC, StackSafety] Change type of internal container Summary: Depends on D80771. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80847	2020-06-02 18:27:22 -07:00
Vitaly Buka	4d44ddfc60	[MTE] Move tagging in pipeline Summary: This removes two analyses from pipeline. Depends on D80771. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80780	2020-06-02 17:48:55 -07:00
Wei Mi	93d80a543f	[SampleFDO] Add use-sample-profile function attribute. When sampleFDO is enabled, people may expect they can use -fno-profile-sample-use to opt-out using sample profile for a certain file. That could be either for debugging purpose or for performance tuning purpose. However, when thinlto is enabled, if a function in file A compiled with -fno-profile-sample-use is imported to another file B compiled with -fprofile-sample-use, the inlined copy of the function in file B may still get its profile annotated. The inconsistency may even introduce profile unused warning because if the target is not compiled with explicit debug information flag, the function in file A won't have its debug information enabled (debug information will be enabled implicitly only when -fprofile-sample-use is used). After it is imported into file B which is compiled with -fprofile-sample-use, profile annotation for the outline copy of the function will fail because the function has no debug information, and that will trigger profile unused warning. We add a new attribute use-sample-profile to control whether a function will use its sample profile no matter for its outline or inline copies. That will make the behavior of -fno-profile-sample-use consistent. Differential Revision: https://reviews.llvm.org/D79959	2020-06-02 17:23:17 -07:00
Guozhi Wei	223393b287	[X86] Add a flag to guard the wide load As shown in http://lists.llvm.org/pipermail/llvm-dev/2020-May/141854.html, widen load can also cause stall. Add a flag to guard the widening code, so users can disable it and evaluate its performance impact. Differential Revision: https://reviews.llvm.org/D80943	2020-06-02 16:16:13 -07:00
Vitaly Buka	69b767eb38	[MTE] Convert StackSafety into analysis This lets us to remove !stack-safe metadata and better controll when to perform StackSafety analysis. Reviewers: eugenis Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80771	2020-06-02 16:08:14 -07:00
Vitaly Buka	eb471b1fc4	[StackSafety] Delete useless test	2020-06-02 16:08:14 -07:00
Nick Desaulniers	6cf6be73f3	[Clang][A32/T32][Linux] -O1 implies -fomit-frame-pointer Summary: An upgrade of LLVM for CrOS [0] containing [1] triggered a bunch of errors related to writing to reserved registers for a Linux kernel's arm64 compat vdso (which is a aarch32 image). After a discussion on LKML [2], it was determined that -f{no-}omit-frame-pointer was not being specified. Comparing GCC and Clang [3], it becomes apparent that GCC defaults to omitting the frame pointer implicitly when optimizations are enabled, and Clang does not. ie. setting -O1 (or above) implies -fomit-frame-pointer. Clang was defaulting to -fno-omit-frame-pointer implicitly unless -fomit-frame-pointer was set explicitly. Why this becomes a problem is that the Linux kernel's arm64 compat vdso contains code that uses r7. r7 is used sometimes for the frame pointer (for example, when targeting thumb (-mthumb)). See useR7AsFramePointer() in llvm/llvm-project/llvm/lib/Target/ARM/ARMSubtarget.h. This is mostly for legacy/compatibility reasons, and the 2019 Q4 revision of the ARM AAPCS looks to standardize r11 as the frame pointer for aarch32, though this is not yet implemented in LLVM. Users that are reliant on the implicit value if unspecified when optimizations are enabled should explicitly choose -fomit-frame-pointer (new behavior) or -fno-omit-frame-pointer (old behavior). [0] https://bugs.chromium.org/p/chromium/issues/detail?id=1084372 [1] https://reviews.llvm.org/D76848 [2] https://lore.kernel.org/lkml/20200526173117.155339-1-ndesaulniers@google.com/ [3] https://godbolt.org/z/0oY39t Reviewers: kristof.beyls, psmith, danalbert, srhines, MaskRay, ostannard, efriedma Reviewed By: psmith, danalbert, srhines, MaskRay, efriedma Subscribers: efriedma, olista01, MaskRay, vhscampos, cfe-commits, llvm-commits, manojgupta, llozano, glider, hctim, eugenis, pcc, peter.smith, srhines Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80828	2020-06-02 15:54:14 -07:00
Eric Christopher	995cc9c405	Undo initialization of TRI in CGP as this is unconditionally initialized later.	2020-06-02 15:08:54 -07:00
Craig Topper	5a956160d0	[X86] Remove DeleteNode calls from PreprocessISelDAG. Rely on the RemoveDeadNodes call at the end. Add a MadeChange flag so we don't call RemoveDeadNodes unless something changed.	2020-06-02 14:10:20 -07:00
Craig Topper	5e8876d712	[X86] Cleanup inconsistencies in our zext/sext vector patterns. -Fix one place where we had a X86vzload64 but should have had X86vzload32. -Make sure all patterns that have scalar_to_vector+loadi64 also have scalar_to_vector+f64 to match 32-bit codegen. -Add some bitcasts that were missing from patterns. -Make sure that if we have a scalar_to_vector+load pattern we also have a vzload pattern. We probably need some better canonicalization to avoid having so many patterns.	2020-06-02 13:50:16 -07:00
Kadir Cetinkaya	def2e34328	[llvm] Fix unused variable warning	2020-06-02 22:46:24 +02:00
LLVM GN Syncbot	0638ffd642	[gn build] Port f99d5f8c32a	2020-06-02 20:36:52 +00:00
Eric Christopher	2f90457f4d	Fix up clang-tidy warnings around null and pointers.	2020-06-02 13:24:20 -07:00
Amy Kwan	97fd4517d5	[DAGCombiner] Combine shifts into multiply-high This patch implements a target independent DAG combine to produce multiply-high instructions from shifts. This DAG combine will combine shifts for any type as long as the MULH on the narrow type is legal. For now, it is enabled on PowerPC as PowerPC is the only target that has an implementation of the isMulhCheaperThanMulShift TLI hook introduced in D78271. Moreover, this DAG combine focuses on catching the pattern: (shift (mul (ext <narrow_type>:$a to <wide_type>), (ext <narrow_type>:$b to <wide_type>)), <narrow_width>) to produce mulhs when we have a sign-extend, and mulhu when we have a zero-extend. The patch performs the following checks: - Operation is a right shift arithmetic (sra) or logical (srl) - Input to the shift is a multiply - Both operands to the shift are sext/zext nodes - The extends into the multiply are both the same - The narrow type is half the width of the wide type - The shift amount is the width of the narrow type - The respective mulh operation is legal Differential Revision: https://reviews.llvm.org/D78272	2020-06-02 15:22:48 -05:00
Thomas Lively	dbbd248c77	[WebAssembly] Eliminate range checks on br_tables Summary: Jump tables for most targets cannot handle out of range indices by themselves, so LLVM emits range checks to guard the jump tables. WebAssembly, on the other hand, implements jump tables using the br_table instruction, which takes a default branch target as an operand, making the range checks redundant. This patch introduces a new MachineFunction pass in the WebAssembly backend to find and eliminate the redundant range checks. Reviewers: aheejin, dschuff Subscribers: mgorny, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80863	2020-06-02 13:14:27 -07:00
dstuttar	11327a08da	[TableGen] Avoid generating switch with just default Summary: Switch with just default causes an MSVC warning (warning C4065: switch statement contains 'default' but no 'case' labels). Change-Id: I9ddeccdef93666256b5454b164b567b73b488461 Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81021	2020-06-02 19:48:07 +01:00
Diego Caballero	986805054f	Update 'git push' command in GettingStarted guide 'git push' command, without any other arguments, can do different things depending on the local configuration of Git. This patch updates the 'git push' command with extra arguments to be more resilient to any local configuration. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D79964	2020-06-02 21:25:29 +03:00
Jonas Devlieghere	2b24554801	[llvm-dwarfdump] Print [=<offset>] after --debug-* options in help output. Some of the --debug-* options can take an optional offset. Although the man page does a good job of making that clear, it's much harder to discover from the help output. Currently the only reference to this is the following sentence: > Where applicable these parameters take an optional =<offset> argument > to dump only the entry at the specified offset. This patch changes the help output from to print [=<offset>] after the options that take an offset. --debug-info[=<offset>] - Dump the .debug_info section rdar://problem/63150066 Differential revision: https://reviews.llvm.org/D80959	2020-06-02 11:06:11 -07:00
Matt Arsenault	f2733aab9a	AMDGPU: Fix a test to be more stable The chained unconditional branches can be eliminated and it's not relevant to the test.	2020-06-02 13:47:48 -04:00
Matt Arsenault	bf98af2851	AMDGPU: Don't run indexing mode switches with exec = 0 Add mode defs rather than special casing this like some of the other instructions.	2020-06-02 13:47:48 -04:00
Matt Arsenault	58874f0270	AMDGPU: Don't run mode switches with exec 0 These are scalar instructions that change vector instructions, so they should not be executed without any active lanes. The implementation of -amdgpu-skip-threshold also seem to be backwards from expected, since decreasing it prevents removal.	2020-06-02 13:47:48 -04:00
Hiroshi Yamauchi	22e9592a06	[PGO] Enable memcmp/bcmp size value profiling. Summary: Following up D79751. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80578	2020-06-02 10:27:11 -07:00
Sanjay Patel	4bd1716bf2	[InstCombine] add tests for select-of-select-shuffle; NFC	2020-06-02 13:26:21 -04:00
Sanjay Patel	48da5ea648	[InstCombine] regenerate complete test checks; NFC	2020-06-02 13:26:21 -04:00
Simon Pilgrim	2ec7714439	TypeSymbolEmitter.h - reduce includes to forward declarations. NFC.	2020-06-02 16:30:17 +01:00
Alexey Bataev	1d1cfb66ce	[OPENMP50]Initial codegen for 'affinity' clauses. Summary: Added initial codegen for 'affinity' clauses on task directives. Emits next code: ``` kmp_task_affinity_info_t affs[<num_elems>]; void *td = __kmpc_task_alloc(..); affs[<i>].base = &data_i; affs[<i>].size = sizeof(data_i); __kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs); ``` The result returned by the call of `__kmpc_omp_reg_task_with_affinity` function is ignored currently sincethe runtime currently ignores args and returns 0 uncoditionally. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80240	2020-06-02 10:50:08 -04:00
Georgii Rymar	935ddb2639	[yaml2obj] - Allocate the file space for SHT_NOBITS sections in some cases. This teaches yaml2obj to allocate file space for a no-bits section when there is a non-nobits section in the same segment that follows it. It was discussed in D78005 thread and matches GNU linkers and LLD behavior. Differential revision: https://reviews.llvm.org/D80629	2020-06-02 17:19:24 +03:00
serge-sans-paille	965b4cbc82	Use Pseudo Instruction to carry stack probing information Instead of using a fake call and metadata to temporarily represent a probed static alloca, use a pseudo instruction. This is inspired by the SystemZ approach proposed in https://reviews.llvm.org/D78717. Differential Revision: https://reviews.llvm.org/D80641	2020-06-02 16:14:06 +02:00
Matt Arsenault	e6f5e03023	AMDGPU: Fix not using scalar loads for global reads in shaders The pass which infers when it's legal to load a global address space as SMRD was only considering amdgpu_kernel, and ignoring the shader entry type calling conventions.	2020-06-02 09:49:23 -04:00
Nico Weber	50523c63f4	[gn build] (manually) port 44f989e7809	2020-06-02 08:18:42 -04:00
Igor Kudrin	17735d83e5	Fix a failing test.	2020-06-02 18:50:36 +07:00
Djordje Todorovic	475384322f	[CSInfo][NFC] Interpret loaded parameter value separately The collectCallSiteParameters() method searches for instructions which load values into registers used for parameters passing. Previously, interpretation of those values, loaded by one such instruction, was implemented inside collectCallSiteParameters() method. This patch moves the interpretation code from collectCallSiteParameters() method into a separate static method named interpretValue. New method is called from collectCallSiteParameters() to process each instruction from targeted instruction scope. The collectCallSiteParameters() searches for loaded parameter value among instructions which precede the call instruction, inside the same basic block. When needed, new method (interpretValue) could be used for searching any instruction scope. This is preparation for search of parameter value, loaded inside call delay slot. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D78106	2020-06-02 13:05:04 +02:00

... 2 3 4 5 6 ...

197848 Commits