llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00

Author	SHA1	Message	Date
Chandler Carruth	c387751c70	Enable '-Wstring-conversion' and fix some bad asserts that it helped find. Notable is the assert in NewGVN which had no effect because of the bug. llvm-svn: 290400	2016-12-23 01:38:06 +00:00
Matt Arsenault	1edd642a1d	AMDGPU: Invert cmp + select with constant Canonicalize a select with a constant to the false side. This enables more instruction shrinking opportunities since an inline immediate can be used for the false side of v_cndmask_b32_e32. This seems to usually be better but causes some code size regressions in some tests. llvm-svn: 290372	2016-12-22 21:40:08 +00:00
Matt Arsenault	66ebaecd36	AMDGPU: Use i16 for i16 shift amount llvm-svn: 290351	2016-12-22 16:36:25 +00:00
Matt Arsenault	ca3de8cd67	AMDGPU: Fix missing 16-bit cmpx instructions llvm-svn: 290349	2016-12-22 16:27:14 +00:00
Matt Arsenault	9419a1ea69	AMDGPU: Use i16 comparison instructions llvm-svn: 290348	2016-12-22 16:27:11 +00:00
Matt Arsenault	0975fb877d	AMDGPU: Fixed '!NodePtr->isKnownSentinel()' assert Caused by dereferencing end iterator when trying to const cast the iterator. Patch by Martin Sherburn llvm-svn: 290347	2016-12-22 16:06:32 +00:00
Sam Kolton	dc4ffc9328	[AMDGPU] Add pseudo SDWA instructions Summary: This is needed for later SDWA support in CodeGen. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27412 llvm-svn: 290338	2016-12-22 12:57:41 +00:00
Sam Kolton	0ab0b61c0c	[AMDGPU] Disassembler: fix for disaasembling v_mac_f32/16_dpp/sdwa Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands. Reviewers: nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27847 llvm-svn: 290336	2016-12-22 11:30:48 +00:00
Matt Arsenault	13f610555b	AMDGPU: Fix missing commute table entries for cmpx No tests because these aren't currently used anywhere. llvm-svn: 290316	2016-12-22 04:39:41 +00:00
Matt Arsenault	ee5d8d2da0	AMDGPU: Swap order of operands in fadd/fsub combine FMA is canonicalized to constant in the middle operand. Do the same so fmad matches and avoid an extra combine step. llvm-svn: 290313	2016-12-22 04:03:40 +00:00
Matt Arsenault	9d4a891569	AMDGPU: Check fast math flags in fadd/fsub combines llvm-svn: 290312	2016-12-22 04:03:35 +00:00
Matt Arsenault	dd6b858bbf	AMDGPU: Form more FMAs if fusion is allowed Extend the existing fadd/fsub->fmad combines to produce FMA if allowed. llvm-svn: 290311	2016-12-22 03:55:35 +00:00
Matt Arsenault	f4e299f829	AMDGPU: Move combines into separate functions llvm-svn: 290309	2016-12-22 03:44:42 +00:00
Matt Arsenault	5ba9667c15	AMDGPU: Enable some f32 fadd/fsub combines for f16 llvm-svn: 290308	2016-12-22 03:40:39 +00:00
Matt Arsenault	d7ec3d5ba4	AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16 llvm-svn: 290307	2016-12-22 03:21:48 +00:00
Matt Arsenault	5ecf306700	AMDGPU: Allow rcp and rsq usage with f16 llvm-svn: 290302	2016-12-22 03:05:44 +00:00
Matt Arsenault	a844bf67ff	AMDGPU: Custom lower f16 fdiv llvm-svn: 290301	2016-12-22 03:05:41 +00:00
Matt Arsenault	263e20ee06	AMDGPU: Implement f16 fcanonicalize llvm-svn: 290300	2016-12-22 03:05:37 +00:00
Matt Arsenault	5979870866	AMDGPU: Update isFPImmLegal for f16 I don't think this matters because ConstantFP is legal. llvm-svn: 290299	2016-12-22 03:05:30 +00:00
Tom Stellard	4026197eca	AMDGPU/SI: Fix file header llvm-svn: 290265	2016-12-21 19:06:24 +00:00
Davide Italiano	9d09222590	[AMDGPU] Garbage collect dead code. NFCI. llvm-svn: 290249	2016-12-21 10:19:00 +00:00
Matt Arsenault	ea409cc20d	AMDGPU: Allow 16-bit types in inline asm constraints llvm-svn: 290193	2016-12-20 19:06:12 +00:00
Matt Arsenault	63d92e4ebb	AMDGPU: Don't add same instruction multiple times to worklist When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. llvm-svn: 290191	2016-12-20 18:55:06 +00:00
Tom Stellard	5634d231ec	AMDGPU/SI: Make a function const llvm-svn: 290185	2016-12-20 17:26:34 +00:00
Tom Stellard	2c0dd4ec69	AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.* Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184	2016-12-20 17:19:44 +00:00
Tom Stellard	ec757acb99	AMDGPU/SI: Add a MachineMemOperand to MIMG instructions Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179	2016-12-20 15:52:17 +00:00
Konstantin Zhuravlyov	3febbc8b1e	[AMDGPU] When unifying metadata, add operands to named metadata individually Differential Revision: https://reviews.llvm.org/D27725 llvm-svn: 290114	2016-12-19 16:54:24 +00:00
Sam Kolton	fcea1ddb2f	AMDGPU: [AMDGPU] Assembler: add .hsa_code_object_metadata directive for functime metadata V2.0 Summary: Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata. Between them user can put YAML string that would be directly put to the generated note. E.g.: ''' .hsa_code_object_metadata { amd.MDVersion: [ 2, 0 ] } .end_hsa_code_object_metadata ''' Based on D25046 Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye Differential Revision: https://reviews.llvm.org/D27619 llvm-svn: 290097	2016-12-19 11:43:15 +00:00
Matt Arsenault	5f7fab5b1a	AMDGPU: Fix name for v_ashrrev_i16 llvm-svn: 289967	2016-12-16 17:40:11 +00:00
Matt Arsenault	b1034a224d	AMDGPU: Select branch on undef to uniform scc branch llvm-svn: 289877	2016-12-15 21:57:11 +00:00
Matt Arsenault	feb1ec1fb2	AMDGPU: Fix asserting on returned tail calls llvm-svn: 289868	2016-12-15 20:50:12 +00:00
Matt Arsenault	496e9bc65d	AMDGPU: Assembler support for vintrp instructions llvm-svn: 289866	2016-12-15 20:40:20 +00:00
Alexander Timofeev	aa7ea574e9	Fix for regression after Global Load Scalarization patch llvm-svn: 289822	2016-12-15 15:17:19 +00:00
Krzysztof Parzyszek	b6cc44c368	Extract LaneBitmask into a separate type Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820	2016-12-15 14:36:06 +00:00
Nico Weber	bc8a99a6d1	fix gcc warning about a superfluous ; llvm-svn: 289705	2016-12-14 20:33:54 +00:00
Yaxun Liu	d4bb42cc3b	Fix build failure due to r289674 on certain systems Removed a useless include which caused conflict. llvm-svn: 289700	2016-12-14 20:17:47 +00:00
Yaxun Liu	98de4b3c84	AMDGPU: Emit runtime metadata version 2 as YAML Differential Revision: https://reviews.llvm.org/D25046 llvm-svn: 289674	2016-12-14 17:16:52 +00:00
Matt Arsenault	6bc4304e35	AMDGPU: Make AllocationPriority of SGPRs higher than VGPRs Since SGPRs should spill to VGPRs, they should be allocated first. I don't think this is sufficient for SGPRs to always spill to VGPRs though. llvm-svn: 289671	2016-12-14 16:52:06 +00:00
Nirav Dave	9fd3ae9cf9	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667	2016-12-14 16:43:44 +00:00
Matt Arsenault	c74ace61e1	AMDGPU: Change vintrp printing llvm-svn: 289664	2016-12-14 16:36:12 +00:00
Nirav Dave	afe2eccae3	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659	2016-12-14 15:44:26 +00:00
Stephan Bergmann	aba15d97df	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Eugene Zelenko	c816ae3436	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289475	2016-12-12 22:23:53 +00:00
Nicolai Haehnle	baedb7e4bc	AMDGPU: llvm.amdgcn.interp.mov is a source of divergence Summary: While the result is constant across a single primitive, each pixel shader wave can have pixels from multiple primitives. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27572 llvm-svn: 289447	2016-12-12 16:52:19 +00:00
Matt Arsenault	82492d300b	AMDGPU: Fix asan errors when folding operands This was failing when trying to fold immediates into operand 1 of a phi, which only has one statically known operand. llvm-svn: 289337	2016-12-10 19:58:00 +00:00
Matt Arsenault	78957bd5fc	AMDGPU: Fix AMDGPUPromoteAlloca breaking addrspacecasts The users of the addrspacecast were having their types incorrectly changed, producing invalid bitcasts between address spaces. llvm-svn: 289307	2016-12-10 00:52:50 +00:00
Matt Arsenault	c2c2a10170	AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306	2016-12-10 00:39:12 +00:00
Matt Arsenault	365f8ab107	AMDGPU: Fix vintrp disassembly llvm-svn: 289292	2016-12-10 00:29:55 +00:00
Matt Arsenault	61a1b18506	AMDGPU: Change vintrp printing to better match sc Some of the immediates need to be printed differently eventually. llvm-svn: 289291	2016-12-10 00:23:12 +00:00
Eugene Zelenko	796f37f3bb	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289282	2016-12-09 22:06:55 +00:00

1 2 3 4 5 ...

1392 Commits