llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

Author	SHA1	Message	Date
Matt Arsenault	40d3e7765c	AMDGPU: Remove denormal subtarget features Switch to using the denormal-fp-math/denormal-fp-math-f32 attributes.	2020-04-02 17:17:12 -04:00
Changpeng Fang	d65c44efb7	AMDGPU: Implement FDIV optimizations in AMDGPUCodeGenPrepare Summary: RCP has the accuracy limit. If FDIV fpmath require high accuracy rcp may not meet the requirement. However, in DAG lowering, fpmath information gets lost, and thus we may generate either inaccurate rcp related computation or slow code for fdiv. In patch implements fdiv optimizations in the AMDGPUCodeGenPrepare, which could exactly know !fpmath. FastUnsafeRcpLegal: We determine whether it is legal to use rcp based on unsafe-fp-math, fast math flags, denormals and fpmath accuracy request. RCP Optimizations: 1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with denormals flushed. a/b -> a*rcp(b) when fast unsafe rcp is legal. Use fdiv.fast: a/b -> fdiv.fast(a, b) when RCP optimization is not performed and fpmath >= 2.5ULP with denormals flushed. 1/x -> fdiv.fast(1,x) when RCP optimization is not performed and fpmath >= 2.5ULP with denormals. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D71293	2020-01-23 16:57:43 -08:00
Qiu Chaofan	3d8ed5a845	[DAGCombiner] Improve division estimation of floating points. Current implementation of estimating divisions loses precision since it estimates reciprocal first and does multiplication. This patch is to re-order arithmetic operations in the last iteration in DAGCombiner to improve the accuracy. Reviewed By: Sanjay Patel, Jinsong Ji Differential Revision: https://reviews.llvm.org/D66050 llvm-svn: 371713	2019-09-12 07:51:24 +00:00
Matt Arsenault	7ead0f3ed6	AMDGPU: Allow SIShrinkInstructions to work in non-SSA Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575	2017-07-10 19:53:57 +00:00
Alexander Timofeev	71cf453d98	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307097	2017-07-04 17:32:00 +00:00
NAKAMURA Takumi	e6c7524092	Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default" It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054	2017-07-04 02:14:18 +00:00
Alexander Timofeev	ef2bf78d2f	[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026	2017-07-03 14:54:11 +00:00
Matt Arsenault	dd9ab77318	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444	2017-03-21 21:39:51 +00:00
Matt Arsenault	6944286a7c	AMDGPU: fdiv -1, x -> rcp -x llvm-svn: 277535	2016-08-02 22:25:04 +00:00
Matt Arsenault	7193bf43c5	AMDGPU: Add volatile to test loads and stores When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check lines with the unmerged loads. llvm-svn: 266071	2016-04-12 13:38:18 +00:00
Matt Arsenault	e676b40286	AMDGPU: Remove some old intrinsic uses from tests llvm-svn: 260493	2016-02-11 06:02:01 +00:00
Tom Stellard	3f1708598e	R600 -> AMDGPU rename llvm-svn: 239657	2015-06-13 03:28:10 +00:00

12 Commits