llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

History

Nikolai Bozhenov 2540ce6c57 [X86] Heuristic to selectively build Newton-Raphson SQRT estimation On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 llvm-svn: 277725		2016-08-04 12:47:28 +00:00
..
AsmParser	[AMDGPU] Assembler: fix row_bcast parsing	2016-07-14 14:50:35 +00:00
Disassembler	AMDGPU: Expand register indexing pseudos in custom inserter	2016-07-19 00:35:03 +00:00
InstPrinter	AMDGPU: Remove unnecessary string usage in AsmPrinter	2016-07-05 22:06:56 +00:00
MCTargetDesc	MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC	2016-07-25 17:18:28 +00:00
TargetInfo	Remove autoconf support	2016-01-26 21:29:08 +00:00
Utils	[AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields.	2016-06-23 14:13:06 +00:00
AMDGPU.h	AMDGPU: Change fdiv lowering based on !fpmath metadata	2016-07-19 23:16:53 +00:00
AMDGPU.td	AMDGPU: Add feature for unaligned access	2016-07-01 23:03:44 +00:00
AMDGPUAlwaysInlinePass.cpp	Cloning: Clean up the interface to the CloneFunction function.	2016-05-10 20:23:24 +00:00
AMDGPUAnnotateKernelFeatures.cpp	AMDGPU: Add HSA dispatch id intrinsic	2016-07-22 17:01:30 +00:00
AMDGPUAnnotateUniformValues.cpp	Add optimization bisect opt-in calls for AMDGPU passes	2016-04-25 22:23:44 +00:00
AMDGPUAsmPrinter.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
AMDGPUAsmPrinter.h	AMDGPU: Delete more dead code	2016-07-22 17:01:25 +00:00
AMDGPUCallingConv.td	AMDGPU: Fix kernel argument alignment impacting stack size	2016-06-18 05:15:53 +00:00
AMDGPUCallLowering.cpp	AMDGPU: Add skeleton GlobalIsel implementation	2016-04-14 19:09:28 +00:00
AMDGPUCallLowering.h	AMDGPU: Add skeleton GlobalIsel implementation	2016-04-14 19:09:28 +00:00
AMDGPUCodeGenPrepare.cpp	AMDGPU: Use rcp for fdiv 1, x with fpmath metadata	2016-07-26 23:25:44 +00:00
AMDGPUFrameLowering.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
AMDGPUFrameLowering.h	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
AMDGPUInstrInfo.cpp	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
AMDGPUInstrInfo.h	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
AMDGPUInstrInfo.td	AMDGPU : Add intrinsics for compare with the full wavefront result	2016-07-28 16:42:13 +00:00
AMDGPUInstructions.td	AMDGPU: Fix i1 fp_to_int	2016-07-22 17:01:21 +00:00
AMDGPUIntrinsicInfo.cpp	AMDGPU: Change fdiv lowering based on !fpmath metadata	2016-07-19 23:16:53 +00:00
AMDGPUIntrinsicInfo.h	AMDGPU: Change fdiv lowering based on !fpmath metadata	2016-07-19 23:16:53 +00:00
AMDGPUIntrinsics.td	AMDGPU: Remove read_workdim intrinsic	2016-07-25 20:17:02 +00:00
AMDGPUISelDAGToDAG.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
AMDGPUISelLowering.cpp	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation	2016-08-04 12:47:28 +00:00
AMDGPUISelLowering.h	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation	2016-08-04 12:47:28 +00:00
AMDGPUMachineFunction.cpp	AMDGPU: Make AMDGPUMachineFunction fields private	2016-07-26 16:45:58 +00:00
AMDGPUMachineFunction.h	AMDGPU: Make AMDGPUMachineFunction fields private	2016-07-26 16:45:58 +00:00
AMDGPUMCInstLower.cpp	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL	2016-07-13 14:23:33 +00:00
AMDGPUMCInstLower.h	AMDGPU: R600 code splitting cleanup	2016-03-11 08:00:27 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp	[NFC] Header cleanup	2016-04-18 09:17:29 +00:00
AMDGPUPromoteAlloca.cpp	AMDGPU: Remove pointless dyn_cast_or_null	2016-07-18 19:00:07 +00:00
AMDGPURegisterInfo.cpp	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
AMDGPURegisterInfo.h	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
AMDGPURegisterInfo.td
AMDGPURuntimeMetadata.h	Re-commit [AMDGPU] Add metadata for runtime	2016-07-16 05:09:21 +00:00
AMDGPUSubtarget.cpp	AMDGPU: Delete dead code	2016-07-25 19:06:25 +00:00
AMDGPUSubtarget.h	AMDGPU: Delete dead code	2016-07-25 19:06:25 +00:00
AMDGPUTargetMachine.cpp	[GlobalISel] Introduce an instruction selector.	2016-07-27 14:31:55 +00:00
AMDGPUTargetMachine.h	AMDGPU: Delete more dead code	2016-07-22 17:01:25 +00:00
AMDGPUTargetObjectFile.cpp	Revert "[AMDGPU] Emit read-only data to .rodata for hsa"	2016-07-22 23:46:40 +00:00
AMDGPUTargetObjectFile.h	AMDGPU/SI: Add support for AMD code object version 2.	2016-05-05 17:03:33 +00:00
AMDGPUTargetTransformInfo.cpp	AMDGPU: Implement getLoadStoreVecRegBitWidth	2016-07-01 00:56:27 +00:00
AMDGPUTargetTransformInfo.h	[TTI] The cost model should not assume vector casts get completely scalarized	2016-07-06 17:30:56 +00:00
AMDILCFGStructurizer.cpp	AMDGPU: Remove implicit iterator conversions, NFC	2016-07-08 19:16:05 +00:00
AMDKernelCodeT.h	[AMDGPU] fix amd_kernel_code_t bit field position as per spec (added missing reserved fields)	2016-02-24 10:54:25 +00:00
CaymanInstructions.td	AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads	2016-07-05 00:12:51 +00:00
CIInstructions.td	[AMDGPU] refactor DS instruction definitions. NFC.	2016-08-01 14:21:30 +00:00
CMakeLists.txt	AMDGPU/R600: Delete/rename intrinsics no longer used by mesa	2016-07-14 05:47:17 +00:00
DSInstructions.td	[AMDGPU] refactor DS instruction definitions. NFC.	2016-08-01 14:21:30 +00:00
EvergreenInstructions.td	AMDGPU/R600: Replace barrier intrinsics	2016-07-18 18:34:59 +00:00
GCNHazardRecognizer.cpp	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
GCNHazardRecognizer.h	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
LLVMBuild.txt	AMDGPU: Prune AMDGPUAsmParser in libdeps.	2016-07-09 07:54:27 +00:00
Processors.td	AMDGPU: Fix crashes on unknown processor name	2016-06-02 18:37:16 +00:00
R600ClauseMergePass.cpp	AMDGPU: Remove implicit iterator conversions, NFC	2016-07-08 19:16:05 +00:00
R600ControlFlowFinalizer.cpp	AMDGPU: Delete more dead code	2016-07-22 17:01:25 +00:00
R600Defines.h	AMDGPU: R600 code splitting cleanup	2016-03-11 08:00:27 +00:00
R600EmitClauseMarkers.cpp	AMDGPU: Remove implicit iterator conversions, NFC	2016-07-08 19:16:05 +00:00
R600ExpandSpecialInstrs.cpp	AMDGPU: Delete more dead code	2016-07-22 17:01:25 +00:00
R600FrameLowering.cpp	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
R600FrameLowering.h	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
R600InstrFormats.td
R600InstrInfo.cpp	[AMDGPU] Fix lifetime of SmallVector temporaries.	2016-07-30 11:31:16 +00:00
R600InstrInfo.h	AMDGPU/R600: Delete dead code.	2016-07-15 21:26:46 +00:00
R600Instructions.td	AMDGPU: Fix TargetPrefix for remaining r600 intrinsics	2016-07-15 21:27:08 +00:00
R600Intrinsics.td	AMDGPU: Fix TargetPrefix for remaining r600 intrinsics	2016-07-15 21:27:08 +00:00
R600ISelLowering.cpp	AMDGPU/R600: Remove dead custom inserters	2016-07-26 21:03:38 +00:00
R600ISelLowering.h	AMDGPU: Fix i1 fp_to_int	2016-07-22 17:01:21 +00:00
R600MachineFunctionInfo.cpp	AMDGPU: Delete more dead code	2016-07-22 17:01:25 +00:00
R600MachineFunctionInfo.h	AMDGPU: Delete more dead code	2016-07-22 17:01:25 +00:00
R600MachineScheduler.cpp	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC	2016-06-30 00:01:54 +00:00
R600MachineScheduler.h	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
R600OptimizeVectorRegisters.cpp	AMDGPU: Remove implicit iterator conversions, NFC	2016-07-08 19:16:05 +00:00
R600Packetizer.cpp	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC	2016-06-30 00:01:54 +00:00
R600RegisterInfo.cpp	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
R600RegisterInfo.h	AMDGPU: Move R600 only pieces into R600 classes	2016-07-09 18:11:15 +00:00
R600RegisterInfo.td
R600Schedule.td	AMDGPU: Fix trailing whitespace	2016-06-10 02:18:02 +00:00
R700Instructions.td
SIAnnotateControlFlow.cpp	AMDGPU/SI: Don't handle a loop if there is no loop at all for a terminator BB.	2016-07-28 23:01:45 +00:00
SIDebuggerInsertNops.cpp	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
SIDefines.h	AMDGPU: Stay in WQM for non-intrinsic stores	2016-08-02 19:31:14 +00:00
SIFixControlFlowLiveIntervals.cpp	Revert "AMDGPU: Remove unused control flow intrinsic"	2016-07-09 17:18:39 +00:00
SIFixSGPRCopies.cpp	Revert "AMDGPU: Remove unused control flow intrinsic"	2016-07-09 17:18:39 +00:00
SIFoldOperands.cpp	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC	2016-06-30 00:01:54 +00:00
SIFrameLowering.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
SIFrameLowering.h	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header	2016-06-25 03:11:28 +00:00
SIInsertWaits.cpp	AMDGPU: Remove implicit iterator conversions, NFC	2016-07-08 19:16:05 +00:00
SIInstrFormats.td	AMDGPU: Stay in WQM for non-intrinsic stores	2016-08-02 19:31:14 +00:00
SIInstrInfo.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
SIInstrInfo.h	AMDGPU: Stay in WQM for non-intrinsic stores	2016-08-02 19:31:14 +00:00
SIInstrInfo.td	AMDGPU: Stay in WQM for non-intrinsic stores	2016-08-02 19:31:14 +00:00
SIInstructions.td	AMDGPU: Stay in WQM for non-intrinsic stores	2016-08-02 19:31:14 +00:00
SIIntrinsics.td	AMDGPU: Change fdiv lowering based on !fpmath metadata	2016-07-19 23:16:53 +00:00
SIISelLowering.cpp	AMDGPU: fdiv -1, x -> rcp -x	2016-08-02 22:25:04 +00:00
SIISelLowering.h	AMDGPU: Remove analyzeImmediate	2016-07-28 00:32:02 +00:00
SILoadStoreOptimizer.cpp	AMDGPU: Move subtarget feature checks into passes	2016-06-27 20:32:13 +00:00
SILowerControlFlow.cpp	AMDGPU: add execfix flag to SI_ELSE	2016-07-28 11:39:24 +00:00
SILowerI1Copies.cpp	AMDGPU: Cleanup subtarget handling.	2016-06-24 06:30:11 +00:00
SIMachineFunctionInfo.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
SIMachineFunctionInfo.h	AMDGPU: Make AMDGPUMachineFunction fields private	2016-07-26 16:45:58 +00:00
SIMachineScheduler.cpp	AMDGPU/SI: Fix SI scheduler refcount issue	2016-07-19 00:35:22 +00:00
SIMachineScheduler.h	AMDGPU: R600 code splitting cleanup	2016-03-11 08:00:27 +00:00
SIRegisterInfo.cpp	MachineFunction: Return reference for getFrameInfo(); NFC	2016-07-28 18:40:00 +00:00
SIRegisterInfo.h	AMDGPU/SI: Don't use reserved VGPRs for SGPR spilling	2016-07-28 14:30:43 +00:00
SIRegisterInfo.td	[AMDGPU] Some code cleaning in SIRegisterInfo.td	2016-07-21 13:29:57 +00:00
SISchedule.td	AMDGPU: Define a schedule class for COPY.	2016-06-24 23:52:11 +00:00
SIShrinkInstructions.cpp	AMDGPU: Expand register indexing pseudos in custom inserter	2016-07-19 00:35:03 +00:00
SITypeRewriter.cpp	AMDGPU: Add a shader calling convention	2016-04-06 19:40:20 +00:00
SIWholeQuadMode.cpp	AMDGPU: Stay in WQM for non-intrinsic stores	2016-08-02 19:31:14 +00:00
VIInstrFormats.td	[AMDGPU] refactor DS instruction definitions. NFC.	2016-08-01 14:21:30 +00:00
VIInstructions.td	[AMDGPU] refactor DS instruction definitions. NFC.	2016-08-01 14:21:30 +00:00