AMD FSR rollback FP32 single precision test, native FP16 is 7% faster • InfoTech News
More In-Depth Details of Floating Point Precision - NVIDIA CUDA - PyTorch Dev Discussions
AMD's FidelityFX Super Resolution Is Just 7% Slower in FP32 Mode vs FP16 | Tom's Hardware
Understanding Mixed Precision Training | by Jonathan Davis | Towards Data Science
Mixed-Precision Training of Deep Neural Networks | NVIDIA Technical Blog
AMD FidelityFX Super Resolution FP32 fallback tested, native FP16 is 7% faster - VideoCardz.com
Revisiting Volta: How to Accelerate Deep Learning - The NVIDIA Titan V Deep Learning Deep Dive: It's All About The Tensor Cores
Bfloat16 – a brief intro - AEWIN
Post-Training Quantization of TensorFlow model to FP16 | by zong fan | Medium
FP16 Throughput on GP104: Good for Compatibility (and Not Much Else) - The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review: Kicking Off the FinFET Generation
The bfloat16 numerical format | Cloud TPU | Google Cloud
A Shallow Dive Into Tensor Cores - The NVIDIA Titan V Deep Learning Deep Dive: It's All About The Tensor Cores
Mixed-Precision Programming with CUDA 8 | NVIDIA Technical Blog
fp16 – Nick Higham
Choose FP16, FP32 or int8 for Deep Learning Models