Distributed Training Basics
Reading Note: Megatron-LM v1
Quantization for NN Inference
Reading Note: TVM
Reading Note: Triton
Moore’s Law, and the future of computing beyond Moore’s Law
Deep Learning Performance Background
Reading Notes: MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive
An Architecture Overview of ML Systems
PMPP Reading Notes
Summary Notes on “Deep Learning Systems:
Algorithms and Implementation” course