Lifan Sun's blog

Lifan Sun's blog, Welcome to my blog.

PMPP Reading Notes

Reading Notes: “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”

Reading Notes: “Efficient Memory Management for Large Language Model Serving with PagedAttention”

Reading Notes: “Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning”

Reading Notes: “GPipe: Easy Scaling with Micro-Batch Pipeline”

Distributed Training Basics

Reading Note: Megatron-LM v1

Quantization for NN Inference

Reading Note: TVM

Reading Note: Triton

Moore’s Law, and the future of computing beyond Moore’s Law

Deep Learning Performance Background

Reading Notes: MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

An Architecture Overview of ML Systems

© Lifan Sun 2023 - 2025