Lifan Sun's blog

Lifan Sun's blog, Welcome to my blog.

  • Blog
  • About
  • RSS
  • Search

Reading Notes: “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”

Mar 8, 2025

Reading Notes: “Efficient Memory Management for Large Language Model Serving with PagedAttention”

Mar 7, 2025

Reading Notes: “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”

Mar 2, 2025

Reading Notes: “Training Compute-Optimal Large Language Models”

Mar 1, 2025

Reading Notes: “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints”

Feb 28, 2025

Reading Notes: GPT Series

Feb 27, 2025

Reading Notes: “Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning”

Feb 23, 2025


© Lifan Sun 2023 - 2025