Rotatry Positional Encoding
Reading Notes: “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”
Reading Notes: “Training Compute-Optimal Large Language Models”
Reading Notes: “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints”
Reading Notes: GPT Series
Two ways of adapting LLMs for Recommender Systems
Reading Notes on NLP Papers
Reading Notes: “Annotated Transformer”
Introduction to GenAI 2024 Spring, Course Notes
Reading Note: “WizardLM: Empowering Large Language Models to Follow Complex Instructions”
Brief Notes on Instruction Tuning