Skip to content

CMU11-667: Large Language Models: Methods and Applications

Course Overview

  • University: Carnegie Mellon University
  • Prerequisites: Solid background in machine learning (equivalent to CMU 10-301/10-601) and natural language processing (equivalent to 11-411/11-611); proficiency in Python and familiarity with PyTorch or similar deep learning frameworks.
  • Programming Language: Python
  • Course Difficulty: 🌟🌟🌟🌟
  • Estimated Study Hours: 100+ hours

This graduate-level course provides a comprehensive overview of methods and applications of Large Language Models (LLMs), covering a wide range of topics from core architectures to cutting-edge techniques. Course content includes:

  1. Foundations: Neural network architectures for language modeling, training procedures, inference, and evaluation metrics.
  2. Advanced Topics: Model interpretability, alignment methods, emergent capabilities, and applications in both textual and non-textual domains.
  3. System & Optimization Techniques: Large-scale pretraining strategies, deployment optimization, and efficient training/inference methods.
  4. Ethics & Safety: Addressing model bias, adversarial attacks, and legal/regulatory concerns.

The course blends lectures, readings, quizzes, interactive exercises, assignments, and a final project to offer students a deep and practical understanding of LLMs, preparing them for both research and real-world system development.

Self-Study Tips:

  • Thoroughly read all assigned papers and materials before each class.
  • Become proficient with PyTorch and implement core models and algorithms by hand.
  • Complete the assignments diligently to build practical skills and reinforce theoretical understanding.

Course Resources

  • Course Website: https://cmu-llms.org/
  • Course Videos: Selected lecture slides and materials are available on the website; full lecture recordings may require CMU internal access.
  • Course Materials: Curated research papers and supplementary materials, with the full reading list available on the course site.
  • Assignments: Six programming assignments covering data preparation, Transformer implementation, retrieval-augmented generation, model evaluation and debiasing, and training efficiency. Details at https://cmu-llms.org/assignments/