CMU11-667: Large Language Models: Methods and Applications

Course Overview

University: Carnegie Mellon University
Prerequisites: Solid background in machine learning (equivalent to CMU 10-301/10-601) and natural language processing (equivalent to 11-411/11-611); proficiency in Python and familiarity with PyTorch or similar deep learning frameworks.
Programming Language: Python
Course Difficulty: 🌟🌟🌟🌟
Estimated Study Hours: 100+ hours

This graduate-level course provides a comprehensive overview of methods and applications of Large Language Models (LLMs), covering a wide range of topics from core architectures to cutting-edge techniques. Course content includes:

Foundations: Neural network architectures for language modeling, training procedures, inference, and evaluation metrics.
Advanced Topics: Model interpretability, alignment methods, emergent capabilities, and applications in both textual and non-textual domains.
System & Optimization Techniques: Large-scale pretraining strategies, deployment optimization, and efficient training/inference methods.
Ethics & Safety: Addressing model bias, adversarial attacks, and legal/regulatory concerns.

The course blends lectures, readings, quizzes, interactive exercises, assignments, and a final project to offer students a deep and practical understanding of LLMs, preparing them for both research and real-world system development.

Self-Study Tips:

Thoroughly read all assigned papers and materials before each class.
Become proficient with PyTorch and implement core models and algorithms by hand.
Complete the assignments diligently to build practical skills and reinforce theoretical understanding.

Course Resources

Course Website: https://cmu-llms.org/
Course Videos: Selected lecture slides and materials are available on the website; full lecture recordings may require CMU internal access.
Course Materials: Curated research papers and supplementary materials, with the full reading list available on the course site.
Assignments: Six programming assignments covering data preparation, Transformer implementation, retrieval-augmented generation, model evaluation and debiasing, and training efficiency. Details at https://cmu-llms.org/assignments/