AI Research Experiences

Harvard CS197

Take your AI skills to the next level

New! Course materials have now been compiled into a Course Book, now available here.

Dive into cutting-edge development tools like PyTorch, Lightning, and Hugging Face, and streamline your workflow with VSCode, Git, and Conda. You'll learn how to harness the power of the cloud with AWS and Colab to train massive deep learning models with lightning-fast GPU acceleration. Plus, you'll master best practices for managing a large number of experiments with Weights and Biases. And that's just the beginning! This course will also teach you how to systematically read research papers, generate new ideas, and present them in slides or papers. You'll even learn valuable project management and team communication techniques used by top AI researchers. Don't miss out on this opportunity to level up your AI skills.

Instructed by Professor Pranav Rajpurkar.

Lecture Notes

You Complete My Sandwiches

Exciting Advances with AI Language Models

Lecture 1 Notes

  • Interact with language models to test their capabilities using zero-shot and few-shot learning.

  • Learn to build simple apps with GPT-3’s text completion and use Codex’s code generation abilities.

  • Learn how language models can have a pernicious tendency to reflect societal biases.

The Zen of Python

Software Engineering Fundamentals

Lecture 2 notes

  • Edit Python codebases effectively using the VSCode editor.

  • Use git and conda comfortably in your coding workflow.

  • Debug without print statements using breakpoints and logpoints

  • Use linting to find errors and improve Python style.

Shoulders of Giants

Reading AI Research Papers

Lecture 3 notes

  • Conduct a literature search to identify papers relevant to a topic of interest

  • Read a machine learning research paper and summarize its contributions

  • Summarize previous works in an area

In-Tune with Jazz Hands

Fine-tuning a Language Model using Hugging Face

Lecture 4 notes

  • Load up and process a natural language processing dataset using the datasets library.

  • Tokenize a text sequence, and understand the steps used in tokenization.

  • Construct a dataset and training step for causal language modeling.

Lightning McTorch

Fine-tuning a Vision Transformer using Lightning

Lecture 5 notes

  • Interact with code to explore data loading and tokenization of images for Vision Transformers.

  • Parse code for PyTorch architecture and modules for building a Vision Transformer.

  • Get acquainted with an example training workflow with PyTorch Lightning.

Moonwalking with PyTorch

Solidifying PyTorch Fundamentals

Lectures 6+7 notes

  • Perform Tensor operations in PyTorch.

  • Understand the backward and forward passes of a neural network in context of Autograd.

  • Detect common issues in PyTorch training code

Experiment Organization Sparks Joy

Organizing Model Training with Weights & Biases and Hydra

Lectures 8+9 notes

  • Manage experiment logging and tracking through Weights & Biases.

  • Perform hyperparameter search with Weights & Biases Sweeps.

  • Manage complex configurations using Hydra.

I Dreamed a Dream

A Framework for Generating Research Ideas

Lectures 10+11 notes

  • Identify gaps in a research paper, including in the research question, experimental setup, and findings.

  • Generate ideas to build on a research paper, thinking about the elements of the task of interest, evaluation strategy and the proposed method.

  • Iterate on your ideas to improve their quality.

Midjourney Generation: “a dream of climbing rainbow stairs”

Today Was a Fairytale

Structuring a Research Paper

Lectures 12+13 notes

  • Deconstruct the elements of a research paper and their sequence.

  • Make notes on the global structure and local structure of the research paper writing.

Deep Learning on Cloud Nine

AWS EC2 for Deep Learning: Setup, Optimization, and Hands-on Training with CheXzero

Lectures 14+15 notes

  • Understand how to set up and connect to an AWS EC2 instance for deep learning.

  • Learn how to modify deep learning code for use with GPUs.

  • Gain hands-on experience running the model training process using a real codebase.

Make your dreams come tuned

Fine-Tuning Your Stable Diffusion Model

Lectures 16+17 notes

  • Create and fine-tune Stable Diffusion models using a Dreambooth template notebook.

  • Use AWS to accelerate the training of Stable Diffusion models with GPUs.

  • Work with unfamiliar codebases and use new tools, including Dreambooth, Colab, Accelerate, and Gradio, without necessarily needing a deep understanding of them.

Research Productivity Power-Ups

Tips to Manage Your Time and Efforts

Lectures 18 notes

  • Learn how to use update meetings and working sessions to stay aligned and make progress on a project.

  • Understand how to use various tools and techniques to improve team communication and project organization.

  • Learn strategies for organizing your efforts on a project, considering the stage of the project and the various tasks involved.

The AI Ninja

Making Progress and Impact in AI Research

Lectures 19 notes

  • Learn how to make steady progress in research, including managing your relation with your advisor, and skills to develop.

  • Gain a deeper understanding of how to increase the impact of your work

Bejeweled

Tips for Creating High-Quality Slides

Lectures 20 notes

  • Apply key principles of the assertion-evidence approach for creating effective slides for talks.

  • Identify common pitfalls in typical slide presentations and strategies for avoiding them.

  • Apply the techniques learned in this lecture to real-world examples of research talk slides to improve their effectiveness.

Model Showdown

Statistical Testing to Compare Model Performances

Lectures 21 notes

  • Understand the different statistical tests that can be used to compare machine learning models, including McNemar's test, the paired t-test, and the bootstrap method.Be able to implement these statistical tests in Python to evaluate the performance of two models on the same test set.

  • Be able to select an appropriate test for a given research question, including tests for statistical superiority, non-inferiority, and equivalence.