Stanford Doerr School of Sustainability  ·  Stanford Data Science

MATRICS HPC Bootcamp

High Performance Computing Essentials for Stanford Researchers

Stanford Campus Postdocs & Graduate Students Certificate Program

Hands-On HPC for AI-Driven Research

MATRICS is a week-long, on-campus intensive designed to give Stanford postdocs and graduate students practical mastery of High Performance Computing workflows — from resource estimation and job scheduling all the way to GPU-accelerated distributed AI/ML pipelines.

Each morning kicks off with experiential tutorials led by the SDSS Research Computing Team and NVIDIA engineers. Afternoons are yours: bring your own research project and get direct, hands-on support from instructors and peers in a collaborative cohort setting.

Stanford Doerr School of Sustainability Stanford Data Science NVIDIA NSF ACCESS
5 Intensive days
8 Expert tutorials
1 Certificate earned

Platforms You'll Work With

Sherlock Farmshare Oak NVIDIA Cloud NSF ACCESS Globus

What You'll Learn

Eight tutorials spanning the full HPC stack — from job orchestration to GPU profiling.

⚙️

Resource Estimation & sbatch

Master SLURM job scripting, resource allocation strategies, and sbatch tips that save you wall-clock time and SUs.

📦

Containers with Apptainer

Build, port, and run reproducible containers across Sherlock, Farmshare, and NVIDIA Cloud using Apptainer.

🔀

Embarrassingly Parallel Jobs

Turn single-threaded scripts into massively parallel SLURM array jobs with minimal code changes.

🎛️

Hyperparameter Optimization

Use Weights & Biases for systematic, distributed hyperparameter sweeps on HPC and cloud resources.

🗂️

Storage Workflows

Navigate Oak, Sherlock, and L_SCRATCH effectively — data placement, Globus transfers, and I/O best practices.

📈

Single-Node Scaling

Scale CPU workloads with Dask and GPU training with PyTorch Distributed Data Parallel (DDP) on a single node.

💾

Checkpointing & Resilience

Implement robust checkpointing, handle pre-emption gracefully, and restart long-running jobs without lost progress.

🔬

CPU & GPU Profiling

Use NVIDIA NSIGHT to profile memory, timing, and efficiency bottlenecks across both CPU and GPU workloads.

Tutorial Schedule

Two tutorials each morning, Mon–Thu. Afternoons are dedicated project work time with instructor support.

📅 Friday is reserved for project presentations, wrap-up, and certificate award — no new tutorials scheduled.

Monday
May 11
1
Morning Tutorial 1
Estimating HPC Resources, and Job Request Tips and Tricks
2
Morning Tutorial 2
Containerizing Package Management for Reproducible Workflows
Tuesday
May 12
3
Morning Tutorial 3
Job Arrays for Perfectly Parallel Problems
4
Morning Tutorial 4
Weights & Biases for Hyperparameter Optimization on Distributed Systems
Wednesday
May 13
5
Morning Tutorial 5
I/O Optimized Workflows for Data-Heavy Pipelines
6
Morning Tutorial 6
Scaling Up to Multi-CPU and Multi-GPU Compute with Python Automation
Thursday
May 14
7
Morning Tutorial 7
Checkpointing and Preemptibility in Depth
8
Morning Tutorial 8
Profiling Distributed Systems with NVIDIA NSIGHT Systems

Your Instructors

The MATRICS bootcamp is led by Stanford's SDSS Research Computing Team alongside partners from Stanford Research Computing and NVIDIA.

EA
Dr. Ellianna Abrahams
Research Data Scientist
SDSS Center for Computation
🧊 Once lived on an Alaskan ice field for an entire summer
GPU Computing Computer Vision Earth Observation Distributed AI/ML
Ask Ellianna about: HPC, GPUs, computer vision, earth observation, hyperparameter sweeps, and building computational community across campus.
BC
Brian Chivers
Research Software Engineer
SDSS Center for Computation
🍕 Current hobby: perfecting homemade pizza
HPC Workflows Containers Checkpointing Storage
Ask Brian about: Getting the most out of Stanford's computing resources, all things data, sbatch tips, and anything Stanford.
CG
Christina Gancayco
Consultant
Stanford Research Computing
🦜 800+ day Duolingo streak and counting
Job Arrays Sherlock Parallel Computing
Ask Christina about: Sherlock, HPC fundamentals, job arrays, and getting started if you're new to HPC. Also: books, travel, and movies!
BT
Brian Tempero
HPC/Cloud Administrator
Stanford Research Computing
🚗 Classic muscle car enthusiast
Cloud Computing GCP HPC Administration
Ask Brian T. about: Finding the right compute platform for your research, HPC or cloud questions, and GCP CLI commands. Also: classic cars.
AB
Amanda Butler
Instructor
NVIDIA
🎵 Was in a medieval orchestra in college
HPC GPU Hardware
Ask Amanda about: GPU hardware, exascale compute, agentic AI in GPU settings
ZR
Zoe Ryan
Instructor
NVIDIA
Loves throwing ceramic coffee mugs
Profiling NSIGHT GPU
Ask Zoe about: Multinode GPU scaling and profiling, NVIDIA NSIGHT, performance optimization.

Certificate in High Performance Research Computing

Participants who complete the full bootcamp receive a formal certificate recognizing their mastery of HPC essentials for AI and research workflows.

🎓

Certificate in High Performance
Research Computing

Awarded to Stanford postdoctoral researchers and graduate students who complete the MATRICS HPC Bootcamp, demonstrating practical competency in job orchestration, containerization, distributed computing, and performance profiling on modern HPC and cloud infrastructure.

Issued by Stanford Doerr School of Sustainability  ·  Stanford Data Science

We gratefully acknowledge our sponsors. MATRICS 2026 was powered by: