Stanford Doerr School of Sustainability · Stanford Data Science
High Performance Computing Essentials for Stanford Researchers
About the Bootcamp
MATRICS is a week-long, on-campus intensive designed to give Stanford postdocs and graduate students practical mastery of High Performance Computing workflows — from resource estimation and job scheduling all the way to GPU-accelerated distributed AI/ML pipelines.
Each morning kicks off with experiential tutorials led by the SDSS Research Computing Team and NVIDIA engineers. Afternoons are yours: bring your own research project and get direct, hands-on support from instructors and peers in a collaborative cohort setting.
Platforms You'll Work With
Curriculum
Eight tutorials spanning the full HPC stack — from job orchestration to GPU profiling.
Master SLURM job scripting, resource allocation strategies, and sbatch tips that save you wall-clock time and SUs.
Build, port, and run reproducible containers across Sherlock, Farmshare, and NVIDIA Cloud using Apptainer.
Turn single-threaded scripts into massively parallel SLURM array jobs with minimal code changes.
Use Weights & Biases for systematic, distributed hyperparameter sweeps on HPC and cloud resources.
Navigate Oak, Sherlock, and L_SCRATCH effectively — data placement, Globus transfers, and I/O best practices.
Scale CPU workloads with Dask and GPU training with PyTorch Distributed Data Parallel (DDP) on a single node.
Implement robust checkpointing, handle pre-emption gracefully, and restart long-running jobs without lost progress.
Use NVIDIA NSIGHT to profile memory, timing, and efficiency bottlenecks across both CPU and GPU workloads.
Daily Schedule
Two tutorials each morning, Mon–Thu. Afternoons are dedicated project work time with instructor support.
📅 Friday is reserved for project presentations, wrap-up, and certificate award — no new tutorials scheduled.
Meet the Team
The MATRICS bootcamp is led by Stanford's SDSS Research Computing Team alongside partners from Stanford Research Computing and NVIDIA.
Recognition
Participants who complete the full bootcamp receive a formal certificate recognizing their mastery of HPC essentials for AI and research workflows.
Awarded to Stanford postdoctoral researchers and graduate students who complete the MATRICS HPC Bootcamp, demonstrating practical competency in job orchestration, containerization, distributed computing, and performance profiling on modern HPC and cloud infrastructure.
Acknowledgements