CQS Summer Institute | Center for Quantitative Sciences

July 20-24 and 27-31, 2026

The 2026 CQS Summer Institute will feature four in-person courses at 2525 West End Avenue, on the main campus of Vanderbilt Health.

Sharpen your quantitative research skills and deepen your understanding of biostatistics and bioinformatics. Participants are highly encouraged to enroll in multiple courses.

2026 CQS Summer Institute

Tuition per course (in USD)

Regular: $950

Vanderbilt University / Vanderbilt University Medical Center faculty and staff: $700

Vanderbilt University / Vanderbilt University Medical Center students, trainees, and postdocs: $450

Registration will be open through July 13 for courses that begin on July 20, and through July 20 for courses that begin on July 27. Space is limited.

Register for the 2026 Summer Institute

Course Descriptions

Genomic Data Analysis: From Sequencing to Biological Insights
Monday, July 20 to Friday, July 24, 9 a.m. to noon
Taught by Dr. Qi Liu, professor of biostatistics and biomedical informatics, technical director of VANGARD (Vanderbilt Technologies for Advanced Genomics Analysis and Research Design), associate director of bioinformatics at the Center for Quantitative Sciences, and director of the CQS Omics Coordinating Center.
This course introduces the statistical and bioinformatic approaches used to analyze large-scale omics data in modern biomedical research. Students will explore key technologies and analytical frameworks for DNA sequencing, RNA sequencing, single-cell RNA sequencing, and spatial transcriptomics. Core topics include sequencing technologies, data preprocessing, quality control, read alignment, transcript quantification, dimensionality reduction, differential expression analysis, and functional enrichment.
Through a combination of lectures and hands-on practical sessions, students will learn how to transform raw sequencing data into biologically meaningful insights. Using real datasets, students will gain practical experience with commonly used genomics analysis workflows, with a particular focus on RNA-seq and single-cell RNA-seq. By the end of the course, students will understand the principles behind modern genomics pipelines and will be able to conduct basic analyses of high-throughput sequencing data.
Prerequisites
Course participants should have basic or entry-level knowledge of R programming, Linux/Unix commands, and biostatistics.
Preparing for this course
These two free software programs will be used throughout the course. Please download them before July 19:
GSEA
IGV
Also, please bookmark the WebGestalt online tool.
If you have limited experience with using R and Linux, please review these materials prior to the start of the course:
R tutorial
ACCRE cheat sheet
Participants who would like a primer on biomedical terminology are encouraged to consult the Chandran Lab’s Beginner’s Guide to RNA-seq or similar resources.

Introduction to PyMC and Bayesian Modeling
Monday, July 20 to Friday, July 24, 1 to 4 p.m.
Taught by Dr. Christopher Fonnesbeck, principal data scientist at PyMC Labs and adjoint associate professor of biostatistics at Vanderbilt University School of Medicine.
This course provides a comprehensive introduction to Bayesian statistical modeling using PyMC. Participants will progress from foundational concepts through applied modeling techniques, building practical skills through hands-on coding exercises with real-world datasets. Each session combines conceptual instruction with interactive notebook-based exercises.
Prerequisites
Working knowledge of Python programming
Familiarity with basic statistical concepts (e.g., distributions, regression concepts)
No prior Bayesian experience required
Preparing for this course
Please see the course README page on GitHub for what to download and set up before July 20. The page also features a detailed schedule for each day of the course.

Terra-Based Cloud Computing
Monday, July 27 to Friday, July 31, 9 a.m. to noon
Taught by Dr. Quanhu “Tiger” Sheng, associate professor of biostatistics, VANGARD deputy technical director, and associate director of advanced computing at CQS, with assistance from Dr. Hua-Chang Chen.
This course provides an in-depth exploration of Terra-based cloud computing with a focus on genome-wide association studies (GWAS) analysis using BioVU whole genome sequencing data. Students will use Visual Studio Code to navigate course materials and engage in hands-on exercises. The curriculum introduces key concepts and tools, including the Terra environment, Docker image creation, workflow description language (WDL), cohort building with the BioVU synthetic derivative BigQuery database, and GWAS analysis using Regenie4. Through practical activities, participants will develop skills in cloud-based GWAS analysis, covering environment setup, software packaging, cohort construction, and data processing.
Prerequisites
Knowledge of genomics and GWAS fundamentals
Familiarity with Python and Jupyter Notebook
Basic proficiency in SQL
Experience using Linux command-line interfaces
Visual Studio Code installed on your computing device (see "Preparing for this course" below for more instructions)
GitHub, Docker Hub, Google Cloud Platform, and Terra accounts set up by July 20 (one week before the start of the course). See below for more information.
Preparing for this course
This course uses Visual Studio Code (VS Code) for navigating course materials and conducting hands-on exercises.
The community version of VS Code is free (download here)
Required extensions: Remote-SSH, Python, Jupyter, R
Students must have the following accounts set up by the start of this course:
GitHub – for code version control
Personal GitHub accounts created with non-VU/non-VUMC emails are acceptable.
You must provide your GitHub account username to Dr. Sheng at least one week prior to the start of the course.
Docker Hub – for publishing Docker images
Personal Docker accounts created with non-VU or non-VUMC emails are acceptable.
Google Cloud Platform (GCP)
Your GCP account must be registered to your VU (vanderbilt.edu) or VUMC (vumc.org) email address.
You must provide your Google Cloud Platform (GCP) account email to Dr. Sheng at least one week prior to the start of the course.
Terra
Your Terra account must be linked to your GCP account (with your VU or VUMC email address).
When you register for Terra, select “Sign in with Google” (not “Sign in with Microsoft”) to ensure proper access to AGD (Alliance for Genomic Discovery) genomics data, the Synthetic Derivative (SD) BigQuery database, and the introductory data portal.
For students with limited or no experience in Python or SQL, Dr. Sheng strongly recommends reviewing the following resources prior to the start of the course:
Wes McKinney’s Python for Data Analysis, 3rd Edition. Python is utilized in the course for data manipulation and querying the SD BigQuery database.
SQL. You will use SQL during the course to extract and analyze phenotype data from the SD BigQuery database.
Need to brush up on the prerequisites? Dr. Sheng recommends the following:
Knowledge of genomics and GWAS fundamentals
en.wikipedia.org/wiki/Genome-wide_association_study
rgcgithub.github.io/regenie/overview/
Familiarity with Python and Jupyter Notebook
code.visualstudio.com/docs/datascience/jupyter-notebooks
Basic proficiency in SQL
docs.cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax
Experience using Linux command-line interfaces
documentation.ubuntu.com/desktop/en/latest/tutorial/the-linux-command-line-for-beginners/

Introduction to Causal Inference
Monday, July 27 to Friday, July 31, 1 to 4 p.m.
Taught by Dr. Andrew Spieker, associate professor of biostatistics
Many have likely heard that “correlation does not imply causation,” but that then begs the question: what exactly is causation in the first place? This five-day short course will provide a framework for modern causal inference. The first day will involve an overview of the potential outcomes framework and theory of DAGs (directed acyclic graphics). The second and third days will involve commonly implemented causal inference methods for use in cross-sectional data including standardization, matching, inverse-weighting, and instrumental variables. The fourth day will focus on methods for longitudinal data including marginal structural models and g-computation. The fifth day will likely feature miscellaneous advanced topics, which may include sensitivity analyses, parametric identification, and Bayesian methods. Throughout the course, emphasis will be placed on graphical representation of variables through DAGs and software-based implementation to real-world data. Upon completion of this course, participants should be able to:
explain the potential outcomes framework.
explain causal identifiability assumptions.
use directed acyclic graphs to characterize relationships between variables.
choose between and implement causal methods suitable for real-world cross-sectional and longitudinal data.
assess covariate balance and positivity violations.
Prerequisites
A basic understanding of biostatistical methods, including linear and logistic regression.
Prior experience working in R will be helpful, although it is not strictly necessary.

Frequently Asked Questions

Where do your courses take place?
The courses will be held on the Vanderbilt Health campus, in the 2525 West End Avenue building. Live participation is expected, and some courses may feature small-group discussions. The sessions will not be recorded or streamed.
Meals and snacks are included with these intensive courses. We will provide a light continental breakfast before each morning class, and boxed lunches to all in-person participants. Coffee, tea, and snacks will be available throughout each day.
I will be driving to campus to attend the Summer Institute. Where can I park my car?
When you register, please indicate that you will need parking. We will assist with making arrangements for a complimentary spot.
Will the instructors make reasonable accommodations for my disability?
Contact us to discuss what is needed. All requests will be kept confidential. We are committed to maintaining an inclusive and accessible learning environment.
2525 West End is equipped with accessible parking, as well as ramps to the building from the parking garage, touch-open doorways, and an easy-to-access elevator bank.
Is there a lactation room?
Yes, in the Department of Biostatistics, on the tenth floor. Contact the course administrator to arrange access.
Is vaccination and/or masking required?
At present, no. However, individuals may still choose to wear a mask in any area at Vanderbilt Health, and we ask all participants to respect this choice. We follow VUMC guidance on COVID-19 prevention, which may be updated if a new variant or other factors suggest an increased risk of spread.
How do I obtain documentation of my participation?
A certificate of completion will be emailed to you after the end of the course.
Can I earn academic credit for my course(s)?
Vanderbilt’s Cancer Biology program is handling credit arrangements for its members. We do not grade or track the attendance of other participants.
I have a scheduling conflict. Would it be possible for me to participate virtually?
This is not an option, as the classes are not recorded or streamed.
Do you plan to offer these courses in the future?
After every Institute, we review what worked and what needs to be adjusted, with timing and format among our considerations. We collect feedback from participants at the end of each course, including whether the instruction met their expectations and what they would like to see on offer. These help guide the planning and design of future Summer Institutes.
I have a question that hasn’t been answered yet. Whom do I contact?
Jena Altstatt, administrator
Details subject to change without notice.