CQS Summer Institute

The 2022 CQS Summer Institute features four short online courses, hosted on Zoom.

Registration is now closed.

Hand surrounded by data and cloud icons
These great weeklong courses will sharpen your quantitative research skills and deepen your understanding of biostatistics and bioinformatics. Participants are highly encouraged to enroll in multiple courses. 


FEES (per course)

$150 for VU/VUMC students/trainees/postdocs
$300 for VU/VUMC faculty or staff
$500 for non-VU/VUMC faculty, staff, students, trainees, or postdocs

Registration for week 1 courses was open through July 25. Registration for week 2 courses was open through August 1.

Please note that academic credit is not available for these courses. 

VUMC Department of Biostatistics faculty/staff: please register via dbConnect. Students/trainees, consult your advisor on how to proceed.



Headshots: Shyr, Liu, Spieker, Ye

Big Data in Biomedical Research ·  August 1–5 from 9 a.m. to noon (CT) ·  [registration is now closed]


Yu Shyr, PhD, professor of biostatistics, biomedical informatics, and health policy

Qi Liu, PhD, associate professor of biostatistics and biomedical informatics

This course will explore statistical, bioinformatic, and computational methods and tools for analyzing big omics data in biomedical research, including experimental design for omics research, RNA-sequencing, single-cell RNA-sequencing, and statistical and bioinformatic methods in high-dimensional data. Students will gain practical experience with RNA-seq and single-cell RNA-seq analysis, including read mapping, quantification, differential expression, cell clustering, and marker gene identification, as well as performing functional interpretation of results.

Prerequisites: Students should have basic or entry-level knowledge of R programming, Linux/Unix commands, and biostatistics. See the "Preparing for Big Data in Biomedical Research" section of the supplement page for directions on what to prepare and review in advance, especially if you have lack experience with R and Linux.


Introduction to Causal Inference · August 1–5 from 1 p.m. to 4 p.m. (CT) ·  [registration is now closed]

Instructor: Andrew Spieker, PhD, assistant professor of biostatistics

Many are familiar with the phrase “correlation does not imply causation,” but that then begs the question: what exactly is causation in the first place? In this five-day short course, we will introduce fundamentals of causal inference approaches. The first three days will provide an overview of commonly implemented causal inference methods, including standardization, matching, inverse-weighting, and instrumental variables. The fourth day will focus on methods for longitudinal data, and the fifth day will address miscellaneous topics, including sensitivity analyses and causal inference with survival outcomes. Throughout the course, emphasis will be placed on graphical representation of variables through “directed acyclic graphs” (i.e., DAGs) and software implementation.

Prerequisites: some familiarity with basic statistics (including linear and logistic regression) and/or interest in designing and evaluating clinical trials. See the "Preparing for Introduction to Causal Inference" section of the supplement page for suggestions on what to prepare and review in advance.


Regression and Modeling in R · August 8–12 from 9 a.m. to noon (CT) ·   [registration is now closed]

Instructor: Fei Ye, PhD, associate professor of biostatistics and medicine

This course will cover advanced statistical topics frequently used in biological and medical research. Emphasis will be placed on practical applications of statistical methods and interpretation of the results. During this week, you will expand your understanding of the advantages and limitations of various methods, choose appropriate analytic approaches based on type of outcome variable and data structure, develop advanced statistical models in R, perform model diagnostic analyses, and interpret R output and analysis results.

Prerequisites: Biostatistics I or equivalent course(s). Students should be familiar with the basic notions and concepts of linear algebra and statistical modeling, types of variables (continuous, categorical, ordinal, etc.), common probability distributions (such as normal and binomial), and descriptive statistics, including summary statistics (mean, median, standard deviation, variance, etc.) and simple tests (t-test, Wilcoxon rank-sum test, chi-square test, etc.). See the "Preparing for Regression and Modeling in R" section of the supplement page for suggestions on what to prepare and review in advance.


Cloud Computing and Case Studies in Biomedical Data Science · August 8–12 from 1 p.m. to 4 p.m. (CT) ·   [registration is now closed]


Yaomin Xu, PhD, assistant professor of biostatistics and bioinformatics; principal investigator, Translational Bioinformatics & Biostatistics Lab

Quanhu (Tiger) Sheng, PhD, assistant professor of biostatistics

Shilin Zhao, PhD, assistant professor of biostatistics

Brian Sharber, BS, lead cloud application developer (Bick Lab)

Alex Bick, MD, PhD, assistant professor of medicine; principal investigator, Bick Lab

With the unprecedented availability of big data, biomedical data scientists increasingly face novel challenges to properly and efficiently handle large-scale, high-volume data to solve complex scientific problems.
Cloud computing is a new generation of technologies and architectures, designed to deliver computer resources over the internet, aiming to economically analyze very large volumes of a wide variety of data by enabling the rapid provision of a large pool of shared computational tools on demand.
In this course, we will introduce cloud computing techniques and utilities in the context of biomedical research and help participants understand the deployment, scalability, and cost-efficiency of applications in the cloud. Topics will include: real-world case studies on implementing flexible and scalable data and project management; data structure for big data; high-throughput data analyses in the cloud using R; whole genome sequencing data analysis using WDL and Cromwell; machine learning in the cloud; the UK Biobank (UKB) and the UK biobank research analysis platform (RAP); analyzing and exploring UKB data using customized pipelines in RAP with WDL and CLI; and cloud computing in All of US and Terra.
This course is for beginner and intermediate students interested in applying cloud computing and big data systems to data science, machine learning, and data engineering. Students are expected to have beginner-level Linux, basic computing, and statistical data analysis skills in the field of biomedical data research. See the "Preparing for Cloud Computing and Case Studies in Biomedical Data Science" section of the supplement page for suggestions on what to prepare and review in advance.

Xu Sheng Zhao Sharber Bick headshots


For all courses, live participation is expected, but we also plan to record each session, which participants may view through August 31. These videos will be available solely for short-term study reference; they will not be edited, and downloading copies will not be permitted.

QUESTIONS? Contact Jenny Jones. You may reach out to the instructors as well if you have specific course-related questions.

A flyer is available. Thank you for sharing the news about this year's offerings!