SMI 2022 - New approaches for normalization and cell phenotyping in multiplex immunofluorescence imaging data

Friday, May 27, 2022 • 11:15 am–12:30 pm (CT) • Light Hall, Room 208

Organizer: Inna Chervoneva, Thomas Jefferson University

Chair: Brooke Fridley, Moffitt Cancer Center

Statistical analysis of multiplex immunofluorescence and immunohistochemistry imaging data

Julia Wrobel, Colorado School of Public Health

Kim Jordan, University of Colorado

Erin Schenk, University of Colorado

Advances in multiplex single cell immunofluorescence (mIF) and multiplex immunohistrochemistry (mIHC) imaging technology have enabled analysis of cell-to-cell spatial relationships that promise to revolutionize our understanding of tissue-based diseases and autoimmune disorders. However, mIF/mIHC data is noisy and requires a multi-step image processing pipeline. Specifically, mIF images are collected as multichannel TIFF files, then must be de-noised, segmented to identify cells and nuclei, normalized across slides and protein markers to correct for batch effects, and phenotyped- a process by which cell subtypes are identified and labeled for further analysis. This complex pipeline typically produces a tabular dataset where each row is a cell and the columns contain features describing the cell location, cell type, and patient characteristics. The tabular dataset is then often used to analyze the spatial relationships between cells and their correlation with patient outcomes (i.e., tumor progression). There are opportunities for statisticians to improve modeling of mIF data at every step of the image processing and analysis pipeline. The goal of this talk is to introduce the audience to the data structure and challenges associated with mIF imaging as well as to detail specific steps in the image processing pipeline.

Computational tools and statistical methods to normalize multiplexed immunofluorescence images

Coleman Harris, Vanderbilt University Medical Center

Julia Wrobel, Colorado School of Public Health

Simon Vandekar, Vanderbilt University Medical Center

Multiplexed imaging has emerged in the biological research space at the forefront of imaging methods developed to measure dozens of marker channels at the single-cell level while preserving their spatial coordinates. These methods are valuable in understanding cell-cell interactions, but few computational tools exist to quantify or correct for technical variation in multiplexed imaging data, and in general this field lacks a clear set of analysis standards, pipelines, and methods. This work introduces mxnorm, an R package that extends normalization methods and quantitative metrics to normalize multiplexed imaging data. It intends to set a foundation for the analysis of multiplexed imaging data in R, extend normalization methods into the field in a user-friendly way, and provide easy applications of a robust evaluation framework to measure both technical variability and the efficacy of various normalization methods. Here we illustrate clear slide-to-slide variation in the raw, unadjusted data, demonstrating that many of the proposed normalization methods included in the mxnorm package reduce this variation while preserving and improving the biological signal. We also demonstrate the flexibility of the mxnorm package to handle user-defined normalization techniques, additional thresholding algorithms, and custom random effects modeling, to allow for a flexible, user-friendly interface for researchers to normalize and analyze multiplexed imaging data.

Assessment of Bayesian models for zero-inflated and over-dispersed multiplex immunofluorescence data

Jose Laborde, Moffitt Cancer Center

Lauren Peres, Moffitt Cancer Center

Joellen M. Schildkraut, Winship Cancer Institute / Emory University

Julia Wrobel, Colorado School of Public Health

Brooke Fridley, Moffitt Cancer Center

Multiplex immunofluorescence (mIF) is a novel technique to characterize the tumor immune microenvironment. mIF provides data on the number of cells in the sample and the number of positive cells for the markers. A challenge in the analysis is the over-dispersion and zero-inflated (ZI) nature of the data with many of the tumor cores showing no positive cells for a given marker. Using mIF data from a study of Black women with high-grade serous ovarian cancer (94 subjects, 264 samples), we set out to compare a variety of models for modeling the association of stage and age at diagnosis on immune infiltrate levels. We measured five immune markers and relevant combinations of these markers: CD3+ (10% samples with zero positive cells), CD8+ (16%), FoxP3+ (16%), CD11b+ (49%), CD15+ (72%), CD3FoxP3+ (30%), CD3+CD8+ (23%), and CD11+CD15+ (85%). Seven Bayesian repeated measures models were assessed: Binomial (B), Poisson (P), Beta-Binomial (BB), Negative Binomial (NB), ZI-Binomial (ZIB), ZI-Poisson (ZIP), and ZI-Negative Binomial (ZINB). Using leave-one-out cross validation with Pareto smoothed importance sampling, the ZINB model was the best fit for the majority of markers in this study (followed closely by the NB and BB models). Future work is ongoing to assess model fit in a second ovarian cancer mIF study along with the assessment of the model performance for the zero-inflated beta-binomial (ZI-BB).

Return to the schedule page