STATISTICAL AND COMPUTATIONAL METHODS FOR ANALYZING SINGLE-CELL RNA-SEQ AND IMMUNE PROFILING DATA

dc.contributor.advisorJi, Hongkai
dc.contributor.advisorKlein, Sabra
dc.contributor.committeeMemberZhao, Ni
dc.contributor.committeeMemberSmith, Kellie
dc.creatorZhang, Boyang
dc.date.accessioned2022-09-23T18:10:44Z
dc.date.created2022-08
dc.date.issued2022-07-21
dc.date.submittedAugust 2022
dc.date.updated2022-09-23T18:10:45Z
dc.description.abstractWith the advancement of single-cell technologies, single-cell RNA-seq experiments increasingly generate data from multiple biological or patient samples. In addition to single modality, single-cell multimodal omics, such as paired single-cell RNA-seq (scRNA-seq) and single-cell TCR-seq (scTCR-seq), enables one to profile multiple data types in the same cell simultaneously and thus provide unprecedented opportunities to study the complex interactions among different features from multiple molecular layers. However, analyzing and visualizing the complex cell type-phenotype association in such multi-sample single-cell data remains challenging. First, we develop TreeCorTreat, an open source computational tool that utilizes a tree-based correlation screen to analyze and visualize the association between phenotype and transcriptomic features and cell types at multiple cell type resolution levels. We also introduce a new TreeCorTreat plot to summarize and visualize the results. With TreeCorTreat, one can conveniently explore, visualize and compare results from different cell types, resolutions, feature types, traits, datasets, analysis protocols and covariate adjustments. These functionalities are demonstrated through two real data datasets: a COVID-19 dataset and a non-small cell lung cancer study. Second, we develop TreeCorIWAS to interrogate a large collection of repertoire features and transcriptional profiles simultaneously and systematically at different cell type resolutions. TreeCorIWAS can facilitate the detection of the immune features associated with sample phenotype that are defined by gene expression profile and the comparison of transcriptional profile changes across different immunophenotypic groups. Third, we utilize immune profiling data to confirm the existence of unique memory CD4+ T cell clonotypes crossrecognizing severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and common cold coronaviruses (CCCs) and assess the functional avidity. Overall, this thesis provides new statistical and computational insights for analyzing large, complex, multi-sample high-throughput sequencing datasets.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/67507
dc.language.isoen_US
dc.publisherJohns Hopkins University
dc.publisher.countryUSA
dc.subjectSingle cell genomics
dc.subjectimmune repertoire
dc.subjecthierarchical clustering tree
dc.subjectmulti-resolution and multi-feature type association analysis
dc.subjectTreeCorTreat plot
dc.titleSTATISTICAL AND COMPUTATIONAL METHODS FOR ANALYZING SINGLE-CELL RNA-SEQ AND IMMUNE PROFILING DATA
dc.typeThesis
dc.type.materialtext
local.embargo.lift2026-08-01
local.embargo.terms2026-08-01
thesis.degree.departmentBiostatistics
thesis.degree.disciplineBiostatistics
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorBloomberg School of Public Health
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.67 KB
Format:
Plain Text
Description: