A BIOINFORMATIC PIPELINE FOR THE ANALYSIS OF PROTEIN MICROARRAY DATA WITH APPLICATIONS TO MALARIA AND LUNG CANCER STUDIES

Embargo until
2022-05-01
Date
2021-04-22
Journal Title
Journal ISSN
Volume Title
Publisher
Johns Hopkins University
Abstract
Bioinformatic pipelines are steps taken to transform data from a raw measurement to a form that enables direct biological inference. These steps vary across assays and different methods can have important impacts on downstream analyses and subsequent inference. While there has been substantial work on optimizing methods for many types of assays including DNA microarrays, relatively few methods have been developed and evaluated specifically for protein microarrays. Due to the high levels of technical variation, and relative measurements obtained from protein microarray data methods specifically suited to these assays are especially important to ensure that biological questions of interest can be directly answered with these data. Here, we propose a bioinformatic pipeline for protein microarray data that contains three main steps: a pre-processing pipeline to quantify and address technical variation, a Bayesian model to produce full posterior distributions of signal, and ranking methods that use information from full posterior distributions. In Chapter 2 we use Bland-Altman plots and associated analysis show that the pre-processing pipeline reduces technical variation in two previously published data sets that use protein microarrays to investigate lung cancer and malaria. In Chapter 3 we show that our proposed Bayesian model fits well ii to these same two data sets and produces estimates of signal that are well suited to downstream inference, specifically to ranking methods that pay attention to uncertainty. Finally, in Chapter 4 we show how the use of our bioinformatic pipeline, can impact downstream inference. In particular, using protein microarray data from a previously published malaria study, we show how our pipeline identifies potential biomarkers of past malaria infection that were not identified with previous analysis methods.
Description
Keywords
bioinformatics, protein microarrays, malaria,
Citation