Pattern Recognition on Random Graphs

dc.contributor.advisorFishkind, Donniell E.en_US
dc.contributor.authorChen, Lien_US
dc.contributor.committeeMemberPriebe, Carey E.en_US
dc.contributor.committeeMemberLyzinski, Vinceen_US
dc.contributor.committeeMemberVogelstein, Joshua T.en_US
dc.contributor.committeeMemberAthreya, Avantien_US
dc.contributor.committeeMemberPark, Youngseren_US
dc.date.accessioned2015-09-16T03:35:04Z
dc.date.available2015-09-16T03:35:04Z
dc.date.created2015-05en_US
dc.date.issued2015-03-17en_US
dc.date.submittedMay 2015en_US
dc.description.abstractThe field of pattern recognition developed significantly in the 1960s, and the field of random graph inference has enjoyed much recent progress in both theory and application. This dissertation focuses on pattern recognition in the context of a particular family of random graphs, namely the stochastic blockmodels, from the two main perspectives of single graph inference and joint graph inference. Single graph inference is the performance of statistical inference on one single observed graph. Given a single graph realized from a stochastic blockmodel, we here consider the specific exploitation tasks of vertex classification, clustering, and nomination. Given an observed graph, vertex classification is the identification of the block labels of test vertices after learning from the training vertices. We propose a robust vertex classifier, which utilizes a representation of a test vertex as a sparse combination of the training vertices. Our proposed classifier is demonstrated to be robust against data contamination, and has superior performance over classical spectral-embedding classifiers in simulation and real data experiments. Vertex clustering groups vertices based on their similarities. We present a model-based clustering algorithm for graphs drawn from a stochastic blockmodel, and illustrate its usefulness on a case study in online advertising. We demonstrate the practical value of our vertex clustering method for efficiently delivering internet advertisements. Under the stochastic blockmodel framework, suppose one block is of particular interest. The task of vertex nomination is to create a nomination list so that vertices from the group of interest are concentrated abundantly near the top of the list. We present several vertex nomination schemes, and propose a vertex nomination scheme that is scalable for large graphs. We demonstrate the effectiveness of our methodology on simulation and real datasets. Next, we consider joint graph inference, which involves the joint space of multiple graphs; in this dissertation, we specifically consider joint graph inference on two graphs. Given two graphs, we consider the tasks of seeded graph matching for large graphs and joint vertex classification. Graph matching is the task of aligning two graphs so as to minimize the number of edge disagreements between them. We propose a scalable graph matching algorithm, which uses a divide-and-conquer approach to scale the state-of-the-art seeded graph matching algorithm to big graph data. We prove theoretical performance guarantees, and demonstrate desired properties such as scalability, robustness, accuracy and runtime in both simulated data and human brain connectome data. Within the joint graph inference framework, we present a case study on the paired chemical and electrical Caenorhabditis elegans neural connectomes. Motivated by the success of seeded graph matching on the paired connectomes, we propose joint vertex classification on the paired connectomes. We demonstrate that joint inference on the paired connectomes yields more accurate results than single inference on either connectome. This serves as a first step for providing a methodological and quantitative approach for understanding the coexistent significance of the chemical and electrical connectomes.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/37863
dc.languageen
dc.publisherJohns Hopkins University
dc.subjectPattern recognitionen_US
dc.subjectrandom graphen_US
dc.subjectstochastic blockmodelen_US
dc.subjectvertex classificationen_US
dc.subjectvertex clusteringen_US
dc.subjectvertex nominationen_US
dc.subjectgraph matchingen_US
dc.subjectjoint vertex classificationen_US
dc.subjectsingle graph inferenceen_US
dc.subjectjoint graph inference.en_US
dc.titlePattern Recognition on Random Graphsen_US
dc.typeThesisen_US
dc.type.materialtexten_US
thesis.degree.departmentApplied Mathematics and Statisticsen_US
thesis.degree.disciplineApplied Mathematics & Statisticsen_US
thesis.degree.grantorJohns Hopkins Universityen_US
thesis.degree.grantorWhiting School of Engineeringen_US
thesis.degree.levelDoctoralen_US
thesis.degree.namePh.D.en_US
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.67 KB
Format:
Plain Text
Description: