Improving Antibody CDR Template Selection by Structural Cluster Prediction

Long, Xiyao

Improving Antibody CDR Template Selection by Structural Cluster Prediction

dc.contributor.advisor	Gray, Jeffrey J.
dc.contributor.committeeMember	Schulman, Rebecca
dc.creator	Long, Xiyao
dc.date.accessioned	2018-05-22T03:50:57Z
dc.date.available	2018-05-22T03:50:57Z
dc.date.created	2017-12
dc.date.issued	2017-12-28
dc.date.submitted	December 2017
dc.date.updated	2018-05-22T03:50:58Z
dc.description.abstract	With the advent of high-throughput sequencing, antibody sequences can be acquired at much greater speed than corresponding structures, creating a need for rapid structure determination. Computational modeling is the only feasible method for high-throughput structure determination, however it does not always produce models with high accuracy. In antibody modeling, the framework regions are well conserved and readily modeled to sub-Angstrom accuracy, but accurate modeling of the complementarity determining region (CDR) loops remains elusive. This is a challenge we must overcome if we are to study antibody function or design an antibody, using models. Of the six CDR loops, the non-H3 CDR loops (H1, H2, and L1–L3) are easier to model than the H3 loop, because they are shorter and have less structural and length variability. Moreover, most of the non-H3 CDR loop structures can be grouped by CDR and length and can be clustered into a few canonical structure clusters. The ability to accurately predict the correct cluster of a CDR from sequence alone could improve structural modeling. In this thesis, I assessed how well current modeling techniques can identify the CDR canonical structures from sequence alone and I improved the retrieval accuracy. First, I benchmarked the current CDR loop modeling method in Rosetta and found it failed to predict the correct canonical structure clusters for 19% of CDRs. Next, I assessed the significance of the failures by comparing to a random cluster selection model. Then, to improve the accuracy of template selection, I trained a machine learning classifier, for each CDR and length group, with sequences as features, and found that the classifier successfully improved the retrieval of canonical structures. This improvement is not achievable by the residue position rules alone. Finally, I propose incorporating canonical class prediction via machine learning to improve canonical structure retrieval accuracy and I expected this improvement to increase as the less populated CDR clusters become more enriched.
dc.format.mimetype	application/pdf
dc.identifier.uri	http://jhir.library.jhu.edu/handle/1774.2/58703
dc.language.iso	en_US
dc.publisher	Johns Hopkins University
dc.publisher.country	USA
dc.subject	Antibody
dc.subject	complementary determining regions
dc.subject	CDRs
dc.subject	Rosetta Antibody
dc.subject	protein structural modeling
dc.title	Improving Antibody CDR Template Selection by Structural Cluster Prediction
dc.type	Thesis
dc.type.material	text
thesis.degree.department	Chemical and Biomolecular Engineering
thesis.degree.discipline	Chemical & Biomolecular Engineering
thesis.degree.grantor	Johns Hopkins University
thesis.degree.grantor	Whiting School of Engineering
thesis.degree.level	Masters
thesis.degree.name	M.S.E.

Files

Original bundle

Now showing 1 - 1 of 1

Name:: LONG-THESIS-2017.pdf
Size:: 10.34 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Name:: LICENSE.txt
Size:: 2.67 KB
Format:: Plain Text
Description:

Download

Collections

ETD -- Graduate theses