Subspace Learning for Data Arising from a Union of Subspaces of High Relative Dimension

dc.contributor.advisorBasu, Amitabh
dc.contributor.committeeMemberRobinson, Daniel
dc.contributor.committeeMemberVidal, Rene
dc.contributor.committeeMemberZhu, Zhihui
dc.creatorDing, Tianyu
dc.creator.orcid0000-0001-8445-4330
dc.date.accessioned2021-09-16T18:55:21Z
dc.date.available2021-09-16T18:55:21Z
dc.date.created2021-08
dc.date.issued2021-06-24
dc.date.submittedAugust 2021
dc.date.updated2021-09-16T18:55:21Z
dc.description.abstractAs we have witnessed the rapid growth of statistical machine learning over the past decades, the ability of processing big and corrupted data becomes increasingly important. One of the major challenges is that structured data, such as images, videos and 3D point clouds, involved in many application scenarios are high-dimensional. Conventional techniques usually approximate the high-dimensional data with low-dimensional structures by fitting the data with one or more linear subspaces. However, their theory and algorithms are restricted to the setting in which the underlying subspaces have a low relative dimension compared to the ambient space. This thesis attempts to advance the understanding of subspace learning for data arising from subspaces of high relative dimension, as well as develop efficient algorithms for handling big and corrupted data. The first major contribution of this thesis is a theoretical analysis that extends Dual Principal Component Pursuit (DPCP), a non-convex approach that learns a hyperplane in the presence of noiseless data, to learn a subspace of any dimension with noisy data. We provide geometric and probabilistic analyses to characterize how the principal angles between the global solution and the orthogonal complement of the subspace behave as a function of the noise level. Moreover, we improve the DPCP theory in multi-hyperplane case with a more interpretable geometric analysis and a new statistical analysis. The second major contribution of this thesis is the development of a linearly convergent method for non-convex optimization on the Grassmannian. We show that if the objective function satisfies a certain Riemannian regularity condition (RRC) with respect to some point in the Grassmannian, then a Projected Riemannian Sub-Gradient Method (PRSGM) converges at a linear rate to that point. In particular, we prove that the DPCP problem for learning a single subspace satisfies the RRC and PRSGM converges linearly to a neighborhood of the orthogonal complement of the subspace with error proportional to the noise level. We also extend the RRC to DPCP for a union of hyperplanes and prove the linear convergence of PRSGM to a specific hyperplane. Finally, both synthetic and real experiments demonstrate the superiority of the proposed method.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/64364
dc.language.isoen_US
dc.publisherJohns Hopkins University
dc.publisher.countryUSA
dc.subjectsubspace learning
dc.subjectsubspace clustering
dc.subjecthigh relative dimension
dc.subjectdual principal component pursuit
dc.titleSubspace Learning for Data Arising from a Union of Subspaces of High Relative Dimension
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentApplied Mathematics and Statistics
thesis.degree.disciplineApplied Mathematics & Statistics
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorWhiting School of Engineering
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DING-DISSERTATION-2021.pdf
Size:
5.5 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.67 KB
Format:
Plain Text
Description: