Subspace Learning for Data Arising from a Union of Subspaces of High Relative Dimension

Ding, Tianyu

Subspace Learning for Data Arising from a Union of Subspaces of High Relative Dimension

dc.contributor.advisor	Basu, Amitabh
dc.contributor.committeeMember	Robinson, Daniel
dc.contributor.committeeMember	Vidal, Rene
dc.contributor.committeeMember	Zhu, Zhihui
dc.creator	Ding, Tianyu
dc.creator.orcid	0000-0001-8445-4330
dc.date.accessioned	2021-09-16T18:55:21Z
dc.date.available	2021-09-16T18:55:21Z
dc.date.created	2021-08
dc.date.issued	2021-06-24
dc.date.submitted	August 2021
dc.date.updated	2021-09-16T18:55:21Z
dc.description.abstract	As we have witnessed the rapid growth of statistical machine learning over the past decades, the ability of processing big and corrupted data becomes increasingly important. One of the major challenges is that structured data, such as images, videos and 3D point clouds, involved in many application scenarios are high-dimensional. Conventional techniques usually approximate the high-dimensional data with low-dimensional structures by fitting the data with one or more linear subspaces. However, their theory and algorithms are restricted to the setting in which the underlying subspaces have a low relative dimension compared to the ambient space. This thesis attempts to advance the understanding of subspace learning for data arising from subspaces of high relative dimension, as well as develop efficient algorithms for handling big and corrupted data. The first major contribution of this thesis is a theoretical analysis that extends Dual Principal Component Pursuit (DPCP), a non-convex approach that learns a hyperplane in the presence of noiseless data, to learn a subspace of any dimension with noisy data. We provide geometric and probabilistic analyses to characterize how the principal angles between the global solution and the orthogonal complement of the subspace behave as a function of the noise level. Moreover, we improve the DPCP theory in multi-hyperplane case with a more interpretable geometric analysis and a new statistical analysis. The second major contribution of this thesis is the development of a linearly convergent method for non-convex optimization on the Grassmannian. We show that if the objective function satisfies a certain Riemannian regularity condition (RRC) with respect to some point in the Grassmannian, then a Projected Riemannian Sub-Gradient Method (PRSGM) converges at a linear rate to that point. In particular, we prove that the DPCP problem for learning a single subspace satisfies the RRC and PRSGM converges linearly to a neighborhood of the orthogonal complement of the subspace with error proportional to the noise level. We also extend the RRC to DPCP for a union of hyperplanes and prove the linear convergence of PRSGM to a specific hyperplane. Finally, both synthetic and real experiments demonstrate the superiority of the proposed method.
dc.format.mimetype	application/pdf
dc.identifier.uri	http://jhir.library.jhu.edu/handle/1774.2/64364
dc.language.iso	en_US
dc.publisher	Johns Hopkins University
dc.publisher.country	USA
dc.subject	subspace learning
dc.subject	subspace clustering
dc.subject	high relative dimension
dc.subject	dual principal component pursuit
dc.title	Subspace Learning for Data Arising from a Union of Subspaces of High Relative Dimension
dc.type	Thesis
dc.type.material	text
thesis.degree.department	Applied Mathematics and Statistics
thesis.degree.discipline	Applied Mathematics & Statistics
thesis.degree.grantor	Johns Hopkins University
thesis.degree.grantor	Whiting School of Engineering
thesis.degree.level	Doctoral
thesis.degree.name	Ph.D.

Files

Original bundle

Now showing 1 - 1 of 1

Name:: DING-DISSERTATION-2021.pdf
Size:: 5.5 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: LICENSE.txt
Size:: 2.67 KB
Format:: Plain Text
Description:

Download

Collections

ETD -- Doctoral Dissertations