On Efficient Bayesian Scene Interpretation

dc.contributor.advisorHermansky, Hynek
dc.contributor.committeeMemberGeman, Donald J.
dc.contributor.committeeMemberYounes, Laurent
dc.contributor.committeeMemberYuille, Alan L.
dc.contributor.committeeMemberTran, Trac Duy
dc.creatorJahangiri, Ehsan
dc.creator.orcid0000-0001-6208-5474
dc.date.accessioned2018-10-03T02:53:41Z
dc.date.available2018-10-03T02:53:41Z
dc.date.created2016-05
dc.date.issued2016-02-05
dc.date.submittedMay 2016
dc.date.updated2018-10-03T02:53:41Z
dc.description.abstractScene understanding, including object recognition, is perhaps the most challenging task in computer vision. Deep convolutional neural networks (CNNs) have received a flurry of interest in the past few years due to their superior performance. However, deep networks are computationally expensive and without efficient implementation on high performance computing systems not as practical as older methods. Furthermore, CNNs do not benefit from the human's visual selective attention and top-down contextual feedback connections. The human visual system makes extensive use of contextual information to facilitate and refine object detections; object detection and recognition based only on intrinsic features of target objects is not usually sufficient for reliable inference. In this thesis, we use a model-based approach to incorporate top-down contextual information, and analyze scenes in a coarse-to-fine fashion inspired by the visual selective attention property. In addition to disambiguating object detection, the space of objects and their poses can be searched more efficiently by taking advantage of the contextual relations between different scene entities. We present a new approach to efficiently search the space of objects and their poses using a Bayesian method called ``Entropy Pursuit'', where contextual relations between object instances and other scene entities are incorporated via a prior model. Using the entropy pursuit approach we collect bits of information about the scene sequentially by greedily selecting patches whose analysis provide the most informative in an information-theoretic sense. As proof of concept we use the entropy pursuit method for multi-category object recognition in table-setting scenes. We have investigated the possibility of generating a scene interpretation by processing only a fraction of patches from an input image. Our results confirm the hypothesis that we can identify an accurate interpretation by processing only a fraction of patches if the right patches are selected in the right order. We can save computation time by processing only a fraction of patches.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/59373
dc.languageen
dc.publisherJohns Hopkins University
dc.publisher.countryUSA
dc.subjectScene Interpretation, Object Detection, Convolutional Neural Networks, Statistical Inference, Stochastic Approximation, MCMC.
dc.titleOn Efficient Bayesian Scene Interpretation
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineApplied Mathematics & Statistics
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorWhiting School of Engineering
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
JAHANGIRI-DISSERTATION-2016.pdf
Size:
26.3 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.68 KB
Format:
Plain Text
Description: