Sparse and low rank approximations for action recognition
MetadataShow full item record
Action recognition is crucial area of research in computer vision with wide range of applications in surveillance, patient-monitoring systems, video indexing, Human- Computer Interaction and many more. These applications require automated action recognition. Robust classification methods are sought-after despite influential research in this field over past decade. The data resources have grown tremendously owing to the advances in the digital revolution which cannot be compared to the meagre resources in the past. The main limitation on a system when dealing with video data is the computational burden due to large dimensions and data redundancy. Sparse and low rank approximation methods have evolved recently which aim at concise and meaningful representation of data. This thesis explores the application of sparse and low rank approximation methods in the context of video data classification with the following contributions. 1. An approach for solving the problem of action and gesture classification is proposed within the sparse representation domain, effectively dealing with large feature dimensions, 2. Low rank matrix completion approach is proposed to jointly classify more than one action 3. Deep features are proposed for robust classification of multiple actions within matrix completion framework which can handle data deficiencies. This thesis starts with the applicability of sparse representations based classifi- cation methods to the problem of action and gesture recognition. Random projection is used to reduce the dimensionality of the features. These are referred to as compressed features in this thesis. The dictionary formed with compressed features has proved to be efficient for the classification task achieving comparable results to the state of the art. Next, this thesis addresses the more promising problem of simultaneous classifi- cation of multiple actions. This is treated as matrix completion problem under transduction setting. Matrix completion methods are considered as the generic extension to the sparse representation methods from compressed sensing point of view. The features and corresponding labels of the training and test data are concatenated and placed as columns of a matrix. The unknown test labels would be the missing entries in that matrix. This is solved using rank minimization techniques based on the assumption that the underlying complete matrix would be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities. This thesis then extends the matrix completion framework for joint classification of actions to handle the missing features besides missing test labels. In this context, deep features from a convolutional neural network are proposed. A convolutional neural network is trained on the training data and features are extracted from train and test data from the trained network. The performance of the deep features has proved to be promising when compared to the state of the art hand-crafted features.