Scientific Publications

'Discriminant Non-negative Matrix Factorization and Projected Gradients for Frontal Face Verification' by Kotsia, S. Zafeiriou and I. Pitas

Abstract
A novel Discriminant Non-negative Matrix Factorization (DNMF) method that uses projected gradients, is presented in this paper. The proposed algorithm guarantees the algorithm’s convergence to a stationary point, contrary to the methods introduced so far, that only ensure the non-increasing behavior of the algorithm’s cost function. The proposed algorithm employs some extra modifications that make the method more suitable for classification tasks. The usefulness of the proposed technique to the frontal face verification problem is also demonstrated.
1st COST 2101 Workshop on Biometrics and Identity Management (BIOID 2008)

'Pathological Motion Detection for Robust Missing Data Treatment' by David Corrigan, Naomi Harte and A. Kokaram

Abstract
This paper presents a new missing data detection algorithm that is robust to Pathological Motion (PM). PM can cause clean image data to be misdiagnosed as missing data, resulting in damage to the image data. The proposed algorithm uses a probabilistic framework to jointly detect PM and missing data. A five frame window is employed instead of the typical three frame window. This allows the temporally impulsive intensity profile of blotches to be distinguished from the quasi-periodic profile of PM. A second diagnostic for PM is defined on the local motion fields of the five frame window. This follows the observation that Pathological Motion results in motion fields which are not smooth. A ground truth comparison with standard missing data detectors shows that the proposed algorithm dramatically reduces the number of falsely detected missing data regions. The algorithm is also shown to reduce image damage during missing data treatment.
EURASIP Journal on Advances in Signal Processing - May 2008

'Simplification of objects for faster human body posture estimation and clinical registration' by A.Hajdu, P.Veres, A. Tanacs, I. Pitas

Abstract
In this paper, we investigate the impact of an object simplification approach in registration using the Iterative Closest Point (ICP) algorithm. Though the simplification of the objects remarkably reduces the registration time, it also raises a critical issue, whether the simplified version of the object carries sufficient information to recover the proper geometric parameters of registration. Accordingly, we investigate both simulated and real data to test the geometric degradation caused by the simplification. As for computational times, we monitor how much time we can save during the ICP registration using simpler objects. Moreover, we check the times taken by the simplification process itself to see whether simplification can be considered online, as well.
NUMGRID2008

'Human Movement Recognition Using Fuzzy Clustering And Discriminant Analysis' by N. Gkalelis, A. Tefas, I. Pitas

Abstract
In this paper a novel method for human movement representation and recognition is proposed. A movement is regarded as a sequence of basic movement patterns, the so-called dynemes. Initially, the fuzzy c-mean (FCM) algorithm is used to identify the dynemes in the input space, and then principal component analysis plus linear discriminant analysis (PCA plus LDA) is employed to project the postures of a movement to the identified dynemes. In this space, the posture representations of the movement are combined to represent the movement in terms of its comprising dynemes. This representation allows for efficient Mahalanobis or cosine-based nearest centroid classification of variable length movements.
EUSIPCO 2008

'Motivating Class-Specific nonlinear projections for multiview face verification' by S. Zafeirou, G. Goudelis, A. Tefas, N. Nikolaidis, I. Pitas

Abstract
In this paper we motivate the use of class-specific nonlinear subspace methods for face verification. The problem of face verification is considered as a two-class problem (genuine versus impostor class). The typical Fisher’s Linear Discriminant Analysis (FLDA) gives only one or two projections in a two-class problem. This is a very strict limitation to the search of discriminant dimensions. As for the FLDA for N class problems (N > 2) the transformation is not person specific. In order to remedy these limitations of FLDA, exploit the individuality of human faces and take into consideration the fact that the distribution of facial images, under different viewpoints, illumination variations and facial expression is highly complex and non-linear, novel kernel discriminant algorithms are used. The new method was tested in the face verification problem using single and multiple view datasets and found to outperform other commonly used kernel approaches.
ICIP 2008

Sparse human movement representation and recognition
by Nikolaos Gkalelis, Anastasios Tefas, Ioannis Pitas

Abstract
In this paper a novel method for human movement representation and recognition is proposed. A movement type is regarded as a unique combination of basic movement patterns, the so-called dynemes. The fuzzy c-mean (FCM) algorithm is used to identify the dynemes in the input space and allow the expression of a posture in terms of these dynemes. In the so called dyneme space, the sparse posture representations of a movement are combined to represent the movement as a single point in that space, and linear discriminant analysis (LDA) is further employed to increase movement type discrimination and compactness of representation. This method allows for simple Mahalanobis or cosine distance comparison of movements, taking
implicitly into account time shifts and internal speed variations, and, thus, aiding the design of a real-time movement recognition algorithm.

2008 IEEE International Workshop on Multimedia Signal Processing

OBJECTIVE QUALITY ASSESSMENT IN FREE-VIEWPOINT VIDEO PRODUCTION by J. Starck, J. Kilner and A. Hilton

Abstract
This paper addresses the problem of objectively measuring quality in free-viewpoint video production. The accuracy of scene reconstruction is typically limited and an evaluation of free-viewpoint video should explicitly consider the quality of image production. A simple objective measure of accuracy is presented in terms of structural registration error in view synthesis. This technique can be applied as a full-reference metric to measure the fidelity of view synthesis to a ground truth image or as a no-reference metric to measure the error in registering scene appearance in image-based rendering. The metric is applied to a data-set with known geometric accuracy and a comparison is also demonstrated between two free viewpoint video techniques across two prototype production studios.

3dtv08.pdf (Adobe PDF - 254Kb)

Objective quality assessment in free-viewpoint video production by J. Kilner , J.Starck,J.Y.Guillemaut,A.Hilton

Abstract
This paper addresses the problem of objectively quantifying accuracy in free-viewpoint video production. Free viewpoint video makes use of geometric scenere construction and renders novel views using the appearance sampled in multiple camera images.
Previous work typically adopts an objective evaluation of geometric accuracy against ground-truth data or a subjective evaluation of visual quality in view synthesis. We consider two production scenarios, human performance capture in a highly constrained studio environment and sports production in a large-scale external environment. The accuracy of scenere construction is typically limited and absolute geometric accuracy does not necessarily reflect the quality of free-viewpoint rendering. A framework is introduced to quantify error at the point of view synthesis. The approach can be applied as a full-reference metric to measure fidelity to a ground-truth image or as an o-reference metric to measure the error in rendering. The framework is applied to a data set with known geometric accuracy and a comparison is presented for studio based and sports production scenarios.

sdarticle.pdf (Adobe PDF - 637Kb)

Fusion of movement specific human identification experts by
Nikolaos Gkalelis, Anastasios Tefas, and Ioannis Pitas.

Abstract
In this paper a multi-modal method for human identifica-
tion that exploits the discriminant features derived from several movement types performed from the same human is proposed. Utilizing a fuzzy vector quantization (FVQ) and linear discriminant analysis (LDA) based algorithm, an unknown movement is first classified, and, then, the person performing the movement is recognized from a movement specific person recognition expert. In case that the unknown person performs
more than one movements, a multi-modal algorithm combines the scores of the individual experts to yield the final decision for the identity of the unknown human. Using a publicly available database, we provide promising results regarding the human identification strength of movement specific experts, as well as we indicate that the combination of the outputs of the experts increases the robustness of the human recognition algorithm.

bioId09.pdf (Adobe PDF - 137Kb)

View indepedent human movement recognition from multi-view video exploiting a circular invariant posture representation by Nikolaos Gkalelis, Nikos Nikolaidis, Ioannis Pitas

Abstract
Abstract—In this paper a novel method for view independent
human movement representation and recognition, exploiting the
rich information contained in multi-view videos, is proposed. The binary masks of a multi-view posture image are first vectorized, concatenated and the view correspondence problem between train and test samples is solved using the circular shift invariance property of the discrete Fourier transform (DFT) magnitudes. Then, using fuzzy vector quantization (FVQ) and linear discriminant analysis (LDA), different movements are represented and classified. This method allows view independent movement recognition, without the use of calibrated cameras, a-priori view correspondence information or 3D model reconstruction. A multiview video database has been constructed for the assessment of the proposed algorithm. Evaluation of this algorithm on the new database, shows that it is particularly efficient and robust, and can achieve good recognition performance.

icme09.pdf (Adobe PDF - 901Kb)

Frontal view recognition in multiview video sequences by I. Kotsia1, N. Nikolaidis and I. Pitas

Abstract
In this paper, a novel method is proposed as a solution to the problem of frontal view recognition from multiview image sequences. Our aim is to correctly identify the view that
corresponds to the camera placed in front of a person, or the
camera whose view is closer to a frontal one. By doing so, frontal face images of the person can be acquired, in order to be used in face or facial expression recognition techniques that require frontal faces to achieve a satisfactory result. The
proposed method firstly employs the Discriminant Non-Negative
Matrix Factorization (DNMF) algorithm on the input images acquired from every camera. The output of the algorithm is then used as an input to a Support Vector Machines (SVMs) system that classifies the head poses acquired from the cameras to two classes that correspond to the frontal or non frontal pose.
Experiments conducted on the IDIAP database demonstrate that the proposed method achieves an accuracy of 98.6% in frontal
view recognition.

kotsia_icme09.pdf (Adobe PDF - 943Kb)

Skeleton Driven Laplacian Volumetric Deformation by C.Budd & A.Hilton

Abstract
This paper proposes a novel mesh animation technique which
combines the flexible interactive control of skeleton based
animation rigs with volumetric mesh deformation to avoid
mesh collapse and self-intersection under folding and twisting
motion. Our solution combines the industry standard Linear
Skin Blending with a mesh based volumetric deformation
approach. Linear Skin Blending is used to attach and efficiently
animate a small number of points with a skeletal control
rig. These points provide constraints for a Laplacian mesh
deformation scheme which solves for the mesh which satisfies
the constraints and gives minimum volume deformation of a
tetrahedralization of the mesh vertices. This approach allows
rigging and animation of high-resolution captured surface
meshes from multiple view video or 3D scans. Interactive
skeleton driven animation is achieved for meshes of several
thousand vertices without the known drawbacks of Linear Skin
Blending, mesh collapse around joints and the ’candy wrapper
effect’.

CVMP-CBudd.pdf (Adobe PDF - 2.212890625Mb)

Wide Baseline Matte Propagation for indoor scenes by M. Sarim, A. Hilton J.-Y. Guillemaut

Abstract
Digital image matting is the process of extracting foreground
objects from an image. This is extremely challenging for natural images and videos because of its ill posed nature.
Initial user interaction is required to aid the algorithms in
identifying the definite foreground and background regions.
Recently techniques have been developed to estimate the alpha matte of an image using multi-view images of a foreground object. However these algorithms are only capable of handling narrow baseline views having small intensity and structural variations in the foreground. In this paper, we propose a novel non-parametric approach to generate alpha matte for wide-baseline multi-view images having different inter-view foreground appearance.

CVMP-Sarim.pdf (Adobe PDF - 3.1943359375Mb)

Graph Based Foreground extraction in Extended color space by H.Kim & A.Hilton

Abstract
We propose a region-based method to extract semantic
foreground regions from color video sequences with static
backgrounds. First, we introduce a new distance measure for
background subtraction which is robust against shadows.
Then the foreground region is extracted with a graph-based
region segmentation method considering background
difference and spatial homogeneity. For efficient
computation, the graph structure is optimized by the
minimum spanning tree before segmentation. The main
contribution is that the proposed algorithm improves on
conventional approaches especially in strong shadow regions
and does not require manual initialization. We have verified
through experiments and comparison to state of the art
methods that the proposed algorithm works well with
various cameras and environment.

ICIP-kim.pdf (Adobe PDF - 594Kb)