Journal of Pattern Recognition Research, Vol 4, No 1 (2009)

QMOS - A Robust Visualization Method for Speaker Dependencies with Different Microphones

Andreas Maier, Maria Schuster, Ulrich Eysholdt, Tino Haderlein, Tobias Cincarek, Stefan Steidl, Anton Batliner, Stefan Wenhardt, Elmar Nöth

Abstract


There are several methods to create visualizations of speech data. All of them,
however, lack the ability to remove microphone-dependent distortions. We exam-
ined the use of Principal Component Analysis (PCA), Linear Discriminant Analysis
(LDA), and the COmprehensive Space Map of Objective Signal (COSMOS) method
in this work. To solve the problem of lacking microphone independency of PCA,
LDA, and COSMOS, we present two methods to reduce the influence of the record-
ing conditions on the visualization. The first one is a rigid registration of maps
created from identical speakers recorded under different conditions, i.e. different
microphones and distances. The second method is an extension of the COSMOS
method, which performs a non-rigid registration during the mapping procedure.
As a measure for the quality of the visualization we computed the mapping error
which occurs during the dimension reduction and the grouping error as the aver-
age distance between the representations of the same speaker recorded by different
microphones. The best linear method in leave-one-speaker-out evaluation is PCA
plus rigid registration with a mapping error of 47% and a grouping error of 18%.
The proposed method, however, surpasses this even further with a mapping error
of 24% and a grouping error which is close to zero.

 


Full Text: PDF