Voice-based Age and Gender Recognition using Training Generative Sparse Model


, University of Mazandaran


Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to general signal classification scheme, our proposed algorithm includes train step to provide related atoms to each signal class and test phase to assess classification performance. Since the classification accuracy depends highly on the selected features, we employ Mel-frequency cepstral coefficients to train basis for better representation of speech structure. These bases are learned over data of male and female speakers using non-negative matrix factorization with sparsity constraint. Then, atom correction is carried out using an energy-based algorithm to decrease coherence between different categories of trained dictionaries. In sparse representation of each data class, atoms related to other sets with the highest energy are replaced with the lowest energy bases if reconstruction error does not exceed from a specified limit. The experimental results show that the proposed algorithm performs better than the earlier methods in this context especially in the presence of background noise.