Vol.64, No.1, 2020, pp.253-271, doi:10.32604/cmc.2020.09848
Sound Source Localization Based on SRP-PHAT Spatial Spectrum and Deep Neural Network
  • Xiaoyan Zhao1, *, Shuwen Chen2, Lin Zhou3, Ying Chen3, 4
1 School of Information and Communication Engineering, Nanjing Institute of Technology, Nanjing, 211167, China.
2 School of Mathematics and Information Technology, Jiangsu Second Normal University, Nanjing, 210013, China.
3 School of Information Science and Engineering, Southeast University, Nanjing, 210096, China.
4 Department of Psychiatry, Columbia University and NYSPI, New York, 10032, USA.
* Corresponding Author: Xiaoyan Zhao. Email: xiaoyanzhao205@163.com.
Received 22 January 2020; Accepted 02 March 2020; Issue published 20 May 2020
Microphone array-based sound source localization (SSL) is a challenging task in adverse acoustic scenarios. To address this, a novel SSL algorithm based on deep neural network (DNN) using steered response power-phase transform (SRP-PHAT) spatial spectrum as input feature is presented in this paper. Since the SRP-PHAT spatial power spectrum contains spatial location information, it is adopted as the input feature for sound source localization. DNN is exploited to extract the efficient location information from SRP-PHAT spatial power spectrum due to its advantage on extracting high-level features. SRP-PHAT at each steering position within a frame is arranged into a vector, which is treated as DNN input. A DNN model which can map the SRP-PHAT spatial spectrum to the azimuth of sound source is learned from the training signals. The azimuth of sound source is estimated through trained DNN model from the testing signals. Experiment results demonstrate that the proposed algorithm significantly improves localization performance whether the training and testing condition setup are the same or not, and is more robust to noise and reverberation.
Sound source localization, microphone array, steered response power-phase transform (SRP-PHAT) spatial spectrum, deep neural network.
Cite This Article
Zhao, X., Chen, S., Zhou, L., Chen, Y. (2020). Sound Source Localization Based on SRP-PHAT Spatial Spectrum and Deep Neural Network. CMC-Computers, Materials & Continua, 64(1), 253–271.