Binaural Speech Separation Algorithm Based on Deep Clustering

Lin Zhou; Kun Feng; Tianyi Wang; Yue Xu; Jingang Shi

doi:10.32604/iasc.2021.018414

Open Access icon Open Access

ARTICLE

Binaural Speech Separation Algorithm Based on Deep Clustering

Lin Zhou^1,*, Kun Feng¹, Tianyi Wang¹, Yue Xu¹, Jingang Shi²

1 School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
2 Center for Machine Vision and Signal Analysis, University of Oulu, Oulu, FI-90014, Finland

* Corresponding Author: Lin Zhou. Email: email

Intelligent Automation & Soft Computing 2021, 30(2), 527-537. https://doi.org/10.32604/iasc.2021.018414

Received 08 March 2021; Accepted 09 April 2021; Issue published 11 August 2021

Abstract

Neutral network (NN) and clustering are the two commonly used methods for speech separation based on supervised learning. Recently, deep clustering methods have shown promising performance. In our study, considering that the spectrum of the sound source has time correlation, and the spatial position of the sound source has short-term stability, we combine the spectral and spatial features for deep clustering. In this work, the logarithmic amplitude spectrum (LPS) and the interaural phase difference (IPD) function of each time frequency (TF) unit for the binaural speech signal are extracted as feature. Then, these features of consecutive frames construct feature map, which are regarded as the input to the Bi-directional long short-term memory (BiLSTM). The feature maps are converted to the high-dimensional vectors through BiLSTM, which are used to classify the time-frequency units by K-means clustering. The clustering index are combined with mixed speech signal to reconstruct the target speech signal. The simulation results show that the proposed algorithm has a significant improvement in speech separation and speech quality, since the spectral and spatial information are all utilized for clustering. Also, the method is more generalized in untrained conditions compared with traditional NN method e.g., deep neural network (DNN) and convolutional neural networks (CNN) based method.

Keywords

Binaural speech separation; K-means clustering; BiLSTM

Cite This Article

APA Style

Zhou, L., Feng, K., Wang, T., Xu, Y., Shi, J. (2021). Binaural speech separation algorithm based on deep clustering. Intelligent Automation & Soft Computing, 30(2), 527-537. https://doi.org/10.32604/iasc.2021.018414

Vancouver Style

Zhou L, Feng K, Wang T, Xu Y, Shi J. Binaural speech separation algorithm based on deep clustering. Intell Automat Soft Comput . 2021;30(2):527-537 https://doi.org/10.32604/iasc.2021.018414

IEEE Style

L. Zhou, K. Feng, T. Wang, Y. Xu, and J. Shi "Binaural Speech Separation Algorithm Based on Deep Clustering," Intell. Automat. Soft Comput. , vol. 30, no. 2, pp. 527-537. 2021. https://doi.org/10.32604/iasc.2021.018414

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Binaural Speech Separation Algorithm Based on Deep Clustering

Abstract

Keywords

Cite This Article

1579

904

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link