Lin Zhou1,*, Yue Xu1, Tianyi Wang1, Kun Feng1, Jingang Shi2
CMC-Computers, Materials & Continua, Vol.69, No.2, pp. 2705-2716, 2021, DOI:10.32604/cmc.2021.017080
Abstract Traditional separation methods have limited ability to handle the speech separation problem in high reverberant and low signal-to-noise ratio (SNR) environments, and thus achieve unsatisfactory results. In this study, a convolutional neural network with temporal convolution and residual network (TC-ResNet) is proposed to realize speech separation in a complex acoustic environment. A simplified steered-response power phase transform, denoted as GSRP-PHAT, is employed to reduce the computational cost. The extracted features are reshaped to a special tensor as the system inputs and implements temporal convolution, which not only enlarges the receptive field of the convolution layer More >