综采工作面对讲系统非平稳噪声低功耗去噪方法

杨艺; 谭晓; 常亚军; 王科平; 刘斌斌; 王田

doi:10.13225/j.cnki.jccs.2024.0796

摘要: 综采工作面语音对讲系统面临严重的非平稳噪声干扰。在功耗限制条件下，实现对讲系统的超低信噪比语音去噪，是确保工作面语音信息正确传输的核心技术之一。基于IMCRA算法，提出一种面向综采工作面语音特点的非平稳噪声去除方法MIMCRA。其中，针对先验信噪比估计延迟导致的非平稳噪声估计不准的问题，引入改进2步噪声去除方法。即利用前一帧的先验信噪比和当前帧的纯净语音来滚动估计当前帧的先验信噪比和下一帧的纯净语音，实现了先验信噪比实时估计。针对固定平滑因子对含噪功率谱进行平滑处理容易引起噪声过估计，从而导致语音信息难以提取的问题，引入帧−频动态平滑因子调节机制。以平滑功率谱密度和噪声功率谱密度的最小均方差为依据，对含噪语音的功率谱实现动态平滑处理。针对信噪比过低，噪声去除不彻底的问题，提出一种面向弱语音分量保护的噪声存在概率检测机制。根据2～4 kHz频率范围内，噪声与弱语音能量分布的统计特性差别，对去噪后的信号再进行噪声检测，并消除存在的残余噪声。对比试验结果表明：当输入语音信噪比为−5～10 dB时，MIMCRA算法与IMCRA算法相比，分段信噪比提高约3 dB，分段误差降低约0.3，对数谱距离降低约0.2。特别当语音信噪比为−5 dB时，MIMCRA算法仍然能将分段信噪比提高到−2.799 5 dB，表明该算法对超低信噪比含噪语音有较强的去噪能力。MIMCRA算法在郑煤机最新研发的综采工作面对讲系统中实现了低功耗部署，芯片功耗为16.5～66.0 mW；处理32 ms帧长的语音帧耗时约16 ms，达到实时性要求。

Abstract: The voice intercom system in the fully mechanized mining face is facing serious non-stationary noise interference. Under power consumption limitations, achieving ultra-low signal-to-noise ratio speech denoising in intercom systems is one of the core technologies to ensure the correct transmission of voice information in the working face. Based on the IMCRA algorithm, a non-stationary noise removal method MIMCRA is proposed for the speech characteristics of fully mechanized mining faces. Among them, an improved two-step noise removal method is introduced to address the problem of inaccurate estimation of non-stationary noise caused by delay in prior signal-to-noise ratio estimation. By utilizing the prior signal-to-noise ratio of the previous frame and the pure speech of the current frame to roll estimate the prior signal-to-noise ratio of the current frame and the pure speech of the next frame, real-time estimation of the prior signal-to-noise ratio is achieved. A frame frequency dynamic smoothing factor adjustment mechanism is introduced to address the problem of over estimation of noise caused by smoothing noisy power spectra with fixed smoothing factors, which makes it difficult to extract speech information. Based on the minimum mean square error of smoothed power spectral density and noise power spectral density, dynamic smoothing processing is implemented on the power spectrum of noisy speech. Aiming at the problem of low signal-to-noise ratio and incomplete noise removal, a noise existence probability detection mechanism for weak speech component protection is proposed. Based on the statistical differences in energy distribution between noise and weak speech within the frequency range of 2−4 kHz, the denoised signal is subjected to noise detection and residual noise is eliminated. The comparative experimental results show that when the input speech signal-to-noise ratio is in the range of −5−10 dB, compared with the IMCRA algorithm, our algorithm improves the segmentation signal-to-noise ratio by about 3 dB, reduces segmentation error by about 0.3, and reduces logarithmic spectral distance by about 0.2. Especially when the signal-to-noise ratio is −5 dB, the algorithm proposed in this paper can still improve the segmented signal-to-noise ratio to −2.799 5 dB, indicating that the algorithm has strong denoising ability for ultra-low signal-to-noise ratio noisy speech. The algorithm in this article has been deployed with low power consumption in the latest fully mechanized mining face to face system developed by Zhengmei Machinery, with a chip power consumption of approximately 16.5−66.0 mW; Processing speech frames with a frame length of 32 ms takes approximately 16 ms, meeting real-time requirements.

综采工作面对讲系统非平稳噪声低功耗去噪方法

Low-power speech denoising method for non-stationary noise in underground mines

相关链接