倪景峰,刘雪峰,邓立军. 矿井通风参数缺失数据插补方法[J]. 煤炭学报,2024,49(5):2315−2323. doi: 10.13225/j.cnki.jccs.2023.0481
引用本文: 倪景峰,刘雪峰,邓立军. 矿井通风参数缺失数据插补方法[J]. 煤炭学报,2024,49(5):2315−2323. doi: 10.13225/j.cnki.jccs.2023.0481
NI Jingfeng,LIU Xuefeng,DENG Lijun. Method for filling missing data of mine ventilation parameters[J]. Journal of China Coal Society,2024,49(5):2315−2323. doi: 10.13225/j.cnki.jccs.2023.0481
Citation: NI Jingfeng,LIU Xuefeng,DENG Lijun. Method for filling missing data of mine ventilation parameters[J]. Journal of China Coal Society,2024,49(5):2315−2323. doi: 10.13225/j.cnki.jccs.2023.0481

矿井通风参数缺失数据插补方法

Method for filling missing data of mine ventilation parameters

  • 摘要: 矿井智能通风系统对矿山智能化建设至关重要。为解决矿井通风参数在实际测量时,因为巷道不具备测试条件、仪器信号受到干扰、巷道断面风速不均一、人工操作不当等制约性因素,造成的矿井通风参数数据缺失问题,提出了1种基于随机森林−链式方程多重插补法的矿井通风参数缺失数据插补方法。采用链式方程多重插补法,通过迭代对每个缺失的属性值产生n个插补值,从而产生n个完整数据集,对n个完整数据集进行分析优化得到1个最终的完整数据集。为了提高缺失值插补精度,合理考虑了矿井通风参数缺失数据的不确定性对分析过程的影响,在随机森林的预测任务中,结合预测均值匹配模型对缺失数据进行插补。以潞新二矿为实验对象,利用智能矿井通风仿真系统IMVS对潞新二矿矿井通风参数原始数据集进行数据预处理,得到完整、准确的矿井通风参数完整数据集,对完整数据集分别进行了不同缺失属性、不同数据缺失率、不同迭代次数的对比试验。以多种模型评价指标对模型有效性进行评估。结果表明:基于随机森林的链式方程多重插补模型插补形成的完整数据集与原始数据集具有很好的相似性;对不同缺失列进行插补实验的结果显示插补模型可以轻松处理混合类型的数据,自主学习参数之间的相关性从而降低了插补复杂性;迭代后形成的n个数据集通过分析合并成一个最终数据集,提高了插补准确率;对初始插补后的完整数据集进行不同迭代次数的试验,发现迭代超过一定次数后,数据相关性一定会收敛。

     

    Abstract: The intelligent mine ventilation system is very important for the intelligent construction of coal mines. In order to solve the problem of missing mine ventilation parameter data caused by the lack of measurement conditions, instrument signal interference, uneven wind speed of roadway section, improper manual operation and other restrictive factors during actual measurement of mine ventilation parameters, a method for filling the missing data of mine ventilation parameters based on the multiple imputation method of random forest-chained equation was proposed. Multiple imputation with chained equations is used to generate n filled values for each missing attribute value by iterations, resulting in n complete datasets, and a final complete dataset is obtained by analyzing and optimizing the n complete datasets. In order to improve the filling accuracy of missing values, the influence of the uncertainty of missing data of mine ventilation parameters on the analysis process is reasonably considered, and the missing data is filled in the prediction task of random forest in combination with the prediction mean matching model. Taking the Luxin No.2 Mine as an experimental example, the intelligent mine ventilation simulation system IMVS was used to preprocess the original data set of ventilation parameters of the Luxin No.2 Mine to obtain a complete and accurate complete dataset of mine ventilation parameters. Comparative experiments with different missing attributes, different data missing rates, and different number of iterations were conducted separately for the complete data set. The effectiveness of the model was evaluated by a variety of model evaluation indicators. The results show that the complete data set formed by the multiple imputation method of random forest-chained equation has good similarity with the original data set. Results of filling experiments with different missing columns show that the filling model can easily handle mixed data types, autonomously learning the correlations between parameters and thus reducing filling complexity. The n datasets formed after iterations are combined into a final dataset by analysis, which improves the filling accuracy. Experiments with different iterations on the complete data set after initial filling show that the data correlation will converge after a certain number of iterations.

     

/

返回文章
返回