Abstract:
Depth information is an important component for the environmental perception of unmanned mining trucks in the complex and harsh environment of open-pit mines. When obtaining environmental depth information using LiDAR or stereo systems, it is prone to be affected by factors such as dust in open-pit mines, rain and fog, and complex light conditions. Therefore, this paper proposes an obstacle depth estimation method based on thermal infrared images. Firstly, the sparse depth map is transformed into a dense depth map by depth completion technology as a supervision label, and then the monocular depth estimation is realized based on thermal infrared images to improve the depth information acquisition ability of unmanned mining trucks in harsh environments. The paper proposes a network structure that combines Laplacian Pyramid Depth Residual and jump layer connection, significantly improving the accuracy of depth estimation. This structure captures different levels of detail features through multi-scale depth information and enhances the detail restoration ability by using jump layer connection, enabling the model to better handle subtle changes in complex scenes. To further improve the model performance, the paper introduces a content-guided attention (CGA) fusion module (CGA-Driven Cross-layer Interaction, CGA-DCI), which improves the fusion process of high-frequency and low-frequency information to avoid the information loss problem in simple stitching methods and enhances the accuracy and robustness of depth estimation. In addition, the proposed detail-enhanced attention blocks (DEAB) strengthen the extraction of image details, enabling the model to effectively handle interference such as dust, rain and fog, and complex light conditions in open-pit mines, maintaining high depth estimation accuracy even in harsh environments. Experimental results demonstrate that our model, integrating Laplacian pyramid depth residuals with skip connections, along with CGA-DCI and DEAB modules, achieves 83.3% on \delta _1.25 and an AbsRel of 15.2% for mining truck depth estimation. This performance surpasses mainstream depth estimation methods and can provide reliable, real-time depth information for autonomous mining trucks in challenging environments.