528 | 0 | 19 |
下载次数 | 被引频次 | 阅读次数 |
针对航拍视频场景建模存在的数据匮乏、现有算法建模细节缺失等问题,提出了基于仿真引擎的飞行场景数据采集方法,并采用三维高斯泼溅(3D Gaussian Splatting,3DGS)算法实现航拍场景的建模与渲染。首先,利用仿真引擎实现城区、沙滩、江桥3种典型场景下不同飞行轨迹、不同飞行速度、不同环境条件下的数据集获取,并进行航拍视频帧采样;然后,通过运动恢复结构实现场景的稀疏建模,再利用3DGS算法进行场景的稠密建模及渲染;最后,进行了模型的定量对比与定性分析实验。结果表明:与经典的COLMAP算法相比,所提算法具有更好的建模效果以及更完备的图像细节;线性、网格、螺旋不同的飞行轨迹均能得到较高的图像质量,但螺旋轨迹具有更好的场景通用性;飞行速度在30 m/s以下时可以得到较高质量的建模效果,其中PSNR>32 dB,SSIM>0.93,LPIPS<0.13;在不同环境条件下均能得到较高的图像质量,且在晴天白天理想光照下质量最佳。该研究为具有位姿灵活、飞行速度较快、场景大等难点的航拍视频提供了高精度、高鲁棒性的建模方案。
Abstract:To solve the problems of data scarcity and lack of modeling details in existing algorithms in the modeling of aerial video scenes, a flight scene data acquisition method based on simulation engines was proposed. The advanced 3DGS(3D Gaussian Splatting)algorithm was adopted to model and render aerial scenes. First, simulation engines were used to obtain datasets for different flight trajectories, speeds, and environmental conditions in three typical scenes(urban areas, beaches, and river bridges), and to obtain samples of aerial video frames. In addition, the sparse modeling of the scenes was realized through the motion recovery structure, and the 3DGS algorithm was used for dense modeling and rendering of the scenes. Finally, quantitative comparison and qualitative analysis experiments of the models were conducted. The results show that compared with the classical COLMAP algorithm, the proposed algorithm has better modeling results and more complete image details. Higher image quality can be obtained in different flight trajectories(linear, grid, and spiral), while achieving better scene universality in spiral trajectories. Higher-quality modeling results can be obtained when the flight speed is below30 m/s(PSNR>32 dB, SSIM>0.93, LPIPS<0.13). Furthermore, high image quality can be achieved under different environmental conditions, and the best quality is achieved under ideal light during the daytime of sunny days. A high-precision and high-robustness modeling solution for aerial videos with difficulties such as flexible positions, fast flight speeds, and large scenes is provided in this paper.
[1]李永露.基于无人机影像的运动恢复结构技术(SfM)研究[D].哈尔滨:哈尔滨理工大学,2017:1-5.LI Y L. Research on structure from motion based on UAV image and video[D]. Harbin:Harbin University of Science and Technology,2017:1-5.
[2]姜国梁.无人机多视图立体三维点云重建技术研究[D].广汉:中国民用航空飞行学院,2024:1-2.JIANG G L.Research on multi-view stereo threedimensional point cloud reconstruction techniques with UAV[D].Guanghan:Civil Aviation Flight University of China,2024:1-2.
[3]王世捷.基于神经辐射场的无人机影像三维重建技术研究[D].成都:电子科技大学,2024:3-4.WANG S J. Neural radiance fields based 3D reconstruction for UAV images[D]. Chengdu:University of Electronic Science and Technology of China,2024:3-4.
[4] LI Y, HU Q W, WU M, et al. Extraction and simplification of building fa?ade pieces from mobile laser scanner point clouds for 3D street view services[J].ISPRS International Journal of Geo-Information,2016,5(12):231.
[5] WANG Y,LIU K,HAO Q,et al.Period coded phase shifting strategy for real-time 3D structured light illumination[J]. IEEE Transactions on Image Processing,2011,20(11):3001-3013.
[6] JWA Y, SOHN G, KIM H B. Automatic 3D powerline reconstruction using airborne LiDAR data[J].Engineering,Environmental Science,2009:105-110.
[7]熊思博,王琦,刘光洁.三维重建技术的发展与现状研究综述[J].电脑知识与技术,2022, 18(36):114-117.
[8] IRSCHARA A, ZACH C, FRAHM J M, et al.From structure-from-motion point clouds to fast location recognition[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition,Piscataway,NJ,USA:IEEE,2009:2599-2606.
[9] MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF:Representing scenes as neural radiance fields for view synthesis[C]//Computer Vision-ECCV 2020. Cham:Springer International Publishing,2020:405-421.
[10] KERBL B, KOPANAS G, LEIMKUEHLER T,et al. 3D Gaussian splatting for real-time radiance field rendering[J].ACM Transactions on Graphics,2023,42(4):1-14.
[11] SEDAGHAT A, MOKHTARZADE M, EBADI H. Uniform robust scale-invariant feature matching for optical remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing, 2011,49(11):4516-4527.
[12] ALIAKBARPOUR H,PALANIAPPAN K,SEETHARAMAN G.Fast structure from motion for sequential and wide area motion imagery[C]//2015IEEE International Conference on Computer Vision Workshop. Piscataway, NJ, USA:IEEE, 2015:1086-1093.
[13] XU Z, WU L X, GERKE M, et al. Skeletal camera network embedded structure-from-motion for 3D scene reconstruction from UAV images[J].ISPRS Journal of Photogrammetry and Remote Sensing,2016,121:113-127.
[14] JIANG S, JIANG W S. Efficient structure from motion for oblique UAV images based on maximal spanning tree expansion[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2017, 132:140-161.
[15] SUN Y B, SUN H B, YAN L, et al. RBA:reduced bundle adjustment for oblique aerial photogrammetry[J].ISPRS Journal of Photogrammetry and Remote Sensing,2016,121:128-142.
[16] KONOLIGE K. Sparse sparse bundle adjustment[C]//Proceedings of the British Machine Vision Conference.Edinburgh:BMVA Press,2010:102.
[17] BARRON J T,MILDENHALL B,TANCIK M,et al. Mip-NeRF:a multiscale representation for anti-aliasing neural radiance fields[EB/OL].(2021-08-14)[2024-11-27].http://arxiv. org/abs/2103.13415v3.
[18] VERBIN D, HEDMAN P, MILDENHALL B,et al. Ref-NeRF:structured view-dependent appearance for neural radiance fields[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024:1-12.
[19]HEDMAN P, SRINIVASAN P P, MILDENHALL B, et al. Baking neural radiance fields for real-time view synthesis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024:1-12.
[20] CHEN Z Q, FUNKHOUSER T, HEDMAN P,et al. MobileNeRF:exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures[EB/OL].(2023-05-30)[2024-11-27].http://arxiv.org/abs/2208.00277v5.
[21] NAVANEET K,MEIBODI K P,KOOHPAYEGANI S A, et al. CompGS:smaller and faster gaussian splatting with vector quantization[EB/OL].(2024-09-26)[2024-11-27]. http://arxiv. org/abs/2311.18159v3.
[22] DURVASULA S, ZHAO A, CHEN F, et al.DISTWAR:fast differentiable rendering on rasterbased rendering pipelines[EB/OL].(2023-12-01)[2024-11-27].http://arxiv.org/abs/2401.05345v1.
[23] YAN Z W,LOW W F,CHEN Y,et al.Multiscale 3D gaussian splatting for anti-aliased rendering[EB/OL].(2024-05-29)[2024-11-27].http://arxiv.org/abs/2311.17089v2.
[24] SCHONBERGER J L, FRAHM J M. Structurefrom-motion revisited[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ,USA:IEEE,2016:4101-4113.
[25] FISCHLER M A,BOLLES R C.Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography[J]. Readings in Computer Vision, 1987:726-740.
[26] SNAVELY N,SEITZ S M,SZELISKI R.Modeling the world from internet photo collections[J].International Journal of Computer Vision, 2008,80:189-210.
[27]TRIGGS B, MCLAUCHLAN P F, HARTLEY R I,et al.Bundle ajustment-a modern synthesis[J].Vision Algorithms:Theory and Practice, 2000:298-372.
[28]中华人民共和国国务院,中华人民共和国中央军事委员会.无人驾驶航空器飞行管理暂行条例[EB/OL].(2023-05-31)[2024-10-16].http://www.gov.cn/zhengce/zhengceku/202306/content-6888800.htm.
基本信息:
DOI:10.20189/j.cnki.CN/61-1527/E.202502001
中图分类号:TP391.41;V19
引用信息:
[1]何天琪,宋佳洁,程景春等.基于三维高斯泼溅技术的航拍场景建模[J].火箭军工程大学学报,2025,39(02):1-12+21.DOI:10.20189/j.cnki.CN/61-1527/E.202502001.
基金信息:
火箭军工程大学重点实验室开放基金(AAIE-2023-0202)