-
摘要: 為解決現實場景下無人機目標被部分遮擋,導致不易檢測問題,本文提出了基于YOLOX-S改進的反無人機系統目標檢測算法YOLOX-drone。首先,建立無人機圖像數據集;其次,搭建YOLOX-S目標檢測網絡,在此基礎上引入坐標注意力機制,來增強無人機的目標圖像顯著度,突出有用特征抑制無用特征;然后,再去除特征融合層中自下而上的路徑增強結構,減少網絡復雜度,并設計了自適應特征融合網絡結構,增強有用特征的表達能力,抑制干擾,提升檢測精度。在DUT-Anti-UAV數據集上的測試結果表明:YOLOX-drone與YOLOX-S、YOLOv5-S和YOLOX-tiny相比,平均準確率(IoU=0.5)提升了3.2%、4.7%和10.1%;在自建的無人機圖像數據集上的測試結果表明:YOLOX-drone與原YOLOX-S目標檢測模型相比,在無遮擋、一般遮擋、嚴重遮擋情況下,平均準確率(IoU=0.5)分別提高了2.4%、2.1%和6.4%,驗證了改進的算法具有良好的抗遮擋檢測能力。Abstract: With the development and advancement of science and technology, the development and innovation of unmanned aerial vehicle (UAV) technology and products have brought great convenience to people in the fields of aerial photography, plant protection, electric cruise, and so on, but the development of UAVs also brings a series of management problems. Therefore, as a key part of the anti-UAV system, research into effective UAV detection is a pressing issue that must be addressed. In public environments such as parks, stadiums, and schools, the detection and tracking of UAV targets become more difficult due to their inherent characteristics and environmental factors. For example, under the occlusion of background interferences such as trees, buildings, and light, the target detection algorithm is unable to extract the effective features of the UAV target, resulting in target detection failure. It is of great significance to study the anti-occlusion target detection and tracking algorithm of anti-UAV systems for situations where UAVs cannot be successfully detected due to occlusion. This study proposes an improved anti-UAV system target detection algorithm YOLOX-drone based on YOLOX-S to solve the problem of the UAV being deformed and partially occluded in complex scenes, which makes it difficult to identify. First, in this study, numerous occluded drone images are collected in complex scenes, and the drone pictures are downloaded online for occlusion processing. The drone images were labeled to establish a UAV image dataset. Second, the YOLOX-S target detection network was constructed. On this premise, the coordinate attention mechanism is introduced to improve the saliency of the target image when the drone is obscured by highlighting useful features and suppressing useless ones. Then, the bottom-up path enhancement structure in the feature fusion layer is removed to reduce the network complexity, and an adaptive feature fusion network structure is designed to improve the expression ability of useful features, suppress interference, and improve detection accuracy. First, experiments were conducted on the Dalian University of Technology Anti-UAV dataset, and the experimental results show that YOLOX-drone improved average accuracy (IOU = 0.5) by 3.2%, 4.7%, and 10.1% compared to YOLOX-S, YOLOv5-S, and YOLOX-tiny, respectively. Then, experiments were conducted on the self-built UAV image dataset, and YOLOX-drone improved the average accuracy (IOU = 0.5) by 2.4%, 2.1%, and 6.4% in the cases of no occlusion, general occlusion, and severe occlusion, respectively, when compared with the original YOLOX-S target detection model. This demonstrates that the improved algorithm has good anti-occlusion detection ability.
-
Key words:
- anti-drone system /
- target detection /
- occlusion /
- attention mechanism /
- adaptive feature fusion
-
表 1 特征圖縮放表
Table 1. Feature map scaling table
Feature map Scale_0 Scale_1 Scale_2 X0 — Up sample(s=2) Up sample(s=4) X1 Down sample(s=2) — Up sample(s=2) X2 Down sample(s=4) Down sample(s=2) — 表 2 實驗條件設置表
Table 2. Experimental condition setting table
N(No occlusion) R(Reasonable) HO(Heavy occlusion) v=0 v$ \in $(0, 0.35] v$ \in $(0.35,0.6] 表 3 引入不同注意力機制檢測性能對比表/mAP@0.5
Table 3. Comparison table of detection performance with different attention mechanisms/mAP@0.5
% Model N R HO N+R+HO YOLOX-S 89.3 82.8 58.1 77.5 +SE 89.6 83.5 61.2 79.1 +ECA 89.9 83.2 61.0 79.1 +CBAM 90.0 83.3 61.6 79.8 +CA 91.4 83.8 61.7 80.1 表 4 YOLOX-S采用不同特征融合層檢測性能對比表mAP@0.5
Table 4. YOLOX-S using different feature fusion layer detection performance comparison table/mAP@0.5
% Feature fusion method N R HO N+R+HO PANet 89.3 82.8 58.1 77.5 FPN 89.6 83.2 60.4 78.5 ASCFM 90.4 84.3 64.3 80.5 PANet+ASCFM 89.9 84.9 62.6 80.3 FPN+ASCFM 90.8 84.9 64.4 80.8 表 5 ASCFM模塊消融實驗表/mAP@0.5
Table 5. ASCFM module ablation experiment table/mAP@0.5
% Feature fusion method N R HO N+R+HO FPN 89.6 83.2 60.4 78.5 FPN+ASFM 89.6 83.4 61.9 79.5 FPN+ACFM 89.9 84.8 62.6 80.0 FPN+ASCFM 90.8 84.9 64.4 80.8 表 6 各個模塊逐步添加模型檢測性能表/mAP@0.5
Table 6. Step-by-step addition of model checking performance tables for each module/mAP@0.5
% FPN ASCFM CA N R HO N+R+HO 89.3 82.8 58.1 77.5 √ 89.6 83.2 60.4 78.5 √ √ 90.8 84.9 64.4 80.8 √ √ √ 91.7 84.9 64.5 81.3 表 7 改進前后網絡復雜性對比表
Table 7. Comparison table of network complexity before and after improvement
Model Params(M) Gflops Times/ms YOLOX-S 8.94 26.64 16.67 YOLOX-drone 10.82 29.91 20.41 表 8 經典檢測算法檢測性能對比表
Table 8. Classical detection algorithm detection performance comparison table
Model Params(M) Gflops mAP@0.5/% Times/ms YOLOX-tiny 5.03 6.40 83.3 3.94 YOLOv5-S 7.01 15.80 88.7 13.00 YOLOX-S 8.94 26.64 90.2 16.67 YOLOX-drone 10.82 29.91 93.4 20.41 www.77susu.com -
參考文獻
[1] Xue S, Li G Q, Lü Q Y, et al. Sound recognition method of an anti-UAV system based on a convolutional neural network. Chin J Eng, 2020, 42(11): 1516薛珊, 李廣青, 呂瓊瑩, 等. 基于卷積神經網絡的反無人機系統聲音識別方法. 工程科學學報, 2020, 42(11):1516 [2] Chen Q Q, Feng Z W, Zhang G B, et al. Dynamic modeling and simulation of anti-UAV tethered-net capture system. J Natl Univ Def Technol, 2022, 44(2): 9 doi: 10.11887/j.cn.202202002陳青全, 豐志偉, 張國斌, 等. 反無人機繩網捕獲系統的動力學建模與仿真. 國防科技大學學報, 2022, 44(2):9 doi: 10.11887/j.cn.202202002 [3] Xue S, Wei L W, Gu C Y, et al. A recognition method for drone based on mixed domain attention mechanism. J Xi'an Jiaotong Univ, http://kns.cnki.net/kcms/detail/61.1069.t.20220701.1417.004.html薛珊, 衛立煒, 顧宸瑜, 等. 采用混合域注意力機制的無人機識別方法. 西安交通大學學報, http://kns.cnki.net/kcms/detail/61.1069.t.20220701.1417.004.html [4] Luo J H, Wang Z Y. A review of development and application of UAV detection and counter technology. Control Decis, 2022, 37(3): 530羅俊海, 王芝燕. 無人機探測與對抗技術發展及應用綜述. 控制與決策, 2022, 37(3):530 [5] Wang X. Research on Real-time Detection and Tracking Algorithm of UAV in Video [Dissertation]. Harbin: Harbin Engineering University, 2019王曉. 視頻中無人機的實時檢測與跟蹤算法研究[學位論文]. 哈爾濱: 哈爾濱工程大學, 2019 [6] Shao P Y. Design and Implementation of Vision Based Drone Intrusion and Tracking System [Dissertation]. Hangzhou: Zhejiang University, 2018邵盼愉. 基于視覺的無人機入侵檢測與跟蹤系統設計與實現[學位論文]. 杭州: 浙江大學, 2018 [7] Zhao H, Li Z W, Zhang T Q. Attention based single shot multibox detector. J Electron &Inf Technol, 2021, 43(7): 2096 doi: 10.11999/JEIT200304趙輝, 李志偉, 張天琪. 基于注意力機制的單發多框檢測器算法. 電子與信息學報, 2021, 43(7):2096 doi: 10.11999/JEIT200304 [8] Tao L, Hong T, Chao X. Drone identification and location tracking based on YOLOv3. Chin J Eng, 2020, 42(4): 463陶磊, 洪韜, 鈔旭. 基于YOLOv3的無人機識別與定位追蹤. 工程科學學報, 2020, 42(4):463 [9] Cui Y P, Wang Y H, Hu J W. Detection method for a dynamic small target using the improved YOLOv3. J Xidian Univ, 2020, 47(3): 1 doi: 10.19665/j.issn1001-2400.2020.03.001崔艷鵬, 王元皓, 胡建偉. 一種改進YOLOv3的動態小目標檢測方法. 西安電子科技大學學報, 2020, 47(3):1 doi: 10.19665/j.issn1001-2400.2020.03.001 [10] Zhang R X, Li N, Zhang X X, et al. Low-altitude UAV detection method based on optimized CenterNet. J Beijing Univ Aeronaut Astronaut, 2022, 48(11): 2335張瑞鑫, 黎寧, 張夏夏, 等. 基于優化CenterNet的低空無人機檢測方法. 北京航空航天大學學報, 2022, 48(11):2335 [11] Ge Z, Liu S T, Wang F, et al. Yolox: Exceeding yolo series in 2021 [J/OL]. arXir preprint online (2021-8-6) [2022-10-21]. https://arxiv.org/abs/2107.08430 [12] Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, 2021: 13708 [13] Duan K W, Bai S, Xie L X, et al. Centernet: Keypoint triplets for object detection // Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, 2019: 6568 [14] Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J/OL]. arXir preprint online (2020-4-23) [2022-10-21]. https://arxiv.org/abs/2004.10934 [15] Zhang H Y, Cisse M, Dauphin Y N, et al. Mixup: Beyond empirical risk minimization [J/OL]. arXir preprint online (2018-4-27) [2022-10-21]. https://arxiv.org/abs/1710.09412 [16] Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, 2020: 1571 [17] He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell, 2015, 37(9): 1904 doi: 10.1109/TPAMI.2015.2389824 [18] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 8759 [19] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 936 [20] Chen Y, Liu X, Liu H L. Occluded pedestrian detection based on joint attention mechanism of channel-wise and spatial information. J Electron &Inf Technol, 2020, 42(6): 1486 doi: 10.11999/JEIT190606陳勇, 劉曦, 劉煥淋. 基于特征通道和空間聯合注意機制的遮擋行人檢測方法. 電子與信息學報, 2020, 42(6):1486 doi: 10.11999/JEIT190606 [21] Liu S T, Huang D, Wang Y H. Learning spatial fusion for single-shot object detection [J/OL]. arXir preprint online (2019-11-25) [2022-10-21]. https://arxiv.org/abs/1911.09516 [22] Woo S, Park J, Lee J Y, et al. CBAM: Convolutional block attention module // The 15th European Conference on Computer Vision. Munich, 2018: 3 [23] Zhang S S, Chen D, Yang J, et al. Guided attention in CNNs for occluded pedestrian detection and re-identification. Int J Comput Vis, 2021, 129(6): 1875 doi: 10.1007/s11263-021-01461-z [24] Hu J, Shen L, Sun G. Squeeze-and-excitation networks // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 7132 [25] Wang Q L, Wu B G, Zhu P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, 2020: 11531 [26] Zhou B L, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 2921 [27] Zhao J, Zhang J S, Li D D, et al. Vision-based anti-UAV detection and tracking. IEEE Trans Intell Transp Syst, 2022, 23(12): 25323 doi: 10.1109/TITS.2022.3177627