-
摘要: 基于高性能的YOLOv3目標檢測算法,提出一種分階段高效火車號識別算法。整個識別過程分為兩個階段:第一階段在低分辨率全局圖像中檢測出火車號區域位置;第二階段在局部高分辨率圖像中檢測出組成火車號的字符,根據字符的空間位置關系搜索得到12位火車號,并利用每個字符的識別置信度及火車號編碼規則進行校驗得到最終火車號。另外,本文提出一種結合批一化因子和濾波器相關度的剪枝算法,通過對兩個階段檢測模型的剪枝,在保證識別準確率不降(實驗中略有提升)的條件下降低了存儲空間占用率和計算復雜度。在現場采集的1072幅火車號圖像上的實驗結果表明,本文提出的火車號識別算法達到了96.92%的整車號識別正確率,平均識別時間僅為191 ms。Abstract: The automatic recognition of a wagon number plays an important role in railroad transportation systems. However, the wagon number character only occupies a very small area of the entire wagon image, and it is often accompanied by uneven illumination, a complex background, image contamination, and character stroke breakage, which makes the high-precision automatic recognition difficult. In recent years, object detection algorithm based on deep learning has made great progress, and it provides a solid technical basis for us to improve the performance of the train number recognition algorithm. This paper proposes a two-phase efficient wagon number recognition algorithm based on the high-performance YOLOv3 object detection algorithm. The entire recognition process is divided into two phases. In the first phase, the region of the wagon number in an image is detected from a low-resolution global image; in the second stage, the characters are detected in a high-resolution local image, formed into the wagon number according to their spatial position, and the final wagon number is obtained after verification based on the recognition confidence of each character and international wagon number coding rules. In addition, we proposed a new deep learning network-pruning algorithm based on the batch normalize scale factor and filter correlation. The importance of every filter was computed by considering the correlation between filter weights and the scale factor generated via batch normalization. By pruning and retraining the region detection model and character detection model, the storage space occupation and computational complexity were reduced without sacrificing recognition accuracy (which is even slightly improved in our experiment). Finally, we tested the proposed two-phase wagon number recognition algorithm on 1072 images from practical engineering application scenarios, and the results show that the proposed algorithm achieves 96.9% of the overall correct ratio (here, “correct” means all 12 characters are detected and recognized correctly), and the average recognition time is only 191 ms.
-
Key words:
- pattern recognition /
- wagon number recognition /
- deep learning /
- neural network /
- object detection /
- model pruning
-
表 1 火車號區域檢測和字符檢測的實驗結果
Table 1. Results of wagon number region detection and character detection
Phase Detectionmodel Pruning mAP/% Model size/MB Runtime memory/MB Mean time/ms Region detection YOLOv3 N 95.31 241 1625 44.06 Y 95.37 93 1155 31.41 Faster-RCNN 95.38 323 1121 103.16 SSD 95.20 192 833 62.01 Character detection YOLOv3 N 90.39 241 1307 29.22 Y 90.69 84 881 18.43 Faster-RCNN 90.68 323 1161 110.09 SSD 90.40 201 851 62.49 表 2 校驗和模型剪枝對識別結果的影響
Table 2. Influence of model pruning and verification on the recognition results
Verification & correction Pruning Accuracy rate/% Error rate/% Rejection rate/% Mean time/ms N N 93.56 4.76 1.68 221 Y N 96.36 1.96 1.68 224 Y Y 96.92 2.15 0.93 191 www.77susu.com -
參考文獻
[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks // 26th Annual Conference on Neural Information Processing Systems. Lake Tahoe, 2012: 1097 [2] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436 doi: 10.1038/nature14539 [3] Liao J. Research on recognition of railway wagon numbers based on deep convolutional neural networks. J Transp Eng Inf, 2016, 14(4): 64 doi: 10.3969/j.issn.1672-4747.2016.04.010廖健. 基于深度卷積神經網絡的貨車車號識別研究. 交通運輸工程與信息學報, 2016, 14(4):64 doi: 10.3969/j.issn.1672-4747.2016.04.010 [4] Li H, Wang P, You M Y, et al. Reading car license plates using deep neural networks. Image Vision Comput, 2018, 72: 14 doi: 10.1016/j.imavis.2018.02.002 [5] Li H, Wang P, Shen C H. Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans Intell Transp Syst, 2019, 20(3): 1126 doi: 10.1109/TITS.2018.2847291 [6] Montazzolli S, Jung C. Real-time Brazilian license plate detection and recognition using deep convolutional neural networks // 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). Niterói, 2017: 52 [7] Laroca R, Severo E, Zanlorensi L A, et al. A robust real-time automatic license plate recognition based on the YOLO detector // 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, 2018: 1 [8] Zhang Q, Li J F, Zhuo L. Review of Vehicle Recognition Technology. J Beijing Univ Technol, 2018, 44(3): 382張強、李嘉鋒、卓力. 車輛識別技術綜述. 北京工業大學學報, 2018, 44(3):382 [9] Zhao Z Q, Zheng P, Xu S T, et al. Object detection with deep learning: a review. IEEE Trans Neural Networks Learning Syst, 2019, 30(11): 3212 doi: 10.1109/TNNLS.2018.2876865 [10] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031 [11] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2117 [12] He K M, Gkioxari G, Dollár P, et al. Mask R-CNN. [J/OL]. arXiv preprint (2018-01-24) [2019-12-15]. https://arxiv.org/abs/1703.06870 [13] Lu X, Li B Y, Yue Y X, et al. Grid R-CNN plus: faster and better [J/OL]. arXiv preprint (2019-06-13) [2019-12-15]. https://arxiv.org/abs/1906.05688v1 [14] Cai Z W, Vasconcelos N. Cascade R-CNN: high quality object detection and instance segmentation[J/OL]. arXiv preprint (2019-06-24) [2019-11-12]. https://arxiv.org/abs/1906.09756v1 [15] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector // European Conference on Computer Vision. Amsterdam, 2016: 21 [16] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 779 [17] Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 2961 [18] Zheng Z H, Wang P, Liu W, et al. Distance-IoU Loss: faster and better learning for bounding box regression [J/OL]. arXiv preprint (2019-11-19) [2019-12-15]. https://arxiv.org/abs/1911.08287 [19] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell, 2020, 42(2): 318 doi: 10.1109/TPAMI.2018.2858826 [20] Shen Z Q, Liu Z, Li J G, et al. DSOD: Learning deeply supervised object detectors from scratch // Proceedings of the IEEE International Conference on Computer Vision. Venice, 2017: 1919 [21] Law H, Deng J. CornerNet: detecting objects as paired keypoints [J/OL]. arXiv preprint (2019-03-18) [2019-12-15]. https://arxiv.org/abs/1808.01244v2 [22] Duan K W, Bai S, Xie L X, et al. CenterNet: Keypoint triplets for object detection [J/OL]. arXiv preprint (2019-04-19) [2019-12-15]. https://arxiv.org/abs/1904.08189v3 [23] Rashwan A, Agarwal R, Kalra A, et al. MatrixNets: a new scale and aspect ratio aware architecture for object detection[J/OL]. arXiv preprint (2020-01-09) [2020-01-15]. https://arxiv.org/abs/2001.03194v1 [24] Redmon J, Farhadi A. YOLOv3: an incremental improvement [J/OL]. arXiv preprint (2018-04-08) [2019-11-12]. https://arxiv.org/abs/1804.02767 [25] Liu Z, Li J G, Shen Z Q, et al. Learning efficient convolutional networks through network slimming // 2017 IEEE International Conference on Computer Vision. Venice, 2017: 2755 [26] Chen K, Wang J Q, Pang J M, et al. MMDetection: Open MMLab detection toolbox and benchmark [J/OL]. arXiv preprint (2019-06-17) [2019-11-12]. https://arxiv.org/abs/1906.07155v1 -