Sound recognition method of an anti-UAV system based on a convolutional neural network
-
摘要: 針對如何識別無人機的問題,提出了一種基于卷積神經網絡的聲音識別無人機的方法。首先,對100 m范圍內的無人機、鳥和人的聲音進行采集、預處理和提取MFCC+GFCC特征值,將其特征參數作為卷積神經網絡學習和識別的數據集;然后分別設計了支持向量機和卷積神經網絡兩種模型對無人機等聲音進行識別實驗。實驗結果表明,運用支持向量機識別無人機的準確率為91.9%,卷積神經網絡識別無人機的準確率為96.5%。為了進一步驗證設計的卷積神經網絡的識別能力,在部分UrbanSound8K數據集上進行測試,準確率達到90%。實驗結果表明運用卷積神經網絡識別無人機具有可行性,且識別性能優于支持向量機。Abstract: With the rapid growth of the UAV market, UAVs have been widely used in aerial photography, agricultural plant protection, power inspection, forest fire prevention, high-altitude fire fighting, emergency communication, and UAV logistics. However, “black flight” incidents of unlicensed flights and random flights frequently occur, which results in severe security risks to civil aviation airports, sensitive targets, and major activities. Moreover, owing to their characteristics of maneuverability, intelligent control, and low cost, UAVs can be easily used for criminal activities, which threatens public and national security. How to effectively detect UAVs and implement effective measures for UAVs, especially “black-flying” UAVs, is an active and difficult problem that needs to be urgently solved, and it is also an important research area in the field of anti-UAV systems. The research and development of anti-UAV systems is an important focus in national public security, and UAV identification is one of the key technologies in anti-UAV systems. Aiming at the problem of how to recognize UAVs, a sound-recognition method based on a convolutional neural network (CNN) was proposed. The UAV anti-jamming technology based on acoustic signals is not easily affected by an UAV size, shelter, ambient light, and ground clutter, and sound is an inherent attribute of UAVs, which is also applicable to UAVs in a radio-silence state. In this study, UAV sounds, bird sounds, and human voice within 100 m were collected and preprocessed; then the mel frequency cepstral coefficient and gammatone frequency cepstral coefficient eigenvalues were extracted. Support vector machine (SVM) and CNN models were designed to recognize UAV sounds and other sounds. The experimental results show that the SVM and CNN accuracies are 93.3% and 96.7%, respectively. To further verify the recognition ability of the designed CNN, it was tested on some Urbansound8K datasets, and its accuracy reached 90%. The experimental results show that a CNN is feasible for UAV recognition, and it has a better recognition performance than a SVM.
-
Key words:
- UAV /
- voice detection /
- public security /
- MFCC eigenvalue /
- GFCC eigenvalue /
- convolution neural network
-
表 1 CNN參數設置
Table 1. CNN parameter setting
Layer Input dimension Output dimension Sampling window Function selection Input layer [99,26] Convolution layer 1 [99,26] [99,26,32] 5×5, striding=1,
padding=same,
convolution kernel=32Activation function Relu Pool layer 1 [99,26,32] [50,13,32] 2×2, striding=2 Convolution layer 2 [50,13,32] [50,13,64] 5×5, striding=1,
padding=same,
convolution kernel=32Activation function Relu Pool layer 2 [50,13,64] [25,7,64] 2×2, striding=2 Full connection layer 1 [25,7,64] [1,10] Full connection layer 2 [1,10] [1,10] Output layer [1,10] [1,3] Softmax 表 2 各類音頻樣本數量表
Table 2. Number of audio samples
Sample Training set (piece) Test set (piece) UAV 1500 300 Bird 1500 300 People 1500 300 表 3 不同模型實驗結果
Table 3. Experimental results of different models
Model Accuracy /% CNN 96.5 SVM 91.9 表 4 不同卷積層測試集準確率實驗結果
Table 4. Experimental results on accuracy of test sets of different convolution layers
Number of layers Accuracy /% Training time/s Number of iterations 2 96.52225 26580.6 1500 3 96.53334 41907.1 1700 4 96.53334 76055.3 2000 5 96.56667 126223.5 2500 www.77susu.com -
參考文獻
[1] Chen W S, Liu J, Chen X L, et al. Non-cooperative UAV target recognition in low-altitude airspace based on motion model. J Beijing Univ Aeron Astron, 2019, 45(4): 687陳唯實, 劉佳, 陳小龍, 等. 基于運動模型的低空非合作無人機目標識別. 北京航空航天大學學報, 2019, 45(4):687 [2] Bisio I, Garibotto C, Lavagetto F, et al. Blind detection: Advanced techniques for WiFi-based drone surveillance. IEEE Trans Veh Technol, 2018, 68(1): 938 [3] Quan H D, Tang Z Q, Sun H X, et al. Binary-sequence frequency hopping communication method based on pseudo-random linear frequency modulation. J Huazhong Univ Sci Technol Nat Sci Ed, 2019, 47(11): 30全厚德, 唐志強, 孫慧賢, 等. 基于偽隨機線性調頻的雙序列跳頻通信方法. 華中科技大學學報: 自然科學版, 2019, 47(11):30 [4] Huang F Z, Zeng J F, Zhang Y, et al. Convolutional recurrent neural networks with multi-sized convolution filters for sound-event recognition. Mod Phys Lett B, 2020, 34(23): 2050235 doi: 10.1142/S0217984920502358 [5] Kim J, Min K, Jung M, et al. Occupant behavior monitoring and emergency event detection in single-person households using deep learning-based sound recognition. Build Environ, 2020, 181: 107092 doi: 10.1016/j.buildenv.2020.107092 [6] Lan H, Fang Z Y. Recent advances in zero-shot learning. J Electron Inf Technol, 2020, 42(5): 1188 doi: 10.11999/JEIT190485蘭紅, 方治嶼. 零樣本圖像識別. 電子與信息學報, 2020, 42(5):1188 doi: 10.11999/JEIT190485 [7] Rai A K, Senthilkumar R, Aswin K R. Combining pixel selection with covariance similarity approach in hyperspectral face recognition based on convolution neural network. Microprocessors Microsystems, 2020, 76: 103096 doi: 10.1016/j.micpro.2020.103096 [8] Sainath T N, Mohamed A R, Kingsbury B, et al. Deep convolutional neural networks for LVCSR // 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, 2013: 8614 [9] Xie Y, Liang R Y, Bao Y Q, et al. Deception detection with spectral features based on deep belief network. Acta Acustica, 2019, 44(2): 214謝躍, 梁瑞宇, 包永強, 等. 融合改進梅爾譜特征和深信念網絡的語音測謊算法. 聲學學報, 2019, 44(2):214 [10] Meng C, Li Y G, Zhang G Q, et al. Signal recognition of loose particles inside aerobat based on support vector machine. J Beijing Univ Aeron Astron, 2020, 46(3): 488孟偲, 李陽剛, 張國強, 等. 基于支持向量機的飛行器多余物信號識別. 北京航空航天大學學報, 2020, 46(3):488 [11] Zhang K, Su Y, Wang J Y, et al. Environment sound classification system based on hybrid feature and convolutional neural network. J Northwestern Polytech Univ, 2020, 38(1): 162 doi: 10.3969/j.issn.1000-2758.2020.01.020張科, 蘇雨, 王靖宇, 等. 基于融合特征以及卷積神經網絡的環境聲音分類系統研究. 西北工業大學學報, 2020, 38(1):162 doi: 10.3969/j.issn.1000-2758.2020.01.020 [12] Dua M, Aggarwal R K, Biswas M. GFCC based discriminatively trained noise robust continuous ASR system for Hindi language. J Ambient Intell Human Comput, 2019, 10(6): 2301 [13] Ali H, Tran S N, Benetos E, et al. Speaker recognition with hybrid features from a deep belief network. Neural Computing Appl, 2018, 29(6): 13 doi: 10.1007/s00521-016-2501-7 [14] Geng Q S, Wang F H, Jin X. Mechanical fault sound diagnosis based on GFCC and random forest optimized by whale algorithm for dry type transformer. Electr Power Autom Equip, 2020, 40(8): 191耿琪深, 王豐華, 金霄. 基于Gammatone濾波器倒譜系數與鯨魚算法優化隨機森林的干式變壓器機械故障聲音診斷. 電力自動化設備, 2020, 40(8):191 [15] Hou G Y, Xu Z D, Liu X, et al. Optimization method improvement for nonlinear constrained single objective system without mathematic models. Chin J Eng, 2018, 40(11): 1402侯公羽, 許哲東, 劉欣, 等. 無數學模型的非線性約束單目標系統優化方法改進. 工程科學學報, 2018, 40(11):1402 [16] Yu X Y, Wu J H, Gao Y H. Research on refrigerant leakage identification for heat pump system based on PCA-SVM models. CIESC J, 2020, 71(7): 3151于仙毅, 巫江虹, 高云輝. 基于主成分分析與支持向量機的熱泵系統制冷劑泄漏識別研究. 化工學報, 2020, 71(7):3151 [17] Kari T, Gao W S, Zhang Z W, et al. Power transformer fault diagnosis based on a support vector machine and a genetic algorithm. J Tsinghua Univ Sci Technol, 2018, 58(7): 623吐松江·卡日, 高文勝, 張紫薇, 等. 基于支持向量機和遺傳算法的變壓器故障診斷. 清華大學學報:自然科學版, 2018, 58(7):623 [18] Shu C, Jin X, Li Z P, et al. Noise diagnosis method of distribution transformer discharge fault based on CEEMDAN. High Voltage Eng, 2018, 44(8): 2603舒暢, 金瀟, 李自品, 等. 基于CEEMDAN的配電變壓器放電故障噪聲診斷方法. 高電壓技術, 2018, 44(8):2603 [19] Tuttle J F, Blackburn L D, Powell K M. On-line classification of coal combustion quality using nonlinear SVM for improved neural network NOx emission rate prediction. Comput Chem Eng, 2020, 141: 106990 doi: 10.1016/j.compchemeng.2020.106990 [20] Wang X Y, He L S, Wang P J, et al. Milling cutter breakage detection based on VMD. J Vib Shock, 2020, 39(16): 135王向陽, 何嶺松, 王平江, 等. 基于VMD的銑刀破損檢測. 振動與沖擊, 2020, 39(16):135 [21] Gong W F, Chen H, Zhang Z H, et al. A novel deep learning method for intelligent fault diagnosis of rotating machinery based on improved CNN-SVM and multichannel data fusion. Sensors, 2019, 19(7): 1693 doi: 10.3390/s19071693 [22] Wang H X, Zhou J Q, Gu C H, et al. Design of activation function in CNN for image classification. J Zhejiang Univ Eng Sci, 2019, 53(7): 1363王紅霞, 周家奇, 辜承昊, 等. 用于圖像分類的卷積神經網絡中激活函數的設計. 浙江大學學報:工學版, 2019, 53(7):1363 [23] Zeng Y, Chen Y L, Cai X D. Face recognition algorithm for the deep hash combined with global and local pooling. J Xidian Univ Nat Sci, 2018, 45(5): 163曾燕, 陳岳林, 蔡曉東. 結合全局與局部池化的深度哈希人臉識別算法. 西安電子科技大學學報: 自然科學版, 2018, 45(5):163 [24] Liang M J, Cui X Y, Song Q S, et al. Traffic sign recognition method based on HOG-Gabor feature fusion and Softmax classifier. J Traffic Transportation Eng, 2017, 17(3): 151 doi: 10.3969/j.issn.1671-1637.2017.03.016梁敏健, 崔嘯宇, 宋青松, 等. 基于HOG-Gabor特征融合與Softmax分類器的交通標志識別方法. 交通運輸工程學報, 2017, 17(3):151 doi: 10.3969/j.issn.1671-1637.2017.03.016 [25] Wang Y H, Wu J W, Ma S L, et al. Mechanical fault diagnosis research of high voltage circuit breaker based on Kernel principal component analysis and SoftMax. Trans China Electrotech Soc, 2020, 35(Suppl 1): 267王昱皓, 武建文, 馬速良, 等. 基于核主成分分析-SoftMax的高壓斷路器機械故障診斷技術研究. 電工技術學報, 2020, 35(增刊1): 267 [26] Li S F. TensorFlow Lite: On-device machine learning framework. J Comput Res Dev, 2020, 57(9): 1839 doi: 10.7544/issn1000-1239.2020.20200291李雙峰. TensorFlow Lite: 端側機器學習框架. 計算機研究與發展, 2020, 57(9):1839 doi: 10.7544/issn1000-1239.2020.20200291 -