基于卷積神經網絡的反無人機系統聲音識別方法

薛珊; 李廣青; 呂瓊瑩; 毛逸維

doi:10.13374/j.issn2095-9389.2020.06.30.008

基于卷積神經網絡的反無人機系統聲音識別方法

doi: 10.13374/j.issn2095-9389.2020.06.30.008

薛珊^{1, 2, ,},
李廣青¹,
呂瓊瑩¹,
毛逸維¹

1.
長春理工大學機電工程學院，長春 130022
2.
長春理工大學重慶研究院，重慶 400000

基金項目: 吉林省重點科技研發資助項目（20180201058SF）

詳細信息

通訊作者:
E-mail: 1660348815@qq.com

中圖分類號: TP391
計量
- 文章訪問數: 1496
- HTML全文瀏覽量: 1047
- PDF下載量: 83
- 被引次數: 0
出版歷程
- 收稿日期: 2020-06-30
- 刊出日期: 2020-11-25

Sound recognition method of an anti-UAV system based on a convolutional neural network

1.
School of Mechanical and Electrical Engineering, Changchun University of Science and Technology, Changchun 130022, China
2.
Chongqing Research Institute, Changchun University of Science and Technology, Chongqing 400000, China

More Information

Corresponding author: E-mail: 1660348815@qq.com

摘要

摘要: 針對如何識別無人機的問題，提出了一種基于卷積神經網絡的聲音識別無人機的方法。首先，對100 m范圍內的無人機、鳥和人的聲音進行采集、預處理和提取MFCC+GFCC特征值，將其特征參數作為卷積神經網絡學習和識別的數據集；然后分別設計了支持向量機和卷積神經網絡兩種模型對無人機等聲音進行識別實驗。實驗結果表明，運用支持向量機識別無人機的準確率為91.9%，卷積神經網絡識別無人機的準確率為96.5%。為了進一步驗證設計的卷積神經網絡的識別能力，在部分UrbanSound8K數據集上進行測試，準確率達到90%。實驗結果表明運用卷積神經網絡識別無人機具有可行性，且識別性能優于支持向量機。
- 無人機 /
- 聲音檢測 /
- 公共安全 /
- MFCC特征值 /
- GFCC特征值 /
- 卷積神經網絡
Abstract: With the rapid growth of the UAV market, UAVs have been widely used in aerial photography, agricultural plant protection, power inspection, forest fire prevention, high-altitude fire fighting, emergency communication, and UAV logistics. However, “black flight” incidents of unlicensed flights and random flights frequently occur, which results in severe security risks to civil aviation airports, sensitive targets, and major activities. Moreover, owing to their characteristics of maneuverability, intelligent control, and low cost, UAVs can be easily used for criminal activities, which threatens public and national security. How to effectively detect UAVs and implement effective measures for UAVs, especially “black-flying” UAVs, is an active and difficult problem that needs to be urgently solved, and it is also an important research area in the field of anti-UAV systems. The research and development of anti-UAV systems is an important focus in national public security, and UAV identification is one of the key technologies in anti-UAV systems. Aiming at the problem of how to recognize UAVs, a sound-recognition method based on a convolutional neural network (CNN) was proposed. The UAV anti-jamming technology based on acoustic signals is not easily affected by an UAV size, shelter, ambient light, and ground clutter, and sound is an inherent attribute of UAVs, which is also applicable to UAVs in a radio-silence state. In this study, UAV sounds, bird sounds, and human voice within 100 m were collected and preprocessed; then the mel frequency cepstral coefficient and gammatone frequency cepstral coefficient eigenvalues were extracted. Support vector machine (SVM) and CNN models were designed to recognize UAV sounds and other sounds. The experimental results show that the SVM and CNN accuracies are 93.3% and 96.7%, respectively. To further verify the recognition ability of the designed CNN, it was tested on some Urbansound8K datasets, and its accuracy reached 90%. The experimental results show that a CNN is feasible for UAV recognition, and it has a better recognition performance than a SVM.
- UAV /
- voice detection /
- public security /
- MFCC eigenvalue /
- GFCC eigenvalue /
- convolution neural network

HTML全文

圖 1 無人機聲音樣本預加重圖

Figure 1. Pre-weighting diagram of an UAV sound sample

下載: 全尺寸圖片幻燈片

圖 2 無人機聲音樣本加漢明窗函數圖

Figure 2. Function diagram of an UAV sound sample plus a Hamming window

下載: 全尺寸圖片幻燈片

圖 3 線性頻率與梅爾頻率轉換曲線圖

Figure 3. Conversion curve of linear frequency and Mel frequency

下載: 全尺寸圖片幻燈片

圖 4 Gammatone濾波器幅頻特性圖

Figure 4. Amplitude frequency characteristics of a gammatone filter

下載: 全尺寸圖片幻燈片

圖 5 特征頻譜圖。（a）MFCC+GFCC特征頻譜圖；（b）MFCC特征頻譜圖；（c）GFCC特征頻譜圖

Figure 5. Characteristic spectra: (a) characteristic spectrum of mel frequency cepstral coefficient (MFCC) + gammatone frequency cepstral coefficient (GFCC); (b) characteristic spectrum of MFCC; (c) characteristic spectrum of GFCC

下載: 全尺寸圖片幻燈片

圖 6 SVM分類示意圖

Figure 6. Schematic of support vector machine classification

下載: 全尺寸圖片幻燈片

圖 7 設計的卷積神經網絡結構圖

Figure 7. Structure of a CNN

下載: 全尺寸圖片幻燈片

圖 8 采集樣本實驗圖。（a）白天停車場采集樣本圖；（b）晚間操場采集樣本圖

Figure 8. Sample collection experiment map: (a) sample collection map of parking lot during day; (b) sample collection map of playground at night

下載: 全尺寸圖片幻燈片

圖 9 卷積神經網絡結果顯示圖。（a）python顯示圖；（b）測試集識別準確率變化曲線圖

Figure 9. CNN results display: (a) python display; (b) change curve of test set recognition accuracy

下載: 全尺寸圖片幻燈片

圖 10 支持向量機結果顯示圖

Figure 10. SVM results display

下載: 全尺寸圖片幻燈片

圖 11 部分Urbansound8K數據集實驗結果顯示圖。（a）python顯示圖；（b）識別準確率變化曲線圖

Figure 11. Experimental results display of some Urbansound8K datasets: (a) python display; (b) recognition accuracy change curve

下載: 全尺寸圖片幻燈片

表 1 CNN參數設置

Table 1. CNN parameter setting

Layer	Input dimension	Output dimension	Sampling window	Function selection
Input layer		[99,26]
Convolution layer 1	[99,26]	[99,26,32]	5×5, striding=1, padding=same, convolution kernel=32
Activation function				Relu
Pool layer 1	[99,26,32]	[50,13,32]	2×2, striding=2
Convolution layer 2	[50,13,32]	[50,13,64]	5×5, striding=1, padding=same, convolution kernel=32
Activation function				Relu
Pool layer 2	[50,13,64]	[25,7,64]	2×2, striding=2
Full connection layer 1	[25,7,64]	[1,10]
Full connection layer 2	[1,10]	[1,10]
Output layer	[1,10]	[1,3]		Softmax

下載: 導出CSV

表 2 各類音頻樣本數量表

Table 2. Number of audio samples

Sample	Training set (piece)	Test set (piece)
UAV	1500	300
Bird	1500	300
People	1500	300

下載: 導出CSV

表 3 不同模型實驗結果

Table 3. Experimental results of different models

Model	Accuracy /%
CNN	96.5
SVM	91.9

下載: 導出CSV

表 4 不同卷積層測試集準確率實驗結果

Table 4. Experimental results on accuracy of test sets of different convolution layers

Number of layers	Accuracy /%	Training time/s	Number of iterations
2	96.52225	26580.6	1500
3	96.53334	41907.1	1700
4	96.53334	76055.3	2000
5	96.56667	126223.5	2500

下載: 導出CSV

www.77susu.com

參考文獻(26)

[1]	Chen W S, Liu J, Chen X L, et al. Non-cooperative UAV target recognition in low-altitude airspace based on motion model. J Beijing Univ Aeron Astron, 2019, 45(4): 687 陳唯實, 劉佳, 陳小龍, 等. 基于運動模型的低空非合作無人機目標識別. 北京航空航天大學學報, 2019, 45(4):687
[2]	Bisio I, Garibotto C, Lavagetto F, et al. Blind detection: Advanced techniques for WiFi-based drone surveillance. IEEE Trans Veh Technol, 2018, 68(1): 938
[3]	Quan H D, Tang Z Q, Sun H X, et al. Binary-sequence frequency hopping communication method based on pseudo-random linear frequency modulation. J Huazhong Univ Sci Technol Nat Sci Ed, 2019, 47(11): 30 全厚德, 唐志強, 孫慧賢, 等. 基于偽隨機線性調頻的雙序列跳頻通信方法. 華中科技大學學報: 自然科學版, 2019, 47(11):30
[4]	Huang F Z, Zeng J F, Zhang Y, et al. Convolutional recurrent neural networks with multi-sized convolution filters for sound-event recognition. Mod Phys Lett B, 2020, 34(23): 2050235 doi: 10.1142/S0217984920502358
[5]	Kim J, Min K, Jung M, et al. Occupant behavior monitoring and emergency event detection in single-person households using deep learning-based sound recognition. Build Environ, 2020, 181: 107092 doi: 10.1016/j.buildenv.2020.107092
[6]	Lan H, Fang Z Y. Recent advances in zero-shot learning. J Electron Inf Technol, 2020, 42(5): 1188 doi: 10.11999/JEIT190485 蘭紅, 方治嶼. 零樣本圖像識別. 電子與信息學報, 2020, 42(5):1188 doi: 10.11999/JEIT190485
[7]	Rai A K, Senthilkumar R, Aswin K R. Combining pixel selection with covariance similarity approach in hyperspectral face recognition based on convolution neural network. Microprocessors Microsystems, 2020, 76: 103096 doi: 10.1016/j.micpro.2020.103096
[8]	Sainath T N, Mohamed A R, Kingsbury B, et al. Deep convolutional neural networks for LVCSR // 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, 2013: 8614
[9]	Xie Y, Liang R Y, Bao Y Q, et al. Deception detection with spectral features based on deep belief network. Acta Acustica, 2019, 44(2): 214 謝躍, 梁瑞宇, 包永強, 等. 融合改進梅爾譜特征和深信念網絡的語音測謊算法. 聲學學報, 2019, 44(2):214
[10]	Meng C, Li Y G, Zhang G Q, et al. Signal recognition of loose particles inside aerobat based on support vector machine. J Beijing Univ Aeron Astron, 2020, 46(3): 488 孟偲, 李陽剛, 張國強, 等. 基于支持向量機的飛行器多余物信號識別. 北京航空航天大學學報, 2020, 46(3):488
[11]	Zhang K, Su Y, Wang J Y, et al. Environment sound classification system based on hybrid feature and convolutional neural network. J Northwestern Polytech Univ, 2020, 38(1): 162 doi: 10.3969/j.issn.1000-2758.2020.01.020 張科, 蘇雨, 王靖宇, 等. 基于融合特征以及卷積神經網絡的環境聲音分類系統研究. 西北工業大學學報, 2020, 38(1):162 doi: 10.3969/j.issn.1000-2758.2020.01.020
[12]	Dua M, Aggarwal R K, Biswas M. GFCC based discriminatively trained noise robust continuous ASR system for Hindi language. J Ambient Intell Human Comput, 2019, 10(6): 2301
[13]	Ali H, Tran S N, Benetos E, et al. Speaker recognition with hybrid features from a deep belief network. Neural Computing Appl, 2018, 29(6): 13 doi: 10.1007/s00521-016-2501-7
[14]	Geng Q S, Wang F H, Jin X. Mechanical fault sound diagnosis based on GFCC and random forest optimized by whale algorithm for dry type transformer. Electr Power Autom Equip, 2020, 40(8): 191 耿琪深, 王豐華, 金霄. 基于Gammatone濾波器倒譜系數與鯨魚算法優化隨機森林的干式變壓器機械故障聲音診斷. 電力自動化設備, 2020, 40(8):191
[15]	Hou G Y, Xu Z D, Liu X, et al. Optimization method improvement for nonlinear constrained single objective system without mathematic models. Chin J Eng, 2018, 40(11): 1402 侯公羽, 許哲東, 劉欣, 等. 無數學模型的非線性約束單目標系統優化方法改進. 工程科學學報, 2018, 40(11):1402
[16]	Yu X Y, Wu J H, Gao Y H. Research on refrigerant leakage identification for heat pump system based on PCA-SVM models. CIESC J, 2020, 71(7): 3151 于仙毅, 巫江虹, 高云輝. 基于主成分分析與支持向量機的熱泵系統制冷劑泄漏識別研究. 化工學報, 2020, 71(7):3151
[17]	Kari T, Gao W S, Zhang Z W, et al. Power transformer fault diagnosis based on a support vector machine and a genetic algorithm. J Tsinghua Univ Sci Technol, 2018, 58(7): 623 吐松江·卡日, 高文勝, 張紫薇, 等. 基于支持向量機和遺傳算法的變壓器故障診斷. 清華大學學報:自然科學版, 2018, 58(7):623
[18]	Shu C, Jin X, Li Z P, et al. Noise diagnosis method of distribution transformer discharge fault based on CEEMDAN. High Voltage Eng, 2018, 44(8): 2603 舒暢, 金瀟, 李自品, 等. 基于CEEMDAN的配電變壓器放電故障噪聲診斷方法. 高電壓技術, 2018, 44(8):2603
[19]	Tuttle J F, Blackburn L D, Powell K M. On-line classification of coal combustion quality using nonlinear SVM for improved neural network NOx emission rate prediction. Comput Chem Eng, 2020, 141: 106990 doi: 10.1016/j.compchemeng.2020.106990
[20]	Wang X Y, He L S, Wang P J, et al. Milling cutter breakage detection based on VMD. J Vib Shock, 2020, 39(16): 135 王向陽, 何嶺松, 王平江, 等. 基于VMD的銑刀破損檢測. 振動與沖擊, 2020, 39(16):135
[21]	Gong W F, Chen H, Zhang Z H, et al. A novel deep learning method for intelligent fault diagnosis of rotating machinery based on improved CNN-SVM and multichannel data fusion. Sensors, 2019, 19(7): 1693 doi: 10.3390/s19071693
[22]	Wang H X, Zhou J Q, Gu C H, et al. Design of activation function in CNN for image classification. J Zhejiang Univ Eng Sci, 2019, 53(7): 1363 王紅霞, 周家奇, 辜承昊, 等. 用于圖像分類的卷積神經網絡中激活函數的設計. 浙江大學學報:工學版, 2019, 53(7):1363
[23]	Zeng Y, Chen Y L, Cai X D. Face recognition algorithm for the deep hash combined with global and local pooling. J Xidian Univ Nat Sci, 2018, 45(5): 163 曾燕, 陳岳林, 蔡曉東. 結合全局與局部池化的深度哈希人臉識別算法. 西安電子科技大學學報: 自然科學版, 2018, 45(5):163
[24]	Liang M J, Cui X Y, Song Q S, et al. Traffic sign recognition method based on HOG-Gabor feature fusion and Softmax classifier. J Traffic Transportation Eng, 2017, 17(3): 151 doi: 10.3969/j.issn.1671-1637.2017.03.016 梁敏健, 崔嘯宇, 宋青松, 等. 基于HOG-Gabor特征融合與Softmax分類器的交通標志識別方法. 交通運輸工程學報, 2017, 17(3):151 doi: 10.3969/j.issn.1671-1637.2017.03.016
[25]	Wang Y H, Wu J W, Ma S L, et al. Mechanical fault diagnosis research of high voltage circuit breaker based on Kernel principal component analysis and SoftMax. Trans China Electrotech Soc, 2020, 35(Suppl 1): 267 王昱皓, 武建文, 馬速良, 等. 基于核主成分分析-SoftMax的高壓斷路器機械故障診斷技術研究. 電工技術學報, 2020, 35(增刊1): 267
[26]	Li S F. TensorFlow Lite: On-device machine learning framework. J Comput Res Dev, 2020, 57(9): 1839 doi: 10.7544/issn1000-1239.2020.20200291 李雙峰. TensorFlow Lite: 端側機器學習框架. 計算機研究與發展, 2020, 57(9):1839 doi: 10.7544/issn1000-1239.2020.20200291