MKL-SVM algorithm for pulmonary nodule recognition based on swarm intelligence optimization
-
摘要: 針對單核學習支持向量機無法兼顧學習能力與泛化能力以及多核函數參數尋優問題,提出了一種基于群體智能優化的多核學習支持向量機算法。首先,研究了五種單核函數對支持向量機分類性能的影響,進一步提出具有全局性質的多項式核和局部性質的拉普拉斯核凸組合形式的多核學習支持向量機算法;其次,為增加粒子多樣性及快速尋優,將粒子群優化算法引入了遺傳算法中的雜交操作,并用此改進的群體智能優化算法對多核學習支持向量機進行參數尋優。最后,分別采用深度特征與手工特征作為識別算法的輸入,研究表明采用深度特征優于手工特征。故本文采用深度特征作為多核學習支持向量機的輸入,以交叉遺傳與粒子群混合智能優化算法作為其尋優方式。實驗選取合作醫院數據集對所提算法進行訓練并初步測試,進一步為了驗證所提算法的泛化能力,選取公開數據集LUNA16進行測試。實驗結果表明,本文算法易于跳出局部最優解,提升了算法的學習能力與泛化能力,具有較優的分類性能。Abstract: To solve the problem that a single kernel learning support vector machine (SVM) cannot consider the learning and generalization abilities and parameter optimization of the multiple kernel function, a multiple kernel learning support vector machine (MKL-SVM) algorithm based on swarm intelligence optimization was proposed. First, the impact of five single kernel functions on the classification indexes of SVM was discussed. These kernel functions include two global kernel functions — the polynomial and sigmoid kernel functions — and three local kernel functions—the radial basis function, exponential kernel function, and Laplacian kernel function. Next, an MKL-SVM algorithm with a convex combination of a polynomial kernel having global properties and a Laplacian kernel having local properties was proposed. Then, to improve particle diversity to avoid falling into local optimal solutions during the iteration, and to reduce the model’s training time, the crossover operation in the genetic algorithm was introduced into the particle swarm optimization (PSO) algorithm. This improved swarm intelligence optimization was used to optimize the parameters of the MKL-SVM. Finally, deep learning features based on the classical model VGG16 and handcrafted features according to doctors’ suggestions were used as inputs for the recognition algorithm. In this algorithm, transfer learning was used to extract deep learning features and principal component analysis was used to reduce computational complexity through dimensionality reduction. The results show that using deep learning features is better than handcrafted features. Therefore, this paper adopts the deep learning features as input for the MKL-SVM algorithm and the hybrid swarm intelligent optimization algorithm of crossover genetic and the PSO algorithm as the optimization method. To verify the generalization ability of the proposed algorithm, the public dataset LUNA16 was selected for testing. The experimental results show that the proposed algorithm is easy to jump out of the local optimal solution, improves the learning ability and generalization ability of the algorithm, and has a better classification performance.
-
圖 3 不同算法的ROC曲線圖及PR曲線圖。(a)不同核函數SVM算法的ROC曲線;(b)不同核函數SVM算法的PR曲線;(c)不同尋優方式MKL-SVM算法的ROC曲線;(d)不同尋優方式MKL-SVM算法的PR曲線
Figure 3. ROC and PR curves of various algorithms: (a) ROC curves of SVM algorithms with various kernel functions; (b) PR curves of SVM algorithms with various kernel functions; (c) ROC curves of the MKL-SVM algorithm with various optimization algorithms; (d) PR curves of the MKL-SVM algorithm with various optimization algorithms
表 1 不同核函數的實驗結果
Table 1. Experimental results of various kernel functions
Algorithm ACC_mean/% ACC_max/% MASEN/% MASPE/% F1_score/% MCC/% AUC AP Polynomial kernel + GAPSO 90.00 90.00 85.19 91.78 82.14 75.30 0.9584 0.8506 Sigmoid kernel + GAPSO 89.00 89.00 77.78 93.15 81.26 74.35 0.9482 0.7990 RBF kernel + GAPSO 90.50 91.00 88.89 91.78 82.85 76.39 0.9498 0.8022 Exponential kernel + GAPSO 90.40 91.00 92.59 90.41 83.71 77.60 0.9604 0.8470 Laplacian kernel + GAPSO 90.60 91.00 92.59 90.41 84.18 78.23 0.9655 0.8464 MKL-SVM + PSO 90.80 92.00 88.89 93.15 83.68 77.44 0.9609 0.8726 MKL-SVM + GA 89.50 90.00 92.59 89.04 82.50 75.93 0.9619 0.8830 MKL-SVM + GAPSO 91.10 92.00 88.89 93.15 84.30 78.29 0.9650 0.8984 表 2 深度特征結合本文算法的實驗結果
Table 2. Results of the proposed algorithm combined with deep learning features
Algorithm ACC_mean/% ACC_max/% MASEN/% MASPE/% F1_score/% MCC/% AUC AP Handcrafted features + MKL-SVM + GAPSO 91.10 92.00 88.89 93.15 84.30 78.29 0.9650 0.8984 Deep learning features + MKL-SVM + GA 88.00 88.00 81.82 91.04 81.82 72.86 0.9038 0.8755 Deep learning features + MKL-SVM + PSO 89.80 91.00 75.76 97.01 82.57 76.81 0.9484 0.9038 Deep learning features + MKL-SVM + GAPSO
(Proposed work)91.50 94.00 81.82 100 85.81 80.69 0.9588 0.9043 表 3 所提算法與當前主流算法的性能比較
Table 3. Performance comparison of the proposed algorithm with current state-of-the-art methods
References Year Datasets Methods ACC/% SEN/% SPE/% AUC Zhao et al. [24] 2019 LIDC-IDRI (743 images) Transfer learning CNNs 85.00 94.00 — 0.94 Masood et al. [25] 2020 LIDC-IDRI (892 images) Enhanced multidimensional region-based fully CNN 97.91 98.1 93.2 0.9813 Mastouri et al. [14] 2020 LUNA16 (3186 images) Bilinear CNN + SVM 91.99 91.85 92.27 0.959 Proposed work 2021 LUNA16 (1140 images) Deep learning features+ Improved MKL-SVM 95.29 94.85 95.89 0.9803 www.77susu.com -
參考文獻
[1] Shen W, Zhou M, Yang F, et al. Multi-crop Convolutional Neural Networks for lung nodule malignancy suspiciousness classification. Pattern Recognit, 2017, 61: 663 doi: 10.1016/j.patcog.2016.05.029 [2] Ferlay J, Colombet M, Soerjomataram I, et al. Cancer incidence and mortality patterns in Europe: Estimates for 40 countries and 25 major cancers in 2018. Eur J Cancer, 2018, 103: 356 doi: 10.1016/j.ejca.2018.07.005 [3] Siegel R L, Miller K D, Jemal A. Cancer statistics, 2018. CA:A Cancer J Clin, 2018, 68(1): 7 doi: 10.3322/caac.21442 [4] Li Y, Zhu Z, Hou A, et al. Pulmonary nodule recognition based on multiple kernel learning support vector machine-PSO. Comput Math Methods Med, 2018, 2018: 1461470 [5] Renita D B, Christopher C S. Novel real time content based medical image retrieval scheme with GWO-SVM. Multimed Tools Appl, 2020, 79(23-24): 17227 doi: 10.1007/s11042-019-07777-w [6] Jia D Y, Li Z Y, Zhang C W. Detection of cervical cancer cells based on strong feature CNN-SVM network. Neurocomputing, 2020, 411: 112 doi: 10.1016/j.neucom.2020.06.006 [7] Shankar K, Lakshmanaprabu S K, Gupta D, et al. Optimal feature-based multi-kernel SVM approach for thyroid disease classification. J Supercomput, 2020, 76(2): 1128 doi: 10.1007/s11227-018-2469-4 [8] Peng Z C, Hu Q H, Dang J W. Multi-kernel SVM based depression recognition using social media data. Int J Mach Learn Cybern, 2019, 10(1): 43 doi: 10.1007/s13042-017-0697-1 [9] Valdez F. A review of optimization swarm intelligence-inspired algorithms with type-2 fuzzy logic parameter adaptation. Soft Comput, 2020, 24(1): 215 doi: 10.1007/s00500-019-04290-y [10] Zhou T, Lu H L, Hu F Y, et al. A model of high-dimensional feature reduction based on variable precision rough set and genetic algorithm in medical image. Math Probl Eng, 2020, 2020: 1 [11] Gao Y K, Xie L B, Zhang Z D, et al. Twin support vector machine based on improved artificial fish swarm algorithm with application to flame recognition. Appl Intell, 2020, 50(8): 2312 doi: 10.1007/s10489-020-01676-6 [12] Litjens G, Kooi T, Bejnordi B E, et al. A survey on deep learning in medical image analysis. Med Image Anal, 2017, 42: 60 doi: 10.1016/j.media.2017.07.005 [13] Zhang B H, Qi S L, Monkam P, et al. Ensemble learners of multiple deep CNNs for pulmonary nodules classification using CT images. IEEE Access, 2019, 7: 110358 doi: 10.1109/ACCESS.2019.2933670 [14] Mastouri R, Khlifa N, Neji H, et al. A bilinear convolutional neural network for lung nodules classification on CT images. Int J Comput Assist Radiol Surg, 2021, 16(1): 91 doi: 10.1007/s11548-020-02283-z [15] ?zyurt F, Sert E, Avci E, et al. Brain tumor detection based on Convolutional Neural Network with neutrosophic expert maximum fuzzy sure entropy. Measurement, 2019, 147: 106830 doi: 10.1016/j.measurement.2019.07.058 [16] Hao P Y. Dual possibilistic regression analysis using support vector networks. Fuzzy Sets Syst, 2020, 387: 1 doi: 10.1016/j.fss.2019.03.012 [17] Yan B L, Zhao Z, Zhou Y C, et al. A particle swarm optimization algorithm with random learning mechanism and Levy flight for optimization of atomic clusters. Comput Phys Commun, 2017, 219: 79 doi: 10.1016/j.cpc.2017.05.009 [18] Wu Z, Zhang S, Wang T. A cooperative particle swarm optimization with constriction factor based on simulated annealing. Computing, 2018, 100(8): 861 doi: 10.1007/s00607-018-0625-6 [19] Freitas D, Lopes L G, Morgado-Dias F. Particle swarm optimisation: A historical review up to the current developments. Entropy, 2020, 22(3): 362 doi: 10.3390/e22030362 [20] Choudhary A, Kumar M, Gupta M K, et al. Mathematical modeling and intelligent optimization of submerged arc welding process parameters using hybrid PSO-GA evolutionary algorithms. Neural Comput Appl, 2020, 32(10): 5761 doi: 10.1007/s00521-019-04404-5 [21] Koessler E, Almomani A. Hybrid particle swarm optimization and pattern search algorithm. Optim Eng, 2020: 1 [22] Setio A A A, Traverso A, Bel T, et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge. Med Image Anal, 2017, 42: 1 doi: 10.1016/j.media.2017.06.015 [23] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition//Proceedings of 2015 International Conference on Learning Representations. California, 2015: 1 [24] Zhao X, Qi S, Zhang B, et al. Deep CNN models for pulmonary nodule classification: Model modification, model integration, and transfer learning. J Xray Sci Technol, 2019, 27(4): 615 [25] Masood A, Sheng B, Yang P, et al. Automated decision support system for lung cancer detection and classification via enhanced RFCN with multilayer fusion RPN. IEEE Trans Ind Inform, 2020, 16(12): 7791 doi: 10.1109/TII.2020.2972918 -