Localization model of traditional Chinese medicine Zang-fu based on ALBERT and Bi-GRU
-
摘要: 臟腑定位,即明確病變所在的臟腑,是中醫臟腑辨證的重要階段。本文旨在通過神經網絡模型搭建中醫臟腑定位模型,輸入癥狀文本信息,輸出對應的病變臟腑標簽,為實現中醫輔助診療的臟腑辨證提供支持。將中醫的臟腑定位問題建模為自然語言處理中的多標簽文本分類問題,基于中醫的醫案數據,提出一種基于預訓練模型ALBERT和雙向門控循環單元(Bi-GRU)的臟腑定位模型。對比實驗和消融實驗的結果表明,本文提出的方法在中醫臟腑定位的問題上相比于多層感知機模型、決策樹模型具有更高的準確性,與Word2Vec文本表示方法相比,本文使用的ALBERT預訓練模型的文本表示方法有效提升了模型的準確率。在模型參數上,ALBERT預訓練模型相比BERT模型降低了模型參數量,有效減小了模型大小。最終,本文提出的臟腑定位模型在測試集上F1值達到了0.8013。Abstract: The rapid development of artificial intelligence (AI) has injected new vitality into various industries and provided new ideas for the development of traditional Chinese medicine (TCM). The combination of AI and TCM provides more technical support for TCM auxiliary diagnosis and treatment. In the history of TCM, many methods of syndrome differentiation have been observed, among which the differentiation of Zang-fu organs is one of the important methods. The purpose of this paper is to provide support for the localization of Zang-fu in TCM through AI technology. Localization of Zang-fu organs is a method of determining the location of lesions in such organs and is an important stage in the differentiation of Zang-fu organs in TCM. In this paper, the localization model of TCM Zang-fu organs through the neural network model was established. Through the input of symptom text information, the corresponding Zang-fu label for a lesion could be output to provide support for the realization of Zang-fu syndrome differentiation in TCM-assisted diagnosis and treatment. In this paper, the localization of Zang-fu organs was abstracted as multi-label text classification in natural language processing. Using the medical record data of TCM, a Zang-fu localization model based on pretraining models a lite BERT (ALBERT) and bidirectional gated recurrent unit (Bi-GRU) was proposed. Comparison and ablation experiments finally show that the proposed method is more accurate than multilayer perceptron and the decision tree. Moreover, using an ALBERT pretraining model for text representation effectively improves the accuracy of the localization model. In terms of model parameters, the ALBERT pretraining model greatly reduces the number of model parameters compared with the BERT model and effectively reduces the model size. Finally, the F1-value of the Zang-fu localization model proposed in this paper reaches 0.8013 on the test set, which provided certain support for the TCM auxiliary diagnosis and treatment.
-
表 1 臟腑定位數據格式
Table 1. Zang-fu location data format
No. Symptoms Tag 1 Legs ache, and wake up unable to sleep, along with hemoptysis and a sore throat spleen, kidney, heart 2 The patient had high blood pressure, weakness in the right limb, and pain in the left upper arm liver, kidney 表 2 訓練過程中的參數
Table 2. Parameters in the training process
Parameter name Parameter value Max_seq_lenth 128 GRU_units 128 Dropout 0.4 Learning_rate 1×10?4 Epochs 10 Batch_size 128 表 3 多標簽分類對比實驗結果
Table 3. Comparative experimental results of multiple label classification
No. Method Precision Recall F1-value 1 Word2Vec+Bi-GRU 0.8015 0.7653 0.7830 2 MLP Classifier 0.7091 0.7067 0.7079 3 Decision Tree Classifier 0.6744 0.6633 0.6688 4 ALBERT+Bi-GRU 0.8301 0.7745 0.8013 表 4 BERT與ALBERT對比實驗結果
Table 4. Comparative experimental results of BERT and ALBERT
Id Method Precision Recall F1-value Time/s Model_
parameters/
MB1 BERT+Bi-GRU 0.8253 0.7783 0.8011 99.8219 363.3 2 ALBERT+Bi-GRU 0.8301 0.7745 0.8013 84.7045 37.3 表 5 多標簽分類消融實驗結果
Table 5. Ablation experiment multiple label classification results
Method Precision Recall F1-value ALBERT 0.7711 0.7315 0.7508 ALBERT+Bi-GRU 0.8301 0.7745 0.8013 www.77susu.com -
參考文獻
[1] Xu Q. Mining the Syndrome Factor Distribution of AECOPD by the Attribution Model Built by Directed Graph [Dissertation]. Chengdu: Chengdu University of TCM, 2017許強. 基于有向圖的證素歸因模型挖掘AECOPD的證素分布規律[學位論文]. 成都: 成都中醫藥大學, 2017 [2] Yin D, Zhou L, Zhou Y M, et al. Study on design of graph search pattern of knowledge graph of TCM classic prescriptions. Chin J Inf Tradit Chin Med, 2019, 26(8): 94 doi: 10.3969/j.issn.1005-5304.2019.08.019尹丹, 周璐, 周雨玫, 等. 中醫經方知識圖譜“圖搜索模式”設計研究. 中國中醫藥信息雜志, 2019, 26(8):94 doi: 10.3969/j.issn.1005-5304.2019.08.019 [3] Liu C, Gao J L, Dong Y, et al. Study on TCM syndrome differentiation and diagnosis model based on BP neural network for syndrome elements and their common combinations in patients with borderline coronary lesion. Chin J Inf Tradit Chin Med, 2021, 28(3): 104劉超, 高嘉良, 董艷, 等. 基于BP神經網絡的冠狀動脈臨界病變患者證候要素及其常見組合中醫辨證診斷模型研究. 中國中醫藥信息雜志, 2021, 28(3):104 [4] Chu N. Research on Hybrid Intelligent Based Syndrome Differentiation System for Traditional Chinese Medicine [Dissertation]. Shanghai: Shanghai Jiaotong University, 2012褚娜. 基于混合智能的中醫辨證系統研究[學位論文]. 上海: 上海交通大學, 2012 [5] Yang K M. Research on Clinical Data Mining Technology of Diabetes TCM [Dissertation]. Kunming: Kunming University of Science and Technology, 2013楊開明. 糖尿病中醫臨床數據挖掘技術研究[學位論文]. 昆明: 昆明理工大學, 2013 [6] Zhou L, Li G G, Sun Y, et al. Construction of intelligent syndrome differentiation and formula selection of compound structure model. World Chin Med, 2018, 13(2): 479 doi: 10.3969/j.issn.1673-7202.2018.02.057周璐, 李光庚, 孫燕, 等. 復合結構智能化辨證選方模型的構建. 世界中醫藥, 2018, 13(2):479 doi: 10.3969/j.issn.1673-7202.2018.02.057 [7] Shu X, Cao Y, Huang X, et al. Construction of prediction model of qi deficiency syndrome in acute ischemic stroke based on neural network analysis technique. Glob Tradit Chin Med, 2019, 12(11): 1650 doi: 10.3969/j.issn.1674-1749.2019.11.007舒鑫, 曹云, 黃幸, 等. 基于神經網絡分析技術的急性缺血性卒中氣虛證預測模型構建的研究. 環球中醫藥, 2019, 12(11):1650 doi: 10.3969/j.issn.1674-1749.2019.11.007 [8] Shen C B, Wang Z H, Sun Y G. A multi-label classification algorithm based on label clustering. Comput Eng Softw, 2014, 35(8): 16 doi: 10.3969/j.issn.1003-6970.2014.08.004申超波, 王志海, 孫艷歌. 基于標簽聚類的多標簽分類算法. 軟件, 2014, 35(8):16 doi: 10.3969/j.issn.1003-6970.2014.08.004 [9] Huang Z Q. Multi-Label Classification and Label Completion Algorithm Based on K-Means [Dissertation]. Anqing: Anqing Normal University, 2020黃志強. 基于K-means的多標簽分類及標簽補全算法[學位論文]. 安慶: 安慶師范大學, 2020 [10] Li D Y, Luo F, Wang S G. A multi-label emotion classification method for Chinese text based on CNN and tag features. J Shanxi Univ Nat Sci Ed, 2020, 43(1): 65李德玉, 羅鋒, 王素格. 融合CNN和標簽特征的中文文本情緒多標簽分類. 山西大學學報(自然科學版), 2020, 43(1):65 [11] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification // Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Valencia, 2017: 427 [12] Yi S X, Yin H P, Zheng H Y. Public security event trigger identification based on Bidirectional LSTM. Chin J Eng, 2019, 41(9): 1201易士翔, 尹宏鵬, 鄭恒毅. 基于BiLSTM的公共安全事件觸發詞識別. 工程科學學報, 2019, 41(9):1201 [13] Chen G B, Ye D H, Xing Z C, et al. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization // 2017 International Joint Conference on Neural Networks (IJCNN). Anchorage, 2017: 2377 [14] Yogatama D, Dyer C, Ling W, et al. Generative and discriminative text classification with recurrent neural networks[J/OL]. ArXiv Preprin (2017-03-06) [2020-12-29]. https://arxiv.org/abs/1703.01898v1 [15] Wang B X. Disconnected Recurrent Neural Networks for Text Categorization // Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, 2018: 2311 [16] Kim Y. Convolutional Neural Networks for Sentence Classification // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, 2014: 1746 [17] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[J/OL]. arXiv preprint (2013-10-16) [2021-5-22]. https://arxiv.org/abs/1310.4546 [18] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, 2014: 1532 [19] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [J/OL]. arXiv preprint (2017-6-12) [2021-5-22]. https://arxiv.org/abs/1706.03762 [20] Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. //Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Minneapolis, Minnesota, 2018: 4171 [21] Yang Z L, Dai Z H, Yang Y M, et al. Xlnet: Generalized autoregressive pretraining for language understanding[J/OL]. arXiv preprint (2019-6-19) [2021-5-23]. https://arxiv.org/abs/1906.08237 [22] Liu Y, Ott M, Goyal N, et al. Roberta: A robustly optimized bert pretraining approach[J/OL]. arXiv preprint (2019-07-26) [2020-12-29]. http://arxiv.org/abs/1907.11692 [23] Sanh V, Debut L, Chaumond J, et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter[J/OL]. arXiv preprint (2019-10-02) [2020-12-29]. http://arxiv.org/abs/1910.01108 [24] Lei J S, Qian Y. Chinese-text classification method based on ERNIE-BiGRU. J Shanghai Univ Electr Power, 2020, 36(4): 329 doi: 10.3969/j.issn.2096-8299.2020.04.003雷景生, 錢葉. 基于ERNIE-BiGRU模型的中文文本分類方法. 上海電力大學學報, 2020, 36(4):329 doi: 10.3969/j.issn.2096-8299.2020.04.003 [25] Lan Z Z, Chen M, Goodman S, et al. ALBERT: A lite BERT for self-supervised learning of language representations. //ICLR 2020 : Eighth International Conference on Learning Representations. Addis Ababa, 2020 [26] Chung J, Gulcehre C, Cho K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [J/OL]. ArXiv Preprin (2018-08-13) [2020-12-29]. http://arxiv.org/abs/1412.3555 -