-
摘要: 為了系統性地歸納工業場景下時序預測方法及應用,首先介紹了統計學習、集成學習、深度學習三類時序預測算法,并圍繞工業數據分析與決策問題,重點分析了循環神經網絡、卷積神經網絡、編碼?解碼器模型三類深度學習模型的優缺點及適用的工業應用場景。為了清晰全面地評估模型性能,介紹了面向點預測、序列預測問題的統計指標和誤差計算方法。同時,整理了經典的公開工業數據集,以便研究者快速評估算法性能。并以過程工業中的采礦、冶金為例,介紹了時序預測方法在真實工業場景下的應用和效果。最后,總結了工業領域中應用深度學習技術所面臨的低穩健性和弱可解釋性等問題,并探討了工業場景下時序預測方法研究的未來發展方向。Abstract: This review outlined recent developments in deep learning time series forecasting technology for the needs of industrial applications. With the advances in industrial automation, the storage and analysis of massive production data have become possible. Traditional mechanism modeling methods based on statistics encounter difficulties in dealing with high-dimensional industrial problems. Thus, time series forecasting for complex processes and products has played an important role in industrial scenarios, such as device modeling, production forecasting, remaining life prediction, and precise control, thereby receiving considerable attention from both academia and industry. To methodically review the time series forecasting method and its industrial applications, this review first introduced the three types of time series forecasting, namely, statistical learning, integrated learning, and deep learning, and compared their ease of use, complexity, and applicability issues. Focusing on industrial data analysis and decision-making, this review analyzed the three types of deep learning models, namely, recurrent neural network, convolutional neural network, and encoder–decoder network. The advantages and disadvantages of these three types of models and the applicable industrial environment were given, and how they can be embedded in industrial application sites to reduce costs and improve production efficiency was described. Afterward, to evaluate the performance of different algorithms clearly and comprehensively, statistical metrics and loss functions for the point prediction sequence and shape (motif) prediction problems were presented. At the same time, this review compiled classic public datasets of the industry for researchers to quickly find authoritative assessment data for an industry sector or issue. Taking mining and metallurgy in the process industry as examples, this review presented some widespread problems in this field, such as nonlinearity, long time delays, and unobservability of variables. This review also showed how deep-learning-based time series forecasting techniques can solve the aforementioned problems, build soft sensors, create digital twins, and achieve the visualization of complex processes. This review revealed that the application of deep learning in the process industry requires highly robust and strongly interpretable or explainable algorithms. For the robustness problem, the use of the ordinary differential equation model and Kalman filter method to solve the modeling of irregularly sampled time series and the use of the deep learning method to detect online sensor abnormalities were proposed. For the interpretability problem, sample-based, structure-based interpretable, and external co-explanation methods were introduced. This review also analyzed how explainable techniques can be applied to industrial deep learning models. Finally, the future directions of time series research were discussed in terms of both deep learning methods and industrial applications.
-
Key words:
- process industry /
- time series forecasting /
- deep learning /
- model interpretability /
- evaluation metric
-
表 1 時間序列預測算法對比
Table 1. Comparison of time series forecasting algorithms
Method Interpretability Design difficulty Efficiency Applicable scenario Statistical Algorithms High Easy Low Stationary random process Ensemble Learning Middle Hard High Experienced experts Deep Learning Low Middle High The predicted system has complex nonlinearity and sufficient data 表 2 指標定義
Table 2. Definitions of the mathematical metrics
Metric name Formula Metric name Formula Mean squared error, MSE $ \text{MSE}=\dfrac{1}{n}\sum\limits _{i=1}^{n}{e}_{i}{}^{2} $ Mean absolute error, MAE $ \text{MAE}=\dfrac{1}{n}\sum\limits _{i=1}^{n}\left|{e}_{i}\right| $ Normalized quantile error $ {Q_\rho } = \dfrac{{\sum\limits_{i = 1}^n {2\left( {\rho {e_{i;}}_{{P_i} > {A_i}} - (1 - \rho ){e_{i;{P_i} \leqslant {A_i}}}} \right)} }}{{\sum\limits_{i = 1}^n {{A_i}} }} $ R-squared $ {R^{\text{2}}} = \left( {1 - \dfrac{{\sum\limits_{i = 1}^n {{e_i}^2} }}{{\sum\limits_{i = 1}^n {{{({A_i} - \overline A )}^2}} }}} \right) \times 100\% $ Mean absolute scaled error, MASE $ \begin{array}{l}\text{MASE}=\text{MAE}/Q\text{ where}\\ Q=\dfrac{1}{n-1}\sum\limits _{i=2}^{n}\left|{A}_{i}-{A}_{i-1}\right|\end{array} $ Pearson correlation coefficient $ \begin{gathered} {r_{xy}} = \frac{{\sum\limits_{i = 1}^n {({X_i} - \bar X)({Y_i} - \bar Y)} }}{{\left( {\sqrt {\sum\limits_{i = 1}^n {{{\left( {{X_i} - \bar X} \right)}^2}} } } \right)\left( {\sqrt {\sum\limits_{i = 1}^n {{{\left( {{Y_i} - \bar Y} \right)}^2}} } } \right)}} \\ = \frac{{\sum\limits_{i = 1}^n {({A_i} - \overline A )} ({P_i} - \overline P )}}{{\left( {\sqrt {\sum\limits_{i = 1}^n {{{\left( {{A_i} - \overline A } \right)}^2}} } } \right)\left( {\sqrt {\sum\limits_{i = 1}^n {{{\left( {{P_i} - \overline P } \right)}^2}} } } \right)}} \\ \end{gathered} $ Mean absolute percentage error, MAPE $ \text{MAPE}=\dfrac{100}{n}{\displaystyle \sum _{n=1}^{i}{\left(\frac{{e}_{i}}{{A}_{i}}\right)}^{2}} $ KL divergence $ {D}_{\text{KL}}(P\Vert Q)={\displaystyle \sum _{i=1}^{n}P({x}_{i})\mathrm{log}\left(\frac{P({x}_{i})}{Q({x}_{i})}\right)} $ Root relative squared error, RRSE $ \text{RRSE}=\sqrt{\sum\limits _{i=1}^{n}\dfrac{{e}_{i}^{2}}{{\left({A}_{i}-\overline{A}\right)}^{2}}} $ www.77susu.com -
參考文獻
[1] Yuan X F, Wang Y L, Yang C H, et al. The application of deep learning in data-driven modeling of process industries. Chin J Intell Sci Technol, 2020, 2(2): 107袁小鋒, 王雅琳, 陽春華, 等. 深度學習在流程工業過程數據建模中的應用. 智能科學與技術學報, 2020, 2(2):107 [2] Lim B, Zohren S. Time-series forecasting with deep learning: A survey. Philos Trans Ser A Math Phys Eng Sci, 2021, 379(2194): 20200209 [3] Xu G, Li M, Xu J W, et al. Control technology of end-point carbon in converter steelmaking based on functional digital twin model. Chin J Eng, 2019, 41(4): 521徐鋼, 黎敏, 徐金梧, 等. 基于函數型數字孿生模型的轉爐煉鋼終點碳控制技術. 工程科學學報, 2019, 41(4):521 [4] Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [J/OL]. arXiv preprint (2014-12-11) [2021-12-02].https://arxiv.org/abs/ 1412.3555 [5] Salinas D, Flunkert V, Gasthaus J, et al. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int J Forecast, 2020, 36(3): 1181 doi: 10.1016/j.ijforecast.2019.07.001 [6] Wen R, Torkkola K, Narayanaswamy B, et al. A multi-horizon quantile recurrent forecaster [J/OL]. arXiv preprint (2018-06-28) [2021-12-02].https://arxiv.org/abs/1711.11053 [7] Oord A, Dieleman S, Zen H, et al. Wavenet: A generative model for raw audio [J/OL]. arXiv preprint (2016-09-19) [2021-12-02].https://arxiv.org/abs/1609.03499 [8] Bai S, Kolter J Z, Koltun V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling [J/OL]. arXiv preprint (2016-09-19) [2021-12-02].https://arxiv.org/abs/1803.01271 [9] Zhang K P, Liu Z J, Zheng L. Short-term prediction of passenger demand in multi-zone level: Temporal convolutional neural network with multi-task learning. IEEE Trans Intell Transp Syst, 2020, 21(4): 1480 doi: 10.1109/TITS.2019.2909571 [10] Zhao W T, Gao Y Y, Ji T X, et al. Deep temporal convolutional networks for short-term traffic flow forecasting. IEEE Access, 2019, 7: 114496 doi: 10.1109/ACCESS.2019.2935504 [11] Zheng C P, Fan X L, Wang C, et al. Gman: A graph multi-attention network for traffic prediction // Proceedings of the AAAI Conference on Artificial Intelligence. New York, 2020, 34(1): 1234 [12] Lea C, Flynn M D, Vidal R, et al. Temporal convolutional networks for action segmentation and detection // 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 1003 [13] Farha Y A, Gall J. Ms-tcn: Multi-stage temporal convolutional network for action segmentation // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 3570 [14] Zhang Y F, Lu Z Q. Remaining useful life prediction based on an integrated neural network. Chin J Eng, 2020, 42(10): 1372張永峰, 陸志強. 基于集成神經網絡的剩余壽命預測. 工程科學學報, 2020, 42(10):1372 [15] Xu R C, Yan W W, Wang G L, et al. Time series forecasting based on seasonality modeling and its application to electricity price forecasting. Acta Autom Sin, 2020, 46(6): 1136徐任超, 閻威武, 王國良, 等. 基于周期性建模的時間序列預測方法及電價預測研究. 自動化學報, 2020, 46(6):1136 [16] Yuan Z L, Hu J L, Wu D, et al. A dual-attention recurrent neural network method for deep cone thickener underflow concentration prediction. Sensors, 2020, 20(5): 1260 doi: 10.3390/s20051260 [17] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [J/OL]. arXiv preprint (2017-12-12) [2021-12-02].https://arxiv.org/abs/1706.03762 [18] Li S Y, Jin X Y, Xuan Y, et al. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting//33rd Conference on Neural Information Processing Systems. Vancouver, 2019 [19] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [J/OL]. arXiv preprint (2021-06-03) [2021-12-02].https://arxiv.org/abs/2010.11929 [20] Child R, Gray S, Radford A, et al. Generating long sequences with sparse transformers [J/OL]. arXiv preprint (2019-04-23) [2021-12-02].https://arxiv.org/abs/1904.10509 [21] Kitaev N, Kaiser Ł, Levskaya A. Reformer: the efficient transformer [J/OL]. arXiv preprint (2020-02-18) [2021-12-02].https://arxiv.org/abs/2001.04451 [22] Beltagy I, Peters M E, Cohan A. Longformer: the long-document transformer [J/OL]. arXiv preprint (2020-12-02) [2021-12-02].https://arxiv.org/abs/2004.05150 [23] Zhou H Y, Zhang S H, Peng J Q, et al. Informer: beyond efficient transformer for long sequence time-series forecasting [J/OL]. arXiv preprint (2020-03-28) [2021-12-02].https://arxiv.org/abs/2012.07436 [24] Wu H X, Xu J H, Wang J M, et al. Autoformer: decomposition transformers with auto-correlation for long-term series forecasting [J/OL]. arXiv preprint (2021-06-24) [2021-12-02].https://arxiv.org/abs/2106.13008 [25] Botchkarev A. Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology [J/OL]. arXiv preprint (2018-09-09) [2021-12-02].https://arxiv.org/abs/1809.03006 [26] Cui Y, Xie J D, Zheng K. Historical inertia: A neglected but powerful baseline for long sequence time-series forecasting // Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Gold Coast, 2021: 2965 [27] Moor B D, Gersem P D, Schutter B D, et al. DAISY: A database for identification of systems. J A, 1997, 38(3): 4 [28] Dua D, Graff C. UCI Machine Learning Repository [DB/OL]. University of California, School of Information and Computer Science (2021-06-05)[2021-12-02].https://archive.ics.uci.edu/ml/index.php [29] Godahewa R, Bergmeir C, Webb G I, et al. Monash time series forecasting archive [J/OL]. arXiv preprint (2021-05-14) [2021-12-02].https://arxiv.org/abs/2105.06643 [30] Wang H J, Wang X L, Zhang X, et al. Deep cone dynamic flocculation thickening of ultrafine full tailings. Chin J Eng, 2022, 44(2): 163王洪江, 王小林, 張璽, 等. 超細全尾砂深錐動態絮凝濃密試驗. 工程科學學報, 2022, 44(2):163 [31] Sun Q Q, Ge Z Q. Probabilistic sequential network for deep learning of complex process data and soft sensor application. IEEE Trans Ind Inform, 2019, 15(5): 2700 doi: 10.1109/TII.2018.2869899 [32] Yuan X F, Li L, Wang Y L. Nonlinear dynamic soft sensor modeling with supervised long short-term memory network. IEEE Trans Ind Inform, 2020, 16(5): 3168 doi: 10.1109/TII.2019.2902129 [33] Yao Z B, Niu W J, Zhang Y, et al. Development and application of a rockburst database management system. Chin J Eng, http://doi.org/10.13374/j.issn2095-9389.2021.08.12.002姚志賓, 牛文靜, 張宇, 等. 巖爆數據庫管理系統開發及應用. 工程科學學報,http://doi.org/10.13374/j.issn2095-9389.2021.08.12.002 [34] Nú?ez F, Langarica S, Díaz P, et al. Neural network-based model predictive control of a paste thickener over an industrial Internet platform. IEEE Trans Ind Inform, 2020, 16(4): 2859 doi: 10.1109/TII.2019.2953275 [35] Zhang H, Tang Z H, Xie Y F, et al. Feature reconstruction-regression network: A light-weight deep neural network for performance monitoring in the froth flotation. IEEE Trans Ind Inform, 2021, 17(12): 8406 doi: 10.1109/TII.2020.3046278 [36] Yuan Z L, Li X R, Wu D, et al. Continuous-time prediction of industrial paste thickener system with differential ODE-net. IEEE/CAA J Autom Sin, 2022, 9(4): 686 doi: 10.1109/JAS.2022.105464 [37] Fan D Y, Sun H, Yao J, et al. Well production forecasting based on ARIMA-LSTM model considering manual operations. Energy, 2021, 220: 119708 doi: 10.1016/j.energy.2020.119708 [38] Zhang X G, Lei Y Y, Chen H, et al. Multivariate time-series modeling for forecasting sintering temperature in rotary kilns using DCGNet. IEEE Trans Ind Inform, 2021, 17(7): 4635 doi: 10.1109/TII.2020.3022019 [39] Li J F, Yang C J, Li Y X, et al. A context-aware enhanced GRU network with feature-temporal attention for prediction of silicon content in hot metal. IEEE Trans Ind Inform,https://ieeexplore.ieee.org/document/9537301 [40] Du S, Wu M, Chen L F, et al. Operating mode recognition of iron ore sintering process based on the clustering of time series data. Control Eng Pract, 2020, 96: 104297 doi: 10.1016/j.conengprac.2020.104297 [41] Yuan Z L, He R Z, Yao C, et al. Online reinforcement learning control algorithm for concentration of thickener underflow. Acta Autom Sin, 2021, 47(7): 1558袁兆麟, 何潤姿, 姚超, 等. 基于強化學習的濃密機底流濃度在線控制算法. 自動化學報, 2021, 47(7):1558 [42] Long J Y, Sun Z Z, Pardalos P M, et al. A robust dynamic scheduling approach based on release time series forecasting for the steelmaking-continuous casting production. Appl Soft Comput, 2020, 92: 106271 doi: 10.1016/j.asoc.2020.106271 [43] Han Y M, Zhou R D, Geng Z Q, et al. Production prediction modeling of industrial processes based on Bi-LSTM // 2019 34rd Youth Academic Annual Conference of Chinese Association of Automation (YAC). Jinzhou, 2019: 285 [44] Wang K, Gopaluni R B, Chen J, et al. Deep learning of complex batch process data and its application on quality prediction. IEEE Trans Ind Inform, 2020, 16(12): 7233 doi: 10.1109/TII.2018.2880968 [45] Essien A, Giannetti C. A deep learning model for smart manufacturing using convolutional LSTM neural network autoencoders. IEEE Trans Ind Inform, 2020, 16(9): 6069 doi: 10.1109/TII.2020.2967556 [46] Antonopoulos I, Robu V, Couraud B, et al. Data-driven modelling of energy demand response behaviour based on a large-scale residential trial. Energy AI, 2021, 4: 100071 doi: 10.1016/j.egyai.2021.100071 [47] Rojat T, Puget R, Filliat D, et al. Explainable artificial intelligence (XAI) on TimeSeries data: A survey [J/OL]. arXiv preprint (2021-04-02) [2021-12-02].https://arxiv.org/abs/2104.00950 [48] Demeester T. System identification with time-aware neural sequence models [J/OL]. arXiv preprint (2019-11-21) [2021-12-02].https://arxiv.org/abs/1911.09431 [49] Fatehi A, Huang B. Kalman filtering approach to multi-rate information fusion in the presence of irregular sampling rate and variable measurement delay. J Process Control, 2017, 53: 15 doi: 10.1016/j.jprocont.2017.02.010 [50] Martí L, Sanchez-Pi N, Molina J M, et al. Anomaly detection based on sensor data in petroleum industry applications. Sens (Basel Switz) , 2015, 15(2): 2774 doi: 10.3390/s150202774 [51] Malhotra P, Ramakrishnan A, Anand G, et al. LSTM-based encoder-decoder for multi-sensor anomaly detection [J/OL]. arXiv preprint (2016-07-11) [2021-12-02].https://arxiv.org/abs/1607.00148 [52] Siddiqui S A, Mercier D, Munir M, et al. Tsviz: demystification of deep learning models for time-series analysis. IEEE Access, 2019, 7: 67027 doi: 10.1109/ACCESS.2019.2912823 [53] Schockaert C, Leperlier R, Moawad A. Attention mechanism for multivariate time series recurrent model interpretability applied to the ironmaking industry [J/OL]. arXiv preprint (2020-07-15) [2021-12-02].https://arxiv.org/abs/2007.12617 [54] Ribeiro M T, Singh S, Guestrin C. “Why should I trust You?”: Explaining the predictions of any classifier // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, 2016: 1135 [55] Lundberg S M, Lee S I. A unified approach to interpreting model predictions [J/OL]. arXiv preprint (2017-11-25) [2021-12-02].https://arxiv.org/abs/1705.07874 [56] Serradilla O, Zugasti E, Cernuda C, et al. Interpreting remaining useful life estimations combining explainable artificial intelligence and domain knowledge in industrial machinery // 2020 IEEE International Conference on Fuzzy Systems. Glasgow, 2020: 1 -