HEALTH MONITORING OF A COMPUTER NETWORK BASED ON SEQUENTIAL ANALYSIS OF SERIAL PATTERN
Oleg I. Sheluhin, MTUCI, Moscow, Russia, sheluhin@mail.ru
Andrey V. Osin, MTUCI, Moscow, Russia, osin_a_v@mail.ru
Denis V. Kostin, MTUCI, Moscow, Russia, d.v.kostin@mail.ru
Abstract
To solve the problems of monitoring the «health» of a computer network, it is proposed to use the methods and algorithms of statistical data processing, machine learning, and the intellectual analysis of historical data obtained by studying the behavior of the network in the past to extract patterns. We proposed to use the method of selecting the most appropriate pattern for «network health» to predict anomalous events in a computer network by using sequential analysis of extracted series of patterns. The stages of forecasting based on sequential patterns and the structure of the implemented algorithm for calculating the forecast of the current state of a computer system are considered in the article. It is proposed to use the Pareto function as a target function for optimizing the hyperparameters of the forecasting algorithm. The best set of parameters is determined by the maximum value of the target function. Service Level Objectives and Service Level Agreement are used as system indicators characterizing the «health of the computer network». A visualization of the clustering of states of a computer network is considered using a specific example using the k-means algorithm and the dimensional reduction algorithm TSNE. It is proposed to evaluate the «network health» as the distance of the forecast of the state of the computer network to the region of anomalous states formed as a result of clustering in the form of the distance to the nearest cluster centers, on an ordinal scale from 1 to 5. The paper proposes to evaluate the «health of the network using the «Green», «Yellow» and «Red» levels.
Keywords: anomaly states, computer network, forecasting, machine learning; data mining, monitoring system metrics,
clustering, sequential analysis, pattern.
References
1. Xiaohui Gu. (2009). Online Anomaly Prediction for Robust Cluster Systems. IEEE 25th International Conference on Data Engineering. March 2009. Pages 1000-1011. DOI: 10.1109/ICDE.2009.128
2. Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, and A. Fox. (2005). Capturing, indexing, clustering, and retrieving system history. SOSP.
3. M. Mirza, J. Sommers, P. Barford, and J. Zhu. (2007). A Machine Learning Approach to TCP Throughput Prediction. Proc. of ACM SIGMETRICS.
4. Sheluhin O.I., Ryabinin V.S., Farmakovsky M.A. (2018). Detection of abnormal conditions of computer systems by means of data mining system logs. Cybersecurity Issues No. 2 (26). DOI: 10.21681 / 2311-3456-2018-2-33-43
5. Sheluhin O.I., Ryabinin V.S. (2019). Detecting Big Data Anomalies in Unstructured System Logs. Cybersecurity Issues. No. 2 (30), pp. 36-41. DOI 10.21681 / 2311-3456-2019-2-36-41
6. C. Gniady, A. R. Butt, and Y. C. Hu. (2004). Program Counter Based Pattern Classification in Buffer Caching. Proc. of OSDI.
7. Mohammed J. Zaki. (2001). SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning. No.42, pp. 31-60.
8. Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Jianyong Wang, Helen Pinto, Qiming Chen, Umeshwar Dayal and Mei-Chun Hsu. (2004). Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE Transactions On knowledge and data engineering. Vol. 16. No. 10.
9. R. Srikant, R. Agrawal. (1996). Mining Sequential Patterns: Generalizations and Performance Improvements, EDBT.
10. R.Agrawal and R.Srikant. (1995). Mining sequential patterns. Proceedings of the Eleventh International Conference on Data Engineering.
11. Ayvazyan S.A., Enyukov I.S., Meshalkin L.D. (1985). Applied statistics. Dependency research. Moscow: Finance and statistics.
12. S. Abbasghorbani and R. Tavoli. (2015). Survey on sequential pattern mining algorithms. 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI), Tehran, pp. 1153-1164. doi: 10.1109/KBEI.2015.7436211
13. Philippe Fournier-Viger, Jerry Chun-Wei Lin, Rage-Uday Kiran, Yun-Sing Koh, and Rincy Thomas. (2017). A survey of sequential pattern mining. Data Science and Pattern Recognition 1, 1, pp. 54-77.
14. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao and Philip S. Yu. (2018). A Survey of Parallel Sequential Pattern Mining. ACM Trans. Knowl. Discov. Data. 0, 1, Article 00 (August 2018), 33 p. https://doi.org/0000001
15. Jen-Wei Huang, Chi-Yao Tseng, Jian-Chih Ou, and Ming-Syan Chen. (2008). A General Model for Sequential Pattern Mining with a Progressive Database Publication. IEEE Transactions On Knowledge And Data Engineering. Vol. 20. No. 9.
16. Keshavamurthy B.N., Mitesh Sharma, Durga Toshniwal. B. (2010). Efficient Support Coupled Frequent Pattern Mining Over Progressive Database Publication. International journal of Database Management Systems (IJDMS). Vol. 2. No.2.
17. K.M.V. Madan Kumar, P.V.S. Srinivas and C. Raghavendra Rao. (2012). Sequential Pattern Mining With Multiple Minimum Supports in Progressive Databases Publication. International Journal of Database Management Systems (IJDMS). Vol. 4. No. 4.
18. D. A. Molodtsov. (2016). Comparison and continuation of multi-valued dependencies. Fuzzy Systems and Soft Computing. Vol. 11, Issue 2, pp. 115-145.
19. Barseghyan A.A., Kupriyanov M.C., Stepanenko B.B., Kholod I.I. (2004). Methods and models of data analysis: OLAP and Data Mining. SPb.: BH St. Petersburg.
Information about authors:
Oleg I. Sheluhin, doctor of technical sciences, professor, head of the Department of Information Security, MTUCI, Moscow, Russia
Andey V. Osin, PhD, MTUCI, Moscow, Russia
Denis V. Kostin, graduate student, MTUCI, department of information security, Moscow, Russia