Log mining and knowledge-based models in data storage systems diagnostics
Modern data storage systems have a sophisticated hardware and software architecture, including multiple storage processors, storage fabrics, network equipment and storage media and contain information, which can be damaged or lost because of hardware or software fault. Approach to storage software diagnostics, presented in current paper, combines a log mining algorithms for fault detection based on natural language processing text classification methods, and usage of the diagnostic model for a task of fault source detection. Currently existing approaches to computational systems diagnostics are either ignoring system or event log data, using only numeric monitoring parameters, or target only certain log types or use logs to create chains of the structured events. The main advantage of using natural language processing method for log text classification is that no information of log message structure or log message source, or log purpose is required if there is enough data for classificator model training. Developed diagnostic procedure has accuracy score comparable with existing methods and can target all presented in training set faults without prior log structure research.
Computational system, data storage systems, diagnostic procedure, hardware and software architectures, knowledge-based model, monitoring parameters, natural language processing, text classification methods