Machine Learning Based System Health Check Analyzer For Energy Components
Master thesis
View/ Open
Date
2018-06-15Metadata
Show full item recordCollections
- Studentoppgaver (TN-IDE) [823]
Abstract
In any system health check is an important measure, which provides details on how the system is performing and whether there is a need for an intervention manual or automated to correct any anomaly. There are several approaches to measure the system’s health, server logs being the one used for this thesis.
In this thesis a prototype of a health check analyzer tool is developed for a product called Energy components. This health check analyzer tool can be used to monitor the system state based on the generated server log files.
In this study supervised machine learning techniques have been applied to do automated log analysis. Incoming logs are read by Logstash, which filters them and extracts useful information and stores them in Elasticsearch. Using Elasticsearch, the parsed structured log files are indexed, which is then read by the machine learning model. Features from the contents of the logs are extracted using different vectorizers and further used to train machine learning model. Several variants of text classification algorithms are experimented and compared, in order to select the most suitable model for the problem being addressed in this study. K fold Cross validation and F1-score, performance matrix and learning curve are used to evaluate different learning models. A high accuracy rate of 94% with 93% precision and 0.058% standard deviation is achieved by using different machine learning algorithms and by varying the tuning parameters. The case study results showed that Support Vector Machine algorithm with hashing vectorizer gave the best accuracy results among the other compared algorithms
Description
Master's thesis in Computer science