A Data Clustering Algorithm for Mining Patterns from Event Logs

**project girl** · 01-02-2013, 04:39 PM

A Data Clustering Algorithm for Mining Patterns from Event Logs

ABSTRACT

Event logging and log files are playing an increasingly important role in system and network management. Over the past two decades, the BSD syslog protocol has become a widely accepted standard that is supported on many operating systems and is implemented in a wide range of system devices. Well-written system applications either use the syslog protocol or produce log files in custom format, while many devices like routers, switches, laser printers, etc. are able to log their events to remote host using the syslog protocol. Normally, events are logged as single-line textual messages.
Since log files are an excellent source for determining the health status of the system, many sites have built a centralized logging and log file monitoring infrastructure. Because of the importance of log files as the source of system health information, a number of tools have been developed for monitoring log files, e.g., Swatch, Log surfer, and SEC.Log file monitoring techniques can be categorized into fault detection and anomaly detection. In the case of fault detection, the domain expert creates a database of fault message patterns. If a line is appended to a log file that matches a pattern, the log file monitor takes a certain action. This commonly used approach has one serious flaw - only those faults that are already known to the domain expert can be detected.
If a previously unknown fault condition occurs, the log file monitor simply ignores the corresponding message in the log file, since there is no match for it in the pattern database. Also, it is often difficult to find a person with sufficient knowledge about the system. In the case of anomaly detection, a system profile is created which reflects normal system activity. If messages are logged that do not fit the profile, an alarm is raised. With this approach, previously unknown fault conditions are detected, but on the other hand, creating the system profile by hand is time-consuming and error-prone.
However, note that log file data clustering is not merely apreprocessing step. A clustering algorithm could identify many line patterns that reflect normal system activity and that can be immediately included in the system profile, since the user does not wish to analyze them further with the association rule algorithms. Furthermore, the cluster of outliers that is formed by the clustering algorithm contains infrequent lines that could represent previously unknown fault conditions, or other unexpected behavior of the system that deserves closer investigation. Although data clustering algorithms provide the user a valuable insight into event logs, they have received little attention in the context of system and network management. In this paper, we discuss existing data clustering algorithms, and propose a new clustering algorithm for mining line patterns from log files. We also present an experimental clustering tool called SLCT (Simple Logfile Clustering Tool).

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	Data Mining: What is Data Mining? Report	project girl	1	2,262	21-09-2017, 11:47 AM Last Post: jaseela123
	DEMONSTRATING DATAPOSSESSION AND UN CHEATABLE DATA TRANSFER	seminar flower	1	1,466	19-09-2017, 11:05 AM Last Post: jaseela123
	Processing of collected data PPT	seminar projects maker	1	718	15-09-2017, 12:48 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123
	INCREMENTAL MINING USING FREQUENT PATTERN TREE	project topics	1	10,061,816	13-09-2017, 09:40 AM Last Post: jaseela123
	Blowfish Encryption Algorithm pdf	project girl	1	1,113	12-09-2017, 12:36 PM Last Post: jaseela123
	Data Warehouse Report	study tips	1	879	12-09-2017, 12:23 PM Last Post: jaseela123
	CONFIDENTIAL DATA STORAGE AND DELETION details	seminar ideas	1	1,668	06-09-2017, 01:23 PM Last Post: jaseela123
	A Privacy-Preserving Remote Data Integrity Checking Protocol	seminar ideas	1	2,350	06-09-2017, 12:31 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.