29-09-2016, 02:16 PM
1456823233-Journal.doc (Size: 130 KB / Downloads: 4)
Abstract— The bug triage is an unavoidable step for handling the software bugs and the time and cost taken to reduce the bug is high.When the bug arises the admin stores the detail of the bug here we use the instance and feature selection method to obtain the subset of the relevant instances and gives the enhanced aolution these algorithm are used to reduce the data reduction here we store the historical bugs and it is used for later usage.The results ahow that our data reduction can effectively reduce the data scale and improve the accuracy of the bug triage.we are using the instance and feature selection simultaneously to reduce the historical bug data.The reduced bug data contain fewer bug reports than the original bug data and provide similar information over the original bug data and the details of the bugs are shown automatically when they login to find the solution of the bug.Here bugs are visible so everyone can find the solutions to the projects and different solutions are obtained.We have added a new module here which will describe the status of the bug like whether it assigned to any developer or not and it is rectified or not.
INTRODUCTION
Many software companies spend most of the money in fixing the bugs. Large software projects have bug repository that collects all the information related to bugs. In bug repository, each software bug has a bug report. The bug report consists of textual information regarding the bug and updates related to status of bug fixing.
Once a bug report is formed, a human triager assigns this bug to a developer, who will try to fix this bug. This developer is recorded in an item assigned-to. The assignedto will change to another developer if the previously assigned developer cannot fix this bug. The process of assigning a correct developer for fixing the bug is called bug triage. Bug triage is one of the most time consuming step in handling of bugs in software projects. Manual bug triage by a human triager is time consuming and error-prone since the number of daily bugs is large and lack of knowledge in developers about all bugs. Because of all these things, bug triage results in expensive time loss, high cost and low accuracy.
The information stored in bug reports has two main challenges. Firstly the large scale data and secondly low quality of data. Due to large number of daily reported bugs, he number of bug reports is scaling up in the repository. Noisy and redundant bugs are degrading the quality of bug reports.Bug fixing is a significant and time-consuming process in software maintenance. For a large-scale software project, the number of daily bugs is so large that it is impossible to handle them without delaying . The work of managing bugs increases the cost of software quality maintenance. Many software projects use a bug tracking system to store and manage bugs submitted by users, including end users, testers, and developers.
The bug tracking system provides a platform, where users can communicate with each other during the bug fixing process. s such a bug tracking system, which is used by many large open source software projects. Based on the bug tracking system, the developers can easily search and maintain all the existing bugs. Bug triage, an important step for bug fixing, is to assign a new bug to a relevant developer for further handling. A general method for bug triage is to assign bugs manually. In practice, due to the frequent changes of software development teams, it is difficult to identify the correct developer in manual triage. 37 bugs per day are submitted to the bug tracking system and 3 person-hours per day are required for the manual triage.
Normally, in companies the bugs have to be properly maintained . One has to take a great care in proper maintenance and resolution of the bugs.Redundant data increases the cost of the data processing and bug triage.The bug is assigned to a particular developer to fix the bugs, so the time increases .Low quality bugs decreases the effectiveness of fixing bugs in software development step
In this paper,addresses the problem of data reduction for bug triage effectively, i.e., how to reduce the bug data to save the labor cost of developers and improve the quality to facilitate the process of bug triage.The solution presented in this paper is the replacement by all the developers in the company to effectively do the bug triage process. This system uses Classification and Prediction algorithm for reducing data from the bug sets and the performance increases from the existing system.
II.Related Works
A. Data Set Preprocessing
Bug data records the textual description of reproducing the bug and updates according to the status of bug fixing. A bug repository provides a data platform to support many types of tasks on bugs, bug localization, and reopened bug analysis.
Bug data set provides to support information collection and assist developers to handle bugs. Bug data set is prepared and stored by all the developers when they’re faced complex bugs.
Instance selection and feature selection are widely used techniques in data processing. For a given data set in a certain application, instance selection is to obtain a subset of relevant instances (i.e., bug reports in bug data) while feature selection aims to obtain a subset of relevant features (i.e., words in bug data). In our work, we employ the combination of instance selection and feature selection
B. Classification and Prediction
By applying the instance selection technique to the data set can reduce bug reports but the accuracy of bug triage may be decreased; applying the feature selection technique can reduce words in the bug data and the accuracy can be increased. Meanwhile, combining both techniques can increase the accuracy, as well as reduce bug reports and words.They reduce the scale of bug data.
C. Data Reduction
The data reduction is mainly used for,
1)Reducing the data scale.
2)Improving the accuracy of bug triage.
• Bug Dimension
The aim of bug triage is to assign developers for bug fixing. Once a developer is assigned to a new bug report, the developer can examine historically fixed bugs to form a solution to the current bug report. For example, historical bugs are checked to detect whether the new bug is the duplicate of an existing one; moreover, existing solutions to bugs can be searched and applied to the new bug.Thus, we consider reducing duplicate and noisy bug reports to decrease the number of historical bugs.
• Word Dimension
By removing uninformative words, feature selection improves the accuracy of bug triage. We use feature selection to remove noisyor duplicate words in a data set. Based on feature selection, the reduced data set can be handled more easily.
LITERATURE SURVEY
• Review and Evaluation of Feature Selection Algorithms in Synthetic Problems[1]
The main purpose of Feature Subset Selection is to find a reduced subset of attributes from a data set described by a featureset.A measure to evaluate FSA is devised that computes the degree of matching between the output given by a FSA and the known optimal solution.An extensive experimental study on synthetic problem is carried out to assessthe behavior of the algorithms in terms of solution accuracy abd size as a function of the relevance,irrelevance,redundancy and size of thesamples.
• An Efficient Greedy Method for Unsupervised Feature Selection[2]
In data mining applications, data instances are typically described by a huge number of features. Most of these features are irrelevant or redundant, which negatively affects the efficiency and effectiveness of different learning algorithms. The selection of relevant features is a crucial task which can be used to allow a better understanding of data or improve the performance
• Reducing Feature To Improve Code Change Based Bug Prediction[3]
This paper investigates multiple feature selection techniques that are generally applicable to classification-based bug prediction methods. The techniques discard less important features until optimal classification performance is reached. The total number of features used for training is substantially reduced
• Reducing The Effort Of Bug Report Triage[4]
It improve the software development process in a number of ways, reports added to the repository need to be triage To assist triagers with their work, this article presents a machine learning approach to create recommenders that assist with a variety of decisions
.CONCLUSION AND FEATURE WORK
Bug triage is an expensive step of software maintenance in both labor cost and time cost. In this paper, we combine feature selection with instance selection to reduce the scale of bug data sets as well as improve the data quality. To determine the order of applying instance selection and feature selection for a new bug data set, we extract attributes of each bug data set and train a predictive model based on historical data sets. We empirically investigate the data reduction for bug triage in bug repositories. Our work provides an approach to leveraging techniques on data processing to form reduced and high-quality bug data in software development and maintenance. In future work, we plan on improving the results of data reduction in bug triage to explore how to prepare a high quality bug data set and tackle a domain specific software task. For predicting reduction orders, we plan to pay efforts to find out the potential relationship between the attributes of bug data sets and the reduction orders.