25-01-2013, 03:00 PM
Enhancing History-Based Concern Mining With Fine-Grained Change Analysis
Enhancing History-Based Concern Mining.pdf (Size: 1.48 MB / Downloads: 17)
Abstract—
Maintenance of large software projects is often hindered
by cross-cutting concerns scattered over multiple modules.
History-based mining techniques have been proposed to mitigate
the difficultly by examining changes related to methods/functions
in development history to suggest potential concerns. However,
the techniques do not cope well with renamed entities and may
lead to irrelevant information about concerns. The intricate
procedures of the methods also make the results difficult for
others to reproduce, utilize or improve.
In this paper, we reinforce history-based concern mining
techniques with fine-grained change analysis based on tree
differencing on abstract syntax trees. Source code changes are
recorded as facts over source code regions according to the
RDF (Resource Description Framework) data model so that the
analysis can be performed in terms of factbase queries.
To show the capability of the method, we report on an
experiment that emulates the state-of-the-art concern mining
technique called COMMIT using our own change analysis tool
called Diff/TS. A comparative case study on several open source
projects written in C and Java shows that our technique improves
results and overcomes the language barrier in the analysis.
INTRODUCTION
Maintenance of large software projects is often hindered by
cross-cutting concerns scattered over multiple modules such
as caching, logging, and authentication. History-based mining
techniques have been proposed to mitigate the difficultly by
examining changes related to methods/functions in development
history to suggest potential concerns. However, the
techniques do not carry out detailed change analysis to track
renamed source code entities across versions and may report
irrelevant methods/functions as part of candidate concerns.
On the other hand, the intricate procedures of history-based
methods make the results difficult for others to reproduce,
utilize or improve. One has to collect change histories from
source code management systems such as CVS and SVN and
process source files to extract changes related to methods/
functions before applying his/her own technique for concern
mining. It is a demanding task considering the volume of
changes involved in a large project and the precision needed
for analyzing the source code. The difficulty lead to limited
use and segregation of methods since preparing tools needed
for target projects and programming languages takes too much
effort from developers and maintainers. In this regards, we
believe that generation and analysis of facts must be separated
in a way that third-party users can access the database of facts
(called the factbase) with reasonable costs of initial learning.
In this paper, we propose a method for recording source
code changes as facts over textual regions according to the
RDF (Resource Description Framework) data model to allow
analysis to be performed in terms of factbase queries.We apply
the method to history-based concern mining by reinforcing the
techniques with a fine-grained change analysis based on tree
differencing on abstract syntax trees. To show the capability
of the method, we report on an experiment that emulates the
state-of-the-art concern mining technique called COMMIT [1]
using our own change analysis tool called Diff/TS [2]. A
comparative case study on several open source projects written
in C and Java shows that our technique improves the results
and overcomes the language barrier in the analysis.
To summarize, our contribution is twofold:
• allowing history-based analysis to be performed at the
level of factbase queries, independent of particular programming
languages and models of dependencies, and
• improving history-based concern mining techniques by
integrating a fine-grained change analysis into the factbase.
The rest of the paper is organized as follows. Section II
explains the method of fined-grained change analysis and
construction of factbase before overviewing the mining technique
of COMMIT in Section III. The emulation of COMMIT
in our setting including the enhancement in change analysis
is presented in Section IV and the result of an experiment
is reported in Section V. After related work is reviewed in
Section VI, we conclude in Section VII with future plans.
FINE-GRAINED CHANGE ANALYSIS AND FACTBASE
Our aim is to facilitate collaborative efforts in software evolution
analysis in a way that is independent of specific tools,
platforms, protocols and languages. This lead us to an idea
of representing facts about source code as properties/relations
on textual regions in source files. This is probably the most
universal and independent way of sharing and exchanging
information about source code since no tools can not point the
code regions of their interests and users normally comprehend
source code by browsing its texts.