11-02-2013, 02:56 PM
Labeling Dynamic XML Documents: An Order-Centric Approach
1Labeling Dynamic.pdf (Size: 1.34 MB / Downloads: 19)
Abstract
Dynamic XML labeling schemes have important applications in XML Database Management Systems. In this paper, we
explore dynamic XML labeling schemes from a novel order-centric perspective. We compare the various labeling schemes proposed in
the literature with a special focus on their orders of labels. We show that the order of labels fundamentally impacts the update
performance of a labeling scheme and develop an order-based framework to classify and characterize XML labeling schemes.
Although there are dynamic XML labeling schemes that can completely avoid relabeling, the gain in update performance all come with
considerable costs such as larger label size and lower query performance, even if the XML documents are hardly updated. We
introduce vector order which is the foundation of the dynamic labeling schemes we propose. Compared with previous solutions that are
based on natural order or lexicographical order, vector order is a simple, yet most effective solution to process updates in XML DBMS.
We show that vector order can be gracefully applied to both range-based and prefix-based labeling schemes with little overhead
introduced. Moreover, vector order-based labeling schemes are not only efficient to process, but also resilient to skewed insertions.
Qualitative and experimental evaluations confirm the benefits of our approach compared to previous solutions.
INTRODUCTION
THE increasing importance of XML data management has
led to extensive research on native XML storage and
query support. The challenge of managing XML data comes
from its ordered tree-structured model which provides rich
semantic content and is essential for querying XML data at a
fine level of granularity. Labeling schemes encode both
order and structural information (e.g., Parent/Child, Ancestor/
Descendant) of the XML tree into extremely compact
labels, which is a widely adopted approach for XML query
processing and has received a lot of research attention.
Existing XML labeling schemes are usually classified into
two categories: range-based [13], [18], [14], [1] and prefixbased
[7], [2], [15], [12], [10]. Both range-based and prefixbased
labeling schemes provide fine-grained labeling of
XML data and are adopted by an array of applications.
Road Map
In Section 2, we describe the various labeling schemes that
have been proposed in the literature. We compare and
contrast the orders adopted by these labeling schemes
which play an important part in their behaviors for both
update and query processing. We show the limitations of
these labeling schemes which motivate the introduction of
our vector order in Section 3. We present range-based and
prefix-based labeling schemes based on vector order in
Sections 4 and 5. Qualitative comparisons are presented in
Section 6 which characterize our labeling schemes and
existing ones under a unified framework. Experimental
results in Section 7 demonstrate the benefits of our labeling
scheme and therefore justify the use of vector order. We
conclude the paper in Section 8.
V-Pre/Post Labeling Scheme
In V-Pre/post labeling scheme, insertion of a new node A
can be processed based on two principles: 1) the pre of A
should be between the pres of two nodes that immediately
precede and follow A during preorder traversal (if they
exist) and 2) the post of A should be between the posts of
two nodes that immediately precede and follow A during
postorder traversal (if they exist).
CONCLUSION
In this paper, we study the problem of designing dynamic
XML labeling scheme from a novel order-centric approach.
A framework has been developed to characterize different
labeling schemes and provide insight into their behaviors
for updates, paving the way for future studies on this topic.
After pointing out the limitations of existing labeling
schemes, we propose vector order to solve this problem.