Efficient Deduplication Techniques for Modern Backup Operation

**seminar ideas** · 01-08-2012, 04:33 PM

Efficient Deduplication Techniques for Modern Backup Operation

.pdf

Efficient_Deduplication_Techniques_for_Modern_Backup_Operation.pdf (Size: 1.96 MB / Downloads: 37)

INTRODUCTION
Motivation

THE recent introduction of digital TV, digital camcorders,
and other communication technologies has rapidly
accelerated the amount of data being maintained in digital
form. In 2007, for the first time ever, the total volume of
digital contents exceeded the global storage capacity, and it
is estimated that by 2011 only half of the digital information
will be stored [1]. Further, the volume of automatically
generated information exceeds the volume of human
generated digital information [1]. Compounding the problem
of storage space, digitized information has a more
fundamental problem: it is more vulnerable to error
compared to the information in legacy media, e.g., paper,
book, and film. When data is stored in a computer storage
system, a single storage error or power failure can put a
large amount of information in danger. To protect against
such problems, a number of technologies to strengthen the
availability and reliability of digital data have been used,
including mirroring, replication, and adding parity information.
In the application layer, the administrator replicates
the data onto additional copies called “backups” so that that
the original information can be restored in case of data loss.

Related Works

There largely exist three approaches for reducing the size of
information: delta encoding, duplication elimination, and
compression. Each of these techniques is used independently
or in a combined manner to improve the space efficiency and
network bandwidth utilization. Delta encoding stores only
the differences between sequential data. It is a common and
efficient method to reduce data redundancy when changes
are small. It is used in many applications including source
control [2] and backup [3]. Kalkarni et al. [4] proposed
redundancy elimination at block level (REBL), which is a
combination of block suppression, delta encoding, and
compression.

SYSTEM OVERVIEW
System Organization

PRUNE(Prompt Redundancy Elimination) is designed for
distributed environment where backups are located at a
remote site.1 We use the terms “client” and “server” for the
location of the original data to be backed up and the
location of the backup files, respectively. Deduplication
consists of three components: chunking, fingerprint generation,
and detection of redundancy.

Chunking Module
Chunking is the operation of scanning a file and partitioning
it into pieces. Each file piece is called a chunk and is a
unit of redundancy detection. There are two types of
chunking: fixed-size chunking and variable-size chunking.
For fixed-size chunking, a file is partitioned into fixed size
units, e.g., 8 KB blocks. Fixed-size chunking is conceptually
simple and fast. However, this method has an important
drawback: when a small amount of data is inserted into a
file or deleted from a file, an entirely different set of chunks
is generated from the updated file. To effectively address
this problem, variable-size chunking, which is also known
as content-based chunking, has been proposed [8].

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	An Efficient Experimental Method for High Power Direct Drive Wind Energy Conversion	seminar ideas	1	2,648	06-09-2017, 11:02 AM Last Post: jaseela123
	MODERN TRENDS IN AUTOMOBILE full report	seminar topics	1	38,866,993	29-03-2016, 12:07 PM Last Post: mkaasees
	ultrasonic-techniques-for-hidden-corrosion-dete	ravimech.p@	1	7,229	24-03-2016, 02:49 PM Last Post: mkaasees
	Ultrasonic Techniques for hidden corrosion	presentation Abstract	1	595	24-03-2016, 02:49 PM Last Post: mkaasees
	Hydroforming Techniques	presentation Abstract	0	410	02-06-2015, 01:00 PM Last Post: presentation Abstract
	Boiler Operation	presentation Abstract	0	350	08-05-2015, 02:49 PM Last Post: presentation Abstract
	Optimization of Machining Techniques in CNC Turning Centre Using Genetic Algorithm	presentation Abstract	0	368	07-05-2015, 04:03 PM Last Post: presentation Abstract
	Optimization of Machining Techniques in CNC Turning Centre Using Genetic Algorithm	presentation Abstract	0	314	07-05-2015, 04:03 PM Last Post: presentation Abstract
	emission control techniques full report	project report tiger	5	44,662,282	02-03-2015, 01:57 AM Last Post: jigkson
	Hydroforming Techniques : Seminar Report and PPT	seminar post	1	1,209	31-12-2014, 07:06 PM Last Post: Guest

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.