28-06-2014, 10:52 AM
Seminar Report Motivation
Seminar Report 2.pdf (Size: 188.28 KB / Downloads: 11)
Abstract
Ext3 has been the most widely used general Linux filesystem for many years. In keep-
ing with increasing disk capacities and state-of-the art feature requirements, the next
generation of the ext3 filesystem, ext4, was created last year. This new filesystem
incorporates scalability and performance enhancements for supporting large filesys-
tems, while maintaining reliability and stability. Ext4 will suitable for a large variety
of workloads and is expected to replace ext3 as the ”Linux filesystem”.
Introduction
The ext4 or fourth extended filesystem is a journaling file system developed as the
successor to ext3.
It was born as a series of backward compatible extensions to remove
64-bit storage limits and add other performance improvements to ext3. However other
Linux kernel developers opposed accepting extension to ext3 for stability reasons and
proposed to fork the source code of ext3, rename it as ext4, and do all the devel-
opment there, without affecting the current ext3 users. This proposal was accepted,
and on June 2006, Theodore Ts’o, the ext3 maintainer, announced the new plan of
development for ext4.
A preliminary development snapshot of ext4 was included in version
2.6.19 of the Linux kernel. On 11 October 2008, the patches that mark ext4 as stable
code were merged in the Linux 2.6.28 source code repositories, denoting the end of
the development phase and recommending ext4 adoption. Kernel 2.6.28, containing
the ext4 filesystem, was finally released on 25 December 2008.
Motivation for ext4
Ext3 has been a very popular Linux filesystem due to its reliability, rich feature
set, relatively good performance, and strong compatibility between versions. The
conservative design of ext3 has given it the reputation of being stable and robust, but
has also limited its ability to scale and perform well on large configurations.
With the pressure of increasing capabilities of new hardware and
on-line resizing support in ext3, the requirement to address ext3 scalability and per-
formance is more urgent than ever. One of the outstanding limits faced by ext3 today
is the 16TB maximum filesystem size. Enterprise workloads are already approaching
this limit, and with disk capacities doubling every year and 1TB hard disks easily
available in stores
Features
The ext4 filesystem is the evolution of ext3 - more scalable and better performing.
Features include:
Large filesystem
The current 16TB filesystem size limit is caused by the 32-bit block numbers in ext3.
To enlarge the filesystem limit, the straightforward method is to increase the number
of bits used to represent block numbers and then fix all the references to data and
meta-data blocks.
Previously, there was an extents[2] path for ext3 with the capacity
to support 48-bit physical block numbers. In ext4, instead of just extending the
block numbers to 64-bits based on the current ext3 indirect block mapping, the ext4
developers decided to use extents mapping with 48-bit block numbers. This both
increases filesystem capacity and improves large file efficiency. With 48-bit block
numbers, ext4 can support a maximum filesystem size up to 248+12 = 260 bytes (1
EB) with 4KB block size.
After changing the data block numbers to 48-bit, the next step was
to correct the references to meta-data blocks correspondingly. Meta-data is present in
the super-block, the group descriptors, and the journal. New fields have been added
at the end of the super-block structure to store the most significant bits for block-
counter variables, s free block count, s blocks count, and s r blocks count, extending
them to 64 bits.
Extents
The ext3 filesystem uses an indirect block mapping scheme providing one-to-one map-
ping from logical blocks to disk blocks. This scheme is very efficient for sparse or small
files, but has high overhead for larger files, performing poorly especially on large file
delete and truncate operations.
Extents efficiently maps logical to physical blocks for large contigu-
ous files. An extent is a single descriptor which represents a range of contiguous
physical blocks. Figure 2.1 shows the extents structure. A single extent can represent
215 contiguous blocks, or 128MB, with 4KB block size. The MSB of the extent length
is used for preallocation feature. Four extents can be stored in the ext4 inode struc-
ture directly. This is generally sufficient to represent small or contiguous files. For
very large, highly fragmented, or sparse files, more extents are needed. In this case a
constant depth extent tree is used to store the extents map of a file. Figure 2.2 shows
the layout of the extents tree. The root of this tree is stored in the leaf nodes of the
tree
Large files
In Linux, file size is calculated based on the i blocks counter value. However, the
unit is in sectors (512 bytes), rather than in the filesystem block size (4096 bytes by
default). Since ext4s i blocks is a 32-bit variable in the inode structure, this limits the
maximum file size in ext4 to 232 * 512 bytes = 241 bytes = 2 TB. This is a scalability
limit that ext3 has planned to break for a while.
The solution for ext4 is quite s
Large number of files
Some applications already create billions of files today, and even ask for support for
trillions of files. In theory, the ext4 filesystem can support billions of files with 32-bit
inode numbers. However, in practice, it cannot scale to this limit. This is because
ext4, following ext3, still allocates inode tables statically. Thus, the maximum number
of inodes has to be fixed at filesystem creation time. To avoid running out of inodes
later, users often choose a very large number of inodes up-front. The consequence
is unnecessary disk space has to be allocated to store unused inode structures. The
wasted space becomes more of an issue in ext4 with the larger default inode. This also
makes the management and repair of large filesystems more difficult than it should be.
Block allocation enhancements
Increased filesystem throughput is the premier goal for all modern filesystems. In
order to meet this goal, developers are constantly attempting to reduce filesystem
fragmentation. High fragmentation rates cause greater disk access time affecting
overall throughput, and increased metadata overhead causing less efficient mapping.
There is an array of new features in line for ext4, which take advan-
tage of the existing extents mapping and are aimed at reducing filesystem fragmen-
tation by improving block allocation techniques.
Persistent preallocation
Some applications, like databases and streaming media servers, benefit from the abil-
ity to preallocate blocks for a file up-front (typically extending the size of the file in
the process), without having to initialize those blocks with valid data or zeros. Preal-
location helps ensure contiguous allocation as far as possible for a file (irrespective of
when and in what order data actually gets written) and guaranteed space allocation
for writes within the preallocated size. It is useful when an application has some fore-
knowledge of how much space the file will require. The filesystem internally interprets
the preallocated but not yet initialized portions of the file as zero-filled blocks. This
avoids exposing stale data for each block until it is explicitly initialized through a
subsequent write. Preallocation must be persistent across reboots, unlike ext3 and
ext4 block reservations.
Delayed and multiple block allocation
The block allocator in ext3 allocates one block at a time during the write operation,
which is inefficient for larger I/O. Since block allocation requests are passed through
the VFS layer one at a time, the underlying ext3 filesystem cannot foresee and cluster
future requests. This also increases the possibility of file fragmentation.
Delayed allocation is a well-known technique in which block alloca-
tions are postponed to page flush time, rather than during the write() operation. This
method provides the opportunity to combine many block allocation requests into a
single request, reducing possible fragmentation and saving CPU cycles. Delayed allo-
cation also avoids unnecessary block allocation for short-lived files.
Migration tool
Ext3 developers worked to maintain backwards compatibility between ext2 and ext3,
a characteristic users appreciate and depend on. While ext4 attempts to retain com-
patibility with ext3 as much as possible, some of the incompatible on-disk layout
changes are unavoidable. Even with these changes, users can still easily upgrade their
ext3 filesystem to ext4, like it is possible from ext2 to ex3. There are methods avail-
able for users to try new ext4 features immediately, or migrate their entire filesystem
to ext4 without requiring back-up and restore.
Upgrading from ext3 to ext4
There is a simple upgrade solution for ext3 users to start using extents and some
ext4 features without requiring a full backup or migration. By mounting an existing
ext3 filesystem as ext4 (with extents enabled), any new files are created using extents,
while old files are still indirect block mapped and interpreted as such. A flag in the
inode differentiates between the two formats, allowing both to coexist in one ext4
filesystem. All new ext4 features based on extents, such as preallocation and multiple
block allocation, are available to the new extents files immediately.
Downgrading from ext4 to ext3
hough not as straightforward as ext3 to ext4, there is a path for any user who may
want to downgrade from ext4 back to ext3. In this case the user would remount
the filesystem with the noextents mount option, copy all files to temporary files and
rename those files over the original file. After all files have been converted back to
indirect block mapping format, the INCOM-PAT EXTENTS flag must be cleared
using tune2fs, and the filesystem can be re-mounted as ext3
Linux filesystems
Difference between ext2, ext3, and ext4
The ext2, ext3, and ext4 file systems are a family of file systems that have a strong
amount of backwards and forward compatibility. In fact, they can be considered a
single filesytem format with a number of feature extensions, and ext2, ext3, and ext4
are merely the names of the implementations found in the Linux kernel. This way of
looking at things is supported by the fact that they share the same userspace utilities
(e2fsprogs), and that many filesystems can be mounted on different filesystems. For
example, a filesystem which is created for use with ext3 can be mounted using either
ext2 or ext4. However, a filesystem with ext4-specific extensions can not be mounted
using ext2 or ext3, and the ext3 and ext4 file systems code in the kernel (at least
of this writing) require the presence of a journal, which is generally not present in
partitions formatted for use by the ext2 file system
Conclusion
As we have discussed, the new ext4 filesystem brings many new features and enhance-
ments to ext3, making it a good choice for a variety of workloads. A tremendous
amount of work has gone into bringing ext4 to Linux, with a busy roadmap ahead
to finalize ext4 for production use. What was once essentially a simple filesystem
has become an enterprise-ready solution, with a good balance of scalability, reliabil-
ity, performance and stability. Soon, the ext3 user community will have the option
to upgrade their filesystem and take advantage of the newest generation of the ext
family