13-10-2012, 05:32 PM
TriangleOS Virtual File- and Database System
TriangleOS Virtual File.pdf (Size: 1.4 MB / Downloads: 29)
Introduction
This documents provides information about the new TriangleOS Storage Layer.
Instead of replacing a filesystem, it aims at replacing the Virtual File System with
the “Virtual File- & Database System” (VFDBS).
The introduction briefly describes the general idea in a non-technical and informal
way. Other parts of the document go into the technical details.
Why a new system?
Problems with the current system
Where did I leave this file? Does it even exists? “Save as...”, how am I going to
save this file to make sure I can still find it next month? When did I send that email
about my vacation? What is this file for?
These are a few of the questions an everyday computer user would probably
recognize. Organising computer files can be a difficult and time-consuming task.
Often users just let their files stack in one place as they don’t know where to
store them permanently (the desktop is a very popular place for this and often
gets very crowded). This goes OK for a while, since the average user is quite
good at remembering what files he was working on. But after a while, as the
amount of files increase and other files become ‘older’, the directories become
disorderly and stuffed and we lose files. Even if we frequently (re-)organise all
our files, we sometimes forget what ordering we used. A photo (file) of a Volcano
could have been put in C:\Photos\Vacation, but also in C:\Photos\Volcanoes, or
even in C:\E-Mail\Sent\Photos\Volcanoes. We can only find this file if we
remember how the file was called, where we put it, or use a tool to traverse all
directories (which can take a while). Even if a fast tool is available, we can’t do
much more than just track down the file. We still can’t query the data, like: ‘Show
a list of all artists of whom there is music on my computer” or “display all phone
numbers of all contacts I sent an e-mail yesterday”. If we want to know more
about the data, it usually takes a while to find out. This problem is ever
increasing as the storage capacity and the number of files/data grows. So instead
of searching through boxes full of papers, we now have to search through virtual
boxes with virtual paper (files), which we have a lot more than paper.
Existing ways of searching
The current paradigm of storing files in directories (also called “folders”) is called
a hierarchic system. The files/directories are stored in a tree. To find a file, we
start at the root, choose a directory, in this directory we choose another
directory, etc. until the file is found in the current directory. A number of
programs make this searching automated, but they usually only work when the
filename is already known.
To help the users organise their files, a central directory was introduced, in which
all files of a certain kind could be stored. The directories have straightforward
names such as “My Documents”, “My Music”, etc. A list of applications can usually
be found in some kind of menu. These folders usually don’t offer much more than
just a place to store the files. Quickly reorganising the files by using different
queries (like ‘Show all documents written by J. Janssen’, ‘Show a menu with all
3D-shooter games’, ‘A list of music per genre’) often isn’t possible. Another
problem is that most applications take no notice of the existence of such
directories, causing the interface to be very inconsistent.
Recently the subject of searching files is discussed more and more and new tools
to make searching easier have emerged. Most of these tools, however.
Replacing the hierarchical system.
With the increase of the amount of data, it’s obvious another system is needed to
present all kinds of data in a well-ordered way. Saving the documents in a tree
structure isn’t always natural.
As example, we take a document, called X. There are certain things we associate
with this file, like we do with objects in the real world (airplane: wings, engines,
airport, air, transportation; it’s not likely to immediately see it as
world.machimachine.transport.air.airplane).
Fig.
Components
The VFDBS consists of different layers or components. These components are:
Block Drivers, Cache, File System Driver, Journaling, SQL, Querying, File
Operations and Legacy Functions. Those last three components are considered to
be the highest layer, and all their functions have the prefix vfdbs-.
Block Cache
All disk I/O goes through the cache. A special flag can be set to indicate that a
certain block is not to be cached and has to be directly written/read to/from the
disk. The cache uses a hash table, LRU and MRU list to provide fast lookups
(O(1)) of the stored blocks. The flush-thread performs background disk flushes of
dirty blocks (i.e. blocks that have been modified since their last write-to-disk).
The flusher ‘glues’ blocks with consecutive block numbers together so that the
disk driver can write entire ‘tracks’ at the same time (which speeds things up
really well). The device driver provides the necessary information about the
optimal length of coalesced block-groups (usually the length of a track) to the
cache layer.
Journalling
When looking at figure 3 you probably noticed a separate component for
Journaling. In the TriangleOS VFDBS journaling is no longer a feature of the
filesystem, but has become an extra layer in the system, used by other system’s
components. This has a number of advantages:
a) This guarantees that all SQL transactions are atomic.
b) Journalling is easily added to existing file systems.
c) No duplicated code for journaling filesystems.
d) All disk-write operations in every part of the OS are atomic.
e) Fast track-writing by using the cache flusher.
f) Easy to keep a log of all disk activities.
For example, the fat-fs driver listed in table 2 is also a journaling filesystem. If
the system isn’t properly shutdown while writing data to a FAT-partition, the FATpartition
remains consistent. Naturally this also holds for trianglefs. No more disk
checks ever!
Filesystem
The file system driver contains file system specific implementations for functions
like open, read, write, seek, etc. Two extra functions to deal with journaling are
added: jcontrol() and jopen(). These functions are called by the journaling layer
(see chapter Journaling). These functions remain very small since the real work is
done by the journaling layer, but there has to be some way for the filesystem to
specify what the journaling file looks like and where it is stored on the disk.
Consistency checks
At startup, the mount-function quickly checks for hot journal entries (i.e. journal
entries having status JOURNAL_ST_HOT). For all such entries, the transaction is
played back, causing it to be committed. When the transaction has been
committed after all, the journal entry is set to JOURNAL_ST_FREE again. The
filesystem is now consistent again. This a fully-automated and fast process that
can be executed very quickly during start-up.
Database Structure
As explained in the introduction, every location has its own database. This
chapter will describe the lay-out of the filesystem database.
There are three important tables which define the relations and meta-data
for/between “files” (also called objects):
Keywords, Categories and Relation. The Volume Information contains some
general info about the volume. Rights specify the rights for a certain object, and
Object models the objects itself. See the following list for an overview of all
tables, their function and properties:
Volume Information
Contains general volume information. A volume normally only has one entry in
this table. The UniqueVolId is needed to define relations, and is generated at the
creation of the volume or database.
A volume also has other properties like a Size, Volume name, description and
maybe even other keywords like Owner, Password, etc. There is always a special
Object with ID 1, that describes the volume in greater detail than just the unique
number. So the volume can – like any other file – have keywords and a various
range of other attributes. It can even have relations to other objects (including
files, other volumes, etc.)