22-12-2012, 03:31 PM
Digital Libraries and Data Warehouses
Digital Libraries.docx (Size: 380.47 KB / Downloads: 42)
INTRODUCTION
Two other systems frequently described in the context of information
retrieval are Digital Libraries and Data Warehouses (or DataMarts). There is significant overlap between these two systems and an Information Storage and
Retrieval System. All three systems are repositories of information and their
primary goal is to satisfy user information needs. Information retrieval easily dates
back to Vannevar Bush’s 1945 article on thinking (Bush-45) that set the stage for
many concepts in this area. Libraries have been in existence since the beginning of
writing and have served as a repository of the intellectual wealth of society. As
such, libraries have always been concerned with storing and retrieving information
in the media it is created on. As the quantities of information grew exponentially,
libraries were forced to make maximum use of electronic tools to facilitate the
storage and retrieval process. With the worldwide interneting of libraries and
information sources (e.g., publishers, news agencies, wire services, radio
broadcasts) via the Internet, more focus has been on the concept of an electronic
library. Between 1991 and 1993 significant interest was placed on this area
because of the interest in U.S. Government and private funding for making more
information available in digital form (Fox-93). During this time the terminology
evolved from electronic libraries to digital libraries. As the Internet continued its
exponential growth and project funding became available, the topic of Digital
Libraries has grown. By 1995 enough research and pilot efforts had started to
support the 1ST ACM International Conference on Digital Libraries (Fox-96).
There remain significant discussions on what is a digital library.
Everyone starts with the metaphor of the traditional library. The question is how
do the traditional library functions change as they migrate into supporting a digital
collection. Since the collection is digital and there is a worldwide communications
infrastructure available, the library no longer must own a copy of information as
long as it can provide access. The existing quantity of hardcopy material
guarantees that we will not have all digital libraries for at least another generation
of technology improvements. But there is no question that libraries have started
and will continue to expand their focus to digital formats. With direct electronic
access available to users the social aspects of congregating in a library and learning
from librarians, friends and colleagues will be lost and new electronic collaboration
equivalencies will come into existence (Wiederhold-95).
A digital library is a library in which collections are stored in digital formats (as opposed to print, microform, or other media) and accessible via computers.[1][not in citation given]The digital content may be stored locally, or accessed remotely via computer networks. A digital library is a type of information retrieval system.
In the context of the DELOS, a Network of Excellence on Digital Libraries, and DL.org, a Coordination Action on Digital Library Interoperability, Best Practices and Modelling Foundations, Digital Library researchers and practitioners produced a Digital Library Reference Model[2][3] which defines a digital library as: "A potentially virtual organisation, that comprehensively collects, manages and preserves for the long depth of time rich digital content, and offers to its targeted user communities specialised functionality on that content, of defined quality and according to comprehensive codified policies."[4]
The first use of the term digital library in print may have been in a 1988 report to the Corporation for National Research Initiatives[5][not in citation given] The term digital libraries was first popularized by the NSF/DARPA/NASA Digital Libraries Initiative in 1994.[6] These draw heavily on As We May Think by Vannevar Bush in 1945, which set out a vision not in terms of technology, but user experience.[7] The term virtual library was initially used interchangeably with digital library, but is now primarily used for libraries that are virtual in other senses (such as libraries which aggregate distributed content).
Description of and pseudocode for the search algorithm
The above example contains all the elements of the algorithm. For the moment, we assume the existence of a "partial match" table T, described below, which indicates where we need to look for the start of a new match in the event that the current one ends in a mismatch. The entries of T are constructed so that if we have a match starting at S[m] that fails when comparing S[m + i] to W[i], then the next possible match will start at index m + i - T[i] in S (that is, T[i] is the amount of "backtracking" we need to do after a mismatch). This has two implications: first, T[0] = -1, which indicates that if W[0] is a mismatch, we cannot backtrack and must simply check the next character; and second, although the next possible match will begin at index m + i - T[i], as in the example above, we need not actually check any of the T[i] characters after that, so that we continue searching from W[T[i]]. The following is a sample pseudocode implementation of the KMP search algorithm.