25-10-2012, 12:38 PM
The Mercury System: Embedding Computation into Disk Drives
ABSTRACT
Having inexpensive data storage has enabled the amassing of vast amounts of information. At present, these data sets far
exceed the capacity of modern processors, so searching them has become a serious challenge. In a recent invited talk at the
High Performance Embedded Computing Workshop, John Reynders of Celera Genomics commented that, ‘The size of the
databases we deal with is no longer measured in terabytes, but in exabytes.’ The Mercury system is a prototype data search
engine that can be embedded within the disk drive itself. We focus on the specific problems associated with searches of
unstructured, unindexed data. Three specific applications include approximate matching of text (important for text searches
of specific interest to homeland security where the original alphabet is different than the Latin alphabet and transliteration is
involved), genomics and proteomics searches (important biological applications), and image searches (also of significant
interest for homeland security).