Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Abstract—Because of the ever-increasing application of nextgeneration
sequencing (NGS) in research, and the expectation of
faster experiment turn-around, it is becoming unfeasible and
unscalable for analysis to be done exclusively by existing trained
bioinformaticians. Instead, researchers and bench biologists are
performing at least parts of most analyses. In order for this to be
realized, two conditions must be satisfied: (1) well designed and
accessible tools need to be made available, and (2) researchers
and biologists need to be trained to use such tools in order to
confidently handle high volumes of NGS data. Bio-Linux is a
fully featured, powerful, configurable and easy to maintain
bioinformatics workstation and helps on both counts by offering
well over one hundred bioinformatics tools packaged into a single
distribution, easily accessible and readily usable. Bio-Linux is
also accessible in the form of virtual images or on the cloud, thus
providing researchers with immediate access to scalable compute
infrastructure required to run the analysis. Furthermore this
paper discusses how bioinformatics training on Bio-Linux is
helping to bridge the data production and analysis gap.
Keywords—bioinformatics; next-generation sequencing;
training; cloud computing.
I. INTRODUCTION
A. The analysis and tools problem
Bioinformaticians are familiar with the current analysis &
tools problem in bioinformatics, exemplified by
the observation that a genome sequenced for $1,000 might
require $100,000 analysis [1] in a current clinical
setting. Researchers, especially PhD candidates and Postdocs,
find that they can run sequencing analyses quickly and cheaply
but the effective analysis of the resultant data remains
hard. Fixed processing pipelines quickly become obsolete in
the face of new sequencing technologies, new lab protocols,
new questions to ask, new reference databases, and increasing
data volumes.
There is an ongoing need for a multitude of flexible, readyto-
use tools, as well as a need for bioinformatics training -
empowering researchers with the software and the knowledge
to plan and perform analyses - creating the bioinformaticians of
the future. The Bio-Linux platform [2] provides researchers an
easy way to: (1) set up a Linux-based bioinformatics
workstation and (2) get the tools and data installed and
configured on the system. Beyond that, the process of
performing bioinformatics tasks is difficult, where one needs to
deal with errors and subtleties in data and understand the tools,
as well as their strengths/weaknesses for a given problem. But
with the system set-up taken care of, the researcher is free to
focus on these problems.

alicejoe