Botany Exam  >  Botany Notes  >  Bioinformatics  >  Lecture 2 - Databases in Bioinformatics (PART 1)

Lecture 2 - Databases in Bioinformatics (PART 1) | Bioinformatics - Botany PDF Download

Download, print and study this document offline
Please wait while the PDF view is loading
 Page 1


Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
0 
 
 
 
 
 
 
 
Subject : Bioinformatic  
Lesson : Databases in Bioinformatics 
Lesson Developer : Arun Jagannath 
College/ Department : Department of Botany, 
University of Delhi 
 
 
  
 
Page 2


Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
0 
 
 
 
 
 
 
 
Subject : Bioinformatic  
Lesson : Databases in Bioinformatics 
Lesson Developer : Arun Jagannath 
College/ Department : Department of Botany, 
University of Delhi 
 
 
  
 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
1 
Table of Contents 
Chapter: Databases in Bioinformatics 
? Introduction 
? Biological databases 
? Classification of databases 
o Type of data/information 
o Source of data/information 
? Biological database retrieval systems – Case studies 
o  Identification and classification of databases 
o  Retrieval of nucleotide sequences 
o Bibliographic databases 
o Whole genome sequence databases 
o Organism-specific databases 
o Gene expression databases 
o Protein databases 
? Summary 
? Exercises 
? Glossary 
? References  
Page 3


Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
0 
 
 
 
 
 
 
 
Subject : Bioinformatic  
Lesson : Databases in Bioinformatics 
Lesson Developer : Arun Jagannath 
College/ Department : Department of Botany, 
University of Delhi 
 
 
  
 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
1 
Table of Contents 
Chapter: Databases in Bioinformatics 
? Introduction 
? Biological databases 
? Classification of databases 
o Type of data/information 
o Source of data/information 
? Biological database retrieval systems – Case studies 
o  Identification and classification of databases 
o  Retrieval of nucleotide sequences 
o Bibliographic databases 
o Whole genome sequence databases 
o Organism-specific databases 
o Gene expression databases 
o Protein databases 
? Summary 
? Exercises 
? Glossary 
? References  
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
2 
Introduction 
Living organisms have been subjected to innumerable studies at various levels 
viz., structure (morphology, anatomy), function (physiology, biochemistry), 
inheritance (genetics), evolution, taxonomy, etc. to name a few. Over the last 
few decades, scientists have also attempted to unravel the molecular basis of 
processes that are integral to organism biology and diversity. These studies were 
initially focused on relatively less complex organisms that came to be referred to 
as Model Organisms or Model Systems. Such organisms belonged to a wide range 
of life forms ranging from viruses and bacteria to higher plants and animals. 
Notable examples include Drosophila, C. elegans, Arabidopsis, mice, yeast and 
more recently Oryza sativa, Medicago, Lotus, etc. Molecular genetic studies on 
many of these life forms led to the development of markers and linkage maps, 
which in turn, facilitated whole genome-sequencing programs to extract the 
encoded information (genome sequence) that supports life. Subsequent analysis 
of gene function based on expression profiling (transcriptome studies) and 
mutant analysis (functional genomics) contributed further to our understanding of 
biological systems. Rapid developments in sequencing chemistry ushered in an 
era of high-throughput genome and transcriptome sequencing, which led to a 
virtual explosion of biological data across the world transgressing the limits of 
“model systems” for biological studies. Seminal developments in Bioinformatics 
centered mainly on the development of Databases, which functioned as electronic 
filing cabinets for the organization and analysis of large amounts of biological 
data that were generated from such studies. 
 
Biological Databases 
 
Biological databases serve a critical purpose in the collation and organization of 
data related to biological systems. They provide computational support and a 
user-friendly interface to a researcher for meaningful analysis of biological data 
viz., gene and protein sequences, molecular structures, etc. Computational tools 
and techniques have also been successfully used for simulation studies on 
biological macromolecules, their structures and interactions, molecular modeling 
and drug design accumulating significant amount of data in these interdisciplinary 
areas which would be dealt with separately in later units of this paper. 
 
Page 4


Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
0 
 
 
 
 
 
 
 
Subject : Bioinformatic  
Lesson : Databases in Bioinformatics 
Lesson Developer : Arun Jagannath 
College/ Department : Department of Botany, 
University of Delhi 
 
 
  
 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
1 
Table of Contents 
Chapter: Databases in Bioinformatics 
? Introduction 
? Biological databases 
? Classification of databases 
o Type of data/information 
o Source of data/information 
? Biological database retrieval systems – Case studies 
o  Identification and classification of databases 
o  Retrieval of nucleotide sequences 
o Bibliographic databases 
o Whole genome sequence databases 
o Organism-specific databases 
o Gene expression databases 
o Protein databases 
? Summary 
? Exercises 
? Glossary 
? References  
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
2 
Introduction 
Living organisms have been subjected to innumerable studies at various levels 
viz., structure (morphology, anatomy), function (physiology, biochemistry), 
inheritance (genetics), evolution, taxonomy, etc. to name a few. Over the last 
few decades, scientists have also attempted to unravel the molecular basis of 
processes that are integral to organism biology and diversity. These studies were 
initially focused on relatively less complex organisms that came to be referred to 
as Model Organisms or Model Systems. Such organisms belonged to a wide range 
of life forms ranging from viruses and bacteria to higher plants and animals. 
Notable examples include Drosophila, C. elegans, Arabidopsis, mice, yeast and 
more recently Oryza sativa, Medicago, Lotus, etc. Molecular genetic studies on 
many of these life forms led to the development of markers and linkage maps, 
which in turn, facilitated whole genome-sequencing programs to extract the 
encoded information (genome sequence) that supports life. Subsequent analysis 
of gene function based on expression profiling (transcriptome studies) and 
mutant analysis (functional genomics) contributed further to our understanding of 
biological systems. Rapid developments in sequencing chemistry ushered in an 
era of high-throughput genome and transcriptome sequencing, which led to a 
virtual explosion of biological data across the world transgressing the limits of 
“model systems” for biological studies. Seminal developments in Bioinformatics 
centered mainly on the development of Databases, which functioned as electronic 
filing cabinets for the organization and analysis of large amounts of biological 
data that were generated from such studies. 
 
Biological Databases 
 
Biological databases serve a critical purpose in the collation and organization of 
data related to biological systems. They provide computational support and a 
user-friendly interface to a researcher for meaningful analysis of biological data 
viz., gene and protein sequences, molecular structures, etc. Computational tools 
and techniques have also been successfully used for simulation studies on 
biological macromolecules, their structures and interactions, molecular modeling 
and drug design accumulating significant amount of data in these interdisciplinary 
areas which would be dealt with separately in later units of this paper. 
 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
3 
This lesson would provide a brief overview of different types/categories of 
databases. It would however, avoid detailed descriptions that can be accessed 
from several standard Bioinformatics textbooks or from the home pages of 
various databases. A few practice exercises for access and retrieval of information 
are provided at the end of the lesson. Some of these exercises would be 
supported with step-by-step instructions for the benefit of beginners while others 
are to be completed by students on their own. 
 
Questions:  
How would I know whether a database relevant to my interest/study exists or 
not?  
How can I be assured of the authenticity of the information available in any 
database?  
Answer: 
The journal, Nucleic Acids Research (NAR), publishes in its January issue every 
year, a comprehensive compilation of all peer-reviewed databases and online 
tools. These issues can be accessed at http://nar.oxfordjournals.org/. The peer 
review process ensures that the published literature and its contents are 
accurate. 
 
Classification of Biological Databases 
 
As mentioned earlier, the quantum of biological information available and its rate 
of increase have necessitated the creation of databases to collect and organize 
the data in a meaningful form. In order to maintain quality, improve accessibility 
of information and reduce redundancy, databases have been classified into 
different types.  
 
NOTE: 
The mode of database classification might vary in published literature. It is more 
important for a student/researcher to identify the information that he/she is 
searching for and attempt to access it from a relevant database rather than dwell 
upon its hierarchy. 
 
Two main approaches have been used to classify databases: 
Page 5


Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
0 
 
 
 
 
 
 
 
Subject : Bioinformatic  
Lesson : Databases in Bioinformatics 
Lesson Developer : Arun Jagannath 
College/ Department : Department of Botany, 
University of Delhi 
 
 
  
 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
1 
Table of Contents 
Chapter: Databases in Bioinformatics 
? Introduction 
? Biological databases 
? Classification of databases 
o Type of data/information 
o Source of data/information 
? Biological database retrieval systems – Case studies 
o  Identification and classification of databases 
o  Retrieval of nucleotide sequences 
o Bibliographic databases 
o Whole genome sequence databases 
o Organism-specific databases 
o Gene expression databases 
o Protein databases 
? Summary 
? Exercises 
? Glossary 
? References  
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
2 
Introduction 
Living organisms have been subjected to innumerable studies at various levels 
viz., structure (morphology, anatomy), function (physiology, biochemistry), 
inheritance (genetics), evolution, taxonomy, etc. to name a few. Over the last 
few decades, scientists have also attempted to unravel the molecular basis of 
processes that are integral to organism biology and diversity. These studies were 
initially focused on relatively less complex organisms that came to be referred to 
as Model Organisms or Model Systems. Such organisms belonged to a wide range 
of life forms ranging from viruses and bacteria to higher plants and animals. 
Notable examples include Drosophila, C. elegans, Arabidopsis, mice, yeast and 
more recently Oryza sativa, Medicago, Lotus, etc. Molecular genetic studies on 
many of these life forms led to the development of markers and linkage maps, 
which in turn, facilitated whole genome-sequencing programs to extract the 
encoded information (genome sequence) that supports life. Subsequent analysis 
of gene function based on expression profiling (transcriptome studies) and 
mutant analysis (functional genomics) contributed further to our understanding of 
biological systems. Rapid developments in sequencing chemistry ushered in an 
era of high-throughput genome and transcriptome sequencing, which led to a 
virtual explosion of biological data across the world transgressing the limits of 
“model systems” for biological studies. Seminal developments in Bioinformatics 
centered mainly on the development of Databases, which functioned as electronic 
filing cabinets for the organization and analysis of large amounts of biological 
data that were generated from such studies. 
 
Biological Databases 
 
Biological databases serve a critical purpose in the collation and organization of 
data related to biological systems. They provide computational support and a 
user-friendly interface to a researcher for meaningful analysis of biological data 
viz., gene and protein sequences, molecular structures, etc. Computational tools 
and techniques have also been successfully used for simulation studies on 
biological macromolecules, their structures and interactions, molecular modeling 
and drug design accumulating significant amount of data in these interdisciplinary 
areas which would be dealt with separately in later units of this paper. 
 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
3 
This lesson would provide a brief overview of different types/categories of 
databases. It would however, avoid detailed descriptions that can be accessed 
from several standard Bioinformatics textbooks or from the home pages of 
various databases. A few practice exercises for access and retrieval of information 
are provided at the end of the lesson. Some of these exercises would be 
supported with step-by-step instructions for the benefit of beginners while others 
are to be completed by students on their own. 
 
Questions:  
How would I know whether a database relevant to my interest/study exists or 
not?  
How can I be assured of the authenticity of the information available in any 
database?  
Answer: 
The journal, Nucleic Acids Research (NAR), publishes in its January issue every 
year, a comprehensive compilation of all peer-reviewed databases and online 
tools. These issues can be accessed at http://nar.oxfordjournals.org/. The peer 
review process ensures that the published literature and its contents are 
accurate. 
 
Classification of Biological Databases 
 
As mentioned earlier, the quantum of biological information available and its rate 
of increase have necessitated the creation of databases to collect and organize 
the data in a meaningful form. In order to maintain quality, improve accessibility 
of information and reduce redundancy, databases have been classified into 
different types.  
 
NOTE: 
The mode of database classification might vary in published literature. It is more 
important for a student/researcher to identify the information that he/she is 
searching for and attempt to access it from a relevant database rather than dwell 
upon its hierarchy. 
 
Two main approaches have been used to classify databases: 
Databases in Bioinformatics 
 
Institute of Lifelong Learning, University of Delhi 
 
4 
Type of data/information 
In this mode of classification, databases are categorized based on the data type. 
A few examples are listed below. 
S. No. Type of data Example(s) Weblinks 
1. Sequence of biomolecules 
viz., DNA, RNA, proteins  
GenBank, EMBL, 
DDBJ, Swiss-Prot, 
PIR 
(i) www.ncbi.nlm.nih.gov/genbank/ 
(ii) https://www.ebi.ac.uk/embl/ 
(iii) www.ddbj.nig.ac.jp/ 
(iv)http://web.expasy.org/docs/swis
s-prot_guideline.html 
(v) http://pir.georgetown.edu/    
2. Bio-molecular structures PDB http://www.rcsb.org/pdb/home/hom
e.do  
3. Bibliography/scientific 
literature ** 
PubMed, Scopus 
(Search engine) 
(i) www.ncbi.nlm.nih.gov/pubmed 
(ii) www.scopus.com  
4. Patent databases  USPTO www.uspto.gov/  
5. Metabolic pathways / 
molecular interactions  
KEGG http://www.genome.jp/kegg/pathwa
y.htm  
6. Gene expression profiles eFP Browser http://bar.utoronto.ca/efp/cgi-
bin/efpWeb.cgi  
7. Genetic disorders  OMIM www.ncbi.nlm.nih.gov/omim  
8. Whole genome sequences Entrez\Genomes www.ncbi.nlm.nih.gov/sites/entrez?d
b=genome  
9. Education Teaching tools – 
Plant Cell 
http://www.plantcell.org/site/teachi
ngtools/teaching.xhtml  
**: Some of the bibliographic databases/search engines require a subscription to 
access their contents. The Delhi University Library System has procured online 
subscription for several national/international journals of repute and search 
engines viz., Scopus that are relevant to different disciplines.  
  
Question: 
Is it necessary to remember the website addresses of databases? 
Answer: 
No. It would be easier to access a database based on its published reference or 
by searching for its home page using search engines viz. Google. 
  
Source of data/information 
 
Read More
16 docs

FAQs on Lecture 2 - Databases in Bioinformatics (PART 1) - Bioinformatics - Botany

1. What is the importance of databases in bioinformatics?
Ans. Databases play a crucial role in bioinformatics by providing a centralized repository of biological data. They allow researchers to store, retrieve, and analyze large amounts of biological information, such as DNA sequences, protein structures, and gene expression data. This enables the discovery of new insights, the identification of patterns, and the development of computational tools and algorithms in various fields of biological research.
2. What types of data are typically stored in bioinformatics databases?
Ans. Bioinformatics databases store a wide range of biological data, including DNA and RNA sequences, protein sequences and structures, gene annotations, genomic variations, metabolic pathways, and gene expression profiles. These databases often integrate data from multiple sources and provide standardized formats for easy access and analysis by researchers.
3. How are bioinformatics databases organized?
Ans. Bioinformatics databases are organized in a hierarchical manner to facilitate data retrieval and analysis. At the top level, there are general databases that cover a wide range of biological information, such as GenBank and UniProt. These are further subdivided into specialized databases that focus on specific types of data, such as the Protein Data Bank (PDB) for protein structures or the Gene Expression Omnibus (GEO) for gene expression data. Within each specialized database, data is typically organized into tables or files, with specific fields or columns representing different attributes or properties of the biological entities being stored.
4. How can bioinformatics databases be accessed and queried?
Ans. Bioinformatics databases can be accessed and queried through various methods. Many databases provide web interfaces that allow users to search for specific data using keywords, sequence patterns, or other criteria. Some databases also offer programmatic access through application programming interfaces (APIs), which allow researchers to retrieve data directly into their own software or scripts. Additionally, some databases provide downloadable data files that can be imported into local bioinformatics tools for further analysis.
5. Are bioinformatics databases freely available to the public?
Ans. Many bioinformatics databases are freely available to the public, as they are funded by government agencies or research institutions with the aim of promoting open access to biological data. Examples of freely accessible databases include GenBank, UniProt, and Ensembl. However, there are also some databases that require a subscription or fee for full access to certain features or datasets. Nonetheless, the majority of essential biological data can be accessed without any cost, facilitating research and collaboration in the field of bioinformatics.
16 docs
Download as PDF
Explore Courses for Botany exam
Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

Objective type Questions

,

Free

,

study material

,

MCQs

,

Viva Questions

,

Lecture 2 - Databases in Bioinformatics (PART 1) | Bioinformatics - Botany

,

past year papers

,

Lecture 2 - Databases in Bioinformatics (PART 1) | Bioinformatics - Botany

,

Important questions

,

pdf

,

ppt

,

Sample Paper

,

mock tests for examination

,

Summary

,

Previous Year Questions with Solutions

,

Semester Notes

,

video lectures

,

Lecture 2 - Databases in Bioinformatics (PART 1) | Bioinformatics - Botany

,

Extra Questions

,

practice quizzes

,

Exam

,

shortcuts and tricks

;