Page 1
Biological sequence database: NCBI
0
Subject : Bioinformatics
Lesson : Biological sequence database: National Center for Biotechnology
Information (NCBI )
Lesson Developer : Sandip Das
College/ Department: Department of Botany, University of Delhi
Page 2
Biological sequence database: NCBI
0
Subject : Bioinformatics
Lesson : Biological sequence database: National Center for Biotechnology
Information (NCBI )
Lesson Developer : Sandip Das
College/ Department: Department of Botany, University of Delhi
Biological sequence database: NCBI
1
Table of Contents
Chapter: Biological sequence database: National Center for
Biotechnology Information (NCBI)
? Introduction
? Databases at NCBI
? Literature
? Bookshelf
? Pubmed
? Nucleic Acid
? dbEST
? dbGSS
? dbGSS
? Popset
? dbGaP
? dbVar
o Genome
o Taxonomy
o PubChem
o Expression analysis
o Protein
? Summary
? Exercise/ Practice
? Glossary
? References/ Bibliography/ Further Reading
National Center for Biotechnology Information (NCBI)
NCBI has emerged as the primary free-to-access source of data and analysis tools in the
field of computational biology. The free-access nature of NCBI is possible as the policy of
funding and publication in most countries dictates that the researcher mandatorily deposits
the information generated using public-fund into a free-to-access central repository. In
return, the repository (such as NCBI or EMBL) assigns a unique identification number, often
termed as accession number, to the data that also can be used to identify the depositor and
Page 3
Biological sequence database: NCBI
0
Subject : Bioinformatics
Lesson : Biological sequence database: National Center for Biotechnology
Information (NCBI )
Lesson Developer : Sandip Das
College/ Department: Department of Botany, University of Delhi
Biological sequence database: NCBI
1
Table of Contents
Chapter: Biological sequence database: National Center for
Biotechnology Information (NCBI)
? Introduction
? Databases at NCBI
? Literature
? Bookshelf
? Pubmed
? Nucleic Acid
? dbEST
? dbGSS
? dbGSS
? Popset
? dbGaP
? dbVar
o Genome
o Taxonomy
o PubChem
o Expression analysis
o Protein
? Summary
? Exercise/ Practice
? Glossary
? References/ Bibliography/ Further Reading
National Center for Biotechnology Information (NCBI)
NCBI has emerged as the primary free-to-access source of data and analysis tools in the
field of computational biology. The free-access nature of NCBI is possible as the policy of
funding and publication in most countries dictates that the researcher mandatorily deposits
the information generated using public-fund into a free-to-access central repository. In
return, the repository (such as NCBI or EMBL) assigns a unique identification number, often
termed as accession number, to the data that also can be used to identify the depositor and
Biological sequence database: NCBI
2
several other features. The following section will introduce you to a variety of databases
dealing with a wide range of disciplines. Please do note that although the data may be
organized separately for the sake of simplicity and clarity, in reality, all the databases are
inter-linked and can be navigated from one to the other. The databases are also associated
with their appropriate analysis tools.
The following section lists some of the databases that have been created at NCBI. For the
sake of simplicity, the databases in this lesson have been divided into three sections-section
I dealing with publication, literature and small scale DNA/RNA sequencing projects; section
II-dealing with whole genome, epigenome, maps of genomes, taxonomy and chemical
structures; and section III dealing with resources for RNA and protein that are required for
“functional genomics” . These sections marked as I, II and III will be dealt in their
respective chapters.
Databases-I:
Literature (PubMed, PubMed Central; NCBI Bookshelf):
DNA and RNA (Refseq, nucleotide, EST, GSS, WGS, PopSet, trace archive, SRA):
Databases-II:
Genomes (Map Viewer, Genome workbench, Plant Genome Central, Genome
Reference Consortium, Epigenomics, Genomics Structural variation):
Maps:
Taxonomy:
PubChem Substance:
Databases-III:
Expression analysis-GEO
Proteins (Reference sequences, GenPept, UniProt/SwissProt, PRF, PDB, Protein
clusters, Structure, UniGene, CDD):
Entrez is the single point database search and retrieval system that allows a user to
perform the search and retrieve action against “all” or a “specific” database in an
interlinked manner.
Page 4
Biological sequence database: NCBI
0
Subject : Bioinformatics
Lesson : Biological sequence database: National Center for Biotechnology
Information (NCBI )
Lesson Developer : Sandip Das
College/ Department: Department of Botany, University of Delhi
Biological sequence database: NCBI
1
Table of Contents
Chapter: Biological sequence database: National Center for
Biotechnology Information (NCBI)
? Introduction
? Databases at NCBI
? Literature
? Bookshelf
? Pubmed
? Nucleic Acid
? dbEST
? dbGSS
? dbGSS
? Popset
? dbGaP
? dbVar
o Genome
o Taxonomy
o PubChem
o Expression analysis
o Protein
? Summary
? Exercise/ Practice
? Glossary
? References/ Bibliography/ Further Reading
National Center for Biotechnology Information (NCBI)
NCBI has emerged as the primary free-to-access source of data and analysis tools in the
field of computational biology. The free-access nature of NCBI is possible as the policy of
funding and publication in most countries dictates that the researcher mandatorily deposits
the information generated using public-fund into a free-to-access central repository. In
return, the repository (such as NCBI or EMBL) assigns a unique identification number, often
termed as accession number, to the data that also can be used to identify the depositor and
Biological sequence database: NCBI
2
several other features. The following section will introduce you to a variety of databases
dealing with a wide range of disciplines. Please do note that although the data may be
organized separately for the sake of simplicity and clarity, in reality, all the databases are
inter-linked and can be navigated from one to the other. The databases are also associated
with their appropriate analysis tools.
The following section lists some of the databases that have been created at NCBI. For the
sake of simplicity, the databases in this lesson have been divided into three sections-section
I dealing with publication, literature and small scale DNA/RNA sequencing projects; section
II-dealing with whole genome, epigenome, maps of genomes, taxonomy and chemical
structures; and section III dealing with resources for RNA and protein that are required for
“functional genomics” . These sections marked as I, II and III will be dealt in their
respective chapters.
Databases-I:
Literature (PubMed, PubMed Central; NCBI Bookshelf):
DNA and RNA (Refseq, nucleotide, EST, GSS, WGS, PopSet, trace archive, SRA):
Databases-II:
Genomes (Map Viewer, Genome workbench, Plant Genome Central, Genome
Reference Consortium, Epigenomics, Genomics Structural variation):
Maps:
Taxonomy:
PubChem Substance:
Databases-III:
Expression analysis-GEO
Proteins (Reference sequences, GenPept, UniProt/SwissProt, PRF, PDB, Protein
clusters, Structure, UniGene, CDD):
Entrez is the single point database search and retrieval system that allows a user to
perform the search and retrieve action against “all” or a “specific” database in an
interlinked manner.
Biological sequence database: NCBI
3
Figure : Various databases at NCBI can be accessed through the Entrez portal
Source: http://www.ncbi.nlm.nih.gov/sites/gquery
The National Center for Biotechnology Center (NCBI) site is conveniently organized into
four major domains and these domains are interlinked :
1. Databases,
2. Tools,
3. Data submission and
4. Education
The following figure depicts the interlinked nature of these domains and can be reached
by
1. Open the ncbi page by typing in www.ncbi.nlm.nih.gov in the web browser
2. Click the “search” button on the home page without enetering any keyword .
3. On the top left hand corner of the webpage, click on the “site map” to reach the
page.
Page 5
Biological sequence database: NCBI
0
Subject : Bioinformatics
Lesson : Biological sequence database: National Center for Biotechnology
Information (NCBI )
Lesson Developer : Sandip Das
College/ Department: Department of Botany, University of Delhi
Biological sequence database: NCBI
1
Table of Contents
Chapter: Biological sequence database: National Center for
Biotechnology Information (NCBI)
? Introduction
? Databases at NCBI
? Literature
? Bookshelf
? Pubmed
? Nucleic Acid
? dbEST
? dbGSS
? dbGSS
? Popset
? dbGaP
? dbVar
o Genome
o Taxonomy
o PubChem
o Expression analysis
o Protein
? Summary
? Exercise/ Practice
? Glossary
? References/ Bibliography/ Further Reading
National Center for Biotechnology Information (NCBI)
NCBI has emerged as the primary free-to-access source of data and analysis tools in the
field of computational biology. The free-access nature of NCBI is possible as the policy of
funding and publication in most countries dictates that the researcher mandatorily deposits
the information generated using public-fund into a free-to-access central repository. In
return, the repository (such as NCBI or EMBL) assigns a unique identification number, often
termed as accession number, to the data that also can be used to identify the depositor and
Biological sequence database: NCBI
2
several other features. The following section will introduce you to a variety of databases
dealing with a wide range of disciplines. Please do note that although the data may be
organized separately for the sake of simplicity and clarity, in reality, all the databases are
inter-linked and can be navigated from one to the other. The databases are also associated
with their appropriate analysis tools.
The following section lists some of the databases that have been created at NCBI. For the
sake of simplicity, the databases in this lesson have been divided into three sections-section
I dealing with publication, literature and small scale DNA/RNA sequencing projects; section
II-dealing with whole genome, epigenome, maps of genomes, taxonomy and chemical
structures; and section III dealing with resources for RNA and protein that are required for
“functional genomics” . These sections marked as I, II and III will be dealt in their
respective chapters.
Databases-I:
Literature (PubMed, PubMed Central; NCBI Bookshelf):
DNA and RNA (Refseq, nucleotide, EST, GSS, WGS, PopSet, trace archive, SRA):
Databases-II:
Genomes (Map Viewer, Genome workbench, Plant Genome Central, Genome
Reference Consortium, Epigenomics, Genomics Structural variation):
Maps:
Taxonomy:
PubChem Substance:
Databases-III:
Expression analysis-GEO
Proteins (Reference sequences, GenPept, UniProt/SwissProt, PRF, PDB, Protein
clusters, Structure, UniGene, CDD):
Entrez is the single point database search and retrieval system that allows a user to
perform the search and retrieve action against “all” or a “specific” database in an
interlinked manner.
Biological sequence database: NCBI
3
Figure : Various databases at NCBI can be accessed through the Entrez portal
Source: http://www.ncbi.nlm.nih.gov/sites/gquery
The National Center for Biotechnology Center (NCBI) site is conveniently organized into
four major domains and these domains are interlinked :
1. Databases,
2. Tools,
3. Data submission and
4. Education
The following figure depicts the interlinked nature of these domains and can be reached
by
1. Open the ncbi page by typing in www.ncbi.nlm.nih.gov in the web browser
2. Click the “search” button on the home page without enetering any keyword .
3. On the top left hand corner of the webpage, click on the “site map” to reach the
page.
Biological sequence database: NCBI
4
Figure: Various databases are organized into four major domains and are interlinked
Source: http://www.ncbi.nlm.nih.gov/guide/sitemap/
Databases of NCBI
The following section introduces you to some of the following databases at NCBI
Databases-I:
Literature (PubMed, PubMed Central; NCBI Bookshelf):
DNA and RNA (Refseq, nucleotide, EST, GSS, WGS, PopSet, trace archive, SRA):
Literature:
? Bookshelf provides free access and allows users to browse and retrieve a wealth of
information in life sciences and healthcare. The information may be in the form of
books documents and policy information from various government agencies and
publishers. The bookshelf titles are organized subject-wise, by Type or by Publisher
in a searchable or browsable format.
Read More