Botany Exam  >  Botany Notes  >  Bioinformatics  >  Lecture 13 - Biological Sequence Databases Protein Information Resource (PIR)

Lecture 13 - Biological Sequence Databases Protein Information Resource (PIR) | Bioinformatics - Botany PDF Download

Download, print and study this document offline
Please wait while the PDF view is loading
 Page 1


Biological Sequence Databases   Protein Information Resource (PIR) 
 
Institute of Lifelong Learning, University of Delhi 
 
 
 
 
 
 
 
 
Subject: Bioinformatics  
Lesson: Biological Sequence Databases   Protein Information 
Resource (PIR) 
Lesson Developer: Suman Sharma 
College/ Department: Department of Botany, Ramjas College, 
University of Delhi 
  
Page 2


Biological Sequence Databases   Protein Information Resource (PIR) 
 
Institute of Lifelong Learning, University of Delhi 
 
 
 
 
 
 
 
 
Subject: Bioinformatics  
Lesson: Biological Sequence Databases   Protein Information 
Resource (PIR) 
Lesson Developer: Suman Sharma 
College/ Department: Department of Botany, Ramjas College, 
University of Delhi 
  
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 1 
 
 
Table of Contents       
 
Chapter 1: Protein Information Resource (PIR) 
? Introduction 
? Features of PIR 
o Classification 
o Non Redundancy 
o Standardized Annotation 
o Cross Reference 
o Comprehensiveness 
o Regular releases with free accessibility 
o Retrieval of information from the site 
?  Database organization and annotation 
o PIR – international sequences and auxillary database 
? PIR Resources 
o Data Retrieval system 
o Databases in PIR 
? Summary 
? Exercises 
? Glossary 
? Suggested Reading 
 
 
 
 
 
 
 
 
Page 3


Biological Sequence Databases   Protein Information Resource (PIR) 
 
Institute of Lifelong Learning, University of Delhi 
 
 
 
 
 
 
 
 
Subject: Bioinformatics  
Lesson: Biological Sequence Databases   Protein Information 
Resource (PIR) 
Lesson Developer: Suman Sharma 
College/ Department: Department of Botany, Ramjas College, 
University of Delhi 
  
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 1 
 
 
Table of Contents       
 
Chapter 1: Protein Information Resource (PIR) 
? Introduction 
? Features of PIR 
o Classification 
o Non Redundancy 
o Standardized Annotation 
o Cross Reference 
o Comprehensiveness 
o Regular releases with free accessibility 
o Retrieval of information from the site 
?  Database organization and annotation 
o PIR – international sequences and auxillary database 
? PIR Resources 
o Data Retrieval system 
o Databases in PIR 
? Summary 
? Exercises 
? Glossary 
? Suggested Reading 
 
 
 
 
 
 
 
 
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 2 
Introduction 
The rapid increase in number of genome sequencing projects has generated enormous 
amount of molecular data. In order to fully understand this huge genome base data, 
computational tools are required which can help in identification of structure, function and 
biologically relevant features in the sequences. In order to serve this purpose Protein 
Information Resource (PIR) was established to generate tools and resources for data 
storage and analysis of protein sequence for scientific community.  
In year 1984, National Biomedical Research Foundation (NBRF) developed PIR (Protein 
Information Resource) for identification and interpretation of information on protein 
sequences (http://www.nbrf.georgetown.edu/pir/find.html). This database was actually 
derived from ‘Atlas of Protein Sequence and Structure’, which was developed by Margaret O. 
Dayhoff in the year 1964. Four years later in 1988, PIR along with NBRF, Munich 
Information Centre for Protein Sequences (MIPS) and the Japan International Protein 
Information Database (JIPID), developed an organization referred as PIR – international 
with four main aims: 
(1) to create an organized, non redundant, comprehensive protein database to study 
structural, functional and evolutionary relationships  
(2) to generate information on biological origin of protein sequences  
(3) to make database easily accessible in public domain  
(4) to enable cross reference with other databases for presenting structural information of 
biomolecules. 
The Protein Information Resource (PIR) is one of the most well established databases for 
annotated protein sequences in public domain. The expanded PIR website allows not only 
sequence similarity search but also other features like text based search for protein 
sequences and cross talk with auxillary databases, annotation – sorted search, domain 
search, combined global and domain search and interactive text searches.   
 
Page 4


Biological Sequence Databases   Protein Information Resource (PIR) 
 
Institute of Lifelong Learning, University of Delhi 
 
 
 
 
 
 
 
 
Subject: Bioinformatics  
Lesson: Biological Sequence Databases   Protein Information 
Resource (PIR) 
Lesson Developer: Suman Sharma 
College/ Department: Department of Botany, Ramjas College, 
University of Delhi 
  
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 1 
 
 
Table of Contents       
 
Chapter 1: Protein Information Resource (PIR) 
? Introduction 
? Features of PIR 
o Classification 
o Non Redundancy 
o Standardized Annotation 
o Cross Reference 
o Comprehensiveness 
o Regular releases with free accessibility 
o Retrieval of information from the site 
?  Database organization and annotation 
o PIR – international sequences and auxillary database 
? PIR Resources 
o Data Retrieval system 
o Databases in PIR 
? Summary 
? Exercises 
? Glossary 
? Suggested Reading 
 
 
 
 
 
 
 
 
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 2 
Introduction 
The rapid increase in number of genome sequencing projects has generated enormous 
amount of molecular data. In order to fully understand this huge genome base data, 
computational tools are required which can help in identification of structure, function and 
biologically relevant features in the sequences. In order to serve this purpose Protein 
Information Resource (PIR) was established to generate tools and resources for data 
storage and analysis of protein sequence for scientific community.  
In year 1984, National Biomedical Research Foundation (NBRF) developed PIR (Protein 
Information Resource) for identification and interpretation of information on protein 
sequences (http://www.nbrf.georgetown.edu/pir/find.html). This database was actually 
derived from ‘Atlas of Protein Sequence and Structure’, which was developed by Margaret O. 
Dayhoff in the year 1964. Four years later in 1988, PIR along with NBRF, Munich 
Information Centre for Protein Sequences (MIPS) and the Japan International Protein 
Information Database (JIPID), developed an organization referred as PIR – international 
with four main aims: 
(1) to create an organized, non redundant, comprehensive protein database to study 
structural, functional and evolutionary relationships  
(2) to generate information on biological origin of protein sequences  
(3) to make database easily accessible in public domain  
(4) to enable cross reference with other databases for presenting structural information of 
biomolecules. 
The Protein Information Resource (PIR) is one of the most well established databases for 
annotated protein sequences in public domain. The expanded PIR website allows not only 
sequence similarity search but also other features like text based search for protein 
sequences and cross talk with auxillary databases, annotation – sorted search, domain 
search, combined global and domain search and interactive text searches.   
 
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 3 
 
Figure:  PIR  Homepage 
Source: http://pir.georgetown.edu/pirwww/ 
Features of PIR Database 
Classification :. In PIR on the basis of similarity, sequences are classified into families, 
superfamilies and homology domains. These families are organized and aligned so that 
database can be searched easily by the name of the gene family. 
Cross Reference: In PIR all entries are cross-referred to reference and molecular 
databases like Medline, Genbank, EMBL, DDBJ, Protein Data Bank, Human Genome 
Database etc so that information retrieval can be optimized. Cross referenced database 
entries are represented in form of Hypertext-links. 
Non-Redundancy: PIR is a non-redundant database; sequences from a species with 
very high identity and similarity value are merged as single entry. Even on merging identity 
of independently reported sequence is not lost and can be discretely observed from the 
canonical sequence so that the reported sequence can be reconstructed on PIR site.   
Page 5


Biological Sequence Databases   Protein Information Resource (PIR) 
 
Institute of Lifelong Learning, University of Delhi 
 
 
 
 
 
 
 
 
Subject: Bioinformatics  
Lesson: Biological Sequence Databases   Protein Information 
Resource (PIR) 
Lesson Developer: Suman Sharma 
College/ Department: Department of Botany, Ramjas College, 
University of Delhi 
  
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 1 
 
 
Table of Contents       
 
Chapter 1: Protein Information Resource (PIR) 
? Introduction 
? Features of PIR 
o Classification 
o Non Redundancy 
o Standardized Annotation 
o Cross Reference 
o Comprehensiveness 
o Regular releases with free accessibility 
o Retrieval of information from the site 
?  Database organization and annotation 
o PIR – international sequences and auxillary database 
? PIR Resources 
o Data Retrieval system 
o Databases in PIR 
? Summary 
? Exercises 
? Glossary 
? Suggested Reading 
 
 
 
 
 
 
 
 
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 2 
Introduction 
The rapid increase in number of genome sequencing projects has generated enormous 
amount of molecular data. In order to fully understand this huge genome base data, 
computational tools are required which can help in identification of structure, function and 
biologically relevant features in the sequences. In order to serve this purpose Protein 
Information Resource (PIR) was established to generate tools and resources for data 
storage and analysis of protein sequence for scientific community.  
In year 1984, National Biomedical Research Foundation (NBRF) developed PIR (Protein 
Information Resource) for identification and interpretation of information on protein 
sequences (http://www.nbrf.georgetown.edu/pir/find.html). This database was actually 
derived from ‘Atlas of Protein Sequence and Structure’, which was developed by Margaret O. 
Dayhoff in the year 1964. Four years later in 1988, PIR along with NBRF, Munich 
Information Centre for Protein Sequences (MIPS) and the Japan International Protein 
Information Database (JIPID), developed an organization referred as PIR – international 
with four main aims: 
(1) to create an organized, non redundant, comprehensive protein database to study 
structural, functional and evolutionary relationships  
(2) to generate information on biological origin of protein sequences  
(3) to make database easily accessible in public domain  
(4) to enable cross reference with other databases for presenting structural information of 
biomolecules. 
The Protein Information Resource (PIR) is one of the most well established databases for 
annotated protein sequences in public domain. The expanded PIR website allows not only 
sequence similarity search but also other features like text based search for protein 
sequences and cross talk with auxillary databases, annotation – sorted search, domain 
search, combined global and domain search and interactive text searches.   
 
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 3 
 
Figure:  PIR  Homepage 
Source: http://pir.georgetown.edu/pirwww/ 
Features of PIR Database 
Classification :. In PIR on the basis of similarity, sequences are classified into families, 
superfamilies and homology domains. These families are organized and aligned so that 
database can be searched easily by the name of the gene family. 
Cross Reference: In PIR all entries are cross-referred to reference and molecular 
databases like Medline, Genbank, EMBL, DDBJ, Protein Data Bank, Human Genome 
Database etc so that information retrieval can be optimized. Cross referenced database 
entries are represented in form of Hypertext-links. 
Non-Redundancy: PIR is a non-redundant database; sequences from a species with 
very high identity and similarity value are merged as single entry. Even on merging identity 
of independently reported sequence is not lost and can be discretely observed from the 
canonical sequence so that the reported sequence can be reconstructed on PIR site.   
Protein Information Resource 
Institute of Lifelong Learning, University of Delhi 4 
 Annotation standardized: Unlike other databases original submission entries are 
annotated at PIR. All entries have complete citations, which includes article titles, genetic 
information, mapped genes, position of introns. For high consistency and accuracy 
conserved and standardized terminologies and annotations are provided in the database 
Comprehensiveness: PIR along with other databases, which are maintained by it, 
presents the most comprehensive repository of protein sequences. 
Regular releases and free accessibility: the database is updated and released 
quarterly. Weekly updates can also be searched on PIR website. Unlike other database, 
sequences in PIR can be accessed in public domain as soon as they are received by the 
resource. 
Retrieval of information from the site: retrieval of data and knowledge is 
supported by various options like superfamilies, features, authors, keywords, and sequence 
similarity. Multiple sequence alignments and family classification supported by hypertext 
links, facilitates fast retrieval of information on related sequences either in PIR or in other 
molecular databases. 
Table : PIR web site URLs  
Tools URLs 
PIR Home page http://www.nbrf.georgetown.edu/pir/ 
MIPS Home Page http://www.mips.biochem.mpg.de/ 
Text Search http://www.nbrf.georgetown.edu/pir/find.html 
Sequence Scan http://www.nbrf.georgetown.edu/nbrf/scan.html 
Sequence 
Search 
http://www.nbrf.georgetown.edu/nbrf/search.html 
Complete 
Genome 
http://www.nbrf.georgetown.edu/pir/genome.html 
PIR Alignment 
search 
http://www.nbrf.georgetown.edu/nbrf/getaln.html 
Read More
16 docs

FAQs on Lecture 13 - Biological Sequence Databases Protein Information Resource (PIR) - Bioinformatics - Botany

1. What is the Protein Information Resource (PIR) and how does it relate to biological sequence databases?
Ans. The Protein Information Resource (PIR) is a database that stores and provides access to biological sequence information, particularly protein sequences. It is one of the oldest and most comprehensive sequence databases, containing information on protein sequences from various organisms. PIR serves as a valuable resource for researchers studying protein structure, function, and evolution.
2. How does the Protein Information Resource (PIR) contribute to our understanding of botany?
Ans. The Protein Information Resource (PIR) contributes to our understanding of botany by providing a vast collection of protein sequences from plants. Researchers in the field of botany can utilize PIR to study the genetic makeup and functions of proteins in different plant species. This information helps in identifying key proteins involved in plant growth, development, and response to environmental factors.
3. What are biological sequence databases, and why are they important in the field of bioinformatics?
Ans. Biological sequence databases are repositories that store and organize various types of biological sequences, such as DNA, RNA, and protein sequences. These databases play a crucial role in bioinformatics, a field that combines biology and computer science. They provide researchers with easy access to a vast amount of sequence data, allowing them to analyze, compare, and interpret biological information. Sequence databases are essential for various bioinformatics applications, including gene discovery, evolutionary studies, drug design, and functional annotation.
4. How can researchers utilize the Protein Information Resource (PIR) for protein structure prediction?
Ans. Researchers can utilize the Protein Information Resource (PIR) for protein structure prediction by accessing the database's collection of known protein sequences. By comparing the target protein sequence with similar sequences in PIR, researchers can infer the likely three-dimensional structure of the protein. PIR's extensive sequence database provides a valuable resource for homology modeling and other structure prediction methods, helping researchers gain insights into protein folding and function.
5. What are the advantages of using the Protein Information Resource (PIR) compared to other biological sequence databases?
Ans. The Protein Information Resource (PIR) offers several advantages compared to other biological sequence databases. Firstly, PIR is one of the oldest and most established databases, known for its comprehensive collection of protein sequences. Secondly, PIR provides manually curated and annotated data, ensuring high-quality information for researchers. Additionally, PIR offers various search and analysis tools, making it easier to retrieve and analyze protein sequence data. Overall, PIR's reputation, data quality, and user-friendly features make it a preferred choice for researchers in the field of bioinformatics.
16 docs
Download as PDF
Explore Courses for Botany exam
Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

study material

,

Exam

,

Viva Questions

,

Summary

,

Objective type Questions

,

past year papers

,

Free

,

MCQs

,

Extra Questions

,

mock tests for examination

,

video lectures

,

Lecture 13 - Biological Sequence Databases Protein Information Resource (PIR) | Bioinformatics - Botany

,

ppt

,

pdf

,

Semester Notes

,

Lecture 13 - Biological Sequence Databases Protein Information Resource (PIR) | Bioinformatics - Botany

,

Important questions

,

Sample Paper

,

Lecture 13 - Biological Sequence Databases Protein Information Resource (PIR) | Bioinformatics - Botany

,

shortcuts and tricks

,

Previous Year Questions with Solutions

,

practice quizzes

;