Introduction to the NCBI Data Bases and ENTREZ

e-mail us questions, comments or suggestions!
 
 

REVIEW: Boolean Logic
The National Center for Biotechnology Information (NCBI)
What is a Data Base?
The NCBI Data Bases
ENTREZ and a few of the most commonly used NCBI Databases
Data Mining: Sickle Cell anemia
POSTER PROJECT

 
 
The National Center for Biotechnology Information (NCBI) was established in 1988 under the National Library of Medicine (NLM), which is a subdivision of the National Institutes of Health (NIH). It is a national resource for molecular biology information, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genomic, proteomic and transcriptomic data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease. See About NCBI.

Click on the image to see what the NCBI Home Page looks like or go directly to it.





The NCBI offers many data bases and tools.  Some of them are indicated below.
 
ALPHABETICAL LIST
BankIt GenBank sample record Plant Genomes
BLAST GeneMap'99 Proteins Sequences
Books Genes and Disease PROW
CCAP Genomes and Maps PubMed
CDD GEO PubMed Central
CGAP HTGs RefSeq
Clone Registry HomoloGene Research at NCBI
Cn3D Human Genome Resources Retroviruses
Coffee Break Human Genome Sequencing SAGEmap
COGs Human-Mouse Homology Maps Seminars
LocusLink Sequin
DART Malaria Site Search
dbEST Map Viewer SKY/CGH
dbGSS MGC Structures
dbSNP Microbial Genomes Taxonomy
dbSTS MMDB Trace Archive
Education Mutation Databases UniGene
E-mail servers NCBI Home UniSTS
E-PCR NCBI News VAST
ENTREZ Nucleotide Sequences VecScreen
FTP OMIM What's New
GenBank ORF Finder  

Tools and data bases have been color coded as follows:  white = general;  cerulean = protein resources;  red = genome and map resources;  pink = nucleotide resources;  lime = biochemical and phenotype resources.


The search and retrieval system used for the NCBI data bases is known as ENTREZ. This software not only searches all the databases, but links the data bases together. There are at least two great advantages to ENTREZ:
  • It provides a single search engine, with the same features and controls for searching all data bases.
  • The "hits" from a search of any data base can be linked directly to the same, or related, records, in other relevant data bases.

Click on the image 
to see full-size!

The graphic to the left shows a highly simplified representation of the various databases, and how they are searched and linked together by ENTREZ. The Data Bases covered in the following tutorials are indicated with arrows. Click on the thumbnail to see it full-size.



A Data Mining Strategy for Information on Human Genetic Disease:
 

Click on the image 
to see full-size!

Introduction. This page! An introduction to the concept of databases. The NCBI databases. The ENTREZ search and retrieval system for bioinformatics data mining.
Bookshelf. Using Entrez to search full-text books online.
OMIM. Using Entrez to search "Online Mendelian Inheritance in Man" the phenotypes and clinical aspect of human genetic diseases and syndromes.
PubMed. Using Entrez to search the published literature for primary references and reviews. Building a bibliography with the clipboard.
Gene Database. Summaries of available information on genes.
Nucleotide Database (Genbank). Using Entrez to search for nucleotide sequences and annotations.
Protein Database. Using Entrez to search for protein sequences and annotations.
Structure Database. Visualization of 3D structure of macromolecules and relation to function.


Poster Project:
 

The final project for this module is a Poster and presentation to the rest of the class. This will be worth 1/3 of your grade for this module. 

The topic is any human genetic disease or condition of your choice. A list of suggested phenotypes is provided. Since knowledge of human biology is still fragmentary and incomplete however, you should check with the Instructor before choosing a topic which is not on this list.

Each poster should incorporate a discussion of the following aspect of the phenotype:

  • Clinical Features / Genotype / Phenotype.
  • Molecular Genetics / Gene Function.
  • Protein Function / Biochemistry / Allelic Variants.
  • An explanation*+of the structure and function of the relevant protein. 
  • A 3-D visualization of the protein which explains its structure and function.
  • An description of the changes in primary, secondary and tertiary structure of the protein which result in aberrant function.
  • An explanation of how and why the structure of the variant changes the function of the protein.
  • An explanation of how and why the defective function of the variant results in the aberrant phenotype.
  • A bibliography of resources you have used.

  •  

     

    *  NOTE: An explanation is not simply a statement of what! An explanation discusses how and why!
    +  NOTE: An explanation is not simply a paraphrase or a quote from a reference! It reflects your own understanding!


GRADING CRITERIA:  The Poster will be graded by your Instructor using the following criteria. Alternatively, your Instructor may choose different criteria of her/his own.

Go to the top of the page.

RETURN to INDEX for Entrez

RETURN to the SITE MAP