Condensed and edited from “SNP
Fact Sheet”, at the ORNL web site, “SNPs:
Variations on a Theme”, at the NCBI website, and “What
is the HapMap?“ at the International HapMap website.
William
S. Barnes, Ph.D., Clarion University of Pennsylvania
The human genome is composed of approximately 3,000,000,000 base pairs and 25,000 genes. 99.9% of the sequence of the human genome is the same in all 7,000,000,000 people alive on earth today. It is this relatively small number of differences in the genome which account for the phenotypic differences between individual humans.
SNPs. Most of these differences are changes in a single base. These are called SNPs (pronounced "snips"), an abbreviation for single nucleotide polymorphisms. For example a substitution of T for an A in the DNA sequence AAGGCTAA would change it to ATGGCTAA. For a variation to be considered an SNP, it must occur in at least 1% of the population. The following are important facts about SNPs:
An example is apolipoprotein E or ApoE , one of the genes associated
with Alzheimer's. This gene contains two SNPs that result in three possible
alleles for this gene: E2, E3, and E4. Each allele differs by one
DNA base, and the protein product of each gene differs by one amino acid.
Each individual inherits one maternal copy of ApoE and one paternal
copy of ApoE, so there are 6 possible genotypes:
Research has shown that an individual who inherits at least one E4 allele will have a greater chance of getting Alzheimer's. Apparently, the change of one amino acid in the E4 protein alters its structure and function enough to make disease development more likely. Inheriting the E2 allele, on the other hand, seems to indicate that an individual is at lower risk.
Haplotypes. Even SNPs which have no effect on function, may still be useful as “tags” for the multiple genes associated with such complex diseases as cancer, diabetes, vascular disease, and some forms of mental illness. These associations are difficult to establish with conventional gene-hunting methods because a single altered gene may make only a small contribution to the disease. However, SNPs can be used as markers to locate genes on chromosomes, and this leads to the “International HapMap Project”. If there are 1 x 107 SNPs distributed across 23 chromosomes, there are approximately 500,000 SNPs / average chromosome. Some SNPs are so close to each other that they will tend to be inherited together as a block. Although the current terminology is slightly ambiguous, for the purposes of this tutorial a set of SNPs which are clustered together at a particular location on a single chromosome will be said to form a haplotype block. Since SNPs are by definition polymorphic, different SNP alleles will result in different haplotypes (variants of a haplotype block). For example, suppose:
The HapMap Project. With so many different haplotype blocks - each with several different haplotype blocks - it becomes interesting to know the frequencies of haplotypes within and between different populations. One of the big biology projects in the mid 2000’s is the “International HapMap Project” to map each SNP to a locus on one of the 23 chromosomes, and to locate each one within haplotype blocks of linked loci. In many parts of our chromosomes, there are only a few haplotypes. In a given population, 55 percent of people may have one haplotype, 30 percent may have another, 8 percent may have a third, and the rest may have a variety of less common haplotypes. The number of these haplotypes is estimated to be about 300,000 to 600,000, which is far fewer than the 10 million common SNPs. This greatly simplifies the task of finding the genes associated with polygenic traits such as cancer, diabetes, vascular disease, and some forms of mental illness. For example, consider the task of identifying all the genes associated
with high blood pressure. The HapMap will make it possible to compare the
haplotypes of individuals who have high blood pressure with those who do
not. If people with high blood pressure tend to share
a particular haplotype, then genes contributing to the disease
might be somewhere within or near that haplotype, and more detailed screening
of that region could be done.
BACK TO “Other NCBI Data Bases” |