Cloning DNA and Finding Genes
Isolating genes:
- "Easy" if the gene product (protein) is
known:
-
- Create a cDNA library using an expression
vector.
- Probe with antibodies that bind the gene
product.
- Isolate and sequence positive clones.
- More difficult if product is not known:
-
- Identify a marker (microsatellite, RFLP) that is linked
to the disease.
- Use a technique called positional cloning to find the
gene.
-
- e.g., cloning of the Huntington's Disease
gene.
Huntington's Disease
- Dominant autosomal neurodegenerative disease
- fatal
- occurs in about 1/10,000 live births
- Late onset - age 35 to 45 (many have children before symptoms
appear)
This was one of the early examples of successfully finding a
gene for a major human disease.
Step 1: Find a linked marker
1. First they had to find a pedigree for a family with high
frequency of the disease. They found a large family in Venezuela.
They collected blood samples from family members with and without
the disease, extracted the DNA, and examined lots of DNA
markers.
2. They used "restriction fragment length
polymorphisms" and looked for restriction sites that were
correlated with the disease. What is a restriction enzyme?
An enzyme that at a particular sequence of 4-6 bases. There are
lots of different enzymes that recognize different sequences.
See
See Fig. 7.1 and 7.2
We also talked about how fragments are separated on
an agarose gel, and how radioactive probes can be used to label
particular fragments (both the fragment are probe are denatured to
make them single -stranded; when they reanneal the probe will
hybridize to any fragments that have the complementary base
sequence).
3. They got lucky and found marker only 4 cM
away.
- How often should you find a person with Huntington's
disease who does not also have that marker, given that it is 4 cM
away?
Eventually they localized the gene to chromosome
4.
They examined many more markers and many more individuals and
finally found markers that narrowed the region to about 500
kb.
Step 2: use those linked markers to find nearby DNA
sequences
- Make genomic "library"
- Probe that library for the marker
- "walk" to extend known sequence
See Fig. 7.9, 8.8, 7.13
How do you make a library?
- Use a restriction enzyme to make lots of fragments of genomic
DNA
- Use the same enzyme to cut the cloning vector (e.g.
plasmid).
- The sticky ends will anneal together and they can be sealed
with DNA ligase.
See figs 7.3, 7.4, 7.5
We talked about the way pUC plasmids have been
engineered to be convenient for cloning- they have a penicillin
resistance gene (to select bacteria that contain the plasmid) and
they have a lacZ gene (that can cleave xGal to make a blue dye).
If foreign DNA is inserted into the lacZ gene, no dye will be
produced and the colony will be white.
How do you sequence the final fragments?
- Sequencing reactions build on our understanding of DNA
replication. Mix a single-stranded template, short complementary
primer, DNA polymerase, and deoxynucleotides and you can
synthesize a complementary strand in a test tube.
- In the sequencing reaction, a small amount of di-deoxy
ribonucleotides are added which stops synthesis wherever they are
incorporated. (Why?)
- ddA, ddC, ddG, ddT are each added to separate
tubes
- When fragments are separated on a gel, you can read off the
sequence by reading the bases from bottom of the gel (shortest
fragments) to the top
See Fig. 7.19 Dideoxy DNA sequencing of a theoretical DNA
fragment
A current example: Parkinson's
disease
Valente et al, Science, May 2004 characterized the gene for a
rare form of Parkinson's
The procedure was similar to that used for Huntington's
disease, but they could make use of the complete human genome
sequence.
- Find markers linked to the disease using pedigree analysis of
families from Italy and Spain
-
- Many more markers are available today
- Narrowed the region to 3 cM (2,800,000
bp)
- Searched the Human Genome database-- 40 genes in that
region
- Narrowed the list of candidates:
-
- Expressed in brain cells?
- Similar to other known genes?
- Sequencing the final candidate gene showed 2
mutations
Yet another example: Cloning Spider Silk
gene
How did they do it? (simplified version)
- Knew the protein structure of silk:
-
- Repetitive amino acid sequence
- GGPXGGPX . . . .
- Used the protein to predict the underlying DNA
sequence
-
- Made degenerate oligonucleotide
-
- Gly = GGA,GGC,GGG, GGU
- Pro = CCA, CCC, CCG, CCU
- Make lots of different DNA sequences to match all
possible codons for a short fragment of the
protein
- Use that to probe a cDNA library
-
- What is a cDNA library? Why would they want to do
that?
- (cDNA is only made from expressed genes, so all of the
junk and introns have been removed, as well as any genes that
are not transcribed in the tissue of interest. Makes a
smaller library of clones to search
through.)
- What they did:
-
- Dissected spider silk glands
- extracted RNA
- pulled out sequences with a polyA tail
- Used reverse transcriptase to copy the RNA to
DNA
- Cloned that DNA into a vector (as above)