NDjinn Multiple Database Searches many different databases (PIR
1,2,3,4; SwissProt; TrEMBL) as well as specific genomes by keywords
. Can also get annotated features from the record selected by clicking
on "Show Sequences" button at bottom
BLAST
BLAST (Basic Local Alignment Search Tool) is a set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. The BLAST programs have been designed for speed, with a minimal sacrifice of sensitivity to distant sequence relationships. The scores assigned in a BLAST search have a well-defined statistical interpretation, making real matches easier to distinguish from random background hits. BLAST uses a heuristic algorithm which seeks local as opposed to global alignments and is therefore able to detect relationships among sequences which share only isolated regions of similarity (Altschul et al., 1990). For a better understanding of BLAST you can refer to the BLAST Course which explains the basics of the BLAST algorithm.
There are several "flavors" of BLAST, including BLASTN (nucleotides),
BLASTP (proteins), BLASTX (compares the six-frame conceptual translation
products of a nucleotide query sequence, (both strands) against a protein
sequence database), TBLASTN (compares a protein query sequence against
a nucleotide sequence database dynamically translated in all six reading
frames (both strands)), TBLASTX (compares the six-frame translations of
a nucleotide query sequence against the six-frame translations of a nucleotide
sequence database.
FASTA
This is another similarity search program - it is more sensitive than
BLAST, but slower and requires more computing power.
FASTA format explained
The genomic sequence below is in FASTA format, which is often required
when searching
molecular databases. Take care! The first line must begin with ">"
and a short
description! The description can be anything that you choose to write.
If you obtained the
FASTA report from GenBank or EMBL, the description can be read as follows:
GenBank generated FASTA report:
>gb|accession|locus|description
EMBL generated FASTA report:
>emb|accession|locus|description
DDBJ generated FASTA report:
>dbj|accession|locus|description
SWISS-PROT generated FASTA report:
>sp|accession|entry name
nr (non-redundant) database generated FASTA report; sequences derived
from other
databases:
>gi|gi_identifier|accession of nucleotide sequence from which it was
derived|description
Accession and locus refer to the ACCESSION and LOCUS numbers in the database.
The DNA sequence of exon 11 coding for human lamin B1 is given below.
Lamin B1 is a protein
found in the nucleus, that helps to organise the chromatin during interphase.
It does so
by binding both to the nuclear matrix (a fibrous scaffold just below
the inner nuclear
envelope) and special regions of DNA. When you are confident, you might
like to use this
sequence to try a specialised database search to identify matrix attachment
regions (MARs)
of a DNA sequence, of which Lamin B is a classic example. The TimeLogic
Server requires
registration, but offers the MAR Finder service.
Sequences reported as being mRNA are in fact cDNA, obtained from mRNA.
You will notice the
absence or uracil (only present in RNA) in such sequences.
CLUSTAL W
BOX SHADE
TREE GRAM
How are "repeats" identified?
MSA an alignment algorithm
FINGERPRINTSCAN searches a protein name; outputs info about fingerprints (whatever they are ???) for that family of proteins. Also gives alignment views of motifs.
PFSCAN Sequence Search Against a Set of Profiles
CHOFAS Predict Secondary Stucture of PS(s) (Chou-Fasman)
HTH Predict HTH ( Helix-Turn-Helix ??? ) Motifs in Protein Chains
PELE Protein Secondary Structure Prediction, but incorporates 8 different
algorithms, not jsut one as per CHOFAS. Would secondary sturcture correspond
to: Motifs? Conserved regions?