DEFINITION of the elements in the Features Table

 

This section of the flatfile includes information about regions of biological significance in genes and gene products.  The range of features to be represented is diverse, including regions which:
  • structural genes which perform a biological function,
  • regulatory elements which control the expression of a biological function
  • binding sites which interact with other molecules
  • origins which affect replication of a sequence
  • affect or are the result of recombination of different sequences
  • repetitive elements, such as VNTRs, microsatellites, etc.
  • secondary or tertiary structure as in tRNA or proteins

  • exhibit variation, or have been revised or corrected

Click on this image to see in a separate page!


Feature key  - a Feature Key is a keyword which indicates the nature of each functional element which occurs in the sequence  (such as:  Source, CDS gene, CDS protein, attenuator, enhancer, exon, gene, intron,  mRNA, LTR, etc .).

In the example to the left, the Feature Key = CDS.



Location  - the Location indicates the region of the presented sequence which corresponds to a feature. 
  • In a nucleotide sequence, base 1 is the first base (5' end) of the presented sequence. 
  • In a protein sequence the first amino acid, the N-terminal amino acid is presented first.

  • Exons are indicated by the prefix "x"  ("x51" )  refers to exon 51 in a given sequence). 
In the example to the left, the Location of this feature is bases 1 - 206 of the sequence presented.



Qualifiers - auxiliary information about a feature.  Every Feature Key is associated with a defined set of additional descriptive terms, or Qualifiers.  The Qualifiers are delimited by a forward slash in the following format: /qualifier=abcde

In the example to the left the Qualifiers are:

  • /codon_start  Indicates that the first codon starts with base #3 of the sequence. 
    • The first codon is therefore UCC which is the codon for serine.
    • The second codon is UCC which is the codon for serine.
    • The third codon is ATA which is the codon for isoleucine.
  • /product  Indicates that this sequence codes for a protein named "TCP1-beta"
  • /protein_id  Gives the accession number of the record for "TCP1-beta".
  • /db_xref    Gives the accession number of the record for "TCP1-beta".
  • /translation   Gives the amino acid sequence for the "TCP1-beta" protein as derived from the base sequence 1-206. 

  • NOTE:  The first 3 amino acids (SSI) correspond with the first 3 codons (UCCUCCATA)!


 
 

BACK TO  "GenBank Flat File Format"