UVM CHEMISTRY


Protein Design by Combinatorial Protein Microarrays


Our objective is to explore the relationship between the primary amino acid sequence of a protein, and protein stability. We would like to address three specific questions:

(1) Is protein stability dependent on sequence in anyway that suggests "rules" for translating sequence information into folding stability?

(2) Is stability thermodynamically dictated? It is well known that proteins fold by forming a "hydrophobic core". The desolvation of hydrophobic amino acid side chains in this core provides most of the driving force for folding. Is stability thus dependent solely on the amount of hydrophobic surface sequestered in the core?

(3) Alternatively, is stability dependent on a "jigsaw" model in which hydrophobic residue sizes are matched in the hydrophobic core?

To address these questions we need to explore many possible packing alternatives, and to be able to quantitate which of these alternatives exhibit optimal stability. We are developing protein array systems in which one thousand arrays, each containing one thousand different protein sequences are interrogated, thus allowing the relative stabilities of one million proteins to be measured and compared. These arrays should allow optimally stable sequences to be determined. The challenge of quantitatively relating the primary amino acid sequence of a protein to its three dimensional fold is known as "the protein folding problem". An understanding of this relationship, and hence the ability to predict tertiary protein structures has profound implications for biochemical, pharmacological, and medical research.

All of the information necessary for folding a peptide sequence into its three dimensional structure is contained in the primary sequence of amino acids. Just how proteins are able to interpret this sequence information to assemble into three dimensional protein structures is still unknown. Predicting the structure of just a hundred residue sequence still remains a formidable challenge. Fortunately, we are able to confidently predict the secondary structures of sequences twenty to thirty residues in length. We are using this knowledge to address the protein folding problem via directed self assembly of libraries of short peptide sequences in order to identify those assemblies of peptides that constitute the optimally stable sequences of higher order protein structures.

Protein arrays are prepared as nanoliter spots on conventional (3 x 1 inch) glass microscope slides functionalized with aldehyde groups. Aldehyde groups have been shown to react chemoselectively with oxoamines appended to the amino termini of peptides. The amino termini of the peptides also bear a 2,2'-bipyridyl ligand. Using metal assisted assembly to direct topology, all possible combinations of the peptides are prepared as parallel three-helix bundles. Two peptides from solution associate with varying degrees of success with each of the surface immobilized peptides. These solution peptides bear a pyrene residue on their carboxy terminus, farthest away from the metal ligand. When pyrenes are brought into close proximity (5-10 Angstroms) they form an excimer characterized by the emission of green light. Only if the three helix structures are stable will the pyrenes be close enough to form the excimer.


Schematic showing the assembly of a stable three-helix bundle on a glass microscope slide. Two soluble helices associate with the immobilized helix depending on the stability of the resulting hydrophobic core. Assembly results in formation of a pyrene excimer with characteristic green fluorescence. The parallel three-helix topology is dictated by the formation of a metal tris-bipyridyl complex at the N-terminus

In preliminary experiments, we are reading the arrays by confocal microscopy at the Vermont Cancer Cell Imaging Facility. After formation of the 3-helix topologies using pyrene labeled peptides, these arrays are subjected to increasing concentrations of a protein denaturant. As the concentration of denaturant increases, the least stable sequences unfold, resulting in loss of pyrene excimer emission on the array. This allows us to determine which sequences present optimal stability, and exactly which sequences unfold at which denaturant concentration. Successful completion of this project will allow identification of a subset of optimally stable proteins, which will allow us to determine how the sequence of amino acids informs protein stability. The sequence/stability information thus obtained in this will allow the construction of optimally stable proteins as continuous linear amino acid sequences for subsequent expression in vivo.