Translation of mRNA to produce a peptide chain of amino acids is achieved by reading the sequence of nucleotides in groups of three (triplets or codons). The starting point for how these codons are read has a significant effect on the subsequent amino acid sequence. Consider the following (coding) DNA sequence and the peptide which it codes for:

Normal reading frame

If the sequence is read one base along, a completely different sequence is generated :

Frameshift one base to the right

This is called a “frameshift” as the reading frame is moved along the sequence. Because the peptide sequence is different, the folding and properties of the protein are significantly altered. In some cases the frameshift results in a premature stop codon which truncates the protein.

To regulate the reading frame, translation commences at the “start” codon - AUG in mRNA (ATG in the coding strand of DNA), which also codes for the amino acid methionine. The placement of this start codon creates a reading frame which determines which codons are read subsequently.

Fusion genes are created when the gene for one protein is attached to the gene for another. The translational apparatus of the cell creates a peptide chain which behaves as if it were a single protein, whereas in reality it is a hybrid or “chimera” of two separate proteins. Fusion proteins are useful to create “tags” for proteins of interest (eg. attaching the gene for a green fluorescent protein to another gene creates a version of that protein which fluoresces and therefore can be used to localize it within the cell). However, since the fusion gene contains a single start codon, the two genes must be read in the same frame for the protein to be effectively made.

The process of ensuring that the two genes are kept in-frame involves careful selection of restriction sites and vectors. The following example is taken from the process used to develop the protocols for the Polo-box projects.

In the project Cloning the PLK1 Polo-box Domain, the gene for the Polo-box portion of the PLK1 gene was amplified using PCR and inserted into a commercial vector, pGEM-T Easy. The ends of this vector have single overhanging thymidine residues which allow the insertion of length of DNA, so long as it has overhanging adenosine residues (see diagram below).

Using A-tailing to assist in ligation

When the Polo-box gene is recovered for insertion into the pGEX vector, it needs to be cut from the pGEM-T Easy vector using restriction enzymes. The EcoRI enzyme cuts the vector at a point where the reading frame for the polo-box domain is preserved

EcoRI restriction to preserve readign frame

When this length of DNA is inserted into the pGEX 4T-3 vector which has been cut with the EcoRI enzyme, the reading frame is preserved and is consistent with the reading frames of the GST gene contained in the pGEX vector upstream of the insertion site.

Ligation into pGEX 4T 3 preserves the reading frame

The pGEX 4T-3 Vector was selected to ensure that this reading frame is preserved. Different versions of this vector have a different reading frame which will result in a different amino acid sequence for the polo-box domain. The diagram below shows what would happen if we used the pGEX 4T-1 vector :

Reading frame is not preserved with ligation into pGEX 4T 1