Amino acidProteins are large molecules composed of chains of amino acids. Each amino acid is joined to the next one by peptide bonds, which form between the acid (containing a carbon atom) of one amino acid and the amino group (containing a nitrogen atom) of the next. This means that all of the amino acids are arranged in the same “direction” in the protein and that one end of the protein can be designated the “carbon end” (or C Terminus) and the other end the “nitrogen end” or (N terminus).

Because a protein consists of many amino acids joined by peptide bonds, the word “peptide” is used to describe proteins – we may use “peptide chain” to describe the sequence of amino acids, “oligopeptide” refers to a sequence of a few (e.g. a dozen or so) amino acids, while “polypeptide” refers to sequences much longer and is equivalent to the term protein.

Protein structureBecause proteins are so large, we don’t often refer to their actual molecular mass. Instead we use a unit called a Dalton, which is equivalent to a molecular mass of 100, or the average molecular mass of an amino acid. Therefore, the size of a protein expressed in Daltons is equivalent to how many amino acids it contains (e.g. a protein with a size of 244 Daltons is 244 amino acids long).

There are twenty different amino acids found in nature. It is different combinations of these amino acids which affect the structure and function of the proteins. Amino acids all have the same basic structure, and only differ by the chemical structure of a “side chain” which comes off the central carbon atom.  Amino acids are sometimes referred to as “residues”. The structures and classification of the twenty amino acids can be found here.

The Biology Project developed at the University of Arizona offers an excellent tutorial (including a quiz) on The Chemistry of Amino Acids (note : these links open to external sites).

Proteins can be thought of as having a number of levels of structure – the primary, secondary, tertiary and sometimes quaternary structures. A video which briefly describes each of these can be found here (note : this opens to an external site and is a You Tube video which may not be available to some users)

The order of amino acids in a protein is called the primary structure. The peptide bond is a covalent bond with partial double bond characteristics. This means that the molecules do not rotate around the peptide bond and the bond remains rigid, giving some structure to the chain. Because covalent bonds are quite strong, it takes a lot of energy to break up the primary structure of a protein.

Once a chain of amino acids has been assembled by the ribosome, it tends to fold in on itself. Nitrogen and oxygen atoms are strongly electronegative, which means that they will attract hydrogen atoms bound to nearby atoms. The first hydrogen bonds to form tend to occur between the atoms which make up the peptide bonds. This makes for tight structures within the protein. When the chain coils up like a corkscrew, an α helix is formed. When the chain folds up in parallel lines, a β pleated sheet is formed. Both of these components make up the secondary structure of the protein. Whether a region of the peptide chain produces α helices or β pleated sheet depends largely on the chemical nature of the amino acids which makes them up.

 

Structure of alpha helix

Structure of beta pleated sheet

 

The tertiary structure of a protein, showing alpha helices (red) and beta pleated sheets (gold) , from Wikimedia Commons
The tertiary structure of a protein, showing α helices (red) and β pleated sheets (gold), from Wikimedia Commons

A protein then folds up to give further three-dimensional structure, based on hydrogen bonds between the side chains of its amino acids and their interaction with the surrounding environment. If the side chain of an amino acid is non-polar (hydrophobic or water repelling) it is pushed to the inside of the protein molecule by the aqueous environment inside the cell. Polar or charged side chains tend to be found on the outside of the protein molecule. This three-dimensional structure is the tertiary structure of the protein. The tertiary structure determines the function of the protein and how it interacts with other substances.

Hydrogen bonds are quite weak compared to covalent bonds. It does not take much to disturb a hydrogen bond – gentle heating (e.g. above 60°C), changing the pH, the level of salts or the presence of certain disrupting surfactants can all interfere with the hydrogen bonding which holds the tertiary structure of proteins together. When this happens, we say that the protein has been denatured.  An example of this is when we heat egg white. The mostly transparent and water soluble proteins which make up albumen are denatured by the heat, producing the hard, insoluble and opaque substance we associate with cooked egg. Once a protein has been denatured, it can no longer carry out its normal function – solubility changes and alterations occur in the tertiary structure which means that the protein can no longer bind to other substances or catalyse chemical processes.

Sometimes stronger bonds form between regions in the tertiary structure of proteins. Cysteine is a sulphur containing amino acid. In proteins which contain a lot of this residue, nearby sulphur atoms may form covalent disulphide bridges. These covalent bonds give a strength to the protein not found in others. Tough proteins like keratin (found in fingernails and hair) contain a lot of disulphide bridges.

The quaternary structure of haemoglobin, showing the alpha subunits in red, the beta subunits in blue and the haem groups in green
The quaternary structure of haemoglobin, showing the α subunits in red the β subunits in blue and the haem groups in green, from Wikimedia Commons

The protein can be thought of as containing a number of distinct regions called domains. Each domain either has a particular structure (e.g. a large number of α helices) or a distinct function in the molecule. For example, while in most proteins, domains with a high proportion of non-polar amino acids are forced to the inside of the molecule, transmembrane proteins tend to have hydrophobic domains where these side chains are on the outside. This allows these proteins to embed in and attach to the non-polar inner part of the cell membrane.

Some proteins, once assembled and folded, may join together with other proteins to form larger molecules. This is called the quaternary structure and is not found in all proteins. If two or more identical subunits join together, the results are called a homodimer, homotrimer, homotetramer, etc. If the subunits are different, they are called heteromers. For example, the protein haemoglobin is a tetramer consisting of two α chains and two β chains.

Other groups may also be attached to proteins to assist in its function. Lipids may incorporate to form lipoproteins which may be associated with the cell membrane or involved in the transport of lipid substances like cholesterol in the aqueous environment of the blood. Carbohydrates associate with proteins to form glycoproteins, important as cell surface proteins involved in cellular recognition and in structural components like cartilage.

Proteins with a transport or catalytic role may also incorporate metal ions encased in special molecular cages. For example, each subunit in the haemoglobin molecule contains a carbon and nitrogen containing cage called a haeme group which binds to the iron ions. This allows the haemoglobin molecule to transport molecular oxygen throughout the body in the blood. Such proteins are called metalloproteins.

Protein biochemistry links

(Note : these links open to external websites)

Folding@Home – a distributed computer project aimed at investigating how proteins fold. Download the software and help protein scientists in their research.

Folding@Home’s guide to Proteins – an excellent summary from amino acids to protein folding. Includes some great animations.

Introduction to Protein Structure – a presentation by Dr Frank Gorga.

Interactive Concepts in Biochemistry – a wide range of molecular biology related interactive activities, including information on protein folding and an amino acid identification game.