At the beginning of the chapter, we defined proteins as strings of amino acids that fold into complex 3-D shapes. There are 20 standard amino acids that can be strung together in different combinations in humans, and the result is that proteins can perform an impressive amount of different functions. For instance, muscle fibers are proteins that help facilitate movement. A special class of proteins (immunoglobulins) help protect the organism by detecting disease-causing pathogens in the body. Protein hormones, such as insulin, help regulate physiological activity. Blood hemoglobin is a protein that transports oxygen throughout the body. Enzymes are also proteins, and they are catalysts for biochemical reactions that occur in the cell (e.g., metabolism). Larger-scale protein structures can be visibly seen as physical features of an organism (e.g., hair and nails).
Nucleotides in our DNA provide the coding instructions on how to make proteins. Making proteins, also known as protein synthesis, can be broken down into two main steps referred to as transcription and translation. The purpose of transcription, the first step, is to make an ribonucleic acid (RNA) copy of our genetic code. Although there are many different types of RNA molecules that have a variety of functions within the cell, we will mainly focus on messenger RNA (mRNA). Transcription concludes with the processing (splicing) of the mRNA. The second step, translation, uses mRNA as the instructions for chaining together amino acids into a new protein molecule (Figure 3.17).
Unlike double-stranded DNA, RNA molecules are single-stranded nucleotide sequences (Figure 3.18). Additionally, while DNA contains the nucleotide thymine (T), RNA does not—instead its fourth nucleotide is uracil (U). Uracil is complementary to (or can pair with) adenine (A), while cytosine (C) and guanine (G) continue to be complementary to each other.
For transcription to proceed, a gene must first be turned “on” by the cell. A gene is a segment of DNA that codes for RNA, and genes can vary in length from a few hundred to as many as two million base pairs in length. The double-stranded DNA is then separated, and one side of the DNA is used as a coding template that is read by RNA polymerase. Next, complementary free-floating RNA nucleotides are linked together (Figure 3.19) to form a single-stranded mRNA. For example, if a DNA template is TACGGATGC, then the newly constructed mRNA sequence will be AUGCCUACG.
Genes contain segments called introns and exons. Exons are considered “coding” while introns are considered “noncoding”—meaning the information they contain will not be needed to construct proteins. When a gene is first transcribed into pre-mRNA, introns and exons are both included (Figure 3.20). However, once transcription is finished, introns are removed in a process called splicing. During splicing, a protein/RNA complex attaches itself to the pre-mRNA. Next, introns are removed and the remaining exons are connected, thus creating a shorter mature mRNA that serves as a template for building proteins.
As described above, the result of transcription is a single-stranded mRNA copy of a gene. Translation is the process by which amino acids are chained together to form a new protein. During translation, the mature mRNA is transported outside of the nucleus, where it is bound to a ribosome (Figure 3.21). The nucleotides in the mRNA are read in triplets, which are called codons. Each mRNA codon corresponds to an amino acid, which is carried to the ribosome by a transfer RNA (tRNA). Thus, tRNAs is the link between the mRNA molecule and the growing amino acid chain.
Continuing with our mRNA sequence example from above, the mRNA sequence AUG-CCU-ACG codes for three amino acids. Using a codon table (Figure 3.22), AUG is a codon for methionine (Met), CCU is proline (Pro), and ACG is threonine (Thr). Therefore, the protein sequence is Met-Pro-Thr. Methionine is the most common “start codon” (AUG) for the initiation of protein translation in eukaryotes. As the ribosome moves along the mRNA, the growing amino acid chain exits the ribosome and folds into a protein. When the ribosome reaches a “stop” codon (UAA, UAG, or UGA), the ribosome stops adding any new amino acids, detaches from the mRNA, and the protein is released. Depending upon the amino acid sequence, a linear protein may undergo additional “folding.” The final three-dimensional protein shape is integral to completing a specific structural or functional task.
To see protein synthesis in animation, please check out the From DNA to Protein video on YourGenome.org.
The LCT gene codes for a protein called lactase, an enzyme produced in the small intestine. It is responsible for breaking down the sugar “lactose,” which is found in milk. Lactose intolerance occurs when not enough lactase enzyme is produced and, in turn, digestive symptoms occur. To avoid this discomfort, individuals may take lactase supplements, drink lactose-free milk, or avoid milk products altogether.
The LCT gene is a good example of how cells regulate protein synthesis. The promoter region of the LCT gene helps regulate whether it is transcribed or not transcribed (i.e., turned “on” or “off,” respectively). Lactase production is initiated when a regulatory protein known as a transcription factor binds to a site on the LCT promoter. RNA polymerases are then recruited; they read DNA and string together nucleotides to make RNA molecules. An LCT pre-mRNA is synthesized (made) in the nucleus, and further chemical modifications flank the ends of the mRNA to ensure the molecule will not be degraded in the cell. Next, a spliceosome complex removes the introns from the LCT pre-mRNA and connects the exons to form a mature mRNA. Translation of the LCT mRNA occurs and the growing protein then folds into the lactase enzyme, which can break down lactose.
Most animals lose their ability to digest milk as they mature due to the decreasing transcriptional “silence” of the LCT gene over time. However, some humans have the ability to digest lactose into adulthood (also known as “lactase persistence”). This means they have a genetic mutation that leads to continuous transcriptional activity of LCT. Lactase persistence mutations are common in populations with a long history of pastoral farming, such as northern European and North African populations. It is believed that lactase persistence evolved because the ability to digest milk was nutritionally beneficial. More information about lactase persistence will be covered in Chapter 14.