Overview
This module will examine how information is encoded in DNA, and how that information is interpreted to bring about changes in cells and tissues.
Objectives
1. Understand the triplet nature of the genetic code, and know the meaning of the term codon. 2. Know that the code is degenerate, and what that means. 3. Know that the code is unambiguous, and what that means. 4. Know the identities of the start and stop codons, and understand how they work.
The Genetic Code
It has been mentioned in a variety of modules that DNA stores genetic information. That much was clear from the experiments of Avery, Macleod, and McCarty and Hershey and Chase. However, these experiments did not explain how DNA stores genetic information. Elucidation of the structure of DNA by Watson and Crick did not offer an obvious explanation of how the information might be stored. DNA was constructed from nucleotides containing only four possible bases (A, G, C, and T). The big question was: how do you code for all of the traits of an organism using only a four letter alphabet?
Recall the central dogma of molecular biology. The information stored in DNA is ultimately transferred to protein, which is what gives cells and tissues their particular properties. Proteins are linear chains of amino acids, and there are 20 amino acids found in proteins. So the real question becomes: how does a four letter alphabet code for all possible combinations of 20 amino acids?
By constructing multi-letter "words" out of the four letters in the alphabet, it is possible to code for all of the amino acids. Specifically, it is possible to make 64 different three letter words from just the four letters of the genetic alphabet, which covers the 20 amino acids easily. This kind of reasoning led to the proposal of a triplet genetic code.
Experiments involving in vitro translation of short synthetic RNAs eventually confirmed that the genetic code is indeed a triplet code. The three-letter "words" of the genetic code are known as codons. This experimental approach was also used to work out the relationship between individual codons and the various amino acids. After this "cracking" of the genetic code, several properties of the genetic code became apparent: * The genetic code is composed of nucleotide triplets. In other words, three nucleotides in mRNA (a codon) specify one amino acid in a protein. * The code is non-overlapping. This means that successive triplets are read in order. Each nucleotide is part of only one triplet codon. * The genetic code is unambiguous. Each codon specifies a particular amino acid, and only one amino acid. In other words, the codon ACG codes for the amino acid threonine, and only threonine. * The genetic code is degenerate. In contrast, each amino acid can be specified by more than one codon. * The code is nearly universal. Almost all organisms in nature (from bacteria to humans) use exactly the same genetic code. The rare exceptions include some changes in the code in mitochondria, and in a few protozoan species. * A Non-overlapping Code * The genetic code is read in groups (or "words") of three nucleotides. After reading one triplet, the "reading frame" shifts over three letters, not just one or two. In the following example, the code would not be read GAC, ACU, CUG, UGA... * * Rather, the code would be read GAC, UGA, CUG, ACU... * * Degeneracy of the Genetic Code * There are 64 different triplet codons, and only 20 amino acids. Unless some amino acids are specified by more than one codon, some codons would be completely meaningless. Therefore, some redundancy is built into the system: some amino acids are coded for by multiple codons. In some cases, the redundant codons are related to each other by sequence; for example, leucine is specified by the codons CUU, CUA, CUC, and CUG. Note how the codons are the same except for the third nucleotide position. This third position is known as the "wobble" position of the codon. This is because in a number of cases, the identity of the base at the third position can wobble, and the same amino acid will still be specified. This property allows some protection against mutation - if a mutation occurs at the third position of a codon, there is a good chance that the amino acid specified in the encoded protein won't change. * Reading Frames * If you think about it, because the genetic code is triplet based, there are three possible ways a particular message can be read, as shown in the following figure: * * Clearly, each of these would yield completely different results. To illustrate the point using an analogy, consider the following set of letters: * theredfoxatethehotdog * If this string of letters is read three letters at a time, there is one reading frame that works: * the red fox ate the hot dog * and two reading frames that produce nonsense: * t her edf oxa tet heh otd og * th ere dfo xat eth eho tdo g * Genetic messages work much the same way: there is one reading frame that makes sense, and two reading frames that are nonsense. *
So how is the reading frame chosen for a particular mRNA? The answer is found in the genetic code itself. The code contains signals for starting and stopping translation of the code. The start codon is AUG. AUG also codes for the amino acid methionine, but the first AUG encountered signals for translation to begin. The start codon sets the reading frame: AUG is the first triplet, and subsequent triplets are read in the same reading frame. Translation continues until a stop codon is encountered. There are three stop codons: UAA, UAG, and UGA. To be recognized as a stop codon, the triplet must be in the same reading frame as the start codon. A reading frame between a start codon and an in-frame stop codon is called an open reading frame.
Let's see how a sequence would be translated by considering the following sequence:
5'-GUCCCGUGAUGCCGAGUUGGAGUCGAUAACUCAGAAU-3'
First, the code is read in a 5' to 3' direction. The first AUG read in that direction sets the reading frame, and subsequent codons are read in frame, until the stop codon, UAA, is encountered. Note that there are three nucleotides, UAG (indicated by asterisks) that would otherwise constitute a stop codon, except that the codon is out of frame and is not recognized as a stop.
In this sequence, there are nucleotides at either end that are outside of the open reading frame. Because they are outside of the open reading frame, these nucleotides are not used to code for amino acids. This is a common situation in mRNA molecules. The region at the 5' end that is not translated is called the 5' untranslated region, or 5' UTR. The region at the 3' end is called the 3' UTR. These sequences, even though they do not encode any polypeptide sequence, are not wasted: in eukaryotes these regions typically contain regulatory sequences that can affect when a message gets translated, where in a cell an mRNA is localized, and how long an mRNA lasts in a cell before it is destroyed. A detailed examination of these sequences is beyond the scope of this course.
The Genetic Code: Summary of Key Points * The genetic code is a triplet code, with codons of three bases coding for specific amino acids. Each triplet codon specifies only one amino acid, but an individual amino acid may be specified by more than one codon. * A start codon, AUG, sets the reading frame, and signals the start of translation of the genetic code. Translation continues in a non-overlapping fashion until a stop codon (UAA, UAG, or UGA) is encountered in frame. The nucleotides between the start and stop codons comprise an open reading frame.
You May Also Find These Documents Helpful
-
Deoxyribose Nucleic Acid (DNA) is a polynucleotide molecule that encodes the genetic instructions used in the development and functioning of all known living organisms and many viruses. Most DNA molecules are double stranded helices, consisting of two polynucleotide strands made up of simpler molecules known as nucleotides. A nucleotide is made up of an organic nitrogenous base, a deoxyribose sugar and phosphate groups. It is order of these bases which make up the genetic code; a set of rules, by which information is encoded within genetic material.…
- 1411 Words
- 6 Pages
Good Essays -
3. Describe each stage of the flow of information starting with DNA and ending with a trait.…
- 494 Words
- 2 Pages
Good Essays -
Santi, L., Maggioli, C., Mastroroberto, M., Tufoni, M., Napoli, l., & Caraceni, P. (2012). Acute liver failure caused by…
- 772 Words
- 4 Pages
Better Essays -
3. Describe each stage of the flow of information starting with DNA and ending with a trait.…
- 380 Words
- 2 Pages
Satisfactory Essays -
Explanation: The first codon of an mRNA transcript is called initiation codon and it initiates the translation process, which is necessary for formation of a protein. The last codon is known as a Stop codon as it stops the translation process to end the addition of amino acids to protein chain. In absence of Stop codon the protein formation is never completed as there would uninhibited addition of amino acids.…
- 735 Words
- 3 Pages
Good Essays -
3. Describe each stage of the flow of information starting with DNA and ending with a trait.…
- 472 Words
- 2 Pages
Good Essays -
The flow of information from gene to protein is based on the triplet code. The genetic instructions for the amino acid sequence of a polypeptide chain are written in DNA and RNA as a series of three-base words called codons. The three-base codons in DNA are transcribed into complementary three-base codons in RNA, and then the RNA codons are translated into amino…
- 459 Words
- 2 Pages
Good Essays -
16. How do the DNA base sequences specify the sequences of amino acids in a protein?…
- 397 Words
- 2 Pages
Satisfactory Essays -
1. A portion of specific DNA molecule consists of the following sequence of nucleotide triplets.…
- 510 Words
- 3 Pages
Satisfactory Essays -
3. Describe each stage of the flow of information starting with DNA and ending with a trait.…
- 257 Words
- 2 Pages
Satisfactory Essays -
The sequences of DNA that comprise a gene are referred to as exons or exonic sequences. Most exonic sequences will code for a particular protein, but they also include other regulatory or non-coding regions that, although not coding for a particular protein, are important to the translation of mRNA. These non-coding sequences are referred to as untranslated regions (UTR) and occur at the 5’ end (5’ UTR) and 3’ end (3’ UTR) of the gene. Other sequences that do not code for protein, and which do not form part of the UTR of the gene, are referred to as introns or intronic sequences. Introns are found in DNA and pre-mRNA, but not in mRNA, which includes only the exonic sequences found in the DNA from which it is copied. Introns account for about 25% of the human genome. The remainder is made up of repetitive and other intergenic…
- 4908 Words
- 20 Pages
Powerful Essays -
3. Describe each stage of the flow of information starting with DNA and ending with a trait.…
- 388 Words
- 2 Pages
Satisfactory Essays -
. This flow of information is dependent on the genetic code, which defines the relation between the sequence of bases in DNA (or its mRNA…
- 506 Words
- 3 Pages
Good Essays -
Apply: Suppose you wanted a protein that consists of the amino acid sequence methionine, asparagine, valine, and histidine. Give an mRNA sequence that would code for this protein.…
- 598 Words
- 3 Pages
Good Essays -
DNA consists of two polynucleotide chains and these nucleotides consist of a deoxyribose sugar, a nitrogenous base and a phosphate group. The bases are Adenine, Cytosine, Guanine and Thymine. The sequence of these bases on DNA determines the structure of these proteins. A gene is a sequence of bases which codes for a single polypeptide. Chromosomes carry these genes and these genes come in specific forms called an allele which is how living organisms vary from each other. For example, humans are made up of an XY or XX chromosome. Females are XX and males are XY, however in some animals their sex is determined by the ZW sex-determination…
- 768 Words
- 2 Pages
Good Essays