The Human Genome

In order to appreciate the current findings and to rationally approach the laboratory investigation of a patient suspected of having a genetic disorder, it is essential to understand some basic principles of genetics and molecular biology. All biological activities depend directly or indirectly on the proteins that lie within cells. The totality of genetic information for an organism is referred to as the genome. The human genome is divided among 23 pairs of chromosomes; females have two X chromosomes and 22 pairs of autosomes (non-sex chromosomes), and males have an X and a Y chromosome in addition to the 22 pairs of autosomes. Genetic information is contained in the linear

sequence of nucleotides, or bases, in deoxyribonucleic acid (DNA). Each chromosome is a single enormous molecule of linear DNA associated with a complex assortment of proteins that play structural, regulatory, and enzymatic roles during DNA replication and gene expression. The autosomes are numbered according to their sizes and contain about 50 million bases in chromosome 22 to more than 200 million bases in chromosome 1. Genetic information is contained within the mitochondrial genome of 16,559 nucleotides. It is important to recognize that this genome only encodes a small proportion of the proteins that constitute mitochondria. The majority of mitochondrial proteins are encoded by the nuclear genome.

It is estimated that 50,000 to 100,000 genes are within the human genome. These genes range in size from nearly 100 bases to about 2.5 million bases. Gene expression is the sequence of processes from messenger RNA synthesis, or transcription, to protein synthesis, or translation. Gene transcription is a complex process and begins with the regulated initiation of transcription from a transcription initiation site. This initiation process is of fundamental importance and is mediated by proteins termed transcription factors. A primary transcript is synthesized down the length of the entire gene from the 5 to the 3

end. The information in most genes is broken into discontinuous pieces termed exons that are separated by introns, or intervening sequences. These introns are removed from the primary transcript through splicing. Most spliced products then have a string of adenosine bases added to their 3

ends. This spliced and polyadenylated product is the mRNA that is then transported to the cytoplasm of the cell and serves as a template for the synthesis of protein. An individual gene, then, is the composite not only of the sequences that code for protein, but also the introns, the transcribed but not translated regions at the 5 and at the 3

ends of the mRNA, and sequences that regulate the transcription initiation, termination, and splicing processes. Regulatory sequences can flank the transcription initiation and termination site or even lie within introns or exons.

The sequence of bases in an mRNA encodes the amino acid sequence of the protein for a gene. Nucleotide information is interpreted in groups of three bases, termed codons. With 4 nucleotide bases, 64 possible codons exist. Four codons are particularly important. AUG encodes methionine, which is always the first amino acid of the primary translation product (which may be cleaved to make a mature protein product). The codons UAG, UGA, and UAA are stop codons. Mutations that cause disease can be as subtle as the substitution of one nucleotide base for another. The consequences of point mutations demonstrate the great variety of effects that result from an alteration in a gene and illustrate most of the key problems that result from mutations. The results of such a mutation depend critically on the location of the mutation within the gene as well as on the particular bases that are involved. If the mutation lies within the coding region (also known as the open reading frame), the result could be the substitution of one amino acid for another. Some point mutations result in no change of the amino acid and are termed silent mutations. Other protein coding region mutations can result in the conversion of an amino acid encoding codon into a termination or stop codon. A mutation affecting an initiation codon would prevent translation initiation unless another AUG codon was present elsewhere. Point mutations can occur outside of the coding region with the resulting consequences including obliteration or creation of splice sites, destruction or creation of transcription initiation, termination, and regulatory sites.

Mutations can also result from deletions or insertions, which can range in size from a single nucleotide to thousands of nucleotides. When an insertion or deletion lying within the coding region for a protein involves a number






Range/Pathological Range

Other Mutations

Huntington's disease




15 to 20/39 to 100


Machado-Joseph disease (MJD)/SCA III




13 to 36/68 to 80


X-linked spinal and bulbar atrophy

Androgen receptor



15 to 20/40 to 100









Ataxin-2 (ATX2)



17 to 24/36 to 52



alpha (1A) calcium Channel subunit (CACNAIA)



6 to 17/21 to 30



Ataxin-7 (AT7)



7 to 17/38 to 130






7 to 23/49 to 80


Myotonic dystrophy




5 to 27/50 to thousands


Fragile X




ILE367ASN, lBP DEL, ACT125CT, FS159TEACT125CT, FS159TER, IVS1, G-T, -1, and G-A, +1

Friedreich's ataxia

Frataxin FRDA



7 to 20/200 to 900

Leul06Ter, splice acceptor mutation intron 3, ILe154Phe

SCA, spinocerebellar ataxia; DPPLA, dentatorubropallidoluysian atrophy.

of bases that are not a multiple of 3, then the coding frame is disrupted, resulting in a protein sequence that diverges from the normal sequence from that point on, unless a second mutation restores the proper reading frame.

A recently recognized but important category of mutations, which is considered a subtype of an insertion, is the trinucleotide repeat expansion mutations ( ,TabJe,25-1 ). These mutations are characterized by an elongation of a gene region with repeats of a three-nucleotide sequence. The precise sequence varies among the disorders, as does the scale of expansions. Some general distinctions can be drawn. When the trinucleotide, or triplet, repeat unit lies outside of the coding region, the expansions may be very large. For example, the mutations that cause fragile X mental retardation syndrome (FRAXA) (see Chapter.32 ) are expansions of a region involving the trinucleotide CCG that lies in the 5

untranslated, but transcribed portion of the gene. Normally, 5 to 50 CCG repeats exist in this region, but affected individuals may have hundreds or even thousands of repeats. Expansions of over several hundred CTG triplet repeats have been found in patients with myotonic dystrophy (Steinert's disease). The repeat unit CTG is normally present in only 5 to 27 copies and lies in the 3

untranslated region of the myotonin gene. The Friedreich's ataxia triplet repeat unit GAA lies within the first intron of the frataxin gene. Normal individuals have 7 to 22 GAA repeats, whereas individuals with Friedreich's ataxia have about 200 to 1000 repeats. Interestingly, point mutations in the FRAXA and Friedreich's ataxia genes can also cause mental retardation and Friedreich's ataxia, respectively. The other type of triplet repeat mutations involves repeats of the nucleotide bases CAG, which encode the amino acid glutamine and lie within the coding domains of their respective genes. Not only is the repeat unit the same for these disorders, but also the scale of the repeat expansion is quite similar; whereas the length of the normal repeat is approximately 20, affected individuals have 40 or more repeats. To date, all mutations of this type are of special importance to neurologists, because the major syndromes affect the brain, spinal cord, or muscles.

Alcohol No More

Alcohol No More

Do you love a drink from time to time? A lot of us do, often when socializing with acquaintances and loved ones. Drinking may be beneficial or harmful, depending upon your age and health status, and, naturally, how much you drink.

Get My Free Ebook

Post a comment