In prokaryotes, almost all the DNA codes for proteins.
However, in eukaryotes most of the DNA does not code for anything!
97 % of human DNA does not code for proteins, and is called “junk DNA.”
Therefore, only about 3 % of your DNA codes for proteins!
Tandemly Repetitive DNA (Satellite DNA)
A short sequence of DNA is repeated many times
For example: GTTACGTTACGTTACGTTAC
These are collectively called tandemly repetitive DNA because they repeat next to one another (i.e. in tandem).
They are also called satellite DNA, because when the DNA is cut up into small pieces, these segments have different densities than the rest of the DNA.
When the DNA is centrifuged, it forms satellite bands next to the main DNA bands.
This is used in DNA fingerprinting.
Tandemly repetitive DNA makes up about 10–15 % of mammalian DNA.
Unknown source
Location
Tandemly repetitive DNA is found in centromeres and telomeres.
It seems that the tandemly repetitive DNA has a structural role for chromosomes:
Centromeres are important in the separation of sister chromatids in cell division.
Telomeres are located at the end of chromosomes and can shorten with each cell division.
They contain thousands of repeats of the nucleotide sequence: TTAGGG.
Unknown sources
Tandemly Repetitive DNA Can Cause Diseases
Fragile X Syndrome
“CGG” is repeated hundreds or even thousands of times creating a “fragile” site on the X chromosome.
It leads to mental retardation (see page 274).
Huntington's Disease
“CAG” repeat causes a protein to have long stretches of the amino acid glutamine.
Leads to a neurological disorder that results in death.
Unknown sources
Interspersed Repetitive DNA
Interspersed repetitive DNA accounts for 25–40 % of mammalian DNA.
The repeats of interspersed repetitive DNA are not found next to each other as in tandemly repetitive DNA.
They are scattered randomly throughout the genome.
The units are hundreds to thousands of base pairs long.
Copies are similar but not identical to each other.
Famous example: Alu elements
300 base pairs long
Can be transcribed, but the function, if any, is not known.
Comprise 5 % of human genome!
Tandemly Repetitive DNA versus Interspersed Repetitive DNA
Tandemly Repetitive DNA
Interspersed Repetitive DNA
Proportion of mammalian DNA
10–15 %
25–40 %
Length of each repeated unit
1–10 base pairs
100–10 000 base pairs
Relevant Numerical Characteristics
Total length of repetitive DNA per site, in base pairs:
Regular satellite DNA
100 000–10 million
Minisatellite DNA
100–100 000
Microsatellite DNA
10–100
Number of repetitions per genome: 10–1 million
Notes
Repeated units at a site are usually identical.
“Copies” are very similar but not identical.
Some Repetitive DNA Sequences are Transcribed (But Don't Make Proteins): Multigene Families
A collection of identical or very similar genes.
The entire family of genes probably evolved from a single ancestral gene.
Famous example: rRNA (ribosomal RNA)
Ribosomes, the large structures that make proteins, are made from proteins and RNA.
Four different pieces of rRNA are used to make up a ribosome: 18S, 5.8S, 28S, and 5S.
It turns out that three of these rRNAs occur in the genome as a gene family that is transcribed together.
The entire multigene family is repeated nearly 300 times in clusters on five different chromosomes!
It makes sense to have many repeats of this multigene family because each cell needs many ribosomes for protein synthesis.
Figure 14.2, Purves's Life: The Science of Biology, 7th Edition
Pseudogenes
Pseudogenes are DNA sequences that are similar to real genes, but lack the regulatory sequences necessary for gene expression (e.g. promoters).
Transposons and Retrotransposons
Interspersed repetitive genes are not stably integrated in the genome; they move from place to place.
These are called transposable elements, or transposons.
A transposon uses transposase, whereas a retrotransposon uses reverse transcriptase.
They can sometimes mess up good genes.
Figure 14.3, Purves's Life: The Science of Biology, 7th Edition; Figure 19.5, page 350, Campbell's Biology, 5th Edition
Evolution of a Multigene Family: Hemoglobin Proteins
Hemoglobin is a quaternary protein comprised of four tertiary subunits:
Two α-globins
Two β-globins
Hypothesis: one ancestral globin gene
Duplication: the ancestral globin was duplicated, producing two copies in the genome.
Mutation: each gene mutated, producing two slight variations: alpha and beta.
Transposition: one gene moved to another chromosome via a transposon.
Duplications and mutations:
The α and β genes undergo further duplications and mutations.