DNA STRUCTURE AND THE GENOME
Figure 2.4 The human genome can be classified into different types of DNA based on its structure and function.
Modified with permission from Jasinska, A., and Krzyzosiak, W.J. (2004) Repetitive sequences that shape the human transcriptome. FEBS Letters 567, 136-141). around 1.5% of the genome is directly involved in encoding for proteins [2-4]. Gene structure, sequence and activity are a focus of medical genetics due to the interest in genetic defects and the expression of genes within cells. Approximately 23.5% of the genome is classified as genic sequence, but does not encode proteins. The non- coding genic sequence contains several elements that are involved with the regulation of genes, including promoters, enhancers, repressors and polyadenylation signals; the majority of gene related DNA, around 23%, is made up of introns, pseudogenes and gene fragments.
Most of the genome, approximately 75%, is extragenic. Around 20% of the genome is single copy DNA which in most cases does not have any known function although some regions appear to be under evolutionary pressure and presumably play an important, but as yet unknown, role . Thelargestportionofthegenome-over50%-iscomposedofrepetitiveDNA;45% of the repetitive DNA is interspersed, with the repeat elements dispersed throughout the genome.
The four most common types of interspersed repetitive element – short inter- spersed elements (SINEs), long interspersed elements (LINEs), long terminal repeats (LTRs) and DNA transposons – account for 45% of the genome [3, 7]. These repeat sequences are all derived through transposition. The most common interspersed repeat element is the Alu SINE; with over 1 million copies, the repeat is approximately 300 bp long and comprises around 10% of the genome.
There is a similar number of LINE elements within the genome, the most common is LINE1, which is between 6-8 kb long, and is represented in the genome around 900000 times; LINEs make up around 20% of the genome [3, 7]. The other class of repetitive element is tandemly repeated DNA. This can be separated into three different types: satellite DNA, minisatellites, and microsatellites.