|MoBio||RNA Splicing||Chapter 5|
RNA splicing is a process that removes introns and joins exons in a primary transcript. An intron usually contains a clear signal for splicing (e.g., the beta globin gene). In some cases (e.g., the Tau gene), a splicing signal may be masked by a regulatory protein, resulting in alternative splicing. In rare cases (e.g., HIV genes), a pre-mRNA may contain several ambiguous splicing signals, resulting in a few alternatively spliced mRNAs.
Most introns start from the sequence GU and end with the sequence AG (in the 5' to 3' direction). They are referred to as the splice donor and splice acceptor site, respectively. However, the sequences at the two sites are not sufficient to signal the presence of an intron. Another important sequence is called the branch site located 20 - 50 bases upstream of the acceptor site. The consensus sequence of the branch site is "CU(A/G)A(C/U)", where A is conserved in all genes.
In over 60% of cases, the exon sequence is (A/C)AG at the donor site, and G at the acceptor site.
The detailed splicing mechanism is quite complex. In short, it involves five snRNAs and their associated proteins. These ribonucleoproteins form a large (60S) complex, called spliceosome. Then, after a two-step enzymatic reaction, the intron is removed and two neighboring exons are joined together (see Alberts et al.). The branch point A residue plays a critical role in the enzymatic reaction.
Expression of the β-globin gene is a typical process. This gene contains two introns and three exons. Interestingly, the codon of the 30th amino acid, AGG, is separated by an intron. As a result, the first two nucleotides AG are in one exon and the third nucleotide G is in another exon.
Tau gene MAPT
The Tau protein has six isoforms produced from a single gene through alternative RNA splicing (Figure 5-A-7). They differ in the number of inserts at the N-terminal half and the number of repeats at the C-terminal half . The number of inserts may be 0, 1 or 2, depending on whether the exon 2 and/or 3 are included during RNA splicing. The number of repeats may be either 3 or 4. The 4-repeat (4R) Tau includes the second repeat encoded by exon 10.
The repeat region is the microtubule binding domain. The 4R Tau binds to, assembles, and stabilizes microtubules more effectively than 3R Tau. In a healthy adult brain, the levels of 4R and 3R Tau proteins are approximately equal. Distortion of the balance will lead to neurodegeneration such as Alzheimer's disease. The underlying mechanism is explained in this book.
The HIV-1 genome contains nine genes: gag, pol, vif, vpr, vpu, env, nef, rev and tat. Their protein products are all derived from a single primary transcript. This is achieved by three mechanisms: (i) alternative splicing, (ii) leaky scanning of the initiation codon, and (iii) ribosomal frameshifting.
The HIV genome contains several ambiguous splicing signals, resulting in a few alternatively spliced mRNAs. They can be divided into three groups: (I) unspliced, (II) singly spliced, and (III) doubly spliced. As shown in the above figure, the resulting mRNA (i), (iv) and (vi) are bicistronic (each encoding two proteins). mRNA (i) encodes gag and pol proteins, mRNA (iv) encodes vpu and env, mRNA (vi) encodes rev and nef.
Protein synthesis starts from the initiation codon (AUG) and ends with one of three stop codons. In HIV, mRNA (iv) and (vi) have two initiation codons, but the first is sometimes skipped so that the second protein may be synthesized. mRNA (i) has only one initiation codon. Synthesis of the second protein (pol) is due to translational frameshifting.