Home   >   Molecular Biology   >   Posttranscriptional Processes

RNA Splicing

 


RNA splicing is a process that removes introns and joins exons in a primary transcript.  An intron usually contains a clear signal for splicing (e.g., the beta globin gene).  In some cases (e.g., the sex lethal gene of fruit fly), a splicing signal may be masked by a regulatory protein, resulting in alternative splicing.  In rare cases (e.g., HIV genes), a pre-mRNA may contain several ambiguous splicing signals, resulting in a few alternatively spliced mRNAs.

Splicing signal

Most introns start from the sequence GU and end with the sequence AG (in the 5' to 3' direction).  They are referred to as the splice donor and splice acceptor site, respectively.  However, the sequences at the two sites are not sufficient to signal the presence of an intron.  Another important sequence is called the branch site located 20 - 50 bases upstream of the acceptor site.   The consensus sequence of the branch site is "CU(A/G)A(C/U)", where A is conserved in all genes.

In over 60% of cases, the exon sequence is (A/C)AG at the donor site, and G at the acceptor site.

b5-a-4.gif (3226 bytes)

Figure 5-A-4.  The consensus sequence for splicing.  Pu = A or G; Py = C or U. 

Exception for the GU-AG rule is discussed in the following review article:

AT-AC Pre-mRNA Splicing Mechanisms and Conservation of Minor Introns in Voltage-Gated Ion Channel Genes - Molecular and Cellular Biology, 1999.

 

Splicing mechanism

The detailed splicing mechanism is quite complex.  In short, it involves five snRNAs and their associated proteins.  These ribonucleoproteins form a large (60S) complex, called spliceosome.  Then, after a two-step enzymatic reaction, the intron is removed and two neighboring exons are joined together (see Alberts et al.).  The branch point A residue plays a critical role in the enzymatic reaction.

b5-a-5.gif (5344 bytes)

Figure 5-A-5.  Schematic drawing for the formation of the spliceosome during RNA splicing.  U1, U2, U4, U5 and U6 denote snRNAs and their associated proteins.  The U3 snRNA is not involved in the RNA splicing, but is involved in the processing of pre-rRNA.

Review Articles:

b-globin gene

Expression of the b-globin gene is a typical process.   This gene contains two introns and three exons.  Interestingly, the codon of the 30th amino acid, AGG, is separated by an intron.  As a result, the first two nucleotides AG are in one exon and the third nucleotide G is in another exon.

b5-a-6.gif (5688 bytes)

Figure 5-A-6.  Expression of the human b-globin gene.  U5 and U3 represent untranslated regions at the 5' and 3' end, respectively.  Note that the mature b-globin protein does not contain the initiating methionine for protein synthesis.

 

Sex lethal gene

Sexual differentiation in Drosophila (fruit fly) is regulated by a protein called sex-lethal (sxl) protein.  The female embryo expresses functional sxl proteins whereas the male embryo expresses non-functional sxl proteins.   Their difference is a result of alternative splicing as shown in the following figure.  

b5-a-7.gif (11420 bytes)

Figure 5-A-7.  Expression of the Drosophila sex-lethal (sxl) protein.  
(a)
In the early stage of embryogenesis, the sxl protein is expressed in female embryo, but not in the male embryo.  
(b)
In the late female embryo, the sxl protein produced in the early stage may mask the splicing signal for the second intron, resulting in a different protein than in the male embryo.

The gene which encodes the sxl protein contains two promoters, denoted by PL and PE.  PL is active in the late development of both female and male embryos, but PE is active only in the early stage of female embryogenesis. Therefore, in early embryogenesis, the sxl protein is expressed only in the female embryo.

The primary transcript generated by PL consists of four exons separated by three introns.  In the male embryo, the three introns are removed and all four exons are joined together.  Its product is a non-functional sxl protein.   In the female embryo, the sxl protein produced at the early stage may bind to the splice acceptor site of the second intron.  As a result, the splicing machinery takes the next acceptor site for splicing.  The third exon is then skipped, producing a functional sxl protein. 

Exon skipping is also frequently observed when a critical residue in the splicing signal is mutated (example).

Review Article:

 

HIV-1 genome

The HIV-1 genome contains nine genes: gag, pol, vif, vpr, vpu, env, nef, rev and tat.  Their protein products are all derived from a single primary transcript.  This is achieved by three mechanisms: (i) alternative splicing, (ii) leaky scanning of the initiation codon, and (iii) ribosomal frameshifting.

B5-a-8.gif (6813 bytes)

Figure 5-A-8.  Alternative splicing of the HIV-1 primary transcript.  (i) is unspliced, (ii) to (iv) are singly spliced, (v) and (vi) are doubly spliced.  The resulting mRNA (i), (iv) and (vi) are bicistronic.  The star "*" indicates the location of the initiation codon (AUG).  

The HIV genome contains several ambiguous splicing signals, resulting in a few alternatively spliced mRNAs.  They can be divided into three groups: (I) unspliced, (II) singly spliced, and (III) doubly spliced.  As shown in the above figure, the resulting mRNA (i), (iv) and (vi) are bicistronic (each encoding two proteins).  mRNA (i) encodes gag and pol proteins, mRNA (iv) encodes vpu and env, mRNA (vi) encodes rev and nef. 

Protein synthesis starts from the initiation codon (AUG) and ends with one of three stop codons.  In HIV, mRNA (iv) and (vi) have two initiation codons, but the first is sometimes skipped so that the second protein may be synthesized.    mRNA (i) has only one initiation codon.  Synthesis of the second protein (pol) is due to translational frameshifting (web link). 

Review Articles: