Translation is initiated when the small ribosomal subunit, which scans from the 5’ end of the mRNA, recognises a start codon. Identifying sites of translation initiation is desirable for uncovering alternatively translated reading frames. When ribosomes initiate at noncanonical start codons of open reading frames (ORF), this creates a more diversified proteome without expansion of genome size, where polypeptides whose sequences deviates from that encoded in their corresponding main ORFs are created, allowing the expression of several protein isoforms from individual transcripts, as seen in the figure above.
Ribosome profiling is a powerful tool for detecting translation start sites and can be adapted for this use with inhibitors that preferentially block the translocation of initiating ribosomes. Harringtonine, for example, captures initiating ribosomes, while depleting other ribosomes by run-off elongation. Lactimidomycin acts more gradually to trap initiating ribosomes. Initiation sites can also be defined by depleting most elongating ribosomes using the drug puromycin or by sequential treatment with lactimidomycin to stabilise initiating ribosomes, followed by puromycin to destabilise elongating and terminating ribosomes.
Below are a series of five papers that exemplify the use of ribosome profiling in determining unique TISs and the impact this may have on alternative gene expression.
Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes
Cell, 2011; 147(4), pp.789-802.
Ingolia, N.T., Lareau, L.F. and Weissman, J.S.
Eukaryotic cells treated with harringtonine leads to an accumulation of ribosomes at translation initiation sites. This is because harringtonine binds to free 60S subunits that have not complexed with a 40S subunit to form an 80S ribosome. Therefore, elongating ribosomes are unaffected by the addition of harringtonine, but a 60S subunit bound by harringtonine will form an 80S at a start site and will not proceed further. In this paper, Ingolia et al. used a support vector machine (SVM)-based learning strategy to identify initiation sites from harringtonine-treated ribosome footprint profiles of mouse embryonic stem cells (mESCs). The SVM model was trained on a set of annotated genes to identify features of footprint profiles that distinguish the start codon from other positions.
- The authors identified 13,454 candidate translation start sites within ∼5000 well-expressed genes. Up to 65% of the genes contained more than one detectable site of translation initiation, with 16% containing four or more sites.
- As much as 14% of the initiation sites observed were predicted to produce alternate protein isoforms of known genes. In particular, 570 genes with potential N-terminal extensions and 870 with N-terminal truncations were identified in the 4,994 genes analysed.
- The authors uncovered information regarding protein products arising from alternative splicing, which are often difficult to annotate. For instance, the growth factor Igf2, whose two annotated isoforms carry the same coding region but contain varying 5′ untranslated regions, was found to express an N-terminal extension specific to one isoform only.
- The identified alternative start sites indicated widespread near-cognate initiation of upstream open reading frames (uORFs) translation, while also lending support to multiple short regions of translation on non-coding RNAs, which the authors termed short polycistronic ribosome-associated coding RNAs (sprcRNAs).
This seminal paper in the field of translatomics was the first to utilise the ribosome profiling technique to target initiating ribosomes, the results of which revealed a hitherto unanticipated complexity to mammalian proteomes. Such findings include the pervasive use of alternate initiation sites that drive the production of extended or truncated isoforms of known proteins as well as the translation of sprcRNAs.
Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution
Proceedings of the National Academy of Sciences, 2012; 109(37), pp.E2424-E2432.
Lee, S., Liu, B., Lee, S., Huang, S.X., Shen, B. and Qian, S.B.
Translation inhibitors cycloheximide (CHX) and lactimidomycin (LTM) bind to the ribosome E-site, resulting in inhibition of ribosomal translocation. While CHX binds to all translating ribosomes, LTM preferentially incorporates into the initiating ribosomes when the E-site is free of tRNA. The authors carried out ribosome profiling using CHX and LTM side by side, allowing initiating ribosomes to be distinguished from elongating ribosomes. This adapted ribosome profiling method, which the authors termed global translation initiation sequencing (GTI-Seq), has the potential to reveal a comprehensive set of TIS codons at nearly single-nucleotide resolution. The authors treated HEK293 cells with DMSO, CHX, or LTM for 30 min before carrying out ribosome profiling to investigate TIS. The same analysis was also carried out using a mouse embryonic fibroblast (MEF) cell line, to investigate cross-species conservation of alternative translation initiators. Finally, the authors experimentally validated by immunoblotting several translational products observed by GTI-Seq to initiate from alternative start codons, further confirming the accuracy of this method.
- Over half the TIS codons used AUG as a start codon. Near-cognate codons that differ from AUG by a single nucleotide made up a large proportion of TIS, with CUG accounting for 16%.
- As much as 49.6% of transcripts analysed contained multiple TIS, suggesting that alternative translation prevails even under physiological conditions.
- In contrast to TIS peaks located downstream of the annotated start codon (dTIS), which used AUG as the primary start codon, the majority (74.4%) of TIS peaks located upstream of the annotated start codon (uTIS) were non-AUG codons, with CUG being the most prominent.
- Approximately 4% of non-coding RNAs were found by GTI-Seq to feature short regions of translational activity, with the median length of identified ORFs being 54nt.
- The general features of alternative translation initiation were found to be well conserved between human and mouse cells, even for orthologs with low sequence similarity. MEF cells showed remarkable similarity to HEK293 cells in the prevalence of non-AUG codons, especially CUG, being identified as uTIS while, about half the transcripts in MEF cells also exhibited multiple initiators.
This study greatly expanded the list of mRNAs with hidden coding potential not visible by sequence-based in silico analysis and thus aids in identifying sets of candidate genes for intensive study. In addition to allowing for exploration of the mechanisms of TIS selection, the illustration of altered TIS selection under various growth conditions will set the stage for future investigation of translational reprogramming during human diseases and development of organisms.
Retapamulin-Assisted Ribosome Profiling Reveals the Alternative Bacterial Proteome
Molecular Cell, 2019; 74(3), pp.481-493.
Meydan, S., Marks, J., Klepacki, D., Sharma, V., Baranov, P.V., Firth, A.E., Margus, T., Kefi, A., Vázquez-Laslop, N. and Mankin, A.S.
The use of alternative translation initiation sites allows expansion of the cellular proteome by enabling production of more than one protein from a single gene. Although translation of most bacterial genes is initiated at a specific translation initiation site (TIS) known as the primary TIS (pTIS), many genes also have an additional internal TIS (iTIS) where ribosomes that initiate translation at the in-frame iTIS synthesize an alternative, N-terminally truncated polypeptide. Primary and alternative proteins may have similar but specialised functions. The authors of this paper set out to find a way in which ribosome profiling could be adapted for use in bacteria in a way that allowed identification of such iTISs. An inhibitor useful for mapping start sites should fulfil a number of criteria: it should allow assembly of the 70S translation initiation complex at a TIS while also preventing the ribosome from leaving the start codon. Unfortunately, most of the ribosomal antibiotics traditionally viewed as initiation inhibitors do not satisfy these criteria. Although tetracycline (TET), an antibiotic that prevents aminoacyl-tRNAs from entering the ribosomal A site, has been successfully paired with Ribo-Seq in past studies to reveal pTISs of many actively translated genes, it was also found to generate a significant number of footprints from elongating ribosomes. Here the authors show that another antibiotic, retapamulin (RET), predominantly arrests bacterial ribosomes at initiation sites by binding to the ribosome’s peptidyl transferase center and preventing peptide bond formation following 70S assembly at the start codon. They found that combining this treatment with Ribo-Seq, an approach they termed Ribo-RET (Retapamulin-enhanced Ribo-Seq analysis), provides an effective means of selectively mapping TISs across bacterial genomes.
- RET exclusively stalls ribosomes at the start codons of the ORFs. Brief pre-treatment of E. coli cells with RET dramatically rearranges the distribution of ribosomes along the ORFs, confining the ribosomal footprints obtained by Ribo-Seq to the TISs of the genes.
- The authors found that Ribo-RET not only allowed mapping of conventional initiation sites at the beginning of the genes, but also identified 239 and 620 iTIS candidates among E. coli BW25113 and BL21 strains, respectively. Both in-frame and out-of-frame iTISs were identified, with several candidates demonstrated to play functionally important roles.
- Ribo-RET also characterized 6 and 36 uTISs among E. coli BW25113 and BL21 strains, respectively and identified 41 translation start sites of unannotated short ORFs that were common between the two strains. Ribo-RET revealed 14 unique sites where a start codon immediately precedes a stop codon (start-stops). A start-stop in the yecJ gene was experimentally validated to attenuate translation of the main ORF, indicating a regulatory function of these motifs. Although start-stops have been identified in the 5’ UTRs of some viral and plant genes, operational start-stops had not previously been reported within bacterial genes.
The development of Ribo-RET has helped unveil the hidden fraction of the bacterial proteome and offers insights into gene regulatory mechanisms. In particular, this study revealed the presence of internal start codons in over a hundred E. coli genes, dramatically expanding the number of known internal initiation sites in bacteria.
Ribosome profiling in archaea reveals leaderless translation, novel translational initiation sites, and ribosome pausing at single codon resolution
Nucleic Acids Research, 2020; 48(10), pp.5201-5216.
Gelsinger, D.R., Dallon, E., Reddy, R., Mohammad, F., Buskirk, A.R. and DiRuggiero, J.
Archaea are prokaryotes, thus their ribosome subunits (30S/50S) and rRNA genes (16S/23S) are closer in size to their bacterial counterparts. Despite this Archaea are evolutionarily closer to Eukarya. Archaeal translation mechanisms remain poorly understood, however, primarily due to the difficulty in cultivating these organisms as well as the fact that many of them are extremophiles, making it difficult to characterise them using classical biochemical and molecular methods. Additionally, many different transcriptional units co-exist in Archaea such as leadered, leaderless, and operonic transcripts, complicating investigation of global translation processes and their regulation. To address the current knowledge gap in this area, the authors of this paper developed a protocol for ribosome profiling in the model halophilic archaeon Haloferax volcanii (Hv) in order to view and understand various aspects of archaeal translation. They also used harringtonine to stall ribosomes at initiation codons and identified putative novel and alternative translation initiation sites genome wide.
- The predominant footprints in Hv were 27 nt in length and were enriched on leadered transcripts, while a subset of small footprints 15-16 nt in length were enriched at initiation sites on leaderless transcripts, providing a means of distinguishing between these two forms of initiation.
- The authors detected 160 novel TIS and classified them as putative alternative translation start sites (paTSS) based on their relationship to the annotated ORF. Of these 160 TISs, 68 were small ORFs <50 codons in length, while 18 were novel unannotated TSS that encode large ORFs in both intergenic regions and antisense to annotated genes.
- The authors discovered hundreds of putative alternative TIS in Hv including upstream of annotated TIS (N-terminal extensions), within annotated ORFs that could produce truncated proteins (internal in-frame) or completely new proteins (internal out-of-frame), and small ORFs.
- The authors observed that many paTSS started with AUG, GUG and UUG, while ∼30% of paTSS started at non-canonical start sites that were near-cognate codons to AUG.
- ORFs featuring N-terminal extensions were found to exhibit higher levels of translational efficiency than their corresponding genes, suggesting that their translated products are of functional significance.
This paper displayed the first global view of translation in an archaeon and in an extremophile and also provides a framework for the adaptation of this technique to other Archaea. This work describes novel insights into archaeal translation processes and provides an experimental concept for the in vivo study of translation at high salt concentrations. The many alternative proteins discovered in this study, not previously observed in proteomic studies, may play important roles in cell physiology and stress response.
Combinatorial analysis of translation dynamics reveals eIF2 dependence of translation initiation at near-cognate codons
Nucleic Acids Research, 2021; 49(13), pp.7298-7317
Ichihara, K., Matsumoto, A., Nishida, H., Kito, Y., Shimizu, H., Shichino, Y., Iwasaki, S., Imami, K., Ishihama, Y. and Nakayama, K.I.,
During the initiation stage of translation in eukaryotes, the initiator methionyl-tRNA (Met-tRNAi) binds to the 40S ribosomal subunit which binds to the 5′ end of the mRNA. This complex scans the mRNA until it recognises an initiation codon, at which point the 60S ribosomal subunit binds forming the 80S ribosome. Despite progress in the area of TIS identification, their precise detection remains challenging due to the presence of experimental artefacts. Ribo-Seq reveals the dynamics of 80S ribosomes by investigating ribosome footprint sequences. Ribo-Seq combined with treatment of cells with lactimidomycin is referred to as global translation initiation sequencing (GTI-Seq) and is used to identify start codons. Selective Translation Complex Profile sequencing (Sel-TCP-Seq), a combination of TCP-Seq, which uses formaldehyde to fix 40S and 80S ribosomes in place on transcripts and immunopurification, where ribosomes associated to translational factors of interest (specifically eIF3B and eIF3D) are pulled down, monitors 40S and 80S ribosomes and reveals the dynamics of each translational factor. By combining various aspects of the outputs from these technologies, the authors created an analytical framework which address the limitations of each individual method which they called TISCA (TIS detection by translation Complex Analysis). TISs were identified from the combination of GTI-Seq and 40S dynamics, followed by frame-fitting on the basis of Ribo-Seq results. The authors then verified TISCA’s functionality and superiority over alternative methods by analysis of proteomics data.
- The authors report that TISCA efficiently removes noise associated with GTI-Seq and thereby allows a more accurate identification of TISs and consequently the correct identification of ORFs. In comparison TISCA to existing TIS identification methods (Ribo-TISH, RiboCode, RiboTaper, ORF-RATER and riboHMM), the framework was reported to identify more TISs with higher reliability.
- In addition to being able to detect their canonical ORFs, TISCA successfully identified both a novel extended ORF for eIF2 alpha kinase 1 (EIF2AK1) with a GUG initiation codon located 54nt upstream of the canonical AUG codon and a noncanonical truncated ORF for lamin B2 (LMNB2) with an AUG initiation codon located 60nt downstream from the start site of the canonical ORF. This confirmed the ability for TISCA to reliably identify TISs located both upstream or downstream of the canonical TIS.
- The authors performed motif analysis for the near-cognate codons identified by TISCA. The sequences surrounding AUG initiation codons identified by TISCA showed the same pattern as the typical Kozak motif, and the CUG, GUG, UUG and ACG initiation codons also manifested Kozak-like motifs. These results suggested that near-cognate codons also require Kozak-like motifs for efficient translation initiation.
- The authors postulated that translation initiation at near-cognate codons is most likely to be dependent on eIF2. For example, a novel extended ORF with an ACG initiation codon for the heterogeneous nuclear ribonucleoprotein A2/B1 (HNRNPA2B1) gene was identified by TISCA, whose translational efficiency was shown to be significantly attenuated knockdown of eIF2α while no effect was seen in the AUG canonical ORF.
The comprehensive identification of near-cognate codons is important to provide insight into the molecular mechanism of noncanonical translation initiation. In the present study, the authors have identified a large number of TISs with high reliability by combining analyses of translation dynamics. The development of such methods will also allow characterisation of changes in TIS usage associated with diseases such as cancer and diabetes, which promises better understanding of disease pathogenesis and may inform the development of new treatments.