uORF detection

Upstream open reading frames (uORFs) are sequences with start and stop codons found upstream of the main protein coding region in an area known as the 5’ untranslated region (UTR). uORFs can vary by number per transcript, by length and by their position in relation to the 5’ cap and main start codon. uORFs act as tissue-specific cis-regulatory elements and can inhibit downstream translation in a number of ways, for example some uORFs translate small peptides which can directly inhibit translation by interacting with and stalling elongating ribosomes at or near the uORF stop codon. In some cases, simply the initiation of translation at an AUG in the uORF can inhibit translation of the downstream protein without any interaction from the translated small peptide. In other cases, termination at a uORF stop codon can trigger the nonsense-mediated decay pathway.


The development of ribosome profiling (also known as Ribo-Seq) has enabled the genome-wide characterisation of translated uORFs. In addition to canonical (AUG initiated) uORF discovery, modified ribosome profiling methods have allowed the identification of many noncanonical uORFs and has helped highlight their prevalence and potential functional importance.


The below papers exemplify the use of translatomics tools in identifying translated uORFs in the 5’ UTR of various genes.

Translation of 5′ leaders is pervasive in genes resistant to eIF2 repression

Elife, 2015. 4, p.e03971.

Andreev, D.E., O’Connor, P.B., Fahey, C., Kenny, E.M., Terenin, I.M., Dmitriev, S.E., Cormican, P., Morris, D.W., Shatsky, I.N. and Baranov, P.V.

For translation to take place the initiation factor eIF2 is required as it functions to catalyse translation initiation at start codons. Eukaryotic cells respond to various stress conditions by activating a kinase which recognises and phosphorylates the α subunit of the eukaryotic initiation factor 2 (eIF2α). As a result, protein synthesis is reduced, however, the persistent translation of certain mRNAs is required for deployment of an adequate stress response. This is where uORFs can play a functional role. The global downregulation of protein synthesis triggered by eIF2 inactivation has two main purposes: to conserve cellular resources, and to provide a delay to evaluate the severity of the damage and, depending on its level, reprogram gene expression either towards apoptosis or to a pro-survival repair response. This necessarily requires activation of genes involved in the ISR. The authors of this study treated human HEK293T cells with 40 µM sodium arsenite for 30 minutes to induce stress and then carried out ribosome profiling on these cells. They also identified the most probable translation initiation sites by manually examining the ribo-seq profiles of eight of the translationally resistant genes (PPP1R15A, IFRD1, SLC35A4, C19ORF48, PTP4A1, PCNXL4, UCP2, PPP1R15B).

Key Findings

  • The arsenite treatment was found to significantly alter the transcript levels of 24 genes.
  • There was a 5.4-fold repression in overall translation, but the protein coding open reading frames (ORFs) of certain individual mRNAs showed resistance to the inhibition. IFRD1 has only a single AUG in its 5′ leader while SLC35A4 has 11. These mRNAs have at least one efficiently translated uORF that represses translation of the main coding ORF under normal conditions.
  • Under normal conditions the translation of mRNAs bearing the 5′ leaders of IFRD1, PPP1R15B, UCP2, and PTP4A1 was about sevenfold lower than that of the control mRNA with the simple non-specific leader (pGL3).
  • Substitution of the IFRD1 uORF AUG with the AUA codon increased the reporter expression eight-fold but made the translation susceptible to eIF2 inhibition.
  • Translation of only one uORF is sufficient to provide resistance to eIF2 inhibition for IFRD1 and PPP1R15B.
  • The authors also investigated the nucleotide context surrounding uORF start codons (mostly AUG but also CUG) in resistant mRNAs but found no evidence for selection for a particular context.
  • The addition of a 5’ single sequence to pGL3 and IFRD1 did not affect stress resistance but the addition of a 5′ terminal stem loop to the 5′ leaders of the uORFs of these two reporters resulted in a threefold to fourfold decrease in the activity of both reporters under normal conditions. This suggests that the nucleotide sequence in the 5’ leader may be important for rapid ribosome loading.

Implications

This work expands the list of mRNAs which are known to be persistently translated under conditions of eIF2 phosphorylation. The analysis of ribosome densities on mRNAs resistant to eIF2 phosphorylation accentuates the vital role of uORF translation in providing the resistance. Further work is now needed to explore the mechanisms by which translation of these uORFs allows mRNAs to resist the stress. Also some of the findings provide more of an insight into why translation of many mammalian mRNAs possessing uORFs are not resistant to eIF2 phosphorylation, while others are.

Translation from the 5′ untranslated region shapes the integrated stress response

Science, 2016; 351(6272).

Starck, S.R., Tsai, J.C., Chen, K., Shodiya, M., Wang, L., Yahiro, K., Martins-Green, M., Shastri, N. and Walter, P.

eIF2 phosphorylation is a key event in the response of cells to various stresses. Ribosome-profiling experiments have revealed that mRNAs encoding stress-response proteins tend to have a high abundance of uORFs. Yet, despite these findings, very few uORF peptides have been identified by mass spectrometry. The authors addressed this limitation in proteomic approaches and exploited the sensitivity and specificity of T cells to detect such translation products. 3T involves insertion of a tracer peptide coding sequence into a candidate DNA sequence. The resulting mRNAs containing the tracer peptide coding sequence are translated to produce tracer peptides. These tracer peptides are loaded onto major histocompatibility complex class I (MHC I) molecules and transit to the cell surface, where they can be detected by specific T cell hybridomas. The authors examined ribosome profiling data for a variety of chaperones to identify non-AUG uORFs in their 5′ UTRs. They then measured uORF expression in the 5′ UTR of mRNAs at multiple distinct regions via their 3T technique, while simultaneously detecting expression of the CDS.

Key Findings

  • 3T proved to be a sensitive and robust indicator of uORF expression.
  • UUG- and CUG-initiated uORFs predicted from ribosome profiling of human chaperone mRNAs generate MHC I peptides predicted to be HLA epitopes that are active in adaptive immunity. The authors propose that this presents an extracellular signature during the ISR.
  • uORF peptide expression from ATF4 mRNA was measured showing that its translation persists during the ISR.
  • BiP, an ER chaperone stably synthesised during the ISR has a 5′ UTR region with uORFs that are exclusively initiated by UUG and CUG start codons.
  • BiP uORF expression was dependent on the alternative initiation factor eIF2A and translation of the UUG-initiated uORF during the ISR.
  • Cells utilise a distinct eIF2A-mediated initiation pathway, which includes uORF translation, to sustain expression of particular proteins during the ISR.

Implications

The discrepancy between the number of translated uORFs predicted by ribosome profiling and the number of translated uORFs identified by mass spectrometry can be attributed to the challenges encountered by extensive mass spectrometry analysis of HLA-bound peptides, namely proteolysis and the short lengths of uORF peptides. 3T offers a valuable solution as it is complementary to ribosome profiling and mass spectrometry in uORF discovery. Ribosome profiling along with 3T can characterise the importance of uORF biology for regulation, and generate fundamental insights into uORF mutation-based diseases.

Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish

Nature communications, 2016, 7(1), pp.1-10.

Chew, G.L., Pauli, A. and Schier, A.F.,

uORFs are present in 50% of human and mouse mRNAs and in ∼65% of zebrafish mRNAs and many of these uORFs are translated, as confirmed by ribosome profiling and mass spectrometry. A lot of knowledge has come from ribosome profiling data regarding uORFs and their roles in individual genes. However, the range of uORF-mediated translational repression in vertebrate genomes is largely unexplored. Additionally, whether the repressive effects of uORFs is conserved across species is largely unknown. The authors of this paper therefore explored these questions by analysing data from human, mouse and zebrafish using three independently generated ribosome profiling data sets. With this data the authors determined how various transcript features modulate uORF repressiveness and coding sequence (CDS) translation efficiency (TE). uORFs were defined as ATG-Stop delimited sequences beginning upstream of the CDS start

Key Findings

  • In contrast to CDSs, uORFs do not have distinct initiation sequence contexts that promote their translation and uORF initiation contexts have generally been under selective pressure to be weakly translated.
  • Vertebrate uORFs tend to have features associated with weak repressiveness such as initiation sequences and secondary structures that are unfavourable for their translation, and they are depleted from regions closest to the CDS where they would be most repressive.
  • uORF sequence features contribute less than 5′ leader and CDS sequence features towards specifying CDS TE.
  • The repressiveness and sequence features of uORFs and 5′ leaders are broadly conserved and may thus contribute to the conservation of CDS TE.
  • More conserved sequence features tend to have stronger contributions towards specifying CDS TE.

Implications

This paper provided a global view of uORF-mediated translational repression. In addition, the authors’ approach allowed the characterisation of existing endogenous variation amongst uORFs. The authors found that while uORF sequence features generally disfavour uORF translation, uORFs still contribute significantly and negatively towards CDS TE at a genome-wide scale. The methods of analysing the effect of various transcript sequence features on downstream translation may be used to explore the impact of non-coding sequence variation within transcripts and provide a molecular framework to understanding their effects on gene expression and physiology.

Translation from unconventional 5′ start sites drives tumour initiation

Cancer Research, 2016; 76(10), pp.30783087

Wahba, A., Rath, B.H., Bisht, K., Camphausen, K. and Tofilon, P.J.,

Glioblastoma (GBM) is a type of cancer of the glial cells in the brain and is the most aggressive of all intracranial tumours. Radiotherapy is a highly cost-effective treatment for glioblastomas but the DNA damage induced by the radiation can trigger a signaling cascade that mediates radioresistance – a major cause of treatment failure in patients with GBM. The aim of this paper was to better understand the effect of ionising radiation (IR) on the human glioblastoma translatome using a set of human glioblastoma stem-like cell (GSC) lines and polysome profiling. Initially the authors conducted this study on established glioma cell lines 6 hours after exposure to 7Gy. The authors then carried out the investigation on three glioblastoma stem-like cell (GSC) lines (NSC11, 0923, and GBMJ1) and collected polysome-bound mRNA 1- 6 hours after exposure to 2 Gy. To determine whether or not the IR-induced changes observed are of biological significance, the authors then investigated the genes affected in terms of cellular processes and pathways.

Key Findings

  • Radiation primarily modifies gene expression via translational control.
  • DNA repair and cell cycle checkpoint regulation were among pathways that were upregulated after IR exposure.
  • Cellular processes not traditionally associated with radioresponse such as activation of eIF4E and mTOR were activated by exposure to IR. This suggests cap-dependent translation is increased after exposure of GSC to IR.
  • Mitochondrial response to IR has a high cell line specificity.

Implications

These data show that IR-induced translational control plays a significant role in the cellular response to IR in glioblastoma. This response influences cell survival and thus plays a role in radioresistance in this tumour type. Understanding more about how translational control of these genes responds to IR exposure offers as a target for glioblastoma radiosensitisation which could hopefully reduce relapse in patients treated for GBM.

Disrupting upstream translation in mRNAs is associated with human disease

Nature communications, 2021, 12(1), pp.1-14.3087

Lee, D.S., Park, J., Kromer, A., Baras, A., Rader, D.J., Ritchie, M.D., Ghanem, L.R. and Barash, Y.

It has been shown that genetic variants that create new uORFs are rare, thus suggesting that these variants are subjected to strong negative selection. It has also been shown that variants destroying stop codons in translated uORFs are under strong negative selection. However, much less is known about the impact of genetic variation within translated uORFs themselves. Ribosome-profiling experiments have provided evidence of the active translation of thousands of uORFs throughout the genome, but the biological significance of this remains uncertain. Using translated uORFs from 4392 genes identified by deep ribosome profiling of two human cell lines, the authors mapped genetic variation from 71,702 whole-genome sequences in gnomAD. The authors also used the Penn Medicine Biobank (PMBB) to discover new disease-gene associations using uORF stop-creating and stop-strengthening variants and then replicated these associations in the UK Biobank (UKB), and by gene burden tests aggregated rare protein-coding loss-of-function variants. The authors also investigated the functional importance of translation at uORFs, and explored possible phenotypic consequences associated with genetic variation in these sequences.

Note: genetic variants that introduce new stop codons in translated uORFs were termed upstream termination codons (UTCs) to distinguish them from premature termination codons within protein-coding sequences. UTCs could impede downstream translation initiation without affecting the CDS directly.

Key Findings

  • Two categories of variants in 5′UTRs that are capable of causing reduced translation of downstream coding genes were identified. These variants either introduce UTCs in uORFs or strengthen uORF stop sites.
  • Variants introducing new stop codons in uORFs are under strong negative selection
  • Introducing UTCs and stop-strengthening variants in translated uORFs decreases protein expression of downstream genes in reporter assays suggesting that uORF UTC and stop-strengthening variants might be associated with disease in humans.
  • Of the 4392 genes with translated uORFs used for this analysis, 1121 were also annotated as having pathogenic CDS variants in ClinVar, suggesting that UTC and stop-strengthening variants in these genes may have additional utility for the diagnosis of rare disease.
  • Translated uORFs use weak stop codons
  • Genomic positions that can create new stop codons in uORFs are conserved.
  • uORF start codons are conserved and under strong selective pressure.
  • A novel association was found between stop-strengthening and pLOF variants in PMVK with diabetes, which strengthens existing evidence linking the mevalonate pathway to diabetes. This data is the first to establish a putative link between PMVK and diabetes.
  • SNVs introducing new stop codons, or creating stronger translation termination signals in uORFs are under strong selective constraints within 5′UTRs.

Implications

The approach taken by this paper demonstrates a powerful combination of data, in this case population-scale databases of human genetic variation integrated with cellular-scale -omics data to identify patterns of how variation impacts regulatory elements. These data demonstrate that genetic variants in translated uORFs that create new stop codons, or strengthen existing stop codons can contribute to disease pathology by changing protein expression.

Contact

Follow EIRNA Bio to stay at the cutting edge of Translatomics

Did you know we share monthly and weekly content via email, keeping you updated on all things translatomics, and, we don’t blitz you with marketing material! To sign up just fill in the form below.

Scroll to Top