Research
Unbiased Sequencing of RNA Modifications and Human RNome Project
An RNA sequence and its diverse modifications constitute the complete informational content of RNA. Defects in RNA modifications account for over 100 human diseases, affecting millions of Americans, including those with cancers, diabetes, and Alzheimer’s and Parkinson’s diseases. Despite its significance, our understanding of RNA sequence diversity remains limited. Current sequencing technologies offer partial insights but fail to provide the full spectrum of RNA sequence variants. In fact, we actually do not know how many unique RNA molecules or sequence variants are present exactly in a given sample, and further, we do not know the complete sequence content of each RNA, including the identity and location of every nucleotide (canonical or modified) within a full-length RNA.
Our lab aims to develop advanced methods to decode complete RNA sequence information by creating a novel RNA sequencing platform with three unprecedented capabilities: 1) exhaustive sequencing of every RNA sequence without omission (targeting all RNA molecules), 2) unbiased sequencing of all RNA modifications (targeting all RNA nucleotides, modified or not), and 3) global profiling of RNA and its modifications in human diseases (targeting all RNA and modification changes). These unique capabilities have the potential to reveal the complete sequence information of RNA molecules for the first time, laying the foundation for the world’s first Human Epitranscriptome Project. This initiative aims to draft the first complete sequence of all human RNA molecules and their modifications, poised to be as significant as the Human Genome Project completed in 2003.
Read more—
1. Yuan X, Su Y, Johnson B, Kirchner M, Zhang X, Xu S, Jiang S, Wu J, Shi S, Russo JJ, Chen Q, Zhang S*. Mass Spectrometry-Based Direct Sequencing of tRNAs De Novo and Quantitative Mapping of Multiple RNA Modifications. J Am Chem Soc., 2024, DOI: 10.1021/jacs.4c07280. https://pubs.acs.org/doi/10.1021/jacs.4c07280
2. National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. https://doi.org/10.17226/27165
De Novo and Direct Sequencing of Modified RNA
Mass spectrometry (MS) is an essential tool for studying protein modifications, where peptide fragmentation produces “ladders” that reveal the identity and position of modifications. However, a similar approach is not yet feasible for RNA, because in situ fragmentation techniques that provide satisfactory sequence coverage do not exist. Aberrant RNA nucleobase modifications, especially methylations and pseudouridinylations, have been correlated to the development of major diseases like breast cancer, type-2 diabetes, and obesity, each of which affects millions of Americans. Despite their significance, the available tools to reliably identify, locate, and quantify nucleobase modifications in RNA are very limited. As a result, we only know the function of a few modifications in contrast to the more than 100 RNA modifications that have been identified. One way to circumvent this issue is to perform prior chemical degradation of RNA so that well-defined mass ladders can be formed before entering the spectrometer. However, the structural uniformity of ladder sequences generated by the prerequisite RNA degradation is unsatisfactory, complicating downstream data analysis. We have spearheaded the development of a two-dimensional LC/MS-based de novo RNA sequencing tool by taking advantage of predictable regularities in liquid chromatographic separation of optimized RNA digests to greatly simplify the interpretation of complex MS data. To sequence RNA, the RNA sequence of interest is first chemically degraded to become a series of short fragments (sequence ladder). Comparison of an individual sequence ladder’s mass difference and the mass of the nucleoside monophosphate (or chemically modified nucleotide) allows simple sequence determination by LC-MS.
Read more—-
- Zhang N, Shi S, Wang X, Ni W, Yuan X, Duan J, Jia TZ, Yoo B, Ziegler A, Russo JJ, Li W, Zhang S*. Direct sequencing of tRNA by 2D-HELS-AA MS Seq reveals its different isoforms and multiple dynamic base modifications ACS Chemical Biology, 2020, 15(6):1464-1472.
- Zhang N, Shi S, Jia TZ, Ziegler A, Yoo B, Yuan X, Li W, Zhang S*. A General LC-MS-based RNA Sequencing Method for Direct Analysis of Multiple-base Modifications in RNA Mixture. Nucleic Acids Research, 2019, gkz731, https://doi.org/10.1093/nar/gkz731.
- 5. Björkbom A#, Lelyveld VS#, Zhang S#, Zhang W, Tam CP, Blain JC, Szostak JW. Bidirectional direct sequencing of noncanonical RNA by two-dimensional analysis of mass chromatograms. J Am Chem Soc. 2015, 137 (45): 14430-14438.
Regulation of Gene Expression by DNA/RNA Modifications
RNA modifications are functionally significant and play important roles in biological processes and diseases in vertebrates. Although more than 100 RNA modifications have been identified so far, we only know the function of just a few. This is mainly due to technological limitations and to the complexity of gene regulation in most living organisms. To probe this modification-activity relationship, we plan to use a simple system based on non-enzymatic template-directed primer extension for preliminary screening of various classes of RNA modifications. Using this experimental system, we demonstrated that 5-methylation and 2-thiolation of uracil significantly increases both the rate and fidelity of the non-enzymatic copying of native DNA and RNA as well as their non-canonical counterparts. Now we continue to take advantage of such an experimental setup to probe the functional advantages that various classes of RNA modifications can confer. The interesting properties of the chemical modifications observed in our template-directed primer reaction will prompt us to further pursue the study of their impact on gene expression using an in vitro transcription/translation system. We will systematically examine how each individual modification, as well as combinations of different modifications, influence gene expression, e.g., of DNA polymerases and reverse transcriptase. We plan to systematically introduce nucleobase modifications to all DNA, mRNA and tRNAs involved in the synthesis of DNA polymerase I and to observe the consequences of such modifications in both the polymerase expression levels and changes in enzymatic properties, such as the rate and fidelity of the DNA polymerase reaction. The modifications that lead to significant changes of any of these parameters will be introduced to cells, e.g. with well-established mRNA transfection protocols. The success of this modality will have profound implications in controlling the expression levels of various proteins, such as Adenosine Deaminase Acting on RNA (ADAR), which plays very important roles in neurodegenerative diseases such as Alzheimer’s disease and Parkinson’s diseases.
Read more—-
- Fast and accurate nonenzymatic copying of an RNA-like synthetic genetic polymer. Proc Natl Acad Sci USA. 2013, 110 (44): 17732-17737.
- Synthesis of N3′-P5′-linked phosphoramidate DNA by nonenzymatic template-directed primer extension. J Am Chem Soc. 2013, 135 (2): 924-932.
Correlation between RNA Modifications and their functions
It is a well-accepted notion that RNA can catalyze different cellular processes. Wochner et al. have demonstrated that a ribozyme can serve as a simple RNA polymerase and catalyze its own transcription (1) . Unfortunately, the catalytic efficiency is not comparable to polymerases, and the fidelity of such a process is hampered by G:U Wobble base pairing. To our delight, there is evidence in our recent studies showing that non-enzymatic primer extension occurs significantly faster on oligo-ribo-T templates than on oligo-ribo-U templates (2). In addition, thiolation of the 2-carbonyl group of uracil reduces mismatches from G:U and A:C base-pairing (3). We believe that methylation and thiolation of nucleobases, as well as other biologically relevant DNA/RNA modifications, can enhance the rate and fidelity of such small RNA polymerase ribozymes. We plan to systematically and surgically introduce various naturally occurring and biologically relevant epigenetic modifications into different regions of small ribozyme systems, and methodically study how these modifications affect ribozyme function. With such information, we can evolve small but powerful ribozymes that can perform specific cellular functions that rival those of protein machineries. Unfortunately, concise and efficient preparation of a substantial portion of complex modified nucleotides, such as 5-methylaminomethyl-2-thiouridine (mnm5s2U), 5-taurinyomethyl-2-thiouridine (τm5s2U), queuosine (Q), wybutosine (yW), and their respective glycosylated analogs, remains a significant challenge. Chemical or enzymatic incorporation of these non-canonical nucleotides into RNA will mandate the installation of their phosphoramidites or activated triphosphates on the 5’-position of ribose; currently, synthetic methods that allow facile syntheses of these activated non-canonical ribonucleotide monomers do not exist. With our expertise in nucleotide synthetic chemistry, we plan to develop robust and general synthetic methodologies; this will not only drive the advancement of synthetic chemical science, but will also allow bulk preparation of non-canonical ribonucleotides for many downstream studies in epigenetic RNA regulation. With convenient access to the broadened nucleotide chemical space, we plan to study how these complex modifications affect the recognition and processing of both coding and non-coding RNA in gene expression regulation.
Read more—-
- Ribozyme-catalyzed transcription of an active ribozyme. Science. 2011, 332:209-212.
- Synthesis of N3′-P5′-linked phosphoramidate DNA by nonenzymatic template-directed primer extension. J Am Chem Soc. 2013, 135 (2): 924-932.
- Fast and accurate nonenzymatic copying of an RNA-like synthetic genetic polymer. Proc Natl Acad Sci USA. 2013, 110 (44): 17732-17737.