<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="research-article"><?properties open_access?><?properties manuscript?><front><journal-meta><journal-id journal-id-type="nlm-journal-id">9604648</journal-id><journal-id journal-id-type="pubmed-jr-id">20305</journal-id><journal-id journal-id-type="nlm-ta">Nat Biotechnol</journal-id><journal-id journal-id-type="iso-abbrev">Nat. Biotechnol.</journal-id><journal-title-group><journal-title>Nature biotechnology</journal-title></journal-title-group><issn pub-type="ppub">1087-0156</issn><issn pub-type="epub">1546-1696</issn></journal-meta><article-meta><article-id pub-id-type="pmid">26237516</article-id><article-id pub-id-type="pmc">4564351</article-id><article-id pub-id-type="doi">10.1038/nbt.3289</article-id><article-id pub-id-type="manuscript">NIHMS700917</article-id><article-categories><subj-group subj-group-type="heading"><subject>Article</subject></subj-group></article-categories><title-group><article-title>High-throughput determination of RNA structure by proximity ligation</article-title></title-group><contrib-group><contrib contrib-type="author"><name><surname>Ramani</surname><given-names>Vijay</given-names></name><xref ref-type="aff" rid="A1">1</xref></contrib><contrib contrib-type="author"><name><surname>Qiu</surname><given-names>Ruolan</given-names></name><xref ref-type="aff" rid="A1">1</xref></contrib><contrib contrib-type="author"><name><surname>Shendure</surname><given-names>Jay</given-names></name><xref ref-type="aff" rid="A1">1</xref></contrib></contrib-group><aff id="A1"><label>1</label> Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.</aff><pub-date pub-type="nihms-submitted"><day>26</day><month>6</month><year>2015</year></pub-date><pub-date pub-type="epub"><day>03</day><month>8</month><year>2015</year></pub-date><pub-date pub-type="ppub"><month>9</month><year>2015</year></pub-date><pub-date pub-type="pmc-release"><day>01</day><month>3</month><year>2016</year></pub-date><volume>33</volume><issue>9</issue><fpage>980</fpage><lpage>984</lpage><!--elocation-id from pubmed: 10.1038/nbt.3289--><permissions><license xlink:href="http://www.nature.com/authors/editorial_policies/license.html#terms"><license-p>Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:<ext-link ext-link-type="uri" xlink:href="http://www.nature.com/authors/editorial_policies/license.html#terms">http://www.nature.com/authors/editorial_policies/license.html#terms</ext-link></license-p></license></permissions><abstract><p id="P1">We present an unbiased method to globally resolve RNA structures through pairwise contact measurements between interacting regions. RNA Proximity Ligation (RPL) uses proximity ligation of native RNA followed by deep sequencing to yield chimeric reads with ligation junctions in the vicinity of structurally proximate bases. We apply RPL in both baker's yeast (<italic>Saccharomyces cerevisiae</italic>) and human cells and generate contact probability maps for ribosomal and other abundant RNAs, including yeast snoRNAs, the RNA subunit of the signal recognition particle, and the yeast U2 spliceosomal RNA homolog. RPL measurements correlate with established secondary structures for these RNA molecules, including stem-loop structures and long-range pseudoknots. We anticipate that RPL will complement the current repertoire of computational and experimental approaches in enabling the high-throughput determination of secondary and tertiary RNA structures.</p></abstract></article-meta></front><body><p id="P2">The folding of RNA species into complex secondary and tertiary structures is central to RNA's catalytic, regulatory, and information-carrying roles <sup><xref rid="R1" ref-type="bibr">1</xref></sup>. Pioneering approaches for elucidating RNA structure&#x02014;including crystallography<sup><xref rid="R2" ref-type="bibr">2</xref></sup>, electron microscopy<sup><xref rid="R3" ref-type="bibr">3</xref></sup>, and spectroscopy<sup><xref rid="R4" ref-type="bibr">4</xref></sup>&#x02014;are technically complex and difficult to scale, motivating the development of computational algorithms for RNA structure prediction<sup><xref rid="R5" ref-type="bibr">5</xref>&#x02013;<xref rid="R7" ref-type="bibr">7</xref></sup>. Current algorithms have limited predictive power, particularly for long-range interactions such as pseudoknots (secondary structures involving intercalated stem loops).With the advent of massively parallel sequencing<sup><xref rid="R8" ref-type="bibr">8</xref></sup>, less laborious experimental techniques have been developed for the global interrogation of RNA secondary structures. These include methods relying on structure-specific chemical modifications<sup><xref rid="R9" ref-type="bibr">9</xref>&#x02013;<xref rid="R11" ref-type="bibr">11</xref></sup>, such as DMS-seq and SHAPE-seq, as well as methods involving digestion with structure-specific RNases<sup><xref rid="R12" ref-type="bibr">12</xref>&#x02013;<xref rid="R14" ref-type="bibr">14</xref></sup>, like PARS-seq and Frag-seq. Although these methods probe the extent to which individual bases participate in secondary structures, they do not directly query which specific pairs of bases or regions interact to form these structures. To address this, recent efforts have combined systematic mutagenesis and structure-specific probing to generate pairwise information for inferring RNA folds<sup><xref rid="R15" ref-type="bibr">15</xref>,<xref rid="R16" ref-type="bibr">16</xref></sup>. However, despite considerable progress, the high-throughput determination of RNA secondary and tertiary structures remains a challenging problem.</p><p id="P3">Here we show that proximity ligation is a straightforward means of generating global pairwise data about RNA secondary and tertiary structure. Proximity ligation records the physical proximity of two nucleic acid termini through their ligation, and has been applied to detect DNA aptamer-bound proteins<sup><xref rid="R17" ref-type="bibr">17</xref></sup>, to probe protein-protein interactions via antibody-bound oligonucleotides<sup><xref rid="R18" ref-type="bibr">18</xref></sup>, and for targeted or global chromosome conformation capture (3C)<sup><xref rid="R19" ref-type="bibr">19</xref>,<xref rid="R20" ref-type="bibr">20</xref></sup>. Proximity ligation has also been applied in conjunction with crosslinking and either affinity purification or immunoprecipitation to characterize snoRNA-rRNA interactions<sup><xref rid="R21" ref-type="bibr">21</xref></sup> and Ago-mediated miRNA-target interactions<sup><xref rid="R22" ref-type="bibr">22</xref></sup>. However, these efforts have primarily focused on assessing specific <italic>trans</italic> interactions, rely on low-efficiency 254 nanometer UV crosslinking, and require time-consuming purification steps.</p><p id="P4">RPL (&#x02018;ripple&#x02019;) globally assesses which pairs of regions are interacting to form intramolecular RNA structure (<bold><xref ref-type="fig" rid="F1">Fig. 1</xref></bold>).Similar to 3C methods for DNA conformation, RPL uses digestion and re-ligation of RNA, but omits crosslinking, relying instead on the inherent spatial proximity of RNA nucleobases in secondary structural features (i.e. stem-loops). To generate RPL libraries, we performed RNase digestion <italic>in situ</italic> (or, for yeast, took advantage of endogenous single-stranded RNases), followed by treatment with exogenous T4 RNA Ligase I under non-denaturing conditions. These steps result in chimeric molecules formed from RNA strands intra-molecularly ligating across digested loops (<bold><xref ref-type="fig" rid="F1">Fig. 1a</xref></bold>, inset). By deeply sequencing these resulting fragments and quantifying the relative abundance of specific intramolecular ligation junctions, we are able to create pairwise contact maps that reflect the short- and long-range stem-loop and pseudoknot interactions of intramolecular RNA secondary structures.</p><p id="P5">First we tested RPL in the budding yeast <italic>S. cerevisiae</italic>. To create libraries, we spheroplasted whole yeast cells for 1 h with zymolyase (dissolved in 1X PBS without DTT to allow endogenous RNases to remain active). We then treated the resulting slurries with T4 polynucleotide kinase (PNK) to convert 5&#x02032;-hydroxyl to 5&#x02032;-phosphate termini, and diluted and incubated these mixtures overnight in the presence of a single-stranded RNA ligase (T4 RNA Ligase I) under non-denaturing conditions. We then purified total RNA using acid guanidinium-phenol, and carried out a standard RNA-seq library preparation. Sequencing (Illumina) yielded 304 million (M) concatenated reads for a (+) ligase sample, and 342M concatenated reads for a (&#x02212;) ligase control sample (<bold>Methods</bold>).</p><p id="P6">To identify candidate ligation junctions in these sequencing reads, we adapted an algorithm for identifying novel RNA isoforms from RNA-seq data<sup><xref rid="R23" ref-type="bibr">23</xref></sup>, relaxing constraints on splice-site composition to more generally recognize intramolecular chimeric reads that map discontinuously to a single RNA sequence (<bold>Methods</bold>). To quantify the enrichment of candidate ligations in our samples, we first examined the distribution of spanned distances of intramolecular chimeric reads (i.e. gap sizes), per million reads, in both (+) and (&#x02212;) ligase samples. Although the overall fraction of reads corresponding to candidate intramolecular ligation junctions is low, the (+) ligase sample is enriched for these across a broad range of spanned distances (0.28% in (+) ligase sample vs. 0.011% in (&#x02212;) ligase sample; <bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 1</xref></bold>).</p><p id="P7">Potential sources of technical artifacts in these data include the formation of chimeric molecules by reverse transcriptase (RT) template switching, systematic mapping artifacts, PCR-mediated duplicates and non-specific ligation events. To reduce the impact of RT template switching, we discarded candidate ligation junctions with &#x0003e;5 nucleotides (nt) microhomology, as well as those mapping to opposite strands. To remove PCR-mediated duplicates, we collapsed all reads with identical mapping coordinates and CIGAR alignment strings. To reduce the impact of systematic mapping artifacts caused by errors within our reference transcriptome (for example, gross deletions, un-annotated splice junctions), we conservatively discarded candidate ligation junctions containing the highest 1% of ligation counts, for each RNA species analyzed. Finally, to quantify the extent of nonspecific ligation, we performed an experiment in duplicate wherein human cells were taken through a modified version of the RPL protocol (<bold>Methods</bold>) and spiked into yeast slurries immediately before proximity ligation. The resulting data demonstrate marked enrichment for intraspecies, intramolecular chimeric reads (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 2</xref></bold>).</p><p id="P8">We first analyzed RPL data in the context of the complex but extensively validated secondary structures of the yeast ribosomal RNAs (rRNAs). The yeast ribosome is comprised of the 60S large subunit (LSU), which includes the 3.4 kb 25S rRNA and short 5.8S and 5S rRNAs, and the 40S small subunit (SSU), which includes the 1.8 kb 18S rRNA. To assess whether RPL captures the proximity implied by secondary structure base-pairing, we tallied candidate ligation junctions in a 500 base-pair window centered on known base pairs of the established rRNA structures, effectively quantifying ligation probability as a function of distance (in linear sequence) from known base pairs (in secondary structure). We observe an enrichment of candidate ligation junctions immediately proximal (i.e. within 10 nt) to known base pairs in both the 5.8S/25S rRNAs (~9-fold; <bold><xref ref-type="fig" rid="F1">Fig. 1b</xref></bold>) and 18S rRNA (~6-fold; <bold><xref ref-type="fig" rid="F1">Fig. 1c</xref></bold>). Furthermore, in the case of the 5.8S/25S rRNAs, which contain many long-range base-pairing interactions, this enrichment is maintained even if we restrict analysis to candidate ligation junctions that span &#x0003e;100 bases in the linear sequence (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 3</xref></bold>).</p><p id="P9">The observed signal is entirely dependent on the inclusion of ligase, and is not explained by sequencing errors, mapping artifacts or by proximity in sequence space (as opposed to structure space). As such, we conclude that it primarily derives from intramolecular ligation events between structurally proximal bases. Nonetheless, the signal shown in <bold><xref ref-type="fig" rid="F1">Fig. 1b,c</xref></bold> is &#x0201c;noise-averaged&#x0201d; over all base pairs in these rRNA structures. Consistent with the stochastic nature of individual ligation events, we observe weaker enrichment when repeating our analysis with a randomly selected subset of 10, 25, or 50 paired bases in either the LSU or SSU rRNAs (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 4</xref></bold>). The ligation junctions that we observe are also clearly affected by other biases, including the bias against G/C extremes routinely seen with Illumina sequencing, as well as more subtle base composition preferences at the ligation junction (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 5</xref></bold>). We also observe that ligation junctions are enriched for single-stranded bases in the LSU and SSU rRNAs (Odds Ratio (OR) = 2.24; <italic>P</italic> &#x0003c; 2.2E-16, Fisher's Exact Test). This bias, and the noisiness of the raw data, is evident when ligation junctions are overlaid onto a known secondary structure (<bold><xref ref-type="fig" rid="F2">Fig. 2a</xref></bold>).</p><p id="P10">Given these observations, we concluded that the signal of RPL likely arises from the combinatorial digestion and ligation of predominantly unpaired ribonucleotides across broken loop structures. Considering this, along with the stochastic, biased nature of individual ligation events, we speculated that our ability to resolve secondary structure would improve by calculating the frequency of ligation events between pairs of sliding windows (21 nt each), effectively capturing a combinatorial diversity of ligation events surrounding secondary structural elements. Concurrent with this, we adapted normalization methods developed for Hi-C matrices<sup><xref rid="R24" ref-type="bibr">24</xref></sup> to account for other one-dimensional biases (for example, sequence biases of RNA ligase and PCR) (<bold>Methods</bold>). We then visualized these normalized RPL scores, calculated for pairwise windows, by directly overlaying them onto known secondary structures. RPL scores broadly mirror the secondary structures of the 5.8S/25S LSU rRNAs (<bold><xref ref-type="fig" rid="F1">Fig. 1d</xref></bold>, <bold><xref ref-type="fig" rid="F2">Fig. 2b</xref></bold>; <bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 6a</xref></bold>) as well as the SSU 18S rRNA (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 6b</xref></bold>). Furthermore, we observe signal corresponding to distal tertiary structures, including long-range &#x0201c;pseudo-knots&#x0201d; in the LSU rRNAs (<bold><xref ref-type="fig" rid="F1">Fig. 1d</xref></bold>, right inset)<sup><xref rid="R25" ref-type="bibr">25</xref></sup>.</p><p id="P11">We next sought to evaluate the correspondence between proximity ligation events and the structures of non-ribosomal RNA transcripts. Because we are limited by sampling depth, we focused on well-characterized, abundant RNAs; specifically, the snoRNA <italic>snR86</italic> (<bold><xref ref-type="fig" rid="F3">Fig. 3a</xref></bold>), which guides uridylation of the LSU rRNA, the U1 spliceosomal RNA (<italic>snR19</italic>) (<bold><xref ref-type="fig" rid="F3">Fig. 3b</xref></bold>), the RNA component of the signal recognition particle (<italic>SCR1</italic>) (<bold><xref ref-type="fig" rid="F3">Fig. 3c</xref></bold>), and the U2 spliceosomal RNA homolog (<italic>LSR1</italic>) (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 7</xref></bold>). In &#x0201c;contact probability maps&#x0201d; for these RNAs (based on the normalized RPL scores described above), we observe a striking anti-diagonal pattern, reminiscent of signal observed at known stems in the 5.8S/25S and 18S rRNAs. When comparing our contact probability maps to secondary structure predictions generated with INFERNAL<sup><xref rid="R26" ref-type="bibr">26</xref></sup> using covariance models taken from Rfam<sup><xref rid="R27" ref-type="bibr">27</xref></sup>, our observations are consistent with conserved stems in both <italic>snR86</italic> and <italic>snR19</italic> (<bold><xref ref-type="fig" rid="F3">Fig. 3a,b</xref></bold>). In RPL measurements for <italic>snR19</italic>, we also observed signal indicative of stem formation in the region comprising bases 320-510&#x02014;MFE predictions suggest that this region can form a helix, raising the possibility that this structure is present endogenously.</p><p id="P12">We also analyzed RPL measurements in the context of a non-ribosomal RNA with a solved structure, the RNA subunit of the signal recognition particle (<italic>SCR1</italic>). Again, we observed broad agreement between RPL scores and regions containing paired bases (<bold><xref ref-type="fig" rid="F3">Fig. 3c</xref></bold>), though we do find that certain expected long-range interactions (for example, folding between the molecule termini) are not seen. Further work will be needed to determine whether this is simply an artifact of insufficient depth-of-coverage, or is symptomatic of some other bias with respect to the classes of structural elements that proximity ligation can resolve.</p><p id="P13">Finally, our observations for <italic>LSR1</italic> (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 7</xref></bold>) are consistent with previous work employing cross-linking, affinity-purification, and proximity ligation of RNA<sup><xref rid="R21" ref-type="bibr">21</xref></sup>, which found ligation products supporting stem-formation between the two termini. In agreement with this cross-linking based approach, our data support the formation of both proximal (for example, stem formation at bases 1100 &#x02013; 1150), and distal folds.</p><p id="P14">We next explored the value of RPL scores as a predictive tool for classifying pairs of interacting regions within a structured RNA. To show that RPL scores can be used in this manner, we examined their positive predictive value (PPV) at varying quantile thresholds for the gold-standard 5.8S/25S and 18S rRNAs (<bold><xref ref-type="fig" rid="F4">Fig. 4a,b</xref></bold>). This is a challenging classification problem (92,392 true positive interacting windows out of 6,317,235 possible interacting windows for the LSU rRNAs (1.5%); 41,981 true positive interacting windows out of 1,620,900 possible interacting windows for the SSU rRNA (2.6%)). The highest RPL scores are strongly enriched for true positive interacting windows (LSU rRNA: PPV of 54% using the top 1% of RPL scores; SSU rRNA: PPV of 61% using the top 1% of RPL scores). Plotting PPV as a function of threshold illustrates the tradeoff with sensitivity (<bold><xref ref-type="fig" rid="F4">Fig. 4c,d</xref></bold>). For example, at a sensitivity of 50%, RPL scores have a PPV of 43% for the LSU rRNA and 27% for the SSU rRNA, for predicting structurally interacting pairs of regions.</p><p id="P15">The high-throughput, unbiased identification of intermolecular RNA-RNA interactions is of strong interest in the RNA biology field. Recent work has shown that psoralen-mediated crosslinking may be used in tandem with anti-sense purification to capture <italic>trans</italic> RNA-RNA interactions<sup><xref rid="R28" ref-type="bibr">28</xref></sup>. In principle, RPL should be able to provide complementary information, as interacting RNAs may form ligation products at a higher rate than non-interacting RNAs. Although we observed a modest enrichment for intermolecular yeast ligation junctions in the species mixing experiment (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 2</xref></bold>), this enrichment in our yeast RPL experiment derives primarily from ligation products between the small and large ribosomal subunits (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 8</xref></bold>). While no inter-subunit RPL scores approached those of strongly interacting intramolecular windows, it remains possible that a combination of methodological improvements to reduce background and deeper sequencing of RPL libraries may enable global surveys of <italic>trans</italic> RNA-RNA interactions (for example, the signal recognition particle-ribosome interaction; subunit interactions in the translating ribosome).</p><p id="P16">We next sought to adapt RPL to generate secondary structure information corresponding to RNAs in human cells. Most notably, we replaced the zymolyase treatment with a limited <italic>in situ</italic> digestion with exogenous single-stranded RNases A and T1. In analyzing the resulting data in the context of the well-studied human ribosomal RNAs, we again observed correlation of high RPL scores with known interacting regions (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 9</xref></bold>). However, an RNase (&#x02212;), ligase (&#x02212;) control also demonstrated signal that correlated with secondary structure, albeit much more weakly and possibly reflecting endogenous nuclease and ligase activity (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 10</xref></bold>). The possibility that endogenous enzymatic activity may contribute to the formation of chimeric RNAs is not novel; recent work using a cross-linking approach to characterize the miRNA interactome of <italic>C. elegans</italic> curiously found that expected ligation products could form in the absence of exogenous T4 RNA Ligase I<sup><xref rid="R29" ref-type="bibr">29</xref></sup>.</p><p id="P17">We anticipate several directions for improving RPL. First, RPL libraries require deep sequencing to reliably map interacting regions, even for highly abundant RNA species. The sufficient sampling of lower-abundance RNA species of interest (for example, mRNAs) might be achieved by optimizing the enzymatic steps of the protocol, by adopting hybrid capture enrichment or subtraction, or simply by brute force deep sequencing.</p><p id="P18">Second, given the high predictive value<sup><xref rid="R9" ref-type="bibr">9</xref>,<xref rid="R15" ref-type="bibr">15</xref>,<xref rid="R16" ref-type="bibr">16</xref>,<xref rid="R30" ref-type="bibr">30</xref></sup> of <italic>in vivo</italic> structure probing methods (for example, DMS-seq, SHAPE-seq) in determining the pairedness of individual bases in secondary structures, a framework that integrates two-dimensional, lower-resolution RPL data with one-dimensional, higher-resolution structure probing data seems highly attractive. Ideally, computational predictions would be integrated at the same time, thereby taking advantage of three largely orthogonal approaches to maximize the accuracy of RNA structural predictions.</p><p id="P19">The current repertoire of high-throughput empirical assays for RNA secondary structure provides us with a deep, but ultimately one-dimensional window into the structural landscape of RNA molecules. In contrast, RPL globally captures information with respect to pairwise interactions within RNA secondary structures. Through its integration with complementary computational and experimental approaches, we anticipate that RPL will facilitate the high-throughput elucidation of RNA secondary structures in diverse organisms.</p><sec sec-type="methods" id="S1" specific-use="web-only"><title>METHODS</title><sec id="S2"><title>Cell culture</title><p id="P20"><italic>S. cerevisiae</italic> strain FY3 was struck out on YPD plates and grown at 30 &#x000b0;C. Mammalian cells (lymphoblastoid cell line GM12878; Coriell) were cultured at 37 &#x000b0;C, 5% CO<sub>2</sub> in RPMI-1640 supplemented with 1X Anti-Anti (Gibco), 1X Plasmocin (Invivogen), and 15% FBS (Gibco).</p></sec><sec id="S3"><title>RNA Proximity Ligation (RPL)</title><p id="P21">Individual yeast colonies were added directly to 0.5 U Zymolyase in 10 uL 1X phosphate buffered saline (PBS) (Gibco) w/ 0.2% IGEPAL (Sigma) and incubated at 37 &#x000b0;C for 60 min to spheroplast while maintaining endogenous RNase activity. Spheroplasted yeast were immediately transferred to ice, and mixed with 0.5 uL SuperASE-In (Ambion), 2.5 uL T4 PNK (NEB), 5 uL 10X T4 DNA Ligase Buffer w/ 10 mM ATP (NEB), and 32 uL 1X PBS w/ 0.2% IGEPAL, after which the slurry was incubated at 37 &#x000b0;C for 30 min. Following end-repair, complexes were immediately transferred to 450 uL ligation reaction mix (50 uL 10X T4 DNA Ligase Buffer w/ 10 mM ATP (NEB); 5 uL SuperASE-In (Ambion), 12.5 uL T4 RNA Ligase I (NEB), 382.5 uL 1X PBS w/ 0.2% IGEPAL), and incubated overnight in a 16 &#x000b0;C water bath, after which complexes were added to 1.5 mL TriZOL (Ambion). Samples were then purified using Direct-ZOL spin columns (Zymo) according to manufacturer's protocols. For mammalian experiments a modified version of RPL was performed wherein 2E6 whole human lymphoblastoid cells (GM12878, Coriell) were treated <italic>in situ</italic> with 0.2 uL of RNace-IT (Agilent) diluted in 9.8 uL 1X PBS w/ 0.2% IGEPAL for 10 min at 22 &#x000b0;C, after which the RPL protocol was followed, beginning with PNK treatment.</p><p id="P22">T4 PNK is known to have minimal 3&#x02032; phosphatase activity under the buffer conditions we use during our end-repair step<sup><xref rid="R31" ref-type="bibr">31</xref></sup>. To ensure that phosphatase activity was not limiting ligation efficiency, we also repeated our yeast RPL experiments using a low pH imidazole buffer (50 mM imidazole-HCl, pH 6.0, 10 mM MgCl<sub>2</sub>, 1 mM ATP, and 10 mM DTT) for our PNK reactions. We observed comparable ligation efficiencies independent of the use of low pH buffer (0.28% of analyzed reads in our sample compared to 0.21% and 0.14% in imidazole experiments performed in duplicate).</p><p id="P23">For spike-in experiments, an individual yeast colony and 5E5 human lymphoblastoid cells were treated with respective RPL treatments described above. Following PNK treatment, the two slurries were mixed and treated with T4 RNA Ligase I overnight, after which complexes were purified as described above.</p><p id="P24">To quantify the extent of RNA degradation during the yeast RPL protocol, we repeated the yeast RPL experiment, isolating RNA after PNK treatment, as well as after overnight incubations both in the presence and absence of T4 RNA Ligase I. We then analyzed the integrity of these RNA products using an RNA 6000 Nano Lab-on-Chip (Agilent), finding our products were mildly degraded following PNK treatment (RIN Score of ~7), though this degradation appears to have been halted before ligation (<bold><xref ref-type="supplementary-material" rid="SD1">Supplementary Fig. 11</xref></bold>).</p></sec><sec id="S4"><title>Library Preparation</title><p id="P25">Libraries were prepared according to standard Illumina TruSeq RNA guidelines, with minor changes. Notably, polyA-selection steps were skipped, RNA fragmentation (Elute, Prime, Fragment) was carried out for 2.5 min, and PCR amplification of the final library was carried out using qPCR for 8-12 cycles on a BioRad OpticonMini to prevent library overamplification. Two biological replicate libraries were generated and sequenced for (+) ligase yeast experiments, one of which was selected for deep sequencing and analyzed further in this paper. Two biological replicate libraries each were generated for imidazole and species-mixing experiments, for both (+) and (&#x02212;) ligase samples.</p></sec><sec id="S5"><title>Sequencing and sequence alignment</title><p id="P26">Sequencing of libraries was carried out using the Illumina MiSeq, NextSeq 500, and HiSeq 2000 instruments, generating paired-end 80 bp and 101 bp reads. All raw sequencing data and processed data files are accessible at GEO Accession GSE69472.</p><sec id="S6"><title>FASTQ Post-processing</title><p id="P27">Raw paired-end FASTQ files were adaptor-trimmed and merged with SeqPrep (<ext-link ext-link-type="uri" xlink:href="https://github.com/jstjohn/SeqPrep">https://github.com/jstjohn/SeqPrep</ext-link>) to account for all read pairs that contain redundant information (i.e. sequence) content. We then took the resulting &#x0201c;singleton&#x0201d; forward and reverse reads (i.e. those that did not contain sufficient overlap to be fused) and concatenated them along with fused reads to yield 304M (for the treated sample) and 342M (for the negative control) concatenated reads, which were then analyzed.</p></sec><sec id="S7"><title>Alignment</title><p id="P28">These resulting FASTQ files were aligned to references generated from either a manually curated list of yeast transcripts with duplicated transcripts removed, taken from the Saccharomyces Genome Database (<ext-link ext-link-type="uri" xlink:href="http://yeastgenome.org">http://yeastgenome.org</ext-link>), or a selected list of deduplicated RefSeq human transcripts, using the STAR aligner with the following parameters:
<list list-type="simple" id="L1"><list-item><p id="P29">&#x02013;outSJfilterOverhangMin 6 6 6 6</p></list-item><list-item><p id="P30">&#x02013;outSJfilterCountTotalMin 1 1 1 1</p></list-item><list-item><p id="P31">&#x02013;outSJfilterDistToOtherSJmin 0 0 0 0</p></list-item><list-item><p id="P32">&#x02013;alignIntronMin 10</p></list-item><list-item><p id="P33">&#x02013;chimSegmentMin 15</p></list-item><list-item><p id="P34">&#x02013;chimScoreJunctionNonGTAG 0</p></list-item><list-item><p id="P35">&#x02013;chimJunctionOverhangMin 6</p></list-item></list></p></sec></sec><sec id="S8"><title>Bioinformatic Analyses</title><p id="P36">Secondary structures in BPSEQ format for <italic>S. cerevisiae</italic> were downloaded from the Comparative RNA Website<sup><xref rid="R32" ref-type="bibr">32</xref></sup> and RNA structures were visualized through a modified version of VARNA. <italic>H. sapiens</italic> rRNA structures were inferred from a published cryo-EM structure<sup><xref rid="R33" ref-type="bibr">33</xref></sup>, using 3DNA<sup><xref rid="R34" ref-type="bibr">34</xref></sup>. STAR-generated output was analyzed with custom Python and R scripts to generate contact probability maps (All custom scripts used to analyze aligned data are provided in <bold>Supplementary Scripts</bold>). First, STAR alignments were deduplicated by collapsing all alignments with identical start coordinates and CIGAR strings. These deduplicated alignments were then converted to &#x0201c;splice junction&#x0201d; and &#x0201c;chimer&#x0201d; files using awk, and ligation junctions were parsed from these files. For specific species of interest, these ligation counts were then filtered further to remove the highest 1% of counts between individual pairs of bases. To calculate the distribution of ligations around known base-pairs, we looked at all pairs of bases (<italic>i,j</italic>) in our secondary structure BPSEQ files, and calculated the abundance of ligation events between (<italic>i, j</italic> &#x02013; 250) to (<italic>i, j</italic> + 250) for each base. For sub-sampling experiments, we randomly sampled 10, 25, or 50 paired-bases and repeated these calculations.</p><p id="P37">To compute RPL scores, which measure the extent of ligation between two regions of a molecule, we first considered the sparse matrix <italic>M</italic> where <italic>M<sub>ij</sub></italic> is the ligation count between base <italic>i</italic> and base <italic>j</italic>. To generate the RPL score matrix <italic>M</italic>*, we compute the coverage at each base <italic>i</italic> and <italic>j</italic> (<italic>c<sub>i</sub></italic>; <italic>c<sub>j</sub></italic>) and generate a normalized matrix <italic>M<sub>norm</sub></italic> such that: <disp-formula id="FD1"><mml:math display="block" id="M1" overflow="scroll"><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:msqrt><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>c</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:mfrac></mml:mrow></mml:math></disp-formula> We then use this normalized matrix to generate <italic>M</italic>* by binning all normalized scores: <disp-formula id="FD2"><mml:math display="block" id="M2" overflow="scroll"><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mo>&#x02217;</mml:mo></mml:msubsup><mml:mo>=</mml:mo><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mo>=</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>+</mml:mo><mml:mn>10</mml:mn></mml:mrow></mml:munderover><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>=</mml:mo><mml:mi>j</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>+</mml:mo><mml:mn>10</mml:mn></mml:mrow></mml:munderover><mml:msubsup><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:math></disp-formula> Classification analyses were performed as follows: we thresholded the RPL scores resulting from the above smoothing by quantiles, with a quantile step size of 0.001, and classified true positive interacting windows as those interacting 21 nt windows with RPL scores greater than our specified threshold, that also contain at least 1 set of paired bases.</p><p id="P38">To generate secondary structures for <italic>snR86</italic> and <italic>snR19</italic>, we downloaded covariance models from Rfam (<italic>snR86</italic> Accession: RF01272; <italic>snR19</italic> Accession: RF00488), aligned respective yeast sequences to their covariance models using the cmalign method from INFERNAL v1.1.1, and converted the resulting Stockholm alignment files to BPSEQ format using VARNA.</p><p id="P39">Structures of the yeast ribosome (PDB Accession: 4V88) were visualized using PyMol (<ext-link ext-link-type="uri" xlink:href="http://www.pymol.org/">http://www.pymol.org/</ext-link>).</p></sec></sec><sec sec-type="supplementary-material" id="SM"><title>Supplementary Material</title><supplementary-material content-type="local-data" id="SD1"><label>1</label><media xlink:href="NIHMS700917-supplement-1.doc" orientation="portrait" xlink:type="simple" id="d37e749" position="anchor"/></supplementary-material></sec></body><back><ack id="S9"><title>ACKNOWLEDGMENTS</title><p>We thank members of the Shendure Lab (particularly D. Cusanovich, M. Kircher, A. McKenna, and M. Snyder), D. Fowler, C. Trapnell, and J. Underwood for helpful discussions and comments on the manuscript. We thank G. Kudla, A. Helwak, and D. Tollervey for answering questions pertaining to the CLASH protocol. We would also like to acknowledge A. Dobin for making auxiliary scripts for processing STAR alignments publicly available. This work was funded by NIH Director's Pioneer Award (1DP1HG007811 to J.S.) and an NIH NGHRI Genome Training Grant (5T32HG000035 to V.R.).</p></ack><fn-group><fn id="FN1"><p id="P40">AUTHOR CONTRIBUTIONS</p><p id="P41">V.R. and J.S. conceived of the project and devised experiments. V.R. and R.Q. carried out the experiments. V.R. performed computational analyses. V.R. and J.S. wrote the manuscript.</p></fn><fn id="FN2"><p id="P42">COMPETING FINANCIAL INTERESTS</p><p id="P43">The authors declare no competing financial interests.</p></fn></fn-group><ref-list><ref id="R1"><label>1</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mortimer</surname><given-names>SA</given-names></name><name><surname>Kidwell</surname><given-names>MA</given-names></name><name><surname>Doudna</surname><given-names>JA</given-names></name></person-group><article-title>Insights into RNA structure and function from genome-wide studies.</article-title><source>Nat. Rev. Genet</source><year>2014</year><volume>15</volume><fpage>469</fpage><lpage>479</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">24821474</pub-id></element-citation></ref><ref id="R2"><label>2</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cate</surname><given-names>JH</given-names></name><etal/></person-group><article-title>Crystal Structure of a Group I Ribozyme Domain: Principles of RNA Packing.</article-title><source>Science</source><year>1996</year><volume>273</volume><fpage>1678</fpage><lpage>1685</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">8781224</pub-id></element-citation></ref><ref id="R3"><label>3</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>Y-H</given-names></name><name><surname>Murphy</surname><given-names>FL</given-names></name><name><surname>Cech</surname><given-names>TR</given-names></name><name><surname>Griffith</surname><given-names>JD</given-names></name></person-group><article-title>Visualization of a Tertiary Structural Domain of the Tetrahymena Group I Intron by Electron Microscopy.</article-title><source>J. Mol. Biol</source><year>1994</year><volume>236</volume><fpage>64</fpage><lpage>71</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">7508985</pub-id></element-citation></ref><ref id="R4"><label>4</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Latham</surname><given-names>MP</given-names></name><name><surname>Brown</surname><given-names>DJ</given-names></name><name><surname>McCallum</surname><given-names>SA</given-names></name><name><surname>Pardi</surname><given-names>A</given-names></name></person-group><article-title>NMR Methods for Studying the Structure and Dynamics of RNA.</article-title><source>ChemBioChem</source><year>2005</year><volume>6</volume><fpage>1492</fpage><lpage>1505</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">16138301</pub-id></element-citation></ref><ref id="R5"><label>5</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zuker</surname><given-names>M</given-names></name></person-group><article-title>Mfold web server for nucleic acid folding and hybridization prediction.</article-title><source>Nucleic Acids Res</source><year>2003</year><volume>31</volume><fpage>3406</fpage><lpage>3415</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">12824337</pub-id></element-citation></ref><ref id="R6"><label>6</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Reuter</surname><given-names>J</given-names></name><name><surname>Mathews</surname><given-names>D</given-names></name></person-group><article-title>RNAstructure: software for RNA secondary structure prediction and analysis.</article-title><source>BMC Bioinformatics</source><year>2010</year><volume>11</volume><fpage>129</fpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">20230624</pub-id></element-citation></ref><ref id="R7"><label>7</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lorenz</surname><given-names>R</given-names></name><etal/></person-group><article-title>ViennaRNA Package 2.0.</article-title><source>Algorithms Mol. Biol</source><year>2011</year><volume>6</volume><fpage>26</fpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">22115189</pub-id></element-citation></ref><ref id="R8"><label>8</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shendure</surname><given-names>J</given-names></name><name><surname>Aiden</surname><given-names>EL</given-names></name></person-group><article-title>The expanding scope of DNA sequencing.</article-title><source>Nat. Biotechnol</source><year>2012</year><volume>30</volume><fpage>1084</fpage><lpage>1094</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">23138308</pub-id></element-citation></ref><ref id="R9"><label>9</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rouskin</surname><given-names>S</given-names></name><name><surname>Zubradt</surname><given-names>M</given-names></name><name><surname>Washietl</surname><given-names>S</given-names></name><name><surname>Kellis</surname><given-names>M</given-names></name><name><surname>Weissman</surname><given-names>JS</given-names></name></person-group><article-title>Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo.</article-title><source>Nature</source><year>2014</year><volume>505</volume><fpage>701</fpage><lpage>705</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">24336214</pub-id></element-citation></ref><ref id="R10"><label>10</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ding</surname><given-names>Y</given-names></name><etal/></person-group><article-title>In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.</article-title><source>Nature</source><year>2014</year><volume>505</volume><fpage>696</fpage><lpage>700</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">24270811</pub-id></element-citation></ref><ref id="R11"><label>11</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lucks</surname><given-names>JB</given-names></name><etal/></person-group><article-title>Multiplexed RNA structure characterization with selective 2&#x02032;- hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq).</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2011</year><volume>108</volume><fpage>11063</fpage><lpage>11068</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">21642531</pub-id></element-citation></ref><ref id="R12"><label>12</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kertesz</surname><given-names>M</given-names></name><etal/></person-group><article-title>Genome-wide measurement of RNA secondary structure in yeast.</article-title><source>Nature</source><year>2010</year><volume>467</volume><fpage>103</fpage><lpage>107</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">20811459</pub-id></element-citation></ref><ref id="R13"><label>13</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wan</surname><given-names>Y</given-names></name><etal/></person-group><article-title>Landscape and variation of RNA secondary structure across the human transcriptome.</article-title><source>Nature</source><year>2014</year><volume>505</volume><fpage>706</fpage><lpage>709</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">24476892</pub-id></element-citation></ref><ref id="R14"><label>14</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Underwood</surname><given-names>JG</given-names></name><etal/></person-group><article-title>FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing.</article-title><source>Nat. Methods</source><year>2010</year><volume>7</volume><fpage>995</fpage><lpage>1001</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">21057495</pub-id></element-citation></ref><ref id="R15"><label>15</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kladwang</surname><given-names>W</given-names></name><name><surname>VanLang</surname><given-names>CC</given-names></name><name><surname>Cordero</surname><given-names>P</given-names></name><name><surname>Das</surname><given-names>R</given-names></name></person-group><article-title>A two-dimensional mutate- and-map strategy for non-coding RNA structure.</article-title><source>Nat. Chem</source><year>2011</year><volume>3</volume><fpage>954</fpage><lpage>962</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">22109276</pub-id></element-citation></ref><ref id="R16"><label>16</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Siegfried</surname><given-names>NA</given-names></name><name><surname>Busan</surname><given-names>S</given-names></name><name><surname>Rice</surname><given-names>GM</given-names></name><name><surname>Nelson</surname><given-names>JAE</given-names></name><name><surname>Weeks</surname><given-names>KM</given-names></name></person-group><article-title>RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP).</article-title><source>Nat Meth</source><year>2014</year><volume>9</volume><fpage>959</fpage><lpage>965</lpage></element-citation></ref><ref id="R17"><label>17</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Fredriksson</surname><given-names>S</given-names></name><etal/></person-group><article-title>Protein detection using proximity-dependent DNA ligation assays.</article-title><source>Nat. Biotechnol</source><year>2002</year><volume>20</volume><fpage>473</fpage><lpage>477</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">11981560</pub-id></element-citation></ref><ref id="R18"><label>18</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>S&#x000f6;derberg</surname><given-names>O</given-names></name><etal/></person-group><article-title>Direct observation of individual endogenous protein complexes in situ by proximity ligation.</article-title><source>Nat. Methods</source><year>2006</year><volume>3</volume><fpage>995</fpage><lpage>1000</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">17072308</pub-id></element-citation></ref><ref id="R19"><label>19</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dekker</surname><given-names>J</given-names></name><name><surname>Rippe</surname><given-names>K</given-names></name><name><surname>Dekker</surname><given-names>M</given-names></name><name><surname>Kleckner</surname><given-names>N</given-names></name></person-group><article-title>Capturing Chromosome Conformation.</article-title><source>Science</source><year>2002</year><volume>295</volume><fpage>1306</fpage><lpage>1311</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">11847345</pub-id></element-citation></ref><ref id="R20"><label>20</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lieberman-Aiden</surname><given-names>E</given-names></name><etal/></person-group><article-title>Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome.</article-title><source>Science</source><year>2009</year><volume>326</volume><fpage>289</fpage><lpage>293</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">19815776</pub-id></element-citation></ref><ref id="R21"><label>21</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kudla</surname><given-names>G</given-names></name><name><surname>Granneman</surname><given-names>S</given-names></name><name><surname>Hahn</surname><given-names>D</given-names></name><name><surname>Beggs</surname><given-names>JD</given-names></name><name><surname>Tollervey</surname><given-names>D</given-names></name></person-group><article-title>Cross-linking, ligation, and sequencing of hybrids reveals RNA&#x02013;RNA interactions in yeast.</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2011</year><volume>108</volume><fpage>10010</fpage><lpage>10015</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">21610164</pub-id></element-citation></ref><ref id="R22"><label>22</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Helwak</surname><given-names>A</given-names></name><name><surname>Kudla</surname><given-names>G</given-names></name><name><surname>Dudnakova</surname><given-names>T</given-names></name><name><surname>Tollervey</surname><given-names>D</given-names></name></person-group><article-title>Mapping the Human miRNA Interactome by CLASH Reveals Frequent Noncanonical Binding.</article-title><source>Cell</source><year>2013</year><volume>153</volume><fpage>654</fpage><lpage>665</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">23622248</pub-id></element-citation></ref><ref id="R23"><label>23</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dobin</surname><given-names>A</given-names></name><etal/></person-group><article-title>STAR: ultrafast universal RNA-seq aligner.</article-title><source>Bioinforma</source><year>2012</year><comment>doi:10.1093/bioinformatics/bts635</comment></element-citation></ref><ref id="R24"><label>24</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Rao</surname><given-names>SSP</given-names></name><etal/></person-group><article-title>A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping.</article-title><source>Cell</source><year>2014</year><volume>159</volume><fpage>1665</fpage><lpage>1680</lpage><comment>Medline</comment><pub-id pub-id-type="pmid">25497547</pub-id></element-citation></ref><ref id="R25"><label>25</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ben-Shem</surname><given-names>A</given-names></name><etal/></person-group><article-title>The Structure of the Eukaryotic Ribosome at 3.0 &#x000c5; Resolution.</article-title><source>Science</source><year>2011</year><volume>334</volume><fpage>1524</fpage><lpage>1529</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">22096102</pub-id></element-citation></ref><ref id="R26"><label>26</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nawrocki</surname><given-names>EP</given-names></name><name><surname>Eddy</surname><given-names>SR</given-names></name></person-group><article-title>Infernal 1.1: 100-fold faster RNA homology searches.</article-title><source>Bioinformatics</source><year>2013</year><volume>29</volume><fpage>2933</fpage><lpage>2935</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">24008419</pub-id></element-citation></ref><ref id="R27"><label>27</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Burge</surname><given-names>SW</given-names></name><etal/></person-group><article-title>Rfam 11.0: 10 years of RNA families.</article-title><source>Nucleic Acids Res</source><year>2013</year><fpage>D226</fpage><lpage>D232</lpage><comment>Medline</comment><pub-id pub-id-type="pmid">23125362</pub-id></element-citation></ref><ref id="R28"><label>28</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Engreitz</surname><given-names>JM</given-names></name><etal/></person-group><article-title>RNA-RNA Interactions Enable Specific Targeting of Noncoding RNAs to Nascent Pre-mRNAs and Chromatin Sites.</article-title><source>Cell</source><year>2014</year><volume>159</volume><fpage>188</fpage><lpage>199</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">25259926</pub-id></element-citation></ref><ref id="R29"><label>29</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Grosswendt</surname><given-names>S</given-names></name><etal/></person-group><article-title>Unambiguous Identification of miRNA:Target Site Interactions by Different Types of Ligation Reactions.</article-title><source>Mol. Cell</source><year>2014</year><volume>54</volume><fpage>1042</fpage><lpage>1054</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">24857550</pub-id></element-citation></ref><ref id="R30"><label>30</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cordero</surname><given-names>P</given-names></name><name><surname>Lucks</surname><given-names>JB</given-names></name><name><surname>Das</surname><given-names>R</given-names></name></person-group><article-title>An RNA Mapping DataBase for curating RNA structure mapping experiments.</article-title><source>Bioinformatics</source><year>2012</year><volume>28</volume><fpage>3006</fpage><lpage>3008</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">22976082</pub-id></element-citation></ref><ref id="R31"><label>31</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cameron</surname><given-names>V</given-names></name><name><surname>Uhlenbeck</surname><given-names>OC</given-names></name></person-group><article-title>3&#x02032;-Phosphatase activity in T4 polynucleotide kinase.</article-title><source>Biochemistry</source><year>1977</year><volume>16</volume><fpage>5120</fpage><lpage>5126</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">199248</pub-id></element-citation></ref><ref id="R32"><label>32</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cannone</surname><given-names>J</given-names></name><etal/></person-group><article-title>The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs.</article-title><source>BMC Bioinformatics</source><year>2002</year><volume>3</volume><fpage>2</fpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">11869452</pub-id></element-citation></ref><ref id="R33"><label>33</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Anger</surname><given-names>AM</given-names></name><etal/></person-group><article-title>Structures of the human and Drosophila 80S ribosome.</article-title><source>Nature</source><year>2013</year><volume>497</volume><fpage>80</fpage><lpage>85</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">23636399</pub-id></element-citation></ref><ref id="R34"><label>34</label><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Lu</surname><given-names>X</given-names></name><name><surname>Olson</surname><given-names>WK</given-names></name></person-group><article-title>3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures.</article-title><source>Nucleic Acids Res</source><year>2003</year><volume>31</volume><fpage>5108</fpage><lpage>5121</lpage><comment>CrossRef Medline</comment><pub-id pub-id-type="pmid">12930962</pub-id></element-citation></ref></ref-list></back><floats-group><fig id="F1" orientation="portrait" position="float"><label>Figure 1</label><caption><title>RNA Proximity Ligation identifies structurally proximate regions within the complex secondary structures of <italic>S. cerevisiae</italic> ribosomal RNAs. a.)</title><p>A schematic representation of the RPL method. Whole cells are spheroplasted with zymolyase and RNA is allowed to react with endogenous RNases. RNA ends are repaired <italic>in situ</italic> via T4 PNK to yield 5&#x02032;-phosphate termini. Complexes are ligated overnight in the presence of T4 RNA Ligase I. Ligation products are cleaned up via acid guanidinium-phenol and subsequent DNase treatment, and subjected to Illumina TruSeq RNA-seq library preparation. These libraries are sequenced to map and count ligation junctions; <bold>b.-c.)</bold> We examined the distribution of ligation junctions as a function of distance from known base-pair partners in the 25S/5.8S rRNA and 18S rRNAs. Ligation products capture the structural proximity implied by base-pairing relationships, as evidenced by the enrichment for ligation junctions immediately near paired bases. Y-axes are shown as ligation counts per million reads analyzed. <bold>d.)</bold> Contact probability map for the eukaryotic 5.8S/25S rRNA based on RPL scores, which are calculated from the frequencies of ligation events between pairs of 21 nt windows (<bold>Methods</bold>). <bold>Lower inset</bold>: Ligation events, shown for bases 1300 to 1475 of the LSU rRNA in orange, primarily occur across digested single-stranded loops. RPL scores effectively smooth this noisy signal and are enriched for pairs of interacting regions. Plotted here are the 8,463 ligation events where both nucleotides fall within the displayed domain (compared to 17,029 ligation events where one nucleotide falls within the displayed domain and one does not, not shown). <bold>Right inset:</bold> RPL scores localize known pseudo-knots in the LSU rRNA structure, such as the interaction between bases 1727-1812 (shown in red) and bases 1941 &#x02013; 2038 (shown in blue).</p></caption><graphic xlink:href="nihms-700917-f0001"/></fig><fig id="F2" orientation="portrait" position="float"><label>Figure 2</label><caption><p>Smoothing of ligation junction data results in ligase-dependent signal around known stem-loop formations. a.) The 10,000 most abundant ligation pairs for the LSU rRNA (red) overlaid onto the known secondary structure (blue). While signal across stem-loops is evident, there is considerable noise. <bold>b.)</bold> Top 25,000 interacting windows based on RPL scores, which are calculated from the frequencies of ligation between pairs of 21 nt windows (<bold>Methods</bold>), for the LSU rRNA in the (+) ligase sample (red), again overlaid onto the known secondary structure (blue). Lines are drawn between the central bases of two interacting 21 nt windows. For <bold>b.)</bold>, the shading of the red lines is proportional to the ligation frequency.</p></caption><graphic xlink:href="nihms-700917-f0002"/></fig><fig id="F3" orientation="portrait" position="float"><label>Figure 3</label><caption><p>2D RPL contact probability maps recapitulate known and predicted non-ribosomal RNA structures. a.) Contact probability map for <italic>snR86</italic> mirrored against interacting windows containing paired bases, based on conserved secondary structure. <bold>b.)</bold> Contact probability map for <italic>snR19</italic> mirrored against interacting windows containing paired bases, based on conserved secondary structure. RPL signal indicating the formation of a stem-loop in bases 320-510 within the molecule is supported by MFE predictions, but not conservation. <bold>c.)</bold> Contact probability map for <italic>SCR1</italic> mirrored against interacting windows containing paired bases, based on the known structure of <italic>SCR1</italic>. For all analyses shown here, RPL scores were calculating using a window size of 21 nt.</p></caption><graphic xlink:href="nihms-700917-f0003"/></fig><fig id="F4" orientation="portrait" position="float"><label>Figure 4</label><caption><title>RPL scores demonstrate modest positive predictive value for pairs of interacting windows in RNA secondary structure. a-b.)</title><p>Plots of number of true positive interacting windows versus number of false positive interacting windows for the <bold>(a)</bold> 5.8SS/25S rRNAs and <bold>(b)</bold> 18S rRNA, at various quantile thresholds on RPL scores. This analysis shows that RPL scores have predictive value in classifying interacting regions containing at least one set of paired bases within RNA secondary structure. <bold>c-d.)</bold> Plots of the positive predictive value (green) and sensitivity (purple) of RPL-based classification of interacting regions, as a function of quantile threshold used for <bold>(c)</bold> 5.8S/25S and <bold>(d)</bold> 18S rRNAs. The quantile step size used for all analyses shown in this figure was 0.001.</p></caption><graphic xlink:href="nihms-700917-f0004"/></fig></floats-group></article>